MySQL Internals Prepared Stored
← Back to MySQL Internals overview page
Contents |
[edit] Prepared Statements and Stored Routines Re-execution
Let us start with a general description of the MySQL statement processing workflow in order to provide the reader with understanding of the problem of reexecution and vocabulary for the following sections.
Conventional statements, that is, SQL queries sent in COM_QUERY protocol packet, are the only statements present in MySQL server prior to version 4.1. Execution of such statements is performed in a batch mode, one query processed by the server at a time. The original implementation is streamlined for this mode and has a single global connection state THD shared among all operational steps.
When executing a query in conventional mode, the server sequentially parses its text, acquires table level locks, analyzes the parsed tree, builds an execution plan, executes the built plan and releases the locks.
Memory for parsing is allocated using block allocator MEM_ROOT in 4k chunks and freed once in the end of execution. Memory for execution is allocated in the memory root of the parsed tree, as well as in the system heap, and in some cases in local "memory roots" of execution modules.
The role of the parser is to create a set of objects to represent the query. E.g. for a SELECT statement, this set includes a list of Item's for SELECT list, a list of tables (TABLE_LIST object for each table) for FROM clause, and a tree of Item's for WHERE clause.
During context analysis phase, links are established from the parsed tree to the physical objects of the database, such as open tables and table columns. A physical table is represented by a heir of class handler that corresponds to the storage engine the table belongs to, and is saved in TABLE_LIST::file.
When context analysis is done, the query optimizer is run. It performs two major tasks:
- Query transformation a transformation of the parsed tree to an equivalent one, which is simpler and more efficient to execute.
- Creation of an execution plan, including evaluation of an order of joins and initialization of methods to access the used tables. At this step parts of the execution plan are attached to the parsed tree.
Finally, the query is passed to the execution runtime an interpreter that operates with and modifies both the parsed tree and the execution plan in order to execute the query.
It should be noted that the overall procedure is infamous for breaking borders between abstraction layers. For example, MySQL performs [sub]query transformation during context analysis; moreover, most parts of the code rely on the fact that THD is processing only one statement at a time.
[edit] Statement Re-execution Requirements
Features of MySQL 4.1 and 5.0 put a new demand on the execution process: prepared statements and stored routines need to reuse the same parsed tree to execute a query many times.
So far no easy mechanism that would allow query reexecution using the conventional query processing code has been found. For instance, copying of the parsed tree before each reexecution is not simple to implement as a parsed tree, which can contain instances of more than 300 different classes, has a lot of cross-references between its objects.
The present solution introduces a concept of change management for the changes of the parsed tree and is largely a unification of numerous fixes of bugs in reexecution. The solution has two aspects.
The first one is that modifications of the parsed tree are tracked and a way to restore the tree to a state that allows reexecution is introduced.
The second aspect is that a dedicated block allocator (memory root) is used to store the parsed tree, and the memory allocated in this memory root is freed only when the parsed tree is destroyed. Later this memory root will be denoted as the permanent memory root of a statement.
In order to properly restore the parsed tree to a usable state, all modifications of the tree are classified as destructive or non-destructive and an appropriate action is taken for every type of modification.
A non-destructive modification does not depend on actual values of prepared statement placeholders or contents of the tables used in a query. Such modification is [and should be, for future changes] made only once and the memory for it is allocated in the permanent memory root of the statement.
As a result, the modified parsed tree remains usable.
Examples of non-destructive and useful modifications of the parsed tree are:
-
WHERE/ONclause flattening -
NOTelimination -
LEFT JOINelimination, when it can be done based on the constants explicitly specified in the query
The rest of modifications are destructive, generally because they are based on actual contents of tables or placeholders.
Examples of destructive modifications are:
- Equality propagation
- Sorting of members of
INarray for quick evaluation ofINexpression.
Destructive modifications are (and should be for all future changes) allocated in a memory root dedicated to execution, are registered in THD::change_list and rolled back in the end of each execution. Later the memory root dedicated to execution of a statement will be denoted as the runtime memory root of the statement. Because allocations are done indirectly via THD::mem_root, THD::mem_root at any given moment of time can point either to the permanent or to the runtime memory root of the statement. Consequently, THD::mem_root and THD::free_list can be denoted as 'currently active arena' of THD.
[edit] Preparation of a Prepared Statement
As mentioned above, THD is currently a required argument and the runtime context for every function in the server. Therefore, in order to call the parser and allocate memory in the statement memory root we perform several save-restore steps with THD::mem_root and THD::free_list (the active arena of THD).
- In order to parse a statement, we save the currently active arena of
THDand assign its members from the permanent arena of the statement. This is achieved by callingTHD::set_and_backup_active_arena. This wayalloc_queryandyyparseoperate on the permanent arena. - We don't want the garbage which is created during statement validation to be left in the permanent arena of the statement. For that, after parse but before validation of the statement, we restore the THD arena saved in (1). In other words, we use the arena of
THDthat was active whenPrepared_statement::preparewas invoked as the runtime arena of the statement when it is validated. - Statement validation is performed in function
check_prepared_statement(). This function will subsequently callst_select_lex_unit::prepare()andsetup_fields()for the main LEX unit, createJOINsfor every unit, and callJOIN::preparefor every join (JOINsin MySQL represents a part of the execution plan). Our prepared statement engine does not save the execution plan in a prepared statement for reuse, and ideally we should not create it at prepare stage. However, currently there is no other way to validate a statement except to callJOIN::preparefor all its units. - During validation we may perform a transformation of the parsed tree. In a clean implementation this would belong to a separate step, but in our case the majority of the server runtime was not refactored to support reexecution of statements, and a permanent transformation of the parsed tree can happen at any moment during validation. Such transformations absolutely must use the permanent arena of the prepared statement. To make this arena accessible, we save a pointer to it in
thd->stmt_arenabefore callingcheck_prepared_statement.
Later, whenever we need to perform a permanent transformation, we first call THD::activate_stmt_arena_if_needed to make the permanent arena active, transform the tree, and restore the runtime arena.
- Some parts of the execution do not distinguish between preparation of a prepared statement and its execution and perform destructive optimizations of the parsed tree even during validation. These changes of the parsed tree are recorded in
THD::change_list usingmethodTHD::register_item_tree_change. - After the validation is done, we rollback the changes registered in
THD::change_listand free new items and other memory allocated by destructive transformations.
[edit] Execution of a Prepared Statement
In order to call mysql_execute_command (the function that executes a statement) for a prepared statement and not damage its parse tree, we backup and restore the active Query_arena of THD.
- We don't want the garbage created during execution to be left in the permanent arena of the statement. To ensure that, every statement is executed in the runtime arena of
THD. In other words, the arena which was active whenmysql_stmt_executewas called is used as the runtime arena of the statement during its execution. - Before calling
mysql_stmt_execute, weallocate thd->querywith parameter markers ('?') replaced with their values: the new query is allocated in the runtime arena. We'll need this query for general, binary, error and slow logs. - The execution plan created at prepare stage is not saved (see [prepared-stored.html#prepared-stored-statement-preparation Section??10.2, Preparation of a Prepared Statement]), and at execute we simply create a new set of JOINs and then prepare and optimize it. During the first execution of the prepared statement the server may perform non-destructive transformations of statement's parsed tree: normally that would belong to a separate step executed at statement prepare, but once again, this haven't been done in 4.1 or 5.0. Such transformations absolutely must use the permanent arena of the prepared statement (saved in
thd->stmt_arena). Whenever we need to perform a permanent transformation, we first callTHD::activate_stmt_arena_if_neededto make the permanent arena active, transform the tree, and restore the runtime arena. To avoid double transformations in such cases, we track current state of the parsed tree inQuery_arena::state.
This state may be one of the following:
-
INITIALIZEDwe're in statementPREPARE. -
INITIALIZED_FOR_SPwe're in first execution of a stored procedure statement. -
PREPAREDwe're in first execution of a prepared statement. -
EXECUTEDwe're in a subsequent execution of a prepared statement or a stored procedure statement. -
CONVENTIONAL_EXECUTIONwe're executing a pre-4.1 query.
-
One can use helper methods of Query_arena to check this state (is_conventional_execution(), is_stmt_prepare(), is_stmt_execute(), is_stmt_prepare_or_first_sp_execute()).
Additionally, st_select_lex_unit::first_execution contains a flag for the state of each subquery in a complex statement. A separate variable is needed because not all subqueries may get executed during the first execution of a statement.
- Some optimizations damage the parsed tree, e.g. replace leafs and subtrees of items with other items or leave item objects cluttered with runtime data. To allow re-execution of a prepared statement the following mechanisms are currently employed:
- A hierarchy of
Item::cleanup()andst_select_lex::cleanup()methods to restore the parsed tree to the condition of right-after-parse. These cleanups are called inPrepared_statement::cleanup_stmt()after the statement has been executed. - In order to roll back destructive transformations of the parsed tree, every replacement of one item with another is registered in
THD::change_listby usingTHD::change_item_tree(). In the end of execution all such changes are rolled back in reverse order.
- A hierarchy of
Example:
if (!(fld= new Item_field(from_field))) goto error; thd->change_item_tree(reference, fld);
If a transformation is a non-destructive, it should not be registered, but performed only once in the permanent memory root. Additionally, be careful to not supply a pointer to stack as the first argument of change_item_tree(); that will lead to stack corruption when a tree is restored.
-
AND/ORsubtrees ofWHEREandONclauses are created anew for each execution. It was easier to implement in 4.1, and the approach with change record list used in (b) could not have been used forAND/ORtransformations, because these transformations not only replace one item with another, but also can remove a complete subtree. Leafs ofAND/ORsubtrees are not copied by this mechanism because currently they are not damaged by the transformation. For details, seeItem::copy_andor_structure(). - No other mechanism exists in the server at the moment to allow re-execution. If the code that you're adding transforms the parsed tree, you must use one of the mechanisms described above, or propose and implement a better approach.
-
- When execution is done, we rollback the damage of the parsed tree.
[edit] Execution of a Stored Procedure Statement
Execution of a stored procedure statement is similar to execution of a prepared statement. The few existing exceptions are described below.
During execution of a stored procedure, THD::stmt_arena points to the permanent query arena of the stored procedure. This arena happens to be also the permanent query arena of every instruction of the procedure, as the parser creates all instructions in the same arena. More generally, THD::stmt_arena is always set and always points to the permanent arena of a statement. If the statement is a conventional query, then the permanent arena simply points to the runtime arena of the query.
An own runtime memory root is set up for execution of every stored procedure statement and freed in the end of execution. This is a necessary measure to avoid memory leaks if a stored procedure statement is executed in a loop.
With regard to the transformations and restoration of the parsed tree, execution of a stored procedure statement follows the path of execution of a prepared statement, with the exception that there is no separate prepare step. THD::is_first_sp_execute() is used to determine whether it's the first execution, and in this case non-destructive permanent transformations of the parsed tree are made in the permanent memory root of the statement that is currently being executed.
During subsequent executions no non-destructive transformations are performed, while all destructive ones are rolled back in the end of execution using the same algorithm as in prepared statements.
[edit] All mechanisms put together: major logical steps of re-execution
With the above mechanisms in place, execution workflow for a statement consists of the following major steps:
- Call to reinit_statement_before_use(): it's a legacy call introduced originally to clean up TABLE_LIST::table elements before they are used.
- Open and lock of all used tables and objects (stored procedures, functions) used in the statement.
- Optimization, an execution plan is created, and then execution of the statement. This potentially involves calls to re-execution facilities, such as change_item_tree(), reset_and_backup_item_arena().
- Destruction of the execution plan (st_select_unit::cleanup()) and clean-up of SELECT_LEX hierarchy.
- The tables and objects used in the query are unlocked and closed.
- Restoration of all the items used in the parsed tree (Item::cleanup() is called for every item that was not created for this execution only but belongs to the original query).
- Rollback of the item tree changes.
- Destruction of the execution-time memory and objects: all temporary items created at step 3 are destroyed, execution memory root is is freed.
These steps are applied to all re-executable objects: stored procedures, triggers, stored functions, events, prepared statemens.