Category: MySQLDevelopment

MySQL Internals Prepared Stored

← Back to MySQL Internals overview page

Contents

[edit] Prepared Statements and Stored Routines Re-execution

Let us start with a general description of the MySQL statement processing workflow in order to provide the reader with understanding of the problem of reexecution and vocabulary for the following sections.

Conventional statements, that is, SQL queries sent in COM_QUERY protocol packet, are the only statements present in MySQL server prior to version 4.1. Execution of such statements is performed in a batch mode, one query processed by the server at a time. The original implementation is streamlined for this mode and has a single global connection state THD shared among all operational steps.

When executing a query in conventional mode, the server sequentially parses its text, acquires table level locks, analyzes the parsed tree, builds an execution plan, executes the built plan and releases the locks.

Memory for parsing is allocated using block allocator MEM_ROOT in 4k chunks and freed once in the end of execution. Memory for execution is allocated in the memory root of the parsed tree, as well as in the system heap, and in some cases in local "memory roots" of execution modules.

The role of the parser is to create a set of objects to represent the query. E.g. for a SELECT statement, this set includes a list of Item's for SELECT list, a list of tables (TABLE_LIST object for each table) for FROM clause, and a tree of Item's for WHERE clause.

During context analysis phase, links are established from the parsed tree to the physical objects of the database, such as open tables and table columns. A physical table is represented by a heir of class handler that corresponds to the storage engine the table belongs to, and is saved in TABLE_LIST::file.

When context analysis is done, the query optimizer is run. It performs two major tasks:

Finally, the query is passed to the execution runtime an interpreter that operates with and modifies both the parsed tree and the execution plan in order to execute the query.

It should be noted that the overall procedure is infamous for breaking borders between abstraction layers. For example, MySQL performs [sub]query transformation during context analysis; moreover, most parts of the code rely on the fact that THD is processing only one statement at a time.

[edit] Statement Re-execution Requirements

Features of MySQL 4.1 and 5.0 put a new demand on the execution process: prepared statements and stored routines need to reuse the same parsed tree to execute a query many times.

So far no easy mechanism that would allow query reexecution using the conventional query processing code has been found. For instance, copying of the parsed tree before each reexecution is not simple to implement as a parsed tree, which can contain instances of more than 300 different classes, has a lot of cross-references between its objects.

The present solution introduces a concept of change management for the changes of the parsed tree and is largely a unification of numerous fixes of bugs in reexecution. The solution has two aspects.

The first one is that modifications of the parsed tree are tracked and a way to restore the tree to a state that allows reexecution is introduced.

The second aspect is that a dedicated block allocator (memory root) is used to store the parsed tree, and the memory allocated in this memory root is freed only when the parsed tree is destroyed. Later this memory root will be denoted as the permanent memory root of a statement.

In order to properly restore the parsed tree to a usable state, all modifications of the tree are classified as destructive or non-destructive and an appropriate action is taken for every type of modification.

A non-destructive modification does not depend on actual values of prepared statement placeholders or contents of the tables used in a query. Such modification is [and should be, for future changes] made only once and the memory for it is allocated in the permanent memory root of the statement.

As a result, the modified parsed tree remains usable.

Examples of non-destructive and useful modifications of the parsed tree are:

The rest of modifications are destructive, generally because they are based on actual contents of tables or placeholders.

Examples of destructive modifications are:

Destructive modifications are (and should be for all future changes) allocated in a memory root dedicated to execution, are registered in THD::change_list and rolled back in the end of each execution. Later the memory root dedicated to execution of a statement will be denoted as the runtime memory root of the statement. Because allocations are done indirectly via THD::mem_root, THD::mem_root at any given moment of time can point either to the permanent or to the runtime memory root of the statement. Consequently, THD::mem_root and THD::free_list can be denoted as 'currently active arena' of THD.

[edit] Preparation of a Prepared Statement

As mentioned above, THD is currently a required argument and the runtime context for every function in the server. Therefore, in order to call the parser and allocate memory in the statement memory root we perform several save-restore steps with THD::mem_root and THD::free_list (the active arena of THD).

  1. In order to parse a statement, we save the currently active arena of THD and assign its members from the permanent arena of the statement. This is achieved by calling THD::set_and_backup_active_arena. This way alloc_query and yyparse operate on the permanent arena.
  2. We don't want the garbage which is created during statement validation to be left in the permanent arena of the statement. For that, after parse but before validation of the statement, we restore the THD arena saved in (1). In other words, we use the arena of THD that was active when Prepared_statement::prepare was invoked as the runtime arena of the statement when it is validated.
  3. Statement validation is performed in function check_prepared_statement(). This function will subsequently call st_select_lex_unit::prepare() and setup_fields() for the main LEX unit, create JOINs for every unit, and call JOIN::prepare for every join (JOINs in MySQL represents a part of the execution plan). Our prepared statement engine does not save the execution plan in a prepared statement for reuse, and ideally we should not create it at prepare stage. However, currently there is no other way to validate a statement except to call JOIN::prepare for all its units.
  4. During validation we may perform a transformation of the parsed tree. In a clean implementation this would belong to a separate step, but in our case the majority of the server runtime was not refactored to support reexecution of statements, and a permanent transformation of the parsed tree can happen at any moment during validation. Such transformations absolutely must use the permanent arena of the prepared statement. To make this arena accessible, we save a pointer to it in thd->stmt_arena before calling check_prepared_statement.

Later, whenever we need to perform a permanent transformation, we first call THD::activate_stmt_arena_if_needed to make the permanent arena active, transform the tree, and restore the runtime arena.

  1. Some parts of the execution do not distinguish between preparation of a prepared statement and its execution and perform destructive optimizations of the parsed tree even during validation. These changes of the parsed tree are recorded in THD::change_list using method THD::register_item_tree_change.
  2. After the validation is done, we rollback the changes registered in THD::change_list and free new items and other memory allocated by destructive transformations.

[edit] Execution of a Prepared Statement

In order to call mysql_execute_command (the function that executes a statement) for a prepared statement and not damage its parse tree, we backup and restore the active Query_arena of THD.

This state may be one of the following:

One can use helper methods of Query_arena to check this state (is_conventional_execution(), is_stmt_prepare(), is_stmt_execute(), is_stmt_prepare_or_first_sp_execute()). Additionally, st_select_lex_unit::first_execution contains a flag for the state of each subquery in a complex statement. A separate variable is needed because not all subqueries may get executed during the first execution of a statement.

Example:

if (!(fld= new Item_field(from_field)))
goto error;
thd->change_item_tree(reference, fld);

If a transformation is a non-destructive, it should not be registered, but performed only once in the permanent memory root. Additionally, be careful to not supply a pointer to stack as the first argument of change_item_tree(); that will lead to stack corruption when a tree is restored.

[edit] Execution of a Stored Procedure Statement

Execution of a stored procedure statement is similar to execution of a prepared statement. The few existing exceptions are described below.

During execution of a stored procedure, THD::stmt_arena points to the permanent query arena of the stored procedure. This arena happens to be also the permanent query arena of every instruction of the procedure, as the parser creates all instructions in the same arena. More generally, THD::stmt_arena is always set and always points to the permanent arena of a statement. If the statement is a conventional query, then the permanent arena simply points to the runtime arena of the query.

An own runtime memory root is set up for execution of every stored procedure statement and freed in the end of execution. This is a necessary measure to avoid memory leaks if a stored procedure statement is executed in a loop.

With regard to the transformations and restoration of the parsed tree, execution of a stored procedure statement follows the path of execution of a prepared statement, with the exception that there is no separate prepare step. THD::is_first_sp_execute() is used to determine whether it's the first execution, and in this case non-destructive permanent transformations of the parsed tree are made in the permanent memory root of the statement that is currently being executed.

During subsequent executions no non-destructive transformations are performed, while all destructive ones are rolled back in the end of execution using the same algorithm as in prepared statements.

[edit] All mechanisms put together: major logical steps of re-execution

With the above mechanisms in place, execution workflow for a statement consists of the following major steps:

  1. Call to reinit_statement_before_use(): it's a legacy call introduced originally to clean up TABLE_LIST::table elements before they are used.
  2. Open and lock of all used tables and objects (stored procedures, functions) used in the statement.
  3. Optimization, an execution plan is created, and then execution of the statement. This potentially involves calls to re-execution facilities, such as change_item_tree(), reset_and_backup_item_arena().
  4. Destruction of the execution plan (st_select_unit::cleanup()) and clean-up of SELECT_LEX hierarchy.
  5. The tables and objects used in the query are unlocked and closed.
  6. Restoration of all the items used in the parsed tree (Item::cleanup() is called for every item that was not created for this execution only but belongs to the original query).
  7. Rollback of the item tree changes.
  8. Destruction of the execution-time memory and objects: all temporary items created at step 3 are destroyed, execution memory root is is freed.

These steps are applied to all re-executable objects: stored procedures, triggers, stored functions, events, prepared statemens.

Retrieved from "http://forge.mysql.com/wiki/MySQL_Internals_Prepared_Stored"

This page has been accessed 4,599 times. This page was last modified 16:12, 31 May 2007.

Find

Browse
MySQLForge
Main Page
Current events
Recent changes
Random page
Help
Edit
Edit this page
Editing help
This page
Discuss this page
Post a comment
Printable version
Context
Page history
What links here
Related changes
My pages
Special pages
New pages
File list
Statistics
Bug reports
More...