Categories: MySQLDevelopment | Optimizer

MySQL Internals Optimizer

← Back to MySQL Internals overview page

Contents

[edit] The Optimizer

This chapter describes the operation of the MySQL Query optimizer, which is used to determine the most efficient means for executing queries.

[edit] Code and Concepts

This section discusses key optimizer concepts, terminology, and how these are reflected in the MySQL server source code.

[edit] Definitions

This description uses a narrow definition: The optimizer is the set of routines which decide what execution path the DBMS should take for queries.

MySQL changes these routines frequently, so you should compare what is said here with what's in the current source code. To make that easy, this description includes notes referring to the relevant file and routine, such as See: /sql/select_cc, optimize_cond().

A transformation occurs when one query is changed into another query which delivers the same result. For example, a query could be changed from

SELECT ... WHERE 5 = a

to

SELECT ...WHERE a = 5

Most transformations are less obvious. Some transformations result in faster execution.

[edit] The Optimizer Code

This diagram shows the structure of the function handle_select() in /sql/sql_select.cc (the server code that handles a query):

handle_select()
   mysql_select()
     JOIN::prepare()
       setup_fields()
     JOIN::optimize()            /* optimizer is from here ... */
       optimize_cond()
       opt_sum_query()
       make_join_statistics()
         get_quick_record_count()
         choose_plan()
           /* Find the best way to access tables */
           /* as specified by the user.          */
           optimize_straight_join()
             best_access_path()
           /* Find a (sub-)optimal plan among all or subset */
           /* of all possible query plans where the user    */
           /* controls the exhaustiveness of the search.   */
           greedy_search()
             best_extension_by_limited_search()
               best_access_path()
           /* Perform an exhaustive search for an optimal plan */
           find_best()
       make_join_select()        /* ... to here */
     JOIN::exec()

The indentation in the diagram shows what calls what. Thus you can see that handle_select() calls mysql_select() which calls JOIN::prepare() which calls setup_fields(), and so on. The first part of mysql_select() is JOIN::prepare() which is for context analysis, metadata setup, and some subquery transformations. The optimizer is JOIN::optimize() and all its subordinate routines. When the optimizer finishes, JOIN::exec() takes over and does the job that JOIN::optimize() decides upon.

Although the word JOIN appears, these optimizer routines are applicable to all query types.

The optimize_cond() and opt_sum_query() routines perform transformations. The make_join_statistics() routine puts together all the information it can find about indexes that might be useful for accessing the query's tables.

[edit] Primary Optimizations

This section discusses the most important optimizations performed by the server.

[edit] Optimizing Constant Relations

[edit] Constant Propagation

A transformation takes place for expressions like this:

WHERE column1 = column2 AND column2 = 'x'

For such expressions, since it is known that, if A=B and B=C then A=C (the Transitivity Law), the transformed condition becomes:

WHERE column1='x' AND column2='x'

This transformation occurs for column1 <operator> column2 conditions if and only if <operator> is one of these operators:

=, <, >, <=, >=, <>, <=>, LIKE
 

That is, transitive transformations don't apply for BETWEEN. Probably they should not apply for LIKE either, but that's a story for another day.

Constant propagation happens in a loop, so the output from one propagation step can be input for the next step.

See</span>: /sql/sql_select.cc, change_cond_ref_to_const(). Or See: /sql/sql_select.cc, propagate_cond_constants().

[edit] Eliminating Dead Code

A transformation takes place for conditions that are always true, for example:

WHERE 0=0 AND column1='y'

In this case, the first condition is removed, leaving

WHERE column1='y'

See: /sql/sql_select.cc, remove_eq_conds().

A transformation also takes place for conditions that are always false. For example, consider this WHERE clause:

WHERE (0 = AND s1 = OR s1 = 7

Since the parenthesized part is always false, it is removed, reducing this expression to

WHERE s1 = 7

In some cases, where the WHERE clause represents an impossible condition, the optimizer might eliminate it completely. Consider the following:

WHERE (0 = AND s1 = 5)

Because it is never possible for this condition to be true, the EXPLAIN statement will show the words Impossible WHERE. Informally, we at MySQL say that the WHERE has been optimized away.

If a column cannot be NULL, the optimizer removes any non-relevant IS NULL conditions. Thus,

WHERE not_null_column IS NULL

is an always-false situation, and

WHERE not_null_column IS NOT NULL

is an always-true situation so such columns are also eliminated from the conditional expression. This can be tricky. For example, in an OUTER JOIN, a column which is defined as NOT NULL might still contain a NULL. The optimizer leaves IS NULL conditions alone in such exceptional situations.

The optimizer will not detect all Impossible WHERE situations there are too many possibilities in this regard. For example:

CREATE TABLE Table1 (column1 CHAR(1));
...
SELECT * FROM Table1 WHERE column1 = 'Canada';

The optimizer will not eliminate the condition in the query, even though the CREATE TABLE definition makes it an impossible condition.

[edit] Folding of Constants

A transformation takes place for this expression:

WHERE column1 = 1 + 2

which becomes:

WHERE column1 = 3

Before you say, but I never would write 1 + 2 in the first place, remember what was said earlier about constant propagation. It is quite easy for the optimizer to put such expressions together. This process simplifies the result.

[edit] Constants and Constant Tables

A MySQL constant is something more than a mere literal in the query. It can also be the contents of a constant table, which is defined as follows:

  1. A table with zero rows, or with only one row
  2. A table expression that is restricted with a WHERE condition, containing expressions of the form column = "constant", for all the columns of the table's primary key, or for all the columns of any of the table's unique keys (provided that the unique columns are also defined as NOT NULL).

For example, if the table definition for Table0 contains

... PRIMARY KEY (column1,column2)

then this expression

FROM Table0 ... WHERE column1=5 AND column2=7 ...

returns a constant table. More simply, if the table definition for Table1 contains

... unique_not_null_column INT NOT NULL UNIQUE

then this expression

FROM Table1 ... WHERE unique_not_null_column=5

returns a constant table.

These rules mean that a constant table has at most one row value. MySQL will evaluate a constant table in advance, to find out what that value is. Then MySQL will plug that value into the query. Here's an example:

SELECT Table1.unique_not_null_column, Table2.any_column
    FROM Table1, Table2
    WHERE Table1.unique_not_null_column = Table2.any_column
    AND Table1.unique_not_null_column = 5;

When evaluating this query, MySQL first finds that table Table1 after restriction with Table1.unique_not_null_column is a constant table according to the second definition above. So it retrieves that value.

If the retrieval fails (there is no row in the table with unique_not_null_column = EXPLAIN for the statement:

Impossible WHERE noticed after reading const tables

Alternatively, if the retrieval succeeds (there is exactly one row in the table with unique_not_null_column = MySQL transforms the query to this:

SELECT 5, Table2.any_column
    FROM Table1, Table2
    WHERE 5 = Table2.any_column
    AND 5 = 5;

Actually this is a grand-combination example. The optimizer does some of the transformation because of constant propagation, which we described earlier. By the way, we described constant propagation first because it happens happens before MySQL figures out what the constant tables are. The sequence of optimizer steps sometimes makes a difference.

Although many queries have no constant-table references, it should be kept in mind that whenever the word constant is mentioned hereafter, it refers either to a literal or to the contents of a constant table.

See: /sql/sql_select.cc, make_join_statistics().

[edit] Optimizing Joins

This section discusses the various methods used to optimize joins.

[edit] Determining the Join Type

When evaluating a conditional expression, MySQL decides what join type the expression has. (Again: despite the word join, this applies for all conditional expressions, not just join expressions. A term like access type would be clearer.) These are the documented join types, in order from best to worst:

See: /sql/sql_select.h, enum join_type{}. Notice that there are a few other (undocumented) join types too, for subqueries.

The optimizer can use the join type to pick a driver expression. For example, consider this query:

SELECT *
FROM Table1
WHERE indexed_column = AND unindexed_column = 6

Since indexed_column has a better join type, it is more likely to be the driver. You'll see various exceptions as this description proceeds, but this is a simple first rule.

What is significant about a driver? Consider that there are two execution paths for the query:

The Bad Execution Plan: Read every row in the table. (This is called a sequential scan of Table1 or simply table scan.) For each row, examine the values in indexed_column and in unindexed_column, to see if they meet the conditions.

The Good Execution Plan: Via the index, look up the rows which have indexed_column = This is called an indexed search.) For each row, examine the value in unindexed_column to see if it meets the condition.

An indexed search generally involves fewer accesses than a sequential scan, and far fewer accesses if the table is large but the index is unique. That is why it is better to access with the good execution plan, and why it is often good to choose indexed_column as the driver.

[edit] Joins and Access Methods

Bad join choices can cause more damage than bad choices in single-table searches, so MySQL developers have spent proportionally more time making sure that the tables in a query are joined in an optimal order and that optimal access methods (often called access paths) are chosen to retrieve table data. A combination of a fixed order in which tables are joined and the corresponding table access methods for each table is called query execution plan (QEP). The goal of the query optimizer is to find an optimal QEP among all such possible plans. There are several general ideas behind join optimization.

Each plan (or part of plan) is assigned a cost. The cost of a plan reflects roughly the resources needed to compute a query according to the plan, where the main factor is the number of rows that will be accessed while computing a query. Once we have a way to assign costs to different QEPs we have a way to compare them. Thus, the goal of the optimizer is to find a QEP with minimal cost among all possible plans.

In MySQL, the search for an optimal QEP is performed in a bottom-up manner. The optimizer first considers all plans for one table, then all plans for two tables, and so on, until it builds a complete optimal QEP. Query plans that consist of only some of the tables (and predicates) in a query are called partial plans. The optimizer relies on the fact that the more tables that are added to a partial plan, the greater its cost. This allows the optimizer to expand with more tables only the partial plans with lower cost than the current best complete plan.

The key routine that performs the search for an optimal QEP is sql/sql_select.cc, find_best(). It performs an exhaustive search of all possible plans and thus guarantees it will find an optimal one.

Below we represent find_best() in an extremely free translation to pseudocode. It is recursive, so some input variables are labeled so far to indicate that they come from a previous iteration.

remaining_tables = {t1, ..., tn}; /* all tables referenced in a query */

procedure find_best(
   partial_plan in,      /* in, partial plan of tables-joined-so-far */
   partial_plan_cost,    /* in, cost of partial_plan */
   remaining_tables,     /* in, set of tables not referenced in partial_plan */
   best_plan_so_far,     /* in/out, best plan found so far */
   best_plan_so_far_cost)/* in/out, cost of best_plan_so_far */
{
   for each table T from remaining_tables
   {
     /* Calculate the cost of using table T. Factors that the
        optimizer takes into account may include:
          Many rows in table (bad)
          Many key parts in common with tables so far (very good)
          Restriction mentioned in the WHERE clause (good)
          Long key (good)
          Unique or primary key (good)
          Full-text key (bad)
        Other factors that may at some time be worth considering:
          Many columns in key
          Short average/maximum key length
          Small table file
          Few levels in index
          All ORDER BY / GROUP columns come from this table */
     cost = complex-series-of-calculations;
     /* Add the cost to the cost so far. */
     partial_plan_cost+= cost;

     if (partial_plan_cost >= best_plan_so_far_cost)
       /* partial_plan_cost already too great, stop search */
       continue;

     partial_plan= expand partial_plan by best_access_method;
     remaining_tables= remaining_tables - table T;
     if (remaining_tables is not an empty set)
     {
       find_best(partial_plan, partial_plan_cost,
                 remaining_tables,
                 best_plan_so_far, best_plan_so_far_cost);
     }
     else
     {
       best_plan_so_far_cost= partial_plan_cost;
       best_plan_so_far= partial_plan;
     }
   }
}

Here the optimizer applies a depth-first search algorithm. It performs estimates for every table in the FROM clause. It will stop a search early if the estimate becomes worse than the best estimate so far. The order of scanning will depend on the order that the tables appear in the FROM clause.

See: /sql/table.h, struct st_table.

ANALYZE TABLE may affect some of the factors that the optimizer considers.

See also: /sql/sql_sqlect.cc, make_join_statistics().

The straightforward use of find_best() and greedy_search() will not apply for LEFT JOIN or RIGHT JOIN. For example, starting with MySQL 4.0.14, the optimizer may change a left join to a straight join and swap the table order in some cases. See also LEFT JOIN and RIGHT JOIN Optimization.

[edit] The range Join Type

Some conditions can work with indexes, but over a (possibly wide) range of keys. These are known as range conditions, and are most often encountered with expressions involving these operators: >, >=, <, <=, IN, LIKE, BETWEEN

To the optimizer, this expression:

column1 IN (1,2,3)

is the same as this one:

column1 = OR column1 = OR column1 = 3

and MySQL treats them the same there is no need to change IN to OR for a query, or vice versa.

The optimizer will use an index (range search) for

column1 LIKE 'x%'

but not for

column1 LIKE '%x'

That is, there is no range search if the first character in the pattern is a wildcard.

To the optimizer,

column1 BETWEEN 5 AND 7

is the same as this expression

column1 >= AND column1 <= 7

and again, MySQL treats both expressions the same.

The optimizer may change a Range to an ALL join type if a condition would examine too many index keys. Such a change is particularly likely for < and > conditions and multiple-level secondary indexes. See: (for MyISAM indexes) /myisam/mi_range.c, mi_records_in_range().

[edit] The index Join Type

Consider this query:

SELECT column1 FROM Table1;

If column1 is indexed, then the optimizer may choose to retrieve the values from the index rather than from the table. An index which is used this way is called a covering index in most texts. MySQL simply uses the word index in EXPLAIN descriptions.

For this query:

SELECT column1, column2 FROM Table1;

the optimizer will use join type = index only if the index has this definition:

CREATE INDEX ... ON Table1 (column1, column2);

In other words, all columns in the select list must be in the index. (The order of the columns in the index does not matter.) Thus it might make sense to define a multiple-column index strictly for use as a covering index, regardless of search considerations.

[edit] The Index Merge Join Type

[edit] Overview

Index Merge is used when table condition can be converted to form:

cond_1 OR cond_2 ... OR cond_N

The conditions for conversion are that each cond_i can be used for a range scan, and no pair (cond_i, cond_j) uses the same index. (If cond_i and cond_j use the same index, then cond_i OR cond_j can be combined into a single range scan and no merging is necessary.)

For example, Index Merge can be used for the following queries:

SELECT * FROM t WHERE key1=c1 OR key2<c2 OR key3 IN (c3,c4);

SELECT * FROM t WHERE (key1=c1 OR key2<c2) AND nonkey=c3;

Index Merge is implemented as a container for range key scans constructed from cond_i conditions. When doing Index Merge, MySQL retrieves rows for each of the keyscans and then runs them through a duplicate elimination procedure. Currently the Unique class is used for duplicate elimination.

[edit] Index Merge Optimizer

A single SEL_TREE object cannot be constructed for conditions that have different members of keys in the OR clause, like in condition:

key1 < c1 OR key2 < c2

Beginning with MySQL 5.0, these conditions are handled with the Index Merge method, and its range optimizer structure, class SEL_IMERGE. SEL_IMERGE represents a disjunction of several SEL_TREE objects, which can be expressed as:

sel_imerge_cond = (t_1 OR t_1 OR ... OR t_n)

where each of t_i stands for a SEL_TREE object, and no pair (t_i, t_j) of distinct SEL_TREE objects can be combined into single SEL_TREE object.

The current implementation builds SEL_IMERGE only if no single SEL_TREE object can be built for the part of the query condition it has analyzed, and discards SEL_TREE immediately if it discovers that a single SEL_TREE object can be constructed. This is actually a limitation, and can cause worse row retrieval strategy to be used. E.g. for query:

SELECT * FROM t WHERE (goodkey1=c1 OR goodkey1=c2) AND badkey=c3

scan on badkey will be chosen even if Index Merge on (goodkey1, goodkey) would be faster.

The Index Merge optimizer collects a list of possible ways to access rows with Index Merge. This list of SEL_IMERGE structures represents the following condition:

  (t_11 OR t_12 OR ... OR t_1k) AND
  (t_21 OR t_22 OR ... OR t_2l) AND
   ...                          AND
  (t_M1 OR t_M2 OR ... OR t_mp)

where t_ij is one SEL_TREE and one line is for one SEL_IMERGE object.

The SEL_IMERGE object with minimal cost is used for row retrieval.

In sql/opt_range.cc, see imerge_list_and_list(), imerge_list_or_list(), and SEL_IMERGE class member functions for more details of Index Merge construction.

See the get_index_merge_params function in the same file for Index Merge cost calculation algorithm.

[edit] The range Optimizer

For range queries, the MySQL optimizer builds a SEL_TREE object which represents a condition in this form:

range_cond = (cond_key_1 AND cond_key_2 AND ... AND cond_key_N)

Each of cond_key_i is a condition that refers to components of one key. MySQL creates a cond_key_i condition for each of the usable keys. Then the cheapest condition cond_key_i is used for doing range scan.

A single cond_key_i condition is represented by a pointer-linked network of SEL_ARG objects. Each SEL_ARG object refers to particular part of the key and represents the following condition:

  sel_arg_cond= (inf_val < key_part_n AND key_part_n < sup_val) (1)
                AND next_key_part_sel_arg_cond                  (2)
                OR left_sel_arg_cond                            (3)
                OR right_sel_arg_cond                           (4)
  1. is for an interval, possibly without upper or lower bound, either including or not including boundary values.
  2. is for a SEL_ARG object with condition on next key component.
  3. is for a SEL_ARG object with an interval on the same field as this SEL_ARG object. Intervals between the current and left objects are disjoint and left_sel_arg_cond.sup_val <= inf_val.
  4. is for a SEL_ARG object with an interval on the same field as this SEL_ARG object. Intervals between the current and right objects are disjoint and left_sel_arg_cond.min_val >= max_val.

MySQL is able to convert arbitrary-depth nested AND-OR conditions to the above conjunctive form.

[edit] Row Retrieval Algorithm

Index Merge works in two steps:

Preparation step:

activate 'index only';
foreach key_i in (key_scans \ clustered_pk_scan)
{
  while (retrieve next (key, rowid) pair from key_i)
  {
    if (no clustered PK scan ||
        row doesn't match clustered PK scan condition)
      put rowid into Unique;
  }
}
deactivate 'index only';

Row retrieval step:

for each rowid in Unique
{
  retrieve row and pass it to output;
}
if (clustered_pk_scan)
{
  while (retrieve next row for clustered_pk_scan)
   pass row to output;
}

See: sql/opt_range.cc, QUICK_INDEX_MERGE_SELECT class members for Index Merge row retrieval code.

[edit] Transpositions

MySQL supports transpositions (reversing the order of operands around a relational operator) for simple expressions only. In other words:

WHERE - 5 = column1

becomes:

WHERE column1 = -5

However, MySQL does not support transpositions where arithmetic exists. Thus:

WHERE 5 = -column1

is not treated the same as:

WHERE column1 = -5

Transpositions to expressions of the form column = constant are ideal for index lookups. If an expression of this form refers to an indexed column, then MySQL always uses the index, regardless of the table size. (Exception: If the table has only zero rows or only one row, it is a constant table and receives special treatment. See Constants and Constant Tables.)

[edit] AND Relations

An ANDed search has the form condition1 AND condition2, as in this example:

WHERE column1 = 'x' AND column2 = 'y'

Here, the optimizer's decision process can be described as follows:

  1. If (neither condition is indexed) use sequential scan.
  2. Otherwise, if (one condition has better join type) then pick a driver based on join type (see Determining the Join Type).
  3. Otherwise, since (both conditions are indexed and have equal join type) pick a driver based on the first index that was created.

The optimizer can also choose to perform an index_merge index intersection, as described in Index Merge Optimization.

Here's an example:

CREATE TABLE Table1 (s1 INT, s2 INT);
CREATE INDEX Index1 ON Table1 (s2);
CREATE INDEX Index2 ON Table1 (s1);
...
SELECT * FROM Table1 WHERE s1 = AND s2 = 5;

When choosing a strategy to solve this query, the optimizer picks s2 = Regard this as an accidental effect rather than a rule it could change at any moment.

[edit] OR Relations

An ORed search has the form "condition1" OR "condition2", as in this example:

WHERE column1 = 'x' OR column2 = 'y'

Here the optimizer's decision is to use a sequential scan.

There is also an option to use index merge under such circumstances. See Index Merge Optimizer and Index Merge Optimization, for more information.

The above warning does not apply if the same column is used in both conditions. For example:

WHERE column1 = 'x' OR column1 = 'y'

In such a case, the search is indexed because the expression is a range search. This subject will be revisited during the discussion of the IN predicate.

[edit] UNION Queries

All SELECT statements within a UNION are optimized separately. Therefore, for this query:

SELECT * FROM Table1 WHERE column1 = 'x'
UNION ALL
SELECT * FROM TABLE1 WHERE column2 = 'y'

if both column1 and column2 are indexed, then each SELECT is done using an indexed search, and the result sets are merged. Notice that this query might produce the same results as the query used in the OR example, which uses a sequential scan.

[edit] NOT (<>) Relations

It is a logical rule that

column1 <> 5

is the same as

column1 < 5 OR column1 > 5

However, MySQL does not transform in this circumstance. If you think that a range search would be better, then you should do your own transforming in such cases.

It is also a logical rule that

WHERE NOT (column1 != 5)

is the same as

WHERE column1 = 5

However, MySQL does not transform in this circumstance either.

We expect to add optimizations for both the previous cases.

[edit] ORDER BY Clauses

In general, the optimizer will skip the sort procedure for the ORDER BY clause if it sees that the rows will be in order anyway. But let's examine some exceptional situations.

For the query:

SELECT column1 FROM Table1 ORDER BY 'x';

the optimizer will throw out the ORDER BY clause. This is another example of dead code elimination.

For the query:

SELECT column1 FROM Table1 ORDER BY column1;

the optimizer will use an index on column1, if it exists.

For the query:

SELECT column1 FROM Table1 ORDER BY column1+1;

the optimizer will use an index on column1, if it exists. But don't let that fool you! The index is only for finding the values. (It's cheaper to do a sequential scan of the index than a sequential scan of the table, that's why index is a better join type than ALL see [optimizer.html#optimizer-index-join-type Section??3.2.2.4, The index Join Type].) There will still be a full sort of the results.

For the query:

SELECT * FROM Table1
WHERE column1 > 'x' AND column2 > 'x'
ORDER BY column2;

if both column1 and column2 are indexed, the optimizer will choose an index on ... column1. The fact that ordering takes place by column2 values does not affect the choice of driver in this case.

See: /sql/sql_select.cc, test_if_order_by_key(), and /sql/sql_select.cc, test_if_skip_sort_order().

ORDER BY Optimization, provides a description of the internal sort procedure which we will not repeat here, but urge you to read, because it describes how the buffering and the quicksort mechanisms operate.

See: /sql/sql_select.cc, create_sort_index().

[edit] GROUP BY and Related Conditions

These are the main optimizations that take place for GROUP BY and related items (HAVING, COUNT(), MAX(), MIN(), SUM(), AVG(), DISTINCT()).

SELECT COUNT(*) FROM Table1;

gets the count without going through all the rows. This is true for MyISAM tables, but not for InnoDB tables. Note that the query

SELECT COUNT(column1) FROM Table1;

is not subject to the same optimization, unless column1 is defined as NOT NULL.

SELECT MAX(column1)
  FROM Table1
  WHERE column1 < 'a';

If column1 is indexed, then it's easy to find the highest value by looking for 'a' in the index and going back to the key before that.

SELECT DISTINCT column1 FROM Table1;

to

SELECT column1 FROM Table1 GROUP BY column1;

if and only if both of these conditions are true:

Because DISTINCT is not always transformed to GROUP BY, do not expect that queries with DISTINCT will always cause ordered result sets. (You can, however, rely on that rule with GROUP BY, unless the query includes ORDER BY NULL.)

See: /sql/sql_select.cc, opt_sum_query(), and /sql/sql_select.cc, remove_duplicates().

[edit] Other Optimizations

In this section, we discuss other, more specialized optimizations performed in the MySQL server.

[edit] NULLs Filtering for ref and eq_ref Access

This section discusses the NULLs filtering optimization used for ref and eq_ref joins.

[edit] Early NULLs Filtering

Suppose we have a join order such as this one:

...,  tblX, ..., tblY, ...

Suppose further that table tblY is accessed via ref or eq_ref access on


tblY.key_column = tblX.column

or, in the case of ref access using multiple key parts, via


... AND tblY.key_partN = tblX.column AND ...

where tblX.column can be NULL. Here the early NULLs filtering for ref (or eq_ref) access is applied. We make the following inference:


(tblY.key_partN = tblX.column)  =>  (tblX.column IS NOT NULL)

The original equality can be checked only after we've read the current rows of both tables tblX and tblY. The IS NOT NULL predicate can be checked after we've read the current row of table tblX. If there are any tables in the join order between tblX and tblY, the added IS NOT NULL check will allow us to skip accessing those tables.

This feature is implemented in these places in the server code:

It is possible to add IS NOT NULL predicates for all equalities that could be used for ref access (and not for those that are actually used). However, this is currently not done.

[edit] Late NULLs Filtering

Suppose we have a query plan with table tblX being accessed via the ref access method:


tblX.key_part1 = expr1 AND tblX.key_part2 = expr2 AND ...

Before performing an index lookup, we determine whether any of the expri values is NULL. If it is, we don't perform the lookup, but rather immediately return that the matching tuple is not found.

This optimization reuses the null_rejecting attribute produced by the early NULLs filtering code (see #Early NULLs Filtering). The check itself is located in the function join_read_always_key().

[edit] Partitioning-Related Optimizations

This section discussions optimizations relating to MySQL Partitioning. See Partitioning for general information about the partitioning implementation in MySQL 5.1 and later.

[edit] Partition pruning

The operation of partition pruning is defined as follows:

Given a query over partitioned table, match the table DDL against any WHERE or ON clauses, and find the minimal set of partitions that must be accessed to resolve the query.

The set of partitions thus obtained (hereafter referred to as used) can be smaller then the set of all table partitions. Partitions that did not get into this set (that is, those that were pruned away) will not be accessed at all: this is how query execution is made faster.

Non-Transactional Table Engines.?? With non-transactional tables such as MyISAM, locks are placed on entire partitioned table. It is theoretically possible to use partition pruning to improve concurrency by placing locks only on partitions that are actually used, but this is currently not implemented.

Partition pruning doesn't depend on what table engine is used. Therefore its implementation is a part of the MySQL Query Optimizer. The next few sections provide a detailed description of partition pruning.

[edit] Partition Pruning Overview

Partition pruning is performed using the following steps:

  1. Analyze the WHERE clause and construct an interval graph describing the results of this analysis.
  2. Walk the graph, and find sets of partitions (or subpartitions, if necessary) to be used for each interval in the graph.
  3. Construct a set of partitions used for the entire query.

The description represented by the interval graph is structured in a bottom-up fashion. In the discussion that follows, we first define the term partitioning interval, then describe how partitioning interval are combined to make an interval graph, and then describe the graph walking process.

[edit] Partitioning Intervals
[edit] Single-Point Intervals

Let's start from simplest cases. Suppose that we have a partitioned table with N columns, using partitioning type p_type and the partitioning function p_func, represented like this:

CREATE TABLE t (columns)
PARTITION BY p_type(p_func(col1, col2,... colN)...);

Suppose also that we have a WHERE clause of the form

WHERE t.col1=const1 AND t.col2=const2 AND ... t.colN=constN

We can calculate p_func(const1, const2 ... constN) and discover which partition can contain records matching the WHERE clause. Note that this process works for all partitioning types and all partitioning functions.

Note: This process works only if the WHERE clause is of the exact form given above that is, each column in the table must be tested for equality with some arbitrary constant (not necessarily the same constant for each column). For example, if col1=const1 were missing from the example WHERE clause, then we would not be able to calculate the partitioning function value and so would be unable to restrict the set of partitions to those actually used.
[edit] Interval Walking

Let a partitioned table t be defined with a set of column definitions columns, a partitioning type p_type using a partitioning function p_func taking an integer column int_col, as shown here:

CREATE TABLE t (columns)
PARTITION BY
p_type(p_func(int_col))
...

Now suppose that we have a query whose WHERE clause is of the form

WHERE const1 <= int_col <= const2

We can reduce this case to a number of cases of single-point intervals by converting the WHERE clause into the following relation:

int_field=const1       OR
int_field=const1 + 1   OR
int_field=const1 + 2   OR
...                    OR
int_field=const2

In the source code this conversion is referred to as interval walking. Walking over short intervals is not very expensive, since we can reduce the number of partitions to scan to a small number. However, walking over long intervals may not be very efficient there will be lots of numbers to examine, and we are very likely to out that all partitions need to be scanned.

The threshold for interval walking is determined by

 #define MAX_RANGE_TO_WALK=10
 
Note: The logic of the previous example also applies for a relation such as this one:


const1 >= int_col >= const2
[edit] Interval mapping

Let a partitioned table t be defined as follows:

CREATE TABLE t (columns)
PARTITION BY RANGE|LIST(unary_ascending_function(column))

Suppose we have a query on table t whose WHERE clause is of one of the forms shown here:

Since the partitioning function is ascending, the following relationship holds:

const1 <= t.col <= const2

  => p_func(const1) <=

p_func(t.column) <= p_func(const2)

Using A and B to denote the leftmost and rightmost parts of this relation, we can rewrite it like this:

A <= p_func(t.column) <= B
Note: In this instance, the interval is closed and has two bounds. However, similar inferences can be performed for other kinds of intervals.

For RANGE partitioning, each partition occupies one interval on the partition function value axis, and the intervals are disjoint, as shown here:

                        p0     p1      p2
  table partitions    ------x------x--------x-------->

  search interval     ----x==============x----------->
                          A              B

A partition needs to be accessed if and only if its interval has a non-empty intersection with the search interval [A, B].

For LIST partitioning, each partition covers a set of points on the partition function value axis. Points produced by various partitions may be interleaved, as shown here:

                      p0  p1   p2   p1   p1   p0
table partitions    --+---+----+----+----+----+---->

search interval     ----x===================x------>
                        A                   B

A partition needs to be accessed if it has at least one point in the interval [A, B]. The set of partitions used can be determined by running from A to B and collecting partitions that have their points within this range.

[edit] Subpartitioning Intervals

In the previous sections we've described ways to infer the set of used partitions from "elementary" WHERE clauses. Everything said there about partitions also applies to subpartitions (with exception that subpartitioning by RANGE or LIST is currently not possible).

Since each partition is subpartitioned in the same way, we'll find which subpartitions should be accessed within each partition.

[edit] From WHERE Clauses to Intervals

Previous sections deal with inferring the set of partitions used from WHERE clauses that represent partitioning or subpartitioning intervals. Now we look at how MySQL extracts intervals from arbitrary WHERE clauses.

The extraction process uses the Range Analyzer a part of the MySQL optimizer that produces plans for the range access method. This is because the tasks are similar. In both cases we have a WHERE clause as input: the range access method needs index ranges (that is, intervals) to scan; partition pruning module needs partitioning intervals so that it can determine which partitions should be used.

For range access, the Range Analyzer is invoked with the WHERE clause and descriptions of table indexes. Each index is described by an ordered list of the columns which it covers:

(keypart1, keypart2, ..., keypartN)

For partition pruning, Range Analyzer is invoked with the WHERE clause and a list of table columns used by the partitioning and subpartitioning functions:


(part_col1, part_col2, ... part_colN,
subpart_col1, subpart_col2, ... subpart_colM)

The result of the Range Analyzer's work is known as a SEL_ARG</code> graph. This is a complex (and not yet fully documented) structure, which we will not attempt to describe here. What's important for the current discussion is that we can walk over it and collect partitioning and subpartitioning intervals.

The following example illustrates the structure and the walking process. Suppose a table t is partitioned as follows:


CREATE TABLE t (..., pf INT, sp1 CHAR(5), sp2 INT, ... )
  PARTITION BY LIST (pf)
  SUBPARTITION BY HASH(sp1, sp2) (
    PARTITION p0 VALUES IN (1),
    PARTITION p1 VALUES IN (2),
    PARTITION p2 VALUES IN (3),
    PARTITION p3 VALUES IN (4),
    PARTITION p4 VALUES IN (5),
  );

Now suppose that a query on table t has a highly complex WHERE clause, such as this one:

pf=1 AND (sp1='foo' AND sp2 IN (40,50))

OR

(pf1=3 OR pf1=4) AND sp1='bar' AND sp2=33

OR

((pf=3 OR pf=4) AND sp1=5)

OR

p=8

The SEL_ARG graph for this is shown here:

(root)
   |               :
   | Partitioning  :         Sub-partitioning
   |               :
   |               :
   |               :
   |   +------+    :     +-----------+   +--------+
   \---| pf=1 |----:-----| sp1='foo' |---| sp2=40 |
       +------+    :     +-----------+   +--------+
          |        :                         |
          |        :                     +--------+
          |        :                     | sp2=50 |
          |        :                     +--------+
          |        :
          |        :
       +------+    :     +-----------+   +--------+
       | pf=3 |----:--+--| sp1='bar' |---| sp2=33 |
       +------+    :  |  +-----------+   +--------+
          |        :  |
       +------+    :  |
       | pf=4 |----:--+
       +------+    :
          |        :
          |        :
       +------+    :     +-----------+
       | pf=8 |----:-----| sp1='baz' |
       +------+    :     +-----------+

In the previous diagram, vertical edges (|) represent OR and the horizontal ones (-) represent AND (the line with both horizontal and vertical segments also represents AND).

The partition-pruning code walks the graph top to bottom and from left to right, making these inferences:

  1. Start with an empty set of used partitions at the topmost and leftmost interval.
    1. Perform interval analysis for pf=1; find a corresponding set of partitions P0; move right.
    2. Move right again, to sp2=40.
    3. Analyze the interval sp1='foo' AND sp2=40 interval; find that it covers rows in some subpartition SP1. Make first inference: within each partition making up set P0, mark subpartition SP1 as used.
    4. Move down to sp2=50.
    5. Analyze the interval sp1='foo' AND sp2=50, finding that it covers rows in some subpartition SP2. Make another inference: within each partition of set P0, mark subpartition SP2 as used.
    6. Move back to pf=1, and then down to pf=3.
    7. Perform interval analysis for pf=3; find a corresponding set of partitions P1; move right.
    8. Move right again, to sp2=33.
    9. Analyze the interval sp1='foo' AND sp2=33, find that it covers rows in a subpartition SP3. Make another inference: within each partition from set P1, mark subpartition SP3 as used.
    10. Move back to pf=3, then down to pf=4.
    11. Perform interval analysis for pf=4; find a corresponding set of partitions P2; move right.
    12. Perform moves and inferences analogous to what we did to the right of pf=3. There is some potential inefficiency due to the fact that that we will analyze the interval for sp1='foo' AND sp2=33 again, but this should not have much impact on overall performance.
    13. Move back to pf=3, then down to pf=8.
    14. Perform interval analysis for pf=8; find a corresponding set of partitions P3, move right.
    15. Now we've arrived at sp1='baz', and find that we can't move any further to the right and can't construct a subpartitioning interval. We remember this, and move back to pf=8.
    16. In the previous step we could not limit the set of subpartitions, so we make this inference: for every partition in set P3, assume that all subpartitions are active, and mark them as such.
  2. Try to move down from pf=8; find that there is nothing there; this completes the graph analysis.
Note: In certain cases the result of the RANGE optimizer will be several SEL_ARG graphs that are to be combined using OR or AND operators. This happens for WHERE clauses which either are very complicated or do not allow for the construction of a single list of intervals. In such cases, the partition pruning code takes apprpriate action, an example being this query:
SELECT * FROM t1 WHERE partition_id=10 OR subpartition_id=20

No single list of intervals can be constructed in this instance, but the partition pruning code correctly infers that the set of partitions used is a union of:

  1. All subpartitions within the partition containing rows with partition_id=10; and

a subpartition containing rows with subpartition_id=20 within each partition.

[edit] Partition Pruning in the Source Code

Here is a short walkthrough of what is where in the code:

This file contains the implementation of what is described in #From WHERE Clauses to Intervals. The entry point is the function prune_partitions(). There are also detailed code-level comments about partition pruning; search for PartitionPruningModule to find the starting point.

class partition_info {
  ...
  /*
    Bitmap of used (i.e. not pruned away) partitions. This is where result
    of partition pruning is stored.
  */
  MY_BITMAP used_partitions;

  /*
    "virtual function" pointers to functions that perform interval analysis
    on this partitioned table (used by the code in opt_range.cc)
  */
  get_partitions_in_range_iter get_part_iter_for_interval;
  get_partitions_in_range_iter get_subpart_iter_for_interval;

};

This file contains the functions implementing all types of partitioning interval analysis.

[edit] Partition selection

If a partitioned table is accessed in a series of index lookups (that is, using the ref, eq_ref, or ref_or_null access methods), MySQL checks to see whether it needs to make index lookups in all partitions or that it can limit access to a particular partition. This is performed for each index lookup.

Consider this example:

CREATE TABLE t1 (a INT, b INT);

INSERT INTO t1 VALUES (1,1),(2,2),(3,3);

CREATE TABLE t2 (
    keypart1 INT,
    keypart2 INT,
    KEY(keypart1, keypart2)
)
PARTITION BY HASH(keypart2);

INSERT INTO t2 VALUES (1,1),(2,2),(3,3);

The query

SELECT * FROM t1, t2
    WHERE  t2.keypart1=t1.a
    AND    t2.keypart2=t1.b;

is executed using this algorithm:

(for each record in t1:)
{
  t2->index_read({current-value-of(t1.a), current-value-of(t1.b)});
  while( t2->index_next_same() )
    pass row combination to query output;
 }
 

In the index_read() call, the partitioned table handler will discover that the value of all partitioning columns (in this case, the single column b) is fixed, and find a single partition to access. If this partition was pruned away, then no partitions will be accessed at all.

Retrieved from "http://forge.mysql.com/wiki/MySQL_Internals_Optimizer"

This page has been accessed 9,420 times. This page was last modified 16:04, 31 May 2007.

Find

Browse
MySQLForge
Main Page
Current events
Recent changes
Random page
Help
Edit
Edit this page
Editing help
This page
Discuss this page
Post a comment
Printable version
Context
Page history
What links here
Related changes
My pages
Special pages
New pages
File list
Statistics
Bug reports
More...