6.0 Subquery Optimization Cheatsheet
Contents |
This page describes only what you can already try in the code. For the list of all subquery optimization tasks, including incomplete and planned items, see Subquery_Works.
[edit] Semi-join optimizations
Semi-join subqueries are IN-subqueries that are AND-parts of the WHERE/ON clause:
SELECT ... WHERE outer_expr IN (SELECT inner_expr FROM ... WHERE ... ) AND ...
or
SELECT ... WHERE (outer_expr1, ... outer_exprN) IN (SELECT inner_expr1, ... inner_exprN FROM ... WHERE ...) AND ...
The subquery must satisfy these conditions:
- It is not a UNION or UNION ALL
- It doesn't have GROUP BY, HAVING or any kind of aggregate functions
- It doesn't have ORDER BY ... LIMIT clause
The subquery may or may not be correlated (new optimizations provide efficient handling for both kinds), it may also have DISTINCT or LIMIT clauses.
[edit] Strategies for handling semi-join subqueries
If a subquery meets semi-join criteria, MySQL will convert it to semi-join and then make a cost-based choice from these options.:
- WL#3740: Convert subquery to join, or pull out some of the tables out of the subquery
- WL#3741: Duplicate Weedout: run semi-join as if it was a join and then remove the duplicate records using a temporary table
- WL#3750: FirstMatch: run as join but shortcut as soon as we've found one match. This is close to how MySQL 5.x ran subqueries.
- WL#3751: LooseScan: scan a subquery table using an index that allows to pick only one value from each subquery's value group.
- WL#3985: Materialize a subquery into a temporary table with index and then run a join with it. Temporary table's index is used both to remove duplicates and when joining with parent select's tables.
[edit] How semi-joins are reflected in EXPLAIN
EXPLAIN EXTENDED + SHOW WARNINGS shows the rewritten query. If a subquery was converted to semi-join you will see that then subquery predicate is gone and subquery's tables and WHERE clause were merged into parent's subquery's join list and WHERE.
- If WL#3740 pulled out a table, the rewritten subquery will have an inner join between parent select's tables and subquery's table. Otherwise it will be a semi join.
- Other strategies are reflected in the Extra column:
[edit] Materialization
Materialization handles all kinds of IN-subqueries that are uncorrelated:
... outer_expr [NOT] IN (SELECT inner_expr FROM /* some uncorrelated subquery */ ) ...
or
... (outer_expr1, ... outer_exprN) [NOT] IN
(SELECT inner_expr1, ... inner_exprN FROM /* some uncorrelated subquery */ )
- The subselect must be uncorrelated
- One of the following must hold:
-
outer_exprandinner_exprcannot ever be NULL, or - it doesn't matter whether the result of subquery is FALSE or NULL (this is the case when the subuqery is not a NOT IN, and it is AND-part or OR-part of the WHERE clause).
-
- GROUP BY, aggregation, UNION, etc are allowed.
- The subquery can be located everywhere: it can be select list, be an argument to some function, etc.
At the moment, materialization doesn't stand out much in EXPLAIN output: you'll see select_type change from DEPENDENT SUBQUERY to SUBQUERY.
[edit] Turning new optimizations on/off
The new @@optimizer_switch variable allows to turn new optimizations on and off. The value is comma-separated list of any subset of these keywords: "no_materialization", "no_semijoin", "no_loosescan", "no_firstmatch". no_semijoin implies no_loosescan and no_firstmatch.
To switch to MySQL 5.1 behavior, use this command:
mysql> set @@optimizer_switch='no_semijoin,no_materialization';
To switch everything back on, use this:
mysql> set @@optimizer_switch='';
[edit] Further reading
This page is a short summary. For a full overview of 6.0 optimizations, see
- New_Optimizer_Features_in_MySQL_6.0 MySQL University session, or
- New Subquery Optimizations in MySQL 6.0 slides from MySQL User Conference 2008 (this is the most up to date source) (but it doesn't cover WL#3985)
Also see Category:Subquery_Optimizations.