Categories: Optimizer | Subquery Optimizations

6.0 Subquery Optimization Benchmarks

Contents


SergeyP has analyzed the performance of new subquery optimizations that are already pushed into MySQL 6.0. For short description of these optimizations, see 6.0 Subquery Optimization Cheatsheet. Ongoing work and future plans are listed at Subquery_Works.

[edit] Summary statistics

Analyzed 25 subquery performance bugs, which is a random sample of all subquery performance bugs we had (TODO: Ranger: expand the sampling)

[edit] Preliminary conclusions

These conclusions were made after analyzing bugs db cases:


[edit] Detailed bug list

I've done two searches, one for "subquery" and one for "subselect". Unfortunately I didn't notice I've left the Status: select in "Active" state. My intent was to find completely everything.

BUG# Summary Subquery type Testcase available? Handled by 6.0 optimizations? Speedup
Bug#4132 MySQL hang up when run a subquery. select ... where oe not in (/*uncorrelated*/ select min(x) ... group by y) Query only Pending: materialization (waiting for WL#3830)
Bug#8139 subselect index not used for "in" but used for "=" select ... where oe in (uncorrelated semi-join-ok select) Query only Yes, both
Bug#9021 Query with subquery does not use index in outer query select ... where oe in (uncorrelated semi-join-ok select) Query only Yes, both
Bug#9090 Optimization Problem with a subquery in an IN Operation select ... where oe in (uncorrelated select w/ group by) Query only Yes, materialization
Bug#10312 subquery cant be optimized correctly select ... where oe in (uncorrelated semi-join-ok select) Query only Yes, both
Bug#10815 Sub-queries failure when using GROUP BY select ... where oe in (uncorrelated group-by select) Query only Yes, materialization
Bug#11254 Subquery IN Crashes MySQL (Service Pegs CPU) select ... where oe in (uncorrellated semi-join-ok select) and (another uncorrellated semi-join-ok select) Query and data Yes, both Yes, 9300 times
Bug#14253 Optimizer uses partial indexes for subqueries, where it should not select ... where exists (correlated semi-join ok select) Query and data No, it's EXISTS
Bug#15135 Dependent subquery performs very badly select ... where oe in (uncorrelated select w/ group by) Query and data Yes, materialization TODO
Bug#17560 select with subselect does not use indexed key select ... where oe in (uncorrellated semi-join-ok select) Query only Yes, both
Bug#17952 nested query super slow..should not be select ... where oe in (uncorrelated select) Query only Yes, both
Bug#18465 Optimizer/execution plan bug with dependent subqueries in ON clauses of JOINs select ... inner join on ((select min(...) where ie=oe1) = oe2) ... Query and data No
Bug#18826 Using subquery in "in (...)" clause forces full table scan select ... where oe in (uncorrelated semi-join-ok select) Query only Yes, both
Bug#19895 No optimization of where clause subqueries rewritable as joins select ... where oe in (semi-join-ok select), where exists (...) Query only Yes, both
Bug#22765 subselect uses all partitions instead of the actual needed ones. select ... where oe in (semi-join-ok uncorrelated select) Query only Yes, both
Bug#24770 update with subquery slow update ... where oe in (uncorrelated semi-join-ok select) Query and data Yes, materialization (todo: try it)
Pending: handling UPDATEs with w/ semi-join
TODO
Bug#25796 Query plan is not using expected index select ... where oe in (uncorrelated semi-join-ok select) Query only Yes, both
Bug#25926 dependent suqbuery takes too long select ... where oe in (uncorrelated group-by select) Query and data Yes, materialization Yes, 950 times
Bug#27199 Performance problem when query contains a left join against a subquery select ... left join (mergeable select ...) ... Query and data No
Bug#27452 Correlated subquery produces inefficient optimizer plan for INFORMATION_SCHEMA select ... left join (mergeable select ...) ... Query only No
Bug#28257 MySQL hangs on "Copying to tmp table" or "Sending data" - Part 2 of Bug #14070 select ... where oe in (uncorrelated group-by select)
select ... where oe in (uncorrelated semi-join-ok select with distinct)
Query and data Materialization - yes+yes, semi-join - yes+no Yes. 10 times and 3140 times
Bug#30659 300% Performance Regression 5.0.45 -> 5.2.6 on Correlated Subquery select ... from (uncorrelated group-by select)
select ... where oe=(select max(ie) ... where correlated_cond)
Query and data No, its FROM or correlated group-by.
Bug#30883 Dependent Sub Queries Very Slow select ... where oe not in (uncorrelated semi-join-ok select) Query only Pending: materialization (waiting for WL#3830)
Bug#31468 EXPLAIN slow when joining big table to a subquery select ... from (select ...) Query and data No
Bug#32341 Subquery does not use index with session variable, works with string literal select ... where oe = (select ... where .. @var .. order by limit 1) Query and data No
Bug#32665 Query with dependent subquery is too slow select .. where oe in (uncorrelated semi-join-ok select) Query only Yes, both

[edit] Results of search in DBT-x benchmark suites

query no query pattern Handled by 6.0
optimizations?
2 select ... where oe=(uncorrelated select min(..) from 4 tables) No
4 select ... where exists (select * from tbl where oe=ie and uncorr_cond) No, EXISTs
7 select ... from (view-mergeable select) group by x order by y No
8 select ... from (view-mergeable select)
/*no where*/ group by x order by y
No
9 select ... from (view-mergeable select)
/*no where*/ group by x order by y
No
11 select ... having sum(..) > (uncorrelated select(sum) ...) No
13 select ... from (group-by select) /*no where*/ group by x order by y No
15 select ... from tbl, group_by_view_1
where group_by_view_1.group_col=tbl.col and
oe = (select max(col) from group_by_view1)
No
16 select ... where oe not in (uncorrelated semi-join-ok select) Yes, materialization
17 select ... where oe < (select avg( ...) ... where correlated cond) No
18 select ... where oe in (uncorrelated group-by select) Yes, materialization
20 select ... where oe in (semi-join-ok select where ... and oe in > (select sum(x) from tbl where corr_cond)) No
21 select ... where
exists (select ... where oe1=ie1 and oe2!=ie2) and
not exists (select ... where oe1=ie1 and oe2!=ie2)
No
22 select .. from (select ... where oe > (uncorrelated select avg(x)) and not exists (select * from tbl where oe=ie))
/*no where*/ order by x group by y
No

[edit] Results of search in MySQL support cases

The search found 47 cases related to subqueries. The below table has private info removed, the full table is at 6.0_Subquery_Optimization_Benchmarks page on the intranet)

(note: in query abbreviations, i='inner'(i.e. defined in the subquery) o='outer', e='expression'. 'oe'='an expression that depends on outer tables'.)

Query type Handled by new optimization? Notes
WHERE oe IN (uncorrelated semi-join-ok select) Yes, both
WHERE oe= ANY(uncorrelated semi-join-ok select) Yes, both subquery has dummy ORDER BY; Testcase available
WHERE oe IN (uncorrelated semi-join-ok select) Yes, both Testcase available
WHERE oe IN (uncorrelated semi-join-ok select) Yes, both
WHERE oe IN (uncorrelated semi-join-ok select) Yes, both
WHERE .. IN (uncorrelated semi-join-ok select) Yes, both
WHERE oe IN (correlated semi-join-ok select) Yes, semi-join
WHERE oe IN (uncorrelated semi-join-ok select w/ selective clauses) Yes, both Inside-out order needed
WHERE oe IN (uncorrelated semi-join-ok select w/ selective clauses) Yes, both Inside-out order seems to be needed
WHERE oe IN (uncorrelated semi-join-ok select w/ selective clauses) Yes, both
equivalent to WHERE oe IN (uncorrelated semi-join-ok select w/ selective clauses) Yes, both
WHERE oe IN (uncorrelated semi-join-ok select w/ selective clauses) Yes, both Subquery is a 5-way inner join
WHERE oe IN (uncorrelated semi-join-ok select) Yes, both Inside-out order seems to be needed
WHERE oe IN (uncorrelated semi-join-ok select distinct(x) ... ) Yes, both
WHERE oe IN (SELECT x FROM t1, tw WHERE uncorrelated_cond GROUP BY x) No? Will be handleable by both if we make the optimizer detect and ignore such redundant GROUP BY clauses
UPDATE ... WHERE oe NOT IN (uncorrelated semi-join-ok select, no access to updated table) Yes, materialization Could use semi-join if we make update code convert such queries to multi-table UPDATEs
select ... from outer_table where oe in (uncorrelated semi-join-ok select w/ LIMIT 1) Yes, both Why LIMIT 1?
HAVING oe IN (SELECT sum() ... FROM (SELECT ... ) no_where GROUP BY ...) Yes, materialization
WHERE oe IN (uncorreated semi-join-ok select distinct) Yes, both
WHERE oe IN (uncorrelated semi-join-ok select) Yes, both
WHERE outer_tbl.col1 = (SELECT max(col) FROM inner_tbl WHERE inner_tbl.col1=outer_tbl.col1) AND outer_tbl.col1 IN (1,2,3) No Rewriteable as FROM. Is this a case for equality propagation to span subquery bounds?
DELETE .. WHERE NOT EXISTS (semi-join-ok correlated select)
SELECT ... WHERE NOT EXISTS (semi-join-ok correlated select)
No, its EXISTS Testcase available
WHERE oe IN (SELECT column FROM table) Yes, both
WHERE oe IN (SELECT col FROM tbl1 WHERE NOT EXISTS(SELECT ... WHERE ... AND tbl1.col= inner_tbl.col) IN-subqery - Yes, both
EXISTS - No.
Another case of equality-correlated EXISTS
select ... inner joins ... where oe = (uncorrelated semi-join-ok select) No Subqueries are simple 'lookups' so the case itself is not interesting. In general, this hints on necessity to position subquery predicates based on their costs.
WHERE .. IN (uncorrelated group-by select) Yes, Materialization
Several correlated LIMIT 1 subqueries in select select list No This could probably use subquery argument caching.
WHERE oe IN (uncorrelated semi-join-ok select) Yes, both
UPDATE tbl ... WHERE tbl.col IN (uncorrelated semi-join-ok select) No Manual conversion to multi-table join update helped
WHERE oe IN (uncorrelated semi-join-ok select) Yes, both Manual conversion to join provided speedup
WHERE oe IN (uncorrelated semi-join-ok select) Yes, both Insideout order neede
WHERE oe IN (uncorrelated semi-join-ok select distinct) Yes, both Insideout order needed
WHERE oe IN (uncorrelated semi-join-ok select) Yes, both Insideout order needed
WHERE oe IN (uncorrelated semi-join-ok select) Yes, both
WHERE oe IN (uncorrelated semi-join-ok select) Yes, both
WHERE NOT EXISTS (SELECT 1 FROM inner_tbl WHERE inner_tbl.col=outer_tbl.col AND another-similar-cond) No, EXISTS
WHERE oe IN (uncorrelated group by select) Yes, Materialization 2-level nesting
WHERE oe IN (SELECT col FROM tbl1, tbl2 WHERE uncorrelated_where GROUP BY col) Yes, Materialization (TODO: check if we can convert this to semi-join) Another redundant GROUP BY case
WHERE oe IN (uncorrelated semi-join-ok select), and its EXISTS form Variant 1: Yes, both
Variant 2: No, it's EXISTS
SELECT .. FROM ... LEFT OUTER JOIN (SELECT ...) No, it's FROM.
WHERE .. IN (uncorrelated semi-join-ok select) Yes, both
WHERE oe NOT IN (uncorrelated select w/ aggregates and group by) Yes, materialization
WHERE oe =(SELECT 3-way join WHERE correlated_cond ORDER BY x LIMIT 1) No Subq value cache might help here.
WHERE oe IN (uncorrelated semi-join-ok select, 2-way join) Yes, both
WHERE [NOT] EXISTS (SELECT 1 FROM it1 STRAIGHT JOIN it2 WHERE it1.col=ot.col AND so forth LIMIT 1 clause) No, it's EXISTS
WHERE oe NOT IN (uncorrelated semi-join-ok select) Yes, Materialization

[edit] TODO

.

Retrieved from "http://forge.mysql.com/wiki/6.0_Subquery_Optimization_Benchmarks"

This page has been accessed 2,719 times. This page was last modified 19:03, 30 April 2008.

Find

Browse
MySQLForge
Main Page
Current events
Recent changes
Random page
Help
Edit
Edit this page
Editing help
This page
Discuss this page
Post a comment
Printable version
Context
Page history
What links here
Related changes
My pages
Special pages
New pages
File list
Statistics
Bug reports
More...