Category: Optimizer

MySQL Internals Optimizer/Storage engine Interface

This page is work in progress. Information may be misleading or incorrect


This is intended to serve as a guide of what information a storage engine needs to provide to MySQL Query Optimizer.

Contents

[edit] Number of rows in the table

Number of records in the table is reported by the table handler by filling handler->stats.records in

 handler::info(HA_STATUS_VARIABLE & maybe_other_bits);

call. The optimizer interprets handler->stats.records value as follows: 0 means the table has exactly zero rows. Interpretation of other values depend on whether handler->table_flags() returns HA_STATS_RECORDS_IS_EXACT bit. If the bit is set, the value is assumed to be the exact number of rows in the table (and query "SELECT COUNT(*) FROM tbl" can be answered immediately). If the bit is not set, the value is assumed to be an estimate (the optimizer will treat it as expected value of the number of records).

[edit] Upper bound of number of rows

The call

 ha_rows handler::estimate_rows_upper_bound()

returns the upper bound of rows in the table. If the storage engine cannot provide the upper bound, it must return HA_POS_ERROR to indicate that. The optimizer uses the upper bound to allocate memory for sorting, returning more records then was reported as upper bound will cause ORDER BY queries to fail.

[edit] Indexes

[edit] Index statistics

Index statistics is reported in

 handler::info(HA_STATUS_CONST & maybe_other_bits);

call. The call retrieves statistics for all indexes, statistics for i-th index is stored in the

 handler->table->key_info[i].rec_per_key

array. The array has #keyparts elements. The optimizer interpets its elements as

 rec_per_key[k] = AVG(#rows such that
                        keypart1=c1 AND keypart2=c2 AND ... AND keypart{k} = c_k, 
                        where c_i are arbitrary constants))
 
 rec_per_key[k] = 0 means the estimate is not known.

Note: the above doesn't apply to fulltext or R-TREE indexes. Optimizer will not use rec_per_key values for those types of indexes

[edit] NULL values

SQL's rules for NULL values are that NULL!=NULL. How should NULL value population affect rec_per_key statistics is a gray area. See Bug#9622 for an attempt to deal with this in MyISAM.

[edit] records_in_range() call

TODO

(returning 0 means you guarantee the range has 0 rows)

[edit] Index properties

[edit] Cost functions

[edit] Cost Units

[edit] read_time()

[edit] scan_time()

[edit] Condition pushdown

[edit] Index Condition pushdown

[edit] Multi Range Read interface

[edit] Default cost functions implementation

what assumptions they make about the engine

[edit] Implementations in known storage engines

Production of statistics and records_in_range() implementation:

Engine rec_per_key records_in_range
MyISAM Static values, updateable on ANALYZE/OPTIMIZE and some DDL or bulk ops Calculated by doing index dives to range endpoints and counting fraction of index pages that is between the endpoints
InnoDB Dynamically calculated by doing several index dives (TODO: when is it recalculated?). That's why running ANALYZE several times may produce different, non-converging results. Similar to MyISAM
NDB MySQL 5.0 - trivial function: { return unique_key? 1 : 10 ;}

MySQL 5.1: For BTREE indexes, ha_ndbcluster can obtain a reasonably accurate estimate by asking the backend. Communication with backend has a cost, so ha_ndbcluster maintains an LRU range estimate cache of 32 elements. The backend is accessed on every 20th records_in_range() call. For other calls the result is calculated from the information in the cache. See NdbIndexStat.h|cpp.

HEAP BTREE: a variant of index dives
HASH: Based on # of hash buckets
Falcon TODO Heuristic formula based on index cardinality value, number of keyparts used by the left interval bound, and type of the interval (disinguishes between equality and all other intervals)
PBXT
Solid (something nontrivial)


ANALYZE/OPTIMIZE table actions:

Engine ANALYZE OPTIMIZE
MyISAM for each index, run through the entire index and update rec_per_key values
InnoDB
NDB
HEAP unsupported unsupported
Falcon
PBXT
Solid

Retrieved from "http://forge.mysql.com/wiki/MySQL_Internals_Optimizer/Storage_engine_Interface"

This page has been accessed 1,639 times. This page was last modified 13:44, 21 December 2007.

Find

Browse
MySQLForge
Main Page
Current events
Recent changes
Random page
Help
Edit
Edit this page
Editing help
This page
Discuss this page
Post a comment
Printable version
Context
Page history
What links here
Related changes
My pages
Special pages
New pages
File list
Statistics
Bug reports
More...