Tuning MySQL5 SQL And Stored Procedures
by Guy Harrison
[edit] Notes by Sheeri Kritzer
Un-tuned SQL or stored procedures often fail to scale as table volumes increase, inefficiency increases exponentially with size.
Tune SQL/stored procedures and then buy new hardware.
use EXPLAIN to help optimize queries. Also use the slow query log.
EXPLAIN EXTENDED shows sql that was actually used — ie, optimizer may rewrite query, so it’s a neat tool.
you can always give optimizer hints, but they’re not recommended — keep checking them as your app grows — STRAIGHT_JOIN, FORCE INDEX, USE INDEX, and one other one.
SHOW STATUS gives you status variables. innodb_buffer_pool_read_requests and innodb_data_read will show how much data is being read from the buffer pool vs. data.
Index isn’t always used, if more than 20% or so of rows, MySQL will use a full table scan. There’s usually a range where MySQL will choose a full table scan when an index is more appropriate, or vice versa, so that’s when you’d use hints. Hey, nobody’s perfect!
think indexes — joining tables of non-trivial size Subqueries ( [NOT] EXISTS, [NOT] IN) in WHERE clause. Use index to avoid a sort, use “covering” indexes.
Establish the best set of multi-column indexes along with singular indexes.
Derived tables (subqueries in FROM cause) can’t use an index. VIEWs with UNION or GROUP BY also can’t use index — all these use TEMPTABLE view algorithm. (temp table created, and then reads from temp table).
Sorts can be improved by increasing memory (sort_buffer_size) or using an index.
Use procedures to:
- Avoid self joins
- Correlated updates (subqueries accessing same data)
Performance of SQL within a stored routine that dominates the performance. When SQL is tuned, optimize the routine using traditional techniques:
- only put what’s needed in a loop
- stop testing when you know the answer
- order tests by most likely first
Recursion:
- only allowed in procedures, not functions
- depth controlled by max_sp_recursion_depth
- iterative alternatives are almost always faster and scaleable
TRIGGERS non-trivial (12% at least) to even simplest trigger. No trigger should EVER contain expensive SQL, because they are done for each row.
Quest free software for MySQL — http://www.quest.com/mysql/
[edit] Notes by Ronald Bradford
Another one of the sessions at the MySQL Users Conference I attended was Tuning MySQL5 SQL and Stored Procedures by Guy Harrison from Quest Software. A global company with 6000+ customers.
Guy has written a number of Oracle Performance Books in the past. His work now is on the “Spotlight” product family - Database diagnostic tools converting data to graphical representations. For these products, MySQL 5 and InnoDB only is necessary, simply due to accessing the right internal information for presentation. There are Freeware MySQL product downloads.
In this presentation he stated, nothing he was talking about specifically was relatively new. He did make quite a funny comment, “He is now seeking refugee status in the MySQL Community”.
Guy is author of O’Reilly “MySQL Stored Procedure Programming” Book. I managed to get for free at the conference from the MySQL Quiz night, in addition to a shirt and cap for stumping a Guru.
His talk were on tools and techniques for tuning MySQL.
- Explain Command - reveals what the optimizer intends to do
- Explain Extended
mysql> explain extended select ...; mysql> show warnings \G Shows what the optimizer actually did. In this example, An IN was converted to EXISTS
There were 4 ways to provide optimizer hints.
1. STRAIGHT_JOIN
2. USE INDEX(…)
3. FORCE INDEX(…)
4. IGNORE INDEX(…)
In addition to the Show Query Log, there are Innodb specific commands, two in particular.
show status like 'innodb%' * innodb_buffer_pool_read_requests * innodb_data_read
Indexing and the optimizer
- In MySQL Index is the best tool to improve performance, however sometimes it’s better to access the entire table.
- Indexes generally effective when between 5% and 20% of rows are accessed.
- Subqueries need to be satisified by an index or performance will be quite inefficent.
- Overloading indexes with additional columns when key queries only use a few columns can enable improved performance.
Not all indexes are created equal. In the following examples, each advancement improved performance.
- No indexes ()
- Single Index (customer)
- multiple indexes (customer, product)
- concatenated indexes (customer + product)
- covering index (including required columns, customer+product+qty)
Examples of SQL that can’t benefit from Indexes.
- Derived tables - SELECT table in a from clause, creates a temporary table and will never get an index.
- Views with UNIONS/GROUP BY
A comment from the audience was that derived tables can be of a benefit to a correlated sub-query in specific examples.
Stored Procedures provided a mixed blessing for performance.
- Can improved perfomrance when high network overhead.
- Some improvement on parsing.
- Breaking up complex queries may provide benefits.
- SQL is highly optimized for SET operations.
- SP is not optimized for number crunching. Computionally not a fast language.
A written routine to calculate prime numbers provided the following performance (from most expensive to least) MySQL SP, Oracle SP,PHP,Perl,Java ,VB.NET ,C (gcc). This showed an example that was excessively inefficient. On the other hand, if the program is network dependent (e.g. access a million rows, perform some statisical aggretation). Comparatively the same between Java and SP locally, but much better in a remote host mode.
Performance of SQL in a SP will dominate overall performance. Where SQL is tuned, goto tried and proven traditional optimisation techniques.
- Optimize iterations
- Optimize Logic/Testing
- Avoid recursion
Loop Management
- Only perform necessary code within iterations
- LEAVE or CONTINUE when possible in loops
- Test the most likely IF/THEN statements first
- extract if comparisions duplicated to produce nested if’s (within reason)
Some guidelines for Triggers.
- Triggers will have a non-trival overhead for even the simplest trigger.
- Due to FOR EACH ROW only, don’t have expensive SQL in any trigger.
- Very carefully tune SQL in triggers.
- Empty trigger produced 12% overhead.