WL#4813: PERFORMANCE_SCHEMA Instrumenting StagesAffects: Benchmarks-3.0 — Status: Un-Assigned — Priority: Very HighWL#2360 "Performance Schema" mentions "stages". They're part of a hierarchy Transaction -> Statement -> Stage -> Wait. WL#2360 has a section "[ NESTING-EVENT SUMMARIES ]" which describes the appropriate columns of a nesting event. It's unnecessary to repeat them here. The main concern is; "what is a stage". By definition a stage is a process that takes place during statement processing (it doesn't span a statement), and examples that we've talked about include "parse" and "filesort". We should be more exact. Stages in mysql.backup_progress, show profiles ---------------------------------------------- An example of 'stages' in mysql-6.0-backup: > > SELECT * FROM mysql.backup_progress; +---------+--------------+--------------+--------+----------+--------+---------+ | backup_ | start_time | stop_time | total_ | progress | error_ | notes | | id | | | bytes | | num | | +---------+--------------+--------------+--------+----------+--------+---------+ | 279 | ... 00:00:00 | ... 00:00:00 | 0 | 0 | 0 | starting| | 279 | ... 00:00:00 | ... 00:00:00 | 0 | 0 | 0 | running | | 279 | ... 00:00:00 | ... 00:00:00 | 0 | 0 | 0 | validity| | 279 | ... 22:14:46 | ... 00:00:00 | 0 | 0 | 0 | vp time | | 279 | ... 00:00:00 | ... 00:00:00 | 0 | 0 | 0 | running | | 279 | ... 00:00:00 | ... 00:00:00 | 0 | 0 | 0 | complete| +---------+--------------+--------------+--------+----------+--------+-- ------+ 6 rows in set (2.16 sec) An example of 'stages' in mysql-6.0 with the profiling patch: mysql> show profile cpu for query 2; +--------------------+----------+----------+------------+ | Status | Duration | CPU_user | CPU_system | +--------------------+----------+----------+------------+ | starting | 0.000337 | 0.000000 | 0.000000 | | Opening tables | 0.000200 | 0.000000 | 0.000000 | | System lock | 0.000118 | 0.000000 | 0.000000 | | Table lock | 0.000089 | 0.000000 | 0.000000 | | init | 0.000193 | 0.000000 | 0.000000 | | end | 0.000076 | 0.000000 | 0.000000 | | query end | 0.000072 | 0.000000 | 0.000000 | | freeing items | 0.000221 | 0.000000 | 0.000000 | | logging slow query | 0.000072 | 0.000000 | 0.000000 | | cleaning up | 0.000111 | 0.000000 | 0.000000 | +--------------------+----------+----------+------------+ 10 rows in set (0.01 sec) The "Status" values are I guess as in the MySQL Reference Manual "7.5.5.2. General Thread States" "7.5.5.3. Delayed-Insert Thread States" "7.5.5.4. Replication Master Thread States" "7.5.5.5. Replication Slave I/O Thread States" "7.5.5.6. Replication Slave SQL Thread States" "7.5.5.7. Replication Slave Connection Thread States" "7.5.5.8. Event Scheduler Thread States" starting at http://mysql.cs.pu.edu.tw/doc/refman/6.0/en/general-thread-states.html and SHOW PROCESSLIST is similar. There's no decision yet whether we should keep the current categories. If they appear to be the product of careful thought, okay. Description of stages in another product can be seen by clicking on 'Progress' and looking at changes for 2009-04-13. It's interesting that some of the above "stages" implementations use getrusage() or equivalent. We considered getrusage to be of limited use for short WAIT instrumenting, but for longer operations we have to consider it. Peter Gulutzan thinks the overhead is too great and would prefer to only have getrusage() statistics for statements not stages. Sergei Golubchik suggested we should have whatever SHOW PROFILE has. See dev-private email thread "Re: BUG#28288 Query Profiler not available in MySQL 5.1 and 6.0", [mysql internal address]/secure/mailarchive/mail.php?folder=4&mail=25820 That correspondence mentions that Peter would like to deprecate SHOW PROFILE, perhaps as part of this task. Requirements ------------ Stages "should" be sequential. That is: * one stage should not be nested within another stage * one stage should not overlap with another stage. But we must accept that nesting and overlapping will happen. Keeping a stack is possible, but not for all the thousands of aggregator bytes. The best we can hope for is to have: Row#1 First part of Event A Row#2 All of Event B Row#3 Second part of Event A For this, we need to recover the fact that we are resuming Event A when Event B ends. So, when starting Event B, we need to save that we were in Event A. And when ending Event B, we recover that fact. And in Row#3 there is an indication that "this is Part 2" or "Continued". Tables in PERFORMANCE_SCHEMA ---------------------------- The name "stages" is uncertain. Other possible names are STATE (which MySQL uses already) or SEGMENT or STEP or DATABASE CALL, or [ fill in your own favourite here ]. So the main tables are EVENTS_STAGES_CURRENT equivalent to EVENTS_WAITS_CURRENT EVENTS_STAGES_HISTORY equivalent to EVENTS_WAITS_HISTORY and the column descriptions are the same as in EVENTS tables, plus summary columns (see WL#4816 section "[ NESTING-EVENT SUMMARIES ]"). So why have a different set of tables for stages? One thought was: their content is backed up by a different set of internal buffers. Another thought was: they obviously aren't WAITS. Indeed, they're obviously not EVENTS either, but we're sticking to the word EVENTS for all the tables with instrumentation data. EVENTS_STAGES_SUMMARY_BY_THREAD_BY_GENUS is the summaries equivalent for STAGES equivalent to EVENTS_WAITS_SUMMARY_BY_THREAD_BY_GENUS. There are also of course the nesting-event summaries. We should consider at least two alternative ways to fill these tables: (1) At start / During Read / At End. When the stage begins, save all the summaries data from the WAITS equivalent table, for example "COUNT_IO = 555" Whenever somebody reads EVENTS_STAGES_SUMMARY_BY_THREAD_BY_GENUS, get the most recent summaries data from the WAITS equivalent table, for example "COUNT_IO = 666", subtract what the value was at the start, and return the difference (666 - 555) "COUNT_IO = 111" to the reader. When the stage ends, do the same difference calculation, it's guaranteed to be the total number of IO waits done by the thread (though not by any sub-threads) while the stage was going on. It works because waits are atomic and nested within stages. (2) Continuously. Whenever a WAIT event takes place, add to the counters for the NESTING_EVENT_ID. This method probably has higher overhead if there are many waits in a stage. But when we get to higher points in the nesting hierarchy, e.g. statements, the trick of alternative way (1) might become impossible so the code for it isn't reusable. (The problem is that statements can contain statements.) References ---------- dev-private email thread "Tables for showing what stage a statement has reached" starting with [mysql internal address]/secure/mailarchive/mail.php?folder=4&mail=21981 WL#1023 Check if we can get CPU time used by thread (1023) WL#4116 Online Backup: Record Online Backup Progress WL#4170 Provide progress indication for long queries and operations Blog article Show profile + Information_schema.profiling http://blogs.mysql.com/peterg/2008/11/06/show-profile-information_schemaprofiling/ No Comments yet |
VotesWatches0 members are watching this worklog
You must be logged in to track this worklog.
Provide Feedback
You must be logged in to comment
|