MySQL Internals Locking Overview
[edit] Introduction: Concurrency and isolation control facilities in MySQL
- Meta-data locks: implemented inside the table cache. Provide isolation for Data Definition Language (DDL) operations. A specific type of a metadata lock is called Name Lock. (layer: SQL)
- Table-level data locks (layer: SQL)
- Engine-specific mechanisms - row locks, page locks, table locks, versioning (implemented inside engines)
- The global read lock - necessary to support FLUSH TABLES WITH READ LOCK (layer: SQL)
[edit] Life cycle of a table during statement execution
A typical execution flow for a Data Manipulation Language (DML) statement:
- compute the set of all tables used in the statement
- For each table:
- "open" the table - get a TABLE object from the table cache and acquire a meta-data lock on the table
- Wait if GLOBAL READ LOCK and the statement changes data
- For each table:
- "lock" the table - acquire a table-level data lock on the table
- execute the statement:
- calls handler::write_row()/read_rnd()/read_index(), etc.
- engine-level locking facilities are invoked implicitly
- For each table:
- release the data lock on the table
- For each table:
- release the DDL lock on the table and put the table back into the table cache
There is no typical execution scheme for DDL statements.
[edit] Acquiring meta-data locks
- Meta-data locks are implemented as a property of a TABLE object, which represents an entry of the table cache
- A metadata lock can be either:
- shared - taken implicitly just by marking a TABLE object as "used"
- semi-exclusive, also called "name lock". Acquired by RENAME on the source and destination tables.
- exclusive, also called "exclusive name lock". Acquired by CREATE TABLE ... SELECT on the destination table when it does not exist
[edit] The table cache
- is a HASH variable called open_cache
- TABLE objects are elements of the hash.
- operations with the hash are protected by LOCK_open mutex
[edit] The table cache: internal structure
- each physical table may be represented by multiple TABLE instances in the cache
- all instances of the same table are connected into a linked list
- each instance has a copy of the table cache version - instances whose saved version does not match the current version of the table cache, are considered old and removed from the cache
- the instances that are "in use" by some statement are marked as not available for other statements - this is the guts of meta data locking
- TABLE object that is in the cache normally has a valid handler instance attached to it
[edit] The table cache: operations
- main code in sql/sql_base.cc, sql/lock.cc, sql/table.h, sql/sql_table.cc
- main functions are open_table(), close_thread_tables(), close_cached_table(), lock_table_names()
- Actually a mix of concepts/objects used not only for caching or locking
- LOCK_open mutex is used for other operations (e.g. makes table creation on disk and in handler atomic)
- Typical operations (important from isolation PoV):
- open/close table (for a statement) – shared lock
- force and wait until all instances of the table are closed – exclusive lock (but not exactly!)
- Name-lock – special case of the latter when we don't have TABLE object at hand so use a special placeholder (table even may not exist!)
[edit] The table cache: locking multiple tables
- Uses a sort of try and back-off technique to avoid deadlocks (optimistic locking)
- A set of tricks/hacks for DDL operations e.g. to issue a lock upgrade or prevent DDL starvation
[edit] Infamous LOCK_open problem
LOCK_open mutex:
- protects table cache structures
- groups creation of .frm file for table and object in engine (also provides atomicity for renames)
- is taken for each statement that uses tables at least twice: during open_tables() and close_thread_tables()!
- Held during disk writes/reads and even syncs when performing DDL operations
It is one of the major MySQL concurrency bottlenecks. Please do not use this mutex!
[edit] Interesting example: ALTER TABLE
A simplified scheme of ALTER TABLE execution
- open and lock table with TL_WRITE_ALLOW_READ
- create an altered copy of the table with a temporary name
- force and wait until all instances of table are closed (lock upgrade!)
- swap the new and old versions
- drop the old version
This is a general case, there are optimized cases.
[edit] A debug trace for ALTER TABLE
Simplified debug trace for ALTER table
[edit] Another example: RENAME TABLE
- Obtain name-lock on source and destination names
(i.e. insert special TABLE placeholders into the table cache and wait until all instances of these tables are closed)
- Rename .frm files for table and call handler::rename_table() method
- Remove name-locks
Note that we hold LOCK_open during the whole second phase!
Simplified debug trace for RENAME TABLE
[edit] Table level locks
- Main code in sql/lock.cc and mysys/thr_lock.cc
- mysql_lock/unlock_tables() (SQL-layer operations) and thr_multi_lock()/thr_lock() (lock-compatibility logic) ...
- Table should be open to be locked
- lock objects are associated with handler (minor)
- engines can adjust lock type, e.g. innodb/bdb, and actual number of objects to be locked, e.g. merge/partition (see handler::store_lock() method).
- Uses lock hierarchy for deadlock avoidance
- all tables should be locked at once
- if engine adjusts locks deadlocks are its own problem
- Tables can be unlocked earlier in some cases
[edit] A few words about pre-locking
- Deadlock avoidance scheme historically utilized for table-level data locks demands that we lock all the tables used in a statement at once
- So for statements that use functions/triggers we have to open all the tables used (directly or indirectly) and lock them (we build a transitive closure of used tables for this)
- To achieve efficiency we mix layers and access to some parser/statement context from modules that mostly deal with tables otherwise (which is bad)
[edit] Global read lock
- Implements support for FLUSH TABLES WITH READ LOCK (used for backup purproses)
- Prevents DDL and DML from execution
- is advisory: each DDL/DML statement statement checks if there is a pending global read lock and stops if there is any
- by calling wait_if_global_read_lock() directly (in this case we will set a protection from global read lock, and have to call start_waiting_global_read_lock() to remove this protection, usually there are no open tables in this case)
- or via mysql_lock_tables() (in the latter case we also reopen tables)
- Thread that does FLUSH TABLES WITH READ sets global read lock flag and initiates a FLUSH TABLES statement. Then it waits until all tables are flushed.