Category: MySQLDevelopment

MySQL Internals Binary Log

← Back to MySQL Internals overview page

Contents

[edit] The Binary Log

This chapter describes the MySQL binary log, which contains information about data modifications made to a MySQL server instance.

Most of the information here applies equally to the relay log used on replication slave servers because a relay log has the same format as a binary log.

[edit] Binary Log Overview

[Some information information in this section is derived from The Binary Log, in the MySQL Reference Manual.]

The binary log is a set of log files that contain information about data modifications made to a MySQL server instance. The log is enabled by starting the server with the --log-bin option.

The binary log was introduced in MySQL 3.23.14. It contains all statements that update data. It also contains statements that potentially could have updated it (for example, a DELETE which matched no rows), unless row-based logging is used. Statements are stored in the form of "events" that describe the modifications. The binary log also contains information about how long each statement took that updated data.

The binary log also contains some other metadata, including:

The binary log is a trace of changes of the server's global state generated during its operation. The events that it contains describe changes of this state. More precisely, binary log events describe actions that can be used to reproduce the same changes of global state which have happened on server.

The binary log has two important purposes:

There are two types of binary logging:

Mixed logging uses statement-based logging by default but switches to row-based logging automatically as necessary.

Row-based (and mixed) logging is available beginning with MySQL 5.1.

For more details about row-based logging, see #Row-Based Binary Logging.

The mysqlbinlog utility can be used to print binary or relay log contents in readable form.

[edit] High-Level Binary Log Structure and Contents

The binary log is a set of files that contain information about data modifications made to a MySQL server instance.

The details about event structure have changed over time, which gives rise to different versions of the binary log format. Currently, there are three binary log format versions, described in #Binary Log Versions.

The log files are sequentially numbered using a .NNNNNN suffix. The index file has a suffix of .index. All files share a common basename. The default binary log file-naming basename is "HOSTNAME-bin". With the default basename, the binary log has files with names like this:

...
HOSTNAME-bin.0000101
HOSTNAME-bin.0000102
HOSTNAME-bin.0000103
...
HOSTNAME-bin.index

Relay log file naming is similar to that of the binary log files. The default relay log file-naming basename is "HOSTNAME-relay". With the default basename, the relay log has files with names like this:

...
HOSTNAME-relay.0000101
HOSTNAME-relay.0000102
HOSTNAME-relay.0000103
...
HOSTNAME-relay.index

[edit] Source Files Related to the Binary Log

This section describes the files in a MySQL source tree that are most relevent to binary log processing.

sql directory:

client directory:

[edit] Source File Archaeological Notes

log_event_old.h/log_event_old.cc (present in MySQL 5.1.18 and up): Classes to read and execute old versions of the row log events:

Prior to MySQL 5.1.17, those classes were known as:

For information about the relationship of the Xxx_rows_log_event and Xxx_row_log_event_old classes, see #Event Classes and Types.

[edit] Generating Browsable Binary Log Information

Source files in the sql directory of a MySQL source tree or source distribution contain comments that can be processed with doxygen to generate HTML files that describe classes, files, and so forth. Those files can be viewed in your Web browser.

To generate the HTML files and view information about the classes related to binary logging, do this:

  1. In your MySQL source tree, change location to the sql directory:
    shell> cd sql
  2. Run doxygen to generate the HTML files. These commands create a subdirectory named html containing the HTML output:
    shell> doxygen -g
    shell> doxygen
  3. To view the top-level index page, load the html/index.html file into your browser.
  4. To view the classes for binary logging, load the html/class_log__event.html page. The Log_event class is the main event class, and the others inherit from Log_event.

The pages also contain links that take you to other related pages. For example, to navigate from index.html to class_log__event.html, click on the Classes tab. On the next page, scroll down to Log_event and click on it.

[edit] Event Classes and Types

Internally, the server uses C++ classes to represent binary log events. Prototypes are in log_event.h. Code for methods of these classes is in log_event.cc.

Log_event is the base class. Other more specific event subclasses are derived from it. Type codes are associated with subclasses because class instance contents are written to binary or relay logs or are sent over the network from master to slave. In those contexts, an event is just a sequence of bytes, not a class structure, so a type code is needed to allow recognition of the event type from the byte sequence.

An event byte sequence has a header part and a data part. The type code appears in the header part of each event.

The possible type codes for events are listed in the Log_event_type enumeration:

enum Log_event_type
{
  UNKNOWN_EVENT= 0, START_EVENT_V3= 1, QUERY_EVENT= 2, STOP_EVENT= 3,
  ROTATE_EVENT= 4, INTVAR_EVENT= 5, LOAD_EVENT= 6, SLAVE_EVENT= 7,
  CREATE_FILE_EVENT= 8, APPEND_BLOCK_EVENT= 9, EXEC_LOAD_EVENT= 10,
  DELETE_FILE_EVENT= 11, NEW_LOAD_EVENT= 12, RAND_EVENT= 13,
  USER_VAR_EVENT= 14, FORMAT_DESCRIPTION_EVENT= 15, XID_EVENT= 16,
  BEGIN_LOAD_QUERY_EVENT= 17, EXECUTE_LOAD_QUERY_EVENT= 18,
  TABLE_MAP_EVENT = 19, PRE_GA_WRITE_ROWS_EVENT = 20,
  PRE_GA_UPDATE_ROWS_EVENT = 21, PRE_GA_DELETE_ROWS_EVENT = 22,
  WRITE_ROWS_EVENT = 23, UPDATE_ROWS_EVENT = 24,
  DELETE_ROWS_EVENT = 25, INCIDENT_EVENT= 26,
  HEARTBEAT_LOG_EVENT= 27,
  ENUM_END_EVENT /* end marker */
};

The INTVAR_EVENT type has "subtypes," listed in the Int_event_type enumeration:

enum Int_event_type
{
  INVALID_INT_EVENT = LAST_INSERT_ID_EVENT = INSERT_ID_EVENT = 2
};

The following table summarizes the relationship between event classes and type codes. Each class is derived from Log_event unless otherwise indicated. As can be seen, an event class is associated with a single type code in most cases, although there are some exceptions:

Value Type Code Class
Log_event
  • Base class for most other classes
Muted_query_log_event
  • Added in 5.0.23
  • Removed in 6.0.4
Rows_log_event
  • Added in 5.1.5
  • Base class for Write_rows_log_event, Update_rows_log_event, Delete_rows_log_event
Old_rows_log_event
  • Added in 5.1.22
  • Base class for Write_rows_log_event_old, Update_rows_log_event_old, Delete_rows_log_event_old
0 UNKNOWN_EVENT Unknown_log_event
1 START_EVENT_V3 Start_log_event_v3
  • Renamed from START_EVENT/Start_log_event in 5.0.0
  • Base class for Format_description_log_event
2 QUERY_EVENT Query_log_event
  • Base class for Execute_load_query_log_event
3 STOP_EVENT Stop_log_event
4 ROTATE_EVENT Rotate_log_event
5 INTVAR_EVENT Intvar_log_event
6 LOAD_EVENT Load_log_event
  • Base class for Create_file_log_event
7 SLAVE_EVENT Slave_log_event
  • Added in 4.0.0
8 CREATE_FILE_EVENT Create_file_log_event
  • Added in 4.0.0
  • Derived from Load_log_event
9 APPEND_BLOCK_EVENT Append_block_log_event
  • Added in 4.0.0
  • Base class for Begin_load_query_log_event
10 EXEC_LOAD_EVENT Execute_load_log_event
  • Added in 4.0.0
11 DELETE_FILE_EVENT Delete_file_log_event
  • Added in 4.0.0
12 NEW_LOAD_EVENT Load_log_event
  • Added in 4.0.0
13 RAND_EVENT Rand_log_event
  • Added in 4.0.5
14 USER_VAR_EVENT User_var_log_event
  • Added in 4.1.0
15 FORMAT_DESCRIPTION_EVENT Format_description_log_event
  • Added in 5.0.0
  • Derived from Start_log_event_v3
16 XID_EVENT Xid_log_event
  • Added in 5.0.3
17 BEGIN_LOAD_QUERY_EVENT Begin_load_query_log_event
  • Added in 5.0.3
  • Derived from Append_block_log_event
18 EXECUTE_LOAD_QUERY_EVENT Execute_load_query_log_event
  • Added in 5.0.3
  • Derived from Query_log_event
19 TABLE_MAP_EVENT Table_map_log_event
  • Added in 5.1.5
20 PRE_GA_WRITE_ROWS_EVENT Write_rows_log_event_old
  • Added in 5.1.5 as WRITE_ROWS_EVENT/Write_rows_log_event and derived from Rows_log_event
  • Renamed in 5.1.18 to PRE_GA_WRITE_ROWS_EVENT/Write_rows_log_event_old and derived from Write_rows_log_event
  • As of 5.1.22, derived from Old_rows_log_event
21 PRE_GA_UPDATE_ROWS_EVENT Update_rows_log_event_old
  • Added in 5.1.5 as UPDATE_ROWS_EVENT/Update_rows_log_event and derived from Rows_log_event
  • Renamed in 5.1.18 to PRE_GA_UPDATE_ROWS_EVENT/Update_rows_log_event_old and derived from Update_rows_log_event
  • As of 5.1.22, derived from Old_rows_log_event
22 PRE_GA_DELETE_ROWS_EVENT Delete_rows_log_event_old
  • Added in 5.1.5 as DELETE_ROWS_EVENT/Delete_rows_log_event and derived from Rows_log_event
  • Renamed in 5.1.18 to PRE_GA_DELETE_ROWS_EVENT/Delete_rows_log_event_old and derived from Delete_rows_log_event
  • As of 5.1.22, derived from Old_rows_log_event
23 WRITE_ROWS_EVENT Write_rows_log_event
  • Derived from Rows_log_event
  • Renumbered in 5.1.18 from 20 to 23
24 UPDATE_ROWS_EVENT Update_rows_log_event
  • Derived from Rows_log_event
  • Renumbered in 5.1.18 from 21 to 24
25 DELETE_ROWS_EVENT Delete_rows_log_event
  • Derived from Rows_log_event
  • Renumbered in 5.1.18 from 22 to 25
26 INCIDENT_EVENT Incident_log_event
  • Added in 5.1.18
27 HEARTBEAT_LOG_EVENT Heartbeat_log_event
  • Added in 6.0.5

[edit] Event Class Archaeological Notes

Despite the "V3" in the type code name, START_EVENT_V3 currently is used as the type code not only for v3 start events, but also for v1 start events. The original symbol for type code 1 was START_EVENT in the format now known as v1. Later, when v3 was developed, type code 1 was reused and the symbol associated with it was renamed from START_EVENT to START_EVENT_V3. The start events for both v1 and v3 therefore have a type code of 1, although the event structures differ and must be distinguished by examining their contents.

Up to MySQL 5.1.17, event type codes from 20 to 22 were associated with symbols and classes as follows:

Value Type Code Class
20 WRITE_ROWS_EVENT Write_rows_log_event
21 UPDATE_ROWS_EVENT Update_rows_log_event
22 DELETE_ROWS_EVENT Delete_rows_log_event

In 5.1.18, the symbols and classes were renamed:

Value Type Code Class
20 PRE_GA_WRITE_ROWS_EVENT Write_rows_log_event_old
21 PRE_GA_UPDATE_ROWS_EVENT Update_rows_log_event_old
22 PRE_GA_DELETE_ROWS_EVENT Delete_rows_log_event_old

Also in 5.1.18, the original symbols were reused with different values and new implementations of the classes that used the original names:

Value Type Code Class
23 WRITE_ROWS_EVENT Write_rows_log_event
24 UPDATE_ROWS_EVENT Update_rows_log_event
25 DELETE_ROWS_EVENT Delete_rows_log_event

Events with type codes 20 to 22 are obsolete now and appear only in binary logs created by servers from MySQL 5.1.5 to 5.1.17.

[edit] Event Meanings

The following descriptions briefly summarize the meaning of each event type:

[edit] Event Structure

This section describes the general properties of events as byte sequences as they are written to binary or relay log files.

All events have a common general structure consisting of an event header followed by event data:

+===================+
|   event header    |
+===================+
|    event data     |
+===================+

The details about what goes in the header and data parts have changed over time, which gives rise to different versions of the binary log format:

A v2 format was used briefly (in early MySQL 4.0.x versions), but it is obsolete and no longer supported.

Some details of event structure are invariant across binary log versions; others depend on the version. Within any given version, different types of events vary in the structure of the data part.

The first event in a log file is special. It is a descriptor event that provides information such as the binary log version and the server version. The information in the descriptor event enables programs to determine which version of the binary log format applies to the file so that the remaining events in the file can be properly read and interpreted.

For details about the initial descriptor event and how to use it to determine the format of a binary log file, see #Binary Log Versions. For additional information about other types of events, see #Event Data for Specific Event Types.

The following event diagrams contain field descriptions written using these conventions:

The overall structure for events in the different binary versions is shown here. The following sections describe the header and data parts in more detail.

v1 event structure:

+=====================================+
| event  | timestamp         0 : 4    |
| header +----------------------------+
|        | type_code         4 : 1    |
|        +----------------------------+
|        | server_id         5 : 4    |
|        +----------------------------+
|        | event_length      9 : 4    |
+=====================================+
| event  | fixed part       13 : y    |
| data   +----------------------------+
|        | variable part              |
+=====================================+

header length = 13 bytes
data length = (event_length - 13) bytes

y is specific to the event type.

v3 event structure:

+=====================================+
| event  | timestamp         0 : 4    |
| header +----------------------------+
|        | type_code         4 : 1    |
|        +----------------------------+
|        | server_id         5 : 4    |
|        +----------------------------+
|        | event_length      9 : 4    |
|        +----------------------------+
|        | next_position    13 : 4    |
|        +----------------------------+
|        | flags            17 : 2    |
+=====================================+
| event  | fixed part       19 : y    |
| data   +----------------------------+
|        | variable part              |
+=====================================+

header length = 19 bytes
data length = (event_length - 19) bytes

y is specific to the event type.

v4 event structure:

+=====================================+
| event  | timestamp         0 : 4    |
| header +----------------------------+
|        | type_code         4 : 1    |
|        +----------------------------+
|        | server_id         5 : 4    |
|        +----------------------------+
|        | event_length      9 : 4    |
|        +----------------------------+
|        | next_position    13 : 4    |
|        +----------------------------+
|        | flags            17 : 2    |
|        +----------------------------+
|        | extra_headers    19 : x-19 |
+=====================================+
| event  | fixed part        x : y    |
| data   +----------------------------+
|        | variable part              |
+=====================================+

header length = x bytes
data length = (event_length - x) bytes
fixed data length = y bytes variable data length = (event_length - (x + y)) bytes

x is given by the header_length field in the format description event (FDE). Currently, x is 19, so the extra_headers field is empty.

y is specific to the event type, and is given by the FDE. The fixed-part length is the same for all events of a given type, but may vary for different event types.

The fixed part of the event data is sometimes referred to as the "post-header" part. The variable part is sometimes referred to as the "payload" or "body."

For information about how to use the FDE to interpret v4 events, see #Binary Log Formats.

[edit] Event Content-Writing Conventions

Event contents are written using these conventions:

Some events use Packed Integers, a special format for efficient representation of unsigned integers. A Packed Integer has the capacity of storing up to 8-byte integers, while small integers still can use 1, 3, or 4 bytes. The value of the first byte determines how to read the number, according to the following table.

First byte Format
0-250 The first byte is the number (in the range 0-250). No additional bytes are used.
252 Two more bytes are used. The number is in the range 251-0xffff.
253 Three more bytes are used. The number is in the range 0xffff-0xffffff.
254 Eight more bytes are used. The number is in the range 0xffffff-0xffffffffffffffff.

Packed Integer format derives from the "Length Coded Binary" representation used in the MySQL client/server network protocol (see MySQL Client/Server Protocol). That representation allows a first byte value of 251 to represent the SQL NULL value, but 251 is apparently unused for Packed Integers in the binary log.

[edit] Event Header Fields

Each event starts with a header of size LOG_EVENT_HEADER_LEN. The value of this constant is 13 in MySQL 3.23 (v1 format), and 19 in MySQL 4.0 and up (v3 format and up). The value is larger as of 4.0 because next position and flags fields were added to the header format then:

The header for any version is a superset of the header for all earlier versions:

Because the event header in a newer binary log format starts with the header of the old formats, headers in different formats are backward compatible.

v1 event header:

+============================+
| timestamp         0 : 4    |
+----------------------------+
| type_code         4 : 1    |
+----------------------------+
| server_id         5 : 4    |
+----------------------------+
| event_length      9 : 4    |
+============================+

The 13 bytes of the v1 header are also present in the header of all subsequent binary log versions.

v3 event header:

+============================+
| timestamp         0 : 4    |
+----------------------------+
| type_code         4 : 1    |
+----------------------------+
| server_id         5 : 4    |
+----------------------------+
| event_length      9 : 4    |
+----------------------------+
| next_position    13 : 4    |
+----------------------------+
| flags            17 : 2    |
+============================+

Compared to v1, the header in v3 and up contains two additional fields, for a total of 19 bytes.

v4 event header:

+============================+
| timestamp         0 : 4    |
+----------------------------+
| type_code         4 : 1    |
+----------------------------+
| server_id         5 : 4    |
+----------------------------+
| event_length      9 : 4    |
+----------------------------+
| next_position    13 : 4    |
+----------------------------+
| flags            17 : 2    |
+----------------------------+
| extra_headers    19 : x-19 |
+============================+

The v4 format includes an extra_headers field; this is a mechanism for adding extra fields to the header without breaking the format. This extension mechanism is implemented via the format description event that appears as the first event in the file. (See #Binary Log Versions for details.) Currently, x = 19, so the extra_headers field is empty; thus, the v4 header is the same as the v3 header.

Note: The extra_headers field does not appear in the FORMAT_DESCRIPTION_EVENT or ROTATE_EVENT header.

The offsets of several fields within the event header are available as constants in log_event.h:

The header fields contain the following information:

4 bytes. This is the time at which the statement began executing. It is represented as the number of seconds since 1970 (UTC), like the TIMESTAMP SQL data type.

1 byte. The type of event. 1 means START_EVENT_V3, 2 means QUERY_EVENT, and so forth. These numbers are defined in the enum Log_event_type enumeration in log_event.h. (See #Event Classes and Types.)

4 bytes. The ID of the mysqld server that originally created the event. It comes from the server-id option that is set in the server configuration file for the purpose of replication. The server ID enables endless loops to be avoided when circular replication is used (with option --log-slave-updates on). Suppose that M1, M2, and M3 have server ID values of 1, 2, and 3, and that they are replicating in circular fashion: M1 is the master for M2, M2 is the master for M3, and M3 is that master for M1. The master/server relationships look like this:

M1---->M2
 ^      |
 |      |
 +--M3<-+
 

A client sends an INSERT statement to M1. This is executed on M1 and written to its binary log with an event server ID of 1. The event is sent to M2, which executes it and writes it to its binary log; the event is still written with server ID 1 because that is the ID of the server that originally created the event. The event is sent to M3, which executes it and writes it to its binary log, still with server ID 1. Finally, the event is sent to M1, which sees server ID = 1 and understands this event originated from itself and therefore must be ignored.

4 bytes. The total size of this event. This includes both the header and data parts. Most events are less than 1000 bytes, except when using LOAD DATA INFILE (where events contain the loaded file, so they can be big).

4 bytes. The position of the next event in the master's binary log, as returned by tell(). In a binary log, this is just the position of the next event in the file. In a relay log, this is the position of the next event in the master's binlog. In other words, it is the position in the binary log where the next event was originally created. The value is copied as-is to the relay log. It is used on the slave, for SHOW SLAVE STATUS to be able to show coordinates of the last executed event in the master's coordinate system. If this value were not stored in the event, we could not know these coordinates because the slave cannot invoke tell() for the master's binary log.

In a binary log file, an identity should hold for every event: The position of the event in the file plus the value in the event length field should equal the value in the next position field. In a relay log file, this will not generally be true because events need not be written at the same position as in the original binary log file.

2 bytes. The possible flag values are described at #Event Flags.

Variable-sized. The size of this field is determined by the format description event that occurs as the first event in the file. Currently, the size is 0, so, in effect, this field never actually occurs in any event. At such time as the size becomes non-zero, this field still will not appear in events of type FORMAT_DESCRIPTION_EVENT or ROTATE_EVENT.

[edit] Event Flags

Event headers for v3 format and up contain event flags in the two flag bytes at position FLAGS_OFFSET = 17. There are comments about these flags in log_event.h, in addition to the remarks in this section.

Current event flags:

<thread id 1>
CREATE TEMPORARY TABLE t (a INT);
<thread id 2>
CREATE TEMPORARY TABLE t (a INT);

In this case, simply feeding this into mysql will lead to a "table t already exists" error. This is why events that use temporary tables are marked with the flag, so that mysqlbinlog knows it has to set the pseudo_thread_id system variable before, like this:

SET PSEUDO_THREAD_ID=1;
CREATE TEMPORARY TABLE t (a INT);
SET PSEUDO_THREAD_ID=2;
CREATE TEMPORARY TABLE t (a INT);

This way there is no confusion for the server that receives these statements. Always printing SET PSEUDO_THREAD_ID, even when temporary tables are not used, would cause no bug, it would just slow down.

Obsolete event flags:

They are now commented out in log_event.h and their values are available for reuse or have already been reused. (But see the associated comments in log_event.h for various cautions!)

[edit] Event Data Fields (Event-Specific Information)

The structure of an event's data part depends on the event type:

The data part of an event consists of a fixed-size part and a variable-size part. Either or both parts may be empty, depending on the event type. (For example, a STOP_EVENT consists only of the header part; the fixed and variable data parts are both empty.)

The size of the event data part is the event size (contained in the header) minus the header size. The size of the fixed data part is a function of the event type. The size of the variable data part is the event size minus the size of the header, minus the size of the fixed data part.

The following principles hold across all events in a binary log file:

For details about the fixed and variable parts of event data for different events, see #Event Data for Specific Event Types.

[edit] Event Data Field Notational Caveat

The fixed part of the event data goes under different names, depending on which source file, work log, bug report, etc. you are reading:

The variable part of event data also goes under different names, such as the event "payload" or "body."

[edit] Binary Log Versions

There are several versions of the binary log file format:

A v2 format was used briefly (in early MySQL 4.0.x versions), but it is obsolete and no longer supported.

Programs that process the binary log must be able to account for each of the supported binary log formats. This section describes how the server distinguishes each format to identify which one a binary log file uses. mysqlbinlog uses the same principles.

Important constants:

A binary log file begins with a 4-byte magic number followed by an initial descriptor event that identifies the format of the file.

Elsewhere you may see either term used generically to refer collectively to both types of event. This discussion uses "descriptor event" as the collective term.

The header and data parts of the descriptor event for each binary log format version are shown following. The diagrams use the same conventions as those described earlier in #Event Structure.

v1 start event (size = 69 bytes):

+=====================================+
| event  | timestamp         0 : 4    |
| header +----------------------------+
|        | type_code         4 : 1    | = START_EVENT_V3 = 1
|        +----------------------------+
|        | server_id         5 : 4    |
|        +----------------------------+
|        | event_length      9 : 4    | = 69
+=====================================+
| event  | binlog_version   13 : 2    | = 1
| data   +----------------------------+
|        | server_version   15 : 50   |
|        +----------------------------+
|        | create_timestamp 65 : 4    |
+=====================================+

v3 start event (size = 75 bytes):

+=====================================+
| event  | timestamp         0 : 4    |
| header +----------------------------+
|        | type_code         4 : 1    | = START_EVENT_V3 = 1
|        +----------------------------+
|        | server_id         5 : 4    |
|        +----------------------------+
|        | event_length      9 : 4    | = 75
|        +----------------------------+
|        | next_position    13 : 4    |
|        +----------------------------+
|        | flags            17 : 2    |
+=====================================+
| event  | binlog_version   19 : 2    | = 3
| data   +----------------------------+
|        | server_version   21 : 50   |
|        +----------------------------+
|        | create_timestamp 71 : 4    |
+=====================================+

v4 format description event (size >= 91 bytes; the size is 76 + the number of event types):

+=====================================+
| event  | timestamp         0 : 4    |
| header +----------------------------+
|        | type_code         4 : 1    | = FORMAT_DESCRIPTION_EVENT = 15
|        +----------------------------+
|        | server_id         5 : 4    |
|        +----------------------------+
|        | event_length      9 : 4    | >= 91
|        +----------------------------+
|        | next_position    13 : 4    |
|        +----------------------------+
|        | flags            17 : 2    |
+=====================================+
| event  | binlog_version   19 : 2    | = 4
| data   +----------------------------+
|        | server_version   21 : 50   |
|        +----------------------------+
|        | create_timestamp 71 : 4    |
|        +----------------------------+
|        | header_length    75 : 1    |
|        +----------------------------+
|        | post-header      76 : n    | = array of n bytes, one byte per event
|        | lengths for all            |   type that the server knows about
|        | event types                |
+=====================================+

In all binary log versions, the event data for the descriptor event begins with a common set of fields

The binary log version number (1, 3, or 4).

The server version as a string.

The creation timestamp, if non-zero, is the time in seconds when this event was created; it indicates the moment when the binary log was created. This field is actually of no value: If non-zero, it is redundant because it has the same value that is in the header timestamp.

Note: In practice, the creation timestamp field should be considered reserved for future use and programs should not rely on its value. This field may be commandeered in the future to serve another purpose.

The v4 format descriptor event data contains two additional fields that enable interpretation of other types of events:

The length of the event header. This value includes the extra_headers field, so this header length - 19 yields the size of the extra_headers field.

Currently in v4, the header length (at offset 75) is 19, which means that in other events, no extra headers will follow the flags field. If in the future the header length is some value x > 19, then x-19 extra header bytes will appear in other events in the extra_headers field following the flags field.

Note: The FORMAT_DESCRIPTION_EVENT itself contains no extra_headers field. Suppose that the FDE did have a header_length field after the flags field. That would introduce this problem:

In other words, you would need to know x to find the header_length field, but you cannot know x until you read the header_length field. (A circular dependency.) This means that the event extensibility mechanism afforded by the FDE does not apply to the FDE itself, which therefore is not itself extensible.

The lengths for the fixed data part of each event. This is an array that provides post-header lengths for all events beginning with START_EVENT_V3 (type code 1). The array includes no length for UNKNOWN_EVENT (type code 0).

[edit] Determining the Binary Log Version

Given any binary log file, the information in this section describes how to determine the format in which it is written.

Some important points about descriptor event formats:

To determine the version of a binary log file, use the following procedure:

1) The file begins with a 4-byte magic number. Skip over that to get to the first event in the file (which in most cases is a start event or format description event).

2) From the first event, read two values:

3) If the type code is not START_EVENT_V3 or FORMAT_DESCRIPTION_EVENT, the file format is v3. (See Exceptional Condition 1 later in this section.)

4) If the type code is START_EVENT_V3 (1), check the event length. If the length is less than 75, the file format is v1, and v3 otherwise. Why the value 75? Because that is the length of a v3 start event:

Summing those lengths yields 19 + 2 + 50 + 4 = 75

Therefore, if the event is shorter than 75 bytes, it must be from a v1 file because that will have a shorter first event than a v3 file.

5) If the type code is FORMAT_DESCRIPTION_EVENT (15), the file format is v4.

The preceding steps describe the general binary log format-recognition principles. However, there are some exceptional conditions that must be accounted for:

Exceptional Condition 1: In MySQL 4.0 and 4.1, the initial event in a binary log file might not be a start event. This occurs because the server writes the start event only to the first binary log file that it creates after startup. For subsequent files, the server writes an event of type ROTATE_EVENT to the end of the current log file, closes it, and the begins the next file without writing a start event to it. If a log file begins with an event that is not START_EVENT_V3 or FORMAT_DESCRIPTION_EVENT, it can be assumed to be a v3 file because this behavior occurs only in MySQL 4.0 and 4.1, and all servers in those versions use v3 format.

Exceptional Condition 2: In MySQL 5.1 and 5.2, several early versions wrote binary log files using v4 format, but using different event numbers from those currently used in v4. Therefore, when the FDE is read and discovered to be v4, it is also necessary to read the server version, which is a string that occurs at position 21. If the version is one of those in the set of affected versions, event renumbering occurs such that events read from the file are mapped onto the current v4 event numbering.

[edit] Ensuring Compatibility of Future Binary Log Versions

To enable any future binary log formats to be correctly understood, the following conventions must hold:

a) The binary log file must start with a descriptor event
b) The descriptor event must start with a v3 header (19 bytes)
c) The 2 bytes following the header (at position 19) must contain the binary log format version number

With respect to the current formats, only a) holds for v1. However, as indicated earlier, v1-format files can be recognized from the initial event in the file, by a type code of START_EVENT_V3 and an event length less than 75.

The v4 format description event is designed so that it can handle future format updates. A new format with the same layout of event packets as in v4 but with additional fields in the header and post-headers can use this format description event to correctly describe itself. Actually, it is (theoretically) possible to have different "flavors" of v4 format that have different (larger) header lengths and even a different number of events.

The current code is written to handle this possibility. That is, any code that parses a binary log and discovers that it is v4 uses the header lengths as given by the format description event (thus potentially different lengths from the values hard-wired in the server code).

Note: Although headers of events in v4 format can be longer than 19 bytes, the format description event is an exception. Its header is always 19 bytes long to meet the preceding backward compatibility requirements. That is, the FORMAT_DESCRIPTION_EVENT does not include an extra_headers field.

[edit] Event Data for Specific Event Types

The following sections provide details about what appears in the fixed and variable parts of the event data for each event type.

LOAD DATA INFILE statements have been associated over time with several different events. The event contents are detailed in this section, and #LOAD DATA INFILE Events provides a historical perspective on which events were used when.

Start_log_event_v3/START_EVENT_V3

This event occurs at the beginning of v1 or v3 binary log files. See #Binary Log Versions for how it is used.

In MySQL 4.0 and 4.1, such events are written only to the first binary log file that mysqld creates after startup. Log files created subsequently (when someone issues a FLUSH LOGS statement or the current binary log file becomes too large) do not contain this event. In MySQL 5.0 and up, all binary log files start with a FORMAT_DESCRIPTION_EVENT.

Fixed data part:

Variable data part:

Query_log_event/QUERY_EVENT

Fixed data part:

Variable part:

Each entry in the status variable block has a code and a value, where the value format is as indicated in the following list. The list provides basic information about each variable. For additional details, see log_event.h.

Table_map_for_update is used to evaluate the filter rules specified by --replicate-do-table / --replicate-ignore-table.

Stop_log_event/STOP_EVENT

A Stop_log_event is written under these circumstances:

Fixed data part:

Variable data part:

Rotate_log_event/ROTATE_EVENT

When a binary log file exceeds a size limit, a ROTATE_EVENT is written at the end of the file that points to the next file in the squence. This event is information for the slave to know the name of the next binary log it is going to receive.

ROTATE_EVENT is generated locally and written to the binary log on the master. It is written to the relay log on the slave when FLUSH LOGS occurs, and when receiving a ROTATE_EVENT from the master. In the latter case, there will be two rotate events in total originating on different servers.

There are conditions under which the terminating log-rotation event does not occur. For example, the server might crash.

Fixed data part:

Variable data part:

Intvar_log_event/INTVAR_EVENT

Fixed data part:

Variable data part:

Load_log_event/LOAD_EVENT

This event is used for LOAD DATA INFILE statements. See also #LOAD DATA INFILE Events.

LOAD_EVENT shares the Load_log_event class with NEW_LOAD_EVENT. The primary difference is that LOAD_EVENT allows only single-character field and line option values, whereas NEW_LOAD_EVENT allows multiple-character values. Also, LOAD_EVENT has no file ID or data block because with this event, the slave asks the master to transfer the data file at event execution time.

The format for this event is more complicated than for others, because it contains information about many LOAD DATA INFILE statement clauses.

Fixed data part:

Variable data part:

Note: Because this event allows only single-character field and line option values, presumably LOAD DATA INFILE statements will not replicate correctly if any such option contains multiple characters.

Slave_log_event/SLAVE_EVENT

This event is never written, so it cannot exist in a binary log file. It was meant for failsafe replication, which has never been implemented.

Create_file_log_event/CREATE_FILE_EVENT

This event is used for LOAD DATA INFILE statements. See also #LOAD DATA INFILE Events.

This event tells the slave to create a temporary file and fill it with a first data block. Later, zero or more APPEND_BLOCK_EVENT events append blocks to this temporary file. EXEC_LOAD_EVENT tells the slave to load the temporary file into the table, or DELETE_FILE_EVENT tells the slave not to do the load and to delete the temporary file. DELETE_FILE_EVENT occurs when the LOAD DATA failed on the master: On the master we start to write loaded blocks to the binary log before the end of the statement. If for some reason there is an error, we must tell the slave to abort the load.

The format for this event is more complicated than for others, because it contains information about many LOAD DATA INFILE statement clauses.

Fixed data part:

Variable data part:

Append_block_log_event/APPEND_BLOCK_EVENT

This event is used for LOAD DATA INFILE statements. See also #LOAD DATA INFILE Events.

Contains data to be written to the data file for a LOAD DATA INFILE statement.

Fixed data part:

Variable data part:

Execute_log_event/EXEC_LOAD_EVENT

This event is used for LOAD DATA INFILE statements. See also #LOAD DATA INFILE Events.

Indicates the end of a successful LOAD DATA INFILE statement and that the data file should be loaded.

Fixed data part:

Variable data part:

Delete_file_log_event/DELETE_FILE_EVENT

This event is used for LOAD DATA INFILE statements. See also #LOAD DATA INFILE Events.

Indicates the end of an unsuccessful LOAD DATA INFILE statement and that the data file should not be loaded.

Fixed data part:

Variable data part:

Load_log_event/NEW_LOAD_EVENT

This event is used for LOAD DATA INFILE statements. See also #LOAD DATA INFILE Events.

The format for this event is more complicated than for others, because it contains information about many LOAD DATA INFILE statement clauses.

LOAD_EVENT shares the Load_log_event class with NEW_LOAD_EVENT. The primary difference is that LOAD_EVENT allows only single-character field and line option values, whereas NEW_LOAD_EVENT allows multiple-character values. Each of these is encoded as a length followed by a string rather than as a single character. Because of that, NEW_LOAD_DATA does not have the flags byte that indicates whether each option is empty.

Fixed data part:

Variable data part:

There is no file ID or data block in the variable data part. The slave is supposed to request the file from the master in a separate connection.

Rand_log_event/RAND_EVENT

RAND() in MySQL generates a random number. A RAND_EVENT contains two seed values that set the rand_seed1 and rand_seed2 system variables that are used to compute the random number.

Fixed data part:

Variable data part:

User_var_log_event/USER_VAR_EVENT

Fixed data part:

Variable data part:

Format_description_log_event/FORMAT_DESCRIPTION_EVENT

This event occurs at the beginning of v4 binary log files. See #Binary Log Versions for how it is used.

In MySQL 5.0 and up, all binary log files start with a FORMAT_DESCRIPTION_EVENT, but there will be a way to distinguish between a FORMAT_DESCRIPTION_EVENT created at mysqld startup and other FORMAT_DESCRIPTION_EVENT instances; such a distinction is needed because the first category of FORMAT_DESCRIPTION_EVENT (which means the master has started) should trigger some cleaning tasks on the slave. (Suppose the master died brutally and restarted: the slave must delete old replicated temporary tables.)

Fixed data part:

Variable data part:


Xid_log_event/XID_EVENT

An XID event is generated for a commit of a transaction that modifies one or more tables of an XA-capable storage engine. Strictly speaking, Xid_log_event is used if thd->transaction.xid_state.xid.get_my_xid() returns non-zero.

Here is an example of how to generate an XID event (it occurs whether or not innodb_support_xa is enabled):

CREATE TABLE t1 (a INT) ENGINE = INNODB;
START TRANSACTION;
INSERT INTO t1 VALUES (1);
COMMIT;

Fixed data part:

Variable data part:

Note: Contrary to all other numeric fields, the XID transaction number is not always written in little-endian format. The bytes are copied unmodified from memory to disk, so the format is machine-dependent. Hence, when replicating from a little-endian to a big-endian machine (or vice versa), the numeric value of transaction numbers will differ. In particular, the output of mysqlbinlog differs. This should does not cause inconsistencies in replication because the only important property of transaction numbers is that different transactions have different numbers (relative order does not matter).

Begin_load_query_log_event/BEGIN_LOAD_QUERY_EVENT

This event is used for LOAD DATA INFILE statements. See also #LOAD DATA INFILE Events.

Fixed data part:

Variable data part:

Execute_load_query_log_event/EXECUTE_LOAD_QUERY_EVENT

This event is used for LOAD DATA INFILE statements. See also #LOAD DATA INFILE Events.

Indicates the end of a successful LOAD DATA INFILE statement and that the data file should be loaded. It is similar to QUERY_EVENT, but before executing the statement, it substitutes the original filename in the statement with the name of the slave-side temporary file. The first 13 bytes of the fixed data part are the same as for QUERY_EVENT, as is the initial status variable block in the variable data part. See the description of that event type for additional information.

Fixed data part:

Variable data part:

Table_map_log_event/TABLE_MAP_EVENT

Used for row-based binary logging beginning with MySQL 5.1.5.

Fixed data part:

Variable data part:

Write_rows_log_event_old/PRE_GA_WRITE_ROWS_EVENT

Used for row-based binary logging from MySQL 5.1.5 to 5.1.17, using the old implementation of Write_rows_log_event/WRITE_ROWS_EVENT. The structure is similar to that for the newer event.

Update_rows_log_event_old/PRE_GA_UPDATE_ROWS_EVENT

Used for row-based binary logging from MySQL 5.1.5 to 5.1.17, using the old implementation of Update_rows_log_event/UPDATE_ROWS_EVENT. The structure is similar to that for the newer event.

Delete_rows_log_event_old/PRE_GA_DELETE_ROWS_EVENT

Used for row-based binary logging from MySQL 5.1.5 to 5.1.17, using the old implementation of Delete_rows_log_event/DELETE_ROWS_EVENT. The structure is similar to that for the newer event.

Write_rows_log_event/WRITE_ROWS_EVENT

Used for row-based binary logging beginning with MySQL 5.1.18.

[TODO: following needs verification; it's guesswork]

Fixed data part:

Variable data part:

For each row, the following is done:

Update_rows_log_event/UPDATE_ROWS_EVENT

Used for row-based binary logging beginning with MySQL 5.1.18.

See the description for WRITE_ROWS_EVENT.

Delete_rows_log_event/DELETE_ROWS_EVENT

Used for row-based binary logging beginning with MySQL 5.1.18.

See the description for WRITE_ROWS_EVENT.

Incident_log_event/INCIDENT_EVENT

Fixed data part:

Variable data part:

Incident number codes are listed in rpl_constant.h. The only code currently used is INCIDENT_LOST_EVENTS, which indicates that there may be lost events (a "gap") in the replication stream that requires databases to be resynchronized.

Heartbeat_log_event/HEARTBEAT_LOG_EVENT

A Heartbeat_log_event is sent by a master to a slave to let the slave know that the master is still alive. Events of this type do not appear in the binary or relay logs. They are generated on a master server by the thread that dumps events and sent straight to the slave without ever being written to the binary log. The slave examines the event contents and then discards it without writing it to the relay log.

Fixed data part:

Variable data part:

Muted_query_log_event

This is a subclass of Query_log_event that is not written to the log. It is used as a means of flushing a transaction without logging an event.

This event class was added in MySQL 5.0.23 and removed in 6.0.4. It was a solution to Bug#16206 that became unnecessary with the fix for Bug#29020.

[edit] LOAD DATA INFILE Events

LOAD DATA INFILE is not written to the binary log like other statements. It is written as one or more events in a packed format, not as a cleartext statement in the binary log. The events indicate what options are present in the statement and how to process the data file.

Historically, there seem to have been at least four event sequences for representing LOAD DATA INFILE operations.

1) In MySQL 3.23, there was only one event: Load_log_event (type code LOAD_EVENT = 6). Load_log_event only contains the filename, not the file itself. When the slave sees a Load_log_event, it requests that the master send the file in a separate connection. This has the drawback that the binary log is not self-contained: If the file has been removed on the master or the slave cannot access the master, the file transfer fails.

2) In MySQL 4.0.0, the file contents were included in the binary log. Several new event types were introduced: Create_file_log_event (type code CREATE_FILE_EVENT = 8), Append_block_log_event (type code APPEND_BLOCK_EVENT = 9), Execute_load_log_event (type code EXEC_LOAD_EVENT = 10), and Delete_file_log_event (type code DELETE_FILE_EVENT = 11). The event sequence is:

The Create_file_log_event contains the options to LOAD DATA INFILE. This was a design flaw since the file cannot be loaded until the Exec_load_log_event is seen. To work around this, the slave, when executing the Create_file_log_event, wrote the Create_file_log_event to a temporary file. When the Execute_load_log_event was seen, this temporary file was read back so that the LOAD DATA INFILE statement could be constructed.

Append_block_log_event is used for files larger than a threshold. In this case, the file is split and the pieces are sent in separate events. The threshold is around 2^17 = 131072 bytes.

CREATE_FILE_EVENT tells the slave to create a temporary file and fill it with a first data block. Later, zero or more APPEND_BLOCK_EVENT events append blocks to this temporary file. EXEC_LOAD_EVENT tells the slave to load the temporary file into the table, or DELETE_FILE_EVENT tells the slave not to do the load and to delete the temporary file. DELETE_FILE_EVENT occurs when the LOAD DATA failed on the master: On the master we start to write loaded blocks to the binary log before the end of the statement. If for some reason there is an error, we must tell the slave to abort the load.

MySQL 4.0.0 also introduced the NEW_LOAD_EVENT = 12 type code.

If a slave reads a NEW_LOAD_EVENT from a binlog, it will use it as a LOAD_EVENT (but allowing longer separator names). Lost in the mysteries of time is the knowledge of whether there was ever a server version capable of writing NEW_LOAD_EVENT.

3) The original Load_log_event used one character for each of the delimiters (FIELDS TERMINATED BY, and so forth). At an unknown point in the version history, the format was modified to allow multiple-character strings as separators. This uses the same class, Load_log_event, but has the type code NEW_LOAD_EVENT = 12. This affects Create_file_log_event, since that inherits from <code class="literal">Load_log_event. So the new feature in Load_log_event allows Create_file_log_event to use multiple-character delimiters.

[A guess for when this occurred would be MySQL 4.0.0: That is when the sql_ex structure that holds the single-character field/line options was renamed to old_sql_ex and a new sql_ex structure was created that allows multiple-character values.]

4) In 5.0.3, the event sequence was changed again with the addition of two new event types: Begin_load_query_log_event (type code BEGIN_LOAD_QUERY_EVENT = 17) and Execute_load_query_log_event (type code EXECUTE_LOAD_QUERY_EVENT = 18). The event sequence is:

With the new sequence, information about the options to LOAD DATA INFILE is moved from the first event to the last event. Consequently, Begin_load_query_log_event is almost the same as Append_file_log_event (it contains only file data), whereas Execute_load_query_log_event contains the text of the LOAD DATA INFILE statement. The revised event sequence fixes the design flaw in the 4.0 format.

Also, the temp file that stores the parameters to LOAD DATA INFILE is not needed anymore. There is still a temp file containing all the data to be loaded.

Here is a concrete example (it applies to MySQL 4.0 and 4.1):

On the master we have a file named /m/tmp/u.txt that contains these lines:

>1,2,3
>4,5,6
>7,8,9
>10,11,12

And we issue this statement on the master while the default database is test:

load data infile '/m/tmp/u.txt' replace into table x fields
terminated by ',' optionally enclosed by '"' escaped by '\\'
lines starting by '>' terminated by '\n' ignore 2 lines (a,b,c);

Then in the master's binary log we have this CREATE_FILE_EVENT (hexadecimal dump):

00000180:                     db4f 153f 0801 0000  .........O.?....
00000190: 006f 0000 0088 0100 0000 0004 0000 0000  .o..............
000001a0: 0000 0002 0000 0001 0403 0000 0003 0000  ................
000001b0: 0001 2c01 2201 0a01 3e01 5c06 0101 0161  ..,."...>.\....a
000001c0: 0062 0063 0078 0074 6573 7400 2f6d 2f74  .b.c.x.test./m/t
000001d0: 6d70 2f75 2e74 7874 003e 312c 322c 330a  mp/u.txt.>1,2,3.
000001e0: 3e34 2c35 2c36 0a3e 372c 382c 390a 3e31  >4,5,6.>7,8,9.>1
000001f0: 302c 3131 2c31 32db 4f15 3f0a 0100 0000  0,11,12.O.?.....
00000200: 1700 0000 f701 0000 0000 0300 0000       ..............

The next byte in the file is the beginning of the EXEC_LOAD_EVENT event.

[edit] Row-Based Binary Logging

Originally, the binary log was written using statement-based logging. Row-based logging was added in MySQL 5.1.5.

Several event types are used specific to row-based logging:

It's common to refer to "row-based logging" (RBL) as "row-based replication" (RBR), but RBR is in fact a misnomer. Logging in this format can be done independent of whether the log is used for replication. That is, replication need not figure into the use of the binary log at all. Similar remarks apply to the terms "statement-based logging" (SBL) versus "statement-based replication" (SBR). Unfortunately, by now the RBR/SBR terminology is probably too well entrenched for there to be much hope of a return to the proper RBL/SBL terms, so I must content myself with including this nomenclatural screed here. :-)

[edit] Additional Resources

The following resources provide additional information on the structure, content, or use of the binary log.

Retrieved from "http://forge.mysql.com/wiki/MySQL_Internals_Binary_Log"

This page has been accessed 7,471 times. This page was last modified 22:33, 17 August 2009.

Find

Browse
MySQLForge
Main Page
Current events
Recent changes
Random page
Help
Edit
Edit this page
Editing help
This page
Discuss this page
Post a comment
Printable version
Context
Page history
What links here
Related changes
My pages
Special pages
New pages
File list
Statistics
Bug reports
More...