WL#5046: Online Backup: Pluggable Storage Modules

Affects: Server-Prototype Only — Status: In-Design — Priority: Medium

RATIONALE
---------
Support arbitrary types of storage for backup images by means of modules which
can plug into mysql backup system.

DESCRIPTION
-----------
The idea is that user will be able to say:

sql> BACKUP DATABASE foo TO '<prefix>:<specification of backup location>'

where <prefix> is something like 'file', 'xbsa', 'http' and the format of
<specification of backup location> depends on the prefix.

Then, registered backup storage modules will be consulted to see if one of them
recognizes the prefix. If yes, then the resulting backup stream will be sent to
the module which will handle it accordingly. For example, an xbsa module can
send the image to an XBSA server.

There will be a default module which will be used if no other module recognizes
given location string. This module will implement the current behaviour, that
is, store images on the server's filesystem.


Notes: 

1. This WL and its design is based on the MyBRM framework developed by Andreas
Almroth.

2. This WL is needed to implement XBSA support in MySQL backup (WL#4089).
Note: This specification focus on storage module functionality which will be 
needed by backup kernel to implement its current behaviour. In this WL we do not
try to make the interface complete in the sense that it covers all sensible
services that a storage module could provide. For example, we do not include
services for managing backup locations. It is understood that storage module can
implement a richer interface - this WL specifies the core API which should be
implemented so that such module can be used with MySQL backup system.


Responsibilities of a Backup Storage Module
===========================================
R1 Create backup locations and store backup image stream there.
R2 Read backup stream from a specified backup location.
R3 Ensure that locations opened for reading contain backup image data (and
not some other kind of data) - warn if this is not the case.
R4 Store and inform about format version number of an image stored in a
given location.
R5 Protect against overwriting existing locations.
R6 Free previously used locations upon request.

Notes
-----
1. Backup storage modules do not need to understand backup image format to
implement their functionality. But they understand that they work with backup
images and that these are versioned.

2. Marking stored backup image so that it can be distinguished from other types
of data is a responsibility of storage module. This can be implemented in number
of ways: magic number, file extension, external file attributes etc.

3. Storing image format version together with backup image is also a
responsibility of backup storage module and can be implemented in a way most
suitable for the underlying media.

4. Listing and more advanced management of locations is not required from a
backup storage module. However, a module can implement such services to be used
by other clients. For example, an external application for managing backup
locations.

Specifying backup locations
===========================
Backup location is a place where a backup image can be stored. What a backup
location is depends on the storage module which handles it. For example, in case
of a filesystem storage, backup location will be a path to a file. In case of
XBSA storage, backup locations are XBSA objects which will store backup images.

How to specify a backup location is completely determined by a storage module.
It can use paths, url strings or any other format suitable for the functionality
of the module.

A location string specified in BACKUP or RESTORE statement is passed to a list
of known backup storage modules. The first module which recognizes the string
will handle given location.

Although not strictly required by this specification, a recommended format of
location string is as follows:

<prefix>:<further specifications>

where <prefix> determines the module which should handle given location and the
format of <further specifications> is determined by that module. <Further
specifications> should specify the name of the location, but they can also
contain additional information such as server connection string, options etc.

Note that it is possible that a single module handles more than one <prefix>. If
several modules recognize the same <prefix> (in general: the same location
string), the first one will be used.

Backup Storage Session
======================
Whenever a storage module agrees to handle given location, it creates a session
which provides a necessary context for using that location. This context is then
passed around to all functions implementing backup storage services. The context
must be freed when work with given location is finished.

Native Compression Support
==========================
Storage modules can provide efficient internal compression methods (see note
[1] below). Handling this will be done independently from the backup kernel.
User can request native compression in a storage module via the location
string. For example, he can append "?compress=on" to the location string.
This will be interpreted internally by a storage module (when backup storage
session is opened).

The WITH COMPRESSION clause of BACKUP statement will retain its current
meaning. If present, backup kernel will compress backup image internally
before sending it to a storage module. This is independent from the possible
compression within the storage module itself (as determined by the location
string). The user should be aware that usually it is not a good idea to do
both internal and native compression (see note [2] below).

Backup Storage Module Services
==============================
This is a high level specification. It tries to capture what services a
backup storage module should implement and what information is passed in and out
for each of the services. It does not try to specify what data types will be
used to pass the information around, not even how the invocation of a service is
implemented. All these details will be specified in LLD.

Signalling errors in services
-----------------------------
When a service is called, there are three possible outcomes:

a) service call succeeds: the specified information (if any) should be
returned.
b) non-fatal error: service call fails, but the session is usable and other
services can be called.
c) fatal error: service fails and the session is not usable any more.

In case of error, an internal error code is returned. It will be decided in LLD
how errors are signalled and how fatal errors are distinguished from non-fatal
ones.

If a fatal error was signalled, the storage session should not be used. The
storage module should free any resources used by that session - after fatal
error session will not be explicitly terminated/cancelled.

Notes:

1. After a fatal error client can try to create a new session with service S1
but this attempt can fail if the problem is really severe.

2. There are no provisions in the interface for reporting warnings upon
successful completion of a service.

Short list of services
----------------------
S1 Initialize backup storage session.
S2 Terminate backup storage session.
S3 Abort backup storage session.
S4 Open stream for writing.
S5 Open stream for reading.
S6 Write bytes to location.
S7 Read bytes from location.
S8 Close input/output stream.
S9 Free the location.
S10 Get information about image stored in the location.
S11 Get canonical name of the location.
S12 Get error description.

Service specifications
----------------------

S1 Initialize backup storage session.
IN: Location string.
OUT: Backup storage session or information that given location is
not recognized.

Look at the given location string and create a new session for it if
this location is recognized by the module. Otherwise inform that it was not
recognized.

S2 Terminate backup storage session.
IN: Backup storage session.

Terminate previously created session and free all resources.

S3 Abort backup storage session.
IN: Backup storage session.

Abort active session and free resources. This request can be made at any
moment, regardless of the state of the session. Results of previous
operations and all data sent to the location can be lost. The location
might become invalid, that is, impossible to open for reading or
writing. In that case it must be freed to be able to use it (see free
location request below).

S4 Open stream for writing.
IN: Backup storage session and image format version number.
OUT: Preferred I/O block size (optional) and the size of a reserved area
at the beginning of the first block.

Prepare session for writing backup image in a given format. The
implementation ensures that format version number is stored in the
location and will be reported when that location is opened for reading
(see below). Service fails if location is occupied. There is a
service request for freeing such occupied location (see below).

The preferred I/O block size tells the client that sending data in
blocks of that size can speed-up I/O. The client does not have to
observe this suggestion and can send arbitrary amounts of data to the
stream (see "Write bytes" service). The module does not have to provide
this information.

Depending on implementation, storage module can reserve certain number
of bytes at the beginning of the stream for storing internal data. In that
case service informs about the size of the reserved area (it can be 0 if
no bytes are reserved). It should be taken into account when calculating
I/O block boundaries. Note that the reserved area is not accessible by
the client. In particular, the write request below will write data after
the reserved area.

S5 Open stream for reading.
IN: Backup storage session
OUT: Image format version number, preferred I/O block size (optional)
and the size of a reserved area at the beginning of the first block.

Prepare session for reading. Error is reported if location is empty or
contains invalid data (not a backup image). If it contains a backup
image, the version number of its format is returned.

The meaning of preferred I/O block size is as for "Open for writing"
service. If there are bytes reserved by the implementation at the
beginning of the stream then the size of that reserved area is reported
(0 if no bytes are reserved). However, client does not have access to
these reserved bytes - the read request below will start reading from the
position indicated by stream offset.

S6 Write bytes to location.
IN: Backup storage session, data buffer and amount of data to
be written.
OUT: Amount of data that has been written.

Write given amount of bytes to location which was previously opened for
writing. It can happen that less bytes than requested has been written.
The amount of data actually written is returned.

S7 Read bytes from location.
IN: Backup storage session, data buffer and its size.
OUT: Amount of data read and information about end of stream.

Read bytes from location which was previously opened for reading. The
amount of bytes read is reported. If there are no more bytes in the
location then end of stream is reported.

S8 Close input/output stream.
IN: Backup storage session.

Close stream which was previously opened for reading or writing. Session
can be used to open a new stream.

S9 Free the location.
IN: Backup storage session.

Make the current location available for writing. Normally, it is not
possible to write to a location which already stores an image. With this
request a client informs that it is not intending to access the old
image any more and thus it is ok to overwrite it. Still, the
implementation can choose to archive the old image somewhere to be
accessed by other means.

It should be possible to free a location which became invalid due to
aborted session. After that the location is again available for writing
backup image to it.

S10 Get information about image stored in the location.
IN: Backup storage session.
OUT: Size and timestamp of the image or information that location does
not contain a backup image.

If current location contains backup image, its size and time of creation
are returned. The exact meaning of "time of creation" is implementation
defined. But it should be a time between the moment given location was
opened for writing and the moment the output stream was closed.

If location does not contain a backup image, then this is signalled by
the service. Service should inform whether location is empty or contains
some other data which is not a backup image.

Note that for some implementations size can make no sense - in that
case storage module does not provide that information.

S11 Get canonical name of the location.
IN: Backup storage session.
OUT: Canonical name of the location.

The same location can possibly be specified in several different ways.
Also, a string specifying location can contain additional data. Once a
session for a given location has been created, this request will
determine the canonical name by which this location is identified within
the storage module.

S12 Get error description.
IN: Internal error number and locale info.
OUT: Human readable description of the error.

The service provides human readable description of an error. Description
should be in the language specified by the given locale. If not supported,
it is acceptable to return plain English description.


String handling
---------------
Implementation should support non-ascii characters in all strings (backup
locations, error messages). For example, they can be represented using UTF-8
encoding.


Error reporting
===============
To present the general idea, some examples follow. They are just examples
and should not be treated as a binding specification. The implementation will
decide what errors will be reported in each particular situation and how
error messages will be formulated.

Examples
--------

1. Trying to restore from file which does not exist.

sql> RESTORE FROM 'file:the/file':

Error: Backup image 'path/to/the/file' does not exist.
Error: Error when preparing for restore operation.

Note that canonical location name (as given by service S11) is used in the
error message (after adding backupdir etc). Kernel will learn that image
does not exist using service S10 "Get information about image stored in
the location". In this case storage module does not report error but
simply informs that the location is empty.

2. Trying to restore from file for which user does not have read access.

sql> RESTORE FROM 'file:the/file':

Error: Filesystem backup storage module error 7: Issuficient permissions
when opening file '/path/to/the/file' for reading.
Error: Can't read backup location '/path/to/the/file'.
Error: Error when preparing for restore operation.

This will lead to error in service S5 "Open stream for reading". Error
from storage module is reported using information obtained from service
S12 "Get error information". Then additional error context information is
provided.

3. Error when accessing XBSA server:

sql> BACKUP DATABASE db1 TO 'xbsa:user@server_name/path/to/object':

Error: XBSA backup storage module error 11: could not connect to XBSA
server 'server_name' - connection refused.
Error: Could not initialize location
'xbsa:user@server_name/path/to/object'.
Error: Error when preparing for backup operation.

The error happens when backup storage session is initialized with service
S1. But we don't mention session in error messages because it is an
implementation detail not really relevant for the final user.


Design principles
-----------------

- Backup kernel does not try to interpret errors reported by backup storage
modules, it only notes that an error has happened and possibly forwards it to
backup error log. There is no global convention about which error number means
what.

- Errors from storage modules are reported as a single "error from storage
module" error. The error message contains module name, internal error number and
error description provided by the module.

- Together with an error from storage module, backup kernel reports more errors
informing about the context in which the error has happened.

- Whenever possible, if error message refers to the location, its canonical form
is used (as given by service S11).

- Not all implementation details are exposed to a user in error messages. For
example the fact that internally backup storage session is used is hidden from
him as irrelevant.

Proposed error messages
-----------------------
If appropriate, existing error codes are used. For new messages ER_NEWX
code is used which should be eventually replaced by a more appropriate name.

When backup storage module reports error to the kernel, then kernel can forward
it to the backup error log using this general error message:

ER_NEW1: M backup storage module error N: D

where:

M = name of the module
N = internal error number
D = description of the error as provided by the module

On top of that, backup kernel will report additional message informing about the
context in which storage module error was detected. The exact list of errors
that are reported and exact form of error messages will be decided by the
implementation. Below, there is an initial list of errors that backup kernel can
report when each of the services fails with suggestions for the corresponding
error messages.

In error messages the following abbreviations are used:

L = backup location string as specified by the user
C = canonical name of the location as returned by service S11

(S1) Initialize backup storage session.

ER_BACKUP_INVALID_LOC: Invalid backup location L.
or
ER_NEW2: L was not recognized as a valid backup location.
ER_NEW3: Could not initialize backup location L.

(S2) Terminate backup storage session.

ER_BACKUP_CONTEXT_REMOVE: Error when cleaning up after backup/restore
operation.

(S3) Abort backup storage session.

ER_NEW4: Errors when aborting backup/restore operation.

(S4) Open stream for writing.

ER_BACKUP_WRITE_LOC: Can't write to backup location C.
or
ER_NEW3: Could not open backup image C for writing.

(S5) Open stream for reading.

ER_BACKUP_READ_LOC: Can't read backup location C.
or
ER_NEW4: Could not open backup image C for reading.

(S6) Write bytes to location.

ER_BACKUP_WRITE_HEADER: Can't write backup archive preamble.
ER_BACKUP_WRITE_META: Error when saving metadata of %-.64s.
ER_BACKUP_WRITE_DATA: Error when writing data from %-.64s backup driver
(data block for table #%d)

(S7) Read bytes from location.

ER_BACKUP_READ_HEADER: Can't read backup archive preamble.
ER_BACKUP_READ_META: Error when reading metadata.
ER_BACKUP_READ_DATA: Error when reading data from backup stream.

(S8) Close input/output stream.

ER_BACKUP_CLOSE: Backup/Restore: Error on close of backup stream.

(S9) Free the location.

ER_NEW5: Could not free location C.

(S10) Get information about image stored in the location.

ER_NEW6: Backup image C does not exist.
ER_NEW7: C is not a valid backup image.
ER_NEW8: Could not determine size and/or creation time of backup image C.

(S11) Get canonical name of the location.

ER_NEW9: Could not determine canonical name for location L.


Ingo 2009-11-04: HLS approved in the current version. Below alternatives popped
up during discussions. They will not be implemented.

Alternatives
============
For the record, here are some alternatives which were considered during design
discussion.

A1 Storing meta-info in backup stream.
--------------------------------------
Meta-info about backup image is the backup image's format version number
and a "fingerprint" which distinguishes it from other kinds of data. In
the above design it is responsibility of a backup storage module to store
this meta-info (R3 & R4). The alternative is that meta-information is
stored in the backup image and storage module is not aware of it.

Advantages:
- Backup modules do not have to implement meta-info storage
which makes them simpler.
- Storing "magic bytes" in the stream provides extra check for data
consistency.
- Potential code duplication is avoided (each BSM must implement the
same functionality).

Disadvantage: More efficient, storage-specific implementation of
meta-info is not possible. For example using XBSA object attributes.

A2 Handle native compression from backup kernel
-----------------------------------------------
The WITH COMPRESSION clause of BACKUP statement will use native
compression if supported by storage module. Kernel will query the module
for native compression. If module supports it, then kernel will rely on
native compression. Otherwise kernel will do compression itself. Note: a
new service for querying native compression would be needed for this
solution.

A3 Simplified "Write bytes" service
-----------------------------------
When requesting writing N bytes of data, either all N bytes are written
or error is reported.

Advantage: Simplifies code for writing data.

Disadvantage: Prevents "asynchronous" I/O implementation. A request for
writing huge amount of data will potentially block the calling thread for
a long time before it is completed.

A4 Use global MySQL error codes for BSM errors
----------------------------------------------
Backup storage modules will register all errors in the MySQL error
database (errmsg.txt). Translation from error number to error message
would be done using server mechanisms without participation of storage
module - no need to have service like S12. Also, no need to pass locale
info the the storage module.

Advantages:
- Storage modules do not need to handle locales.
- Error codes from each storage module are unique.

Disadvantage: This will make it impossible to upgrade a running server
with a new storage module whose error messages were not registered at
server compilation time.

Comments
========

[1] Why it is good to use native compression (Andreas):

"Today it is better to leave compression to be up to the storage to implement.
With the current trends in storage, a lot of vendors provide several means of
efficient storage, ranging from normal 1:1 (raw), 2:1 (LZH or alike) to 10-100:1
in de-duplicated storage area. I really don't think MySQL backup kernel should
be bothered with compressing with its own algorithm, but leave it to the BSM."

[2] Why it might be bad if backup kernel does its own compression (Andreas):

"To explain a real scenario #1: The BSM implementation is made by a vendor using
only de-duplicated backend storage. By using de-duplication the source data
should be uncompressed in order to achieve the most efficient de-duplication
ratio. If the backup kernel compresses the data, then the storage cannot de-
duplicate. Typically JPG and MP3 files cannot be de-duplicated because of the
compression used in the files. De-duplication of normal databases can range
anywhere from 1-100:1 or even more depending on whether it is static data, or
just null values etc. Even at a baseline backup using de-duplication we see
better than 2:1 ratios, and subsequent backups will range in the region of 5-
50:1.

To explain a real scenario #2: Compressing the data stream in the backup kernel
will use additional CPU cycles on the server before sending the data to the BSM.
In most enterprises today, the backup systems use tape storage systems, which do
hardware compression (2:1 algorithms). So, by sending an uncompressed data
stream, less CPU cycles are used on the server, and the compression is done in
hardware which typically much faster. LTO4 tape drives can stream 120MB/s with
2:1 compression. Most servers don't, and if they do, they use excessive CPU."

You must be logged in to tag this worklog

No Comments yet

Votes

  • Rated 5.00 out of 5
Rated 5.00 out of 5 with 2 votes cast.
You must be logged in to vote.

Watches

1 members are watching this worklog
You must be logged in to track this worklog.

Provide Feedback

Please note:
HTML will be purified, but we allow for a number of HTML tags so that you have the flexibility to decorate your comment text to some extent. The comments allow the following HTML tags:

strong, b, em, blockquote, a, code, pre

To put code into your comment, simply encapsulate your code with
[code language="XXX"][/code], where XXX is any common language, for instance "PHP", "SQL", "C", etc.



You must be logged in to comment