Configuring mod ndb
| mod_ndb |
|
|---|
Contents |
[edit] Request handlers
mod_ndb defines several possible request handlers which can be specified using Apache's SetHandler directive.
[edit] ndb-cluster
The primary handler. Enables mod_ndb to respond to HTTP requests by querying the NDB data nodes as specified in the Apache configuration file.
<Location /ndb> SetHandler ndb-cluster Database mod_ndb_test </Location>
[edit] ndb-exec-batch
The ndb-exec-batch handler is used in scripting to define an endpoint that will execute all of the transactions in a batch. See Mod_ndb_scripting for more information.
<Location /ndb-commit-all> SetHandler ndb-exec-batch </Location>
[edit] ndb-dump-format
The ndb-dump-format handler is able to dump an output format, either as source code which can be used in httpd.conf, or in a JSON representation of its internal compiled form. It is possible to dump both user-defined formats and mod_ndb's internal formats. (The three internal formats are "JSON," "raw," and "XML." )
The following configuration is supplied by default in test.conf:
<Location /ndb/format> SetHandler ndb-dump-format </Location>
By default, a compiled format is available at /ndb/format, and a source format is at /ndb/format/source. For examlpe, a request to "/ndb/format?JSON" will return a representation of the compiled internal JSON result format, while a request to "/ndb/format/source?JSON" will return its source code definition.
[edit] Server level configuration
[edit] ndb-connectstring
NDB connection string; refer to this page of the MySQL Manual for details. By default, mod_ndb will attempt to connect to a management server running on the local machine.
[edit] ndb-max-read-subrequests
The maximum number of read operations that can be batched together in a single transaction. Operation batching occurs only when mod_ndb is scripted in PHP or Perl. The default value, defined by the compile-time constant DEFAULT_MAX_READ_OPERATIONS, is 20.
[edit] ndb-retry-ms
When attempting to execute a transaction, mod_ndb may encounter a temporary error condition. If this happens, it will retry the transaction for up to ndb-retry-ms milliseconds. After no more than ndb-retry-ms milliseconds, if the transaction still encounters the temporary error condition, mod_ndb will return a "503 Service Unavailable" response.
The default value, defined at compile time in the file defaults.h, is 50 ms.
Mod_ndb actually waits for 5 + (2 * retries * retries) ms after each failed attempt to execute the transaction (i.e. 5ms, then 7ms, then 13ms, then 23ms, then 41ms, etc). But whenever the wait-time already accumulated plus 5 + (2 * retries * retries) would be greater than ndb-retry-ms, it will give up and return the 503 page.
[edit] ndb-force-restart
Mod_ndb is not able to handle DDL changes, such as those made from a mysql server using ALTER TABLE, CREATE TABLE, and DROP TABLE, while it is running. If such changes are made, you must restart apache (e.g. with apachectl graceful) before mod_ndb is able to use the newly changed table.
If mod-ndb cannot execute a query because the data dictionary has changed, it will return a 500 server error response, and write a message in Apache's error log. Additionally, if ndb-force-restart is set to 1, it will automatically begin an apache graceful restart.
The default value of ndb-force-restart is 0.
[edit] <ResultFormat>
The <ResultFormat> container is used to define a custom output format. See Output Formats.
Mod_ndb creates a response page by using an output format to present a database result set. Internal JSON, XML, and raw formats are built in to mod_ndb. Custom output formats can be defined at the server-level using a <ResultFormat> section, as described in detail at Mod_ndb_formats.
[edit] Common endpoint configuration
The Database, Table, and Format directives supply basic information on how mod_ndb should behave. These parameters can be inherited down the directory tree, as illustrated in the example above, where Database mod_ndb_test and Table cars apply to every section, including /ndb/car/by_id and /ndb/car/photo, though they are specified only once.
[edit] Database
Specifies the MySQL database used in queries.
[edit] Table
Specifies the table for a specific directory.
Optionally, a table may be marked with "scan" to define a Table Scan Endpoint (See Table Scans, below). As an additional option, the name of an ordered index to use in table scans can be specified.
Format: Table table_name [scan] [ordered_index_name]
Example: Table users
Example: Table users scan
Example: Table users scan PRIMARY
[edit] Format
Specifies the output format used for the response, which may be either an internal format or a user-defined format. The internally-defined formats are:
- JSON -- Results are formatted in Javascript Object Notation, as described at http://www.json.org/. This is the default output format.
- Raw -- results are not formatted or structured in any way, but the literal value of a single column is sent as the response. When raw results are used, only a single column may appear in the output. BLOB and TEXT columns can *only* be returned using raw output. A mod_mime DefaultType directive should be used to set the MIME type of the output approriately, as illustrated in the example above.
- XML -- A simple XML format.
[edit] ETags
A flag which can be set to On or Off. With ETags On, mod_ndb will compute an MD5 fingerprint of the response and send an ETag header containing the fingerprint. It will also respond appropriately to "If-match" and "If-none-match" conditional requests from caches and proxy servers. If Etags is set to Off, the MD5 checksum will not be computed, and conditional requests will be treated as normal requests. The default is ETags On.
[edit] PathInfo
Used to implicitly associate key columns (which must be named elsewhere in a PrimaryKey, UniqueIndex, OrderedIndex, or Filter directive) with the rightmost parts of the URL path name. (Note: it is a current limitation of mod_ndb that you cannot insert records from a directory where PathInfo is also configured.)
Format: PathInfo key_col/key_col/...
Example: PathInfo user_id/icon_id
[edit] Apache-style endpoint configuration
[edit] Table Scans
A location defined as a table scan endpoint will return every row in a table. These endpoints are defined by adding the keyword "scan" to the Table definition. It is not allowed to define any other access method (e.g. PrimaryKey, UniqueIndex, etc.) at a table scan endpoint, or to supply PathInfo or query arguments.
[edit] Directives specifying data access plans
[edit] Columns
Specifies the list of columns (separated by white space) to be returned in the result set. These are actual column names in the table structure, not the column aliases, as described below, used for the request parameters.
[edit] AllowUpdate
Specifes the list of columns (separated by white space) that may be updated by a POST query. Any parameters that are supplied in the POST request but are not in this list will be ignored
[edit] Deletes
A flag, set to either On or Off, indicating whether HTTP DELETE requests should be accepted. The default is Deletes Off. This directive cannot be inherited in the directory hierarchy; it must be specified independently for each directory where deletes are allowed.
[edit] PrimaryKey
Allows primary key access to a table.
Format: PrimaryKey column_alias [ column_alias ...]
Example: PrimaryKey car_id
[edit] UniqueIndex
Allows access to a table using a specific unique index.
Format: UniqueIndex indexname column_alias [ column_alias ...]
Example: UniqueIndex name$unique name
[edit] OrderedIndex
Allows access to a table using a specific ordered index. Unlike lookups on primary keys and unique indexes, an ordered index scan can return multiple result rows. The column list can also include a sort flag, one of [ASC]
or [DESC], which will cause results to be returned in sorted order. (Note that unsorted scans can be parallelized, and may perform faster than sorted ones).
Format: OrderedIndex indexname column_alias [ column_alias ...] [ sort_flag ]
Example: OrderedIndex license_tag license_tag
The PrimaryKey, UniqueIndex, and OrderedIndex directives define access paths for HTTP requests against a table. PrimaryKey refers to NDB's unique hash primary key (sometimes called the distribution key), which is present for every table. The UniqueIndex and OrderedIndex directives require you to supply the name of the index as it is known to NDB. Note that this often differs from the index name in MySQL in several ways
- A MySQL primary key usually corresponds to two separate keys in NDB: the
unique hash distribution key (PrimaryKey) and an OrderedIndex named PRIMARY.
- A unique index in MySQL usually corresponds to an NDB UniqueIndex
with a name ending in $unique.
- Other MySQL indexes are usually represented by a pair of NDB indexes.
For instance, a MySQL index named "icon_id" corresponds to an OrderedIndex named icon_id and a UniqueIndex named icon_id$unique.
MySQL's "SHOW CREATE TABLE" and "SHOW INDEXES" statements will reveal a table's index structure as seen by MySQL, and the command-line utility ndb_desc can reveal the index names used by NDB.</p>
PrimaryKey lookups are the most efficient data access path. OrderedIndex scans can return multiple rows of results.
While the supplied index names must match NDB's actual indexes, the column_alias names do not need to correspond to actual column names.
A directive like
PrimaryKey i j
is interpreted so that the HTTP parameter i is mapped to the first column of a two-part primary key, and j is mapped to the second part of the key. It makes no difference whether the two parts of the primary key are actually columns named i and j or if they are named something else, say user_id and icon_id.
[edit] Filter
Filters define small interpreted programs that run on NDB data nodes and filter particular rows out of a result set. Filters are used only with scans -- i.e. table scans and queries that use an ordered index. They correspond to pushed-down conditions in mysqld and to NdbScanFilter objects in the NDB API.
Format: Filter column_name operator parameter
Example: Filter age <= max_age
The supported operators are:
= < <= > >= != LIKE NOTLIKE
It is the presence of a filter parameter in the request that causes the filter to be applied to the query. Several filters can be defined at an endpoint, but the only filters actually used will be those whose parameters are supplied in a given request.
Filters have no effect in one-row operations (lookups that use a primary key or unique index).
[edit] N-SQL endpoint configuration
N-SQL is an SQL-like language providing an alternative to the Apache-style access plan directives (Columns, Deletes, PrimaryKey, UniqueIndex, and OrderedIndex). N-SQL configuration and Apache-style configuration are very similar in their capabilites, and can be freely mixed.
Syntactically, N-SQL statements begin with a keyword, span one or several lines, and end with a semicolon. Apache-style configuration directives, in contrast, are contained on a single line.
N-SQL is a small subset of standard SQL with several notable differences:
- There is no SELECT *. You are required to explicitly list every column that will appear in the results.
- N-SQL configuration, like Apache-style configuration, requires you to identify by name a driving index for each query. In standard SQL, the SQL optimizer can examine the columns in a WHERE clause and use the system data dictionary to determine which indexes to use. But mod_ndb has no optimizer and does not access the NDB data dictionary at configuration-time.
- Currently, only SELECT and DELETE are supported. Use the Apache-style AllowUpdate directive to define updates and inserts of data.
N-SQL allows a few possibilities that cannot be defined otherwise:
- In Apache-style configuration the columns of an ordered index must equal their associated column aliases in the request. = is the only possible relation, while N-SQL also allows the relational operators <, <=, >, and >=".
- An N-SQL WHERE clause can include a condition comparing a database column to a constant string or numeric value.
[edit] An N-SQL Statement
NSQL := SelectQuery | DeleteQuery | QueryPlan ';' .
An N-SQL statement may consist of a SelectQuery, a DeleteQuery, or a bare QueryPlan. Every N-SQL statement ends with a semicolon.
An N-SQL statement does not necessarily correspond to an SQL query.
- mod_ndb allows you to define several possible access plans at an endpoint. At runtime, the named column parameters associated with an access plan are used to distinguish it.
- mod_ndb allows you to mix Apache-style configuration and N-SQL configuration at an endpoint. This means that a statement may describe only one aspect of a complete query -- for example, only a select list, or only a query plan.
[edit] SelectQuery
SelectQuery := SelectList [ QueryPlan ] . SelectList := "SELECT" Column { "," Column } "FROM" [ DBName "." ] TableName . Column := ColumnName [ "AS" ColumnAlias ] .
A complete SelectQuery contains a SelectList followed by a QueryPlan. A bare SelectList, without a QueryPlan, is also syntactically acceptable; in this case, one or more query plans should be supplied as separate N-SQL statements or using Apache-style directives. (Unlike standard SQL, the bare SelectList does not imply a full table scan.)
[edit] DeleteQuery
DeleteQuery := "DELETE" "FROM" [ DBName "." ] TableName OneRowWhereClause .
A DeleteQuery allows an HTTP DELETE request at an endpoint. DELETE requests must use a primary key or unique index; only single-row deletes are supported.
See also: Deletes.
[edit] QueryPlan
QueryPlan := OneRowWhereClause | Scan .
A QueryPlan specifies a set of query parameters and their associated data access plan. There are two sorts of QueryPlan:
- A OneRowWhereClause uses a primary key or unique index and returns a single row of data
- A Scan uses an ordered index or a full table scan, and possibly Filters, and can return many rows of results.
[edit] OneRowWhereClause
OneRowWhereClause := "WHERE" ("PRIMARY" "KEY" | "UNIQUE" "INDEX" Name ) "=" ValueList .
ValueList := IndexValue { "," IndexValue } .
IndexValue := ( ["$"]HTTP_param_name | quoted_string | number ) .
In a OneRowWhereClause, a primary key or named uique index is associated with one or more request parameters. If the index contains one or more columns, all columns must be represented in order in the value list. HTTP parameter names may optionally begin with a dollar sign ($), and this practice is encouraged, as it may become a requirement in a future version.
Note that the name of unique index created by MySQL will often end in the string "$unique". The ndb_desc utility can be used to see discover actual NDB index names.
A quoted string or number that appears in a ValueList is interpreted as a literal constant, and will be used in the database request. Note that you cannot perform a query solely with constants -- at least one part of an index must be mapped to a request parameter, or the index will never be used.
See also: PrimaryKey, UniqueIndex.
[edit] Scan
Scan = "USING" ("TABLE" "SCAN" | IndexScan ["ORDER" ("ASC" | "DESC") ] ).
A Scan defines an access plan that can return multiple rows of results. A full-table scan returns all rows of a table, while an ordered IndexScan returns a range of rows using an ordered index.
See also: Table, OrderedIndex, Filter.
[edit] IndexScan
IndexScan := "ORDERED" "INDEX" [ Name ] [ WhereClause ] .
WhereClause := "WHERE" IndexCondition { "AND" IndexCondition } .
IndexCondition := ColumnName rel_op IndexValue .
rel_op := ( "=" | "<" | "<=" | ">" | ">=" ) .
IndexValue := ( ["$"]HTTP_param_name | quoted_string | number ) .
An IndexScan defines an access plan using a named ordered index. If no name is specified, the default ordered index, PRIMARY, is used. The PRIMARY ordered index, normally created by MySQL, is an ordered index on the table's primary key.
While the Apache-style OrderedIndex directive can define index bounds only using equality, an IndexScan can use any of the five relational operators listed above.
See also: OrderedIndex, Filter.
[edit] How access plans are selected at runtime
When mod_ndb processes its configuration, each request parameter (or column alias) becomes associated either with an index or a filter. When mod_ndb handles an HTTP request at runtime, it examines each parameter supplied by the client. If the parameter points to an index, that index is used to access the table. If the parameter belongs to a filter, and table is being scanned, the filter applies to the scan.
The column aliases supplied in the request determine the access path for the request. An indexed column in MySQL might belong to both a UniqueIndex and an OrderedIndex in NDB; you could supply different alias parameters in mod_ndb to use one index or the other.
If the configuration attempts to define a column alias twice, using multiple indexes, the second instance will overwrite the first, and mod_ndb will write a line (at log level error) to Apache's error log to note that it is "reassociating a column."


