WL#1803: Prepared Statements: overview of tasks

Affects: Server-7.1 — Status: Assigned — Priority: Low

This is a meta worklog task for design flaws, bugs, feature requests in prepared
statements. It's split into two parts: outstanding and closed issues.
An issue may have or have not a separate worklog task, depending on its size.
If a task exists, a reference to it is present. All issues are described
shortly, in order to allow ease of maintenance: once a work on a problem 
is started, it's provided with an elaborate description.

-------------------------------------------------------------------------------
Table of contents.
-------------------------------------------------------------------------------
1. Outstanding issues.
1a. References to open worklog tasks.
2. Solved issues.
2a. References to closed worklog tasks and design documents.

-------------------------------------------------------------------------------
1. Outstanding issues.
-------------------------------------------------------------------------------
PROBLEM
Add a way to execute any statement using the prepared statements
API and get results back in the binary format without having to
prepare  the statement first. In other words, the benefits of the
binary protocol should be available without having to prepare the
query. This is WL#4627 "Prepared Statements: add C API call
mysql_stmt_execute_immediate()"

PROBLEM

Total list of SQL commands not supported in prepared statemnts mode is:

ALTER_DB, ANALYZE, ASSIGN_TO_KEYCACHE, BACKUP_TABLE, CHANGE_DB
CHANGE_MASTER, CHECK, CHECKSUM, COMMIT, CREATE_DB, CREATE_FUNCTION, 
CREATE_INDEX, DROP_DB, DROP_FUNCTION, DROP_INDEX, DROP_USER
FLUSH, GRANT, HA_CLOSE, HA_OPEN, HA_READ, HELP, KILL, LOAD
LOAD_MASTER_DATA, LOAD_MASTER_TABLE, PRELOAD_KEYS, PURGE,
PURGE_BEFORE, REPAIR, REPLACE_SELECT, RESET, REVOKE, REVOKE_ALL
SQLCOM_ROLLBACK, ROLLBACK_TO_SAVEPOINT, SAVEPOINT, SET_OPTION
SHOW_BINLOG_EVENTS, SHOW_BINLOGS, SHOW_CREATE, SHOW_ERRORS
SHOW_INNODB_STATUS, SHOW_MASTER_STAT, SHOW_NEW_MASTER, SHOW_SLAVE_HOSTS
SHOW_SLAVE_STAT, SHOW_WARNS, SLAVE_START, SLAVE_STOP, TRUNCATE, UPDATE_MULTI
RESTORE_TABLE

The problem is aggravated by the fact that most commands not supported
in prepared statements not function in stored procedures either.

This is WL#2871 Prepare any SQL

PROBLEM
We need to support named placeholders:
SELECT :hello, :world


PROBLEM:
- We send information about columns of result set to client only in case of
  SELECT statement, and now we do it twice: in reply to COM_PREPARE
  command, and in the beginning of reply to COM_EXECUTE command.
  Mark says: sending it early, that is in mysql_prepare, is good because
  in JDBC users want to know type of columns in result set at this stage.
  We need to clarify how our clients should work with the second metadata
  packet, and what to do in case when underlying tables were changed between
  prepare and execute calls. In case tables weren't changed significantly, we
  can re-prepare the query on the fly. Otherwise we have to send error to
  client.

  Georg notes: we dont' send result set metadata for SHOW commands at
  prepare stage.

DECISION:
SQL_SHOW_ commands must be fixed.
This issue also pops up in BUG#2812.

We must send metadata twice only in case it was changed between prepare
and execute (for example, if there was ALTER TABLE statement in between).
In case metadata was changed, we should issue a warning (this can be handled
by client library), because user may rely on old values (for
example, cache it in his application).
4.1-beta server implementation should simply always send metadata only in
reply to COM_PREPARE, but the client library should be able to make use of
metadata received in reply to COM_EXECUTE too.
If changes to underlying tables were incompatible with prepared statement,
we of course issue an error (this is done already).


PROBLEM:
- we need to explicitly state that currently mysql_fetch_column doesn't
  perform piece-by-piece LOB fetching and is just a stub for future
  extensions. Probably we should provide user with pointer to internal LOB
  storage to prevent excessive data copying.

  Georg says: DbSl requires that we're able to send blobs bigger than
  max_allowed_packet.

DECISION:
No decision was made on it.
We plan to support piecewise data fetching at some point.

PROBLEM:
(Array execution of statements)
- we need to have ability to bind array of columns (Mark Matthews,
  Georg Richter, Guenter Drach), example:

  stmt= mysql_stmt_prepare("insert into foo (id) values (?)");
  int values[100];
  bind.buffer= (char *) values;
  mysql_execute(stmt, 100); // all values are sent in one packet

  The same principle should be applicable for result set data.

DECISION:
MYSQL_BIND structure was designed with this in mind.
We need to extend it to support binding of arrays soon.
Monty proposed sample extensions to MYSQL_BIND structure and mysql_execute
call.
We added support in binary protocol for this item before MySQL 4.1.2 got out.


PROBLEM
We need to support placeholders inside specifiers:
SELECT _utf8 ?
To be discussed with PeterG and Monty.


PROBLEM:
For that we need to copy
mysql->status to stmt->status after each COM_EXECUTE on this statement.


PROBLEM
We need to check that all sequences of new API calls are SIGSEGV robust
and reasonable error messages are given in case of misuse of API.

PROBLEM:
VARBINARY datatype needs typecode in MySQL C API.
(Discussed with Monty on IRC and deferred).


PROBLEM (raw)
* ability for cursors and use_result to know data sizes before fetching long
  data its lenght.  Maybe give client access to internal buffer.

Otherwise for non-buffered or cursor-buffered fetch you need to somehow know
in advance size of buffers and preallocate them - or, alternatively,
you're stuck with mysql_stmt_fetch(zero buffers) -> mysql_stmt_fetch_column
combination.


PROBLEM
Prepared statements binlogging needs to be rewritten in more efficient
and extending-friendly manner.

PROBLEM
Client-server protocol wise, a prepared statement can have no more than
65535 placeholders (only two bytes are reserved for placeholder count).
This should be fixed by extending the protocol.

PROBLEM:
* for variable length types we need to be able to bind pointer to type
    length, instead of providing length by value - this is the only way
    values of variable length types can be reused without performing
    rebind between executions.

    Example:

    bind_array[position].buffer_type= MYSQL_TYPE_STRING;
    bind_array[position].buffer= (char *) value;
    bind_array[position].buffer_length= strlen(value);

    mysql_bind_param(stmt, bind_array);
    mysql_execute(stmt);

    /* value is changed and has a new string */
    /* oops, we need to rebind it
        because value length has chanded too! */

Monty explains:
This is why we have both 'length' and 'buffer_length' variables in MYSQL_BIND.
For character strings you should set length to the pointer, containing
length of the string, for fixed lenght types you can use buffer_length.
This explaination should be added to the manual.

TODO: add tests of this item into client_test.c

-------------------------------------------------------------------------------
1a. References to open WL tasks.
-------------------------------------------------------------------------------

Other open worklog tasks for prepared statements:
WL#945  -- Extensions to server/client protocol
WL#1055 -- Prepared statements - Shared Global Cache
WL#1085 -- Prepared Statements and the General Query Log
WL#2661 -- Prepared Statements: Dynamic SQL in Stored Procedures
WL#2793 --Prepared Statements: convenient syntax of SQL prepared statements for
use in Dynamic SQL

For the list of closed WL tasks, see the end of this document.

Note added by Trudy Pelzer, 2007-05-25
At the Dev-MT Offsite meeting in Santa Cruz, Brian and Monty 
made the following time estimate for WL#1569:
- prepared statements (WL#1569 and others); 4+ mths
* This is 4 tasks:
** WL#1569 "Prepared Statements: implement support of Query Cache"
** WL#? include all SQL
** WL#? update server for max ps statements
** WL#? make ps able to recreate itself if it doesn't exist
* Brian: Prepared statements aren't useful until this 4-part
task is complete.
* Current estimate: Update for qcache: 1 mth. Update client library 
to recreate ps if it goes missing: 1 week. Update server to know 
about max number of statements it remembers: 2 weeks. Include all 
sql: 2 months (this includes needed discussion but not the extra
time QA needs to adjust mysqltest for the new feature).

On May 25th, Kostja and Guilhem agreed that sufficient work has been 
done (fixing BUG#735 and BUG#26842) that WL#1569 can be considered
complete. The other 3 tasks that Monty and Brian determined should
be done to make prepared statements fully useful will still be 
deferred so this note has been added to this  WL#1803; it now contains
a complete list of work that still needs to be done to make prepared 
statements full-featured.

-----------------------------------------------------------------------------
2. Solved items.
-----------------------------------------------------------------------------
PROBLEM:
- we need to be able to bind output parameters of stored procedures in
prepared mode. BUG#17898
- stored procedures can produce several result set. We need to have support
for it in prepared mode.

DECISION:
Yes, we do (Saint-Petersbourg stored procedures session).


PROBLEM (SOLVED):
- We need to implement support of prepared statements in query cache.
DECISION:
This is WL#1569 "Query cache for prepared statements"

PROBLEM (SOLVED):
- mysql_stmt_bind_param(): why require is_null to be a pointer? why not use it
  as a boolean variable inside MYSQL_BIND structure?
Monty explains:
- if your data is always NULL, use MYSQL_TYPE_NULL to bind it.
- if your data is always NOT NULL, set is_null= (my_bool *) 0
- in all other cases you should provide pointer to a my_bool variable
and change value referenced by pointer between executions.
This explaination should be added to the manual.

(communicated to the docs team).

PROBLEM (SOLVED)
- we need to support placeholders in LIMIT clause
- there are other places in our grammar where we don't support placeholders
  unlike other RDBMS. One of them is explicitly requested by Guenther (SAP):
  LIKE ? ESCAPE ? clause.
DECISION:
 This is WL#1785 "Prepared statements: implement support for placeholders in
 LIMIT clause" which is in Konstantin's sprint now.

(WL#1785 is complete)


PROBLEM (SOLVED)
After Guilhem patch with implementation of SET NAMES commands family is
pushed we need to verify that character sets conversion works in prepared
statements.


PROBLEM (SOLVED)
There should be statistic variables for COM_PREPARE and COM_EXECUTE,
as there are for COM_QUERY.
See also:
https://intranet.mysql.com/secure/mailarchive/mail.php?folder=5&mail=28611

Solution: WL#2379 "Prepared Statements: add status variables", which
was implemented.

PROBLEM (SOLVED)
Add a variable that limits the total number of prepared statements in the
server. This is necessary to control statement leaks in application
and protect the server against DoS attack when statements are created
without being freed.

Solution: BUG#16365 which was fixed in 4.1.19 and 5.0.20.

PROBLEM (SOLVED)
Logging of prepared statements to the general query log should be extended.
Currently when a prepared statement is executed, the log gets only 'Com_execute'
message and nothing else (no statement id, text, placeholder values).
This is a security breach as the general query log is used for audit 
purposes.

An expanded query is now always written to the general and slow logs.
(WL#1085)


PROBLEM (SOLVED)
Client-server protocol wise, mysql_stmt_prepare() has no means to report
warnings. This should be fixed by extending the protocol.

Implemented by Monty in 5.0 by extending COM_STMT_PREPARE reply packet.

PROBLEM (SOLVED)
- logging of prepare and execute commands was added to the general
  query log (--log) in 4.1.9 and 5.0.3.
  This is mainly needed for audit and performance monitoring purposes.

PROBLEM (SOLVED):
We need to report truncations when data is converted from binary format
to client side buffers by the client library.
VERSION: 5.0

Conversion takes place when a value of one type is saved in a buffer
of another type. For any result set column a user can provide
buffer of any host language type, hence the client library has to support
any conversion sequence.
Possible host language buffer types are: unsigned and signed char, short, 
int, long, long long; float and double; char * and MYSQL_TIME.
Possible column types include all columns types allowed by the server,
where each column has its' native representation in a buffer of 
host language type. For example for MYSQL_TYPE_LONG the native
representation is a 32-bit signed integer.
When the client side buffer doesn't strictly correspond to the native type
of a column, a truncation is possible.
Reasons of truncation are:
- data is out of range of the target type. E.g. on attempt to save a long 
number in 16-bit integer buffer
- loss of precision when saving data. E.g. on attempt to save a long long 
value in a buffer of type 'float'.
- data is not conversion compatible. E.g. on attempt to save a string
in a numeric buffer, or convert a date to time.
- overflow or underflow when converting a string representation of 
a number to its binary representation.
All these truncations should be reported by the client library.
Note: the current exception is string -> floating point conversion, 
where we have no means to track down underflow or overflow.

Implementation.

MYSQL_BIND structure, which is used by the user to describe target buffers
of result set data, should be extended with my_bool *error member.
If at least one error pointer of the mysql_stmt_bind_result bind list is not 
0, the client library should report truncations happened during row fetch.
Reporting is performed from the client API call mysql_stmt_fetch(),
which gets additional return value MYSQL_DATA_TRUNCATED.
If no 'error' buffer of the bind list is set, mysql_stmt_fetch() behaves as
before (which is: returns 0 for successful fetch, MYSQL_NO_DATA for
 after-last position, and 1 otherwise).

UPDATE: The patch for this item has been submitted for review.
bk commit - 4.1 tree (konstantin:1.2154)
UPDATE: Pushed into 5.0 tree, is available since 5.0.3

PROBLEM (SOLVED)
We need call 'mysql_stmt_field_count'. 
Added in 4.1.3,  bk commit - 4.1 tree (konstantin:1.1895)

PROBLEM (SOLVED):
- some calls of new API have mysql_ prefix, while some have mysql_stmt_,
  The reason is that there are similar calls in the old API, but operating on
  the entire connection. There is no easy way to remember prefix for a call,
  and  now you have to look at docs each time to find out if a call doesn't
  have '_stmt_' in its name.  What if we rename all calls to have mysql_stmt_
  prefix? This will make clear distinction between old and new APIs and make
  all calls of new API easy to remember.
  Brian says: we probably should call mysql_fetch mysql_cursor_fetch, because
  in future it will be used to fetch from cursor.
  Georg says: lots of drivers already use our API.

DECISION:
We will rename all new calls to have mysql_stmt_ prefix, including
mysql_fetch. Konstantin will make patch for source and documentation trees
ASAP.
This was done and pushed into 4.1.2

PROBLEM (SOLVED):
- we need mysql_stmt_execute_direct() - with that call we can claim that the
  new prepared statement API is complete replacement of old API. This is a
  replacement of mysql_real_query, with such advantages as usage of binary
  protocol to send data to the client, and ability to supply output bind
  parameters without calling mysql_prepare.

  The problem is that with this call we still need to be able to set execution
  flags before statement execution, to open cursors. An example:

   stmt= mysql_prepare(mysql, query, query_length);
   mysql_stmt_set_attr(stmt, MYSQL_STMT_OPEN_CURSOR);
   stmt= mysql_execute(stmt);

   (prepare/execute/fetch of this and other statements)
   with mysql_stmt_execute_direct we don't have place to set statement
   attributes.

DECISION:
We split mysql_prepare call in two:
stmt= mysql_stmt_init(mysql);
mysql_stmt_prepare(stmt, query, query_length);
This was done and pushed into MySQL 4.1.2

PROBLEM (SOLVED):
- we need mysql_stmt_insert_id  (Georg Richter: we need to copy
  all data of 'OK' packet to the statement).
This was done and pushed into 4.1.2.
However better support in the client library is needed, this is not
yet described anywhere, but mysql_stmt_insert_id doesn't work as documented
and does not always behave in the same way as mysql_insert_id.

PROBLEM (SOLVED):
- we need to send types of placeholders to the client
  in reply to COM_PREPARE command. This is a requirement to be able to do
  strict type checking on client side. (Sun request)

  Georg Richter says: in PHP (probably also in Perl) we have the same problem
  - we have to specify types of marker bound variables explicitly.
  ODBC has calls to access this data too.

DECISION: we need to reserve place for this data in the protocol before
MySQL 4.1 goes beta. At the moment we will just send MYSQL_TYPE_UNKNOWN
for all placeholders.
It was verified that current client library will malfunction or get SIGSEGV if
server replies with packet in new format. The library needs to be fixed.

Support for this was added to the binary protocol. However types
of placeholders are not tracked now.

PROBLEM (SOLVED):
  * we should check for missing C types we can bind via mysql_bind_param:
    unsigned types and long double can't be bound without truncation at
    the moment.
    If unsigned types can be bound as if they were the same as their signed
    counterparts, it should be stated in the manual.

DECISION:

No decision's been made for 'long double' type yet.
For all unsigned types we added 'is_unsigned' member to MYSQL_BIND
structure.

PROBLEM (SOLVED):
  * do we really need to call mysql_stmt_reset just to reset long data
    state of the statement on server side? If we don't need
    mysql_stmt_reset for long data, do we need it at all, because if an
    error has happened on server side, we can just force the user to
    close this statement and create a new one?

DECISION:
We should not require calling mysql_stmt_reset just to reset long
data state of placeholders.
If long data for one of placeholders wasn't supplied, its values will
be taken from MYSQL_BIND array.
To implement it we will ensure that calls 'mysql_stmt_execute' and
'mysql_stmt_reset' reset long data state of placeholders on both client
and server sides.
We should keep mysql_stmt_reset because it's purpose is more generic
than to reset long data errors, though there is only this use of it
at the moment.
We should require mysql_stmt_reset to always send reply to the client.

All of the above was done.


- mysql_send_long_data:

PROBLEM:
  * we don't send any reply to this command. In case of error the report will
    be deferred till mysql_execute. This is not very convenient.

DECISION:
  We won't change it. Reply to each packet slow downs communication too much
  in case when packet size is small. Monty says: common size of packet in ODBC
  is 1K.


PROBLEM:
  * mysql_send_long_data should be able to send long
    data for any placeholder, if placeholder type is compatible with long
    data. An example to clarify the problem:

    INSERT INTO two_blobs_table (id, lob1, lob2) VALUES (?, ?, ?);

    In this case it should be possible to send long data both for lob1 and
    lob2.

DECISION:
Yes, it should be possible already. If it is not, it's a bug.

It was verified that it's possible.


PROBLEM:
  * mysql_stmt_send_long_data doesn't support character set conversions
    and doesn't have/make distinction between BLOBS and CLOBS.

DECISION:
This is a bug. We need to make conversion of long data to the server character
set at execution time. To distinguish BLOBs and CLOBs we will use parameter
type value sent to server in MYSQL_BIND array.
When discussing this item it was noted, that implementation of
logging of prepared statements is very inefficient and should
be rewritten.

-------------------------------------------------------------------------------
2a. References to closed worklog tasks and design documents.
-------------------------------------------------------------------------------

A very early spec for prepared statements on Wiki:
https://intranet.mysql.com/secure/wiki/Client-Server-Protocol-Enhancement

There also exists several other WL entries for prepared statements:
WL#1564 -- Intensive test of prepared statements via 'mysql' [Client-Sprint]
 - complete
WL#1014 -- Port client_test.c to normal test library [Server-Sprint]
(Matthias) - complete
WL#1622 -- SQL Syntax for Prepared Statements [Server-Sprint]
(SergeyP) - complete
WL#1781   -- Testing and profiling prepared statements [Benchmarks-Sprint]
(Walrus)
WL#1569 -- Prepared statements and query cache

WL#4435: Support OUT-parameters in prepared statements
BUG#17898 No straightforward way to deal with output parameters

You must be logged in to tag this worklog

No Comments yet

Votes

Not yet rated.
You must be logged in to vote.

Watches

1 members are watching this worklog
You must be logged in to track this worklog.

Provide Feedback

Please note:
HTML will be purified, but we allow for a number of HTML tags so that you have the flexibility to decorate your comment text to some extent. The comments allow the following HTML tags:

strong, b, em, blockquote, a, code, pre

To put code into your comment, simply encapsulate your code with
[code language="XXX"][/code], where XXX is any common language, for instance "PHP", "SQL", "C", etc.



You must be logged in to comment