MySQL Internals Algorithms
← Back to MySQL Internals overview page
[edit] Important Algorithms and Structures
MySQL uses many different algorithms and structures. This chapter tries to describe some of them.
[edit] The Item Class
To us, the word Item means more than just thingamabob; it is a technical term with a precise definition in the context of our source code. Item is a class. Each instance of the Item class has:
- an analogue in the SQL language
- a value
- a data type descriptor
All of the following SQL thingamabobs are modeled in the Item class:
- literals
- column references
- session or global variables
- procedure variables
- parameters
- SQL functions (not a surprise since SQL functions have data types and return values).
In the function category we include operators such as + and ||, because operators are merely functions that return values. We also include operators such as = and LIKE, which are operators that return boolean values. Consider the following statement:
SELECT UPPER(column1) FROM t WHERE column2 = @x;
For this statement, MySQL will need to store a list of items for the select list ('column1' column reference and UPPER function), and a list of items for the WHERE clause ('column2' column reference and '@x' variable and '=' operator).
Terminology: an Item instance in a MySQL program roughly corresponds to a "site", which according to the standard_SQL definition is "a place that holds an instance of a value of a specified data type", Another word that you'll see often in MySQL code is "field", which means column reference, and the Item_field subclass is generally for column values that occur for the intersection of a row and column in a table.
MySQL's Item class is defined in .../sql/item.h, and its subclasses are defined in .../sql/item*.h (that is, in item.h, item_cmpfunc.h, item_func.h, item_geofunc.h, item_row.h, item_strfunc.h, item_subselect.h, item_sum.h, item_timefunc.h). Page-width limitations prevent us from displaying the whole tree, but these are the main Item subclasses, and the subclasses of the subclasses:
Item_ident (Item_field, Item_ref) Item_null Item_num (Item_int, Item_real) Item_param Item_string (Item_static_string_func, Item_datetime, Item_empty_string) Item_hex_string (Item_bin_string) Item_result_field (all "item_func.h" "item_subselect.h" "item_sub.h" classes) Item_copy_string Item_cache (Item_cache_int, Item_cache_real, Item_cache_str, Item_cache_row) Item_type_holder Item_row
There's no formal classification of subclasses, but the main distinctions are by use (field, parameter, function) and by data type (num, string).
So, how does MySQL use items? You'll find that nearly every .cc program in the /sql directory makes some use of the Item class and its subclasses, so this list of programs is only partial and very general:
sql_parse.cc: Makes new items in add_field_to_list() item_sum.cc: Uses item_func subclasses for COUNT, AVG, SUM item_buff.cc: Where buffers for item values can be stored item_cmpfunc.cc: Comparison functions with item_func subclasses item_create.cc For creating items that the lex might use item_subselect.cc Subqueries are another type of function mysqld.cc: When main() ends, it uses clean_up() for items opt_range.cc: Uses field, compare-condition, and value subclasses procedure.cc: Notice Procedure * has a pointer to an item list protocol.cc: Uses send_fields() to pass item values back to users sys_var.cc: System variables have Item associations too sql_base.cc: Thread-specific Item searchers like find_field_in_table() sql_class.cc: Look at cleanup_after_query() sql_delete.cc This (like sql_insert.cc etc.) has field references sql_error.cc Has one of many examples of SHOW's use of items sql_lex.cc: Notice "add...to_list" functions sql_select.cc The largest program that uses items, apparently udf_example.cc The comments in this program are extensive
Whenever there's a need for an SQL operation that assigns, compares, aggregates, accepts, sends, or validates a site, you'll find a MySQL use of Item and its subclasses.
[edit] How MySQL Does Sorting (filesort)
MySQL has two filesort algorithms for sorting and retrieving results. The original method uses only the ORDER BY columns. The modified method uses not just the ORDER BY columns, but all the columns used in the query.
The optimizer selects which filesort algorithm to use. Prior to MySQL 4.1, it uses the original algorithm. As of MySQL 4.1, it normally uses the modified algorithm except when BLOB or TEXT columns are involved, in which case it uses the original algorithm.
The original filesort algorithm works as follows:
- Read all rows according to key or by table scanning. Rows that do not match the
WHEREclause are skipped. - For each row, store a pair of values in a buffer (the sort key and the row pointer). The size of the buffer is the value of the
sort_buffer_sizesystem variable. - When the buffer gets full, run a qsort (quicksort) on it and store the result in a temporary file. Save a pointer to the sorted block. (If all pairs fit into the sort buffer, no temporary file is created.)
- Repeat the preceding steps until all rows have been read.
- Do a multi-merge of up to
MERGEBUFF(7) regions to one block in another temporary file. Repeat until all blocks from the first file are in the second file. - Repeat the following until there are fewer than
MERGEBUFF2(15) blocks left. - On the last multi-merge, only the pointer to the row (the last part of the sort key) is written to a result file.
- Read the rows in sorted order by using the row pointers in the result file. To optimize this, we read in a big block of row pointers, sort them, and use them to read the rows in sorted order into a row buffer. The size of the buffer is the value of the
read_rnd_buffer_sizesystem variable. The code for this step is in thesql/records.ccsource file.
One problem with this approach is that it reads rows twice: One time when evaluating the WHERE clause, and again after sorting the pair values. And even if the rows were accessed successively the first time (for example, if a table scan is done), the second time they are accessed randomly. (The sort keys are ordered, but the row positions are not.)
The modified filesort algorithm incorporates an optimization such that it records not only the sort key value and row position, but also the columns required for the query. This avoids reading the rows twice. The modified filesort algorithm works like this:
- Read the rows that match the
WHEREclause. - For each row, record a tuple of values consisting of the sort key value and row position, and also the columns required for the query.
- Sort the tuples by sort key value
- Retrieve the rows in sorted order, but read the required columns directly from the sorted tuples rather than by accessing the table a second time.
Using the modified filesort algorithm, the tuples are longer than the pairs used in the original method, and fewer of them fit in the sort buffer (the size of which is given by sort_buffer_size). As a result, it is possible for the extra I/O to make the modified approach slower, not faster. To avoid a slowdown, the optimization is used only if the total size of the extra columns in the sort tuple does not exceed the value of the max_length_for_sort_data system variable. (A symptom of setting the value of this variable too high is that you should see high disk activity and low CPU activity.)
[edit] Bulk Insert
The logic behind bulk insert optimization is simple.
Instead of writing each key value to B-tree (that is, to the key cache, although the bulk insert code doesn't know about the key cache), we store keys in a balanced binary (red-black) tree, in memory. When this tree reaches its memory limit, we write all keys to disk (to key cache, that is). But since the key stream coming from the binary tree is already sorted, inserting goes much faster, all the necessary pages are already in cache, disk access is minimized, and so forth.
[edit] How MySQL Does Caching
MySQL has the following caches. (Note that the some of the filenames contain an incorrect spelling of the word cache.)
- Key Cache
A shared cache for all B-tree index blocks in the different NISAM files. Uses hashing and reverse linked lists for quick caching of the most recently used blocks and quick flushing of changed entries for a specific table. (mysys/mf_keycash.c)
- Record Cache
This is used for quick scanning of all records in a table. (mysys/mf_iocash.c and isam/_cash.c)
- Table Cache
This holds the most recently used tables. (sql/sql_base.cc)
- Hostname Cache
For quick lookup (with reverse name resolving). This is a must when you have a slow DNS. (sql/hostname.cc)
- Privilege Cache
To allow quick change between databases, the last used privileges are cached for each user/database combination. (sql/sql_acl.cc)
- Heap Table Cache
Many uses of GROUP BY or DISTINCT cache all found rows in a HEAP table. (This is a very quick in-memory table with hash index.)
- Join Buffer Cache
For every full join in a SELECT statement the rows found are cached in a join cache. (A full join here means there were no keys that could be used to find rows for the next table in the list.) In the worst case, one SELECT query can use many join caches.
[edit] How MySQL Uses the Join Buffer Cache
Basic information about the join buffer cache:
- The size of each join buffer is determined by the value of the
join_buffer_sizesystem variable. - This buffer is used only when the join is of type
ALLorindex(in other words, when no possible keys can be used). - A join buffer is never allocated for the first non-const table, even if it would be of type
ALLorindex. - The buffer is allocated when we need to do a full join between two tables, and freed after the query is done.
- Accepted row combinations of tables before the
ALL/indexare stored in the cache and are used to compare against each read row in theALLtable. - We only store the used columns in the join buffer, not the whole rows.
Assume you have the following join:
Table name Type t1 range t2 ref t3 ALL
The join is then done as follows:
- While rows in t1 matching range
- Read through all rows in t2 according to reference key
- Store used fields from t1, t2 in cache
- If cache is full
- Read through all rows in t3
- Compare t3 row against all t1, t2 combinations in cache
- If row satisfies join condition, send it to client
- Empty cache
- Read through all rows in t3
- Compare t3 row against all stored t1, t2 combinations in cache
- If row satisfies join condition, send it to client
The preceding description means that the number of times table t3 is scanned is determined as follows:
S = size-of-stored-row(t1,t2) C = accepted-row-combinations(t1,t2) scans = (S * C)/join_buffer_size + 1
Some conclusions:
- The larger the value of
join_buffer_size, the fewer the scans oft3. Ifjoin_buffer_sizeis already large enough to hold all previous row combinations, there is no speed to be gained by making it larger. - If there are several tables of join type
ALLorindex, then we allocate one buffer of sizejoin_buffer_sizefor each of them and use the same algorithm described above to handle it. (In other words, we store the same row combination several times into different buffers.)
A slightly more user-friendly explanation of join buffering is available here: http://s.petrunia.net/blog/?p=18
[edit] How MySQL Handles FLUSH TABLES
-
FLUSH TABLESis handled insql/sql_base.cc::close_cached_tables(). - The idea of
FLUSH TABLESis to force all tables to be closed. This is mainly to ensure that if someone adds a new table outside of MySQL (for example, by copying files into a database directory withcp), all threads will start using the new table. This will also ensure that all table changes are flushed to disk (but of course not as optimally as simply calling a sync for all tables)! - When you do a
FLUSH TABLES, the variablerefresh_versionis incremented. Every time a thread releases a table, it checks if the refresh version of the table (updated at open) is the same as the currentrefresh_version. If not, it will close it and broadcast a signal onCOND_refresh(to await any thread that is waiting for all instances of a table to be closed). - The current
refresh_versionis also compared to the openrefresh_versionafter a thread gets a lock on a table. If the refresh version is different, the thread will free all locks, reopen the table and try to get the locks again. This is just to quickly get all tables to use the newest version. This is handled bysql/lock.cc::mysql_lock_tables()andsql/sql_base.cc::wait_for_tables(). - When all tables have been closed,
FLUSH TABLESreturns an okay to the client. - If the thread that is doing
FLUSH TABLEShas a lock on some tables, it will first close the locked tables, then wait until all other threads have also closed them, and then reopen them and get the locks. After this it will give other threads a chance to open the same tables.
[edit] Full-text Search
MySQL uses Ranking with Vector Spaces for ordinary full-text queries.
Rank, also known as relevance rank, also known as relevance measure, is a number that tells us how good a match is.
Vector Space, which MySQL sometimes calls "natural language", is a well-known system based on a metaphor of lines that stretch in different dimensions (one dimension per term) for varying distances (one distance unit per occurrence of term). The value of thinking of it this way is: once you realize that term occurrences are lines in a multi-dimensional space, you can apply basic trigonometry to calculate "distances", and those distances are equatable with similarity measurements. A comprehensible discussion of vector space technology is here: http://www.miislita.com/term-vector/term-vector-1.html. And a text which partly inspired our original developer is here: ftp://ftp.cs.cornell.edu/pub/smart/smart.11.0.tar.Z ("SMART").
But let's try to describe the classic formula:
w = tf * idf
This means "weight equals term frequency times inverse of document frequency", or "increase weight for number of times term appears in one document, decrease weight for number of documents the term appears in". (For historical reasons we're using the word "weight" instead of "distance", and we're using the information-retrieval word "document" throughout; when you see it, think of "the indexed part of the row".)
For example: if "rain" appears three times in row #5, weight goes up; but if "rain" also appears in 1000 other documents, weight goes down.
MySQL uses a variant of the classic formula, and adds on some calculations for "the normalization factor". In the end, MySQL's formula looks something like:
w = (log(dtf)+1)/sumdtf * U/(1+0.0115*U) * log((N-nf)/nf)
Where:
dtf is the number of times the term appears in the document sumdtf is the sum of (log(dtf)+1)'s for all terms in the same document U is the number of Unique terms in the document N is the total number of documents nf is the number of documents that contain the term
The formula has three parts: base part, normalization factor, global multiplier.
The base part is the left of the formula, "(log(dtf)+1)/sumdtf".
The normalization factor is the middle part of the formula. The idea of normalization is: if a document is shorter than average length then weight goes up, if it's average length then weight stays the same, if it's longer than average length then weight goes down. We're using a pivoted unique normalization factor. For the theory and justification, see the paper "Pivoted Document Length Normalization" by Amit Singhal and Chris Buckley and Mandar Mitra ACM SIGIR'96, 21-29, 1996: http://ir.iit.edu/~dagr/cs529/files/handouts/singhal96pivoted.pdf. The word "unique" here means that our measure of document length is based on the unique terms in the document. We chose 0.0115 as the pivot value, it's PIVOT_VAL in the MySQL source code header file myisam/ftdefs.h.
If we multiply the base part times the normalization factor, we have the term weight. The term weight is what MySQL stores in the index.
The global multiplier is the final part of the formula. In the classic Vector Space formula, the final part would be the inverse document frequency, or simply
log(N/nf)
We have replaced it with
log((N-nf)/nf)
This variant is more often used in "probabilistic" formulas. Such formulas try to make a better guess of the probability that a term will be relevant. To go back to the old system, look in myisam/ftdefs.h for "#define GWS_IN_USE GWS_PROB" (i.e. global weights by probability) and change it to "#define GWS_IN_USE GWS_IDF" (i.e. global weights by inverse document frequency).
Then, when retrieving, the rank is the product of the weight and the frequency of the word in the query:
R = w * qf;
Where:
w is the weight (as always) qf is the number of times the term appears in the query
In vector-space speak, the similarity is the product of the vectors.
And R is the floating-point number that you see if you say: SELECT MATCH(...) AGAINST (...) FROM t.
To sum it up, w, which stands for weight, goes up if the term occurs more often in a row, goes down if the term occurs in many rows, goes up / down depending whether the number of unique words in a row is fewer / more than average. Then R, which stands for either Rank or Relevance, is w times the frequency of the term in the AGAINST expression.
The Simplest Possible Example
First, make a fulltext index. Follow the instructions in the "MySQL Full-Text Functions" section of the MySQL Reference Manual. Succinctly, the statements are:
CREATE TABLE articles (
id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
title VARCHAR(200),
body TEXT,
FULLTEXT (title,body)
);
INSERT INTO articles (title,body) VALUES
('MySQL Tutorial','DBMS stands for DataBase ...'),
('How To Use MySQL Well','After you went through a ...'),
('Optimizing MySQL','In this tutorial we will show ...'),
('1001 MySQL Tricks','1. Never run mysqld as root. 2. ...'),
('MySQL vs. YourSQL','In the following database comparison ...'),
('MySQL Security','When configured properly, MySQL ...');
Now, let's look at the index.
There's a utility for looking at the fulltext index keys and their weights. The source code is myisam/myisam_ftdump.c, and the executable comes with the binary distribution. So, if exedir is where the executable is, and datadir is the directory name that you get with "SHOW VARIABLES LIKE 'datadir%'", and dbname is the name of the database that contains the articles table, then this works:
>/exedir/myisam_ftdump /datadir/dbname/articles 1 -d
b8 0.9456265 1001
f8 0.9560229 comparison
140 0.8148246 configured
0 0.9456265 database
f8 0.9560229 database
0 0.9456265 dbms
0 0.9456265 mysql
38 0.9886308 mysql
78 0.9560229 mysql
b8 0.9456265 mysql
f8 0.9560229 mysql
140 1.3796179 mysql
b8 0.9456265 mysqld
78 0.9560229 optimizing
140 0.8148246 properly
b8 0.9456265 root
140 0.8148246 security
78 0.9560229 show
0 0.9456265 stands
b8 0.9456265 tricks
0 0.9456265 tutorial
78 0.9560229 tutorial
f8 0.9560229 yoursql
Let's see how one of these numbers relates to the formula.
The term 'tutorial' appears in document 0. The full document is "MySQL Tutorial / DBMS stands for DataBase ...". The word "tutorial" appears once in the document, so dtf = The word "for" is a stopword, so there are only 5 unique terms in the document ("mysql", "tutorial", "dbms", "stands", "database"), so U = Each of these terms appears once in the document, so sumdtf is the sum of log(1)+1, five times. So, taking the first two parts of the formula (the term weight), we have:
(log(dtf)+1)/sumdtf * U/(1+0.0115*U)
which is
(log(1)+1)/( (log(1)+1)*5) * 5/(1+0.0115*5)
which is
0.9456265
which is what myisam_ftdump says. So the term weight looks good.
Now, what about the global multiplier? Well, myisam_ftdump could calculate it, but you'll see it with the mysql client. The total number of rows in the articles table is 6, so N = And "tutorial" occurs in two rows, in row 0 and in row 78, so nf = So, taking the final (global multiplier) part of the formula, we have:
log((N-nf)/nf)
which is
log((6-2)/2)
which is
0.6931472
So what would we get for row 0 with a search for 'tutorial'? Well, first we want w, so: Multiply the term weight of tutorial (which is 0.9456265) times the global multiplier (which is 0.6931472). Then we want R, so: Multiply w times the number of times that the word 'tutorial' appears in the search (which is 1). In other words, R = Here's the proof:
mysql>select round(0.9456265 * 0.6931472 * 1, 7) as R;+-----------+ | R | +-----------+ | 0.6554583 | +-----------+ 1 row in set (0.00 sec) mysql>select round(match(title,body) against ('tutorial'), 7) as R->from articles limit 1;+-----------+ | R | +-----------+ | 0.6554583 | +-----------+ 1 row in set (0.00 sec)
You'll need memory
The MySQL experience is that many users appreciate the full-text precision or recall, that is, the rows that MySQL returns are relevant and the rows that MySQL misses are rare, in the judgment of some real people. That means that the weighting formula is probably justifiable for most occasions. Since it's the product of lengthy academic research, that's understandable.
On the other hand, there are occasional complaints about speed. Here, the tricky part is that the formula depends on global factors -- specifically N (the number of documents) and nf (the number of documents that contain the term). Every time that insert/update/delete occurs for any row in the table, these global weight factors change for all rows in the table.
If MySQL was a search engine and there was no need to update in real time, this tricky part wouldn't matter. With occasional batch runs that redo the whole index, the global factors can be stored in the index. Search speed declines as the number of rows increases, but search engines work.
However, MySQL is a DBMS. So when updates happen, users expect the results to be visible immediately. It would take too long to replace the weights for all keys in the fulltext index, for every single update/insert/delete. So MySQL only stores the local factors in the index. The global factors are more dynamic. So MySQL stores an in-memory binary tree of the keys. Using this tree, MySQL can calculate the count of matching rows with reasonable speed. But speed declines logarithmically as the number of terms increases.
Weighting in boolean mode
The basic idea is as follows: In an expression of the form A or B or (C and D and E), either A or B alone is enough to match the whole expression, whereas C, D, and E should all match. So it's reasonable to assign weight 1 to each of A, B, and (C and D and E). Furthermore, C, D, and E each should get a weight of 1/3.
Things become more complicated when considering boolean operators, as used in MySQL full-text boolean searching. Obviously, +A +B should be treated as A and B, and A B - as A or B. The problem is that +A B can not be rewritten in and/or terms (that's the reason why thisextendedset of operators was chosen). Still, approximations can be used. +A B C can be approximated as A or (A and (B or C)) or as A or (A and B) or (A and C) or (A and B and C). Applying the above logic (and omitting mathematical transformations and normalization) one gets that for +A_1 +A_2 ... +A_N B_1 B_2 ... B_M the weights should be: A_i = N, B_j=1 if N==0, and, otherwise, in the first rewriting approach B_j = B_j = (1+(M-1)*2^M)/(M*(2^(M+1)-1)).
The second expression gives a somewhat steeper increase in total weight as number of matched B_j values increases, because it assigns higher weights to individual B_j values. Also, the first expression is much simpler, so it is the first one that is implemented in MySQL.
[edit] FLOAT and DOUBLE data types and their representation.
The MySQL Reference Manual has a discussion of floating-point numbers in Section 11.2 Numeric Types, including details about the storage. Let us now take up the story from where the MySQL Reference Manual leaves off.
The following discussion concentrates on the case where no display width and decimals are given. This means that FLOAT is stored as whatever the C type float is and REAL or DOUBLE [PRECISION] is stored as whatever the C type double is. The field length is selected by the MySQL code.
This document was created when [http://bugs.mysql.com/4457 Bug#4457] (Different results in SQL-Statements for the same record) was fixed at the end of August 2004. Until then there was some confusion in the double-to-string conversion at different places in the code.
The bugfix for [http://bugs.mysql.com/4937 Bug#4937] (INSERT + SELECT + UNION ALL + DATE to VARCHAR(8) conversion problem) produced a conversion function which was a promising approach to the conversion problems. Unfortunately it was only used for direct field conversions and not for function results etc. It did not take small numbers (absolute value less than 1) and negative numbers into account. It did not take the limited precision of float and double data types into account. The bugfix was developed in two steps: The first attempt looked like this (in principle):
length= sprintf(buff, "%.*g", field_length, nr); if (length > field_length) length= sprintf(buff, "%.*g", field_length-5, nr);
If the libc conversion produces too many characters, the precision is reduced by the space required for the scientific notation (1.234e+05). Thus the printf() conversion is forced to switch to the scientific notation, since the value would not fit otherwise. Or, if it was scientific already, the precision is reduced and also uses less space. I left out some important stuff around limit checking just to show the idea. This simple algorithm should work quite well in most cases, but has been discarded for the sake of performance. The double call to the slow printf() conversion %g didn't seem reasonable, though it would only be used for extreme values and small fields. During my explorations of the code I didn't find places where float or double were to be converted into small fields. Remember that I talk only of conversions where field length and precision are not given. In this case a sufficient field length is selected at several places, except for a bug where it was selected wrongly. If a field length is given, a different conversion is used anyway. But since the code is quite complex, I don't claim to grasp it in full, and therefore may be in error. So let us look further:
The second attempt to fix the bug looked like this:
bool use_scientific_notation=TRUE;
if (field_length < 32 && nr > 1)
{
double e[]={1, 1e1, 1e2, 1e4, 1e8, 1e16 }, p=1;
for (int i=sizeof(e), j=1<<i-- ; j; i--, j>>=1 )
{
if (field_length & j)
p*=e[i];
}
use_scientific_notation=(p < nr);
}
length= sprintf(buff, "%.*g", use_scientific_notation ?
field_length-5 : field_length, nr);
Here we evaluate if the string representation of a given number fits into field_length characters. If not, we reduce the precision to make it fit. Again, I left out important details. For example, the evaluation is done only once per field for the sake of performance. The downside here is the unconditional reduction of precision for field length > 31 (which doesn't really matter), for negative numbers and for small numbers (absolute value less than 1).
Both algorithms do not take the limited precision of float and double values into account. This could lead to conversions with ridiculous bogus precision output. For example a value of 0.7 converted with %.30g will give a lot of digits, which pretend to tell about deviations from the value 0.7 and are completely absurd: 0.699999988079071044921875. To understand more about the %g conversion, I quote from a comment introduced in the source at the beginning of bugfixing #4937 (this comment was removed because it mainly describes, how the printf() conversion works, but I think it's valuable enough to include it here):
/*
Let's try to pretty print a floating point number. Here we use
'%-*.*g' conversion string:
'-' stands for right-padding with spaces, if such padding will take
place
'*' is a placeholder for the first argument, field_length, and
signifies minimal width of result string. If result is less than
field length it will be space-padded. Note, however, that we'll not
pass spaces to Field_string::store(const char *, ...), due to
strcend in the next line.
'.*' is a placeholder for DBL_DIG and defines maximum number of
significant digits in the result string. DBL_DIG is a hardware
specific C define for maximum number of decimal digits of a floating
point number, such that rounding to hardware floating point
representation and back to decimal will not lead to loss of
precision. That is: if DBL_DIG is 15, number 123456789111315 can be
represented as double without precision loss. As one can judge from
this description, choosing DBL_DIG here is questionable, especially
because it introduces a system dependency.
'g' means that conversion will use [-]ddd.ddd (conventional) style,
and fall back to [-]d.ddde[+|i]ddd (scientific) style if there is not
enough space for all digits.
Maximum length of result string (not counting spaces) is (I guess)
DBL_DIG + 8, where 8 is 1 for sign, 1 for decimal point, 1 for
exponent sign, 1 for exponent, and 4 for exponent value.
XXX: why do we use space-padding and trim spaces in the next line?
*/
sprintf(to,"%-*.*g",(int) field_length,DBL_DIG,nr);
to=strcend(to,' ');
There is one small misapprehension in the comment. %g does not switch to scientific notation when there is 'not enough space for all digits'. As the commentator says, the field length gives the minimal output length. printf() happily outputs more characters if required to produce a result with 'precision' digits. In fact it switches to scientific when the value can no longer be represented by 'precision' digits in conventional notation. The man page says "Style e is used if the exponent from its conversion is less than -4 or greater than or equal to the precision." In explanation, a precision of 3 digits can print a value of 345 in conventional notation, but 3456 needs scientific notation, as it would require 4 digits (a precision of 4) in conventional notation. Thus, it is printed as 3.46e+03 (rounded).
Since we don't want spaces in the output, we should not give a field length, but always use "%.*g". However, the precision matters, as seen above. It is worth its own paragraph.
Since MySQL uses the machine-dependent binary representation of float and double to store values in the database, we have to care about these. Today, most systems use the IEEE standard 754 for binary floating-point arithmetic. It describes a representation for single precision numbers as 1 bit for sign, 8 bits for biased exponent and 23 bits for fraction and for double precision numbers as 1-bit sign, 11-bit biased exponent and 52-bit fraction. However, we can not rely on the fact that every system uses this representation. Luckily, the ISO C standard requires the standard C library to have a header float.h that describes some details of the floating point representation on a machine. The comment above describes the value DBL_DIG. There is an equivalent value FLT_DIG for the C data type float.
So, whenever we print a floating-point value, we must not specify a precision above DBL_DIG or FLT_DIG respectively. Otherwise we produce a bogus precision, which is wrong. For the honor of the writer of the first attempt above, I must say that his complete algorithm took DBL_DIG into account, if however only for the second call to sprintf(). But FLT_DIG has never been accounted for. At the conversion section of the code, it was not even known whether the value came from a float or double field.
My attempt to solve the problems tries to take all this into account. I tried to concentrate all float/double-to-string conversions in one function, and to bring the knowledge about float versus double to this function wherever it is called. This solution managed to keep the test suite happy while solving the new problem of [http://bugs.mysql.com/4457 Bug#4457]. Luckily the first problem was not big, as the test cases have been very carefully selected, so that they succeed as long as the machine uses IEEE 754.
Nevertheless, the function is still not perfect. It is not possible to guess how many significant digits a number has. Given that, it is not simple to tell how long the resulting string would be. This applies to numbers with an absolute value smaller then 1. There are probably ways to figure this out, but I doubt that we would win in terms of performance over the simple solution of the first attempt, and besides we might cause new bugs. The compromise taken here is to accept that the resulting string may exceed the destination field length by five characters in the worst case.
if (nr < 0.0)
{
abs_nr= -nr;
extra_space= 1;
}
else
{
abs_nr= nr;
extra_space= 0;
}
precision= is_float ? FLT_DIG : DBL_DIG;
if (precision > field_length)
precision= field_length;
if (! initialized)
{
/* Better switch to scientific too early than too late. */
double mult;
mult= 1e0;
for (length= DBL_DIG; length++)
mult/= 1e1;
mult= 1e1 - mult;
double val;
val= 1.0;
for (int idx= DBL_DIG+1; idx++)
{
DBUG_PRINT("info",("double_to_string_conv: big[%d] %.*g",
idx, DBL_DIG+3, val));
big_number[idx]= val;
val*= mult;
}
small_number[0]= 1e0;
small_number[1]= 1e0;
small_number[2]= 1e0;
small_number[3]= 1e-1;
small_number[4]= 1e-2;
small_number[5]= 1e-3;
small_number[6]= 1e-4;
/* %g switches to scientific when exponent < -4. */
for (int idx= DBL_DIG+1; idx++)
small_number[idx]= 1e-4;
initialized= TRUE;
}
use_scientific_notation= (abs_nr != 0.0) &&
((abs_nr > big_number[precision]) ||
(abs_nr < small_number[precision]));
if (use_scientific_notation)
{
if (((nr >= 0.0) && ((nr >= 1e+100) || (nr <= 1e-100))) ||
((nr < 0.0) && ((nr <= -1e+100) || (nr >= -1e-100))))
extra_space+= 6; /* .e+100 or .e-100 */
else
extra_space+= 5; /* .e+99 or .e-99 */
}
if (field_length < extra_space)
precision= 0;
else if (precision > (field_length - extra_space))
precision= field_length - extra_space;
length= sprintf(buff, "%.*g", precision, nr);
This solution takes performance into account by initializing the limiting numbers arrays only once into static space. It copes with negative numbers and tries to decide even over small numbers. The latter has only small implications, as the prefix 0.000 is exactly the same size as the postfix e-100. But knowing if scientific notation will be selected by sprintf() allows for saving one digit when the exponent is larger than -100.
The calculations for the big number array are less precise than in the second attempt, but faster. The precision is sufficient for the guess whether sprintf() uses scientific notation. There may be number to field length combinations which exploit the gap, but these won't emerge anyway as I found no situation where this function is called with small field lengths. Remember again that it is not called with user-supplied field lengths.
However in the current stable releases (including gamma) we have some places where the field length is too small by one character. Thus, the precision is sometimes one digit smaller than DBL_DIG would allow for. Consequently, we cannot use the simple algorithm in the stable releases. There is a chance of doing it in a development release, though.
Addendum:
There turned out to be a new solution to the "big number array" problem. We have a statically initialized array log_10, which holds the necessary values. But I did not check whether these values are safe. Even if computed by the compiler, they could carry values slightly above the decimal powers, which would be bad. In this case we needed to initialize by 9.99999999e+xxx, where the number of nines is equal to DBL_DIG. This must be protected by #if DBL_DIG == yy, so that a new DBL_DIG on a new platform is detected. And the array is of limited length. We must at least protect it by a DBUG_ASSERT(sizeof(log_10)/sizeof(log_10[0]) > DBL_DIG).
But all of this is probably completely unnecessary, since we are only speaking of cases where no user-supplied field length is given. So MySQL selects the field length on its own. So it is totally possible, indeed highly desirable, that MySQL selects a field length, which allows for a maximum of precision for all possible values. And these are DBL_DIG+7 or FLT_DIG+6 respectively as far as IEEE 754 is used. In this case we can have values of about +/-1e-307 to +/-1e+308 for double and +/-1e-37 to +/-1e+38 for float. That is, for example -1.<DBL_DIG-1 digits>e+100. For cases where a precision above IEEE 754 is possible, we may need +8 instead. We can detect this with #if DBL_MAX_10_EXP >= So using a field length of DBL_DIG+8 in all cases should be sufficient for a simple sprintf(buff, "%.*g", DBL_DIG, nr) or sprintf(buff, "%.*g", FLT_DIG, nr), respectively. To be safe, we should not use the machine dependent constants everywhere, but instead concentrate them into definitions like these:
#if (DBL_MAX_10_EXP > 9999) || (DBL_MIN_10_EXP < -9999) # error "Need new definition for UNSPECIFIED_DOUBLE_FIELD_LENGTH" #elif (DBL_MAX_10_EXP > 999) || (DBL_MIN_10_EXP < -999) # define UNSPECIFIED_DOUBLE_FIELD_LENGTH (DBL_DIG+8) #else # define UNSPECIFIED_DOUBLE_FIELD_LENGTH (DBL_DIG+7) #endif #if (FLT_MAX_10_EXP > 999) || (FLT_MIN_10_EXP < -999) #error "Need new definition for UNSPECIFIED_FLOAT_FIELD_LENGTH" #elif (FLT_MAX_10_EXP > 99) || (FLT_MIN_10_EXP < -99) # define UNSPECIFIED_FLOAT_FIELD_LENGTH (FLT_DIG+7) #else # define UNSPECIFIED_FLOAT_FIELD_LENGTH (FLT_DIG+6) #endif
These definitions should be used wherever an item or field of type float or double without an explicit field length specification is encountered. We have to propagate these lengths though all derived items and fields and we have to select the maximum of all field lengths wherever in two or more of them are used in an expression or a function.
We need to treat the precision (DBL_DIG/FLT_DIG) similarly, but have to select the minimum in expressions or functions.
[edit] Threads
Threads in mysqld can run at four different priorities, defined in mysql_priv.h:
#define INTERRUPT_PRIOR 10
#define CONNECT_PRIOR 9
#define WAIT_PRIOR 8
#define QUERY_PRIOR 6
Some threads try to set their priority; others don't. These calls are passed along to pthread_setschedparam() if the native threading library implements it.
The different threads are:
- The main thread. Runs at CONNECT_PRIOR priority. Calls thr_setconcurrency() if it is available at compile time; this call is generally assumed to exist only on Solaris, its value should reflect the number of physical CPUs.
- The "bootstrap" thread. See handle_bootstrap() in sql_parse.cc. The mysql_install_db script starts a server with an option telling it to start this thread and read commands in from a file. Used to initialize the grant tables. Runs once and then exits.
- The "maintenance" thread. See sql_manager_cc. Like the old "sync" daemon in unix, this thread occasionally flushes MyISAM tables to disk. InnoDB has a separate maintenance thread, but BDB also uses this one to occasionally call berkeley_cleanup_log_files(). Begins at startup and persists until shutdown.
- The "handle TCP/IP sockets" thread. See handle_connections_sockets() in mysqld.cc. Loop with a select() function call, to handle incoming connections.
- The "handle named pipes" thread. Only on Windows.
- The "handle shared memory connections" thread. Only on Windows.
- Signal handler ("interrupt") thread. See signal_hand() in mysqld.cc. Runs at INTERRUPT_PRIOR priority. Sets up to receive signals, and then handles them as they come in. Begins at server startup and persists until shutdown.
- The "shutdown" thread. See kill_server() in mysqld.cc. Created by the signal handling thread. Closes all connections with close_connections(), the ends.
- Active and cached per-connection threads. See handle_one_connection() in sql_parse.cc. These can run at QUERY_PRIOR priority or WAIT_PRIOR priority depending on what they are doing.
- The "delayed" thread. See handle_delayed_insert() in sql_insert.cc. Used for MyISAM's delayed inserts.
- The two slave threads, in slave.cc. One thread connects to the master and handles network IO. The other reads queries from the relay log and executes them.
In InnoDB, all thread management is handled through os/os0thread.c InnoDB's threads are:
- The I/O handler threads, See io_handler_thread().
- Two "watchmen" threads: srv_lock_timeout_and_monitor_thread(), and srv_error_monitor_thread().
- The master thread "which does purge and other utility operations", See srv_master_thread().
InnoDB's internal os_thread_set_priority() function implements three priorities (Background, normal, and high) but only on windows. The function is a no-op on unix.
[edit] Character/Collation Sets
Character sets are used by MySQL when storing information, both to ensure that the information is stored (and returned) in the correct format, but also for the purposes of collation and sorting. Each character set supports one or more collations, and so these are collectively known as Collation Sets, rather than character sets.
Character sets are recorded against individual tables and returned as part of the field data. For example, the MYSQL_FIELD data type definition includes the field charsetnr:
typedef struct st_mysql_field {
char *name; /* Name of column */
char *org_name; /* Original column name, if an alias */
char *table; /* Table of column if column was a field */
char *org_table; /* Org table name, if table was an alias */
char *db; /* Database for table */
char *catalog; /* Catalog for table */
char *def; /* Default value (set by mysql_list_fields) */
unsigned long length; /* Width of column (create length) */
unsigned long max_length; /* Max width for selected set */
unsigned int name_length;
unsigned int org_name_length;
unsigned int table_length;
unsigned int org_table_length;
unsigned int db_length;
unsigned int catalog_length;
unsigned int def_length;
unsigned int flags; /* Div flags */
unsigned int decimals; /* Number of decimals in field */
unsigned int charsetnr; /* Character set */
enum enum_field_types type; /* Type of field. See mysql_com.h for types */
} MYSQL_FIELD;
Character set and collation information are specific to a server version and installation, and are generated automatically from the sql/share/charsets/Index.xml file in the source distribution.
You can obtain a list of the available character sets configured within a server by running SHOW COLLATION, or by running a query on the INFORMATION_SCHEMA.COLLATION table. A sample of the information from that table has been provided here for reference.
| Collation Id | Charset | Collation | Default | Sortlen |
| 64 | armscii8
| armscii8_bin
| ?? | 1 |
| 32 | armscii8
| armscii8_general_ci
| Yes | 1 |
| 65 | ascii
| ascii_bin
| ?? | 1 |
| 11 | ascii
| ascii_general_ci
| Yes | 1 |
| 84 | big5
| big5_bin
| ?? | 1 |
| 1 | big5
| big5_chinese_ci
| Yes | 1 |
| 63 | binary
| binary
| Yes | 1 |
| 66 | cp1250
| cp1250_bin
| ?? | 1 |
| 44 | cp1250
| cp1250_croatian_ci
| ?? | 1 |
| 34 | cp1250
| cp1250_czech_cs
| ?? | 2 |
| 26 | cp1250
| cp1250_general_ci
| Yes | 1 |
| 50 | cp1251
| cp1251_bin
| ?? | 1 |
| 14 | cp1251
| cp1251_bulgarian_ci
| ?? | 1 |
| 52 | cp1251
| cp1251_general_cs
| ?? | 1 |
| 23 | cp1251
| cp1251_ukrainian_ci
| ?? | 1 |
| 51 | cp1251
| cp1251_general_ci
| Yes | 1 |
| 67 | cp1256
| cp1256_bin
| ?? | 1 |
| 57 | cp1256
| cp1256_general_ci
| Yes | 1 |
| 58 | cp1257
| cp1257_bin
| ?? | 1 |
| 29 | cp1257
| cp1257_lithuanian_ci
| ?? | 1 |
| 59 | cp1257
| cp1257_general_ci
| Yes | 1 |
| 80 | cp850
| cp850_bin
| ?? | 1 |
| 4 | cp850
| cp850_general_ci
| Yes | 1 |
| 81 | cp852
| cp852_bin
| ?? | 1 |
| 40 | cp852
| cp852_general_ci
| Yes | 1 |
| 68 | cp866
| cp866_bin
| ?? | 1 |
| 36 | cp866
| cp866_general_ci
| Yes | 1 |
| 96 | cp932
| cp932_bin
| ?? | 1 |
| 95 | cp932
| cp932_japanese_ci
| Yes | 1 |
| 69 | dec8
| dec8_bin
| ?? | 1 |
| 3 | dec8
| dec8_swedish_ci
| Yes | 1 |
| 98 | eucjpms
| eucjpms_bin
| ?? | 1 |
| 97 | eucjpms
| eucjpms_japanese_ci
| Yes | 1 |
| 85 | euckr
| euckr_bin
| ?? | 1 |
| 19 | euckr
| euckr_korean_ci
| Yes | 1 |
| 86 | gb2312
| gb2312_bin
| ?? | 1 |
| 24 | gb2312
| gb2312_chinese_ci
| Yes | 1 |
| 87 | gbk
| gbk_bin
| ?? | 1 |
| 28 | gbk
| gbk_chinese_ci
| Yes | 1 |
| 93 | geostd8
| geostd8_bin
| ?? | 1 |
| 92 | geostd8
| geostd8_general_ci
| Yes | 1 |
| 70 | greek
| greek_bin
| ?? | 1 |
| 25 | greek
| greek_general_ci
| Yes | 1 |
| 71 | hebrew
| hebrew_bin
| ?? | 1 |
| 16 | hebrew
| hebrew_general_ci
| Yes | 1 |
| 72 | hp8
| hp8_bin
| ?? | 1 |
| 6 | hp8
| hp8_english_ci
| Yes | 1 |
| 73 | keybcs2
| keybcs2_bin
| ?? | 1 |
| 37 | keybcs2
| keybcs2_general_ci
| Yes | 1 |
| 74 | koi8r
| koi8r_bin
| ?? | 1 |
| 7 | koi8r
| koi8r_general_ci
| Yes | 1 |
| 75 | koi8u
| koi8u_bin
| ?? | 1 |
| 22 | koi8u
| koi8u_general_ci
| Yes | 1 |
| 47 | latin1
| latin1_bin
| ?? | 1 |
| 15 | latin1
| latin1_danish_ci
| ?? | 1 |
| 48 | latin1
| latin1_general_ci
| ?? | 1 |
| 49 | latin1
| latin1_general_cs
| ?? | 1 |
| 5 | latin1
| latin1_german1_ci
| ?? | 1 |
| 31 | latin1
| latin1_german2_ci
| ?? | 2 |
| 94 | latin1
| latin1_spanish_ci
| ?? | 1 |
| 8 | latin1
| latin1_swedish_ci
| Yes | 1 |
| 77 | latin2
| latin2_bin
| ?? | 1 |
| 27 | latin2
| latin2_croatian_ci
| ?? | 1 |
| 2 | latin2
| latin2_czech_cs
| ?? | 4 |
| 21 | latin2
| latin2_hungarian_ci
| ?? | 1 |
| 9 | latin2
| latin2_general_ci
| Yes | 1 |
| 78 | latin5
| latin5_bin
| ?? | 1 |
| 30 | latin5
| latin5_turkish_ci
| Yes | 1 |
| 79 | latin7
| latin7_bin
| ?? | 1 |
| 20 | latin7
| latin7_estonian_cs
| ?? | 1 |
| 42 | latin7
| latin7_general_cs
| ?? | 1 |
| 41 | latin7
| latin7_general_ci
| Yes | 1 |
| 43 | macce
| macce_bin
| ?? | 1 |
| 38 | macce
| macce_general_ci
| Yes | 1 |
| 53 | macroman
| macroman_bin
| ?? | 1 |
| 39 | macroman
| macroman_general_ci
| Yes | 1 |
| 88 | sjis
| sjis_bin
| ?? | 1 |
| 13 | sjis
| sjis_japanese_ci
| Yes | 1 |
| 82 | swe7
| swe7_bin
| ?? | 1 |
| 10 | swe7
| swe7_swedish_ci
| Yes | 1 |
| 89 | tis620
| tis620_bin
| ?? | 1 |
| 18 | tis620
| tis620_thai_ci
| Yes | 4 |
| 90 | ucs2
| ucs2_bin
| ?? | 1 |
| 138 | ucs2
| ucs2_czech_ci
| ?? | 8 |
| 139 | ucs2
| ucs2_danish_ci
| ?? | 8 |
| 145 | ucs2
| ucs2_esperanto_ci
| ?? | 8 |
| 134 | ucs2
| ucs2_estonian_ci
| ?? | 8 |
| 146 | ucs2
| ucs2_hungarian_ci
| ?? | 8 |
| 129 | ucs2
| ucs2_icelandic_ci
| ?? | 8 |
| 130 | ucs2
| ucs2_latvian_ci
| ?? | 8 |
| 140 | ucs2
| ucs2_lithuanian_ci
| ?? | 8 |
| 144 | ucs2
| ucs2_persian_ci
| ?? | 8 |
| 133 | ucs2
| ucs2_polish_ci
| ?? | 8 |
| 131 | ucs2
| ucs2_romanian_ci
| ?? | 8 |
| 143 | ucs2
| ucs2_roman_ci
| ?? | 8 |
| 141 | ucs2
| ucs2_slovak_ci
| ?? | 8 |
| 132 | ucs2
| ucs2_slovenian_ci
| ?? | 8 |
| 142 | ucs2
| ucs2_spanish2_ci
| ?? | 8 |
| 135 | ucs2
| ucs2_spanish_ci
| ?? | 8 |
| 136 | ucs2
| ucs2_swedish_ci
| ?? | 8 |
| 137 | ucs2
| ucs2_turkish_ci
| ?? | 8 |
| 128 | ucs2
| ucs2_unicode_ci
| ?? | 8 |
| 35 | ucs2
| ucs2_general_ci
| Yes | 1 |
| 91 | ujis
| ujis_bin
| ?? | 1 |
| 12 | ujis
| ujis_japanese_ci
| Yes | 1 |
| 83 | utf8
| utf8_bin
| ?? | 1 |
| 202 | utf8
| utf8_czech_ci
| ?? | 8 |
| 203 | utf8
| utf8_danish_ci
| ?? | 8 |
| 209 | utf8
| utf8_esperanto_ci
| ?? | 8 |
| 198 | utf8
| utf8_estonian_ci
| ?? | 8 |
| 210 | utf8
| utf8_hungarian_ci
| ?? | 8 |
| 193 | utf8
| utf8_icelandic_ci
| ?? | 8 |
| 194 | utf8
| utf8_latvian_ci
| ?? | 8 |
| 204 | utf8
| utf8_lithuanian_ci
| ?? | 8 |
| 208 | utf8
| utf8_persian_ci
| ?? | 8 |
| 197 | utf8
| utf8_polish_ci
| ?? | 8 |
| 195 | utf8
| utf8_romanian_ci
| ?? | 8 |
| 207 | utf8
| utf8_roman_ci
| ?? | 8 |
| 205 | utf8
| utf8_slovak_ci
| ?? | 8 |
| 196 | utf8
| utf8_slovenian_ci
| ?? | 8 |
| 206 | utf8
| utf8_spanish2_ci
| ?? | 8 |
| 199 | utf8
| utf8_spanish_ci
| ?? | 8 |
| 200 | utf8
| utf8_swedish_ci
| ?? | 8 |
| 201 | utf8
| utf8_turkish_ci
| ?? | 8 |
| 192 | utf8
| utf8_unicode_ci
| ?? | 8 |
| 33 | utf8
| utf8_general_ci
| Yes | 1 |
Note that it is the collation ID, not the character set ID, that is used to identify the unique combination of character set and collation. Thus, when requesting character set information using one of the character set functions in mysys/charset.c, such as get_charset(), different IDs may return the same base character set, but a different collation set.
The following functions provide an internal interface to the collation and character set information, enabling you to access the information by name or ID:
static uint get_collation_number_internal(const char *name)
uint get_collation_number(const char *name)
uint get_charset_number(const char *charset_name, uint cs_flags)
const char *get_charset_name(uint charset_number)
static CHARSET_INFO *get_internal_charset(uint cs_number, myf flags)
CHARSET_INFO *get_charset(uint cs_number, myf flags)
CHARSET_INFO *get_charset_by_name(const char *cs_name, myf flags)
CHARSET_INFO *get_charset_by_csname(const char *cs_name,
uint cs_flags,
myf flags)
The table below details the functions, the key argument that is supplied, and the return value.
| Function | Supplied Argument | Return Value |
get_collation_number_internal()
| Collation name | Collation ID |
get_collation_number()
| Collation name | Collation ID |
get_charset_number()
| Character set name | Collation ID |
get_charset_name()
| Collation ID | Character set name |
get_internal_charset()
| Collation ID | Character datatype |
get_charset()
| Collation ID | Character datatype |
An example of using the collation/character set functions is available in the extras/charset2html.c, which outputs an HTML version of the internal collation set table.
[edit] Error flags and functions
The following flags can be examined or set to alter the behavior during error handling:
-
thd->net.report_error
thd->net.report_error is set in my_message_sql() if the error message was registered. (my_message_sql() is called by my_error(), my_printf_error(), my_message()).
-
thd->query_error
Like net.report_error, but is always set to 1 in my_message_sql() if error was not caught by an error handler. Used by replication to see if a query generated any kind of errors.
-
thd->no_warnings_for_error
Normally an error also generates a warning. The warning can be disabled by setting thd->no_warnings_for_error. (This allows one to catch all error messages generated by a statement)
-
thd->lex->current_select->no_error
This is set to in case likes INSERT IGNORE ... SELECT. In this case we ignore all not fatal errors generated by the select.
-
thd->is_fatal_error
Set this if we should abort the current statement (and any multi-line statements) because something went fatally wrong. (for example, a stored procedure continue handler should not be able to catch this). This is reset by mysql_reset_thd_for_next_command().
-
thd->abort_on_warning
Strict mode flag, which means that we should abort the statement if we get a warning. In the field::store function this changes the warning level from WARN to ERROR. In other cases, this flag is mostly tested with thd->really_abort_on_warning() to ensure we don't abort in the middle of an update with not transactional tables.
-
thd->count_cuted_fields
If set, we generate warning for field conversations (normal case for INSERT/UPDATE/DELETE). This is mainly set to 0 when doing internal copying of data between fields and we don't want to generate any conversion errors at any level.
-
thd->killed
Set in case of error in connection protocol or in case of 'kill'. In this case we should abort the query and kill the connection.
Error functions
-
thd->really_abort_on_warning()
This function returns 1 if a warning should be converted to an error, like in strict mode when all tables are transactional. The conversion is handled in sql_error.cc::push_warning().
-
thd->fatal_error()
Should be called if we want to abort the current statement and any multi-line statement.
-
thd->clear_error()
Resets thd->net.report_error and thd->query_error.
[edit] Functions in the mysys Library
Functions in mysys: (For flags see my_sys.h)
-
int my_copy _A((const char *from, const char *to, myf MyFlags));
Copy file from from to to.
-
int my_rename _A((const char *from, const char *to, myf MyFlags));
Rename file from from to to.
-
int my_delete _A((const char *name, myf MyFlags));
Delete file name.
-
int my_redel _A((const char *from, const char *to, int MyFlags));
Delete from before rename of to to from. Copies state from old file to new file. If MY_COPY_TIME is set, sets old time.
-
int my_getwd _A((string buf, uint size, myf MyFlags));,int my_setwd _A((const char *dir, myf MyFlags));
Get and set working directory.
-
string my_tempnam _A((const char *dir, const char *pfx, myf MyFlags));
Make a unique temporary file name by using dir and adding something after pfx to make the name unique. The file name is made by adding a unique six character string and TMP_EXT after pfx. Returns pointer to malloc()'ed area for filename. Should be freed by free().
-
File my_open _A((const char *FileName,int Flags,myf MyFlags));,File my_create _A((const char *FileName, int CreateFlags, int AccsesFlags, myf MyFlags));,int my_close _A((File Filedes, myf MyFlags));,uint my_read _A((File Filedes, byte *Buffer, uint Count, myf MyFlags));,uint my_write _A((File Filedes, const byte *Buffer, uint Count, myf MyFlags));,ulong my_seek _A((File fd,ulong pos,int whence,myf MyFlags));,ulong my_tell _A((File fd,myf MyFlags));
Use instead of open, open-with-create-flag, close, read, and write to get automatic error messages (flag MYF_WME) and only have to test for != MY_NABP</code>).
-
FILE *my_fopen _A((const char *FileName,int Flags,myf MyFlags));,FILE *my_fdopen _A((File Filedes,int Flags,myf MyFlags));,int my_fclose _A((FILE *fd,myf MyFlags));,uint my_fread _A((FILE *stream,byte *Buffer,uint Count,myf MyFlags));,uint my_fwrite _A((FILE *stream,const byte *Buffer,uint Count, myf MyFlags));,ulong my_fseek _A((FILE *stream,ulong pos,int whence,myf MyFlags));,ulong my_ftell _A((FILE *stream,myf MyFlags));
Same read-interface for streams as for files.
-
gptr _mymalloc _A((uint uSize,const char *sFile,uint uLine, myf MyFlag));,gptr _myrealloc _A((string pPtr,uint uSize,const char *sFile,uint uLine, myf MyFlag));,void _myfree _A((gptr pPtr,const char *sFile,uint uLine));,int _sanity _A((const char *sFile,unsigned int uLine));,gptr _myget_copy_of_memory _A((const byte *from,uint length,const char *sFile, uint uLine,myf MyFlag));
malloc(size,myflag) is mapped to these functions if not compiled with -DSAFEMALLOC.
-
void TERMINATE _A((void));
Writes malloc() info on stdout if compiled with -DSAFEMALLOC.
-
int my_chsize _A((File fd, ulong newlength, myf MyFlags));
Change size of file fd to newlength.
-
void my_error _D((int nr, myf MyFlags, ...));
Writes message using error number (see mysys/errors.h) on stdout, or using curses, if MYSYS_PROGRAM_USES_CURSES() has been called.
-
void my_message _A((const char *str, myf MyFlags));
Writes str on stdout, or using curses, if MYSYS_PROGRAM_USES_CURSES() has been called.
-
void my_init _A((void ));
Start each program (in main()) with this.
-
void my_end _A((int infoflag));
Gives info about program. If infoflag & MY_CHECK_ERROR, prints if some files are left open. If infoflag & MY_GIVE_INFO, prints timing info and malloc() info about program.
-
int my_copystat _A((const char *from, const char *to, int MyFlags));
Copy state from old file to new file. If MY_COPY_TIME is set, sets old time.
-
string my_filename _A((File fd));
Returns filename of open file.
-
int dirname _A((string to, const char *name));
Copy name of directory from filename.
-
int test_if_hard_path _A((const char *dir_name));
Test if dir_name is a hard path (starts from root).
-
void convert_dirname _A((string name));
Convert dirname according to system. On Windows, changes all characters to capitals and changes '/' to '\'.
-
string fn_ext _A((const char *name));
Returns pointer to extension in filename.
-
string fn_format _A((string to,const char *name,const char *dsk,const char *form,int flag));
Format a filename with replacement of library and extension and convert between different systems. The to and name parameters may be identical. Function doesn't change name if name != to. flag may be:
| 1 | Force replace filenames library with 'dsk' |
| 2 | Force replace extension with 'form' */ |
| 4 | Force unpack filename (replace ~ with home directory)
|
| 8 | Pack filename as short as possible for output to user |
All open requests should always use at least open(fn_format(temp_buffer, name, "", "", 4), ...) to unpack home and convert filename to system-form.
-
string fn_same _A((string toname, const char *name, int flag));
Copies directory and extension from name to toname if needed. Copying can be forced by same flags used in fn_format().
-
int wild_compare _A((const char *str, const char *wildstr));
Compare if str matches wildstr. wildstr can contain '*' and '?' as wildcard characters. Returns 0 if str and wildstr match.
-
void get_date _A((string to, int timeflag));
Get current date in a form ready for printing.
-
void soundex _A((string out_pntr, string in_pntr))
Makes in_pntr to a 5 char long string. All words that sound alike have the same string.
-
int init_key_cache _A((ulong use_mem, ulong leave_this_much_mem));
Use caching of keys in MISAM, PISAM, and ISAM. KEY_CACHE_SIZE is a good size. Remember to lock databases for optimal caching.
-
void end_key_cache _A((void));
End key caching.
[edit] Bitmaps
Inside the mysys directory is a file named my_bitmap.c. It contains functions for manipulating bitmaps. Specifically there are functions for setup or teardown (bitmap_init, bitmap_free), for setting and clearing individual bits or whole sections of the bitmap (bitmap_set_bit, bitmap_fast_test_and_set, bitmap_clear_all, bitmap_set_all, bitmap_set_prefix, bitmap_set_above), and for performing comparisons and set operations on two bitmaps (bitmap_cmp, bitmap_intersect, bitmap_subtract, bitmap_union). Bitmaps are useful, so the functions are called from several places (opt_range.cc, slave.cc, mysqld.c, sql_insert.cc, log_event.cc, sql_show.cc) and we're expecting to make more use of them in the next version of MySQL, MySQL 5.1.
There are a few warnings and limitations that apply for the present bitmap implementation. First: the allocation is an integral number of bytes, and it is not possible to determine whether the last few bits are meaningful. Second: the whole bitmap might have to be protected by a mutex for manipulations; this is settable by passing appropriate flag values. Third: the bitmap is allocated with a 'uint' size, which means that ordinarily it can't have more than 2^32 bytes. Fourth: when unioning two bitmaps, they must be of the same size.