How to Create Good Tests

Contents

[edit] How to Create Good Test Cases

There are a lot of strict rules and rules of thumb which may increase the quality of tests written for the tool pair mysqltest/mysqltest-run.pl (MTR). The following presentation should give you some hints.

  • This document should not replace reading our excellent manual about testing http://dev.mysql.com/doc/mysqltest/en/writing-tests.html.
  • Some of the examples might not work
    • with MTR1 or
    • on some operating system.
  • MTR2 means the second version of mysql-test-run.pl/mysqltest. MTR2 replaced MTR1 ~ Jan 2009 in MySQL 5.1 and 6.0.
  • The term "protocol" means in most cases "output from a test case."
October 2007 - Matthias Leich - Create this document for a MySQL University session
March 2009 - Matthias Leich - Updates and corrections, add "Examples of suspicious scripts"
March 2009 - Patrick Crews - Corrections, add "Notes on MTR"

[edit] Formal stuff

Rules about formatting within scripts and similar stuff.

[edit] Coding style

Please have a look at the C/C++ coding guidelines and apply them when they make sense: Coding Guidelines

Some more or less strict rules:

Think about a comparison of the new and the old version of a test script within a graphical diff tool.
Having the difference frequent at the end of long lines is very uncomfortable.
Ignore this guideline if your intent is to check the processing of mixed lettercases ;-)
Please follow this guideline for new tests. Rewriting existing tests to use better formatting is good, but can be tedious - a rule of thumb: don't rewrite unless you intend to touch the entire subtest (not the entire .test file, but the statements that make up a complete testing 'unit')
[edit] SQL statement example
SELECT f1 AS "my_column", f10 ....
FROM mysqltest1.t5
WHERE (f2 BETWEEN 17 AND 25 OR f2 = 61)
  AND f3 IN (SELECT ....
             FROM mysqltest1.t4
             WHERE .....)
ORDER BY ...
[edit] Please avoid too dense code

except you intend to check the parser or the performance of the reader of the code ;-).

 Unfortunate example:
 select 1+1,1-1,1+1*2,8/5,8%5,mod(8,5),mod(8,5)|0,-(1+1)*-2;
 Improved example:
 SELECT 1 + 1, 1 - 1, 1 + 1 * 2, 8 / 5, 8 % 5,
        MOD(8,5), MOD(8,5) | 0, -(1+1) * -2;
 
[edit] Header of scripts

For each test or auxiliary script, create a header that contains the following information:

Note: The header, like the rest of the test, should not mention confidential information. Remember, our tests are available publicly.

[edit] Header example
########### suite/funcs_1/t/processlist_val_no_prot.test #############
#                                                                    #
# Testing of values within INFORMATION_SCHEMA.PROCESSLIST            #
#                                                                    #
# The prepared statement variant of this test is                     #
# suite/funcs_1/t/processlist_val_ps.test.                           #
#                                                                    #
# There is important documentation within                            #
#       suite/funcs_1/datadict/processlist_val.inc                   #
#                                                                    #
#                                                                    #
# Creation:                                                          #
# 2007-08-09 mleich Implement this test as part of                   #
#                   WL#3982 Test information_schema.processlist      #
#                                                                    #
######################################################################
[edit] Comments within boxes

Please work 100% perfect when using boxes made of '#'. Such boxes "jump" into the eye of most readers. Bad example similar to what I found in some tests:

###############################################
#                                             #
# Some text                                   #
# Some text                                   #
#                                             #
# Author : .....                                   #
#                                             #
# Some text                                  #
#                                             #
###############################################

Please note that this is caused by spaces, not tabs.

[edit] Comments Everywhere

Write comments, they save the time of others.

[edit] Complicated test architecture or tricky code

If your test is very complicated, write a sufficient explanation of the test architecture. This helps avoid having someone else come along and unintentially change the test in ways that:

Comment your code when you do or check something that someone else may think is not trivial. Please write what you intend to check if it is not obvious.

[edit] A subtest case is able to reveal a bug

Write a comment if the next subtest revealed a bug in history. Please mention number and title.

Please use exact the formatting "# Bug#nnnnn <Title>"

Example: # Bug#3671 Stored procedure crash if function has "set @variable=param"

because it does not look nice to see so many different variants like "BUG #nnnnn", "bug #nnnnn - Title" often even within the same test.


[edit] Subtests

At least in cases where your file contains many subtests

And please explain what each subtest checks, unless it is obvious.

Example:

--echo #----------------- Testcase 3.3.1.52 ---------------------#
###############################################################################
# Ensure that a view that is a subset of every column and some rows of a single
# underlying table, contains the correct row-and-column data; such a view has
# a definition that is semantically equivalent to CREATE VIEW <view_name>
# AS SELECT * FROM <table_name> WHERE ...
###############################################################################
[edit] Make test protocols more understandable

Please have the test write comments into the protocol if this makes the surrounding protocol content much more understandable. This is especially true for tests that do the following:

[edit] Better protocols example

Good script with message about switching the connection:

SET @aux = 1;
SELECT @aux AS "content of @aux is";
--echo # Establish session con1 (user=root)
connect (con1,localhost,root,,,,);
SELECT @aux AS "content of @aux is";

Protocol of good script:

SET @aux = 1;
SELECT @aux AS "content of @aux is";
content of @aux is
1   
# Establish session con1 (user=root)
SELECT @aux AS "content of @aux is";
content of @aux is
NULL

This is what the protocol looks like without the comment. The output becomes less clear because there is no indication that the connection has been changed.

SET @aux = 1;
SELECT @aux AS "content of @aux is";
content of @aux is
1   
SELECT @aux AS "content of @aux is";
content of @aux is
NULL

[edit] Some Rules Outside of the Formal Stuff

[edit] Error masking

Use error names instead of error numbers whenever possible.

Example:

--error ER_BAD_DB_ERROR
USE <not existing database>;

You can find the error names in the include/mysqld_error.h file of a MySQL source distribution, or in the Server Error Codes and Messages section of the MySQL Reference Manual]. NOTE: Don't do this with error numbers > 2000. Use of the error name does not seem to work well in these cases.

[edit] Use of the option "--disable_abort_on_error"

This option is very useful when starting to write a new test because the test will not abort if your script contains some failing statements or SQL syntax errors. By disabling the abort, you get to see more of the errors per test run and can fix more of them at a time.

But there are only rare situations where the final versions of a test should use "--disable_abort_on_error", either at all or during a sequence of several SQL statements.

[edit] Perfect cleanup at the end of a test

MTR2 runs now by default with "check-testcases" enabled. It checks if there are additional objects like user or tables, modifed system table etc. "check-testcases" will be soon improved. However, proper cleanup should still be a responsibility of both the test author and reviewer.

[edit] Use of OS-specific commands

The exec and system commands enable tests to execute external commands. However, many of such commands are available only on certain platforms. (For example, rm is Unix-specific and not available on Windows.) Please avoid these commands if possible. They harm the portability and stability of tests.

Have a look at the mysqltest manual and the t/mysqltest.test test file. There are now several mysqltest built-in commands such as

"--write_file", "--cat_file", "--remove_file", ...

which are reliable on all operating systems. Sometimes OS commands could be also replaced by SQL statement sequences writing to and reading from files and some SQL string functions.

[edit] "Negative" Tests

A "negative" test is a test for which you expect to see a failure. If an error does not occur, that itself indicates a problem.

DO NOT FORGET "NEGATIVE" TESTS where we expect to see fine error messages from the server.

This section contains just a few examples of what to test. Please be creative and imagine what could go wrong in rough reality.

Column with numeric data type:

Column with string data type:

Limited number of tables, columns, variables, ...

Assume an SQL operation that affects the filesystem (LOAD DATA, CREATE SCHEMA or TABLE, backup/restore, ...). What will happen if the following conditions occur for a file or directory to be read, written, or implicitly created or deleted?

[edit] Tests with Several Variants

Please think twice before you create

or

happen. They might be

[edit] Variation of the storage engine
The storage engines to be used for information_schema tables are hardcoded. Therefore tests focussed on permissions, optimizer strategies, column data types etc. when selecting on INFORMATION_SCHEMA tables should not run with storage engine variations.
That means mostly:
DO NOT use hardcoded storage engine assignments within CREATE TABLE statements.
create/run with storage engine variants and check this.

Solution 1:

Do not assign the storage engine within your CREATE TABLE statements at all. The default storage engine MyISAM will be used for your tables. Check your assumption with:

./myql-test-run.pl --mysqld="--default-storage-engine=<engine>" <test_case>

Solution 2:

Assign the storage engine to be used via $variable.

Top level script:

let $engine_type= MyISAM;

The same script or a sourced script:

eval CREATE TABLE .... ENGINE = $engine_type ...

Check your assumption by creating and running storage engine variants of the top level test.

--source include/have_falcon.inc
let $engine_type= Falcon;
[edit] Variation of the protocol option

It is usual to run tests with and without the mysql-test-run.pl startup option "--ps-protocol". And there are also attempts of System QA to run the other protocols.

We have a lot of tests running very similar and extremely simple SQL just for the creation of a situation to be tested, check of table content, etc. So it can be assumed that the n'th test running again simple statements does not improve the coverage.

Conclusion:

If your test contains

Do not prevent the protocol runs.

Another problem around runs with such protocols is that there are cases where we get different protocol content.

Example:

Script:

SELECT * FROM processlist ...

Protocol content if running without any "--*-protocol":

ID USER ... COMMAND ... STATE     INFO
1  root ... Query   ... preparing SELECT * FROM processlist ...

Protocol content if running with "--ps-protocol":

ID USER ... COMMAND ... STATE     INFO
1  root ... Execute ... preparing SELECT * FROM processlist ...

So please check whether every new test gives the same result with every protocol but at least with "--ps-protocol". If not, do something of the following:

Example solution (code within the top level scripts):

# The file with expected results fits only to a run without
if (`SELECT $PS_PROTOCOL + $SP_PROTOCOL
          + $CURSOR_PROTOCOL + $VIEW_PROTOCOL > 0`)
{
   --skip Test requires: ps-protocol/sp-protocol/cursor-protocol/view-protocol disabled
}
--source include/<whatever>.inc

or

# The file with expected results fits only to a run with "--ps-protocol".
if (`SELECT $SP_PROTOCOL + $CURSOR_PROTOCOL + $VIEW_PROTOCOL > 0
         OR $PS_PROTOCOL = 0`)
{
   --skip Test requires: ps-protocol enabled, other protocols disabled
}
--source include/<whatever>.inc

[edit] Miscellaneous Tips


[edit] Stability checks for random timing problems
./mysql-test-run.pl --mem .... --repeat=<number> <your test>
rmdir var    # "var" must not be a symlink pointing into a RAM based filesystem
./mysql-test-run.pl .... --repeat=<number> <your test>

  • all OS: Compile a MySQL version from source
  • Unix derivates (OpenSolaris,Linux,OSX,...): tar -chvf - <some path> > /dev/null
  • Unix derivates: root: dd if=<area of disk containing the "var" directory> of=/dev/null
  • Windows: Defragmentation
rmdir var    # "var" must not be a symlink pointing into a RAM based filesystem
./mysql-test-run.pl .... --repeat=<number> <your test>
  • all OS:
./mysql-test-run.pl .... --repeat=<number> --parallel=8 <your test> 
[edit] Stability checks for hostname-related problems
sudo hostname <other hostname>
./mysql-test-run.pl --mem <name of test>

Good values for <other hostname> to check are "0", "bbb", "mmm", "zzz".

[edit] Stability checks for unexpected problems

Ask a colleague to run your test or try another testing box (preferably one with a different operating system).

[edit] Row order within result sets

Please keep in mind that the row order within a significant number of result sets depends on storage engine properties and in worst cases the current load (NDB!, partitioning?, parallel query?) on the testing box.

Either use

--sorted_result
SELECT * FROM t1;

or, if possible, "decorate" your SELECT with ORDER BY to make the row order static.

[edit] Test case behavior should depend on the result of a SHOW

Please have a look at the t/mysqltest.test file where "query_get_value()" is checked.

[edit] Does a test execution hang?

Symptom: No text flooding through the command window where a test is running.

Run this command in a different window:

tail -f <var_dir>/mysqld.1/data/mysql/general_log.CSV

If you see some changes, your test is alive.

[edit] Your test is complicated, support debugging

Top level test script:

...
##### Option, for debugging support #####
let $debug= 0;
...

At various places within your scripts:

if ($debug)
{
   --echo # var1: $var1 , var2: $var2, .....
   SELECT ......
}

or

--disable_query_log
if ($debug)
{
   --enable_query_log
}
[edit] You need to know some syntax, option etc., but hate reading manuals

Try this:

grep -i '<one keyword you know or guess>' t/* include/* | less

[edit] Notes on MTR2

MTR2 has brought some new utility that can improve testing, but also some elements that can be pitfalls. This section attempts to describe both.

  • $MYSQLTEST_VARDIR/master-data -> $MYSQLD_DATADIR
  • This change was introduced to work with parallel. Since MTR2 can run several instances of itself, one path is needed for each instance.
  • You must also use let $MYSQLD_DATADIR= `SELECT @@datadir`;
  • max-test-fail: The number of test cases that can fail before the test run aborts. Default=10, set to 0 to continue the run regardless of failure count.
NOTE: --force alone is no longer sufficient to guarantee a full test suite run. If the number of failing tests == max-test-fail, then the test run will abort.
  • parallel: Set n number of workers to run tests in parallel. The current recommendation is 2x the number of processors on your machine.
NOTE: You can no longer be sure of executing test cases in a particular order due to this and other changes brought on by MTR2. *Do not* try to create multi-part tests (test_pt1.test, test_pt2.test, etc).
NOTE: Be cautious when using a high (>2) value for parallel and --mem in combination. It is possible to use vast amounts of resources on your system and cause problematic performance till a crash of the operating system while MTR2 is running.
  • repeat: Run the input test(s) n times in succession. Very good for diagnosing random failures - set a high value for 'n', create a high load on your machine, and see if you can duplicate the failure.


[edit] Examples of "suspicious" tests and scripts


[edit] Probably mismatch of focus of test and code sequence
  CREATE TABLE t1 (
  id INT NOT NULL AUTO_INCREMENT,
  my_column VARCHAR(30),
  name LONGTEXT,
  PRIMARY KEY (id));
  INSERT INTO t1(my_column,name) VALUES('2','two');
  INSERT INTO t1(my_column,name) VALUES('1','one');
  INSERT INTO t1(my_column,name) VALUES('4','four');
  INSERT INTO t1(my_column,name) VALUES('2','two');
  INSERT INTO t1(my_column,name) VALUES('3','three');

This test will

==> no coverage for other features or properties checked within this test

Some questions with recommended action depending on the answer:

Yes: Please ensure that the test does not break or gets skipped if the default storage engine does not support AUTO_INCREMENT or LONGTEXT and you are done.
No: no action
Yes: no action
No: Remove the use of AUTO_INCREMENT
Yes: no action
No: Remove the use of LONGTEXT
Yes: no action
No: Split the test at least if it should be a test of basic functionality
[edit] Too greedy test

Let's assume we have to check that every new created table causes a row in information_schema.tables.

 --replace_column 15 <CREATE_TIME> 16 <UPDATE_TIME> 17 <CHECK_TIME>
 SELECT * FROM information_schema.tables;

The SELECT above makes the test extreme "greedy" for changes in behaviour which is maybe good for general bug hunting but not for smart and frequent automatic tests. It is to be expected that such a test requires frequent maintenance like adjustment of expected results.

The problems with the select above: We will (sometimes only maybe) get a result set difference whenever

Completeness of result sets without WHERE are outside of the focus.
Refined statement:
 --replace_column 15 <CREATE_TIME> 16 <UPDATE_TIME> 17 <CHECK_TIME>
 SELECT * FROM information_schema.tables
 WHERE TABLE_SCHEMA = 'test' AND TABLE_NAME = 't1';
Lets assume something intentional like VERSION is now 11 instead of 10.
This means we should try to avoid "SELECT *".

Final statement:

 Variant 1:
 SELECT TABLE_SCHEMA,TABLE_NAME FROM information_schema.tables
 WHERE TABLE_SCHEMA = 'test' AND TABLE_NAME = 't1';
Are we really consequent? No, but we could
 Variant 2:
 SELECT COUNT(*) FROM information_schema.tables
 WHERE TABLE_SCHEMA = 'test' AND TABLE_NAME = 't1';

It's on you which variant you choose. I decide depending on the situation.

[edit] Risky handling around additional sessions

Disconnects are performed asynchronous. This is most probably good for the throughput of usual application but has some annoying consequences for the development of tests.

If a

occurs within a script for mysqltest than the next statements will be executed even if the disconnect or kill is not complete finished.

I guess this also applies to the end (-> end of statements to be executed by mysqltest) of a test. It triggers some disconnect for any open session but this is also asynchronous.

Not complete finished means that the session is visible within the processlist and the entry about the 'quit' of the session is not already appended to the general log.

So it could happen that a succeeding test suffers by unexpected events like

According to my experiences the likelihood that some test could harm a successing test by unfinished disconnects increases with

In my opinion the most robust solution is to add a

 --source include/wait_until_disconnected.inc

just after the disconnect.

In case you don't believe me, run the sniplet following later

./mtr  --skip-ndb --no-check-testcases --repeat=100 <sniplet>

and you will most probably observe something like

TEST                                 RESULT   TIME (ms)
------------------------------------------------------------
...
<sniplet>                           [ pass ]      4
<sniplet>                           [ fail ]
...
CURRENT_TEST: <sniplet>
--- <result>
+++ <reject>
@@ -1,4 +1,4 @@
SELECT COUNT(*) FROM information_schema.processlist WHERE id <> CONNECTION_ID();
COUNT(*)
-0
+1
SELECT SLEEP(10);
...
mysqltest: Result content mismatch

Sniplet for demonstration purposes:

 SELECT COUNT(*) FROM information_schema.processlist WHERE id <> CONNECTION_ID();  # (1)
 connect (con1,localhost,root,,);
 send SELECT SLEEP(10); # (2)
 connection default;
 # Wait till the SQL statement of con1 is "in work"
 let $wait_condition= SELECT COUNT(*) = 1 FROM information_schema.processlist
 WHERE State = 'User sleep' AND Info = 'SELECT SLEEP(10)';
 --source include/wait_condition.inc  # (3)
 SELECT COUNT(*) = 1 FROM information_schema.processlist
 WHERE State = 'User sleep' AND Info = 'SELECT SLEEP(10)'; # (4)
 disconnect con1; (5)
 <end of script>

The sniplet is intended for demonstration purposes and contains code which is obvious "crap" but needed to enforce the intended effect. Some notes:

 (5) should be replaced by
 connection con1;
 disconnect con1;
 --source include/wait_until_disconnected.inc
 connection default;
  • There should be a 'reap' for every preceeding 'send <statement>'. This is not just for academic completeness it also ensures here that (2) is really finished.
  • Do we really need 10 seconds within the SLEEP? According to my experiences sleep times <= 2 seconds are critical under high load. What about SLEEP(5)?

Retrieved from "http://forge.mysql.com/wiki/How_to_Create_Good_Tests"

This page has been accessed 6,120 times. This page was last modified 15:17, 1 April 2009.

Find

Browse
MySQLForge
Main Page
Current events
Recent changes
Random page
Help
Edit
Edit this page
Editing help
This page
Discuss this page
Post a comment
Printable version
Context
Page history
What links here
Related changes
My pages
Special pages
New pages
File list
Statistics
Bug reports
More...