Category: RandomQueryGenerator

RandomQueryGeneratorToDo


The following items are known to be deficient or missing from the Random Query Generator. Ideas or patches on how to solve those are very welcome!

As you put items on this page, please put a small note as to why the feature would be nice to have, e.g. how it is going to increase productivity, prevent mistakes or allow new stuff to be tested.

Please note that this page resides on forge.mysql.com, and is therefore public.

You may use the Random Query Generator team's Launchpad Blueprints to track the design and implementation of specific functionality. Specifications may be added to or listed at the following wiki page:

RandomQueryGeneratorSpecifications

Contents

[edit] The Framework

[edit] The Validators

[edit] The Grammar

SELECT _field FROM _table ;

Because the field is chosen before the table. Possible solutions are to only pick a table that contains the field already selected, or to pick the table first even though it appears later in the statement.

[edit] The Executor

One solution would be to use mysql_use_result() discard the result altogether if no Validators are defined to process this query. On the other hand, we could allow a Validator to process the result set one row at a time and still use mysql_use_result(). However, this would cause affected_rows to become unreliable. In addition, the MySQL server may be forced to hold locks for longer, disrupting the concurrency of the test.

[edit] The Data Generator


[edit] Unsorted stuff by Matthias Leich

I am new with RQG. So I cannot exclude that there are already existing or simple solutions for the issues I found. And thank you for this great tool. Aside of its great capabilities around testing, at least I have a lot fun when using RQG. Feel free to remove any of my entries if they are outdated.

 table_name:
    ABC ;
 ...
 table_name:
    DEF ;
cause bad effects ( the last assignment wins all time ).
A check within RQG which catches such duplicates would be very helpful.
  • _field_list - comma separated random list containing the names of all existing columns
Example of _field_list content and usage:
 CREATE TABLE def SELECT _field_list FROM abc
becomes finally
 CREATE TABLE def SELECT f3,f1,f2 FROM abc;
  • _field_count - number of existing columns
Example of _field_count and usage:
 SELECT * FROM abc ORDER by ( _field_count - 1 )
_field <comparison operator> <_digit or _year or ...>
see warnings about data truncation or fail in strict SQL_MODE quite often because the static assigned data value does not fit to the data type of the random picked column. It would be nice but not extreme important to have a solution like for example
_field > $_field_value
where $_field_value provides data of the correct type.
LIKE '%int%'
gets expanded to something like
LIKE '%146079744%'
though there is no '_' before the 'int'.
Workaround (by Philip): Use
LIKE '%INT%'
instead.
Use case: SHOW FULL COLUMNS FROM t1_161 LIKE ....
  • '_field' gets not expanded (most probably because of the inconsistent quoting mentioned above).
  • A piece of a string which occurs quite often in column names is "int". Therefore I tried this.
This separation could be done by requiring that user defined grammar items must follow some naming convention (Ugly looking example: "__update" where the starting "__" are the required stuff).
If the RQG would check that the user defined grammar follows this naming convention than probably a lot debugging in grammars with unexpected behavior could be avoided.
Example:
table:
  table_name | table_name | (SELECT * FROM table_name) ;
table_name:
  ...
... <many lines later> ...
pick_table_name1:
  # Fill $table_name1
  SET @aux = 'Merge table will use base_table_name_sl '; <more statements>
Outcome: Some statements were missed at all, grammar items had no value etc. 
Example: see last example
We should have a grammar syntax checker.

https://blueprints.launchpad.net/randgen/+spec/yy-grammar-syntax-checker

Finding the right pattern is quite error prone.
The assigned pattern might be either plain wrong (example:typo) or too restrictive (example: distance between two expected words fluctuates).
Bad effect: After some hours with simplifier runs failing on first oracle check it turns out that the pattern was wrong.
It would be nice to have a tool which checks if some expected output pattern given on command line or per config file matches the content of given file.

https://blueprints.launchpad.net/randgen/+spec/expected-output-checker

  • The current grammar simplifier has only one strategy which goes deeper (into some element of a rule) as soon as his attempt to remove some element failed.
This has the following disadvantages
  • The simplifier goes very early into simplification of details whereas more frequent used or top level rules could be simplified too. There is the hypothesis that giving the removal of such frequent used/top-level rules prioritity leads to a speedup of simplification process.
  • In case the simplification process gets interruptet the next simplification process will focus first on again on the the same grammar rules where the first simplification was not successful.
  • Therefore we should have at least one alternative simplification strategy.

https://blueprints.launchpad.net/randgen/+spec/alternative-grammar-simplification-strategy

  • Any improvement is welcome though I am aware that the structure of a specific grammar has a high impact how fast a grammar simplification could be. So the grammar author will be all time to some significant extend responsible for the speed of possible grammar simplification.
Have a negative list with string patterns which occur within the backtrace in case some known bug is met. This negative list is stored within a file to be processed by the RQG simplifier.
 Example:
 Item_func::fix_fields # Bug....
 Diagnostics_area::set_ok_status # Bug....
The RQG simplifier could check if any string from the negative list occurs within the backtrace. Reasons for backtrace existence:
  • server crash
  • assert
  • RQG simplifier detected deadlock, hanging server etc. + killed the server process
If any of the negative list strings is found than the simplification attempt is valuated as not successful.
Greedy hunt mode 1: Check the grammar with seed 1 to n (n is configurable).
Greedy hunt mode 2: Check the grammar with seed n (n is configurable).
Both hunt modes stop if
  • a backtrace "<>" negative list is found and positive list is empty.
  • a backtrace "<>" negative list is found but "=" positive list is found.
My first prototype (I am aware that shell + GNU tools have portability problems and that it could be coded more elegant) for proof of concept:
PROT=/dev/shm/prt1
SEED=1
RUN=1
while [ $RUN -eq 1 ]
do
   perl runall.pl --mem \
   --basedir=/work2/6.0/mysql-next-bugfixing --threads=1 --queries=10000 \
   --reporter=Deadlock,Backtrace --mysqld=--table-lock-wait-timeout=1 \
   --mysqld=--innodb-lock-wait-timeout=1 --grammar=conf/bugnnnn-experimental.yy --gendata=conf/WL5004_data.zz --seed=$SEED | tee $PROT.$SEED 2>&1
   grep -n 'exit status 0' $PROT.$SEED
   RC=$?
   if [ $RC -eq 1 ]
   then
      bug47249='MDL_global_lock\:\:is_lock_type_compatible'
      bug47150='Field_long\:\:val_int'
      bug46615='Query_cache\:\:invalidate'
      # bug46958 has a pattern like bug46425
      bug46425='Diagnostics_area\:\:set_ok_status'
      CNT=`egrep \
      -e "$bug47249" \
      -e "$bug47150" \
      -e "$bug46615" \
      -e "$bug46425" \
      $PROT.$SEED | wc -l`
      if [ $CNT -eq 0 ]
      then
         RUN=0
      fi
   fi
   rm $PROT.$SEED
   SEED=`expr $SEED + 1`
done
performed well and I found several new bugs. It must be mentioned that my huge grammar ( ~ 1500 lines, 67 KiB ) fits very good to this script. My grammar is
  • very sensitive to changes of seed
  • (a) runs with a high likelihood into a known bug
  • (b) checking many features
(a) and (b) are the most problematic properties. If I disable the checking of some features because I want to avoid (a), than (b) is no more valid. But than
I also get a rapid decrease of the chance that my grammar catches any unknown bug.
  • the environment variable MTR_BUILD_THREAD for the computation of
  • vardirs -- where the tables, server logs etc. are stored
  • ports
This would make parallel runs of RQG easier.
  • MTR version 2 for starting the server etc.
1. Execute a statement mix on the master server.
2. Cause that the slave server crashes
3. Restart the slave server and let him recover
4. Bring master and slave server into a state so that it can be checked if both are in sync

[edit] Simplification

be passed as an option from the simplification script.

[edit] Combinations

Currently the combinations facility only acts on the top-level query element. If you have lower-level rules such as "ddl" and "dml", the Combinations facility will not open them up and treat their items as individual query types to be enabled/disabled. A solution would be to take the N top-most levels of the query tree and generate combinations using all of them, not just the "query".

[edit] Reporters

A DynamicBinlogFormat Reporter can be used to periodically change the binlog format for a running test. Combined with --rpl_mode, this allows tests to exercise all replication modes without having to have the SET BINLOG_FORMAT coded in the grammar.

A DynamicStorageEngine Reporter can be used to periodically change the default storage engine, so that grammars contain CREATE TABLE will pick the new engine without having to have ENGINE = innodb|myisam|memory hard-coded in the grammar. This would allow the test to only use the engines that are currently available on the server. At the same time ALTER-ing from one engine into another will still require a specific construct in the grammar.

[edit] Other generators

In addition to GenTest::Generator::FromGrammar (which is the actual RQG), other generators could be handy. A GenTest::Generator::FromFile which takes queries from a file sequentially could be useful if the RQG generates an interesting queries after several hours (or days) which one would want to re-run in the framework.

One could also envision alternative random query generators based on external powerful tools, like DGL (http://cs.baylor.edu/~maurer/dgl.php).

[edit] Replication

MTR --start-and-exit does not work if a replication slave from a previous run is still present. On the other hand, using the Shutdown Reporter, while it does cause both the master and the slave to shut down, causes the final slave sync and dump in runall.pl to fail, this being reported with an error code of 255.

Retrieved from "http://forge.mysql.com/wiki/RandomQueryGeneratorToDo"

This page has been accessed 5,856 times. This page was last modified 13:09, 14 October 2010.

Find

Browse
MySQLForge
Main Page
Current events
Recent changes
Random page
Help
Edit
Edit this page
Editing help
This page
Discuss this page
Post a comment
Printable version
Context
Page history
What links here
Related changes
My pages
Special pages
New pages
File list
Statistics
Bug reports
More...