MySQL General Log ParserDeveloped In: Perl — Contributed by: Gavin ToweyThe mysql general log outputs all statements and commands sent to the server in a plain text format. This is useful for many things such as gathering all queries that the server is running, including selects that wouldn't otherwise show up in the binary logs. However this format isn't always easy to parse and extract whole statements from because queries with newline characters mean that you have to read until the beginning of the next entry to find out where your statement actually ends. This perl script will take care of finding these boundaries and printing statements. The output can then be used more easily to feed into mysqlslap and/or maatkit's mk-query-profiler for instance.
Perl
Actually my problem is, to filter all threads from ONE db-user@host out of the generel log (a big-big generel log !! :-) "Connect xxx@host on" is helpfull, but then i am running in the problem with the query newlines an the "date time" fields. The following patch contains a bug fix to the regexes, which did not successfully match lines from the general log that begin with the date/time fields. Also added is the ability to output both date/time of query and connection ID of query. [code] --- mysql_log_parser.pl.orig 2008-09-11 11:19:02.000000000 -0400 +++ mysql_log_parser.pl 2008-09-11 11:21:00.000000000 -0400 @@ -4,6 +4,7 @@ use Getopt::Long; # author: Gavin Towey, 2008 gtowey@gmail.com +# contributions from: Moshe Hyzon, 2008 moshe.hyzon@grantstreet.com # todo, add "follow thread" capability # so we can give a process name or thread id & see # all activity in sequence for each thread @@ -16,7 +17,9 @@ "type|t=s", "pattern|p=s", "preserve-newlines|n", - "separator|s=s" ) + "separator|s=s", + "time|T", + "connection-id|c" ) ) { @@ -61,7 +64,7 @@ } if ( !$in_block - && $LINEBUFFER[$#LINEBUFFER] =~ /^(\d{6} [\d:]{8})?\s+(\d+)\s(\w+)(\s(.*))?/ ) + && $LINEBUFFER[$#LINEBUFFER] =~ /^(\d{6}\s+[\d:]{7,8})?\s+(\d+)\s(\w+)(\s(.*))?/ ) { # we have the beginning of a line if ( $#LINEBUFFER == 0 ) { # begin block capture @@ -72,7 +75,7 @@ elsif ($in_block) { - if ( $LINEBUFFER[$#LINEBUFFER] =~ /^(\d{6} [\d:]{8})?\s+(\d+)\s(\w+)(\s(.*))?/ ) { + if ( $LINEBUFFER[$#LINEBUFFER] =~ /^(\d{6}\s+[\d:]{7,8})?\s+(\d+)\s(\w+)(\s(.*))?/ ) { if ( $#LINEBUFFER > 0 ) { #end block # return everything up to this statement $query_found = ''; @@ -97,6 +100,7 @@ sub main { open( FILE, $file ); my $done = 0; + my $datetime = ''; while ( !$done ) { my $query = get_next_query( \*FILE ); if ( $query eq -1 ) { @@ -104,9 +108,12 @@ } else { chomp($query); - $query =~ /^(\d{6} [\d:]{8})?\s+(\d+)\s(\w+)(\s+(.*))?/s; - my ($type, $query ) = (lc($3), $5); + $query =~ /^(\d{6}\s+[\d:]{7,8})?\s+(\d+)\s(\w+)(\s+(.*))?/s; + my ($type, $id, $query ) = (lc($3), $2, $5); + if ($1 != '') { $datetime = $1 ; } if (!$OPTIONS{'preserve-newlines'}) { $query =~ s/[\r\n]/ /g; } + if ($OPTIONS{'connection-id'}) { $query = "$id\t$query"; } + if ($OPTIONS{'time'}) { $query = "$datetime\t$query"; } if ( $type eq $OPTIONS{'type'}) { if (defined($OPTIONS{'pattern'})) { if ( $query =~ /$OPTIONS{'pattern'}/ ) { @@ -159,10 +166,18 @@ Keep original newlines in multiline queries default is to make all queries single line. - -separator=s + --separator=s -s Add the separator after every query + --time + -T + Print the timestamp of each query + + --connection-id + -c + Print the connection ID of the query + EOF exit; } [/code] Moshe Hyzon |
VotesWatches4 members are watching this tool
You must be logged in to track this tool.
Provide Feedback
You must be logged in to comment
|
Let's see if a <pre> block helps formatting any
--- mysql_log_parser.pl.orig 2008-09-11 11:19:02.000000000 -0400 +++ mysql_log_parser.pl 2008-09-11 11:21:00.000000000 -0400 @@ -4,6 +4,7 @@ use Getopt::Long; # author: Gavin Towey, 2008 gtowey@gmail.com +# contributions from: Moshe Hyzon, 2008 moshe.hyzon@grantstreet.com # todo, add "follow thread" capability # so we can give a process name or thread id & see # all activity in sequence for each thread @@ -16,7 +17,9 @@ "type|t=s", "pattern|p=s", "preserve-newlines|n", - "separator|s=s" ) + "separator|s=s", + "time|T", + "connection-id|c" ) ) { @@ -61,7 +64,7 @@ } if ( !$in_block - && $LINEBUFFER[$#LINEBUFFER] =~ /^(\d{6} [\d:]{8})?\s+(\d+)\s(\w+)(\s(.*))?/ ) + && $LINEBUFFER[$#LINEBUFFER] =~ /^(\d{6}\s+[\d:]{7,8})?\s+(\d+)\s(\w+)(\s(.*))?/ ) { # we have the beginning of a line if ( $#LINEBUFFER == 0 ) { # begin block capture @@ -72,7 +75,7 @@ elsif ($in_block) { - if ( $LINEBUFFER[$#LINEBUFFER] =~ /^(\d{6} [\d:]{8})?\s+(\d+)\s(\w+)(\s(.*))?/ ) { + if ( $LINEBUFFER[$#LINEBUFFER] =~ /^(\d{6}\s+[\d:]{7,8})?\s+(\d+)\s(\w+)(\s(.*))?/ ) { if ( $#LINEBUFFER > 0 ) { #end block # return everything up to this statement $query_found = ''; @@ -97,6 +100,7 @@ sub main { open( FILE, $file ); my $done = 0; + my $datetime = ''; while ( !$done ) { my $query = get_next_query( \*FILE ); if ( $query eq -1 ) { @@ -104,9 +108,12 @@ } else { chomp($query); - $query =~ /^(\d{6} [\d:]{8})?\s+(\d+)\s(\w+)(\s+(.*))?/s; - my ($type, $query ) = (lc($3), $5); + $query =~ /^(\d{6}\s+[\d:]{7,8})?\s+(\d+)\s(\w+)(\s+(.*))?/s; + my ($type, $id, $query ) = (lc($3), $2, $5); + if ($1 != '') { $datetime = $1 ; } if (!$OPTIONS{'preserve-newlines'}) { $query =~ s/[\r\n]/ /g; } + if ($OPTIONS{'connection-id'}) { $query = "$id\t$query"; } + if ($OPTIONS{'time'}) { $query = "$datetime\t$query"; } if ( $type eq $OPTIONS{'type'}) { if (defined($OPTIONS{'pattern'})) { if ( $query =~ /$OPTIONS{'pattern'}/ ) { @@ -159,10 +166,18 @@ Keep original newlines in multiline queries default is to make all queries single line. - -separator=s + --separator=s -s Add the separator after every query + --time + -T + Print the timestamp of each query + + --connection-id + -c + Print the connection ID of the query + EOF exit; }