SQL query analysis with MySQL Proxy

Long before there was the official Query Analyzer (QUAN), a component of MySQL Enterprise, SQL analysis was possible using MySQL Proxy.

The following is an introduction to logging and query analysis with MySQL Proxy.

Get MySQL Proxy

You need to first download MySQL Proxy. In this example I am using the Linux RHEL5 64bit OS and Version 0.7.2

$ wget http://dev.mysql.com/get/Downloads/MySQL-Proxy/mysql-proxy-0.7.2-linux-rhel5-x86-64bit.tar.gz/from/http://mirror.trouble-free.net/mysql_mirror/
$ tar xvfz mysql-proxy-0.7.2-linux-rhel5-x86-64bit.tar.gz
$ ln -s mysql-proxy-0.7.2-linux-rhel5-x86-64bit mysql-proxy
$ export PATH=`pwd`/mysql-proxy/sbin:$PATH
$ mysql-proxy --help-all


MySQL Proxy uses TCP/IP, so it is important you connect via the actual hostname. You should first confirm this, as appropriate MySQL permissions may be necessary. For example:

$ mysql -h`hostname` -u -p

On confirmation this works, you can then connect directly to the proxy

$ mysql -h`hostname` -P4040 -u -p


$ cd mysql-proxy/share/doc/mysql-proxy/
$ wget -O log.lua http://ronaldbradford.com/mysql-dba/mysql-proxy/log.lua
$ mysql-proxy --proxy-lua-script=share/doc/mysql-proxy/log.lua &
$ tail -f mysql.log

This script is based on simple query logging which requires a modification to work in more current versions of MySQL proxy.

$ mysql -hhostname -P4040 -u -p
mysql>  SELECT host,user,password FROM mysql.user;
mysql>  SELECT table_schema,COUNT(*) FROM information_schema.tables GROUP BY table_schema;
mysql>  SELECT NOW(), SLEEP(3);
mysql>  EXIT
$ cat mysql.log
2009-09-02 17:15:01     58 -- select @@version_comment limit 1
2009-09-02 17:16:15     58 -- SELECT host,user,password FROM mysql.user
2009-09-02 17:16:30     58 -- SELECT table_schema,COUNT(*) FROM information_schema.tables GROUP BY table_schema
2009-09-02 17:16:39     58 -- SELECT NOW(), SLEEP(3)

Query Analysis

Restart proxy with the histogram.lua sample provided.

$ mysql-proxy --proxy-lua-script=share/doc/mysql-proxy/histogram.lua &

Connect and run some queries.

$ mysql -hhostname -P4040 -u -p
mysql>  SELECT host,user,password FROM mysql.user;
mysql>  SELECT table_schema,COUNT(*) FROM information_schema.tables GROUP BY table_schema;
mysql>  SELECT NOW(), SLEEP(3);

While connected to the proxy, you can now review data from two pseudo tables.

mysql>  SELECT * FROM histogram.tables;
mysql>  SELECT * FROM histogram.queriesG
mysql>  DELETE FROM histogram.tables;
mysql>  DELETE FROM histogram.queries;

mysql> SELECT * FROM histogram.tables;
| table                     | reads | writes |
| information_schema.tables |     1 |      0 |
| mysql.user                |     1 |      0 |

mysql> SELECT * FROM histogram.queries;
| query                                                                                            | count | max_query_time | avg_query_time |
| SELECT @@version_comment LIMIT ?                                                                 |     1 |            300 |            300 |
| SELECT `table_schema` , COUNT( * ) FROM `information_schema` . `tables` GROUP BY `table_schema`  |     1 |           1822 |           1822 |
| SELECT `host` , `user` , `password` FROM `mysql` . `user`                                        |     1 |            494 |            494 |
| SELECT NOW( ) , SLEEP( ? )                                                                       |     1 |        3000735 |        3000735 |

Moving forward

The power is that with Lua you have the flexibility to write your own logging. Some improvements to these scripts could be.

  • Add the query time, number of rows, and result set size to the logging
  • Be able to sort histogram results or see top percentile. Being able to copy data into real tables would enable any level of analysis
  • Combine the logging and histogram scripts
  • Enable global enable/disabling of logging with SET GLOBAL commands
  • Support variable length IN queries, those that pass multiple values, so you end up with a subset of all queries
  • Provide a actual query example, making it easy to do a QEP. For normalized queries you need to do additional work to find values.
  • The histogram does not support the C API multi query functionality, where multiple queries can be passed to the server at one time. The problem is there is no way to time the individual queries.

Read on in SQL Analysis with MySQL Proxy – Part 2.


A good introduction document
MySQL Proxy – From architecture to implementation – OSCON 2008


  1. says

    The problem is that it adds extra overhead, more components, and potential failure points.
    Particularly the systems that are in the most need of something like this shouldn’t be burdened in that way, their systems can’t handle it.
    From your list, I think most if not all are covered by the microslow extensions in OurDelta/Percona builds.

  2. says

    tcpdump-ing, then analyzing, is now my preferred way to do this. Zero impact on running systems, no need to restart, etc. (See mk-query-digest).