Long before there was the official Query Analyzer (QUAN), a component of MySQL Enterprise, SQL analysis was possible using MySQL Proxy.
The following is an introduction to logging and query analysis with MySQL Proxy.
Get MySQL Proxy
You need to first download MySQL Proxy. In this example I am using the Linux RHEL5 64bit OS and Version 0.7.2
$ wget http://dev.mysql.com/get/Downloads/MySQL-Proxy/mysql-proxy-0.7.2-linux-rhel5-x86-64bit.tar.gz/from/http://mirror.trouble-free.net/mysql_mirror/ $ tar xvfz mysql-proxy-0.7.2-linux-rhel5-x86-64bit.tar.gz $ ln -s mysql-proxy-0.7.2-linux-rhel5-x86-64bit mysql-proxy $ export PATH=`pwd`/mysql-proxy/sbin:$PATH $ mysql-proxy --help-all
Pre-requisites
MySQL Proxy uses TCP/IP, so it is important you connect via the actual hostname. You should first confirm this, as appropriate MySQL permissions may be necessary. For example:
$ mysql -h`hostname` -u -p
On confirmation this works, you can then connect directly to the proxy
$ mysql -h`hostname` -P4040 -u -p
Logging
$ cd mysql-proxy/share/doc/mysql-proxy/ $ wget -O log.lua http://ronaldbradford.com/mysql-dba/mysql-proxy/log.lua $ mysql-proxy --proxy-lua-script=share/doc/mysql-proxy/log.lua & $ tail -f mysql.log
This script is based on simple query logging which requires a modification to work in more current versions of MySQL proxy.
$ mysql -hhostname -P4040 -u -p mysql> SELECT host,user,password FROM mysql.user; mysql> SELECT table_schema,COUNT(*) FROM information_schema.tables GROUP BY table_schema; mysql> SELECT NOW(), SLEEP(3); mysql> EXIT
$ cat mysql.log 2009-09-02 17:15:01 58 -- select @@version_comment limit 1 2009-09-02 17:16:15 58 -- SELECT host,user,password FROM mysql.user 2009-09-02 17:16:30 58 -- SELECT table_schema,COUNT(*) FROM information_schema.tables GROUP BY table_schema 2009-09-02 17:16:39 58 -- SELECT NOW(), SLEEP(3)
Query Analysis
Restart proxy with the histogram.lua sample provided.
$ mysql-proxy --proxy-lua-script=share/doc/mysql-proxy/histogram.lua &
Connect and run some queries.
$ mysql -hhostname -P4040 -u -p mysql> SELECT host,user,password FROM mysql.user; mysql> SELECT table_schema,COUNT(*) FROM information_schema.tables GROUP BY table_schema; mysql> SELECT NOW(), SLEEP(3);
While connected to the proxy, you can now review data from two pseudo tables.
mysql> SELECT * FROM histogram.tables; mysql> SELECT * FROM histogram.queriesG mysql> DELETE FROM histogram.tables; mysql> DELETE FROM histogram.queries; mysql> SELECT * FROM histogram.tables; +---------------------------+-------+--------+ | table | reads | writes | +---------------------------+-------+--------+ | information_schema.tables | 1 | 0 | | mysql.user | 1 | 0 | +---------------------------+-------+--------+ mysql> SELECT * FROM histogram.queries; +--------------------------------------------------------------------------------------------------+-------+----------------+----------------+ | query | count | max_query_time | avg_query_time | +--------------------------------------------------------------------------------------------------+-------+----------------+----------------+ | SELECT @@version_comment LIMIT ? | 1 | 300 | 300 | | SELECT `table_schema` , COUNT( * ) FROM `information_schema` . `tables` GROUP BY `table_schema` | 1 | 1822 | 1822 | | SELECT `host` , `user` , `password` FROM `mysql` . `user` | 1 | 494 | 494 | | SELECT NOW( ) , SLEEP( ? ) | 1 | 3000735 | 3000735 | +--------------------------------------------------------------------------------------------------+-------+----------------+----------------+
Moving forward
The power is that with Lua you have the flexibility to write your own logging. Some improvements to these scripts could be.
- Add the query time, number of rows, and result set size to the logging
- Be able to sort histogram results or see top percentile. Being able to copy data into real tables would enable any level of analysis
- Combine the logging and histogram scripts
- Enable global enable/disabling of logging with SET GLOBAL commands
- Support variable length IN queries, those that pass multiple values, so you end up with a subset of all queries
- Provide a actual query example, making it easy to do a QEP. For normalized queries you need to do additional work to find values.
- The histogram does not support the C API multi query functionality, where multiple queries can be passed to the server at one time. The problem is there is no way to time the individual queries.
Read on in SQL Analysis with MySQL Proxy – Part 2.
References
A good introduction document
MySQL Proxy – From architecture to implementation – OSCON 2008
Arjen Lentz says
The problem is that it adds extra overhead, more components, and potential failure points.
Particularly the systems that are in the most need of something like this shouldn’t be burdened in that way, their systems can’t handle it.
From your list, I think most if not all are covered by the microslow extensions in OurDelta/Percona builds.
Baron says
tcpdump-ing, then analyzing, is now my preferred way to do this. Zero impact on running systems, no need to restart, etc. (See mk-query-digest).