Don't Assume – Session Scope

MySQL system variables and status variables have two scopes. These are GLOBAL and SESSION which are self explanatory.
This is important to realize when altering system variables dynamically. The following example does not produce the expected results.

mysql> USE test;
Database changed
mysql> CREATE TABLE example1(
    -> col1 VARCHAR(10) NOT NULL,
    -> PRIMARY KEY(id)
    -> );
Query OK, 0 rows affected (0.29 sec)

mysql> SHOW CREATE TABLE example1G
*************************** 1. row ***************************
       Table: example1
Create Table: CREATE TABLE `example1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `col1` varchar(10) NOT NULL,
  PRIMARY KEY (`id`)
1 row in set (0.00 sec)

We see that the table has a default CHARACTER SET of latin1. If you wanted to ensure all tables are created as utf8 you change the appropriate system variable. For example, we change the GLOBAL system variable and re-create the table.

mysql> SHOW GLOBAL VARIABLES like 'char%';
| Variable_name            | Value                                                          |
| character_set_client     | latin1                                                         |
| character_set_connection | latin1                                                         |
| character_set_database   | latin1                                                         |
| character_set_filesystem | binary                                                         |
| character_set_results    | latin1                                                         |
| character_set_server     | latin1                                                         |
| character_set_system     | utf8                                                           |
| character_sets_dir       | /Users/rbradfor/mysql/mysql-5.1.39-osx10.5-x86/share/charsets/ |
8 rows in set (0.00 sec)

mysql> SET GLOBAL character_set_server=utf8;
Query OK, 0 rows affected (0.10 sec)
mysql> DROP TABLE example1;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.12 sec)

mysql> SHOW CREATE TABLE example1G
*************************** 1. row ***************************
       Table: example1
Create Table: CREATE TABLE `example1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `col1` varchar(10) NOT NULL,
  PRIMARY KEY (`id`)
1 row in set (0.02 sec)

The table still is latin1. This is because now we have a SESSION scope that differs from the GLOBAL scope as seen in this output.

mysql> SELECT @@GLOBAL.character_set_server,@@SESSION.character_set_server;
| @@GLOBAL.character_set_server | @@SESSION.character_set_server |
| utf8                          | latin1                         |
1 row in set (0.00 sec)

The solution is easy however the trap can be easily overlooked and especially when changing other MySQL system variables.

mysql> SET SESSION character_set_server=utf8;
Query OK, 0 rows affected (0.00 sec)

mysql> DROP TABLE example1;Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.09 sec)

mysql> SHOW CREATE TABLE example1G
*************************** 1. row ***************************
       Table: example1
Create Table: CREATE TABLE `example1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `col1` varchar(10) NOT NULL,
  PRIMARY KEY (`id`)
1 row in set (0.00 sec)

By default, and when not specified in SHOW and SET commands the default scope is GLOBAL, however prior to MySQL 5.0.2 the default was SESSION. A note you will find in the 5.0 Reference Manual but not the current GA version 5.1 Reference Manual. See also SHOW STATUS Gotcha written in August 2006.

There are also other gotchas with scope that we will discuss at some other time.


About “Don’t Assume”

“Don’t Assume” is a series of posts to help the Oracle DBA understand, use and appreciate the subtle differences and unique characteristics of the MySQL RDBMS in comparison to Oracle. These points as essential to operate MySQL effectively in a production environment and avoid any loss of data or availability.

For more posts in this series be sure to follow the mysql4oracledba tag and also watch out for other MySQL for Oracle DBA presentations.

The MySQLCamp for the Oracle DBA is a series of educational talks all Oracle DBA resources should attend. Two presentations from this series IGNITION and LIFTOFF will be presented at the MySQL Users Conference 2010 in Santa Clara, April 2010 This series also includes JUMPSTART and VELOCITY. If you would like to here these presentations in your area, please contact me.

Don't Assume Series – MySQL for the Oracle DBA

As part of my MySQLCamp for the Oracle DBA series of talks to help the Oracle DBA understand, use and appreciate MySQL I have also developed a series of short interesting posts I have termed “Don’t Assume”. Many of these are re-occurring points during my consulting experiences as I observe Oracle DBA’s using MySQL. I am putting the finishing touches to my MySQL for the Oracle DBA series of talks and I’m excited to highlight some of the subtle differences and unique characteristics of MySQL RDBMS in comparison to Oracle and some extent other products including SQL Server.

Stay tuned for more soon.

I will be presenting at the MySQL Users Conference 2010 in Santa Clara, April 2010 two presentations from this series, IGNITION and LIFTOFF. This series also includes JUMPSTART and VELOCITY.

Edward Screven of Oracle to Answer Questions for future of MySQL

For those of you on the O’Reilly MySQL conference list you will no doubt see this email, but for readers here is the important bits.

Oracle Executive Will Speak at O’Reilly MySQL Conference & Expo
Edward Screven to Answer Questions re: Future of MySQL

Sebastopol, CA, February 24, 2010—Wonder about the future of MySQL? Curious about what Oracle plans for the open source database software? Expect answers when Edward Screven, Oracle’s chief corporate architect and leader of the MySQL business, speaks at the O’Reilly MySQL Conference & Expo, scheduled for April 12-15, at the Santa Clara Convention Center and the Hyatt Regency Santa Clara.

Edward Screven reports to CEO Larry Ellison, and he drives technology and architecture decisions across all Oracle products to ensure that product directions are consistent with Oracle’s overall strategy. He’ll discuss the current and future state of MySQL, now part of the Oracle family of products. His presentation will also cover Oracle’s investment in MySQL technology and community, as well as the role that open source in general is playing within heterogeneous customer environments around the world.

I have not found a link yet to provide reference to this.

Ineffective concatenated indexes

In MySQL significant performance improvements can be achieved by the correct use of indexes. It is important to understand different MySQL index implementations and one key improvement on indexes defined on single columns is to use multiple column or more commonly known concatenated indexes.

However it’s also possible to define ineffective indexes. This example shows you how to identify a concatenated index that is ineffective.

CREATE TABLE example (
  x  VARCHAR(10),
  y  VARCHAR(10),
  z  VARCHAR(10),
UNIQUE INDEX (a,b,c,d)

INSERT INTO example(a,b,c,d) VALUES
(1,0,1,1),(1,0,1,2), (1,0,2,3), (1,0,4,5),
(2,0,2,1),(2,0,2,2), (2,0,2,3), (2,0,2,5),
(3,0,2,1),(3,0,2,3), (3,0,3,3), (3,0,3,5);

And our sample query is

SELECT id,x,y,z
FROM   example
WHERE  a = 2
AND    c = 2
AND    d = 3;

The EXPLAIN plan is

| id | select_type | table   | type | possible_keys | key  | key_len | ref   | rows | Extra       |
|  1 | SIMPLE      | example | ref  | a             | a    | 4       | const |    4 | Using where |

While we are using the index (see the key column), the full benefit of the index is not utilized (see the key_len). 4 indicates the number of bytes used, that is only 1 INT column.

Let’s look at a second example.

SELECT id,x,y,z
FROM   example
WHERE  a = 2
AND    b = 0
AND    c = 2
AND    d = 3;

| id | select_type | table   | type  | possible_keys | key  | key_len | ref                     | rows | Extra |
|  1 | SIMPLE      | example | const | a             | a    | 16      | const,const,const,const |    1 |       |

In this example the index is used however the key_len is 16, that is 4 x 4 byte INT columns. This is ideal for this index.

In the above example, the client was using a common data structure however was not using one column, it’s values were all effectively 0. Certain queries were written with this knowledge and instead of specifying the column, they elected to remove it, however the impact of this developer code change was increased load on the database and more inefficient performance.

While this was easily addressed by a code change, an alternative could have been to change the index definition. It was not possible to remove the column due to legacy requirements.

mysql> explain SELECT id,x,y,z FROM   example  WHERE  a = 2 AND    c = 2 AND    d = 3;
| id | select_type | table   | type  | possible_keys | key  | key_len | ref               | rows | Extra |
|  1 | SIMPLE      | example | const | a             | a    | 12      | const,const,const |    1 |       |

A better solution may be to enforce better integrity on this business rule, that b only contains 0. MySQL does not support check constraint. The closest option would be to use an ENUM data type and a STRICT sql_mode however ENUM is only a string object so this would break other code.


Not only is performance impacted in this situation, in other examples of ineffective indexes it’s simply a waste of diskspace to sort additional columns in an index that are never used, and also a waste of memory as the index pages stored in memory contain information that is not used.

Getting started with Cassandra

With the motivation from today’s public news on Twitter’s move from MySQL to Cassandra, my own skills desire following in-depth discussions at last November’s Open SQL Camp to consider Cassandra and yesterday’s discussion with a new client on persistent key-value store products, today I download installed and configured for the first time. Not that today’s news was unexpected, if you follow the Twitter Engineering Open Source projects you would have seen Cassandra as well as other products being used or evaluated by Twitter.

So I went from nothing to a working Cassandra node in under 5 minutes. This is what I did.

  1. While I knew this was an Apache project, a Google Search yields for me the 3rd link for the The Apache Cassandra Project at Congrats for Cassandra now a top level Apache Project. This url will update soon.
  2. Download Cassandra. Hard to miss with a big green button on home page. Current version is 0.5
  3. I read Getting Started, which is the 3rd top level link on menu after Home and Download. Step 1 is picking a version which I’ve already done, Step 2 is Running a single node.
  4. The Getting Started indicated a problem on Mac OS X for the required minimum Java version. I was installing on Mac OS X 10.5 and CentOS 5.4. I’ve experienced this Java 6 default path issue before. Set my JAVA_HOME and PATH accordingly (after I updated the wiki with correct value)
  5. I extracted the tar file, changed to the directory and took at look at the README.txt file. Yes, I always check this first with any software and relevant because it includes valuable instructions on creating the default data and log directories.
  6. Start with bin/cassandra -f. No problems!
  7. I then followed the instructions from the link in Step 2 with the CassandraCli. This tests and confirms the installation is operational.

Ok, a working environment. I’ve now installed on a second machine and tested however I now need to configure the cluster, and the documentation is not as straightforward. Time to try out Google again.

On a side note, this is one reason why I love Open Source. I followed the instructions online and found a mistake in the Mac OS X path, I simply registered and corrected providing the benefit of my experience for the next reader(s).

You may also like to view future posts including.

The correct approach to rolling MySQL logs

I say correct because there are several incorrect approaches to managing MySQL logs. In MySQL you have two important log files, the MySQL error log (configured with –log-error) and the MySQL slow query log (configured with –log-slow-queries or –slow-query-log and –slow-query-log-file which is available from 5.1.29).

The ideal management of these log files is different for each type of file.

The MySQL Error Log

With the error log you want to have one file showing a contiguous history of the server instance. This log provides valuable information over time and you should not discard this. You do NOT want to roll your error log. If for feel the content in the error log is too much, then these are errors you need to be addressing, and archiving after correcting. There are circumstances where error logs are rolled outside of your control.

The first is the FLUSH LOGS command. When this command is run, the error log is renamed to -old and a new log is created. For example:

$ ls -l log/error*
-rw-rw---- 1 mysql root  1733 Feb 22 16:11 error.log

$ mysql -uroot -p -e "FLUSH LOGS"

$ ls -l log/error*
-rw-rw---- 1 mysql mysql    0 Feb 22 18:08 error.log
-rw-rw---- 1 mysql root  1733 Feb 22 16:11 error.log-old

What happens when you run this command again?

$ mysql -uroot -p -e "FLUSH LOGS"

$ ls -l log/error*
-rw-rw---- 1 mysql mysql 0 Feb 22 18:10 log/error.log
-rw-rw---- 1 mysql mysql 0 Feb 22 18:08 log/error.log-old

You have now lost all valuable information in the error log, both the current log and the -old log are ZERO bytes in size.

The second example is the Ubuntu specific MySQL distribution on Ubuntu OS that logs MySQL information to the system error log (i.e. /var/log/syslog). You are then receiving a daily log rotate via the default OS log-rotate settings. You effectively lose information after 7 days. Here is what you will find on a stock Ubuntu server.

$ ls -l /var/log
-rw-r----- 1 mysql       adm       0 2008-05-28 20:33 mysql.err
-rw-r----- 1 mysql       adm       0 2008-05-28 20:33 mysql.log
-rw-r----- 1 syslog      adm  278480 2010-02-22 20:22 syslog
-rw-r----- 1 syslog      adm  366934 2010-02-22 06:25 syslog.0
-rw-r----- 1 syslog      adm   21025 2010-02-21 06:27 syslog.1.gz
-rw-r----- 1 syslog      adm   18551 2010-02-20 06:47 syslog.2.gz
-rw-r----- 1 syslog      adm   20086 2010-02-19 06:25 syslog.3.gz
-rw-r----- 1 syslog      adm   17135 2010-02-18 06:40 syslog.4.gz
-rw-r----- 1 syslog      adm   19238 2010-02-17 06:32 syslog.5.gz
-rw-r----- 1 syslog      adm   16101 2010-02-16 06:34 syslog.6.gz

You have to troll the syslog files to find any mysql specific information, even that is not possible with one command.

$ grep syslog syslog.0 | grep mysql

$ zcat syslog*gz | grep mysql
Feb 16 22:12:20 db1 mysqld[21769]: 100216 22:12:20 [ERROR] /usr/sbin/mysqld: Incorrect key file for table '/tmp/#sql_5508_40.MYI'; try to repair it
Feb 16 22:12:20 db1 mysqld[21769]: 100216 22:12:20 [ERROR] /usr/sbin/mysqld: Incorrect key file for table '/tmp/#sql_5508_33.MYI'; try to repair it
Feb 16 22:30:15 db1 mysqld[21769]: 100216 22:19:13 [ERROR] /usr/sbin/mysqld: Incorrect key file for table '/tmp/#sql_5508_29.MYI'; try to repair it
Feb 16 22:30:17 db1 mysqld[21769]: 100216 22:19:13 [ERROR] /usr/sbin/mysqld: Incorrect key file for table '/tmp/#sql_5508_44.MYI'; try to repair it

I’m lucky I looked, because tomorrow would have been too late to see these errors, as this is the oldest log file.

I place a disclaimer that if you have proper backups in place, these files are retrievable, but that takes time and effort which I consider unnecessary for a database server. The mysql log is one of the most important log files for this type of server, you want this information on the server.

The solution to this problem is easy in Ubuntu, always define the -log-error variable for the [mysqld] and [mysqld_safe] sections to a specific file, e.g. /var/log/mysql/mysql.log

The MySQL slow query log

With the slow query log, you DO want to rotate the log file produced. The best practice is to rotate this file daily. In addition you should both analyze the log file producing a top 5 or top 10 list of slow SQL queries each day. As load may not always predictable, it’s ideal to also analyze the combined logs of the past 7 days for cross reference.

The daily granularity allows you to track your load of slow performing queries more consistently and it also enables you verify more easily the impact of improvements made when they have been deployed.

Care needs to be taken to roll your log filed. Simply moving the log file will not work. For example.

A good tip to help in slow query SQL analysis is to
Comment your SQL (url)

$ mysql -uroot -p -e "select sleep(5)"
$ tail -1 log/slow.log
select sleep(5);

$ ls -l log/slow*
-rw-rw---- 1 mysql mysql 352 Feb 22 15:26 log/slow.log

$ mv log/slow.log log/slow.log.1; touch log/slow.log
$ ls -l log/slow*
-rw-r--r-- 1 root  root    0 Feb 22 15:26 log/slow.log
-rw-rw---- 1 mysql mysql 352 Feb 22 15:26 log/slow.log.1

$ mysql -uroot -p -e "SELECT SLEEP(4)"

$ ls -l log/slow*
-rw-r--r-- 1 root  root    0 Feb 22 15:26 log/slow.log
-rw-rw---- 1 mysql mysql 533 Feb 22 16:01 log/slow.log.1
$ tail -1 log/slow.log.1

As you can see, the slow log was not written to, but the previous file which has the same inode.

$ rm -f log/slow.log; mv log/slow.log.1 log/slow.log
$ ls -l log/slow.log*
-rw-rw---- 1 mysql mysql 533 Feb 22 16:01 log/slow.log
$ cp log/slow.log log/slow.log.`date +%M`; > log/slow.log
$ mysql -uroot -p -e "SELECT SLEEP(3)"
$ ls -l log/slow.log*
-rw-rw---- 1 mysql mysql 181 Feb 22 16:03 log/slow.log
-rw-r----- 1 root  root  533 Feb 22 16:03 log/slow.log.03
$  tail -1 log/slow.log

$ cp log/slow.log log/slow.log.`date +%M`; > log/slow.log
$ mysql -uroot -p -e "SELECT SLEEP(6)"
$ ls -l log/slow.log*
-rw-rw---- 1 mysql mysql 181 Feb 22 16:04 log/slow.log
-rw-r----- 1 root  root  533 Feb 22 16:03 log/slow.log.03
-rw-r----- 1 root  root  181 Feb 22 16:04 log/slow.log.04
$ tail -1 log/slow.log

As you can see by copying and truncating you can perform an effective log rotate manually. Ideally you should config logrotate to manage this log.

More Information

What's your MySQL version?

I’ve heard that the mechanic’s wife always has a car that needs repair or tuneup, the painter’s wife always had walls of peeling paint, you get the picture. What about MySQL DBA’s and their own databases? While I have many versions of MySQL for testing including for example the latest 5.1.44 which I was using for my previous post, what is running on my production server? Let’s see:

mysql> select version();
| version() |
| 5.1.25-rc |

That’s really old. And yes, to prove my point that we can be our own worst enemy, the previous version before 5.1.25 was 5.1.6. Yes, .6 which worked just fine, and never crashed once for my 20+ websites. While I have downloaded onto my production server several versions ready for upgrade including versions 5.1.30, 5.1,38, and 5.4.1 I’ve never actually gone through the upgrade process.

Migrating MySQL latin1 to utf8 – Character Set Options

Continuing on from preparation in our MySQL latin1 to utf8 migration let us first understand where MySQL uses character sets. MySQL defines the character set at 4 different levels for the structure of data.

  • Instance
  • Schema
  • Table
  • Column

In MySQL 5.1, the default character set is latin1. If not specified, this is what you will get. For example.

mysql> create table test1(c1 varchar(10) not null);
mysql> show create table test1G
Create Table: CREATE TABLE `test1` (
  `c1` varchar(10) NOT NULL

If you want all tables in your instance to always be a default of utf8, you can changed the server variable character_set_server. This can be set dynamically.

mysql> set global character_set_server=utf8;
mysql> set session character_set_server=utf8;
mysql> create table test2(c1 varchar(10) not null);
mysql> show create table test2G
Create Table: CREATE TABLE `test2` (
  `c1` varchar(10) NOT NULL

If you change this dynamically be sure to include the option in your my.cnf to ensure this option is persisted for a mysqld restart.

You can define the default character set for all new tables in a given schema. You specify this when you create the schema.

mysql> set global character_set_server=latin1;
mysql> set session character_set_server=latin1;
mysql> create schema test_ucs2 default character set ucs2;
mysql> use test_ucs2;
mysql> create table test3(c1 varchar(10) not null);
mysql> show create table test3G
Create Table: CREATE TABLE `test3` (
  `c1` varchar(10) NOT NULL

Even though we have a schema default, you can always specify the default character set for a given table which overrides any defaults.

mysql> use test_ucs2;
mysql> create table test4_utf8 (c varchar(10) not null) default charset utf8;
mysql> show create table test4_utf8G
Create Table: CREATE TABLE `test4_utf8` (
  `c` varchar(10) NOT NULL

And finally, if you really wanted to be specific you can define the character set on a per column level.

mysql> create table test4_utf8_latin1 (c varchar(10) not null, c2 varchar(20) charset latin1) default charset utf8;
mysql> show create table test4_utf8_latin1G
*************************** 1. row ***************************
       Table: test4_utf8_latin1
Create Table: CREATE TABLE `test4_utf8_latin1` (
  `c` varchar(10) NOT NULL,
  `c2` varchar(20) CHARACTER SET latin1 DEFAULT NULL

With great flexibility comes great responsibility. You should have a defined standard for your application that is simple and easy to understand. I am not a proponent of using utf8 for everything, the primary reason why is memory. As part of my consulting I spend a lot of time with clients that have limited resources, e.g. database servers with 2GB or 4GB of RAM. MySQL stores utf8 efficiently on disk, but when this data is stored in memory for internal usage, it automatically uses 3 bytes, when on disk it may only be 1 byte. You can test this by creating a MEMORY table with latin1 and utf8 examples and comparing the difference in size. Is this a serious problem? Well that depends on many factors such as the number of database connections, persistent or not persistent connections, the size of the results etc. While it’s difficult in MySQL to instrument the memory precisely on a per connection basis, prudence should be a consideration for any physical resources, especially RAM.

Now that we understand what’s possible, how can we change our existing latin1 tables in our preparation example?

We could try a simple ALTER TABLE command.

mysql> alter table test_latin1 default charset utf8;
mysql> show create table test_latin1G
Create Table: CREATE TABLE `test_latin1` (
  `c` varchar(100) CHARACTER SET latin1 NOT NULL

This does not work, because we are only changing the default storage engine of the table. The underlying columns remain the same. If we were to add a new column, it would default to utf8. We can however achieve what we expected with the CONVERT option.

mysql> alter table test_latin1 convert to character set utf8;
mysql> show create table test_latin1G
Create Table: CREATE TABLE `test_latin1` (
  `c` varchar(100) NOT NULL

We look at our data, and it looks great? Are we done with our conversion?

mysql> select * from test_latin1;
| c             |
| a             |
| abc           |
| ☺           |
| abc ☺☹☻ |

The answer is no. While it may look like the data is correct, MySQL also manages character sets for the communication channel. In this case, we are still communicating in latin1. To ensure moving forward in the future we must always communicate in utf8 to ensure we correctly pass utf8 to the database. We can test this with the mysql client, and as you will see our data is still corrupt.

mysql> set names utf8;
mysql> select * from test_latin1;
| c                      |
| a                      |
| abc                    |
| ☺                 |
| abc ☺☹☻ |

mysql> show session variables like 'character%';
| Variable_name            | Value                                                          |
| character_set_client     | utf8                                                           |
| character_set_connection | utf8                                                           |
| character_set_database   | latin1                                                         |
| character_set_filesystem | binary                                                         |
| character_set_results    | utf8                                                           |
| character_set_server     | latin1                                                         |
| character_set_system     | utf8                                                           |

While you can see how we could migrate the schema definition, this does not complete our migration. In my next post, I will discuss the various different ways to correctly perform a data migration between latin1 and utf8.

Checked your MySQL recovery process recently?

I sound like a broken record with every client when I talk to about the resilience of their production environments. It’s very simple in theory, however in practice many organizations fail.

Ask yourself these checklist questions for your MySQL backup and recovery process?

  1. Do you have MySQL backups in place?
  2. Do you backup ALL your MySQL data?
  3. Do you have consistent MySQL backups?
  4. Do you have backups that include both static snapshot and point in time transactions?
  5. Do you review your backup logs EVERY SINGLE day or have tested backup monitoring in place?
  6. Do you perform a test recovery of your static backup?
  7. Do you perform a test recovery to point in time?
  8. Do you time your backup and recovery process and review over time?
  9. Do you have off-site copies of your backups?
  10. Do you backup your primary binary logs?

In the past month I’ve discovered clients that have an online only business (i.e. Their MySQL data is their only tangible asset), they perform daily backups but they don’t have binary logging enabled. I’ve also discovered an example of backup logs not being checked, and an underlying mysqldump error was resulting in an incomplete backup, yet the backup script apparently completed successfully.

Disaster is inevitable.

If you don’t score 8 or better in the above checklist in your business, you are at higher risk. If you are a owner/founder/executive this should keep you awake at night if your not sure of your business viability. If your organization needs help, please contact me for assistance.

  • Can you recover from a small or large disaster?
  • Do you have confidence in your DR plan?
  • Do you know how long your DR plan will take.
  • What does your online business look like or operate during your DR time?

The Blue Pill or the Red Pill

At the recent FOSDEM 2010 event, I presented in my keynote Dolphins, now and beyond a option which I termed the “Blue Pill” or the “Red Pill”. The following slide produced noticed interest in a packed room, and subsequent conversation.

While the ownership of MySQL has changed, the option between MySQL and Oracle as a product for use still remains. While MySQL is the most popular for modern online applications, Oracle continues to have the widely used enterprise database product and has a large number of Oracle DBAs in the IT marketplace.

Over the past 5 years I have presented a number of topics on MySQL for Oracle DBA’s. At the upcoming MySQL Users Conference 2010 I will be presenting the first two talks in my new series “MySQLCamp for the Oracle DBA”.

  • IGNITION – is the preparation necessary for a successful launch of a MySQL ecosystem for an Oracle DBA. This volume covers the preparation needed to be ready for ongoing production administration of MySQL.
  • LIFTOFF – Only after a successful preparation covered in IGNITION can you be ready for the implementation and management of a MySQL ecosystem and a successful launch of your product.

Migrating MySQL latin1 to utf8 – Preparation

This article is Part 1 of a series of articles regarding MySQL character set conversion.
Be sure to also check out character set options and the process for more information.

Before undertaking such migration the first step is a lesson in understanding more about how latin1 and utf8 work and interact in MySQL. latin1 in a common and historical character set used in MySQL. utf8 (first available in MySQL Version 4.1) is an encoding supporting multiple bytes and is the system default in MySQL 5.0

  • latin1 is a single byte character set.
  • utf8 is a 1-3 byte character set depending on the size of the character. NOTE: MySQL utf8 does not support the RFC 3629 4 byte sequences. (Updated: MySQL 5.5 now supports full Unicode support with the ” utf8m4″charset))

MySQL variables

MySQL has a number of different system variables to consider, the following is the default representation in MySQL 5.1

mysql> show global variables like '%char%';
| Variable_name            | Value                                                          |
| character_set_client     | latin1                                                         |
| character_set_connection | latin1                                                         |
| character_set_database   | latin1                                                         |
| character_set_filesystem | binary                                                         |
| character_set_results    | latin1                                                         |
| character_set_server     | latin1                                                         |
| character_set_system     | utf8                                                           |

MySQL enables you to specify the character set for data at the Instance,Schema,Table and Column level. This is important because this complexity, especially between the Table and Columns can catch you out.

It is also important to ensure that not only is data stored in the appropriate format, the communication between client and server also needs to support the character set you wish to use.

latin1 example

We start by creating a simple table, inserting some data, and reviewing the data.

mysql> create table test_latin1(c varchar(100) not null) default charset latin1;
mysql> insert into test_latin1(c) values ('a'),('abc'),('☺'),('abc ☺☹☻');

mysql> select c,length(c),char_length(c),charset(c),hex(c) from test_latin1;
| c             | length(c) | char_length(c) | charset(c) | hex(c)                     |
| a             |         1 |              1 | latin1     | 61                         |
| abc           |         3 |              3 | latin1     | 616263                     |
| ☺             |         3 |              3 | latin1     | E298BA                     |
| abc ☺☹☻       |        13 |             13 | latin1     | 61626320E298BAE298B9E298BB |
4 rows in set (0.00 sec)

We use the LENGTH() and CHAR_LENGTH() functions to confirm the known length of the string and the true number of characters.

In the above examples, the smiley characters can be reproduced in a web page with the following.


utf8 example

In a separate session (because we re-use these later) we repeat for utf8.

mysql> set  names utf8;
mysql> show session variables like '%char%';
| Variable_name            | Value                                                          |
| character_set_client     | utf8                                                           |
| character_set_connection | utf8                                                           |
| character_set_database   | latin1                                                         |
| character_set_filesystem | binary                                                         |
| character_set_results    | utf8                                                           |
| character_set_server     | latin1                                                         |
| character_set_system     | utf8                                                           |

mysql> create table test_utf8(c varchar(100) not null) default charset utf8;
mysql> insert into test_utf8(c) values ('a'),('abc'),('☺'),('abc ☺☹☻');

mysql> select c,length(c),char_length(c),charset(c), hex(c) from test_utf8;
| c             | length(c) | char_length(c) | charset(c) | hex(c)                     |
| a             |         1 |              1 | utf8       | 61                         |
| abc           |         3 |              3 | utf8       | 616263                     |
| ☺            |         3 |              1 | utf8       | E298BA                     |
| abc ☺☹☻       |        13 |              7 | utf8       | 61626320E298BAE298B9E298BB |
4 rows in set (0.00 sec)

As you can see, the key difference here is the CHAR_LENGTH() of the utf8 strings differ.

Mixing latin1 with utf8

If we now look at the latin1 data in the utf8 session we see that while the underlying data via LENGTH(),CHAR_LENGTH() and HEX() remains unchanged (because this is the database representation of the data), the actual output presentation is garbled due to the mismatch in the client communication.

mysql> select c,length(c),char_length(c),charset(c), hex(c) from test_latin1;
| c                      | length(c) | char_length(c) | charset(c) | hex(c)                     |
| a                      |         1 |              1 | latin1     | 61                         |
| abc                    |         3 |              3 | latin1     | 616263                     |
| ☺                    |         3 |              3 | latin1     | E298BA                     |
| abc ☺☹☻          |        13 |             13 | latin1     | 61626320E298BAE298B9E298BB |
4 rows in set (0.00 sec)

If we look at the utf8 data in latin1, we see a different garbled representation.

mysql> select c,length(c),char_length(c),charset(c),hex(c) from test_utf8;
| c       | length(c) | char_length(c) | charset(c) | hex(c)                     |
| a       |         1 |              1 | utf8       | 61                         |
| abc     |         3 |              3 | utf8       | 616263                     |
| ?       |         3 |              1 | utf8       | E298BA                     |
| abc ??? |        13 |              7 | utf8       | 61626320E298BAE298B9E298BB |
4 rows in set (0.00 sec)


Armed now with a basic understanding we are ready to consider what approaches we may undertake to migrate this example table, and how we may be able to verify our data.


Character Set Support
Connection Character Sets and Collations

Beyond MySQL GA: patches, storage engines, forks, and pre-releases – FOSDEM 2010

Kristian Nielsen presented “Beyond MySQL GA: patches, storage engines, forks, and pre-releases”.
This included a history of current products:

Google Patches (5.0 & 5.1) included improvements in :

  • statistics/monitoring
  • lock contention
  • binlog
  • malloc()
  • filesorts
  • innodb I/O and wait statistics
  • SHOW …STATISTICS statements
  • smp scalability
  • I/O scalability
  • semisync replication
  • many more

Percona Patches (5.0) focus on

  • statistics/monitoring
  • performance/scalability
  • buffer pool content/mutexes
  • microslow patch

These have been ported to 5.1 and mainly integrated into XtraDB.

EBay Patches (5.0) have included:

  • variable length memory storage engine
  • pool of threads
  • Virtual columns

XtraDB storage engine (5.1) includes

  • Percona patches
  • Google patches
  • Innodb patches
  • Has XtraBackup for backup

Other engines/patches discussed included:

  • PBXT storage engine – community contribution
  • FederatedX – replacement to Federated
  • Sphinx storage engine
  • Pinba storage engine – Collects PHP statistics
  • Others OQGraph/Spider
  • Galera – Synchronous replication
  • Drizzle

Alternative packaging options for MySQL 5.0 and MySQL 5.1 including Our Delta, Percona and MariaDB.

FOSDEM 2010 MySQL Developer Room Schedule
FOSDEM 2010 Website
Brussels, Belgium
February 7, 2010

Multi-Master Manager for MySQL – FOSDEM 2010

The next presentation by Piotr Biel from Percona was on Multi-Master Manager for MySQL.

The introduction included a discussion of the popular MySQL HA solutions including:

  • MySQL Master-slave replication with failover
  • MMM managed bi-directional replication
  • Heartbeat/SAN
  • Heartbeat/DRBD
  • NDB Cluster

A key problem that was clarified in the talk is the discussion of Multi-Master and this IS NOT master-master. You only write to a single node. With MySQL is this critical because MySQL replication does not manage collision detection.

The MMM Cluster Elements are:

  • monitoring node
  • database nodes

And the Application Components are:

  • mon
  • agent
  • angel

MMM works with 3 layers.

  • Network Layer – uses a virtual IP address, related to servers, not a physical machine
  • Database Layer
  • Application Layer

MMM uses two roles for management with your application.

  • exclusive – also known as the writer
  • balanced – also known as the reader

There are 3 different statuses are used to indicate node state

  • proper operation
  • maintenance
  • fatal errors

The mmm_control is the tool used to manage the cluster including:

  • move roles
  • enable/disable individual nodes
  • view cluster status
  • configure failover

The Implementation challenges require the use of the following MySQL settings to minimize problems.

  • auto_increment_offset/auto_increment_increment
  • log_slave_updates
  • read_only

FOSDEM 2010 MySQL Developer Room Schedule
FOSDEM 2010 Website
Brussels, Belgium
February 7, 2010

10x Performance Improvements in MySQL – A Case Study

The slides for my presentation at FOSDEM 2010 are now available online at slideshare. In this presentation I describe a successful client implementation with the result of 10x performance improvements. My presentation covers monitoring, reviewing and analyzing SQL, the art of indexes, improving SQL, storage engines and caching.

The end result was a page load improvement from 700+ms load time to a a consistent 60ms.

State of phpMyAdmin – FOSDEM 2010

Following the opening keynote “Dolphins, now and beyond”, Marc Delisle presented on “State of phpMyAdmin”.

phpMyAdmin is an DBA administration tool for MySQL available today in 57 different languages. This is found today in many distributions, LAMP stack products and also in cpanel. The product is found at

There are current two versions, the legacy 2.x version to support older php 3.x & 4.x, The current version 3.x is for PHP 5.2 or greater.

The current UI includes some new features including.

  • calendar input for date fields
  • meta data for mime types e.g images, which is great for showing the output as an image, otherwise blob data
  • Relational designer with the able to show and create foreign keys

The New features in 3.3 (currently in beta) include:

  • Replication support including configuring master/slave, start/stop slave.
  • Synchronization model showing structure and data differences between two servers and ability to sync.
  • New export to php array, xslx, mediawiki, new importing features including progress bar.
  • Changes tracking for changes on per instance or per table. Providing change report and export options.

FOSDEM 2010 MySQL Developer Room Schedule
FOSDEM 2010 Website
Brussels, Belgium
February 7, 2010

Be sure to know your my.cnf [sections]

The MySQL configuration file, e.g. /etc/my.cnf has a number of different section headings including [mysql], [mysqld], [mysqld_safe]. It is important that you ensure you put the right variables into the right section. For example, the following my.cnf configuration file will not operate as the user probably expects.

query_cache_size = 100M
query_cache_type = 1



In this example, this configuration does not give you a MyISAM key buffer of 600M, it’s actually the default of 8M.

mysql> show global variables like 'key_buffer_size';
| Variable_name   | Value   |
| key_buffer_size | 8388600 |

Be sure to add the right options to the [mysqld] section.

What I didn’t know until yesterday was that some programs read from multiple groups. From the 5.1.2. Server Command Options MySQL reference manual page. In helping the describe the problem for the benefit of readers I actually learned something new myself.

mysqld reads options from the [mysqld] and [server] groups. mysqld_safe reads options from the [mysqld], [server], [mysqld_safe], and [safe_mysqld] groups. mysql.server reads options from the [mysqld] and [mysql.server] groups.

I have for example always put log-error in both the [mysqld_safe] and [mysql]d sections because both of these write different errors. Seems that is unnecessary.

Speaking at MySQL UC 2010

My talk on 10x performance improvements – A case study has just been approved for the 2010 MySQL Conference. This will be my 5th straight year speaking at the MySQL conferences. For those in Europe wanting a sneak peek I am also speaking at FOSDEM 2010 in Brussels on Feb 7th where I’ll be giving an abridged version.

As an independent MySQL consultant, my work generally covers performance tuning and scalability and sometimes database architecture. Often however my work involves a review of a given problem and recommendations (immediate, short and long term). I’m rarely involved in the full implementation and generally do not see the full fruits of the proposed work.

Recently however I was able to work with a client, first in resolving critical performance issues and then in a review of the application architecture and MySQL environment, provide recommendations and also helping internal resources in the successful implementation. The result was a very successful engagement and an ideal case study on a strategy for tackling improving performance of an application using MySQL.

With an existing environment that included a MySQL master, 3 database slaves and 6 web servers this application provided the grounds of a well sized configuration and the need for greater availability and scalability.

The Call for Papers for the 2010 MySQL conference is still open. If you would like to share your experiences with MySQL submit your talk today.

O'Reilly MySQL Conference & Expo 2010

Europe conference options for MySQL Developers

For those in the US the annual MySQL UC is taking place again in April. For those in Europe we have dedicated room for MySQL and MySQL related products/variants/branches at FOSDEM 2010 being held in Brussels, Belgium on 6-7 Feb.

This conference will feature a full day of talks with a format of 20 minutes presentation and 5 minutes Q&A. More information about submissions can be found at Call for Papers for “MySQL and Friends” Developer Room at FOSDEM 2010 now open!

Other references:


Wednesday January 6th is the last date for submissions. Extension for FOSDEM MySQL

Monitoring the right MySQL slow queries

When looking at a set of SQL statements in isolation with tools such as the slow query log, processlist and tcpdump/mk-query-digest it is easy to identify queries that are slow.

It is more difficult however to identify the frequency of the query, and whether the slow query is actually acceptable in your overall system design.

I very simple technique to help you is to comment your queries. For example:

SELECT /* 10m cache */ ...;

When I’m working for a client, this process enables me to realize the purpose of the query more quickly and to help prioritize which queries I need to improve first. This does not mean I simply ignore these less frequent queries, however my choices for reviewing and indexing queries on given tables is adjusted generally for OLTP queries first, and batch second.

As with all practices it is important to implement across your full code base.

What do MySQL staff think of the acquisition?

It finally dawned on me while reflecting on the year past this Sunday that the missing voice since the announcement of the Oracle acquisition of Sun Microsystems (and therefore MySQL) has been the MySQL employees.

When I worked as an employee for MySQL Inc, the acquisition by Sun Microsystems in 2008 lead to several requirements about the acquisition.

  • You were not allowed to talk about the acquisition publically.
  • You were not allowed to communicate with any Sun (i.e. the acquirer) resources.

In other words it was “business as usual” which is really an oxymoron, because business will never be exactly as it was before the announcement. The ongoing delay in pending acquisition by Oracle Corporation is really hurting everybody with getting on with doing their jobs, being happy with their work, and making a difference in open source and in the lives of all the benefit from using MySQL.

I’m sure many that have words to say are disappointed, worried or even fearful of their own future careers. Comments are always welcome via Mr Anonymous using 10 minute email.

Understanding installing MySQL rpm versions

I have a problem with an easy way to install MySQL via rpm without resorting to specifying the exact point release of MySQL. Presently my local yum repository has versions of 5.0, 5.1,5.4 and 5.5.

If I want to install MySQL Sever, I can just run:

$ sudo yum install -y MySQL-server
Setting up Install Process
Package MySQL-server-community-5.5.0-1.rhel5.x86_64 already installed and latest version
Nothing to do

The issue here is the most current version is installed. If I want to install the most current version of 5.1 for example, I have found no way to specify MySQL-server-5.1, or MySQL-server-community-5.1, I have to specify the point release MySQL-server-community-5.1.40

I suspect there is some internal aliasing that may be possible within rpm’s to support this. I’m seeking help from any rpm experts and would appreciate any feedback.

My current products include:

$ sudo yum list MySQL-server-community
Installed Packages
MySQL-server-community.x86_64      5.5.0-1.rhel5        installed
Available Packages
MySQL-server-community.x86_64      5.0.82-0.rhel5       mydb-rhel5-server-x86_64
MySQL-server-community.x86_64      5.0.82-0.rhel5       mydb-rhel5-x86_64
MySQL-server-community.x86_64      5.1.40-0.rhel5       mydb-rhel5-server-x86_64
MySQL-server-community.x86_64      5.1.40-0.rhel5       mydb-rhel5-x86_64
MySQL-server-community.x86_64      5.4.3-0.rhel5        mydb-rhel5-server-x86_64
MySQL-server-community.x86_64      5.4.3-0.rhel5        mydb-rhel5-x86_64
MySQL-server-community.x86_64      5.5.0-1.rhel5        mydb-rhel5-server-x86_64
MySQL-server-community.x86_64      5.5.0-1.rhel5        mydb-rhel5-x86_64

The Oracle EU statement on MySQL – What's missing

Many providers embedd MySQL with their commercial products including Adobe, Macfee, Nokia, Symantec and ScienceLogic just to name a few. In addition most commercial third party storage engines have for years been forced to provided very customized versions of MySQL due to limitations in the storage engine API. These situations require a license agreement necessary with the trademark holder of MySQL. The Oracle Corporation EU Statement released on December 14, 2009 has carefully worded in the statement about these OEM licenses and storage engine providers there will be no changes for 5 years.

One specific detail is missing, what happens then?

As an individual that uses, recommends, promotes and advises clients especially on various storage engine offerings, I see the downstream effect of uncertainty for these providers can kill the entrepreneurial spirit that the economy desperately needs.

I see the possible impact as two fold. If company X that pays a license agreement is forced in 5 years to now pay say 3x the fee for example and cannot do so, the supplier chain is broken. What about companies that use the product from company X. Are they now liable with a cease and desist notice, therefore jeopardizing these businesses as well.

I see a further complexity in the agreement for storage engine providers. While there is a commitment to maintain the API, basically all storage engine providers don’t use the API, they have custom extensions and custom binaries essential for operation. The agreement makes specific reference to using the provided interfaces. Will the ability to provide heavily customized versions be permitted?

I’m not an expert in open source and commercial licenses and I am not a lawyer, however in the past I admit I may have been ignorant to the specifics that can affect the livelihood of many including myself.

As with any problem, I seek to have or find a solution. I am unclear here what options do exist, and would appreciate feedback.

Due to the anti competitive (*) nature of this resulting acquisition (#), should there be provisions to ensure unrestricted commercial licensing will always be available in the future regardless of time frame. Should limits on the amount of increase or change in license costs to a reasonable amount be enforced to protect companies.

It is most unclear about what will happen and this uncertainly I’m sure is affecting commercial interests.

Until this statement was released, the greatest fear in the MySQL community has been uncertainly. This statement does mention some details but the specifics of commercial business will never be known and therefore still affects us.

(*) Anti competitive nature. Contrary to statements by others, MySQL and Oracle are competitors in the relational database space. While generally they service different spectrums, there is sufficient overlap. As an expert and speaker in Oracle to MySQL migrations I clearly see the competition with these products. For example, while I will never purchase a Ferrari over a Holden for example due to cost and desirable but unnecessary features, they are still both cars and are competitors in their respective industry, the same applies to Oracle and MySQL.

(#) It is important to realize that Oracle is not acquiring MySQL directly. Oracle is acquiring Sun Microsystems which includes Sun hardware platform, the Solaris Operating System and Java which I’m sure are the primary motivations. MySQL is included as it is now owned by Sun Microsystems, and this is the current issue.

Couldn't load plugin named 'innodb'

As part of reviewing storage engines for my work on the upcoming Expert PHP and MySQL book, I finally had an excuse to try out the InnoDB Plugin for MySQL which is now conveniently included with MySQL 5.1 since 5.1.38.

Following the MySQL 5.1 Reference Manual instructions at 13.6. The InnoDB Storage Engine I included the bare minimum as documented to my my.cnf.

ignore-builtin-innodb ; ; ; ; ; ;

However to my misfortune the following error occured.

091212 17:45:14 [ERROR] Can't open shared library '/home/rbradfor/mysql/mysql-5.1.41-linux-x86_64-glibc23/lib/plugin/ ' (errno: 22 cannot open shared object file: No such file or directory)
091212 17:45:14 [ERROR] Couldn't load plugin named 'innodb' with soname ' '.

I double checked the files were indeed included in the binary release before undertaking a google search, to reveal the official Innodb Plugin Manual page but still not much to go on with the information ” Can’t open shared library library_name — Diagnose the cause from the following message details.”

Back to re-reading the error message, as referenced, I observed there is a trailing space ‘ ‘ ater the plugin name. Observing that I had indeed had this the my.cnf, the result of joining the lines from the MySQL manual which were broken down for presentation means (as noted).

Removed the space, and viola it works.

mysql> show plugins;
| Name                | Status   | Type               | Library             | License |
| binlog              | ACTIVE   | STORAGE ENGINE     | NULL                | GPL     |
| partition           | ACTIVE   | STORAGE ENGINE     | NULL                | GPL     |
| ARCHIVE             | ACTIVE   | STORAGE ENGINE     | NULL                | GPL     |
| BLACKHOLE           | ACTIVE   | STORAGE ENGINE     | NULL                | GPL     |
| CSV                 | ACTIVE   | STORAGE ENGINE     | NULL                | GPL     |
| FEDERATED           | DISABLED | STORAGE ENGINE     | NULL                | GPL     |
| MEMORY              | ACTIVE   | STORAGE ENGINE     | NULL                | GPL     |
| MyISAM              | ACTIVE   | STORAGE ENGINE     | NULL                | GPL     |
| MRG_MYISAM          | ACTIVE   | STORAGE ENGINE     | NULL                | GPL     |
| InnoDB              | ACTIVE   | STORAGE ENGINE     | | GPL     |

I would question that this level of pedantic checking is unnecessary. A simple trim on semi-colon ‘;’ separated values could be an easy fix.

Is your database schema in sync?

If you have more then a single MySQL database for your production environment, e.g. a development and test environment, or a MySQL replication topology, ensuring your schema’s are in sync can be task that requires some time if not managed correctly.

There is a tool I do use for MySQL environments called Schema Sync – a MySQL Schema Versioning and Migration Utility. There are many reasons why schema’s get out of sync. Developers may not ensure their changes are reflected in any software to be deployed, and when not tested you could end up with broken functionality. A DBA might try some different index strategies on a slave, but not the master, and never implement or revert.

While some people want the quick and hidden just sync version akin to Rails, I really like this product as it produces proper patch and revert SQL scripts. As a DBA I really want to know what’s going to be applied to my schema, I want to be able to review it before it’s just magically applied. The revert process is also ideal, in time critical situations when you may need to rollback a change, either it’s taking to long to deploy.

This is definitely a tool to use over time, especially if you have multiple developers, DBA’s and more then one MySQL environment that should be the same.

Monitoring MySQL with MONyog

It just works. In absence of any MySQL monitoring for your site, I have found no solution that gets you operational as quickly and easily. MONyog can be deployed in 60 seconds, and configured in another 60 seconds. Within 5 minutes you can have visual monitoring of your MySQL environment.

MONyog is an agentless process, which is an advantage for easy install, but does not provide for monitoring redundancy in the capture of information due to agentless nature. It’s a static standalone executable which is great if you need something to work out of the box. You can easily configure multiple servers in a replication topology, or different servers in your environment. You get the ability to monitor all the usual information, with a dashboard and detailed graphs. While MONyog does provide customizations of rules for the graphs and presentation order, that’s about it. You can’t at this time for example change the colors, what’s on graphs except for what MONyog monitors or the security of certain options in the GUI to different users, however I hope they offer these suggestions in future releases.

MONyog includes some nice features that are overlooked in other products. You have the ability to monitor the MySQL error log (if configured appropriately) which is a common complaint of end users. You can also see the process list, and when configured you can also perform query gathering and analysis.

MONyog is a well priced commercial product with a free trial download without registration requirements which gives no barrier to access and evaluate. As a solution and ease of deployment, there is no excuse not to evaluate this product. If you have no monitoring, you can now quickly and easily. I find a number of clients that simply have no monitoring. There really is no excuse as it’s critical information you need to have and record for a successful business.

You can get it from

MySQL Permissions – Restarting MySQL

I am working with a client that is using managed hosting on dedicated servers. This has presented new challenges in obtaining the right permissions to undertake MySQL tasks but not have either ‘root’ or ‘mysql’ access and not have to involve a third party everytime.

Adding the following to the /etc/sudoers file enabled the ability to restart MySQL.

User_Alias	DBA = rbradfor, user2, etc
Host_Alias 	DB_SERVERS =,, etc
Cmnd_Alias	MYSQL = /etc/init.d/mysqld, /usr/sbin/tcpdump


As you can see I also got tcpdump, which I find valuable to monitor via mk-query-digest.

Next, permissions for log files.

Monitoring MySQL resource limits

I have for the first time seen a client implement MySQL Resource Limits. I got the following error tying to connect to the database.

$ mysql -udba -p
ERROR 1226 (42000): User 'dba' has exceeded the 'max_user_connections' resource (current value: 10)

I see from the documentation the ability to see the limits in the mysql.user table. I see this is included in the SHOW GRANTS output.

SHOW GRANTS for 'dba'@'%';
| Grants for [email protected]%                                                                                                                         |
1 row in set (0.01 sec)

The documentation includes details that you can flush the resource limits, however I have found no way to monitor the current usage.

I welcome any feedback from the MySQL Community.

Take a look at mk-query-digest

Q: What SQL is running on your MySQL database server now?
A: The bane of pain for MySQL DBA’s when there is no official MySQL instrumentation that is dynamic and fine grained sufficiently to solve this problem at the SQL interface.

While hybrid solutions exist, the lack of dynamic and real-time are the issues. There is however great work being done by Baron and others on Maatkit mk-query-digest and packet sniffing the MySQL TCP packets.

$ sudo tcpdump -i eth0 port 3306 -s 65535  -x -n -q -tttt | ./mk-query-digest --type tcpdump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
# Caught SIGINT.
5444 packets captured
8254 packets received by filter
2809 packets dropped by kernel
# 2.1s user time, 40ms system time, 22.23M rss, 57.60M vsz
# Overall: 1.58k total, 38 unique, 262.67 QPS, 2.34x concurrency _________
#                    total     min     max     avg     95%  stddev  median
# Exec time            14s    41us      2s     9ms    23ms    72ms   236us
# Time range        2009-10-05 08:17:44.377926 to 2009-10-05 08:17:50.052460
# bytes            271.31k       8   8.79k  176.28  621.67  412.51   44.60
# Rows affe            248       0       3    0.16    0.99    0.38       0
# Warning c              0       0       0       0       0       0       0
#   3% (58)   No_index_used

# Query 1: 118.67 QPS, 1.23x concurrency, ID 0x16219655761820A2 at byte 2167682
#              pct   total     min     max     avg     95%  stddev  median
# Count         45     712
# Exec time     52      7s    41us      1s    10ms    23ms    80ms   138us
# Hosts                 11 10.251.199... (132), 10.251.103... (129)... 9 more
# Time range 2009-10-05 08:17:44.377926 to 2009-10-05 08:17:50.051177
# bytes          2   6.43k       8      17    9.25   16.81    3.15    7.70
# Errors                 1    none
# Rows affe      1       4       0       1    0.01       0    0.07       0
# Warning c      0       0       0       0       0       0       0       0
#   0% (1)    No_index_used
# Query_time distribution
#   1us
#  10us  ################################################
# 100us  ################################################################
#   1ms  #####
#  10ms  #############
# 100ms  #
#    1s  #
#  10s+
select 1G


# Rank Query ID           Response time    Calls   R/Call     Item
# ==== ================== ================ ======= ========== ====
#    1 0x16219655761820A2     7.3861 54.9%     712   0.010374 SELECT
#    2 0x930DE584EC815E11     1.6664 12.4%      35   0.047611 SELECT X
#    3 0x68B1E4E47977667A     1.4265 10.6%      71   0.020092 SELECT Y Z

In this real-time example, the SELECT 1 a Connector/J keep alive in version 3.1.4 using iBATIS is the major SQL statement used. (yes MM I know about /* ping*/, have suggested to client). I was however with additional sample times able to identify a new query and confirm a full table scan by lack of good index. Monitoring had highlighted an increase in SQL statements and table scans, but you need tools such as this to identify the problem SQL in a well tuned system.

There is a lot of information to digest with this output, to confirm and determine the relative benefit of each number, the histogram etc, but the identification of SQL in real-time and the good work of overall summaries and comments for EXPLAIN and SHOW CREATE TABLE for example shows this tool has been designed by MySQL DBA’s for MySQL DBA’s.

Sheeri just wrote about Dynamic General and Slow Query Log Before MySQL 5.1 which apart from the File I/O overhead is an idea I’d not considered before. What may be a good idea, is to pass this information into a named pipe and then let another process do whatever. Drizzle solves this problem with query logging information being able to be shipped off to Gearman.

Looking just at the data

There are many areas you need to review when addressing MySQL performance such as current database load, executed SQL statements, connections, configuration parameters, memory usage, disk to memory ratio, hardware performance & bottlenecks just to name a few.

If you were to just look at the data that is held in the database, what would you consider?
Here are my tips, when looking just at the data.

  1. What is the current database size?
  2. What is the growth of data over time, say daily, weekly?
  3. Which are the 2 largest tables now?
  4. What 2 tables are growing the fastest?
  5. What tables have greatest churn, specifically DELETE’s?
  6. How often do you optimize your tables?
  7. What is your archiving/purging strategy? Do you even have one?
  8. Review data types? I average 25% reduction in footprints, just by choosing optimal data types, generally with zero code changes.
  9. What further data simplification can occur to reduce size, eg. INT for IP’s, enums, removing repeating text etc?
  10. What normalization of data can occur?
  11. What storage engines are in use?
  12. What data is write once data?
  13. Can data be stored in other forms, e.g. outside a relational database?

Even without looking at the SQL statements or the MySQL configuration you can generally deduce a lot of information about the application by just looking at the data.

What alert monitoring do you use?

More importantly, how often to you confirm access to your server and database with that alert monitoring?

With a client yesterday the primary database server while still usable and serving connections for a while, but was not accessible via SSH to investigate performance issues. It eventually became non responsive and required a physical reboot. With alert monitoring for system availability only recorded every 5 minutes this was simply too long a delay.

This lead to a discussion with more questions then answers including.

  • How often should you ping your server(s), both internally and externally?
  • How often do you connect physically to your server for confirmation, e.g. a ssh keyed authentication test?
  • How often do you perform a physical database connection test?
  • How often do you do an end to end test, including http request to database query test?

As with all of these, you also want to time these operations for any deviations.

I’ve created a very simple MySQL Alert Monitoring survey. I would appreciate your input.