2007 February

At a customer site recently I came across the following table definition for an Innodb table. 140 columns, 3 part primary key totally 44 bytes, 2 million rows giving over 900MB in data size. And it had 15 indexes, totaling over 2.3GB in size. Add into the mix a Windows Server 2003 OS, 640MB innodb_buffer_pool_size setting and table scans out the warzoo. This all leads to very poor performance.

In is generally considered best practice for Innodb to use the shortest primary key as possible, and there is a clear reason for this. Innodb stores with every index the full primary key. So for example if an indexed column was 4 bytes in length, in this example the index row would be 48 bytes (before overhead). Fortunately an easy solution presented itself, because of this index storage requirement, Innodb will create an internal 6 byte primary key if none exists for a table. I of course had known about this but had never tried it in a production situation. I come from old school where every table is defined with a primary key.

So a ALTER TABLE [name] DROP PRIMARY KEY results in a long wait, and a reduction in the Index size to 900MB. WOOT! Now, in closer analysis the Primary Key is the Primary Key because it’s the Unique requirement for the table. No problem I just add a Unique Key to replace the previously dropped Primary Key. A check to review the Index Size showed a size of 2.3GB. What the!

It seems if you read the fine print of the MySQL documentation for Innodb Table Structures there is an exception clause. If you do not define a PRIMARY KEY for your table, MySQL picks the first UNIQUE index that has only NOT NULL columns as the primary key and InnoDB uses it as the clustered index.

In my situation, by adding a Unique Key this was in turn converted internally to the new Primary Key. Drat! So to the Innodb developers out there. I’d like to see a way for the internally generated key to remain in this situation, or at least provide the ability for the designer to choose this capability.

The only result is to physically create an INTEGER UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY column definition. This may seem like not a big thing, but the customer did not want to make any schema changes. However it was a necessity simply as the first step to get reasonable performance.

This was only the first part of the problem, solving the full table scans via creative Indexing was not possible, code changes were out of the question for an immediate fix. The real problem was the size of the innodb_buffer_pool_size. It was just too small to handle both the table data and index pages in memory. Perfmon analysis showed the machine simply when into intensive disk I/O every time any queries were run. On a Linux system with just a database and Innodb tables, it can be recommended that 70-80% of available memory can be allocated to the innodb_buffer_pool_size.

Alas this is Windows 32 bit, and their is an implied 2GB memory limit for any process, so the best one could manage in this situation was 1600MB.
The long and the short was that even with poor database design, and immediate performance improvement occurred with an optimal Primary Key and sufficient memory allocated to the correct MySQL buffers. This is only a temporary fix for a greater problem.

Ronald Bradford | Enterprise Data Architect | MySQL Subject Matter Expert | Author | Speaker

CU@UC07

Using Innodb Primary Keys wisely