10x Performance Improvements in MySQL – A Case Study

The slides for my presentation at FOSDEM 2010 are now available online at slideshare. In this presentation I describe a successful client implementation with the result of 10x performance improvements. My presentation covers monitoring, reviewing and analyzing SQL, the art of indexes, improving SQL, storage engines and caching.

The end result was a page load improvement from 700+ms load time to a a consistent 60ms.

Calculating your database size

I generally use the following MySQL INFORMATION_SCHEMA (I_S) query to Calculate Your MySQL Database Size. This query and most others that access the MySQL INFORMATION_SCHEMA can be very slow to execute because they are not real tables and are not governed by physical data, memory buffers and indexes for example but rather internal MySQL data structures.

Mark Leith indicates in his post on innodb_stats_on_metadata that Innodb performs 8 random(ish) dives in to the index, when anybody accesses any of SHOW TABLE STATUS, SHOW INDEX, INFORMATION_SCHEMA.TABLES,INFORMATION_SCHEMA.STATISTICS for InnoDB tables. This can have an effect on performance, especially with a large number of Innodb tables, and a poor ratio of innodb_buffer_pool_size to disk data+index footprint.

What is even more incredible is when the result of this apparently harmless query causes the mysqld process to actual crash with a core dump due to these random index dives. The following core dump analysis highlights my query as the cause of the problem. This has happened now at least twice in for recent core crashes on a production environment.

(gdb) bt
#0 0x000000327280b6b2 in pthread_kill () from ./lib64/libpthread.so.0
#1 0x000000000055b136 in handle_segfault ()
#3 0x00000000007e1c21 in rec_get_offsets_func ()
#4 0x0000000000766007 in btr_estimate_number_of_different_key_vals ()
#5 0x000000000070d4c2 in dict_update_statistics_low ()
#6 0x000000000061fa84 in ha_innobase::info ()
#7 0x0000000000636972 in fill_schema_charsets ()
#8 0x0000000000639a66 in get_all_tables ()
#9 0x0000000000634633 in get_schema_tables_result ()
#10 0x00000000005bde37 in JOIN::exec ()
#11 0x00000000005bf7a7 in mysql_select ()
#12 0x00000000005c0127 in handle_select ()
#13 0x000000000056fcf0 in mysql_execute_command ()
#14 0x0000000000574c83 in mysql_parse ()
#15 0x00000000005751a0 in dispatch_command ()
#16 0x0000000000576483 in do_command ()
#17 0x0000000000577002 in handle_one_connection ()
#18 0x0000003272806367 in start_thread () from ./lib64/libpthread.so.0
#19 0x0000003271cd30ad in clone () from ./lib64/libc.so.6
Cannot access memory at address 0x3271cd3040

This is an information_schema query that caused innodb to open a table.
This is totally normal. On first open, innodb tables get automatically  analyzed.
This analyze process crashed in innodb.

This exact query *provoked* a crash:

(gdb) x/1s 0x00002aaabc961dd0
0x2aaabc961dd0: "SELECT table_schema,table_name,engine,row_format,
table_rows, avg_row_length,
(data_length+index_length)/1024/1024 as total_mb,
(data_length)/1024/1024 as data_mb,
(index_length)/1024/1024 as index_mb,
CURDATE() AS today
FROM information_schema.tables
WHERE table_schema=@schema

The issue however is which table is the problem? How widespread is the corruption. Would an ALTER TABLE ENGINE=Innodb rebuild the table and eliminate the problem. Would an ANALYZE on an Innodb table identify the problem? (I doubt this second point). The problem however is even more significant due to the actual system. The largest single table of this 1TB database is 500GB. The impact of performing the ALTER, the time to undertake this blocking operation, the increase in the Innodb data file that can’t be reclaimed are just two factors that the inexperienced may fall victim of.

A saying I use is “Disaster is inevitable”. In this situation the disaster appears to not be significant but the ramifications due to the lack of appropriate and expert architectural design considerations to correct the problem are.

Is your environment capable of supporting this maintenance requirement? If not, then is the decision maker in your organization worried enough to seek the expert advice to address pro actively or will it be too late.

Understanding Innodb Transaction Isolation

The MySQL Innodb storage engine provides ACID compliance, the ‘I’ being isolation. There are four states defined in MySQL with the tx_isolation system variable, READ-UNCOMMITTED, READ-COMMITTED, REPEATABLE-READ and SERIALIZABLE.

Generally MySQL installations do not modify the default value of tx_isolation = REPEATABLE-READ, however I have seen with a number of clients the default value has been changed to READ-COMMITTED. When I attempt to identify the reason why, I have always received the same reason. Oracle uses a default transaction isolation of READ-COMMITTED. See how Oracle Manages Data Concurrency and Consistency for more information.

However, while the literal string is the same, the actual implementation of READ-COMMITTED in Oracle more closely represents the REPEATABLE-READ in MySQL.

The following demonstrates what you can expect to see between the operation of these two modes.

Session 1 Session 2
DROP TABLE IF EXISTS transaction_test;
CREATE TABLE transaction_test(
  val  VARCHAR(20) NOT NULL,

INSERT INTO transaction_test(val) VALUES ('a'),('b'),('c');
SELECT @@global.tx_isolation, @@session.tx_isolation;
| @@global.tx_isolation | @@session.tx_isolation |
SELECT * FROM transaction_test;
| id | val | created             |
|  1 | a   | 2009-09-21 00:19:43 |
|  2 | b   | 2009-09-21 00:19:43 |
|  3 | c   | 2009-09-21 00:19:43 |
INSERT INTO transaction_test(val) VALUES ('x'),('y'),('z');
SELECT * FROM transaction_test;
| id | val             | created             |
|  1 | a               | 2009-09-21 00:19:43 |
|  2 | b               | 2009-09-21 00:19:43 |
|  3 | c               | 2009-09-21 00:19:43 |
|  4 | x               | 2009-09-21 00:21:00 |
|  5 | y               | 2009-09-21 00:21:00 |
|  6 | z               | 2009-09-21 00:21:00 |
INSERT INTO transaction_test(val) VALUES (@@session.tx_isolation);
SELECT * FROM transaction_test;
| id | val             | created             |
|  1 | a               | 2009-09-21 00:19:43 |
|  2 | b               | 2009-09-21 00:19:43 |
|  3 | c               | 2009-09-21 00:19:43 |
|  7 | REPEATABLE-READ | 2009-09-21 00:21:01 |

TRUNCATE TABLE transaction_test;
INSERT INTO transaction_test(val) VALUES ('a'),('b'),('c');
SELECT @@global.tx_isolation, @@session.tx_isolation;
| @@global.tx_isolation | @@session.tx_isolation |
SELECT * FROM transaction_test;
| id | val | created             |
|  1 | a   | 2009-09-23 22:49:44 |
|  2 | b   | 2009-09-23 22:49:44 |
|  3 | c   | 2009-09-23 22:49:44 |
INSERT INTO transaction_test(val) VALUES ('x'),('y'),('z');
SELECT * FROM transaction_test;
| id | val | created             |
|  1 | a   | 2009-09-23 22:49:44 |
|  2 | b   | 2009-09-23 22:49:44 |
|  3 | c   | 2009-09-23 22:49:44 |
|  4 | x   | 2009-09-23 22:52:38 |
|  5 | y   | 2009-09-23 22:52:38 |
|  6 | z   | 2009-09-23 22:52:38 |
INSERT INTO transaction_test(val) VALUES (@@session.tx_isolation);
SELECT * FROM transaction_test;
| id | val            | created             |
|  1 | a              | 2009-09-23 22:49:44 |
|  2 | b              | 2009-09-23 22:49:44 |
|  3 | c              | 2009-09-23 22:49:44 |
|  4 | x              | 2009-09-23 22:52:38 |
|  5 | y              | 2009-09-23 22:52:38 |
|  6 | z              | 2009-09-23 22:52:38 |
|  7 | READ-COMMITTED | 2009-09-23 22:56:10 |


As you can see, under READ-COMMITTED your result set can change during the transaction. However, how practical is this example in an actual application.

In what circumstances would you consider using READ-COMMITTED? Is there an improvement in locking contention that can lead to less deadlock contention? What is the overhead in other areas?

Harrison writes in My Favorite New Feature of MySQL 5.1: Less InnoDB Locking that best locking out of InnoDB in 5.1 will be with READ-COMMITTED. Note that as mentioned, the impact is a change in replication mode that may have a more dramatic effect.

Heikki Tuuri comments in Understanding InnoDB MVCC that using READ-COMMITTED should help in a specific locking issue.

I am still unclear of the specific benefits in general terms for all environments. Review of the The InnoDB Transaction Model and Locking and specifically Consistent Nonlocking Reads provides “With READ COMMITTED isolation level, each consistent read within a transaction sets and reads its own fresh snapshot” which indicates that for certain workloads the reduced locking is a benefit.

Every environment is different and ultimately the actual transaction statements will determine what options and benefits work best.

Setting up sysbench with MySQL & Drizzle

Sysbench is a open source product that enables you to perform various system benchmarks including databases. Drizzles performs regression testing of every trunk revision with a branched version of sysbench within Drizzle Automation.

A pending branch https://code.launchpad.net/~elambert/sysbench/trunk_drizzle_merge by Eric Lambert now enables side by side testing with MySQL and Drizzle. On a system running MySQL and Drizzle I was able install this sysbench branch with the following commands.

cd bzr
bzr branch lp:~elambert/sysbench/trunk_drizzle_merge
cd trunk_drizzle_merge/
sudo make install

Running the default lua tests supplied required me to ensure drizzle was in my path and that I created the ‘sbtest’ schema. I’ll be sure it add that checking to my future developed benchmark scripts.

$ cd sysbench/tests/db
$ sysbench --test=insert.lua --db_driver=drizzle prepare
sysbench v0.4.10:  multi-threaded system evaluation benchmark

FATAL: unable to connect to Drizzle server: 23
FATAL: error 0: Unknown database 'sbtest'
FATAL: failed to execute function `prepare': insert.lua:7: Failed to connect to the database
$ drizzle -e "create schema sbtest"
$ sysbench --test=insert.lua --db_driver=drizzle prepare
sysbench v0.4.10:  multi-threaded system evaluation benchmark

Creating table 'sbtest'...

And running produces the following results.

$ sysbench --num-threads=1 --test=insert.lua --db_driver=drizzle run
sysbench v0.4.10:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1

Threads started!

OLTP test statistics:
    queries performed:
        read:                            0
        write:                           10000
        other:                           0
        total:                           10000
    transactions:                        0      (0.00 per sec.)
    deadlocks:                           0      (0.00 per sec.)
    read/write requests:                 10000  (879.68 per sec.)
    other operations:                    0      (0.00 per sec.)

Test execution summary:
    total time:                          11.3678s
    total number of events:              10000
    total time taken by event execution: 11.3354s
    per-request statistics:
         min:                                  0.32ms
         avg:                                  1.13ms
         max:                                 68.74ms
         approx.  95 percentile:               2.41ms

Threads fairness:
    events (avg/stddev):           10000.0000/0.00
    execution time (avg/stddev):   11.3354/0.0

Rerunning the prepare also lacked some auto cleanup to allow for automated re-running.

$ sysbench --test=insert.lua --db_driver=drizzle prepare
Creating table 'sbtest'...
ALERT: Drizzle Query Failed: 1050:Table 'sbtest' already exists
FATAL: failed to execute function `prepare': insert.lua:57: Database query failed


$ sysbench --test=insert.lua --db_driver=mysql --mysql_table_engine=innodb prepare
sysbench v0.4.10:  multi-threaded system evaluation benchmark

Creating table 'sbtest'...

Unfortunately this doesn’t actually create the table in the right storage engine, I had to hack the code to ensure I was comparing InnoDB in each test.

$ sysbench --num-threads=1 --test=insert.l
ua --db_driver=mysql run
sysbench v0.4.10:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1

Threads started!

OLTP test statistics:
    queries performed:
        read:                            0
        write:                           10000
        other:                           0
        total:                           10000
    transactions:                        0      (0.00 per sec.)
    deadlocks:                           0      (0.00 per sec.)
    read/write requests:                 10000  (897.67 per sec.)
    other operations:                    0      (0.00 per sec.)

Test execution summary:
    total time:                          11.1399s
    total number of events:              10000
    total time taken by event execution: 11.1084s
    per-request statistics:
         min:                                  0.27ms
         avg:                                  1.11ms
         max:                                252.63ms
         approx.  95 percentile:               2.48ms

Threads fairness:
    events (avg/stddev):           10000.0000/0.00
    execution time (avg/stddev):   11.1084/0.00

Armed with a working environment I can now write some more realistic production like tests in Lua.

What to do at 3:25am

Look at MySQL bug reports of course? Well actually I’m writing multiple blog posts, and I was confirming additional reference sources and links when I came across MySQL Bug #29847 – Large CPU usage of InnoDB crash recovery with a big buf pool.

Taking the time to actually read the information exchange I stumble upon.

[8 Jun 23:29] liz drachnik

Hello Heikki -

In order for us to continue the process of reviewing your contribution to MySQL - We need
you to review and sign the Sun|MySQL contributor agreement (the "SCA")

The process is explained here:


Getting a signed/approved SCA on file will help us facilitate your contribution-- this
one, and others in the future.

Thank you !

Liz Drachnik  - Program Manager - MySQL

Oops. Well it made me laugh out loud for so many reasons. First your talking to the creator of InnoDB, a part of MySQL for at least 5+ years. Second, there is clearly an agreement between Oracle and MySQL already for the incorporation of InnoDB in the current builds, but mostly because at this late stage of probably Oracle acquisition (which side note has seemed too quite for too long) it seems rather a mute point to be chasing up paperwork.

Fortunately sanity obviously prevailed, unfortunately the public record persists. Still, humor is always good.

[9 Jun 18:04] Liz Drachnik

Please disregard the previous request for an SCA.
thank you

Understanding the various MySQL Products & Variants

The MySQL marketplace today is far more complex then simply choosing between a particular version of MySQL that Sun/MySQL produces.
The MySQL server product in general is released under the GNU General Public License (GPL) v2, however you should carefully review the MySQL Legal Policies as a number of exceptions and different license agreements operate for companion tools such as MySQL Cluster, MySQL client libraries and documentation for example.

Looking into the MySQL ecosystem for products, I’ve produced the following categories:

  • Sun/MySQL Official Products
    • MySQL Versions
  • MySQL Variants
    • Community
    • Enterprise
  • MySQL Plugins
  • MySQL Patches
  • MySQL Alternatives

Why does such a diversification occur?  I attribute this to three primary causes:

  • The GPL license by nature allows for an organization to take the product, modify it and use it for their specific needs. They can also provide these patches under GPL for others to use and incorporate. While this has occurred for example  Google , FaceBook, eBay , Proven Scaling and Percona to name a few, Sun/MySQL has elected not to undertake any proactive process of incorporating these in any timely fashion.
  • The policy of Sun/MySQL to allow for contributions was so strict, and combined with a properietory Version Control System BitKeeper you had to purchase, there was little incentive for community contributions in relation to so many other open source projects
  • The Sun/MySQL management and decision makers didn’t listen to the community and paying customers, and over the past 3-5 years the product life cycle, features, release schedule and quality can be questioned.

Sun/MySQL Official Products

Sun/MySQL holds the license to the MySQL server products. They release official binaries and the source code (due to GPL).  Even within MySQL, there are several products that differ subtly and to the untrained eye it can be confusing to understand and determine what is best. Your can download from www.mysql.com the following versions:

  • MySQL Server 5.1  GA
  • MySQL Community Server 5.0 GA
  • MySQL Enterprise Server 5.0 GA
  • MySQL Cluster NDB 6.3

  • MySQL Server 4.1 (EOL)
  • MYSQL Server 6.0 (Alpha)

MySQL Versions

It is important that you understand the MySQL Versions, especially in evaluating any of the following referenced variants, patches etc.
The common path for MySQL Server versions is with a generally linear numbering systems including historical versions 3.23, 4.0 and 4.1.  These versions have now reached End Of Life (EOL) for support, however emergency security patches are applied where necessary.
Continuing from 4.1, you have the 5.0, 5.1 versions which are both Generally Available (GA), and then version 6.0 which is currently Alpha.

Further complexity happens when within the Sun/MySQL Official products, several forks/branches have occurred.  These include:

  • The MySQL 5.0 Community & Enterprise split occurred at MySQL Version 5.0.27
  • At this time, the Community version (free to download) continued with the intention of allowing for community contributions. Only one patch was ever accepted, and SHOW PROFILES was introduced in MySQL 5.0.37.  To date, 11 versions have been released to the current 5.0.77 version.
  • MySQL Enterprise (available under subscription) is itself comprised of three subtypes, these are Rapid Update Service Packs(monthly), Quarterly Service Packs (quarterly) and Hot-fix releases.  To date 37 versions have been released to the current 5.0.78 version.
  • MySQL Cluster, was part of the MySQL Server product until this was branched/forked at  MySQL Version 5.1.23. This enabled MySQL Cluster to be labeled as Production Ready for Cluster clients, and not be held back by continued delays in the 5.1 server release.   Starting with a new versioning scheme with 6.1, the MySQL Cluster NDB produces new versions far exceeding the volume of the server, with to date 23 versions in 6.1 , 18 in version 6.2 and 24 in version 6.3.      I am not advocating that features and quality are better or worse, simply that activity and interaction with community and users is far greater.
  • MySQL 5.1 Maria is a special branch starting at MySQL Version 5.1.24 that includes the Maria Storage Engine. This is the next generation of the MyISAM Storage Engine, both architected by the creator of MySQL, Monty Widenius.  It is undercertain this will continue as a product released officially by Sun/MySQL.

In Review

With just reading this introduction you can understand the confusion that exists when new customers/clients are beginning to evaluate the different MySQL Versions.

In my next post, I’ll talk more about:

  • MySQL Variants, those I consider variants use the MySQL Interface, protocol and support the standard connectors.  These include community versions (e.g. Solid, Infobright, Sphinx) and commercial versions (e.g. KickFire, InfoBright, Nitro).  
  • MySQL patches are improvements that have been released to the community and are now becoming part of common third party MySQL packages, such as Percona, Proven Scaling and Out Delta
  • MySQL Plugins are a feature of MySQL 5.1, and allow for pluggable storage engines into MySQL.  While several companies have had to produce custom binaries due to the API limitions (especially with the optimizer), a number of engines support the API including Innodb, PBXT and filesystem engine.
  • MySQL Alternatives include any MySQL related products that have now deviated from being supported under the MySQL protocol.  Most notably here is Drizzle.

More Information

Ronald Bradford is Principal at 42SQL. We provide consulting and advisory services for the MySQL ecosystem with a focus on MySQL database performance, architecture and scalability. 42SQL also provides education in MySQL including the “MySQL Essentials” training course. You can find more information regarding this offering and an upcoming schedule at 42SQL Education.

A few words from Jonathan Schwartz

Following Marten Mickos, the second opening keynote at the 2008 MySQL Conference and Expo was by Jonathan Schwartz CEO and President of Sun Microsystems. Blog

His opening joke was about dinner with Marten, to which Marten said “You not going to get a keynote, unless you buy the company.”

So what was striking for me in his presentation “What is Sun’s Agenda?

  • There is no open-source phone yet, but that’s an industry that needs disrupting.
  • Like the need for water or electricity, The Network Has Become A Social Utility.
  • We want to work with the community, create greater innovation.
  • The future, the price tag of Free, the philosophy of Freedom

I had a chance to meet Jonathan and Rich Green on Sunday night, and it was great to see Jonathan learning about, and getting behind the product PBXT – The Community Engine a MySQL 5.1 open source engine.

The pursuit of a synchronous world

Well at least your MySQL database world.

As Paul eluded to, PrimeBase Technologies has a project to provide synchronous replication for MySQL in a High Availability environment. It is more then an idea, there is a plan.

Is it possible?
What are the use cases?
How can you use it?
Would you use it?

Some input to date. We need these questions and more, and we seeking more input for discussion.

Unfortunately the opportunity to hear any input during a presentation is left to the last day of the conference, so there has been created a BoF session on Tuesday night for a round table discussion if necessary. People are encouraged to bring specific cases and situations for feedback, the reasons why MySQL Replication, MySQL Cluster, DRBD/HeatBeat or any other solution does not satisfy your needs, and what would.

If you can’t wait you may need to seek out Booth #518, and make a time to seek out the Technology Expert.

Just today, Peter Zaitsev of MySQL Performance Blog also writes in State of MySQL Market and will Replication live ?, “Customers are constantly asking me if there is something which would help them to scale MySQL and get some HA out of the box even on the medium level. Seriously – MySQL Cluster, Continuent, Master-Master Replication, DRBD or SAN based HA architectures all have their limits which makes neither of them used for very wide class of applications.” to confirm this pursuit.

Storage Engines at the MySQL Conference

I’ll be following closely the progression of Storage Engines available in the MySQL Database server, well soon to be available when 5.1 gets to GA (hopefully by end of Q2 which is what we have been told). Tick, Tick, time is running out.

PrimeBase XT (PBXT) and Blob Streaming is a focus as PrimeBase Technologies, the company which I want to note for people is an Open Source company, committed at providing an open source alternative to the other commercial players. You also have at the MySQL Conference talks on the the existing InnoDB from Innobase (a subsidiary of market RDBMS leader Oracle). There is a Nitro presentation, an Infobright presentation, no Solid presentation surprisingly (the IBM news happening after submissions closed). We also have from MySQL, presentations on the internally developed storage engines Falcon and Maria, both products that won’t even be in 5.1 but 6.0, however Maria is presently a different branch of 5.1 so I don’t know how that works. Will it be in 5.1?

But what I want to seek is more news of KickFire, a Diamond Sponsor, an engine with embedded H/W, something that’s been obviously worked on in reasonable stealth. For me it’s not just interesting, it’s a competitor in our technology space, so I’ve been researching Joseph Chamdani and some of his patents.

Plenty of news in the past few weeks on Kickfire including Kickfire Update by Keith Murphy on April 3, Kickfire: stream-processing SQL queries by Baron Schwartz on April 4, Kickfire looking to push MySQL limits by Farhan Mashraqi on April 4, and Kickfire Kickfire Kickfire by Peter Zaitsev on April 4, and myself back on March 23.

So what can I make from the lack of company information and posted information to date.

  • Hardware based acceleration.
  • No Solid State Drive (SSD) Technology, at least not yet but C2App mentions SSD.
  • Data Warehousing, lending to thinking it’s not a transactional storage engine
  • A new storage engine and a new approach to data storage. I find this surprising, as it takes years to develop a feature complete storage engine, and most new 5.1 storage engines are indeed existing products, take Nitro, Solid, Infobright and Falcon. Only PBXT has been written from the ground up for MySQL 5.1, so looking to know more about it’s development
  • Expensive, it’s dedicated H/W + (assuming) MySQL Enterprise + Storage Engine

Come and get a t-shirt at UC2008

Here I am at my desk sporting the PrimeBase supporters t-shirt that will be available at the exhibitors booth at the 2008 MySQL Conference. The front is rather uneventful with the official logo, but the back will be worth the experience. So everybody interested in supporting PBXT as the transactional storage engine for MySQL developed by the community and for the community, please come and see us and mention the secret password.

We have been placed way back in the right hand side of the exhibitors hall at booth 518, in front of the Open Source and OEM providers.

A day with Paul McCullagh – Architect of PBXT and BlobStreaming

Continuing on from my lightning visits with Jan Kneschke and Michael Zinner, today I got to spend a day with Paul McCullagh at his home in Hamburg Germany.

Paul is the architect of the PBXT Pluggable Storage Engine for MySQL 5.1, and also the Blob Streaming Storage Engine. His work was acknowledged with the MySQL Community contributor for the year in 2007. The successful PrimeBase product for the publishing Industry in Europe and North America also now uses PBXT for underlying data storage which is great to see a company use it’s own products.

Like each friend and expert this week, Paul develops using Mac, but at least uses the US English version & keyboard layout, not the German layout. As Paul states, “the German version is not productive for coding”.

Paul uses NetNewsWire Lite as his RSS feeder, as I observe some PlanetMySQL links. I’m more old school, I just go to the site every few days. I really should move from the dark ages, one advantage is the RSS feeder shows you what you have read.

However, there is a particular reason why I visited Paul, other then to spend some time with a good friend. I’m doing more investigation regarding making coding changes to MySQL, and I’m very out of the development cycle. I’ve forgotten more then I remember with C++ and in particular the tools and techniques used to develop, debug and deploy. Of course I know what I need to do, I just need to know what now is the most efficient means of doing today.

Paul uses Xcode to view and edit his MySQL code. The development environment provided by Apple is good for Java and C++. As an Eclipse person from my Java days, I was using CDT for my C++ environment. XCode is used for project management (i.e. looking at files in project) and for file editing. A standard terminal is used to do make. Xcode can do a quick compile for syntax checking and you get a list of errors and hot links to lines of code.

The next key part is debugging, and you can very easily define executables and enable you to run in an integrated debugger. A good option testing is to run mysql test with is –manual-gdb. This gives you the command you can run. What Paul does it copy the parameters, and add to an executable defined within XCode, the arguments from the –manual-gdb output.

The MySQL source tree was added as a project but only partial source directories such as (storage,sql,include) are required. You do need to make sure it does compile (but necessary for a build via Xcode.)

Additional debugging help is In xcode you can do jump to variable definition. So you can add breakpoints, and then run the defined executable that was build via command line make, and copy mysqld binary to initially deployed directory.

Regarding testing, we got to talk again about the issues with the MySQL Test Suite including the issues I encountered while working with Nitro. Hard coded storage engine syntax such as such as engine type=Innodb make testing more complicated then necessary. MySQL test requires some modifications for pluggable storage engines. You need to add a row to the plugins table via mysql_system_tables_data.sql to enable –default-storage-engine=pbxt as a command line argument. Paul did state that on using the MySQL testing suite, the Perl version is a great improvement in performance.

So now I understand the environment, I get onto my problem, and we are specifically looking at the thread sort_buffer and sql/filesort.cc

So when I ask about a few lines of specific code, Paul shakes his head, and says “This looks crazy”. After we spend some time on it, his comment is upgraded, “this doesn’t look crazy, it is crazy. When you look at the effect of the code, it appears it can be done in 3 lines, and it would make sense what it is doing.” This is worth further discussion in a future post.

One last tip learned. You need to be careful with multiple versions and open projects of the same, not to make changes 1/2 in one version and 1/2 in another version.

Just what are MySQL 9.x features?

Top marks to Jay Pipes for getting the Forge 2.0 finally out after quite some time, as well as in the midst of the MySQL Conference he is organizing.

I am worried however about some of the content, as shown in the screenshot below, the opening page lists Worklog tasks/features for versions 6.x or 7.x , that’s ok, but features in 9.x. Where is the practicality of thinking more then 2 releases ahead, and just having a future bucket. Indeed, we have 5.1 and 6.0 already frozen and not releases, so 6.x is already 3 releases out.

Tonight we were told at the NY PHP Meeting MySQL 5.1 is not due to late Q2, so that’s at least June 2008.
The MySQL 5.1 Release Notes reveals a history that I don’t find very flattering.

  • MySQL 5.1.3 (29 November 2005)
  • MySQL 5.1.9 (12 April 2006) *First reported beta via docs
  • MySQL 5.1.22 (24 September 2007: Release Candidate)

I hope that Sun will take on board this very slow release cycle of producing GA products, the last version MySQL 5.0.15 (19 October 2005: Production) 2 years and 6 months ago.

I’m even more interested then previously in the ultimate release and success of MySQL 5.1 as this is a pre-requisite of my new employer’s key product the PBXT Storage Engine for MySQL.

My passion for Open Source

I am a very strong proponent of Open Source (excluding my Macbook). Joining MySQL Inc was a wonderful achievement, being part of the team behind the most popular open source database. Leaving MySQL was not an easy decision due to the people, but the Sun transition and requirements did help. However it is no surprise I am joining another open source company – Primebase Technologies in Hamburg, Germany. My association with the MySQL Community will only be strengthened with my full work and support behind the PBXT and Blob Streaming pluggable storage engines for MySQL.

It is actually poetic that I am joing Primebase for I have the auspicious recognition while an active part of the MySQL community of introducing Paul McCullagh to MySQL some 2 years ago. Only weeks later, PBXT was MySQL news in the opening CEO keynote my Marten Mickos at the 2006 MySQL Users Conference.

I expect this opportunity will increase my contributions, in particular in the lacking areas of instrumentation and memory management. As well I will now be focusing my efforts on MySQL 5.1 and ensuring this is of the highest quality for the MySQL Community. While my open source code contributions have been small to date, my first some years ago adding in JMeter support for MySQL Stored procedures and Transactions, the feeling of seeding instead of leaching is a wonderful thing. I look forward to a greater opportunity with the team at Primebase Technologies.

Using PBXT 0.9.5

Paul has released Version 0.95 of his PBXT MySQL Transactional Storage Engine.

Here is what I did to get it operational under CentOS 4.3.

su -
useradd pbxt
su - pbxt
wget http://www.primebase.com/xt/download/mysql-4.1.16-pbxt-0.9.5.tar.gz
tar xvfz mysql-4.1.16-pbxt-0.9.5.tar.gz
cd mysql-4.1.16-pbxt-0.9.5
./configure --with-debug=full --with-pbxt-storage-engine --without-query-cache --with-mysqld-user=pbxt  --prefix=/home/pbxt/mysql
make install
cd /home/pbxt/mysql
./bin/mysqld_safe --user=pbxt --basedir=/home/pbxt/mysql &

Now, lets test and confirm PBXT is operational.

bin/mysql -uroot mysql
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 1 to server version: 4.1.16-pbxt-0.9.5-debug

Type 'help;' or 'h' for help. Type 'c' to clear the buffer.

mysql> show engines;
| Engine     | Support | Comment                                                    |
| MyISAM     | DEFAULT | Default engine as of MySQL 3.23 with great performance     |
| HEAP       | YES     | Alias for MEMORY                                           |
| MEMORY     | YES     | Hash based, stored in memory, useful for temporary tables  |
| MERGE      | YES     | Collection of identical MyISAM tables                      |
| MRG_MYISAM | YES     | Alias for MERGE                                            |
| ISAM       | NO      | Obsolete storage engine, now replaced by MyISAM            |
| MRG_ISAM   | NO      | Obsolete storage engine, now replaced by MERGE             |
| InnoDB     | YES     | Supports transactions, row-level locking, and foreign keys |
| INNOBASE   | YES     | Alias for INNODB                                           |
| BDB        | NO      | Supports transactions and page-level locking               |
| BERKELEYDB | NO      | Alias for BDB                                              |
| NDBCLUSTER | NO      | Clustered, fault-tolerant, memory-based tables             |
| NDB        | NO      | Alias for NDBCLUSTER                                       |
| EXAMPLE    | NO      | Example storage engine                                     |
| ARCHIVE    | NO      | Archive storage engine                                     |
| CSV        | NO      | CSV storage engine                                         |
| BLACKHOLE  | NO      | Storage engine designed to act as null storage             |
| PBXT       | YES     | Super fast transactional, row-level locking engine         |
18 rows in set (0.00 sec)

Now to take if for a spin with the MySQL Sakila Sample Database. At this time I need a modified version of the schema, as the sample database is designed for Version 5.

Download sakila.mysql4 script.
Download sakila.pbxt.sql script.

bin/mysql -uroot mysql
mysql>  source sakila.mysql4
mysql> source  sakila.pbxt.sql
# NOTE: Alter table of film_text fails as expected due to FULLTEXT index
mysql> show create table actor;
mysql> exit;

A few warnings in my sakila.mysql4 script I’ll need to review, but it looks ok from an installation step. Now back to my tests that I’m writing to use the Sakila Sample Database which I was working on, so I can really test the ACID state and performance of PBXT.

Migrating an MyISAM schema to use Referential Integrity

Here are some steps involved. Using the current MySQL defacto engine InnoDB. Of course, Falcon, PBXT and others will enable alternative engines to be used.

Convert Table Storage Engine Types

$ mysql -u[user] -p[password] [database] -e "SHOW TABLES" | grep -v "Tables_in" | sed -e "s/^/ALTER TABLE /" | sed -e "s/$/ ENGINE=InnoDB;/" > upgrade.sql
$ mysql -u[user] -p[password] [database] < upgrade.sql

NOTE: This may not work for all tables, for example those with FULLTEXT indexes will fail.

For the introduction of Referential Integrity we need to ensure the following.

  • Each Foreign Key column should have an index. Prior to 4.1 I think this was a requirement, however it's a good general practice regardless for SQL performance.
  • The datatype must match between Primary Key and Foreign Keys. The most obvious oversight is normally UNSIGNED, however you also for example have INT and INT, and not INT and BIGINT as datatypes.
  • Optional Foreign Keys must contain NULL values and not the normal practice of having a default of 0.

A Sample Foreign Key Syntax.

mysql> ALTER TABLE wp_usermeta ADD FOREIGN KEY usermeta_user_fk (user_id) REFERENCES wp_user(id);
ERROR 1005 (HY000): Can't create table './wordpress/#sql-cd9_10.frm' (errno: 150)

A closer investigation of what this error really is:

$ perror 150
MySQL error code 150: Foreign key constraint is incorrectly formed

A confirmation of table definitions.

mysql> desc wp_usermeta;
| Field      | Type                | Null | Key | Default | Extra          |
| umeta_id   | bigint(20)          |      | PRI | NULL    | auto_increment |
| user_id    | bigint(20) unsigned |      | MUL | 0       |                |
| meta_key   | varchar(255)        | YES  | MUL | NULL    |                |
| meta_value | longtext            | YES  |     | NULL    |                |
4 rows in set (0.00 sec)

 mysql> desc wp_users;
| Field               | Type                | Null | Key | Default             | Extra          |
| id                  | bigint(20) unsigned |      | PRI | NULL                | auto_increment |
| user_login          | varchar(60)         |      | UNI |                     |                |
| user_pass           | varchar(64)         |      |     |                     |                |
| user_nicename       | varchar(50)         |      |     |                     |                |
| user_email          | varchar(100)        |      |     |                     |                |
| user_url            | varchar(100)        |      |     |                     |                |
| user_registered     | datetime            |      |     | 0000-00-00 00:00:00 |                |
| user_activation_key | varchar(60)         |      |     |                     |                |
| user_status         | int(11)             |      |     | 0                   |                |
| display_name        | varchar(250)        |      |     |                     |                |
10 rows in set (0.00 sec)

A second pair of eyes (thanks Jon), showed that I actually spelt a table name wrong. Did you spot it. Of course it would have been nice if the error message actually told me this. This rather bland message Can’t create table could actually mean.

  • missing index (pre 4.1)
  • incompatible columns data types
  • Invalid Table.

I’m sure if I tried to break it I’d find more examples, but just a trap for unsuspecting people.

Now migrating an existing schema to using Referential Integrity provides some initial benefits (row level locking, misplaced key updates/deletes) but it does not provide true integrity unless your application has been written to support transactions. Chances are it hasn’t, but this is at least the first step.

PBXT – The MySQL Community Transactional Storage Engine

In having a discussion with Paul McCullagh (the creator of PBXT transactional storage engine) and Taneli Otala MySQL AB CTO after the keynote presentation at the MySQL User Conference, Taneli made the following comment (paraphrased and reproduced with permission).

“I talk about PBXT in discussions shamelessly. The development of PBXT was excellent timing in the MySQL community landscape”.

It was an excellent commendation that MySQL AB management considered so highly the contributions from the community. As mentioned in the opening keynote, MySQL with the Storage Engine API with MySQL 5.1 has great potential to expand what options are available to user of MySQL.

Good work Paul. I like many others wish you the best in your continued development and contribution.

Opening Conference Keynote

Mike writes a good summary of the MySQL Conference opening keynote State of the Dolphin: Interview with Kaj, Monty and David of MySQL.

I’d like to add just two comments.

Firstly, it was great to see community awards to Giuseppe, Roland, Marcus and Rasmus. Well done!

Second, it was a great thing to see on the slides a reference to Paul’s PBXT Transactional Storage Engine. A MySQL storage engine from the community. I’ve had a chance to meet Paul and have a number of great discussions with him. I wish him all the best, and I’m happy to contribute what I can to see his engine make it into the MySQL product in the future.

Contributing to JMeter

As part of my using JMeter for the purpose of testing a new Transactional storage engine PBXT for MySQL, I’ve been investigating the best approach for handling transactions. Read more about earlier decisions at my earlier post Testing a new MySQL Transactional Storage Engine.

I found that the JMeter JDBC Sampler only supports SELECT and UPDATE Statements, and not calls to stored procedures. This is just one approach I’m considering taking.

Well, I guess it’s time to contribute code to an Apache Project. I’ve modified code and logged bugs before for Tomcat, but this will be my first attempt of modify code and submit.

A summary of what I did (really for my own short term memory):

Now I just have to wait to see if it’s accepted. Regardless, it works for me. And that’s Open Source. FREEDOM

svn checkout http://svn.apache.org/repos/asf/jakarta/jmeter/trunk/ jmeter

$ svn diff JDBCSampler.java > JDBCSampler.java.patch
$ cat  JDBCSampler.java.patch
Index: JDBCSampler.java
--- JDBCSampler.java    (revision 388876)
+++ JDBCSampler.java    (working copy)
@@ -23,6 +23,7 @@
 import java.sql.ResultSetMetaData;
 import java.sql.SQLException;
 import java.sql.Statement;
+import java.sql.CallableStatement;

 import org.apache.avalon.excalibur.datasource.DataSourceComponent;
 import org.apache.jmeter.samplers.Entry;
@@ -45,6 +46,8 @@

        public static final String QUERY = "query";
        public static final String SELECT = "Select Statement";
+       public static final String UPDATE = "Update Statement";
+       public static final String STATEMENT = "Call Statement";

        public String query = "";

@@ -69,6 +72,7 @@
                log.debug("DataSourceComponent: " + pool);
                Connection conn = null;
                Statement stmt = null;
+               CallableStatement cs = null;

                try {

@@ -88,14 +92,19 @@
                                        Data data = getDataFromResultSet(rs);
                                } finally {
-                                       if (rs != null) {
-                                               try {
-                                                       rs.close();
-                                               } catch (SQLException exc) {
-                                                       log.warn("Error closing ResultSet", exc);
-                                               }
-                                       }
+                                       close(rs);
+                       // execute stored procedure
+                       } else if (STATEMENT.equals(getQueryType())) {
+                               try {
+                                       cs = conn.prepareCall(getQuery());
+                                       cs.execute();
+                                       String results = "Executed";
+                                       res.setResponseData(results.getBytes());
+                               } finally {
+                                       close(cs);
+                               }
+                       // Insert/Update/Delete statement
                        } else {
                                int updateCount = stmt.getUpdateCount();
@@ -112,20 +121,8 @@
                } finally {
-                       if (stmt != null) {
-                               try {
-                                       stmt.close();
-                               } catch (SQLException ex) {
-                                       log.warn("Error closing statement", ex);
-                               }
-                       }
-                       if (conn != null) {
-                               try {
-                                       conn.close();
-                               } catch (SQLException ex) {
-                                       log.warn("Error closing connection", ex);
-                               }
-                       }
+                       close(stmt);
+                       close(conn);

@@ -164,6 +161,38 @@
                return data;

+       public static void close(Connection c) {
+               try {
+                       if (c != null) c.close();
+               } catch (SQLException e) {
+                       log.warn("Error closing Connection", e);
+               }
+       }
+       public static void close(Statement s) {
+               try {
+                       if (s != null) s.close();
+               } catch (SQLException e) {
+                       log.warn("Error closing Statement", e);
+               }
+       }
+       public static void close(CallableStatement cs) {
+               try {
+                       if (cs != null) cs.close();
+               } catch (SQLException e) {
+                       log.warn("Error closing CallableStatement", e);
+               }
+       }
+       public static void close(ResultSet rs) {
+               try {
+                       if (rs != null) rs.close();
+               } catch (SQLException e) {
+                       log.warn("Error closing ResultSet", e);
+               }
+       }
        public String getQuery() {
                return query;

$ svn diff JDBCSamplerBeanInfo.java > JDBCSamplerBeanInfo.java.patch
$ cat JDBCSamplerBeanInfo.java.patch
Index: JDBCSamplerBeanInfo.java
--- JDBCSamplerBeanInfo.java    (revision 388876)
+++ JDBCSamplerBeanInfo.java    (working copy)
@@ -50,7 +50,7 @@
                p.setValue(NOT_UNDEFINED, Boolean.TRUE);
                p.setValue(DEFAULT, JDBCSampler.SELECT);
-               p.setValue(TAGS,new String[]{JDBCSampler.SELECT,"Update Statement"});
+               p.setValue(TAGS,new String[]{JDBCSampler.SELECT,JDBCSampler.UPDATE,JDBCSampler.STATEMENT});

                p = property("query");
                p.setValue(NOT_UNDEFINED, Boolean.TRUE);


Good to know somebody read my post, and responded positively. The quickest way for patches is to log a Bugzilla request. Seemed somebody already had, so it was easy for me to just to contribute to Bug #38682

Atomicity, Consistency, Isolation, and Durability = ACID

ACID is the key transaction processing feature for a RDBMS. Without this, the integrity of the database cannot be guaranteed.

In Summary.

Atomicity is an all-or-none proposition.
Consistency guarantees that a transaction never leaves your database in a half-finished state.
Isolation keeps transactions separated from each other until they’re finished.
Durability guarantees that the database will keep track of pending changes in such a way that the server can recover from an abnormal termination.

A clearer definition from the Wikipedia.

  • Atomicity refers to the ability of the DBMS to guarantee that either all of the tasks of a transaction are performed or none of them are. The transfer of funds can be completed or it can fail for a multitude of reasons, but atomicity guarantees that one account won’t be debited if the other is not credited as well.
  • Consistency refers to the database being in a legal state when the transaction begins and when it ends. This means that a transaction can’t break the rules, or integrity constraints, of the database. If an integrity constraint states that all accounts must have a positive balance, then any transaction violating this rule will be aborted.
  • Isolation refers to the ability of the application to make operations in a transaction appear isolated from all other operations. This means that no operation outside the transaction can ever see the data in an intermediate state; a bank manager can see the transferred funds on one account or the other, but never on both—even if she ran her query while the transfer was still being processed. More formally, isolation means the transaction history (or schedule) is serializable. For performance reasons, this ability is the most often relaxed constraint.
  • Durability refers to the guarantee that once the user has been notified of success, the transaction will persist, and not be undone. This means it will survive system failure, and that the database system has checked the integrity constraints and won’t need to abort the transaction. Typically, all transactions are written into a log that can be played back to recreate the system to its state right before the failure. A transaction can only be deemed committed after it is safely in the log.

Testing a new MySQL Transactional Storage Engine

As part of my A call to arms! post about a month ago, I’ve had a number of unofficial comments of support. In addition, I’ve also been approached to assist in the completion of a MySQL Transactional support engine. More information on the PBXT engine will be forthcoming soon by it’s creator.

Anyway, I’ve taken on the responsiblity of assisting in testing this new storage engine. This will also give me the excuse of being able to pursue some other ideas about the performance of differing storage engines for differing tables in business circumstances, such as MyIsam verses InnoDB in a highly OLTP environment. Part of testing will be ensure ACID conformance in varying situations and multi-concurrency use. Of course the ability to also do performance and load testing would be a obvious extension.

Considering how I’m going to benchmark is an interesting approach. I of course want to use Java, my choice of language at present. This presents a problem, in another factor towards performance, however by using Java, I’m simulating a more real world environment of a programming overhead and JDBC Connector rather then just raw performance output.

Laying out a plan would include an ability to have an existing database structure and data, be able to bulk define SQL statements and transactions, and parameterise SQL during transactions. I would need to be able to verify the state of database from the transactions, and clearly identify any invalid data. I would also need the ability for handling threads, and of course adequate reporting of my results.

As of MySQL 5.1.4, there is a supported benchmarking tool called mysqlslap in MySQL. I’ve discounted using this because I figured at this early stage, the documentation and exposure of this is of course limiting, and I’m sure I’d still need to perform other development.

Along comes JMeter. Within Java development I use JUnit quite extensively. This is key in the test-driven agile methodology approach of Extreme Programming. In discussion with this problem with a collegue on a new project, I found that JMeter was used for extensive load testing for web applications, but also performed database testing, and provides the support to integrate JUnit tests.

So yesterday I had a quick look at JMeter. The capabilities for defining, reporting and threading are quite complete. It took litterally minutes to install, configure, run an initial test and view results all in a GUI interface. A little more work gave me scripting handling of my initial tests. I’ve posted my initial investigations of JMeter – Performance Testing Software and JMeter and Ant Integration earlier.

With this behind me, I’ve just got to define the approach for more complete transactional tests, explictly confirming the results (I’m hoping to achieve this in custom JUnit tests). If I can solve this, then I can spend the most of time in the defining of adequate tests. Let’s see what the next few days work provides.

A call to arms!

With Oracle Corporation purchasing InnoBase, the company providing the InnoDB Storage Engine, and now reliable rumors of the acquisition of SleepyCat, the BDB Storage Engine, both key transactional storage engines for MySQL are effectively owned by a competitor.

While the is a strange and probably unchartered territory for both organisations, I’m personally concerned. I use InnoDB extensively, however if there was a comparable alternative within MySQL I’d consider switching out of principle. Is Oracle purchasing these organisations a bad thing? We don’t know. That’s the problem. While MySQL will undoubtly continue to provide these storage engines as part of the MySQL Database I believe a call to arms is needed.

It’s true that Oracle helped more general adoption of Linux when it announced this as it’s primary platform some years back. I’m sure with a middleware suite such as JBoss this will benefit from Oracle’s exposure, but will it benefit in it’s development? Will funding for development skew the product torwards what Oracle wants, not the community? Even things such as BerkleyDB being embedded in Open Office, and Google wanting to spend into the refinement of Open Office for possible web services options just makes you wonder whats happening. I’m on holiday’s trying to relax, and not do any reading, developing or communicating, but the one thing that makes my mind turn is, why is Oracle doing this?

My concerns are: look what happened to the acquired big commercial competitors, the PeopleSoft, Siebel, JD Edwards. Now, it’s a little difficult with the Open Source software, but I’m no licence expert. While it will continue to be available in current licencing options, I’m sure there has to be some concerns. Even something trivial like, all MySQL downloads that include these engines must first be registered on the website, and all these statistics must go to Oracle. Can they do this? I have no idea, but what if they could.

I’m sure internally in MySQL AB there are plans afoot for alternative transactional storage engines. The pluggable nature of these within MySQL makes it easy to move in this direction. I think sometimes, some functionality is kept close to heart, and you only here about it when some actual work is released.

I’m interested to guage reaction to see if a public working group should be setup to specifically tackle the issue of an independent transactional storage engine. Are there others out there that feel the same way I do. Now let me be clear, I’m not anti Oracle, infact rather pro Oracle, but I’m very anti Microsoft. If it was Microsoft buying these companies how would people react?

In my opinion this should be a bold announcement from MySQL now. In stating the development and release of a new Transactional Storage Engine this year, and then not evening mentioning InnoDB and BDB, they are downplaying the Oracle buy in, and emphasising a true Open Source Company option. In no means say they are no longer supporting InnoDB and BDB, but if the media exposure from MySQL continues to mention them, then it’s going to bleed into some reference back to Oracle.

I can’t contribute to the actual development in C++, if it was Java that would be a different matter, but as I move more away from hardcore coding there are plenty of other areas in which I could contribute.