Seeking public data for benchmarks

I have several side projects when time permits and one is that of benchmarking various MySQL technologies (e.g. MySQL 5.0,5.1,5.4), variants (e.g. MariaDB, Drizzle) and storage engines (e.g. Tokutek, Innodb plugin) and even other products like Tokyo Cabinet which is gaining large implementations.

You have two options with benchmarks, the brute force approach such as Sysbench, TPC, sysbench, Juice Benchmark, iibench, mysqlslap, skyload. I prefer the realistic approach however these are always on client’s private data. What is first needed is better access to public data for benchmarks. I have compiled this list to date and I am seeking additional sources for reference.

Of course, the data is only the starting point, having representative transactions and queries to execute and a framework to execute and a reporting module are also necessary. The introduction of Lua into Sysbench may now be a better option then my tool of choice mybench which I use simply because I can configure, write and deploy generally for a client in under 1 hour.

If anybody has other good references to free public data that’s easily loadable into MySQL please let me know.

Understanding your RAID Configuration

For any production MySQL Database system, running RAID is a given these days. Do you know what RAID your database is? Are you sure?. Ask for quantifiable reproducible output from your systems provider or your System Administrator.

As a consultant I don’t always know the specific tools for the clients deployed H/W, but I ask the question. On more the one occasion the actual result differed from the clients’ perspective or what they were told, and twice I’ve discovered that clients when asked if their RAID was running in a degraded mode, it actually was and they didn’t know.

You can read about various benchmarks at MySQL blogs such as BigDBAHead and MySQL Performance Blog however getting first hand experience of your actually RAID configuration, the H/W and S/W variables is critical to knowing how your technology works. You can then build on this to run your own benchmarks.

Over 50% of my clients run on DELL equipment, most using local storage or shared storage options such as Dell MD1000, Dell MD3000, NetApps or EMC. I’ve had the opportunity to spend a few days looking into the more details of RAID, specifically the DELL PERC 5/i Raid Controllers, and I’ve started a few MySQL Cheatsheets for my own reference that others may also benefit from.

Understanding PERC RAID Controllers gives an overview of using the MegaCLI tools to retrieve valuable information on the Adapter, Physical Drives, Logical Drives and the all important Battery Backed Cache.

There are several Google search results out there about finding the MegaCLI tools, I found them to be all outdated. There is of course other tools including Dell OpenManage Server Administrator (GUI and CLI) and an Open Source project called megactl.

Here is just a summary of a few lines of each that yields valuable information:

Adapter Details

$ /opt/MegaRAID/MegaCli/MegaCli64 -AdpAllInfo -aALL

RAID Level Supported             : RAID0, RAID1, RAID5, RAID10, RAID50
Max Stripe Size         : 128kB
Stripe Size                    : 64kB

Physical Details

$ /opt/MegaRAID/MegaCli/MegaCli64 -LDPDInfo -aall

Adapter #0
Number of Virtual Disks: 1

RAID Level: Primary-5, Secondary-0, RAID Level Qualifier-3
Stripe Size: 64kB
Number Of Drives:4
Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU

Raw Size: 70007MB [0x88bb93a Sectors]
Inquiry Data: FUJITSU MAY2073RC       D108B363P7305KAU
Inquiry Data: FUJITSU MAY2073RC       D108B363P7305KAJ
Inquiry Data: FUJITSU MAY2073RC       D108B363P7305JSW
Inquiry Data: FUJITSU MAY2073RC       D108B363P7305KB1


$ MegaCli -AdpBbuCmd -aALL
  Fully Charged           : Yes
  Discharging             : Yes

BBU Capacity Info for Adapter: 0
Relative State of Charge: 100 %
Absolute State of charge: 88 %
Run time to empty: 65535 Min

BBU Properties for Adapter: 0

Auto Learn Period: 7776000 Sec
Next Learn time: 304978518 Sec

The big detail that was missing was the details in this ouput of the drive speed, such as 7.2K, 10K, 15K. What is the impact? Well that’s the purpose of the next step.

Following this investigation, testing of the RAID configuration with Bonnie++ was performed to determine the likely performance of various configurations, and to test RAID0, RAID1, RAID5 and RAID10.

Further testing that would be nice would include for example RAID 5 with 3 drives verses 4 drives. The speed of the drives, the performance in a degraded situation, and the performance during a disk rebuild.

This still leaves the question about how to test the performance with and without the Battery Backed Cache. You can easily disable this via CLI tools, but testing an actually database test, and pulling the power plug for example with and without would yield some interesting results. More concerning is when Dell specifically discharges the batters, and it takes like 8 hours to recharge. In your production environment you are then running in degraded mode. Disaster always happens at the worse time.

Leveraging the power of Twitter

Last week I posted the following twitter request“Can somebody loan me (or buy me) a Dell 2950 decked out so I can run and publish some benchmarks. Please!”

In a same day response I was offered access to use 2 x Dell 1950’s, and today I’m now actually using these machines for my own testing. I would like to thank cafemom (Barry, Anthony & Dan) for the loan of hardware.

And now the chance to better understand the RAID configuration of the DELL PERC Controllers, trying out some different RAID types, LVM configurations and disk tests. When I’m done with my System Administrator refresher, I’m then be trying some different MySQL Benchmarks to test various MySQL configuration settings including using the new Juice benchmark.