MySQL Conference Submissions have closed

If you didn’t get your proposal in for MySQL Conference 2008 , that’s too bad.

I often wondered from past conferences why submissions were needed so early, like 5 months before. Well, as being invited to be part of the MySQL Conference committee this year I now know why, and have a greater appreciation. With near 300 submissions, it takes time to review them, and this is just the first step in a number to get to a completed schedule for next years conference.

Also getting a sneak preview of what’s to come is really cool! I’m already excited.

NY Users Group – Analyzing MySQL Status and your SQL

This month I continued my Performance Analysis talks at the Local NY MySQL Meetup. Previous discussions can be found here.

Our focus was a more in-depth look at gathering and reviewing MySQL Status and your applications SQL statements using MySQL Proxy. Even after preparing the slides over the weekend Jan added more functionality that was particularly interesting. So today while addressing a client issues I further extended this work to do even more funky monitoring.

Today’s monitoring.lua script does:

  • Logs to file, Date/Time, Query Time, Response Time, Rows Affected, Normalized SQL and Actual SQL for each query
  • Has histogram of tables used with read/write figures.
  • Has histogram of SQL with avg and max execution times.

You get the best of both worlds, you can see SQL access live, get statistics of this, and then be able to drill down to every instance of a given SQL statement for more information.

You can get a copy of my slides Here. You can get a full tar of my demo work here. This includes MySQL Proxy, My Bench for some benchmarks and the Sakila Sample Database.

LAST_INSERT_ID(expr) – The lesser known usage

I am of the attitude, the day you stop learning something is the day you die. I’m not prepared to induce MySQL into both sides of that equation, however some days it never ceases to amaze me what little thing I didn’t know about MySQL.

Today I saw in reviewing SQL statements for an application SELECT LAST_INSERT_ID(). No big deal, that is expected, however I then saw UPDATE … SET id=LAST_INSERT_ID(id+1) WHERE …

Having never seen this syntax I was forced to review it’s usage. See MySQL Documentation


If expr is given as an argument to LAST_INSERT_ID(), the value of the argument is returned by the function and is remembered as the next value to be returned by LAST_INSERT_ID(). This can be used to simulate sequences:

1.Create a table to hold the sequence counter and initialize it:

mysql> CREATE TABLE sequence (id INT NOT NULL);
mysql> INSERT INTO sequence VALUES (0);

2. Use the table to generate sequence numbers like this:

mysql> UPDATE sequence SET id=LAST_INSERT_ID(id+1);
mysql> SELECT LAST_INSERT_ID();

The UPDATE statement increments the sequence counter and causes the next call to LAST_INSERT_ID() to return the updated value. The SELECT statement retrieves that value. The mysql_insert_id() C API function can also be used to get the value. See Section 17.2.3.35, “mysql_insert_id()”.

Seems like a layman’s idea of Sequence support, but you are still restricted with the problems of a manual implementation. Transactions support, and if you use MyISAM, table level locking.

MySQL Camp II – Memorable Quotes Day 1

Better late then never, this week I finally have the chance to catch up on some overdue posts. At the first MySQL Camp I made a list of memorable Quotes, see Day 1, Day 2, Day 3. I didn’t get as much chance as last time, however here are some pearls of the recent MySQL Camp II.

“I walked in of the street for the free food. I’m here for the free education” – Adrian.

“I’m his boss, I’m here to make sure he’s really here, and not playing hooky.”

“Mashups, collating the worse bugs from multiple API’s all in one place” – OSCON badge

“Compiles 114,000 combinations of MySQL that we are interested in.” Sandro – Skoll Project — You mean to say there are are combinations your not interested in.

“Ok, people this is the second site were are going to crash today. You have heard you have been slash dotted, now you have been camped.”

Jay turning down the lights, to the whole auditorm, not just the stage. “That won’t work. My paper isn’t backlit.” — Andrew

Jeremy: “It’s a porn site.”
Sheeri: “technocation.com it’s not a porn site, I work for a porn site.”

Jeremy: “You don’t realize how many Google properties you use, google reader was down, google search was down, google maps was down.”
Sheeri: “I had to use Yahoo maps, I feel so dirty.”

Jay: “How many people are interested in a tour of the MySQL Source Code.”
Jeremy II: “It is a guided tour, isn’t it.”

“I can’t remember if was the cold, hot or luke warm”. — Bob, In the backup talk.

“How many environments have it. They all have the presumption of it.”

DateTime vs Timestamp

I was asked a question today, “DATETIME vs TIMESTAMP. When to use which & why?”

It’s a good MySQL introduction question, here are some general considerations for choosing one.

Do you need Date values other then an EPOCH value? (i.e. before 1970) If the answer is yes, then DATETIME is required.
If you do not however, then TIMESTAMP is the best choice for a few reasons.

1. The TIMESTAMP columns uses 4 Bytes to record it’s value, while DATETIME uses 8 bytes. Using the smallest storage is always a best practice for all columns.
2. The TIMESTAMP column supports the CURRENT_DATE syntax in the CREATE TABLE command. This enables the column to have a default value for INSERT or for UPDATE, but not both. Indeed this is the only data type that allows for any default value that is not a constant.
3. All date functions (at least the ones I use) work equally as well with TIMESTAMP and DATETIME.

I have yet to find any benchmarking to indicate any performance differences of not selecting TIMESTAMP.

And just for a piece of trivia, the DATE datatype is 3 bytes, the TIME datetype is 3 bytes, so why is the DATETIME 8 bytes?
Yes, for those that intend to reply I do know the answer, however others readers may not. Comments please!

The woes of MySQL Community tools under Solaris

Yesterday I attempted to get a working MySQL environment to support the number of utilities we all use including mytop, innotop, mybench, mysqltoolkit. These products require a number of Perl Dependencies, and while that may be a rather trivial task under Linux and with the power of cpan, working on Solaris is a whole different story.

For the record, I’m working with Solaris 9 SPARC 64bit.

I won’t detail you with how hard it was to get to this point, except to say thanks to Jeremy, Baron and Frank so far. Here is where I’m at.

You need a number of pre-requisites, most from sunfreeware.com

Pre-Requisites

$ wget ftp://ftp.sunfreeware.com/pub/freeware/sparc/9/libiconv-1.11-sol9-sparc-local.gz
$ wget ftp://ftp.sunfreeware.com/pub/freeware/sparc/9/gcc-3.4.6-sol9-sparc-local.gz
$ wget ftp://ftp.sunfreeware.com/pub/freeware/sparc/9/make-3.81-sol9-sparc-local.gz
$ gunzip libiconv-1.11-sol9-sparc-local.gz gcc-3.4.6-sol9-sparc-local.gz make-3.81-sol9-sparc-local.gz
$ pkgadd -d libiconv-1.11-sol9-sparc-local
$ pkgadd -d gcc-3.4.6-sol9-sparc-local
$ pkgadd -d make-3.81-sol9-sparc-local
$ wget http://search.cpan.org/CPAN/authors/id/A/AB/ABURLISON/Solaris-PerlGcc-1.3.tar.gz
$ gunzip Solaris-PerlGcc-1.3.tar.gz
$ tar xvf Solaris-PerlGcc-1.3.tar
$ cd Solaris-PerlGcc-1.3
$ perl Makefile.PL
$ make install
$ wget http://search.cpan.org/CPAN/authors/id/M/MS/MSCHWERN/ExtUtils-MakeMaker-6.36.tar.gz
$ gunzip ExtUtils-MakeMaker-6.36.tar.gz
$ tar xvf ExtUtils-MakeMaker-6.36.tar
$ cd ExtUtils-MakeMaker-6.36
$ perl Makefile.PL
$ make install

Back to installing

$ PATH=/usr/local/bin:$PATH;export PATH
$ PATH=/usr/perl5/5.6.1/bin/:$PATH;export PATH

$ perlgcc Makefile.PL
Checking if your kit is complete...
Looks good
Writing Makefile for Term::ReadKey
netcontrol1:/tmp/TermReadKey-2.30# make install
make: *** No rule to make target `/usr/perl5/5.6.1/lib/sun4-solaris-64intConfig.pm', needed by `Makefile'.  Stop.

This is where the trail ends as my friend google has not helped me past this point, so I’m turning the community. If anybody can assist please let me know.

A MySQL Story

The Taso Dump

I’ve gotto go, my time has come. (maintenance window needed)
Could not get a lock. (table lock)
I needed a blocker to ensure a single thread. (–single-transaction)

Finally, got a lock. (–lock-tables)
Need to ensure no transactions lost.
Dump, Dump, Dump. (mysqldump)

I’m having flow problems.
Was the buffer to small? (key_buffer_size)
Was it the query cache? (query_cache)
My Dirty Pages may be too high? (innodb_max_dirty_pages_pct)
Or was it just Too Many Connections? (max_connections)
But it was just waiting on the flush (flush tables with read lock)

Time passes, Time Passes. No output yet.
Is it network saturation?
Is it IO bound?
Do I need a better flushing method (innodb_flush_method)
No, it was just the lag? (Seconds_behind_master)

Dump is complete (unlock tables)
Now it’s time to Purge (purge logs)

If there is an attempt to restore.
The backup would be empty.
Blackhole would be found as the cause.
In the end, all transactions lost!

If you weren’t at Friday dinner after MySQL Camp II you missed it.
It’s not meant to be MySQL grammar correct, it’s just some random words we were throwing around.

MySQL Camp II – Post Dinner

MySQL Camp II is complete. A small group of about 18 had post dinner at Tiny Thai in New York City. Some elected to drive from Brooklyn, they arrived at least 30 minutes after those of us that the subway.

I have a lot of notes to write, if ever the time permits. For now, the following few that joined for drinks are below. I know other people took photos of the camp, for a change I actually took none. If you want to add a link in comments of photos from the camp that would be great.

MySQL Camp II Post Dinner Drinks

Other sizes here

MySQL Camp II – Introductions

We have started MySQL Camp II. The first session is Introductions.

I didn’t catch all the employers, but here is part of the list of attendees at the Introduction section. Great to see multiple people from many places including ESPN, priceline.com, Proven Scaling, Solid Tech – sponsors of Dorsal Source, ForSaleByOwner. fontshop.com, 9Mmedia, CafeMom, JP Morgan, Upoc, ClubMom, Stock Photo Finder, AmieStreet,LogicWorks, Skoll – Distributed Continuous QA , AOL, Minggl, Minggl New Test Site, Visibone. Others include OnlineBuddies.com, NT Snort User Group, DreamweaverNY User Group, A law firm. As well as a few people from MySQL.

Many people mentioned having an Oracle background, or working with Oracle now, at least 6 people that heard me speak at “MySQL DBABootcamp for the Oracle DBA” last week.

There were a lot of MySQL Beginners here which was really great.

And now we are onto the discussion of the sessions.

MySQL Camp II – It begins


Well readers, your either here or your not. MySQL Camp II starts today in Brooklyn, New York, at Polytechnic University. Last night’s pre drinks meetup in NYC went well, but today it’s brass tacks time. View Larger Map

For those of you not able to make it, IRC@Freenode #mysql-camp will be the place to hang out to hear what’s happening. If your not at the camp, please identify yourself. Be sure to also check out the Camp Web Site MySQL Camp II for the plans for today and tomorrow.

For those of you not here, MySQL Camp III is already in planning.

MySQL Backup & Recovery – Part 1

I realized recently from observation that some smaller websites which use MySQL do not have a working backup and recovery strategy. Some may have a backup approach but it will not work in a recovery. As part of a number of discussions on Backup & Recovery here is part 1.

Using straight ‘cp’ for Backup

Using ‘cp’ to simply copy your running database *DOES NOT WORK*. You may be lucky, but in a world of guarantees, this is no guarantee that your can recovery your system. Why is this so.

  • The data is not consistent during the backup. If it takes say 5 minutes to copy your files, they are copied probably in alphabetical order, what if data is written during the backup to a table starting with ‘A’ and a table starting with ‘Z’, the A file has already been copied.
  • When using MyISAM, Indexes are not flushed to disk unlike Table data. This means that while MyISAM has the facility of recover and rebuild indexes using myisamchk, you need to know if you have corruption, you need to rebuild your tables offline, and there is not assurance your data can be corrected.
  • Likewise with using Innodb, Data and Indexes are not flushed to disk. While Innodb does flush the redo logs to enable crash recovery, I have seen on a production website, granted 4.1 that Innodb failed to recover and caused major downtime and serious business ramifications.

Using ‘cp’ correctly

In order to copy your database, there are two different ways. The first is to shutdown your database first then copy your data files. Be sure to also copy your my.cnf file, this is important in a recovery process. The second is to use FLUSH command. For example.

mysql> FLUSH TABLES WITH READ LOCK;
-- In second session copy all data
mysql> UNLOCK TABLES;

The advantage is you don’t have to shutdown your instance, the disadvantage is that FLUSH TABLES will halt any WRITE statements to your database, so depending on the time to copy your data files that could have significant effect on your system. In a high write environment the FLUSH command could also take a significant time to complete.

I’ll leave with one simple tip. You should always have sufficient diskspace on your machine for performing the backup. This is important for two parts. First, if you say compress your backup during the process, you are taking a longer time then a straight copy and your database is unavailable longer. Second, over 90% of any recovery occurs with last nights backup, and the time take to uncompress the backup affects your recovery time, particularly in a large system. Your ‘cp’ should indeed be a two phase process. First ‘cp’ on the local server for a local backup, then have an off server compressed backup.

In my next Part, I’ll discuss more alternatives to improving your backup strategy.

References

FLUSH TABLES

Today's interesting MySQL Error Message

You have to love error messages some times. Today is was “ERROR 1289 (HY000): The ‘UNKNOWN’ feature is disabled;”

For those interested.

mysql> select version();
+-------------+
| version()   |
+-------------+
| 6.0.0-alpha |
+-------------+
1 row in set (0.00 sec)
mysql> set session sql_mode=NO_ENGINE_SUBSTITUTION;
Query OK, 0 rows affected (0.00 sec)

mysql> create table T2 (id INT UNSIGNED NOT NULL) ENGINE=InnoDB;
ERROR 1289 (HY000): The 'UNKNOWN' feature is disabled; you need MySQL built with 'UNKNOWN'
to have it working

In this case it’s obviously something that’s not quite right, so being a good community contributor I raised a bug. You can read more at Bug #29373.

MySQL values the contributions from the community, even as trivial as documentation or error messages. If you have an inquiry raise it at the forums or the lists, or if it’s obviously an issue raise it in the Bugs System

SHOW STATUS WHERE

When you use SHOW STATUS can can restrict with the LIKE syntax, allowing for a subset of values. For example:

mysql> SHOW GLOBAL STATUS LIKE 'Com%';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Com_admin_commands       | 0     |
| Com_alter_db             | 0     |
| Com_alter_table          | 0     |
| Com_analyze              | 0     |
| Com_backup_table         | 0     |
| Com_begin                | 0     |
| Com_change_db            | 0     |
| Com_change_master        | 0     |
...

That’s great, but sometimes you want specific values. Using WHERE can achieve this. For Example.

mysql> SHOW GLOBAL STATUS WHERE VARIABLE_NAME IN (’Com_insert’,'Innodb_buffer_pool_pages_latched’,'threads_running’);
+----------------------------------+-------+
| Variable_name                    | Value |
+----------------------------------+-------+
| Com_insert                       | 0     |
| Innodb_buffer_pool_pages_latched | 0     |
| Threads_running                  | 1     |
+----------------------------------+-------+
3 rows in set (0.00 sec)

Cool, the downside is you loose the wildcard capability, however you can string commands together with LIKE

mysql> SHOW GLOBAL STATUS WHERE VARIABLE_NAME LIKE 'innodb%' OR VARIABLE_NAME LIKE  'com%';

MySQL NY Meetup – Part 2 in our series

Last night at the MySQL NY Meetup we continued on from a very successful July presentation on “Practical Performance Tips & Tricks”. I must admit after speaking and standing all day for the MySQL DBA Bootcamp for the Oracle DBA it was a stretch, and we didn’t cover all material as expected, but the evening was still very productive to everybody. Links are here for my August Presentation and July Presentation.

Thanks to Marc and the team from LogicWorks for again sponsoring our NY Meetup Event. We don’t get the beer and food any other way.

As a consultant working on client sites even in the time from the previous meeting, I see a number of simple steps that can be applied to every single MySQL environment and last nights talk placed some focus on this. The following is the homework that has been set for attendees, so next month we can have a panel discussion and then continue more on the analysis of MySQL.

This homework focuses on three areas. It seems a lot, but it will be worth it.

  1. Collection of SQL statements (in various forms)
  2. Monitoring of raw MySQL Status information
  3. Review of Schema Data Types for more optimal storage

Collection

This information is to be collected.

  • Production is to have slow query log enabled and long_query_time =1
  • Daily review of Production slow query log to view the longest running and most frequent (Slide Aug-13)
  • All development and test environments have general log and slow query log enabled.
  • All application paths to be tested in development or testing, to capture all SQL. Use self documenting approach (Slide: Aug-21)

Outputs
Your outputs will include offending SQL statements via the Slow Query log that require investigation, extra credit for doing an EXPLAIN and a SHOW CREATE TABLE for each table in the query for analysis of improving the queries.
You will have a review of every SQL statement. A simple desk check will determine are there any queries which are exact duplicates, are there any queries that could be combined (as they use the same where condition). Are there any queries that are repeated a lot, but with only changing values. These are a clear candidate for combining into a single query.

MySQL Status Information

The MySQL Status can provide help into where to look for problems. This is the raw fundamentals that either home grown scripts or MySQL Network Monitoring and Advisory Service use to provide monitoring of important information (e.g. Buffer Hit Ratio) and can provide alerts of present or potential pending problems. We will not be reviewing all these status (~250) or developing any detailed analysis, however this is designed to show you the building blocks.

You should collect figures for 1 minute, 1 hour and 1 day that represent appropriate load on your system.

1 Minute

$ mysqladmin -uroot -p -r -i 1 -c 60 extended-status | grep -v " | 0 " > minute.log

1 Hour

$ mysqladmin -uroot -p -i 60 -c 60 extended-status > hourly.log

1 Day

$ mysqladmin -uroot -p -i 3600 -c 24 extended-status > daily.log

Extra credit for downloading Statpack and doing an analysis of the statistics. The trick is to take the first and last status output from each of the Hourly and Daily logs to get a 1 hour picture and a 1 day picture.

Review Schema

For reviewing the schema, try to use a static production sized copy (e.g. a copy of production in a test environment). This ensures you don’t make changes that will cause your production system to fail, and before and after figures are for the same data set as it’s not changing during production use.

  • Calculate size of present data
  • Review schema for immediate size improvements (Slide Jul-59)
  • Apply size improvements and re-run to see size improvements.

SQL

Calculate Size of all schemas

select table_schema, count(*) as tables,
           sum(data_length+index_length)/1024/1024 as total_mb,
           sum(data_length)/1024/1024 as data_mb,
           sum(index_length)/1024/1024 as index_mb
from information_schema.tables
group by table_schema
order by 3 desc;

Calculate Size of all tables for given schema

use test;
select table_name,engine,row_format,avg_row_length,table_rows,
          (data_length+index_length)/1024/1024 as total_mb,
          (data_length)/1024/1024 as data_mb,
          (index_length)/1024/1024 as index_mb
from information_schema.tables
where table_schema=database()
order by 6 desc;

Learning MySQL as an Oracle DBA

Updated

I have an entire section now devoted to various MySQL for the Oracle DBA Resources. You will find additional information here.

This week I presented two one day free seminars, “MySQL DBA Bootcamp for the Oracle DBA” in New York and San Francisco. Both were very successful days providing an opportunity to speak to seasoned enterprise professionals.

One question I was asked was “As an Oracle DBA, how can I become a MySQL DBA, what do I do, where do I start?”

Here are my references and recommendations that have zero cost to get started.

  • Read the MySQL Documentation Reference Manual.
  • Download MySQL install and use it.
  • The MySQL Developer Zone is a great sources for articles, information and references.
  • Planet MySQL is our official consolidated Blog Aggregator. Read it daily.
  • The MySQL Forge is a detailed reference of MySQL related projects, code snippets, wiki and MySQL WorkLog. A look at what MySQL is doing in future versions, and what others like yourself would like to see in future versions.
  • The MySQL Conference website has many papers from the recent 2007 Conference. You can also review the 2006 Conference Papers.
  • MySQL has various Forums and Email Lists. We have a specific Oracle Forum to assist Oracle DBA’s and Developers with MySQL questions.
  • MySQL also provides a large number of White Papers and Case Studies in it’s Why MySQL section. These are helpful to see how MySQL is being used today.
  • Register at mysql.com, if you join a list, fill in download form or respond to a forum, your already registered, but if not you will get a regular newletter that provides helpful information, including events, webinars and training
  • Sheeri Kritzer, the She-BA of MySQL and this years community award winner has an extensive list of resources on her website including podcasts and a long list of Audio & Video from the recent MySQL Conference.

For additional Oracle to MySQL specific references from recent conferences include 2006 – MySQL For Oracle DBA’s,
MySQL For Oracle Developers and 2007 – MySQL For Oracle DBA’s and Developers.

For your reference material, I would initially recommend the following books.

  • “MySQL 5.0 Certification Guide” – which all attendees received for free.
  • “MySQL Cookbook” by Paul DuBois, “MySQL” by the same Paul DuBois or “Pro MySQL” by Michael Kruckenberg and Jay Pipes.
  • “MySQL Administrator’s Guide and Language Reference”. – This is just a printed copy of the MySQL Manual, however some people may appreciate this.

Following that, additional resources depending on your level of interest in development or internal workings etc would include.

  • “MySQL Stored Procedure Programming” by Guy Harrison.
  • “Understanding MySQL Internals” by Sasha Pachev.

MySQL Professional Services also provides training and certification for MySQL. With 9 different training courses held world wide, and 5 different certification courses there are various programs to suit DBA’s and Developers at different skill levels. For more information see MySQL Training and Certification.

Let's improve MySQL security

We have all done it in the past, and probably most people that read this (will admit| or lie) to still doing it, but everybody must start making an effort to improving MySQL security in the usage on your MySQL Installation, including just on your laptop, and in presentations that people read.

I spotted a reference article on Planet MySQL this evening and without looking at the details the syntax presented typifies two basic and fundamental 101 MySQL security issues.

1. Always, always, always have a password for a MySQL account, especially for the ‘root’ user.
2. Don’t use the ‘root’ user unless you really have to. The SUPER privilege is just that SUPER, there are many things you really don’t want every person accessing to have. In a larger environment you want to ensure good principles, but also in single developer environment.

I was asked on Thursday at a seminar I was giving. “Our developers login as ‘root’ for development and use ‘root’ for MySQL access, is this bad?”. To which my simple response was “Yes”.

I think MySQL should improve the product and not allow a root user without a password. You can of course use mysql_secure_installation to set this, but it’s not a mandatory step. You can also use SQL_MODE to ensure users can’t be created without a password, in both instances the horse has already bolted.

5.1.20 Gotcha – The MySQL Error Log

While using the latest MySQL 5.1.20 yesterday I came across another situation that was not expected as with previous editions of MySQL. The background is experimenting with DRBD. When I configured MySQL to startup with a /etc/my.cnf file with data on a DRBD partition I got a failed startup error message with mysqld_safe.

$ bin/mysqld_safe &
[1] 12615
070720 10:10:42 mysqld_safe Starting mysqld daemon with databases from /drbd/data
070720 10:10:42 mysqld_safe mysqld from pid file /drbd/data/newyork.localdomain.pid ended

Ok. Well this happens so I went to the data directory to look for `hostname`.err.

$ cd /drbd/data
$ ls -l

What the! There is no error log. Then the discussion started about this. Apparently mysqld_safe now uses syslog (e.g. /var/log/messages) for logging messages. Ok, but where is the line between mysqld_safe and mysqld. There was some confusion here about this, however investigation of /var/log/messages showed what was necessary to at least find the error.

$ tail -30 /var/log/messages
Jul 19 16:59:51 newyork mysqld_safe[6156]: Starting mysqld daemon with databases from /drbd/data
Jul 19 16:59:54 newyork mysqld[6160]: /usr/local/mysql/bin/mysqld: Can't find file: './mysql/plugin.frm' (errno: 13)
Jul 19 16:59:54 newyork mysqld[6160]: [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it.
Jul 19 16:59:54 newyork mysqld[6160]: InnoDB: The first specified data file ./ibdata1 did not exist:
Jul 19 16:59:54 newyork mysqld[6160]: InnoDB: a new database to be created!
Jul 19 16:59:54 newyork mysqld[6160]: InnoDB: Setting file ./ibdata1 size to 10 MB
Jul 19 16:59:54 newyork mysqld[6160]: InnoDB: Database physically writes the file full: wait...
Jul 19 16:59:54 newyork mysqld[6160]: InnoDB: Log file ./ib_logfile0 did not exist: new to be created
Jul 19 16:59:54 newyork mysqld[6160]: InnoDB: Setting log file ./ib_logfile0 size to 5 MB
Jul 19 16:59:54 newyork mysqld[6160]: InnoDB: Database physically writes the file full: wait...
Jul 19 16:59:54 newyork mysqld[6160]: InnoDB: Log file ./ib_logfile1 did not exist: new to be created
Jul 19 16:59:54 newyork mysqld[6160]: InnoDB: Setting log file ./ib_logfile1 size to 5 MB
Jul 19 16:59:54 newyork mysqld[6160]: InnoDB: Database physically writes the file full: wait...
Jul 19 16:59:54 newyork mysqld[6160]: InnoDB: Doublewrite buffer not found: creating new
Jul 19 16:59:54 newyork mysqld[6160]: InnoDB: Doublewrite buffer created
Jul 19 16:59:54 newyork mysqld[6160]: InnoDB: Creating foreign key constraint system tables
Jul 19 16:59:54 newyork mysqld[6160]: InnoDB: Foreign key constraint system tables created
Jul 19 16:59:54 newyork mysqld[6160]: InnoDB: Started; log sequence number 0 0
Jul 19 16:59:54 newyork mysqld[6160]: [ERROR] /usr/local/mysql/bin/mysqld: Can't find file: './mysql/host.frm' (errno: 13)
Jul 19 16:59:54 newyork mysqld[6160]: [ERROR] Fatal error: Can't open and lock privilege tables: Can't find file: './mysql/host.frm' (errno: 13)

To ensure backward compatibility now with previous versions of MySQL, it is necessary to specific in my.cnf the log-error parameter.

#my.cnf
[mysqld]
log-error = /drbd/data/mysql.err

Starting gives you expected error logging.

$ more /drbd/data/mysql.err
070720 10:16:30 mysqld_safe Starting mysqld daemon with databases from /drbd/data
/usr/local/mysql/bin/mysqld: Can't find file: './mysql/plugin.frm' (errno: 13)
070720 10:16:30 [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it.
070720 10:16:30  InnoDB: Operating system error number 13 in a file operation.
InnoDB: The error means mysqld does not have the access rights to
InnoDB: the directory.
InnoDB: File name ./ibdata1
InnoDB: File operation call: 'open'.
InnoDB: Cannot continue operation.
070720 10:16:30 mysqld_safe mysqld from pid file /drbd/data/newyork.localdomain.pid ended

So, you need to read the fine print in the Changes in release 5.1.20 (25 June 2007) 5.1 Manual you get.


Incompatible change: mysqld_safe now supports error logging to syslog on systems that support the logger command. The new –syslog and –skip-syslog options can be used in conjunction with the –log-error option to control logging behavior, as described in Section 5.3.1, “mysqld_safe — MySQL Server Startup Script”. The default is to use syslog, which differs from the previous default behavior of writing an error log file. To maintain the older behavior, use the –log-error option. (Bug#4858)

If you read mysqld_safe — MySQL Server Startup Script you also see the following note.


Note

As of MySQL 5.1.20, the default error logging behavior is to write errors to syslog on systems that support the logger program. This differs from the default behavior of writing an error log file for versions prior to 5.1.20. To maintain the older behavior, use the –log-error option.

RTFM those release notes!

MySQL 5.1.20 Installation

MySQL recently released 5.1.20 beta. I just ran into another gotcha, which prompted me to post this draft I’ve had for a while.

rbradford@newyork:/opt/mysql51$ scripts/mysql_install_db
Installing MySQL system tables...
070709 23:24:08 [Note] Plugin 'InnoDB' disabled by command line option
OK
Filling help tables...
070709 23:24:08 [Note] Plugin 'InnoDB' disabled by command line option
OK

To start mysqld at boot time you have to copy
support-files/mysql.server to the right place for your system

PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER !
To do so, start the server, then issue the following commands:
./bin/mysqladmin -u root password 'new-password'
./bin/mysqladmin -u root -h newyork password 'new-password'
See the manual for more instructions.
You can start the MySQL daemon with:
cd . ; ./bin/mysqld_safe &

You can test the MySQL daemon with mysql-test-run.pl
cd mysql-test ; perl mysql-test-run.pl

Please report any problems with the ./bin/mysqlbug script!

The latest information about MySQL is available on the web at

http://www.mysql.com

Support MySQL by buying support/licenses at http://shop.mysql.com

Wait, there is an Innodb error message I’ve not seen . I’ll have to look into that.

mysql> show engines;
+------------+---------+----------------------------------------------------------------+--------------+-----+------------+
| Engine     | Support | Comment                                                        | Transactions | XA  | Savepoints |
+------------+---------+----------------------------------------------------------------+--------------+-----+------------+
| InnoDB     | YES     | Supports transactions, row-level locking, and foreign keys     | YES          | YES | YES        |
| MRG_MYISAM | YES     | Collection of identical MyISAM tables                          | NO           | NO  | NO         |
| BLACKHOLE  | YES     | /dev/null storage engine (anything you write to it disappears) | NO           | NO  | NO         |
| CSV        | YES     | CSV storage engine                                             | NO           | NO  | NO         |
| MEMORY     | YES     | Hash based, stored in memory, useful for temporary tables      | NO           | NO  | NO         |
| FEDERATED  | YES     | Federated MySQL storage engine                                 | YES          | NO  | NO         |
| ARCHIVE    | YES     | Archive storage engine                                         | NO           | NO  | NO         |
| MyISAM     | DEFAULT | Default engine as of MySQL 3.23 with great performance         | NO           | NO  | NO         |
+------------+---------+----------------------------------------------------------------+--------------+-----+------------+
8 rows in set (0.00 sec)

mysql> show plugins;
+------------+--------+----------------+---------+---------+
| Name       | Status | Type           | Library | License |
+------------+--------+----------------+---------+---------+
| binlog     | ACTIVE | STORAGE ENGINE | NULL    | GPL     |
| partition  | ACTIVE | STORAGE ENGINE | NULL    | GPL     |
| ARCHIVE    | ACTIVE | STORAGE ENGINE | NULL    | GPL     |
| BLACKHOLE  | ACTIVE | STORAGE ENGINE | NULL    | GPL     |
| CSV        | ACTIVE | STORAGE ENGINE | NULL    | GPL     |
| FEDERATED  | ACTIVE | STORAGE ENGINE | NULL    | GPL     |
| MEMORY     | ACTIVE | STORAGE ENGINE | NULL    | GPL     |
| InnoDB     | ACTIVE | STORAGE ENGINE | NULL    | GPL     |
| MyISAM     | ACTIVE | STORAGE ENGINE | NULL    | GPL     |
| MRG_MYISAM | ACTIVE | STORAGE ENGINE | NULL    | GPL     |
+------------+--------+----------------+---------+---------+
10 rows in set (0.00 sec)

Yet, it works fine! So what does message “[Note] Plugin ‘InnoDB’ disabled by command line option” really mean. On closer inspection it doesn’t say Error, however it’s a different message from previous installs, so it requires more investigate to know what’s happening.

OS Freedom

I’m at some internal MySQL training today at our US HQ. In a room of 11 people we are all using our laptops for access to additional machines. The beauty of this is we have Linux (Ubuntu 5.10, 6.10, 7.04, Red Hat Fedora), Mac OS/X and Windows Operating Systems on various staff machines. It’s great to realize we are not locked into just one infrastructure, and we have the ability to all do our work effectively and MySQL works just as well across these technologies. I can’t however SSH from my iPhone yet, unlike my previous Nokia E62 PDA.

MySQL NY Meetup Presentation

Tonight I spoke at the NY MySQL Meetup. The topic “Practical Performance Tips & Tricks” was a full packed 1 hour session, with 4 x 15 minute sections on Beginner, Intermediate, Advanced and MySQL Proxy. The goal to hopefully cover content for different level of attendees. This meeting followed up the large turnout from last month’s meeting with at least 35 people. Thanks again to Logicworks for sponsoring the night and providing the beer and food, especially to adjust for the late arriving presenter.

My Slides are download able in PDF format here.

Some additional links that were discussed during the meeting for reference included.

Details of my previous MySQL presentations can be found in my Articles section.

MySQL Proxy. Playing with the tutorials

I was playing with the 5 sample tutorial Lua scripts available here with the MySQL Proxy, but I was doing something a little inefficiently.

I started mysqld, then I started the MySQL Proxy with the lua script, then connected to MySQL via the proxy. To test a different script I was actually killing the MySQL Proxy and restarting with appropriate script, but this is unnecessary. MySQL Proxy will re-read the lua script, as specified with –proxy-lua-script on new connection. All I need to do is copy in the file in question and get a new mysql client connection.

The tech version of the right way:

$ cp tutorial-basic.lua running.lua
$ ./mysql-proxy --proxy-lua-script=running.lua &
$ mysql -uusr -p -P4040 -h127.0.0.1
mysql> # do my stuff
mysql> exit;
$ cp tutorial-inject.lua running.lua
$ mysql -uusr -p -P4040 -h127.0.0.1

In the tutorial-resultset.lua example, we see the creation of a new command, show querycounter which gives you the number of queries executed for the connection. Immediately I can see that people will be creating pseudo Com_insert,Com_select,… variables per connection so you can get a better granularity of information then what default status variables can provide.

It was interesting however to see the counter be a value of 20 when you open a connection. What you may ask, but we forget the little things sometimes when running the mysql client. On invocation it actually executed 20 commands to the server, that is 20 round trips.

There are indeed: SHOW DATABASES; SHOW TABLES; then 16 Field Lists for each mysql table (e.g. host,db,func…), then a SELECT @@version_comment. Most of these as a result of the rehash or tab-completion capabilities in mysql client.

Wow, I also learned something new, there is a @@version_comment, I just knew about @@version. You learn some new trivial point every day!

 mysql> select @@version,@@version_comment;
+------------+------------------------------+
| @@version  | @@version_comment            |
+------------+------------------------------+
| 5.0.37-log | MySQL Community Server (GPL) |
+------------+------------------------------+
1 row in set (0.00 sec)

MySQL Proxy. Get it here.

What's your disk I/O thoughtput?

MySQL uses disk. Ok, so everybody knew that. MySQL uses disk in two primary ways.

  • Random I/O (Reading & Writing Data/Index blocks)
  • Sequential I/O (Binary Log, InnoDB Redo Log)

Historically it’s been best practice to separate these onto different spindles, and also separating the OS and tmp space onto a third spindle. With commodity H/W that can be easily done, but today a lot of people use SAN. Is this a good thing for a MySQL Database?
That’s a topic of much discussion at a later time, however I’ll add two points. A lot of SAN configurations are RAID 5, and RAID 10 is a better choice due to removing the requirement to calculate the parity. Second, last week I observed a RAID disk failure and it took an incredible long time for the disk to be re-built. Just how many SAN uses our there have actually timed a disk rebuild on a loaded system and seen the impact on the system in general.

Back on topic, I don’t have access to any variety of hardware, so community here is where you can help. Those that can spare 5 mins, and have some free disk space (< 5GB), here is an exercise.

Commands

$ time dd if=/dev/zero count=100000 of=testfile.32k bs=32k
$ ls -lh testfile.32k
$ rm testfile.32k

You should see something like (FYI: from a 5400rpm laptop drive)


$ time dd if=/dev/zero count=100000 of=testfile.32k bs=32k
100000+0 records in
100000+0 records out
3276800000 bytes (3.3 GB) copied, 160.172 seconds, 20.5 MB/s
real 2m40.342s
user 0m0.120s
sys 0m15.277s
$ ls -lh testfile.32k
-rw-r--r-- 1 usr grp 3.1G 2007-06-28 10:02 testfile.32k

If your output doesn’t provide the dd M/B output (like Solaris for example) if you could also add:

$bc
3.2*1024/160
20
^D

NOTE: Replace 160 with the number of seconds from the real time (e.g. 2*60+40)

Of course I’m not collecting a lot of stuff, like comparing different block sizes, or looking at iostat for existing load and introduced load. I thought I’d ask an easy question to see what type of response and output I’d find.

If you want to add any additional information such as Drive Types & Speeds (e.g. SATA 5400rpm), RAID configuration or SAN configuration (e.g. RAID and connection type), OS and File System type that would be great, but completely optional.

If you would like add your results anonymously, please email me directly.

Thanks in advance.

Top 5 Best Practices for MySQL

We had the Top 5 wishes for MySQL started by Jay recently. So in true chain letter fashion I’m starting a new one this week. “The top 5 Best Practices for MySQL”. This like it’s predecessor is generally vague, so it can include points on development, design, administration etc.

My list:

1. Write your application to support Transactions (and therefore use a Transactional Storage Engine).

2. Always use SQL_MODE. e.g. at least TRADITIONAL and ANSI to ensure better data integrity and errors as errors.

3. Use the most optimal data types (particularly for number (e.g. TINY/SMALL/BIG INT and nullability) and especially in relation to columns in indexes.

4. When using InnoDB use the shortest primary key possible (e.g. INT UNSIGNED. BIGINT unless you have more then 4 billion rows in your potential data set is laziness).

5. VARCHAR(255) is just plain dumb and lazy. This is not database design and for the record, yes there is an impact when your queries use certain buffers (e.g. sort_buffer). Last year I wrote on this topic in If you don’t know your data, you don’t know your application.. Combined with SELECT * FROM TABLE in queries is not a well designed application.

Obviously I need to clarify that this is a baseline for Best Practice and many considerations can lead to a more optimized means depending on circumstances, for example using MyISAM or other MySQL non-transactional storage engines etc, when not to use sql_mode etc and when the shortest Innodb is not the best when you are being killed by I/O. For points 3 and 5, there are no exceptions.

To all Planet MySQL bloggers and readers, the challenge is on.

Got MySQL Proxy yet!

If you haven’t got MySQL Proxy yet, then stop and get it now. Jan announced the release a few days ago of this new product offering from MySQL.

I first heard about MySQL Proxy at the recent MySQL Conference 2007 and actually used it a few weeks later to help address slow running queries during benchmarking with a granularity of milliseconds — Wow. The product has grown immensely since then and I’ve watched in true amazement at the speed of development by Jan, who I only found out recently was the creator of Lighthttpd.

Now it’s past midnight and I think of those things I’ve wanted, can the proxy now do them. Here are a few ideas for me to play with.

  • Find Slow Queries < 1 second
  • Add a time (HH:MM:SS) column to SHOW GLOBAL STATUS output, and really hope it comes out in mysqladmin extended-status. Cross fingers here
  • Be able to log slow queries to a Memory table, a little like mysql.slow_log but more intelligent, and in 5.0.
  • True SQL access to things like PROCESSLIST, SHOW GLOBAL VARIABLES and SHOW GLOBAL STATUS, again like the 5.1 I_S tables, but I want them now in 5.0

The mind boggles. As Giuseppe describes it’s also an excuse to learn a new language called Lua. Those familiar with the MySQL GUI Products and have played with the GRT would have already seen Lua in action, as early as the 2006 User Conference.

Woot! Time to hit post and check out the real www.woot.com

Query Cache is deterministic

I was reading some points on a Blog post today here. I didn’t see it via Planet MySQL but via a Google Alert.

The post has several good beginner points on MySQL Performance, but included the point “Literal current date is better than using the CURRENT_DATE() function because literals will use the query cache but functions won’t.”

How true, I must admit I’d not considered this in reviewing code for sites that use the Query Cache heavily. Non deterministic functions invalid use of the Query Cache (Except for recent Bug #29053 which is also an interesting read. Note to self, read the new bugs more often when free time permits). Back on point.

So if you have a query like “Select the news for today” and use CURDATE(), it will not be stored in the Query Cache, yet if your application code determines the current date and passes “Select the news for 2007-06-26″ for example it will.

Sometimes the simple things are easy to forget, or not have thought about in this case for me.

My 'Hourly' MySQL Monitor Script Version 0.05

I’ve been able to steal some more time to work on my script following My ‘hourly’ MySQL monitor script Version 0.03 almost a month ago.

The purpose of this script is simple. Create an hourly ‘cron’ job that records and monitors information against the OS and MySQL Server. This is most helpful when environments simply don’t have any monitoring in place. I’ve found it very productive when running a benchmark on site to simply enable for an hour. An amount of analysis is required, but I at least have a baseline of data collection. That is the first goal.

So hourly.0.05.tar.gz is taking some shape, and has the following new features:

  • Provide configurable flag to enable/disable OS and MySQL tests
  • Added MYSQL_SID support, allowing for running MySQL tests against multiple instances on a single server
  • Added configurable mysql.conf support, again supporting multiple server instances
  • Revised file name standards, again to support multiple server instances
  • Added additional error checking for correct MySQL Configuration and operational environment
  • Corrected file pathing of dependent files

My next step now is to provide some simple analysis, I’ll be starting with Statpack Version 2 by colleague Mark Leith.

Watching for Disk Based Tables being created

I didn’t know you could actually do this before addressing this problem in a benchmark using 5.0.36.

MySQL allows you to update two tables with a single UPDATE statement. I knew you could reference two tables in an UPDATE statement but not update both. However when working with a client after benchmarking I observed a large number of Created_tmp_disk_tables via SHOW GLOBAL STATUS and found that this query was the offending query only for certain circumstances. Understanding took a little longer.

schema.sql

DROP TABLE IF EXISTS a;
CREATE TABLE a(
id  INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
i   INT UNSIGNED NOT NULL,
c   CHAR(1) NOT NULL)
ENGINE=INNODB;
DROP TABLE IF EXISTS b;
CREATE TABLE b(
id  INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
i   INT UNSIGNED NOT NULL,
c   CHAR(1) NOT NULL,
t   TEXT NOT NULL)
ENGINE=INNODB;

data.sql

INSERT INTO a VALUES(1,1,'a');
INSERT INTO a VALUES(2,2,'b');
INSERT INTO a VALUES(3,3,'c');
INSERT INTO a VALUES(4,4,'d');
INSERT INTO b VALUES(1,1,'a',REPEAT('The quick brown fox jumped over the lazy dog',100));
INSERT INTO b VALUES(2,2,'b',REPEAT('The quick brown fox jumped over the lazy dog',200));
INSERT INTO b VALUES(3,3,'c',REPEAT('The quick brown fox jumped over the lazy dog',300));
INSERT INTO b VALUES(4,4,'d',REPEAT('The quick brown fox jumped over the lazy dog',400));

test.sql

SHOW SESSION STATUS LIKE '%tmp%tables';
SHOW SESSION STATUS LIKE '%tmp%tables';
UPDATE a,b SET a.i = 10, a.c = 'x', b.i = 20 WHERE a.id = b.id and a.id=1;
SHOW SESSION STATUS LIKE '%tmp%tables';
UPDATE a,b SET a.i = 10, a.c = 'x', b.i = 20,b.t=REPEAT('b',5000) WHERE a.id = b.id and a.id=3;
SHOW SESSION STATUS LIKE '%tmp%tables';
UPDATE b SET t = REPEAT('a',5000) WHERE b.id=2;
SHOW SESSION STATUS LIKE '%tmp%tables';

Output

The following output shows the following.

  1. Impact of a SHOW SESSION STATUS increments Created_tmp_tables
  2. A two table update increments Created_tmp_tables
  3. A two table update including a text field increments Created_tmp_tables and Created_tmp_disk_tables
  4. A single table update with the text field increments Created_tmp_tables

So the culprit was the combination of a two table update and a TEXT field.

mysql> source test.sql
--------------
SHOW SESSION STATUS LIKE '%tmp%tables'
--------------

+-------------------------+-------+
| Variable_name           | Value |
+-------------------------+-------+
| Created_tmp_disk_tables | 0     |
| Created_tmp_tables      | 3     |
+-------------------------+-------+
2 rows in set (0.00 sec)

--------------
SHOW SESSION STATUS LIKE '%tmp%tables'
--------------

+-------------------------+-------+
| Variable_name           | Value |
+-------------------------+-------+
| Created_tmp_disk_tables | 0     |
| Created_tmp_tables      | 4     |
+-------------------------+-------+
2 rows in set (0.00 sec)

--------------
UPDATE a,b SET a.i = 10, a.c = 'x', b.i = 20 WHERE a.id = b.id and a.id=1
--------------

Query OK, 2 rows affected (0.02 sec)
Rows matched: 2  Changed: 2  Warnings: 0

--------------
SHOW SESSION STATUS LIKE '%tmp%tables'
--------------

+-------------------------+-------+
| Variable_name           | Value |
+-------------------------+-------+
| Created_tmp_disk_tables | 0     |
| Created_tmp_tables      | 6     |
+-------------------------+-------+
2 rows in set (0.00 sec)

--------------
UPDATE a,b SET a.i = 10, a.c = 'x', b.i = 20,b.t=REPEAT('b',5000) WHERE a.id = b.id and a.id=3
--------------

Query OK, 2 rows affected (0.00 sec)
Rows matched: 2  Changed: 2  Warnings: 0

--------------
SHOW SESSION STATUS LIKE '%tmp%tables'
--------------

+-------------------------+-------+
| Variable_name           | Value |
+-------------------------+-------+
| Created_tmp_disk_tables | 1     |
| Created_tmp_tables      | 8     |
+-------------------------+-------+
2 rows in set (0.00 sec)

--------------
UPDATE b SET t = REPEAT('a',5000) WHERE b.id=2
--------------

Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

--------------
SHOW SESSION STATUS LIKE '%tmp%tables'
--------------

+-------------------------+-------+
| Variable_name           | Value |
+-------------------------+-------+
| Created_tmp_disk_tables | 1     |
| Created_tmp_tables      | 9     |
+-------------------------+-------+
2 rows in set (0.00 sec)

The Facebook effect

First it was the SlashDot effect, then it was the Digg effect, now it’s the Facebook effect. I have a friend at Facebook and he was talking about the effect of the Facebook Platform API that was released a few weeks ago. Sites were now struggling to cope with the effect of massive amounts of new traffic, ensuring that experienced MySQL Consultants will have plenty of scale out opportunities.

Here is an abstract from an article I read recently. Analyzing the Facebook Platform, three weeks in


Translation: unless you already have, or are prepared to quickly procure, a 100-500+ server infrastructure and everything associated with it — networking gear, storage gear, ISP interconnetions, monitoring systems, firewalls, load balancers, provisioning systems, etc. — and a killer operations team, launching a successful Facebook application may well be a self-defeating proposition.

This is a “success kills” scenario — the good news is you’re successful, the bad news is you’re flat on your back from what amounts to a self-inflicted denial of service attack, unless you have the money and time and knowledge to tackle the resulting scale challenges.

This comes from the success of iLike. Some more reading references are Crazy love when startup iLike hits pay dirt and Holy cow… 6mm users and growing 300k/day!. Wow!

You can’t buy viral marketing with this type of traffic growth figures. Given the current MySQL 12 Days of Scale-out and recent experiences where clients are seeking HA & scale-out solutions but have not architectured present systems to manage any level of scale-out via the MySQL proven techniques of replication and sharding.

It’s an important lesson that any organization wanting to develop a successful web site needs to ensure the architecture is designed with massive scale-out in mind from the beginning. This means starting with your application supporting partitioning of your data (both vertically and horizontally), and supporting replication, including the possibility of lag with MySQL Replication slaves.

You see on a lot of larger Web 2.0 sites these days after saving data, a message like “Your information will be available momentarily” or a message of this nature and the data saved not automatically displayed. This is a clear means of supporting lag, even if only for a few milliseconds. This is just the first of many steps in application design managing scale-out architectures.

Top 5 wishes for MySQL

Note: My views are just that: mine.

1. Real time Query Monitoring

MySQL 5.0 GA provides only 3 ways to look at queries that are executed on a server in some way or another. Slow Query Log, General Query Log and Binary Log. All require a server reboot to activate and de-activate. In a production system, it’s sometimes critical to be able to know “what is going on”, and you simply can’t reboot the server twice (once to turn on, once to turn off). 5.1 goes some way with Log Tables to being able to turn on General and Slow Logs into tables. Question is, as Kevin Burton listed in his points, when is 5.1 going to be out.

Real time query monitoring also needs to have a granularity of operation better then “server”. There needs to be a capacity to assign this on per connection basis. A server is being hammered, certain status variables are increasing greatly, I need to know now what queries are causing this. MySQL provides no means of doing this. MySQL Proxy is a great new idea/project people will start hearing more about, and it goes a long way to helping, but it’s not dynamic in that I can simply turn on logging on a production system without impact to the MySQL server or connections.

Real time query monitoring granularity of time, also need to be in better units, it’s very difficult to find slow running queries > 100ms when the present granularity is seconds. MySQL Proxy as mentioned, and also Connector/J provide this, (BTW Connector/J has excellent features in it’s many connection options if you develop with Java but it’s yet another output to look at, and when your application server and database servers are on different machine architectures it’s a lot of work to sync).

I am also behind SHOW PROFILE. I’d like to see it being able to be attached to existing connections, and applied to queries, and then output discarded for a time base condition (say < 100ms). Granted the act of observation slows things done, it the ability to be able to observe, see and use the information that’s around in bits that would be a start.

2. Consistent Release Cycles

As Stewart pointed out, it’s ridiculous to have 2 years for release of a product. A consistent cycle is needed. 5.0 Change Notes shows first release to GA 22 months. 5.1 Change Notes still not GA is at 19 months.
We talk now about getting new features, 5.1 is frozen, 5.2 seems lost now in any discussions. 6.0 has a few key features but I’m sure significant new features will be limited to ensure the exposure of Falcon. So, a key new feature maybe in 6.1+

Scope creep, lack of clear planning, test coverage, and user community contribution I feel are all factors. I know that user community testing and contribution is continuing to increase and I applaud the valuable contribution of the community. I wish I could do more myself.

3. Information Schema Extensions

I’ve heard of a Pluggable API for I_S tables in 5.1 Could somebody really confirm? I’ve seen Google doing File System Storage Engines (e.g. for /proc info). I really, really, really wish things like 5.1 processlist and status/variables tables were backported to 5.0 to start with.

In addition, now that I have started, there is need for more detailed information on queries, extensions of status variables that are needed. SHOW PROFILE goes someway internally to indicate what’s going on, but knowing that a certain buffer is being used, and what portion of it per connection will help in sizing. It’s important for example to know the tmp_table_size actually used in a result set, VARCHAR and COLLATION have a huge effect that people simply don’t consider. As the number of Pluggable storage engines increases, the need to know what is really happening is going to be more important. Some of this may be more in relation to Real Time Query Monitoring, but I feel certain additional information is needed to be stored.

4. Online Table Maintenance

It was not until I had to time operations recently for ALTER TABLE ADD COLUMN|INDEX did I realize the extent of the time it takes for InnoDB tables (i.e. is takes your database table offline during this time for any OLTP). My tests were taking over an hour (and I was not in the 3 digit GB range for a table). A real uptime system can’t support downtime like this. Traditionally large scale out MySQL applications have been developed around this limitation, however to compete more with Enterprise experiences, and resources coming from enterprise background this is simply not an option. Add the fact you can’t add a datafile to a Innodb Tablespace online (why not!). While speaking of datafiles, I echo Frank’s comment with the limitation of when using innodb_file_per_table, you can’t copy the file between MySQL Instances (assuming for example all the same version of H/W, O/S, MySQL).

I really hope that Falcon addresses these issues to provide a transactional storage engine offering with these enterprise uptime features.

5. Published Benchmarks

MySQL does not publish any benchmarks, well at least not what I know about. The first problem is: how long is a piece of string. There are millions of variables, but it would be great if even a number of cases of straight forward cases were proven. People may then have a better indication of baseline systems.

Here is my initial wish list.

  • Classification of Server Configuration. Let’s say a comparison between 1 CPU (dual Core), 2 CPU (dual Core) and 4 CPU (dual Core) commodity H/W. With the same memory (4G), disk (local disk), O/S (64 bit to support > 4GB), sample data (20-50GB) and same queries (simple OLTP only) just what are the benefits. Can we get a cost of throughput to $ Cost.
  • Disk Configuration. Just how does local Raid 1/Raid 5/Raid 10, compare with SAN (Raid 5) and SAN (Raid 10). Ok there are many types of disks as well as Raid, but start with commodity SATA 72K, 8MB cache. In addition how does a machine with 6 drives (in Raid 1 with OS,Data,Logs split) compare with Raid 5 or Raid 10.
  • RAM. How do our tests run when we take a 4GB system with 20-50GB of data and give it now 16GB.
  • Backup/Recovery. How long does it take to backup and restore.
  • Admin, how long does it take to ALTER TABLE, add data file, even something simply like how long to load data into a memory table across H/W has been interesting

As you can see it turns into a nightmare quickly, we didn’t talk about storage engines like MyISAM/InnoDB, tuning parameters, different O/S etc but surely something really is better then nothing. If only there was a baseline of data and queries to start with. Surely with the data sources available out there, some enterprising person could create a 20GB, 50GB + realistic production type data source, and 20-50 OLTP queries and we have a baseline.

There is talk of the Build Farm by Jay for compiling, let’s get that baseline so we can run some tests across thousands of configurations.
If sufficient work was done by MySQL to get some standard start, then the community might take up the challenge of taking the data/queries/benchmarking framework and test on all the configurations out there, tune to the wazoo and provide back to the MySQL Forge data for everybody to look at.

One day, knowing that this type of disk with this type of battery backup in this RAID configuration just isn’t anywhere near as good as 3 other types of options at roughly the same price.

The Rest

There are more, but in keeping with the spirit of 5 knowing that at least 2 people have shown scope creep already I’ll stop. I really want to mention more.

More About Top 5

Jay Pipes started the Top 5 wishes for MySQL recently. Here are the Planet MySQL contributors to date.

Jay Pipes
Marten Mickos – MySQL CEO
Stewart Smith
Kevin Burton
Farhan Mashaqi
Jeremy Cole

It’s almost like a chain letter, so I’ll start it by passing it onto 3 more, my challenge is to: — Mike Kruckenberg, my evil(he isn’t really evil) twin – Roland Bouman & Paul McCullagh.

27 June 2007 Update
Since my posting we have also had:
Antony Curtis
Alan Kasindorf
Jim Winstead
Jonathon Coombes