Recent Presentations Buenos Aires MySQL/NoSQL/Cloud Conference

The first annual Latin America MySQL/NoSQL/Cloud Conference was held in Buenos Aires Argentina from June 26-28. Kudos to Santiago Lertora from Binlogic who had the vision for the conference in his country and made it happen. I look forward to the second annual event.

My first presentation was “Improving Performance with Better Indexes”. This presentation details the six steps to SQL performance analysis, Capture, Identify, Confirm, Analyze, Optimize and Verify. An explanation of MySQL EXPLAIN, and working examples to create indexes and better covering indexes in several examples are provided. A production example of a 13 table join is used to detail how covering indexes and partial column indexes can make a dramatic improvement in performance. Download Presentation (PDF).

More detailed information about EXPLAIN and creating indexes is available in book Effective MySQL: Optimizing SQL Statements.

My second presentation was “MySQL Backup and Recovery Essentials”. This presentation covers the most common options for MySQL backup and the respective restore options. Also covered is the importance of the master binary logs and point in time recovery capabilities. Download Presentation (PDF)

More detailed information about the right backup and recovery strategy and associated tools is available in book Effective MySQL: Backup and Recovery.

References

Latin America MySQL/NoSQL/Cloud Conference Program.

UTF-8 with MySQL and LAMP

A recent question on a mailing list was the best practices for UTF-8 and PHP/MySQL. The following are the configurations I used in my multi-language projects.

MySQL UTF-8 Configuration

# my.cnf
[mysqld]
default_character_set = utf8
character_set_client       = utf8
character_set_server       = utf8
[client]
default_character_set = utf8

PHP UTF-8 Configuration

#php.ini
default_charset = "utf-8"

Apache UTF-8 Configuration

#httpd.conf
AddDefaultCharset UTF-8
<VirtualHost>
    AddCharset UTF-8   .htm
</VirtualHost>

HTML file UTF-8 Configuration

 <meta charset="utf-8">

PHP file UTF-8 Configuration

header('Content-type: text/html; charset=UTF-8');

MySQL connection (extra precaution)

SET NAMES utf8;

Shell UTF-8

And last but not least, even editing files in shell can be affected (.e.g UTF-8 data to be inserted into DB from file). Ensure at least

#~/.profile
export LC_CTYPE=en_US.UTF-8
export LANG=en_US.UTF-8

Amateurs – They give us professionals a bad name

Any person with half a brain would see from the error messages below that the MySQL server is not operating optimally, or more specifically the MySQL upgrade has not completely successfully and let users can go happily use the website. It amazing me when web hosting providers tell their paying client that an upgrade has been performed yet they did not have the intelligence to actually look at the error log for confirmation. Got a mysql> prompt, it’s all good. One of the first things I check is the error log.

When will people learn the MySQL error log is a valuable resource both for what it contains, and what it should not contain.

120426 17:36:00 [Note] /usr/libexec/mysqld: Shutdown complete

120426 17:36:00 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
120426 17:36:00 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
120426 17:36:00 [Note] Plugin 'FEDERATED' is disabled.
/usr/libexec/mysqld: Table 'mysql.plugin' doesn't exist
120426 17:36:00 [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it.
120426 17:36:00 InnoDB: The InnoDB memory heap is disabled
120426 17:36:00 InnoDB: Mutexes and rw_locks use GCC atomic builtins
120426 17:36:00 InnoDB: Compressed tables use zlib 1.2.3
120426 17:36:00 InnoDB: Using Linux native AIO
120426 17:36:00 InnoDB: Initializing buffer pool, size = 128.0M
120426 17:36:00 InnoDB: Completed initialization of buffer pool
120426 17:36:00 InnoDB: highest supported file format is Barracuda.
120426 17:36:00  InnoDB: Waiting for the background threads to start
120426 17:36:01 InnoDB: 1.1.8 started; log sequence number 232577699
120426 17:36:01 [ERROR] Missing system table mysql.proxies_priv; please run mysql_upgrade to create it
120426 17:36:01 [ERROR] Can't open and lock privilege tables: Table 'mysql.servers' doesn't exist
120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_current' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_history' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_history_long' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'setup_consumers' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'setup_instruments' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'setup_timers' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'performance_timers' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'threads' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_summary_by_thread_by_event_name' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_summary_by_instance' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_summary_global_by_event_name' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'file_summary_by_event_name' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'file_summary_by_instance' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'mutex_instances' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'rwlock_instances' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'cond_instances' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'file_instances' has the wrong structure
120426 17:36:01 [ERROR] Column count of mysql.db is wrong. Expected 22, found 20. Created with MySQL 50077, now running 50523. Please use mysql_upgrade to fix this error.
120426 17:36:01 [ERROR] mysql.user has no `Event_priv` column at position 29
120426 17:36:01 [ERROR] Cannot open mysql.event
120426 17:36:01 [ERROR] Event Scheduler: An error occurred when initializing system tables. Disabling the Event Scheduler.
120426 17:36:01 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.5.23-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL) by Atomicorp
120426 17:46:01 [ERROR] Missing system table mysql.proxies_priv; please run mysql_upgrade to create it
120426 17:46:01 [ERROR] Can't open and lock privilege tables: Table 'mysql.servers' doesn't exist
120426 17:46:01 [ERROR] Column count of mysql.proc is wrong. Expected 20, found 16. Created with MySQL 50077, now running 50523. Please use mysql_upgrade to fix this error.

Some more light reading at Have you checked your MySQL error log today? and Monitoring MySQL – The error log

An excellent conference (5 out of 5 stars)

I wanted to extend thanks as others have also about the excellent annual MySQL Conference, now a Percona Live event. This was easily the best run, attended and energetic event in at least the past 3 years. With over a 1000 attendees a well stocked exhibitors hall (and good involvement in the hall), and good talks; there was just a great community vibe. To Terry, Kortney and all Percona staff involved, well done. The event ran on time, I personally did not see or hear of any issues. The only complaint was from many that wanted to attend multiple talks at the same time, another indication of the quality of speakers for the event.

Thank you to those that attended my two sessions on Explaining the MySQL Explain and MySQL Idiosyncrasies that BITE. Many people thanked me after presentations, along also with people coming up to me to say they appreciated the first book of the Effective MySQL Series. My desires to speak and write are only for the benefit of the MySQL community to hopefully learn and appreciate how to best use MySQL.

It was of course great to see many MySQL alumni, and old friends I have seen since meeting at my first MySQL conference in 2006.

SQL_MODE and storage engines

I was again reminded why setting SQL_MODE is so important in any new MySQL environment. While performing benchmark tests on parallel backup features with a common InnoDB tablespace and per file tablespace, I inadvertently missed an important step in the data migration. The result was the subsequent test that performed data population worked without any issues however there was no data in any InnoDB tables.

These are the steps used in the migration of InnoDB tables from a common tablespace model to a per-table tablespace model.

  1. Dump all InnoDB tables
  2. Drop all InnoDB tables
  3. Shutdown MySQL
  4. Change the my.cnf to include innodb-file-per-table
  5. Remove the InnoDB ibdata1 tablespace file
  6. Remove the InnoDB transactional log files
  7. Start MySQL
  8. Verify the error log
  9. Create and load new InnoDB tables

However, step 6 was not performed correctly due to a sudo+shell wildcard issue. The result was MySQL started, and tables were subsequently created incorrectly. What should have happened was:

mysql> CREATE TABLE `album` (
    ->   `album_id` int(10) unsigned NOT NULL,
    ->   `artist_id` int(10) unsigned NOT NULL,
    ->   `album_type_id` int(10) unsigned NOT NULL,
    ->   `name` varchar(255) NOT NULL,
    ->   `first_released` year(4) NOT NULL,
    ->   `country_id` smallint(5) unsigned DEFAULT NULL,
    ->   PRIMARY KEY (`album_id`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
ERROR 1286 (42000): Unknown table engine 'InnoDB'

However, because by default MySQL will fallback to the legacy default of MyISAM, no actual error occurred. In order for this to produce an error, an appropriate SQL_MODE is necessary.

mysql> SET GLOBAL sql_mode='NO_ENGINE_SUBSTITUTION';

A check of the MySQL error log shows why InnoDB was not available.

120309  0:59:36  InnoDB: Starting shutdown...
120309  0:59:40  InnoDB: Shutdown completed; log sequence number 0 1087119693
120309  0:59:40 [Note] /usr/sbin/mysqld: Shutdown complete

120309  1:00:16 [Warning] No argument was provided to --log-bin, and --log-bin-index was not used; so replication may break when this MySQL server acts as a master and has his hostname changed!! Please use '--log-bin=ip-10-190-238-14-bin' to avoid this problem.
120309  1:00:16 [Note] Plugin 'FEDERATED' is disabled.
120309  1:00:16  InnoDB: Initializing buffer pool, size = 500.0M
120309  1:00:16  InnoDB: Completed initialization of buffer pool
InnoDB: The first specified data file ./ibdata1 did not exist:
InnoDB: a new database to be created!
120309  1:00:16  InnoDB: Setting file ./ibdata1 size to 64 MB
InnoDB: Database physically writes the file full: wait...
InnoDB: Error: all log files must be created at the same time.
InnoDB: All log files must be created also in database creation.
InnoDB: If you want bigger or smaller log files, shut down the
InnoDB: database and make sure there were no errors in shutdown.
InnoDB: Then delete the existing log files. Edit the .cnf file
InnoDB: and start the database again.
120309  1:00:17 [ERROR] Plugin 'InnoDB' init function returned error.
120309  1:00:17 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
120309  1:00:17 [Note] Event Scheduler: Loaded 0 events
120309  1:00:17 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.1.58-1ubuntu1-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  (Ubuntu)

NOTE: This was performed on Ubuntu using the standard distro MySQL version of MySQL 5.1.

As previously mentioned, SQL_MODE may not be perfect, however what features do exist warrant correctly configuration your MySQL environment not to use the default.

More Information.

Why SQL_MODE is essential even when not perfect

In a recent rant on Why I think SQL_MODE is useless…, I wanted to counteract this statement with why we MUST all use SQL_MODE, even with the inherit flaws.

The fundamental principle of a database is to restore and retrieve data. When I can insert data into the database and then I select this data it is different, this is fundamentally wrong. This is a loss of essential data integrity, something a database should NEVER do.

SQL_MODE solves the problem of “silent truncation” in most instances, and produces an all important error. As pointed out, the SQL_MODE has several limitations, however the benefits do out way the risks. Quality control on source code can reduce the limitations, but no amount of coding can stop the CRUD that comes out of the database without some SQL_MODE settings.

I would ask two more important questions.

  1. How in the first place can such a critical feature of silent data truncation ever be permitted in MySQL? Who made that decision and why?
  2. When is the owner of MySQL codebase realize this is rather ridiculous and enforce essential minimual data integrity that can be obtain with options including STRICT_ALL_TABLES, NO_ZERO_DATE, NO_ZERO_IN_DATE and NO_ENGINE_SUBSTITUTION.

References

Colorado MySQL Users Group Presentation

In addition to speaking at RMOUG event in Denver, I also spoke today in Broomfield on “Improving MySQL Performance with Better Indexes”.

This presentation included details on :

  • Effective examples of capture SQL via application logging and TCP/IP analysis necessary for identifying the best candidates. Slow is not always the best query to start with.
  • All the commands necessary to identify why you may need an index, how to create varying types of indexes, and how to confirm there true effectiveness.
  • How the number of table rows and different storage engines can greatly effect the optimization choice and query performance
  • The presentation shows how to determine/create and verify covering indexes for a single table example, a master/child example and a production 13 table join.

You can obtain the slides from Improving MySQL Performance With Better Indexes Presentation.

Event Details

MySQL Security Essentials Presentation

Today at the RMOUG Training Days 2012 event I gave an introduction presentation on MySQL Security Essentials covering the following topics:

  • MySQL Security defaults
  • MySQL Security Improvements
  • OS Security
  • User Privileges
  • Data Integrity
  • Installation Practices
  • Auditing Options
  • Better Security
  • Further References

Download slides for MySQL Security Essentials.

Exciting upcoming MySQL events

At the IOUC leaders’ summit in San Francisco this week, key leaders from Oracle, Java and MySQL user groups world wide have been meeting. This has included the key Oracle MySQL resources from the community, marketing and product teams. The Java User Groups and MySQL User Groups have been well represented and there has been very welcoming discussion with the IOUC about how we can become active within the Oracle Community.

There has been key discussions of upcoming and proposed MySQL events including the great outreach by the Oracle MySQL team with existing Open Source conferences this year including Scale, FOSDEM and South East Linuxfest just to name a few.

You can see the upcoming events at http://mysql.com/news-and-events/events/. In February alone there will be events in North Carolina, California, Texas, Frankfurt Germany and Stockholm Sweden. I will also be speaking in Denver at 2 events and Salt Lake City.

There is a much longer list then what is shown here, and we are working on getting the full list more available.

More info by Dave Stokes at SCaLE’s MySQL Day a big hit and Keith Larson More User Groups.

Why is searching the manual so hard

As a consultant I often use the MySQL Reference Manual to provide additional information for clients. I am very happy to recognize the quality of the content in the MySQL documentation, but why is the searching of the manual so, so bad?

While reading the General Security Issues section of the MySQL 5.5 manual, I performed a search for “CREATE USER”. I was not asking for anything abstract, this is an actual SQL command. I was rather horrified to find that the results could not even list the appropriate manual page in the first page of results.

I am not an expert in full-text search, however it does not take a rocket scientist to realize that a SQL keyword, the title of a page, in the language of the current page (English) and the current version of the Manual (5.5) should be an easy result. This is a simple weighted result right? Wrong.



The most important MySQL Reference Manual page

In my opinion, The Server Option and Variable Reference at http://dev.mysql.com/doc/refman/5.5/en/mysqld-option-tables.html rates as my most important page. This is a consolidated index that enables a drill down to the Server Command Options, System Variables, Startup and replication specifics, as well as important information on default values and differences between versions including point releases.

However, there is another page not in the actual manual, but at http://dev.mysql.com/doc/mysqld-version-reference/en/mysqld-version-reference-optvar.html which is an Options/Variables reference akin to the Reference Manual, but includes a 5.x version matrix.

Recently I was asked about some options that had to be removed from an upgrade to MySQL 5.5. Some of these were obvious, however not all. This page enabled me to confirm deprecation (as expected), and also point to important reference material.

These options where:

  • default_table_type
  • log_long_format
  • master-connect-retry
  • default-character-set

The use of table_type was a 3.x/4.x relic, replaced with engine, so I was surprised this option was still even valid. The option replaced with default_storage_engine. The page defined this as deprecated in MySQL 5.0
log_long_format is also old, and definitely modified since MySQL 5.1 with the general_log_xxx options. In fact this has been deprecated since 4.1
I have never liked the master-xxx options, in favor of a CHANGE MASTER command and synchronization issues with the master.info file and master-xx options. master-connect-retry and several other options were deprecated in 5.1.17. On a side note, if you look at this option in the MySQL 5.5 Reference Manual you get Obsolete options. The following options are removed in MySQL 5.5. If you attempt to start mysqld with any of these options in MySQL 5.5, the server aborts with an unknown variable error. To set the replication parameters formerly associated with these options, you must use the CHANGE MASTER TO … statement (see Section 12.4.2.1, “CHANGE MASTER TO Syntax”).
Finally default-character-set. Initially I thought that is definitely still applicable, however the handy cross reference reminded me, this is also deprecated in MySQL 5.0 and the Reference Manual again stating default-character-set is also deprecated in 5.0 in favor of character-set-server.. The name has simply changed in newer version.

With so many options and as a consultant I work with many different versions each week, I sometimes need a refresher of the changes in the versions of the past 5 years. Definitely my second most important page.

If you have a favorite page, please let me know.

I would also like to say thank you to the Oracle/MySQL Documentation team that do a great job in providing an excellent resource to an open source product. We would all do well to appreciate this in comparison to other open source documentation in companion technologies and related tools. With every new release of a MySQL product you don’t realize that somebody reviewed, tested and wrote about features without receive the limelight.

Binary Log Replayer

When using the replication slave stream, or mysql command line client and mysqlbinlog output from a binary/relay log, all statements are executed in a single thread as quickly as possible.

I am seeking a tool to simulate the replay of the binary/relay log for a benchmark at a pace that is more representative to original statements. For a simple example, if the Binary Log has 3 transactions in the first second, 2 transactions in the second second, and 5 transactions in the third second, I am wanting to simulate the replay to take roughly 3 seconds, not as fast as possible (which would be sub-second). The tool should try to wait the remainder of a second before processing SQL statements in the incoming stream.

Does anybody know of a tool that currently provides this type of functionality? Any input appreciated before I create my own.

Reasons to use MySQL 5.5 Presentation

I recently gave a presentation at the New York Effective MySQL Meetup on the new features of, and some of the compelling reasons to upgrade to MySQL 5.5. There are also a number of new MySQL variables that can have a dramatic effect on performance in a highly transactional environment, innodb_buffer_pool_instances and innodb_purge_threads are just two to consider.

For more information on all the new variables, status, reserved words and benchmarks of new features you can Download Presentation Slides.

Utilizing multiple indexes per MySQL table join

Historically it was considered that MySQL will generally use only one index per referenced table in a SQL query. In MySQL 5.0 the introduction of merge indexes enabled for certain conditions the possibility to utilize two indexes however this could result in worst performance then creating a better index. In MySQL 5.1 it became possible to control optimization switches with the optimizer_switch system variable.

However in explaining how to utilize the intersection, union and sort union in queries I discovered that MySQL could use three indexes for one given table.

        Extra: Using union(name,intersect(founded,type)); Using where

I was not aware of this.

Extra: Using Index

Many people consider this information in the MySQL Query Execution Plan (QEP) to indicate that the referenced table is using an index. It actually means that ONLY the index is used. This can for larger and more frequent queries provide a significant boost.

In a recent example, adding an index dropped a query from 190ms to 6ms. However adding a better index dropped that 6ms query to 1.2ms. When executed 100s/1000s of times per second these millisecond improvements can have a huge benefit in greater scalability. While people often tune slow running queries, in a well tuned system shaving milliseconds of queries, in this example making 6ms query 80% better is a far greater improvement.

You can get a detailed explanation of how to identify, create and verify covering indexes from my Percona Live presentation Improving performance with better indexes where I also include another great 10 table join example, reducing a query running 20,000+ times per second from 175ms to 10ms.

Upcoming MySQL presentation in New York

On Tuesday I will be speaking in New York at the Effective MySQL Meetup group where I will be giving the presentation “MySQL Idiosyncrasies That Bite”. For more information and to register, check out the Meetup Event. There are just 10 seats left.

To promote the upcoming Percona Live event in New York, for attendees to the Meetup there will be a draw for a FREE ticket to the May 26th event.

Effective MySQL New York is the only MySQL group now operating in New York. Please join our group for the latest information and events for the MySQL community.

Basic scalability principles to avert downtime

In the press in the last two days has been the reported outage of Amazon Web Services Elastic Compute Cloud (EC2) in just one North Virginia data center. This has affected many large website includes FourSquare, Hootsuite, Reddit and Quora. A detailed list can be found at ec2disabled.com.

For these popular websites was this avoidable? Absolutely.

Basic scalability principles if deployed in these systems architecture would have averted the significant downtime regardless of your development stack. While I work primarily in MySQL these principles are not new, nor are they complicated, however they are fundamental concepts in scalability that apply to any technology including the popular MongoDB that is being used by a number of affected sites.

Scalability 101 involves some simple basic rules. Here are just two that seem to have been ignored by many affected by this recent AWS EC2 outage.

  1. Never put all your eggs in one basket. If you rely on AWS completely, or you rely on just one availability zone that is putting all your eggs in one basket.
  2. Always keep your important data close to home. When it comes to what is most critical to your business you need access and control to your information. At 5am in the morning when the CEO asks how long will our business be unavailabla and what is needed to resolve the problem, the answer “We have no control over this and have no ETA” is not an acceptable answer.

With a successful implementation and appropriate data redundancy you may not have an environment immediately available however you have access to your important information and the ability to create one quickly. Many large hosting companies can provide additional H/W on near demand, especially if you have an initial minimal footprint. Indeed using Amazon Web Services (AWS) as a means to avert a data center disaster is an ideal implementation of Infrastructure As A Service (IAAS). Even with this issue, organizations that had planned for this type of outage could have easily migrated to another AWS availability zone that was unaffected.

Furthermore, system architecture to support various levels of data availability and scalability ensure you can handle many more various types of unavailability without significant system down time as recently seen. There are many different types of availability and unavailability, know what your definition of downtime is and supporting disasters should be your primary focus of scalability, not an after thought.

As an expert in performance and scalability I can help your organization in the design of a suitable architecture to support successful scalability and disaster. This is not rocket science however many organizations gamble without the expertise of a professional to ensure business viability.

Use Replication for backups? Are you schemas consistent?

Many people have a master/slave MySQL environment of various different topologies, and many use the slave as a backup.
Is your slave schema identical to your production schema? As long as an SQL statements completes without an error, your slave schema can differ. Common examples are different indexes or storage engines for example. However if you use the slave as backup, you want to ensure when you recover, you are recovering a production environment, not a slave environment.

While the changes may be small, the can lead to different results. For a client I found that the default value of a price field was 10.00 in one schema and 0.00 in another. Not withstanding using defaults for important fields and not defining in an INSERT is a different issue, it highlighted the different schemas can easily exist.

My tool of choice is SchemaSync. The reason why I like this command is that it provides migration scripts to upgrade or downgrade your schemas.

I vote for Planet MySQL moderation

How this happens or who does it is obviously a larger and more complex conversation however it is better then involving innocent animals.

How is it that trivial $#*! gets voted and has a string of comments I can not explain, however Planet MySQL should have practical MySQL related content. I would vote for the front page and RSS feed to show only moderated content, and content that did not pass the cut (and there is a lot of this) can be available on a less important page if necessary.

Rather then complaining like many open source communities, let us propose a way to improve the system we use.

Checked that MySQL backup log lately?

Running a MySQL backup and ensuring it completed successfully and backup files exist is not enough. In my B&R Quiz from Checked your MySQL recovery process recently? one important step is “Do you review your backup logs EVERY SINGLE day or have tested backup monitoring in place?”

This is what I found when reviewing a backup log for a client today.

mysqldump: Got error: 1142: SELECT,LOCK TABL command denied to user 'root'@'localhost' for table 'cond_instances' when using LOCK TABLES
mysqldump: Got error: 1142: SELECT,LOCK TABL command denied to user 'root'@'localhost' for table 'cond_instances' when using LOCK TABLES

The backup script was completing, backup files were in place (and are listed in the log file) however these errors were occurring.

Further investigation was less then one minute of actual work. This server runs multiple MySQL instances and recently one instance was upgraded from MySQL 5.1 to MySQL 5.5 however the call to mysqldump was not. This error was the result of running 5.1 mysqldump against a 5.5 database. Change the hard coded path and the error message went away.

The greater problem is teaching people to understand the importance of backups.

MySQL conference schedule

I am one of the crazy individuals(*) that will be speaking at both the regular O’Reilly MySQL Conference and the IOUG Collaborate conference both being held in the second week of April. My 4 presentations are:

Upcoming NY Presentation – How Better Indexes Save You Money

For all those in New York this is an upcoming MySQL presentation held in conjunction with our colleagues at General Assembly on March 22nd 2011.

This presentation “How Better Indexes Save You Money” will be discussing how one simple technique can result in huge MySQL performance improvements and with zero code changes necessary. Many people think they know indexes, however MySQL and MySQL Storage Engines have some specifics that differ from more traditional RDBMS products. Learn some of the key analysis and verification techniques and be able to see immediate potential results in performance.

You can find more details at Meetup.com EffectiveMySQL. This new group is all about highly technical MySQL related content “no fluff, just stuff”.

Your PHP installation appears to be missing the MySQL extension which is required by WordPress.

I recently deployed a new WordPress installation to my existing production webserver running Apache, MySQL and PHP for other websites, yet I was presented with the following message.

“Your PHP installation appears to be missing the MySQL extension which is required by WordPress.”

This thread at wordpress.org did not help me, however I was able to solve the problem, but this thread is now marked as closed. That’s poor form because I can’t share the solution I found.

My PHP configuration file did not have the following.

#php.ini
[PHP]
extension=mysql.so

Adding this and restarting Apache did not fix the problem.

The problem was more fundamental and required PHP to be recompiled. Orginally PHP was configured with the ‘–with-mysqli’ option. PHP requires the ‘–with-mysql’ which is rather stupid they have this dependency.

Recompiling PHP and adding the necessary extension were both necessary to get my new WordPress installation operational.

Part 2 – Simple lessons in improving scalability

Given the popular response from my first lesson in improving scalability where I detailed simple ways to eliminate unnecessary SQL, let me share another common bottleneck with MySQL scalability that can be instantly overcome.

Analyzing the writes that occur on a system can expose obvious potential bottlenecks. The MySQL Binary Log is a wealth of information that can be mined. Simple DML Counts per table can be achieved by a single line command.

Let’s look at the following example output of a production system:

mysqlbinlog /path/to/mysql-bin.000999 |  
   grep -i -e "^update" -e "^insert" -e "^delete" -e "^replace" -e "^alter"  | 
   cut -c1-100 | tr '[A-Z]' '[a-z]' |  
   sed -e "s/t/ /g;s/`//g;s/(.*$//;s/ set .*$//;s/ as .*$//" | sed -e "s/ where .*$//" |  
   sort | uniq -c | sort -nr

Of the approx 100,000 DML statements we get the following breakdown.

  55283 update sessions
  25204 insert into sessions
  12610 update items
  10536 insert into item_categories
   7532 update users
   5168 delete from item_categories
 

More then 50% of the statements that are written to the binary log and therefore replicated are INSERT’s into the sessions table. A further 25% are UPDATE’s to the same table. This represents 75% of DML statements in just the two most frequent statements.

What is disappointing is that these statements do not belong in MySQL. This is an example of when MySQL is being abused for a purpose where other products are more suited. While there is the argument in using MySQL for storing data, the impact in MySQL memory management, backup/recovery and slave replication throughput and lag can significantly impact scalability of your important MySQL data.

What is observed here is session management where a key value store product should be used as an alternative. In most circumstances it is likely this information is not even required to be persisted. The obvious replacement here is using memcached. If you do wish to persist this data there is an ever increasing list of products including Redis, Tokyo Cabinet/Kyoto Cabinent, Membrain, Voldemort etc that are specifically designed as a key-value store. Even the popular noSQL MongoDB can be easily substituted to perform as a key-value session manager with the added benefits of being a more fully functional product for other purposes.

This is often a common mistake when you use a framework such as Ruby on Rails (RoR) or PHP Code Igniter and many others.

Optimizing UPDATE and DELETE statements

Updated Nov 2011. Check out my latest book on Optimizing SQL Statements for more information. MySQL 5.6.2 also now provides an EXPLAIN syntax for UPDATE and DELETE statements natively.

While most people look at performance optimizations for SELECT statements, UPDATE and DELETE statements are often overlooked. These can benefit from the principles of analyzing the Query Execution Plan (QEP). You can only run an EXPLAIN on a SELECT statement, however it’s possible to rewrite an UPDATE or DELETE statement to perform like a SELECT statement.

To optimize an UPDATE, look at the WHERE clause. If you are using the PRIMARY KEY, no further analysis is necessary. If you are not, it is of benefit to rewrite your UPDATE statement as a SELECT statement and obtain a QEP as previously detailed to ensure optimal indexes are used. For example:

UPDATE t
SET	c1 = ‘x’, c2 = ‘y’, c3 = 100
WHERE c1 = ‘x’
AND	d = CURDATE()

You can rewrite this UPDATE statement as a SELECT statement for using EXPLAIN:

EXPLAIN SELECT c1, c2, c3 FROM	t WHERE c1 = ‘x’ AND	d = CURDATE()

You should now apply the same principles as you would when optimizing SELECT statements.

Simple lessons in improving scalability

It can be very easy to improve scalability with a MySQL server by a few simple rules. Here is one of them.

“The most efficient way to improve an SQL statement is to eliminate it”

There are numerous ways to eliminate SQL statements, however before I give a classic example that I’ve observed again with a client, let me explain the basic premise of why this improves scalability?

The MySQL kernel can only physically process a certain number of SQL statements for a given time period (e.g. per second). Regardless of the type of machine you have, there is a physical limit. If you eliminate SQL statements that are unwarranted and unnecessary, you automatically enable more important SQL statements to run. There are numerous other downstream affects, however this is the simple math. To run more SQL, reduce the number of SQL you need to run.

Here is the output of a small sample of analyzed TCP/IP packets via mk-query-digest.

# Rank Query ID           Response time Calls R/Call Apdx V/M   Item
# ==== ================== ============= ===== ====== ==== ===== ==========
#    1 0xD631CB919867DB50  0.0436 47.3%    92 0.0005 1.00  0.00 SELECT TTDOD
#    2 0x04FE01C5B31FD305  0.0258 27.9%   329 0.0001 1.00  0.00 ADMIN PING
#    3 0x93321857BCD8E771  0.0229 24.8%    36 0.0006 1.00  0.00 SELECT TTD

There are many problems here including the Row at a Time (RAT) nature of the SQL, the excessive pings however that’s a topic for another time. Let us look at the first statement.

SELECT `Date` FROM TTDOD WHERE ID = 9999;

That seems a simple enough query however let’s look at the table.

mysql> select count(*) from TTDOD;
+----------+
| count(*) |
+----------+
|        0 |
+----------+

In this case, the query will NEVER return any rows because the table is currently empty. Sure this may change in the future, however as this is more an exception processing situation the simple act of managing the knowledge this table rarely has any rows, and building a solution to inform the application of this can completely eliminate the need for this query to ever be executed.

FYI, the above sample is from less then 2 seconds of sampling. Removing the first query reduces the number of queries executed in this time slice by 20%. Regardless of whether this is typical load or load during a batch job the principle stands. We have not even started to look at what we can do with the next query.

MySQL 5.5.9

You blink and there is a new version. I have not seen an Planet MySQL release as yet about this new version. Release Notes.

I’d like to say I installed it, but I downloaded the Linux – Generic 2.6 (x86, 64-bit), TAR file, only to find it contains 6 rpm files. #fail, I’m using Ubuntu.

You have to scroll to the bottom of the list (another stupid thing for a generic binary choice) to get Linux – Generic 2.6 (x86, 64-bit), Compressed TAR Archive. Double #fail

Leveraging the InnoDB Plugin

Beginning with MySQL 5.1 as an additional plugin and included by default in MySQL 5.5 the InnoDB plugin includes many performance improvements. To leverage the support of new file formats however a very important setting is necessary.

#my.cnf
[mysqld]
innodb_file_per_table

The use of innodb_file_per_table with an existing system or during an upgrade to 5.1 or 5.5 requires a complete reload of your database to use effectively. In summary.

  • Backup all InnoDB tables via mysqldump
  • Drop InnoDB tables
  • Verify InnoDB not used
  • Stop MySQL
  • Enable innodb_file_per_table & simplified innodb_data_file_path (if applicable)
  • Remove ibdata? files
  • Start MySQL
  • Create Tables
  • Reload Data
  • Verify InnoDB Operation
    • The primary reason is we are moving from using a common tablespace to a tablespace per table. InnoDB wil not shrink the common tablespace so this process is necessary in order to purge the diskspace currently being used. You should also reduce your innodb_data_file_path options if specified. For example if currently set to :

      innodb_data_file_path = ibdata1:2000M;ibdata2:10M:autoextend
      

      I would suggest you change to

      innodb_data_file_path = ibdata1:100M:autoextend
      

      InnoDB still requires this common tablespace, however now each table has it’s own disk file the volume required is signficantly less.

Interesting MySQL 5.5 upgrade gotcha

Today I discovered an interesting upgrade problem with a client migrating from MySQL 5.0 to 5.5. The client who is undertaking the upgrade reported that MySQL 5.5 did not support the DECIMAL(18,5) data type. I easily confirmed this not to be the case:

mysql> drop table if exists x;
mysql> create table x (col1 DECIMAL(18,5));
Query OK, 0 rows affected (0.01 sec)

Delving more into the issue in question, I looked at the complete CREATE TABLE statement, recreating the syntax.

mysql> drop table if exists x;
mysql> create table x ( MinValue DECIMAL(18,5));
Query OK, 0 rows affected (0.00 sec)

No problem there.

mysql> drop table if exists x;
mysql> create table x (Department INT NOT NULL, MinValue DECIMAL(18,5) NULL, MaxValue DECIMAL(18,5) NULL);
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'MaxValue DECIMAL(18,5) NULL)' at line 1

I could immediately see the problem because if you look closely at the error message it starts with the word “MaxValue”, most likely this is a reserved word. This was easily confirmed with:

mysql> create table x (Department INT NOT NULL, MinValue DECIMAL(18,5) NULL, `MaxValue` DECIMAL(18,5) NULL);
Query OK, 0 rows affected (0.00 sec)

And a confirmation with the 5.5 Reference Manual does indeed show there are new reserved words in MySQL 5.5. These are:

  • GENERAL
  • IGNORE_SERVER_IDS
  • MASTER_HEARTBEAT_PERIOD
  • MAXVALUE
  • RESIGNAL SIGNAL
  • SLOW