I want a mysqldump –ignore-database option

While working with RDS and Google Cloud SQL I have come to realize that excluding the mysql schema from a mysqldump is important. However with many databases, the –all-databases option enables you only to select all or none. There is however an easy solution to exclude one or more databases in mysqldump with this little gem I created.

$ time mysqldump --databases `mysql --skip-column-names -e "SELECT GROUP_CONCAT(schema_name SEPARATOR ' ') FROM information_schema.schemata WHERE schema_name NOT IN ('mysql','performance_schema','information_schema');" >` >/mysql/backup/rds2.sql

An you can exclude as many schemas as you want.

I checked the mysqldump –help, there was no option in MySQL 5.1, asked a colleague just to be sure I wasn’t wasting my time, and it took all of 2 minutes to create and test a working solution.

When is a database schema not a database schema?

mysql> show schemas;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| innodb             |
| mysql              |
| performance_schema |
+--------------------+
4 rows in set (0.00 sec)
mysql> drop schema innodb;
ERROR 1010 (HY000): Error dropping database (can't rmdir './innodb/', errno: 17)

This is an additional schema that is included in an AWS RDS installation. You should not put directories in the MySQL data directory.

An excellent conference (5 out of 5 stars)

I wanted to extend thanks as others have also about the excellent annual MySQL Conference, now a Percona Live event. This was easily the best run, attended and energetic event in at least the past 3 years. With over a 1000 attendees a well stocked exhibitors hall (and good involvement in the hall), and good talks; there was just a great community vibe. To Terry, Kortney and all Percona staff involved, well done. The event ran on time, I personally did not see or hear of any issues. The only complaint was from many that wanted to attend multiple talks at the same time, another indication of the quality of speakers for the event.

Thank you to those that attended my two sessions on Explaining the MySQL Explain and MySQL Idiosyncrasies that BITE. Many people thanked me after presentations, along also with people coming up to me to say they appreciated the first book of the Effective MySQL Series. My desires to speak and write are only for the benefit of the MySQL community to hopefully learn and appreciate how to best use MySQL.

It was of course great to see many MySQL alumni, and old friends I have seen since meeting at my first MySQL conference in 2006.

MySQL now has two user conferences (*)

PC World has written a post with this title(*) about the upcoming MySQL Connect conference and references the Percona Live conference and an official Percona comment. As this is not syndicated in Planet MySQL I encourage you to read the full article.

This is the MySQL conference to get technical presentations by the many great Oracle/MySQL technical staff who will not be in attendance at Percona Live. There will also be a strong community presence in speaking at Oracle Connect in September. While Oracle was organizing a dedicated MySQL event in April for the community with all vendors including Percona to replace the conference dropped by long term partner O’Reilly (kudos for many years of great events), Percona decided to go at it without including Oracle, the owners and developers of MySQL. The statement quoted in the PC World article regarding “lack of momentum around the [annual community] event” is clearly inaccurate and not a true representation of actual events.

It is difficult to keep up with all the community events Oracle is now running including multiple OTN MySQL Developer days per month across the US and Europe. I will be speaking at the upcoming OTN NY event in April, the Rocky Mountain Training Day in May, and hopefully the MySQL Innovation day in June. Get the full list at Upcoming MySQL Events.

Indeed MySQL content and presentations have also been represented at Oracle Open World for a number of years. 2011 was a very large turnout and many MySQL presentations. As a senior consultant for MySQL Inc I manned a MySQL booth at OOW exhibition hall back in 2007, prior to both Sun and Oracle acquisitions.

SQL_MODE and storage engines

I was again reminded why setting SQL_MODE is so important in any new MySQL environment. While performing benchmark tests on parallel backup features with a common InnoDB tablespace and per file tablespace, I inadvertently missed an important step in the data migration. The result was the subsequent test that performed data population worked without any issues however there was no data in any InnoDB tables.

These are the steps used in the migration of InnoDB tables from a common tablespace model to a per-table tablespace model.

  1. Dump all InnoDB tables
  2. Drop all InnoDB tables
  3. Shutdown MySQL
  4. Change the my.cnf to include innodb-file-per-table
  5. Remove the InnoDB ibdata1 tablespace file
  6. Remove the InnoDB transactional log files
  7. Start MySQL
  8. Verify the error log
  9. Create and load new InnoDB tables

However, step 6 was not performed correctly due to a sudo+shell wildcard issue. The result was MySQL started, and tables were subsequently created incorrectly. What should have happened was:

mysql> CREATE TABLE `album` (
    ->   `album_id` int(10) unsigned NOT NULL,
    ->   `artist_id` int(10) unsigned NOT NULL,
    ->   `album_type_id` int(10) unsigned NOT NULL,
    ->   `name` varchar(255) NOT NULL,
    ->   `first_released` year(4) NOT NULL,
    ->   `country_id` smallint(5) unsigned DEFAULT NULL,
    ->   PRIMARY KEY (`album_id`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
ERROR 1286 (42000): Unknown table engine 'InnoDB'

However, because by default MySQL will fallback to the legacy default of MyISAM, no actual error occurred. In order for this to produce an error, an appropriate SQL_MODE is necessary.

mysql> SET GLOBAL sql_mode='NO_ENGINE_SUBSTITUTION';

A check of the MySQL error log shows why InnoDB was not available.

120309  0:59:36  InnoDB: Starting shutdown...
120309  0:59:40  InnoDB: Shutdown completed; log sequence number 0 1087119693
120309  0:59:40 [Note] /usr/sbin/mysqld: Shutdown complete

120309  1:00:16 [Warning] No argument was provided to --log-bin, and --log-bin-index was not used; so replication may break when this MySQL server acts as a master and has his hostname changed!! Please use '--log-bin=ip-10-190-238-14-bin' to avoid this problem.
120309  1:00:16 [Note] Plugin 'FEDERATED' is disabled.
120309  1:00:16  InnoDB: Initializing buffer pool, size = 500.0M
120309  1:00:16  InnoDB: Completed initialization of buffer pool
InnoDB: The first specified data file ./ibdata1 did not exist:
InnoDB: a new database to be created!
120309  1:00:16  InnoDB: Setting file ./ibdata1 size to 64 MB
InnoDB: Database physically writes the file full: wait...
InnoDB: Error: all log files must be created at the same time.
InnoDB: All log files must be created also in database creation.
InnoDB: If you want bigger or smaller log files, shut down the
InnoDB: database and make sure there were no errors in shutdown.
InnoDB: Then delete the existing log files. Edit the .cnf file
InnoDB: and start the database again.
120309  1:00:17 [ERROR] Plugin 'InnoDB' init function returned error.
120309  1:00:17 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
120309  1:00:17 [Note] Event Scheduler: Loaded 0 events
120309  1:00:17 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.1.58-1ubuntu1-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  (Ubuntu)

NOTE: This was performed on Ubuntu using the standard distro MySQL version of MySQL 5.1.

As previously mentioned, SQL_MODE may not be perfect, however what features do exist warrant correctly configuration your MySQL environment not to use the default.

More Information.

IOUC Leaders Summit Presentation

At the recent leaders summit of world leaders in Oracle, Java and MySQL user communities I gave a presentation on Why Upgrade to MySQL 5.5

This is a more high level overview presentation, specifically designed for Oracle resources with little to no knowledge about MySQL, however it provides a great management approach to the consideration of using the current MySQL GA version.

I have a more detailed technical presentation from last year on Reasons to Use MySQL 5.5.

Why SQL_MODE is essential even when not perfect

In a recent rant on Why I think SQL_MODE is useless…, I wanted to counteract this statement with why we MUST all use SQL_MODE, even with the inherit flaws.

The fundamental principle of a database is to restore and retrieve data. When I can insert data into the database and then I select this data it is different, this is fundamentally wrong. This is a loss of essential data integrity, something a database should NEVER do.

SQL_MODE solves the problem of “silent truncation” in most instances, and produces an all important error. As pointed out, the SQL_MODE has several limitations, however the benefits do out way the risks. Quality control on source code can reduce the limitations, but no amount of coding can stop the CRUD that comes out of the database without some SQL_MODE settings.

I would ask two more important questions.

  1. How in the first place can such a critical feature of silent data truncation ever be permitted in MySQL? Who made that decision and why?
  2. When is the owner of MySQL codebase realize this is rather ridiculous and enforce essential minimual data integrity that can be obtain with options including STRICT_ALL_TABLES, NO_ZERO_DATE, NO_ZERO_IN_DATE and NO_ENGINE_SUBSTITUTION.

References

Colorado MySQL Users Group Presentation

In addition to speaking at RMOUG event in Denver, I also spoke today in Broomfield on “Improving MySQL Performance with Better Indexes”.

This presentation included details on :

  • Effective examples of capture SQL via application logging and TCP/IP analysis necessary for identifying the best candidates. Slow is not always the best query to start with.
  • All the commands necessary to identify why you may need an index, how to create varying types of indexes, and how to confirm there true effectiveness.
  • How the number of table rows and different storage engines can greatly effect the optimization choice and query performance
  • The presentation shows how to determine/create and verify covering indexes for a single table example, a master/child example and a production 13 table join.

You can obtain the slides from Improving MySQL Performance With Better Indexes Presentation.

Event Details

MySQL Security Essentials Presentation

Today at the RMOUG Training Days 2012 event I gave an introduction presentation on MySQL Security Essentials covering the following topics:

  • MySQL Security defaults
  • MySQL Security Improvements
  • OS Security
  • User Privileges
  • Data Integrity
  • Installation Practices
  • Auditing Options
  • Better Security
  • Further References

Download slides for MySQL Security Essentials.

Exciting upcoming MySQL events

At the IOUC leaders’ summit in San Francisco this week, key leaders from Oracle, Java and MySQL user groups world wide have been meeting. This has included the key Oracle MySQL resources from the community, marketing and product teams. The Java User Groups and MySQL User Groups have been well represented and there has been very welcoming discussion with the IOUC about how we can become active within the Oracle Community.

There has been key discussions of upcoming and proposed MySQL events including the great outreach by the Oracle MySQL team with existing Open Source conferences this year including Scale, FOSDEM and South East Linuxfest just to name a few.

You can see the upcoming events at http://mysql.com/news-and-events/events/. In February alone there will be events in North Carolina, California, Texas, Frankfurt Germany and Stockholm Sweden. I will also be speaking in Denver at 2 events and Salt Lake City.

There is a much longer list then what is shown here, and we are working on getting the full list more available.

More info by Dave Stokes at SCaLE’s MySQL Day a big hit and Keith Larson More User Groups.

Why is searching the manual so hard

As a consultant I often use the MySQL Reference Manual to provide additional information for clients. I am very happy to recognize the quality of the content in the MySQL documentation, but why is the searching of the manual so, so bad?

While reading the General Security Issues section of the MySQL 5.5 manual, I performed a search for “CREATE USER”. I was not asking for anything abstract, this is an actual SQL command. I was rather horrified to find that the results could not even list the appropriate manual page in the first page of results.

I am not an expert in full-text search, however it does not take a rocket scientist to realize that a SQL keyword, the title of a page, in the language of the current page (English) and the current version of the Manual (5.5) should be an easy result. This is a simple weighted result right? Wrong.



The most important MySQL Reference Manual page

In my opinion, The Server Option and Variable Reference at http://dev.mysql.com/doc/refman/5.5/en/mysqld-option-tables.html rates as my most important page. This is a consolidated index that enables a drill down to the Server Command Options, System Variables, Startup and replication specifics, as well as important information on default values and differences between versions including point releases.

However, there is another page not in the actual manual, but at http://dev.mysql.com/doc/mysqld-version-reference/en/mysqld-version-reference-optvar.html which is an Options/Variables reference akin to the Reference Manual, but includes a 5.x version matrix.

Recently I was asked about some options that had to be removed from an upgrade to MySQL 5.5. Some of these were obvious, however not all. This page enabled me to confirm deprecation (as expected), and also point to important reference material.

These options where:

  • default_table_type
  • log_long_format
  • master-connect-retry
  • default-character-set

The use of table_type was a 3.x/4.x relic, replaced with engine, so I was surprised this option was still even valid. The option replaced with default_storage_engine. The page defined this as deprecated in MySQL 5.0
log_long_format is also old, and definitely modified since MySQL 5.1 with the general_log_xxx options. In fact this has been deprecated since 4.1
I have never liked the master-xxx options, in favor of a CHANGE MASTER command and synchronization issues with the master.info file and master-xx options. master-connect-retry and several other options were deprecated in 5.1.17. On a side note, if you look at this option in the MySQL 5.5 Reference Manual you get Obsolete options. The following options are removed in MySQL 5.5. If you attempt to start mysqld with any of these options in MySQL 5.5, the server aborts with an unknown variable error. To set the replication parameters formerly associated with these options, you must use the CHANGE MASTER TO … statement (see Section 12.4.2.1, “CHANGE MASTER TO Syntax”).
Finally default-character-set. Initially I thought that is definitely still applicable, however the handy cross reference reminded me, this is also deprecated in MySQL 5.0 and the Reference Manual again stating default-character-set is also deprecated in 5.0 in favor of character-set-server.. The name has simply changed in newer version.

With so many options and as a consultant I work with many different versions each week, I sometimes need a refresher of the changes in the versions of the past 5 years. Definitely my second most important page.

If you have a favorite page, please let me know.

I would also like to say thank you to the Oracle/MySQL Documentation team that do a great job in providing an excellent resource to an open source product. We would all do well to appreciate this in comparison to other open source documentation in companion technologies and related tools. With every new release of a MySQL product you don’t realize that somebody reviewed, tested and wrote about features without receive the limelight.

Binary Log Replayer

When using the replication slave stream, or mysql command line client and mysqlbinlog output from a binary/relay log, all statements are executed in a single thread as quickly as possible.

I am seeking a tool to simulate the replay of the binary/relay log for a benchmark at a pace that is more representative to original statements. For a simple example, if the Binary Log has 3 transactions in the first second, 2 transactions in the second second, and 5 transactions in the third second, I am wanting to simulate the replay to take roughly 3 seconds, not as fast as possible (which would be sub-second). The tool should try to wait the remainder of a second before processing SQL statements in the incoming stream.

Does anybody know of a tool that currently provides this type of functionality? Any input appreciated before I create my own.

Speaking in Denver

Following a heavy schedule in the last month speaking in Tokyo, Beijing, Manila and Auckland it is nice to be on home soil for upcoming speaking. I will be in Denver, Colorado for RMOUG 2012 from February 14-16, 2012 where I will be speaking about the essentials of MySQL security.

I hope to also organize another presentation in the area for the local MySQL users group. More to follow.

And a friendly reminder, the annual MySQL conference is on again, same place (Santa Clara), same dates (April 10-12), same great speakers (too many to say), just a new primary sponsor. Why not submit a paper to present. Call for papers is still open, closing on December 5. More information at Percona Live MySQL Conference & Expo 2012.

Do you have a MySQL horror story to share?

I am looking for a few more unique examples to add to the final chapter of my upcoming book on MySQL Backup & Recovery. If you would like to share your fun experience, receive a mention and a free copy please let me know via comment. If you would like to share but not have your comment published, please note at top of your feedback.

Thanks for helping to contribute to a detailed list of what could go wrong and how to be prepared for a MySQL disaster.

NoSQL from a RDBMS company

Oracle has announced an open source product for the NoSQL space, the Oracle NoSQL Database. Unlike other popular products including Redis, MongoDB, Cassandra, Voldermort and many others, Oracle has set a benchmark on the features that are truly necessary for highly available data systems.

Many products in the NoSQL space have told you that consistency is not needed, eventual consistency is good enough, that transactions are not performant enough to include as a feature. No standards exists, there is no common interface for communication, or key features that products aim to meet or better. With this product, features including transactions, replicated data and failover which are built in, are features other open source NoSQL products will need to match.

Oracle NoSQL Database is a key value store, supporting a major/minor key for co-locating regularly accessed information for more consistent data retrieval. The API (built in Java) supports GET, PUT, and DEL operators. The system is designed to not have a single point of failure, and to support a node failure without impact. The replication factor is reported to enable up to 7 copies of information, which would be a feature to support cross data center management. The database driver is latency aware, so this can support load balancing operations for optimal performance.

I am excited to hear about this and looking forward to evaluating the software. I will be watching more closely how the integration of MySQL and Oracle NoSQL can be an offering for startups and Web 2.0

The Effective MySQL Book Series

Effective MySQL: Optimizing SQL StatementsAnnounced on Sunday at Oracle Open World 2011 is the release of the Effective MySQL book series starting with the “Optimizing SQL Statements” title. The goal of the Effective MySQL series is a highly practical, concise and topic specific reference providing applicable knowledge to use on each page. A feedback comment provided today was “no fluff” which is great comment to re-enforce the practical nature of the series.

Details on the Effective MySQL Optimizing SQL Statements page include a sample chapter, code downloads and purchase links for print and e-books at Amazon, McGraw-Hill and Barnes & Noble.

Reasons to use MySQL 5.5 Presentation

I recently gave a presentation at the New York Effective MySQL Meetup on the new features of, and some of the compelling reasons to upgrade to MySQL 5.5. There are also a number of new MySQL variables that can have a dramatic effect on performance in a highly transactional environment, innodb_buffer_pool_instances and innodb_purge_threads are just two to consider.

For more information on all the new variables, status, reserved words and benchmarks of new features you can Download Presentation Slides.

Utilizing multiple indexes per MySQL table join

Historically it was considered that MySQL will generally use only one index per referenced table in a SQL query. In MySQL 5.0 the introduction of merge indexes enabled for certain conditions the possibility to utilize two indexes however this could result in worst performance then creating a better index. In MySQL 5.1 it became possible to control optimization switches with the optimizer_switch system variable.

However in explaining how to utilize the intersection, union and sort union in queries I discovered that MySQL could use three indexes for one given table.

        Extra: Using union(name,intersect(founded,type)); Using where

I was not aware of this.

Extra: Using Index

Many people consider this information in the MySQL Query Execution Plan (QEP) to indicate that the referenced table is using an index. It actually means that ONLY the index is used. This can for larger and more frequent queries provide a significant boost.

In a recent example, adding an index dropped a query from 190ms to 6ms. However adding a better index dropped that 6ms query to 1.2ms. When executed 100s/1000s of times per second these millisecond improvements can have a huge benefit in greater scalability. While people often tune slow running queries, in a well tuned system shaving milliseconds of queries, in this example making 6ms query 80% better is a far greater improvement.

You can get a detailed explanation of how to identify, create and verify covering indexes from my Percona Live presentation Improving performance with better indexes where I also include another great 10 table join example, reducing a query running 20,000+ times per second from 175ms to 10ms.

Why SQL_MODE is important

Today was another example of where a correct SQL_MODE saved customer data from being corrupted. By default, MySQL does not enforce data integrity. It allows what is called silent truncations where the result of what you INSERT or UPDATE does not represent truth. NOTE: I see very few customers ever have this correctly configured, those that do have actually listened to my advice.

If you do not read any further, your production MySQL environments should be running with at the bare minimum of SQL_MODE=STRICT_ALL_TABLES however I would also advocate for additional SQL_MODE settings.

For this example, some modified undesirable code attempted to reduce a counter by 1, however because of an UNSIGNED data type and a correctly set SQL_MODE, the application produced an error and data was not corrupted.

This is what should happen with your SQL.

mysql> update stats set loss_count=loss_count - 1 where user_id=42;
ERROR 1690 (22003): BIGINT UNSIGNED value is out of range in '(`db`.`stats`.`loss_count` - 1)'

It is interesting to note that the error message actually is misleading. The datatype for the column is SMALLINT however the error message prompted an unnecessary schema verification. Even with the calculation I would have assumed the inherited data type would be the column definition before the subtraction. If you try to set the value automatically to negative you get a better message.

mysql> update stats set loss_count=-1 where user_id=42;
ERROR 1264 (22003): Out of range value for column 'loss_count' at row 1

So why is this a big deal? If your MySQL instance is not using SQL_MODE then the result would have been a value of 0 and this would never highlight the bug in the code.

I should point out that only Drizzle of the MySQL forks and variants has tightened up this data integrity, however I have to agree to disagree with the decision to drop UNSIGNED. While I under internally the code simplification behind the decision, without check constraints in MySQL, UNSIGNED is a saving grace. In this example, Drizzle in this case would not have reported an error.

Percona Live New York is underway

Today we have a dedicated MySQL conference in New York with Percona Live. It is great to see an overflowing room in the opening keynote. With over 20 speakers and 4 dedicated tracks there is a lot of content for attendees.

With all the confusion over conference ownership since the Oracle acquisition I applaud Percona for taking an initiative, first in San Francisco and now here in New York. Also announced today is the next Percona Live in London which is great for the MySQL ecosystem in Europe.

Another reason to avoid RDS

My list of reasons for never using or recommending Amazon’s MySQL RDS service grows every time I experience problems with customers. This was an interesting and still unresolved issue.

ERROR 126 (HY000): Incorrect key file for table '/rdsdbdata/tmp/#sql_5b7_1.MYI'; try to repair it

You may see this is a MyISAM table. The MySQL database is version 5.5, all InnoDB tables and is very small 100MB in total size.
What is happening is that MySQL is generating a temporary table, and this table is being written to disk. I am unable to change the code to improve the query causing this disk I/O.

What I can not understand and have no ability to diagnose is why this error occurs sometimes and generally when the database is under additional system load. With RDS you have no visibility of the server running the production database. While you have SQL access, an API for managing MySQL configuration options (I also add not all MySQL variables), and limited system statistics via a graphical interface, all information about the system performance, disk configuration etc is hidden and not accessible. This is a frustrating limitation of using RDS.

NOTE: While I cannot recommend RDS, I am very happy with AWS EC2 services when correctly configured. For a cloud based MySQL solution I would definitely recommend greater control over your MySQL database using EC2 and EBS.

query_cache_size=0 is not enough

Last week at the OUG Harmony conference thanks to Dimitri Kravtchuk I learned that setting query_cache_size=0 does not disable and remove locking from the Query Cache. You actually need to also set query_cache_type=0. This appears to been a bug, seen in the presently still open MySQL bugs database entry #38511.

My recommendation to customers now is to set both variables on all existing MySQL versions if you are not using the MySQL Query Cache.

Thanks to the Performance Schema in MySQL 5.5 for uncovering this. More information in Dimitri’s detailed post at MySQL Performance: Using Performance Schema

Details of all MySQL presentations at OUG Harmony 2011 in Helsinki, Finland.

Free does not mean cheap

Many organizations consider MySQL as a database because the initial license cost is free (*). Larger organizations that use Oracle and SQL Server also consider implementing MySQL as a means to lower the total cost of software infrastructure due to the initial cost for new software licenses or expensive upgrades due to new hardware.

However free software does not mean that services to support MySQL should be also free or even cheap. Recently a large multi-national customer wanted professional consulting and training for MySQL resources and they seemed shocked that I wanted to charge a reasonable rate for professional services. My MySQL consulting rates are cheaper then industry MySQL peers and also similar skilled resources providing Oracle consulting.

With over 10 years experience in MySQL, many doing consulting and also over 10 years professional experience before MySQL I am more then qualified to provide the best possible consulting available for architecture design, performance analysis and tuning, high availability, training and education. My significant contributions to the MySQL community including blogging, speaking and presenting is also not an indicator that companies should expect a significantly different nor cheaper price for professional consulting.

Speaking at Percona Live New York

As the top MySQL expert in New York it is great to join the team at Percona for the upcoming Percona Live in New York City on May 26th. As an invited speaker I am joining a select list of expert speakers including Harrison Fisk from Facebook, Kurt von Finck from Monty Program and Monty Taylor from the core Drizzle team.

My presentation will be on Improving Performance with Better Indexes where I will not only show how to apply indexes to improve query performance, but how to apply better indexes and provide even greater performance gains via a great technique known as a covering index.

Upcoming MySQL presentation in New York

On Tuesday I will be speaking in New York at the Effective MySQL Meetup group where I will be giving the presentation “MySQL Idiosyncrasies That Bite”. For more information and to register, check out the Meetup Event. There are just 10 seats left.

To promote the upcoming Percona Live event in New York, for attendees to the Meetup there will be a draw for a FREE ticket to the May 26th event.

Effective MySQL New York is the only MySQL group now operating in New York. Please join our group for the latest information and events for the MySQL community.