Posts Tagged ‘MySQL’

Utilizing multiple indexes per MySQL table join

Monday, June 13th, 2011

Historically it was considered that MySQL will generally use only one index per referenced table in a SQL query. In MySQL 5.0 the introduction of merge indexes enabled for certain conditions the possibility to utilize two indexes however this could result in worst performance then creating a better index. In MySQL 5.1 it became possible to control optimization switches with the optimizer_switch system variable.

However in explaining how to utilize the intersection, union and sort union in queries I discovered that MySQL could use three indexes for one given table.

        Extra: Using union(name,intersect(founded,type)); Using where

I was not aware of this.

Extra: Using Index

Monday, June 13th, 2011

Many people consider this information in the MySQL Query Execution Plan (QEP) to indicate that the referenced table is using an index. It actually means that ONLY the index is used. This can for larger and more frequent queries provide a significant boost.

In a recent example, adding an index dropped a query from 190ms to 6ms. However adding a better index dropped that 6ms query to 1.2ms. When executed 100s/1000s of times per second these millisecond improvements can have a huge benefit in greater scalability. While people often tune slow running queries, in a well tuned system shaving milliseconds of queries, in this example making 6ms query 80% better is a far greater improvement.

You can get a detailed explanation of how to identify, create and verify covering indexes from my Percona Live presentation Improving performance with better indexes where I also include another great 10 table join example, reducing a query running 20,000+ times per second from 175ms to 10ms.

Upcoming MySQL presentation in New York

Monday, April 25th, 2011

On Tuesday I will be speaking in New York at the Effective MySQL Meetup group where I will be giving the presentation “MySQL Idiosyncrasies That Bite”. For more information and to register, check out the Meetup Event. There are just 10 seats left.

To promote the upcoming Percona Live event in New York, for attendees to the Meetup there will be a draw for a FREE ticket to the May 26th event.

Effective MySQL New York is the only MySQL group now operating in New York. Please join our group for the latest information and events for the MySQL community.

Basic scalability principles to avert downtime

Saturday, April 23rd, 2011

In the press in the last two days has been the reported outage of Amazon Web Services Elastic Compute Cloud (EC2) in just one North Virginia data center. This has affected many large website includes FourSquare, Hootsuite, Reddit and Quora. A detailed list can be found at ec2disabled.com.

For these popular websites was this avoidable? Absolutely.

Basic scalability principles if deployed in these systems architecture would have averted the significant downtime regardless of your development stack. While I work primarily in MySQL these principles are not new, nor are they complicated, however they are fundamental concepts in scalability that apply to any technology including the popular MongoDB that is being used by a number of affected sites.

Scalability 101 involves some simple basic rules. Here are just two that seem to have been ignored by many affected by this recent AWS EC2 outage.

  1. Never put all your eggs in one basket. If you rely on AWS completely, or you rely on just one availability zone that is putting all your eggs in one basket.
  2. Always keep your important data close to home. When it comes to what is most critical to your business you need access and control to your information. At 5am in the morning when the CEO asks how long will our business be unavailabla and what is needed to resolve the problem, the answer “We have no control over this and have no ETA” is not an acceptable answer.

With a successful implementation and appropriate data redundancy you may not have an environment immediately available however you have access to your important information and the ability to create one quickly. Many large hosting companies can provide additional H/W on near demand, especially if you have an initial minimal footprint. Indeed using Amazon Web Services (AWS) as a means to avert a data center disaster is an ideal implementation of Infrastructure As A Service (IAAS). Even with this issue, organizations that had planned for this type of outage could have easily migrated to another AWS availability zone that was unaffected.

Furthermore, system architecture to support various levels of data availability and scalability ensure you can handle many more various types of unavailability without significant system down time as recently seen. There are many different types of availability and unavailability, know what your definition of downtime is and supporting disasters should be your primary focus of scalability, not an after thought.

As an expert in performance and scalability I can help your organization in the design of a suitable architecture to support successful scalability and disaster. This is not rocket science however many organizations gamble without the expertise of a professional to ensure business viability.

Use Replication for backups? Are you schemas consistent?

Thursday, March 31st, 2011

Many people have a master/slave MySQL environment of various different topologies, and many use the slave as a backup.
Is your slave schema identical to your production schema? As long as an SQL statements completes without an error, your slave schema can differ. Common examples are different indexes or storage engines for example. However if you use the slave as backup, you want to ensure when you recover, you are recovering a production environment, not a slave environment.

While the changes may be small, the can lead to different results. For a client I found that the default value of a price field was 10.00 in one schema and 0.00 in another. Not withstanding using defaults for important fields and not defining in an INSERT is a different issue, it highlighted the different schemas can easily exist.

My tool of choice is SchemaSync. The reason why I like this command is that it provides migration scripts to upgrade or downgrade your schemas.

I vote for Planet MySQL moderation

Tuesday, March 29th, 2011

How this happens or who does it is obviously a larger and more complex conversation however it is better then involving innocent animals.

How is it that trivial $#*! gets voted and has a string of comments I can not explain, however Planet MySQL should have practical MySQL related content. I would vote for the front page and RSS feed to show only moderated content, and content that did not pass the cut (and there is a lot of this) can be available on a less important page if necessary.

Rather then complaining like many open source communities, let us propose a way to improve the system we use.

Checked that MySQL backup log lately?

Tuesday, March 29th, 2011

Running a MySQL backup and ensuring it completed successfully and backup files exist is not enough. In my B&R Quiz from Checked your MySQL recovery process recently? one important step is “Do you review your backup logs EVERY SINGLE day or have tested backup monitoring in place?”

This is what I found when reviewing a backup log for a client today.

mysqldump: Got error: 1142: SELECT,LOCK TABL command denied to user 'root'@'localhost' for table 'cond_instances' when using LOCK TABLES
mysqldump: Got error: 1142: SELECT,LOCK TABL command denied to user 'root'@'localhost' for table 'cond_instances' when using LOCK TABLES

The backup script was completing, backup files were in place (and are listed in the log file) however these errors were occurring.

Further investigation was less then one minute of actual work. This server runs multiple MySQL instances and recently one instance was upgraded from MySQL 5.1 to MySQL 5.5 however the call to mysqldump was not. This error was the result of running 5.1 mysqldump against a 5.5 database. Change the hard coded path and the error message went away.

The greater problem is teaching people to understand the importance of backups.

MySQL conference schedule

Monday, March 14th, 2011

I am one of the crazy individuals(*) that will be speaking at both the regular O’Reilly MySQL Conference and the IOUG Collaborate conference both being held in the second week of April. My 4 presentations are:

Upcoming NY Presentation – How Better Indexes Save You Money

Friday, March 11th, 2011

For all those in New York this is an upcoming MySQL presentation held in conjunction with our colleagues at General Assembly on March 22nd 2011.

This presentation “How Better Indexes Save You Money” will be discussing how one simple technique can result in huge MySQL performance improvements and with zero code changes necessary. Many people think they know indexes, however MySQL and MySQL Storage Engines have some specifics that differ from more traditional RDBMS products. Learn some of the key analysis and verification techniques and be able to see immediate potential results in performance.

You can find more details at Meetup.com EffectiveMySQL. This new group is all about highly technical MySQL related content “no fluff, just stuff”.

Your PHP installation appears to be missing the MySQL extension which is required by WordPress.

Tuesday, March 1st, 2011

I recently deployed a new WordPress installation to my existing production webserver running Apache, MySQL and PHP for other websites, yet I was presented with the following message.

“Your PHP installation appears to be missing the MySQL extension which is required by WordPress.”

This thread at wordpress.org did not help me, however I was able to solve the problem, but this thread is now marked as closed. That’s poor form because I can’t share the solution I found.

My PHP configuration file did not have the following.

#php.ini
[PHP]
extension=mysql.so

Adding this and restarting Apache did not fix the problem.

The problem was more fundamental and required PHP to be recompiled. Orginally PHP was configured with the ‘–with-mysqli’ option. PHP requires the ‘–with-mysql’ which is rather stupid they have this dependency.

Recompiling PHP and adding the necessary extension were both necessary to get my new WordPress installation operational.

Part 2 – Simple lessons in improving scalability

Thursday, February 24th, 2011

Given the popular response from my first lesson in improving scalability where I detailed simple ways to eliminate unnecessary SQL, let me share another common bottleneck with MySQL scalability that can be instantly overcome.

Analyzing the writes that occur on a system can expose obvious potential bottlenecks. The MySQL Binary Log is a wealth of information that can be mined. Simple DML Counts per table can be achieved by a single line command.

Let’s look at the following example output of a production system:

mysqlbinlog /path/to/mysql-bin.000999 |  \
   grep -i -e "^update" -e "^insert" -e "^delete" -e "^replace" -e "^alter"  | \
   cut -c1-100 | tr '[A-Z]' '[a-z]' |  \
   sed -e "s/\t/ /g;s/\`//g;s/(.*$//;s/ set .*$//;s/ as .*$//" | sed -e "s/ where .*$//" |  \
   sort | uniq -c | sort -nr

Of the approx 100,000 DML statements we get the following breakdown.

  55283 update sessions
  25204 insert into sessions
  12610 update items
  10536 insert into item_categories
   7532 update users
   5168 delete from item_categories
 

More then 50% of the statements that are written to the binary log and therefore replicated are INSERT’s into the sessions table. A further 25% are UPDATE’s to the same table. This represents 75% of DML statements in just the two most frequent statements.

What is disappointing is that these statements do not belong in MySQL. This is an example of when MySQL is being abused for a purpose where other products are more suited. While there is the argument in using MySQL for storing data, the impact in MySQL memory management, backup/recovery and slave replication throughput and lag can significantly impact scalability of your important MySQL data.

What is observed here is session management where a key value store product should be used as an alternative. In most circumstances it is likely this information is not even required to be persisted. The obvious replacement here is using memcached. If you do wish to persist this data there is an ever increasing list of products including Redis, Tokyo Cabinet/Kyoto Cabinent, Membrain, Voldemort etc that are specifically designed as a key-value store. Even the popular noSQL MongoDB can be easily substituted to perform as a key-value session manager with the added benefits of being a more fully functional product for other purposes.

This is often a common mistake when you use a framework such as Ruby on Rails (RoR) or PHP Code Igniter and many others.

Optimizing UPDATE and DELETE statements

Thursday, February 24th, 2011

Updated Nov 2011. Check out my latest book on Optimizing SQL Statements for more information. MySQL 5.6.2 also now provides an EXPLAIN syntax for UPDATE and DELETE statements natively.

While most people look at performance optimizations for SELECT statements, UPDATE and DELETE statements are often overlooked. These can benefit from the principles of analyzing the Query Execution Plan (QEP). You can only run an EXPLAIN on a SELECT statement, however it’s possible to rewrite an UPDATE or DELETE statement to perform like a SELECT statement.

To optimize an UPDATE, look at the WHERE clause. If you are using the PRIMARY KEY, no further analysis is necessary. If you are not, it is of benefit to rewrite your UPDATE statement as a SELECT statement and obtain a QEP as previously detailed to ensure optimal indexes are used. For example:

UPDATE t
SET	c1 = ‘x’, c2 = ‘y’, c3 = 100
WHERE c1 = ‘x’
AND	d = CURDATE()

You can rewrite this UPDATE statement as a SELECT statement for using EXPLAIN:

EXPLAIN SELECT c1, c2, c3 FROM	t WHERE c1 = ‘x’ AND	d = CURDATE()

You should now apply the same principles as you would when optimizing SELECT statements.

Simple lessons in improving scalability

Wednesday, February 16th, 2011

It can be very easy to improve scalability with a MySQL server by a few simple rules. Here is one of them.

“The most efficient way to improve an SQL statement is to eliminate it”

There are numerous ways to eliminate SQL statements, however before I give a classic example that I’ve observed again with a client, let me explain the basic premise of why this improves scalability?

The MySQL kernel can only physically process a certain number of SQL statements for a given time period (e.g. per second). Regardless of the type of machine you have, there is a physical limit. If you eliminate SQL statements that are unwarranted and unnecessary, you automatically enable more important SQL statements to run. There are numerous other downstream affects, however this is the simple math. To run more SQL, reduce the number of SQL you need to run.

Here is the output of a small sample of analyzed TCP/IP packets via mk-query-digest.

# Rank Query ID           Response time Calls R/Call Apdx V/M   Item
# ==== ================== ============= ===== ====== ==== ===== ==========
#    1 0xD631CB919867DB50  0.0436 47.3%    92 0.0005 1.00  0.00 SELECT TTDOD
#    2 0x04FE01C5B31FD305  0.0258 27.9%   329 0.0001 1.00  0.00 ADMIN PING
#    3 0x93321857BCD8E771  0.0229 24.8%    36 0.0006 1.00  0.00 SELECT TTD

There are many problems here including the Row at a Time (RAT) nature of the SQL, the excessive pings however that’s a topic for another time. Let us look at the first statement.

SELECT `Date` FROM TTDOD WHERE ID = 9999;

That seems a simple enough query however let’s look at the table.

mysql> select count(*) from TTDOD;
+----------+
| count(*) |
+----------+
|        0 |
+----------+

In this case, the query will NEVER return any rows because the table is currently empty. Sure this may change in the future, however as this is more an exception processing situation the simple act of managing the knowledge this table rarely has any rows, and building a solution to inform the application of this can completely eliminate the need for this query to ever be executed.

FYI, the above sample is from less then 2 seconds of sampling. Removing the first query reduces the number of queries executed in this time slice by 20%. Regardless of whether this is typical load or load during a batch job the principle stands. We have not even started to look at what we can do with the next query.

MySQL 5.5.9

Friday, February 11th, 2011

You blink and there is a new version. I have not seen an Planet MySQL release as yet about this new version. Release Notes.

I’d like to say I installed it, but I downloaded the Linux – Generic 2.6 (x86, 64-bit), TAR file, only to find it contains 6 rpm files. #fail, I’m using Ubuntu.

You have to scroll to the bottom of the list (another stupid thing for a generic binary choice) to get Linux – Generic 2.6 (x86, 64-bit), Compressed TAR Archive. Double #fail

Leveraging the InnoDB Plugin

Friday, February 11th, 2011

Beginning with MySQL 5.1 as an additional plugin and included by default in MySQL 5.5 the InnoDB plugin includes many performance improvements. To leverage the support of new file formats however a very important setting is necessary.

#my.cnf
[mysqld]
innodb_file_per_table

The use of innodb_file_per_table with an existing system or during an upgrade to 5.1 or 5.5 requires a complete reload of your database to use effectively. In summary.

  • Backup all InnoDB tables via mysqldump
  • Drop InnoDB tables
  • Verify InnoDB not used
  • Stop MySQL
  • Enable innodb_file_per_table & simplified innodb_data_file_path (if applicable)
  • Remove ibdata? files
  • Start MySQL
  • Create Tables
  • Reload Data
  • Verify InnoDB Operation
    • The primary reason is we are moving from using a common tablespace to a tablespace per table. InnoDB wil not shrink the common tablespace so this process is necessary in order to purge the diskspace currently being used. You should also reduce your innodb_data_file_path options if specified. For example if currently set to :

      innodb_data_file_path = ibdata1:2000M;ibdata2:10M:autoextend
      

      I would suggest you change to

      innodb_data_file_path = ibdata1:100M:autoextend
      

      InnoDB still requires this common tablespace, however now each table has it’s own disk file the volume required is signficantly less.

Interesting MySQL 5.5 upgrade gotcha

Wednesday, February 9th, 2011

Today I discovered an interesting upgrade problem with a client migrating from MySQL 5.0 to 5.5. The client who is undertaking the upgrade reported that MySQL 5.5 did not support the DECIMAL(18,5) data type. I easily confirmed this not to be the case:

mysql> drop table if exists x;
mysql> create table x (col1 DECIMAL(18,5));
Query OK, 0 rows affected (0.01 sec)

Delving more into the issue in question, I looked at the complete CREATE TABLE statement, recreating the syntax.

mysql> drop table if exists x;
mysql> create table x ( MinValue DECIMAL(18,5));
Query OK, 0 rows affected (0.00 sec)

No problem there.

mysql> drop table if exists x;
mysql> create table x (Department INT NOT NULL, MinValue DECIMAL(18,5) NULL, MaxValue DECIMAL(18,5) NULL);
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'MaxValue DECIMAL(18,5) NULL)' at line 1

I could immediately see the problem because if you look closely at the error message it starts with the word “MaxValue”, most likely this is a reserved word. This was easily confirmed with:

mysql> create table x (Department INT NOT NULL, MinValue DECIMAL(18,5) NULL, `MaxValue` DECIMAL(18,5) NULL);
Query OK, 0 rows affected (0.00 sec)

And a confirmation with the 5.5 Reference Manual does indeed show there are new reserved words in MySQL 5.5. These are:

  • GENERAL
  • IGNORE_SERVER_IDS
  • MASTER_HEARTBEAT_PERIOD
  • MAXVALUE
  • RESIGNAL SIGNAL
  • SLOW

Higher Availability (HA) starts with two database servers

Tuesday, February 8th, 2011

Many early startups that use a single server for all services or a single database server for their website talk about how they would like to achieve higher availability with MySQL. This is not possible without at least two database servers. Using MySQL replication you can then support higher availability in several varying capacities. An additional MySQL database server can satisfy several infrastructure needs including:

  • A primary hot backup
  • A datasource for performing backups
  • Read scalability infrastructure
  • A reporting server
  • A benchmarking server
  • A fail-over/fail back master environment

MySQL replication is very easy to configure and deploy, a task that takes < 30 minutes for an experienced MySQL DBA. However, altering your backup and recovery strategy, modifying your application to support read/write splitting or implementing a fail-over/fail back strategy (also known as a MySQL Pairs implementation) are more complex tasks for implementing a higher availability MySQL solution.

About the Author

Ronald Bradford is a well respected industry expert in the MySQL community. Ronald is internationally recognized as an Oracle ACE Director in MySQL, a published author of Expert PHP and MySQL and the all-time top individual contributor of MySQL blog content at Planet MySQL.

Do you use PHPMyAdmin?

Wednesday, February 2nd, 2011

If so then were is it installed on your publicly accessible website. If the location is where the documentation states not to put it, or in other popular locations then you can easily become open to an attack. The following are apache logs of a simple hacker test to find a potential security hole on a new IP public address for a client.

My advice is never have PHPMyAdmin accessible by default. You should use HTTP authentication, firewall rules and additional security practices all to protect any level of access to your data.

78.111.81.180 - - [02/Feb/2011:05:29:24 -0500] "GET //phpmyadmin/ HTTP/1.1" 401 290 "-" "Made by ZmEu @ WhiteHat Team - www.whitehat.ro"
78.111.81.180 - - [02/Feb/2011:05:29:24 -0500] "GET //phpMyAdmin/ HTTP/1.1" 401 290 "-" "Made by ZmEu @ WhiteHat Team - www.whitehat.ro"
78.111.81.180 - - [02/Feb/2011:05:29:24 -0500] "GET //pma/ HTTP/1.1" 401 290 "-" "Made by ZmEu @ WhiteHat Team - www.whitehat.ro"
78.111.81.180 - - [02/Feb/2011:05:29:24 -0500] "GET //dbadmin/ HTTP/1.1" 401 290 "-" "Made by ZmEu @ WhiteHat Team - www.whitehat.ro"
78.111.81.180 - - [02/Feb/2011:05:29:25 -0500] "GET //myadmin/ HTTP/1.1" 401 290 "-" "Made by ZmEu @ WhiteHat Team - www.whitehat.ro"
78.111.81.180 - - [02/Feb/2011:05:29:25 -0500] "GET //phppgadmin/ HTTP/1.1" 401 290 "-" "Made by ZmEu @ WhiteHat Team - www.whitehat.ro"
78.111.81.180 - - [02/Feb/2011:05:29:26 -0500] "GET //PMA/ HTTP/1.1" 401 290 "-" "Made by ZmEu @ WhiteHat Team - www.whitehat.ro"
78.111.81.180 - - [02/Feb/2011:05:29:26 -0500] "GET //admin/ HTTP/1.1" 401 290 "-" "Made by ZmEu @ WhiteHat Team - www.whitehat.ro"
78.111.81.180 - - [02/Feb/2011:05:29:26 -0500] "GET //MyAdmin/ HTTP/1.1" 401 290 "-" "Made by ZmEu @ WhiteHat Team - www.whitehat.ro"

If you are an administrator, 78.111.81.180 should be added to your blacklist permanently.
I should also state I do not use PHPMyAdmin on public servers, and also note the 401 response.

Microsoft’s position on MySQL

Monday, January 31st, 2011

While Oracle provides no official information they are planning on improving MySQL and using as a product to compete with Microsoft SQL Server, it is rather obvious from what little information you can glean from public announcements this is a clear business goal.

Microsoft however are publicly seeking a Senior Product Manager, MySQL Compete in the Marketing department. Your goal is nothing technical, it’s all PR to dispel MySQL as a viable product. I quote “you will equip field and partners to win in competitive engagements against MySQL, and you will influence market perception in favor of Microsoft technologies.” Here is the Full job description for those that want an amusing read.

This information came from an Oracle colleague of mine based in Asia.

Welcome new Oracle ACE’s

Tuesday, January 25th, 2011

I am pleased to announce that the Oracle ACE program has two new MySQL inductees. These people actively contribute to the MySQL community via a varied means in an unbiased and non commercial way.

Giuseppe Maxia (aka DataCharmer) is no stranger to the MySQL community having filled a position with MySQL Inc/Sun Microsystems/Oracle Corporation in the Community team for many years. Giuseppe was an active member in the community before this position, and continues to provide great input with tools such as the MySQL sandbox and many project and code snippets on the MySQL Forge as well as many writings for the Developer zone.

Patrick Galbraith (aka CaptTufo) may be a lesser known name to some, however he is the maintainer of the MySQL Perl DBD::mysql driver, the creator of the Federated Storage Engine for MySQL and is the author of two MySQL books. He also is the creator of the Memcached UDF functions.

Welcome, and I hope to see your continued contributions help grow our MySQL community.

Changes in using Profiling in MySQL 5.5

Wednesday, January 19th, 2011

In the past I’ve used the profiling features (e.g. SHOW PROFILES) in MySQL to help with timing SQL statements, especially those in the < 10 millisecond range.

Out of habit I did use this to time all SQL statements however in MySQL 5.5.8 GA I've found this no longer to be representative.

As you can see, the query takes some 50+ms longer with profiling enabled, not to mention they have broken the Source_file column which I've actually used to troll the source code with.

mysql> set profiling=1;

4 rows in set (1.14 sec)
4 rows in set (1.15 sec)
4 rows in set (1.17 sec)

mysql> set profiling=0;

4 rows in set (0.37 sec)
4 rows in set (0.37 sec)
4 rows in set (0.37 sec)

Investigating further showed the cause. There appears to be some new overhead that causes profiling to log excessive amount of information.

mysql> show profile source for query 35;
+--------------------------------+----------+-----------------------+----------------------+-------------+
| Status                         | Duration | Source_function       | Source_file          | Source_line |
+--------------------------------+----------+-----------------------+----------------------+-------------+
| starting                       | 0.000047 | NULL                  | NULL                 |        NULL |
| Waiting for query cache lock   | 0.000008 | try_lock              | /export/home/pb2/bui |         454 |
| checking query cache for query | 0.000112 | send_result_to_client | /export/home/pb2/bui |        1537 |
| checking permissions           | 0.000008 | check_access          | /export/home/pb2/bui |        4613 |
| checking permissions           | 0.000006 | check_access          | /export/home/pb2/bui |        4613 |
| checking permissions           | 0.000006 | check_access          | /export/home/pb2/bui |        4613 |
| checking permissions           | 0.000009 | check_access          | /export/home/pb2/bui |        4613 |
| Opening tables                 | 0.000035 | open_tables           | /export/home/pb2/bui |        4732 |
| System lock                    | 0.000015 | mysql_lock_tables     | /export/home/pb2/bui |         299 |
| Waiting for query cache lock   | 0.000056 | try_lock              | /export/home/pb2/bui |         454 |
| init                           | 0.000068 | mysql_select          | /export/home/pb2/bui |        2545 |
| optimizing                     | 0.000019 | optimize              | /export/home/pb2/bui |         858 |
| statistics                     | 0.000036 | optimize              | /export/home/pb2/bui |        1049 |
| preparing                      | 0.000025 | optimize              | /export/home/pb2/bui |        1071 |
| Creating tmp table             | 0.000050 | optimize              | /export/home/pb2/bui |        1587 |
| Sorting for group              | 0.000010 | optimize              | /export/home/pb2/bui |        1632 |
| executing                      | 0.000006 | exec                  | /export/home/pb2/bui |        1818 |
| Copying to tmp table           | 0.000049 | exec                  | /export/home/pb2/bui |        1965 |
| optimizing                     | 0.000011 | optimize              | /export/home/pb2/bui |         858 |
| statistics                     | 0.000053 | optimize              | /export/home/pb2/bui |        1049 |
| preparing                      | 0.000014 | optimize              | /export/home/pb2/bui |        1071 |
| executing                      | 0.000006 | exec                  | /export/home/pb2/bui |        1818 |
| Sending data                   | 0.000025 | exec                  | /export/home/pb2/bui |        2356 |
....

| Sending data                   | 0.000009 | exec                  | /export/home/pb2/bui |        2356 |
| executing                      | 0.000006 | exec                  | /export/home/pb2/bui |        1818 |
| Sending data                   | 0.000051 | exec                  | /export/home/pb2/bui |        2356 |
| Sorting result                 | 0.000034 | exec                  | /export/home/pb2/bui |        2244 |
| Sending data                   | 0.000052 | exec                  | /export/home/pb2/bui |        2356 |
| end                            | 0.000008 | mysql_select          | /export/home/pb2/bui |        2581 |
| removing tmp table             | 0.000014 | free_tmp_table        | /export/home/pb2/bui |       11121 |
| end                            | 0.000009 | free_tmp_table        | /export/home/pb2/bui |       11146 |
| query end                      | 0.000010 | mysql_execute_command | /export/home/pb2/bui |        4310 |
| closing tables                 | 0.000023 | mysql_execute_command | /export/home/pb2/bui |        4362 |
| freeing items                  | 0.000026 | mysql_parse           | /export/home/pb2/bui |        5509 |
| Waiting for query cache lock   | 0.000009 | try_lock              | /export/home/pb2/bui |         454 |
| freeing items                  | 0.000051 | NULL                  | NULL                 |        NULL |
| Waiting for query cache lock   | 0.000006 | try_lock              | /export/home/pb2/bui |         454 |
| freeing items                  | 0.000006 | NULL                  | NULL                 |        NULL |
| storing result in query cache  | 0.000017 | end_of_result         | /export/home/pb2/bui |        1020 |
| logging slow query             | 0.000006 | log_slow_statement    | /export/home/pb2/bui |        1444 |
| cleaning up                    | 0.000009 | dispatch_command      | /export/home/pb2/bui |        1400 |
+--------------------------------+----------+-----------------------+----------------------+-------------+
124504 rows in set (0.48 sec)

MySQL 5.5.8 GA and PHP 5.3.4 don’t get along with libmysql

Thursday, December 16th, 2010

Today I discovered that you are unable to compile the current stable PHP version 5.3.4 with yesterday’s MySQL 5.5.8 GA release. I was able to download the current MySQL 5.1.54 and compile without issue.

You can find all the gory details in Bug #58987 however I was able to edit a number of MySQL include file to get a build. Does this mean it’s a MySQL packaging problem or a PHP problem I don’t know, but I would hope that Oracle in the testing phase of a GA release test this against popular programming languages starting with the LAMP stack to ensure compatibility such as what I uncovered.

Five reasons to upgrade to MySQL 5.5

Wednesday, December 15th, 2010

Updated Nov 2011. Check out my Reasons to use MySQL 5.5 Presentation for more in-depth information about installing/configuring and using MySQL 5.5

I have been looking forward to the general availability (GA) release of MySQL 5.5 since is was publically announced in September that we would see this in 2010. While I already have a production client with 5.5.7rc, the badge of general availability is a great way to promote why environments should consider moving to using MySQL 5.5. Here is my quick short list of why I’d promote moving to MySQL 5.5.

1. Improved integration

The first significant improvement is that InnoDB is now again firmly a default included storage engine. The InnoDB plugin 1.1.x is now the builtin version of the engine, not a plugin version. Also the 1.1.x version has continued improvements over the 1.0.x version available as an included but not enabled plugin in current MySQL 5.1.x versions. Removing the complexity for end users over the choice of InnoDB and the necessary configuration changes is a great simplification. The introduction in the InnoDB plugin 1.0.x of a new file format (known as Barracuda) and additional new row formats such as dynamic and compressed should also be considered for improved I/O performance. Greater eduction and customer comparison results are still needed here.

2. Improved performance

“Improved scalability on multi-core CPUs” is the tag line Oracle touts and while the marketing graphs do show improved performance with huge numbers of threads, it is the continued work at addressing the support for large cores that is the key to have MySQL perform on newer H/W. MySQL was always known as the software that runs on low cost commodity H/W, however it also needs to work well on new hardware architectures.

InnoDB has many improvements, you can see a long list at 13.7.7. Performance and Scalability Enhancements. I have seen an upgrade from 5.0 to 5.5 significantly reduce contention of queries. The help of multiple buffer pool instances, multiple rollback segments, additional threads and functional mutexes have all reduced high throughput mutex contention.

3. Improved recoverability

While I have yet to use Semisynchronous replication in a production situation I see this as a continued progression to improving the recoverability and resilience of MySQL. This was work first seen by the Google Patch in MySQL 5.0 several years ago. The lack of a single unbreakable backup strategy is the number 1 overlooked feature in MySQL as reported by customer advisory groups in the past. The move to making InnoDB the default storage engine is another small step to improved recoverability in a default MySQL setting.

4. Improved instrumentation

A lack of detailed instrumentation rates number 2 on my list of areas of improvement needed with MySQL. With 5.5 there are more continued improvements. While some of these are very technical for high end analysis (e.g. the performance_schema), simple improvements in common instrumentation can help you identify triggers for poor performance. There is a lot more information available in the INFORMATION_SCHEMA. Here is a great example via SHOW PROCESSLIST that shows a 4 second lock in the Query Cache (QC) that until now was not possible to summarize easily what happens in the Query Cache black box.

... | Query   |    4 | Waiting for query cache lock | REPLACE INTO statistics_w SET IP = ...

Identification is the first necessary step to continued improvement.

My work with the PERFORMANCE_SCHEMA has been limited to experimenting such as described by Tracking IO with performance_schema however a recent discussion with Mark Leith highlighted that with the performance_schema it will be possible to determine the exact size of disk based temporary tables created by the MySQL kernel for example.

5. Improved development methodology

We have heard a lot of noise from a vocal few that Oracle will not do right by MySQL, the product, the community, the OEM providers. While I believe it is too early to tell the long term plans of MySQL under Oracle, I believe that action speaks louder than words. Oracle has made a number of commitments publically, Oracle are investing in resources to improve the development of the product and It’s clear the targeting of the Microsoft OS highlights MySQL has a future as a SQL Server competitor. Oracle brings a great wealth of experience, resources and processes and hopefully that will be invested into MySQL.

There is a good list of What Is New in MySQL 5.5 as part of the current development, features I’ve not mention include SIGNAL/RESIGNAL, change in the default storage engine, partitioning additions and pluggable MySQL authentication capability.

The MySQL documentation team also do a great job. An example of new work is a great comparison of Options and Variables from all 5.x versions shown on a single page for comparison.

The way forward with MySQL

Oracle as steward of MySQL is here to stay. While people still rumble about it, everybody has to get over it. For the end consumer little will change, for clients that use MySQL Enterprise Support that are affected by the changes in policy, many other viable support options exist. What is unknown and my single point of issue with the EU 10 point Statement is what will happen with OEM providers. This affects the financial viability of a number of providers and while I’m not directly involved I do not want to see this good work lost to the MySQL community.

Points of uncertainty include what features will be developed next, will they be pay only features, will they be available to enterprise customers and not community customers, what is the frequency of community releases etc. These still need good PR from Oracle for the MySQL community. For those concerned, Oracle has made continued investment in acquired RDBMS products including BerkeleyDB and InnoDB and RDB. Oracle RDB for example was acquired in 2004 and is still being actively developed and supported. As long as MySQL is profitable, we will see continued releases.

About the Author

Ronald Bradford is a well respected expert in the MySQL community. Ronald is internationally recognized as an Oracle ACE Director in MySQL, the highest industry recognition and is also a published author of Expert PHP and MySQL. He is available for MySQL consulting work now.

Wish list for MySQL thread polling events

Monday, December 13th, 2010

It is great to draw inspiration from other Open Source communities. Brad Fitzpatrick recently wrote about Android Strict Mode. His twitter tag line for this post was “I see you were doing 120 ms in a 16 ms zone” which is all I needed to hear from somebody who also worries unreasonably about responsiveness (Web site quote).

How would I apply this to a MySQL context? This is what happens in Android. “Strict Mode lets you set a policy on a thread declaring what you’re not allowed to do on that thread, and what the penalty is if you violate the policy. Implementation-wise, this policy is simply a thread-local integer bitmask. By default everything is allowed and it won’t get in your way unless you want it to”

In a MySQL thread I would like to know certain things every time they occurs. Some events I’d consider relevant are:

  • Any disk writes (e.g. Created_tmp_disk_tables) occured
  • Any disk reads (e.g. data was not in buffer_pool for example)
  • Exceeded execution time obviously
  • Internal memory buffer usage, size/amount of read/read_rnd/join/sort buffer
  • Other paths triggered, e.g. Sort_merges, read from disk for table_open_cache etc

Also with this mode is not just the instrumentation of the thread, but the penalty if violated. Do you log, kill, change priority, report an annoying message (difficult in SQL world).
I feel it would be unwise to consider this on a per thread request due to the complexity of SQL, so the next consideration is how to you categorize or identify given SQL query paths to a given polling policy.

While I have already raised more questions then answers for a possible solution the principle is important in what can we do to make our job easier and more simpler. These appproaches are the next generation of requirements in performance tuning, where you are becoming proactive rather then reactive in identification, analysis and correction.

Unexpected mysqld crashing in 5.5

Monday, November 22nd, 2010

An update of MySQL from 5.0 to 5.5 on CentOS 5.5 64bit has not resulted in a good experience. The mysqld process would then crash every few minutes with the following message.

101120 8:29:27 InnoDB: Operating system error number 22 in a file operation.
InnoDB: Error number 22 means ‘Invalid argument’.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/refman/5.1/en/operating-system-error-codes.html
InnoDB: File name /tmpfs/#sql6cf3_5c_0.ibd
InnoDB: File operation call: ‘aio write’.
InnoDB: Cannot continue operation.

The work around was to change the tmpdir=/tmpfs (which was a 16G tmpfs filesystem) to a physical disk.

The referenced URL didn’t provide any more information of help. Unlike Bug #26662 O_DIRECT is not specified as the flush method.

Damm you Peformance Schema

Monday, November 22nd, 2010

One significant new feature of MySQL 5.5 is the Performance Schema. I recently performed an upgrade from 5.0 to 5.5, however my check of differences in the MySQL variables via mysqladmin variables failed because we now have a new record long variable name “performance_schema_events_waits_history_long_size”.

The full list of new performance_schema variables are:

> | performance_schema_events_waits_history_long_size | 10000                                                                                                                  |
> | performance_schema_events_waits_history_size      | 10                                                                                                                     |
> | performance_schema_max_cond_classes               | 80                                                                                                                     |
> | performance_schema_max_cond_instances             | 1000                                                                                                                   |
> | performance_schema_max_file_classes               | 50                                                                                                                     |
> | performance_schema_max_file_handles               | 32768                                                                                                                  |
> | performance_schema_max_file_instances             | 10000                                                                                                                  |
> | performance_schema_max_mutex_classes              | 200                                                                                                                    |
> | performance_schema_max_mutex_instances            | 1000000                                                                                                                |
> | performance_schema_max_rwlock_classes             | 30

Not really a big problem, and not a complaint, just no longer a simple diff.

The Casual MySQL DBA – Operational Basics

Wednesday, November 17th, 2010

So your not a MySQL DBA, but you have to perform like one. If you have a production environment that’s running now, what are the first things you do when it’s not running or reported as not running?

  1. Are the MySQL processes running? (i.e. mysqld and mysqld_safe)
  2. Can you connect locally via cli?
  3. What’s in the MySQL error log?
  4. What are current MySQL threads doing? Locked? long running? how many? idle sources?
  5. Can you connect remotely via cli?
  6. Verify free diskspace?
  7. Verify system physical resources?
  8. If this is a slave, is MySQL replication running? Is it up to date?
  9. What is the current MySQL load, e.g. reads/writes/throughput/network/disk etc?
  10. What is the current InnoDB state and load? (based on if your using InnoDB)

After you do this manually more then once you should be scripting these commands to be productive for future analysis and proactive monitoring?

Is a problem obvious? Does the output look different to what a normal environment looks like? (HINT: This list is not just for when there is a problem)

So moving forward?

  1. Is disk/memory/cpu/network bottleneck an issue you can resolve?
  2. Can you improving locking statements (if applicable)?
  3. Can you identify, analyse and tune long running statements?
  4. Do you know how to restart MySQL?
  5. Do you know who to call when you have a non working environment?
  6. When did your backup last run?
  7. Does your last backup work?

In order to support any level of production MySQL environment you need to know the answers to these questions? If you don’t, then this is your homework checklist for MySQL DBA operations 101. There a number of resources where you can find the answers, and this help can be available online, however never assume the timeliness of responses, especially if your expecting if for FREE! Open source software can be free, open source support rarely is.

MySQL HandlerSocket under Ubuntu

Friday, November 5th, 2010

Starting with the great work of Yoshinori-san Using MySQL as a NoSQL – A story for exceeding 750,000 qps on a commodity server and Golan Zakai who posted Installing Dena’s HandlerSocket NoSQL plugin for MySQL on Centos I configured and tested HandlerSocket under Ubuntu 10.04 64bit.

NOTE: This machine already compiles MySQL and Drizzle. You should refer to appropriate source compile instructions for necessary dependencies.

# Get Software
cd /some/path
export DIR=`pwd`
wget http://download.github.com/ahiguti-HandlerSocket-Plugin-for-MySQL-1.0.6-10-gd032ec0.tar.gz
wget http://mysql.mirror.iweb.ca/Downloads/MySQL-5.1/mysql-5.1.52.tar.gz
wget  http://mysql.mirror.iweb.ca/Downloads/MySQL-5.1/mysql-5.1.52-linux-x86_64-glibc23.tar.gz

# Install MySQL Binaries
tar xvfz mysql-5.1.52-linux-x86_64-glibc23.tar.gz
ln -s mysql-5.1.52-linux-x86_64-glibc23 mysql5152
rm -rf mysql5152/mysql-test
rm -rf mysql5152/sql-bench
cd mysql5152
scripts/mysql_install_db
bin/mysqld_safe &

# Install MySQL Source
tar xvfz mysql-5.1.52.tar.gz

# Install HandlerSocket
tar xvfz ahiguti-HandlerSocket-Plugin-for-MySQL-1.0.6-10-gd032ec0.tar.gz
cd ahiguti-HandlerSocket-Plugin-for-MySQL-d032ec0/
./autogen.sh
./configure --with-mysql-source=${DIR}/mysql-5.1.51 --with-mysql-bindir=${DIR}/mysql5152/bin
make
sudo make install

# libtool: install: /usr/bin/install -c .libs/handlersocket.lai /home/rbradfor/projects/handlersocket/mysql5152/lib/plugin/handlersocket.la

# Install the Perl dependency
cd perl-Net-HandlerSocket
perl Makefile.PL
Checking if your kit is complete...
Warning: the following files are missing in your kit:
	lib/HandlerSocket.pm
Please inform the author.
Writing Makefile for Net::HandlerSocket
# For some reason I had to run this twice, after error I check, saw nothing wrong, ran a second time and it worked ???
perl Makefile.PL
Writing Makefile for Net::HandlerSocket
make
sudo make install

Files found in blib/arch: installing files in blib/lib into architecture dependent library tree
Installing /usr/local/lib/perl/5.10.1/auto/Net/HandlerSocket/HandlerSocket.so
Installing /usr/local/lib/perl/5.10.1/auto/Net/HandlerSocket/HandlerSocket.bs
Installing /usr/local/lib/perl/5.10.1/Net/HandlerSocket.pm
Installing /usr/local/lib/perl/5.10.1/Net/HandlerSocket/Pool.pm
Installing /usr/local/man/man3/Net::HandlerSocket.3pm
Appending installation info to /usr/local/lib/perl/5.10.1/perllocal.pod

cp lib/Net/HandlerSocket.pm blib/lib/Net/HandlerSocket.pm
cp lib/Net/HandlerSocket/Pool.pm blib/lib/Net/HandlerSocket/Pool.pm
/usr/bin/perl /usr/share/perl/5.10/ExtUtils/xsubpp  -typemap /usr/share/perl/5.10/ExtUtils/typemap  HandlerSocket.xs > HandlerSocket.xsc && mv HandlerSocket.xsc HandlerSocket.c
Please specify prototyping behavior for HandlerSocket.xs (see perlxs manual)
g++ -fPIC -c  -I. -I../libhsclient -D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -g -O3 -Wall -Wno-unused   -DVERSION=\"0.01\" -DXS_VERSION=\"0.01\" -fPIC "-I/usr/lib/perl/5.10/CORE"   HandlerSocket.c
Running Mkbootstrap for Net::HandlerSocket ()
chmod 644 HandlerSocket.bs
rm -f blib/arch/auto/Net/HandlerSocket/HandlerSocket.so
LD_RUN_PATH="/usr/local/lib" g++ -fPIC  -shared -O2 -g -L/usr/local/lib -fstack-protector HandlerSocket.o  -o blib/arch/auto/Net/HandlerSocket/HandlerSocket.so 	\
	   -lhsclient  	\

chmod 755 blib/arch/auto/Net/HandlerSocket/HandlerSocket.so
cp HandlerSocket.bs blib/arch/auto/Net/HandlerSocket/HandlerSocket.bs
chmod 644 blib/arch/auto/Net/HandlerSocket/HandlerSocket.bs
Manifying blib/man3/Net::HandlerSocket.3pm

Now all built, it’s as simple as the following to install:

${DIR}/mysql5152/bin/mysql -uroot
mysql> INSTALL PLUGIN HandlerSocket SONAME 'handlersocket.so';
mysql> SHOW PLUGINS;
+---------------+----------+----------------+------------------+---------+
| Name          | Status   | Type           | Library          | License |
+---------------+----------+----------------+------------------+---------+
| binlog        | ACTIVE   | STORAGE ENGINE | NULL             | GPL     |
| partition     | ACTIVE   | STORAGE ENGINE | NULL             | GPL     |
| ARCHIVE       | ACTIVE   | STORAGE ENGINE | NULL             | GPL     |
| BLACKHOLE     | ACTIVE   | STORAGE ENGINE | NULL             | GPL     |
| CSV           | ACTIVE   | STORAGE ENGINE | NULL             | GPL     |
| FEDERATED     | DISABLED | STORAGE ENGINE | NULL             | GPL     |
| MEMORY        | ACTIVE   | STORAGE ENGINE | NULL             | GPL     |
| InnoDB        | ACTIVE   | STORAGE ENGINE | NULL             | GPL     |
| MyISAM        | ACTIVE   | STORAGE ENGINE | NULL             | GPL     |
| MRG_MYISAM    | ACTIVE   | STORAGE ENGINE | NULL             | GPL     |
| handlersocket | ACTIVE   | DAEMON         | handlersocket.so | BSD     |
+---------------+----------+----------------+------------------+---------+
11 rows in set (0.00 sec)

However, I found it not actually running even after restarting MySQL. You need the following configuration

cd ${DIR}/mysql5152
echo "[mysqld]
plugin-load=handlersocket.so
loose_handlersocket_port = 9998 # the port number to bind to (for read requests)
loose_handlersocket_port_wr = 9999 # the port number to bind to (for write requests)
loose_handlersocket_threads = 16 # the number of worker threads (for read requests)
loose_handlersocket_threads_wr = 1 # the number of worker threads (for write requests)" >> my.cnf

After restarting you get the following confirmation.

mysql>SHOW PROCESSLIST;
+----+-------------+-----------------+---------------+---------+------+-------------------------------------------+------------------+
| Id | User        | Host            | db            | Command | Time | State                                     | Info             |
+----+-------------+-----------------+---------------+---------+------+-------------------------------------------+------------------+
|  1 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |
|  2 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |
|  3 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |
|  4 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |
|  5 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |
|  6 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |
|  7 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |
|  8 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |
|  9 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |
| 10 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |
| 11 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |
| 12 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |
| 13 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |
| 14 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |
| 15 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |
| 16 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |
| 17 | system user | connecting host | handlersocket | Connect | NULL | handlersocket: mode=wr, 0 conns, 0 active | NULL             |
| 18 | root        | localhost       | NULL          | Query   |    0 | NULL                                      | SHOW PROCESSLIST |
+----+-------------+-----------------+---------------+---------+------+-------------------------------------------+------------------+
18 rows in set (0.00 sec)

My Test Data

USE test
CREATE TABLE user (
  user_id INT UNSIGNED NOT NULL,
  name VARCHAR(50) NOT NULL,
  email VARCHAR(255) NOT NULL,
  created TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY(user_id)
) ENGINE=InnoDB; 

INSERT INTO user (user_id,name, email) VALUES
(100,'Mickey Mouse','mickey@acme.inc'),
(101,'Marvin the Martian','mars@warnerbros.inc'),
(102,'Spiderman','spidy@marvelcomics.inc');

SELECT user_id,name,email FROM user WHERE user_id=101;
+---------+--------------------+---------------------+
| user_id | name               | email               |
+---------+--------------------+---------------------+
|     101 | Marvin the Martian | mars@warnerbros.inc |
+---------+--------------------+---------------------+

My test retrieval program is basically unchanged from the original post.

$ cat retrieve.pl
#!/usr/bin/perl

use strict;
use warnings;
use Net::HandlerSocket;

#1. establishing a connection
my $args = { host => '10.0.0.6', port => 9998 };
my $hs = new Net::HandlerSocket($args);

#2. initializing an index so that we can use in main logics.
 # MySQL tables will be opened here (if not opened)
my $res = $hs->open_index(0, 'test', 'user', 'PRIMARY',
    'name,email,created');
die $hs->get_error() if $res != 0;

#3. main logic
 #fetching rows by id
 #execute_single (index id, cond, cond value, max rows, offset)
$res = $hs->execute_single(0, '=', [ '101' ], 1, 0);
die $hs->get_error() if $res->[0] != 0;
shift(@$res);
for (my $row = 0; $row < 1; ++$row) {
  my $user_name= $res->[$row + 0];
  my $user_email= $res->[$row + 1];
  my $created= $res->[$row + 2];
  print "$user_name\t$user_email\t$created\n";
}

#4. closing the connection
$hs->close();

If your HandlerSocket is not running you will see:

$ perl retrieve.pl
FATAL_EXIT: getaddrinfo failed: ip_to_remote_host:9998
$ telnet 10.0.0.6 9998
Trying 10.0.0.6...
telnet: Unable to connect to remote host: Connection refused

When working

$ telnet 10.0.0.6 9998
Trying 10.0.0.6...
Connected to 10.0.0.6.
Escape character is '^]'.
P	0	test	user	PRIMARY	name,email,created
0	1
0	=	1	101
0	3	Marvin the Martian	mars@warnerbros.inc	2010-11-05 13:11:34

$ perl retrieve.pl
Marvin the Martian	mars@warnerbros.inc	2010-11-05 13:11:34

Again, great work by Yoshinori-san to have this code as production deployable. Now to run some real benchmarks.

Improving MySQL Insert thoughput

Friday, November 5th, 2010

There are three simple practices that can improve general INSERT throughput. Each requires consideration on how the data is collected and what is acceptable data loss in a disaster.

General inserting of rows can be performed as single INSERT’s for example.

INSERT INTO table (col1, col2, col3) VALUES (?, ?, ?);
INSERT INTO table (col1, col2, col3) VALUES (?, ?, ?);
INSERT INTO table (col1, col2, col3) VALUES (?, ?, ?);

While this works, there are two scalability limitations. First is the network overhead of the back and forth of each SQL statement, the second is the synchronous nature, that is your code can not continue until your INSERT is successfully completed.

The first improvement is to use MySQL’s multi values capability with INSERT. That is you can insert multiple rows with a single INSERT statement. For example:

INSERT INTO table (col1, col2, col3) VALUES (?, ?, ?),  (?, ?, ?), (?, ?, ?);

Depending on how you collect the information to be inserted, you may be able to easily implement this. The benefit, as previously mentioned in The value of multi insert values shows an 88% improvement in performance.

One system variable to be aware of is max_allowed_packet. You may have to adjust this for larger INSERT statements.

Next is the ability to INSERT data based on information already in another table. You can also leverage for example another storage engine like MEMORY to batch up data to be inserted via this approach.

INSERT INTO table (col1, col2, col3) SELECT col1, col2, col3 FROM anothertable

The third option is to batch load your data from a flat file. Depending on how you source the information you are inserting, that may also be a significant improvement in throughput in bulk inserting data.

LOAD DATA [CONCURRENT] INFILE 'file'
INTO TABLE (col1, col2, col3)

On a closing note, the choice of storage engine can also have a significant effect on INSERT throughput. MySQL also offers other non ANSI options including DELAYED, IGNORE and LOW_PRIORITY. These should definitely be avoided.

OTN MySQL conference slides

Wednesday, November 3rd, 2010

2010 has been the first year I have re-presented any of my developed MySQL presentations. Historically I have always created new presentations, however Paul Vallee gave me some valuable advice at UC 2010. In the past two weeks I’ve traveled to seven countries in South America on the OTN LA tour where I have been speaking about and promoting MySQL.

My three current presentations have been improved and even simplified, more future improvements are planned. There is definitely a benefit in repeating a good presentation multiple times.