Ronald Bradford
MySQL Expert

MySQL Expert Ronald Bradford shares valuable input in MySQL Performance Tuning, MySQL Scalability and general MySQL Help from his two decades of working with MySQL, Oracle, Ingres and development technologies.

Posts Tagged ‘performance’

Speaking at Surge Scalability 2010 – Baltimore, MD

Wednesday, July 28th, 2010

I will be joining a great list of quality speakers including John Allspaw, Theo Schlossnagle, Rasmus Lerdorf and Tom Cook at Surge 2010 in Baltimore, Maryland on Thu 30 Sep, and Fri Oct 1st 2010. Surge 2010 Speaker - Baltimore, MD

My presentation on “The most common MySQL scalability mistakes, and how to avoid them.” will include discussing various experiences observed in the field as a MySQL Consultant and MySQL Performance Tuning expert.

Abstract:

The most common mistakes are easy to avoid however many startups continue to fall prey, with the impact including large re-design costs, delays in new feature releases, lower staff productivity and less then ideal ROI. All growing and successful sites need to achieve higher Availability, seamless Scalability and proven Resilience. Know the right MySQL environments to provide a suitable architecture and application design to support these essential needs.

Overview:

Some details of the presentation would include:

  • The different types of accessible data  (e.g. R/W, R, none)
  • What limits MySQL availability (e.g software upgrades, blocking statements, locking etc)
  • The three components of scalability – Read Scalability/Write Scalability/Caching
  • Design practices for increasing scalability and not physical resources
  • Disaster is inevitable. Having a tested and functional failover strategy
  • When other products are better (e.g. Static files, Session management via Key/Value store)
  • What a lack of accurate monitoring causes
  • What a lack of breakability testing causes
  • What does “No Downtime” mean to your organization.
  • Implementing a successful “failed whale” approach with preemptive analysis
  • Identifying when MySQL is not your bottleneck

Optimizing SQL Performance – The Art of Elimination

Thursday, July 8th, 2010

The most efficient performance optimization of a SQL statement is to eliminate it. Cary Millsap’s recent Kaleidoscope presentation again highlighted that improving performance is function of code path. Removing code will improve performance.

You may think that it could be hard to eliminate SQL, however when you know every SQL statement that is executed in your code path obvious improvements may be possible. In the sequence SQL was implemented sometimes easy observations can lead to great gains. Let me provide some actual client examples that were discovered by using the MySQL General Log.

Example 1

5 Query   SELECT *  FROM `artist`
5 Query   SELECT *  FROM `artist`
5 Query   SELECT *  FROM `artist`
5 Query   SELECT *  FROM `artist`
5 Query   SELECT *  FROM `artist`
5 Query   SELECT *  FROM `artist` WHERE (ArtistID = 196 )
5 Query   SELECT *  FROM `artist` WHERE (ArtistID = 2188 )
5 Query   SELECT *  FROM `artist`
5 Query   SELECT *  FROM `artist`
5 Query   SELECT *  FROM `artist`

In this example, the following was executed for a single page load. Not only did I find a bug where full-table scans occurred rather then being qualified, there were many repeating and unnecessary occurrences.

Example 2

SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'
SELECT option_value FROM wp_options WHERE option_name = 'aiosp_title_format' LIMIT 1
SELECT option_value FROM wp_options WHERE option_name = 'ec3_show_only_even' LIMIT 1
SELECT option_value FROM wp_options WHERE option_name = 'ec3_num_months' LIMIT 1
SELECT option_value FROM wp_options WHERE option_name = 'ec3_day_length' LIMIT 1
SELECT option_value FROM wp_options WHERE option_name = 'ec3_hide_event_box' LIMIT 1
SELECT option_value FROM wp_options WHERE option_name = 'ec3_advanced' LIMIT 1
SELECT option_value FROM wp_options WHERE option_name = 'ec3_navigation' LIMIT 1
SELECT option_value FROM wp_options WHERE option_name = 'ec3_disable_popups' LIMIT 1
SELECT option_value FROM wp_options WHERE option_name = 'sidebars_widgets' LIMIT 1

This is a stock Wordpress installation and highlights a classic Row at a Time (RAT) processing.

Example 3

SELECT * FROM activities_theme WHERE theme_parent_id=0
SELECT * FROM activities_theme WHERE theme_parent_id=1
SELECT * FROM activities_theme WHERE theme_parent_id=2
SELECT * FROM activities_theme WHERE theme_parent_id=11
SELECT * FROM activities_theme WHERE theme_parent_id=16

In this client example, again RAT processing, I provided a code improvement to run these multiple queries in a single statement, otherwise known as Chunk At a Time (CAT) processing. It’s not rocket science however the elimination of the network component of several SQL statements can greatly reduce page load time.

SELECT *
FROM   activities_theme
WHERE  theme_parent_id in  (0,1,2,11,16) 

Example 4

The following represents one of the best improvement. During capture, the following query was executed 6,000 times over a 5 minute period. While you make think this is acceptable, the value passed wae 0. The pages_id is an auto_increment column which by definition does not have a 0 value. In this instance, a simple boundary condition in the code would eliminate this query.

SELECT pages_id, pages_livestats_code, pages_title,
       pages_parent, pages_exhibid, pages_theme,
       pages_accession_num
FROM pages WHERE pages_id = 0

There are many tips to improving and optimizing SQL. This is the simplest and often overlooked starting point.

Timing your SQL queries

Wednesday, July 7th, 2010

When working interactively with the MySQL client, you receive feedback of the time the query took to complete to a granularity of 10 ms.

Enabling profiling is a simple way to get more a more accurate timing of running queries. In the following example you can see the time the kernel took to run an explain, the query, and alter, and repeat explain and query.


mysql> set profiling=1;
mysql> EXPLAIN SELECT ...
mysql> SELECT ...
mysql> ALTER ...
mysql> show profiles;
+----------+------------+-------------------------
| Query_ID | Duration   | Query
+----------+------------+-------------------------
|        1 | 0.00036500 | EXPLAIN SELECT sbvi.id a
|        2 | 0.00432700 | SELECT sbvi.id as sbvi_i
|        3 | 2.83206100 | alter table sbvi drop in
|        4 | 0.00047500 | explain SELECT sbvi.id a
|        5 | 0.00367100 | SELECT sbvi.id as sbvi_i
+----------+------------+-------------------------

More information at Show Profiles documentation page.

Why is my database slow?

Tuesday, May 11th, 2010

Not part of my Don’t Assume series, but when a client states “Why is my database slow”", you need to determine if indeed the database is slow.

Some simple tools come to the rescue here, one is Firebug. If a web page takes 5 seconds to load, but the .htm file takes 400ms, and the 100+ assets being downloaded from one base url, then is the database actually slow? Tuning the database will only improve the 400ms portion of 5,000ms download.

There some very simple tips here. MySQL is my domain expertise and I will not profess to improving the entire stack however perception is everything to a user and you can often do a lot. Some simple points include:

  • Know about blocking assets in your <head> element, e.g. .js files.
  • Streamline .js, .css and images to what’s needed. .e.g. download a 100k image only to resize to a thumbnail via style elements.
  • Sprites. Like many efficient but simple SQL statements, network overhead is your greatest expense.
  • Splitting images to a different domain.
  • Splitting images to multiple domains (e.g. 3 via CNAME only needed.) — Hint: Learn about the protocol
  • Cookieless domains for static assets
  • Lighter web container for static assets (e.g. nginx, lighttpd)
  • Know about caching, expires and etags
  • Stripping out http://ww.domain.com from all your internal links (that one alone saved 12% of HTML page size for a client). You may ask is that really a big deal, well in a high volume site the sooner you can release the socket on your webserver, the sooner you can start serving a different request.

Like tuning a database, some things work better then others, some require more testing then others, and consultants never tell you all the tricks.

References

As with everything in tuning, do your research and also determine what works in your environment and what doesn’t. Two excellent resources to start with are Steve Souders and Best Practices for Speeding Up Your Web Site by Yahoo.

2010 MySQL Conference Presentations

Monday, April 19th, 2010

I have uploaded my three presentations from the 2010 MySQL Users Conference in Santa Clara, California which was my 5th consecutive year appearing as a speaker.

A full history of my MySQL presentations can be found on the Presenting page.

Ineffective concatenated indexes

Wednesday, February 24th, 2010

In MySQL significant performance improvements can be achieved by the correct use of indexes. It is important to understand different MySQL index implementations and one key improvement on indexes defined on single columns is to use multiple column or more commonly known concatenated indexes.

However it’s also possible to define ineffective indexes. This example shows you how to identify a concatenated index that is ineffective.

CREATE TABLE example (
  id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  a  INT UNSIGNED NOT NULL,
  b  INT UNSIGNED NOT NULL,
  c  INT UNSIGNED NOT NULL,
  d  INT UNSIGNED NOT NULL,
  x  VARCHAR(10),
  y  VARCHAR(10),
  z  VARCHAR(10),
PRIMARY KEY (id),
UNIQUE INDEX (a,b,c,d)
) ENGINE=InnoDB;

INSERT INTO example(a,b,c,d) VALUES
(1,0,1,1),(1,0,1,2), (1,0,2,3), (1,0,4,5),
(2,0,2,1),(2,0,2,2), (2,0,2,3), (2,0,2,5),
(3,0,2,1),(3,0,2,3), (3,0,3,3), (3,0,3,5);

And our sample query is

SELECT id,x,y,z
FROM   example
WHERE  a = 2
AND    c = 2
AND    d = 3;

The EXPLAIN plan is

+----+-------------+---------+------+---------------+------+---------+-------+------+-------------+
| id | select_type | table   | type | possible_keys | key  | key_len | ref   | rows | Extra       |
+----+-------------+---------+------+---------------+------+---------+-------+------+-------------+
|  1 | SIMPLE      | example | ref  | a             | a    | 4       | const |    4 | Using where |
+----+-------------+---------+------+---------------+------+---------+-------+------+-------------+

While we are using the index (see the key column), the full benefit of the index is not utilized (see the key_len). 4 indicates the number of bytes used, that is only 1 INT column.

Let’s look at a second example.

SELECT id,x,y,z
FROM   example
WHERE  a = 2
AND    b = 0
AND    c = 2
AND    d = 3;

+----+-------------+---------+-------+---------------+------+---------+-------------------------+------+-------+
| id | select_type | table   | type  | possible_keys | key  | key_len | ref                     | rows | Extra |
+----+-------------+---------+-------+---------------+------+---------+-------------------------+------+-------+
|  1 | SIMPLE      | example | const | a             | a    | 16      | const,const,const,const |    1 |       |
+----+-------------+---------+-------+---------------+------+---------+-------------------------+------+-------+

In this example the index is used however the key_len is 16, that is 4 x 4 byte INT columns. This is ideal for this index.

In the above example, the client was using a common data structure however was not using one column, it’s values were all effectively 0. Certain queries were written with this knowledge and instead of specifying the column, they elected to remove it, however the impact of this developer code change was increased load on the database and more inefficient performance.

While this was easily addressed by a code change, an alternative could have been to change the index definition. It was not possible to remove the column due to legacy requirements.

ALTER TABLE example
DROP INDEX a,  ADD UNIQUE INDEX (a,c,d);
mysql> explain SELECT id,x,y,z FROM   example  WHERE  a = 2 AND    c = 2 AND    d = 3;
+----+-------------+---------+-------+---------------+------+---------+-------------------+------+-------+
| id | select_type | table   | type  | possible_keys | key  | key_len | ref               | rows | Extra |
+----+-------------+---------+-------+---------------+------+---------+-------------------+------+-------+
|  1 | SIMPLE      | example | const | a             | a    | 12      | const,const,const |    1 |       |
+----+-------------+---------+-------+---------------+------+---------+-------------------+------+-------+

A better solution may be to enforce better integrity on this business rule, that b only contains 0. MySQL does not support check constraint. The closest option would be to use an ENUM data type and a STRICT sql_mode however ENUM is only a string object so this would break other code.

ALTER TABLE example MODIFY b ENUM('0') NOT NULL DEFAULT '0';

Not only is performance impacted in this situation, in other examples of ineffective indexes it’s simply a waste of diskspace to sort additional columns in an index that are never used, and also a waste of memory as the index pages stored in memory contain information that is not used.

10x Performance Improvements in MySQL – A Case Study

Sunday, February 7th, 2010

The slides for my presentation at FOSDEM 2010 are now available online at slideshare. In this presentation I describe a successful client implementation with the result of 10x performance improvements. My presentation covers monitoring, reviewing and analyzing SQL, the art of indexes, improving SQL, storage engines and caching.

The end result was a page load improvement from 700+ms load time to a a consistent 60ms.

Speaking at MySQL UC 2010

Wednesday, January 20th, 2010

My talk on 10x performance improvements – A case study has just been approved for the 2010 MySQL Conference. This will be my 5th straight year speaking at the MySQL conferences. For those in Europe wanting a sneak peek I am also speaking at FOSDEM 2010 in Brussels on Feb 7th where I’ll be giving an abridged version.

As an independent MySQL consultant, my work generally covers performance tuning and scalability and sometimes database architecture. Often however my work involves a review of a given problem and recommendations (immediate, short and long term). I’m rarely involved in the full implementation and generally do not see the full fruits of the proposed work.

Recently however I was able to work with a client, first in resolving critical performance issues and then in a review of the application architecture and MySQL environment, provide recommendations and also helping internal resources in the successful implementation. The result was a very successful engagement and an ideal case study on a strategy for tackling improving performance of an application using MySQL.

With an existing environment that included a MySQL master, 3 database slaves and 6 web servers this application provided the grounds of a well sized configuration and the need for greater availability and scalability.

The Call for Papers for the 2010 MySQL conference is still open. If you would like to share your experiences with MySQL submit your talk today.


O'Reilly MySQL Conference & Expo 2010

SQL Analysis with MySQL Proxy – Part 2

Thursday, September 3rd, 2009

As I outlined in Part 1 MySQL Proxy can be one tool for performing SQL analysis. The impact with any monitoring is the art of monitoring will affect the results, in this case the performance. I don’t recommend enabling this level of detailed monitoring in production, these techniques are designed for development, testing, and possibly stress testing.

This leads to the question, how do I monitor SQL in production? The simple answer to this question is, Sampling. Take a representative sample of your production system. The implementation of this depends on many factors including your programming technology stack, and your MySQL topology.

If for example you are using PHP, then defining MySQL proxy on a production system, and executing firewall rules to redirect incoming 3306 traffic to 4040 for a period of time, e.g. 2 seconds can provide a wealth of information as to what’s happening on the server now. I have used this very successfully in production as an information gathering an analysis tool. It is also reasonably easy to configure, execute and the impact on any failures for example are minimized due to the sampling time.

If you run a distributed environment with MySQL Slaves, or many application servers, you can also introduce sampling to a certain extent as these specific points, however like scaling options, it is key to be able to handle and process the write load accurately.

Another performance improvement is to move processing of the gathered information in MySQL proxy to a separate thread or process, removing this work from the thread execution path and therefore increasing the performance. I’m interested to explore the option of passing this information off to memcached or gearman and having MySQL proxy simply capture the packet information and distributing the output. I have yet to see how memcached and/or gearman integrate with the Lua/C bindings. If anybody has experience or knowledge I would be interested to know more.

It is interesting to know that Drizzle provides a plugin to send this level of logging information to gearman automatically.

Understanding Different MySQL Index Implementations

Wednesday, July 22nd, 2009

It is important to know and understand that while indexing columns in MySQL will generally improve performance, using the appropriate type of index can make a greater impact on performance.

There are four general index types to consider when creating an appropriate index to optimize SQL queries.

  • Column Index
  • Concatenated Index
  • Covering Index
  • Partial Index

For the purpose of this discussion I am excluding other specialized index types such as fulltext, spatial and hash in memory engine.

Example Table

For the following examples, I will use this test table structure.

DROP TABLE IF EXISTS t1;
CREATE TABLE t1(
  id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  user_name VARCHAR(20) NOT NULL,
  first_name VARCHAR(30) NOT NULL,
  last_name VARCHAR(30) NOT NULL,
  external_id INT UNSIGNED NOT NULL,
  country_id SMALLINT UNSIGNED NOT NULL,
  PRIMARY KEY(id)
) ENGINE=InnoDB;

Column Index

Quite simply, you have an index on a single column to help with performance. For example, if you were to query your data on external_id, without an index the system will need to read all data pages and then sequential scan pages to identify matching records. As there is no information known about how many rows satisfy the criteria, all data must be read. You can confirm this with the QEP.

SELECT id, user_name
FROM   t1
WHERE external_id = 1;

By adding an index to external_id, the query is optimized to only look at records that satisfy your criteria.

ALTER TABLE t1
  ADD INDEX (external_id);

Concatenated Index

I often see many single column indexes on tables, when these are simply not needed, and generally will be not used. This is easily identified when looking at the QEP and seeing multiple 3,4,5 possible keys.
You need to also consider in your MySQL Index theory, that in general only one index is used for each table in a MySQL query. There are a few exceptions however these are rare.

A concatenated index uses multiple columns. Let’s look a modified version of our query.

SELECT id, user_name
FROM   t1
WHERE external_id = 1
AND      country_id = 5;

The original external_id index will be used, however if we create a concatenated index on external_id and country_id we improve the query path.

ALTER TABLE t1
  DROP INDEX external_id,
  ADD INDEX (external_id, country_id);

What about an index on country_id, external_id? If your access to your data always includes these two columns, you can consider swapping the columns based on the cardinality. However, if you have queries that search on external_id or external_id and country_id, then creating an index on country_id, external_id will not be used.

Tip In the QEP look at the key length to determine how effective concatenated indexes are.

Covering Index

A covering index as the name describes covers all columns in a query. The benefit of a covering index is that the lookup of the various Btree index pages necessary satisfies the query, and no additional data page lookups are necessary.

If we revisit our earlier example, by modifying the external_id index, and create a concatenated index on external_id and user_name we actually satisfy

ALTER TABLE t1
  DROP INDEX external_id,
  ADD INDEX (external_id, user_name);
SELECT id, user_name
FROM   t1
WHERE external_id = 1;

With MySQL, the QEP will indicate in Extra, ‘Using Index’. This is not a reference to the index actually being used, but the index satisfies all requirements of the query.

Partial Index

The final type is the partial index. This is a MySQL feature which allows you specify a subset of a column for the index.

Let’s say we query data and allow pattern matching on last name.

SELECT id, first_name, last_name, user_name
FROM   t1
WHERE last_name like 'A%'

We should add an index to last_name to improve performance.

ALTER TABLE t1
  ADD INDEX (last_name);

Depending on the average length of data in last_name (you can use PROCEDURE ANALYSE as a quick tool to sample this), creating a partial index may greatly reduce the size of the index, and minimize the additional data lookups required.

ALTER TABLE t1
  DROP INDEX last_name,
  ADD INDEX (last_name(10));

In this example, you would want to investigate the size of the index, the improvement, and then the amount of additional reads necessary for sample queries. If your accessed data is generally hot, then the benefit of a smaller index will not be impacted by additional data seeks.

Conclusion

As with any performance tuning, sufficient analysis and before and after testing is necessary for your specific environment.

Some future topics on indexes not discussed here include:

  • Using UNIQUE Indexe
  • The impact of NULL columns and values on indexes
  • Eliminating filesort by using indexes
  • The affect of too many indexes
  • Index cardinality

You need to also consider in your MySQL Index theory, that in general only one index is used for each table in a MySQL query. There are a few exceptions however these are rare.

I common question I am also asked is about function based indexes? MySQL provides no means to use a scalar function against a column in an index.