Ronald Bradford
MySQL Expert

MySQL Expert Ronald Bradford shares valuable input in MySQL Performance Tuning, MySQL Scalability and general MySQL Help from his two decades of working with MySQL, Oracle, Ingres and development technologies.

Posts Tagged ‘performance’

Ineffective concatenated indexes

Wednesday, February 24th, 2010

In MySQL significant performance improvements can be achieved by the correct use of indexes. It is important to understand different MySQL index implementations and one key improvement on indexes defined on single columns is to use multiple column or more commonly known concatenated indexes.

However it’s also possible to define ineffective indexes. This example shows you how to identify a concatenated index that is ineffective.

CREATE TABLE example (
  id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  a  INT UNSIGNED NOT NULL,
  b  INT UNSIGNED NOT NULL,
  c  INT UNSIGNED NOT NULL,
  d  INT UNSIGNED NOT NULL,
  x  VARCHAR(10),
  y  VARCHAR(10),
  z  VARCHAR(10),
PRIMARY KEY (id),
UNIQUE INDEX (a,b,c,d)
) ENGINE=InnoDB;

INSERT INTO example(a,b,c,d) VALUES
(1,0,1,1),(1,0,1,2), (1,0,2,3), (1,0,4,5),
(2,0,2,1),(2,0,2,2), (2,0,2,3), (2,0,2,5),
(3,0,2,1),(3,0,2,3), (3,0,3,3), (3,0,3,5);

And our sample query is

SELECT id,x,y,z
FROM   example
WHERE  a = 2
AND    c = 2
AND    d = 3;

The EXPLAIN plan is

+----+-------------+---------+------+---------------+------+---------+-------+------+-------------+
| id | select_type | table   | type | possible_keys | key  | key_len | ref   | rows | Extra       |
+----+-------------+---------+------+---------------+------+---------+-------+------+-------------+
|  1 | SIMPLE      | example | ref  | a             | a    | 4       | const |    4 | Using where |
+----+-------------+---------+------+---------------+------+---------+-------+------+-------------+

While we are using the index (see the key column), the full benefit of the index is not utilized (see the key_len). 4 indicates the number of bytes used, that is only 1 INT column.

Let’s look at a second example.

SELECT id,x,y,z
FROM   example
WHERE  a = 2
AND    b = 0
AND    c = 2
AND    d = 3;

+----+-------------+---------+-------+---------------+------+---------+-------------------------+------+-------+
| id | select_type | table   | type  | possible_keys | key  | key_len | ref                     | rows | Extra |
+----+-------------+---------+-------+---------------+------+---------+-------------------------+------+-------+
|  1 | SIMPLE      | example | const | a             | a    | 16      | const,const,const,const |    1 |       |
+----+-------------+---------+-------+---------------+------+---------+-------------------------+------+-------+

In this example the index is used however the key_len is 16, that is 4 x 4 byte INT columns. This is ideal for this index.

In the above example, the client was using a common data structure however was not using one column, it’s values were all effectively 0. Certain queries were written with this knowledge and instead of specifying the column, they elected to remove it, however the impact of this developer code change was increased load on the database and more inefficient performance.

While this was easily addressed by a code change, an alternative could have been to change the index definition. It was not possible to remove the column due to legacy requirements.

ALTER TABLE example
DROP INDEX a,  ADD UNIQUE INDEX (a,c,d);
mysql> explain SELECT id,x,y,z FROM   example  WHERE  a = 2 AND    c = 2 AND    d = 3;
+----+-------------+---------+-------+---------------+------+---------+-------------------+------+-------+
| id | select_type | table   | type  | possible_keys | key  | key_len | ref               | rows | Extra |
+----+-------------+---------+-------+---------------+------+---------+-------------------+------+-------+
|  1 | SIMPLE      | example | const | a             | a    | 12      | const,const,const |    1 |       |
+----+-------------+---------+-------+---------------+------+---------+-------------------+------+-------+

A better solution may be to enforce better integrity on this business rule, that b only contains 0. MySQL does not support check constraint. The closest option would be to use an ENUM data type and a STRICT sql_mode however ENUM is only a string object so this would break other code.

ALTER TABLE example MODIFY b ENUM('0') NOT NULL DEFAULT '0';

Not only is performance impacted in this situation, in other examples of ineffective indexes it’s simply a waste of diskspace to sort additional columns in an index that are never used, and also a waste of memory as the index pages stored in memory contain information that is not used.

10x Performance Improvements in MySQL – A Case Study

Sunday, February 7th, 2010

The slides for my presentation at FOSDEM 2010 are now available online at slideshare. In this presentation I describe a successful client implementation with the result of 10x performance improvements. My presentation covers monitoring, reviewing and analyzing SQL, the art of indexes, improving SQL, storage engines and caching.

The end result was a page load improvement from 700+ms load time to a a consistent 60ms.

Speaking at MySQL UC 2010

Wednesday, January 20th, 2010

My talk on 10x performance improvements – A case study has just been approved for the 2010 MySQL Conference. This will be my 5th straight year speaking at the MySQL conferences. For those in Europe wanting a sneak peek I am also speaking at FOSDEM 2010 in Brussels on Feb 7th where I’ll be giving an abridged version.

As an independent MySQL consultant, my work generally covers performance tuning and scalability and sometimes database architecture. Often however my work involves a review of a given problem and recommendations (immediate, short and long term). I’m rarely involved in the full implementation and generally do not see the full fruits of the proposed work.

Recently however I was able to work with a client, first in resolving critical performance issues and then in a review of the application architecture and MySQL environment, provide recommendations and also helping internal resources in the successful implementation. The result was a very successful engagement and an ideal case study on a strategy for tackling improving performance of an application using MySQL.

With an existing environment that included a MySQL master, 3 database slaves and 6 web servers this application provided the grounds of a well sized configuration and the need for greater availability and scalability.

The Call for Papers for the 2010 MySQL conference is still open. If you would like to share your experiences with MySQL submit your talk today.


O'Reilly MySQL Conference & Expo 2010

SQL Analysis with MySQL Proxy – Part 2

Thursday, September 3rd, 2009

As I outlined in Part 1 MySQL Proxy can be one tool for performing SQL analysis. The impact with any monitoring is the art of monitoring will affect the results, in this case the performance. I don’t recommend enabling this level of detailed monitoring in production, these techniques are designed for development, testing, and possibly stress testing.

This leads to the question, how do I monitor SQL in production? The simple answer to this question is, Sampling. Take a representative sample of your production system. The implementation of this depends on many factors including your programming technology stack, and your MySQL topology.

If for example you are using PHP, then defining MySQL proxy on a production system, and executing firewall rules to redirect incoming 3306 traffic to 4040 for a period of time, e.g. 2 seconds can provide a wealth of information as to what’s happening on the server now. I have used this very successfully in production as an information gathering an analysis tool. It is also reasonably easy to configure, execute and the impact on any failures for example are minimized due to the sampling time.

If you run a distributed environment with MySQL Slaves, or many application servers, you can also introduce sampling to a certain extent as these specific points, however like scaling options, it is key to be able to handle and process the write load accurately.

Another performance improvement is to move processing of the gathered information in MySQL proxy to a separate thread or process, removing this work from the thread execution path and therefore increasing the performance. I’m interested to explore the option of passing this information off to memcached or gearman and having MySQL proxy simply capture the packet information and distributing the output. I have yet to see how memcached and/or gearman integrate with the Lua/C bindings. If anybody has experience or knowledge I would be interested to know more.

It is interesting to know that Drizzle provides a plugin to send this level of logging information to gearman automatically.

Understanding Different MySQL Index Implementations

Wednesday, July 22nd, 2009

It is important to know and understand that while indexing columns in MySQL will generally improve performance, using the appropriate type of index can make a greater impact on performance.

There are four general index types to consider when creating an appropriate index to optimize SQL queries.

  • Column Index
  • Concatenated Index
  • Covering Index
  • Partial Index

For the purpose of this discussion I am excluding other specialized index types such as fulltext, spatial and hash in memory engine.

Example Table

For the following examples, I will use this test table structure.

DROP TABLE IF EXISTS t1;
CREATE TABLE t1(
  id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  user_name VARCHAR(20) NOT NULL,
  first_name VARCHAR(30) NOT NULL,
  last_name VARCHAR(30) NOT NULL,
  external_id INT UNSIGNED NOT NULL,
  country_id SMALLINT UNSIGNED NOT NULL,
  PRIMARY KEY(id)
) ENGINE=InnoDB;

Column Index

Quite simply, you have an index on a single column to help with performance. For example, if you were to query your data on external_id, without an index the system will need to read all data pages and then sequential scan pages to identify matching records. As there is no information known about how many rows satisfy the criteria, all data must be read. You can confirm this with the QEP.

SELECT id, user_name
FROM   t1
WHERE external_id = 1;

By adding an index to external_id, the query is optimized to only look at records that satisfy your criteria.

ALTER TABLE t1
  ADD INDEX (external_id);

Concatenated Index

I often see many single column indexes on tables, when these are simply not needed, and generally will be not used. This is easily identified when looking at the QEP and seeing multiple 3,4,5 possible keys.
You need to also consider in your MySQL Index theory, that in general only one index is used for each table in a MySQL query. There are a few exceptions however these are rare.

A concatenated index uses multiple columns. Let’s look a modified version of our query.

SELECT id, user_name
FROM   t1
WHERE external_id = 1
AND      country_id = 5;

The original external_id index will be used, however if we create a concatenated index on external_id and country_id we improve the query path.

ALTER TABLE t1
  DROP INDEX external_id,
  ADD INDEX (external_id, country_id);

What about an index on country_id, external_id? If your access to your data always includes these two columns, you can consider swapping the columns based on the cardinality. However, if you have queries that search on external_id or external_id and country_id, then creating an index on country_id, external_id will not be used.

Tip In the QEP look at the key length to determine how effective concatenated indexes are.

Covering Index

A covering index as the name describes covers all columns in a query. The benefit of a covering index is that the lookup of the various Btree index pages necessary satisfies the query, and no additional data page lookups are necessary.

If we revisit our earlier example, by modifying the external_id index, and create a concatenated index on external_id and user_name we actually satisfy

ALTER TABLE t1
  DROP INDEX external_id,
  ADD INDEX (external_id, user_name);
SELECT id, user_name
FROM   t1
WHERE external_id = 1;

With MySQL, the QEP will indicate in Extra, ‘Using Index’. This is not a reference to the index actually being used, but the index satisfies all requirements of the query.

Partial Index

The final type is the partial index. This is a MySQL feature which allows you specify a subset of a column for the index.

Let’s say we query data and allow pattern matching on last name.

SELECT id, first_name, last_name, user_name
FROM   t1
WHERE last_name like 'A%'

We should add an index to last_name to improve performance.

ALTER TABLE t1
  ADD INDEX (last_name);

Depending on the average length of data in last_name (you can use PROCEDURE ANALYSE as a quick tool to sample this), creating a partial index may greatly reduce the size of the index, and minimize the additional data lookups required.

ALTER TABLE t1
  DROP INDEX last_name,
  ADD INDEX (last_name(10));

In this example, you would want to investigate the size of the index, the improvement, and then the amount of additional reads necessary for sample queries. If your accessed data is generally hot, then the benefit of a smaller index will not be impacted by additional data seeks.

Conclusion

As with any performance tuning, sufficient analysis and before and after testing is necessary for your specific environment.

Some future topics on indexes not discussed here include:

  • Using UNIQUE Indexe
  • The impact of NULL columns and values on indexes
  • Eliminating filesort by using indexes
  • The affect of too many indexes
  • Index cardinality

You need to also consider in your MySQL Index theory, that in general only one index is used for each table in a MySQL query. There are a few exceptions however these are rare.

I common question I am also asked is about function based indexes? MySQL provides no means to use a scalar function against a column in an index.

Percona Performance Conference Talk

Thursday, April 23rd, 2009

My final presentation during the 2009 MySQL Conference and Expo week was with the Percona Performance Conference on the topic of The Ideal Performance Architecture. My talk included discussions on Technology, Disk, Memory, Indexes, SQL and Data.

Updated 09/18/09
you can now see video of the event at Percona TV.