Percona Live Presentation: Improving Performance With Better Indexes

The slides for my Improving Performance With Better Indexes presentation at Percona Live 2015 MySQL Conference and Expo are now available.

In this presentation I discuss how to identify, review and analyze SQL statements in order to create better indexes for your queries. This includes understanding the EXPLAIN syntax and how to create and identify covering and partial column indexes.

This presentation is based on the work with a customer showing the 95% improvement of a key 15 table join query running 15,000 QPS in a 25 billion SQL statements per day infrastructure.

As mentioned, Explaining the MySQL Explain is an additional presentation that goes into more detail for learning how to read Query Execution Plans (QEP) in MySQL.

Recent Presentations Buenos Aires MySQL/NoSQL/Cloud Conference

The first annual Latin America MySQL/NoSQL/Cloud Conference was held in Buenos Aires Argentina from June 26-28. Kudos to Santiago Lertora from Binlogic who had the vision for the conference in his country and made it happen. I look forward to the second annual event.

My first presentation was “Improving Performance with Better Indexes”. This presentation details the six steps to SQL performance analysis, Capture, Identify, Confirm, Analyze, Optimize and Verify. An explanation of MySQL EXPLAIN, and working examples to create indexes and better covering indexes in several examples are provided. A production example of a 13 table join is used to detail how covering indexes and partial column indexes can make a dramatic improvement in performance. Download Presentation (PDF).

More detailed information about EXPLAIN and creating indexes is available in book Effective MySQL: Optimizing SQL Statements.

My second presentation was “MySQL Backup and Recovery Essentials”. This presentation covers the most common options for MySQL backup and the respective restore options. Also covered is the importance of the master binary logs and point in time recovery capabilities. Download Presentation (PDF)

More detailed information about the right backup and recovery strategy and associated tools is available in book Effective MySQL: Backup and Recovery.

References

Latin America MySQL/NoSQL/Cloud Conference Program.

Utilizing multiple indexes per MySQL table join

Historically it was considered that MySQL will generally use only one index per referenced table in a SQL query. In MySQL 5.0 the introduction of merge indexes enabled for certain conditions the possibility to utilize two indexes however this could result in worst performance then creating a better index. In MySQL 5.1 it became possible to control optimization switches with the optimizer_switch system variable.

However in explaining how to utilize the intersection, union and sort union in queries I discovered that MySQL could use three indexes for one given table.

        Extra: Using union(name,intersect(founded,type)); Using where

I was not aware of this.

Extra: Using Index

Many people consider this information in the MySQL Query Execution Plan (QEP) to indicate that the referenced table is using an index. It actually means that ONLY the index is used. This can for larger and more frequent queries provide a significant boost.

In a recent example, adding an index dropped a query from 190ms to 6ms. However adding a better index dropped that 6ms query to 1.2ms. When executed 100s/1000s of times per second these millisecond improvements can have a huge benefit in greater scalability. While people often tune slow running queries, in a well tuned system shaving milliseconds of queries, in this example making 6ms query 80% better is a far greater improvement.

You can get a detailed explanation of how to identify, create and verify covering indexes from my Percona Live presentation Improving performance with better indexes where I also include another great 10 table join example, reducing a query running 20,000+ times per second from 175ms to 10ms.

Optimizing UPDATE and DELETE statements

Updated Nov 2011. Check out my latest book on Optimizing SQL Statements for more information. MySQL 5.6.2 also now provides an EXPLAIN syntax for UPDATE and DELETE statements natively.

While most people look at performance optimizations for SELECT statements, UPDATE and DELETE statements are often overlooked. These can benefit from the principles of analyzing the Query Execution Plan (QEP). You can only run an EXPLAIN on a SELECT statement, however it’s possible to rewrite an UPDATE or DELETE statement to perform like a SELECT statement.

To optimize an UPDATE, look at the WHERE clause. If you are using the PRIMARY KEY, no further analysis is necessary. If you are not, it is of benefit to rewrite your UPDATE statement as a SELECT statement and obtain a QEP as previously detailed to ensure optimal indexes are used. For example:

UPDATE t
SET	c1 = ‘x’, c2 = ‘y’, c3 = 100
WHERE c1 = ‘x’
AND	d = CURDATE()

You can rewrite this UPDATE statement as a SELECT statement for using EXPLAIN:

EXPLAIN SELECT c1, c2, c3 FROM	t WHERE c1 = ‘x’ AND	d = CURDATE()

You should now apply the same principles as you would when optimizing SELECT statements.

10x Performance Improvements in MySQL – A Case Study

The slides for my presentation at FOSDEM 2010 are now available online at slideshare. In this presentation I describe a successful client implementation with the result of 10x performance improvements. My presentation covers monitoring, reviewing and analyzing SQL, the art of indexes, improving SQL, storage engines and caching.

The end result was a page load improvement from 700+ms load time to a a consistent 60ms.

EXPLAIN – An essential tool for MySQL developers.

Just recently I came across the presentation “Bend SQL to your will with EXPLAIN” by Ligaya Turmelle via the Linked In – MySQL Speakers and Presenters group. Slides available at Slideshare. While I know a little about MySQL, I always enjoy spending time reviewing content when possible, perhaps as a good reference when asked, perhaps to glean some small point of knowledge I didn’t know, or didn’t understand completely, or had forgotten about. It’s also fun to randomly find worthwhile information.

EXPLAIN is an essential tool for MySQL developers, if you don’t know what QEP is, the listed google search link gives you a clear and obvious easy definition (using basic contextual searching techniques). You would then use EXPLAIN to determine the QEP, and then learn how to use it well.

For those that want to learn about EXPLAIN as an essential tool for MySQL developers I recommend you check the presentation out.

Explain this

The EXPLAIN command is an important tool to review how a SQL query is executed and in this example includes what indexes are used.

By adding a covering index I ended up with the following EXPLAIN plan I was unable to explain. The end result was a boost in server performance which was the ultimate goal.

mysql> explain select max(md)  from e_r  where email = '[email protected]' and id = '36981';
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
1 row in set (0.00 sec)

The queries still produced the expected results.

Understanding Different MySQL Index Implementations

It is important to know and understand that while indexing columns in MySQL will generally improve performance, using the appropriate type of index can make a greater impact on performance.

There are four general index types to consider when creating an appropriate index to optimize SQL queries.

  • Column Index
  • Concatenated Index
  • Covering Index
  • Partial Index

For the purpose of this discussion I am excluding other specialized index types such as fulltext, spatial and hash in memory engine.

Example Table

For the following examples, I will use this test table structure.

DROP TABLE IF EXISTS t1;
CREATE TABLE t1(
  id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  user_name VARCHAR(20) NOT NULL,
  first_name VARCHAR(30) NOT NULL,
  last_name VARCHAR(30) NOT NULL,
  external_id INT UNSIGNED NOT NULL,
  country_id SMALLINT UNSIGNED NOT NULL,
  PRIMARY KEY(id)
) ENGINE=InnoDB;

Column Index

Quite simply, you have an index on a single column to help with performance. For example, if you were to query your data on external_id, without an index the system will need to read all data pages and then sequential scan pages to identify matching records. As there is no information known about how many rows satisfy the criteria, all data must be read. You can confirm this with the QEP.

SELECT id, user_name
FROM   t1
WHERE external_id = 1;

By adding an index to external_id, the query is optimized to only look at records that satisfy your criteria.

ALTER TABLE t1
  ADD INDEX (external_id);

Concatenated Index

I often see many single column indexes on tables, when these are simply not needed, and generally will be not used. This is easily identified when looking at the QEP and seeing multiple 3,4,5 possible keys.
You need to also consider in your MySQL Index theory, that in general only one index is used for each table in a MySQL query. There are a few exceptions however these are rare.

A concatenated index uses multiple columns. Let’s look a modified version of our query.

SELECT id, user_name
FROM   t1
WHERE external_id = 1
AND      country_id = 5;

The original external_id index will be used, however if we create a concatenated index on external_id and country_id we improve the query path.

ALTER TABLE t1
  DROP INDEX external_id,
  ADD INDEX (external_id, country_id);

What about an index on country_id, external_id? If your access to your data always includes these two columns, you can consider swapping the columns based on the cardinality. However, if you have queries that search on external_id or external_id and country_id, then creating an index on country_id, external_id will not be used.

Tip In the QEP look at the key length to determine how effective concatenated indexes are.

Covering Index

A covering index as the name describes covers all columns in a query. The benefit of a covering index is that the lookup of the various Btree index pages necessary satisfies the query, and no additional data page lookups are necessary.

If we revisit our earlier example, by modifying the external_id index, and create a concatenated index on external_id and user_name we actually satisfy

ALTER TABLE t1
  DROP INDEX external_id,
  ADD INDEX (external_id, user_name);
SELECT id, user_name
FROM   t1
WHERE external_id = 1;

With MySQL, the QEP will indicate in Extra, ‘Using Index’. This is not a reference to the index actually being used, but the index satisfies all requirements of the query.

Partial Index

The final type is the partial index. This is a MySQL feature which allows you specify a subset of a column for the index.

Let’s say we query data and allow pattern matching on last name.

SELECT id, first_name, last_name, user_name
FROM   t1
WHERE last_name like 'A%'

We should add an index to last_name to improve performance.

ALTER TABLE t1
  ADD INDEX (last_name);

Depending on the average length of data in last_name (you can use PROCEDURE ANALYSE as a quick tool to sample this), creating a partial index may greatly reduce the size of the index, and minimize the additional data lookups required.

ALTER TABLE t1
  DROP INDEX last_name,
  ADD INDEX (last_name(10));

In this example, you would want to investigate the size of the index, the improvement, and then the amount of additional reads necessary for sample queries. If your accessed data is generally hot, then the benefit of a smaller index will not be impacted by additional data seeks.

Conclusion

As with any performance tuning, sufficient analysis and before and after testing is necessary for your specific environment.

Some future topics on indexes not discussed here include:

  • Using UNIQUE Indexe
  • The impact of NULL columns and values on indexes
  • Eliminating filesort by using indexes
  • The affect of too many indexes
  • Index cardinality

You need to also consider in your MySQL Index theory, that in general only one index is used for each table in a MySQL query. There are a few exceptions however these are rare.

I common question I am also asked is about function based indexes? MySQL provides no means to use a scalar function against a column in an index.

A 5.1 QEP nicety – Using join buffer

I was surprised to find yesterday when using MySQL 5.1.26-rc with a client I’m recommending 5.1 to, some information not seen in the EXPLAIN plan before while reviewing SQL Statements.

Using join buffer

+----+-------------+-------+--------+---------------+--------------+---------+------------------------+-------+----------------------------------------------+
| id | select_type | table | type   | possible_keys | key          | key_len | ref                    | rows  | Extra                                        |
+----+-------------+-------+--------+---------------+--------------+---------+------------------------+-------+----------------------------------------------+
|  1 | SIMPLE      | lr    | ALL    | NULL          | NULL         | NULL    | NULL                   |  1084 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | ca    | ref    | update_check  | update_check | 4       | XXXXXXXXXXXXXXXXX      |     4 | Using where; Using index                     |
|  1 | SIMPLE      | ce    | ALL    | NULL          | NULL         | NULL    | NULL                   | 13319 | Using where; Using join buffer               |
|  1 | SIMPLE      | co    | eq_ref | PRIMARY       | PRIMARY      | 4       | XXXXXXXXXXXXXXXXX      |     1 | Using where                                  |
+----+-------------+-------+--------+---------------+--------------+---------+------------------------+-------+----------------------------------------------+
4 rows in set (0.00 sec)
mysql> select version();
+-----------+
| version() |
+-----------+
| 5.1.26-rc |
+-----------+
1 row in set (0.00 sec)

Sergey Petrunia of the MySQL Optimizer team writes about this in Use of join buffer is now visible in EXPLAIN.