Percona Live Presentation: Improving Performance With Better Indexes

The slides for my Improving Performance With Better Indexes presentation at Percona Live 2015 MySQL Conference and Expo are now available.

In this presentation I discuss how to identify, review and analyze SQL statements in order to create better indexes for your queries. This includes understanding the EXPLAIN syntax and how to create and identify covering and partial column indexes.

This presentation is based on the work with a customer showing the 95% improvement of a key 15 table join query running 15,000 QPS in a 25 billion SQL statements per day infrastructure.

As mentioned, Explaining the MySQL Explain is an additional presentation that goes into more detail for learning how to read Query Execution Plans (QEP) in MySQL.

Explain this

The EXPLAIN command is an important tool to review how a SQL query is executed and in this example includes what indexes are used.

By adding a covering index I ended up with the following EXPLAIN plan I was unable to explain. The end result was a boost in server performance which was the ultimate goal.

mysql> explain select max(md)  from e_r  where email = '[email protected]' and id = '36981';
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
1 row in set (0.00 sec)

The queries still produced the expected results.

Understanding Different MySQL Index Implementations

It is important to know and understand that while indexing columns in MySQL will generally improve performance, using the appropriate type of index can make a greater impact on performance.

There are four general index types to consider when creating an appropriate index to optimize SQL queries.

  • Column Index
  • Concatenated Index
  • Covering Index
  • Partial Index

For the purpose of this discussion I am excluding other specialized index types such as fulltext, spatial and hash in memory engine.

Example Table

For the following examples, I will use this test table structure.

DROP TABLE IF EXISTS t1;
CREATE TABLE t1(
  id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  user_name VARCHAR(20) NOT NULL,
  first_name VARCHAR(30) NOT NULL,
  last_name VARCHAR(30) NOT NULL,
  external_id INT UNSIGNED NOT NULL,
  country_id SMALLINT UNSIGNED NOT NULL,
  PRIMARY KEY(id)
) ENGINE=InnoDB;

Column Index

Quite simply, you have an index on a single column to help with performance. For example, if you were to query your data on external_id, without an index the system will need to read all data pages and then sequential scan pages to identify matching records. As there is no information known about how many rows satisfy the criteria, all data must be read. You can confirm this with the QEP.

SELECT id, user_name
FROM   t1
WHERE external_id = 1;

By adding an index to external_id, the query is optimized to only look at records that satisfy your criteria.

ALTER TABLE t1
  ADD INDEX (external_id);

Concatenated Index

I often see many single column indexes on tables, when these are simply not needed, and generally will be not used. This is easily identified when looking at the QEP and seeing multiple 3,4,5 possible keys.
You need to also consider in your MySQL Index theory, that in general only one index is used for each table in a MySQL query. There are a few exceptions however these are rare.

A concatenated index uses multiple columns. Let’s look a modified version of our query.

SELECT id, user_name
FROM   t1
WHERE external_id = 1
AND      country_id = 5;

The original external_id index will be used, however if we create a concatenated index on external_id and country_id we improve the query path.

ALTER TABLE t1
  DROP INDEX external_id,
  ADD INDEX (external_id, country_id);

What about an index on country_id, external_id? If your access to your data always includes these two columns, you can consider swapping the columns based on the cardinality. However, if you have queries that search on external_id or external_id and country_id, then creating an index on country_id, external_id will not be used.

Tip In the QEP look at the key length to determine how effective concatenated indexes are.

Covering Index

A covering index as the name describes covers all columns in a query. The benefit of a covering index is that the lookup of the various Btree index pages necessary satisfies the query, and no additional data page lookups are necessary.

If we revisit our earlier example, by modifying the external_id index, and create a concatenated index on external_id and user_name we actually satisfy

ALTER TABLE t1
  DROP INDEX external_id,
  ADD INDEX (external_id, user_name);
SELECT id, user_name
FROM   t1
WHERE external_id = 1;

With MySQL, the QEP will indicate in Extra, ‘Using Index’. This is not a reference to the index actually being used, but the index satisfies all requirements of the query.

Partial Index

The final type is the partial index. This is a MySQL feature which allows you specify a subset of a column for the index.

Let’s say we query data and allow pattern matching on last name.

SELECT id, first_name, last_name, user_name
FROM   t1
WHERE last_name like 'A%'

We should add an index to last_name to improve performance.

ALTER TABLE t1
  ADD INDEX (last_name);

Depending on the average length of data in last_name (you can use PROCEDURE ANALYSE as a quick tool to sample this), creating a partial index may greatly reduce the size of the index, and minimize the additional data lookups required.

ALTER TABLE t1
  DROP INDEX last_name,
  ADD INDEX (last_name(10));

In this example, you would want to investigate the size of the index, the improvement, and then the amount of additional reads necessary for sample queries. If your accessed data is generally hot, then the benefit of a smaller index will not be impacted by additional data seeks.

Conclusion

As with any performance tuning, sufficient analysis and before and after testing is necessary for your specific environment.

Some future topics on indexes not discussed here include:

  • Using UNIQUE Indexe
  • The impact of NULL columns and values on indexes
  • Eliminating filesort by using indexes
  • The affect of too many indexes
  • Index cardinality

You need to also consider in your MySQL Index theory, that in general only one index is used for each table in a MySQL query. There are a few exceptions however these are rare.

I common question I am also asked is about function based indexes? MySQL provides no means to use a scalar function against a column in an index.