Selecting wise indexes

Indexes are a great way to improve performed in a MySQL database, when used appropriately.
When used in-appropriately the impact can be a degradation of performance.

The following example from Movable Type shows how when reviewing the slow query log I found numerous occurrences of Inserts take 3 or more seconds, with no reported lock contention time for this insert.

# Query_time: 3  Lock_time: 0  Rows_sent: 0  Rows_examined: 0
SET insert_id=6281;
INSERT INTO mt_comment
(comment_author, comment_blog_id, comment_commenter_id, comment_created_by,
 comment_created_on, comment_email, comment_entry_id, comment_ip, comment_junk_log,
comment_junk_score, comment_junk_status, comment_last_moved_on, comment_modified_by,
comment_modified_on, comment_parent_id, comment_text, comment_url, comment_visible)
VALUES (...)

The impact here, is that SELECT statements to the mt_comment table are also blocked because this table is in MyISAM. It was reviewing slow running SELECT statements that the cause of the slow inserts was easily determined.

mysql> explain SELECT comment_id
    -> FROM mt_comment
    -> WHERE (comment_visible = '1') AND (comment_blog_id = '3') AND (comment_entry_id = '276')
    -> ORDER BY comment_created_on DESC;


*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: mt_comment
         type: ref
possible_keys: mt_comment_visible,mt_comment_entry_id,mt_comment_blog_id,mt_comment_blog_stat,mt_comment_visible_date,mt_comment_entry_visible,mt_comment_blog_visible,mt_comment_blog_ip_date,mt_comment_blog_url
          key: mt_comment_entry_visible
      key_len: 6
          ref: const,const
         rows: 99
        Extra: Using where
1 row in set (0.00 sec)


CREATE TABLE `mt_comment` (
  `comment_id` int(11) NOT NULL auto_increment,
  `comment_author` varchar(100) default NULL,
  `comment_blog_id` int(11) NOT NULL default '0',
  `comment_commenter_id` int(11) default NULL,
  `comment_created_by` int(11) default NULL,
  `comment_created_on` datetime default NULL,
  `comment_email` varchar(75) default NULL,
  `comment_entry_id` int(11) NOT NULL default '0',
  `comment_ip` varchar(16) default NULL,
  `comment_junk_log` mediumtext,
  `comment_junk_score` float default NULL,
  `comment_junk_status` smallint(6) default '0',
  `comment_last_moved_on` datetime NOT NULL default '2000-01-01 00:00:00',
  `comment_modified_by` int(11) default NULL,
  `comment_modified_on` datetime default NULL,
  `comment_parent_id` int(11) default NULL,
  `comment_text` mediumtext,
  `comment_url` varchar(255) default NULL,
  `comment_visible` tinyint(4) default NULL,
  PRIMARY KEY  (`comment_id`),
  KEY `mt_comment_commenter_id` (`comment_commenter_id`),
  KEY `mt_comment_visible` (`comment_visible`),
  KEY `mt_comment_junk_score` (`comment_junk_score`),
  KEY `mt_comment_ip` (`comment_ip`),
  KEY `mt_comment_parent_id` (`comment_parent_id`),
  KEY `mt_comment_entry_id` (`comment_entry_id`),
  KEY `mt_comment_email` (`comment_email`),
  KEY `mt_comment_last_moved_on` (`comment_last_moved_on`),
  KEY `mt_comment_created_on` (`comment_created_on`),
  KEY `mt_comment_junk_status` (`comment_junk_status`),
  KEY `mt_comment_blog_id` (`comment_blog_id`),
  KEY `mt_comment_blog_stat` (`comment_blog_id`,`comment_junk_status`,`comment_created_on`),
  KEY `mt_comment_visible_date` (`comment_visible`,`comment_created_on`),
  KEY `mt_comment_entry_visible` (`comment_entry_id`,`comment_visible`,`comment_created_on`),
  KEY `mt_comment_blog_visible` (`comment_blog_id`,`comment_visible`,`comment_created_on`,`comment_id`),
  KEY `mt_comment_blog_ip_date` (`comment_blog_id`,`comment_ip`,`comment_created_on`),
  KEY `mt_comment_junk_date` (`comment_junk_status`,`comment_created_on`),
  KEY `mt_comment_blog_url` (`comment_blog_id`,`comment_visible`,`comment_url`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

As you can see, the table has 18 indexes. This means that for every row inserted, 18 separate index inserts are required.

When adding an Index to a table, first determine the usage patterns that will use the index, consolidating indexes when possible and removing obvious duplicates (in the above example, the single column comment_blog_id is a classic duplicate index).

Adding an index will generally help SELECT performance, depending on cardinality, but will always impact INSERT,UPDATE and DELETE performance.
Another down side of too many indexes is the MySQL optimizer has much more work to do to eliminate beneficial indexes for every Query Execution Plan (QEP) that is undertaken.

Indeed I have seen worse, in one case a table with ~120 columns, move then 20 single column indexes AND a 3 part primary key summing 40 bytes in InnoDB. The impact was terrible for performance, with the Index size being 3x times the data size.


About the Author

Ronald Bradford , Principal of 42SQL provides Consulting and Advisory Services in Data Architecture, Performance and Scalability for MySQL Solutions. An IT industry professional for two decades with extensive database experience in MySQL, Oracle and Ingres his expertise covers data architecture, software development, migration, performance analysis and production system implementations. His knowledge from 10 years of specialized consulting across many industry sectors, technologies and countries has provided unique insight into being able to provide solutions to problems. For more information Contact Ronald .

Tagged with: Databases MySQL

Related Posts

Tracking new AWS Database Infrastructure Availability

AWS can drop 10+ articles a day just in the What’s New feed. You either need an eagle eye, or luck to keep up if you run multiple AWS database products across multiple regions and managing infrastructure.

Read more

Evaluating Readyset Caching for MySQL

Readyset is a database caching solution for MySQL and PostgreSQL . For applications that have increased load on your primary database, or use scale-out infrastructure to support a high-read workload, ReadySet can be a drop-in solution to address current performance issues.

Read more

Creating a More Realistic Benchmark

Common benchmark approaches fall into two general categories, synthetic testing and realistic testing. You have the most generic operations from a synthetic test, starting with the most simple example using a single table, a single column, and for a single DML operation.

Read more