Identifying Resource Bottlenecks – Disk

With a discussion on identifying CPU and Memory bottlenecks achieved, let us now look at how Disk can affect MySQL performance.

One of the first steps when addressing a MySQL performance tuning problem is to perform a system audit of the physical hardware resources (CPU,Memory,Disk,Network), then identify any obvious bottlenecks in these resources.

There are several commands you can use including vmstat, iostat and sar (both part of the sysstat package) to provide information on disk I/O.

vmstat is a good place to start for an overall view for multiple resources, however for disk it only provides a total system overview.

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
  4  0 249256 107768 116360 1519596    0    0     0   112 8151 11256 40  4 56  0
 3  0 249256 107588 116360 1519680    0    0     4  1068 8021 11514 51  2 47  0
 1  0 249256 107396 116360 1519788    0    0     0    88 8880 12832 35  6 60  0
 0  0 249256 107432 116360 1519928    0    0     4    40 9393 14561  8  4 89  0
 2  0 249256 107160 116360 1519988    0    0     4  5636 9790 14245 23  6 71  0
 1  0 249256 107140 116360 1520356    0    0     4   180 9077 13285 33  3 65  0
 3  0 249256 107100 116360 1520352    0    0     0  1516 7970 13099 22  2 75  0
 4  1 249256 107184 116360 1520476    0    0     4   108 9756 15478 67  4 29  0
 2  0 249256 106196 116360 1520652    0    0     0     0 9512 14212 61  4 35  0

We want to look at is bi, bo and wa. The descriptions of these columns from the man page is:

  • bi: Blocks received from a block device (blocks/s).
  • bo: Blocks sent to a block device (blocks/s).
  • wa: Time spent waiting for IO.

As you can see from this above example, there is no disk reading, just disk writing, and there is no wait for I/O. This is good.

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 3  2 888552  86716 321940 712480   32  636  5352  1800 18177 22731 31  9 29 31
 0  5 888552  86748 321944 712592    0    0  2072   264 15592 19435 27  6 45 23
 4  5 888552  86296 321944 712796   16    0  5556  8388 15559 19674 28  5 11 55
 4  2 888552  86112 321948 713408   24    0  4404  4936 15215 19354 26  6 20 48
 6  0 888552  85732 321948 713608   56    0  6348  4368 15123 19109 25  5 37 34
 2  3 888552  85188 321956 713936   60    0  3080  4104 16322 21044 29  6 48 18
 2  3 888552  84972 321964 714376   20    0  4464 10852 20483 26013 33  9 25 34
 1 10 888552  84772 321980 714724   12    0  9332 12868 16981 21948 28  6 19 48
 2  3 888552  84080 321988 714952  112    0 11144  8944 15973 20190 27  6  1 65

In this above example we see a production system that has high disk reads and writes, and wait I/O is high. If you see the CPU waiting for Disk I/O at 60%-70%-80% you have effectively reached disk saturation.

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0 28      0  14436   4616 2405504    0    0     0     2  101   92  0  0  0 100
 0  4      0  14320   4616 2405504    0    0   120  2300  191  213  0  0  1 98
 0  5      0  15064   4616 2405056    0    0  1688    62 1961 2080  2  0 32 63
 0  4      0  14136   4616 2405916    0    0   952   272  584  569  0  0  0 98
 0  5      0  16864   4624 2403068    0    0   336    76  886 1389  2  1  0 97
 0  3      0  16052   4624 2403928    0    0   800   288  373  352  0  0  0 99
 0  3      0  15380   4624 2404556    0    0   720   240  373  357  0  0  0 98
 0  3      0  14452   4624 2405588    0    0   912   400  330  324  1  0  0 97
 0 57      0  15688   4624 2404380    0    0  1956   893  439  828  1  0  0 97
 0 56      0  15572   4632 2404544    0    0   208    12  947 1402  2  0  0 97
 0 53      0  14784   4632 2405320    0    0   784     1  310  261  0  0  0 98
 0 50      0  14436   4632 2405664    0    0   288   120  175  140  0  0  0 99
 0 50      0  14228   4632 2405688    0    0   160     0   97   85  0  0  0 100
 1 49      0  14112   4632 2406032    0    0   276     0  183  184  0  0  0 100

In the above example the system is past total disk saturation. The system is waiting completely for disk. This is the output of an actual production system. This is a good example because it is important to look at all three figures. Look at how low the amount of bi/bo in ration to wa. This is an indicator of a potential underlying disk subsystem problem, and in-fact this system failed within 24 hours of this output. I have never see 100% Disk Wait I/O before this example.

To get an indication of load on a multi disk system you need to use iostat for example. In the following example, we have an idle system with two drives. I simulate load on sdb with the following command.

$ dd if=/dev/zero of=/disk1/dd.out bs=1024k count=10000
$ iostat 5
Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               2.80         1.60        97.60          8        488
sdb               2.80         0.00        25.60          0        128

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.10    0.00    0.05    0.10    0.00   99.75

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               5.80        19.20       225.60         96       1128
sdb               0.00         0.00         0.00          0          0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.80    0.00   11.86    6.30    0.00   81.04

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              19.20        17.60       294.40         88       1472
sdb              18.20         0.00     18246.40          0      91232

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.40    0.00    3.15   34.88    0.00   61.57

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              35.20        30.40     14409.60        152      72048
sdb              86.20         1.60     83763.20          8     418816

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.30    0.00    1.45   38.87    0.00   58.38

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              30.80         4.80       451.20         24       2256
sdb              84.40         0.00     84172.80          0     420864

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.40    0.00    1.15   35.43    0.00   61.02

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              15.40         3.20       270.40         16       1352
sdb              84.80         0.00     83353.60          0     416768

You can see from this example, the increase in blocks written to sdb, and overal %iowait.

If you want to do a blanket test of your disk subsystem you should consider a program that adequately test the different interactions, especially any caching or battery backed cached that your system may have. I have written several articles on using Bonnie++ including Using Bonnie++, Extending Bonnie++ and Bonnie++ Results.

There are a number of variables that make it more difficult to audit disks. RAID configuration is often difficult as this involves custom OS/RAID provider commands. The disk controller cache, and battery backed cache (BBRU) are just two factors. It is important you know these commands, you study them and most importantly you know when your system is running in a degraded mode. The popular Dell 1950/2950 etc series generally have PERC 5/6i cards, you can use the folllowing as a guide to Understanding PERC RAID Controllers.

My 60 second take on RAID and Disk Configuration. I concur with Matt Yonkivit. You should separate your OS from your database on disk, RAID 1 (2 drives) works fine for the OS. For databases, in theory (pure math when understanding RAID), RAID10 is better then RAID5.