Opinions, Expertise, Passion.

Information in black and white, and sometimes some color.

Jun
29

What’s your disk I/O thoughtput?

Link to this post

MySQL uses disk. Ok, so everybody knew that. MySQL uses disk in two primary ways.

  • Random I/O (Reading & Writing Data/Index blocks)
  • Sequential I/O (Binary Log, InnoDB Redo Log)

Historically it’s been best practice to separate these onto different spindles, and also separating the OS and tmp space onto a third spindle. With commodity H/W that can be easily done, but today a lot of people use SAN. Is this a good thing for a MySQL Database?
That’s a topic of much discussion at a later time, however I’ll add two points. A lot of SAN configurations are RAID 5, and RAID 10 is a better choice due to removing the requirement to calculate the parity. Second, last week I observed a RAID disk failure and it took an incredible long time for the disk to be re-built. Just how many SAN uses our there have actually timed a disk rebuild on a loaded system and seen the impact on the system in general.

Back on topic, I don’t have access to any variety of hardware, so community here is where you can help. Those that can spare 5 mins, and have some free disk space (< 5GB), here is an exercise.

Commands

$ time dd if=/dev/zero count=100000 of=testfile.32k bs=32k
$ ls -lh testfile.32k
$ rm testfile.32k

You should see something like (FYI: from a 5400rpm laptop drive)


$ time dd if=/dev/zero count=100000 of=testfile.32k bs=32k
100000+0 records in
100000+0 records out
3276800000 bytes (3.3 GB) copied, 160.172 seconds, 20.5 MB/s
real 2m40.342s
user 0m0.120s
sys 0m15.277s
$ ls -lh testfile.32k
-rw-r--r-- 1 usr grp 3.1G 2007-06-28 10:02 testfile.32k

If your output doesn’t provide the dd M/B output (like Solaris for example) if you could also add:

$bc
3.2*1024/160
20
^D

NOTE: Replace 160 with the number of seconds from the real time (e.g. 2*60+40)

Of course I’m not collecting a lot of stuff, like comparing different block sizes, or looking at iostat for existing load and introduced load. I thought I’d ask an easy question to see what type of response and output I’d find.

If you want to add any additional information such as Drive Types & Speeds (e.g. SATA 5400rpm), RAID configuration or SAN configuration (e.g. RAID and connection type), OS and File System type that would be great, but completely optional.

If you would like add your results anonymously, please email me directly.

Thanks in advance.

Posted under Databases, MySQL, Professional on 29 Jun 2007

8 Comments »

  1. Here are my results with two different fiber attached arrays (both Linux 2.6.9-42.0.3.ELsmp):

    Fiber attached array(raid5), ext3 filesystem:

    $ time dd if=/dev/zero count=100000
    of=testfile.32k bs=32k
    100000+0 records in
    100000+0 records out

    real 0m20.191s
    user 0m0.155s
    sys 0m16.379s
    $ ls -lh testfile.32k
    -rw-r–r– 1 root root 3.1G Jun 28 11:05 testfile.32k

    Fiber attached array(raid5), ext3 filesystem over lvm over drbd:

    $ time dd if=/dev/zero
    count=100000 of=testfile.32k bs=32k
    100000+0 records in
    100000+0 records out

    real 0m30.538s
    user 0m0.134s
    sys 0m19.267s
    $ ls -lh testfile.32k
    -rw-r–r– 1 root root 3.1G Jun 28 11:16 testfile.32k

    Comment by Jason McKay — June 29, 2007 @ 3:59 am

  2. Putting the WAL on one disk and the innodb table on another won’t help actually. It’s better to RAID 0 them (or partition the tables).

    Innodb can’t write to the WAL AND the data files at the same time so you won’t be performing concurrent work.

    The binary log (and slow query log) are antoher story though.

    Comment by Kevin Burton — June 29, 2007 @ 7:30 am

  3. That just measures filesystem append speed.

    The underlying engines tend to prefer preallocating the filespace, and then do seek/writes inside the file, since thats usually much faster than filesystem append.

    A set of tools that measure random seek-read and random seek-write numbers would be illuminating. However, they can’t just be easy whipped off on the UNIX command line… :)

    Comment by Mark Atwood — June 29, 2007 @ 9:39 am

  4. I agree with Mark Atwood, I normally use several bonnie runs of bonnie, and a few other testers to check. Other things than drives and raid config affect the performance alot too, like amound of ram on the raid card, write caching. I perfer to stay away from raid5 cause of the slowness with a drive failure. I still wonder about netapp raiddp setup.

    This is a 6drive U160 15krpm raid10

    3276800000 bytes transferred in 77.806677 secs (42114638 bytes/sec)
    real 1m17.834s
    user 0m0.206s
    sys 0m23.531s
    -rw-r–r– 1 root wheel 3.1G Jun 28 22:27 testfile.32k

    Comment by Patrick Domack — June 29, 2007 @ 12:36 pm

  5. (all on XFS… on my /home… so a used file system)

    time dd if=/dev/zero count=100000 of=testfile.32k bs=32k
    100000+0 records in
    100000+0 records out
    3276800000 bytes (3.3 GB) copied, 122.893 seconds, 26.7 MB/s

    real 2m3.181s
    user 0m0.068s
    sys 0m9.637s

    yes, my laptop drive is teh roxor!

    although seemed tricky to repeat these numbers….

    Although… 1MB blocksize and O_DIRECT:
    3145728000 bytes (3.1 GB) copied, 119.849 seconds, 26.2 MB/s

    real 1m59.871s
    user 0m0.004s
    sys 0m1.568s

    What’s more interesting is the rates at different parts of the disk… there is a difference.

    Comment by Stewart Smith — June 29, 2007 @ 3:39 pm

  6. Most good SAN configurations use RAID6 today (like NetApp, which I’ll be testing in a few days).

    In the meantime, this is what we use (all these numbers are in production):

    6 drive U300 SCSI 15k on LVS aggregate stripe size=1MB, XFS

    time dd if=/dev/zero count=100000 of=testfile.32k bs=32k
    100000+0 records in
    100000+0 records out
    3276800000 bytes (3.3 GB) copied, 31.4418 seconds, 104 MB/s

    real 0m31.449s
    user 0m0.024s
    sys 0m8.309s

    —

    2 drive SAS 10k on HW RAID 0, XFS:

    time dd if=/dev/zero count=100000 of=testfile.32k bs=32k
    100000+0 records in
    100000+0 records out
    3276800000 bytes (3.3 GB) copied, 47.756 seconds, 68.6 MB/s

    real 0m47.795s
    user 0m0.040s
    sys 0m7.370s

    I expect better numbers from the FC SAN.

    Comment by Guillaume Lefranc — June 30, 2007 @ 12:59 am

  7. Software RAID 10 with 32k chunks over 6 S-ATA drives 7200rpm attached to on board S-ATA connectors. Hardened Gentoo Linux 2.6.20 with XFS filesystem, mounted with noatime,logbufs=8 options

    time dd if=/dev/zero count=100000 of=testfile.32k bs=32k
    100000+0 records in
    100000+0 records out
    3276800000 bytes (3.3 GB) copied, 18.2788 s, 179 MB/s

    real 0m18.283s
    user 0m0.192s
    sys 0m12.301s

    Comment by Jean-Pierre Schwickerath — July 1, 2007 @ 9:10 pm

  8. Looks like we’re pushing about 269MB/s on an HP XP128 across a stack of HP SAN switches. The XP128 is set up with a whole mess of RAID5 volumes presented to our Linux boxes which use LVM to manage the volumes. We always have multipathing involved in essential systems (read: mySQL boxes) in an active/passive configuration.

    If you do the math, though, we may actually be running into a bottleneck on the inter-switch trunks. Our trunks are running at 2Gpbs.

    269MB/s * 8bits/Byte = 2.152Gbps

    —–
    $ sudo time dd if=/dev/zero count=100000 of=testfile.32k bs=32k
    100000+0 records in
    100000+0 records out
    0.06user 8.75system 0:12.17elapsed 72%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+180minor)pagefaults 0swaps
    $ ll test*
    -rw-r–r– 1 root root 3276800000 Aug 9 10:06 testfile.32k
    $ bc
    bc 1.06
    Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
    This is free software with ABSOLUTELY NO WARRANTY.
    For details type `warranty’.
    3.2*1024/12.17
    269

    Comment by Aaron Conaway — August 10, 2007 @ 12:22 am

RSS feed for comments on this post.

Leave a comment

Home
Professional Blog RSS Feed of Professional Blog
Consulting
Presentations
About Ronald
Related Links
Contact Ronald
  • « May spinner iCalendar Jul »
    June 2007
    M T W T F S S
     123
    45678910
    11121314151617
    18192021222324
    252627282930 
  • Categories:
    • Professional
      • 42SQL
      • Apple
        • iPhone
        • MacBook
        • OS/X
      • Clever Design
      • Cloud Computing
        • 10gen
        • AppNexus
        • Kaavo
        • Kloudshare
      • Databases
        • Drizzle
        • Ingres
        • MySQL
          • Compiling
          • GUI Products
          • MySQL Events
            • mysqlcamp01
            • mysqlcamp02
          • MySQL Proxy
          • MySQL User Conferences
            • mysqluc06
            • mysqluc07
            • mysqluc08
          • Storage Engines
            • Non Transactional
              • Infobright
              • KickFire
              • Maria
              • Nitro
            • Transactional
              • Blob Streaming
              • Falcon
              • InnoDB
              • PBXT
              • Solid
        • Oracle
      • Extreme Programming (XP)
      • General
      • Java
        • Tomcat
      • Linux
        • One Liners
      • Microsoft
      • Open Source
        • Buildbot
        • Ubuntu
        • UltimateLAMP
        • Virtual Box
      • OSCON 2008
      • Packet General
      • PrimeBase Technologies
      • Solid State Drives
      • Sun
      • The Daily WTF
      • Web 2.0 NY
      • Windoze
      • Yahoo
    • Web
      • Google
        • App Engine
        • Summer of Code
      • SEO
        • Brand Identity
      • Web Development
        • Amazon
          • EC2
          • S3
          • SimpleDB
        • CSS
        • HTML
        • PHP
        • Web 2.0
      • Web Sites
        • Application Software
        • Content
        • Cool Tools
        • Linux Stuff
        • MySQL Related
        • Show Your Stuff
        • Twitter
        • Unype
      • WordPress
  • Pages:
    • Best Of PlanetMySQL Articles
    • Interesting Articles
    • MediaWiki Restyling (1)

  • Archives:
    • November 2008
    • October 2008
    • September 2008
    • August 2008
    • July 2008
    • June 2008
    • May 2008
    • April 2008
    • March 2008
    • February 2008
    • January 2008
    • December 2007
    • November 2007
    • October 2007
    • September 2007
    • August 2007
    • July 2007
    • June 2007
    • May 2007
    • April 2007
    • March 2007
    • February 2007
    • January 2007
    • December 2006
    • November 2006
    • October 2006
    • September 2006
    • August 2006
    • July 2006
    • June 2006
    • May 2006
    • April 2006
    • March 2006
    • February 2006
    • January 2006
    • December 2005
    • November 2005
    • October 2005
    • September 2005
    • July 2005
    • June 2005
    • February 2005
    • October 2004
    • September 2004
    • July 2004
    • June 2004