Opinions, Expertise, Passion.

Information in black and white, and sometimes some color.

Apr
27

MySQL Conference - YouTube

Link to this post

MySQL Conference 2007 Day 4 rolled quickly into the second keynote Scaling MySQL at YouTube by Paul Tuckfield.

The introduction by Paul Tuckfield was; “What do I know about anything, I was just the DBA at PayPal, now I’m just the DBA at youTube. There are only 3 DBA’s at YouTube.”

This talk had a number of great performance points, with various caching situations. Very interesting.

Scaling MySQL at YouTube

Top Reasons for YouTube Scalability

The technology stack:

  • Python
  • Memcache
  • MySQL Replication

Caching outside the database is huge.

It a display of numbers of hits per day it was said “I can neither confirm or deny the interpretation will work here (using an Alexa graph)”. This is not the first time I’ve heard this standard “Google” response. They must get lessons by lawyers in what you can say.

Standardizing on DB boxes (but they crash almost daily)

  • 4×2ghz opteron core
  • 16G RAM
  • 12×10k scsi
  • LSI hardware raid 10
  • Replication played a big part in fixing
  • Get a reliable H/W supplier

Replication Lessons

  • You don’t worry about it when a replicas fail.
  • One thing that sucks, Innodb doesn’t recover very fast. It does that durability think, but it takes hours to finish recovering (was it going to finish)
  • How many backups can you restore. When you switch you a replica, are you sure it’s right?
  • Did you test recovery, did you test your backups.
  • replication was key to trying different H/W permutations to identify incompatible H/W (combinations of controllers/disks)
  • we got good at re-parenting/promoting replicas, really fast
  • we built up ways to clone databases as fast as possible
  • Excellent way to test tuning changes or fixes (powerful place to test things)
  • Keep “intentional lag”/Stemcell replicas - Stop SQL thread, keeps a server a few hours or a day behind. Say if you drop a table you have a online backup.
  • When upgrading, always mysqldump then reload, rather then upgrade database.
  • Don’t care about CPU’s. I want as much memory as possible, I want as many spindles as possible.
  • For YouTube 2-3 second lag is acceptable.

If you db fits in ram, great otherwise

  • Cache is king
  • Writes should be cached by raid controller (buffered really) not the OS
  • Only the db should cache reads (not raid, not Linux buffer cache)

Only DB should cache reads

  • Hit in db cache means lower caches went unused.
  • Miss in db cache can only miss in other caches since they’re smaller.
  • Caching reads is worse then useless. It’s serialized writes.
  • Avoiding serialization in reads reaps compounds benefits under high concurrency

An important lesson learned. Do no cache reads in F/S and Raid Controller.

Caching Lessons
Overcoming Mystery Serialization

  • Use O_DIRECT
  • vm.swappiness=1-5
  • if you’re >80% buys — your not doing I/O concurrently look at other figures e.g. 80% busy 8 I/O’s, next configuration 80%, only 4 I/O’s
  • Mirror in H/W strip in S/W

Scale Out

  • Writes are parallel to master, but serialized to replicas. We need true horizontal partitioning.
  • We want true independent masters
  • EMD - Even More Databases — Extreme Makeover Database
  • Slave transactions must serialize to preserve commit order (this is why replication is always way slower)
  • The oracle caching algorithm (that’s a small o) — predicting the future
  • Replication lags: one IO bound thread. You do know the future, commands are coming up serially.
  • Write a script to do reads, before updates coming up (because they are cache hits).
  • The diamond. For golive, play shards binlogs back to original master for fallback.
Posted under Databases, General, MySQL, mysqluc07 on 27 Apr 2007

No Comments »

No comments yet.

RSS feed for comments on this post.

Leave a comment

Home
Professional Blog RSS Feed of Professional Blog
Consulting
Presentations
About Ronald
Related Links
Contact Ronald
  • « Mar spinner iCalendar May »
    April 2007
    M T W T F S S
     1
    2345678
    9101112131415
    16171819202122
    23242526272829
    30EC
  • Categories:
    • Professional
      • 42SQL
      • Apple
        • iPhone
        • MacBook
        • OS/X
      • Clever Design
      • Cloud Computing
        • 10gen
        • AppNexus
        • Kaavo
        • Kloudshare
      • Databases
        • Drizzle
        • Ingres
        • MySQL
          • Compiling
          • GUI Products
          • MySQL Events
            • mysqlcamp01
            • mysqlcamp02
          • MySQL Proxy
          • MySQL User Conferences
            • mysqluc06
            • mysqluc07
            • mysqluc08
          • Storage Engines
            • Non Transactional
              • Infobright
              • KickFire
              • Maria
              • Nitro
            • Transactional
              • Blob Streaming
              • Falcon
              • InnoDB
              • PBXT
              • Solid
        • Oracle
      • Extreme Programming (XP)
      • General
      • Java
        • Tomcat
      • Linux
        • One Liners
      • Microsoft
      • Open Source
        • Buildbot
        • Ubuntu
        • UltimateLAMP
        • Virtual Box
      • OSCON 2008
      • Packet General
      • PrimeBase Technologies
      • Solid State Drives
      • Sun
      • The Daily WTF
      • Web 2.0 NY
      • Windoze
      • Yahoo
    • Web
      • Google
        • App Engine
        • Summer of Code
      • SEO
        • Brand Identity
      • Web Development
        • Amazon
          • EC2
          • S3
          • SimpleDB
        • CSS
        • HTML
        • PHP
        • Web 2.0
      • Web Sites
        • Application Software
        • Content
        • Cool Tools
        • Linux Stuff
        • MySQL Related
        • Show Your Stuff
        • Twitter
        • Unype
      • WordPress
  • Pages:
    • Best Of PlanetMySQL Articles
    • Interesting Articles
    • MediaWiki Restyling (1)

  • Archives:
    • November 2008
    • October 2008
    • September 2008
    • August 2008
    • July 2008
    • June 2008
    • May 2008
    • April 2008
    • March 2008
    • February 2008
    • January 2008
    • December 2007
    • November 2007
    • October 2007
    • September 2007
    • August 2007
    • July 2007
    • June 2007
    • May 2007
    • April 2007
    • March 2007
    • February 2007
    • January 2007
    • December 2006
    • November 2006
    • October 2006
    • September 2006
    • August 2006
    • July 2006
    • June 2006
    • May 2006
    • April 2006
    • March 2006
    • February 2006
    • January 2006
    • December 2005
    • November 2005
    • October 2005
    • September 2005
    • July 2005
    • June 2005
    • February 2005
    • October 2004
    • September 2004
    • July 2004
    • June 2004