MySQL Conference – YouTube

Ronald Bradford
April 27, 2007

MySQL Conference 2007 Day 4 rolled quickly into the second keynote Scaling MySQL at YouTube by Paul Tuckfield.

The introduction by Paul Tuckfield was; “What do I know about anything, I was just the DBA at PayPal, now I’m just the DBA at youTube. There are only 3 DBA’s at YouTube.”

This talk had a number of great performance points, with various caching situations. Very interesting.

Scaling MySQL at YouTube

Top Reasons for YouTube Scalability

The technology stack:

Python
Memcache
MySQL Replication

Caching outside the database is huge.

It a display of numbers of hits per day it was said “I can neither confirm or deny the interpretation will work here (using an Alexa graph)”. This is not the first time I’ve heard this standard “Google” response. They must get lessons by lawyers in what you can say.

Standardizing on DB boxes (but they crash almost daily)

4x2ghz opteron core
16G RAM
12x10k scsi
LSI hardware raid 10
Replication played a big part in fixing
Get a reliable H/W supplier

Replication Lessons

You don’t worry about it when a replicas fail.
One thing that sucks, Innodb doesn’t recover very fast. It does that durability think, but it takes hours to finish recovering (was it going to finish)
How many backups can you restore. When you switch you a replica, are you sure it’s right?
Did you test recovery, did you test your backups.
replication was key to trying different H/W permutations to identify incompatible H/W (combinations of controllers/disks)
we got good at re-parenting/promoting replicas, really fast
we built up ways to clone databases as fast as possible
Excellent way to test tuning changes or fixes (powerful place to test things)
Keep “intentional lag”/Stemcell replicas – Stop SQL thread, keeps a server a few hours or a day behind. Say if you drop a table you have a online backup.
When upgrading, always mysqldump then reload, rather then upgrade database.
Don’t care about CPU’s. I want as much memory as possible, I want as many spindles as possible.
For YouTube 2-3 second lag is acceptable.

If you db fits in ram, great otherwise

Cache is king
Writes should be cached by raid controller (buffered really) not the OS
Only the db should cache reads (not raid, not Linux buffer cache)

Only DB should cache reads

Hit in db cache means lower caches went unused.
Miss in db cache can only miss in other caches since they’re smaller.
Caching reads is worse then useless. It’s serialized writes.
Avoiding serialization in reads reaps compounds benefits under high concurrency

An important lesson learned. Do no cache reads in F/S and Raid Controller.

Caching Lessons
Overcoming Mystery Serialization

Use O_DIRECT
vm.swappiness=1-5
if you’re >80% buys — your not doing I/O concurrently look at other figures e.g. 80% busy 8 I/O’s, next configuration 80%, only 4 I/O’s
Mirror in H/W strip in S/W

Scale Out

Writes are parallel to master, but serialized to replicas. We need true horizontal partitioning.
We want true independent masters
EMD – Even More Databases — Extreme Makeover Database
Slave transactions must serialize to preserve commit order (this is why replication is always way slower)
The oracle caching algorithm (that’s a small o) — predicting the future
Replication lags: one IO bound thread. You do know the future, commands are coming up serially.
Write a script to do reads, before updates coming up (because they are cache hits).
The diamond. For golive, play shards binlogs back to original master for fallback.

Tagged with: Databases General MySQL MySQL Conference &Amp; Expo 2007

MySQL and Heatwave Summit Presentation

Ronald Bradford
April 30, 2025

Last week I had the opportunity to speak at the MySQL and Heatwave Summit in San Francisco. I discussed the impact of the new MySQL 8.0 default caching_sha2_password authentication, replacing the mysql_native_password authentication that was the default for approximately 20 of the 30 years that MySQL has existed.

Readyset QueryPilot Announcement

Ronald Bradford
April 22, 2025

At the MySQL and Heatwave Summit 2025 today, Readyset announced a new data systems architecture pattern named Readyset QueryPilot . This architecture which can front a MySQL or PostgreSQL database infrastructure, combines the enterprise-grade ProxySQL and Readyset caching with intelligent query monitoring and routing to help support applications scale and produce more predictable results with varied workloads.

More CPUs or Newer CPUs

Ronald Bradford
April 2, 2025

In a CPU-bound database workload, regardless of price, would you scale-up or scale-new? What if price was the driving factor, would you scale-up or scale-new? I am using as a baseline the first available AWS Graviton2 processor for RDS (r6g).

MySQL Conference – YouTube

Related Posts

MySQL and Heatwave Summit Presentation

Readyset QueryPilot Announcement

More CPUs or Newer CPUs