MySQL Backup & Recovery – Part 1

Ronald Bradford
August 21, 2007

I realized recently from observation that some smaller websites which use MySQL do not have a working backup and recovery strategy. Some may have a backup approach but it will not work in a recovery. As part of a number of discussions on Backup & Recovery here is part 1.

Using straight ‘cp’ for Backup

Using ‘cp’ to simply copy your running database *DOES NOT WORK*. You may be lucky, but in a world of guarantees, this is no guarantee that your can recovery your system. Why is this so.

The data is not consistent during the backup. If it takes say 5 minutes to copy your files, they are copied probably in alphabetical order, what if data is written during the backup to a table starting with ‘A’ and a table starting with ‘Z’, the A file has already been copied.
When using MyISAM, Indexes are not flushed to disk unlike Table data. This means that while MyISAM has the facility of recover and rebuild indexes using myisamchk, you need to know if you have corruption, you need to rebuild your tables offline, and there is not assurance your data can be corrected.
Likewise with using Innodb, Data and Indexes are not flushed to disk. While Innodb does flush the redo logs to enable crash recovery, I have seen on a production website, granted 4.1 that Innodb failed to recover and caused major downtime and serious business ramifications.

Using ‘cp’ correctly

In order to copy your database, there are two different ways. The first is to shutdown your database first then copy your data files. Be sure to also copy your my.cnf file, this is important in a recovery process. The second is to use FLUSH command. For example.

mysql> FLUSH TABLES WITH READ LOCK;
-- In second session copy all data
mysql> UNLOCK TABLES;

The advantage is you don’t have to shutdown your instance, the disadvantage is that FLUSH TABLES will halt any WRITE statements to your database, so depending on the time to copy your data files that could have significant effect on your system. In a high write environment the FLUSH command could also take a significant time to complete.

I’ll leave with one simple tip. You should always have sufficient diskspace on your machine for performing the backup. This is important for two parts. First, if you say compress your backup during the process, you are taking a longer time then a straight copy and your database is unavailable longer. Second, over 90% of any recovery occurs with last nights backup, and the time take to uncompress the backup affects your recovery time, particularly in a large system. Your ‘cp’ should indeed be a two phase process. First ‘cp’ on the local server for a local backup, then have an off server compressed backup.

In my next Part, I’ll discuss more alternatives to improving your backup strategy.

References

FLUSH TABLES

Tagged with: Databases MySQL

MySQL and Heatwave Summit Presentation

Ronald Bradford
April 30, 2025

Last week I had the opportunity to speak at the MySQL and Heatwave Summit in San Francisco. I discussed the impact of the new MySQL 8.0 default caching_sha2_password authentication, replacing the mysql_native_password authentication that was the default for approximately 20 of the 30 years that MySQL has existed.

Readyset QueryPilot Announcement

Ronald Bradford
April 22, 2025

At the MySQL and Heatwave Summit 2025 today, Readyset announced a new data systems architecture pattern named Readyset QueryPilot . This architecture which can front a MySQL or PostgreSQL database infrastructure, combines the enterprise-grade ProxySQL and Readyset caching with intelligent query monitoring and routing to help support applications scale and produce more predictable results with varied workloads.

More CPUs or Newer CPUs

Ronald Bradford
April 2, 2025

In a CPU-bound database workload, regardless of price, would you scale-up or scale-new? What if price was the driving factor, would you scale-up or scale-new? I am using as a baseline the first available AWS Graviton2 processor for RDS (r6g).