MySQL Backup & Recovery – Part 1

I realized recently from observation that some smaller websites which use MySQL do not have a working backup and recovery strategy. Some may have a backup approach but it will not work in a recovery. As part of a number of discussions on Backup & Recovery here is part 1.

Using straight ‘cp’ for Backup

Using ‘cp’ to simply copy your running database *DOES NOT WORK*. You may be lucky, but in a world of guarantees, this is no guarantee that your can recovery your system. Why is this so.

  • The data is not consistent during the backup. If it takes say 5 minutes to copy your files, they are copied probably in alphabetical order, what if data is written during the backup to a table starting with ‘A’ and a table starting with ‘Z’, the A file has already been copied.
  • When using MyISAM, Indexes are not flushed to disk unlike Table data. This means that while MyISAM has the facility of recover and rebuild indexes using myisamchk, you need to know if you have corruption, you need to rebuild your tables offline, and there is not assurance your data can be corrected.
  • Likewise with using Innodb, Data and Indexes are not flushed to disk. While Innodb does flush the redo logs to enable crash recovery, I have seen on a production website, granted 4.1 that Innodb failed to recover and caused major downtime and serious business ramifications.

Using ‘cp’ correctly

In order to copy your database, there are two different ways. The first is to shutdown your database first then copy your data files. Be sure to also copy your my.cnf file, this is important in a recovery process. The second is to use FLUSH command. For example.

mysql> FLUSH TABLES WITH READ LOCK;
-- In second session copy all data
mysql> UNLOCK TABLES;

The advantage is you don’t have to shutdown your instance, the disadvantage is that FLUSH TABLES will halt any WRITE statements to your database, so depending on the time to copy your data files that could have significant effect on your system. In a high write environment the FLUSH command could also take a significant time to complete.

I’ll leave with one simple tip. You should always have sufficient diskspace on your machine for performing the backup. This is important for two parts. First, if you say compress your backup during the process, you are taking a longer time then a straight copy and your database is unavailable longer. Second, over 90% of any recovery occurs with last nights backup, and the time take to uncompress the backup affects your recovery time, particularly in a large system. Your ‘cp’ should indeed be a two phase process. First ‘cp’ on the local server for a local backup, then have an off server compressed backup.

In my next Part, I’ll discuss more alternatives to improving your backup strategy.

References

FLUSH TABLES

Comments

  1. says

    Hi Ronald,

    It’s worth noting, that even with FLUSH TABLES WITH READ LOCK, a “cp” copy of the files is not good enough for InnoDB. FLUSH TABLES WITH READ LOCK does block commit, but my understanding is that it doesn’t ensure all activity is stopped within InnoDB itself.

    Regards,

    Jeremy

  2. Brent says

    I realize there are some “approved” backup methods for InnoDB tables, but it would be very nice to have something as simple as a copy available. The database I administer is medium-sized, but a full restore from a backup will still take several days. If, instead, I knew of some way to simply make a copy of it — even if it means shutting down MySQL at 3:00 AM for a few minutes — I would find that preferable to many of the alternatives I’ve seen.

  3. says

    This is from first hand experience:

    A snapshot taken while

    SHOW FLUSH TABLES WITH READ LOCK

    is in place, will be corrupt, especially if you are in a high transaction environment and if innodb_max_purge_lag is set to a high number. This can be verified by issuing a READ LOCK and then either by monitoring the timestamps or by calculating cksum.

    On such a system where I had innodb_max_purge_lag set to 70000 activity fully stopped approx. 30 minutes after issuing the READ LOCK.

    Frank

  4. says

    Heh,

    Indeed this blog post gives worst backup advice ever :)

    It does not work for Innodb as many guys mentioned and it is unlikely work for any other storage engines which have background activity.

    Interesting enough it may look like it works if you test with very light load so be especially careful.