Handling Disaster 101

I’ve had to accept the “practice what you preach” pill recently due to a disaster at my hosting provider. See Learning from a Disaster .

While it was my own personal site on a dedicated server in question and not a business generating review I found that my MySQL Backup Strategy was incomplete ( It is also based on code 4 years old). I found that I had not tested my Disaster Recovery Plan adequately. I have used my backup and recovery approach in the past for various controlled situations and testing successfully.

So what mistakes did I make. There were two.

  1. I was using a cold backup approach. That is specifically copying the entire MySQL Database in a controlled manner at the file system level. These were also copied to an alternate shared hosting server for storage. This works fine when you backup server supports a means of restoring data in this format, however if your backup shared hosting facility does not give you access to the MySQL data area, then this does not work. Not wanting to pay for two dedicated hosts this backup solution is impractical for my present hosting. Time to consider alternatives, such as being prepared with an EC2 image.

  2. Recently I moved to using two MySQL instances, both 5.0 GA and 5.1 RC. The problem is I didn’t adjust my backup scripts appropriately to reflect two instances. Of course when my server was unavailable for 43 hours I was completely screwed in at least I could only throw an Site Unavailable page rather then my website. Combined with my hosting provider totally screwing DNS and admin access to manage DNS for 3-4 days more then almost 2 days of total server unavailability I also had to change support for my DNS. Would have been easier if I had a DNS DR plan.

The lesson is, no matter how trivial the information or website, if you actually want the content, make sure you test your recovery strategy ahead of a disaster.

Tagged with: Databases MySQL

Related Posts

Why Being Proactive Is Always a Winning Approach

Many companies manage production infrastructure using a reactive model rather than a proactive one. Organizations typically react to warnings and alerts, then implement corrective actions in response. While some companies have well-designed architectural patterns—such as feature flags and rate limiting—that can quickly mitigate the impact of issues, these are merely temporary solutions, not resolutions.

Read more

AWS CLI support for Aurora DSQL and S3 Tables

If you were following the AWS Re:invent keynote yesterday there were several data specific announcements including Aurora DSQL and S3 Tables . Wanting to check them out, I downloaded the latest AWS CLI 2.

Read more

Migrating off of WordPress - A Simplified Stack

The ongoing drama between Wordpress v WP Engine continues to cross my reading list, but I have permanently removed WordPress from my website. I have finally transitioned away from the complex Linux/Apache/MySQL/PHP (LAMP) stack required for self-hosting WordPress on my professional website.

Read more