Basic scalability principles to avert downtime

Ronald Bradford
April 23, 2011

In the press in the last two days has been the reported outage of Amazon Web Services Elastic Compute Cloud (EC2) in just one North Virginia data center. This has affected many large website includes FourSquare , Hootsuite , Reddit and Quora . A detailed list can be found at ec2disabled.com .

For these popular websites was this avoidable? Absolutely.

Basic scalability principles if deployed in these systems architecture would have averted the significant downtime regardless of your development stack. While I work primarily in MySQL these principles are not new, nor are they complicated, however they are fundamental concepts in scalability that apply to any technology including the popular MongoDB that is being used by a number of affected sites.

Scalability 101 involves some simple basic rules. Here are just two that seem to have been ignored by many affected by this recent AWS EC2 outage.

Never put all your eggs in one basket. If you rely on AWS completely, or you rely on just one availability zone that is putting all your eggs in one basket.
Always keep your important data close to home. When it comes to what is most critical to your business you need access and control to your information. At 5am in the morning when the CEO asks how long will our business be unavailabla and what is needed to resolve the problem, the answer “We have no control over this and have no ETA” is not an acceptable answer.

With a successful implementation and appropriate data redundancy you may not have an environment immediately available however you have access to your important information and the ability to create one quickly. Many large hosting companies can provide additional H/W on near demand, especially if you have an initial minimal footprint. Indeed using Amazon Web Services (AWS) as a means to avert a data center disaster is an ideal implementation of Infrastructure As A Service (IAAS). Even with this issue, organizations that had planned for this type of outage could have easily migrated to another AWS availability zone that was unaffected.

Furthermore, system architecture to support various levels of data availability and scalability ensure you can handle many more various types of unavailability without significant system down time as recently seen. There are many different types of availability and unavailability, know what your definition of downtime is and supporting disasters should be your primary focus of scalability, not an after thought.

As an expert in performance and scalability I can help your organization in the design of a suitable architecture to support successful scalability and disaster. This is not rocket science however many organizations gamble without the expertise of a professional to ensure business viability.

Tagged with: Amazon Amazon Web Services (AWS) Databases EC2 MySQL

Databases MySQL

Part 2 – Simple lessons in improving scalability

Ronald Bradford
February 24, 2011

Given the popular response from my first lesson in improving scalability where I detailed simple ways to eliminate unnecessary SQL, let me share another common bottleneck with MySQL scalability that can be instantly overcome.

Databases MySQL

Simple lessons in improving scalability

Ronald Bradford
February 16, 2011

It can be very easy to improve scalability with a MySQL server by a few simple rules. Here is one of them. “The most efficient way to improve an SQL statement is to eliminate it”

Databases MySQL Oracle/MySQL Conferences South America OTN LAD Tour 2010

OTN MySQL conference slides

Ronald Bradford
November 3, 2010

2010 has been the first year I have re-presented any of my developed MySQL presentations. Historically I have always created new presentations, however Paul Vallee gave me some valuable advice at UC 2010.

Basic scalability principles to avert downtime

Related Posts

Part 2 – Simple lessons in improving scalability

Simple lessons in improving scalability

OTN MySQL conference slides