Replication++, Replication 2.0, Replication.Next

There has been some discussion recently regarding the death of MySQL Read Replication starting with Brian Aker and then Farhan , Arjen and Paul have all chimed in. Whatever you want to call it, the next generation of replication approach is clearly on the agenda of the industry leaders and pack followers. We should take a programmatic look however and ask ourselves a few questions. Such as:

  1. The Use — What is/was MySQL Replication used for?
  2. The Reasons — Why was it used?
  3. The Problems — Why is there a need for something better, different or improved?

Read Scalability

  1. The Use — Clearly the most commonly known method of MySQL Replication is for Read Scalability. The ability for many more reads to occur against the data then writes. I call it the YouTube effect, thousands/millions of reads (i.e. view a video) to a single write (i.e. upload of a video)
  2. The Reasons — Database servers could not support the read/write load. Applications were poorly designed for sharding data. The database was being used for a function when it wasn’t really needed. The MySQL Server can’t scale to a massive number of cores.
  3. The Problems — MySQL Replication suffers from the asynchronous condition. A successful transaction on the Master, is not confirmed with a Slave before acknowledgment to the client. Google has made inroads in their own environment to improve the asynchronous nature with the SemiSyncReplication patch, now submitted back to the community. The number one effect of asynchronous replication is lag, and again poorly developed applications simply can’t support successful scale-out. Other problems include locking contention when not using a transactional engine.

Backups

  1. The Use — A Replication slave is used to perform a database backup, designed to not affect production usage.
  2. The Reasons — MySQL does not support a single “unbreakable” online backup solution. Period. There are partial/incomplete implementations, dependent on storage engines and data volume. There is a Roadmap for a solution in MySQL 6.0 (finally), but it’s way too early to tell what features will there be, will you need to pay for some of them, will it just work, let alone who is going to upgrade to 6.0 for this.
  3. The Problems — There exists no consistent method for backup across all storage engines. There is no compatibility for backup/point in time recovery (e.g. binary logs/redo logs) across different master/slave servers. No mirrored binary logs is also a problem, something also in the Google Patch.

Failover

  1. The Use — You need Business Continuity or Revenue Continuity Solutions (RCS) as discussed by Marc Simony. You need to ensure the maximum uptime possible with a failover infrastructure when some component (software or hardware) fails and your production database is not accessible.
  2. The Reasons — MySQL provides no Online DDL. In a 24×7 large volume, changing functionality application this provides an unnecessary complexity for applying changes easily.
  3. The Problems — When correctly configured and applied in a controlled situation, failover can work, provided your data is not too large, and your traffic is not too much. It simply doesn’t have a guarantee to work without data loss in an uncontrolled (i.e. disaster recovery) situation.

Others

There are more reasons, something to write about at a later time. These would include.

  • Testing
  • Alternative data access path (e.g. Data Warehousing)
  • Upgrade Path/Migration Path

Options

To handle Read Scalability. Why select something from the database (or various level of caches in the database call when it can be managed by the Application). Memcache is the obvious buzz word here. With tighter database integration (such as the patches and commands FaceBook has) this could be a winner.

Backups is a tough one, without practical storage engine agnostic consistent solutions I don’t feel the backup conundrum will be resolved anytime soon.

To handle a successful failover you need a guarantee that the data is consistent. A Synchronous Replication solution will solve this. MySQL Cluster is a synchronous solution, however it is impractical for basically every application that is already running with MySQL Replication.

What other options exist? Food for thought.

About Memcache

A little on Memcache for those still in the dark ages. Many large sites such as FaceBook ,Fotolog , Wikipedia and Slashdot are strong proponents of Memcached . Created originally for Live Journal Memcached has become the defacto standard in this level of application data caching.

The description from the source.

memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.

Tagged with: Databases MySQL

Related Posts

Your Attack Vector Extends Beyond Production Systems

A common data security issue is the unprotected copying of production data to non-production environments without any redaction, masking, or filtering. This practice poses a serious risk. A malicious actor will target the weakest link in your infrastructure, including non-production accounts and the developer systems accessing them.

Read more

Easy Money-Saving Tips for Your AWS Cloud Spend

There are numerous Cloud Service Provider (CSP) FinOps products that can review, collate, summarize, and recommend ways to optimize your cloud spend. If you’re using one or more cloud providers and don’t actively manage your Cost and Usage Reports (CURs) on a daily basis, investing in such a product is a smart move.

Read more

Why Being Proactive Is Always a Winning Approach

Many companies manage production infrastructure using a reactive model rather than a proactive one. Organizations typically react to warnings and alerts, then implement corrective actions in response. While some companies have well-designed architectural patterns—such as feature flags and rate limiting—that can quickly mitigate the impact of issues, these are merely temporary solutions, not resolutions.

Read more