In a recent podcast on Mastering EOL Migrations: Lessons learned from MySQL 5.7 to 8.0 I discuss with my colleague Adam North not only the technical issues that become a major migration but also key business and management requirements with having a well-articulated strategy that covers:
- Planning
- Testing
- Be Prepared
- Proactiveness
Having a plan is key to any significant task including data migrations. You should heed the warnings and the deprecations and consider all potential downstream product impacts such as connector upgrades. The plan includes a timeline but also needs to define all the stakeholders both technical and business, the definition of a successful migration, and most importantly the decision tree for a non-successful migration that would include any outage, failback, rollback, or fix-forward requirements.
Test, Test, Test. Leveraging the simple design pattern of read-write splitting (hint: if your application does not support this, it should) you get to test with minimal risk all of your application reads and with real load from 1% to 100%. You can validate all writes but this does not match concurrency, however, you can emulate load testing and using this two-way door strategy, verify and prevent many common problems before the decision point of failover.
Being prepared is assuming your migration will fail, rather than assuming it will succeed. Rehearsal of all steps that are documented and reproducible. Validating that your backup and recovery strategy is still optimal and operational with the new version, preparing supporting staff for availability before, during, and after the migration. There are probably not many technologists that can say, “Well that was a boring, uneventful successful migration”. The question is why not?
Being proactive is just as important. Leaving a large migration to the last minute is procrastination and a cause of unneeded stress during a non-successful migration. The Meltdown/Spectre vulnerabilities are one example of a highly impactful event outside of your control that sidelined entire teams in many companies for months. Does an outage of your cloud provider impact your uptime requirements and force you to delay a last-minute migration due to customer SLA obligations? While being prepared is for the reasons you could think of, being proactive and prepared is for the situations you have not thought of.
Having solid architectural design practices will aid greatly in many critical business requirements of uptime, read-only mode, scale-out, scale-up, and sharding. These design patterns also greatly enhance the likelihood of a successful database migration.
We have also created a Checklist to cover the planning and execution of a migration. Any input is welcome.
You can check out the video podcast on YouTube or listen with your favorite podcast tool.