Learning from a Disaster

As Farhan has already pointed out to us, Disaster is Inevitable – Must shutdown generators . My primary hosting provider The Planet had a serious meltdown, 9,000 servers unavailable, DNS and administration application .

My server was effectively totally unavailable from 3PM Saturday until 10AM Monday, 43 hours in total.

The problem didn’t stop there. Started and verified servers and domains, but like 8 hours later I find that the DNS is wrong on two important domains (I didn’t discover this because I have them in local /etc/hosts) because I moved them to a different IP like 2 weeks ago.

The Planet denied any problems, ticket logged to get them fixed because admin interface was still down.
Wind forward, Service Updates pages states

June 3 – 3:00pm CDT
All DNS Zone files for ns1, ns2, ns5 and ns6 are completely updated as of the information that was available 5:30PM Central Saturday. All DNS servers have been rebooted and BIND has been restarted.
.

What a flat out lie. DNS lookup directly against name server confirms it’s wrong. At Wed 12:00am Wednesday, the admin interface finally gives access to view and update zone files. What the. It indicates and IP address which is should be, but clearly not what it is.

I will so be sending a complaint to notify.management@theplanet.com when I finally get this resolved.

And just to make this all worse, my present professional site ronaldbradford.com is of critical importance. It’s my only exposure for preparing to provide information to potential employers, and it’s used for the preparation of consulting information. I’ve been forced earlier today because of events starting today for NY Internet Week to purchase DNS services at www.easydns.com . That didn’t also got to plan, yet another story.

Tagged with: Uncategorized

Related Posts

Why Being Proactive Is Always a Winning Approach

Many companies manage production infrastructure using a reactive model rather than a proactive one. Organizations typically react to warnings and alerts, then implement corrective actions in response. While some companies have well-designed architectural patterns—such as feature flags and rate limiting—that can quickly mitigate the impact of issues, these are merely temporary solutions, not resolutions.

Read more

AWS CLI support for Aurora DSQL and S3 Tables

If you were following the AWS Re:invent keynote yesterday there were several data specific announcements including Aurora DSQL and S3 Tables . Wanting to check them out, I downloaded the latest AWS CLI 2.

Read more

Migrating off of WordPress - A Simplified Stack

The ongoing drama between Wordpress v WP Engine continues to cross my reading list, but I have permanently removed WordPress from my website. I have finally transitioned away from the complex Linux/Apache/MySQL/PHP (LAMP) stack required for self-hosting WordPress on my professional website.

Read more