Digital Tech Trek Digest [#Issue 2024.09]

As an entrepreneur, pricing is an important consideration in any evaluation, development, and customer testing. In How To Price A SaaS Product, we see different pricing strategies, cost-based pricing, competitor-based pricing, penetration pricing, value-based pricing, freemium pricing. None of these match what I am ultimately considering: consumption-based pricing. Pricing is critical to define the value proposition statement and determine the range of the total lifetime value (TLV). It can vary greatly for B2C, B2B, and B2B enterprise offerings. If we look at YCombinator https://www.ycombinator.com/library/6h-startup-pricing-101 a basic principle is determining the gap between price and cost. That is your margin and your incentive to sell, and you work with either cost-plus or value-based pricing. Starting with founder-led sales is difficult as you do not have the luxury of a dedicated and experienced head of sales to work on different models and guide a technical founder, even before you enter the minefield of enterprise sales with applicable bids, contract, and compliance complexities. I am drawn back to “Consumption-based pricing is a pricing model that charges customers based on their product or service usage. Consumption-based pricing calculates pricing based on usage volume rather than the number of users and is a popular pricing model for IT services, SaaS, and cloud computing and storage” Cite: Consumption-Based Pricing.

Moving a Billion Postgres Rows on a $100 Budget

I wrote recently about the 1 Billion Row Challenge (1BR). This week, I found this article on the same number with a different title. The objective was not performance; it was cost. PeerDB enables the efficient extraction of data from PostgreSQL into a data warehouse, such as Big Query, ClickHouse or Snowflake. It was interesting to see Arvo as a format used over, for example, Parquet. The product also offers different streaming modes, including log-based (CDC), cursor-based (timestamp or integer), and XMIN-based. I will need to do further research on this new term XMIN-Based.

Test queries against your production database (responsibly)

This post links off to a YouTube video of The Safest Way to Test Postgres Destructive Queries, which provides a basic introduction to branching of the Neon PostgreSQL DBaaS. While the title originally interested me, the example showing the mechanics is like many other product examples in which it is extremely simplistic and not a true representation of “production” size or workload. I see this as a similar concept to AWS RDS Aurora cloning. However, any example should modify the structure of a table, measure the impact of that structure against production queries (note plural), and provide additional metadata rather than just a response time. These are important considerations in my own evaluation of test coverage of data access and the gathering of configuration, data, and infrastructure when running experiments to determine a more optimal data access path or a new functionality requirement. More documentation can be found here on Neon Branching

About “Digital Tech Trek Digest”

Most days, I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new about professional and personal topics of interest. I turn what I read into actionable notes in a short, committed time window, summarizing what I learned, what I should learn and use, or what is of random interest.

Some of my regular sources include TLDR, Forbes Daily, ThoughWorks Podcasts, Daily Dose of Data Science and BoringCashCow. Also Scientific American Technology, Fareed’s Global Briefing, Software Design: Tidy First? by Kent Beck, Last Week in AWS, Micro Newsletter to name a few.

Random Wisdom

This week, I was reminded via a very interesting statement that work-life balance and joy in what you do are critically important. You will not find on a tombstone the statement:

“I never worked enough hours.”

Digital Tech Trek Digest [#Issue 2024.08]

The One Billion Row Challenge Shows That Java Can Process a One Billion Rows File in Two Seconds

Well, it’s way under 2 seconds for the 1brc.The published results are in and if you’re good you can read 1 billion data points of weather data and analyze it. The final best number as per the article release is “00:00.323″. Yes, that answer is in milliseconds “Result (m:s.ms)”. Mind-blowing.

ScyllaDB Summit 2024

Last week I attended this virtual event. All the presentations can be found online. I had never used the product before so while some new features like Tablets were not as applicable in understanding the full impact, the comparisons of DynamoDB performance and cost were very applicable.

So what is ScyllaDB? It is a distributed NoSQL DBaaS that speaks Cassandra protocol (do large companies still use this?), and it speaks AWS DynamoDB protocol. That is really interesting to me. You can choose a Cloud Hosted offering, or if you’re into managing your setup, you can use the Open Source ScyllaDB version available from GitHub. I started at ScyllaDB University to get a grip on the basics. I have yet to try the local Docker Compose setup.

Thanks also to the team for the swag which I received.

Playing a game with your CI/CD pipeline

My friend Sergey has created a game in GitLab called GitTerra. Drop a few lines into your .gitlab-ci.yml and each build will give you a generated 3D map of a city based on your commit. I really look forward to some of his next steps, leveraging potentially different colors for languages, or different building structures for artifacts found in your commit.

We raised 11.6M to build Serverless Postgres for Modern SaaS

Congrats to Gwen and her co-founder for getting seed funding for Nile Serverless Postgres for Modern SaaS. Awesome news for an entrepreneur and I’m very hopeful for the success of Nile.

The Safest Way to Test Postgres Destructive Queries

While I am a user of ElephantSQL serverless PostgreSQL and Neon, Nile and Xata are just a few that are competing in the space. With multiple other products also speaking PostgreSQL protocol, you can easily trial a small product in an RDBMS in the cloud at no cost. PostgreSQL is definitely outdoing MySQL in this space. You have the extensive set of NoSQL Cloud offerings, SycllaDB I just mentioned, and D1 by CloudFlare I have yet to try this branching feature for your database, sounds interesting and I’ve added to my just as long list of products to try, as books to read. Nit: It’s PostgreSQL, not Postgres.

About “Digital Tech Trek Digest”

Most days I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new covering professional and personal topics of interest. Turning what I read into some actionable notes in a short committed time window is a summary of what I learned, what I should learn and use, or what is of random interest.

Some of my regular sources include TLDR, Forbes Daily, ThoughWorks Podcasts, Daily Dose of Data Science and BoringCashCow. Also Scientific American Technology, Fareed’s Global Briefing, Software Design: Tidy First? by Kent Beck, Last Week in AWS, Micro Newsletter to name a few.

Digital Tech Trek Digest [#Issue 2024.07]

Everything you need to know about seed funding for startups

A recent call with a startup founder funded by TinySeed led me to learn about MicroConf and Rob Walling. (Thanks Tony for the info). This has led to a lot of great info in several new newsletters and videos including this video. A few very valuable tips I learned included the answer to Why should you raise funds at all? The 1-9-90 rule, and different types of funding including Indie funding. It was interesting to find out that the TinySeed accelerator is 1 year, and not 13 weeks, which is common in NY. Rather than sharing my notes, go watch the video.

5 Books That Paved My Path to Entrepreneurial Success

I have not heard of any of these books, and I have such a long list, perhaps I need to publish my list and elicit feedback on prioritizing. The list from this article is as follows:

    1. Mastering negotiations with never split the difference
  1. 2. Embracing risk with skin in the game
  2. 3. Building habit-forming products with Hook
  3. 4. The roadmap to a billion dollar app in How to Build a Billion Dollar App

Visualization

Last week I was at two events in Brussels. I chose to head to London to fly home. I found this map present in many tube stations (The tube is the London Metro Subway). It’s been a decade since I was in London, and over two decades since I lived in the UK. I found the new map great. When I mentioned it as a good visualization, I was surprised that locals of the London area thought it was horrible. I saw the value in the visualization, but perhaps others see it like art, “in the eyes of the beholder”. It could also be “habit”.
London Tube Map - 2024 presentation
Typical London Tube map

Cats and Dogs

How many *NIX `cat` memes can there be? Well, a lot cat is the most misused thing by programmers new to Linux. I cringe every time someone uses it wrong in a bash script. Thread below with proper uses of cat only.

Hey, dogs, you are in the count also with HTTP STATUS DOGS. My picks are 300 Multiple Choices, and 429 Too Many Requests for me.

Upcoming Events on my radar

About “Digital Tech Trek Digest”

Most days I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new covering professional and personal topics of interest. Turning what I read into some actionable notes in a short committed time window is a summary of what I learned, what I should learn and use, or what is of random interest.

Some of my regular sources include TLDR, Forbes Daily, ThoughWorks Podcasts, Daily Dose of Data Science and BoringCashCow. Also Scientific American Technology, Fareed’s Global Briefing, Software Design: Tidy First? by Kent Beck, Last Week in AWS to name a few.

Random

I shared with a colleague on Feb 9. “3 SQL databases walked into a NoSQL bar. A little while later, they walked out because they couldn’t find a table.”

Digital Tech Trek Digest [#Issue 2024.06]

In this past week, I’ve been able to not just read or watch content online but meet people in person. In Brussels, first at the MySQL Belgian Days 2024 event, followed by FOSDEM 2024.

There was a wide array of presentations covering many different topics, this is just a summary. Fred talked history of Command Line Monitoring and an intro to the new player Dolphie. Dave Stokes talked security, Sunny Bains gave us a brain dump of TiDB scalable architecture. We got an update on PMM and MySQL on k8s from Peter Zaitsev as well as a chat about his new product coroot. And then a great intro to a new generation of online schema change at scale with Sprit by Morgan Tocker. Alex Rubin shows us how not how to hack MySQL, but how MySQL can hack you. We have all crossed paths as MySQL Inc. employees or MySQL community members since 2006.

Marcelo Altmann gave us a detailed intro of a new era of caching with ReadySet. We also heard updates on Vitess. And that was just Day 1 presentations. The evening event was at the incredibly wall-to-wall packed Delirium Café, sponsored by ReadySet which we offer great thanks and cheers.

Day 2 was packed with great content about MySQL Shell, MySQL Heatwave ML and Vector, MySQL Router, and the MySQL optimizer from many well-known Oracle MySQLers before amazing awards, Belgian beer, and black vodka of course.

Congratulations Giuseppe Maxia on your MySQL Legends award at MySQL Belgium Days 2024. It is well deserved for all of your community contributions over the decades.

Check out the details at Unveiling the Highlights: A Look Back at MySQL Belgian Days 2024.

Saturday and Sunday were FOSDEM 24 and its usual location. So many people crossing the university, tunnels, and weird transit paths between all the university lecture halls it can feel like a blur. For the first time, I had no fixed agenda so I could check out random talks on random topics.

A shout-out also to so many people I know and some new people I met. Colin Charles, Alkin Tezuysal, Walter Heck, Charly Batista, Robert Hodges, Jens Bollmann, Monty Widenius, Matthias Crauwels, Michael Pope, Marcelo Altmann, Emerson Gaudencio, Aldo Junior and tons more I have forgotten to mention by name. Many conversations also with random community people I didn’t even get names for example the team at Canonical.

About “Digital Tech Trek Digest”

Most days I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new covering professional and personal topics of interest. Turning what I read into some actionable notes in a short committed time window is a summary of what I learned, what I should learn and use, or what is of random interest.

Some of my regular sources include TLDR, Forbes Daily, ThoughWorks Podcasts, Daily Dose of Data Science and BoringCashCow. Also Scientific American Technology, Fareed’s Global Briefing, Software Design: Tidy First? by Kent Beck, Last Week in AWS to name a few.

Digital Tech Trek Digest [#Issue 2024.05]

Because the world needs better dashboards

While my professional interests in Building Better Data Insights Faster rely on using visuals and narratives to show data-driven results, “Starting from first principles” is the question you have to ask. Identifying the quality of data sources, the time to delivery, and the confidence of accuracy are critical aspects of any dashboard.

Source: WrapText by Equals

This is the second article I’ve read about Equals in a week, and while I’m not ready to go back to a spreadsheet, this company has some great previous posts with excellent content, such as the 2023 summary and How to ship fast. An appropriate statement would be.

What a year. We embraced AI. We reimagined BI. We waved freemium goodbye. And as the cliché goes, we’re only just getting started.

[Last Week in AWS] Issue #352: New Year, New You, Here’s December in Review

Damm right, I think you are giving too much created by saying “a year”. More than once I had to rewrite code because AWS was years behind standard Python releases. AWS Lambda adds support for Python 3.12.

Whatever was going on with the delays in getting new language runtimes out a year or more after the language version itself was released seems to have been resolved. I wonder how long it’ll take that unpleasant chapter to fade from the collective awareness around Lambda.

Source: Last Week in AWS

Latency is the new outage

While technically a video that I listened to with Getting Started with ElastiCache for Redis Performance & Cost Optimization, this needs to be a slogan used more frequently. It is so true. The speaker in the opening minutes also describes some compelling reasons why our proliferation of data can contribute to a negative impact.

Source: Random AWS reading.

About “Digital Tech Trek Digest”

Most days I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new covering professional and personal topics of interest. Turning what I read into some actionable notes in a short committed time window is a summary of what I learned, what I should really learn and use, or what is of random interest.

Some of my regular sources include TLDR, Forbes Daily, ThoughWorks Podcasts, Daily Dose of Data Science and BoringCashCow. Also Scientific American Technology, Fareed’s Global Briefing, Software Design: Tidy First? by Kent Beck, Last Week in AWS to name a few.

Digital Tech Trek Digest [#Issue 2024.04]

NoOps and Serverless solutions

I was reminded of an upcoming expiry of a test website that I have on PythonAnywhere. This site enables you to host, run, and code Python in the cloud without any infrastructure and starts with a free account and then a $5 account. Striving towards NoOps and serverless is an important consideration for any small and simple application, I’d forgotten completely about this service.

5 IT services industry trends on tap for 2024

As major companies either want to use a service provider or maintain a relationship with one, knowing the trends lets you consider what SaaS providers of all the services you use like authentication, security, chatbots, support systems, and more are thinking about.

This article considers these trends:

  • Cloud cost optimization
  • Focused transformation, innovation
  • Investment in generative AI skills
  • Vertical market focus
  • Partner programs, reconsidered

Rapid developments in AI will also shape business prospects for consulting firms, MSPs, and systems integrators. AI could potentially provide a way to deliver new capabilities in shorter timeframes that satisfy the C-level demand for a quick ROI.

Source:https://www.techtarget.com/searchitchannel/feature/IT-services-industry-trends-on-tap

Context switching is killing your productivity

I believe the title says it all. The article provides several ways to combat this productivity killer.
Source: https://asana.com/resources/context-switching

Exploding Topics

A colleague pointed me to Exploding Topics. An interesting look at the growth of certain topics over recent years. I’m not sure if they are measuring, articles, products, websites, or just conversations on the topic in question.

Thoughtworks Technology Radar

I spent a lot of time reviewing the recent Thoughtworks Technology Radar. I was hoping that 2024 would issue a current version however Sep 2023 is still recent. My thoughts on the tools, techniques platforms, and frameworks in vogue I’ll leave for a separate post.

Why I’m excited about profit-sharing startups

Every year there is a list of the startups that failed and 2023 failures was no different. There is also the list of likely IPOs for the year. Is it going to be Space X, DataBricks, and Reddit for example.

This article along with a host of links reaching out to sites such as Creator Fund, Humanism and Weekend Fund and other interesting stories re-iterate that it is great people and not great ideas that are the right way of being an entrepreneur. The concept of investing that asks for a return of 1-5% of future earnings is an interesting movement from going down the VC slog.

The article lists these points:

  • There’s a culture shift in tech toward profit-generating businesses.
  • There’s a tech shift that enables talent to build more with less.
  • There’s a regulatory shift that makes exits challenging.

… believe a few big shifts will drive more founders and investors to pursue profit-sharing models in 2024 and beyond.

This tweet talks about Gumroad issuing dividends back to our investors. I always understood that investors wanted to see a return, or a positive change in the return capabilities within a 5 year horizon. Also interesting is this Challenging your assumptions about starutps video.

Combined with Why the Future of Startups are Studios really helps me consider what I started back in 2011 with a number of technology leaders in New York as a viable alternative to what we know about funding a startup. We were always able to get through the first 3 steps easily.

  • Generate an idea
  • Flesh out the idea
  • Launch and experiment
  • Create a project
  • Create a big company

I believe Graham was ahead of his time with Ultra Light Startups some 15 years ago.

Source: TLDR

About ‘Digital Tech Trek Digest’

Most days I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new covering the professional and personal topics of interest. Turning what I read into some actionable notes in a short committed time window is a summary of what I learned today, what I should learn and use, or what is of random interest.

Some of my regular sources include TLDR, Forbes Daily, ThoughWorks Podcasts, Daily Dose of Data Science and BoringCashCow. Also Scientific American Technology, Fareed’s Global Briefing, Software Design: Tidy First? by Kent Beck to name a few.

Digital Tech Trek Digest [#Issue 2024.03]

Lessons from going freemium: a decision that broke our business

As an entrepreneur always considering how to produce a sustaining passive revenue, what licensing model to use, and how to acquire and retain customers, the allure of a freemium model is ever present in so many offerings. You may wish to read this article and look at the visualizations provided with the narrative. I found this a useful data storytelling example.

The allure of seeing a new product is the strongest motivator new users have to complete setup. If you make onboarding too easy, they’ll never come back to do the hard task you let them skip.

Read more at Lessons from going freemium: a decision that broke our business Source: TLDR

Newsletters and online content creators

Lenny’s Newsletter from the prior article, listed with over 574,000 subscribers is one of several Substack newsletters I subscribe to. Substack is described as “The subscription network for independent writers and creators”. I have been collecting the number of subscribers from several newsletters I follow, however, there is no way to see that growth over time. Also missing are the price rates over time and the ratio of free to paying subscribers. Random Idea: What is missing is a history of this information. Other stats I’ve noted previously include 66,000 subscribers for Kent Becks Newsletter with 3 subscription plan offerings, 1,250,000 subscribers for the free TLDR (I can remember this years ago being much less) and 65,000 subscribers for the Seattle Data Guy newsletter.

FWIW this post from Lenny’s Newsletter This newsletter is growing up is from 2020.

Golden Kitty Awards 2022

I came across the Golden Kitty Awards which unfortunately are only current to 2022 (fail on being current). It was interesting to scan the list for innovative ideas. I’ve yet to visit any sites, but I’m always encouraged by what people think of and commit to building regardless of the motivation or incentive. What counts is an entrepreneur takes an idea and releases a product. 

Source: Random

Streamer JS – Video stream layout manager for OBS Studio and other streaming applications.

I am a new user of Twitch streaming for personal projects.  My good friend Sergey Chernyshev organizer of the large New York Web Performance Group has created Streamer JS as a means to drive more dynamic content in the browser and with common languages of HTML/CSS/Javascript and using OBS more as the streaming only component.  One objective is to have better version control management of assets/scenes/sources/filters/etc.  It’s interesting that  PouchDB is an eventually consistent distributed datastore in Javascript. Yet another simple data store to review for suitability. 

Source: Word of mouth

The 37th Chaos Communication Congress (37C3) by the Chaos Computer Club

Last month I was introduced to the Chaos Computer Club. This is a large German-based annual tech conference with a focus on security & infrastructure/hacking.  Over 100 talks from the most recent event last week have been posted here.

Source: Word of mouth

About ‘Digital Tech Trek Digest’

Most days I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new covering professional and personal topics of interest. Turning what I read into some actionable notes in a short committed time window is a summary of what I learned today, what I should learn and use, or what is of random interest.

Some of my regular sources include TLDR, Forbes Daily, ThoughWorks Podcasts, Daily Dose of Data Science and BoringCashCow. Also Scientific American Technology, Fareed’s Global Briefing, Software Design: Tidy First? by Kent Beck to name a few.

Digital Tech Trek Digest [#Issue 2024.02]

Indie Newsletter Tool Generates $15,000 a Month

There are so many different email newsletter sites you could wonder if there is market saturation. MailChimp, Mailgun, ConvertKit, Sendgrid (now part of Twilio it seems), Moosend and Mailersend come to mind.

It seems the space still has plenty of revenue-producing options including buttondown.email reportedly a side gig generating $15k per month. Source: BoringCashCow

When I asked a good friend and author of the Technical SEO Weekly his use of ConvertKit directed me to this Baremetrics Dashboard which is another product to look at sometime.

LLMs and Programming in the first days of 2024

How do use an LLM? If you are still on the fence start getting into the habit of using it more frequently then start. I now use ChatGPT and Claude AI daily, and with a crowded market there are many other emerging technologies to also consider.

I use ChatGPT for coding and image generation with DALL.E. I use Claude more for reviewing large documents that seem to be ideal for producing a summary, or to generate a fictitious movie script from those documents.

I do not like Javascript nor do I wish to actually learn this language however I write it daily via ChatGPT. Javascript is the ever-changing technology of web development and it’s impossible to keep up with the next product, or version of a product you may know. ChatGPT helps me navigate this combined with asking for HTML and TailwindCSS.  However, it’s not perfect, you need to be an experienced engineer who has learned how to write code for many years to ask the right questions and to correct the LLM when it does not produce what you expect. Let’s look at CSS. Now there is flex and grid and it’s hard to keep up with changing features that browsers support. This is where ChatGPT has helped me. I have been using Tailwindcss but it still took an expert friend 30 minutes to help me debug a CSS formatting issue of a future OBS Twitch streaming project to correctly size the content all in a 1920×1080 box. I learned a lot of new features of Google Chrome Developer Tools Inspector I did not know and are probably just the start of expert debugging features.

Until a few months ago I never knew it’s now much easier to read JSON in Javascript.

async function fetchData() {
  try {
    const response = await fetch('data.json');
    const data = await response.json();
    console.log(data);
    return data
  } catch (error) {
    console.error('Error fetching data:', error);
  }
}

let data = await fetchData();

I’d like to remind users that  ChatGPT can make mistakes. Consider checking important information.. Source:  TLDR

ParadeDB (GitHub Repo)

Every day there is another PostgreSQL product to review.  I am a current user of ElephantSQL which I didn’t know existed two months ago. Neon and Tembo are two more PostgreSQL serverless-related products on my product review list.  Now adding ParadeDB as well as reading Thoughts on PostgreSQL in 2024.

About ‘Digital Tech Trek Digest’

Most days I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new covering professional and personal topics of interest. Turning what I read into some actionable notes in a short committed time window is a summary of what I learned today, what I should learn and use, or what is of random interest.

Some of my regular sources include TLDR, Forbes Daily, ThoughWorks Podcasts, Daily Dose of Data Science and BoringCashCow to name a few.

Digital Tech Trek Digest [#Issue 2024.01]

The Tiny Stack (Astro, SQLite, Litestream)

I spent many years in the LAMP stack, and there are often many more acronyms of technology stacks in our evolving programming ecosystem. New today is “The Tiny Stack”, consisting of Astro, a modern meta-framework for javascript (not my words), and Lightstream Continuously stream SQLite changes. I’ve never been a fan of Javascript, a necessary evil in modern stacks, but it changes so rapidly it’s a constant stream of new products with never the time to learn any. Lightstream is interesting. Replication of SQL operations to a database is nothing new, the Change Data Capture (CDC) of your data, however, I’d not thought of SQLite which is embedded everywhere offered this type of capability.

Amazon Aurora Introduces Long-Awaited RDS Data API to Simplify Serverless Workloads

AWS Aurora Serverless version 2 has been out for at least a year (actually 20 months – Apr 21, 2022), but a feature of version 1 that was not available in version 2 is the Data API. This is for developers without SQL skills to have a RESTful interface to the database, however, it only works in AppSync and only for recent versions of PostgreSQL and only in certain regions. I’ve never used it myself, but it is news.

Speaking of what is available in what AWS region, recently released InstanceHunt allows you to identify the instance families/types available in different regions across various AWS Database services. I developed this in just a few days and released it only last week as a working MVP. Future goals are to include other clouds and other categories of services such as Compute. The prior announcement may facilitate a future version that supports the features of services in regions.

Stop Stalling And Start Your Dream Side Business In 2024

The title kinda says it all. As an inspiring entrepreneur, my pursuits have only offered limited minimal success over the decades and never a passive revenue stream. While the article did not provide valuable nuggets, the title did. One of my goals for 2024 is to elevate my creation and release of side projects, regardless of each project being a source of revenue. I consider refining my design, development, testing, and implementation skills and providing information of value, are all resources of a soft income that showcase some of my diverse skills.

About ‘Digital Tech Trek Digest ‘

Most days I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new covering professional and personal topics of interest. Turning what I read into some actionable notes in a short committed time window is a summary of what I learned today, what I should learn and use, or what is of random interest.

Announcing InstanceHunt

InstanceHunt identifies the instance (families/types/classes) available for a cloud service across all the regions of that cloud.

The initial version is a working example of several AWS database services. Future releases will enable advanced filtering and will cover other service categories (e.g. compute) as well as GCP and Azure cloud platforms, as well as providing the full list of instance types within families within the service matrix.

For a few days investment this MVP is a usable service, complete with adding new regions the same day, for example ca-central-1 data was available the day of release. It is interesting and can answer questions like what regions the new generation 7 instance families are available? What consistent instance types can use use across Europe regions? Where is MemoryDB not available?

Feature requests are welcome. From today’s reading, being able to show a feature of a service may be also a useful future matrix, e.g. AWS Aurora Serverless Data API now available in Serverless v2, but only one of two engines and only in a few regions.

China regions and AWS Gov Cloud regions are coming soon.

InstanceHunt - Find what instances you can use for your cloud services

Mastering MySQL 5.7 EOL migrations

In a recent podcast on Mastering EOL Migrations: Lessons learned from MySQL 5.7 to 8.0 I discuss with my colleague Adam North not only the technical issues that become a major migration but also key business and management requirements with having a well-articulated strategy that covers:

  • Planning
  • Testing
  • Be Prepared
  • Proactiveness

Having a plan is key to any significant task including data migrations. You should heed the warnings and the deprecations and consider all potential downstream product impacts such as connector upgrades. The plan includes a timeline but also needs to define all the stakeholders both technical and business, the definition of a successful migration, and most importantly the decision tree for a non-successful migration that would include any outage, failback, rollback, or fix-forward requirements.

Test, Test, Test. Leveraging the simple design pattern of read-write splitting (hint: if your application does not support this, it should) you get to test with minimal risk all of your application reads and with real load from 1% to 100%. You can validate all writes but this does not match concurrency, however, you can emulate load testing and using this two-way door strategy, verify and prevent many common problems before the decision point of failover.

Being prepared is assuming your migration will fail, rather than assuming it will succeed. Rehearsal of all steps that are documented and reproducible. Validating that your backup and recovery strategy is still optimal and operational with the new version, preparing supporting staff for availability before, during, and after the migration. There are probably not many technologists that can say, “Well that was a boring, uneventful successful migration”. The question is why not?

Being proactive is just as important. Leaving a large migration to the last minute is procrastination and a cause of unneeded stress during a non-successful migration. The Meltdown/Spectre vulnerabilities are one example of a highly impactful event outside of your control that sidelined entire teams in many companies for months. Does an outage of your cloud provider impact your uptime requirements and force you to delay a last-minute migration due to customer SLA obligations? While being prepared is for the reasons you could think of, being proactive and prepared is for the situations you have not thought of.

Having solid architectural design practices will aid greatly in many critical business requirements of uptime, read-only mode, scale-out, scale-up, and sharding. These design patterns also greatly enhance the likelihood of a successful database migration.

We have also created a Checklist to cover the planning and execution of a migration. Any input is welcome.

You can check out the video podcast on YouTube or listen with your favorite podcast tool.

Data Masking 101

I continue to dig up and share this simple approach for production data masking via SQL to create testing data sets. Time to codify it into a post.

Rather than generating a set of names and data from tools such as Mockaroo, it is more practical to use actual data for a variety of testing reasons.

The SQL below is a self-explanatory approach of removing Personal Identifiable Information (PII), but keeping data relevant. I use this approach for a number of reasons.

  • We are using production data rather than synthetic data. Data volume, distribution, and additional column values are realistic. This is a subset of an example, but dates and locations are therefore realistic
  • Indexes (and unique indexes) still work, and distribution across the index is adequate for searching. Technically the index will be a little larger in disk footprint.
  • You cannot reverse engineer the masked value into a real value with just this data set. An engineer in a test environment cannot obtain the underlying information.
  • If you identify an issue with data quality for any row of data, there is a way to present the uniqueness of that row. This enables a person with production access to match the underlying row. Of course, any unique identifier (auto increment or UUID) should also be modified to mask real data.


SELECT CONCAT(SUBSTR(first_name,1,2),REPEAT('*',LENGTH(first_name)-2)) AS first_name,
CONCAT(SUBSTR(last_name,1,3),REPEAT('*',LENGTH(last_name)-3),' ', SUBSTRING(MD5(CONCAT(first_name,last_name)),1,6)) AS last_name,
CONCAT(SUBSTR(organization,1,3),REPEAT('*',LENGTH(organization)-3),' ', SUBSTRING(MD5(CONCAT(organization)),1,6)) AS organization,
created, country
FROM customer
LIMIT 10;

+------------+--------------------+------------------+---------------------+---------+
| first_name | last_name | organization | created | country |
+------------+--------------------+------------------+---------------------+---------+
| Sa**** | Cor**** 4c23cd | Ski*** d21420 | 2022-09-20 03:30:14 | PH |
| Fu**** | Wat*** 8b97de | Jax***** e629c2 | 2022-04-08 03:20:22 | BY |
| Mo**** | Zis***** b11d94 | Rhy**** b4073a | 2022-10-06 15:58:38 | IR |
| So**** | Bad** 232cc2 | Rhy*** 1734bd | 2022-02-01 07:35:39 | ID |
| Ni***** | Ter***** d9ffb5 | Wor****** 6e476c | 2021-11-08 17:07:34 | IL |
| Ka****** | Scr***** 9201db | Jax**** 481fd8 | 2022-08-18 19:17:54 | BR |
| Li*** | Coz** 0447f6 | Nlo**** 11da59 | 2022-07-29 06:47:56 | HR |
| Ch***** | Hal******** f5d9c8 | Zoo**** c6e07d | 2022-09-28 04:54:30 | UA |
| Er****** | Ste******* d005f2 | Eid** ffc305 | 2022-04-28 18:50:11 | PT |
| Fo** | O'S***** b35c44 | Buz**** 2c8598 | 2022-09-11 02:05:55 | RU |
+------------+--------------------+------------------+---------------------+---------+

A reliable and dependable application requires observability

Observability (o11y) is a critical pre-requisite component in software architecture when advocating for and preparing organizations for making informed decisions on the success of their application. Open Telemetry from the Cloud Native Computing Foundation is the goto standard regardless of your choices of monitoring tools. However, observability is just a building block that I need to explain when advocating for having reliable and dependable systems. Observability will not inform you “Was the customer actually impacted, or how many, or how long?”. Observability will not tell you “the root cause of a problem?”.

My five layers of building blocks for Reliability are:

  1. Observability – The collection of telemetry (metrics, traces and logs) should just be there. If you are using Kubernetes (k8s) and a Java/Python/Node.js application it is already built in. Just do it.
  2. Reproducibility – The ability with a known set of steps and a given configuration and setup you can reproduce an outcome showing the same observed results is a necessary pre-requisite for any feature development or bug fixes.
  3. Testability – After being able to consistently reproduce an observed event with measurable results, the running of various experiments using a variety of changes enables you to adequately test future improvements or corrections to the initial situation, whether it’s a known bug, or a new piece of functionality. Reproducible and consistent testing is an essential component to the release of software for a reliable application.
  4. Scalability – It is impossible to adequately test a system to failure without an observable, reproducible, and testable framework. Many organizations suffer from the management “Can we support X operations” syndrome, when instead the application should know what “X” is automatically, and have adequate safeguards in place to prevent its occurrence. The ability to proactively disable [expensive] features for the good of the entire system is not a common practice for software (aka a dark mode). In fact, many organizations do not even have the capability of customer-level and individual component-level feature flags or related rate limits that can manually be implemented.
  5. Dependability – A reliable, highly available, and dependable application requires all of the prior layers to be in place to give a level of assurance to your customers and your company that your product is dependable.

AWS RDS Aurora wish list

I’ve had this list on a post-it note on my monitor for all of 2022. I figured it was time to write it down, and reuse the space.

In summary, AWS suffers from the same problem that almost every other product does. It sacrifices improved security for backward compatibility of functionality. IMO this is not in the best practices of a data ecosystem that is under constant attack.

  • Storage should be encrypted by default. When you launch an RDS cluster its storage is not encrypted. This goes against their own AWS Well-Architected Framework Section 2 – Security.
  • Plain text passwords. To launch a cluster you must specify a password in plain text on the command line, again not security best practice. At least change this to using a known secret from AWS secrets manager.
  • TLS for administrative accounts should be the only option. The root user should only be REQUIRE SSL (MySQL syntax).
  • Expanding on the AWS secrets manager usage for passwords, there should not need to be lambda code and cloudwatch cron event for rotation, it should just be automatically built in.
  • The awscli has this neat wait command that will block until you can execute the next statement in a series of sequential events to prepare and launch a cluster, but it doesn’t work for create-db-cluster. You have to build in your own manual “wait” until “available” process.
  • In my last position, I was unable to enforce TLS communications to the database from the application. This insecure practice is a more touchy situation, however, there needs to be some way to ensure security best practices over application developer laziness in the future.
  • AWS has internal special flags that only AWS support can set when say you have a bug in a version. Call it a per-client feature flag. However, there is no visibility into what is set, which account, which cluster, etc. Transparency is of value so that the customer knows to get that special flag unset after minor upgrades.
  • When you launch a new RDS Cluster, for example, MySQL 2.x, you get the oldest version, back earlier in the year it was like 2.7.2, even when 2.10.1 was released. AWS should be using a default version when only an engine is specified as a more current version. I would advocate the latest version is not the automatic choice, but it’s better to be more current.
  • the ALTER SYSTEM CRASH functionality is great, but it’s incomplete. You cannot for example crash a global cluster, forcing a region-specific failover. If you have a disaster resiliency plan that is multi-region it’s impossible to actually test it. You can emulate a controlled failover, but this is a different use case to a real failover (aka Dec 2021)
  • Use arn when it’s required not id. This goes back to my earlier point over maximum compatibility over usability, but when a --db-instance-identifier, or --db-instance-identifier requires the value to be the ARN, then the parameter should be specific. IMO –identifier is what you use for that argument, e.g. --db-cluster-identifier. When you specify for example --replication-source-identifier this must be (as per docs) “The Amazon Resource Name (ARN) of the source DB instance or DB cluster if this DB cluster is created as a read replica.” It should then be --replication-source-arn. There are a number of different occurrences of this situation.

Our Data Security Moonshot Starts With Prevention

The recent re-announcement of the Cancer Moonshot highlighted a common enemy to many endeavors to improve our society as a whole, and that is using common sense and already known methods.

At a high level The goal of the Cancer Moonshot Scholars program is to inspire and support the next generation of world-class and diverse researchers focused on scientific breakthroughs that will make a difference for patients and drive progress toward the goal of ending cancer as we know it today. source fact sheet

As stories of this announcement filtered thru news outlets with interviews of medical professionals, a known thread appeared. Both lacking in the message, and the single greatest advancement to the problem, which is already known, is prevention. This includes known prevention measures, early detection measures, and education.

As a Data Strategist, Data Security is a critical component of any business and the single best defense is prevention and using common sense.

Here are just some simple basics that seem to have to be discussed and argued repeatedly company after company, product after product, yet there is no single effort to eliminate these poor practices.

  • No clear text passwords. If you have to enter a password on the command line (cough cough AWS CLI) or put a clear text password in a configuration file (cough cough 100s of products), you enable simple techniques to obtain unvetted access to your data.
  • Using clear text passwords is amplified when products offer a more secure means of access and identity management but they do not employ it everywhere.Check out Password Plaintext Storage
  • Clear text transport. It pains me to say it but even in recent employment that held critical PII data, I could not enforce TLS communication between applications and databases. While it was as simple as a configuration option, the constant excuses by engineering management were it was too hard to implement (cough cough BS).
  • The default configuration settings for a product need to be secure, not the default that is most compatible with prior versions. For example, if you launch a new cloud instance database with defaults, is it the most secure options, or the least secure options>
  • Credential rotation. Long-lived credentials should just be eliminated. Often these are also not named users, but commonly used processes.
  • Communicating passwords in clear text. This should never ever happen, yet it does. Have you ever received an (insecure protocol) email saying here is your username and password? A short known list of 5880+ sites can be found in the https://plaintextoffenders.com/ list on github offenders.csv.
  • Data systems accessible via the public internet For example MongoDB article, MySQL/MariaDB article, Redis & ElasticSearch etc, etc
  • Data systems that have no credentials required
  • Data systems that have default credentials that were never changed
  • Storing passwords in clear text
  • Storing passwords with a single salt
  • Storing passwords with a symmetric encryption approach
  • Administrators that use a common account for “root” privileges, not individual named accounts
  • Not patching products with fixed vulnerabilities CVS Program Mission
  • and the list could go on and on….

In all of the above points, there are numerous examples of these data security anti-patterns. While many are due to the products in use, some of these examples represent poor business practices. It should not have to be explained that most attacks and breaches are internal. The common and very incorrect attitude of, we are within our Virtual Private Cloud (VPC) we do not need to encrypt our data is well, plainly wrong.

Transparency

One of the greatest threats to businesses is ransomware. Attackers gain access to system via various means, those above are just the simplest means and then hold businesses ransom. Ransomware has multiple impacts including the loss of a business operating, the process and time of making a decision, the penalty for payment to release the random, and generally the threat of release of their data if a fine is not paid.

There is a lot to unpack even with this ransomware statement. Can you not restore your entire business operations within a suitable RTO and RPO? Is important data not encrypted. Are passwords in your business able to de-encrypted (this should never even be possible). Do you have a disaster recovery (DR) strategy? Can you access critical data via others means and systems independently?

The stigma of a ransomware attack is organizations do not share this openly. They do not share why it happened, what could have been done to prevent this, and sharing all information with federal authorities that should be tracking all occurrences. This information is an important and critical education feedback loop for the whole industry and IMO lacking of attention. Do you know of a website that shared known ransomware attack vectors.

Conclusion

If security is an important aspect to the data in your organization, can you name the people in your security department? Can any individual point out an insecure product with a known fixed vulnerability? Is that information transparent? Is there a process to address that as a top priority, moving engineering and operations goals accordingly? While organizations may employ an error budget for outages, do they employ a security vulnerability budget? Do companies note version updates of all their software, have people read ALL the release notes of each point release, or even know every version of each software product in use in the organization?

For more information, check out

There will always be better and more determined attempts to attack data systems, we have to stop the most obvious first, and we have to participate in identification and remediation endeavors.

Using a simple relatable example to every person, your home. We should start with not leaving the door open, or leaving the keys in the door or simply removing the door all together.

Spoiler – Owning your data isn’t good enough

While this is a catchy title, if you use Software as a Service (SaaS), or an online cloud provider, do you actually own and have total control of your business data and its infrastructure? For all the free and paid services your business uses, what happens if one day, a portion of that were no longer available?

When you have data in a CRM, an analytics platform, a marketing platform, a payments platform, if one of those service providers locks you out of your data, you have lost control and access to a part of your business. Can you still operate unaffected? What is the actual impact? What is your contingency? You could be lucky and the impact is temporary, such as a day or a week, but it could also be longer or even indefinite.

Let me give you a simple but concrete example. Fellow woodworker Eric of Spencley Design posted recently on YouTube “I just lost half of my business”. If you listen to just 2 1/2 minutes from 12:00 to 14:30 of his youtube video explanation you will understand that this business relies on several online SaaS services. Many are free, but for an unexplained reason, whether bad code, bad ML/AI, or several other plausible reasons, one of his income streams was shut down without notice. This was not by his doing, or any of his actions but for unrelated reasons. Online attempts to appeal this situation caused a permanent suspension. Talking to a human to understand what happened, why it happened, and how this can be resolved, was also unanswered because there is no ability to physically speak to a human.

This problem is not limited to online services. A great example of just a decade ago is your business credit card stops working, transactions are declined. If you were lucky you could physically call your bank manager, or go to your bank manager to get to the bottom of the situation. You knew your bank account contained sufficient funds as you maintained on-premise accounting practices and you could provide evidence of such facts. If you run a small business today, do you think you can talk to a human that would have the ability to correct this problem, or would you have to talk to 5 humans, multiple automated (and annoying) systems, costing countless hours of time and frustration?

If you rely on Acme George Inc workspaces product for your small business email and shared documents, what if that becomes blocked? How do you communicate with your customers? What if you use Acme Archie Inc for your customer support ticketing system, and for a week it is unavailable to use? Not only can your customers not report issues, but you have no access to see what issues were already outstanding and work on them independently.

At times there are widespread outages of online presences that have a wide effect across industries from hours to weeks. Cloudflare Jun 21, 2022, Fastly June 8, 2021, Amazon Web Services Dec 7, 2021, and then Dec 15 and Dec 22. A blog post called it the AWS’s December Outagepalooza. The Atlassian April 2022 outage for paying customers lasted upto 2 weeks. Even a free social media company and its related entities incurred widespread impact Facebook Oct 4 ,2021 that affected many gig economy businesses. These outages can have far ranging effects. Actual examples include you cannot pay your employees, your staff at a hospital cannot authenticate to access patient records, transportation and logistics of your shipping business is halted.

I am referring here to loss of access to your data in a SaaS environment, and loss of cloud infrastructure that supported your SaaS services or even your internally developed and maintained systems running on cloud infrastructure. If you are not convinced of the larger ramifications of an extreme loss of infrastructure services what was the impact to Parler in 2021.

My point here is you cannot simply stop using these services, or your cloud provider(s) infrastructure. You need to be prepared. In a traditional system, you backup your data for some degree of disaster, and you support the capability to recover both infrastructure and data from this, and if you a smart you actually test this. Sidebar a colleague recently shared that even with massive investment in infrastructure and global redundancy, a scheduled test for this large bank took down services for 12 hours.

Large SaaS organizations could offer services that offer multi-region or multi-cloud capabilities, but they are also at the mercy of the SaaS providers they use. Do you know all the interdependencies? Look no further than the wipe out of Okta’s stock (down 30%) in one day. CEO of Okta Todd McKinnon cited several factors including a security impact by text message provider Twilio. Read more about that at Twilio Employeee, Customer Accounts Breached Through Texts. And yes, the headline here has an incorrect spelling. I tried to add a comment to offer feedback, but the MarketWatch paywall of 4 articles would not let me create an account to login to leave a comment!

The solution is not to host all of your own infrastructure either. Facebook’s very long outage was self-inflicted and they controlled all of their own infrastructure. It not only had an impact on their websites, their internal staff were unable to use security badges to access critical infrastructure to correct the problem because they were physically locked out of buildings holding the infrastructure.

Returning to the small business owner who uses a marketing platform, an analytics platform, a CRM, a payment platform or even a social media platform. Do you keep current copies of your data in these systems so that if there were a loss, you knew who to communicate with? In the first cited case, did Eric have a list of all of his subscribers, a copy of all his online content, and all comments made by subscribers. Was there a means to communicate with them via other means, or was access to sufficient PII not even possible for what was his original content?

In future posts I will share some of my techniques for ensuring you have a data acquisition strategy.

What is the right length of a blog post?

A question without a definitive answer. Finding opinions from authoritative sources can also be easily obscured due to search engine optimization or even the choice of words used while searching.

I used the following search terms initially in Google and DuckDuckGo.

  • what is the right size of a blog post
  • what is the ideal length of a blog post

I then started with the term “ideal blog post”, and here are the type-ahead responses. Clearly “length” is the definitive winner in word association. My first thought was “size”, is that a technical difference?

DuckDuckGo

  • ideal blog post length
  • ideal blog post length for seo
  • ideal blog post size
  • ideal blog post length 2021
  • ideal word count for a blog post
  • ideal length for a blog post
  • ideal length of a blog post
  • ideal length for a blog post

NOTE: Size mentioned only once.

Google

  • ideal blog post length
  • ideal blog post length for seo
  • ideal blog post title length
  • ideal blog post length 2022
  • ideal blog post length for seo 2022
  • ideal blog post length 2021
  • ideal blog post length 2020
  • ideal blog post length for seo 2020
  • ideal blog post frequency
  • ideal blog posts

NOTE: Size not mentioned once. As a result the original title of my post was changed from size to length.

Search Outcomes

Using Google, which now often will provide a summarized result (known as a feature snippet) before examples of what People also ask, or ad results that are even before ranked actual results.

what is the right size of a blog post – Google

2,100-2,400 words
For SEO, the ideal blog post length should be 2,100-2,400 words, according to HubSpot data. We averaged the length of our 50 most-read blog posts in 2019, which yielded an average word count of 2,330. Individual blog post lengths ranged from 333 to 5,581 words, with a median length of 2,164 words. Mar 2, 2020

ideal blog post length – Google

about 1,500 to 2,000 words
Although your blog post length may vary depending on your topic and audience, it is often best to aim for about 1,500 to 2,000 words for articles or posts. Longer pieces seem to do better when it comes to ranking on SERPs.

DuckDuckGo

I have not yet seen, nor in these examples is DuckDuckGo creating a single answer summary. Probably IMO a good thing.

what is the right size of a blog post – Bing

Branching out I was curious what other possible engines provided.

1,600 words – According to 2 sources

And then a non copy/paste answer that I had to extract from developer tools

In the infographic “ The Internet is a Zoo: The Ideal Length of Everything Online ” from Buffer, they find that the ideal blog post length is 1,600 words. But some sources think a good blog post should be even longer than that. In a Medium article, the writer says that posts with an average read time of 7 minutes captured the most attention.

According to research done by popular blogging platform, Medium, the ideal length for blog posts is 1,600 words (or seven minutes of reading). This number is based on an analysis of the “average total seconds spent on each post and compared this to the post length.”

ideal blog post length – Bing

To sum up, here’s a list of common blog posts lengths to help you find your own ideal length:

Micro content: 75–300 words. Super-short posts are best for generating discussion. They rarely get many shares on social…
Short-form content: 300–600 words. This is the standard blogging length, recommended by many “expert” bloggers. Shorter…

More …

what is the right size of a blog post – Yahoo

Above the fold, after ads and before People also ask and actual results was

For SEO , the ideal blog post length should be 2,100-2,400 words, according to HubSpot data. We averaged the length of our 50 most-read blog posts in 2019, which yielded an average word count of 2,330. Individual blog post lengths ranged from 333 to 5,581 words, with a median length of 2,164 words.

ideal blog post length – Baidu

As the homepage was all Chinese and I wasn’t sure if I should continue but I cut/pasted english and hit the button and got results in English.

The text of the first search response was something I’d not seen on any other page, so for reference apparently there are Blog styles :)

Ideal Blog Post Length for SEO Blog posts vary in length from a few short paragraphs (Seth Godin style) to 40,000 words (Neil Patel style).

What an SEO SME says

So I reached out to my most knowledgeable friend in SEO and asked them the question Without googling or searching online, based on your SME.

Q: What is the right size of a blog post?
A: You mean content length? 1500 to start, ideally more towards the 5,000 or 10,000

Q: What is the best reading time for a blog post?
A: depends – long form vs short – some times a simple paragraph is all you need. Other times you want a book.

Summary

Using what the engines provide as a single recommendation, not the top organic search result.

Source Response
Google 1,500-2,000 or 2,100-2,400 depending on question
DuckDuckGo -
Bing 1,600 (only to mention time of 7 minutes)
Yahoo 2,100-2,400
Human SEO SME 1,500

Additional Helpers

A recent edition to my short reading email summaries of useful articles is TLDR. While this is not new information the inclusion of 1 minute read, 2 minute read, 11 minute read is useful data to me in making an informed decision based on the factors at the moment. Other information that helps this example which is a newsletter is 300,000 Subscribers and 43% Open Rate. There are also other data points that help, and could narrow your audience and determine what you may consider and ideal size.

Returning to the summarized results of various search engines, only one, Bing, provided this additional measurement of time, and the answer was “average read time of 7 minutes captured the most attention.” which translated into 1,600 words.

I cannot ofter any personal validation of either of these data points, but I should perhaps start collecting it.

Conclusion

What is the answer? Well, only your target audience can inform you of this. The question(s) is then who is your target audience? Is your target audience who you think they are?

For the record, my last blog post was 1973 words long, and this one is 1216 words long, therefore averaging 1594 words. NOTE: These numbers were the original versions length, both of which have changed/evolved over time with additional feedback.

This leads to a more important question. How are you measuring the impact of your blog posts and how does size/length/time play a role in that?

Sidebar: Is a blog post actually the best way for people to read your content, or at least gain insights into what may be useful for your readers. Is a newsletter a better option?

Going back to the TLDR newsletter for a moment, this information can be found on the website.

  1. Highly technical audience, primarily software engineers and other tech workers
  2. 30% United States, 10% United Kingdom, 10% Canada, 25% other EU, 25% other non-EU
  3. 50% ages 25 to 34, 20% ages 18 to 24, 20% ages 35 to 44, 10% other
  4. Primary sponsors get between 1000 to 1250 clicks
  5. Developer sponsors get between 750 to 1000 clicks
  6. Subscribers from companies like: Google, Amazon, Facebook, Apple, … (it’s interesting this is a list of logos, and what order they are in, FWIW)

I do not have access to the data so I am unable to gain more insights as to what is most read articles based on time. Hint: Interesting infographic for TLDR to publish.

I would ask how do they know point 1 and point 3 of my information without additional data mining providing this detail? I provided an @gmail email address, and my location can be determined via IP.

Can a picture replace a text description?

Data visualization, data storytelling, and data lineage are all ways to better describe and visualize a specific situation for a set of data. Generally, I find these techniques are used as a means to uncover or identify information that ultimately pertains to individuals. For example how many sales have we made across time/location/business unit? How many customers do we have? How many social media photos has a person provided over a period of time? Unfortunately, this is not the kind of data that I feel has real-world meaning to me. It doesn’t describe the advancements made in the biomedical field to help fight disease, it doesn’t tell us the amount of energy that we have saved or the amount of energy that we failed to collect and the impact that has locally and globally on our world, or it doesn’t describe valuable human experiences in history about people and places. I find this value in data visualizations of others.

During a recent vacation, I thought about the impact of the visualization of my experiences and just how much information was not collected, and how much information was collected but is of average or poor value or is extremely valuable. How hard it was to collate even what I had collected, and to who or what the value of this information is? While these are personal experiences and not that of a commercial organization, Google certainly informed me of how many people were viewing a public image I uploaded and a comment I made of an iconic Australian location and food.

Inaccessible value in a text description

I recently caught up with a very dear friend. She had lost her husband 10 years ago after more than 50 years of marriage. He kept a written diary every year since 1946, starting a year after the end of world war II that he served it.

So from 1946 to 2012, that is 66 years, there is a wealth of information that includes personal feelings, expectations, and perhaps thoughts about what is going well and what’s not going well. It would also include valuable information about the world view from a very intelligent and influential individual. These diaries are still located today on the same shelf in the same office they have for the short 25 years I was aware.

To draw a conclusion to the question with a data analogy. A single copy, in a single location of un-indexed information, which first-hand sources know has unlocked potential. It is also an immutable and finite time capsule. It would also IMHO contain great value in feelings, emotions, family, and history that is important to a small community. Could a picture represent this data?

A picture

Whilst traveling I used my camera as a means to record the experiences that I was having with my family. I am a photographer and not a videographer so my expression is a picture rather than a video and optionally audio. Sidebar, we did give our child a diary for this vacation to write in, however the attempt to build this new habit only lasted 2 days. Forming new habits can be hard.

But can a picture relay adequate information to describe when and where this photo was taken, why it was taken, how I felt, who I was with, what did it inspire, how did it make me think about related experiences in the past.

Today there is technology now that can take a picture and describe the contents, effectively it could create a summarized description of the picture. With additional metadata such as Exif data where you can extract more details such as time and location. With machine learning you can do picture comparison to identify locations even if location was not specified with the photo.

You can now have AI create an image from a description, if only my Dall-E-2 account would not keep crashing I would try it out.

A picture on its own only contains some value. If you collect all this information and combined with other sources, for example when I used my phone and not my camera, this is stored use google photos. This company can use this information to create a timeline of where you were, when you were there, perhaps you were with and combined with all the sources this company has such as your Google calendar, and Gmail it can and does create a timeline much like the timeline you see in social media platforms such as Facebook if you are regularly user of such a platform.

So we have not a picture, but a collection of pictures, including those not taken or owned by yourself, combined with other structured and unstructured data that can provide an improved timeline.

In comparison in data visualization there is usually a time component for most data. Animated data visualizations which can be awesome usually represent data across time.

Example pictures

Let me give you some simple images as an example and I’ll add some information that is not included in the photos specifically what is available in today’s modern technology such as GPS location. First my existing EOS 5D camera does not provide that information and second I do not enable that on my phone because I want to keep that information private and Google does not provide a capability that would enable me to store personal information but do not share that information for consumption by for example use that information for other machine learning capabilities.

Emu

I had never seen an emu roaming in the wild in Australia before this trip. From a conversation with a friend on a different topic did she provide information that driving between Jeriderie and Narandera you will find emus in the wild. Indeed we did multiple times on this specific highway without having to randomly goto some isolated place hoping for the same outcome.

This is a truly unique animal without only an ostrich looking similar.
Emu in nature, NSW Australia

AWS Rekognition output with values >90% categorize this photo with Antelope, Animal, Mammal, Wildlife, Bird, Sheep, with parents categories of Wildlife, Mammal, Animal. Well that is horribly wrong. An Antelope has 4 legs, this clearly has 2. An Emu is not a Mammal.

AWS Rekognition Response – August 2022. Larger version

That was so bad (and unexpected) I wanted to give the technology another chance with a different emu pic.
Emu in nature, NSW Australia
This time AWS Rekognition output with values >90% describe this photo with Bird, Animal, Emu, Sheep, Mammal, and at 88% Antelope and Wildlife. So if you get Emu (that has two legs), why would you say Sheep which is 4 legs and not a bird? And if you said Emu and Bird, why would you then select Antelope, also with 4 legs, but so not a Bird.

AWS Rekogintion Response – August 2022. Larger version.

Feeling a bit duped by technology, I tried Google Image search next. The first image was recognized as “Tasmanian Emu”. I didn’t know there was such a thing, but it did say Emu and all other related visual matches were Emu. I was surprised it only picked 3 of the 4 animals in the first pic.

The second image was recognized as “Tasmanian emu”, “Emu” and “Common ostrich”. Doh!

Platypus

This next image I was confident AWS Rekognition would be spot on. It’s even a more unique animal, and there are no grass or obstacles to obscure the animal. Boy was I wrong.

Platypus in nature, Eungella QLD Australia

AWS Rekognition output with values >90% describe this photo with Wildlife, Animal, Mammal and at 87% Hippo. It is true that a Hippo is a mammal, and you do find them in water, but?

Platypus in nature, Eungella QLD Australia
Finding an even more evident picture that anybody would recognize, well the software could not. Wildlife, Animal, Mammal. At 85% Lizard, Reptile and Otter. A lizard is not a mammal?

This post is starting to turn into a self proposition even more than I thought.

Had I provided GPS, it would have said Eungella, QLD and any more additional searching would show that it’s a popular destination for finding a Platypus in nature. In-fact a precise GPS location would give the name literally as “Platypus Deck”. here. A human would quickly articulate this by reading text from basic online searches.

Would AWS Rekognition discount values responses if it could know the precise location or even the country. Somehow I feel not.

For what it’s worth, a Google Image Search of Platypus yields tons of pictures AWS should use in it’s recognition validate and ML model.

What this image does not say with whom I was traveling with, that is a local of the area and his comment that in 20 years I’ve not seen platypus as easily and playful as this. It would not describe that on social media, many locals were surprised with the quality of images and videos. It would also not describe what the video shows, how they dive and then burrow into the muddy water using their bill, then rise to the surface and dive again.

Trying Google Image search again, the first image yielded platypus which it is. For the second image, google found no results. Again, I was rather shocked in comparison to the images of platypus a Google Image search shows and the fact this second image showed more detail IMO.

If I correctly label this image with alt text as a platypus, will Google Image search in future show this within search results, or will the output of correct recognition improve at a later time.

Other animals

At this time I decided to give up on animals. An echidna is a unique animal unlike almost any other, a quokka is also, but does resemble a small kangaroo. I do have a hippo, I’ll have to try that out.

Other images

I decided to try some more easier images and I was again overall disappointed and therefore decided to stop adding content here after these two images.

Surfers during summer waves in Oahu

AWS Rekognition above 90% accuracy was Sea, Ocean, Water, Nature, Outdoors, Person, Human, Surfing, Sea Waves, Sport, Sports.
Google Image search gave the first visual response as Viewing summer Swells on Oahu, which is spot on. It was Oahu, not Maui (wonders if I have a surfer image from Maui), and it was more importantly in Summer and not winter which has much larger waves.

Finally I tried to pick one of the most uniquely observable images that you will only find in one location and this image also included readable text.
U.S.S Arizona Memorial at Pearl Harbor Hawaii

AWS Rekognition above 90% was Flag, Symbol, Watercraft, Vessel, Vehicle, Transportation, Waterfront, Water, with Ferry and Boat being 88%. Rather shocked it could not pull out not 1 but 2 specific and clear areas of text and used this. The visible text is “U.S.S ARIZONA MEMORIAL” and “USS ARIZONA BB 39″.

Google Image search described this image as the Pearl Harbor National Memorial which is exactly what it is (in summary).

Even a picture of the Sydney Opera House, a truly unique building did not yield the result of Sydney via AWS Rekognition.

I look forward to this post being indexed to see if Google can give the source of the image as my site. I am also going to have to look into Google APIs for image recognition rather then the very slow and painful web browser option.

In conclusion

To return to the question of this post. Can a picture replace a text description? In short, no. A picture can summarize and quickly convey meaning simply because of how our mind processes visuals but it cannot replace a full text description.

In these examples I would expect almost every reader to be about to summarize these images more accurately, whereas (accessible) machine learning has a very long way to go. Even with location information, only some people would be able to add additional value or better infer an initial description, however all the information I could convey that is of applicable value is not within the picture itself. You cannot interpret additional meaning and value without more context, or without intrinsic knowledge. Using the surfers example, you can find waves used by surfers all around the world. In Oahu specifically in summer, surfable waves are infrequent and small, whereas in winter apparently they are very different.

With data storytelling, a data visualization is going to provide a similar outcome. Better visualizations will contain color and legends that visually describe information more clearly. They will also offer additional insights in creative ways, such as magnifications or clear differentiators. Should photographs as these shown here, automatically contain further context that is shown just in the image. Should image recognition automatically suggest titles that can be further edited by the author? Should an audio summary of the image be able to be recorded with the image? Should any single picture use context of other pictures around locality or time or similarities in the view to build a better picture. It is interesting to consider how technology could improve to provide greater value to the consumer.

How much text is needed is an entirely different question. Is the saying “A picture is worth a 1,000 words” approximately accurate?

Even a complex picture that takes months to review all the detail does not replace a full text description. If you would like an example for comparison checkout the “L. Tellier Kitchen Poster”. A piece of art that I own.

A summer sabbatical

In recent weeks I have been sharing more informal thoughts and in the upcoming weeks, there will be a period of greater radio silence.

After three decades as a professional, I am taking the entire summer off. This will be a chance to intentionally not sit at a desk, stare at a screen, look at my phone, read emails, read articles, and all those other work and personal related activities one does.

Weekly Musings – July 8, 2022

A very succinct description of the responsibilities of leadership by Jawad Nagda (infographic below) shows a number of key features of management that are also needed in data storytelling such as empathy, integrity, and listening. I wish annual 1:1 performance reviews gave employees an opportunity to rate their manager in such detail? It would be an interesting infographic to compare your managers for the past 3-5 jobs visually?

The Data Management Value Realization Journey by Bill Schmarzo really shows the depth and breadth of what an organization needs to be prepared for. The infographic (shown below) provides a lot of detail and each time you look at it from a different perspective you can see a wealth of key terms and thoughts to value. This starts with your data, the velocity, variety, and volume of data, which is described as fast, diverse and deep. The value journey to operationalizing this information clearly outweighs the risks of not being prepared. As much of Sydney is now underwater, and with the frequency of “rain bombs” in Australia, how could your company prepare for an influx of data (a once-in 500-year event)? Could you filter valuable data from invaluable data and draw insights quickly, or would you need to create an infrastructure to do so and train resources? The timeliness of your investment may be too late. The Data Management Value Creation Journey Map should be at the forefront of your business for planning how information drives your business success.

When your computer is idle, is it really idle? Peter Zaitsev shared this article What does an idle CPU do? which is a great read. Reminds me of my very old Unix (yes before Linux) kernel core dump analysis, where I had the Unix source code in question. A computer does not just do nothing unless it’s powered off or sleeping.

AI is used in many different fields. DALL-E 2 is a service that will create art and images based on a description. While this sounds interesting, I consider ML/AI as tools to help improve our society, and our decision-making and remove and replace redundant workloads. I feel creative expression is a talent and gift of an individual and the value of the work is in the eyes of the beholder. DALL-E 2 had to learn by imitating other famous works of art, some artists would learn this way, but some are just naturally talented. Will there be an AI to critique works of art, and how would it describe DALL-2 E’s works?

Speaking of art. I have always been fascinated with water structures and large outdoor works of art. The Bellagio fountain is one example of that. And for those naysays of water usage do your research, this project is actually very water efficient. This video of ultra-slow motion fluid dynamics (2 minutes) is just incredible. (some screenshots below)

Some images of the week.

Volcano + lightning

Weekly Musings – June 10, 2022

A large part of my work week was spent u-hauling across 1/3 of the country. This was a very mentally intense time, indeed 8-10 hrs per day of concentration working with dangerous equipment and sometimes in unpredictable situations with little break was harder than sitting at a desk. I had a lot of time to look at all those trucks on the highway and compose some thoughts about improving our planet. Yes, I did pay $5 per gallon for fuel, and at one stop $150 didn’t even fill the tank.

Easily 50% of all vehicles on the highway were semi-trailer trucks, a cab hauling one or two trailers (henceforth just trucks). If 100 trucks are moving from Point A to Point B, and let’s say it’s 8hr to 16hrs in travel distance, it is highly possible for longer trips you are also away from family. That’s 100 people that are always on, focused on the sole task of driving, you cannot step away for a quick break like in other roles. Electric vehicles will reduce emissions, but that’s not solving the problem. Driverless vehicles will help but that is also decades away from practical use. While 90% of vehicles will remain operated manually for many decades if not always, I see this as an impractical short-term solution.

I had to feel that rail is the obvious alternative here. You can with fewer individuals haul 100 containers, which reduces the human impact. The track is fixed, providing you have the correct support for trains in the opposite directions, so no dealing with the varying speeds of vehicles and crazy drivers. That reduces the mental complexity, and it also reduces the volume of larger vehicles with passenger vehicles. But rail has significant limitations in the change in elevation and direction unlike a road. Any tangible improvement to reduce traffic on highways would work best in areas of flat country. Is this geographical limitation alone a sufficient deterrent.

However, a train goes from point A1 to point B1. It still requires transportation of the container from individual companies’ locations A to A1, and B1 to B. These are much smaller distances and require those 100 drivers, however, they spend less time on the road, less stress on long-hauling, less time away from family. You also cannot just drive the trailer onto a rail car, so there is the complexity, and bottleneck of getting containers onto and off of trains. So is there a way to solve the actual problem of too many vehicles with so much human requirement that also requires concentration and attention, and a volume that is every increasing. The reason this would never work is capitalism. We live in a world where every company wants their own trucks, their own product traveling on their own schedule. Until we stop thinking like 1000s of individual companies and 100s of individual countries to focus on 10s of critical problems facing the planet, I feel the root cause is never actually being tackled. Ironic that in software engineering, the same issue of not tackling the actual root cause in larger strategic ways also occurs.

Changing topics. Let me start with a technical analogy of the following real-life experience.

You have terrible technical debt. They may be known reasons why this occurred in the past, but those reasons and those people are long gone. Yet all subsequent workers suffer from this accumulated technical debt and the impact on product quality and time efficiency is never actually measured or calculated but it should because the impact would be staggering. Vain attempts are made to make some improvements but the amount of technical debt grows, as the number of people writing code grows, the number of varying tools and their apparent effectiveness grows making it all easier to access faster ways of doing things poorly. Highly specialized individuals are hired to help address the problem, but then instead of being able to apply their wisdom to the advertised position, they are subjugated by the few, and either capitulate and are assimilated, or leave feeling worthless and powerless to a solvable problem because of the power and greediness of just those few that try to wield their power. Many may whisper in the shadows or wish for a better situation, but instead, accept the unacceptable normal as the new normal. Soon they have no idea how to relate to what is actually the right thing, except that they believe it is wrong because it’s not what is done now.

I generally refrain from any personal statements, however today I’m going to talk about my closest experience with “Guns in America”. Some facts to start.

  • The US accounts for 4.25% of the world population, let’s say 1/20.
  • The US has between 40% and 50% of the estimated number of guns in the world, so almost 1/2.
  • There are more guns in the US than people. Cite America’s gun culture – in seven charts
  • There are more mass shootings (4 or more wounded by a gun) in 2022 in the US than days in the year
  • I live just 20 minutes away from Sandy Hook. Our church has a memorial for that tragedy. Thankfull have never had to deal with the impact of gun violence..

As a parent, I could not fathom the lifelong anguish for parents of senseless deaths of their children to guns in schools or churches or supermarkets, or hospitals. It is articulated that many gun owners are responsible gun owners, so why does the gun industry, protected from being sued in the country that sues for everything, control the narrative of the safety of humans? I don’t have to be a scholar to read a document that is over 300 years old to see how a few have twisted its meaning, and control the entire population because of it, unwavering in being reasonable that things have changed in 300 years. They certainly afford all the improvements made living in our society in the past 300 years.

My neighbors own guns responsibly. They are also parents. You require a gun license, just like if you were driving a car. They are stored in a locked gun safe, just like you would with other vital possessions or dangerous ones, however this week I came to the realization that many people are not as fortunate.

This week I was at an event, where the circumstances brought me the closest to the real potential of guns in America. Skipping forward from important preamble. I was part of a subsequent conversation with brother B of individual A who asked family member C about his guns. “He has two handguns, one may be in the car (the car he left in that police subsequently arrested him in), he has two shotguns, he has a rifle, like a sniper rifle, that’s big it will be easy to find, and at least 4 semi-automatic machines guns including the AR-15″. Person B was going to collect these items, and they were not secured in any way, so the conversation was where they may be in the home. What happened was individual A wasn’t going to be even arrested, until other ex-law enforcement strongly suggested it happen. This individual was out on bail within a few hours.

This situation could have been very different. Individual A could have left feeling betrayed and returned with weapons of mass destruction. They could have just started out like that. They could have returned home to find their guns gone, and just gone and purchased more, or even possibly just borrowed others easily. I am skipping over a lot of important details as to why this was more of a close call then I am describing.

Guns in America is a complex problem, however when every single recommendation from politicians for fixing the gun problem by doing everything else except tackling the actual root cause, the gun, well that’s insanity. There is simply no other single word. When there is a press conference regarding a terrible mass shooting at a school, and not one single immediate action regarding guns is mentioned, why? My thoughts and prayers are also for all those suffering, but removing machine guns, requiring licenses, requiring background checks, raising the age, limiting the amount of bullets and magazine capacity, not allowing sale of body armor, these are all reasonable requests that still let you own a gun, just like a car. I have to provide proof of Id to buy Sudafed for a cold, but I could walk into a gun show and buy a machine gun. You have to be 21 and show your Id to purchase alcohol, but I can easily get body armor. I was forced to provide my age to buy one container cough medicine from a grocery store, yet you can buy an excessive amount of ammo more easily.

Returning to the technical analogy, it seems the gun problem is just like a technical debt problem. It never goes away, there are always ways to make the increase of technical debt easier. The priority is to add to the technical-debt not to prioritize removing it. In an organizations of 1000s, the few that try to make the world a better place, and constantly battling an ideological world view in software engineering that is well, wrong.

And the week in several images.



Weekly Musings – June 3, 2022

This week I wanted to share more about Observability and the CNCF foundation project Open Telemetry. Observability is a necessary foundation for any information system however observability does not answer questions that are essential for a successful business to operate. Let me explain in more detail.

Observability on it’s own does not answer these questions:

  • Was the customer impacted due to an event?
  • What is the root cause of a customer impacted situation?

So, no matter how much data one can provide here, what is the data story you need to be telling?

Let me give you a concrete example of a recent actual outage example.  Your cloud provider has an outage at one data center within one availability zone in one region. Your observability shows that 13% of your fleet’s infrastructure is impacted. You employ a multi-AZ single region primary customer-facing website.  While there are alarms and alerts and pages, your infrastructure balances the load, IaC relaunches the necessary replacements and most systems return to an apparent steady-state (I’ll leave the “hint” of apparent for another time).  

Was the customer experience actually impacted?  There are alerts of an increase in 500 errors, however, this quickly resolves. There are some small increases in latency of primary functions that you have on your dashboard? What did the customer actually experience?  Was it just a few customers, all of your customers, or certain customers based on what level of functionality they were performing, for example searching for products to purchase, adding products to a shopping cart, or checking out?

Observability is not going to answer the fundamental question of “Was the customer impacted?”.  Your business needs to define the metrics of measurement and actually capture this. Is a single customer of 100,000 active customers receiving a few 500 errors considered impact? Is 1% of served traffic affected considered impact? What duration?  What is actually necessary are business-specific metrics around your customer sentiment. Is it simply a measurement of revenue per minute compared with seasonal measurements of the same time of day, the same day of the week, and with the same impactful event such as a public holiday. Is it more complicated? No amount of RTO, RPO, MTTD, MTTR, multi-AZ, or DR resiliency is going to help you here.

Let’s take the same situation, but this time the IaC doesn’t work. More alarms are going off, and certain layers of your infrastructure are highly saturated.  Manual attempts to correct the loss of resources do not work? Where is the root cause of the problem? How can you fix the root cause? What if the root cause is a portion of your infrastructure that is a purchased product by another provider, and is a technology stack that does not match your own companies or the skills of your employees? How do you address this “house is on fire” situation?

In the above example, AWS suffered 3 outages last December 2021 and one was the loss of power to a single us-east-1 availability zone.  If you did not know this, us-east-1a for your account does not mean us-east-1a for a different customer of AWS. In fact, it doesn’t even mean the same if you have multiple different accounts per environment.  An availability zone is also not one data center.  Prior incidents have shown that it could be a small percentage. One AWS AZ could comprise 5-10-15+ different data centers.

Also, in the above example, if your container registry is highly-available, but an incorrectly configured third-party product and is now in a state where you cannot re-launch any infrastructure because the necessary images are inaccessible, your business is hobbled.  Have you planned for this situation before?  Let me share some more hypothetical questions about this scenario.  The stack is not what your on-call resources know, there is insufficient documentation about this system, and there is no test infrastructure in order to reproduce the issue, or validate any hypothesis. What if there is no support agreement with the company that sold the product?

As you can see the role of an architect, whether a solutions architect, a data architect, an enterprise architect or the principal architect, you could consider in many organizations this far exceeds the likely scope of your day-to-day obligations.  Is there such a thing as a disaster-preparedness architect, or a chaos architect?  Is the architect not even a sufficiently leveled responsibility here! Is it the Head of, or the Director of need in your business? Is there such a thing as a Chief Reliability Officer (CRO)? Seems a google search finds results. Added to my reading todo list.

My professional experience is that Observability is the first essential layer of total observability infrastructure for your organization.  The full stack actually includes:

  1. Observability
  2. Reproducibility
  3. Testability
  4. Scalability
  5. Reliability

All of these layers are essential. Each layer is a prerequisite for the next.  In your position in the organization where do you start? As a reliability resource you need Observability first. As a test engineer, you actually cannot start with Testability. As a C-suite executive, you need to know that system Reliability comes first, but how to you validate that?

I will be providing a much more in depth paper on this in the future. 

What is also missing from this list is one essential business-wide requirement — Ownership.  If in the entire organization, from the developer to the manager, to the customer support representative to the c-suite officer, every level is needed to take joint ownership in customer success. The weakest link is the actual problem and no amount of instrumentation, process or dashboards can address that.

Moving on, VS Code again came up in conversation in my tech circles, I really should practice using it.

My neighbor purchased the company Steel Bee – Long live your razors. It was a fascinating conversation about not creating a new product, but selecting an existing product that has a drop-ship infrastructure already in place and an Amazon and Shopify store presence. How do you measure the success of something you did not build? How can you improve on it?
 
On a personal note, I am about to venture into the world of CNC routing. Anybody with tips & tricks and open-source software to use? I am currently trying Carbide Create

With all that is happening locally, let us not forget to #StandWithUkraine.

This week in images.




Weekly Musings – May 27, 2022

———

We should all take a moment to reflect that going to school should be a safe, happy, and memorable part of everybody’s life. That was taken away this week from 19 children because common-sense laws, licenses, and checks do not apply to deadly weapons in this country. They apply to get a car license, to require car insurance when purchasing a vehicle, or to purchase Sudafed for a stuffy nose. I reside just 25 minutes from Sandy Hook Elementary school. My church has a memorial for that tragedy. As a parent, I could not comprehend what the grief of loss could be. My prayers to everybody affected in Uvalde, and to all other school districts this year, last year, and all years before that.

———

In recent months I have focussed on improving my data visualization technology skills, and working on my data storytelling skills. 3 Tips You Need to Be Successful in Data Visualization sums this up well.  “Data visualization is not just a skill, it’s a lifestyle. Keep learning and find new ways to get better”. If you are interested, my favorite physical book to date on the subject area is Effective Data Storytelling by Brent Dykes. Great detail, as well as great quotes.  This week Brent has published 100 Essential Data Storytelling Quotes from his book which is a timely affirmation.

How well we communicate is determined not by how well we say things but how well we are understood” — Andrew Grove


More reading and discussion on what is Web 3.0? What does it mean for our field?  What does it mean for my future skills?  The hard truths about Web3: What no one else is talking about was something I read this week after it was recommended by a good friend. The takeaway is in the closing thoughts “Instead, educate yourself on the long-term sustainable use cases of blockchain technology.”. My friends’ takeaway about Blockchain is “It’s a tool, not a solution.” I would tend to agree.

I launched a new project last weekend and I’ve selected for a second time to go with Hugo for a static site generator. If you want a drag and drop template well it’s good, but there is definitely a learning curve if you want to make just minor tweaks. My theme for example said it included Bootstrap, but I wanted to accent a post with a TIP box (in Bootstrap they are called Alerts). Do you think it was trivial to work out why Bootstrap alerts didn’t work in my Hugo template? I spent over an hour because of the complexity of a low-code, no-code solution, whereas if I’d built a site with straight HTML/CSS/JS/Bootstrap it would have just worked. Maybe I’m old school, but clean code and not three levels of abstraction is IMO more maintainable. Does it take longer to be productive? At the start of a new project perhaps, but if you don’t have very technically capable resources that are at your avail, the selection of an internal tool for an essential part of your business may be a poor choice.

As an example. Last year my employer suffered a long outage due to the rough AWS Cloud Dec 2021 with three separate incidents. In one occurrence, the loss of power to a data center that knocked out approximately 7% of one AZ would not be an issue for any organization’s business that runs in a highly available multi-AZ model right? Wrong. The use of a Docker Container Registry product, that was configured has HA went down, along with multiple nodes. Those nodes could not be relaunched because the registry was down. The images could not be rebuilt because they relied on additional images. The entire site was degraded because of one component that was configured in a HA capability, but it was configured incorrectly. To further complicate the matter, the entire stack, from the IAAS to underlying technologies was not part of the stack the DevOps team used, and without clearly documented installation, testing, and chaos experiments. To further complicate the issue, this required obtaining commercial support for the product being used right then, opening a ticket, and getting a support person of said commercial company to help address the issue. The moral here is. If your business relies on it’s availability and you do not have the technical skills and capabilities and redundancies of your staff to ensure its availability, then are you really thinking hard about being prepared, or are you chasing the next sale, the next feature, the next new wave of technology?

Want to get your links to render nicely in the varying products you use? Twitter Card Validator can be a bit of a hit/miss effect. I have found that if I cut/paste a link in chat programs including Slack, Google Chat, and Signal which all provide a different experience but seem to be more responsive. I guess I will keep working on it. (Damm you Hugo!)

On a more personal note and a sore pain point is 401k retirement plans and planning for retirement in the U.S.A. Have you been burned by the 3-year vesting rule of your employer’s matching contributions that you didn’t know about when you looked at the initial offer package? I have. It seems it’s a wide industry problem that affects all levels of employees. Opinion: This giant pension scandal is hiding in plain sight. You are expected to financially plan for retirement only to find that limits, types of plans, and employer decisions put roadblocks in your way.

This week in images.





Weekly Musings – May 20, 2022

The Linux Foundation came across my reading path two separate times this week. As I continue to re-establish my larger footprint solely in the open-source ecosystem Setting an Open Source Strategy is a detailed report for any business to identify the potential return on investment (ROI) of participating in the open-source ecosystem. Every company uses open source. Even if you consume open source in your organization and do not plan to contribute to open source it is a good read to determine what is the inflection point where you (or your employees) may want to invest.

This week I spent some more time looking at the various Open Source Foundations after reading White House joins OpenSSF and the Linux Foundation in securing open-source software. The Open Source Security Foundation (OpenSSF) is a project of the The Linux Foundation. OpenSSF has created the “The Open Source Software Security Mobilization Plan”. This plan lists 10 streams of investment for open source security and I feel it’s important to reiterate these.

  • Security Education – Deliver baseline secure software development education and certification to all.
  • Risk Assessment – Establish a public, vendor-neutral, objective, metrics-based risk assessment dashboard for the top 10,000 (or more) OSS components.
  • Digital Signatures – Accelerate the adoption of digital signatures on software releases.
  • Memory Safety – Eliminate root causes of many vulnerabilities through replacement of non-memory-safe languages.
  • Incident Response – Establish an OpenSSF Incident Response Team of security experts to assist open source projects accelerate their responses to newly discovered vulnerabilities.
  • Better Scanning – Accelerate discovery of new vulnerabilities by maintainers and experts through advanced security tools and expert guidance.
  • Code Audits – Conduct third-party code reviews (and any necessary remediation work) of up to 200 of the most-critical OSS components once per year.
  • Data Sharing – Coordinate industry-wide data sharing to improve the research that helps determine the most critical OSS components.
  • SBOMs Everywhere – Improve SBOM tooling and training to drive adoption.
  • Improved Software Supply Chains – Enhance the 10 most critical OSS build systems, package managers, and distribution systems with better supply chain security tools and best practices.

While I have not read this, CNCF released the Cloud Native Security Whitepaper v2 this week.

In open source conference land we saw in-person events including Percona Live 2022 and KubeCon + CloudNativeCon Europe 2022. Which I was there!

In unrelated tech news, I have cut the cord following ongoing poor customer service with a legacy provider. Welcome to YouTube TV. I am automatically impressed with more features and 1/3 of the price.
Also, Derek Muller has a new video out. Check out my favorite YouTube channel Veritasium.

I’ll leave this blog with a few images reflecting the week.

handcalcs
Azure Cloud Infographic
For Application Security in your Pipelines
Shark Tracking

Weekly musings – May 13 2022

As I reflect on this week of my technology journey with the conversations I had, what I learned, and what I wanted to do and write about, I decided what better way to work on multiple blog posts than write about what I’d like to write about.

The 2022 observability conference https://o11yfest.org/ is a wrap. For those that are interested in OpenTelemetry this event had plenty of great content with videos with transcripts will become available. Thanks Paul Bruce for your organizing work. While I could only attend some sessions “Building Software Reliability with Distributed Tracing” by Ricardo Ferreira and “Bad Observability” by Stephen Townshend are definitely on my rewatch list. I heard about new things such as keptn – Cloud-native application life-cycle orchestration, and cloudevents – A specification for describing event data in a common way.

A big shot out to Ashton Rodenhiser of Mind’s Eye Creative, who did these amazing animated canvasas during the presentations, I’ve included one at the bottom of this post.

I have never been that into podcasts. I guess I have always been more of a reader than a listener, but this week while having to do some driving, I dove into listing and realized again why I like to read more. Several times I wish I could stop and take notes however lucky for me I was able to see that Thoughworks Technology Podcasts have online transcripts. Coding lessons from the pandemic, The big five tech trends for 2022 and Following an unusual career path: from dev to CEO were all valuable listening. The single best snippet was on rethinking estimation or “no estimate techniques”. I hope I can discuss and implement myself, the “is basically just three things. It’s just right, it’s too big, or it’s insane”.

I took an intro into Web 3.0 with this F5 webinar What is Web3 and How to Build a Dapp?. Yep, I still don’t get Web 3.0 fully, but I can now launch my own blockchain solution with Scaffold-ETH, write Solidity by Example and Learn how to build on Ethereum; the superpowers and the gotchas should I want to in the future.

While I have my favorite YouTube channels that intersect topics including Math, Physics, Engineering, Technology, Facts and Figures, and woodworking (such as Veritasium (11.9M), CGP Grey (5.35M), DIYMontreal (151K) and 3×3 Custom (620K), as part of having random conversations in the social networking of https://o11yfest.org/ I’ve added two new ones to my list of never having enough time. Fireship (1.31M), and TechLinked (1.73M).

So what did I learn on YouTube this week in addition to you can make a video of a topic in 100 seconds. VS Code Top-Ten Pro Tips. I know Microsoft’s Visual Studio Code is more popular, I see it in presentations, but I never knew it has become the goto integrated platform. While I default to the good old CLI for vi, git and the like, and Atom, this video highlighted I need to use VS Code. We all know computer and math gives undesired results Why do computers suck at math? was fun to watch. And I’ve ordered the plans and getting supplies to make this 6-in-1 Trim Router Jig.

I’ll leave this blog with a few images reflecting the week.

Building Software Reliability with distributed Tracing
It's not my job
Test Data and Training Data
The AI Model they want, The data they give
Easter Island - Dig Deeper

SELECT 1

If you have worked with an RDBMS for some time, you will likely have come across the statement SELECT 1.

However, rarely is it correctly explained to engineers what the origin of SELECT 1 is, and why it’s useless and wasteful? A google search is not going to give you the response you would hope, these ranked responses are just as useless as the statement itself.

Bloat

Seeing a SELECT 1 confirms two things. First you are using a generic ORM framework, quote, and second, you have never optimized your SQL traffic patterns.

“Frameworks generally suck.
They CLAIM to improve the speed of development and abstract the need to know SQL.
The REALITY is the undocumented cost to sub-optimal performance, especially with data persistence.”

Connection Pooling

SELECT 1 comes from early implementations of connection pooling.

What is a connection pool? Rather than a new request or call getting a new database connection each time you wanted to return some data, programming languages implemented a cache with a pre-loaded pool of pre-established database connections. The intended goal is to reduce the execution time of an initial expensive operation of getting a new database connection if you were retrieving data from a simple SELECT statement. If intelligent enough (many are not), these pools would include features such as a low watermark, a high watermark, a pruning backoff of idle connections, and an ability to flush all connections.

When your code wanted to access the database to retrieve data, it would first ask the connection pool for an available connection from its pool, mark the connection as in-use and provide that for subsequent consumption.

Here is a simple example of the two queries that would actually be necessary to retrieve one piece of information.

SELECT 1
SELECT email_address, phone, position, active FROM employee where employee_id = ?

Staleness

SELECT 1 was implemented as the most light-weight SQL statement (i.e., minimal parsing, privilege checking, execution) that would validate that your connection was still active and usable. If SELECT 1 failed, i.e. a protocol communication across your network, the connection could be dropped from the connection pool, and a new connection from the pool could be requested. While this may appear harmless, it leads to multiple code in-efficiencies, a topic for a subsequent discussion.

Failed error handling

SELECT 1 was a lazy and flawed means to perform error handling. In reality, every single SQL statement requires adequate error handling, any statement can fail at any time to complete. In the prior example, what happens if the SELECT 1 succeeds but a simple indexed SELECT statement fails? This anti-pattern also generally shows that error handling is inconsistent and highly duplicated rather than at the correct position in the data access path.

By definition, error handling is needed in an abstraction function for all SQL statements, and it needs to handle all types of error handling including the connection no longer valid, connection terminated, timed out, etc.

If you had the right error handling SELECT 1 would then be redundant, and as I stated useless. You simply run the actual SELECT statement and handle any failure accordingly.

High availability

In today’s cloud-first architectures where high availability consists of multiple availability zones and multiple regions where application A can communicate with database B, every unneeded network round-trip in a well-tuned system is wasteful, i.e. it is costing you time to render a result quicker. We all know studies have shown that slow page loads drive users away from your site.

The cost of the cloud

This AWS Latency Monitoring grid by Matt Adorjan really shows you the impact that physics has on your resiliency testing strategy when application A and database B are geographically separated and you just want one piece of information.

Conclusion

The continued appearance of SELECT 1 is a re-enforcement that optimizing for performance is a missing skill for the much larger engineering code-writing workforce that have lost the ability for efficiency. It is also another easy win that becomes an unnecessary battle for Data Architects to ensure your organization provides a better customer experience.

What does the MySQL mysqlsh util.checkForServerUpgrade() execute

During a recent Aurora MySQL 8 upgrade process, a number of validation checks have failed. This is an analysis of the error message “present in INFORMATION_SCHEMA’s INNODB_SYS_TABLES table but missing from TABLES table”.

Some background

During a Major Upgrade from Aurora MySQL 5.7 to Aurora MySQL 8.0 the cluster instances were left in an incompatible-parameters state. The upgrade-prechecks.log shed some more light on the situation with

{
            "id": "schemaInconsistencyCheck",
            "title": "Schema inconsistencies resulting from file removal or corruption",
            "status": "OK",
            "description": "Error: Following tables show signs that either table datadir directory or frm file was removed/corrupted. Please check server logs, examine datadir to detect the issue and fix it before upgrade",
            "detectedProblems": [
                {
                    "level": "Error",
                    "dbObject": "flinestones.fred",
                    "description": "present in INFORMATION_SCHEMA's INNODB_SYS_TABLES table but missing from TABLES table"
                }
            ]
        }, 

For anonymity the troublesome table here is played by flinestones.fred

This error could be reproduced more quickly with the util.checkForServerUpgrade() check that saves the creation of a snapshot of your cluster, restore from the snapshot cluster, then the launch cluster instance path.

18) Schema inconsistencies resulting from file removal or corruption
  Error: Following tables show signs that either table datadir directory or frm
    file was removed/corrupted. Please check server logs, examine datadir to
    detect the issue and fix it before upgrade

  mysql.rds_heartbeat2 - present in INFORMATION_SCHEMA's INNODB_SYS_TABLES
    table but missing from TABLES table
  flinstones.fred -
    present in INFORMATION_SCHEMA's INNODB_SYS_TABLES table but missing from
    TABLES table 

As I am using the MySQL community mysqlsh tool with a managed AWS RDS MySQL cluster, I have discounted any rds specific messages.

Back to investigating the cause. Some basic spot checks within the Cluster confirmed this mismatch.

mysql > desc flinstones.fred;
ERROR 1146 (42S02): Table flinstones.fred ' doesn't exist

mysql > select * from information_schema.innodb_sys_tables where name = ' flinstones/fred';

*results*
(1 row)

A closer inspection of the Aurora MySQL error log re-iterated there was some issue.

$ aws rds download-db-log-file-portion --db-instance-identifier ${INSTANCE_ID} --log-file-name error/mysql-error-running.log --output text

... 
[Warning] InnoDB: Tablespace 'flinstones/fred' exists in the cache with id 5233285 != 4954605
...

What is this check

It is easy enough to look at the SQL behind this using open-source software, you go to the source and look at the SQL https://github.com/mysql/mysql-shell .. upgrade_check.cc. As the message is near identical to what AWS provides I am making an educated assumption the check is the same.

// clang-format off
std::unique_ptr
Sql_upgrade_check::get_schema_inconsistency_check() {
  return std::make_unique(
      "schemaInconsistencyCheck",
      "Schema inconsistencies resulting from file removal or corruption",
      std::vector{
       "select A.schema_name, A.table_name, 'present in INFORMATION_SCHEMA''s "
       "INNODB_SYS_TABLES table but missing from TABLES table' from (select "
       "distinct "
       replace_in_SQL("substring_index(NAME, '/',1)")
       " as schema_name, "
       replace_in_SQL("substring_index(substring_index(NAME, '/',-1),'#',1)")
       " as table_name from "
       "information_schema.innodb_sys_tables where NAME like '%/%') A left "
       "join information_schema.tables I on A.table_name = I.table_name and "
       "A.schema_name = I.table_schema where A.table_name not like 'FTS_0%' "
       "and (I.table_name IS NULL or I.table_schema IS NULL) and A.table_name "
       "not REGEXP '@[0-9]' and A.schema_name not REGEXP '@[0-9]';"},
      Upgrade_issue::ERROR,
      "Following tables show signs that either table datadir directory or frm "
      "file was removed/corrupted. Please check server logs, examine datadir "
      "to detect the issue and fix it before upgrade");
}

Ok, that’s a little more difficult to read than plain text, and what if I wanted to review other SQL statements this could become tedious.

Gather the SQL statements executed by util.checkForServerUpgrade()

Let’s use a more straightforward means of capturing SQL statements, the MySQL general log.

MYSQL_PASSWD=$(date | md5sum - | cut -c1-20)

docker network create -d bridge mynetwork
docker run --name mysql57 -e MYSQL_ROOT_PASSWORD="${MYSQL_PASSWD}" -d mysql:5.7
docker network connect mynetwork mysql57
docker inspect mysql57 | grep "IPAddress"
IP=$(docker inspect mysql57 | grep '"IPAddress":' | head -1 | cut -d'"' -f4)
docker exec -it mysql57 mysql -uroot -p${MYSQL_PASSWD} -e "SET GLOBAL general_log=1"
docker exec -it mysql57 mysql -uroot -p${MYSQL_PASSWD} -e "SHOW GLOBAL VARIABLES LIKE 'general_log_file'"
GENERAL_LOG_FILE=$(docker exec -it mysql57 mysql -uroot -p${MYSQL_PASSWD} -e "SHOW GLOBAL VARIABLES LIKE 'general_log_file'" | grep general_log_file | cut -d'|' -f3)


docker run --name mysql8 -e "MYSQL_ALLOW_EMPTY_PASSWORD=yes" -d mysql/mysql-server
docker exec -it mysql8 mysqlsh -h${IP} -uroot -p${MYSQL_PASSWD} --js -- util checkForServerUpgrade | tee check.txt

docker exec -it mysql57 grep Query ${GENERAL_LOG_FILE} | cut -c41- | tee check.sql


# Cleanup
docker stop mysql8 && docker rm mysql8
docker stop mysql57 && docker rm mysql57
docker network rm mynetwork

And we are left with the output of util.checkForServerUpgrade() and the SQL of all checks including of said statement:

check.sql

SET NAMES 'utf8mb4'
select current_user()
SELECT PRIVILEGE_TYPE, IS_GRANTABLE FROM INFORMATION_SCHEMA.USER_PRIVILEGES WHERE GRANTEE = '\'root\'@\'%\''
SELECT PRIVILEGE_TYPE, IS_GRANTABLE, TABLE_SCHEMA FROM INFORMATION_SCHEMA.SCHEMA_PRIVILEGES WHERE GRANTEE = '\'root\'@\'%\'' ORDER BY TABLE_SCHEMA
SELECT PRIVILEGE_TYPE, IS_GRANTABLE, TABLE_SCHEMA, TABLE_NAME FROM INFORMATION_SCHEMA.TABLE_PRIVILEGES WHERE GRANTEE = '\'root\'@\'%\'' ORDER BY TABLE_SCHEMA, TABLE_NAME
select @@version, @@version_comment, UPPER(@@version_compile_os)
SET show_old_temporals = ON
SELECT table_schema, table_name,column_name,column_type FROM information_schema.columns WHERE column_type LIKE 'timestamp /* 5.5 binary format */'
SET show_old_temporals = OFF
select SCHEMA_NAME, 'Schema name' as WARNING from INFORMATION_SCHEMA.SCHEMATA where SCHEMA_NAME in ('ADMIN', 'CUBE', 'CUME_DIST', 'DENSE_RANK', 'EMPTY', 'EXCEPT', 'FIRST_VALUE', 'FUNCTION', 'GROUPING', 'GROUPS', 'JSON_TABLE', 'LAG', 'LAST_VALUE', 'LEAD', 'NTH_VALUE', 'NTILE', 'OF', 'OVER', 'PERCENT_RANK', 'PERSIST', 'PERSIST_ONLY', 'RANK', 'RECURSIVE', 'ROW', 'ROWS', 'ROW_NUMBER', 'SYSTEM', 'WINDOW', 'LATERAL', 'ARRAY' ,'MEMBER' )
SELECT TABLE_SCHEMA, TABLE_NAME, 'Table name' as WARNING FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_TYPE != 'VIEW' and TABLE_NAME in ('ADMIN', 'CUBE', 'CUME_DIST', 'DENSE_RANK', 'EMPTY', 'EXCEPT', 'FIRST_VALUE', 'FUNCTION', 'GROUPING', 'GROUPS', 'JSON_TABLE', 'LAG', 'LAST_VALUE', 'LEAD', 'NTH_VALUE', 'NTILE', 'OF', 'OVER', 'PERCENT_RANK', 'PERSIST', 'PERSIST_ONLY', 'RANK', 'RECURSIVE', 'ROW', 'ROWS', 'ROW_NUMBER', 'SYSTEM', 'WINDOW', 'LATERAL', 'ARRAY' ,'MEMBER' )
select TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, COLUMN_TYPE, 'Column name' as WARNING FROM information_schema.columns WHERE TABLE_SCHEMA not in ('information_schema', 'performance_schema') and COLUMN_NAME in ('ADMIN', 'CUBE', 'CUME_DIST', 'DENSE_RANK', 'EMPTY', 'EXCEPT', 'FIRST_VALUE', 'FUNCTION', 'GROUPING', 'GROUPS', 'JSON_TABLE', 'LAG', 'LAST_VALUE', 'LEAD', 'NTH_VALUE', 'NTILE', 'OF', 'OVER', 'PERCENT_RANK', 'PERSIST', 'PERSIST_ONLY', 'RANK', 'RECURSIVE', 'ROW', 'ROWS', 'ROW_NUMBER', 'SYSTEM', 'WINDOW', 'LATERAL', 'ARRAY' ,'MEMBER' )
SELECT TRIGGER_SCHEMA, TRIGGER_NAME, 'Trigger name' as WARNING FROM INFORMATION_SCHEMA.TRIGGERS WHERE TRIGGER_NAME in ('ADMIN', 'CUBE', 'CUME_DIST', 'DENSE_RANK', 'EMPTY', 'EXCEPT', 'FIRST_VALUE', 'FUNCTION', 'GROUPING', 'GROUPS', 'JSON_TABLE', 'LAG', 'LAST_VALUE', 'LEAD', 'NTH_VALUE', 'NTILE', 'OF', 'OVER', 'PERCENT_RANK', 'PERSIST', 'PERSIST_ONLY', 'RANK', 'RECURSIVE', 'ROW', 'ROWS', 'ROW_NUMBER', 'SYSTEM', 'WINDOW', 'LATERAL', 'ARRAY' ,'MEMBER' )
SELECT TABLE_SCHEMA, TABLE_NAME, 'View name' as WARNING FROM INFORMATION_SCHEMA.VIEWS WHERE TABLE_NAME in ('ADMIN', 'CUBE', 'CUME_DIST', 'DENSE_RANK', 'EMPTY', 'EXCEPT', 'FIRST_VALUE', 'FUNCTION', 'GROUPING', 'GROUPS', 'JSON_TABLE', 'LAG', 'LAST_VALUE', 'LEAD', 'NTH_VALUE', 'NTILE', 'OF', 'OVER', 'PERCENT_RANK', 'PERSIST', 'PERSIST_ONLY', 'RANK', 'RECURSIVE', 'ROW', 'ROWS', 'ROW_NUMBER', 'SYSTEM', 'WINDOW', 'LATERAL', 'ARRAY' ,'MEMBER' )
SELECT ROUTINE_SCHEMA, ROUTINE_NAME, 'Routine name' as WARNING FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_NAME in ('ADMIN', 'CUBE', 'CUME_DIST', 'DENSE_RANK', 'EMPTY', 'EXCEPT', 'FIRST_VALUE', 'FUNCTION', 'GROUPING', 'GROUPS', 'JSON_TABLE', 'LAG', 'LAST_VALUE', 'LEAD', 'NTH_VALUE', 'NTILE', 'OF', 'OVER', 'PERCENT_RANK', 'PERSIST', 'PERSIST_ONLY', 'RANK', 'RECURSIVE', 'ROW', 'ROWS', 'ROW_NUMBER', 'SYSTEM', 'WINDOW', 'LATERAL', 'ARRAY' ,'MEMBER' )
SELECT EVENT_SCHEMA, EVENT_NAME, 'Event name' as WARNING FROM INFORMATION_SCHEMA.EVENTS WHERE EVENT_NAME in ('ADMIN', 'CUBE', 'CUME_DIST', 'DENSE_RANK', 'EMPTY', 'EXCEPT', 'FIRST_VALUE', 'FUNCTION', 'GROUPING', 'GROUPS', 'JSON_TABLE', 'LAG', 'LAST_VALUE', 'LEAD', 'NTH_VALUE', 'NTILE', 'OF', 'OVER', 'PERCENT_RANK', 'PERSIST', 'PERSIST_ONLY', 'RANK', 'RECURSIVE', 'ROW', 'ROWS', 'ROW_NUMBER', 'SYSTEM', 'WINDOW', 'LATERAL', 'ARRAY' ,'MEMBER' )
select SCHEMA_NAME, concat('schema''s default character set: ',  DEFAULT_CHARACTER_SET_NAME) from INFORMATION_SCHEMA.schemata where SCHEMA_NAME not in ('information_schema', 'performance_schema', 'sys') and DEFAULT_CHARACTER_SET_NAME in ('utf8', 'utf8mb3')
select TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, concat('column''s default character set: ',CHARACTER_SET_NAME) from information_schema.columns where CHARACTER_SET_NAME in ('utf8', 'utf8mb3') and TABLE_SCHEMA not in ('sys', 'performance_schema', 'information_schema', 'mysql')
SELECT TABLE_SCHEMA, TABLE_NAME, 'Table name used in mysql schema in 8.0' as WARNING FROM INFORMATION_SCHEMA.TABLES WHERE LOWER(TABLE_SCHEMA) = 'mysql' and LOWER(TABLE_NAME) IN ('catalogs', 'character_sets', 'collations', 'column_type_elements', 'columns', 'dd_properties', 'events', 'foreign_key_column_usage', 'foreign_keys', 'index_column_usage', 'index_partitions', 'index_stats', 'indexes', 'parameter_type_elements', 'parameters', 'routines', 'schemata', 'st_spatial_reference_systems', 'table_partition_values', 'table_partitions', 'table_stats', 'tables', 'tablespace_files', 'tablespaces', 'triggers', 'view_routine_usage', 'view_table_usage', 'component', 'default_roles', 'global_grants', 'innodb_ddl_log', 'innodb_dynamic_metadata', 'password_history', 'role_edges')
select table_schema, table_name, concat(engine, ' engine does not support native partitioning') from information_schema.Tables where create_options like '%partitioned%' and upper(engine) not in ('INNODB', 'NDB', 'NDBCLUSTER')
select table_schema, table_name, 'Foreign key longer than 64 characters' as description from information_schema.tables where table_name in (select left(substr(id,instr(id,'/')+1), instr(substr(id,instr(id,'/')+1),'_ibfk_')-1) from information_schema.innodb_sys_foreign where length(substr(id,instr(id,'/')+1))>64)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete MAXDB sql_mode') from information_schema.routines where find_in_set('MAXDB', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete MAXDB sql_mode' from information_schema.EVENTS where find_in_set('MAXDB', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete MAXDB sql_mode' from information_schema.TRIGGERS where find_in_set('MAXDB', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete MAXDB option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('MAXDB', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete DB2 sql_mode') from information_schema.routines where find_in_set('DB2', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete DB2 sql_mode' from information_schema.EVENTS where find_in_set('DB2', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete DB2 sql_mode' from information_schema.TRIGGERS where find_in_set('DB2', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete DB2 option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('DB2', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete MSSQL sql_mode') from information_schema.routines where find_in_set('MSSQL', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete MSSQL sql_mode' from information_schema.EVENTS where find_in_set('MSSQL', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete MSSQL sql_mode' from information_schema.TRIGGERS where find_in_set('MSSQL', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete MSSQL option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('MSSQL', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete MYSQL323 sql_mode') from information_schema.routines where find_in_set('MYSQL323', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete MYSQL323 sql_mode' from information_schema.EVENTS where find_in_set('MYSQL323', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete MYSQL323 sql_mode' from information_schema.TRIGGERS where find_in_set('MYSQL323', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete MYSQL323 option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('MYSQL323', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete MYSQL40 sql_mode') from information_schema.routines where find_in_set('MYSQL40', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete MYSQL40 sql_mode' from information_schema.EVENTS where find_in_set('MYSQL40', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete MYSQL40 sql_mode' from information_schema.TRIGGERS where find_in_set('MYSQL40', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete MYSQL40 option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('MYSQL40', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete NO_AUTO_CREATE_USER sql_mode') from information_schema.routines where find_in_set('NO_AUTO_CREATE_USER', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete NO_AUTO_CREATE_USER sql_mode' from information_schema.EVENTS where find_in_set('NO_AUTO_CREATE_USER', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete NO_AUTO_CREATE_USER sql_mode' from information_schema.TRIGGERS where find_in_set('NO_AUTO_CREATE_USER', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete NO_AUTO_CREATE_USER option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('NO_AUTO_CREATE_USER', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete NO_FIELD_OPTIONS sql_mode') from information_schema.routines where find_in_set('NO_FIELD_OPTIONS', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete NO_FIELD_OPTIONS sql_mode' from information_schema.EVENTS where find_in_set('NO_FIELD_OPTIONS', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete NO_FIELD_OPTIONS sql_mode' from information_schema.TRIGGERS where find_in_set('NO_FIELD_OPTIONS', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete NO_FIELD_OPTIONS option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('NO_FIELD_OPTIONS', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete NO_KEY_OPTIONS sql_mode') from information_schema.routines where find_in_set('NO_KEY_OPTIONS', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete NO_KEY_OPTIONS sql_mode' from information_schema.EVENTS where find_in_set('NO_KEY_OPTIONS', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete NO_KEY_OPTIONS sql_mode' from information_schema.TRIGGERS where find_in_set('NO_KEY_OPTIONS', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete NO_KEY_OPTIONS option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('NO_KEY_OPTIONS', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete NO_TABLE_OPTIONS sql_mode') from information_schema.routines where find_in_set('NO_TABLE_OPTIONS', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete NO_TABLE_OPTIONS sql_mode' from information_schema.EVENTS where find_in_set('NO_TABLE_OPTIONS', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete NO_TABLE_OPTIONS sql_mode' from information_schema.TRIGGERS where find_in_set('NO_TABLE_OPTIONS', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete NO_TABLE_OPTIONS option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('NO_TABLE_OPTIONS', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete ORACLE sql_mode') from information_schema.routines where find_in_set('ORACLE', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete ORACLE sql_mode' from information_schema.EVENTS where find_in_set('ORACLE', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete ORACLE sql_mode' from information_schema.TRIGGERS where find_in_set('ORACLE', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete ORACLE option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('ORACLE', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete POSTGRESQL sql_mode') from information_schema.routines where find_in_set('POSTGRESQL', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete POSTGRESQL sql_mode' from information_schema.EVENTS where find_in_set('POSTGRESQL', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete POSTGRESQL sql_mode' from information_schema.TRIGGERS where find_in_set('POSTGRESQL', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete POSTGRESQL option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('POSTGRESQL', variable_value)
select TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, UPPER(DATA_TYPE), COLUMN_TYPE, CHARACTER_MAXIMUM_LENGTH from information_schema.columns where data_type in ('enum','set') and CHARACTER_MAXIMUM_LENGTH > 255 and table_schema not in ('information_schema')
SELECT TABLE_SCHEMA, TABLE_NAME, concat('Partition ', PARTITION_NAME, ' is in shared tablespace ', TABLESPACE_NAME) as description FROM information_schema.PARTITIONS WHERE PARTITION_NAME IS NOT NULL AND (TABLESPACE_NAME IS NOT NULL AND TABLESPACE_NAME!='innodb_file_per_table')
SELECT tablespace_name, concat('circular reference in datafile path: \'', file_name, '\'') FROM INFORMATION_SCHEMA.FILES where file_type='TABLESPACE' and (file_name rlike '[^\\.]/\\.\\./' or file_name rlike '[^\\.]\\\\\\.\\.\\\\')
select table_schema, table_name, '', 'VIEW', UPPER(view_definition) from information_schema.views where table_schema not in ('performance_schema','information_schema','sys','mysql')
select routine_schema, routine_name, '', routine_type, UPPER(routine_definition) from information_schema.routines where routine_schema not in ('performance_schema','information_schema','sys','mysql')
select TABLE_SCHEMA,TABLE_NAME,COLUMN_NAME, 'COLUMN', UPPER(GENERATION_EXPRESSION) from information_schema.columns where extra regexp 'generated' and table_schema not in ('performance_schema','information_schema','sys','mysql')
select TRIGGER_SCHEMA, TRIGGER_NAME, '', 'TRIGGER', UPPER(ACTION_STATEMENT) from information_schema.triggers where TRIGGER_SCHEMA not in ('performance_schema','information_schema','sys','mysql')
select event_schema, event_name, '', 'EVENT', UPPER(EVENT_DEFINITION) from information_schema.events where event_schema not in ('performance_schema','information_schema','sys','mysql')
select table_schema, table_name, 'VIEW', UPPER(view_definition) from information_schema.views where table_schema not in ('performance_schema','information_schema','sys','mysql') and (UPPER(view_definition) like '%ASC%' or UPPER(view_definition) like '%DESC%')
select routine_schema, routine_name, routine_type, UPPER(routine_definition) from information_schema.routines where routine_schema not in ('performance_schema','information_schema','sys','mysql') and (UPPER(routine_definition) like '%ASC%' or UPPER(routine_definition) like '%DESC%')
select TRIGGER_SCHEMA, TRIGGER_NAME, 'TRIGGER', UPPER(ACTION_STATEMENT) from information_schema.triggers where TRIGGER_SCHEMA not in ('performance_schema','information_schema','sys','mysql') and (UPPER(ACTION_STATEMENT) like '%ASC%' or UPPER(ACTION_STATEMENT) like '%DESC%')
select event_schema, event_name, 'EVENT', UPPER(EVENT_DEFINITION) from information_schema.events where event_schema not in ('performance_schema','information_schema','sys','mysql') and (UPPER(event_definition) like '%ASC%' or UPPER(event_definition) like '%DESC%')
select 'global.sql_mode', 'does not contain either NO_ZERO_DATE or NO_ZERO_IN_DATE which allows insertion of zero dates' from (SELECT @@global.sql_mode like '%NO_ZERO_IN_DATE%' and @@global.sql_mode like '%NO_ZERO_DATE%' as zeroes_enabled) as q where q.zeroes_enabled = 0
select 'session.sql_mode', concat(' of ', q.thread_count, ' session(s) does not contain either NO_ZERO_DATE or NO_ZERO_IN_DATE which allows insertion of zero dates') FROM (select count(thread_id) as thread_count from performance_schema.variables_by_thread WHERE variable_name = 'sql_mode' and (variable_value not like '%NO_ZERO_IN_DATE%' or variable_value not like '%NO_ZERO_DATE%')) as q where q.thread_count > 0
select TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, concat('column has zero default value: ', COLUMN_DEFAULT) from information_schema.columns where TABLE_SCHEMA not in ('performance_schema','information_schema','sys','mysql') and DATA_TYPE in ('timestamp', 'datetime', 'date') and COLUMN_DEFAULT like '0000-00-00%'
select A.schema_name, A.table_name, 'present in INFORMATION_SCHEMA''s INNODB_SYS_TABLES table but missing from TABLES table' from (select distinct replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(NAME, '/',1), '@002d', '-'), '@003a', ':'), '@002e', '.'), '@0024', '$'), '@0021', '!'), '@003f', '?'), '@0025', '%'), '@0023', '#'), '@0026', '&'), '@002a', '*'), '@0040', '@')  as schema_name, replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(substring_index(NAME, '/',-1),'#',1), '@002d', '-'), '@003a', ':'), '@002e', '.'), '@0024', '$'), '@0021', '!'), '@003f', '?'), '@0025', '%'), '@0023', '#'), '@0026', '&'), '@002a', '*'), '@0040', '@')  as table_name from information_schema.innodb_sys_tables where NAME like '%/%') A left join information_schema.tables I on A.table_name = I.table_name and A.schema_name = I.table_schema where A.table_name not like 'FTS_0%' and (I.table_name IS NULL or I.table_schema IS NULL) and A.table_name not REGEXP '@[0-9]' and A.schema_name not REGEXP '@[0-9]'
select a.table_schema, a.table_name, concat('recognized by the InnoDB engine but belongs to ', a.engine) from information_schema.tables a join (select replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(NAME, '/',1), '@002d', '-'), '@003a', ':'), '@002e', '.'), '@0024', '$'), '@0021', '!'), '@003f', '?'), '@0025', '%'), '@0023', '#'), '@0026', '&'), '@002a', '*'), '@0040', '@')  as table_schema, replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(substring_index(NAME, '/',-1),'#',1), '@002d', '-'), '@003a', ':'), '@002e', '.'), '@0024', '$'), '@0021', '!'), '@003f', '?'), '@0025', '%'), '@0023', '#'), '@0026', '&'), '@002a', '*'), '@0040', '@')  as table_name from information_schema.innodb_sys_tables where NAME like '%/%') b on a.table_schema = b.table_schema and a.table_name = b.table_name where a.engine != 'Innodb'
FLUSH LOCAL TABLES
SELECT TABLE_SCHEMA, TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA not in ('information_schema', 'performance_schema', 'sys')
CHECK TABLE `mysql`.`columns_priv` FOR UPGRADE
CHECK TABLE `mysql`.`db` FOR UPGRADE
CHECK TABLE `mysql`.`engine_cost` FOR UPGRADE
CHECK TABLE `mysql`.`event` FOR UPGRADE
CHECK TABLE `mysql`.`func` FOR UPGRADE
CHECK TABLE `mysql`.`general_log` FOR UPGRADE
CHECK TABLE `mysql`.`gtid_executed` FOR UPGRADE
CHECK TABLE `mysql`.`help_category` FOR UPGRADE
CHECK TABLE `mysql`.`help_keyword` FOR UPGRADE
CHECK TABLE `mysql`.`help_relation` FOR UPGRADE
CHECK TABLE `mysql`.`help_topic` FOR UPGRADE
CHECK TABLE `mysql`.`innodb_index_stats` FOR UPGRADE
CHECK TABLE `mysql`.`innodb_table_stats` FOR UPGRADE
CHECK TABLE `mysql`.`ndb_binlog_index` FOR UPGRADE
CHECK TABLE `mysql`.`plugin` FOR UPGRADE
CHECK TABLE `mysql`.`proc` FOR UPGRADE
CHECK TABLE `mysql`.`procs_priv` FOR UPGRADE
CHECK TABLE `mysql`.`proxies_priv` FOR UPGRADE
CHECK TABLE `mysql`.`server_cost` FOR UPGRADE
CHECK TABLE `mysql`.`servers` FOR UPGRADE
CHECK TABLE `mysql`.`slave_master_info` FOR UPGRADE
CHECK TABLE `mysql`.`slave_relay_log_info` FOR UPGRADE
CHECK TABLE `mysql`.`slave_worker_info` FOR UPGRADE
CHECK TABLE `mysql`.`slow_log` FOR UPGRADE
CHECK TABLE `mysql`.`tables_priv` FOR UPGRADE
CHECK TABLE `mysql`.`time_zone` FOR UPGRADE
CHECK TABLE `mysql`.`time_zone_leap_second` FOR UPGRADE
CHECK TABLE `mysql`.`time_zone_name` FOR UPGRADE
CHECK TABLE `mysql`.`time_zone_transition` FOR UPGRADE
CHECK TABLE `mysql`.`time_zone_transition_type` FOR UPGRADE
CHECK TABLE `mysql`.`user` FOR UPGRADE

check.txt

Cannot set LC_ALL to locale en_US.UTF-8: No such file or directory
WARNING: Using a password on the command line interface can be insecure.
The MySQL server at 172.17.0.3:3306, version 5.7.33 - MySQL Community Server
(GPL), will now be checked for compatibility issues for upgrade to MySQL
8.0.24...

1) Usage of old temporal type
  No issues found

2) Usage of db objects with names conflicting with new reserved keywords
  No issues found

3) Usage of utf8mb3 charset
  No issues found

4) Table names in the mysql schema conflicting with new tables in 8.0
  No issues found

5) Partitioned tables using engines with non native partitioning
  No issues found

6) Foreign key constraint names longer than 64 characters
  No issues found

7) Usage of obsolete MAXDB sql_mode flag
  No issues found

8) Usage of obsolete sql_mode flags
  Notice: The following DB objects have obsolete options persisted for
    sql_mode, which will be cleared during upgrade to 8.0.
  More information:

https://dev.mysql.com/doc/refman/8.0/en/mysql-nutshell.html#mysql-nutshell-removals

  global system variable sql_mode - defined using obsolete NO_AUTO_CREATE_USER
    option

9) ENUM/SET column definitions containing elements longer than 255 characters
  No issues found

10) Usage of partitioned tables in shared tablespaces
  No issues found

11) Circular directory references in tablespace data file paths
  No issues found

12) Usage of removed functions
  No issues found

13) Usage of removed GROUP BY ASC/DESC syntax
  No issues found

14) Removed system variables for error logging to the system log configuration
  To run this check requires full path to MySQL server configuration file to be specified at 'configPath' key of options dictionary
  More information:

https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-13.html#mysqld-8-0-13-logging

15) Removed system variables
  To run this check requires full path to MySQL server configuration file to be specified at 'configPath' key of options dictionary
  More information:

https://dev.mysql.com/doc/refman/8.0/en/added-deprecated-removed.html#optvars-removed

16) System variables with new default values
  To run this check requires full path to MySQL server configuration file to be specified at 'configPath' key of options dictionary
  More information:

https://mysqlserverteam.com/new-defaults-in-mysql-8-0/

17) Zero Date, Datetime, and Timestamp values
  No issues found

18) Schema inconsistencies resulting from file removal or corruption
  No issues found

19) Tables recognized by InnoDB that belong to a different engine
  No issues found

20) Issues reported by 'check table x for upgrade' command
  No issues found

21) New default authentication plugin considerations
  Warning: The new default authentication plugin 'caching_sha2_password' offers
    more secure password hashing than previously used 'mysql_native_password'
    (and consequent improved client connection authentication). However, it also
    has compatibility implications that may affect existing MySQL installations.
    If your MySQL installation must serve pre-8.0 clients and you encounter
    compatibility issues after upgrading, the simplest way to address those
    issues is to reconfigure the server to revert to the previous default
    authentication plugin (mysql_native_password). For example, use these lines
    in the server option file:

    [mysqld]
    default_authentication_plugin=mysql_native_password

    However, the setting should be viewed as temporary, not as a long term or
    permanent solution, because it causes new accounts created with the setting
    in effect to forego the improved authentication security.
    If you are using replication please take time to understand how the
    authentication plugin changes may impact you.
  More information:

https://dev.mysql.com/doc/refman/8.0/en/upgrading-from-previous-series.html#upgrade-caching-sha2-password-compatibility-issues


https://dev.mysql.com/doc/refman/8.0/en/upgrading-from-previous-series.html#upgrade-caching-sha2-password-replication

Errors:   0
Warnings: 1
Notices:  1

No fatal errors were found that would prevent an upgrade, but some potential issues were detected. Please ensure that the reported issues are not significant before upgrading.

The pre-pre SQL check

I now am armed with an simplified single SQL statement. It does of course take a long to run in a cluster with thousands of tables.

select A.schema_name, A.table_name, 
       'present in INFORMATION_SCHEMA''s INNODB_SYS_TABLES table but missing from TABLES table' 
from (select distinct replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(NAME, '/',1), '@002d', '-'), '@003a', ':'), '@002e', '.'), '@0024', '$'), '@0021', '!'), '@003f', '?'), '@0025', '%'), '@0023', '#'), '@0026', '&'), '@002a', '*'), '@0040', '@')  as schema_name, 
replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(substring_index(NAME, '/',-1),'#',1), '@002d', '-'), '@003a', ':'), '@002e', '.'), '@0024', '$'), '@0021', '!'), '@003f', '?'), '@0025', '%'), '@0023', '#'), '@0026', '&'), '@002a', '*'), '@0040', '@')  as table_name
 from information_schema.innodb_sys_tables 
where NAME like '%/%') A 
left join information_schema.tables I on A.table_name = I.table_name and A.schema_name = I.table_schema 
where A.table_name not like 'FTS_0%' 
and (I.table_name IS NULL or I.table_schema IS NULL) 
and A.table_name not REGEXP '@[0-9]' 
and A.schema_name not REGEXP '@[0-9]')

I then performed a number of drop/remove/restart/re-create/discard tablespace steps with no success. As a managed service RDS the only course of action now is to open an AWS Support ticket for help with this specific internal corruption.

Upgrading to AWS Aurora MySQL 8

With Aurora MySQL 8 now generally available to all, you may want to consider the plan for an upgrade path if you would like to take advantage of the new features for your application, for example, Common Table Expressions (CTE). This new major release has a much improved and streamlined upgrade progress from Aurora MySQL 5.7.

This tutorial will provide all the steps to allow you to try out setting up an Aurora cluster and performing an upgrade without the impact on your existing AWS environment. The two pre-requisites to getting started are:

You can find all the CLI cut/paste commands in my AWS Tutorials repo. This will lead you through all of the various AWS dependencies for a successful RDS Aurora cluster including IAM, KMS, VPC and EC2 requirements.

Create an RDS Aurora MySQL Cluster and Aurora MySQL Major upgrade – Aurora 2.x to Aurora 3.x can provide you with a POC of the primary operations path to achieving the goal of this post in under 30 minutes.

While this example will produce an upgraded cluster with some warnings, in real life a more detailed upgrade assessment is needed for any new version of software. The MySQL and Aurora pre-checks can be performed to minimize surprises during the final process of your data migration.

mysqlcheck –check-upgrade and the mysqlsh util.checkForServerUpgrade() pre-checks can help to assist in being prepared and not have your Cluster instances with the incompatible-parameters status. At this point download the upgrade-prechecks.log Aurora Log and trash your cluster and instance. They are unusable. Reviewing the upgrade-prechecks.log can contain more information than mysqlsh util.checkForServerUpgrade() output.

With an Aurora cluster configured with an instance parameter group enabling MySQL binary log replication, it is easy to have a functioning Aurora 5.7 Cluster with real-time replication to an Aurora 8 Cluster to minimize any downtime in your production environment and then benefit from an atomic data dictionary, roles, descending indexes, improved internal temporary table, additional JSON functions, Window Functions, CTEs and more!

More Reading

Reviewing your strengths and areas for improvement

The end of the year is often a time to review the progress of your yearly goals and to set new goals for the next year. These goals may include improving your professional and personal development. There are many different ways to assess your personality for your profession and over the decades I participated in both employer-sponsored assessments and personal improvement assessments. Some of these have included Myers-Briggs Type Indicator® (MBTI®), the Facet 5 Personality Profile, NERIS Type Explorer®, and StandOut® strengths assessment.  This StandOut® assessment was part of the onboarding process a year ago and is also available by my employer to repeat at no cost. This different opportunity from other assessments I have undertaken has enabled me to re-assess my strengths as I review this year and reflect on goals for next year. 
 
Your StandOut® assessment lists the top two roles which “are the focal point of all your talents and skills. They represent your instinctive way of making a difference in the world.”. You also see a ranking of the remaining seven roles, which was most helpful the second time when my results were slightly different, however, the top three roles were overall consistent.

Right now, you can take this StandOut® assessment for free(1)

During this time I also found online audio descriptions of my original top two roles, these being Creator and Teacher. What I liked about this audio addition was a different description of the written report. While I do not wish to repeat the information you can find online of these two roles or the roles that may best match your strengths, I found both the written summary and audio to be a precise reflection of who I am and “are the focal point of all your talents and skills. They represent your instinctive way of making a difference in the world.”. Here is a summary of my top two roles:

Creator

  • “Creators make sense of the world, pulling it apart, seeing a better configuration, and creating it.” 
  • “As a creator, the first question you ask in any situation is what do I understand? When you look out at the world,  you don’t jump right in and leap to conclusions, instead, you stop, you take a step back, and try to look thru the superficial details at the surface, and get underneath to the causes of the effect we see on the surface”. (audio intro translation)

Teacher

  • “Teachers are thrilled by the potential they see in each person. Their power comes from learning how to unleash it.”
  • “As a teacher the first question you ask in any situation is what can I learn? What can (s)he learn?  You take your own development very seriously, your inquisitive, you read, you want to grow, you want to develop, and you see your life as a constant journey of development for yourself… and you also take other peoples development seriously. You see little increments of growth in others and you get a kick out of these”. (audio intro translation)

While information from assessments can be an affirmation.  Sometimes a key point can be completely wrong.  This happened with my Facet 5 report from several years ago, where one line really stuck out. “Having to spend too much time on the following elements has been shown to be demotivating …Being asked to be creative”. I could not find that line to be any further from the truth. My creativeness extends from the professional outlook in problem-solving to the numerous personal activities I undertake, from creating a children’s card and board game, writing a self-published infants board book and creating new products without plans from 100% recycled wood to name a few.  I am glad that the StandOut® description for creator included  “put things in a more creative configuration”.  This was a pleasing correction.

No assessment is perfect, however they can help reaffirm your strengths and can also offer insights into areas of improvement you may wish to incorporate into your future goals.

(1) *This is not an endorsement or promotion as an employee.*

Re-posted on LinkedIn

AWS Aurora MySQL 8 is now generally available

AWS has just announced the general availability of Aurora MySQL 8 compatibility (known as Aurora Version 3). This is long awaited addition to RDS MySQL 8 and provides many of the new features that can be found in the open-source MySQL 8 community version.

For those unfamiliar with Amazon Aurora my Understanding AWS RDS Aurora Capabilities presentation from Percona Live 2021 provides a great introduction of the benefits of this managed service.

There is a lot to digest and the Aurora User Guide provides details of the new features from the MySQL 8 community version, and of Aurora 3 new features, and feature differences or unsupported features. This AWS blog post also provides a general introduction.

It is very easy to spin up a new Aurora MySQL 3.01.0 cluster in an existing environment containing existing Aurora clusters. After defining new cluster and instance parameter groups for the aurora-mysql8.0 family, or starting with the available default.aurora-mysql8.0 parameter groups, there are no other differences in aws rds create-db-cluster syntax, or using the AWS Console or Terraform syntax for example.

Before considering a migration of an existing Aurora cluster, there is a lot of information around parameter changes (including inclusive language functionality), and those related status and CloudWatch Metrics changes. Yes, looking at the 29 ‘Removed from Aurora MySQL version 3′, 30 ‘This parameter applies to Aurora MySQL version 3 and higher’ and presently ‘Currently not available in Aurora MySQL version 3′ LOAD|SELECT S3 capabilities is important. There are new reserved words to be aware of, you will need to note how to take advantage of roles within the Aurora permissions model.

Migrating an existing Aurora MySQL 2 cluster to Aurora 3 is a little more involved than specifying the snapshot-id. Seeing your restored Aurora 2 snapshot in an Aurora 3 cluster but with a status of incompatible-parameters is a good indication that more work is needed. While I will detail some of my experiences in a subsequent post, one helpful tip is found in those additional pages of the 5 rows of logs for your new cluster after all the error.log files, you will find an upgrade-prechecks.log file. This contains an extensive list of checks and warnings performed for the upgrade. Skipping to the end of the JSON will give you an idea of your errorCount, warningCount and noticeCount.

Searching then for an object of “status”: “ERROR” will find the errorCount entries matching the count. Several other checks provide a “detectedProblems” section and a “level”: “Error” which would seem to be needed to be also corrected. There are a lot of checks between the INFORMATION_SCHEMA, InnoDB internal data dictionary and actual data/files on disk. You will also be presented with a nice long list of tables/columns using reserved words, as well as character set deprecations.

At a more technical glance of the documentation, there is a key change in how internal temporary tables are created, and how this differs from writer and reader instances. Benchmarking your application in different configurations will definitely be recommended.

Restoring an Aurora 2 cluster into Aurora 3 also took significantly more time; many hours; than a simple restore-db-cluster-from-snapshot you may be used to. While Terraform cluster creation timeouts need to be increased for global clusters, this time the default 1h30 timeout for an instance was also exceeded.

While different features will benefit different consumers of your Aurora database, one of the most anticipated is CTEs. From the operations perspective, as a managed service Aurora offers a subset of community features. One great feature that is now available in Aurora 3 is binary log filtering, a simple long-lived option in MySQL land that will help replacing more complex functionality.

This is a great and long awaited version release for Aurora.