The RAT and the CAT

Ronald Bradford
August 24, 2006

No, it’s not a bedtime story, is a serious system’s design concept and I’m amazing that people don’t know about this.
As I mentioned in If you don’t know your data, you don’t know your application I was doing a Java Code Review, and I found a clear case of a much simplier solution. How simple you ask?

Well, without completing the task 100%, I achieved in less then 1 day (and lets say for the argument 1 more day of work), what is being worked on by somebody else for a week, with an estimate of 2 more weeks to complete. So let’s add 50% to my estimate, that’s a total of 3 days verses 15 days. You do the math. and yes that was last week and that task is still being worked on the same way, even with reference to my working code. Not to mention the code is a similiar magnitude of simplicity, and simplicity means cost savings in support, people so quickly forget that.

So what is this RAT and CAT:

RAT – Row At a Time
CAT – Chunk At a Time

This concept is really rather simple, however it’s amazing how implementations never consider this. Let me give you a simple example.

Your importing some external data (a flat file of 5 columns, and is inturn will be stored in 2 tables in a normalised form). So the RAT way of doing things would entail:

Using the language in question, open the file, read line by line, converting into object for reference.
Now for each row read you
- You get the values (A and B), and then do a select to see if this row already exists in one of the normalised tables (calling in X). If it does, you have the surrogate key, else you insert the row and get the surrogate key
- Now you do the same with the next set of values (C, D, E) which reference the normalised values (A and B), inserting into Y)
- And so on and so on.
- Report exceptions line by line if found

For those now laughing, it’s not really funny, it’s sad that programs are written this way. For those that say, but that’s the only way, well welcome to a different, radical and hard to learn approach.

The Cat way of doing things would entail:

Create a temporary table
Bulk load the data into the temporary table via appropiate command, e.g. mysqlimport or LOAD DATA.
Using one select, insert into X rows from temporary table that are not already present
Using one select, insert into Y rows from the temporart table that are not present, joining to X to get the appropiate surrogate key
Report exceptions via one select of data that wasn’t inserted due to some rules

It’s not rocket science people.

I should mention this is a simple example, and it’s not always possible to do CAT processing for all operations, but generally some portion of batch work can be, and the remaining must be done in a RAT way.

Updated
Some more recent articles including We need more CATs (2009) and The Art of Elimination (2010)

Tagged with: Databases General MySQL

MySQL and Heatwave Summit Presentation

Ronald Bradford
April 30, 2025

Last week I had the opportunity to speak at the MySQL and Heatwave Summit in San Francisco. I discussed the impact of the new MySQL 8.0 default caching_sha2_password authentication, replacing the mysql_native_password authentication that was the default for approximately 20 of the 30 years that MySQL has existed.

Readyset QueryPilot Announcement

Ronald Bradford
April 22, 2025

At the MySQL and Heatwave Summit 2025 today, Readyset announced a new data systems architecture pattern named Readyset QueryPilot . This architecture which can front a MySQL or PostgreSQL database infrastructure, combines the enterprise-grade ProxySQL and Readyset caching with intelligent query monitoring and routing to help support applications scale and produce more predictable results with varied workloads.

More CPUs or Newer CPUs

Ronald Bradford
April 2, 2025

In a CPU-bound database workload, regardless of price, would you scale-up or scale-new? What if price was the driving factor, would you scale-up or scale-new? I am using as a baseline the first available AWS Graviton2 processor for RDS (r6g).

The RAT and the CAT

Related Posts

MySQL and Heatwave Summit Presentation

Readyset QueryPilot Announcement

More CPUs or Newer CPUs