Slow Queries aren’t always that bad!

Well, now I have your attention, Slow Queries are bad (unless it’s a single user system and you don’t care). However there are worse things then slow queries in a large enterprise system.

I’ve been asked in recent weeks a number of questions which has brought this topic to discussion, as well as a current implementation I’m undertaking for a client of a purchased product.

High volume repetitive queries can have a worse effect on your system’s performance. Combined with slow queries that take locks, these queries can have an extreme effect on performance and if you don’t know your application, or have the right tools, it can be initially hard to diagonise easily.

This problem is the classic Wasting CPU cycles problem, seen it before, will see it again.

Here’s a classic example for reference. Using the current product that I’m customising and installing (I’ve not be involved in any development), a typical Customer Information System (CIS).

The system when deployed will have 1.1 Million Customers. Each Customer has a Balance. Now the Customer Balance is not recorded against the customer record, it’s calculated every time it’s required. So, no big deal, well yes. The stored procedure to calculate the balance hit’s 4 different tables, and one of these tables is one of the largest in the system (recording all detailed financial transactions). Multiple batch reports alone that work with large sets of customers all require or use the current balance in some means.

There are only a small set of transactions that affect the balance including Invoice Statements, Payments, Adjustments, Credit Control, Write-Offs . These processes by there own nature are either batch or small online transactions. The calculation of the balance can be applied against the Chunk of customers in a batch, or individual Row for online transactions (See The RAT and the CAT ).

Even for the paranoid, the re-calculation of the customer balance for all customers in batch is more efficient after hours, yet only customers where some transaction that has been applied since the last time the re-calculation ran is necessary.

MySQL doesn’t be default have any analysis tools to identify and manage these types of queries. Peter Zaitsev in his article Slow Query Log analyzes tools highlights the issue with a good approach by a source code change to long_query_time as well as some additional scripts.

Hopefully MySQL will consider a more improved approach in future releases that doesn’t require patching the source code. Now it’s unlikely that MySQL will do another SHOW STATUS Gotcha and for example change the unit of measure from seconds to milliseconds, but they could make it a float, so you could specify 0.1 for example.

Tagged with: Databases General MySQL

Related Posts

More CPUs or Newer CPUs

In a CPU-bound database workload, regardless of price, would you scale-up or scale-new? What if price was the driving factor, would you scale-up or scale-new? I am using as a baseline the first available AWS Graviton2 processor for RDS (r6g).

Read more

An Interesting Artifact with AWS RDS Aurora Storage

As part of using public datasets with my own Benchmarking Suite I wanted upsize a dataset for larger volume testing. I have always used the INFORMATION_SCHEMA.TABLES data_length and index_length columns as a sufficiently accurate measurement for actual disk space used.

Read more

How long does it take the ReadySet cache to warm up?

During my setup of benchmarking I run a quick test-sysbench script to ensure my configuration is right before running an hour+ duration test. When pointing to a Readyset cache where I have cached the 5 queries used in the sysbench test, but I have not run any execution of the SQL, throughput went up 10x in 5 seconds.

Read more