Well, now I have your attention, Slow Queries are bad (unless it’s a single user system and you don’t care). However there are worse things then slow queries in a large enterprise system.
I’ve been asked in recent weeks a number of questions which has brought this topic to discussion, as well as a current implementation I’m undertaking for a client of a purchased product.
High volume repetitive queries can have a worse effect on your system’s performance. Combined with slow queries that take locks, these queries can have an extreme effect on performance and if you don’t know your application, or have the right tools, it can be initially hard to diagonise easily.
This problem is the classic Wasting CPU cycles problem, seen it before, will see it again.
Here’s a classic example for reference. Using the current product that I’m customising and installing (I’ve not be involved in any development), a typical Customer Information System (CIS).
The system when deployed will have 1.1 Million Customers. Each Customer has a Balance. Now the Customer Balance is not recorded against the customer record, it’s calculated every time it’s required. So, no big deal, well yes. The stored procedure to calculate the balance hit’s 4 different tables, and one of these tables is one of the largest in the system (recording all detailed financial transactions). Multiple batch reports alone that work with large sets of customers all require or use the current balance in some means.
There are only a small set of transactions that affect the balance including Invoice Statements, Payments, Adjustments, Credit Control, Write-Offs . These processes by there own nature are either batch or small online transactions. The calculation of the balance can be applied against the Chunk of customers in a batch, or individual Row for online transactions (See The RAT and the CAT).
Even for the paranoid, the re-calculation of the customer balance for all customers in batch is more efficient after hours, yet only customers where some transaction that has been applied since the last time the re-calculation ran is necessary.
MySQL doesn’t be default have any analysis tools to identify and manage these types of queries. Peter Zaitsev in his article Slow Query Log analyzes tools highlights the issue with a good approach by a source code change to long_query_time as well as some additional scripts.
Hopefully MySQL will consider a more improved approach in future releases that doesn’t require patching the source code. Now it’s unlikely that MySQL will do another SHOW STATUS Gotcha and for example change the unit of measure from seconds to milliseconds, but they could make it a float, so you could specify 0.1 for example.