What is the optimal thread specific buffer size?

So you want to know what join_buffer_size, sort_buffer_size, read_buffer_size and read_rnd_buffer_size values for your application should be? These MySQL thread specific buffers are variables I can never get right because there are insufficient metrics, instrumentation or even abstract details with the present MySQL Versions. These are important because Memory is a resource that you want to maximum towards your database data (the System Global Area), and not towards the Process Global Area in which there is no limitations.

I’ve wanted to know this answer for some time, I’ve asked many people including internal MySQL resources, I’d hoped that when joining MySQL more details would be available, but I’ve never been able to get an answer. I’ve always been meaning to work this out, it’s now 2008 and well the time has now come to do something about it.

The questions I want to answer include?

  • When are these buffers used, i.e. which SQL Statements trigger these. Ok, you can work this out when you review the SQL + Schema but I want something that works for clients.
  • I want to know if the buffers are fully allocated or incrementally allocated. My understanding is that for these 4 buffers, there are pre-allocated unlike internal temporary tables, but since reading some of the code I’m no longer completely convinced.
  • I want to know what size of the buffer (if fully allocated) is actually used.

My initial goal is to add the following new status variables.

  • sort_buffer_count, sort_buffer_usage
  • read_buffer_count, read_buffer_usage
  • read_rnd_buffer_count, read_rnd_buffer_usage
  • join_buffer_count, join_buffer_usage

Now, it’s likely there will be some overlap, for example sort_scan, but nothing like introducing some consistency here.

There per session status variables will allow you when reviewing individual SQL statements to see the impact. Ultimately I’d like to add more smarts into MySQL, because things like “gathering all SQL Statements” can be difficult and often not possible in production. Combined with leverage MySQL Proxy we will get there.

I’m looking forward to getting in the bows of the MySQL Source Code, it’s going to be a steep curve as I do little development these days, less in C++ but nothing like a challenge and questions to a lot of close colleagues for small snippets of help.

Tagged with: Databases MySQL

Related Posts

Tracking new AWS Database Infrastructure Availability

AWS can drop 10+ articles a day just in the What’s New feed. You either need an eagle eye, or luck to keep up if you run multiple AWS database products across multiple regions and managing infrastructure.

Read more

Evaluating Readyset Caching for MySQL

Readyset is a database caching solution for MySQL and PostgreSQL . For applications that have increased load on your primary database, or use scale-out infrastructure to support a high-read workload, ReadySet can be a drop-in solution to address current performance issues.

Read more

Creating a More Realistic Benchmark

Common benchmark approaches fall into two general categories, synthetic testing and realistic testing. You have the most generic operations from a synthetic test, starting with the most simple example using a single table, a single column, and for a single DML operation.

Read more