For MySQL DBA fame and glory. Prize included.

I came across the following configuration today on a Production MySQL system (5.0.67) running 30+ blogs using WordPress MU.

$ cat /etc/my.cnf
[mysqld]
set-variable = max_connections=500
safe-show-database

No I did not truncate the output. I could say I’ve seen worse, but that’s a stretch.


So the quiz and a prize for the best response, for the next 48 hours I’ll accept your comments as responses to this post for the top 5 settings you would add, and additionally what information you may need to add these settings. Bonus points for giving a reason why you would add the settings as well.

For example, I’ll give you the most obvious.

key_buffer_size = ????

To determine a key_buffer_size to start with I would look at the size of all Indexes via I_S, and combine with some estimate of growth, say 2x-5x.

For the best answer with the top 5 settings, I’ll send you a copy of Sheeri Cabral’s book MySQL Administrators Bible which even I as a resident MySQL Expert has enjoyed scanning and learning something new.

Comments

  1. Dennis van Zuijlekom says

    Well, this one is fairly obvious probably;

    default-character-set = utf8

    Because databases should be encoded in UTF-8 charset instead of the default latin-1 charset.

    Also, you will probably want to specify the following;

    thread_cache_size = ???
    query_cache_size = ???M

    For performance tuning and database scaling, you need to know what hardware the db is running on and how much RAM will be available for query result caching.

    When replicating to another server, you’ll probably want to specify the following;

    server-id = ??

    To uniquely identify the server.

    auto-increment-increment = ??
    auto-increment-offset = ??

    For multi-master replication without the risk of duplicate unique ID’s.

    log-bin = ??

    Binary logging file, required for replication.

    master-info-file = ??

    Replication agreement information goes in here.

    relay-log = ??
    relay-log-index = ??
    relay-log-info-file = ??

    To be able to do multi-master replication with another master.

    binlog-do-db = ??
    binlog-ignore-db = ??

    Which databases to replicate when the ‘USE database’ command is issued or which one to skip.

    Disclaimer: I’m not a DBA nor claim to be.

  2. says

    The actual config syntax and options are of 3.23 origin!
    You may wish to provide info on things like available RAM, and whether the MySQL instance is on a dedicated server or shared with for instance the webserver. Also presuming MyISAM is used.
    I would not actually use INFORMATION_SCHEMA to investigate as that might upset an already stressed server.

    I’ll take a rough stab that the top settings you requested, this can be done in the live server with SET GLOBAL … and would yield immediate benefits. Then a more detailed investigation of stats can take place.

    #default key buffer would 8M which is not enough; presuming sufficient RAM but not going overboard at first bash
    #don’t think there’s a need to have all or even most of the indexes in memory. let’s just improve basic perf for now.
    key_buffer_size = 256M
    # sensible for tablescans when indexes not getting used
    read_buffer_size = 1M
    # repeated queries not re-executed
    query_cache_size = 64M

    # Could also add these

    # for reading larger chunks on indexed row reads, could make sense for blog entries
    read_rnd_buffer_size = 2M

    # improve connection speed, presuming mysql.allow_persistent_connections=Off in php.ini (which it should be)
    thread_cache_size = 256

    # so these’s the 5, above.

    # tends to already be like 2M by default so not bothering with this one yet
    #sort_buffer_size = 2M

    # these might not help much since there’ll be TEXT columns in the result sets.
    # also we need to set both so it would eat up two items from your top 5 ;-)
    #tmp_table_size = 32M
    #max_heap_table_size=32M

    # In addition… (later!) requires restart of server, may need to wait for a suitable maintenance slot.

    # binary log enable
    log-bin

    # allow mysql to check and auto-recover attempt myisam.
    myisam-recover = QUICK,BACKUP

    # log slow queries so we can find slow/unindexed queries
    log-slow-queries
    log-queries-not-using-indexes
    # enable/lower (down to 2, etc) this one later when unindexed dealt with
    #long_query_time = 5

  3. Robin says

    log-slow-queries
    log-queries-not-using-indexes
    set-variable=long_query_time=1
    log=mysql.log

    Changing your key buffer size, or any of the other buffers and caches without any information on how much data is stored and how often data gets queried and updated would be premature optimization.
    After you have set these settings, gathered some data and posted them we could analyze them and consider the next step: optimizing your server.

  4. says

    Free book! Woohoo!

    I’d like to make a go. I’ll list 5 parameters, then explain why these, then explain about my personal circumstance which will surely break your heart.

    So were assuming WordPress installation with MyISAM tables, and the following already defined:
    max-connections
    safe-show-database (?!)
    key_buffer_size

    My 5 are:
    =========
    table_cache=512
    thread_cache_size=64
    myisam_recover = force,backup
    server-id=12345
    log-bin=my-machine-bin

    Reasons:
    – I just can’t stand a server with no binary logging. It means no replication; no incremental backups. If I had more variables, I’d have added expire-log-days, but purging master logs can also be done in other ways.
    – 30 blogs, and about a dozen tables per wordpress installation (if I understand correctly, there’s a speparate schema per blog) make for about 500 tables. I’d like to keep them open. I usually allow for 1024 or 2048 cache size – I;m not sure though was OS this is, aind since there may be 500 connections, 512+500 make is almost 1024, which is (I think?) max in Windows, and default-max on linux. Anyway, I’m assuming since 500 connections are allowed, this is an expected number, which means lots of reads, which means lots of open tables, possible same tables open concurrently many times. So this can help out in flushing etc.
    – again, 500 connections make for lots of concurrency. So better keep up a decent thread cache. I find that 64 is large enough even for 500 concurrent connections, and make for very log thread creation per second. Again, this reduces impact from OS.
    – myisam-recover – Hey! You’re using MyISAM! Better watch out for yourself and prepare for the worse.

    I’m skipping the following important variables:
    sql-mode: because WordPress I’m assuming WordPress to work correctly, thereby avoiding data errors
    wait-timeout: since I’m assuming PHP pool to function correctly
    …ummm… bad idea to do this list, since there are dozens of flags…

    And now for my personal circumstance:
    I’m pennyless. The trip to the conference made me broke. I’m on the verge of selling my Community Member of the Year Award for the higher bidder (it’s a nice glass sphere, as you very well know).
    We both stood on the same stage and got it. You wouldn;t break such a bond of friendship, would you?
    And as for my health! Oh! Where shall I start?!
    If only I had a good book to comfort me in those cold, dark nights, maybe I would allow myself lighting another light bulb, in the hope that such good education can bring me, some year into the future, one more step in the direction of prosperity…

    PS. I’d also like to argue that some of the comments above have cheated, they contain more than 5. Now, would you not prefer to give the prize to an honest man?

    PS, you do realize it’s not only the book you’re offering for free, it’s also shipment!

    Regards,
    Shlomi

  5. says

    Goodness I forgot table_cache didn’t I… I’ll “swap out” thread_cache_size in my first 5 for table_cache=500
    They’re both important, but table_cache is *more* important.

  6. Tobias says

    My Config would be:

    # While WordPress relies on MyISAM and blogs are about 99.999% read-only we get a big benefit of a very big cache
    query_cache_type = On
    query_cache_size = 128M # we should know the RAM size but 128MB will never be too much for the use of this server (read-only sites)

    # MyISAM has no crash recovery like InnoDB, so we have to do it a similar way, or we’ll have trouble some night (Moore’s Law)
    myisam-recover = BACKUP,FORCE

    # explained later ;)
    log-slow-queries

    The other configuration options i would not touch – they are pretty ok. The basic wordpress functions do not use many tables, so there’s no need in increasing tablecache or keybuffersize.
    And database querys of wordpress are optimized, we only need the logging feature for slow querys to see that “something ist not working correctly”, we choose false parameters, the os is swapping or maybe our raid is doing some consistency checks.

    Tobias

  7. Toba says

    Why every assumes MyISAM for wordpress? We run wordpress fine with Innodb :)

    So mine are,

    # Set InnoDB as default engine so wordpress will create tables as innodb.
    default-storage-engine = InnoDB

    # More memory for Innodb, I usually set this about 75% of all RAM if running only innodb.
    innodb_buffer_pool_size = 1024M

    # Table cache is still usefull even with innodb tablespace because we want to cache .frm files.
    table_cache = 2048

    # Binary logging for replication and PITR
    bin-log = server-bin

    # For both mysqld and mysqldump. Using bigger multiline inserts reading the sql dump is lot faster.
    max_allowed_packet = 16M

  8. says

    Also consider if you are on a 32 or 64 bit machine. If 32, your large caches (e.g. innodb_buffer_pool and key_buffer_size) will be limited to 2G.

    Remember that some values are per thread (connection) e.g.: read_buffer_size and some can be allocated multiple times per thread, e.g.: join_buffer_size

    Other values are allocated globalally, e.g. innodb_buffer_pool and query_cache_size.

    Keep in mind the total amount of physical memory on the server when you are calculating global buffers and per connection buffers (Also note the setting of max_connections. Can your server handle 1000 connections, for example, allocating as much per-thread memory as you’ve configured?)

  9. Tobias says

    @Tobia
    While WordPress is mostly readonly and does not use any of InnoDB’s optimizations (clustered indexes).
    And in the past there were a few problems in using Innodb: SELECT COUNT(*) FROM tab; and such things.

  10. Michael says

    log-bin
    – For point in time recovery & replication

    table_cache <= 1024
    – WP opens and closes tables faster than you can sneeze. It also [heavily] uses temp tables. This can be a huge bottleneck if set too high.

    query_cache_size
    – WP is heavily read oriented, having an ample Query Cache size alleviates table accesses

    read_rnd_buffer_size
    – WP uses poorly written queries — many GROUP BY and ORDER BY. Setting this to a higher than normal value can help speed up these queries.

    If MyISAM:
    key_buffer_size (if large indexes, multiple key caches)
    – To help cache keys in memory, try to avoid disk seeks

    If InnoDB:
    innodb_buffer_pool_size
    – To help cache data in main memory, try and avoid disk seeks

  11. Gavin Towey says

    My 5 (assuming non-innodb system as suggested by key_buffer being the most obvious addition):
    key_buffer_size:
    # Large enough to hold current mysiam indexes, but no more than 2G or 25% of total system ram.

    table_cache
    # somewhere around 512 to 2048. Adjust this value up until the opened_tables status stops incrementing most of the time.

    thread_cache_size
    # probably around 32 to 128. Adjust it up until threads_created counter stops incrementing.

    query_cache_size
    # probably around 32M to 128M. Check the cache hits and fill rate for size. Check ratio of insert rate to hit rate to decide if query cache is even usefull. If there is a low number of hits compared to inserts, then it would be best to disable query cache.

    log-bin
    # incremental backups, and point in time recovery are critical. Plus it’s better to have this on, so you don’t require a server restart when it’s needed for replication or backups later.

  12. Gavin Towey says

    Also note that I didn’t choose to adjust sort_buffer_size, read_buffer_size or read_rnd_buffer_size since defaults in more recent versions of mysql are sensible values. They are the next values I’d tune *after* the top 5 above.

  13. says

    Just to point out (competition aside) that for replication purpose, log-bin is not sufficient: one must also define server-id (and though this is a dynamic variable, it really should be set in the config file)

  14. Michael says

    @Shlomi Noach:
    True that for replication to work correctly you need to assign server_id, however as TOP 5 configuration parameters, I’d recommend log-bin and leave server_id out. Replication will not start correctly without server_id, however the server [as a master] will be ripe/ready to serve replication slaves (assuming binary logs are not purged). As a result, you will eventually be forced to set server_id …

  15. says

    @Michael,

    you are right; I was merely stating the fact, regardless of the 5 parameters limitation. I was in fact relating to Toba #8.