Given the popular response from my first lesson in improving scalability where I detailed simple ways to eliminate unnecessary SQL, let me share another common bottleneck with MySQL scalability that can be instantly overcome.
Analyzing the writes that occur on a system can expose obvious potential bottlenecks. The MySQL Binary Log is a wealth of information that can be mined. Simple DML Counts per table can be achieved by a single line command.
Let’s look at the following example output of a production system:
mysqlbinlog /path/to/mysql-bin.000999 | \ grep -i -e "^update" -e "^insert" -e "^delete" -e "^replace" -e "^alter" | \ cut -c1-100 | tr '[A-Z]' '[a-z]' | \ sed -e "s/\t/ /g;s/\`//g;s/(.*$//;s/ set .*$//;s/ as .*$//" | sed -e "s/ where .*$//" | \ sort | uniq -c | sort -nr
Of the approx 100,000 DML statements we get the following breakdown.
55283 update sessions 25204 insert into sessions 12610 update items 10536 insert into item_categories 7532 update users 5168 delete from item_categories
More then 50% of the statements that are written to the binary log and therefore replicated are INSERT’s into the sessions table. A further 25% are UPDATE’s to the same table. This represents 75% of DML statements in just the two most frequent statements.
What is disappointing is that these statements do not belong in MySQL. This is an example of when MySQL is being abused for a purpose where other products are more suited. While there is the argument in using MySQL for storing data, the impact in MySQL memory management, backup/recovery and slave replication throughput and lag can significantly impact scalability of your important MySQL data.
What is observed here is session management where a key value store product should be used as an alternative. In most circumstances it is likely this information is not even required to be persisted. The obvious replacement here is using memcached. If you do wish to persist this data there is an ever increasing list of products including Redis, Tokyo Cabinet/Kyoto Cabinent, Membrain, Voldemort etc that are specifically designed as a key-value store. Even the popular noSQL MongoDB can be easily substituted to perform as a key-value session manager with the added benefits of being a more fully functional product for other purposes.