Ronald Bradford
MySQL Expert

MySQL Expert Ronald Bradford shares valuable input in MySQL Performance Tuning, MySQL Scalability and general MySQL Help from his two decades of working with MySQL, Oracle, Ingres and development technologies.

Posts Tagged ‘monitoring’

What do MySQL Consultants do?

Thursday, July 8th, 2010

One role of a MySQL consultant is to review an existing production system. Sometimes you have sufficient time and access, and other times you don’t. If I am given a limited time here is a general list of things I look at.

  1. Review Server architecture, OS, Memory, Disks (including raid and partition type), Network etc
  2. Review server load and identify physical bottleneck
  3. Look at all running processes
  4. Look specifically at MySQL processes
  5. Review MySQL Error Log
  6. Determine MySQL version
  7. Look at MySQL configuration (e.g. /etc/my.cnf)
  8. Look at running MySQL Variables
  9. Look at running MySQL status (x n times)
  10. Look at running MySQL INNODB status (x n times) if used
  11. Get Database and Schema Sizes
  12. Get Database Schema
  13. Review Slow Query Log
  14. Capture query sample via SHOW FULL PROCESSLIST (locked and long running)
  15. Analyze Binary Log file
  16. Capture all running SQL

Here are some of the commands I would run.

2. Review server load and identify physical bottleneck

$ vmstat 5 720 > vmstat.`date +%y%m%d.%H%M%S`.txt

4. Look at MySQL processes

$ ps -eopid,fname,rss,vsz,user,command | grep -e "RSS" -e "mysql"
  PID COMMAND    RSS    VSZ USER     COMMAND
 5463 grep       764   5204 ronald   grep -e RSS -e mysql
13894 mysqld_s   596   3936 root     /bin/sh /usr/bin/mysqld_safe
13933 mysqld   4787812 5127208 mysql /usr/sbin/mysqld --basedir=/usr --datadir=/vol/mysql/mysqldata --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-external-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock
13934 logger     608   3840 root     logger -p daemon.err -t mysqld_safe -i -t mysqld

$ ps -eopid,fname,rss,vsz,user,command | grep " mysqld " | grep -v grep | awk '{print $3,$4}'
4787820 5127208

5. Review MySQL Error Log

The error log can be found in various different places based on the operating system and configuration. It is important to find the right log, the SHOW GLOBAL VARIABLES LIKE ‘log_error’ will determine the location.

This is generally overlooked, however this can quickly identify some underlying problems with a MySQL environment.

7. Look at MySQL configuration

$ [ -f /etc/my.cnf ] &&  cat /etc/my.cnf
$ [ -f /etc/mysql/my.cnf ] &&  cat /etc/mysql/my.cnf
$ find / -name  "*my*cnf" 2>/dev/null

8. Look at running MySQL Variables

$ mysqladmin -uroot -p variables

9. Look at running MySQL status (x n times)

$ mysqladmin -uroot -p extended-status

It is important to run this several times at regular intervals, say 60 seconds, 60 minutes, or 24 hours.

I also have dedicated scripts that can perform this. Check out Log MySQL Stats.

11. Get Database and Schema Sizes

Check out my scripts on my MySQL DBA page

14. Capture Locked statements

Check out my script for Capturing MySQL sessions.

15. Analyze Binary Log file

Check out my post on using mk-query-digest.

16. Capture all SQL

Check out my post on DML Stats per table

Moving forward

Of course the commands I run exceeds this initial list, and gathering this information is only

Monitoring MySQL with MONyog

Wednesday, November 25th, 2009

It just works. In absence of any MySQL monitoring for your site, I have found no solution that gets you operational as quickly and easily. MONyog can be deployed in 60 seconds, and configured in another 60 seconds. Within 5 minutes you can have visual monitoring of your MySQL environment.

MONyog is an agentless process, which is an advantage for easy install, but does not provide for monitoring redundancy in the capture of information due to agentless nature. It’s a static standalone executable which is great if you need something to work out of the box. You can easily configure multiple servers in a replication topology, or different servers in your environment. You get the ability to monitor all the usual information, with a dashboard and detailed graphs. While MONyog does provide customizations of rules for the graphs and presentation order, that’s about it. You can’t at this time for example change the colors, what’s on graphs except for what MONyog monitors or the security of certain options in the GUI to different users, however I hope they offer these suggestions in future releases.

MONyog includes some nice features that are overlooked in other products. You have the ability to monitor the MySQL error log (if configured appropriately) which is a common complaint of end users. You can also see the process list, and when configured you can also perform query gathering and analysis.

MONyog is a well priced commercial product with a free trial download without registration requirements which gives no barrier to access and evaluate. As a solution and ease of deployment, there is no excuse not to evaluate this product. If you have no monitoring, you can now quickly and easily. I find a number of clients that simply have no monitoring. There really is no excuse as it’s critical information you need to have and record for a successful business.

You can get it from www.webyog.com.

What alert monitoring do you use?

Wednesday, October 7th, 2009

More importantly, how often to you confirm access to your server and database with that alert monitoring?

With a client yesterday the primary database server while still usable and serving connections for a while, but was not accessible via SSH to investigate performance issues. It eventually became non responsive and required a physical reboot. With alert monitoring for system availability only recorded every 5 minutes this was simply too long a delay.

This lead to a discussion with more questions then answers including.

  • How often should you ping your server(s), both internally and externally?
  • How often do you connect physically to your server for confirmation, e.g. a ssh keyed authentication test?
  • How often do you perform a physical database connection test?
  • How often do you do an end to end test, including http request to database query test?

As with all of these, you also want to time these operations for any deviations.

I’ve created a very simple MySQL Alert Monitoring survey. I would appreciate your input.

MySQL Monitoring 101 – Graph your results

Sunday, February 17th, 2008

The problem

Hands up those that don’t monitor their production MySQL web server. I’m a little surprised by this, but I’ve visited several clients that have absolutely no monitoring other then “the customers will tell us when something is wrong”. The lack of system monitoring is one of the topics in my book “101 ways to screw up a successful startup”.

Why is monitoring important? First it can tell you when something is wrong, most monitoring systems introduce some level of constraints that trigger notifications via email, SMS or red flashing screens. Second, and I consider more important, is it allows you to analyze change and compare results over time. Let’s say you added more memory to your server, and then remembered to also increase the MySQL buffers appropriately. How much improvement did it make? Rather then “it seems faster”, you can have hard and fast numbers to back it up.

The Monitoring

If you have zero monitoring, you need to implement at least the following. Create the following script and run daily at midnight via cron. I’ve made the script as simple as can be.

#!/bin/sh
SCRIPT_NAME="monitor"
DATETIME=`date +%Y%m%d.%H%M`
HOSTNAME=`hostname -s`
LOG_DIR="/tmp"

vmstat 5 17280 > $LOG_DIR/os.vmstat.5.$HOSTNAME.$DATETIME.log &
iostat -x 5 17280 > $LOG_DIR/os.iostat.5.$HOSTNAME.$DATETIME.log &
exit 0

If you don’t have iostat installed, you will find iostat/sar as part of the sysstat package that most distributions will have generally available.

The results

vmstat output of relative idle system.

$ vmstat 5
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0      0 304332 289484 435052    0    0     3     4  206  177  1  0 98  0
 0  0      0 304108 289496 435072    0    0     0    45  411  736  5  0 95  0
 0  0      0 304116 289500 435072    0    0     0    13  395  716  3  0 97  0

vmstat of system under disk load

$ vmstat 5
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 1  0      0 234784 289844 516800    0    0     3     4  206  177  1  0 98  0
 1  1      0 101636 289988 645552    0    0     1 33190  448  626 35  5 31 29
 1  2      0  33092 280888 723276    0    0     4 17233  469  616 24  4  6 66
 1  0      0  62876 177444 797056    0    0     2 14846  837  938 33  5 31 30
 1  1      0  33252 168440 834064    0    0     1 30376  969  904 31  6 22 41

iostat of system under load

 iostat -x 5
Linux 2.6.22-14-generic (newyork)       02/16/2008

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.31    0.00    0.29    0.15    0.00   98.26

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.83     5.82    0.71    0.94    13.04    53.56    40.53     0.09   52.60   2.08   0.34
sr0               0.00     0.00    0.00    0.00     0.00     0.00    10.40     0.00  237.00 235.00   0.01

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          41.30    0.00    5.70   29.00    0.00   24.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  5153.60    0.60   75.80     4.80 42774.40   559.94    81.57 1354.93  13.09 100.00
sr0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          30.63    0.00    4.10   61.36    0.00    3.90

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  3850.00    0.20   60.00     1.60 39763.20   660.54    83.68 1450.90  13.86  83.44
sr0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          22.10    0.00    3.80   42.30    0.00   31.80

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  7645.80    0.40   66.80     3.20 44009.60   654.95   100.34 1192.33  14.14  95.04
sr0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

The analysis

The below graph shows CPU usage over a few hours. What was only observed via the graph was a spike of CPU that occurred every 5 minutes. In fact it occurred just on one server of several servers, and this enabled analysis to find a rouge cron job running every 5 minutes.

The below graph shows CPU idle usage over a 24 hour period. The colors represent different usage patterns such as during the day, batch processing overnight etc.

The below graph shows IO utilization over the same 24 hour period.

NOTE: Color’s added manually to highlight points.
These past two graphs highlight a number of things that require further analysis, but as in this example, it only shows part of a bigger picture. For example only CPU idle is graphed, not Wait I/O for example which as you can see from the I/O is excessive at times.

Still, these graphs were created from grep’ing the appropriate data into a CSV file, loading into Open Office, and click just the graph button, about a 60 second process for each type of data. This alone gave a better picture of what was happening then no picture.

What’s next

This level of monitoring is impractical for production systems, especially if you are monitoring more then one system. You need to monitoring more things, compare different results at the same time, and get better results at hand instantly. In my next topic “Monitoring 201″ I’ll talk about a number of easy and practical options.