Unexplained function output

I was asked today to confirm the operation of INET_ATON() and INET_NTOA() functions for converting IP4 strings to numeric representations. My tests on the machine I was just connected to at the very instant reported the following results.

mysql> SELECT VERSION();
+-----------------+
| VERSION()       |
+-----------------+
| 5.1.31-1ubuntu2 |
+-----------------+
1 row in set (0.00 sec)

mysql> SELECT INET_ATON('74.165.97.204') AS ipn;
+------------+
| ipn        |
+------------+
| 1252352460 |
+------------+
1 row in set (0.00 sec)

mysql> SELECT INET_NTOA(1252352460) as ipa;
+---------------+
| ipa           |
+---------------+
| 74.165.97.204 |
+---------------+
1 row in set (0.01 sec)

The results on the client via ODBC using MySQL 5.0.45 was?

ipn
1252352460

ipa
37342E3136352E39372E323034

What is causing this? It was immediately suspicious the pattern of 2E (the only letter), which translates from Hex to decimal to ‘.’ and 37 becomes 7 and 34 becomes 4, so the string is in hexidecimal and then with the UNHEX() function you get the right answer?

mysql> select unhex('37342E3136352E39372E323034');
+-------------------------------------+
| unhex('37342E3136352E39372E323034') |
+-------------------------------------+
| 74.165.97.204                       |
+-------------------------------------+

Could it be the ODBC drivers? Investigation showed them current as 5.01.05.00

Searching the MySQL Bugs database didn’t reveal anything noticable, nor on the forums.

My best suggestion was to post on the MySQL ODBC forums but I welcome any feedback from my readers.

MySQL Query Cache path

Just how effective is the Query Cache on performance? If you are not comfortable reading the MySQL code you can cheat very easily with the SHOW PROFILE command. This demonstration will show you at a high level the relative impact without and with the Query Cache. First let us confirm the Query Cache is not used.

mysql> SHOW GLOBAL VARIABLES LIKE 'query_cache%';
+------------------------------+---------+
| Variable_name                | Value   |
+------------------------------+---------+
| query_cache_limit            | 1048576 |
| query_cache_min_res_unit     | 4096    |
| query_cache_size             | 0       |
| query_cache_type             | ON      |
| query_cache_wlock_invalidate | OFF     |
+------------------------------+---------+
5 rows in set (0.00 sec)

We now enable profiling.

mysql> SET PROFILING=1;

We run our sample query.

mysql> select name,id,competitions from olympic_games where host_city='Sydney'G

We can then enable the query cache and re-run the query.

mysql> SET GLOBAL query_cache_size=1024*1024*16;
mysql> select name,id,competitions from olympic_games where host_city='Sydney'G
mysql> select name,id,competitions from olympic_games where host_city='Sydney'G

NOTE: We run the query twice after enabling the cache, the first time, the query is cached, the second time it is retrieved from the cache. Now let us look at the profiling information.

mysql> SHOW PROFILES;
+----------+------------+-------------------------------------------------------------------------+
| Query_ID | Duration   | Query                                                                   |
+----------+------------+-------------------------------------------------------------------------+
|        1 | 0.00096100 | select name,id,competitions from olympic_games where host_city='Sydney' |
|        2 | 0.00029700 | SET GLOBAL query_cache_size=1024*1024*16                                |
|        3 | 0.00837900 | select name,id,competitions from olympic_games where host_city='Sydney' |
|        4 | 0.00009500 | select name,id,competitions from olympic_games where host_city='Sydney' |
+----------+------------+-------------------------------------------------------------------------+

mysql> SHOW PROFILE SOURCE FOR QUERY 3;
+--------------------------------+----------+---------------------------+---------------+-------------+
| Status                         | Duration | Source_function           | Source_file   | Source_line |
+--------------------------------+----------+---------------------------+---------------+-------------+
| starting                       | 0.000033 | NULL                      | NULL          |        NULL |
| checking query cache for query | 0.000088 | send_result_to_client     | sql_cache.cc  |        1262 |
| Opening tables                 | 0.000025 | open_tables               | sql_base.cc   |        4482 |
| System lock                    | 0.000006 | mysql_lock_tables         | lock.cc       |         258 |
| Table lock                     | 0.000081 | mysql_lock_tables         | lock.cc       |         269 |
| init                           | 0.000037 | mysql_select              | sql_select.cc |        2350 |
| optimizing                     | 0.000016 | optimize                  | sql_select.cc |         772 |
| statistics                     | 0.000021 | optimize                  | sql_select.cc |         954 |
| preparing                      | 0.000021 | optimize                  | sql_select.cc |         964 |
| executing                      | 0.000005 | exec                      | sql_select.cc |        1648 |
| Sending data                   | 0.000500 | exec                      | sql_select.cc |        2190 |
| end                            | 0.000008 | mysql_select              | sql_select.cc |        2395 |
| query end                      | 0.000005 | mysql_execute_command     | sql_parse.cc  |        4821 |
| freeing items                  | 0.007489 | mysql_parse               | sql_parse.cc  |        5827 |
| storing result in query cache  | 0.000028 | query_cache_end_of_result | sql_cache.cc  |         813 |
| logging slow query             | 0.000007 | log_slow_statement        | sql_parse.cc  |        1628 |
| cleaning up                    | 0.000009 | dispatch_command          | sql_parse.cc  |        1595 |
+--------------------------------+----------+---------------------------+---------------+-------------+
17 rows in set (0.00 sec)

mysql> SHOW PROFILE SOURCE FOR QUERY 4;
+--------------------------------+----------+-----------------------+--------------+-------------+
| Status                         | Duration | Source_function       | Source_file  | Source_line |
+--------------------------------+----------+-----------------------+--------------+-------------+
| starting                       | 0.000035 | NULL                  | NULL         |        NULL |
| checking query cache for query | 0.000014 | send_result_to_client | sql_cache.cc |        1262 |
| checking privileges on cached  | 0.000010 | send_result_to_client | sql_cache.cc |        1346 |
| sending cached result to clien | 0.000026 | send_result_to_client | sql_cache.cc |        1441 |
| logging slow query             | 0.000005 | log_slow_statement    | sql_parse.cc |        1628 |
| cleaning up                    | 0.000005 | dispatch_command      | sql_parse.cc |        1595 |
+--------------------------------+----------+-----------------------+--------------+-------------+
6 rows in set (0.00 sec)

It does not take a rocket scientist to determine that 6 steps within the MySQL kernel is better then 17, regardless of what those steps are, and how different in timing they may be.

I’m not wanting to represent how much saving you may have here, there are many factors such as a realistic example, a loaded warmed up environment etc. You should try this in your own environment with your own queries.

This information was to provide an introduction into looking a little deeper at the Query Cache path within MySQL.

Using the Query Cache effectively

Maximize your strengths, minimize your weaknesses.

You can apply this approach to many things in life, I apply it to describing and using MySQL the product, and it’s components. The Query Cache like many features in MySQL, and indeed features in many different RDBMS products (don’t get me started on Oracle *features*) have relative benefits. In one context it can be seen as ineffective, or even detrimental to your performance, however it’s course grain nature makes it both trivial to disable dynamically (SET GLOBAL query_cache_size=0;), and also easy to get basic statistics on current performance (SHOW GLOBAL STATUS LIKE ‘QCache%';) to determine effectiveness and action appropriately.

The Query Cache is course grained, that is it is rather simple/dumb in nature. When you understand the path of execution of a query within the MySQL kernel you learn a few key things.

  • When enabled, by default the Query Cache will cache all SELECT statements within certain defined system parameter conditions. There are of course exceptions such as non-deterministic functions, prepared statements in earlier versions etc.
  • Any DML/DDL statement for a table that has a query cached, flushes all query cache results that pertain to this table.
  • You can use SQL_CACHE and SQL_NO_CACHE as hints however you can’t configure on a table by table, or query basis.
  • The query cache works on an exact match of the query (including spaces and case) and other settings such the client character set, and protocol version. If a match is found, data is returned in preformed network packets.<.li>

The Query Cache was not good when set to large values (e.g. > 128M) due to in-efficient cache invalidation. I’m not certain of the original source of this condition however Bug #21074, fixed in 5.0.50 and 5.1.21 is likely the reason.

My advice is to disable the Query Cache by default, especially for testing. As a final stress test you can enable to determine if there is a benefit.

I wish MySQL would spend time in improving key features, for example the Query Cache lacks sufficient instrumentation like what queries are in the cache, what tables are in the cache, and also lack all the sufficient system parameters exposed to fine tune. I believe there is a patch to show the queries for example, but I was unable to find via a google search.

It is a powerful and easy technology if you use it well. It involves architecting your solution appropriately, and knowing when the Query Cache is ineffective.

I have a number of circumstances where the query cache is extremely effective, or could be with simple modifications. A recommendation to a recent client with a 1+TB database was to split historical and current data into two different instances. The data was already in separated tables, the application already performed dual queries, so the change was a simple as a new connection pool. The benefits were huge, not only would the backup process be more efficient, some 500GB of data now only had to be backed up once (as is was 100% static), the scaling and recovery process improved, but the second MySQL instance could enable the query cache and the application would get a huge performance improvement with ZERO code changes for caching. That’s a quick and easy win.

On a side note, I wanted to title this “The MySQL Query Cache is not useless”. When I was a MySQL employee I got reprimanded (twice) for blogging anything about MySQL that wasn’t positive. This blog post is in direct response to Konstantin, a Sun/MySQL employee who actually works on the actually MySQL server code who wrote Query cache = useless?. In my view it is not useless.

EXPLAIN – An essential tool for MySQL developers.

Just recently I came across the presentation “Bend SQL to your will with EXPLAIN” by Ligaya Turmelle via the Linked In – MySQL Speakers and Presenters group. Slides available at Slideshare. While I know a little about MySQL, I always enjoy spending time reviewing content when possible, perhaps as a good reference when asked, perhaps to glean some small point of knowledge I didn’t know, or didn’t understand completely, or had forgotten about. It’s also fun to randomly find worthwhile information.

EXPLAIN is an essential tool for MySQL developers, if you don’t know what QEP is, the listed google search link gives you a clear and obvious easy definition (using basic contextual searching techniques). You would then use EXPLAIN to determine the QEP, and then learn how to use it well.

For those that want to learn about EXPLAIN as an essential tool for MySQL developers I recommend you check the presentation out.

Calculating your database size

I generally use the following MySQL INFORMATION_SCHEMA (I_S) query to Calculate Your MySQL Database Size. This query and most others that access the MySQL INFORMATION_SCHEMA can be very slow to execute because they are not real tables and are not governed by physical data, memory buffers and indexes for example but rather internal MySQL data structures.

Mark Leith indicates in his post on innodb_stats_on_metadata that Innodb performs 8 random(ish) dives in to the index, when anybody accesses any of SHOW TABLE STATUS, SHOW INDEX, INFORMATION_SCHEMA.TABLES,INFORMATION_SCHEMA.STATISTICS for InnoDB tables. This can have an effect on performance, especially with a large number of Innodb tables, and a poor ratio of innodb_buffer_pool_size to disk data+index footprint.

What is even more incredible is when the result of this apparently harmless query causes the mysqld process to actual crash with a core dump due to these random index dives. The following core dump analysis highlights my query as the cause of the problem. This has happened now at least twice in for recent core crashes on a production environment.

(gdb) bt
#0 0x000000327280b6b2 in pthread_kill () from ./lib64/libpthread.so.0
#1 0x000000000055b136 in handle_segfault ()
#2 
#3 0x00000000007e1c21 in rec_get_offsets_func ()
#4 0x0000000000766007 in btr_estimate_number_of_different_key_vals ()
#5 0x000000000070d4c2 in dict_update_statistics_low ()
#6 0x000000000061fa84 in ha_innobase::info ()
#7 0x0000000000636972 in fill_schema_charsets ()
#8 0x0000000000639a66 in get_all_tables ()
#9 0x0000000000634633 in get_schema_tables_result ()
#10 0x00000000005bde37 in JOIN::exec ()
#11 0x00000000005bf7a7 in mysql_select ()
#12 0x00000000005c0127 in handle_select ()
#13 0x000000000056fcf0 in mysql_execute_command ()
#14 0x0000000000574c83 in mysql_parse ()
#15 0x00000000005751a0 in dispatch_command ()
#16 0x0000000000576483 in do_command ()
#17 0x0000000000577002 in handle_one_connection ()
#18 0x0000003272806367 in start_thread () from ./lib64/libpthread.so.0
#19 0x0000003271cd30ad in clone () from ./lib64/libc.so.6
Cannot access memory at address 0x3271cd3040

This is an information_schema query that caused innodb to open a table.
This is totally normal. On first open, innodb tables get automatically  analyzed.
This analyze process crashed in innodb.

This exact query *provoked* a crash:

(gdb) x/1s 0x00002aaabc961dd0
0x2aaabc961dd0: "SELECT table_schema,table_name,engine,row_format,
table_rows, avg_row_length,
(data_length+index_length)/1024/1024 as total_mb,
(data_length)/1024/1024 as data_mb,
(index_length)/1024/1024 as index_mb,
CURDATE() AS today
FROM information_schema.tables
WHERE table_schema=@schema
ORDER BY 7 DESC"

The issue however is which table is the problem? How widespread is the corruption. Would an ALTER TABLE ENGINE=Innodb rebuild the table and eliminate the problem. Would an ANALYZE on an Innodb table identify the problem? (I doubt this second point). The problem however is even more significant due to the actual system. The largest single table of this 1TB database is 500GB. The impact of performing the ALTER, the time to undertake this blocking operation, the increase in the Innodb data file that can’t be reclaimed are just two factors that the inexperienced may fall victim of.

A saying I use is “Disaster is inevitable”. In this situation the disaster appears to not be significant but the ramifications due to the lack of appropriate and expert architectural design considerations to correct the problem are.

Is your environment capable of supporting this maintenance requirement? If not, then is the decision maker in your organization worried enough to seek the expert advice to address pro actively or will it be too late.

Understanding Innodb Transaction Isolation

The MySQL Innodb storage engine provides ACID compliance, the ‘I’ being isolation. There are four states defined in MySQL with the tx_isolation system variable, READ-UNCOMMITTED, READ-COMMITTED, REPEATABLE-READ and SERIALIZABLE.

Generally MySQL installations do not modify the default value of tx_isolation = REPEATABLE-READ, however I have seen with a number of clients the default value has been changed to READ-COMMITTED. When I attempt to identify the reason why, I have always received the same reason. Oracle uses a default transaction isolation of READ-COMMITTED. See how Oracle Manages Data Concurrency and Consistency for more information.

However, while the literal string is the same, the actual implementation of READ-COMMITTED in Oracle more closely represents the REPEATABLE-READ in MySQL.

The following demonstrates what you can expect to see between the operation of these two modes.

REPEATABLE-READ
Session 1 Session 2
DROP TABLE IF EXISTS transaction_test;
CREATE TABLE transaction_test(
  id   INT UNSIGNED NOT NULL AUTO_INCREMENT,
  val  VARCHAR(20) NOT NULL,
  created TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY(id)
) ENGINE=InnoDB DEFAULT CHARSET latin1;

INSERT INTO transaction_test(val) VALUES ('a'),('b'),('c');
SELECT @@global.tx_isolation, @@session.tx_isolation;
+-----------------------+------------------------+
| @@global.tx_isolation | @@session.tx_isolation |
+-----------------------+------------------------+
| REPEATABLE-READ       | REPEATABLE-READ        |
+-----------------------+------------------------+
START TRANSACTION;
SELECT * FROM transaction_test;
+----+-----+---------------------+
| id | val | created             |
+----+-----+---------------------+
|  1 | a   | 2009-09-21 00:19:43 |
|  2 | b   | 2009-09-21 00:19:43 |
|  3 | c   | 2009-09-21 00:19:43 |
+----+-----+---------------------+
SELECT SLEEP(20);
START TRANSACTION;
INSERT INTO transaction_test(val) VALUES ('x'),('y'),('z');
SELECT * FROM transaction_test;
+----+-----------------+---------------------+
| id | val             | created             |
+----+-----------------+---------------------+
|  1 | a               | 2009-09-21 00:19:43 |
|  2 | b               | 2009-09-21 00:19:43 |
|  3 | c               | 2009-09-21 00:19:43 |
|  4 | x               | 2009-09-21 00:21:00 |
|  5 | y               | 2009-09-21 00:21:00 |
|  6 | z               | 2009-09-21 00:21:00 |
+----+-----------------+---------------------+
COMMIT;
INSERT INTO transaction_test(val) VALUES (@@session.tx_isolation);
SELECT * FROM transaction_test;
+----+-----------------+---------------------+
| id | val             | created             |
+----+-----------------+---------------------+
|  1 | a               | 2009-09-21 00:19:43 |
|  2 | b               | 2009-09-21 00:19:43 |
|  3 | c               | 2009-09-21 00:19:43 |
|  7 | REPEATABLE-READ | 2009-09-21 00:21:01 |
+----+-----------------+---------------------+

COMMIT;
READ-COMMITTED
SET SESSION tx_isolation='READ-COMMITTED';
TRUNCATE TABLE transaction_test;
INSERT INTO transaction_test(val) VALUES ('a'),('b'),('c');
SELECT @@global.tx_isolation, @@session.tx_isolation;
+-----------------------+------------------------+
| @@global.tx_isolation | @@session.tx_isolation |
+-----------------------+------------------------+
| REPEATABLE-READ       | READ-COMMITTED         |
+-----------------------+------------------------+
START TRANSACTION;
SELECT * FROM transaction_test;
+----+-----+---------------------+
| id | val | created             |
+----+-----+---------------------+
|  1 | a   | 2009-09-23 22:49:44 |
|  2 | b   | 2009-09-23 22:49:44 |
|  3 | c   | 2009-09-23 22:49:44 |
+----+-----+---------------------+
SELECT SLEEP(20);
START TRANSACTION;
INSERT INTO transaction_test(val) VALUES ('x'),('y'),('z');
SELECT * FROM transaction_test;
+----+-----+---------------------+
| id | val | created             |
+----+-----+---------------------+
|  1 | a   | 2009-09-23 22:49:44 |
|  2 | b   | 2009-09-23 22:49:44 |
|  3 | c   | 2009-09-23 22:49:44 |
|  4 | x   | 2009-09-23 22:52:38 |
|  5 | y   | 2009-09-23 22:52:38 |
|  6 | z   | 2009-09-23 22:52:38 |
+----+-----+---------------------+
COMMIT;
INSERT INTO transaction_test(val) VALUES (@@session.tx_isolation);
SELECT * FROM transaction_test;
+----+----------------+---------------------+
| id | val            | created             |
+----+----------------+---------------------+
|  1 | a              | 2009-09-23 22:49:44 |
|  2 | b              | 2009-09-23 22:49:44 |
|  3 | c              | 2009-09-23 22:49:44 |
|  4 | x              | 2009-09-23 22:52:38 |
|  5 | y              | 2009-09-23 22:52:38 |
|  6 | z              | 2009-09-23 22:52:38 |
|  7 | READ-COMMITTED | 2009-09-23 22:56:10 |
+----+----------------+---------------------+

COMMIT;

As you can see, under READ-COMMITTED your result set can change during the transaction. However, how practical is this example in an actual application.

In what circumstances would you consider using READ-COMMITTED? Is there an improvement in locking contention that can lead to less deadlock contention? What is the overhead in other areas?

Harrison writes in My Favorite New Feature of MySQL 5.1: Less InnoDB Locking that best locking out of InnoDB in 5.1 will be with READ-COMMITTED. Note that as mentioned, the impact is a change in replication mode that may have a more dramatic effect.

Heikki Tuuri comments in Understanding InnoDB MVCC that using READ-COMMITTED should help in a specific locking issue.

I am still unclear of the specific benefits in general terms for all environments. Review of the The InnoDB Transaction Model and Locking and specifically Consistent Nonlocking Reads provides “With READ COMMITTED isolation level, each consistent read within a transaction sets and reads its own fresh snapshot” which indicates that for certain workloads the reduced locking is a benefit.

Every environment is different and ultimately the actual transaction statements will determine what options and benefits work best.

SELECT INTO DUMPFILE

While learning a new ORDER BY syntax recently, as a diligent architect/DBA I reviewed the documentation. What I also found in the SELECT syntax which I did not also know was the keyword DUMPFILE.

The SELECT Syntax from MySQL 5.1 Manual states:

If you use INTO DUMPFILE instead of INTO OUTFILE, MySQL writes only one row into the file, without any column or line termination and without performing any escape processing. This is useful if you want to store a BLOB value in a file.

It’s a shame there is no middle ground, where you get the features of OUTFILE (i.e. all rows), and the features of DUMPFILE (i.e. no heading)

Identifying Resource Bottlenecks – Disk

With a discussion on identifying CPU and Memory bottlenecks achieved, let us now look at how Disk can affect MySQL performance.

One of the first steps when addressing a MySQL performance tuning problem is to perform a system audit of the physical hardware resources (CPU,Memory,Disk,Network), then identify any obvious bottlenecks in these resources.

There are several commands you can use including vmstat, iostat and sar (both part of the sysstat package) to provide information on disk I/O.

vmstat is a good place to start for an overall view for multiple resources, however for disk it only provides a total system overview.

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
  4  0 249256 107768 116360 1519596    0    0     0   112 8151 11256 40  4 56  0
 3  0 249256 107588 116360 1519680    0    0     4  1068 8021 11514 51  2 47  0
 1  0 249256 107396 116360 1519788    0    0     0    88 8880 12832 35  6 60  0
 0  0 249256 107432 116360 1519928    0    0     4    40 9393 14561  8  4 89  0
 2  0 249256 107160 116360 1519988    0    0     4  5636 9790 14245 23  6 71  0
 1  0 249256 107140 116360 1520356    0    0     4   180 9077 13285 33  3 65  0
 3  0 249256 107100 116360 1520352    0    0     0  1516 7970 13099 22  2 75  0
 4  1 249256 107184 116360 1520476    0    0     4   108 9756 15478 67  4 29  0
 2  0 249256 106196 116360 1520652    0    0     0     0 9512 14212 61  4 35  0

We want to look at is bi, bo and wa. The descriptions of these columns from the man page is:

  • bi: Blocks received from a block device (blocks/s).
  • bo: Blocks sent to a block device (blocks/s).
  • wa: Time spent waiting for IO.

As you can see from this above example, there is no disk reading, just disk writing, and there is no wait for I/O. This is good.

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 3  2 888552  86716 321940 712480   32  636  5352  1800 18177 22731 31  9 29 31
 0  5 888552  86748 321944 712592    0    0  2072   264 15592 19435 27  6 45 23
 4  5 888552  86296 321944 712796   16    0  5556  8388 15559 19674 28  5 11 55
 4  2 888552  86112 321948 713408   24    0  4404  4936 15215 19354 26  6 20 48
 6  0 888552  85732 321948 713608   56    0  6348  4368 15123 19109 25  5 37 34
 2  3 888552  85188 321956 713936   60    0  3080  4104 16322 21044 29  6 48 18
 2  3 888552  84972 321964 714376   20    0  4464 10852 20483 26013 33  9 25 34
 1 10 888552  84772 321980 714724   12    0  9332 12868 16981 21948 28  6 19 48
 2  3 888552  84080 321988 714952  112    0 11144  8944 15973 20190 27  6  1 65

In this above example we see a production system that has high disk reads and writes, and wait I/O is high. If you see the CPU waiting for Disk I/O at 60%-70%-80% you have effectively reached disk saturation.

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0 28      0  14436   4616 2405504    0    0     0     2  101   92  0  0  0 100
 0  4      0  14320   4616 2405504    0    0   120  2300  191  213  0  0  1 98
 0  5      0  15064   4616 2405056    0    0  1688    62 1961 2080  2  0 32 63
 0  4      0  14136   4616 2405916    0    0   952   272  584  569  0  0  0 98
 0  5      0  16864   4624 2403068    0    0   336    76  886 1389  2  1  0 97
 0  3      0  16052   4624 2403928    0    0   800   288  373  352  0  0  0 99
 0  3      0  15380   4624 2404556    0    0   720   240  373  357  0  0  0 98
 0  3      0  14452   4624 2405588    0    0   912   400  330  324  1  0  0 97
 0 57      0  15688   4624 2404380    0    0  1956   893  439  828  1  0  0 97
 0 56      0  15572   4632 2404544    0    0   208    12  947 1402  2  0  0 97
 0 53      0  14784   4632 2405320    0    0   784     1  310  261  0  0  0 98
 0 50      0  14436   4632 2405664    0    0   288   120  175  140  0  0  0 99
 0 50      0  14228   4632 2405688    0    0   160     0   97   85  0  0  0 100
 1 49      0  14112   4632 2406032    0    0   276     0  183  184  0  0  0 100

In the above example the system is past total disk saturation. The system is waiting completely for disk. This is the output of an actual production system. This is a good example because it is important to look at all three figures. Look at how low the amount of bi/bo in ration to wa. This is an indicator of a potential underlying disk subsystem problem, and in-fact this system failed within 24 hours of this output. I have never see 100% Disk Wait I/O before this example.

To get an indication of load on a multi disk system you need to use iostat for example. In the following example, we have an idle system with two drives. I simulate load on sdb with the following command.

$ dd if=/dev/zero of=/disk1/dd.out bs=1024k count=10000
$ iostat 5
Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               2.80         1.60        97.60          8        488
sdb               2.80         0.00        25.60          0        128

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.10    0.00    0.05    0.10    0.00   99.75

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               5.80        19.20       225.60         96       1128
sdb               0.00         0.00         0.00          0          0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.80    0.00   11.86    6.30    0.00   81.04

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              19.20        17.60       294.40         88       1472
sdb              18.20         0.00     18246.40          0      91232

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.40    0.00    3.15   34.88    0.00   61.57

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              35.20        30.40     14409.60        152      72048
sdb              86.20         1.60     83763.20          8     418816

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.30    0.00    1.45   38.87    0.00   58.38

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              30.80         4.80       451.20         24       2256
sdb              84.40         0.00     84172.80          0     420864

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.40    0.00    1.15   35.43    0.00   61.02

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              15.40         3.20       270.40         16       1352
sdb              84.80         0.00     83353.60          0     416768

You can see from this example, the increase in blocks written to sdb, and overal %iowait.

If you want to do a blanket test of your disk subsystem you should consider a program that adequately test the different interactions, especially any caching or battery backed cached that your system may have. I have written several articles on using Bonnie++ including Using Bonnie++, Extending Bonnie++ and Bonnie++ Results.

There are a number of variables that make it more difficult to audit disks. RAID configuration is often difficult as this involves custom OS/RAID provider commands. The disk controller cache, and battery backed cache (BBRU) are just two factors. It is important you know these commands, you study them and most importantly you know when your system is running in a degraded mode. The popular Dell 1950/2950 etc series generally have PERC 5/6i cards, you can use the folllowing as a guide to Understanding PERC RAID Controllers.

My 60 second take on RAID and Disk Configuration. I concur with Matt Yonkivit. You should separate your OS from your database on disk, RAID 1 (2 drives) works fine for the OS. For databases, in theory (pure math when understanding RAID), RAID10 is better then RAID5.

My favorite MySQL data type – DECIMAL(31,0)

It may seem hard to believe, but I have seen DECIMAL(31,0) in action on a production server. Not just in one column, but in 15 columns just in the largest 4 tables of one schema. The column was being used to represent a integer primary or foreign key column.

In a representative production instance (one of a dozen plus distributed production database servers) the overall database footprint was decreased from ~10 GB to ~2 GB, a 78% saving. In total, 15 columns across just 4 tables were changed from DECIMAL(31,0) to INT UNSIGNED.

One single table > 5GB was reduced to under 1GB (a 81% saving). This being my record for any GB+ tables in my time working with the MySQL database.

Had this server for example had 4GB of RAM, and say 2.5GB allocated to the innodb_buffer_pool_size, this one change moved the system from requiring more consistent disk access (4x data to memory) to being able to store all data in memory. Tests showed a clear improvement in Innodb buffer pool reads and hit ratio.

Today’s lesson as described in my 2008 conference presentation Top 20 design tips for data architects is, choose the right integer data type for your data.

MySQL SHOW PRIVILEGES

Some days you learn about MySQL commands even without knowing about them. Today I wanted to check the privileges a user had because they did not have permissions to drop a view. Rather then typing SHOW GRANTS I quite by accident typed SHOW PRIVILEGES only to realize not only was it a valid command, it actually provided information that means I don’t have to go to the Privileges Provided by MySQL documentation page which I was already on.

Old dog, new trick.

mysql> show privileges;
+-------------------------+---------------------------------------+-------------------------------------------------------+
| Privilege               | Context                               | Comment                                               |
+-------------------------+---------------------------------------+-------------------------------------------------------+
| Alter                   | Tables                                | To alter the table                                    |
| Alter routine           | Functions,Procedures                  | To alter or drop stored functions/procedures          |
| Create                  | Databases,Tables,Indexes              | To create new databases and tables                    |
| Create routine          | Functions,Procedures                  | To use CREATE FUNCTION/PROCEDURE                      |
| Create temporary tables | Databases                             | To use CREATE TEMPORARY TABLE                         |
| Create view             | Tables                                | To create new views                                   |
| Create user             | Server Admin                          | To create new users                                   |
| Delete                  | Tables                                | To delete existing rows                               |
| Drop                    | Databases,Tables                      | To drop databases, tables, and views                  |
| Event                   | Server Admin                          | To create, alter, drop and execute events             |
| Execute                 | Functions,Procedures                  | To execute stored routines                            |
| File                    | File access on server                 | To read and write files on the server                 |
| Grant option            | Databases,Tables,Functions,Procedures | To give to other users those privileges you possess   |
| Index                   | Tables                                | To create or drop indexes                             |
| Insert                  | Tables                                | To insert data into tables                            |
| Lock tables             | Databases                             | To use LOCK TABLES (together with SELECT privilege)   |
| Process                 | Server Admin                          | To view the plain text of currently executing queries |
| References              | Databases,Tables                      | To have references on tables                          |
| Reload                  | Server Admin                          | To reload or refresh tables, logs and privileges      |
| Replication client      | Server Admin                          | To ask where the slave or master servers are          |
| Replication slave       | Server Admin                          | To read binary log events from the master             |
| Select                  | Tables                                | To retrieve rows from table                           |
| Show databases          | Server Admin                          | To see all databases with SHOW DATABASES              |
| Show view               | Tables                                | To see views with SHOW CREATE VIEW                    |
| Shutdown                | Server Admin                          | To shut down the server                               |
| Super                   | Server Admin                          | To use KILL thread, SET GLOBAL, CHANGE MASTER, etc.   |
| Trigger                 | Tables                                | To use triggers                                       |
| Update                  | Tables                                | To update existing rows                               |
| Usage                   | Server Admin                          | No privileges - allow connect only                    |
+-------------------------+---------------------------------------+-------------------------------------------------------+
29 rows in set (0.00 sec)

More woes with java version on Ubuntu

Armed with more information on Drizzle JDBC being a JDBC 4.0 implementation (helps to explain my issues in Getting started with Drizzle JDBC) I took the time to read about some other new JDBC 4.0 features.

There was reference to handling chained exceptions, however when trying to get this working for SQLException was more complex on Ubuntu 9.04 then I anticipated.

My first problem was an apparent source level problem.

$ javac ExampleDrizzle.java
----------
1. ERROR in ExampleDrizzle.java (at line 14)
	for(Throwable e : sx ) {
	    ^^^^^^^^^^^^^^^^
Syntax error, 'for each' statements are only available if source level is 1.5

That’s weird, what java version was I running now I’d changed with update-alternatives –config java yesterday.

$ java -version
java version "1.6.0_16"
Java(TM) SE Runtime Environment (build 1.6.0_16-b01)
Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode)

No issues here, a quick man reference gives me:

-1.5                    set compliance level to 1.5

I try that, and well that fixes one problem, but creates another.

$ javac -1.5 ExampleDrizzle.java
----------
1. ERROR in ExampleDrizzle.java (at line 14)
	for(Throwable e : sx ) {
	                  ^^
Can only iterate over an array or an instance of java.lang.Iterable

Now Class SQLException 1.6 javadocs shows SQLException as implementing the generics Iterable<Throwable>, while 1.5 javadoc does not. I guess I need to use 1.6 then.

$ javac -1.6 ExampleDrizzle.java
Annotation processing got disabled, since it requires a 1.6 compliant JVM
----------
1. ERROR in ExampleDrizzle.java (at line 14)
	for(Throwable e : sx ) {
	                  ^^
Can only iterate over an array or an instance of java.lang.Iterable

Wait a minute, I’m using a 1.6 compliant JVM. Double checking

$ ls -al /etc/alternatives/java*
lrwxrwxrwx 1 root root 36 2009-09-17 18:53 /etc/alternatives/java -> /usr/lib/jvm/java-6-sun/jre/bin/java
lrwxrwxrwx 1 root root 46 2009-09-17 18:53 /etc/alternatives/java.1.gz -> /usr/lib/jvm/java-6-sun/jre/man/man1/java.1.gz
lrwxrwxrwx 1 root root 31 2009-09-17 17:50 /etc/alternatives/javac -> /usr/lib/jvm/java-gcj/bin/javac
lrwxrwxrwx 1 root root 41 2009-09-17 17:50 /etc/alternatives/javac.1.gz -> /usr/lib/jvm/java-gcj/man/man1/javac.1.gz
lrwxrwxrwx 1 root root 33 2009-09-17 17:50 /etc/alternatives/javadoc -> /usr/lib/jvm/java-gcj/bin/javadoc
lrwxrwxrwx 1 root root 43 2009-09-17 17:50 /etc/alternatives/javadoc.1.gz -> /usr/lib/jvm/java-gcj/man/man1/javadoc.1.gz
lrwxrwxrwx 1 root root 31 2009-09-17 17:50 /etc/alternatives/javah -> /usr/lib/jvm/java-gcj/bin/javah
lrwxrwxrwx 1 root root 41 2009-09-17 17:50 /etc/alternatives/javah.1.gz -> /usr/lib/jvm/java-gcj/man/man1/javah.1.gz
lrwxrwxrwx 1 root root 33 2009-09-11 10:06 /etc/alternatives/javap -> /usr/lib/jvm/java-6-sun/bin/javap
lrwxrwxrwx 1 root root 43 2009-09-11 10:06 /etc/alternatives/javap.1.gz -> /usr/lib/jvm/java-6-sun/man/man1/javap.1.gz
lrwxrwxrwx 1 root root 39 2009-09-11 10:06 /etc/alternatives/java_vm -> /usr/lib/jvm/java-6-sun/jre/bin/java_vm
lrwxrwxrwx 1 root root 38 2009-09-11 10:06 /etc/alternatives/javaws -> /usr/lib/jvm/java-6-sun/jre/bin/javaws
lrwxrwxrwx 1 root root 48 2009-09-11 10:06 /etc/alternatives/javaws.1.gz -> /usr/lib/jvm/java-6-sun/jre/man/man1/javaws.1.gz

javac is not using Sun Java 6. I have no idea how that happened, but it explains now the problem, should be checking javac version, not java version.

$ javac -version
Eclipse Java Compiler 0.894_R34x, 3.4.2 release, Copyright IBM Corp 2000, 2008. All rights reserved.

What the? I was writing Java code on this server by hand, but decided last night to install eclipse after the fact. Did this affect this. I’m not certain whether I installed eclipse before or after my work last night.

I try to change the alternatives again.

$ sudo update-alternatives --config java

There are 4 alternatives which provide `java'.

  Selection    Alternative
-----------------------------------------------
*         1    /usr/lib/jvm/java-6-sun/jre/bin/java
          2    /usr/bin/gij-4.3
          3    /usr/bin/gij-4.2
 +        4    /usr/lib/jvm/java-gcj/jre/bin/java

Press enter to keep the default[*], or type selection number: 1
Using '/usr/lib/jvm/java-6-sun/jre/bin/java' to provide 'java'.

$ javac -version
Eclipse Java Compiler 0.894_R34x, 3.4.2 release, Copyright IBM Corp 2000, 2008. All rights reserved.

That doesn’t work. One needs to know that java and javac operate independently.

$ sudo update-alternatives --config javac

There are 4 alternatives which provide `javac'.

  Selection    Alternative
-----------------------------------------------
          1    /usr/lib/jvm/java-6-sun/bin/javac
          2    /usr/bin/ecj
          3    /usr/bin/gcj-wrapper-4.3
*+        4    /usr/lib/jvm/java-gcj/bin/javac

Press enter to keep the default[*], or type selection number: 1
Using '/usr/lib/jvm/java-6-sun/bin/javac' to provide 'javac'.
$ javac -version
javac 1.6.0_16

$ javac ExampleDrizzle.java

Buyer beware with Ubuntu and it’s rather messed up implementation approach toward alternative java JVM’s.

Getting started with Drizzle JDBC

In preparation for some Java work I wanted to configure and test the Drizzle JDBC Driver. Any chance to swing Drizzle into a MySQL discussion is worth the research. What I found was an issue compiling and an issue running on Ubuntu 9.04

You can start by downloading and building the Drizzle JDBC. My first problem was when I tried to build a usable .jar. I got errors in the test cases which caused by default no built .jar to work with. I raised Bug #432146 – org.drizzle.jdbc.MySQLDriverTest Tests fail. As I stated it may not be a real bug, but it seems at present that you require a running MySQL instance as well as a running Drizzle instance. In my case I didn’t have MySQL running, and I think to be fair, I should be able to build a Drizzle driver without MySQL.

Anyway, as per the Wiki Docs I proceeded to package without successful test cases. My next problem was more interesting, and perhaps found earlier from the tests?

I first created a test schema my code was going to use.

$ ~/drizzle/deploy/bin/drizzle
Your Drizzle connection id is 724
Server version: 2009.09.1126 Source distribution (trunk)

drizzle> create schema test_java;
Query OK, 1 row affected (0 sec)
drizzle> exit

I wrote a simple Java program.

$ cat ExampleDrizzle.java
import java.sql.*;

public class ExampleDrizzle {

  public static void main(String args[]) {

    try {
      Class.forName("org.drizzle.jdbc.Driver");
    } catch (Exception e) {
      System.out.println(e.getMessage());
      System.exit(1);
    }

    try {
      Connection con = DriverManager.getConnection("jdbc:drizzle://localhost:4427/test_java");
      Statement st = con.createStatement();
      st.executeUpdate("CREATE TABLE a (id int not null primary key, value varchar(20))");
      st.close();
      con.close();
    } catch (SQLException e) {
      System.out.println(e.getMessage());
    }
  }
}

Compiled.

$ javac ExampleDrizzle.java

Ran.

$ java ExampleDrizzle
org.drizzle.jdbc.Driver not found in gnu.gcj.runtime.SystemClassLoader{urls=[file:mysql-connector-java-5.1.8-bin.jar,file:./], parent=gnu.gcj.runtime.ExtensionClassLoader{urls=[], parent=null}}

Oops, been a while since using Java. I was amazed I could write the code in vi in the first place.

$ export CLASSPATH=drizzle-jdbc-0.5-SNAPSHOT.jar:.
$ java ExampleDrizzle
17-Sep-09 6:48:45 PM org.drizzle.jdbc.internal.drizzle.DrizzleProtocol 
INFO: Connected to: localhost:4427
Exception in thread "main" java.lang.NoClassDefFoundError: org.drizzle.jdbc.DrizzleConnection
   at java.lang.Class.initializeClass(libgcj.so.90)
   at org.drizzle.jdbc.Driver.connect(Driver.java:74)
   at java.sql.DriverManager.getConnection(libgcj.so.90)
   at ExampleDrizzle.main(ExampleDrizzle.java:15)
Caused by: java.lang.ClassNotFoundException: java.sql.SQLFeatureNotSupportedException not found in gnu.gcj.runtime.SystemClassLoader{urls=[file:drizzle-jdbc-0.5-SNAPSHOT.jar,file:./], parent=gnu.gcj.runtime.ExtensionClassLoader{urls=[], parent=null}}
   at java.net.URLClassLoader.findClass(libgcj.so.90)
   at java.lang.ClassLoader.loadClass(libgcj.so.90)
   at java.lang.ClassLoader.loadClass(libgcj.so.90)
   at java.lang.Class.forName(libgcj.so.90)
   at java.lang.Class.initializeClass(libgcj.so.90)
   ...3 more

Hmmm, that’s disappointing. I thought about it a minute, figured some guidance would be beneficial , so I sought out the best Java person on #drizzle IRC. Getting a name, but no response from an initial inquiry after about a half hour I thought again at the problem. Just what java are you using?

$ java -version
java version "1.5.0"
gij (GNU libgcj) version 4.3.3

$ ls -l /usr/bin/java
lrwxrwxrwx 1 root root 22 2009-07-17 12:36 /usr/bin/java -> /etc/alternatives/java

$ sudo find / -name java
[sudo] password for rbradfor:
/usr/lib/java
/usr/lib/ure/share/java
/usr/lib/jvm/java-6-sun-1.6.0.16/bin/java
/usr/lib/jvm/java-6-sun-1.6.0.16/jre/bin/java
/usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/bin/java
/usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/bin/java
/usr/bin/java
/usr/include/c++/4.3/gnu/java
/usr/include/c++/4.3/java
/usr/local/include/google/protobuf/compiler/java

$ ls -l /etc/alternatives/j*
...
lrwxrwxrwx   1 root root    33 2009-09-17 17:50 jar -> /usr/lib/jvm/java-gcj/jre/bin/jar
lrwxrwxrwx   1 root root    39 2009-09-17 17:50 jar.1.gz -> /usr/lib/jvm/java-gcj/man/man1/jar.1.gz
lrwxrwxrwx   1 root root    35 2009-09-17 17:50 jarsigner -> /usr/lib/jvm/java-gcj/bin/jarsigner
lrwxrwxrwx   1 root root    45 2009-09-17 17:50 jarsigner.1.gz -> /usr/lib/jvm/java-gcj/man/man1/jarsigner.1.gz
lrwxrwxrwx   1 root root    34 2009-09-17 17:50 java -> /usr/lib/jvm/java-gcj/jre/bin/java
lrwxrwxrwx   1 root root    40 2009-09-17 17:50 java.1.gz -> /usr/lib/jvm/java-gcj/man/man1/java.1.gz
lrwxrwxrwx   1 root root    31 2009-09-17 17:50 javac -> /usr/lib/jvm/java-gcj/bin/javac
lrwxrwxrwx   1 root root    41 2009-09-17 17:50 javac.1.gz -> /usr/lib/jvm/java-gcj/man/man1/javac.1.gz
...

I wonder if I should use the real Sun Java.

$ sudo apt-get install sun-java6-jdk
Reading package lists... Done
Building dependency tree
Reading state information... Done
sun-java6-jdk is already the newest version.
0 upgraded, 0 newly installed, 0 to remove and 2 not upgraded.
$ sudo update-alternatives --config java

There are 4 alternatives which provide `java'.

  Selection    Alternative
-----------------------------------------------
          1    /usr/lib/jvm/java-6-sun/jre/bin/java
          2    /usr/bin/gij-4.3
          3    /usr/bin/gij-4.2
*+        4    /usr/lib/jvm/java-gcj/jre/bin/java

Press enter to keep the default[*], or type selection number: 1
Using '/usr/lib/jvm/java-6-sun/jre/bin/java' to provide 'java'.

$ ls -l /usr/bin/java
lrwxrwxrwx 1 root root 22 2009-07-17 12:36 /usr/bin/java -> /etc/alternatives/java
$ ls -l /etc/alternatives/java
lrwxrwxrwx 1 root root 36 2009-09-17 18:53 /etc/alternatives/java -> /usr/lib/jvm/java-6-sun/jre/bin/java

Yep, it took a minute to discover the update-alternatives command, lucky I didn’t try that manually.

A second try.

$ javac ExampleDrizzle.java
$ java ExampleDrizzle
Sep 17, 2009 6:54:22 PM org.drizzle.jdbc.internal.drizzle.DrizzleProtocol 
INFO: Connected to: localhost:4427
Sep 17, 2009 6:54:22 PM org.drizzle.jdbc.internal.drizzle.DrizzleProtocol close
INFO: Closing connection
Sep 17, 2009 6:54:22 PM org.drizzle.jdbc.internal.common.packet.AsyncPacketFetcher run
INFO: Connection closed

$ ~/drizzle/deploy/bin/drizzle test_java
Server version: 2009.09.1126 Source distribution (trunk)

drizzle> show tables;
+---------------------+
| Tables_in_test_java |
+---------------------+
| a                   |
+---------------------+
1 row in set (0 sec)

drizzle> desc a;
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id    | int         | NO   | PRI | NULL    |       |
| value | varchar(20) | YES  |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+
2 rows in set (0 sec)

And I’ve got a working testcase.

Engine agnostic MySQL test cases

Mark writes Now we all need the storage-engine independent test suite. I could not agree more. I have made comments about this probably as early as 4 years ago, and both before and while working for MySQL Inc.

There is however a way to do it with the current mysql-test syntax. While not ideal, it does actually work.
It took me like an hour to dig though old, old backup code, but I found it.

The Test Case:

$ cat t/engine_agnostic.test
CREATE TABLE i(id INT UNSIGNED NOT NULL);
let $ENGINE=`select variable_value from information_schema.global_variables where variable_name='STORAGE_ENGINE'`;
--replace_result $ENGINE ENGINE
SHOW CREATE TABLE i;

The Test Result:

cat r/engine_agnostic.result
CREATE TABLE i(id INT UNSIGNED NOT NULL);
SHOW CREATE TABLE i;
Table	Create Table
i	CREATE TABLE `i` (
  `id` int(10) unsigned NOT NULL
) ENGINE=ENGINE DEFAULT CHARSET=latin1

You can now drive different storage engine tests via using the default-storage-engine configuration option. It’s not ideal, and it’s not pretty, but it does work.

I should also say for 5.1+ versions.

How do I create a simple MySQL database

I was asked this question recently “I am wanting to create a simple MySQL database consisting of 5 tables”?

While it’s easy to tell people to RTFM, the question does warrant an answer for the MySQL beginner to provide a more specific guidance as to where to start, and what to do. As a expert in MySQL it’s easy to forget how you would describe what to do. Here are my tips to getting started.

Step 1. Download the MySQL 5.1 software for your platform (e.g. Linux, Windows, Mac etc) from MySQL 5.1 Downloads. There are many different versions of MySQL, MySQL 5.1 is the current production version.

Step 2. You will need to install the MySQL software. The MySQL reference manual is the place to go, Chapter 2 describes installing MySQL. You can also download a copy of the manual in various different formats at MySQL Documentation. This is also valuable for the time when the documentation may be be unavailable online.

Step 3. Download a GUI tool to help you in the design of your first MySQL Tables. There are a number of products available to do this, the MySQL Query Browser and WebYog are just two examples. If your bold, you can use the mysql client command line tool and use the CREATE TABLE command to create your table structures.

MySQL by itself is ineffective for producing a client facing end result unless you have an application purpose and therefore a general application to access the data in MySQL. Using a LAMP/WAMP stack is a good place to start. XAMPP is a good cross platform program that gives you MySQL and a PHP technology stack. You also get PhpMyAdmin included with XAMPP which is a good web based design tool. I don’t mention earlier because it needs a running php/apache/mysql environment. If you elect to start with this stack, then you don’t need to install any GUI tools.

Finally, there a wealth of knowledge, not at least the MySQL Forums and the #mysql channel on irc.freenode.net which can be good places to get free beginner information.

How do I find the storage engine of a MySQL table

This seems quite a trivial question, but developers don’t often know what a MySQL storage engine is and how to determine what storage engine is used for a table.

The first choice is to describe the table with the DESC[RIBE] command. Side Note: people often don’t realize that DESC is a short acceptable version here.

mysql> desc stats;
+---------+---------------------+------+-----+-------------------+----------------+
| Field   | Type                | Null | Key | Default           | Extra          |
+---------+---------------------+------+-----+-------------------+----------------+
| stat_id | int(10) unsigned    | NO   | PRI | NULL              | auto_increment |
| created | timestamp           | NO   |     | CURRENT_TIMESTAMP |                |
| version | tinyint(3) unsigned | NO   |     | NULL              |                |
| referer | varchar(500)        | NO   |     | NULL              |                |
| q       | varchar(100)        | YES  |     | NULL              |                |
| uri     | varchar(100)        | YES  |     | NULL              |                |
| server  | text                | NO   |     | NULL              |                |
| headers | text                | YES  |     | NULL              |                |
+---------+---------------------+------+-----+-------------------+----------------+

Alias, this command does not provide the details of the storage engine.
You need to use the SHOW CREATE TABLE as a means to get a more detailed description including the storage engine.

mysql> SHOW CREATE TABLE statsG
*************************** 1. row ***************************
       Table: stats
Create Table: CREATE TABLE `stats` (
  `stat_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `version` tinyint(3) unsigned NOT NULL,
  `referer` varchar(500) NOT NULL,
  `q` varchar(100) DEFAULT NULL,
  `uri` varchar(100) DEFAULT NULL,
  `server` text NOT NULL,
  `headers` text,
  PRIMARY KEY (`stat_id`)
) ENGINE=MyISAM AUTO_INCREMENT=21964 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

As another side note tip, try the G instead of a semicolon (;) which provides a better vertical output.

Another option is to query the INFORMATION_SCHEMA.TABLES meta data.

mysql> SELECT table_name,engine FROM INFORMATION_SCHEMA.TABLES WHERE table_schema=DATABASE();
+--------------+--------+
| table_name   | engine |
+--------------+--------+
| stats        | MyISAM |
+--------------+--------+
1 row in set (0.00 sec)

Monitoring MySQL – The error log

It is important that you monitor the MySQL error log. There are a few different options available for defining the details of the log. If not specified the default is [datadir]/[hostname].err. This is not an ideal location for 2 reasons.

First, a correctly configured MySQL datadir will have permissions only for the mysql user, and this is generally restrictive access to the user only and no group or world permissions. Other users/groups should have limited access to the mysql error log.

Second, the datadir is for data, not logs, especially logs that can potentially fill file systems. I am referring here to more then just the error log.

I would recommend you create a separate directory for MySQL logs such as the error, slow and general logs. An example I implement for single installation environments using Linux mysql packages is:

mkdir /var/log/mysql
chown mysql:mysql /var/log/mysql
chmod 750 /var/log/mysql

There does not seem to be a consensus over whether to include the hostname or not in the error log filename. My preference is to not include. I would rather the filename to be consistent across multiple servers. The argument is what about when consolidating logs from multiple servers. I discount this because you have to connect to the server to retrieve logs, create a sub directory of that hostname for consolidated logs.

With Linux distributions you may not find log files where you expect. Ubuntu packages for example has the log going to syslog. While the theory is to make system logging and monitoring easier, it makes MySQL specific monitoring more difficult. You also suffer a logrotate problem where you may only have 7 days of log. I prefer to have access to all historical MySQL log information.

The best choice is to define the error log with log-error, in both the [mysqld_safe] and [mysqld] section of your servers my.cnf

[mysqld_safe]
log-error=/var/log/mysql/error.log

[mysqld]
log-error=/var/log/mysql/error.log

In MySQL 5.1 you have the luxury of different output sources, FILE, TABLE or BOTH for the general log and the slow log with –log-output. No option exists for the error log.

Other my.cnf options to be aware of include:

  • log-warnings | skip-log-warnings
  • syslog | skip-syslog

There is generally also lacking in the standard monitoring products/plugins that present MySQL status information. In my monitoring MySQL solutions I provide a line count of the MySQL error log, so that a delta can be easily detected and then reviewed more proactively.

One issue with a recent client is the lack of access to the physical box by different parties and therefore the lack of access to the log. The identification that something needs to be viewed, then the ability to be able to view is an important problem to be solved.

References

Some other references for MySQL error log monitoring.

Explain this

The EXPLAIN command is an important tool to review how a SQL query is executed and in this example includes what indexes are used.

By adding a covering index I ended up with the following EXPLAIN plan I was unable to explain. The end result was a boost in server performance which was the ultimate goal.

mysql> explain select max(md)  from e_r  where email = '[email protected]' and id = '36981';
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
1 row in set (0.00 sec)

The queries still produced the expected results.

MySQL Replication 102

One of the most asked questions is how to setup MySQL replication. The MySQL Reference Manual provides a good Replication How To as a starting guide on MySQL Replication 101.

MySQL replication has many uses including read scalability, backups, failover, online maintenance, upgrade testing and verification, software upgrades, alternative data or structure for performance queries and even benchmarking to name the popular uses.

When reviewing an installation of MySQL replication I use the following as part of my checklist of steps used for confirming your replication environment is operational.

Master Variables

  • server-id – Replication will not work without this correctly set and unique
  • log-bin – Pre-requisite for working replication
  • log-bin-index
  • max_binlog_size
  • binlog_cache_size
  • expire_logs_days – a value from 5 to 10 is good, not set can result in a full disk.
  • binlog-do-db/binlog-ignore-db – Use with caution
  • sync_binlog
  • innodb_support_xa

Slave Variables

  • server-id – Replication will not work without this correctly set and unique
  • read_only = TRUE
  • log-bin – may or may not be present
  • relay-log
  • relay-log-index
  • max_binlog_size
  • binlog_cache_size
  • expire_logs_days – a value from 5 to 10 is good, not set can result in a full disk.
  • replicate-do-???? – Warning, use these with caution. Your slave will not be the same as your master.
  • slave-skip-errors – Warning, this can lead to your slave being inconsistent with your slave.

On the Master I audit the following information.

  • SHOW MASTER STATUS
    • If any Binlog_Do_DB, then a SHOW SCHEMAS for verification
  • SHOW MASTER LOGS
    • Confirm physical files as well as available diskspace on log-bin disk partition
  • SHOW SLAVE STATUS (in a true master/slave environment this should be empty)
  • SHOW GLOBAL VARIABLES LIKE ‘binlog_cache_size';
  • SHOW GLOBAL STATUS LIKE ‘Binlog%’
  • SELECT host,user,password FROM mysql.user WHERE Repl_slave_priv=’Y’ AND Super_priv=’N';

On the Slave I audit the following information.

  • SHOW SLAVE STATUS
  • SHOW MASTER STATUS – This will determine if you have log-bin enabled on the slave

The key information for MySQL slaves is in the SHOW SLAVE STATUS command. An example output is:

mysql> show slave statusG
*************************** 1. row ***************************
             Slave_IO_State:
                Master_Host: 10.10.1.1
                Master_User: slave
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: bin-log.001817
        Read_Master_Log_Pos: 369684547
             Relay_Log_File: relay-log.000449
              Relay_Log_Pos: 42347742
      Relay_Master_Log_File: bin-log.001817
           Slave_IO_Running: No
          Slave_SQL_Running: No
            Replicate_Do_DB:
        Replicate_Ignore_DB:
         Replicate_Do_Table:
     Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
                 Last_Errno: 0
                 Last_Error:
               Skip_Counter: 0
        Exec_Master_Log_Pos: 369684547
            Relay_Log_Space: 42347742
            Until_Condition: None
             Until_Log_File:
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File:
         Master_SSL_CA_Path:
            Master_SSL_Cert:
          Master_SSL_Cipher:
             Master_SSL_Key:
      Seconds_Behind_Master: NULL
1 row in set (0.00 sec)

It is important that you learn and understand these values. In this above case, replication is NOT running as indicated by Slave_IO_Running and Slave_SQL_Running.

This information is just an introduction as to what to look at. In my next lesson, I’ll spend more detail of the output of the various commands, as well as describe in greater detail the relationship of underlying files that are important for a working MySQL Replication environment.

Other References

Verifying MySQL Replication in Action
MySQL Replication Architecture

Where can you find MySQL Events?

As a frequent traveler for my MySQL consulting (last 4 weeks were Sydney, San Francisco, New York and Vancouver), I like to keep abreast of any local tech event that includes MySQL that I may be able to attend.

Now there is a consolidated location that you can use, the Open Source Events Calendar. Kudos to the MySQL Community team members Lenz Grimmer and Giuseppe Maxia who have put this together.

We need your help. If you have a local event, please submit your event request. This projects needs the support of all.

You will also find valuable conference information including dates for close of proposals. A great tool for scheduling your upcoming conference year.

InnoDB I_S.tables.table_rows out by a factor of 100x

I’ve always believed that the MySQL Information_schema.tables.table_rows figure for Innodb tables to be while approximate, approximately accurate.

Today I found that the figures varied on one table from 10x to 100x wrong.

Before performing an ALTER I always verify sizes for reference.

+---------+------------+--------+------------+------------+----------------+------------+------------+-----------+------------+
| table_s | table_name | engine | row_format | table_rows | avg_row_length | total_mb   | data_mb    | index_mb  | today      |
+---------+------------+--------+------------+------------+----------------+------------+------------+-----------+------------+
| XXXXX   | s_a        | InnoDB | Compact    |     208993 |           7475 |  1491.5312 |  1490.0156 |    1.5156 | 2009-09-09 |

mysql> alter table s_a modify col VARCHAR(255);
Query OK, 23471 rows affected (4 min 26.23 sec)

After

+---------+------------+--------+------------+------------+----------------+------------+------------+-----------+------------+
| XXXXX   | s_a        | InnoDB | Compact    |    2407063 |            629 |  1447.0312 |  1445.5156 |    1.5156 | 2009-09-09 |

I was so caught out by this. The table reported 200k rows, but the alter returned 23k, that’s like 10x out.
I ran my query again, and the second time I got.

+---------+------------+--------+------------+------------+----------------+------------+------------+-----------+------------+
| XXXXX   | s_a        | InnoDB | Compact    |      21813 |          69487 |  1447.0312 |  1445.5156 |    1.5156 | 2009-09-09 |

This is closer to what I’d expected, 22k verses 23k.

I have to say, while I have always treated the data and index size as accurate, I now how little confidence in the table_rows any more.

MySQL DML stats per table

MySQL provides a level of statistics for your INSERT, UPDATE, DELETE, REPLACE Data Manipulation Language (DML) commands using the STATUS output of various Com_ variables, however it is per server stats. I would like per table stats.

You can achieve this with tools such as MySQL Proxy and mk-query-digest, however there is actually a very simple solution that requires no additional tools.
The following 1 line Linux command (reformatted for ease of reading) gave me exactly what I wanted, and it had ZERO impact on the database.

$ mysqlbinlog /path/to/mysql-bin.000999 |  
   grep -i -e "^update" -e "^insert" -e "^delete" -e "^replace" -e "^alter"  | 
   cut -c1-100 | tr '[A-Z]' '[a-z]' |  
   sed -e "s/t/ /g;s/`//g;s/(.*$//;s/ set .*$//;s/ as .*$//" | sed -e "s/ where .*$//" |  
   sort | uniq -c | sort -nr


  33389 update e_acc
  17680 insert into r_b
  17680 insert into e_rec
  14332 insert into rcv_c
  13543 update e_rec
  10805 update loc
   3339 insert into r_att
   2781 insert into o_att
...

Granted the syntax could do with some regex improvements, but in 2 minutes I was able to deduce some approximate load. The mysqlbinlog command also gives option to retrieve data for a given time period, so it is very easy to get these statistics on a per hour basis.

Sometimes the most simple options are right in front of you, just just need to strive to find the simplest solution.

Has your blog been hacked?

While not a MySQL topic, as most of my readers view my MySQL Blog, my WordPress blog has been hacked? Has yours?

Like many, I’m sure you may have read about it like at WordPress blogs under attack from hack attack but I was surprised when my custom permlinks did not work.

Being surprised I looked at Administrator accounts, and I found that there was one more number then being displayed in the list. I had to dig into the database to find the problem.

mysql> select * from wp_users where ID in (select user_id from wp_usermeta where meta_key = 'wp_capabilities' and meta_value like '%admin%');
+-----+-------------+------------------------------------+---------------+------------------------------+---------------------------+---------------------+---------------------+-------------+--------------+
| ID  | user_login  | user_pass                          | user_nicename | user_email                   | user_url                  | user_registered     | user_activation_key | user_status | display_name |
+-----+-------------+------------------------------------+---------------+------------------------------+---------------------------+---------------------+---------------------+-------------+--------------+
|   1 | admin       | $P$BHZFK/prDplb/W/024yrH49JvAmmCE. | ronald        | [email protected] | http://ronaldbradford.com | 2005-11-21 23:43:47 |                     |           0 | Ronald       |
| 127 | ronald      | $P$B..e75VtFsv9bUGj5H5NTiXXPQIitr1 | ronald        | [email protected]    | http://ronaldbradford.com | 2009-02-22 20:13:33 |                     |           0 | ronald       |
| 133 | ChaseKent87 | $P$Bl8cVSzBums33Md6u2PQtUVY2PPBHK. | chasekent87   |                              |                           | 2009-09-05 06:36:59 |                     |           0 | ChaseKent87  |
+-----+-------------+------------------------------------+---------------+------------------------------+---------------------------+---------------------+---------------------+-------------+--------------+
3 rows in set (0.00 sec)

mysql> delete from wp_users where ID=133;
mysql> delete from wp_usermeta where user_id=133;

However the damage has been done, and an update to the recommend 2.8.4 is unlikely to fix the data corruption.

Being a good DBA I have a nightly backup of my database. Being a diligent system administrator, I have not 1 copy, by 3 copies of my system, one on my web site and two offsite.

The problem is I don’t keep older backups of my data, only a day old version.

What do you monitor in MySQL?

If you are unfamiliar with what to monitor in MySQL, starting with looking at what popular Monitoring products monitor. For example, the following is the list of MySQL Cacti Plugin measurements.

Innodb Buffer Pool Activity

  • Pages Created
  • Pages Written
  • Pages Read

Innodb Buffer Pool Pages

  • Pool Size
  • Database Pages
  • Free Pages
  • Modified Pages

Inoodb File I/O

  • File Reads
  • Files Writes
  • Log Writes
  • File Fsyncs

Innodb Pending I/O

  • Aio Log Ios
  • Aio Sync ios
  • Buffer Pool Flushes
  • Chkp Writes
  • Ibuf Aio Reads
  • Log Flushes
  • Log Writes
  • Normal Aio Reads
  • Normal Aio Writes

Innodb Insert Buffer

  • Inserts
  • Merged
  • Merges

Innodb Log

  • Log Buffer Size
  • Log Bytes Written
  • Log Bytes Flushed
  • Unflushed Log

Innodb Row Operations

  • Rows Read
  • Rows Deleted
  • Rows Updated
  • Rows Inserted

Innodb Semaphores

  • Spin Rounds
  • Spin Waits
  • OS Waits

Innodb Transactions

  • Innodb Transactions
  • Current Transactions
  • History List
  • Read Views

MySQL Binary/Relay Logs

  • Binlog Cache use
  • Binlog Cache Disk Use
  • Binary Log Space
  • Relay Log Space

MySQL Command Counters

  • Questions
  • SELECT
  • DELETE
  • INSERT
  • UPDATE
  • REPLACE
  • LOAD
  • DELETE MULTI
  • INSERT SELECT
  • UPDATE MULTI
  • REPLACE SELECT

MySQL Connections

  • Max Connections
  • Max Used Connections
  • Aborted Clients
  • Aborted Connects
  • Threads Connected
  • Connections

MySQL Files and Tables

  • Table Cache
  • Open Tables
  • Open Files
  • Opened Tables

MySQL Network Traffic

  • Bytes Received
  • Bytes Sent

MySQL Processlist

  • State Closing Tables
  • State Copying to Tmp Table
  • State End
  • State Freeing Items
  • State Init
  • State Locked
  • State Login
  • State Preparing
  • State Reading From Net
  • State Sending Data
  • State Sorting Result
  • State Statistics
  • State Updating
  • State Writing to Net
  • State None
  • State Other

MySQL Query Cache

  • Queries In Cache
  • Hits
  • Inserts
  • Not Cached
  • Lowmem Prunes

MySQL Query Cache Memory

  • Query Cache Size
  • Free Memory
  • Total Blocks
  • Free Blocks

MySQL Replication

  • Slave Running
  • Slave Stopped
  • Slave Lag
  • Slave Open Temp Tables
  • Slave Retried Transactions

MySQL Select Types

  • Select Full Join
  • Select Full Range Join
  • Select Range
  • Select Range Check
  • Select Scan

MySQL Sorts

  • Sort Rows
  • Sort Range
  • Sort Merge Passes
  • Sort Scan

MySQL Table Locks

  • Table Locks Immediate
  • Table Locks Waited
  • Slow Queries

MySQL Temporary Objects

  • Created Tmp Tables
  • Created Tmp Disk Tables
  • Created Tmp Files

MySQL Threads

  • Thread Cache Size
  • Threads Created

SQL Analysis with MySQL Proxy – Part 2

As I outlined in Part 1 MySQL Proxy can be one tool for performing SQL analysis. The impact with any monitoring is the art of monitoring will affect the results, in this case the performance. I don’t recommend enabling this level of detailed monitoring in production, these techniques are designed for development, testing, and possibly stress testing.

This leads to the question, how do I monitor SQL in production? The simple answer to this question is, Sampling. Take a representative sample of your production system. The implementation of this depends on many factors including your programming technology stack, and your MySQL topology.

If for example you are using PHP, then defining MySQL proxy on a production system, and executing firewall rules to redirect incoming 3306 traffic to 4040 for a period of time, e.g. 2 seconds can provide a wealth of information as to what’s happening on the server now. I have used this very successfully in production as an information gathering an analysis tool. It is also reasonably easy to configure, execute and the impact on any failures for example are minimized due to the sampling time.

If you run a distributed environment with MySQL Slaves, or many application servers, you can also introduce sampling to a certain extent as these specific points, however like scaling options, it is key to be able to handle and process the write load accurately.

Another performance improvement is to move processing of the gathered information in MySQL proxy to a separate thread or process, removing this work from the thread execution path and therefore increasing the performance. I’m interested to explore the option of passing this information off to memcached or gearman and having MySQL proxy simply capture the packet information and distributing the output. I have yet to see how memcached and/or gearman integrate with the Lua/C bindings. If anybody has experience or knowledge I would be interested to know more.

It is interesting to know that Drizzle provides a plugin to send this level of logging information to gearman automatically.

SQL query analysis with MySQL Proxy

Long before there was the official Query Analyzer (QUAN), a component of MySQL Enterprise, SQL analysis was possible using MySQL Proxy.

The following is an introduction to logging and query analysis with MySQL Proxy.

Get MySQL Proxy

You need to first download MySQL Proxy. In this example I am using the Linux RHEL5 64bit OS and Version 0.7.2

$ wget http://dev.mysql.com/get/Downloads/MySQL-Proxy/mysql-proxy-0.7.2-linux-rhel5-x86-64bit.tar.gz/from/http://mirror.trouble-free.net/mysql_mirror/
$ tar xvfz mysql-proxy-0.7.2-linux-rhel5-x86-64bit.tar.gz
$ ln -s mysql-proxy-0.7.2-linux-rhel5-x86-64bit mysql-proxy
$ export PATH=`pwd`/mysql-proxy/sbin:$PATH
$ mysql-proxy --help-all

Pre-requisites

MySQL Proxy uses TCP/IP, so it is important you connect via the actual hostname. You should first confirm this, as appropriate MySQL permissions may be necessary. For example:

$ mysql -h`hostname` -u -p

On confirmation this works, you can then connect directly to the proxy

$ mysql -h`hostname` -P4040 -u -p

Logging

$ cd mysql-proxy/share/doc/mysql-proxy/
$ wget -O log.lua http://ronaldbradford.com/mysql-dba/mysql-proxy/log.lua
$ mysql-proxy --proxy-lua-script=share/doc/mysql-proxy/log.lua &
$ tail -f mysql.log

This script is based on simple query logging which requires a modification to work in more current versions of MySQL proxy.

$ mysql -hhostname -P4040 -u -p
mysql>  SELECT host,user,password FROM mysql.user;
mysql>  SELECT table_schema,COUNT(*) FROM information_schema.tables GROUP BY table_schema;
mysql>  SELECT NOW(), SLEEP(3);
mysql>  EXIT
$ cat mysql.log
2009-09-02 17:15:01     58 -- select @@version_comment limit 1
2009-09-02 17:16:15     58 -- SELECT host,user,password FROM mysql.user
2009-09-02 17:16:30     58 -- SELECT table_schema,COUNT(*) FROM information_schema.tables GROUP BY table_schema
2009-09-02 17:16:39     58 -- SELECT NOW(), SLEEP(3)

Query Analysis

Restart proxy with the histogram.lua sample provided.

$ mysql-proxy --proxy-lua-script=share/doc/mysql-proxy/histogram.lua &

Connect and run some queries.

$ mysql -hhostname -P4040 -u -p
mysql>  SELECT host,user,password FROM mysql.user;
mysql>  SELECT table_schema,COUNT(*) FROM information_schema.tables GROUP BY table_schema;
mysql>  SELECT NOW(), SLEEP(3);

While connected to the proxy, you can now review data from two pseudo tables.

mysql>  SELECT * FROM histogram.tables;
mysql>  SELECT * FROM histogram.queriesG
mysql>  DELETE FROM histogram.tables;
mysql>  DELETE FROM histogram.queries;

mysql> SELECT * FROM histogram.tables;
+---------------------------+-------+--------+
| table                     | reads | writes |
+---------------------------+-------+--------+
| information_schema.tables |     1 |      0 |
| mysql.user                |     1 |      0 |
+---------------------------+-------+--------+

mysql> SELECT * FROM histogram.queries;
+--------------------------------------------------------------------------------------------------+-------+----------------+----------------+
| query                                                                                            | count | max_query_time | avg_query_time |
+--------------------------------------------------------------------------------------------------+-------+----------------+----------------+
| SELECT @@version_comment LIMIT ?                                                                 |     1 |            300 |            300 |
| SELECT `table_schema` , COUNT( * ) FROM `information_schema` . `tables` GROUP BY `table_schema`  |     1 |           1822 |           1822 |
| SELECT `host` , `user` , `password` FROM `mysql` . `user`                                        |     1 |            494 |            494 |
| SELECT NOW( ) , SLEEP( ? )                                                                       |     1 |        3000735 |        3000735 |
+--------------------------------------------------------------------------------------------------+-------+----------------+----------------+

Moving forward

The power is that with Lua you have the flexibility to write your own logging. Some improvements to these scripts could be.

  • Add the query time, number of rows, and result set size to the logging
  • Be able to sort histogram results or see top percentile. Being able to copy data into real tables would enable any level of analysis
  • Combine the logging and histogram scripts
  • Enable global enable/disabling of logging with SET GLOBAL commands
  • Support variable length IN queries, those that pass multiple values, so you end up with a subset of all queries
  • Provide a actual query example, making it easy to do a QEP. For normalized queries you need to do additional work to find values.
  • The histogram does not support the C API multi query functionality, where multiple queries can be passed to the server at one time. The problem is there is no way to time the individual queries.

Read on in SQL Analysis with MySQL Proxy – Part 2.

References

A good introduction document
MySQL Proxy – From architecture to implementation – OSCON 2008