Ronald Bradford
MySQL Expert

MySQL Expert Ronald Bradford shares valuable input in MySQL Performance Tuning, MySQL Scalability and general MySQL Help from his two decades of working with MySQL, Oracle, Ingres and development technologies.

Archive for the ‘General’ Category

My favorite MySQL data type – DECIMAL(31,0)

Friday, September 18th, 2009

It may seem hard to believe, but I have seen DECIMAL(31,0) in action on a production server. Not just in one column, but in 15 columns just in the largest 4 tables of one schema. The column was being used to represent a integer primary or foreign key column.

In a representative production instance (one of a dozen plus distributed production database servers) the overall database footprint was decreased from ~10 GB to ~2 GB, a 78% saving. In total, 15 columns across just 4 tables were changed from DECIMAL(31,0) to INT UNSIGNED.

One single table > 5GB was reduced to under 1GB (a 81% saving). This being my record for any GB+ tables in my time working with the MySQL database.

Had this server for example had 4GB of RAM, and say 2.5GB allocated to the innodb_buffer_pool_size, this one change moved the system from requiring more consistent disk access (4x data to memory) to being able to store all data in memory. Tests showed a clear improvement in Innodb buffer pool reads and hit ratio.

Today’s lesson as described in my 2008 conference presentation Top 20 design tips for data architects is, choose the right integer data type for your data.

More woes with java version on Ubuntu

Friday, September 18th, 2009

Armed with more information on Drizzle JDBC being a JDBC 4.0 implementation (helps to explain my issues in Getting started with Drizzle JDBC) I took the time to read about some other new JDBC 4.0 features.

There was reference to handling chained exceptions, however when trying to get this working for SQLException was more complex on Ubuntu 9.04 then I anticipated.

My first problem was an apparent source level problem.

$ javac ExampleDrizzle.java
----------
1. ERROR in ExampleDrizzle.java (at line 14)
	for(Throwable e : sx ) {
	    ^^^^^^^^^^^^^^^^
Syntax error, 'for each' statements are only available if source level is 1.5

That’s weird, what java version was I running now I’d changed with update-alternatives –config java yesterday.

$ java -version
java version "1.6.0_16"
Java(TM) SE Runtime Environment (build 1.6.0_16-b01)
Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode)

No issues here, a quick man reference gives me:

-1.5                    set compliance level to 1.5

I try that, and well that fixes one problem, but creates another.

$ javac -1.5 ExampleDrizzle.java
----------
1. ERROR in ExampleDrizzle.java (at line 14)
	for(Throwable e : sx ) {
	                  ^^
Can only iterate over an array or an instance of java.lang.Iterable

Now Class SQLException 1.6 javadocs shows SQLException as implementing the generics Iterable<Throwable>, while 1.5 javadoc does not. I guess I need to use 1.6 then.

$ javac -1.6 ExampleDrizzle.java
Annotation processing got disabled, since it requires a 1.6 compliant JVM
----------
1. ERROR in ExampleDrizzle.java (at line 14)
	for(Throwable e : sx ) {
	                  ^^
Can only iterate over an array or an instance of java.lang.Iterable

Wait a minute, I’m using a 1.6 compliant JVM. Double checking

$ ls -al /etc/alternatives/java*
lrwxrwxrwx 1 root root 36 2009-09-17 18:53 /etc/alternatives/java -> /usr/lib/jvm/java-6-sun/jre/bin/java
lrwxrwxrwx 1 root root 46 2009-09-17 18:53 /etc/alternatives/java.1.gz -> /usr/lib/jvm/java-6-sun/jre/man/man1/java.1.gz
lrwxrwxrwx 1 root root 31 2009-09-17 17:50 /etc/alternatives/javac -> /usr/lib/jvm/java-gcj/bin/javac
lrwxrwxrwx 1 root root 41 2009-09-17 17:50 /etc/alternatives/javac.1.gz -> /usr/lib/jvm/java-gcj/man/man1/javac.1.gz
lrwxrwxrwx 1 root root 33 2009-09-17 17:50 /etc/alternatives/javadoc -> /usr/lib/jvm/java-gcj/bin/javadoc
lrwxrwxrwx 1 root root 43 2009-09-17 17:50 /etc/alternatives/javadoc.1.gz -> /usr/lib/jvm/java-gcj/man/man1/javadoc.1.gz
lrwxrwxrwx 1 root root 31 2009-09-17 17:50 /etc/alternatives/javah -> /usr/lib/jvm/java-gcj/bin/javah
lrwxrwxrwx 1 root root 41 2009-09-17 17:50 /etc/alternatives/javah.1.gz -> /usr/lib/jvm/java-gcj/man/man1/javah.1.gz
lrwxrwxrwx 1 root root 33 2009-09-11 10:06 /etc/alternatives/javap -> /usr/lib/jvm/java-6-sun/bin/javap
lrwxrwxrwx 1 root root 43 2009-09-11 10:06 /etc/alternatives/javap.1.gz -> /usr/lib/jvm/java-6-sun/man/man1/javap.1.gz
lrwxrwxrwx 1 root root 39 2009-09-11 10:06 /etc/alternatives/java_vm -> /usr/lib/jvm/java-6-sun/jre/bin/java_vm
lrwxrwxrwx 1 root root 38 2009-09-11 10:06 /etc/alternatives/javaws -> /usr/lib/jvm/java-6-sun/jre/bin/javaws
lrwxrwxrwx 1 root root 48 2009-09-11 10:06 /etc/alternatives/javaws.1.gz -> /usr/lib/jvm/java-6-sun/jre/man/man1/javaws.1.gz

javac is not using Sun Java 6. I have no idea how that happened, but it explains now the problem, should be checking javac version, not java version.

$ javac -version
Eclipse Java Compiler 0.894_R34x, 3.4.2 release, Copyright IBM Corp 2000, 2008. All rights reserved.

What the? I was writing Java code on this server by hand, but decided last night to install eclipse after the fact. Did this affect this. I’m not certain whether I installed eclipse before or after my work last night.

I try to change the alternatives again.

$ sudo update-alternatives --config java

There are 4 alternatives which provide `java'.

  Selection    Alternative
-----------------------------------------------
*         1    /usr/lib/jvm/java-6-sun/jre/bin/java
          2    /usr/bin/gij-4.3
          3    /usr/bin/gij-4.2
 +        4    /usr/lib/jvm/java-gcj/jre/bin/java

Press enter to keep the default[*], or type selection number: 1
Using '/usr/lib/jvm/java-6-sun/jre/bin/java' to provide 'java'.

$ javac -version
Eclipse Java Compiler 0.894_R34x, 3.4.2 release, Copyright IBM Corp 2000, 2008. All rights reserved.

That doesn’t work. One needs to know that java and javac operate independently.

$ sudo update-alternatives --config javac

There are 4 alternatives which provide `javac'.

  Selection    Alternative
-----------------------------------------------
          1    /usr/lib/jvm/java-6-sun/bin/javac
          2    /usr/bin/ecj
          3    /usr/bin/gcj-wrapper-4.3
*+        4    /usr/lib/jvm/java-gcj/bin/javac

Press enter to keep the default[*], or type selection number: 1
Using '/usr/lib/jvm/java-6-sun/bin/javac' to provide 'javac'.
$ javac -version
javac 1.6.0_16

$ javac ExampleDrizzle.java

Buyer beware with Ubuntu and it’s rather messed up implementation approach toward alternative java JVM’s.

Explain this

Monday, September 14th, 2009

The EXPLAIN command is an important tool to review how a SQL query is executed and in this example includes what indexes are used.

By adding a covering index I ended up with the following EXPLAIN plan I was unable to explain. The end result was a boost in server performance which was the ultimate goal.

mysql> explain select max(md)  from e_r  where email = 'xxxx@gmail.com' and id = '36981';
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
1 row in set (0.00 sec)

The queries still produced the expected results.

MySQL Replication 102

Monday, September 14th, 2009

One of the most asked questions is how to setup MySQL replication. The MySQL Reference Manual provides a good Replication How To as a starting guide on MySQL Replication 101.

MySQL replication has many uses including read scalability, backups, failover, online maintenance, upgrade testing and verification, software upgrades, alternative data or structure for performance queries and even benchmarking to name the popular uses.

When reviewing an installation of MySQL replication I use the following as part of my checklist of steps used for confirming your replication environment is operational.

Master Variables

  • server-id – Replication will not work without this correctly set and unique
  • log-bin – Pre-requisite for working replication
  • log-bin-index
  • max_binlog_size
  • binlog_cache_size
  • expire_logs_days – a value from 5 to 10 is good, not set can result in a full disk.
  • binlog-do-db/binlog-ignore-db – Use with caution
  • sync_binlog
  • innodb_support_xa

Slave Variables

  • server-id – Replication will not work without this correctly set and unique
  • read_only = TRUE
  • log-bin – may or may not be present
  • relay-log
  • relay-log-index
  • max_binlog_size
  • binlog_cache_size
  • expire_logs_days – a value from 5 to 10 is good, not set can result in a full disk.
  • replicate-do-???? – Warning, use these with caution. Your slave will not be the same as your master.
  • slave-skip-errors – Warning, this can lead to your slave being inconsistent with your slave.

On the Master I audit the following information.

  • SHOW MASTER STATUS
    • If any Binlog_Do_DB, then a SHOW SCHEMAS for verification
  • SHOW MASTER LOGS
    • Confirm physical files as well as available diskspace on log-bin disk partition
  • SHOW SLAVE STATUS (in a true master/slave environment this should be empty)
  • SHOW GLOBAL VARIABLES LIKE ‘binlog_cache_size’;
  • SHOW GLOBAL STATUS LIKE ‘Binlog%’
  • SELECT host,user,password FROM mysql.user WHERE Repl_slave_priv=’Y’ AND Super_priv=’N';

On the Slave I audit the following information.

  • SHOW SLAVE STATUS
  • SHOW MASTER STATUS – This will determine if you have log-bin enabled on the slave

The key information for MySQL slaves is in the SHOW SLAVE STATUS command. An example output is:

mysql> show slave status\G
*************************** 1. row ***************************
             Slave_IO_State:
                Master_Host: 10.10.1.1
                Master_User: slave
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: bin-log.001817
        Read_Master_Log_Pos: 369684547
             Relay_Log_File: relay-log.000449
              Relay_Log_Pos: 42347742
      Relay_Master_Log_File: bin-log.001817
           Slave_IO_Running: No
          Slave_SQL_Running: No
            Replicate_Do_DB:
        Replicate_Ignore_DB:
         Replicate_Do_Table:
     Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
                 Last_Errno: 0
                 Last_Error:
               Skip_Counter: 0
        Exec_Master_Log_Pos: 369684547
            Relay_Log_Space: 42347742
            Until_Condition: None
             Until_Log_File:
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File:
         Master_SSL_CA_Path:
            Master_SSL_Cert:
          Master_SSL_Cipher:
             Master_SSL_Key:
      Seconds_Behind_Master: NULL
1 row in set (0.00 sec)

It is important that you learn and understand these values. In this above case, replication is NOT running as indicated by Slave_IO_Running and Slave_SQL_Running.

This information is just an introduction as to what to look at. In my next lesson, I’ll spend more detail of the output of the various commands, as well as describe in greater detail the relationship of underlying files that are important for a working MySQL Replication environment.

Other References

Verifying MySQL Replication in Action
MySQL Replication Architecture

Has your blog been hacked?

Tuesday, September 8th, 2009

While not a MySQL topic, as most of my readers view my MySQL Blog, my WordPress blog has been hacked? Has yours?

Like many, I’m sure you may have read about it like at Wordpress blogs under attack from hack attack but I was surprised when my custom permlinks did not work.

Being surprised I looked at Administrator accounts, and I found that there was one more number then being displayed in the list. I had to dig into the database to find the problem.

mysql> select * from wp_users where ID in (select user_id from wp_usermeta where meta_key = 'wp_capabilities' and meta_value like '%admin%');
+-----+-------------+------------------------------------+---------------+------------------------------+---------------------------+---------------------+---------------------+-------------+--------------+
| ID  | user_login  | user_pass                          | user_nicename | user_email                   | user_url                  | user_registered     | user_activation_key | user_status | display_name |
+-----+-------------+------------------------------------+---------------+------------------------------+---------------------------+---------------------+---------------------+-------------+--------------+
|   1 | admin       | $P$BHZFK/prDplb/W/024yrH49JvAmmCE. | ronald        | ronald.bradford@xxxx.xxx.xx | http://ronaldbradford.com | 2005-11-21 23:43:47 |                     |           0 | Ronald       |
| 127 | ronald      | $P$B..e75VtFsv9bUGj5H5NTiXXPQIitr1 | ronald        | ronald.bradford@xxxxx.xxx    | http://ronaldbradford.com | 2009-02-22 20:13:33 |                     |           0 | ronald       |
| 133 | ChaseKent87 | $P$Bl8cVSzBums33Md6u2PQtUVY2PPBHK. | chasekent87   |                              |                           | 2009-09-05 06:36:59 |                     |           0 | ChaseKent87  |
+-----+-------------+------------------------------------+---------------+------------------------------+---------------------------+---------------------+---------------------+-------------+--------------+
3 rows in set (0.00 sec)

mysql> delete from wp_users where ID=133;
mysql> delete from wp_usermeta where user_id=133;

However the damage has been done, and an update to the recommend 2.8.4 is unlikely to fix the data corruption.

Being a good DBA I have a nightly backup of my database. Being a diligent system administrator, I have not 1 copy, by 3 copies of my system, one on my web site and two offsite.

The problem is I don’t keep older backups of my data, only a day old version.

SQL Analysis with MySQL Proxy – Part 2

Thursday, September 3rd, 2009

As I outlined in Part 1 MySQL Proxy can be one tool for performing SQL analysis. The impact with any monitoring is the art of monitoring will affect the results, in this case the performance. I don’t recommend enabling this level of detailed monitoring in production, these techniques are designed for development, testing, and possibly stress testing.

This leads to the question, how do I monitor SQL in production? The simple answer to this question is, Sampling. Take a representative sample of your production system. The implementation of this depends on many factors including your programming technology stack, and your MySQL topology.

If for example you are using PHP, then defining MySQL proxy on a production system, and executing firewall rules to redirect incoming 3306 traffic to 4040 for a period of time, e.g. 2 seconds can provide a wealth of information as to what’s happening on the server now. I have used this very successfully in production as an information gathering an analysis tool. It is also reasonably easy to configure, execute and the impact on any failures for example are minimized due to the sampling time.

If you run a distributed environment with MySQL Slaves, or many application servers, you can also introduce sampling to a certain extent as these specific points, however like scaling options, it is key to be able to handle and process the write load accurately.

Another performance improvement is to move processing of the gathered information in MySQL proxy to a separate thread or process, removing this work from the thread execution path and therefore increasing the performance. I’m interested to explore the option of passing this information off to memcached or gearman and having MySQL proxy simply capture the packet information and distributing the output. I have yet to see how memcached and/or gearman integrate with the Lua/C bindings. If anybody has experience or knowledge I would be interested to know more.

It is interesting to know that Drizzle provides a plugin to send this level of logging information to gearman automatically.

Seeking public data for benchmarks

Friday, August 28th, 2009

I have several side projects when time permits and one is that of benchmarking various MySQL technologies (e.g. MySQL 5.0,5.1,5.4), variants (e.g. MariaDB, Drizzle) and storage engines (e.g. Tokutek, Innodb plugin) and even other products like Tokyo Cabinet which is gaining large implementations.

You have two options with benchmarks, the brute force approach such as Sysbench, TPC, sysbench, Juice Benchmark, iibench, mysqlslap, skyload. I prefer the realistic approach however these are always on client’s private data. What is first needed is better access to public data for benchmarks. I have compiled this list to date and I am seeking additional sources for reference.

Of course, the data is only the starting point, having representative transactions and queries to execute and a framework to execute and a reporting module are also necessary. The introduction of Lua into Sysbench may now be a better option then my tool of choice mybench which I use simply because I can configure, write and deploy generally for a client in under 1 hour.

If anybody has other good references to free public data that’s easily loadable into MySQL please let me know.

Getting started with Gearman

Sunday, July 26th, 2009

Gearman is an open source generic framework for distributed processing. At OSCON 2009 I attended the Gearman: Build Your Own Distributed Platform in 3 Hours tutorial.

While it’s very easy to install Gearman, and follow the first example, if you missed the all important additional PHP steps listed on just one slide you may be left with the “‘Class ‘GearmanClient’ not found” error.

The following are detailed instructions for the installation and configuration of Gearman and PHP on Ubuntu 9.04 Jaunty.

Add the Drizzle PPA to get pre-packaged versions of Gearman.

cp /etc/apt/sources.list /etc/apt/sources.list.orig
echo "deb http://ppa.launchpad.net/drizzle-developers/ppa/ubuntu intrepid main
deb-src http://ppa.launchpad.net/drizzle-developers/ppa/ubuntu intrepid main" >> /etc/apt/sources.list
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 06899068
apt-get update

Get the gearman packages

apt-get install -y gearman gearman-job-server gearman-tools libgearman1 libgearman-dev libgearman-dbg libgearman-doc

Get the German PHP extension.

wget http://pecl.php.net/get/gearman-0.4.0.tgz
tar xvfz gearman-0.4.0.tgz
cd gearman-0.4.0/
phpize
./configure
make
make install

If phpize is not available then you are missing the development packages.

$ apt get php5-dev

You also configure PHP to load the extension. This will vary on different Linux environments. In this case.

echo 'extension="gearman.so"' >>/etc/php5/cli/php.ini

Verify the PHP Gearman extension is configured.

$ php --info | grep gearman
gearman
gearman support => enabled
libgearman version => 0.8

Now you are ready for working with the Gearman PHP examples.

Setting up sysbench with MySQL & Drizzle

Thursday, July 23rd, 2009

Sysbench is a open source product that enables you to perform various system benchmarks including databases. Drizzles performs regression testing of every trunk revision with a branched version of sysbench within Drizzle Automation.

A pending branch https://code.launchpad.net/~elambert/sysbench/trunk_drizzle_merge by Eric Lambert now enables side by side testing with MySQL and Drizzle. On a system running MySQL and Drizzle I was able install this sysbench branch with the following commands.

cd bzr
bzr branch lp:~elambert/sysbench/trunk_drizzle_merge
cd trunk_drizzle_merge/
./autogen.sh
./configure
make
sudo make install

Running the default lua tests supplied required me to ensure drizzle was in my path and that I created the ’sbtest’ schema. I’ll be sure it add that checking to my future developed benchmark scripts.

$ cd sysbench/tests/db
$ sysbench --test=insert.lua --db_driver=drizzle prepare
sysbench v0.4.10:  multi-threaded system evaluation benchmark

FATAL: unable to connect to Drizzle server: 23
FATAL: error 0: Unknown database 'sbtest'
FATAL: failed to execute function `prepare': insert.lua:7: Failed to connect to the database
$ drizzle -e "create schema sbtest"
$ sysbench --test=insert.lua --db_driver=drizzle prepare
sysbench v0.4.10:  multi-threaded system evaluation benchmark

Creating table 'sbtest'...

And running produces the following results.

$ sysbench --num-threads=1 --test=insert.lua --db_driver=drizzle run
sysbench v0.4.10:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1

Threads started!

OLTP test statistics:
    queries performed:
        read:                            0
        write:                           10000
        other:                           0
        total:                           10000
    transactions:                        0      (0.00 per sec.)
    deadlocks:                           0      (0.00 per sec.)
    read/write requests:                 10000  (879.68 per sec.)
    other operations:                    0      (0.00 per sec.)

Test execution summary:
    total time:                          11.3678s
    total number of events:              10000
    total time taken by event execution: 11.3354s
    per-request statistics:
         min:                                  0.32ms
         avg:                                  1.13ms
         max:                                 68.74ms
         approx.  95 percentile:               2.41ms

Threads fairness:
    events (avg/stddev):           10000.0000/0.00
    execution time (avg/stddev):   11.3354/0.0

Rerunning the prepare also lacked some auto cleanup to allow for automated re-running.

$ sysbench --test=insert.lua --db_driver=drizzle prepare
Creating table 'sbtest'...
ALERT: Drizzle Query Failed: 1050:Table 'sbtest' already exists
FATAL: failed to execute function `prepare': insert.lua:57: Database query failed

For MySQL

$ sysbench --test=insert.lua --db_driver=mysql --mysql_table_engine=innodb prepare
sysbench v0.4.10:  multi-threaded system evaluation benchmark

Creating table 'sbtest'...

Unfortunately this doesn’t actually create the table in the right storage engine, I had to hack the code to ensure I was comparing InnoDB in each test.

$ sysbench --num-threads=1 --test=insert.l
ua --db_driver=mysql run
sysbench v0.4.10:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1

Threads started!

OLTP test statistics:
    queries performed:
        read:                            0
        write:                           10000
        other:                           0
        total:                           10000
    transactions:                        0      (0.00 per sec.)
    deadlocks:                           0      (0.00 per sec.)
    read/write requests:                 10000  (897.67 per sec.)
    other operations:                    0      (0.00 per sec.)

Test execution summary:
    total time:                          11.1399s
    total number of events:              10000
    total time taken by event execution: 11.1084s
    per-request statistics:
         min:                                  0.27ms
         avg:                                  1.11ms
         max:                                252.63ms
         approx.  95 percentile:               2.48ms

Threads fairness:
    events (avg/stddev):           10000.0000/0.00
    execution time (avg/stddev):   11.1084/0.00

Armed with a working environment I can now write some more realistic production like tests in Lua.

configure: error: mysql_config executable not found

Thursday, July 23rd, 2009

If your compiling a product that includes a dependency of MySQL, you can easily get the error

configure: error: mysql_config executable not found

I generally don’t see this problem, because I use MySQL binary tar files, however if you use MySQL packages, such as Ubuntu, you can easily miss the required dependency.

My currently installed MySQL packages on this Ubuntu machine are:

$ sudo dpkg -l | grep mysql
ii  libdbd-mysql                               0.8.2-1-4.1                               MySQL database server driver for libdbi
ii  libdbd-mysql-perl                          4.008-1                                   A Perl5 database interface to the MySQL data
ii  libmysqlclient15off                        5.1.30really5.0.75-0ubuntu10.2            MySQL database client library
ii  libmysqlclient16                           5.1.31-1ubuntu2                           MySQL database client library
ii  libqt4-sql-mysql                           4.5.0-0ubuntu4.1                          Qt 4 MySQL database driver
ii  mysql-client-5.1                           5.1.31-1ubuntu2                           MySQL database client binaries
ii  mysql-common                               5.1.30really5.0.75-0ubuntu10.2            MySQL database common files
ii  mysql-server-5.1                           5.1.31-1ubuntu2                           MySQL database server binaries
ii  php5-mysql                                 5.2.6.dfsg.1-3ubuntu4.1                   MySQL module for php5

The missing link is the development version of the libmysqlclient library.

sudo apt-get install libmysqlclient15-dev