Ronald Bradford
MySQL Expert

MySQL Expert Ronald Bradford shares valuable input in MySQL Performance Tuning, MySQL Scalability and general MySQL Help from his two decades of working with MySQL, Oracle, Ingres and development technologies.

Archive for the ‘Drizzle’ Category

The Drizzle Census

Wednesday, April 28th, 2010

One thing I have often wondered is just how many MySQL instances exist in the world and what MySQL versions and architectures are in use. We hear of 50,000 windows downloads per day but this is misleading because MySQL is basically bundled with Linux by default or installed from various repositories. Linux servers powers many websites.

In Drizzle we have a proposed plan, the Drizzle Census. From the productive Drizzle Developers Day recently at the 2010 MySQL conference we sat down and created a blueprint, and subsequent high level spec of what we considered this optional plugin should do. We didn’t get as far as I would have liked in a code skeleton to at least gather and store a sample result, but the hope is that with the community we will in the near future.

Here is the list of information we decided was appropriate for anonymous information of value.

  • Kernel Version/Architecture
  • CPU type
  • SID – HASH(processor_id,listener address,first listener port)
  • Drizzle Version
  • Drizzle Uptime
  • Drizzled process memory usage

How to find MySQL developers?

Wednesday, March 24th, 2010

Brian wrote recently Where did all of the MySQL Developers Go?, while over in Drizzle land they have been accepted for the Google Summer of code along with many other open source projects. MySQL from my observation a noticeable absentee.

Historically, the lack of opportunity to enable community contributions and see them implemented in say under 5 years, has really hurt MySQL in recent times. There is plenty of history here so that’s not worth repeating. The current landscape of patches, forks and custom MySQL binaries for storage engine provider has provided a boom of innovation that sadly is now lost from the core MySQL product.

In Drizzle, community contribution is actively sought and a good portion of committed code is not from the core Drizzle developers (wherever they work). As a Drizzle GSoC project contributor last year Padraig for example this year is helping to mentor. The Drizzle project contribution philisophy, GSoC and other activities such as the Drizzle Developer Day all enable the next generation of developers to be part of ongoing project developement.

Oracle, what are you going to do to foster an active community and new long term developers for MySQL?

Understanding Drizzle user authentication options – Part 2

Friday, March 12th, 2010

A key differentiator in Drizzle from it’s original MySQL roots is user based authentication. Gone is the host/user and schema/table/column model that was stored in the MyISAM based mysql.user table.

Authentication is now completely pluggable, leveraging existing systems such as PAM, LDAP via PAM and Http authentication.

In this post I’ll talk about HTTP authentication which requires an external http server to implement successfully. You can look at Part 1 for PAM authentication.

Compiling for http auth support

By default during compilation you may find.

checking for libcurl... no
configure: WARNING: libcurl development lib not found: not building auth_http plugin. On Debian this is found in libcurl4-gnutls-dev. On RedHat it's in libcurl-devel.

In my case I needed:

$ sudo yum install curl-devel

NOTE: Bug #527255 talks about issues of the message being incorrect for libcurl-devel however this appears it may be valid in Fedora Installs

After successfully installing the necessary pre-requisite you should see.

checking for libcurl... yes
checking how to link with libcurl... -lcurl
checking if libcurl has CURLOPT_USERNAME... no

HTTP Authentication

We need to enable the plugin at server startup.

$ sbin/drizzled --mysql-protocol-port=3399 --plugin_add=auth_http &

You need to ensure the auth_http plugin is active by checking the data dictionary plugin table.

drizzle> select * from data_dictionary.plugins where plugin_name='auth_http';
+-------------+----------------+-----------+-------------+
| PLUGIN_NAME | PLUGIN_TYPE    | IS_ACTIVE | MODULE_NAME |
+-------------+----------------+-----------+-------------+
| auth_http   | Authentication | TRUE      |             |
+-------------+----------------+-----------+-------------+

The auth_http plugin also has the following system variables.

drizzle> SHOW GLOBAL VARIABLES LIKE '%http%';
+------------------+-------------------+
| Variable_name    | Value             |
+------------------+-------------------+
| auth_http_enable | OFF               |
| auth_http_url    | http://localhost/ |
+------------------+-------------------+
2 rows in set (0 sec)

In order to configure Http authentication, you need to have the following settings added to your drizzled.cnf file. For example:

$ cat etc/drizzled.cnf
[drizzled]
auth_http_enable=TRUE
auth_http_url=http://thedrizzler.com/auth

NOTE: Replace the domain name with something you have, even localhost.

A Drizzle restart gives us

$ bin/drizzle -e "SHOW GLOBAL VARIABLES LIKE 'auth_http%'"
+------------------+-----------------------------+
| Variable_name    | Value                       |
+------------------+-----------------------------+
| auth_http_enable | ON                          |
| auth_http_url    | http://thedrizzler.com/auth |
+------------------+-----------------------------+

By default, currently if the settings result in an invalid url, then account validation does not fail and you can still login. It is recommended that you always configure pam authentication as well as a fall back.

$ wget -O tmp http://thedrizzler.com/auth
--17:32:32--  http://thedrizzler.com/auth
Resolving thedrizzler.com... 208.43.73.220
Connecting to thedrizzler.com|208.43.73.220|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
17:32:32 ERROR 404: Not Found.

$ bin/drizzle
drizzle > exit

Configuring passwords

To correctly configured your web server to perform the HTTP auth, you can use this Apache syntax as an example.

The following is added to the VirtualHost entry in your web browser.

<Directory /var/www/drizzle/auth>
AllowOverride FileInfo All AuthConfig
AuthType Basic
AuthName "Drizzle Access Only"
AuthUserFile /home/drizzle/.authentication
Require valid-user
</Directory>
$ sudo su -
$ mkdir /var/www/drizzle/auth
$ touch /var/www/drizzle/auth/index.htm
$ apachectl graceful

We check we now need permissions for the URL.

$ wget -O tmp http://thedrizzler.com/auth
--17:35:48--  http://thedrizzler.com/auth
Resolving thedrizzler.com... 208.43.73.220
Connecting to thedrizzler.com|208.43.73.220|:80... connected.
HTTP request sent, awaiting response... 401 Authorization Required
Authorization failed.

You need to create the username/password for access.

$ htpasswd -cb /home/drizzle/.authentication testuser sakila
$ cat /home/drizzle/.authentication
testuser:85/7CbdeVql4E

Confirm that the http auth with correct user/password works.

$ wget -O tmp http://thedrizzler.com/auth --user=testuser --password=sakila
--17:37:45--  http://thedrizzler.com/auth
Resolving thedrizzler.com... 208.43.73.220
Connecting to thedrizzler.com|208.43.73.220|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently

Drizzle HTTP Authentication in action

By default we now can’t login

$ bin/drizzle
ERROR 1045 (28000): Access denied for user ''@'127.0.0.1' (using password: NO)
$ bin/drizzle --user=testuser --password=sakila999
ERROR 1045 (28000): Access denied for user 'testuser'@'127.0.0.1' (using password: YES)

$ bin/drizzle --user=testuser --password=sakila
Welcome to the Drizzle client..  Commands end with ; or \g.
Your Drizzle connection id is 6
Server version: 7 Source distribution (trunk)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

drizzle>

Understanding Drizzle user authentication options – Part 1

Friday, March 12th, 2010

A key differentiator in Drizzle from it’s original MySQL roots is user based authentication. Gone is the host/user and schema/table/column model that was stored in the MyISAM based mysql.user table.

Authentication is now completely pluggable, leveraging existing systems such as PAM, LDAP via PAM and Http authentication.

In this post I’ll talk about PAM authentication which is effectively your current Linux based user security.

This information is based on the current build 1317.

Compiling for PAM support

Your Drizzle environment needs to be compiled with PAM support. You would have received the following warning during a configure.

$ ./configure
...
checking for libpam... no
configure: WARNING: Couldn't find PAM development support, pam_auth will not be built. On Debian, libpam is in libpam0g-dev. On RedHat it's in pam-devel.

The solution is provided in the warning message which is another great thing about Drizzle. The pre checks for dependencies and the optional messages like these far exceed the MySQL equivalent compilation process. In my case:

$ sudo yum install pam-devel

When correctly configured, it should look like:

checking for libpam... yes
checking how to link with libpam... -lpam

Working with PAM

You need to enable the PAM authentication plugin at drizzled startup.

sbin/drizzled --plugin_add=auth_pam &

Unfortunately connecting fails to work with

time sbin/drizzle --user=testuser --password=***** --port=4427
real 0m0.003s
user 0m0.003s
sys 0m0.001s

A look into the source at src/drizzle-2010.03.1317/plugin/auth_pam/auth_pam.cc shows a needed config file

117     retval= pam_start("check_user", userinfo.name, &conv_info, &pamh);

Configuring PAM

In order to enable PAM with Drizzle you need to have the following system configuration.

$ cat /etc/pam.d/check_user
auth    required        pam_unix.so
account required        pam_unix.so

$ time sbin/drizzle --user=testuser --password=***** --port=4427
ERROR 1045 (28000): Access denied for user 'testuser'@'127.0.0.1' (using password: YES)

real 0m2.055s
user 0m0.002s
sys 0m0.002s

This did some validation but still failed.

It seems Bug #484069 may fix this problem, however this is not currently in the main line!

Stay Tuned!

Drizzle’s Data Dictionary and Global Status

Thursday, March 11th, 2010

With the recent news by Brian about the Data Dictionary in Drizzle replacing the INFORMATION_SCHEMA, I was looking into the server status variables (aka INFORMATION_SCHEMA.GLOBAL_STATUS) and I came across an interesting discovery.

select * from data_dictionary.global_status;
...
| Table_locks_immediate      | 0              |
| Table_locks_waited         | 0              |
| Threads_connected          | 8134064        |
| Uptime                     | 332            |
| Uptime_since_flush_status  | 332            |
+----------------------------+----------------+
51 rows in set (0 sec)

This only retrieved 51 rows, which is way less then previous. What I wanted was clearly missing, all the old com_ status variables. Looking at what the data_dictionary actually has available revealed a new table.

drizzle> select * from data_dictionary.global_statements;
+-----------------------+----------------+
| VARIABLE_NAME         | VARIABLE_VALUE |
+-----------------------+----------------+
| admin_commands        | 0              |
| alter_db              | 0              |
| alter_table           | 0              |
| analyze               | 0              |
| begin                 | 0              |
| change_db             | 1              |
| check                 | 0              |
| checksum              | 0              |
| commit                | 0              |
| create_db             | 0              |
| create_index          | 0              |
| create_table          | 0              |
| delete                | 0              |
| drop_db               | 0              |
| drop_index            | 0              |
| drop_table            | 0              |
| empty_query           | 0              |
| flush                 | 0              |
| insert                | 0              |
| insert_select         | 0              |
| kill                  | 0              |
| load                  | 0              |
| release_savepoint     | 0              |
| rename_table          | 0              |
| replace               | 0              |
| replace_select        | 0              |
| rollback              | 0              |
| rollback_to_savepoint | 0              |
| savepoint             | 0              |
| select                | 10             |
| set_option            | 0              |
| show_create_db        | 0              |
| show_create_table     | 0              |
| show_errors           | 0              |
| show_warnings         | 0              |
| truncate              | 0              |
| unlock_tables         | 0              |
| update                | 0              |
+-----------------------+----------------+
38 rows in set (0 sec)

Kudos to this. Looking at list I saw an obvious omission, of “ping”. Something that caught me out some years ago with huge (300-500 per second admin_commands). I’m also a fan of Mark’s recent work An evening hack – Com_ping in MySQL.

Getting started with Drizzle JDBC

Thursday, September 17th, 2009

In preparation for some Java work I wanted to configure and test the Drizzle JDBC Driver. Any chance to swing Drizzle into a MySQL discussion is worth the research. What I found was an issue compiling and an issue running on Ubuntu 9.04

You can start by downloading and building the Drizzle JDBC. My first problem was when I tried to build a usable .jar. I got errors in the test cases which caused by default no built .jar to work with. I raised Bug #432146 – org.drizzle.jdbc.MySQLDriverTest Tests fail. As I stated it may not be a real bug, but it seems at present that you require a running MySQL instance as well as a running Drizzle instance. In my case I didn’t have MySQL running, and I think to be fair, I should be able to build a Drizzle driver without MySQL.

Anyway, as per the Wiki Docs I proceeded to package without successful test cases. My next problem was more interesting, and perhaps found earlier from the tests?

I first created a test schema my code was going to use.

$ ~/drizzle/deploy/bin/drizzle
Your Drizzle connection id is 724
Server version: 2009.09.1126 Source distribution (trunk)

drizzle> create schema test_java;
Query OK, 1 row affected (0 sec)
drizzle> exit

I wrote a simple Java program.

$ cat ExampleDrizzle.java
import java.sql.*;

public class ExampleDrizzle {

  public static void main(String args[]) {

    try {
      Class.forName("org.drizzle.jdbc.Driver");
    } catch (Exception e) {
      System.out.println(e.getMessage());
      System.exit(1);
    }

    try {
      Connection con = DriverManager.getConnection("jdbc:drizzle://localhost:4427/test_java");
      Statement st = con.createStatement();
      st.executeUpdate("CREATE TABLE a (id int not null primary key, value varchar(20))");
      st.close();
      con.close();
    } catch (SQLException e) {
      System.out.println(e.getMessage());
    }
  }
}

Compiled.

$ javac ExampleDrizzle.java

Ran.

$ java ExampleDrizzle
org.drizzle.jdbc.Driver not found in gnu.gcj.runtime.SystemClassLoader{urls=[file:mysql-connector-java-5.1.8-bin.jar,file:./], parent=gnu.gcj.runtime.ExtensionClassLoader{urls=[], parent=null}}

Oops, been a while since using Java. I was amazed I could write the code in vi in the first place.

$ export CLASSPATH=drizzle-jdbc-0.5-SNAPSHOT.jar:.
$ java ExampleDrizzle
17-Sep-09 6:48:45 PM org.drizzle.jdbc.internal.drizzle.DrizzleProtocol 
INFO: Connected to: localhost:4427
Exception in thread "main" java.lang.NoClassDefFoundError: org.drizzle.jdbc.DrizzleConnection
   at java.lang.Class.initializeClass(libgcj.so.90)
   at org.drizzle.jdbc.Driver.connect(Driver.java:74)
   at java.sql.DriverManager.getConnection(libgcj.so.90)
   at ExampleDrizzle.main(ExampleDrizzle.java:15)
Caused by: java.lang.ClassNotFoundException: java.sql.SQLFeatureNotSupportedException not found in gnu.gcj.runtime.SystemClassLoader{urls=[file:drizzle-jdbc-0.5-SNAPSHOT.jar,file:./], parent=gnu.gcj.runtime.ExtensionClassLoader{urls=[], parent=null}}
   at java.net.URLClassLoader.findClass(libgcj.so.90)
   at java.lang.ClassLoader.loadClass(libgcj.so.90)
   at java.lang.ClassLoader.loadClass(libgcj.so.90)
   at java.lang.Class.forName(libgcj.so.90)
   at java.lang.Class.initializeClass(libgcj.so.90)
   ...3 more

Hmmm, that’s disappointing. I thought about it a minute, figured some guidance would be beneficial , so I sought out the best Java person on #drizzle IRC. Getting a name, but no response from an initial inquiry after about a half hour I thought again at the problem. Just what java are you using?

$ java -version
java version "1.5.0"
gij (GNU libgcj) version 4.3.3

$ ls -l /usr/bin/java
lrwxrwxrwx 1 root root 22 2009-07-17 12:36 /usr/bin/java -> /etc/alternatives/java

$ sudo find / -name java
[sudo] password for rbradfor:
/usr/lib/java
/usr/lib/ure/share/java
/usr/lib/jvm/java-6-sun-1.6.0.16/bin/java
/usr/lib/jvm/java-6-sun-1.6.0.16/jre/bin/java
/usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/bin/java
/usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/bin/java
/usr/bin/java
/usr/include/c++/4.3/gnu/java
/usr/include/c++/4.3/java
/usr/local/include/google/protobuf/compiler/java

$ ls -l /etc/alternatives/j*
...
lrwxrwxrwx   1 root root    33 2009-09-17 17:50 jar -> /usr/lib/jvm/java-gcj/jre/bin/jar
lrwxrwxrwx   1 root root    39 2009-09-17 17:50 jar.1.gz -> /usr/lib/jvm/java-gcj/man/man1/jar.1.gz
lrwxrwxrwx   1 root root    35 2009-09-17 17:50 jarsigner -> /usr/lib/jvm/java-gcj/bin/jarsigner
lrwxrwxrwx   1 root root    45 2009-09-17 17:50 jarsigner.1.gz -> /usr/lib/jvm/java-gcj/man/man1/jarsigner.1.gz
lrwxrwxrwx   1 root root    34 2009-09-17 17:50 java -> /usr/lib/jvm/java-gcj/jre/bin/java
lrwxrwxrwx   1 root root    40 2009-09-17 17:50 java.1.gz -> /usr/lib/jvm/java-gcj/man/man1/java.1.gz
lrwxrwxrwx   1 root root    31 2009-09-17 17:50 javac -> /usr/lib/jvm/java-gcj/bin/javac
lrwxrwxrwx   1 root root    41 2009-09-17 17:50 javac.1.gz -> /usr/lib/jvm/java-gcj/man/man1/javac.1.gz
...

I wonder if I should use the real Sun Java.

$ sudo apt-get install sun-java6-jdk
Reading package lists... Done
Building dependency tree
Reading state information... Done
sun-java6-jdk is already the newest version.
0 upgraded, 0 newly installed, 0 to remove and 2 not upgraded.
$ sudo update-alternatives --config java

There are 4 alternatives which provide `java'.

  Selection    Alternative
-----------------------------------------------
          1    /usr/lib/jvm/java-6-sun/jre/bin/java
          2    /usr/bin/gij-4.3
          3    /usr/bin/gij-4.2
*+        4    /usr/lib/jvm/java-gcj/jre/bin/java

Press enter to keep the default[*], or type selection number: 1
Using '/usr/lib/jvm/java-6-sun/jre/bin/java' to provide 'java'.

$ ls -l /usr/bin/java
lrwxrwxrwx 1 root root 22 2009-07-17 12:36 /usr/bin/java -> /etc/alternatives/java
$ ls -l /etc/alternatives/java
lrwxrwxrwx 1 root root 36 2009-09-17 18:53 /etc/alternatives/java -> /usr/lib/jvm/java-6-sun/jre/bin/java

Yep, it took a minute to discover the update-alternatives command, lucky I didn’t try that manually.

A second try.

$ javac ExampleDrizzle.java
$ java ExampleDrizzle
Sep 17, 2009 6:54:22 PM org.drizzle.jdbc.internal.drizzle.DrizzleProtocol 
INFO: Connected to: localhost:4427
Sep 17, 2009 6:54:22 PM org.drizzle.jdbc.internal.drizzle.DrizzleProtocol close
INFO: Closing connection
Sep 17, 2009 6:54:22 PM org.drizzle.jdbc.internal.common.packet.AsyncPacketFetcher run
INFO: Connection closed

$ ~/drizzle/deploy/bin/drizzle test_java
Server version: 2009.09.1126 Source distribution (trunk)

drizzle> show tables;
+---------------------+
| Tables_in_test_java |
+---------------------+
| a                   |
+---------------------+
1 row in set (0 sec)

drizzle> desc a;
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id    | int         | NO   | PRI | NULL    |       |
| value | varchar(20) | YES  |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+
2 rows in set (0 sec)

And I’ve got a working testcase.

SQL Analysis with MySQL Proxy – Part 2

Thursday, September 3rd, 2009

As I outlined in Part 1 MySQL Proxy can be one tool for performing SQL analysis. The impact with any monitoring is the art of monitoring will affect the results, in this case the performance. I don’t recommend enabling this level of detailed monitoring in production, these techniques are designed for development, testing, and possibly stress testing.

This leads to the question, how do I monitor SQL in production? The simple answer to this question is, Sampling. Take a representative sample of your production system. The implementation of this depends on many factors including your programming technology stack, and your MySQL topology.

If for example you are using PHP, then defining MySQL proxy on a production system, and executing firewall rules to redirect incoming 3306 traffic to 4040 for a period of time, e.g. 2 seconds can provide a wealth of information as to what’s happening on the server now. I have used this very successfully in production as an information gathering an analysis tool. It is also reasonably easy to configure, execute and the impact on any failures for example are minimized due to the sampling time.

If you run a distributed environment with MySQL Slaves, or many application servers, you can also introduce sampling to a certain extent as these specific points, however like scaling options, it is key to be able to handle and process the write load accurately.

Another performance improvement is to move processing of the gathered information in MySQL proxy to a separate thread or process, removing this work from the thread execution path and therefore increasing the performance. I’m interested to explore the option of passing this information off to memcached or gearman and having MySQL proxy simply capture the packet information and distributing the output. I have yet to see how memcached and/or gearman integrate with the Lua/C bindings. If anybody has experience or knowledge I would be interested to know more.

It is interesting to know that Drizzle provides a plugin to send this level of logging information to gearman automatically.

Seeking public data for benchmarks

Friday, August 28th, 2009

I have several side projects when time permits and one is that of benchmarking various MySQL technologies (e.g. MySQL 5.0,5.1,5.4), variants (e.g. MariaDB, Drizzle) and storage engines (e.g. Tokutek, Innodb plugin) and even other products like Tokyo Cabinet which is gaining large implementations.

You have two options with benchmarks, the brute force approach such as Sysbench, TPC, sysbench, Juice Benchmark, iibench, mysqlslap, skyload. I prefer the realistic approach however these are always on client’s private data. What is first needed is better access to public data for benchmarks. I have compiled this list to date and I am seeking additional sources for reference.

Of course, the data is only the starting point, having representative transactions and queries to execute and a framework to execute and a reporting module are also necessary. The introduction of Lua into Sysbench may now be a better option then my tool of choice mybench which I use simply because I can configure, write and deploy generally for a client in under 1 hour.

If anybody has other good references to free public data that’s easily loadable into MySQL please let me know.

Setting up sysbench with MySQL & Drizzle

Thursday, July 23rd, 2009

Sysbench is a open source product that enables you to perform various system benchmarks including databases. Drizzles performs regression testing of every trunk revision with a branched version of sysbench within Drizzle Automation.

A pending branch https://code.launchpad.net/~elambert/sysbench/trunk_drizzle_merge by Eric Lambert now enables side by side testing with MySQL and Drizzle. On a system running MySQL and Drizzle I was able install this sysbench branch with the following commands.

cd bzr
bzr branch lp:~elambert/sysbench/trunk_drizzle_merge
cd trunk_drizzle_merge/
./autogen.sh
./configure
make
sudo make install

Running the default lua tests supplied required me to ensure drizzle was in my path and that I created the ’sbtest’ schema. I’ll be sure it add that checking to my future developed benchmark scripts.

$ cd sysbench/tests/db
$ sysbench --test=insert.lua --db_driver=drizzle prepare
sysbench v0.4.10:  multi-threaded system evaluation benchmark

FATAL: unable to connect to Drizzle server: 23
FATAL: error 0: Unknown database 'sbtest'
FATAL: failed to execute function `prepare': insert.lua:7: Failed to connect to the database
$ drizzle -e "create schema sbtest"
$ sysbench --test=insert.lua --db_driver=drizzle prepare
sysbench v0.4.10:  multi-threaded system evaluation benchmark

Creating table 'sbtest'...

And running produces the following results.

$ sysbench --num-threads=1 --test=insert.lua --db_driver=drizzle run
sysbench v0.4.10:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1

Threads started!

OLTP test statistics:
    queries performed:
        read:                            0
        write:                           10000
        other:                           0
        total:                           10000
    transactions:                        0      (0.00 per sec.)
    deadlocks:                           0      (0.00 per sec.)
    read/write requests:                 10000  (879.68 per sec.)
    other operations:                    0      (0.00 per sec.)

Test execution summary:
    total time:                          11.3678s
    total number of events:              10000
    total time taken by event execution: 11.3354s
    per-request statistics:
         min:                                  0.32ms
         avg:                                  1.13ms
         max:                                 68.74ms
         approx.  95 percentile:               2.41ms

Threads fairness:
    events (avg/stddev):           10000.0000/0.00
    execution time (avg/stddev):   11.3354/0.0

Rerunning the prepare also lacked some auto cleanup to allow for automated re-running.

$ sysbench --test=insert.lua --db_driver=drizzle prepare
Creating table 'sbtest'...
ALERT: Drizzle Query Failed: 1050:Table 'sbtest' already exists
FATAL: failed to execute function `prepare': insert.lua:57: Database query failed

For MySQL

$ sysbench --test=insert.lua --db_driver=mysql --mysql_table_engine=innodb prepare
sysbench v0.4.10:  multi-threaded system evaluation benchmark

Creating table 'sbtest'...

Unfortunately this doesn’t actually create the table in the right storage engine, I had to hack the code to ensure I was comparing InnoDB in each test.

$ sysbench --num-threads=1 --test=insert.l
ua --db_driver=mysql run
sysbench v0.4.10:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1

Threads started!

OLTP test statistics:
    queries performed:
        read:                            0
        write:                           10000
        other:                           0
        total:                           10000
    transactions:                        0      (0.00 per sec.)
    deadlocks:                           0      (0.00 per sec.)
    read/write requests:                 10000  (897.67 per sec.)
    other operations:                    0      (0.00 per sec.)

Test execution summary:
    total time:                          11.1399s
    total number of events:              10000
    total time taken by event execution: 11.1084s
    per-request statistics:
         min:                                  0.27ms
         avg:                                  1.11ms
         max:                                252.63ms
         approx.  95 percentile:               2.48ms

Threads fairness:
    events (avg/stddev):           10000.0000/0.00
    execution time (avg/stddev):   11.1084/0.00

Armed with a working environment I can now write some more realistic production like tests in Lua.

Drizzle Query logging

Tuesday, July 21st, 2009

Currently Drizzle offers three (3) separate query logging plugins. These plugins offer an extensible means of gathering all or selected queries and provide the foundation for a query analyser tool. Additional filtering includes selecting queries by execution time, result size, rows processed and by any given regular expression via PCRE.

During this tutorial I’ll be stepping though the various logging_query parameters which log SQL in a CSV format.

Confirm Logging Plugins

You can view the current ACTIVE plugins in Drizzle with the following SQL.

drizzle> select version();
+--------------+
| version()    |
+--------------+
| 2009.07.1097 |
+--------------+

drizzle> select * from information_schema.plugins where plugin_name like 'logging%';
+-----------------+----------------+---------------+--------------------------------------+---------------------------------+----------------+
| PLUGIN_NAME     | PLUGIN_VERSION | PLUGIN_STATUS | PLUGIN_AUTHOR                        | PLUGIN_DESCRIPTION              | PLUGIN_LICENSE |
+-----------------+----------------+---------------+--------------------------------------+---------------------------------+----------------+
| logging_gearman | 0.1            | ACTIVE        | Mark Atwood  mark @fallenpegasus.com | Log queries to a Gearman server | GPL            |
| logging_query   | 0.2            | ACTIVE        | Mark Atwood  mark @fallenpegasus.com | Log queries to a CSV file       | GPL            |
| logging_syslog  | 0.2            | ACTIVE        | Mark Atwood  mark @fallenpegasus.com | Log to syslog                   | GPL            |
+-----------------+----------------+---------------+--------------------------------------+---------------------------------+----------------+
3 rows in set (0.01 sec)

Logging all queries

You can define the following configuration variables to enable query logging.

/etc/drizzle/drizzled.cnf
[drizzled]
logging_query_enable=true
logging_query_filename=/var/log/drizzle/general.csv

You can confirm the settings with the following SHOW VARIABLES.

drizzle> show global variables like 'logging_query%';
+---------------------------------------+------------------------------+
| Variable_name                         | Value                        |
+---------------------------------------+------------------------------+
| logging_query_enable                  | ON                           |
| logging_query_filename                | /var/log/drizzle/general.csv |
| logging_query_pcre                    |                              |
| logging_query_threshold_big_examined  | 0                            |
| logging_query_threshold_big_resultset | 0                            |
| logging_query_threshold_slow          | 0                            |
+---------------------------------------+------------------------------+

This command showing queries to be logged.

$ cat /var/log/drizzle/general.csv
1248214561824590,1,1,"","select @@version_comment limit 1","Query",1248214561824590,1240,1240,1,00,0
1248214582588346,1,3,"","show global variables like 'logging_query%'","Query",1248214582588346,1958,1706,6,62,0

Unfortunately the log does not yet provide a header. You need to turn the source code to get a better description of the columns.

      snprintf(msgbuf, MAX_MSG_LEN,
               "%"PRIu64",%"PRIu64",%"PRIu64",\"%.*s\",\"%s\",\"%.*s\","
               "%"PRIu64",%"PRIu64",%"PRIu64",%"PRIu64",%"PRIu64
               "%"PRIu32",%"PRIu32"\n",
               t_mark,
               session->thread_id,
               session->query_id,
               // dont need to quote the db name, always CSV safe
               dbl, dbs,
               // do need to quote the query
               quotify((unsigned char *)session->query,
                       session->query_length, qs, sizeof(qs)),
               // command_name is defined in drizzled/sql_parse.cc
               // dont need to quote the command name, always CSV safe
               (int)command_name[session->command].length,
               command_name[session->command].str,
               // counters are at end, to make it easier to add more
               (t_mark - session->connect_utime),
               (t_mark - session->start_utime),
               (t_mark - session->utime_after_lock),
               session->sent_row_count,
               session->examined_row_count,
               session->tmp_table,
               session->total_warn_count);

The important parts of this information include:

  • getmicrotime – 1248214561824590
  • Session Id – 1
  • Query Id – 1
  • Schema
  • The Query: “show global variables like ‘logging_query%’”
  • The Query type “Query”
  • Time session connected – 1248214582588346
  • The total execution time – 1958
  • The execution time after necessary locks – 1706
  • The number of rows returned – 6
  • The number of rows examined – 6
  • The number of temporary tables used – 2
  • The total warning count – 0

I also found what I believe is a formatting problem logged as Bug #402831.

You can enable logging dynamically.

drizzle> select now();
+---------------------+
| now()               |
+---------------------+
| 2009-07-22 02:14:31 |
+---------------------+
1 row in set (0 sec)

drizzle> set global logging_query_enable=true;
Query OK, 0 rows affected (0 sec)

drizzle> select curdate();
+------------+
| curdate()  |
+------------+
| 2009-07-22 |
+------------+
1 row in set (0 sec)

drizzle> set global logging_query_enable=false;
Query OK, 0 rows affected (0 sec)

drizzle> select now();
+---------------------+
| now()               |
+---------------------+
| 2009-07-22 02:14:54 |
+---------------------+
1 row in set (0 sec)
1248228876381645,4,3,"","set global logging_query_enable=true","Query",1248228876381645,761,761,0,00,0
1248228886866882,4,4,"","select curdate()","Query",1248228886866882,105,105,1,00,0

I was not able to alter the logging_query_filename dynamically. Need to confirm with the development team about this functionality for the future.

drizzle> set global logging_query_filename='/tmp/general.csv';
ERROR 1238 (HY000): Variable 'logging_query_filename' is a read only variable

Logging slow queries

If you just wanted to emulate the MySQL slow query log, with a long_query_time of 1 second, you could use the following.

/etc/drizzle/drizzled.cnf
[drizzled]
logging_query_enable=true
logging_query_filename=/var/log/drizzle/slow.csv
logging_query_threshold_slow=1000000

Drizzle supports the ability to set a threshold in microseconds.

NOTE: I wanted to demonstrate this using the popular MySQL SLEEP() function, only to find this is currently not available in Drizzle. This is an ideal example of a simple UDF that can be written and added to Drizzle. One day if I ever have the time.

Here is some sample output using queries > 1 second.

1248216457856195,1,43,"test","insert into numbers   select...","Query",1248216457856195,2160680,2160620,0,26214420,0
1248216462738678,1,45,"test","insert into numbers   select...","Query",1248216462738678,4530327,4530263,0,52428821,0
1248216472430813,1,47,"test","insert into numbers   select...","Query",1248216472430813,8990965,8990890,0,104857622,0
1248216473592812,1,48,"test","select @counter := count(*) from numbers","Query",1248216473592812,1152319,1152257,1,104857622,0

Logging by threshold

Drizzle Query Logging provides the ability to return results by 2 thresholds, the number of rows in the result, and the number of rows examined by the storage engine.

/etc/drizzle/drizzled.cnf
[drizzled]
logging_query_enable=true
logging_query_filename=/var/log/drizzle/slow.csv
logging_query_threshold_big_resultset=100
1248216631322097,1,5,"test","select * from numbers limit 100","Query",1248216631322097,281,217,100,1002,0
1248216642763174,1,6,"test","select * from numbers limit 101","Query",1248216642763174,268,215,101,1012,0
/etc/drizzle/drizzled.cnf
[drizzled]
logging_query_enable=true
logging_query_filename=/var/log/drizzle/slow.csv
logging_query_threshold_big_examined=1000
1248216785430588,1,6,"test","select * from numbers limit 1000","Query",1248216785430588,8055,7983,1000,10002,0
1248216800327928,1,7,"test","select count(*) from numbers","Query",1248216800327928,1041322,1041222,1,10485762,0

Logging by pattern

The final option is to return queries that match a given pattern via a PCRE expression.


/etc/drizzle/drizzled.cnf
[drizzled]
logging_query_enable=true
logging_query_filename=/var/log/drizzle/slow.csv
logging_query_pcre=now
drizzle> select now();
+---------------------+
| now()               |
+---------------------+
| 2009-07-22 03:24:32 |
+---------------------+
1 row in set (0 sec)

drizzle> select curdate();
+------------+
| curdate()  |
+------------+
| 2009-07-22 |
+------------+
1 row in set (0 sec)

drizzle> select "now";
+-----+
| now |
+-----+
| now |
+-----+
1 row in set (0 sec)

drizzle> select "know how";
+----------+
| know how |
+----------+
| know how |
+----------+
1 row in set (0 sec)
1248233072792211,3,2,"","select now()","Query",1248233072792211,154,154,1,00,0
1248233085807520,3,4,"","select \"now\"","Query",1248233085807520,92,92,1,00,0
1248233096659018,3,5,"","select \"know how\"","Query",1248233096659018,75,75,1,00,0

Another example using a pattern.

/etc/drizzle/drizzled.cnf
[drizzled]
logging_query_enable=true
logging_query_filename=/var/log/drizzle/slow.csv
logging_query_pcre="[0-9][0-9][0-9]"
drizzle> select 1;
+---+
| 1 |
+---+
| 1 |
+---+
1 row in set (0 sec)

drizzle> select 11;
+----+
| 11 |
+----+
| 11 |
+----+
1 row in set (0 sec)

drizzle> select 111;
+-----+
| 111 |
+-----+
| 111 |
+-----+
1 row in set (0 sec)

drizzle> select 1111;
+------+
| 1111 |
+------+
| 1111 |
+------+
1 row in set (0 sec)

drizzle> select 11+22;
+-------+
| 11+22 |
+-------+
|    33 |
+-------+
1 row in set (0 sec)
1248233336460373,3,4,"","select 111","Query",1248233336460373,79,79,1,00,0
1248233339300429,3,5,"","select 1111","Query",1248233339300429,82,82,1,00,0

Unfortunately it seems that this variable is also not configurable dynamically at this time.

drizzle> set global logging_query_pcre="now";
ERROR 1238 (HY000): Variable 'logging_query_pcre' is a read only variable

This is definitely an improvement over current MySQL logging.