A beginners look at Drizzle – SQL_MODE

A new feature to MySQL Version 5 was the introduction of SQL_MODE to support STRICT… or TRADITIONAL values.

This feature enabled a closer compatibility to other RDBMS products. MySQL by default performs a number of silent data changes which do not help in providing a level of data integrity if you come from a more traditional background. MySQL by default represents these as warnings, while with an appropriate SQL_MODE, these are in turn treated as errors.

How does Drizzle handle this? Very simple. There is no SQL_MODE. By default Drizzle handling a strict mode of producing errors for any invalid data. The following are some test case examples showing the varying conditions.

Test Case

select version();
create database if not exists test;
use test;
drop table if exists t1;
create table t1(i1 int, c1 char(10), d1 timestamp);
#Pass Tests
insert into t1(i1) values (500000000);
insert into t1(c1) values('1234567890');
insert into t1(i1) values (5000000000);
#Fail Tests
insert into t1(c1) values('12345678901');
insert into t1(d1) values(now());
insert into t1(d1) values(0);

Drizzle Output

drizzle> select version();
+-------------------------+
| version()               |
+-------------------------+
| 2009.03.970-development |
+-------------------------+
1 row in set (0.00 sec)

drizzle> create database if not exists test;
Query OK, 1 row affected (0.01 sec)

drizzle> use test;
Database changed
drizzle> create table t1(i1 int, c1 char(10), d1 timestamp);
Query OK, 0 rows affected (0.17 sec)
#Pass Tests
drizzle> insert into t1(i1) values (500000000);
Query OK, 1 row affected (0.08 sec)
drizzle> insert into t1(c1) values('1234567890');
Query OK, 1 row affected (0.05 sec)
drizzle> insert into t1(d1) values(now());
Query OK, 1 row affected (0.02 sec)
#Fail Tests
drizzle> insert into t1(i1) values (5000000000);
ERROR 1264 (22003): Out of range value for column 'i1' at row 1
drizzle> insert into t1(c1) values('12345678901');
ERROR 1406 (22001): Data too long for column 'c1' at row 1
drizzle> insert into t1(d1) values(0);
ERROR 1685 (HY000): Received an invalid value '0' for a UNIX timestamp.

MySQL Output

mysql> create database if not exists test;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> drop table if exists t1;
Query OK, 0 rows affected (0.05 sec)

mysql> create table t1(i1 int, c1 char(10), d1 timestamp);
Query OK, 0 rows affected (0.16 sec)

mysql> #Pass Tests
mysql> insert into t1(i1) values (500000000);
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1(c1) values('1234567890');
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1(i1) values (5000000000);
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> #Fail Tests
mysql> insert into t1(c1) values('12345678901');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into t1(d1) values(now());
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1(d1) values(0);
Query OK, 1 row affected (0.00 sec)

mysql> select * from t1;
+------------+------------+---------------------+
| i1         | c1         | d1                  |
+------------+------------+---------------------+
|  500000000 | NULL       | 2009-04-06 12:14:21 |
|       NULL | 1234567890 | 2009-04-06 12:14:21 |
| 2147483647 | NULL       | 2009-04-06 12:14:21 |
|       NULL | 1234567890 | 2009-04-06 12:14:21 |
|       NULL | NULL       | 2009-04-06 12:14:21 |
|       NULL | NULL       | 0000-00-00 00:00:00 |
+------------+------------+---------------------+
6 rows in set (0.00 sec)

MySQL SQL_MODE=STRICT_ALL_TABLES Output

mysql> set sql_mode = STRICT_ALL_TABLES;
Query OK, 0 rows affected (0.00 sec)

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.1.32    |
+-----------+
1 row in set (0.00 sec)

mysql> create database if not exists test;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> drop table if exists t1;
Query OK, 0 rows affected (0.01 sec)

mysql> create table t1(i1 int, c1 char(10), d1 timestamp);
Query OK, 0 rows affected (0.12 sec)

mysql> #Pass Tests
mysql> insert into t1(i1) values (500000000);
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1(c1) values('1234567890');
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1(i1) values (5000000000);
ERROR 1264 (22003): Out of range value for column 'i1' at row 1
mysql> #Fail Tests
mysql> insert into t1(c1) values('12345678901');
ERROR 1406 (22001): Data too long for column 'c1' at row 1
mysql> insert into t1(d1) values(now());
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1(d1) values(0);
Query OK, 1 row affected (0.00 sec)


Update
Thanks to Robert Wultsch who highlighted to me that SQL_MODE has been around since 4.1.

Identifying resource bottlenecks – Memory

Continuing on from CPU, we turn our attention to Memory. One of the first steps when addressing a MySQL performance tuning problem is to perform a system audit of the physical hardware resources, then identify any obvious bottlenecks in these resources.

In auditing, I start with the ‘free’ command, the already used ‘vmstat’ command, the /proc/meminfo and /proc/sys/vm/swappiness files to get an indication of memory and swap resources. While we are looking at the Memory, the configuration of Swap is also very important. I will discuss this in more detail later.

$ free -m
             total       used       free     shared    buffers     cached
Mem:          3955       3838        117          0        402       2366
-/+ buffers/cache:       1069       2886
Swap:         1027          0       1027
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0    128 234088 409632 2474372    0    0     0     0 1081  198  0  0 100  0  0
 0  0    128 234088 409632 2474396    0    0     0     0 1003   59  0  0 100  0  0
 0  0    128 234088 409636 2474392    0    0     0   100 1085  209  0  0 100  0  0
 0  0    128 233836 409636 2474396    0    0     0     0 1014  184  3  0 97  0  0
 0  0    128 233284 409636 2474396    0    0     0     0 1182  435  2  0 98  0  0
 0  0    128 233176 409636 2474396    0    0     0     0 1024  104  1  0 99  0  0
 0  0    128 233176 409636 2474396    0    0     0     0 1079  195  0  0 100  0  0
 1  0    128 233168 409644 2474396    0    0     0   232 1021  188  3  0 97  0  0
 0  0    128 233176 409644 2474396    0    0     0     0 1111  213  2  0 98  0  0
 0  0    128 233176 409644 2474396    0    0     0     0 1005   60  0  0 100  0  0

Memory

  • swpd: the amount of virtual memory used.
  • free: the amount of idle memory.
  • buff: the amount of memory used as buffers.
  • cache: the amount of memory used as cache.
  • inact: the amount of inactive memory. (-a option)
  • active: the amount of active memory. (-a option)

Swap

  • si: Amount of memory swapped in from disk (/s).
  • so: Amount of memory swapped to disk (/s).
$ cat /proc/meminfo
MemTotal:      4050776 kB
MemFree:        120984 kB
Buffers:        411928 kB
Cached:        2423468 kB
SwapCached:          0 kB
Active:        1861536 kB
Inactive:      1492152 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      4050776 kB
LowFree:        120984 kB
SwapTotal:     1052248 kB
SwapFree:      1052120 kB
Dirty:             172 kB
Writeback:           0 kB
AnonPages:      518112 kB
Mapped:          23140 kB
Slab:           544448 kB
PageTables:       9528 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:   3077636 kB
Committed_AS:   859208 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    263132 kB
VmallocChunk: 34359474803 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB
$ cat /proc/sys/vm/swappiness
60

What’s important is not to believe that ‘free’ or ‘MemFree’ values are the total free memory. Linux/Unix distributions use available memory as a File System Cache, dumping this data if additional memory is needed. It’s important to add Cached to get a better indication of the true memory available.

Your initial audit should also look at the processes that are using the memory. Options include ‘top’ and ‘ps’.

$ top
M

op - 15:33:56 up 36 days, 17:08,  2 users,  load average: 0.01, 0.11, 0.08
Tasks: 133 total,   1 running, 132 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.1%sy,  0.0%ni, 99.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   4050776k total,  3792952k used,   257824k free,   368140k buffers
Swap:  1052248k total,      128k used,  1052120k free,  2329212k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2159 mysql     15   0  411m 204m 5624 S    0  5.2  33:26.43 mysqld
 2965 buildbot  15   0  280m 105m 1932 S    0  2.7 393:15.54 buildbot
 8533 nobody    15   0  168m  25m 4092 S    0  0.7   0:03.49 httpd
 9121 nobody    15   0  166m  24m 3432 S    0  0.6   0:01.61 httpd
...
$ ps -eopid,fname,rss,vsz,user,command | more
  PID COMMAND    RSS    VSZ USER     COMMAND
    1 init       700  10324 root     init [3]
   346 httpd    25252 170484 nobody   /opt/httpd-2.2.9/bin/httpd -k start
  364 httpd    25184 170344 nobody   /opt/httpd-2.2.9/bin/httpd -k start
  425 kpsmouse     0      0 root     [kpsmoused]
  452 httpd    21000 165684 nobody   /opt/httpd-2.2.9/bin/httpd -k start
...
 2095 mysqld_s  1204  63800 root     /bin/sh bin/mysqld_safe
 2159 mysqld   209448 421248 mysql   /opt/mysql51/bin/mysqld --basedir=/opt/mysql51 --datadir=/opt/mysql51/data --user=mysql --log-error=/opt/mysql51/data/dc1.onegreendog.com.err --pid-file=/opt/mysql51/data/dc
1.onegreendog.com.pid
 ...
  • rss RSS resident set size, the non-swapped physical memory that a task has used (in kiloBytes). (alias rssize, rsz).
  • vsz VSZ virtual memory size of the process in KiB (1024-byte units). Device mappings are currently excluded; this is subject to change.
    (alias vsize).

As I have written about previously, Are you monitoring RSS & VSZ?, these columns are important and should be monitored appropriately.

Regarding Swap. This is pre assigned disk space that is used to swap out (dump) memory processes when you have run out of memory for all the running processes. You never want to run out of memory on your database server. Swapping is both extremely slow, and if your database server swaps out the mysqld process, this will effectively kill your database. If you have insufficient swap space for the process, again in the case of database server this can cause your system to crash.

Historically in Unix world, swap was always defined as 2x Memory. This also doubled as a place to dump all memory in a kernel panic. I spent time in a past life doing core dump analysis.
Today, most Linux systems are ill-configured for swap. If you use a dedicated server for example, you may be limited to what is configured by a third party. The above example shows a configuration I would not recommend where swap is less then 1x the memory.
There is also a consideration to have no Swap. By setting swapiness to 0, you are effectively saying never swap. I would also not recommend this.

Monitoring memory usage closely is important. Taking appropriate action regarding the mysql process because you can’t control how much total memory it uses is critical. Correctly configuring mysql to use memory optimally is key to a well and long running database server.

Next, we will be looking at Disk and Network resource bottlenecks.

Identifying resource bottlenecks – CPU

One of the first steps when addressing a MySQL performance tuning problem is to perform a system audit of the physical hardware resources, then identify any obvious bottlenecks in these resources.

When dealing with CPU, a quick audit should include identifying the number of CPU cores your server has, and the types of these cores. The key file on Linux systems is /proc/cpuinfo.

Number of cores can be found via the command cat /proc/cpuinfo | grep “^processor” | wc -l

You need to look more closely at the file to determine the type of CPU (e.g. below the model name shows Intel(R) Xeon(R) CPU X3220 @ 2.40GHz. The combination of knowing the number of processors (cores) listed and physical id and siblings helps identify how many CPUs and how many cores per CPU exist.

$ cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Xeon(R) CPU           X3220  @ 2.40GHz
stepping	: 11
cpu MHz		: 2394.051
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
bogomips	: 4789.96
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

...

Other commands that help with identifying CPU/cores include mpstat and top.

$ mpstat -P ALL 5

11:43:43 AM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
11:43:48 AM  all    0.00    0.00    0.00    0.00    0.05    0.00    0.00   99.95   1033.00
11:43:48 AM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00   1000.40
11:43:48 AM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:43:48 AM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00     31.40
11:43:48 AM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      1.00
$ top
0

top - 11:42:09 up 36 days, 13:17,  2 users,  load average: 0.20, 0.24, 0.25
Tasks: 133 total,   1 running, 132 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   4050776k total,  3825584k used,   225192k free,   397580k buffers
Swap:  1052248k total,      128k used,  1052120k free,  2302408k cached

You can easily identify a CPU bottleneck using the vmstat command.

The following shows an idle system.

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0    128 234088 409632 2474372    0    0     0     0 1081  198  0  0 100  0  0
 0  0    128 234088 409632 2474396    0    0     0     0 1003   59  0  0 100  0  0
 0  0    128 234088 409636 2474392    0    0     0   100 1085  209  0  0 100  0  0
 0  0    128 233836 409636 2474396    0    0     0     0 1014  184  3  0 97  0  0
 0  0    128 233284 409636 2474396    0    0     0     0 1182  435  2  0 98  0  0
 0  0    128 233176 409636 2474396    0    0     0     0 1024  104  1  0 99  0  0
 0  0    128 233176 409636 2474396    0    0     0     0 1079  195  0  0 100  0  0
 1  0    128 233168 409644 2474396    0    0     0   232 1021  188  3  0 97  0  0
 0  0    128 233176 409644 2474396    0    0     0     0 1111  213  2  0 98  0  0
 0  0    128 233176 409644 2474396    0    0     0     0 1005   60  0  0 100  0  0

The key columns (from the man page are)

CPU – These are percentages of total CPU time.

  • us: Time spent running non-kernel code. (user time, including nice time)
  • sy: Time spent running kernel code. (system time)
  • id: Time spent idle. Prior to Linux 2.5.41, this includes IO-wait time.
  • wa: Time spent waiting for IO. Prior to Linux 2.5.41, included in idle.
  • st: Time stolen from a virtual machine. Prior to Linux 2.6.11, unknown.
  • Procs

  • r: The number of processes waiting for run time.

NOTE: The columns of vmstat may vary between different Linux Operating Systems.

If you system is CPU Bound then you will observe this. Look at id,us,sy,r

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 3  0    128 275684 397176 2300672    0    0     0     0 1118  427 74  2 25  0  0
 3  0    128 217404 397176 2300672    0    0     0     0 1017  138 74  1 25  0  0
 6  0    128 239584 397176 2300672    0    0     0     0 1086  350 93  2  5  0  0
 4  0    128 269468 397176 2300672    0    0     0     0 1005  229 98  2  0  0  0
 4  0    128 217636 397180 2300668    0    0     0   168 1087  251 99  2  0  0  0
 4  0    128 240576 397180 2300668    0    0     0     0 1006  182 99  2  0  0  0
 4  0    128 270708 397180 2300668    0    0     0     0 1079  338 98  2  0  0  0
 4  0    128 218752 397180 2300684    0    0     0     0 1005  106 99  1  0  0  0
 4  0    128 226316 397180 2300684    0    0     0     0 1077  308 98  2  0  0  0
 4  0    128 198664 397184 2300680    0    0     0    76 1010  250 99  1  0  0  0
 4  0    128 179444 397184 2300680    0    0     0     0 1077  238 100  0  0  0  0
 4  0    128 185396 397184 2300688    0    0     0     0 1006  210 99  1  0  0  0
 4  0    128 199408 397184 2300688    0    0     0     0 1079  336 99  1  0  0  0

You should also be wary of a Single CPU Bound process. This is why knowing the number of cores is important. In this example, one CPU is bound.

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st

 0  0    128  99592 412544 2477580    0    0     0     0 1017   89  0  0 100  0  0
 0  0    128  99592 412544 2477580    0    0     0     0 1090  222  0  0 100  0  0
 0  0    128  99592 412544 2477580    0    0     0     0 1019   98  0  0 100  0  0
 1  0    128  99592 412544 2477580    0    0     0     0 1096  347 14  0 86  0  0
 1  0    128  99592 412548 2477576    0    0     0    84 1030  194 25  0 75  0  0
 1  0    128  99592 412548 2477576    0    0     0     0 1094  300 25  0 75  0  0
 1  0    128  99592 412548 2477580    0    0     0     0 1012   76 25  0 75  0  0
 1  0    128  99592 412548 2477580    0    0     0     0 1096  318 25  0 75  0  0
 1  0    128  73192 412548 2477580    0    0     0     0 1039  273 29  0 70  0  0
 1  0    128  77284 412556 2477572    0    0     0   268 1122  373 25  1 75  0  0
 2  0    128  83592 412556 2477584    0    0     0     0 1036  374 27  1 72  0  0
 0  0    128  56220 412564 2477576    0    0     0   172 1017   84  7  0 94  0  0
 0  0    128  56220 412564 2477576    0    0     0     0 1078  192  0  0 100  0  0
$ mpstat -P ALL 1
12:15:55 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
12:15:56 PM  all   25.00    0.00    0.00    0.00    0.00    0.00    0.00   75.00   1072.00
12:15:56 PM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00   1001.00
12:15:56 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
12:15:56 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00     62.00
12:15:56 PM    3  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      9.00

12:15:56 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
12:15:57 PM  all   25.00    0.00    0.00    0.00    0.00    0.00    0.00   75.00   1021.00
12:15:57 PM    0    0.00    0.00    0.00    0.00    0.00    1.00    0.00   99.00   1001.00
12:15:57 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
12:15:57 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00     18.00
12:15:57 PM    3  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      2.00

I will be detailing identifying bottlenecks of Memory, Disk and Network in future posts. You can also find out more at the MySQL User Conference “Monitoring 101 – Simple stuff to save your bacon” session.

Two *FREE* MySQL Conferences Next Month

The annual MySQL Conference & Expo will be held in this year on April 20-23 in Santa Clara, California with a double twist.

Not one, but *two* FREE additional MySQL Conferences are running at the same time, in the same hotel. If you on the west coast you can effectively get a free conference with many MySQL experts speaking at them. I am speaking at all three on three different topics.

The first announcement was the 2009 MySQL Camp organized by Sheeri K. Cabral – The She-BA in line with the O’Reilly approach of having a smaller un-conference within a conference such as with Web 2.0 NY last year.

However the big news was the Percona Performance Conference which was to be held in an adjacent location at the same time, is available now within the same hotel, the Hyatt Regency. In some regards, the Schedule is a better lineup then the official conference.

Percona is a major player in providing services to the MySQL eco-system, the failure to include Percona speakers such as Baron Schwartz was seen as a bizarre move. As a member of the MySQL Conference review committee, Percona submitted a number of presentations, and a number received high scores, more then sufficient in comparison to others accepted.

The MySQL eco-system is at a very fragile junction point, there are numerous independent versions, patches and forks now available, and the recent decision by Sun Microsystems, the trademark owner of the ‘mysql’ term to request Google to enforce trademark usage within Google Ad Words did not seem a productive move for a open source produce. Read more at
Hurting the little guy and MySQL Banned In Google Adword Campaigns. Combined with recent news that MySQL, may now become part of IBM, greater uncertainty may prevail.

These conferences will be unique opportunity to see and hear hopefully more details of what’s happening in the MySQL World.

Twitter Tips

I have in the past questioned the value of Twitter as an effective business tool, but it continues to defy the trend of inability to bridge the business gap with social media.

Even with still continual growth problems (at least it’s not down as much) Twitter is everywhere I go, see or do. You see it at business events, business cards, meetups even on CNN Headline News. There are so many various differ twitter sites, applications, widgets etc, I’m surprised there isn’t a twitter index just of the twitter related sites.

I have now incorporated Twitter into my professional site and I’m using this micro-blogging approach more to share my professional skills and interests to my growing band of followers. I don’t expect to make the Twitter top list which is headed by CNN Breaking News with 667,353 followers.

Even Lance Armstrong (who rates 9th) used Twitter for press releases this week of his injuries.

For more reading check out How Twitter Makes You A Better Writer and 27 Twitter Applications Your Small Business Can Use Today.

I was surprised to see How to get a job by blogging: Tips for a setting up the kind of professional blog that will get you hired, barely mention Twitter.

Now be sure to add a background appropriate to your Twitter. This one is wicked.

Priceless Monty

While working with Monty Widenius on a bug I reported in MariaDB I was surprised to not get an IRC response for a few minutes. When committed Monty can identify, create a workaround, and patch a problem in code before you have time to read all the responses he also types. See my Monty’s Monument “Passion is a timeless wisdom” comment. One thing I forgot to say in that entry was, while 9pm in New york, it was at last 3am-5am Monty time.

We all love memorable quotes, so here is one.

“sorry, machine died; First time in years”

[9:40pm] montywi: then just try compiling mysqld.o, no need to wait for everything else...
....
[9:58pm] rbradfor: make is clean.
[9:59pm] montywi: sorry, machine died;  First time in years
[10:00pm] rbradfor: laughs out load, a priceless monty quote, love it.
...

Hurting the little guy?

Today I come back from the dentist, if that wasn’t bad enough news, I get an email from Google AdWords titled Your Google AdWords Approval Status.

In the email, all my AdWords campaigns are now disapproved, because of:

SUGGESTIONS:
-> Ad Content: Please remove the following trademark from your ad:
mysql.

Yeah right. I can’t put the word ‘MySQL’ in my ads. How are people to now find me? It would appear that many ads have been pulled not just mine. Is this a proactive measure by Google? is this a complaint from the MySQL trademark holder Sun Microsystems?

I’d like any comment, feedback or suggestions on how one can proceed here.

It reminds me of the days CentOS advertised itself as an “Open source provider of a popular North American Operating System”, or something of that nature.

Understanding the various MySQL Products & Variants

The MySQL marketplace today is far more complex then simply choosing between a particular version of MySQL that Sun/MySQL produces.
The MySQL server product in general is released under the GNU General Public License (GPL) v2, however you should carefully review the MySQL Legal Policies as a number of exceptions and different license agreements operate for companion tools such as MySQL Cluster, MySQL client libraries and documentation for example.

Looking into the MySQL ecosystem for products, I’ve produced the following categories:

  • Sun/MySQL Official Products
    • MySQL Versions
  • MySQL Variants
    • Community
    • Enterprise
  • MySQL Plugins
  • MySQL Patches
  • MySQL Alternatives

Why does such a diversification occur?  I attribute this to three primary causes:

  • The GPL license by nature allows for an organization to take the product, modify it and use it for their specific needs. They can also provide these patches under GPL for others to use and incorporate. While this has occurred for example  Google , FaceBook, eBay , Proven Scaling and Percona to name a few, Sun/MySQL has elected not to undertake any proactive process of incorporating these in any timely fashion.
  • The policy of Sun/MySQL to allow for contributions was so strict, and combined with a properietory Version Control System BitKeeper you had to purchase, there was little incentive for community contributions in relation to so many other open source projects
  • The Sun/MySQL management and decision makers didn’t listen to the community and paying customers, and over the past 3-5 years the product life cycle, features, release schedule and quality can be questioned.

Sun/MySQL Official Products

Sun/MySQL holds the license to the MySQL server products. They release official binaries and the source code (due to GPL).  Even within MySQL, there are several products that differ subtly and to the untrained eye it can be confusing to understand and determine what is best. Your can download from www.mysql.com the following versions:

  • MySQL Server 5.1  GA
  • MySQL Community Server 5.0 GA
  • MySQL Enterprise Server 5.0 GA
  • MySQL Cluster NDB 6.3

  • MySQL Server 4.1 (EOL)
  • MYSQL Server 6.0 (Alpha)

MySQL Versions

It is important that you understand the MySQL Versions, especially in evaluating any of the following referenced variants, patches etc.
The common path for MySQL Server versions is with a generally linear numbering systems including historical versions 3.23, 4.0 and 4.1.  These versions have now reached End Of Life (EOL) for support, however emergency security patches are applied where necessary.
Continuing from 4.1, you have the 5.0, 5.1 versions which are both Generally Available (GA), and then version 6.0 which is currently Alpha.

Further complexity happens when within the Sun/MySQL Official products, several forks/branches have occurred.  These include:

  • The MySQL 5.0 Community & Enterprise split occurred at MySQL Version 5.0.27
  • At this time, the Community version (free to download) continued with the intention of allowing for community contributions. Only one patch was ever accepted, and SHOW PROFILES was introduced in MySQL 5.0.37.  To date, 11 versions have been released to the current 5.0.77 version.
  • MySQL Enterprise (available under subscription) is itself comprised of three subtypes, these are Rapid Update Service Packs(monthly), Quarterly Service Packs (quarterly) and Hot-fix releases.  To date 37 versions have been released to the current 5.0.78 version.
  • MySQL Cluster, was part of the MySQL Server product until this was branched/forked at  MySQL Version 5.1.23. This enabled MySQL Cluster to be labeled as Production Ready for Cluster clients, and not be held back by continued delays in the 5.1 server release.   Starting with a new versioning scheme with 6.1, the MySQL Cluster NDB produces new versions far exceeding the volume of the server, with to date 23 versions in 6.1 , 18 in version 6.2 and 24 in version 6.3.      I am not advocating that features and quality are better or worse, simply that activity and interaction with community and users is far greater.
  • MySQL 5.1 Maria is a special branch starting at MySQL Version 5.1.24 that includes the Maria Storage Engine. This is the next generation of the MyISAM Storage Engine, both architected by the creator of MySQL, Monty Widenius.  It is undercertain this will continue as a product released officially by Sun/MySQL.

In Review

With just reading this introduction you can understand the confusion that exists when new customers/clients are beginning to evaluate the different MySQL Versions.

In my next post, I’ll talk more about:

  • MySQL Variants, those I consider variants use the MySQL Interface, protocol and support the standard connectors.  These include community versions (e.g. Solid, Infobright, Sphinx) and commercial versions (e.g. KickFire, InfoBright, Nitro).  
  • MySQL patches are improvements that have been released to the community and are now becoming part of common third party MySQL packages, such as Percona, Proven Scaling and Out Delta
  • MySQL Plugins are a feature of MySQL 5.1, and allow for pluggable storage engines into MySQL.  While several companies have had to produce custom binaries due to the API limitions (especially with the optimizer), a number of engines support the API including Innodb, PBXT and filesystem engine.
  • MySQL Alternatives include any MySQL related products that have now deviated from being supported under the MySQL protocol.  Most notably here is Drizzle.

More Information

Ronald Bradford is Principal at 42SQL. We provide consulting and advisory services for the MySQL ecosystem with a focus on MySQL database performance, architecture and scalability. 42SQL also provides education in MySQL including the “MySQL Essentials” training course. You can find more information regarding this offering and an upcoming schedule at 42SQL Education.

Beginner CSV Engine issues

I’ve just started using the CSV engine for a practical application and I’ve come across a few subtle and unexpected issues/limitations.

First, you can’t create any columns in your CSV table nullable.

mysql> create table t1(i INT) ENGINE=CSV;
ERROR 1178 (42000): The storage engine for the table doesn't support nullable columns

RTFM shows this was introduced in 5.1.23. See CSV Limitations

The second and more annoying was creating a CSV table, inserting a sample row (just to check the syntax), then replacing the file ([datadir]/[schema]/[table].CSV) with the same format, but with additional generated rows. This was to no avail when attempting to SELECT from the table.

The solution was to do a REPAIR TABLE [table] in order to see the newly refreshed data.
Futhermore, some more RTFM shows in Repairing and Checking CSV Tables an important caveat to using the CSV engine and loading data directly by file copy.

Warning: Note that during repair, only the rows from the CSV file up to the first damaged row are copied to the new table. All other rows from the first damaged row to the end of the table are removed, even valid rows.

Buyer beware.

Infobright Community Edition(ICE) – It's Free

The March NY MySQL Meetup featured a presentation from Infobright, a data warehousing solution built on the MySQL Product.

With a pitch of “Simplicity, Scalability and low TCO” I became more impressed with the capability to delivery on these as the presentation proceeded. Here are some highlights.

  • The company and product has been around for a few years. Infobright started as a compression engine to sit beside Teradata, providing a significant cost saving to clients, and allowing a two way data transfer between Teradata.
  • In September 2008, a open source community edition was released, called ICE. (Which I didn’t know)
  • The technology is based on a Rough Set theory, a mathematical approach
  • Using a column oriented approach, compression generally starts at 10:1, different applications can get 30:1 or better
  • There is basically no tuning, there are no indexes. Knowledge is gleaned at data loading and each data pack node holds key information per column, such as range of values (min,max).
  • Some interesting results are, there is a constant load time, it doesn’t degrade over time as the size of your data increases. Also, Query performance scales with data volume.
  • Depending on queries, the knowledge grid can retrieve results without having to uncompress the data, i.e. introspection of the meta data is all that is needed
  • Infobright is not a pluggable storage engine, rather a custom binary of MySQL. This is due to the restrictions of the API and the lack of optimizer push down conditions for example.

The product is not without some limitations, but you have to realize the product is for a data warehousing implementation, not an OLTP web app. It’s not great with SELECT *, and large text strings for example.

Functionality continues to be added, with a recent release adding many more MySQL Functions, but again, Infobright does not claim to be a solution to everybody, there is not UDF support or SP support at this time, however I’d warrant this is really not needed.

While the presentation went into some detail regarding the knowledge grid, data packs, data pack nodes, and pack to pack integration from a slide perspective, the presentation lacked the technical here is how you use the loader to get data out of MySQL and into Infobright. Here is the throughput, etc. As a marketing presentation it had the right content, but I’d like to now see the companion technical presentation.

Having previously been part of the MySQL Consulting team, and having worked also in the Storage Engine API with the Nitro Storage engine I have a distinct advantage of knowing the complexities of integration with MySQL. We can only hope this continues to improve with future releases of MySQL enabling Infobright and other products to integrate better and keep up to date with the MySQL Release cycle.

Are you monitoring RSS & VSZ?

Monitoring MySQL Memory is a rather critical task because you can’t limit MySQL’s usage of physical memory resources. Improperly configured servers running MySQL can crash because you don’t understand memory usage.

MySQL uses memory in a number of different ways. Using the Oracle analogy, you can divide the mysqld memory usage into main areas of:

  • SGA – System Global Area
  • PGA – Process Global Area

The SGA is the footprint that MySQL uses for startup. This is attributed to the base footprint of the mysqld process and a number of buffers including:

NOTE: This is for a default MySQL 5.1 install. Other storage engines and/or other versions of MySQL may have additional buffers. Falcon for example in MySQL 6.x has additional buffers.

The PGA is more complex, and the cause of problems for the possible occurrence of your server running out of memory and needing to swap. The goal of monitoring memory usage is to of course avoid this.
This additional memory is a combination of a few areas including:

  • MEMORY tables
  • Connection management (such as thread_cache and table_cache)
  • Per Connection memory usage

The later is the cause of greatest concern, especially for environments that have persistent connections. Per Connection memory usage is a combination of many buffers including the thread_stack, 2 x net_buffer_length (to max_allowed_packet), read_buffer_size, read_rnd_buffer_size, join_buffer_size, sort_buffer_size, and up to min(max_heap_table_size,tmp_table_size). In fact, for example with temporary tables, a query may use multiple temporary tables.

Memory on a per connection basis is kept until the connection is closed. In some instances such as next_buffer_size, this is apparently reduced aftter each SQL Statement result. With a persistent connection model (e.g. Java), ensuring idle connections drop to a low watermark is a valuable task. The confusing part is MySQL instrumentation does not tell you exactly how much is used, and it’s impossible to calculate with available provided data.

As part of monitoring your server, you should monitor the size of the mysqld memory usage, because this will cause you to be proactive rather then reactive to scarce memory resources. You can easily get this using the ps command. For example:

$ps -eopid,fname,rss,vsz,user,command | grep -e "RSS" -e "mysql"
  PID COMMAND    RSS    VSZ USER     COMMAND
 5463 grep       764   5204 ronald   grep -e RSS -e mysql
13894 mysqld_s   596   3936 root     /bin/sh /usr/bin/mysqld_safe
13933 mysqld   4787812 5127208 mysql /usr/sbin/mysqld --basedir=/usr --datadir=/vol/mysql/mysqldata --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-external-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock
13934 logger     608   3840 root     logger -p daemon.err -t mysqld_safe -i -t mysqld
$ ps -eopid,fname,rss,vsz,user,command | grep " mysqld " | grep -v grep | awk '{print $3,$4}'
4787820 5127208


From man ps
rss RSS resident set size, the non-swapped physical memory that a task has used (in kiloBytes). (alias rssize, rsz).
vsz VSZ virtual memory size of the process in KiB (1024-byte units). Device mappings are currently excluded; this is subject to change.
(alias vsize).

The motto of the story, don’t just monitor the free memory of your system, for a database server, you need to closely monitor the primary process on the server, that is mysqld.

References

How MySQL Uses Memory

More Information

Join me for my MySQL User Conference talk on “Monitoring 101 – Simple stuff to save your bacon”.

I also cover monitoring MySQL in my “MySQL Essentials” training course. For more information visit MySQL Education.

Testing your system

I have raised this specific topic 3 times this week alone, twice in a MySQL setting.

The fundamental philosophy of testing is NOT to verify features of your product that work, it is to BREAK your system.

One such discussion this week was with a service provider that deployed a new system into an existing ecosystem. The release has been delayed due to development issue, and credibility with customers is now being further damaged because the system is reaching physical hardware limitations after just one month.

With this was described to me, my simple response was. You did not test you system to stress the system to breaking point. To know the limit of your capacity ahead of time is a proactive analysis, not a reactive one.

It’s not that complicated to do, easier in early stage before you have a 50-100-1000 server total environment, but it’s a best practice not see often enough.

Configuration management concepts for database objects

Correctly managing your MySQL database objects such as schemas, tables, indexes, base data etc, is critical to the success of a 24×7 online website. I rarely encounter a robust working solution as part of my consulting so I would like to share my experience in identifying the best practices you should be adopting whether your an existing organization or just an individual with a simple website.

Much of the following concept actually pre dates my involvement in MySQL (since 1999), so this is not just applicable for a MySQL RDBMS. For the purposes of this discussion I’d like to focus on the theory successfully used with clients.

Under version control I have the following directory structure:

NOTE If your first observation was “Arrh, Version Control?”, you are in more trouble then you want to be right from day one. You need Version Control such as svn, cvs, bzr, git etc for any website no matter how small.

/database
  /scripts
  /sql
    /schema
    /patch
    /revert
    /admin
  /data
  ....

The /database is a top level directory, and for software packaging for all database related operations, you simply include all contents from /database.

At it’s core, every database object change for configuration management will be addressed in three (3) files.

  • A schema file
  • A patch file
  • A revert file

In fact, you can add version control rules for example to ensure if you add a patch file, a corresponding revert and schema file is also specified.

For a “current” working environment, there are two paths for database object management.

  • An upgrade path
  • A new version creation.

An upgrade path which is the normal production operation, takes an existing database schema and ‘patches’ this to a new revision. As the name suggests, for each ‘patch’ file a corresponding ‘revert’ file can be used to revert the upgrade. For testing and development environments, a current version of the full schema can always be created without using the upgrade path simply by creating the schema with the current schema file.

For the purposes of understanding how this would work in a real environment, I’ll use the Sakila test database and I’ll step through a few examples.

Seeding your configuration management

Because we already have an existing schema, the first step is to seed our new configuration management with the existing schema information.

This would actually involve some duplication, however this will become more apparent in future examples.

We will be creating the following three (3) files:

  • /database/sql/schema/schema.sakila.sql
  • /database/sql/patch/patch.20090303.01.sql
  • /database/sql/revert/revert.20090303.01.sql

/database/sql/schema/schema.sakila.sql
This will be a copy of the sakila-db/sakila-schema.sql. You will need to edit this file to remove the following lines.

DROP SCHEMA IF EXISTS sakila;
CREATE SCHEMA sakila;
USE sakila;

All configuration files must not contain any schema definitions. This will be discussed in more detail at a later time.

/database/sql/schema/patch.20090303.01.sql
This will be a copy of the above file.

/database/sql/schema/revert.20090303.01.sql

DROP PROCEDURE IF EXISTS rewards_report ;
DROP FUNCTION IF EXISTS get_customer_balance;
DROP PROCEDURE IF EXISTS film_in_stock;
DROP PROCEDURE IF EXISTS film_not_in_stock;
DROP FUNCTION IF EXISTS inventory_held_by_customer;
DROP FUNCTION IF EXISTS inventory_in_stock;
DROP VIEW IF EXISTS customer_list;
DROP VIEW IF EXISTS film_list;
DROP VIEW IF EXISTS nicer_but_slower_film_list;
DROP VIEW IF EXISTS staff_list;
DROP VIEW IF EXISTS sales_by_store;
DROP VIEW IF EXISTS sales_by_film_category;
DROP VIEW IF EXISTS actor_info;
DROP TABLE IF EXISTS actor;
DROP TABLE IF EXISTS address;
DROP TABLE IF EXISTS category;
DROP TABLE IF EXISTS city;
DROP TABLE IF EXISTS country;
DROP TABLE IF EXISTS customer;
DROP TABLE IF EXISTS film;
DROP TABLE IF EXISTS film_actor;
DROP TABLE IF EXISTS film_category;
DROP TABLE IF EXISTS film_text;
DROP TABLE IF EXISTS inventory;
DROP TABLE IF EXISTS language;
DROP TABLE IF EXISTS payment;
DROP TABLE IF EXISTS rental;
DROP TABLE IF EXISTS staff;
DROP TABLE IF EXISTS store;

For the purposes of this theory, I will discuss schema creation and management at a later time. For this example, we assume the ‘sakila’ schema has been created and is empty.

The ‘two’ paths

The default path is to apply the patch file to the appropriate schema. In this case, by using the patch file, this would create the current ‘sakila’ schema.

If this fails for example, you should automatically apply the revert script which should restore your environment to it’s original state, in this case an empty schema.

If you wanted to create a new test environment for example, (following creation of the schema), you could simply apply the schema file.

Let’s perform another iteration, to see the full working process.

Adding new objects

Let’s say we wanted to keep additional information such as famous quotes an actor has made. We want to create a new table ‘actor_quote’.

For this we would first create a patch and revert script to manage this new object.
/database/sql/patch/patch.20090303.02.sql

CREATE TABLE actor_quote (
  quote_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  actor_id SMALLINT UNSIGNED NOT NULL,
  quote   VARCHAR(200) NOT NULL,
  PRIMARY KEY  (quote_id),
  KEY idx_fk_actor_id (actor_id),
  CONSTRAINT fk_actor_quote_actor FOREIGN KEY (actor_id) REFERENCES actor (actor_id) ON DELETE RESTRICT ON UPDATE CASCADE
)ENGINE=InnoDB DEFAULT CHARSET=utf8;

/database/sql/revert/revert.20090303.02.sql

DROP TABLE IF EXISTS actor_quote;

/database/sql/schema/schema.sakila.sql
The contents of the patch file should be appended to this file.

The ‘two’ paths

If we look at the two paths again.

The normal production operation, by using the patch file, would create the new database object.

If this fails for example, you should automatically apply the revert script which should restore your environment to it’s original state, in this case drop the table if it exists. In this simplest example,

If you wanted to create a new test environment for example, (following creation of the schema), you could simply apply the schema file.

Review

We have only touched on the entire process of configuration management for database objects. The implementation of this practice includes meta data and controlling scripts that manage the order of execution, recording operations performed successfully or unsuccessfully for example.

About Standards

Within this overview a number of standards are in place. These include:

  • SQL scripts do not contain any CREATE/DROP DATABASE|SCHEMA commands
  • SQL scripts do not contain any schema/database specific references. This is important for being able to easily test and verify operations. In our above examples, the default Sakila DB contains such information and would be edited appropriately.
  • For Patch and Revert files a chronological date format for naming is used, e.g. YYYYMMDD.XX, where XX is a sequential number for multiple patch/revert scripts for any given day.
  • All SQL statements must be terminated with ‘;’. This is important for the management processes and automated scripts that take these fundamental schema/patch/revert scripts as source information.
  • Where possible, try to make revert scripts, support either a successful or failed patch process. For example, adding IF EXISTS to a DROP TABLE statements supports both cases.
  • It is reasonably obvious to have schema, patch and revert directories as a naming standard, but file name also include this as a prefix. This is performed as a double check, if a file is seen in isolation it’s type can be determined regardless of directory location. Also for logging, only filenames are used.

More Information

Configuration Management in MySQL is one of the topics discussed in the “MySQL Essentials” training course. You can find more information regarding this and other training offerings including an upcoming schedule at 42SQL Education.

Planet MySQL at a new URL

Did anybody notice that http://planetmysql.org now redirects to http://planet.mysql.com?

Curious to know the reason why, perhaps an official MySQL person can give us some details.
Also it’s a 302 redirect, not a 301 redirect, interesting?

 wget http://planetmysql.org
--2009-02-26 14:40:09--  http://planetmysql.org/
Resolving planetmysql.org... 213.136.52.29
Connecting to planetmysql.org|213.136.52.29|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.planetmysql.org/ [following]
--2009-02-26 14:40:10--  http://www.planetmysql.org/
Resolving www.planetmysql.org... 213.136.52.29
Connecting to www.planetmysql.org|213.136.52.29|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://planet.mysql.com/ [following]
--2009-02-26 14:40:10--  http://planet.mysql.com/
Resolving planet.mysql.com... 213.136.52.29
Connecting to planet.mysql.com|213.136.52.29|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]

Eliminating unnecessary internal temporary tables

I can’t stress enough that people look at SQL statements that are being executed against your production MySQL database, and you optimize queries when you can.

Often it’s the improvement to the large number of similar queries executed that can optimize resources. In this example, we take a very simple query, and by removing an unnecessary order by, we eliminate MySQL internally creating a temporary (in memory) table.

So what’s the big deal.

  • The query is simpler to read and understand
  • Memory required for the connection is not assigned
  • A number of internal steps are no longer required (4 of 21 logging messages, not an ideal measurement, but an indication). In this case, it was easily a 10% performance improvement for each query.

This query is executed 10-100 times per second, so the improvement in performance is significant.

mysql> explain select max(mdate) as mdate from tbl  where original_account = '[email protected]' and id = '15847' order by mdate desc;
+----+-------------+----------------------+------+----------------------+------------------+---------+-------------+------+------------------------------+
| id | select_type | table                | type | possible_keys        | key              | key_len | ref         | rows | Extra                        |
+----+-------------+----------------------+------+----------------------+------------------+---------+-------------+------+------------------------------+
|  1 | SIMPLE      | tbl                  | ref  | ids,original_account | original_account | 388     | const,const |  146 | Using where; Using temporary |
+----+-------------+----------------------+------+----------------------+------------------+---------+-------------+------+------------------------------+
1 row in set (0.00 sec)
mysql> explain select max(mdate) as mdate from tbl  where original_account = '[email protected]' and id = '15847';
+----+-------------+----------------------+------+----------------------+------------------+---------+-------------+------+-------------+
| id | select_type | table                | type | possible_keys        | key              | key_len | ref         | rows | Extra       |
+----+-------------+----------------------+------+----------------------+------------------+---------+-------------+------+-------------+
|  1 | SIMPLE      | tbl                  | ref  | ids,original_account | original_account | 388     | const,const |  146 | Using where |
+----+-------------+----------------------+------+----------------------+------------------+---------+-------------+------+-------------+
1 row in set (0.00 sec)
mysql> show profile cpu,memory,source for query 1;
+--------------------------------+----------+----------+------------+---------------------------+---------------+-------------+
| Status                         | Duration | CPU_user | CPU_system | Source_function           | Source_file   | Source_line |
+--------------------------------+----------+----------+------------+---------------------------+---------------+-------------+
| (initialization)               | 0.00001  | 0        | 0          | send_result_to_client     | sql_cache.cc  |        1143 |
| checking query cache for query | 0.000079 | 0        | 0          | open_tables               | sql_base.cc   |        2652 |
| Opening tables                 | 0.000024 | 0        | 0          | mysql_lock_tables         | lock.cc       |         153 |
| System lock                    | 0.000015 | 0        | 0          | mysql_lock_tables         | lock.cc       |         163 |
| Table lock                     | 0.000041 | 0        | 0          | mysql_select              | sql_select.cc |        2282 |
| init                           | 0.000046 | 0        | 0          | optimize                  | sql_select.cc |         765 |
| optimizing                     | 0.000027 | 0        | 0          | optimize                  | sql_select.cc |         924 |
| statistics                     | 0.000173 | 0        | 0          | optimize                  | sql_select.cc |         934 |
| preparing                      | 0.000028 | 0        | 0          | optimize                  | sql_select.cc |        1383 |
| Creating tmp table             | 0.000053 | 0        | 0          | exec                      | sql_select.cc |        1603 |
| executing                      | 0.000011 | 0        | 0          | exec                      | sql_select.cc |        1743 |
| Copying to tmp table           | 0.002226 | 0        | 0          | exec                      | sql_select.cc |        2123 |
| Sending data                   | 0.000148 | 0        | 0          | mysql_select              | sql_select.cc |        2327 |
| end                            | 0.000013 | 0        | 0          | free_tmp_table            | sql_select.cc |       10115 |
| removing tmp table             | 0.000064 | 0        | 0          | free_tmp_table            | sql_select.cc |       10143 |
| end                            | 0.000014 | 0        | 0          | mysql_execute_command     | sql_parse.cc  |        5154 |
| query end                      | 0.000012 | 0        | 0          | query_cache_end_of_result | sql_cache.cc  |         735 |
| storing result in query cache  | 0.000047 | 0        | 0          | mysql_parse               | sql_parse.cc  |        6155 |
| freeing items                  | 0.000021 | 0        | 0          | dispatch_command          | sql_parse.cc  |        2146 |
| closing tables                 | 0.000014 | 0        | 0          | log_slow_statement        | sql_parse.cc  |        2204 |
| logging slow query             | 0.000011 | 0        | 0          | dispatch_command          | sql_parse.cc  |        2169 |
+--------------------------------+----------+----------+------------+---------------------------+---------------+-------------+
21 rows in set (0.00 sec)


mysql> show profile cpu,memory,source for query 2;
+--------------------------------+-----------+----------+------------+---------------------------+---------------+-------------+
| Status                         | Duration  | CPU_user | CPU_system | Source_function           | Source_file   | Source_line |
+--------------------------------+-----------+----------+------------+---------------------------+---------------+-------------+
| (initialization)               | 0.000021  | 0        | 0          | send_result_to_client     | sql_cache.cc  |        1143 |
| checking query cache for query | 0.000090  | 0        | 0          | open_tables               | sql_base.cc   |        2652 |
| Opening tables                 | 0.000022  | 0        | 0          | mysql_lock_tables         | lock.cc       |         153 |
| System lock                    | 0.000014  | 0        | 0          | mysql_lock_tables         | lock.cc       |         163 |
| Table lock                     | 0.000044  | 0        | 0          | mysql_select              | sql_select.cc |        2282 |
| init                           | 0.000049  | 0        | 0          | optimize                  | sql_select.cc |         765 |
| optimizing                     | 0.000028  | 0        | 0          | optimize                  | sql_select.cc |         924 |
| statistics                     | 0.000179  | 0        | 0          | optimize                  | sql_select.cc |         934 |
| preparing                      | 0.000029  | 0        | 0          | exec                      | sql_select.cc |        1603 |
| executing                      | 0.000016  | 0        | 0          | exec                      | sql_select.cc |        2123 |
| Sending data                   | 0.00229   | 0        | 0          | mysql_select              | sql_select.cc |        2327 |
| end                            | 0.000039  | 0        | 0          | mysql_execute_command     | sql_parse.cc  |        5154 |
| query end                      | 0.000012  | 0        | 0          | query_cache_end_of_result | sql_cache.cc  |         735 |
| storing result in query cache  | 0.000011  | 0        | 0          | mysql_parse               | sql_parse.cc  |        6155 |
| freeing items                  | 0.00002   | 0        | 0          | dispatch_command          | sql_parse.cc  |        2146 |
| closing tables                 | 0.000014  | 0        | 0          | log_slow_statement        | sql_parse.cc  |        2204 |
| logging slow query             | 0.00001   | 0        | 0          | dispatch_command          | sql_parse.cc  |        2169 |
+--------------------------------+-----------+----------+------------+---------------------------+---------------+-------------+
17 rows in set (0.00 sec)

Announcing "MySQL Essentials" Training

Are you having problems getting up to speed on MySQL? Are you asking yourself “Is there a hands-on training course we can send a developer/system admin to learn MySQL?”. In response, at 42SQL we have put together two new training courses, MySQL Essentials and MySQL Operations.

MySQL Essentials Training Details

With MySQL Essentials we tackle the core essentials that a developer/system admin/junior DBA would require in order to support an initial development environment that uses MySQL. Essentials training teaches the following skills:

  • Which version of MySQL to use (including the various different variants and patches available)
  • Backup, retention, and recovery strategies
  • Configuration and Monitoring of MySQL
  • Optimal schema and data objects configuration management
  • more information here

We are now accepting registrations for MySQL Essentials training being held on April 1st – 2nd in New York, and April 6th – 7 th in Washington DC.

About the presenter

Ronald Bradford is a two-decade veteran with extensive database experience in MySQL, Oracle and Ingres. His expertise covers data architecture, software development, migration, performance analysis and production system implementations. With ten years experience in MySQL, his involvement in the MySQL ecosystem has included working as Senior Consultant with MySQL Inc, speaker at four MySQL Conferences, and creator of the “MySQL for Oracle DBA’s” one-day workshop. Ronald holds MySQL Certifications including DBA 5.0, Developer 5.0 and MySQL Cluster 5.1.

The art of looking at the actual SQL statements

It’s a shame that MySQL does not provide better granularity when you want to look at all SQL statements being executed in a MySQL server. I canvas that you can with the general log, but the inherit starting/stopping problems in 5.0, improved in 5.1, but I would still like to see the option on a per connection basis, or even a time period. MySQL Proxy can provide a solution here but also with some caveats.

You should however in a NON production environment, take the time to enable the general log and look the SQL Statements. Prior to looking at the SQL, monitoring of the GLOBAL STATUS variables combined with Statpack revealed the following in a 1 minute interval.

====================================================================================================
                                         Statement Activity
====================================================================================================

                     SELECT:           16,042                   267.37                8,177,050 (46.03%)
                     INSERT:            5,838                    97.30                1,826,616 (10.28%)
                     UPDATE:            1,109                    18.48                  738,546 (4.16%)
                     DELETE:            2,018                    33.63                1,374,983 (7.74%)
                    REPLACE:                0                     0.00                        0 (0.00%)
          INSERT ... SELECT:                0                     0.00                       27 (0.00%)
         REPLACE ... SELECT:                0                     0.00                        0 (0.00%)
               Multi UPDATE:                0                     0.00                        0 (0.00%)
               Multi DELETE:                0                     0.00                        0 (0.00%)
                     COMMIT:            5,708                    95.13                2,161,232 (12.17%)
                   ROLLBACK:            5,746                    95.77                3,485,828 (19.62%)

If you notice the last 2 lines, some 19% of statements executed on the server are ROLLBACK. Further analysis of the schema shows mainly Innodb tables (good as COMMIT and ROLLBACK are supported), but also some MyISAM tables.

The following is a snippet from the general log.

                     23 Query       select 1
                     23 Query       INSERT INTO JMS_TRANSACTIONS (TXID) values(17719)
                     23 Query       UPDATE JMS_MESSAGES SET TXID=17719, TXOP='D' WHERE MESSAGEID=16248 AND DESTINATION='QUEUE.receivemail'
                     23 Query       commit
                     23 Query       rollback
                     23 Query       select 1
                     23 Query       DELETE FROM JMS_MESSAGES WHERE TXID=17719 AND TXOP='D'
                     23 Query       DELETE FROM JMS_TRANSACTIONS WHERE TXID = 17719
                     23 Query       commit
                     23 Query       rollback

This turns out to be most interesting. These tables are use by Java Messaging Service but I observed three points.

  • the ‘select 1′ is effectively a ping test to confirm the connection is still valid. MySQL provides a more lightweight COM_PING. It would be good to know if this environment using JBoss could support that.
  • There is a ‘ROLLBACK’ after every command, totally redundant, and most likely part of higher level framework.
  • The ‘COMMIT’ is used in conjunction with a number of statements, however when I mentioned earlier some tables were MyISAM, these were the JMS tables, so in this situation the commit is useless as this is not a transactional storage engine.

A number of decisions are needed to correct this problem, however the point of raising this is, always look at the your SQL.

Watching a slave catchup

This neat one line command can be of interest when you are rebuilding a MySQL slave and replication is currently catching up.

$ watch --interval=1 --differences 'mysql -uuser -ppassword -e "SHOW SLAVE STATUS\G"'

You will see the standard SHOW SLAVE STATUS output, but the watch command presents an updated view every second, and highlights differences. This can be useful in a background window to keep an eye on those ‘Seconds Behind Master’.

*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: 10.10.10.10
                Master_User: slave
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: mysql-bin.000626
        Read_Master_Log_Pos: 88159239
             Relay_Log_File: slave-relay.000005
              Relay_Log_Pos: 426677632
      Relay_Master_Log_File: mysql-bin.000621
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB:
        Replicate_Ignore_DB:
         Replicate_Do_Table:
     Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
                 Last_Errno: 0
                 Last_Error:
               Skip_Counter: 0
        Exec_Master_Log_Pos: 426677495
            Relay_Log_Space: 2714497549
            Until_Condition: None
             Until_Log_File:
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File:
         Master_SSL_CA_Path:
            Master_SSL_Cert:
          Master_SSL_Cipher:
             Master_SSL_Key:
      Seconds_Behind_Master: 24131

Some Drupal observations

I had the opportunity to review a client’s production Drupal installation recently. This is a new site and traffic is just starting to pick up. Drupal is a popular LAMP stack open source CMS system using the MySQL Database.

Unfortunately I don’t always have the chance to focus on one product when consulting, sometimes the time can be minutes to a few hours. Some observations from looking at Drupal.

Disk footprint

Presently, volume and content is of a low volume, but expecting to ramp up. I do however find 90% of disk volume in one table called ‘watchdog';


+--------------+--------------+--------------+-------------+--------+
| table_schema | total_mb     | data_mb      | index_mb    | tables |
+--------------+--------------+--------------+-------------+--------+
| xxxxx        | 812.95555878 | 745.34520721 | 67.61035156 |    191 |
+--------------+--------------+--------------+-------------+--------+

+-------------------------------------------+--------+------------+------------+----------------+--------------+--------------+-------------+
| table_name                                | engine | row_format | table_rows | avg_row_length | total_mb     | data_mb      | index_mb    |
+-------------------------------------------+--------+------------+------------+----------------+--------------+--------------+-------------+
| watchdog                                  | MyISAM | Dynamic    |      63058 |            210 | 636.42242813 | 607.72516251 | 28.69726563 |
| cache_menu                                | MyISAM | Dynamic    |        145 |         124892 |  25.33553696 |  25.32577133 |  0.00976563 |
| search_index                              | MyISAM | Dynamic    |     472087 |             36 |  23.40134048 |  16.30759048 |  7.09375000 |
| comments                                  | MyISAM | Dynamic    |      98272 |            208 |  21.83272934 |  19.58272934 |  2.25000000 |

Investigating the content of the ‘watchdog’ table shows detailed logging. Drilling down just on the key ‘type’ records shows the following.

mysql> select message,count(*) from watchdog where type='page not found' group by message order by 2 desc limit 10;
+--------------------------------------+----------+
| message                              | count(*) |
+--------------------------------------+----------+
| content/images/loadingAnimation.gif  |    17198 |
| see/images/loadingAnimation.gif      |     6659 |
| images/loadingAnimation.gif          |     6068 |
| node/images/loadingAnimation.gif     |     2774 |
| favicon.ico                          |     1772 |
| sites/all/modules/coppa/coppa.js     |      564 |
| users/images/loadingAnimation.gif    |      365 |
| syndicate/google-analytics.com/ga.js |      295 |
| content/img_pos_funny_lowsrc.gif     |      230 |
| content/google-analytics.com/ga.js   |      208 |
+--------------------------------------+----------+
10 rows in set (2.42 sec)

Some 25% of rows is just the reporting one missing file. Correcting this one file cuts down a pile of unnecessary logging.

Repeating Queries

Looking at just 1 random second of SQL logging shows 1200+ SELECT statements.
355 are SELECT changed FROM node

$ grep would_you_rather drupal.1second.log
              7 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              5 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              3 Query       SELECT field_image_textarea_value AS value FROM content_type_would_you_rather WHERE vid = 24303 LIMIT 0, 1
              4 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              6 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
             10 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              9 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              8 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              9 Query       SELECT field_image_textarea_value AS value FROM content_type_would_you_rather WHERE vid = 24303 LIMIT 0, 1

There is plenty of information regarding monitoring the Slow Queries in MySQL, but I have also promoted that’s it not the slow queries that ultimately slow a system down, but the 1000’s of repeating fast queries.

MySQL of course has the Query Cache to assist, but this is a course grade solution, and a high volume read/write environment this is meaningless.

There is a clear need for either a application level caching, or a database redesign to pull rather then poll this information, however without more in depth review of Drupal I can not make any judgment calls.

Best Practices in Migrating to MySQL

This week I was the invited speaker to give a 4 hr presentation to the Federal Government Sector in Washington DC on “Best Practices in Migrating to MySQL“. This was a followup to my day long “MySQL for the Oracle DBA Bootcamp” which I presented in Washington DC last year. It was good to see a number of attendees from my first DC presentation.

There was good attendance across various government departments and companies providing services to the government sector, as well a variety of job descriptions.

Thanks to Carahsoft and Sun/MySQL for organizing and sponsoring the event. Thanks also to Phil Hildebrand who provided fantastic support during my preparation answering all my SQL Server questions.

Thanks also to Baron Schwartz creator of Maatkit who as my invited guest was nice enough to table a list of attendee questions, which is always a good reference for revising slides and writing more blog posts.

You can find the first of seven sessions online in my presentations section.

Updated
Thanks to Baron Schwartz for his follow-up blog posts Migrating US Government applications from Oracle to MySQL and 50 things to know before migrating Oracle to MySQL.

Strict mode can still throw warnings

MySQL by default is vary lax with data validation. Silent conversions is a concept that is not a common practice in other databases. In MySQL, instead of throwing an error, a warning was thrown and many applications simply did not handle warnings. With the introduction of sql_mode=STRICT_ALL_TABLES (or TRADITIONAL), in MySQL 5, a better level of validation now exists.

My understanding was that Warnings are now thrown as Errors, therefore eliminating the need to do a SHOW WARNINGS to confirm any problems after every query (this is a performance overhead on a high volume system due to the round trip latency).

However I found an instance where MySQL in STRICT Mode still throws warnings, leading to the question, are there any other areas, and does the earlier statement “Warnings are now thrown as Errors” hold true.

Here is my seeding process to showing the problem.

mysql> create table i(i tinyint, unique key( i));
Query OK, 0 rows affected (0.01 sec)
mysql> insert into i values(999);
Query OK, 1 rows affected (0.00 sec)

Using default settings, attempting to INSERT a duplicate row throws an error, using INSERT IGNORE does not.

mysql> insert into i values(999);
ERROR 1062 (23000): Duplicate entry '127' for key 'i'
mysql> insert ignore into i values(999);
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+--------------------------------------------+
| Level   | Code | Message                                    |
+---------+------+--------------------------------------------+
| Warning | 1264 | Out of range value for column 'i' at row 1 |
+---------+------+--------------------------------------------+
1 row in set (0.00 sec)

When using a Strict Mode, a recommendation for all new systems, it is generally accepted that warnings are translated into errors, which implies your could should never have to consider checking for warnings.

mysql> truncate table i;
mysql> set sql_mode=strict_all_tables;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into i values(999);
ERROR 1264 (22003): Out of range value for column 'i' at row 1
mysql> insert ignore into i values(999);
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+--------------------------------------------+
| Level   | Code | Message                                    |
+---------+------+--------------------------------------------+
| Warning | 1264 | Out of range value for column 'i' at row 1 |
+---------+------+--------------------------------------------+
1 row in set (0.00 sec)
mysql> set sql_mode=traditional;
Query OK, 0 rows affected (0.00 sec)

mysql> insert ignore into i values(9990);
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+--------------------------------------------+
| Level   | Code | Message                                    |
+---------+------+--------------------------------------------+
| Warning | 1264 | Out of range value for column 'i' at row 1 |
+---------+------+--------------------------------------------+
1 row in set (0.00 sec)

I should caveat this post also with using caution with INSERT IGNORE. This should only be used if you never care about errors which I would never consider as a best practice design approach.

Reducing the MySQL 5.1.30 disk footprint

The current size of a MySQL 5.1.30 installation is around 420M.

$ du -sh .
426M	.

A further breakdown.

$ du -sh *
213M	bin
20K	COPYING
9.8M	docs
8.0K	EXCEPTIONS-CLIENT
436K	include
12K	INSTALL-BINARY
121M	lib
504K	man
4.0K	my.cnf
77M	mysql-test
4.0K	README
20K	scripts
2.3M	share
2.9M	sql-bench
100K	support-files

A means to reduce the footprint by 25% is to delete some unused stuff.

$ rm -rf docs/ mysql-test/ sql-bench/
$ du -sh .
337M	.

It’s no big deal, however it certainly does cut down on verbose output in the backup logs removing the mysql-test directory and files.

Technology changes, humans don't. – Web 2.0 NY Second keynote

I needed a rest from my opening keynote review NY Tech 1995-2008. Opening Web 2.0 Expo NY Keynote but a few siginificant points from The Death of the Grand Gesture by Deb Schultz.

  • An interesting site is Visual Complexity showing graphical representations of many social networks.
  • All the binary communication becomes white noise — Information Overload.
  • “Technology changes, humans don’t” – Deb Schultz

A 5.1 QEP nicety – Using join buffer

I was surprised to find yesterday when using MySQL 5.1.26-rc with a client I’m recommending 5.1 to, some information not seen in the EXPLAIN plan before while reviewing SQL Statements.

Using join buffer

+----+-------------+-------+--------+---------------+--------------+---------+------------------------+-------+----------------------------------------------+
| id | select_type | table | type   | possible_keys | key          | key_len | ref                    | rows  | Extra                                        |
+----+-------------+-------+--------+---------------+--------------+---------+------------------------+-------+----------------------------------------------+
|  1 | SIMPLE      | lr    | ALL    | NULL          | NULL         | NULL    | NULL                   |  1084 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | ca    | ref    | update_check  | update_check | 4       | XXXXXXXXXXXXXXXXX      |     4 | Using where; Using index                     |
|  1 | SIMPLE      | ce    | ALL    | NULL          | NULL         | NULL    | NULL                   | 13319 | Using where; Using join buffer               |
|  1 | SIMPLE      | co    | eq_ref | PRIMARY       | PRIMARY      | 4       | XXXXXXXXXXXXXXXXX      |     1 | Using where                                  |
+----+-------------+-------+--------+---------------+--------------+---------+------------------------+-------+----------------------------------------------+
4 rows in set (0.00 sec)
mysql> select version();
+-----------+
| version() |
+-----------+
| 5.1.26-rc |
+-----------+
1 row in set (0.00 sec)

Sergey Petrunia of the MySQL Optimizer team writes about this in Use of join buffer is now visible in EXPLAIN.

VirtualBox, compiling Part 2

So I managed to find all dependencies after some trial and error for compiling VirtualBox 1.6.4 under Ubuntu 8.0.4, then finding the Linux build instructions to confirm.

It was not successful however in building, throwing the following error:

kBuild: Compiling dyngen - dyngen.c
kBuild: Linking dyngen
kmk[2]: Leaving directory `/usr/local/VirtualBox-1.6.4/src/recompiler'
kmk[2]: Entering directory `/usr/local/VirtualBox-1.6.4/src/apps'
kmk[2]: pass_bldprogs: No such file or directory
kmk[2]: *** No rule to make target `pass_bldprogs'. Stop.
kmk[2]: Leaving directory `/usr/local/VirtualBox-1.6.4/src/apps'
kmk[1]: *** [pass_bldprogs_before] Error 2
kmk[1]: Leaving directory `/usr/local/Virtu

More searching, I needed to add two more files manually. Read More Here.

A long wait, compiling for 20+ minutes, and a necessary reboot as upgraded images threw another error, I got 1.6.4 running, and able to boot Fedora Core 9 image created under 1.5.6

But the real test, and the need for this version was to install Intrepid.

This also failed with a Kernel panic during boot. More info to see this reported as a Ubuntu Bug and Virtual Box Bug.

More work still needed.

Virtual Box, a world of hurt

I successfully installed Virtual box via a few simply apt-get commands under Ubuntu 8.04 via these instructions.

It started fine, after two small annoying, install this module, add this group messages. I was even able to install Ubuntu Intrepid from .iso. But from here it was down hill.

Attempting to start VM gives the error.

This kernel requires the following features not present on the CPU:
pae
Unable to boot - please use a kernel appropriate for the CPU

Some digging around, and confirmation that the current packaged version of Virtual Box doesn’t support PAE. You think they could tell you before successfully installing an OS. I’m running 1.5.6, I need 1.6.x

$ dpkg -l | grep virtualbox
ii  virtualbox-ose                             1.5.6-dfsg-6ubuntu1                      x86 virtualization solution - binaries
ii  virtualbox-ose-modules-2.6.24-19-generic   24.0.4                                   virtualbox-ose module for linux-image-2.6.24
ii  virtualbox-ose-source                      1.5.6-dfsg-6ubuntu1                      x86 virtualization solution - kernel module

Off to the Virtual Box Downloads to get 1.6.4
Don’t make the same mistake as I did and use the first download link, that’s the commercial version that doesn’t install what you expect, you need the OSE. Of course this is not packaged, it’s only source.

  ./configure
Checking for environment: Determined build machine: linux.x86, target machine: linux.x86, OK.
Checking for kBuild: found, OK.
Checking for gcc: found version 4.2.3, OK.
Checking for as86:
  ** as86 (variable AS86) not found!

Ok, well I go through this step like 4 times, installing one package at a time, I wish they could do a pre-check and give you all missing requirements. I installed bin86, bcc, iasl.

Then I got to the following error.

$ ./configure
...
Checking for libxml2:
  ** not found!

Well it’s installed, all too hard. Throw Virtual Box away for virtualization software. And why am I using it anyway. Because VMWare Server doesn’t work under Ubuntu 8.04 either because of some ancient gcc dependency. Sees I may have to go back to that. I just want a working virtualization people on the most popular Linux distro to install other current distros. It’s not a difficult request.

$ dpkg -l | grep libxml
ii  libxml-parser-perl                         2.34-4.3                                 Perl module for parsing XML files
ii  libxml-twig-perl                           1:3.32-1                                 Perl module for processing huge XML document
ii  libxml2                                    2.6.31.dfsg-2ubuntu1                     GNOME XML library
ii  libxml2-utils                              2.6.31.dfsg-2ubuntu1                     XML utilities
ii  python-libxml2                             2.6.31.dfsg-2ubuntu1                     Python bindings for the GNOME XML library

Choosing MySQL 5.1 over 5.0

I have been asked twice this week what version of MySQL I would choose for a new project.
As with most questions in life the answer is: It Depends?

In general I would now recommend for a new project to select 5.1, and he is why.

  1. If it’s a new project and your not managing existing applications with older versions then 5.1 is slated for General Availability (GA) at some imminent time. Having been at Release Candidate (RC) for quite some time (almost 1 year), many people, both internally and in the community are just waiting for Sun/MySQL to get this version out.
  2. MySQL 5.0 is in maintenance mode, it’s now 3 years old. MySQL is placing (I’m assuming) resourcing energies to current and future releases.
  3. If your looking at releasing a product in the next 3 months for example, you do not want to consider the testing and deployment of a new version (e.g. 5.1) in the next 6-9 months.
  4. Unless your comparing specific performance between 5.1 and 5.0 in your edge cases, for a new project start with 5.1 you should be testing and confirming performance and reliability here. The worse case is you can test in 5.0 of any specific problem.
  5. 5.1 gives you new features of course, partitioning may be of benefit but don’t assume it’s going to be a great improvement unless you applications SQL naturally tended to the MySQL partitioning strengths.
  6. The single biggest benefit is the Pluggable Storage Engine Architecture. This can give you some benefits, and in the case of transaction storage engines that are production ready, Innodb now has a pluggable version, much improved on the MySQL supplied version. There are a long list of other engines under development with relative strengths and weakness, however be wary of versions that require customized builds of MySQL.

There are some concerns where I don’t have answers? For example, if you have MySQL Support , is 5.1 supported? I know a common answer to problems in pre 5.0 versions is, have you tried upgrading to 5.1

Why is not released? This is good question, the answer is obviously a level of quality, however it is generally discussed that 5.1 is of better quality in existing features then 5.0. It is 5.1 specific features you need to be careful of. It’s important that you do read carefully the 5.1 Release Notes to see where bug fixes or compatibility changes are still occuring.

As with any choice in the Open Source world, some level of risk assessment is necessary. If you have good metrics and measurement in place for your system, and you adequately test your software, there is no reason not to now consider 5.1 as a viable alternative for new development.

As I post this I note, I see the yet unreleased 5.1.28 list of bugs still shows issues of concern.

Project Darkstar

It may sound like either a astronomical research project or a Star Wars spin- off, but Project Darkstar is an open source infrastructure from Sun Microsystems that states “simplify the development and operation of massively scalable online games, virtual worlds, and social networking applications.”

The advertising sounds promising like many sites, the emphasis seems to be on gaming throughout the material, interesting they threw in the term “social networking applications” specifically in opening descriptions.

I believe worthy of investigation, if only to see how that solve some classic problems. So, Learn some more, Start your rockets and Participate.