Percona Performance Conference Talk

My final presentation during the 2009 MySQL Conference and Expo week was with the Percona Performance Conference on the topic of The Ideal Performance Architecture. My talk included discussions on Technology, Disk, Memory, Indexes, SQL and Data.

Updated 09/18/09
you can now see video of the event at Percona TV.

MySQL Monitoring 101

At the 2009 MySQL Conference and Expo I presented to a full room on MySQL Monitoring 101.

This presentation focused on the following four goals.

  • Know what to monitor
  • Know how you can monitor
  • Learn practices to diagnose problems
  • Have a foundation of historical information

Updated 09/18/09
You can also find additional materials at:

A change in the MySQL Binary distributions

Yesterday was the surprise announcement of MySQL 5.4 at the 2009 MySQL Conference and Expo. It was unfortunate that the supporting information was not that forthcoming on the MySQL website. I tried for several hours to try and download, but no mirrors were initially available. Today I see some information on the mysql.com home page and finally able to get the binary.

What I found most significant with this new major version release is a change in the binary distribution, as seen on the Download page.

MySQL 5.4 is only available on 3 platforms:

  • Linux (AMD64 / Intel EM64T)
  • Solaris 10 (SPARC, 64-bit)
  • Solaris 10 (AMD64 / Intel EM64T, 64-bit)

I was also surprised that this beta release highlights the emphasis of community contributions (long overdue), yet the community and indeed many employees of Sun/MySQL were simply unaware of this work. This is clearly a change in involving the community. While I applaud the beta status, hopefully a more stable product to start with, it’s development was done in a very closed company model.

Setting up MySQL on Amazon Web Services (AWS) Presentation

On Tuesday at the MySQL Camp 2009 in Santa Clara I presented Setting up MySQL on Amazon Web Services (AWS).

This presentation assumed you know nothing about AWS, and have no account. With Internet access via a Browser and a valid Credit Card, you can have your own running Web Server on the Internet in under 10 minutes, just point and click.

We also step into some more detail online click and point and supplied command line tools to demonstrate some more advanced usage.

What's happening with InnoDB

I have moved on to InnoDB: Innovative Technologies for Performance and Data Protection by Ken Jacobs at MySQL Conference and Expo.

With a brief history lesson of inception from 1994, inclusion in MySQL in 2000 and acquired by Oracle in 2005. Most of the work was done by one person. InnoDB is based on sound database computer science using Gray & Reuters definitive text on database design.

Some key points in Ken’s discussion.

  • Adaptive Hash indexing for frequent queries on keys.
  • In plugin Adaptive Hash is configurable
  • Insert Buffering – Deferring secondary index writes
  • Fast Index Create – doesn’t requires all indexes to be rebuilt
  • Table Compression – Changing the page size

The InnoDB plugin available in 5.1 has a number of new benefits.

  • fast index creation
  • table compression
  • info schema tables
  • new row storage format
  • file format management

All InnoDB 1.0.3 plugin features will be available in MySQL 5.4

The big announcement is a new product – Embedded InnoDB. This has the high performance, reliability and rich functionality of InnoDB, has a flexible programmatic API. No SQL, No security.

Search at Craigslist

I am now sitting in on MySQL and Search at Craigslist by Jeremy Zawodny at MySQL Users Conference

Some of the technical difficulties that required addressing.

  • High churn rate
  • half life can be very short
  • Growth
  • Traffic
  • Need to archive postings, e.g. 100M but be searchable
  • Internationalization and UTF-8

Some of the Craigslist Goals

  • Open Source
  • Easy and approachable
  • be green with energy use

A review of the Internals server configuration

  • Load Balancer (perlbal like)
  • Read Proxy Array (perl+memcached)
  • Web Read Array (apache 1.3 + mod_perl)
  • Object Cache (Perl + memcached)
  • Read DB Cluster (MySQL 5.0.x)
  • Search Cluster (Sphinx)

Clusters of DB servers have good vertical partitioning by Roles. These being

  • Users
  • Classified
  • Forums
  • Stats
  • Archive

Sphinx is a full standalone full text search that is used. Did compare with Apache Solr, but it seemed more complex and complicated. The Sphinx configuration:

  • Partitioned based on cities (people search locally)
  • Attributes v Keywords
  • Persistent Connections
  • Minimal stopword list
  • Partition in 2 clusters (1 master, 4 slaves)

The results of implementing Sphinx were:

  • decrease in 25 MySQL boxes to 10 sphinx boxes
  • no locking
  • 1,000+ qps
  • 50M queries per day
  • Better separation of code

MySQL Users Conference Opening Lines

Opening introduction from Colin Charles got us started. Karen Tegan Padir VP MySQL & Software Infrastructure was the opening keynote.

She comes from a strong tech background and is passionate about open source, the communities and how to make a successful product.

There isn’t a person that doesn’t go a day without interacting with a website or hardware system that uses a MySQL database.

The big news was the announcement of MySQL 5.4 – Performance & Scalability. Key features include.

  • InnoDb scalability 16way x86 and 64 way CMT servers
  • subquery optimization
  • new query algorithms
  • improved stored procedures, and prepared statements
  • enhanced Information Schema
  • improved DTrace Support

More information at MySQL 5.4 Announcement Details….

Other key points includes:

1. Ken Jacobs announces today an Embedded Innodb with a powerful API (not SQL based). Read more at Innobase Introduces Embedded InnoDB
2. MySQLCluster 7.0 is also released today. Some benchmarks 4.3x improvements. New features also include LDAP support.
3. The next release of MySQL Query Analyzer, 2.1 announced.
4. Sun announces a commitment to accept contributions from the community.
5. Community also gets the Monthly Rapid Updates.
6. MySQL Drizzle Project is discussed as a technology incubator.

Partners of the year: Intel, Infobright and Lifeboard.
Appliation of the year: Zappos, Alcatel-lucent and Symantec.
Community members of the year: Marc Delisle, Ronald Bradford, Shlomi Noach.

Where is the MySQL in Sun's announcement

I find it surprising that in the official Sun Announcement there is no mention of MySQL for two reasons. Firstly, this was Sun largest single purchase of $1 billion only 12 months ago. Second, MySQL’s largest competitor is Oracle.

While the Sun website shows the news in grandeur, the MySQL website is noticeably absent in any information of it’s owners’ acquisition.

On my professional side, as an independent speaker for Sun Microsystems with plans for upcoming webinars and future speaking on “Best Practices in Migrating to MySQL from Oracle”, this news does not benefit my bottom line.

Drizzle + PHP = Sweet

I’ve just successfully configured Drizzle with the PHP Extension and successfully retrieve data to present on a web page.

Qudos to Eric Day for his work. I was able to identify a problem with the current tar release, and a quick confirmation on #drizzle at IRC confirmed a fix had already been commited.

I’m looking forward to evaluating WordPress and Drupal, two popular and common LAMP stack applications that run on MySQL, and to provide any feedback to the community for future support of Drizzle.

What questions do you ask?

When you have to evaluate a MySQL System & Environment, what questions do you ask in order to determine critical information about the environment and evaluate the business success and viability. You don’t have to be a consultant to ask these questions, ask them of your own environment. Do any of the answers shock or concern you?

I was prompted to write about this from a conversation with a colleague about “accepting risk”. His comment was, “every IT server on the planet is vulnerable regardless of best practices.”

Here is my list of questions for you based on an immediate response from this discussion?

Technology

  • What is your full technology stack, i.e. Operating System, Database, Application Server, Development Language(s) and other essential components?
  • What are the versions of these technologies?
  • What new technologies or versions of existing technologies are you presently evaluating?

Disaster Recovery

  • What is your Backup and Recovery strategy including your Database, Application and Administration?
  • Have you tested your Backup and Recovery strategy?
  • Have you really tested your Backup and Recovery strategy from end to end? How long ago? How long did it take?
  • What RAID do you run? Have you physically verified that? When did you confirm you are not running in a degraded RAID situation?
  • What does your website look like when it’s unavailable? What is the physical content on the website. Let’s pretend your entire data center is unavailable for 40 hours.
  • Have you ever had a major disaster? What did you learn from this experience?

Development Processes

  • Do you use a version control system? Which one? Is everything under version control?
  • Do you have a controlled and reproducible build and release process? Is it automated in any way?
  • What are your levels of testing. Unit test for coding? Regression testing for new features? Volume Testing? End User Testing?
  • Do you have a proper test environment (which is not production) where you can accurately evaluate production software and production problems?

Infrastructure

  • How do you know when there is a problem with your site? Do you have monitoring and alert notification in place?
  • When you have a performance problem, can you evaluate if it is new, re-occuring or a gradually worsening problem?
  • What are you two biggest performance problems right now? What are the specific details of the problem? “My website is slow” is not an answer
  • Can you roll out new features without taking your website down for general use?

Business Viability

  • How long would a customer stay with your site if it was unavailable?
  • Can your clients be satisfied with the Twitter “failed whale” approach or will they leave?
  • When will your system crash under load? Do you know this figure? What is the load today and the projections to this failure point?

Given more time I’d probably revise the list, but this was just an initial response.

This post is part of 31 Days to Build a Better BlogWrite a List Post.

mysql.com search is so broken

Today, while on the MySQL manual page, I typed in ‘select’ in the search manual box to confirm the SELECT syntax.

The result was not what I expected, the “SELECT” command. Instead I only got two options “Speed of SELECT …” and “Optimizing SELECT and …”.

Ok, well that’s not what I want, there is a suggestion box to the right so I pick the top option “mysql select”. Not only is this worse with “Type Conversion in …”, “Searching on Two Keys” I also get 3 totally useless “Keymatch” records

Download MySQL -http://dev.mysql.com/downloads/
MySQL Training – http://www.mysql.com/training/ KeyMatch
Buy MySQL Enterprise -http://shop.mysql.com/enterprise/

I know in the past just entering ‘SELECT’ worked, because I’ve been presently writing tests on JOIN syntax and I wanted to link in my blog reference.

Images of my searches.


Developing Code Coverage for MySQL tests

I have always been a strong advocate of good testing of any system. I started on a project last year with Drizzle to produce coverage tests to facilitate verifying syntax and helping in comparison with MySQL.

From my extensive experience in code generation from the past 20 years, I produced about 3 years ago when at MySQL Professional Services a Java based solution with a small meta language to automated the creation of a large number of tests. At the time is was some 475 tests and 200k lines of mysql-test syntax when I first looked at validating the Nitro Storage Engine.

A number of issues with mysql-tests including the support of multiple storage engines had me last year write a Proposed Testing Protocol for Drizzle for further discussion. While the desire for improvements still exists, we can not deny the benefit of leveraging the large amount of scope the current MySQL Test Suite, and so I have started in the creation of tests in this format.

My first output is to validate all possible variants of creating various data types in MySQL. This includes the obvious datatype, but also NULL, NOT NULL syntax, sizing with (n) and (m,n), UNSIGNED and ZEROFILL. What my tests also cover is not that the syntax is valid, but also the cases where the syntax is invalid (e.g. you can use UNSIGNED for Character fields). This POC is a small example of what can be produced.

The below first test enabled me to easily confirm valid and invalid syntax within Drizzle. This for example helped in clarifying Drizzle Data Types. I hope to be able to use the same framework to confirm MariaDB syntax and MySQL compatibility.

ct_syntax_1col.test

--disable_warnings
DROP TABLE IF EXISTS t1;
--enable_warnings
CREATE TABLE t1 (c1 tinyint NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 smallint NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 int NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 bigint NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 float NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 double NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 decimal NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 bit NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 date NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 time NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 datetime NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 timestamp NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 year NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 tinytext NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 text NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 mediumtext NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 longtext NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 tinyblob NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 blob NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 mediumblob NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 longblob NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 char NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 binary NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;

#Error conditions
--error 1064
CREATE TABLE t1 (c1 varchar NULL);
--error 1064
CREATE TABLE t1 (c1 varbinary NULL);

CREATE TABLE t1 (c1 tinyint NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 smallint NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 int NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 bigint NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 float NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 double NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 decimal NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 bit NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 date NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 time NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 datetime NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 timestamp NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 year NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 tinytext NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 text NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 mediumtext NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 longtext NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 tinyblob NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 blob NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 mediumblob NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 longblob NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 char NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 binary NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;

#Error conditions
--error 1064
CREATE TABLE t1 (c1 varchar NOT NULL);
--error 1064
CREATE TABLE t1 (c1 varbinary NOT NULL);

CREATE TABLE t1 (c1 tinyint(1) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 smallint(1) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 int(1) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 bigint(1) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 decimal(1) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 bit(1) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 char(1) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 varchar(1) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 binary(1) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 varbinary(1) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 float(1) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 text(1) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 blob(1) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 timestamp(1) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 year(1) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;

#Error conditions
--error 1064
CREATE TABLE t1 (c1 double(1) NULL);
--error 1064
CREATE TABLE t1 (c1 date(1) NULL);
--error 1064
CREATE TABLE t1 (c1 time(1) NULL);
--error 1064
CREATE TABLE t1 (c1 datetime(1) NULL);
--error 1064
CREATE TABLE t1 (c1 tinytext(1) NULL);
--error 1064
CREATE TABLE t1 (c1 mediumtext(1) NULL);
--error 1064
CREATE TABLE t1 (c1 longtext(1) NULL);
--error 1064
CREATE TABLE t1 (c1 tinyblob(1) NULL);
--error 1064
CREATE TABLE t1 (c1 mediumblob(1) NULL);
--error 1064
CREATE TABLE t1 (c1 longblob(1) NULL);

CREATE TABLE t1 (c1 tinyint(1) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 smallint(1) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 int(1) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 bigint(1) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 decimal(1) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 bit(1) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 char(1) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 varchar(1) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 binary(1) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 varbinary(1) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 float(1) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 text(1) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 blob(1) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 timestamp(1) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 year(1) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;

#Error conditions
--error 1064
CREATE TABLE t1 (c1 double(1) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 date(1) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 time(1) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 datetime(1) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 tinytext(1) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 mediumtext(1) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 longtext(1) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 tinyblob(1) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 mediumblob(1) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 longblob(1) NOT NULL);

CREATE TABLE t1 (c1 decimal(5,2) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 float(5,2) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 double(5,2) NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;

#Error conditions
--error 1064
CREATE TABLE t1 (c1 tinyint(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 smallint(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 int(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 bigint(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 bit(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 date(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 time(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 datetime(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 timestamp(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 year(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 char(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 varchar(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 binary(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 varbinary(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 tinytext(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 text(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 mediumtext(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 longtext(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 tinyblob(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 blob(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 mediumblob(5,2) NULL);
--error 1064
CREATE TABLE t1 (c1 longblob(5,2) NULL);

CREATE TABLE t1 (c1 decimal(5,2) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 float(5,2) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 double(5,2) NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;

#Error conditions
--error 1064
CREATE TABLE t1 (c1 tinyint(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 smallint(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 int(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 bigint(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 bit(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 date(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 time(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 datetime(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 timestamp(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 year(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 char(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 varchar(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 binary(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 varbinary(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 tinytext(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 text(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 mediumtext(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 longtext(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 tinyblob(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 blob(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 mediumblob(5,2) NOT NULL);
--error 1064
CREATE TABLE t1 (c1 longblob(5,2) NOT NULL);

CREATE TABLE t1 (c1 tinyint UNSIGNED NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 smallint UNSIGNED NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 int UNSIGNED NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 bigint UNSIGNED NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 float UNSIGNED NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 double UNSIGNED NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 decimal UNSIGNED NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 year UNSIGNED NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;

#Error conditions
--error 1064
CREATE TABLE t1 (c1 bit UNSIGNED NULL);
--error 1064
CREATE TABLE t1 (c1 date UNSIGNED NULL);
--error 1064
CREATE TABLE t1 (c1 time UNSIGNED NULL);
--error 1064
CREATE TABLE t1 (c1 datetime UNSIGNED NULL);
--error 1064
CREATE TABLE t1 (c1 timestamp UNSIGNED NULL);
--error 1064
CREATE TABLE t1 (c1 char UNSIGNED NULL);
--error 1064
CREATE TABLE t1 (c1 varchar UNSIGNED NULL);
--error 1064
CREATE TABLE t1 (c1 binary UNSIGNED NULL);
--error 1064
CREATE TABLE t1 (c1 varbinary UNSIGNED NULL);
--error 1064
CREATE TABLE t1 (c1 tinytext UNSIGNED NULL);
--error 1064
CREATE TABLE t1 (c1 text UNSIGNED NULL);
--error 1064
CREATE TABLE t1 (c1 mediumtext UNSIGNED NULL);
--error 1064
CREATE TABLE t1 (c1 longtext UNSIGNED NULL);
--error 1064
CREATE TABLE t1 (c1 tinyblob UNSIGNED NULL);
--error 1064
CREATE TABLE t1 (c1 blob UNSIGNED NULL);
--error 1064
CREATE TABLE t1 (c1 mediumblob UNSIGNED NULL);
--error 1064
CREATE TABLE t1 (c1 longblob UNSIGNED NULL);

CREATE TABLE t1 (c1 tinyint UNSIGNED NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 smallint UNSIGNED NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 int UNSIGNED NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 bigint UNSIGNED NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 float UNSIGNED NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 double UNSIGNED NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 decimal UNSIGNED NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 year UNSIGNED NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;

#Error conditions
--error 1064
CREATE TABLE t1 (c1 bit UNSIGNED NOT NULL);
--error 1064
CREATE TABLE t1 (c1 date UNSIGNED NOT NULL);
--error 1064
CREATE TABLE t1 (c1 time UNSIGNED NOT NULL);
--error 1064
CREATE TABLE t1 (c1 datetime UNSIGNED NOT NULL);
--error 1064
CREATE TABLE t1 (c1 timestamp UNSIGNED NOT NULL);
--error 1064
CREATE TABLE t1 (c1 char UNSIGNED NOT NULL);
--error 1064
CREATE TABLE t1 (c1 varchar UNSIGNED NOT NULL);
--error 1064
CREATE TABLE t1 (c1 binary UNSIGNED NOT NULL);
--error 1064
CREATE TABLE t1 (c1 varbinary UNSIGNED NOT NULL);
--error 1064
CREATE TABLE t1 (c1 tinytext UNSIGNED NOT NULL);
--error 1064
CREATE TABLE t1 (c1 text UNSIGNED NOT NULL);
--error 1064
CREATE TABLE t1 (c1 mediumtext UNSIGNED NOT NULL);
--error 1064
CREATE TABLE t1 (c1 longtext UNSIGNED NOT NULL);
--error 1064
CREATE TABLE t1 (c1 tinyblob UNSIGNED NOT NULL);
--error 1064
CREATE TABLE t1 (c1 blob UNSIGNED NOT NULL);
--error 1064
CREATE TABLE t1 (c1 mediumblob UNSIGNED NOT NULL);
--error 1064
CREATE TABLE t1 (c1 longblob UNSIGNED NOT NULL);

CREATE TABLE t1 (c1 tinyint ZEROFILL NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 smallint ZEROFILL NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 int ZEROFILL NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 bigint ZEROFILL NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 float ZEROFILL NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 double ZEROFILL NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 decimal ZEROFILL NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 year ZEROFILL NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;

#Error conditions
--error 1064
CREATE TABLE t1 (c1 bit ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 date ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 time ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 datetime ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 timestamp ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 char ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 varchar ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 binary ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 varbinary ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 tinytext ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 text ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 mediumtext ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 longtext ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 tinyblob ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 blob ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 mediumblob ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 longblob ZEROFILL NULL);

CREATE TABLE t1 (c1 tinyint ZEROFILL NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 smallint ZEROFILL NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 int ZEROFILL NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 bigint ZEROFILL NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 float ZEROFILL NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 double ZEROFILL NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 decimal ZEROFILL NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 year ZEROFILL NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;

#Error conditions
--error 1064
CREATE TABLE t1 (c1 bit ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 date ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 time ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 datetime ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 timestamp ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 char ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 varchar ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 binary ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 varbinary ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 tinytext ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 text ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 mediumtext ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 longtext ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 tinyblob ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 blob ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 mediumblob ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 longblob ZEROFILL NOT NULL);

CREATE TABLE t1 (c1 tinyint UNSIGNED ZEROFILL NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 smallint UNSIGNED ZEROFILL NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 int UNSIGNED ZEROFILL NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 bigint UNSIGNED ZEROFILL NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 float UNSIGNED ZEROFILL NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 double UNSIGNED ZEROFILL NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 decimal UNSIGNED ZEROFILL NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 year UNSIGNED ZEROFILL NOT NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;

#Error conditions
--error 1064
CREATE TABLE t1 (c1 bit UNSIGNED ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 date UNSIGNED ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 time UNSIGNED ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 datetime UNSIGNED ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 timestamp UNSIGNED ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 char UNSIGNED ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 varchar UNSIGNED ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 binary UNSIGNED ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 varbinary UNSIGNED ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 tinytext UNSIGNED ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 text UNSIGNED ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 mediumtext UNSIGNED ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 longtext UNSIGNED ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 tinyblob UNSIGNED ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 blob UNSIGNED ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 mediumblob UNSIGNED ZEROFILL NOT NULL);
--error 1064
CREATE TABLE t1 (c1 longblob UNSIGNED ZEROFILL NOT NULL);

CREATE TABLE t1 (c1 tinyint UNSIGNED ZEROFILL NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 smallint UNSIGNED ZEROFILL NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 int UNSIGNED ZEROFILL NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 bigint UNSIGNED ZEROFILL NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 float UNSIGNED ZEROFILL NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 double UNSIGNED ZEROFILL NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 decimal UNSIGNED ZEROFILL NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;
CREATE TABLE t1 (c1 year UNSIGNED ZEROFILL NULL);
SHOW CREATE TABLE t1;
DROP TABLE t1;

#Error conditions
--error 1064
CREATE TABLE t1 (c1 bit UNSIGNED ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 date UNSIGNED ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 time UNSIGNED ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 datetime UNSIGNED ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 timestamp UNSIGNED ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 char UNSIGNED ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 varchar UNSIGNED ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 binary UNSIGNED ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 varbinary UNSIGNED ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 tinytext UNSIGNED ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 text UNSIGNED ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 mediumtext UNSIGNED ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 longtext UNSIGNED ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 tinyblob UNSIGNED ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 blob UNSIGNED ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 mediumblob UNSIGNED ZEROFILL NULL);
--error 1064
CREATE TABLE t1 (c1 longblob UNSIGNED ZEROFILL NULL);

A beginners look at Drizzle – SQL_MODE

A new feature to MySQL Version 5 was the introduction of SQL_MODE to support STRICT… or TRADITIONAL values.

This feature enabled a closer compatibility to other RDBMS products. MySQL by default performs a number of silent data changes which do not help in providing a level of data integrity if you come from a more traditional background. MySQL by default represents these as warnings, while with an appropriate SQL_MODE, these are in turn treated as errors.

How does Drizzle handle this? Very simple. There is no SQL_MODE. By default Drizzle handling a strict mode of producing errors for any invalid data. The following are some test case examples showing the varying conditions.

Test Case

select version();
create database if not exists test;
use test;
drop table if exists t1;
create table t1(i1 int, c1 char(10), d1 timestamp);
#Pass Tests
insert into t1(i1) values (500000000);
insert into t1(c1) values('1234567890');
insert into t1(i1) values (5000000000);
#Fail Tests
insert into t1(c1) values('12345678901');
insert into t1(d1) values(now());
insert into t1(d1) values(0);

Drizzle Output

drizzle> select version();
+-------------------------+
| version()               |
+-------------------------+
| 2009.03.970-development |
+-------------------------+
1 row in set (0.00 sec)

drizzle> create database if not exists test;
Query OK, 1 row affected (0.01 sec)

drizzle> use test;
Database changed
drizzle> create table t1(i1 int, c1 char(10), d1 timestamp);
Query OK, 0 rows affected (0.17 sec)
#Pass Tests
drizzle> insert into t1(i1) values (500000000);
Query OK, 1 row affected (0.08 sec)
drizzle> insert into t1(c1) values('1234567890');
Query OK, 1 row affected (0.05 sec)
drizzle> insert into t1(d1) values(now());
Query OK, 1 row affected (0.02 sec)
#Fail Tests
drizzle> insert into t1(i1) values (5000000000);
ERROR 1264 (22003): Out of range value for column 'i1' at row 1
drizzle> insert into t1(c1) values('12345678901');
ERROR 1406 (22001): Data too long for column 'c1' at row 1
drizzle> insert into t1(d1) values(0);
ERROR 1685 (HY000): Received an invalid value '0' for a UNIX timestamp.

MySQL Output

mysql> create database if not exists test;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> drop table if exists t1;
Query OK, 0 rows affected (0.05 sec)

mysql> create table t1(i1 int, c1 char(10), d1 timestamp);
Query OK, 0 rows affected (0.16 sec)

mysql> #Pass Tests
mysql> insert into t1(i1) values (500000000);
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1(c1) values('1234567890');
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1(i1) values (5000000000);
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> #Fail Tests
mysql> insert into t1(c1) values('12345678901');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into t1(d1) values(now());
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1(d1) values(0);
Query OK, 1 row affected (0.00 sec)

mysql> select * from t1;
+------------+------------+---------------------+
| i1         | c1         | d1                  |
+------------+------------+---------------------+
|  500000000 | NULL       | 2009-04-06 12:14:21 |
|       NULL | 1234567890 | 2009-04-06 12:14:21 |
| 2147483647 | NULL       | 2009-04-06 12:14:21 |
|       NULL | 1234567890 | 2009-04-06 12:14:21 |
|       NULL | NULL       | 2009-04-06 12:14:21 |
|       NULL | NULL       | 0000-00-00 00:00:00 |
+------------+------------+---------------------+
6 rows in set (0.00 sec)

MySQL SQL_MODE=STRICT_ALL_TABLES Output

mysql> set sql_mode = STRICT_ALL_TABLES;
Query OK, 0 rows affected (0.00 sec)

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.1.32    |
+-----------+
1 row in set (0.00 sec)

mysql> create database if not exists test;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> drop table if exists t1;
Query OK, 0 rows affected (0.01 sec)

mysql> create table t1(i1 int, c1 char(10), d1 timestamp);
Query OK, 0 rows affected (0.12 sec)

mysql> #Pass Tests
mysql> insert into t1(i1) values (500000000);
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1(c1) values('1234567890');
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1(i1) values (5000000000);
ERROR 1264 (22003): Out of range value for column 'i1' at row 1
mysql> #Fail Tests
mysql> insert into t1(c1) values('12345678901');
ERROR 1406 (22001): Data too long for column 'c1' at row 1
mysql> insert into t1(d1) values(now());
Query OK, 1 row affected (0.00 sec)

mysql> insert into t1(d1) values(0);
Query OK, 1 row affected (0.00 sec)


Update
Thanks to Robert Wultsch who highlighted to me that SQL_MODE has been around since 4.1.

Identifying resource bottlenecks – Memory

Continuing on from CPU, we turn our attention to Memory. One of the first steps when addressing a MySQL performance tuning problem is to perform a system audit of the physical hardware resources, then identify any obvious bottlenecks in these resources.

In auditing, I start with the ‘free’ command, the already used ‘vmstat’ command, the /proc/meminfo and /proc/sys/vm/swappiness files to get an indication of memory and swap resources. While we are looking at the Memory, the configuration of Swap is also very important. I will discuss this in more detail later.

$ free -m
             total       used       free     shared    buffers     cached
Mem:          3955       3838        117          0        402       2366
-/+ buffers/cache:       1069       2886
Swap:         1027          0       1027
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0    128 234088 409632 2474372    0    0     0     0 1081  198  0  0 100  0  0
 0  0    128 234088 409632 2474396    0    0     0     0 1003   59  0  0 100  0  0
 0  0    128 234088 409636 2474392    0    0     0   100 1085  209  0  0 100  0  0
 0  0    128 233836 409636 2474396    0    0     0     0 1014  184  3  0 97  0  0
 0  0    128 233284 409636 2474396    0    0     0     0 1182  435  2  0 98  0  0
 0  0    128 233176 409636 2474396    0    0     0     0 1024  104  1  0 99  0  0
 0  0    128 233176 409636 2474396    0    0     0     0 1079  195  0  0 100  0  0
 1  0    128 233168 409644 2474396    0    0     0   232 1021  188  3  0 97  0  0
 0  0    128 233176 409644 2474396    0    0     0     0 1111  213  2  0 98  0  0
 0  0    128 233176 409644 2474396    0    0     0     0 1005   60  0  0 100  0  0

Memory

  • swpd: the amount of virtual memory used.
  • free: the amount of idle memory.
  • buff: the amount of memory used as buffers.
  • cache: the amount of memory used as cache.
  • inact: the amount of inactive memory. (-a option)
  • active: the amount of active memory. (-a option)

Swap

  • si: Amount of memory swapped in from disk (/s).
  • so: Amount of memory swapped to disk (/s).
$ cat /proc/meminfo
MemTotal:      4050776 kB
MemFree:        120984 kB
Buffers:        411928 kB
Cached:        2423468 kB
SwapCached:          0 kB
Active:        1861536 kB
Inactive:      1492152 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      4050776 kB
LowFree:        120984 kB
SwapTotal:     1052248 kB
SwapFree:      1052120 kB
Dirty:             172 kB
Writeback:           0 kB
AnonPages:      518112 kB
Mapped:          23140 kB
Slab:           544448 kB
PageTables:       9528 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:   3077636 kB
Committed_AS:   859208 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    263132 kB
VmallocChunk: 34359474803 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB
$ cat /proc/sys/vm/swappiness
60

What’s important is not to believe that ‘free’ or ‘MemFree’ values are the total free memory. Linux/Unix distributions use available memory as a File System Cache, dumping this data if additional memory is needed. It’s important to add Cached to get a better indication of the true memory available.

Your initial audit should also look at the processes that are using the memory. Options include ‘top’ and ‘ps’.

$ top
M

op - 15:33:56 up 36 days, 17:08,  2 users,  load average: 0.01, 0.11, 0.08
Tasks: 133 total,   1 running, 132 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.1%sy,  0.0%ni, 99.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   4050776k total,  3792952k used,   257824k free,   368140k buffers
Swap:  1052248k total,      128k used,  1052120k free,  2329212k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2159 mysql     15   0  411m 204m 5624 S    0  5.2  33:26.43 mysqld
 2965 buildbot  15   0  280m 105m 1932 S    0  2.7 393:15.54 buildbot
 8533 nobody    15   0  168m  25m 4092 S    0  0.7   0:03.49 httpd
 9121 nobody    15   0  166m  24m 3432 S    0  0.6   0:01.61 httpd
...
$ ps -eopid,fname,rss,vsz,user,command | more
  PID COMMAND    RSS    VSZ USER     COMMAND
    1 init       700  10324 root     init [3]
   346 httpd    25252 170484 nobody   /opt/httpd-2.2.9/bin/httpd -k start
  364 httpd    25184 170344 nobody   /opt/httpd-2.2.9/bin/httpd -k start
  425 kpsmouse     0      0 root     [kpsmoused]
  452 httpd    21000 165684 nobody   /opt/httpd-2.2.9/bin/httpd -k start
...
 2095 mysqld_s  1204  63800 root     /bin/sh bin/mysqld_safe
 2159 mysqld   209448 421248 mysql   /opt/mysql51/bin/mysqld --basedir=/opt/mysql51 --datadir=/opt/mysql51/data --user=mysql --log-error=/opt/mysql51/data/dc1.onegreendog.com.err --pid-file=/opt/mysql51/data/dc
1.onegreendog.com.pid
 ...
  • rss RSS resident set size, the non-swapped physical memory that a task has used (in kiloBytes). (alias rssize, rsz).
  • vsz VSZ virtual memory size of the process in KiB (1024-byte units). Device mappings are currently excluded; this is subject to change.
    (alias vsize).

As I have written about previously, Are you monitoring RSS & VSZ?, these columns are important and should be monitored appropriately.

Regarding Swap. This is pre assigned disk space that is used to swap out (dump) memory processes when you have run out of memory for all the running processes. You never want to run out of memory on your database server. Swapping is both extremely slow, and if your database server swaps out the mysqld process, this will effectively kill your database. If you have insufficient swap space for the process, again in the case of database server this can cause your system to crash.

Historically in Unix world, swap was always defined as 2x Memory. This also doubled as a place to dump all memory in a kernel panic. I spent time in a past life doing core dump analysis.
Today, most Linux systems are ill-configured for swap. If you use a dedicated server for example, you may be limited to what is configured by a third party. The above example shows a configuration I would not recommend where swap is less then 1x the memory.
There is also a consideration to have no Swap. By setting swapiness to 0, you are effectively saying never swap. I would also not recommend this.

Monitoring memory usage closely is important. Taking appropriate action regarding the mysql process because you can’t control how much total memory it uses is critical. Correctly configuring mysql to use memory optimally is key to a well and long running database server.

Next, we will be looking at Disk and Network resource bottlenecks.

A beginners look at Drizzle – Datatypes and Tables

The Drizzle database, while similar to MySQL includes a number of significant differences. In this post we will look at data types and table syntax that is valid in Drizzle. For more background information you can also review A beginners look at Drizzle – Getting around with SHOW.

Data Types

This comparison is with Drizzle 2009.03.970 and MySQL 5.1.32 GA. More information at MySQL 5.1 Data Types.

The following data types are not valid in Drizzle.

  • TINYINT
  • SMALLINT
  • MEDIUMINT
  • BIT
  • TIME
  • YEAR
  • BINARY
  • SET

Tests used the following data-types in comparison with MySQL 5.1.
TINYINT,SMALLINT,MEDIUMINT,INT,BIGINT, FLOAT,DOUBLE,DECIMAL,BIT, ENUM, SET, DATE,TIME,DATETIME,TIMESTAMP,YEAR, CHAR,VARCHAR,BINARY,VARBINARY, TEXT,TINYTEXT,MEDIUMTEXT,LONGTEXT, BLOB,TINYBLOB,MEDIUMBLOB,LONGBLOB

I’m surprised that a number of data types are still valid in Drizzle however the product is still under heavy development. My kill-list for further reducing the syntax scope would include TINYTEXT,MEDIUMTEXT,LONGTEXT, TINYBLOB,MEDIUMBLOB,LONGBLOB

Numeric sizing for INT,BIGINT,FLOAT datatypes is not supported. i.e. INT(1). Very glad about that.

The UNSIGNED and ZEROFILL syntax for Numeric datatypes are not supported.

Character Sets

Another difference is the lack of the CHARACTER SET syntax for character fields. Drizzle only supports the UTF8 character set for all text fields. This also differs from MySQL, as it is a 4 byte field, not 3 byte in MySQL.

drizzle> select * from information_schema.character_sets;
+--------------------+----------------------+---------------+--------+
| CHARACTER_SET_NAME | DEFAULT_COLLATE_NAME | DESCRIPTION   | MAXLEN |
+--------------------+----------------------+---------------+--------+
| utf8               | utf8_general_ci      | UTF-8 Unicode |      4 |
| binary             | binary               |               |      1 |
+--------------------+----------------------+---------------+--------+
2 rows in set (0.09 sec)

mysql> select * from information_schema.character_sets where character_set_name in ('utf8','binary');
+--------------------+----------------------+-----------------------+--------+
| CHARACTER_SET_NAME | DEFAULT_COLLATE_NAME | DESCRIPTION           | MAXLEN |
+--------------------+----------------------+-----------------------+--------+
| utf8               | utf8_general_ci      | UTF-8 Unicode         |      3 |
| binary             | binary               | Binary pseudo charset |      1 |
+--------------------+----------------------+-----------------------+--------+
2 rows in set (0.00 sec)

Collations (the COLLATE syntax) is still supported.

drizzle> select * from information_schema.collations order by 1;
+--------------------+--------------------+-----+------------+-------------+---------+
| COLLATION_NAME     | CHARACTER_SET_NAME | ID  | IS_DEFAULT | IS_COMPILED | SORTLEN |
+--------------------+--------------------+-----+------------+-------------+---------+
| binary             | binary             |  63 | Yes        | Yes         |       1 |
| utf8_bin           | utf8               |  46 |            | Yes         |       1 |
| utf8_czech_ci      | utf8               | 234 |            | Yes         |       8 |
| utf8_danish_ci     | utf8               | 235 |            | Yes         |       8 |
| utf8_esperanto_ci  | utf8               | 241 |            | Yes         |       8 |
| utf8_estonian_ci   | utf8               | 230 |            | Yes         |       8 |
| utf8_general_ci    | utf8               |  45 | Yes        | Yes         |       1 |
| utf8_hungarian_ci  | utf8               | 242 |            | Yes         |       8 |
| utf8_icelandic_ci  | utf8               | 225 |            | Yes         |       8 |
| utf8_latvian_ci    | utf8               | 226 |            | Yes         |       8 |
| utf8_lithuanian_ci | utf8               | 236 |            | Yes         |       8 |
| utf8_persian_ci    | utf8               | 240 |            | Yes         |       8 |
| utf8_polish_ci     | utf8               | 229 |            | Yes         |       8 |
| utf8_romanian_ci   | utf8               | 227 |            | Yes         |       8 |
| utf8_roman_ci      | utf8               | 239 |            | Yes         |       8 |
| utf8_sinhala_ci    | utf8               | 243 |            | Yes         |       8 |
| utf8_slovak_ci     | utf8               | 237 |            | Yes         |       8 |
| utf8_slovenian_ci  | utf8               | 228 |            | Yes         |       8 |
| utf8_spanish2_ci   | utf8               | 238 |            | Yes         |       8 |
| utf8_spanish_ci    | utf8               | 231 |            | Yes         |       8 |
| utf8_swedish_ci    | utf8               | 232 |            | Yes         |       8 |
| utf8_turkish_ci    | utf8               | 233 |            | Yes         |       8 |
| utf8_unicode_ci    | utf8               | 224 |            | Yes         |       8 |
+--------------------+--------------------+-----+------------+-------------+---------+
23 rows in set (0.09 sec)

mysql> select * from information_schema.collations where character_set_name in ('utf8','binary') order by 1;
+--------------------+--------------------+-----+------------+-------------+---------+
| COLLATION_NAME     | CHARACTER_SET_NAME | ID  | IS_DEFAULT | IS_COMPILED | SORTLEN |
+--------------------+--------------------+-----+------------+-------------+---------+
| binary             | binary             |  63 | Yes        | Yes         |       1 |
| utf8_bin           | utf8               |  83 |            | Yes         |       1 |
| utf8_czech_ci      | utf8               | 202 |            | Yes         |       8 |
| utf8_danish_ci     | utf8               | 203 |            | Yes         |       8 |
| utf8_esperanto_ci  | utf8               | 209 |            | Yes         |       8 |
| utf8_estonian_ci   | utf8               | 198 |            | Yes         |       8 |
| utf8_general_ci    | utf8               |  33 | Yes        | Yes         |       1 |
| utf8_hungarian_ci  | utf8               | 210 |            | Yes         |       8 |
| utf8_icelandic_ci  | utf8               | 193 |            | Yes         |       8 |
| utf8_latvian_ci    | utf8               | 194 |            | Yes         |       8 |
| utf8_lithuanian_ci | utf8               | 204 |            | Yes         |       8 |
| utf8_persian_ci    | utf8               | 208 |            | Yes         |       8 |
| utf8_polish_ci     | utf8               | 197 |            | Yes         |       8 |
| utf8_romanian_ci   | utf8               | 195 |            | Yes         |       8 |
| utf8_roman_ci      | utf8               | 207 |            | Yes         |       8 |
| utf8_slovak_ci     | utf8               | 205 |            | Yes         |       8 |
| utf8_slovenian_ci  | utf8               | 196 |            | Yes         |       8 |
| utf8_spanish2_ci   | utf8               | 206 |            | Yes         |       8 |
| utf8_spanish_ci    | utf8               | 199 |            | Yes         |       8 |
| utf8_swedish_ci    | utf8               | 200 |            | Yes         |       8 |
| utf8_turkish_ci    | utf8               | 201 |            | Yes         |       8 |
| utf8_unicode_ci    | utf8               | 192 |            | Yes         |       8 |
+--------------------+--------------------+-----+------------+-------------+---------+
22 rows in set (0.00 sec)

Perhaps a new MySQL 6.0 collation that is in Drizzle is ‘utf8_sinhala_ci’.

Storage Engines

A key difference in Drizzle is the default storage engine. This defaults to InnoDB, rather then MyISAM. MyISAM is still currently packaged with Drizzle, however I hope that Maria becomes it’s replacement.

The MERGE and FEDERATED storage engines are not supported.
BLACKHOLE by default is not supported, however documentation indicates this can be compiled.

Table Syntax

FULLTEXT indexes which is valid for MyISAM only in MySQL is not supported.
PARTITIONS is not supported.

The following table options are still valid in Drizzle. These were only tested as valid syntax, not any usage of this functionality.
ENGINE, AUTO_INCREMENT, AVG_ROW_LENGTH, CHECKSUM, COLLATE, COMMENT, CONNECTION, DELAY_KEY_WRITE, KEY_BLOCK_SIZE, PACK_KEYS

INSERT_METHOD, CHARACTER SET is not support.

I suspect a number of these could be eliminated, definitely CONNECTION as this is related to the FEDERATED storage engine.

VIEWS are not supported in Drizzle.

I will need to invest more time to confirm INDEX and CONSTRAINT syntax.

A beginners look at Drizzle – Getting around with SHOW

Assuming you have successfully compiled Drizzle, and you are ready to start for the first time, here are some beginner differences with those familiar with the current MySQL 5.1 GA version. Note: drizzle is actually an original fork of the 6.0.x codebase however the code looks very little like mysql anymore.

Getting started differences

The first default MySQL installation step scripts/install_mysql_db is gone, and there is no replacement. The ‘mysql’ schema no longer exists. Talking about schemas, ‘test’ is also gone by default.

While a bin/drizzled_safe exists to start drizzle, this presently doesn’t operate as expected. You should start drizzle with sbin/drizzled &

They still have an empty password for ‘root’. Would be nice if it was at least a default like an Oracle installation, and a password became a mandatory requirement. Speaking of passwords, trying ‘drizzle -uroot -pxxxx’ will result in an unexpected error.

$ bin/drizzle -uroot -pxxx

ERROR:
Non-integer value supplied for port.  If you are trying to enter a password please use --password instead.

Taking a look around at some common mysql commands you may be familiar with.

Information Schema

drizzle> show schemas;
+--------------------+
| Database           |
+--------------------+
| information_schema |
+--------------------+
1 row in set (0.00 sec)
drizzle> use information_schema;
drizzle> show tables;
+---------------------------------------+
| Tables_in_information_schema          |
+---------------------------------------+
| CHARACTER_SETS                        |
| COLLATIONS                            |
| COLLATION_CHARACTER_SET_APPLICABILITY |
| COLUMNS                               |
| GLOBAL_STATUS                         |
| GLOBAL_VARIABLES                      |
| KEY_COLUMN_USAGE                      |
| PLUGINS                               |
| PROCESSLIST                           |
| REFERENTIAL_CONSTRAINTS               |
| SCHEMATA                              |
| SESSION_STATUS                        |
| SESSION_VARIABLES                     |
| STATISTICS                            |
| TABLES                                |
| TABLE_CONSTRAINTS                     |
| INNODB_CMP                            |
| INNODB_CMP_RESET                      |
| INNODB_CMPMEM                         |
| INNODB_CMPMEM_RESET                   |
| INNODB_LOCK_WAITS                     |
| INNODB_LOCKS                          |
| INNODB_TRX                            |
+---------------------------------------+
23 rows in set (0.00 sec)

Gone are COLUMN_PRIVILEGES ENGINES EVENTS FILES KEY_COLUMN_USAGE PARTITIONS PROFILING ROUTINES SCHEMA_PRIVILEGES TABLE_PRIVILEGES TRIGGERS USER_PRIVILEGES VIEWS. With MySQL privileges and a number of 5.x features removed, that’s expected. I was surprised to see ENGINES gone. More on that later.

SIDE NOTE: Disappointed to see (0.00 sec) still around. I would have expected the client interface could have had an overhaul for timing output, and report say 0.00 milliseconds.

Variables

drizzle> show variables
+----------------------------------------+---------------------------------------------+
| Variable_name                          | Value                                       |
+----------------------------------------+---------------------------------------------+
| archive_aio                            | OFF                                         |
..
| innodb_change_buffering                | inserts                                     |
| innodb_file_format                     | Antelope                                    |
| innodb_file_format_check               | Antelope                                    |
| innodb_replication_delay               | 0                                           |
| innodb_stats_on_metadata               | ON                                          |
| innodb_stats_sample_pages              | 8                                           |
| innodb_strict_mode                     | OFF                                         |
| innodb_use_sys_malloc                  | ON                                          |
| innodb_version                         | 1.0.3                                       |
...
| logging_query_enable                   | OFF                                         |
| logging_query_filename                 |                                             |
| logging_query_threshold_big_examined   | 0                                           |
| logging_query_threshold_big_resultset  | 0                                           |
| logging_query_threshold_slow           | 0                                           |
| logging_syslog_enable                  | OFF                                         |
| logging_syslog_facility                | local0                                      |
| logging_syslog_ident                   | drizzled                                    |
| logging_syslog_priority                | info                                        |
| logging_syslog_threshold_big_examined  | 0                                           |
| logging_syslog_threshold_big_resultset | 0                                           |
| logging_syslog_threshold_slow          | 0                                           |
...
| multi_thread_max_threads               | 2048                                        |
| myisam_block_size                      | 1024                                        |
| optimizer_switch                       |                                             |
| optimizer_use_mrr                      | force                                       |
| pool_of_threads_size                   | 8                                           |
| replicator_directory                   |                                             |
| replicator_enabled                     | OFF                                         |

161 rows in set (0.01 sec)

MySQL 5.1 has some 265 variables values, so huge changes here. The big sections gone are around CHARACTER SETS, SSL, REPLICATION, QUERY_CACHE, LOG, FULL TEXT. The above list shows new variables, some of these are Innodb plugin related.

Status

drizzle> show status
...
112 rows in set (0.01 sec)

MySQL 5.1 has some 290 status and Drizzle currently 112. Same suspects, REPLICATION, SSL, QUERY_CACHE, a lot of COM_ commands, all INNODB related status and THREADS_CREATED and THREADS_RUNNING. Nothing new from Drizzle.

Engines

drizzle> show engines;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your Drizzle server version for the right syntax to use near 'engines' at line 1
drizzle> show plugins;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your Drizzle server version for the right syntax to use near 'plugins' at line 1

An initial surprise here, but knowing that all work is converting to plugins, you found this all in INFORMATION_SCHEMA.PLUGINS

drizzle> select * from plugins;
+---------------------+----------------+---------------+--------------------+----------------+--------------------------------------+--------------------------------------------------------------------------+----------------+
| PLUGIN_NAME         | PLUGIN_VERSION | PLUGIN_STATUS | PLUGIN_TYPE        | PLUGIN_LIBRARY | PLUGIN_AUTHOR                        | PLUGIN_DESCRIPTION                                                       | PLUGIN_LICENSE |
+---------------------+----------------+---------------+--------------------+----------------+--------------------------------------+--------------------------------------------------------------------------+----------------+
| binlog              | 1.0            | ACTIVE        | STORAGE ENGINE     | NULL           | MySQL AB                             | This is a pseudo storage engine to represent the binlog in a transaction | GPL            |
| compress            | 1.0            | ACTIVE        | UDF                | NULL           | Stewart Smith                        | UDF for compress()                                                       | GPL            |
| crc32               | 1.0            | ACTIVE        | UDF                | NULL           | Stewart Smith                        | UDF for computing CRC32                                                  | GPL            |
| logging_query       | 0.2            | ACTIVE        | LOGGER             | NULL           | Mark Atwood  mark @fallenpegasus.com  | Log queries to a CSV file                                                | GPL            |
| logging_syslog      | 0.2            | ACTIVE        | LOGGER             | NULL           | Mark Atwood  mark @fallenpegasus.com  | Log to syslog                                                            | GPL            |
| multi_thread        | 0.1            | ACTIVE        | SCHEDULING         | NULL           | Brian Aker                           | One Thread Per Session Scheduler                                         | GPL            |
| pool_of_threads     | 0.1            | ACTIVE        | SCHEDULING         | NULL           | Brian Aker                           | Pool of Threads Scheduler                                                | GPL            |
| replicator          | 0.1            | ACTIVE        | REPLICATOR         | NULL           | Brian Aker                           | Basic replication module                                                 | GPL            |
| signal_handler      | 0.1            | ACTIVE        | DAEMON             | NULL           | Brian Aker                           | Default Signal Handler                                                   | GPL            |
| single_thread       | 0.1            | ACTIVE        | SCHEDULING         | NULL           | Brian Aker                           | Single Thread Scheduler                                                  | GPL            |
| uncompressed_length | 1.0            | ACTIVE        | UDF                | NULL           | Stewart Smith                        | UDF for compress()                                                       | GPL            |
| uncompress          | 1.0            | ACTIVE        | UDF                | NULL           | Stewart Smith                        | UDF for compress()                                                       | GPL            |
| ARCHIVE             | 3.5            | ACTIVE        | STORAGE ENGINE     | NULL           | Brian Aker, MySQL AB                 | Archive storage engine                                                   | GPL            |
| CSV                 | 1.0            | ACTIVE        | STORAGE ENGINE     | NULL           | Brian Aker, MySQL AB                 | CSV storage engine                                                       | GPL            |
| MEMORY              | 1.0            | ACTIVE        | STORAGE ENGINE     | NULL           | MySQL AB                             | Hash based, stored in memory, useful for temporary tables                | GPL            |
| InnoDB              | 1.0.1          | ACTIVE        | STORAGE ENGINE     | NULL           | Innobase Oy                          | Supports transactions, row-level locking, and foreign keys               | GPL            |
| INNODB_TRX          | 1.0.2          | ACTIVE        | INFORMATION SCHEMA | NULL           | Innobase Oy                          | InnoDB transactions                                                      | GPL            |
| INNODB_LOCKS        | 1.0.2          | ACTIVE        | INFORMATION SCHEMA | NULL           | Innobase Oy                          | InnoDB conflicting locks                                                 | GPL            |
| INNODB_LOCK_WAITS   | 1.0.2          | ACTIVE        | INFORMATION SCHEMA | NULL           | Innobase Oy                          | InnoDB which lock is blocking which                                      | GPL            |
| INNODB_CMP          | 1.0.2          | ACTIVE        | INFORMATION SCHEMA | NULL           | Innobase Oy                          | Statistics for the InnoDB compression                                    | GPL            |
| INNODB_CMP_RESET    | 1.0.2          | ACTIVE        | INFORMATION SCHEMA | NULL           | Innobase Oy                          | Statistics for the InnoDB compression; reset cumulated counts            | GPL            |
| INNODB_CMPMEM       | 1.0.2          | ACTIVE        | INFORMATION SCHEMA | NULL           | Innobase Oy                          | Statistics for the InnoDB compressed buffer pool                         | GPL            |
| INNODB_CMPMEM_RESET | 1.0.2          | ACTIVE        | INFORMATION SCHEMA | NULL           | Innobase Oy                          | Statistics for the InnoDB compressed buffer pool; reset cumulated counts | GPL            |
| MyISAM              | 1.0            | ACTIVE        | STORAGE ENGINE     | NULL           | MySQL AB                             | Default engine as of MySQL 3.23 with great performance                   | GPL            |
+---------------------+----------------+---------------+--------------------+----------------+--------------------------------------+--------------------------------------------------------------------------+----------------+
24 rows in set (0.00 sec)

This is where we start to see some significant differences. Storage engines are now all plugins. We see some entire new features for SCHEDULING, LOGGING and REPLICATOR.

drizzle> select plugin_type,count(*) from information_schema.plugins group by plugin_type order by 2 desc;
+--------------------+----------+
| plugin_type        | count(*) |
+--------------------+----------+
| INFORMATION SCHEMA |        7 |
| STORAGE ENGINE     |        6 |
| UDF                |        4 |
| SCHEDULING         |        3 |
| LOGGER             |        2 |
| REPLICATOR         |        1 |
| DAEMON             |        1 |
+--------------------+----------+
7 rows in set (0.00 sec)

SHOW Commands

Drizzle supports the following SHOW Commands.

SHOW COLUMNS FROM information_schema.tables;
SHOW CREATE DATABASE information_schema;
SHOW DATABASES;
SHOW ENGINE INNODB STATUS;
SHOW ERRORS;
SHOW INDEX FROM information_schema.tables;
SHOW OPEN TABLES;
SHOW PROCESSLIST;
SHOW STATUS;
SHOW TABLE STATUS;
SHOW TABLES;
SHOW VARIABLES;
SHOW WARNINGS;
SHOW CREATE SCHEMA;
SHOW SCHEMAS;

The following are no longer valid.

AUTHORS, CHARACTER SET, COLLATION, CONTRIBUTORS, EVENT event_name, FUNCTION func_name, PROCEDURE proc_name, TRIGGER trigger_name, VIEW view_name, ENGINES, EVENTS, FUNCTION CODE func_name, FUNCTION STATUS, GRANTS FOR , INNODB STATUS, PLUGINS, PROCEDURE CODE proc_name, PROCEDURE STATUS, PRIVILEGES, PROFILE, PROFILES, SCHEDULER STATUS, TRIGGERS, BINARY LOGS, MASTER LOGS, BINLOG EVENTS, MASTER STATUS, SLAVE HOSTS, SLAVE STATUS,

A Drizzle update – Running version 2009.03.970-development

I’ve not looked at compiling and running Drizzle on my server for the past four weeks. Well overdue time for a check and see how it’s going. I saw in today’s planet.mysql.com by Eric Day a new dependency is needed. libdrizzle 0.2.0 now in Drizzle is now required, so I started there.

cd ~/bzr
bzr branch lp:libdrizzle
cd libdrizzle
./config/autorun.sh
./configure
make
sudo make install

No problems there, also documented at the Drizzle Wiki. Great to see the docs up to date. I see my old work on starting the compiling page still relevant. Tested on CentOS 5 and Mac OS/X 10.5

Compiling drizzle was not much more difficult.

cd ~/bzr/drizzle
bzr update
make distclean
./config/autorun.sh
./configure --prefix=/home/drizzle/deploy
make
make install

The problems happened when I started drizzle. Initially I was using bin/drizzled_safe, but it was recommended via IRC#drizzle I stick with sbin/drizzled

sbin/drizzled &
error while loading shared libraries: libprotobuf.so.2: cannot open shared object file: No such file or directory

An investigation of Google Proto Buffers.

$ protoc --version
libprotoc 2.0.2

I see that protobuf 2.0.3 is now available, but this was not the problem.

I got around the problem by specifying the current library path:

$ LD_LIBRARY_PATH=/usr/local/lib sbin/drizzled &

I corrected this problem by adding /usr/local/lib to the default ld path, both the libdrizzle and libprotobuf libs are located there.

$ echo "/usr/local/lib" > /etc/ld.so.conf.d/drizzle.conf
$ ldconfig
$ ls -l /usr/local/lib
total 37240
-rw-r--r-- 1 root root  1194602 Mar 31 17:42 libdrizzle.a
-rwxr-xr-x 1 root root      940 Mar 31 17:42 libdrizzle.la
lrwxrwxrwx 1 root root       19 Mar 31 17:42 libdrizzle.so -> libdrizzle.so.0.0.2
lrwxrwxrwx 1 root root       19 Mar 31 17:42 libdrizzle.so.0 -> libdrizzle.so.0.0.2
-rwxr-xr-x 1 root root  1117979 Mar 31 17:42 libdrizzle.so.0.0.2
-rw-r--r-- 1 root root 12199302 Nov 30 23:32 libprotobuf.a
-rwxr-xr-x 1 root root      836 Nov 30 23:32 libprotobuf.la
lrwxrwxrwx 1 root root       20 Nov 30 23:32 libprotobuf.so -> libprotobuf.so.2.0.0
lrwxrwxrwx 1 root root       20 Aug 27  2008 libprotobuf.so.0 -> libprotobuf.so.0.0.0
-rwxr-xr-x 1 root root  5027949 Aug 27  2008 libprotobuf.so.0.0.0
lrwxrwxrwx 1 root root       20 Nov 30 23:32 libprotobuf.so.2 -> libprotobuf.so.2.0.0
-rwxr-xr-x 1 root root  5586965 Nov 30 23:32 libprotobuf.so.2.0.0
-rw-r--r-- 1 root root  9264068 Nov 30 23:32 libprotoc.a
-rwxr-xr-x 1 root root      852 Nov 30 23:32 libprotoc.la
lrwxrwxrwx 1 root root       18 Nov 30 23:32 libprotoc.so -> libprotoc.so.0.0.0
lrwxrwxrwx 1 root root       18 Nov 30 23:32 libprotoc.so.0 -> libprotoc.so.0.0.0
-rwxr-xr-x 1 root root  3645396 Nov 30 23:32 libprotoc.so.0.0.0
drwxr-xr-x 2 root root     4096 Mar 31 17:42 pkgconfig

Starting

$ sbin/drizzled &
InnoDB: The InnoDB memory heap is disabled
InnoDB: Mutexes and rw_locks use GCC atomic builtins.
090331 18:38:08  InnoDB: highest supported file format is Barracuda.
InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
090331 18:38:08  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
090331 18:38:08 InnoDB Plugin 1.0.3 started; log sequence number 46419
sbin/drizzled: ready for connections.
Version: '2009.03.970-development'  socket: ''  port: 4427  Source distribution

Verifying

$ bin/drizzle -uroot
Welcome to the Drizzle client..  Commands end with ; or g.
Your Drizzle connection id is 1
Server version: 2009.03.970-development Source distribution
Type 'help;' or 'h' for help. Type 'c' to clear the buffer.
drizzle> select version();
+-------------------------+
| version()               |
+-------------------------+
| 2009.03.970-development |
+-------------------------+
1 row in set (0.00 sec)
drizzle> exit

Sweet! Now to try some testing & benchmarking before the barrage of conferences next month, 2009 MySQL Camp, Percona Performance Conference and MySQL Conference & Expo.

I’m going to check out The Juice Database Benchmark next as a more realistic benchmark to DBT2 and sysbench.

Extending vmplot

Taking the work already done with vmplot.sh, a useful tool for MySQL performance tuning by Yves and Matt at BigDBAHead, and in true Open Source fashion I’ve enhanced and modified for my own purposes.

These changes include:

  • Error checking for ‘gnuplot’ command on the system
  • Eliminate the first row of sample data, as this is often not a complete sample for the vmstat duration.
  • Created a HTML output file for easy browser viewing
  • Changed Memory scale values from Kilobytes to Megabytes
  • Resizing png’s for optimal 1024×768 display output (2 per row)

Download vmplot

And I get:

So more specifics of what I did.

Install gnuplot.

$ yum install gnuplot

Create vmstat sample file.

vmstat 1 100 > vmstat.out

Generate output graphs (need to work out those warnings)

 ./vmplot -i vmstat.out -o tmp
Warning: empty y range [0:0], adjusting to [-1:1]
Warning: empty y range [0:0], adjusting to [0:1]

And then I can view via a browser, in this case http://localhost/tmp/vmplot.htm

Some worthwhile references include Gnuplot Reference Manual, Gnuplot Examples and Gnuplot in Action

Identifying resource bottlenecks – CPU

One of the first steps when addressing a MySQL performance tuning problem is to perform a system audit of the physical hardware resources, then identify any obvious bottlenecks in these resources.

When dealing with CPU, a quick audit should include identifying the number of CPU cores your server has, and the types of these cores. The key file on Linux systems is /proc/cpuinfo.

Number of cores can be found via the command cat /proc/cpuinfo | grep “^processor” | wc -l

You need to look more closely at the file to determine the type of CPU (e.g. below the model name shows Intel(R) Xeon(R) CPU X3220 @ 2.40GHz. The combination of knowing the number of processors (cores) listed and physical id and siblings helps identify how many CPUs and how many cores per CPU exist.

$ cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Xeon(R) CPU           X3220  @ 2.40GHz
stepping	: 11
cpu MHz		: 2394.051
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
bogomips	: 4789.96
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

...

Other commands that help with identifying CPU/cores include mpstat and top.

$ mpstat -P ALL 5

11:43:43 AM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
11:43:48 AM  all    0.00    0.00    0.00    0.00    0.05    0.00    0.00   99.95   1033.00
11:43:48 AM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00   1000.40
11:43:48 AM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:43:48 AM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00     31.40
11:43:48 AM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      1.00
$ top
0

top - 11:42:09 up 36 days, 13:17,  2 users,  load average: 0.20, 0.24, 0.25
Tasks: 133 total,   1 running, 132 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   4050776k total,  3825584k used,   225192k free,   397580k buffers
Swap:  1052248k total,      128k used,  1052120k free,  2302408k cached

You can easily identify a CPU bottleneck using the vmstat command.

The following shows an idle system.

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0    128 234088 409632 2474372    0    0     0     0 1081  198  0  0 100  0  0
 0  0    128 234088 409632 2474396    0    0     0     0 1003   59  0  0 100  0  0
 0  0    128 234088 409636 2474392    0    0     0   100 1085  209  0  0 100  0  0
 0  0    128 233836 409636 2474396    0    0     0     0 1014  184  3  0 97  0  0
 0  0    128 233284 409636 2474396    0    0     0     0 1182  435  2  0 98  0  0
 0  0    128 233176 409636 2474396    0    0     0     0 1024  104  1  0 99  0  0
 0  0    128 233176 409636 2474396    0    0     0     0 1079  195  0  0 100  0  0
 1  0    128 233168 409644 2474396    0    0     0   232 1021  188  3  0 97  0  0
 0  0    128 233176 409644 2474396    0    0     0     0 1111  213  2  0 98  0  0
 0  0    128 233176 409644 2474396    0    0     0     0 1005   60  0  0 100  0  0

The key columns (from the man page are)

CPU – These are percentages of total CPU time.

  • us: Time spent running non-kernel code. (user time, including nice time)
  • sy: Time spent running kernel code. (system time)
  • id: Time spent idle. Prior to Linux 2.5.41, this includes IO-wait time.
  • wa: Time spent waiting for IO. Prior to Linux 2.5.41, included in idle.
  • st: Time stolen from a virtual machine. Prior to Linux 2.6.11, unknown.
  • Procs

  • r: The number of processes waiting for run time.

NOTE: The columns of vmstat may vary between different Linux Operating Systems.

If you system is CPU Bound then you will observe this. Look at id,us,sy,r

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 3  0    128 275684 397176 2300672    0    0     0     0 1118  427 74  2 25  0  0
 3  0    128 217404 397176 2300672    0    0     0     0 1017  138 74  1 25  0  0
 6  0    128 239584 397176 2300672    0    0     0     0 1086  350 93  2  5  0  0
 4  0    128 269468 397176 2300672    0    0     0     0 1005  229 98  2  0  0  0
 4  0    128 217636 397180 2300668    0    0     0   168 1087  251 99  2  0  0  0
 4  0    128 240576 397180 2300668    0    0     0     0 1006  182 99  2  0  0  0
 4  0    128 270708 397180 2300668    0    0     0     0 1079  338 98  2  0  0  0
 4  0    128 218752 397180 2300684    0    0     0     0 1005  106 99  1  0  0  0
 4  0    128 226316 397180 2300684    0    0     0     0 1077  308 98  2  0  0  0
 4  0    128 198664 397184 2300680    0    0     0    76 1010  250 99  1  0  0  0
 4  0    128 179444 397184 2300680    0    0     0     0 1077  238 100  0  0  0  0
 4  0    128 185396 397184 2300688    0    0     0     0 1006  210 99  1  0  0  0
 4  0    128 199408 397184 2300688    0    0     0     0 1079  336 99  1  0  0  0

You should also be wary of a Single CPU Bound process. This is why knowing the number of cores is important. In this example, one CPU is bound.

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st

 0  0    128  99592 412544 2477580    0    0     0     0 1017   89  0  0 100  0  0
 0  0    128  99592 412544 2477580    0    0     0     0 1090  222  0  0 100  0  0
 0  0    128  99592 412544 2477580    0    0     0     0 1019   98  0  0 100  0  0
 1  0    128  99592 412544 2477580    0    0     0     0 1096  347 14  0 86  0  0
 1  0    128  99592 412548 2477576    0    0     0    84 1030  194 25  0 75  0  0
 1  0    128  99592 412548 2477576    0    0     0     0 1094  300 25  0 75  0  0
 1  0    128  99592 412548 2477580    0    0     0     0 1012   76 25  0 75  0  0
 1  0    128  99592 412548 2477580    0    0     0     0 1096  318 25  0 75  0  0
 1  0    128  73192 412548 2477580    0    0     0     0 1039  273 29  0 70  0  0
 1  0    128  77284 412556 2477572    0    0     0   268 1122  373 25  1 75  0  0
 2  0    128  83592 412556 2477584    0    0     0     0 1036  374 27  1 72  0  0
 0  0    128  56220 412564 2477576    0    0     0   172 1017   84  7  0 94  0  0
 0  0    128  56220 412564 2477576    0    0     0     0 1078  192  0  0 100  0  0
$ mpstat -P ALL 1
12:15:55 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
12:15:56 PM  all   25.00    0.00    0.00    0.00    0.00    0.00    0.00   75.00   1072.00
12:15:56 PM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00   1001.00
12:15:56 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
12:15:56 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00     62.00
12:15:56 PM    3  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      9.00

12:15:56 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
12:15:57 PM  all   25.00    0.00    0.00    0.00    0.00    0.00    0.00   75.00   1021.00
12:15:57 PM    0    0.00    0.00    0.00    0.00    0.00    1.00    0.00   99.00   1001.00
12:15:57 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
12:15:57 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00     18.00
12:15:57 PM    3  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      2.00

I will be detailing identifying bottlenecks of Memory, Disk and Network in future posts. You can also find out more at the MySQL User Conference “Monitoring 101 – Simple stuff to save your bacon” session.

Two *FREE* MySQL Conferences Next Month

The annual MySQL Conference & Expo will be held in this year on April 20-23 in Santa Clara, California with a double twist.

Not one, but *two* FREE additional MySQL Conferences are running at the same time, in the same hotel. If you on the west coast you can effectively get a free conference with many MySQL experts speaking at them. I am speaking at all three on three different topics.

The first announcement was the 2009 MySQL Camp organized by Sheeri K. Cabral – The She-BA in line with the O’Reilly approach of having a smaller un-conference within a conference such as with Web 2.0 NY last year.

However the big news was the Percona Performance Conference which was to be held in an adjacent location at the same time, is available now within the same hotel, the Hyatt Regency. In some regards, the Schedule is a better lineup then the official conference.

Percona is a major player in providing services to the MySQL eco-system, the failure to include Percona speakers such as Baron Schwartz was seen as a bizarre move. As a member of the MySQL Conference review committee, Percona submitted a number of presentations, and a number received high scores, more then sufficient in comparison to others accepted.

The MySQL eco-system is at a very fragile junction point, there are numerous independent versions, patches and forks now available, and the recent decision by Sun Microsystems, the trademark owner of the ‘mysql’ term to request Google to enforce trademark usage within Google Ad Words did not seem a productive move for a open source produce. Read more at
Hurting the little guy and MySQL Banned In Google Adword Campaigns. Combined with recent news that MySQL, may now become part of IBM, greater uncertainty may prevail.

These conferences will be unique opportunity to see and hear hopefully more details of what’s happening in the MySQL World.

Priceless Monty

While working with Monty Widenius on a bug I reported in MariaDB I was surprised to not get an IRC response for a few minutes. When committed Monty can identify, create a workaround, and patch a problem in code before you have time to read all the responses he also types. See my Monty’s Monument “Passion is a timeless wisdom” comment. One thing I forgot to say in that entry was, while 9pm in New york, it was at last 3am-5am Monty time.

We all love memorable quotes, so here is one.

“sorry, machine died; First time in years”

[9:40pm] montywi: then just try compiling mysqld.o, no need to wait for everything else...
....
[9:58pm] rbradfor: make is clean.
[9:59pm] montywi: sorry, machine died;  First time in years
[10:00pm] rbradfor: laughs out load, a priceless monty quote, love it.
...

Your Code, Your Community, Your Cloud… Project Kenai

Following the opening keynote announcement about Kenai I ventured into a talk on Project Kenai.

With today’s economy, the drive is towards efficiency is certainly a key consideration, it was quoted that dedicated hosting servers only run at 30% efficiency.

An overview again of Cloud Computing

  • Economics – Pay as you go,
  • Developer Centric – rapid self provisioning, api-driven, faster deployment
  • Flexibility – standard services, elastic, on demand, multi-tenant

Types of Clouds

  • Public – pay as you go, multi-tenant application and services
  • Private – Cloud computing model run within a company’s own data center
  • Mixed – Mixed user of public and private clouds according to applications

SmugMug was referenced as a Mixed Cloud example.

Cloud Layers

  • Infrastructure as a Services – Basic storage and computer capabilities offer as a service (eg. AWS)
  • Platform as a Service – Developer platform with build-in services. e.g. Google App Engine
  • Software as Service – applications offered on demand over the network e.g salesforce.com

Some issues raised about this layers included.

  • IaaS issues include Service Level, Privacy, Security, Cost of Exit
  • PaaS interesting point, one that is the bane of MySQL performance tuning, that is instrumentation
  • SaaS nothing you need to download, you take the pieces you need, interact with the cloud. More services simply like doing your Tax online.

Sun offers Project Kenai as well as Zembly.

Project Kenai

  • A platform and ecosystem for developers.
  • Freely host open source projects and code.
  • Connect, community, collaborate and Code with peers
  • Eventually easily deploy application/services to “clouds”

Kenai Features

  • Code Repository with SVN, Mercurial, or an external repository
  • Issue tracking with bugzilla, jira
  • collaboration tools such as wiki, forums, mailing lists
  • document hosting
  • your profile
  • administrative role

Within Kenai you can open up to 5 open source projects and various metrics of the respositories, issue trackers, wiki etc.

The benefits were given as the features are integrated into your project, not distributed across different sites. Agile development within the project sees a release every 2 weeks. Integration with NetBeans and Eclipse is underway.

Kenai is targeted as being the core of the next generation of Sun’s collaboration tools. However when I asked for more details about uptake in Sun, it’s only a request, not a requirement for internal teams.

The API’s for the Sun Cloud are at http://kenai.com/projects/suncloudapis.

Event: CommunityOne East in New York, NY.
Presenter: Tori Wieldt, Sun Microsystems
Article Author: Ronald Bradford

Everybody is talking About Clouds

From the opening keynote at CommunityOne East we begin with Everybody is talking About Clouds.

It’s difficult to get a good definition, the opening cloud definition today was Software/Platform/Storage/Database/Infrastructure as a service. Grid Computing, Visualization, Utility Computing, Application Hosting. Basically all the buzz words we currently know.

Cloud computing has the ideals of truly bringing a freedom of choice. For inside or outside of an enterprise, the lower the barrier, time and cost into freedom of choice give opportunities including:

  • Self-service provisioning
  • Scale up, Scale down.
  • Pay for only what you use.

Sun’s Vision has existed since 1984 with “The NETWORK is the Computer”.

Today, Sun’s View includes Many Clouds, Public and Private, Tuned up for different application needs, geographical, political, with a goal of being Open and Compatible.

How do we think into the future for developing and deploying into the cloud? The answer given today was, The Sun Open Cloud Platform which includes the set of core technologies, API’s and protocols that Sun hopes to see uptake among many different providers.

The Sun Cloud Platform

  • Products and Technologies – VirtualBox, Sun xVM, Q-Laser, MySQL
  • Expertise and Services
  • Partners – Zmanda, Rightscale, Kickapps
  • Open Communities – Glashfish, Java, Open Office, Zfs, Netbeans, Eucalyptus

The Sun Cloud includes:

  • Compute Service
  • Storage Service
  • Virtual Data Center
  • Open API – Public, RESTful, Java, Python, Ruby

The public API has been released today and is available under Kenai. It includes two key points:

  • Everything is a resource http GET, POST, PUT etc
  • A single starting point, other URI’s are discoverable.

What was initially showed was CLI interface exmaples, great to see this still is common, a demonstration using drag and drop via a web interface was also given, showing a load balanced, multi-teired, multi server environment. This was started and tested during the presentation.

Then Using Cyberduck (a WebDAV client on Mac OS/X) and being able to access the storage component at storage.network.com directly, then from Open Office you now get options to Get/Save to Cloud ( using TwoGuys.com, Virtual Data Center example document).

Seamless integration between the tools, and the service. That was impressive.

More information at sun.com/cloud. You can get more details also at the Sun Microsystems Unveils Open Cloud PlatformOfficial Press Release.

Event: CommunityOne East in New York, NY.
Article Author: Ronald Bradford

CommunityOne East – An open developer conference

With an opening video from thru-you.com – an individual taking random you-tube video and producing video mashup’s, the CommunityOne East conference in New York, NY beings.

The opening introduction was by Chief Sustainability Officer Dave Douglas. Interesting job title.

His initial discussion was around what is the relationship between technology and society. A plug for his upcoming book “Citizen Engineer” – The responsibilities of a 21st Century Engineer. He quotes “Crisis loves an Innovation” by Jonathan Schwartz, and extends with “Crisis loves a Community”.

He asks us to consider the wider community ecosystem such as schools, towns, governments, NGO’s etc with our usage and knowledge of technology.

Event: CommunityOne East in New York, NY.
Author: Ronald Bradford

Hurting the little guy?

Today I come back from the dentist, if that wasn’t bad enough news, I get an email from Google AdWords titled Your Google AdWords Approval Status.

In the email, all my AdWords campaigns are now disapproved, because of:

SUGGESTIONS:
-> Ad Content: Please remove the following trademark from your ad:
mysql.

Yeah right. I can’t put the word ‘MySQL’ in my ads. How are people to now find me? It would appear that many ads have been pulled not just mine. Is this a proactive measure by Google? is this a complaint from the MySQL trademark holder Sun Microsystems?

I’d like any comment, feedback or suggestions on how one can proceed here.

It reminds me of the days CentOS advertised itself as an “Open source provider of a popular North American Operating System”, or something of that nature.

Understanding the various MySQL Products & Variants

The MySQL marketplace today is far more complex then simply choosing between a particular version of MySQL that Sun/MySQL produces.
The MySQL server product in general is released under the GNU General Public License (GPL) v2, however you should carefully review the MySQL Legal Policies as a number of exceptions and different license agreements operate for companion tools such as MySQL Cluster, MySQL client libraries and documentation for example.

Looking into the MySQL ecosystem for products, I’ve produced the following categories:

  • Sun/MySQL Official Products
    • MySQL Versions
  • MySQL Variants
    • Community
    • Enterprise
  • MySQL Plugins
  • MySQL Patches
  • MySQL Alternatives

Why does such a diversification occur?  I attribute this to three primary causes:

  • The GPL license by nature allows for an organization to take the product, modify it and use it for their specific needs. They can also provide these patches under GPL for others to use and incorporate. While this has occurred for example  Google , FaceBook, eBay , Proven Scaling and Percona to name a few, Sun/MySQL has elected not to undertake any proactive process of incorporating these in any timely fashion.
  • The policy of Sun/MySQL to allow for contributions was so strict, and combined with a properietory Version Control System BitKeeper you had to purchase, there was little incentive for community contributions in relation to so many other open source projects
  • The Sun/MySQL management and decision makers didn’t listen to the community and paying customers, and over the past 3-5 years the product life cycle, features, release schedule and quality can be questioned.

Sun/MySQL Official Products

Sun/MySQL holds the license to the MySQL server products. They release official binaries and the source code (due to GPL).  Even within MySQL, there are several products that differ subtly and to the untrained eye it can be confusing to understand and determine what is best. Your can download from www.mysql.com the following versions:

  • MySQL Server 5.1  GA
  • MySQL Community Server 5.0 GA
  • MySQL Enterprise Server 5.0 GA
  • MySQL Cluster NDB 6.3

  • MySQL Server 4.1 (EOL)
  • MYSQL Server 6.0 (Alpha)

MySQL Versions

It is important that you understand the MySQL Versions, especially in evaluating any of the following referenced variants, patches etc.
The common path for MySQL Server versions is with a generally linear numbering systems including historical versions 3.23, 4.0 and 4.1.  These versions have now reached End Of Life (EOL) for support, however emergency security patches are applied where necessary.
Continuing from 4.1, you have the 5.0, 5.1 versions which are both Generally Available (GA), and then version 6.0 which is currently Alpha.

Further complexity happens when within the Sun/MySQL Official products, several forks/branches have occurred.  These include:

  • The MySQL 5.0 Community & Enterprise split occurred at MySQL Version 5.0.27
  • At this time, the Community version (free to download) continued with the intention of allowing for community contributions. Only one patch was ever accepted, and SHOW PROFILES was introduced in MySQL 5.0.37.  To date, 11 versions have been released to the current 5.0.77 version.
  • MySQL Enterprise (available under subscription) is itself comprised of three subtypes, these are Rapid Update Service Packs(monthly), Quarterly Service Packs (quarterly) and Hot-fix releases.  To date 37 versions have been released to the current 5.0.78 version.
  • MySQL Cluster, was part of the MySQL Server product until this was branched/forked at  MySQL Version 5.1.23. This enabled MySQL Cluster to be labeled as Production Ready for Cluster clients, and not be held back by continued delays in the 5.1 server release.   Starting with a new versioning scheme with 6.1, the MySQL Cluster NDB produces new versions far exceeding the volume of the server, with to date 23 versions in 6.1 , 18 in version 6.2 and 24 in version 6.3.      I am not advocating that features and quality are better or worse, simply that activity and interaction with community and users is far greater.
  • MySQL 5.1 Maria is a special branch starting at MySQL Version 5.1.24 that includes the Maria Storage Engine. This is the next generation of the MyISAM Storage Engine, both architected by the creator of MySQL, Monty Widenius.  It is undercertain this will continue as a product released officially by Sun/MySQL.

In Review

With just reading this introduction you can understand the confusion that exists when new customers/clients are beginning to evaluate the different MySQL Versions.

In my next post, I’ll talk more about:

  • MySQL Variants, those I consider variants use the MySQL Interface, protocol and support the standard connectors.  These include community versions (e.g. Solid, Infobright, Sphinx) and commercial versions (e.g. KickFire, InfoBright, Nitro).  
  • MySQL patches are improvements that have been released to the community and are now becoming part of common third party MySQL packages, such as Percona, Proven Scaling and Out Delta
  • MySQL Plugins are a feature of MySQL 5.1, and allow for pluggable storage engines into MySQL.  While several companies have had to produce custom binaries due to the API limitions (especially with the optimizer), a number of engines support the API including Innodb, PBXT and filesystem engine.
  • MySQL Alternatives include any MySQL related products that have now deviated from being supported under the MySQL protocol.  Most notably here is Drizzle.

More Information

Ronald Bradford is Principal at 42SQL. We provide consulting and advisory services for the MySQL ecosystem with a focus on MySQL database performance, architecture and scalability. 42SQL also provides education in MySQL including the “MySQL Essentials” training course. You can find more information regarding this offering and an upcoming schedule at 42SQL Education.

Beginner CSV Engine issues

I’ve just started using the CSV engine for a practical application and I’ve come across a few subtle and unexpected issues/limitations.

First, you can’t create any columns in your CSV table nullable.

mysql> create table t1(i INT) ENGINE=CSV;
ERROR 1178 (42000): The storage engine for the table doesn't support nullable columns

RTFM shows this was introduced in 5.1.23. See CSV Limitations

The second and more annoying was creating a CSV table, inserting a sample row (just to check the syntax), then replacing the file ([datadir]/[schema]/[table].CSV) with the same format, but with additional generated rows. This was to no avail when attempting to SELECT from the table.

The solution was to do a REPAIR TABLE [table] in order to see the newly refreshed data.
Futhermore, some more RTFM shows in Repairing and Checking CSV Tables an important caveat to using the CSV engine and loading data directly by file copy.

Warning: Note that during repair, only the rows from the CSV file up to the first damaged row are copied to the new table. All other rows from the first damaged row to the end of the table are removed, even valid rows.

Buyer beware.

Infobright Community Edition(ICE) – It's Free

The March NY MySQL Meetup featured a presentation from Infobright, a data warehousing solution built on the MySQL Product.

With a pitch of “Simplicity, Scalability and low TCO” I became more impressed with the capability to delivery on these as the presentation proceeded. Here are some highlights.

  • The company and product has been around for a few years. Infobright started as a compression engine to sit beside Teradata, providing a significant cost saving to clients, and allowing a two way data transfer between Teradata.
  • In September 2008, a open source community edition was released, called ICE. (Which I didn’t know)
  • The technology is based on a Rough Set theory, a mathematical approach
  • Using a column oriented approach, compression generally starts at 10:1, different applications can get 30:1 or better
  • There is basically no tuning, there are no indexes. Knowledge is gleaned at data loading and each data pack node holds key information per column, such as range of values (min,max).
  • Some interesting results are, there is a constant load time, it doesn’t degrade over time as the size of your data increases. Also, Query performance scales with data volume.
  • Depending on queries, the knowledge grid can retrieve results without having to uncompress the data, i.e. introspection of the meta data is all that is needed
  • Infobright is not a pluggable storage engine, rather a custom binary of MySQL. This is due to the restrictions of the API and the lack of optimizer push down conditions for example.

The product is not without some limitations, but you have to realize the product is for a data warehousing implementation, not an OLTP web app. It’s not great with SELECT *, and large text strings for example.

Functionality continues to be added, with a recent release adding many more MySQL Functions, but again, Infobright does not claim to be a solution to everybody, there is not UDF support or SP support at this time, however I’d warrant this is really not needed.

While the presentation went into some detail regarding the knowledge grid, data packs, data pack nodes, and pack to pack integration from a slide perspective, the presentation lacked the technical here is how you use the loader to get data out of MySQL and into Infobright. Here is the throughput, etc. As a marketing presentation it had the right content, but I’d like to now see the companion technical presentation.

Having previously been part of the MySQL Consulting team, and having worked also in the Storage Engine API with the Nitro Storage engine I have a distinct advantage of knowing the complexities of integration with MySQL. We can only hope this continues to improve with future releases of MySQL enabling Infobright and other products to integrate better and keep up to date with the MySQL Release cycle.

Identifying Bad Memory

I was having problems recently with a dedicated production server, that runs my MySQL Server and a number of websites. It’s most annoying when your system crashes without any reporting in /var/log/messages

The tool of choice from the host provider SoftLayer was PassMark BurnInTest Linux which is installed with every dedicated server.

I will need to investigate open source alternatives, as this is a commercial product, but for the purposes of my pain, this included tool was well worth the investment.

**************
RESULT SUMMARY
**************
Test Start time: Sun Feb 22 16:02:48 2009
Test Stop time: Sun Feb 22 16:07:49 2009
Test Duration: 000h 05m 01s

Test Name Cycles Operations Result Errors Last Error
CPU - Maths 261 488 Billion PASS 0 No errors
Memory (RAM) 2 3.081 Billion FAIL 1 Error verifying data in RAM
Network: 127.0.0.1 412995 4.295 Billion PASS 0 No errors
TEST RUN FAILED

*********************
SERIOUS ERROR SUMMARY
*********************
SERIOUS : 2009-02-22 16:07:31, RAM, SERIOUS: Error verifying data in RAM (x 1)

It was great to get a simple resolution to the problem, bad memory?
With a scheduled maintenance replacement I was operational again.

 **************
RESULT SUMMARY
**************
Test Start time: Sun Feb 22 20:34:37 2009
Test Stop time: Sun Feb 22 20:39:38 2009
Test Duration: 000h 05m 01s

Test Name Cycles Operations Result Errors Last Error
CPU - Maths 267 406 Billion PASS 0 No errors
Memory (RAM) 1 3.664 Billion PASS 0 No errors
Network: 127.0.0.1 334578 3.480 Billion PASS 0 No errors
TEST RUN PASSED

*********************
SERIOUS ERROR SUMMARY
*********************

Are you monitoring RSS & VSZ?

Monitoring MySQL Memory is a rather critical task because you can’t limit MySQL’s usage of physical memory resources. Improperly configured servers running MySQL can crash because you don’t understand memory usage.

MySQL uses memory in a number of different ways. Using the Oracle analogy, you can divide the mysqld memory usage into main areas of:

  • SGA – System Global Area
  • PGA – Process Global Area

The SGA is the footprint that MySQL uses for startup. This is attributed to the base footprint of the mysqld process and a number of buffers including:

NOTE: This is for a default MySQL 5.1 install. Other storage engines and/or other versions of MySQL may have additional buffers. Falcon for example in MySQL 6.x has additional buffers.

The PGA is more complex, and the cause of problems for the possible occurrence of your server running out of memory and needing to swap. The goal of monitoring memory usage is to of course avoid this.
This additional memory is a combination of a few areas including:

  • MEMORY tables
  • Connection management (such as thread_cache and table_cache)
  • Per Connection memory usage

The later is the cause of greatest concern, especially for environments that have persistent connections. Per Connection memory usage is a combination of many buffers including the thread_stack, 2 x net_buffer_length (to max_allowed_packet), read_buffer_size, read_rnd_buffer_size, join_buffer_size, sort_buffer_size, and up to min(max_heap_table_size,tmp_table_size). In fact, for example with temporary tables, a query may use multiple temporary tables.

Memory on a per connection basis is kept until the connection is closed. In some instances such as next_buffer_size, this is apparently reduced aftter each SQL Statement result. With a persistent connection model (e.g. Java), ensuring idle connections drop to a low watermark is a valuable task. The confusing part is MySQL instrumentation does not tell you exactly how much is used, and it’s impossible to calculate with available provided data.

As part of monitoring your server, you should monitor the size of the mysqld memory usage, because this will cause you to be proactive rather then reactive to scarce memory resources. You can easily get this using the ps command. For example:

$ps -eopid,fname,rss,vsz,user,command | grep -e "RSS" -e "mysql"
  PID COMMAND    RSS    VSZ USER     COMMAND
 5463 grep       764   5204 ronald   grep -e RSS -e mysql
13894 mysqld_s   596   3936 root     /bin/sh /usr/bin/mysqld_safe
13933 mysqld   4787812 5127208 mysql /usr/sbin/mysqld --basedir=/usr --datadir=/vol/mysql/mysqldata --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-external-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock
13934 logger     608   3840 root     logger -p daemon.err -t mysqld_safe -i -t mysqld
$ ps -eopid,fname,rss,vsz,user,command | grep " mysqld " | grep -v grep | awk '{print $3,$4}'
4787820 5127208


From man ps
rss RSS resident set size, the non-swapped physical memory that a task has used (in kiloBytes). (alias rssize, rsz).
vsz VSZ virtual memory size of the process in KiB (1024-byte units). Device mappings are currently excluded; this is subject to change.
(alias vsize).

The motto of the story, don’t just monitor the free memory of your system, for a database server, you need to closely monitor the primary process on the server, that is mysqld.

References

How MySQL Uses Memory

More Information

Join me for my MySQL User Conference talk on “Monitoring 101 – Simple stuff to save your bacon”.

I also cover monitoring MySQL in my “MySQL Essentials” training course. For more information visit MySQL Education.