DP#4 The importance of using sql_mode

What if the data you retrieved from the database did not match the data the application claimed to had successfully stored? How comfortable would your organization feel about your skills and the products that are being used to store important information if data integrity was not guaranteed?

MySQL employs a terrible default technique known as silent truncation where the product determines that it knows about your data better than you. Never has the saying “do not assume” because it makes an “ass” out of “u” and “me” been more applicable.

Customer Example

A HTML form for new customers provide input fields for the customer first and last name. Good design was considered with the HTML form client validation to ensure that each field could not exceed 20 characters in length. However, the database design is different, where the first name is only defined as 10 characters. In most cases this is sufficient, however for first names longer than 10 characters, the data retrieved does not match the data that was apparently successfully stored because there was no SQL error. The following SQL reproduces this situation.

DROP TABLE IF EXISTS dp4;
CREATE TABLE dp4 (
  customer_id     INT UNSIGNED NOT NULL AUTO_INCREMENT,
  first_name      VARCHAR(10) NOT NULL,
  last_name       VARCHAR(20) NOT NULL,
  PRIMARY KEY (customer_id)
) ENGINE=InnoDB DEFAULT CHARSET utf8;

INSERT INTO dp4 (customer_id, first_name, last_name)
          VALUES (NULL,'Evangeline','Jones');
INSERT INTO dp4 (customer_id, first_name, last_name)
          VALUES (NULL,'Christopher','Smith');
INSERT INTO dp4 (customer_id, first_name, last_name)
          VALUES (NULL,'Alexander','Bell');

SELECT * FROM dp4;
+-------------+------------+-----------+
| customer_id | first_name | last_name |
+-------------+------------+-----------+
|           1 | Evangeline | Jones     |
|           2 | Christophe | Smith     |
|           3 | Alexander  | Bell      |
+-------------+------------+-----------+
3 rows in set (0.00 sec)

As you can see, the first name of Christopher Smith is not actually correctly stored in the database. MySQL DID NOT produce an error message, rather it performed a silent truncation of the data.

Defining sql_mode

To demonstrate what level of data integrity you should expect with MySQL, you must define the sql_mode configuration option. The following example demonstrates the dynamic syntax for a given connection and the error you should expect.

SET SESSION sql_mode='STRICT_ALL_TABLES,NO_ZERO_DATE,NO_ZERO_IN_DATE';

TRUNCATE TABLE dp4;
INSERT INTO dp4 (customer_id, first_name, last_name)
          VALUES (NULL,'Christopher','James');
ERROR 1406 (22001): Data too long for column 'first_name' at row 1
SELECT * FROM dp4;
Empty set (0.00 sec)

When MySQL is first installed the following configuration option should always be added to all environments as a default.

$ cat /etc/my/cnf
[mysqld]
sql_mode=STRICT_ALL_TABLES,NO_ZERO_DATE,NO_ZERO_IN_DATE,NO_ENGINE_SUBSTITUTION

Starting with MySQL 5.6, the default MySQL configuration files when created by MySQL define sql_mode variable with STRICT_ALL_TABLES,NO_ENGINE_SUBSTITUTION. This is a step in the right direction.

For more information, refer to the MySQL Reference Manual for sql_mode at http://dev.mysql.com/doc/refman/5.6/en/server-sql-mode.html

NOTE: MySQL provides many different options with sql_mode. Careful consideration is needed to determine which options are best for your application. Some options help in providing syntax and compatibility with other database products however these can affect and even break existing products written specifically for MySQL.

MySQL Warnings

The underlying cause of this loss of data integrity is how MySQL handles success and error conditions with SQL Statements. There are the obvious success and failure states, however MySQL has a third state known was warnings, or more specifically success with warnings. As the use of warnings is uncommon with other data store products, many applications, developers and programming languages ignore checking for warnings, or are simply unaware of this inbuilt feature.

Using the MySQL command line client, you can get a visual indication of warnings following an SQL statement which then help the need for reviewing what warnings occurred.

SET SESSION sql_mode='';
INSERT INTO dp4 (customer_id, first_name, last_name)
VALUES (NULL,'Christopher','Smith');
Query OK, 1 row affected, 1 warning (0.00 sec)

SHOW WARNINGS;
+---------+------+-------------------------------------------------+
| Level   | Code | Message                                         |
+---------+------+-------------------------------------------------+
| Warning | 1265 | Data truncated for column 'first_name' at row 1 |
+---------+------+-------------------------------------------------+
1 row in set (0.00 sec)

When using PHP there is no indication of SQL warnings unless you specifically check after every SQL statement. For example:

<?php
  $con = mysqli_connect('localhost', 'scott', 'sakila', 'design');
  if (mysqli_connect_errno()) {
    print 'Failed to connect to MySQL: ' . mysqli_connect_error() . "n";
    exit(1);
  }

  if (!mysqli_query($con, 'INSERT INTO dp4 (customer_id, first_name, last_name) ' .
                          'VALUES (NULL,"Christopher","Holt") ')) {
    print 'Failed to insert data: ' . mysqli_error($con) . "n";
  }
  if (($warnings = mysqli_warning_count($con)) > 0) {
    if ($rs = mysqli_query($con, "SHOW WARNINGS")) {
        $row = mysqli_fetch_row($rs);
        printf("%s (%d): %sn", $row[0], $row[1], $row[2]);
        mysqli_free_result($rs);
    }
  }
  mysqli_close($con);
?>

The best recommendation is to avoid all situations where MySQL can produce a warning and does not provide the best possible data integrity.

Refer to the MySQL Reference Manual for more information on SHOW WARNINGS at
http://dev.mysql.com/doc/refman/5.6/en/show-warnings.html

The Larger Problem

This underlying problem is actually more difficult to correct for an existing production system than enabling the correct sql_mode configuration option. Using the customer example, the identification of any rows that are 10 characters in length could be valid, or may have been truncated. There is no easy way to obtain the actual value that was originally submitted. The use of the correct numerical data type (DP#14) can provide a check constraint for values, however it can also suffer from the same truncation problem. You especially hope that this does not affect your payroll, your frequent flyer points balance or your accumulated score from your favorite online game.

The solution is to avoid the problem of producing incorrect data.

Review

While this example is using a character data type, field truncation can also occur with numeric and date data types. The use of applicable sql_mode configuration settings is a critical MySQL design practice to ensure adequate data integrity that all systems need to implement.

More References

About Expert MySQL Design Practices

This new series by leading MySQL Expert Ronald Bradford helps the software engineer understand, appreciate and develop the right skills and techniques to build scalable software solutions. These proven and reproducible design practices will ensure your use of MySQL to improve performance, scalability and reliability.

These expert design practices are from 25 years of professional experience following formal university qualifications in computer science. All of these practices are written for use with a MySQL based data system however most of the content in these practices predate the existence of the MySQL product and have stood the test of time with emerging technologies and software development approaches. Many practices apply directly to other data stores, whether relational or the new NoSQL products and include working with persistent and non-persistent data storage products.

More information about the series can be found at http://ronaldbradford.com/blog/expert-mysql-design-practices/

Unexplained (trivial) MySQL behavior

The -N or –skip-column-names is a convenient option with the mysql client to skip the header line of output.
However I found when viewing the output via the terminal, some interesting and unexplained output.

$ mysql -h*** -u*** -p -e "SELECT VARIABLE_NAME,VARIABLE_VALUE FROM INFORMATION_SCHEMA.GLOBAL_STATUS"
+--------------------------------+----------------+
| VARIABLE_NAME                  | VARIABLE_VALUE |
+--------------------------------+----------------+
| ABORTED_CLIENTS                | 710001         |
| ABORTED_CONNECTS               | 31             |
| BINLOG_CACHE_DISK_USE          | 0              |
| BINLOG_CACHE_USE               | 0              |
| BYTES_RECEIVED                 | 2522301004     |
| BYTES_SENT                     | 317785976      |
| COM_ADMIN_COMMANDS             | 2890667        |
| COM_ASSIGN_TO_KEYCACHE         | 0              |
| COM_ALTER_DB                   | 0              |
| COM_ALTER_DB_UPGRADE           | 0              |
$ mysql -N -h*** -u*** -p -e "SELECT VARIABLE_NAME,VARIABLE_VALUE FROM INFORMATION_SCHEMA.GLOBAL_STATUS"
+--------------------------------+--------------+
|                ABORTED_CLIENTS |       710001 |
|               ABORTED_CONNECTS |           31 |
|          BINLOG_CACHE_DISK_USE |            0 |
|               BINLOG_CACHE_USE |            0 |
|                 BYTES_RECEIVED |   2522947764 |
|                     BYTES_SENT |    348838502 |
|             COM_ADMIN_COMMANDS |      2890742 |


As you can see all the values of the first column are right aligned in terminal display. When written to a file (which is the intended outcome), the data is not.

$ mysql ... > /tmp/x
$ head /tmp/x
ABORTED_CLIENTS	710009
ABORTED_CONNECTS	31
BINLOG_CACHE_DISK_USE	0
BINLOG_CACHE_USE	0
BYTES_RECEIVED	2526760299
BYTES_SENT	530046795
COM_ADMIN_COMMANDS	2890742
COM_ASSIGN_TO_KEYCACHE	0
COM_ALTER_DB	0
COM_ALTER_DB_UPGRADE	0

Improved Security with MySQL 5.6

Installed on a clean CentOS 6.4 AWS instance.

sudo su -
cd /tmp
wget http://cdn.mysql.com/Downloads/MySQL-5.6/MySQL-5.6.13-1.el6.x86_64.rpm-bundle.tar
tar xvf MySQL-5.6.13-1.el6.x86_64.rpm-bundle.tar
yum install -y libaio perl
rpm -i MySQL*.rpm

The following output is the sign that security is being considered with new MySQL versions. Woot!

A RANDOM PASSWORD HAS BEEN SET FOR THE MySQL root USER !
You will find that password in '/root/.mysql_secret'.

You must change that password on your first connect,
no other statement but 'SET PASSWORD' will be accepted.
See the manual for the semantics of the 'password expired' flag.

Also, the account for the anonymous user has been removed.

In addition, you can run:

  /usr/bin/mysql_secure_installation

which will also give you the option of removing the test database.
This is strongly recommended for production servers.

See the manual for more instructions.

Please report any problems with the /usr/bin/mysqlbug script!

The latest information about MySQL is available on the web at

http://www.mysql.com

Support MySQL by buying support/licenses at http://shop.mysql.com

New default config file was created as /usr/my.cnf and
will be used by default by the server when you start it.
You may edit this file to change server settings

However, moving the MySQL configuration to /usr. WTF?

Identify package management file contents

Recently when implementing Milliseconds in MySQL Proxy Lua Script I needed to identify what was installed from a given package.

For CentOS

$ sudo yum install -y yum-utils
$ repoquery -q -l --plugins lua-socket

/usr/lib64/lua/5.1/mime
/usr/lib64/lua/5.1/mime/core.so
/usr/lib64/lua/5.1/socket
/usr/lib64/lua/5.1/socket/core.so
/usr/share/doc/lua-socket-2.0.2
/usr/share/doc/lua-socket-2.0.2/LICENSE
/usr/share/doc/lua-socket-2.0.2/README
/usr/share/doc/lua-socket-2.0.2/dns.html
/usr/share/doc/lua-socket-2.0.2/ftp.html
/usr/share/doc/lua-socket-2.0.2/http.html
/usr/share/doc/lua-socket-2.0.2/index.html
/usr/share/doc/lua-socket-2.0.2/installation.html
/usr/share/doc/lua-socket-2.0.2/introduction.html
/usr/share/doc/lua-socket-2.0.2/ltn12.html
/usr/share/doc/lua-socket-2.0.2/luasocket.png
/usr/share/doc/lua-socket-2.0.2/mime.html
/usr/share/doc/lua-socket-2.0.2/reference.css
/usr/share/doc/lua-socket-2.0.2/reference.html
/usr/share/doc/lua-socket-2.0.2/smtp.html
/usr/share/doc/lua-socket-2.0.2/socket.html
/usr/share/doc/lua-socket-2.0.2/tcp.html
/usr/share/doc/lua-socket-2.0.2/udp.html
/usr/share/doc/lua-socket-2.0.2/url.html
/usr/share/lua/5.1/ltn12.lua
/usr/share/lua/5.1/mime.lua
/usr/share/lua/5.1/socket
/usr/share/lua/5.1/socket.lua
/usr/share/lua/5.1/socket/ftp.lua
/usr/share/lua/5.1/socket/http.lua
/usr/share/lua/5.1/socket/smtp.lua
/usr/share/lua/5.1/socket/tp.lua
/usr/share/lua/5.1/socket/url.lua

For Ubuntu

$ dpkg-query -L liblua5.1-socket2

/.
/usr
/usr/share
/usr/share/doc
/usr/share/doc/liblua5.1-socket2
/usr/share/doc/liblua5.1-socket2/README
/usr/share/doc/liblua5.1-socket2/copyright
/usr/share/doc/liblua5.1-socket2/changelog.Debian.gz
/usr/share/lua
/usr/share/lua/5.1
/usr/share/lua/5.1/ltn12.lua
/usr/share/lua/5.1/mime.lua
/usr/share/lua/5.1/socket
/usr/share/lua/5.1/socket/http.lua
/usr/share/lua/5.1/socket/url.lua
/usr/share/lua/5.1/socket/tp.lua
/usr/share/lua/5.1/socket/ftp.lua
/usr/share/lua/5.1/socket/smtp.lua
/usr/share/lua/5.1/socket.lua
/usr/lib
/usr/lib/liblua5.1-mime.so.2.0.0
/usr/lib/liblua5.1-socket.so.2.0.0
/usr/lib/liblua5.1-unix.so.2.0.0
/usr/lib/lua
/usr/lib/lua/5.1
/usr/lib/lua/5.1/mime
/usr/lib/lua/5.1/socket
/usr/lib/liblua5.1-mime.so.2
/usr/lib/liblua5.1-socket.so.2
/usr/lib/liblua5.1-unix.so.2
/usr/lib/lua/5.1/mime/core.so
/usr/lib/lua/5.1/socket/core.so
/usr/lib/lua/5.1/socket/unix.so

MySQL Proxy and microseconds

By default the included Lua within MySQL proxy (0.8.3) does not include socket, necessary for getting microsecond granularity. To setup you have to install Lua and socket on the OS first:

For CentO5

$ sudo yum install lua lua-socket

For Ubuntu

$ sudo apt-get install lua5.1 liblua5.1-socket2

The following enables use within MySQL Proxy.

cp /usr/share/lua/5.1/socket.lua /path/to/mysqlproxy/lib/mysql-proxy/lua
cp -r /usr/lib64/lua/5.1/socket /path/to/mysqlproxy/lib/mysql-proxy/lua
cp -r /usr/lib64/lua/5.1/mime /path/to/mysqlproxy/lib/mysql-proxy/lua

My lua script can now use syntax similar to:

require 'socket'

function read_query( packet )
...
  now=socket.gettime()
  print( string.format("# %s.%3dn%s;n",os.date("%X",now),select(2,math.modf(now))*1000 , query))

Announcing the MySQL Plugin for New Relic

Many application developers would know of New Relic. A SaaS performance and monitoring tool targeted towards your web application monitoring including PHP, Ruby, Java, .Net, Python and Node.

With the release today (June 19, 2013) of the New Relic Platform, custom monitoring of data stores including MySQL are now possible. Try it now free. This link will provide you a free standard account (no cost, no billing details necessary), that enables you to perform application monitoring, server monitoring, MySQL instance monitoring and monitoring of many other products via many plugins.

Over the next few posts I will be discussing some of the design decisions I made for this MySQL plugin. New Relic has certain features that lend towards really helping developers monitor and diagnose the application (I have been surprised how it has helped in debugging DB and OS problems directly from PHP code for example). However, often it’s important that knowing the server resources, the database usage is critical to seeing the whole picture, and with the new plugins, New Relic gives developers, system administrators and database administrator some well targeted insights. When building custom dashboards you can see CPU usage, Database usage, and your web application volume, page load time and other metrics on one page.

The MySQL plugin has two pre-requisite requirements. A MySQL server running 5.0 or better, and a Java JRE 1.6 or better. The plugin can work either directly on each MySQL server/instance, and therefore needs the JRE, or it is possible to configure a single server to collect all MySQL statistics and report them to New Relic. There are no specifics that restrict this plugin working for any MySQL variants/forks, infact I specifically designed the plugin to be forward compatible with new version and status variables for example, and support custom recording of any metrics (more on that later).

This is first release of the New Relic Platform and MySQL plugin so I expect a lot of refinements, improvements and suggestions as we move forward. As an integral part of developing the MySQL plugin and using the New Relic Java SDK, there are a number of roadmap items to better serve MySQL and other products that will be coming in future releases. The beta version of the MySQL plugin has been running on production MySQL servers for several months now and working well.

New Relic provides two ways to display data, first by graphs, and second by tables. There is a handy information option in the Server monitoring that is not presently available for the plugins. Graphs work best with multiple data points and constantly changing data and records averages. When looking at the SQL commands for example is great to see the total breakdown, monitoring MySQL replication lag (a single metric), that hopefully is generally zero can appear a little bland. A cool trick is to click on the legend, this toggles the displayed value, and can really help when one value in a graph hogs the metric.

At present the plugin has 4 tabs of display:

  • Overview provides a high level view of total reads v total writes, database connections, network communications and a table of key utilizations (which I will discuss more in another post)
  • Query Analysis shows details of SQL commands, temporary (memory/disk) tables, slow queries, query cache usage, select query types, sort types and table locking.
  • InnoDB Metrics include buffer pool operations, a page breakdown, row operations, log writes, log and data fsyncs, checkpoint age, history list, internal threads and mutexes.
  • Replication shows lag, relay log volume, I/O thread lag, slave errors and master binary log volume (when on a master).

The plugin is written to be extensible via JSON configuration. Those wishing to monitor different variables, or say custom metrics from storage engines like Tokutek can be easily defined, either a key/value set, or single row of metrics. However, the initial version of the dashboards does not allow the customers to modify the present dashboard. Requests are welcome for me to expand the current global dashboards.

In the News

Percona Ireland??

Anybody else noticed that Percona appears to not be a US entity any more?

I observed it today.

$ sudo /usr/bin/innobackupex ...
InnoDB Backup Utility v1.5.1-xtrabackup; Copyright 2003, 2009 Innobase Oy
and Percona Ireland Ltd 2009-2012.  All Rights Reserved.

This software is published under
the GNU GENERAL PUBLIC LICENSE Version 2, June 1991.

In previous versions this did say Percona Inc, 2009-2012.

The footer of the Documentation also states © Copyright Percona Ireland Ltd. 2009-2013.

Percona Live Conference Recommendations

Percona Live MySQL Conference and Expo, April 22-25, 2013

While many attendees are repeat offenders, if 2013 is your first MySQL conference and you are relatively new with MySQL (say < 2 years experience), it can be daunting to determine which of the 8 or more concurrent sessions you should attend during the conference.

Here are my top recommendations that give you a good grounding in the various conference topics and a wealth of experience from known MySQL authorities, on important topics.

  1. A backup today saves you tomorrow by Ben Mildred at Pythian. Losing your data is a terrible experience. Learn what is needed to keep your data safe and you system highly available.
  2. Survey of Percona Toolkit: Command-Line tools for MySQL by Daniel Nichter at Percona. There are a wealth of additional MySQL tools that any resource should be familiar with. These are some of the most popular.
  3. Script It. Make Professional DBA tools out of nothing by Giuseppe Maxia at Continuent. I seasoned expert in the MySQL field, his expertise is invaluable to learn how to use MySQL effeciently. Giuseppe is also the creator of MySQL Sandbox, a huge productivity tool for developers.
  4. Practices for reducing MySQL database size by Yoshinori Matsunobu at Facebook. This is a consulting technique I use for great advantage with clients to improve performance. Yoshinori is also one of the most popular technical speakers at events.
  5. MHA: Getting started and moving past the quirks by Colin Charles at Monty Program. Creating a HA environment is essential for any successful application. MHA is one open source approach that should be considered.
  6. Managing data and data archiving using MySQL 5.6 new features of portable tablespace and exchange partition by Marco tusa at Pythian. If there was one footnote feature in MySQL 5.6 that has a huge benefit, this is the feature. As data continues to grow rapidly in size, archiving is more important.

This year’s conference talks are organized by topic and skill level. This can also help you find talks specific to your needs. Topics include the following:

  • Developing Applications
  • Tools
  • Best Practices for Businesses
  • Database Administration
  • Utilizing Hardware
  • Replication and High Availability Strategies
  • Treads in Architecture and Design
  • New Features

2013 is sure to be a great event, with a lineup of many MySQL product features for the MySQL ecosystem.

When is the error log filename not the right filename

When evaluating a MySQL system one of the first things to look at is the MySQL error log. This is defined by the log[_-]error variable in the MySQL Configuration file. Generally found like:

grep log.error /etc/my.cnf
log_error=/var/lib/mysql/logs/mysql_error_log
log-error=/var/lib/mysql/logs/mysql_error_log

It is possible to find multiple rows because this could be defined in the [mysqld] and [mysqld_safe] sections. It is also possible it is incorrectly defined twice in any given section.

Immediately I see a problem here, and the following describes why. If you look at this file name, in this case it’s actually found, but the file is empty.

$ ls -l /var/lib/mysql/logs/mysql_error_log
-rw-r----- 1 mysql mysql 0 Feb 19 20:35 /var/lib/mysql/logs/mysql_error_log

An error log should never exist and be empty, because starting the instance producing messages. An error log could be empty because the system does a log rotate (BTW, never rotate the error log, see The correct approach to rolling MySQL logs). However it is empty in this case because MySQL is not writing to the error log filename as defined, because it does not have file extension. NOTE: there is no .log or similar extension. Looking more closely.

$ ls -l /var/lib/mysql/logs/mysql_error_log*
-rw-r----- 1 mysql mysql      0 Feb 19 20:35 /var/lib/mysql/logs/mysql_error_log
-rw-rw---- 1 mysql root  394530 Feb 19 20:35 /var/lib/mysql/logs/mysql_error_log.err

As you can see, MySQL has overwritten your parameter and given the file an extension, as verified by the runtime value.

mysql> show global variables like 'log_error';
+---------------+-----------------------------------------+
| Variable_name | Value                                   |
+---------------+-----------------------------------------+
| log_error     | /var/lib/mysql/logs/mysql_error_log.err |
+---------------+-----------------------------------------+

So, the tip is, always use .log for your MySQL error (and slow query log).

NOTE: Placing the log files in the MySQL datadir (which defaults to /var/lib/mysql) is also a bad idea. A topic for another blog post at a later time.

The MySQL symlink trap

Many users of MySQL install and use the standard directories for MySQL data and binary logs. Generally this is /var/lib/mysql.
As your system grows and you need more disk space on the general OS partition that commonly holds /tmp, /usr and often /home, you create a dedicated partition, for example /mysql. The MySQL data, binary logs etc are then moved to this partition (hopefully in dedicated directories). For example data is placed in /mysql/data.
Often however, a symbolic link (symlink) is used to so MySQL still refers to the data in /var/lib/mysql.

When it comes to removing the symlink and correctly configuring MySQL, you first stop MySQL and correctly defining the datadir my.cnf variable to point to the right location. However, MySQL still keeps the legacy directory information around and this will cause MySQL replication to fail in several ways when you attempt to restart your MySQL instance.

The binary log index, the relay log index, and the relay log info files all contain the legacy path. MySQL does not make it easy to also determine these actual files.

The relay_log_index variable defines the index, but defaults to [relay_log].index when not defined, so with SHOW GLOBAL VARIABLES this may be blank.
log-bin-index is an configurable option, but no matching global variable. It defaults to [log-bin].index.
relay_log_info will contain a value, generally only a file that is relevant to the data directory.

In these situations, your only option to to manually edit these files, specifying the new datadir (or log-bin) path in order to correctly remove symlinks.

The best advice, is to consider the design of your system first, and never place data in default locations if you feel this has to be modified later. Define those dedicated directories before you start using your MySQL instance.

Poor programming practices

When will it stop. These amateur programmers that simply cut/paste code really affect those good programmers in the ecosystem trying to make a decent living. I was reviewing a developed (but incomplete) PHP/MySQL system using a common framework (which in itself is irrelevant for this post).

In one source file there were 12 repetitions of the following code:

   //permissions
    $this->security_model->setUserPermissions($id);
    if (!array_key_exists($id,$this->session->userdata['permissions']) OR
	!array_key_exists('id', $this->session->userdata['permissions'][$id]) OR
	!array_key_exists('scope', $this->session->userdata['permissions'][$id]['name'])){
      $this->session->set_flashdata('alert', 'You are not authorized to go there.');
      redirect($this->agent->referrer());
    }

It’s bad enough when code is repeated and not put in a simple re factored function. When it’s repeated 12 times in one file, and OMG over 100 times in the product, that is a recipe for bugs, and high maintenance codes due to extremely poor coding practice.

Upgrading to MySQL 5.5 on Ubuntu 10.04 LTS

Ubuntu does not provide an apt-get repository package for MySQL 5.5 on this older OS, however this is still a widely used long term support version. The following steps will upgrade an existing MySQL 5.1 apt-get version to a standard MySQL 5.5 binary.

Step 1. Remove existing MySQL 5.1 retaining data and configuration

sudo su -
service mysql stop
cp -r /etc/mysql /etc/mysql.51
cp -r /var/lib/mysql /var/lib/mysql.51
which mysqld
dpkg -P mysql-server mysql-server-5.1 mysql-server-core-5.1
which mysqld
which mysql
dpkg -P mysql-client-5.1 mysql-client-core-5.1
which mysql
dpkg -P libdbd-mysql-perl libmysqlclient16 mysql-common
# This will not remove /etc/mysql if any other files are in the directory
dpkg -l | grep mysql
[ -d /etc/mysql ] && mv /etc/mysql /etc/mysql.uninstall
cp -r /etc/mysql.51 /etc/mysql

2. Prepare configuration and required directories.

sudo su -
MYCNF="/etc/mysql/my.cnf"
grep basedir ${MYCNF}
sed -ie "s/^basedir.*$/basedir=/opt/mysql/" ${MYCNF}
grep basedir ${MYCNF}
sed -ie "/^[mysqld_safe]/a
skip-syslog" ${MYCNF}
chown -R mysql:mysql /var/lib/mysql
mkdir -p /var/run/mysqld
sudo chown mysql:mysql /var/run/mysqld

Install MySQL 5.5

sudo su -
mkdir -p /opt
cd /opt
# Install MySQL 5.5 Binaries
apt-get install -y libaio-dev  # New 5.5 dependency
wget http://cdn.mysql.com/Downloads/MySQL-5.5/mysql-5.5.28-linux2.6-x86_64.tar.gz
tar xvfz mysql*.tar.gz
ln -s mysql-5.5.28-linux2.6-x86_64 /opt/mysql
echo "export MYSQL_HOME=/opt/mysql
export PATH=$MYSQL_HOME/bin:$PATH" > /etc/profile.d/mysql.sh
chmod +x /etc/profile.d/mysql.sh
. /etc/profile.d/mysql.sh
echo $MYSQL_HOME
echo $PATH
which mysql

4. Upgrade and verify MySQL Instance

su - mysql
cd $MYSQL_HOME
bin/mysqld_safe --skip-syslog &
tail /var/log/mysql/error.log
# There will be some expected ERRORS in error log
bin/mysql_upgrade -uroot
bin/mysqladmin -uroot  shutdown
bin/mysqld_safe --skip-syslog &
tail -100 /var/log/mysql/error.log
mysql -uroot -e "SELECT VERSION();"
bin/mysqladmin -uroot  shutdown
exit

5. Setup MySQL for system use

# As Root
sudo su -
INIT="/etc/init.d/mysqld"
cp /opt/mysql/support-files/mysql.server ${INIT}
sed -ie "s/^basedir=$/basedir=/opt/mysql/;s/^datadir=$/datadir=/var/lib/mysql/" ${INIT}
${INIT} start
mysql -uroot -e "SELECT VERSION();"
${INIT} stop

Not a cool new feature for Master_Host

I was surprised to find on a customer MySQL server this new syntax for Master_host in SHOW SLAVE STATUS.

mysql> SHOW SLAVE STATUSG
*************************** 1. row ***************************
               Slave_IO_State: Connecting to master
                  Master_Host: testdb1.xxx.com or 10.XXX.XX.XXX
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: db1-354215-bin-log.000005
          Read_Master_Log_Pos: 1624
               Relay_Log_File: db2-354214-relay-log.000001
   

Is this a fancy new Percona Server feature? No. It’s operator error.

We read a little further to find.

mysql> SHOW SLAVE STATUSG
...
             Slave_IO_Running: Connecting

...
                Last_IO_Errno: 2005
                Last_IO_Error: error connecting to master 'repl@ testdb1.xxx.com or 10.XXX.XX.XXX' - retry-time: 60  retries: 86400

How can this be created.
Using MySQL MHA, you get the following message in the output of commands to manage replication.

...
Thu Jan  3 17:06:40 2013 - [info]  All other slaves should start replication from here.
Statement should be: CHANGE MASTER TO MASTER_HOST='testdb1.xxx.com or 10.XXX.XX.XXX', MASTER_PORT=3306,
 MASTER_LOG_FILE='db1-354215-bin-log.000005', MASTER_LOG_POS=1624, MASTER_USER='repl',
MASTER_PASSWORD='xxx';
...

Needless to say, this syntax was taken literately, and MySQL did not complain.

I would suggest here that while MySQL does not do any validation on the value of the MASTER_HOST value in the CHANGE MASTER TO command to ensure it is resolvable it should at least do some validation to ensure the value is either a DNS entry or an IPV4,IPV6 value, that is space ‘ ‘, is not a valid character in these situations.

Installing MySQL MHA with Percona Server

MySQL MHA by Oracle ACE Director Yoshinori Matsunobu is an excellent open source tool to help in providing HA with native MySQL replication. The installation however is dependent on some Perl packages and to the untrained eye this may be an issue if you are using Percona Server as your choice of MySQL implementation.

The MHA Node page requires the perl-DBD-MySQL package to be installed. The installation on RedHat/CentOS/Oracle Linux look like this:

$ sudo yum install perl-DBD-MySQL
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: mirror.anl.gov
* extras: mirror.anl.gov
* updates: mirror.anl.gov
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package perl-DBD-MySQL.x86_64 0:3.0007-2.el5 set to be updated
--> Finished Dependency Resolution

However, recently with a client the following error “mysql conflicts with Percona-Server” occurred.

$ sudo yum install perl-DBD-MySQL
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: mirror.anl.gov
* extras: mirror.anl.gov
* updates: mirror.anl.gov
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package perl-DBD-MySQL.x86_64 0:3.0007-2.el5 set to be updated
--> Processing Dependency: libmysqlclient.so.15(libmysqlclient_15)(64bit) for package: perl-DBD-MySQL
--> Processing Dependency: libmysqlclient.so.15()(64bit) for package: perl-DBD-MySQL
--> Running transaction check
---> Package mysql.x86_64 0:5.0.95-1.el5_7.1 set to be updated
--> Processing Conflict: mysql conflicts MySQL
--> Finished Dependency Resolution
mysql-5.0.95-1.el5_7.1.x86_64 from updates has depsolving problems
 --> mysql conflicts with Percona-Server-client-55
Error: mysql conflicts with Percona-Server-client-55

This is actually an easy problem to solve, but if you are unfamiliar with all the various MySQL packages for RPM’s, and you installed Percona Server with the standard documented commands you may get lost. What is necessary is the installation of the compatibility libraries, as simple as:

$ sudo yum install -y Percona-Server-shared-compat

SQLSTATE[HY000]: General error: 2006 MySQL server has gone away

This would have to be one of the most common MySQL error messages that is misleading to the end user developer. The MySQL Manual page confirms the broad range of possible conditions, but offers little to a PHP developer that does not speak MySQL Geek. I am commonly asked to help solve this issue from a developer.

The problem is that there are several conditions that can cause this error, and a more meaningful explanation to the end user would help in addressing the issue. In general terms, this actually means “Your SQL statement has failed because the connection to the database has been disconnected because of ???”.

Here are a few common situations and how to check for what “???” is.

1. Your MySQL server really did go away.

We can easily check this by looking at the server uptime and the server error log.

$ mysql -uroot -p -e "show global status like 'uptime';"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Uptime        | 68928 |
+---------------+-------+
1 row in set (0.04 sec)
$ tail /var/log/mysql/error.log
130101 22:22:30 InnoDB: Initializing buffer pool, size = 256.0M
130101 22:22:30 InnoDB: Completed initialization of buffer pool
130101 22:22:30 InnoDB: highest supported file format is Barracuda.
130101 22:22:30 InnoDB: 1.1.8 started; log sequence number 63444325509
130101 22:22:30 [Note] Server hostname (bind-address): '127.0.0.1'; port: 3306
130101 22:22:30 [Note]   - '127.0.0.1' resolves to '127.0.0.1';
130101 22:22:30 [Note] Server socket created on IP: '127.0.0.1'.
130101 22:22:30 [Note] Event Scheduler: Loaded 0 events
130101 22:22:30 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.28-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

In both these cases, the server has been up some time, and there are zero error messages to indicate problems.

If the MySQL server did go away, was it shutdown or did it crash? The MySQL error log will provide the answers. Generally the mysql daemon (mysqld) will be restarted by the mysqld_safe wrapper process.

2. The connection timed out

$ mysql -uroot -p -e "show global variables like '%timeout';"
+----------------------------+----------+
| Variable_name              | Value    |
+----------------------------+----------+
| connect_timeout            | 30       |
| delayed_insert_timeout     | 300      |
| innodb_lock_wait_timeout   | 50       |
| innodb_rollback_on_timeout | OFF      |
| interactive_timeout        | 28800    |
| lock_wait_timeout          | 31536000 |
| net_read_timeout           | 30       |
| net_write_timeout          | 60       |
| slave_net_timeout          | 3600     |
| wait_timeout               | 28800    |
+----------------------------+----------+

These values are relatively sane MySQL defaults. If however you have very short timeouts, you may get this error. Here is just one example.

mysql> SET SESSION wait_timeout=5;

## Wait 10 seconds

mysql> SELECT NOW();
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    132361
Current database: *** NONE ***

+---------------------+
| NOW()               |
+---------------------+
| 2013-01-02 11:31:15 |
+---------------------+
1 row in set (0.00 sec)

3. Your SQL statement was killed

Some systems will proactively kill SQL statements that have been running too long. You can easily check if this may be happening proactively by looking at how many KILL statements have been executed.

$ mysql -uroot -p -e "show global status like 'com_kill'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Com_kill      | 0     |
+---------------+-------+

Not killed this time.

4. Your SQL statement was too large.

A little harder to test and verify, but MySQL uses a maximum packet site for communications between the server and the client. If this includes large fields (for example BLOB columns), you may be getting a termination of your SQL statement due to size.

By default this is relatively small.

mysql> show global variables like 'max_allowed_packet';
+--------------------+---------+
| Variable_name      | Value   |
+--------------------+---------+
| max_allowed_packet | 1048576 |
+--------------------+---------+
1 row in set (0.00 sec)

You can increase, for example to 16M with:

mysql> set global max_allowed_packet=1024*1024*16;
mysql> show global variables like 'max_allowed_packet';
+--------------------+----------+
| Variable_name      | Value    |
+--------------------+----------+
| max_allowed_packet | 16777216 |
+--------------------+----------+
1 row in set (0.00 sec)

The good news, is this was the cause for the customer today, and now no more errors!

Be sure to keep this value during MySQL restarts.

#my.cnf
[mysqld]
max_allowed_packet = 16M

Open Source Database Schemas

I am seeking the help of the community. I am working on an evaluation project about schema design in open source applications. While it’s possible for me to download the software of many popular apps, and install the software and then do a mysqldump, it takes time. Quite often there is no simple schema.sql file, but a process for creating the schema. If you are using an open source project, would you take a moment and run the following.

$ mysqldump -u[user] -p --skip-lock-tables --no-data --databases [schema]  > [schema].sql

This will only dump the table definitions, and should therefore contain nothing company specific. I have at this time:

  • WordPress
  • Drupal
  • Mediawiki
  • OS Commerce
  • Joomla
  • EzPublish
  • PHPWiki

I am open to any projects, and it doesn’t matter if the version is not the most current, what I am seeking is to understand trends. There is a huge list of possibilities including Bugzilla, PHPBB, SugarCRM, Magento, PHPWiki just to name a few.

Your help would be greatly appreciated. You are welcome to add a a comment, or email me at [this domain] with your results.

I am also happy to accept sanitized schemas of any projects, however please ensure no company or propriety specific information is provided.

Controlled failover simplicity with MySQL

As part of a recent engagement, I described the relative products to manage a MySQL pair (i.e. an Active/Passive MySQL masters configuration). This included the steps to undertake a controlled failover for supporting software maintenance using manual procedures. The upcoming Effective MySQL: Replication Techniques in Depth book details each step and all conditions to review over a dozen pages. While the steps are straightforward and generally well known, scripting this for your environment takes a certain amount of work to ensure your information is correct, and application connectivity loss is kept to a minimum.

In Continuent Tungsten (which I have just been reviewing these past few weeks), I achieved the same result with a single command.

$ echo "switch" | /opt/continuent/tungsten/tungsten-manager/bin/cctrl

In addition to all the checks and balances to ensure data is consistent and no information can be lost, Continuent Tungsten Connectors ensure ALL database connections are maintained, i.e. they are not dropped. This is ideal for an application that uses persistent connections, e.g. Java applications. This is a feature that other options do not provide. This command also supports additional read slaves with no additional work.

This post has been in draft for a while, the current chatter on disasters, controlled and uncontrolled failover, and the disaster preparedness for your company information is a very important. There are several options for implementing a more highly available (HA) MySQL solution depending on your business continuity requirements.

The fine print

Here is the full output of the command, between 3 servers, alpha, beta and gamma.

$ echo "switch" | /opt/continuent/tungsten/tungsten-manager/bin/cctrl
Tungsten Enterprise 1.5.2 build 69
connect to 'nyc@alpha'
nyc: session established
[LOGICAL] /nyc > switch
SELECTED SLAVE: beta@nyc
PURGE REMAINING ACTIVE SESSIONS ON CURRENT MASTER 'alpha@nyc'
PURGED A TOTAL OF 0 ACTIVE SESSIONS ON MASTER 'alpha@nyc'
FLUSH TRANSACTIONS ON CURRENT MASTER 'alpha@nyc'
PUT THE NEW MASTER 'beta@nyc' ONLINE
PUT THE PRIOR MASTER 'alpha@nyc' ONLINE AS A SLAVE
RECONFIGURING SLAVE 'gamma@nyc' TO POINT TO NEW MASTER 'beta@nyc'
SWITCH TO 'beta@nyc' WAS SUCCESSFUL
[LOGICAL] /nyc >
Exiting...

Joining the Continuent Team

This month I have joined the team at Continuent. No stranger to the MySQL ecosystem, Continuent provides replication and clustering technology for managing data between MySQL, Oracle, PostgreSQL, Vertica and a growing list of data stores.

I have known many of the team at Continuent for some time, and will again be joining Giuseppe Maxia from our days at MySQL Inc/AB starting back in 2006.

I am looking forward to taking the hard work out of administration of MySQL systems with the simplicity of Continuent Tungsten, simplifying tasks including automatic failover, multi-master and geo cluster redundancy to a single command.

Catch me speaking at the upcoming MySQL Connect (San Francisco) and Percona Live (New York) conferences, and where the third book of the Effective MySQL Series Replication Techniques in Depth will also be available for sale.

Looking for MySQL 4.1

I had need today to download a version of MySQL 4.1 to test something. The MySQL Developer Zone archives no longer provides any software before 5.0.

While this may have long reached EOL and is no longer support, customers still do run this version of MySQL.

Anybody that can help out with binaries (on several OS’s), it would be appreciated.

SPOF Internet

SPOF (i.e. Single Point of Failure) is the bane for technologists. Avoiding SPOF generally requires redundancy, and redundancy has a cost, often more then a business is prepared to pay. In the database field, I see this regularly and advise clients on how to improve availability and potential avoid disasters that can affect their business.

Today, at approximately 10:30am, the Con Edison work crew in front of my home (digging a 5″ deep trench down the road), severed multiple Time Warner Cable fibre connections. ($#&* and the lack of ownership to correct timely is another story). No Internet, no ability to work actively with clients (which I was doing), etc, etc.

As an individual that works from home, I have recognized this SPOF and have redundancy in place. That is, a Verizon MiFi HotSpot, normally used for travel, but a backup in times of Internet downtime to my home. The moral here is, one level of redundancy is often not enough, just as MySQL replication is not a backup solution, only one part thereof, my backup redundancy was in maintenance mode (in this case loaned to a good overseas friend that is visiting and traveling in the US). Disaster often strikes unexpectedly, and often causes multiple failures, this being an example of a cascading failure in my Internet redundancy procedures.

The fact that I am writing this post, shows that I have a second backup, that is a portable WiFi hotspot on my T-Mobile phone. It’s not great, but it is an emergency.

This is not a satisfactory solution long term. My first estimate for repairs was September 17th. Even after stressing that was unsatisfactory, my second estimate is still unacceptable for my business.

Amen, to co-working spaces in New York. The ability to work at a location for a fixed daily/weekly/monthly cost. It pays to know where they are, and have an informal relationship for such an emergency. While inconvenient, I can take a laptop and have power and Internet to work for some core hours in the day, again far less then ideal.

For those that have actually decided to read this far, the moral of the story is this. What is your plan when this happens to your business connection, or your primary MySQL database server? What is acceptable downtime, and how to address correcting issues outside of your control (e.g. An explosion in a data center taking out 15,000 servers, which has happened to me, or damage to the 5 servers you have, all in a single rack). With practically every client, there is not a defined plan in the event of a disaster. There should be.

MySQL client password security

In case you missed it, MySQL 5.6.6, also known as Milestone 9, was recently released. I have yet to install this, however just one part of the MySQL 5.6.6 Release Notes makes placing installing and testing high on my TODO list.

Updated 20 Sep, 2012. Be sure to also read Todd’s post Understanding mysql_config_editor’s security aspects about a more in-depth and accurate description of this new feature. In summary, “It makes secure access via MySQL client applications easier to use”.

That is the reported improvements in password management. From the release notes:

Security Improvements

These security improvements were implemented:

MySQL now provides a method for storing authentication credentials securely in an option file named .mylogin.cnf. To create the file, use the mysql_config_editor utility. The file can be read later by MySQL client programs to obtain authentication credentials for connecting to a MySQL server. mysql_config_editor writes the .mylogin.cnf file using encryption so the credentials are not stored as clear text, and its contents when decrypted by client programs are used only in memory. In this way, passwords can be stored in a file in non-cleartext format and used later without ever needing to be exposed on the command line or in an environment variable. This improves security for interactive use of MySQL client programs, as well as security for noninteractive tasks that require a MySQL password from a file. For more information, see Section 4.6.6, “mysql_config_editor — MySQL Configuration Utility”.

The .mylogin.cnf file can contain multiple sets of options, known as “login paths.” To specify which option group to use from the .mylogin.cnf file for connecting to the server, use the –login-path option. See Section 4.2.3.4, “Command-Line Options that Affect Option-File Handling”.

There are additional improvements and modifications around encryption. Well worth reading about in MySQL 5.6.6 Release Notes.

When is a crashing MySQL bug not a bug?

Answer: When Oracle acknowledges the bug in 5.5.25 (to the owner only), corrects the bug in 5.5.27 (to the owner only), yet hides all information of its existence.

Recently a colleague and good friend discovered a bug in MySQL 5.5 replication that would crash MySQL. This was initially reported as Bug #65740, and after a lot of back and forth, a reproducible test case was found. Excellent work on the part of my colleague to spend the time to clearly identify the specific conditions. I remember looking at this initial thread in detail for an UPDATE statement using variables combined with an –ignore-database configuration option.

For no explanation by Oracle, this bug was subsequently marked as private (after I originally viewed the thread publicly), corrected, and the corrected bug is not referenced in the 5.5.27 Release Notes, yet the private bug clearly states.

[20 Jul 11:59] XXXX
Fixed in 5.1.65/5.5.27.

The reason why I raise this concern is due to the lack of consistency. Why this was not included in the Release Notes is not acceptable in my view. However, the installation of 5.2.25 was needed to address another obscure Bug #45670 that was corrected and remains open to the MySQL Community.

I do not know of [yet] a public statement by Oracle that details why certain information about MySQL is open, and certain information is closed.

Recent Presentations in Cali, Colombia

On July 4 I gave two presentations at the OTN Tour Day, and on July 5 I have three presentations at the MySQL Training Days. This was my 3rd visit to Colombia and it was great to see a receptive audience. Thanks to Robin for organizing the events in 2010, 2011 and 2012.

You can download all presentations from the provided links.

New security fixes for MySQL reported

6 new security fixes for Oracle MySQL have been detailed in the most current Oracle Critical Patch Update (CPU).

These are:

  • CVE-2012-1735 (5.5.23 and earlier)
  • CVE-2012-0540 (5.1.62 and earlier, 5.5.23 and earlier)
  • CVE-2012-1757 (5.5.23 and earlier )
  • CVE-2012-1756 (5.5.23 and earlier)
  • CVE-2012-1734 (5.1.62 and earlier, 5.5.23 and earlier )
  • CVE-2012-1689 (5.1.62 and earlier, 5.5.22 and earlier )

Oracle strongly recommends that customers apply CPU fixes as soon as possible. Unfortunately there is no easy description for MySQL users what that really entails. There is a reference to Critical Patch Update July 2012 Patch Delivery Document for Oracle Sun Products Suite My Oracle Support Note 1446033.1, however all the information is behind having a support license. There appears to be no information easily available for the community users.

A full description of these CVEs can be found here. Unfortunately most say Vulnerability in the MySQL Server component of Oracle MySQL (subcomponent: Server). Supported versions that are affected are 5.5.23 and earlier. Easily exploitable vulnerability allows successful authenticated network attacks via multiple protocols. Successful attack of this vulnerability can result in unauthorized ability to cause a hang or frequently repeatable crash (complete DOS) of MySQL Server. which is effectively useless information.

There is external information that can be found at the National Vulnerability Database (not linked in the Oracle article). For example http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2012-1735 however this does not provide any more meaningful information either.

There was a recent 5.5.25a released on 2012-07-05, however this, 5.5.25, 5.5.24, and the yet to be released 5.5.26 release notes provide no information about these security issues.

While security is important for a database and system administrator, on first inspection the information provided does not offer an easy way to assess the risk and take appropriate actions.

More information at

I will be speaking at Percona Live New York

Percona Live New York City, October 1 - 2, 2012
Percona is back for a second New York Percona Live Conference. As the resident New York MySQL Expert, I will again be presenting. My session will be on MySQL Backup and Recovery Essentials.

You can only present so much in one hour, and this presentation just touches on the highlights of what is possible. More detailed information about the right backup and recovery strategy and associated tools is available in my current book Effective MySQL: Backup and Recovery.

Encrypting your MySQL backups and more

Assuming you have a backup and recovery strategy in place, how secure is your data? Does a hacker need to obtain access to your production system bypassing all the appropriate security protection you have in place, or just the unencrypted data on the backup server?

Encryption with zNcrypt

The following steps demonstrate how I setup a mysqldump encrypted backup with zNcrypt, a product from Gazzang. You can request a free trial evaluation of the software from http://gazzang.com/request-a-trial. I asked for a AWS EC2 instance, and was able to provide my bootstrap instructions for OS and MySQL installation. Following installation and configuration, the first step is to verify the zNcrypt process is running:

$ sudo ezncrypt-service status
  ezncrypt | Checking system dependencies
** ezncrypt system is UP and running **
       log | File: /var/log/ezncrypt/ezncrypt-service.log

If the process is not running you would find the following error message:

$ sudo ezncrypt-service status
  ezncrypt | Checking system dependencies
** ezncrypt system is NOT running **
       log | File: /var/log/ezncrypt/ezncrypt-service.log

$ sudo ezncrypt-service start
  ezncrypt | Checking system dependencies
  ezncrypt | checking encryption directories
    keymgr | Retrieving key from KSS
           |  > Encryption password retrieved from KSS
  ezncrypt | starting service
           |  > using "aes_256" cipher algorithm
           | done!
    access | Loading access control list
           | done!
  ezncrypt | Thank you for using ezncrypt.
       log | File: /var/log/ezncrypt/ezncrypt-service.log

Under the covers you will find the following attached devices, and no actual processes.

$ df -h
Filesystem ...
...
/var/lib/ezncrypt/storage/encrypted_private
/var/lib/ezncrypt/ezncrypted

$ ps -ef | grep ezn
uid  4947  3327  0 23:15 pts/3    00:00:00 grep ezn

$ ps -ef | grep cry
root        30     2  0 21:41 ?        00:00:00 [ecryptfs-kthrea]
root        31     2  0 21:41 ?        00:00:00 [crypto]
uid  4951  3327  0 23:15 pts/3    00:00:00 grep cry

The first step is to create a backup directory and encrypt all contents that are placed in the directory. ezNcrypt uses the concept of an @category for reference with an encrypted file or directory.

$ mkdir /mysql/backup/encrypted
$ sudo ezncrypt --encrypt @backup /mysql/backup/encrypted
  ezncrypt | Checking system dependencies
           | Verifying ezncrypt license
           | getting information about location
           |   > path: /var/lib/ezncrypt/ezncrypted/backup
  ezncrypt | Checking encryption status
           | done!
    keymgr | Retrieving key from KSS
           |  > Encryption password retrieved from KSS
           | generating keys
           | done!
    backup | backing up data
           | This can take a while. Please be patient
           |  > backing up /mysql/backup/encrypted
           |  > File: /opt/ezncrypt/backup/2012-04-27/encrypted.tar.gz
           | done!
  ezncrypt | encrypting files
           |  > checking disk space
           |  > encrypting /mysql/backup/encrypted
           | done!
  ezncrypt | congratulations. you have encrypted your Files!!
       log | File: /var/log/ezncrypt/ezncrypt.log

The underlying regular directory is now replaced:

$ ls -l /mysql/backup
total 0
lrwxrwxrwx 1 root root 59 2012-04-27 00:03 encrypted -> /var/lib/ezncrypt/ezncrypted/backup//mysql/backup/encrypted

Any attempts to write to this encrypted directory will now fail, even with the Linux super user:

$ mysqldump --all-databases > /mysql/backup/encrypted/edump1.sql
-bash: /mysql/backup/encrypted/edump1.sql: Permission denied

$ sudo mysqldump --all-databases > /mysql/backup/encrypted/edump1.sql
-bash: /mysql/backup/encrypted/edump1.sql: Permission denied

In order to read and write from an encrypted directory you need to grant access controls to a given program, for example mysqldump:

$ sudo ezncrypt-access-control -a "ALLOW @backup * /usr/bin/mysqldump"
passphrase:
salt:
Rule added

You verify the defined access control rules with:

$ sudo ezncrypt-access-control -L
passphrase:
salt:
# -  Type     Category       Path    Process
1    ALLOW    @backup        *       /usr/bin/mysqldump

However, writing with mysqldump still causes an error because it is the shell redirection that is performing the writing, as seen in the system error log:

$ mysqldump --all-databases > /mysql/backup/encrypted/edump1.sql
-bash: /mysql/backup/encrypted/edump1.sql: Permission denied
$ dmesg | tail
[4138848.618559] ezncryptfs: DENIED type="acl" exec="/bin/bash" script="/dev/pts/4" comm="bash" path="/var/lib/ezncrypt/ezncrypted/backup" pid=7448 uid=1000

You can use the –result-file option with mysqldump to enable the process to create the file directly. For example:

$ time mysqldump --all-databases   --result-file=/mysql/backup/encrypted/edump2.sql
real      1m34.714s
user      0m59.388s
sys       0m9.589s

$ sudo ezncrypt-run "ls -l /mysql/backup/encrypted/"
passphrase:
salt:
total 3.0G
-rw-rw-r-- 1 uid gid 2.9G 2012-04-27 02:43 edump2.sql

In this single test, the transparent encryption added only a very nominal overhead to the mysqldump test backup used. You can easily extract the file from the encrypted directory, however that would defeat the purpose of using encryption. The following syntax is shown just to confirm the validity of the encrypted file:

$ sudo /usr/sbin/ezncrypt-run "cp /mysql/backup/encrypted/edump2.sql ."
passphrase:
salt:
$ ls –al edump*
total 3916
-rw-r--r-- 1 uid gid 2.9G 2012-04-27 02:55 edump2.sql


$ grep "^CREATE.*DATABASE" edump2.sql
CREATE DATABASE /*!32312 IF NOT EXISTS*/ `book2` /*!40100 DEFAULT CHARACTER SET latin1 */;
CREATE DATABASE /*!32312 IF NOT EXISTS*/ `employees` ...
CREATE DATABASE /*!32312 IF NOT EXISTS*/ `musicbrainz` ...
CREATE DATABASE /*!32312 IF NOT EXISTS*/ `mysql` ...
CREATE DATABASE /*!32312 IF NOT EXISTS*/ `sakila` ...
CREATE DATABASE /*!32312 IF NOT EXISTS*/ `world_innodb` ...
CREATE DATABASE /*!32312 IF NOT EXISTS*/ `world_myisam` ...

When using correctly configured directories and access controls, the use is truly transparent to the backup process.

Restoring an encrypted file is a little more involved. The best approach is to create a script to perform the work, than encrypt this script. When executed, this script will have the permissions necessary to read and apply the encrypted file.

Perhaps the best tip about using this type of transparent encryption is that it is possible to encrypt the MySQL user and password securely in a plain text configuration file and used with appropriate MySQL client commands. This helps to address another common security problem.

What compression do you use?

The following is an evaluation of various compression utilities that I tested when reviewing the various options for MySQL backup strategies. The overall winner in performance was pigz, a parallel implementation of gzip. If you use gzip today as most organizations do, this one change will improve your backup compression times.

Details of the test:

  • The database is 5.4GB of data
  • mysqldump produces a backup file of 2.9GB
  • The server is an AWS t1.xlarge with a dedicated EBS volume for backups

The following testing was performed to compare the time and % compression savings of various available open source products. This was not an exhaustive test with multiple iterations and different types of data files.

Compression
Utility
Compression Time
(sec)
Decompression Time
(sec)
New Size
(% Saving)
lzo (-3) 21 34 1.5GB (48%)
pigz (-1) 43 33 995MB (64%)
pigz (-3) 56 34 967MB (67%)
gzip (-1) 81 43 995MB (64%)
fastlz 92 128 1.3GB (55%)
pigz [-6] 105 25 902MB (69%)
gzip (-3) 106 43 967MB (67%)
compress 145 36 1.1GB (62%)
pigz (-9) 202 23 893MB (70%)
gzip [-6] 232 78 902MB (69%)
zip 234 50 902MB (69%)
gzip (-9) 405 43 893MB (70%)
bzip2 540 175 757MB (74%)
rzip 11 minutes 360 756MB (74%)
lzo (-9) 20 minutes 82 1.2GB (58%)
7z 33 minutes 122 669MB (77%)
lzip 47 minutes 132 669MB (77%)
lzma 58 minutes 180 639MB (78%)
xz 59 minutes 160 643MB (78%)

Observations

  • The percentage savings and compression time of results will vary depending on the type of data that is stored in the MySQL database.
  • The pigz compression utility was the surprising winner in best compression time producing at least a size of gzip. This was a full 50% faster than gzip.
  • For this compression tests, only one large file was used. Some utilities work much better with many smaller files.

Find our more information of these tests and the results in Effective MySQL: Backup and Recovery