Speaking at Oracle Open World 2010


I will be one of 18 MySQL speakers at Oracle Open World 2010 at the first ever MySQL Sunday. With a great diversity of technical, community and product talks this will be a great opportunity to get a cross section of MySQL content. Combined with Java One, this year’s Oracle Open World will include a lot more opportunity of technical and developer content then the more regular marketing material.

As the program chair for the first dedicated MySQL track at last month’s ODTUG Kaleidoscope 2010 our MySQL community now includes a larger number of target people. From the Oracle community come many highly technical and skilled resources, some with an understanding or appreciation of MySQL and many that are new to MySQL.

This is a great opportunity to share our knowledge and experience with MySQL.

References

Optimizing SQL Performance – The Art of Elimination

The most efficient performance optimization of a SQL statement is to eliminate it. Cary Millsap’s recent Kaleidoscope presentation again highlighted that improving performance is function of code path. Removing code will improve performance.

You may think that it could be hard to eliminate SQL, however when you know every SQL statement that is executed in your code path obvious improvements may be possible. In the sequence SQL was implemented sometimes easy observations can lead to great gains. Let me provide some actual client examples that were discovered by using the MySQL General Log.

Example 1

5 Query   SELECT *  FROM `artist`
5 Query   SELECT *  FROM `artist`
5 Query   SELECT *  FROM `artist`
5 Query   SELECT *  FROM `artist`
5 Query   SELECT *  FROM `artist`
5 Query   SELECT *  FROM `artist` WHERE (ArtistID = 196 )
5 Query   SELECT *  FROM `artist` WHERE (ArtistID = 2188 )
5 Query   SELECT *  FROM `artist`
5 Query   SELECT *  FROM `artist`
5 Query   SELECT *  FROM `artist`

In this example, the following was executed for a single page load. Not only did I find a bug where full-table scans occurred rather then being qualified, there were many repeating and unnecessary occurrences.

Example 2

SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'
SELECT option_value FROM wp_options WHERE option_name = 'aiosp_title_format' LIMIT 1
SELECT option_value FROM wp_options WHERE option_name = 'ec3_show_only_even' LIMIT 1
SELECT option_value FROM wp_options WHERE option_name = 'ec3_num_months' LIMIT 1
SELECT option_value FROM wp_options WHERE option_name = 'ec3_day_length' LIMIT 1
SELECT option_value FROM wp_options WHERE option_name = 'ec3_hide_event_box' LIMIT 1
SELECT option_value FROM wp_options WHERE option_name = 'ec3_advanced' LIMIT 1
SELECT option_value FROM wp_options WHERE option_name = 'ec3_navigation' LIMIT 1
SELECT option_value FROM wp_options WHERE option_name = 'ec3_disable_popups' LIMIT 1
SELECT option_value FROM wp_options WHERE option_name = 'sidebars_widgets' LIMIT 1

This is a stock WordPress installation and highlights a classic Row at a Time (RAT) processing.

Example 3

SELECT * FROM activities_theme WHERE theme_parent_id=0
SELECT * FROM activities_theme WHERE theme_parent_id=1
SELECT * FROM activities_theme WHERE theme_parent_id=2
SELECT * FROM activities_theme WHERE theme_parent_id=11
SELECT * FROM activities_theme WHERE theme_parent_id=16

In this client example, again RAT processing, I provided a code improvement to run these multiple queries in a single statement, otherwise known as Chunk At a Time (CAT) processing. It’s not rocket science however the elimination of the network component of several SQL statements can greatly reduce page load time.

SELECT *
FROM   activities_theme
WHERE  theme_parent_id in  (0,1,2,11,16) 

Example 4

The following represents one of the best improvement. During capture, the following query was executed 6,000 times over a 5 minute period. While you make think this is acceptable, the value passed wae 0. The pages_id is an auto_increment column which by definition does not have a 0 value. In this instance, a simple boundary condition in the code would eliminate this query.

SELECT pages_id, pages_livestats_code, pages_title,
       pages_parent, pages_exhibid, pages_theme,
       pages_accession_num
FROM pages WHERE pages_id = 0

There are many tips to improving and optimizing SQL. This is the simplest and often overlooked starting point.

Related articles include: The RAT and the CAT and We need more CAT’s

What do MySQL Consultants do?

One role of a MySQL consultant is to review an existing production system. Sometimes you have sufficient time and access, and other times you don’t. If I am given a limited time here is a general list of things I look at.

  1. Review Server architecture, OS, Memory, Disks (including raid and partition type), Network etc
  2. Review server load and identify physical bottleneck
  3. Look at all running processes
  4. Look specifically at MySQL processes
  5. Review MySQL Error Log
  6. Determine MySQL version
  7. Look at MySQL configuration (e.g. /etc/my.cnf)
  8. Look at running MySQL Variables
  9. Look at running MySQL status (x n times)
  10. Look at running MySQL INNODB status (x n times) if used
  11. Get Database and Schema Sizes
  12. Get Database Schema
  13. Review Slow Query Log
  14. Capture query sample via SHOW FULL PROCESSLIST (locked and long running)
  15. Analyze Binary Log file
  16. Capture all running SQL

Here are some of the commands I would run.

2. Review server load and identify physical bottleneck

$ vmstat 5 720 > vmstat.`date +%y%m%d.%H%M%S`.txt

4. Look at MySQL processes

$ ps -eopid,fname,rss,vsz,user,command | grep -e "RSS" -e "mysql"
  PID COMMAND    RSS    VSZ USER     COMMAND
 5463 grep       764   5204 ronald   grep -e RSS -e mysql
13894 mysqld_s   596   3936 root     /bin/sh /usr/bin/mysqld_safe
13933 mysqld   4787812 5127208 mysql /usr/sbin/mysqld --basedir=/usr --datadir=/vol/mysql/mysqldata --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-external-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock
13934 logger     608   3840 root     logger -p daemon.err -t mysqld_safe -i -t mysqld

$ ps -eopid,fname,rss,vsz,user,command | grep " mysqld " | grep -v grep | awk '{print $3,$4}'
4787820 5127208

5. Review MySQL Error Log

The error log can be found in various different places based on the operating system and configuration. It is important to find the right log, the SHOW GLOBAL VARIABLES LIKE ‘log_error’ will determine the location.

This is generally overlooked, however this can quickly identify some underlying problems with a MySQL environment.

7. Look at MySQL configuration

$ [ -f /etc/my.cnf ] &&  cat /etc/my.cnf
$ [ -f /etc/mysql/my.cnf ] &&  cat /etc/mysql/my.cnf
$ find / -name  "*my*cnf" 2>/dev/null

8. Look at running MySQL Variables

$ mysqladmin -uroot -p variables

9. Look at running MySQL status (x n times)

$ mysqladmin -uroot -p extended-status

It is important to run this several times at regular intervals, say 60 seconds, 60 minutes, or 24 hours.

I also have dedicated scripts that can perform this. Check out Log MySQL Stats.

11. Get Database and Schema Sizes

Check out my scripts on my MySQL DBA page

14. Capture Locked statements

Check out my script for Capturing MySQL sessions.

15. Analyze Binary Log file

Check out my post on using mk-query-digest.

16. Capture all SQL

Check out my post on DML Stats per table

Moving forward

Of course the commands I run exceeds this initial list, and gathering this information is only

Timing your SQL queries

When working interactively with the MySQL client, you receive feedback of the time the query took to complete to a granularity of 10 ms.

Enabling profiling is a simple way to get more a more accurate timing of running queries. In the following example you can see the time the kernel took to run an explain, the query, and alter, and repeat explain and query.


mysql> set profiling=1;
mysql> EXPLAIN SELECT ...
mysql> SELECT ...
mysql> ALTER ...
mysql> show profiles;
+----------+------------+-------------------------
| Query_ID | Duration   | Query
+----------+------------+-------------------------
|        1 | 0.00036500 | EXPLAIN SELECT sbvi.id a
|        2 | 0.00432700 | SELECT sbvi.id as sbvi_i
|        3 | 2.83206100 | alter table sbvi drop in
|        4 | 0.00047500 | explain SELECT sbvi.id a
|        5 | 0.00367100 | SELECT sbvi.id as sbvi_i
+----------+------------+-------------------------

More information at Show Profiles documentation page.

Improving MySQL Productivity – From Design to Implementation

My closing presentation at the dedicated MySQL track at ODTUG Kaleidoscope 2010 discussed various techniques and best practices for improving the ROI of developer resources using MySQL. Included in the sections on Design, Security, Development, Testing, Implementation, Instrumentation and Support were also a number of horror stories of not what to do, combined with practical examples of improving productivity.

MySQL Idiosyncrasies That Bite

The following are my slides that I presented at ODTUG Kaleidoscope 2010. This presentation talks about the MySQL defaults including a non-transactional state, silent data truncations, date management, and transaction isolation options. These are all critical for data integrity and consistency. I also cover in-depth topics including SQL_MODE, character sets and collations.

Still room at Kaleidoscope for MySQL attendees

Today I received notice that next week’s Velocity conference is at maximum capacity. With just under 2 weeks before the start of ODTUG Kaleidoscope in Washington DC we still have room for late registrations. There is 4 days of MySQL content, free events and also a Sunday Symposium that includes talks on performance and high availability.

Contact any of the MySQL speakers directly and you can receive a special 50% discount code. This is only for MySQL attendees.

If you live in the DC area and only want the FREE option then come along and join use on Monday night for a free session and reception.

ODTUG Kaleidoscope 2010
July 27 – July 1
Marriott Wardman Part Hotel
2660 Woodley Road NW
Washington, District Of Columbia 20008
www.odtugkaleidoscope.com

Conference highlights include

Community Service Day – Saturday, June 26, 8:00 a.m. – 1:00 p.m.
Join ODTUG volunteers and help refurbish a school in D.C.  Under the guidance of Greater DC Cares (GDCC), the leading and largest nonprofit coordinator of volunteerism in the D.C. region, ODTUGgers will: Sort books, beautify school grounds, and paint games on blacktop outside of hte school.

There is still time to sign up!  

Four Full-day Symposia – Sunday, June 27, 8:30 a.m. – 4:00 p.m.
Application Express; Oracle EPM and Essbase; Security, Scalability, and Performance; SOA and BPM. One-day registration available.

Welcome Reception/Battle of the Rock Bands – Sunday, June 27, 6:15 – 8:00 p.m.
Meet the exhibitors and compete in the “Battle of the Rock Bands.” Sign up to play.


Opening General Session – Monday, June 28, 8:30 – 10:00 a.m.
Awards for Best Technical Paper and Best 2009 Presentations
Keynote – “Future of the Internet and its Social Impact” by Lee Rainie, Director of the PEW Research Center’s Internet & American Life Project.
Sundown Sessions with Oracle ACE Directors – Monday, June 28, 5:45 – 6:45 p.m.
Reception to meet the Oracle ACE Directors immediately follows – 6:45 – 7:45 p.m.

Special Event – Wednesday, June 30, 6:30 – 10:00 p.m.
Featuring comedian John Heffron, 2nd season champion of the hit TV show, Last Comic Standing.
Music by live cover band, Right Foot Red

Oracle resources for the MySQL Community

While I have spent a lot of time recently helping the MySQL community interact with and integrate with various Oracle User Groups including ODTUG, IOUG, NoCOUG, NYOUG, DAOG I thought I’d share some resources for the MySQL Community that wanted to know more about Oracle.

The Oracle family of products is huge. You only have to look at the acquisitions via Wikipedia to get an idea. The first thing is to narrow your search, e.g. Database, APEX, Middleware, BI, Hyperion, Financials, development via Java, PHP or Oracle Forms etc.

While Oracle is a commercial product you can download all software for FREE via Oracle Technology Network. There is also documentation, forums, blogs and events.

Some Oracle bloggers I have already been reading however I’m expanding my list. People you may want to consider include:

Cary Millsap,Lewis Cunningham, Debra Lilley, Dimitri Gielis,Duncan Mills, Edward Roske, Mark Rittman, Scott Spendolini, Tim Tow, Tom Kyte

If you want a comparison of the Oracle and MySQL community, be sure to also check out Sheeri Cabral’s keynote address at the 2010 MySQL User Conference for reference.

When SET GLOBAL affects SESSION scope

We have all been caught out with using SET and not realizing that the default GLOBAL Scope (since 5.0.2) does not change the current SESSION scope.

I was not aware until today that changing GLOBAL scope has some exceptions that also automatically affect SESSION scope.

What I expected with a change in the GLOBAL scope is no affect SESSION scope. For example.

mysql> SHOW GLOBAL VARIABLES LIKE 'read_buffer_size';
+------------------+--------+
| Variable_name    | Value  |
+------------------+--------+
| read_buffer_size | 131072 |
+------------------+--------+
1 row in set (0.00 sec)

mysql> SHOW SESSION VARIABLES LIKE 'read_buffer_size';
+------------------+--------+
| Variable_name    | Value  |
+------------------+--------+
| read_buffer_size | 131072 |
+------------------+--------+
1 row in set (0.00 sec)

mysql> SET GLOBAL read_buffer_size=1024*256;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW GLOBAL VARIABLES LIKE 'read_buffer_size';
+------------------+--------+
| Variable_name    | Value  |
+------------------+--------+
| read_buffer_size | 262144 |
+------------------+--------+
1 row in set (0.00 sec)

mysql> SHOW SESSION VARIABLES LIKE 'read_buffer_size';
+------------------+--------+
| Variable_name    | Value  |
+------------------+--------+
| read_buffer_size | 131072 |
+------------------+--------+
1 row in set (0.00 sec)

However I was no prepared for this when changing an important variable for transaction management.

mysql> SHOW GLOBAL VARIABLES LIKE 'autocommit';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+
1 row in set (0.00 sec)

mysql> SHOW SESSION VARIABLES LIKE 'autocommit';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+
1 row in set (0.00 sec)

mysql> SET autocommit=0;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW GLOBAL VARIABLES LIKE 'autocommit';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | OFF   |
+---------------+-------+
1 row in set (0.00 sec)

mysql> SHOW SESSION VARIABLES LIKE 'autocommit';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | OFF   |
+---------------+-------+
1 row in set (0.00 sec)

However even more perplexing was the following message:

mysql> SET GLOBAL autocommit=0;
ERROR 1228 (HY000): Variable 'autocommit' is a SESSION variable and can't be used with SET GLOBAL

So this is another case were the definition of variables is not applicable in a GLOBAL level, yet the tools of the trade represent in some manner misleading information.
To prove my point, here is another new concurrent session started after the above.

mysql> SHOW GLOBAL VARIABLES LIKE 'autocommit';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+
1 row in set (0.01 sec)

mysql> SHOW SESSION VARIABLES LIKE 'autocommit';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | ON    |
+---------------+-------+
1 row in set (0.00 sec)

The MySQL Documentation also had an incorrect specification with description regarding this scope, Bug #54215

Best Practices: Additional User Security

By default MySQL allows you to create user accounts and privileges with no password. In my earlier MySQL Best Practices: User Security I describe how to address the default installation empty passwords.

For new user accounts, you can improve this default behavior using the SQL_MODE variable, with a value of NO_AUTO_CREATE_USER. As detailed via the 5.1 Reference Manual

NO_AUTO_CREATE_USER

Prevent the GRANT statement from automatically creating new users if it would otherwise do so, unless a nonempty password also is specified.

Having set this variable I attempted to show the error of operation to demonstrate in my upcoming “MySQL Idiosyncrasies that bite” presentation.

Confirm Settings

mysql> show global variables like 'sql_mode';
+---------------+---------------------+
| Variable_name | Value               |
+---------------+---------------------+
| sql_mode      | NO_AUTO_CREATE_USER |
+---------------+---------------------+
1 row in set (0.00 sec)

mysql> show session variables like 'sql_mode';
+---------------+---------------------+
| Variable_name | Value               |
+---------------+---------------------+
| sql_mode      | NO_AUTO_CREATE_USER |
+---------------+---------------------+
1 row in set (0.00 sec)

Create error condition

mysql> CREATE USER superuser@localhost;
Query OK, 0 rows affected (0.00 sec)
mysql> GRANT ALL ON *.* TO superuser@localhost;
Query OK, 0 rows affected (0.00 sec)
mysql> exit

What the? Surely this isn’t right.

$ mysql -usuperuser

mysql> SHOW GRANTS;
+--------------------------------------------------------+
| Grants for superuser@localhost                         |
+--------------------------------------------------------+
| GRANT ALL PRIVILEGES ON *.* TO 'superuser'@'localhost' |
+--------------------------------------------------------+

mysql> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 5.1.39    |
+-----------+

Well that’s broken functionality.

What should happen as described in Bug #43938 is a cryptic message as reproduced below.

mysql> GRANT SELECT ON foo.* TO 'geert12'@'localhost';
ERROR 1133 (42000): Can't find any matching row in the user table
mysql> GRANT SELECT ON *.* TO geert12@localhost IDENTIFIED BY 'foobar';
Query OK, 0 rows affected (0.00 sec)

It seems however that the user of CREATE USER first nullifies this expected behavior, hence new Bug #54208.

Eventually consistent Group Commit

Having just written an interview response about NoSQL concepts for a RDBMS audience it was poetic that an inconspicuous title “(4 of 3)” highlights that both a MySQL read scalable implementation via replication and a NoSQL solution can share a common lack of timely consistency of data. For the sake of Group Commit I hope my data is always consistent at some location at some point in time as soon as possible.

In attempting to comment to Kristian Nielsen’s Fixing MySQL group commit (part 4 of 3) I was forced to watch an ad before I could even add a comment. Go jump Live Journal, it’s quicker to write my own blog post.

And if anybody is still reading, I had just written the following.

“There is clearly a place for NoSQL solutions. The two primary types of products are a key/value store and a schema-less solution. You need to learn the strengths, benefits and weaknesses of both. For a RDBMS resource the lack of transactions, the lack of joins and the concept of eventually consistent can take some time to accept.”

mk-query-digest Tips – Showing all hosts & users

The Maatkit tools provide a suite of additional MySQL commands. There is one command I use constantly and that is mk-query-digest.

Unfortunately the documentation does leave a lot to be desired for usability. While throughout, it is a man page and not a user guide. Several of us have discussed writing better documentation however it’s always a matter of time. I have however learned a number of tips and I’d like to share them in smaller digests.

The first is showing additional display. Maatkit works on truncating per line output to a reasonable length of 73 characters?

One of those lines is the list of hosts that connected to MySQL for a query, for example.

# Hosts                  4 192.168.40... (2), 192.168.40... (2)... 2 more
# Hosts                  3 99.99.245.14 (12), 999.106.206.167 (6)... 1 more

The problem is I want to know what that 1 more is so I can gather a complete list of IP addresses that connect to this server. You do that with the –show-all=host argument.

Without

$ cat external.tcpdump | ./mk-query-digest --type tcpdump | grep Hosts | uniq -c
#
      1 # Hosts                  3 99.99.245.14 (12), 999.106.206.167 (6)... 1 more
      1 # Hosts                  1 99.99.139.140

With

$ cat external.tcpdump | ./mk-query-digest --type tcpdump --show-all=host | grep Hosts | uniq -c
      1 # Hosts                  3 99.99.245.14 (12), 999.106.206.167 (6), 99.99.139.140 (2)
      1 # Hosts                  1 99.99.139.140

You can apply the same principle to the Users as well with –show-all=user

$ cat external.tcpdump | ./mk-query-digest --type tcpdump  --show-all=user | grep Users | uniq -c
      1 # Users                  2 xxx (13), phpmysqlmo... (5)
     49 # Users                  1  xxx

The problem is a still gett a truncation of the name ‘phpmysqlmo…’ That’s the one thing I’m trying to uncover, because that IP and usernme are not valid permissions for this system.

tcpdump errors on FreeBSD for mk-query-digest

While I use this tcpdump command for MySQL query analysis with mk-query-digest, I found recently that it didn’t work on FreeBSD

$ tcpdump -i bge0 port 3306 -s 65535 -x -n -q -tttt -c 5
tcpdump: syntax error

It left me perplexed and reading the man page seemed to indicate my options were valid. I tried a few variances just to be sure without success.

$ tcpdump -i bge0 -c 5 port 3306 -x
tcpdump: syntax error
$ tcpdump -i bge0 -c 5 port 3306 -q
tcpdump: syntax error
$ tcpdump -i bge0 -c 5 port 3306 -tttt
tcpdump: syntax error

The solution was actually quite simple in the end, it had nothing to do with the commands, it had everything to do with the order of them. Placing port as the last option solved the problem.

$ tcpdump -i bge0 -s 65535 -x -n -q -tttt -c 5  port 3306
$ uname -a
FreeBSD db4.example.com 6.3-RELEASE-p3 FreeBSD 6.3-RELEASE-p3 #0: Wed Jul 16 05:13:50 EDT 200

MySQL Best Practices: User Security

It is critical that you do not use the default MySQL installation security, it’s simply insecure.

Default Installation

When installed, MySQL enables any user with physical permissions to the server to connect to the MySQL via unauthenticated users. MySQL also provides complete access to all super user privileges via the ‘root’ user with no default password.

$ mysql -uroot
mysql> SELECT host,user,password FROM mysql.user;
+--------------+------+-------------------------------------------+
| host         | user | password                                  |
+--------------+------+-------------------------------------------+
| localhost    | root |                                           |
| server.local | root |                                           |
| 127.0.0.1    | root |                                           |
| localhost    |      |                                           |
| server.local |      |                                           |
+--------------+------+-------------------------------------------+

What you see here are two types of users.

  • The ‘root’ user which has MySQL super user privileges for your server or ‘localhost’ connections with no password.
  • Unauthenticated users indicated by the blank ‘user’ column

The absolute minimum you should do, is run the provided optional command for immediate improvements mysql_secure_installation. When running this command, you’re prompted for the following
options — the output has been trimmed for presentations purposes.

$ mysql_secure_installation
Enter current password for root (enter for none):
Set root password? [Y/n] y
New password:
Re-enter new password:
Remove anonymous users? [Y/n] Y
Disallow root login remotely? [Y/n] Y
Remove test database and access to it? [Y/n] Y
Reload privilege tables now? [Y/n] Y

If you revisit permissions now, you’ll see what you would expect from a more initially secure installation.

mysql> SELECT host,user,password FROM mysql.user;
+-----------+------+-------------------------------------------+
| host      | user | password                                  |
+-----------+------+-------------------------------------------+
| localhost | root | *FDAF706717E70DB8DDAD0C5214B13770E1A80B0E |
+-----------+------+-------------------------------------------+

This is only the first step to hardening your MySQL instance and server.

Recommendations

The following are my recommendations for the minimum MySQL security permissions:

  • Always set a MySQL ‘root’ user password
  • Change the MySQL ‘root’ user id to a different name, e.g. ‘dba’
  • Only enable SUPER privileges to dba accounts, and only ever for ‘localhost’.
  • Application user permissions should be as restrictive as possible.
  • Never use ‘%’ for a hostname
  • Never use ALL TO *.*
  • Ideally the application should have at least two types of users, a read/write user and a read user.

There is a lot more information about physical Operating System security and the MySQL permission/privilege model to be discussed. One product I know of that help is SecuRich – The MySQL Security Package featuring roles, password history and many other cool functionalities.

References

A recent post by Lance Miller quoted the following.


I cant tell you how many times in the past 18 months that I’ve found real enterprises running vulnerable databases with default passwords, weak passwords and no real permissions management. It’s bad enough that the stats right now are this (so I guess I can tell you):
– 9 out of 10 organizations have a Microsoft SQL Database with a blank “sa” password (or an sa password of “sa”, “sql” or “password”)
– 9 out of 10 organizations have a Postgres Database with a default password
– 9 out of 10 organizations have a Sybase Database with a default password

The article didn’t include MySQL however some organizations don’t change the default password, probably not 9 of 10 in my experience.

More Information

MySQL Monitoring – What's really needed

The implementation of MySQL Monitoring is critical for any organization that uses a database and wants to avoid the inevitable disaster. There are 3 important components that all serve a key purpose to “MySQL Monitoring” in general:

  • Monitoring – Historical and graphical information
  • Alerting – Tell me when something is wrong
  • Dashboard – The State of NOW

Monitoring

There is no one option for Monitoring that is significantly better then another. A short list of what’s on offer can be found at http://monitoring-mysql.com/monitoring-products. What’s important is you have monitoring in place so historically you can review situations and compare across your servers and enabling the better identification of physical or database bottlenecks. My recommendations for products are Cacti which is packaged with most popular Linux distrubtions, so can be installed via a single apt-get/yum/rpm command, and the MySQL Cacti Templates. This is not the best product solution or combination, it’s in my opinion the most common and covers all the essential bases.

It’s best to define a different web server (i.e. publically accessible) to be the monitoring server rather then installing the web interface on any single DB server.

Alerting

Alerting is key to notify you of a problem without the need for somebody to be viewing a screen and see it happening via your monitoring. The identification of high CPU load, a disk nearing capacity, database locking etc often helps avoid a current problem before it becomes some level of disaster. Almost all companies use Nagios or a derivative such as the main fork Icigna or products that include Nagios like Opsview or Groundworks.

Dashboard

While both Monitoring and Alerting are necessary, they both however lack a key component necessary for successful administration. That is timing. Both of these earlier options sample, e.g. 1 minute or generally 5 minutes (by default), but problems can happen quickly. This is why each organization needs to have a Dashboard. I don’t know of any products here unless you try and adjust a monitoring product, but want’s needed is a very lightweight and very business centric single page of Green/Yellow/Red status’s of your environment including databases, webservers, response time and traffic etc. This is for the state of NOW. A Dashboard should sample every 5 to 10 seconds. I have seen larger and more successful companies have various home grown implementations. I developed a product for one company and it included the following on a single page. You can see the screen output in a presentation at http://www.slideshare.net/ronaldbradford/10x-performance-improvements. This included

  • 5 DB servers monitoring load average, ping time, database connections, active,free,locked, replication availability and lag
  • 5 web servers monitoring load average, ping time, apache connections
  • Application metrics monitoring 3 different page load times, and page size

It is often important to be able to identify a key problem and then drill down to this more quickly rather then the usual “the website is slow” question and having to investigate the same repetitive tasks. You need to automate, and be more pro-active in response especially to load and locking issues.

Advanced Monitoring

Information is only have the requirement, it is what you do with this information that determines how to be proactive rather then reactive. If I was the DBA of a company I’d do even more then these initial 3 steps which are a necessary base. I would also monitor for example:

  • Database size and growth. This is important to be preemptive about your capacity. Example SQL at http://ronaldbradford.com/mysql-dba/#allschemas
  • Error Log changes
  • Backup timing. This is important as your DB grows as it affects recovery.
  • Recovery timing
  • Gather raw MySQL status information because monitoring tools only capture what you ask it to do, not everything. While you may not analyze all now, you may want to in the future look back in time. Example scripts at http://ronaldbradford.com/mysql-dba/#log-stats
  • Hourly/Daily text reports. Producing a easy readable SHOW GLOBAL status report such as with statpack will for example enable me to know network throughtput in the DB, transaction throughput and key indicators of locking, disk access etc. While you may have a graphic interface, it’s a lot easier to automate and grep text reports.
  • Proactive restrictions. The Twitter failed whale is a great example of when the system moves closer to known limits, but before those limits they start limiting load. This includes for example to disable less critical but resource intensive functionality, e.g. people search. The also start rejecting connections so they do not reach a crash state. This could be proactively changing timeout values so the DB fails queries, and the webservers respond accordingly with a try again approach.

While these are important if you have only limited resources, too much information can be just as much of a burden then people just start ignoring the information and miss what’s important.

More on understanding sort_buffer_size

There have been a few posts by Sheeri and Baron today on the MySQL sort_buffer_size variable. I wanted to add some more information about this buffer, what is impacted when it is changed, and what to do about it?

The first thing you need to know is the sort_buffer_size is a per session buffer. That is this memory is assigned per connection/thread. I’ve seen clients that set this assuming it’s a global buffer Don’t Assume – Per Session Buffers.

Second, internally in the OS usage independently of MySQL, there is a threshold > 256K. From Monty Taylor “if buffer is set to over 256K, it uses mmap() instead of malloc() for memory allocation. Actually – this is a libc malloc thing and is tunable, but defaults to 256k. From the manual:” . He goes on in a further to shows that impact > 256K for a buffer is 37x slower. This applies to all per session buffers, not just sort buffer. Now I have heard recently about this limit being 512K. I wasn’t able to nail down the specific speaker to see if this was a newer library or kernel or OS.

With MySQL instrumentation and the sort_buffer_size we are lucky, there is the Sort_merge_passes status variable. While it’s not perfect, it does indicate if the size of the buffer is in-sufficient, however even if we use a sort_buffer_size of say 256K, and you see Sort_merge_passes increasing slowly, does not indicate you have to increase the buffer.

So, all this does not tell you how to tune the buffer? Unfortunately with MySQL there is no actual easy answer. You do need to monitor the mysqld memory usage overall, especially if you are using persistent connections. A connection/thread will not release the memory assigned until it is closed, so it’s important to monitor for memory creep of the PGA, knowing what your initial SGA is. Morgan Tocker wrote a patch in Bug #33540 to create a RESET CONNECTION type command.

You do need to look for memory as a bottleneck. You need to learn how MySQL use memory, not just the sort_buffer_size. I actually started many years ago to write global/session variables to indicate when buffers were used, and how much and I started with the sort_buffer_size which was buried down in some very old filesort code. When I sought the input of an expert C coder around this, they wondered how the code, especially a loop handler even actually worked.

Nobody knows what the optimal setting is, and that’s the problem. In certain areas especially memory usage the MySQL instrumentation is simply non-existent, and I’d like to see this as something that is fixed.

In conclusion, if I ever see a sort_buffer_size above 256K, e.g. 1M or 2M, I always reset it to 256K. My reasoning is simple. Until you have evidence in your specific environment increasing the buffer makes performance better, it’s better to use a smaller value. There are bigger wins, like not using sorting, or better design, or even better simplifying or eliminating SQL.

References

sort_buffer_size and Knowing Why
How to tune MySQL’s sort_buffer_size
Read Buffer performance hit
more on malloc() speed

Free MySQL Event in Washington DC

As the program chair for the recently announced MySQL Track at the ODTUG Kaleidoscope conference located in Washington DC we are also looking into an associated free community event for MySQL locals in addition to a dedicated track for 4 days.

Please let us know your name and email via the form at http://ronaldbradford.com/ODTUG/free-event/ so we can provide more details in the coming week as we try to finalize logistics.

Registration will be necessary for attendance however for now we just want to know who is local so we can provide more details soon!

Updated. Full details of the free Monday night sundown sessions and reception can be found at MySQL track with free event at Kaleidoscope 2010

The MySQL community impacting the Oracle community

I’m happy to announce that the MySQL community has been given the opportunity to speak at the upcoming Oracle Developer Tools User Group (ODTUG) Kaleidoscope conference in Washington DC. We will be releasing more details this week of the MySQL presentations and topics and we are finalizing details of possible options to include the local MySQL community during the event.

The various independent Oracle User Groups in North America that embody “by the community and for the community” have been very positive with including the MySQL community. With the Sun/MySQL now Oracle community team of Giuseppe Maxia, Lenz Grimmer, Kaj Arnö and Oracle ACE Directors Sheeri K Cabral and myself we have been happy with the openness and willingness to include us in the larger Oracle ecosystem.

We’ll announce the schedule when we finalize it, but we have had a great response from an impressive list of speakers.

Additional References

The MySQL documentation is not always right

Let me premise this post with the statement I think the MySQL documentation is an excellent and highly accurate resource. I think the MySQL docs team do a great job, however like software and people, documentation is not perfect.

As members of the MySQL community you can always contribute to improve the process by reading the documentation and logging any issues as Documentation Bugs.

Some time ago in a discussion with a friend and colleague, we were talking about changes in historical defaults that had been improved finally in MySQL 5.4 The specific discussion was on the new default innodb_buffer_pool_size and we both agreed it increased significantly. One said 1GB, the other said 128MB. Who was right? Well we both were, and we were both inaccurate depending on versions.

Referencing the 5.4 Manual in InnoDB Startup Options and System Variables the current value for Linux is 128M, but for Windows it’s 1GB.

However I was confident I was told in a presentation, perhaps even the keynote the value was 1GB. Firing up my server and seeing the original version I used of 5.4.0 (which was not available on Windows) we find that the default for Linux was 1GB at some time, i.e. the first release.

mysql> show global variables like 'innodb_buffer_pool_size';
+-------------------------+------------+
| Variable_name           | Value      |
+-------------------------+------------+
| innodb_buffer_pool_size | 1073741824 |
+-------------------------+------------+
1 row in set (0.00 sec)

mysql> select 1073741824/1024/1024 as MB;
+---------------+
| MB            |
+---------------+
| 1024.00000000 |
+---------------+
1 row in set (0.00 sec)

mysql> show global variables like 'version%';
+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| version                 | 5.4.0-beta                   |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | x86_64                       |
| version_compile_os      | unknown-linux-gnu            |
+-------------------------+------------------------------+
4 rows in set (0.00 sec)

I’m not trying to nit pick here, I’m highlighting that MySQL is a evolving product with many different versions and architectures. It’s our job of the MySQL community to help make the documentation the best for all readers. In this above case I’ve not logged the issue, because 5.4 is a defunct product, however if you want an example of how I identified a problem, provided a test case, and saw that my contribution was reviewed, verified and implemented check out Bug #51739 –core-file is not default TRUE (incorrect docs).

In conclusion, always read the documentation but pay special attention to the current version that matches the documentation, and the version you are actually running. Defaults change between versions, e.g. innodb_thread_concurrency is a complex example, and I’ve been caught with a large enterprise client with assuming the default of a Connector/J options as true, when it was in 5.0.6, but in 5.0.5 the version the client was running it was false.

An old saying, “trust by verify” is a good motto to consider.

The Drizzle Census

One thing I have often wondered is just how many MySQL instances exist in the world and what MySQL versions and architectures are in use. We hear of 50,000 windows downloads per day but this is misleading because MySQL is basically bundled with Linux by default or installed from various repositories. Linux servers powers many websites.

In Drizzle we have a proposed plan, the Drizzle Census. From the productive Drizzle Developers Day recently at the 2010 MySQL conference we sat down and created a blueprint, and subsequent high level spec of what we considered this optional plugin should do. We didn’t get as far as I would have liked in a code skeleton to at least gather and store a sample result, but the hope is that with the community we will in the near future.

Here is the list of information we decided was appropriate for anonymous information of value.

  • Kernel Version/Architecture
  • CPU type
  • SID – HASH(processor_id,listener address,first listener port)
  • Drizzle Version
  • Drizzle Uptime
  • Drizzled process memory usage

2010 MySQL Conference Presentations

I have uploaded my three presentations from the 2010 MySQL Users Conference in Santa Clara, California which was my 5th consecutive year appearing as a speaker.

A full history of my MySQL presentations can be found on the Presenting page.

State of the Dolphin – Opening keynote

Edward Screven – Chief Corporate Architect of Oracle provided the opening keynote at the 2010 MySQL Users Conference.

Overall I was disappointed. The first half was more an Oracle Sales pitch, we had some product announcements, we had some 5.5 performance buzz. While a few numbers and features were indeed great to hear, there was a clear lack of information to the MySQL ecosystem including employees, alumni and various support services. I hope more is unveiled this week.

Some notes of the session.

  • Oracle’s Strategy covers storage, servers, virtual machines, operating system, database, middleware, applications
  • We build a complete technology stack that is “open” and “integrated” based on “open standards”
  • products talk via open standards with the intention for customers to not feel locked in to any technology
  • Examples include apache, java, linux, xen, eclipse, and innodb
  • Unbreakable linux has now over 4,500 customers

After the sales pitch we got down to more about MySQL.

What MySQL means to Oracle? We make the Oracle solution more complete as a stack for customers.

What is the investment in MySQL?

  • Make MySQL a better MySQL
  • Develop, promote and support MySQL
  • MySQL community edition

Integration with Oracle Enterprise Manager, Oracle Secure Backup and Oracle Audit Vault infrastructure. *This I expected and have blogged about, so I’m glad to see this commitment.

MySQL 5.5 is now in Alpha, some features are

  • InnoDB will be default engine
  • Semi sync replication
  • Replication heartbeat
  • Signal
  • Performance Schema

MySQL 5.5 is planned on being faster with Innodb Performance Improvements & MySQL Performance Improvements.
MySQL 5.5 sysbench claims, read 200% faster, write 364% faster.

MySQL Workbench 5.2 announcement

  • SQL Development
  • Database Administration
  • Data Modelling

MySQL Cluster 7.1 GA announcement

  • Improved Administration
  • Higher Performance
  • Carrier Grade Availability & Performance

MySQL Enterprise Backup announcement

  • Online backup for InnoDB only
  • Formally InnoDB hot backup with additional features including incremental backups

MySQL Enterprise Monitor 2.2 Beta announcment

In closing the statement was “MySQL lets Oracle be more complete at the database layer”. Is that good for the MySQL Community or better for the Oracle revenue model?

Using ext4 for MySQL

This week with a client I saw ext4 used for the first time on a production MySQL system which was running Ubuntu 9.10 (Karmic Koala). I observe today while installing 9.10 Server locally that ext4 is the default option. The ext4 filesystem is described as better performance, reliability and features while there is also information about improvements in journaling.

At OSCON 2009 I attended a presentation on Linux Filesystem Performance for Databases by Selena Deckelmann in which ext4 was included. While providing some improvements in sequential reading and writing, there were issue with random I/O which is the key for RDBMS products.

Is the RAID configuration (e.g. RAID 5, RAID 10), strip size, buffer caches, LVM etc more important then upgrading from ext3 to ext4? I don’t have access to any test equipment in order to determine myself however I’d like to know of any experiences from members of the MySQL community and if anybody has experienced any general problems running ext4.

ext4 References

How do I identify the MySQL my.cnf file?

If you are unfamiliar in administrating MySQL, the current MySQL configuration file generally found is named my.cnf (my.ini on windows). Where is that file.

If only that question was easy to answer!

Use of configuration files

MySQL will by default use at least one configuration file from the following defaults. MySQL also uses a cascade approach for configuration files. When you have multiple files in the appropriate paths you can see unexpected behavior when you override certain values in different files.

You can however for example specify –no-defaults to use no configuration file, or add options to your command line execution, so even looking at all configuration files is no guarantee of your operating configuration.

However for most environments, these complexities do not exist.

Default Location

By default and on single instance MySQL servers you are most likely to find this file called my.cnf and found at:

  • /etc/my.cnf
  • /etc/mysql/my.cnf

These are known as the global options files.

Alternative Locations

MySQL has both instance specific and user specific locations. For the inclusion of an instance specific file, the location is:

  • $MYSQL_HOME/my.cnf

where MYSQL_HOME is a defined environment variable. Historical MySQL versions also looked at [datadir]/my.cnf however I am unaware if this is applicable in 5.x versions.

You can also specific options on a per user basis for default inclusion. These are found at:

  • $HOME/.my.cnf

Distro specific locations

Ubuntu for example also provides an ability to add options via an include directory.

Specifying a configuration at runtime

While you may have these default files, you may elect to start mysql with a specific configuration file as specified by –defaults-file. This option will override all global/instance/user locations and use just this configuration file. You can also specify additional configuration that supplements and not overrides the default with –defaults-extra-file.

What files are on my system?

Again, assuming the default names you can perform a brute force check with:

$ sudo find / -name "*my*cnf"

This is actually worthwhile, especially if you find a /root/.my.cnf file which is default MySQL settings for the Operating System ‘root’ user.

MySQL recommendations

MySQL by default provides a number of recommended files however these are generally outdated especially for newer hardware. These files include my-huge.cnf, my-large.cnf, my-medium.cnf, my-small.cnf and my-innodb-heavy-4G.cnf. Don’t assume replacing your configuration with one of these files will make your system perform better.

MySQL made some attempt to correct these and at least some very poor defaults with MySQL 5.4 however I am unsure what’s in MySQL 5.5

MySQL Configuration at runtime

While several commands can help with identifying your configuration files and print defaults etc, it’s also possible to change your configuration at runtime. It’s possible that these changes are not reflected in your configuration files and pose an additional mismatch.

References

Don't Assume – Per Session Buffers

MySQL has a number of global buffers, i.e. your SGA. There are also a number of per session/thread buffers that combined with other memory usage constitutes an unbounded PGA. One of the most common errors in mis-configured MySQL environments is the setting of the 4 primary per session buffers thinking they are global buffers.

Global buffers include:

    The four important per session buffers are:

    I have seen people see these values > 5M. The defaults range from 128K to 256K. My advice for any values above 256K is simple. What proof do you have this works better? When nothing is forthcoming, the first move is to revert to defaults or a maximum of 256K for some benchmarkable results. The primary reason for this is MySQL internally as quoted by Monty Taylor – for values > 256K, it uses mmap() instead of malloc() for memory allocation.

    These are not all the per session buffers you need to be aware of. Others include thread_stack, max_allowed_packet,binlog_cache_size and most importantly max_connections.

    MySQL also uses memory in other areas most noticeably in internal temporary tables and MEMORY based tables.

    As I mentioned, there is no bound for the total process memory allocation for MySQL, so some incorrectly configured variables can easily blow your memory usage.

    References

    About “Don’t Assume”

    “Don’t Assume” is a series of posts to help the Oracle DBA understand, use and appreciate the subtle differences and unique characteristics of the MySQL RDBMS in comparison to Oracle. These points as essential to operate MySQL effectively in a production environment and avoid any loss of data or availability.

    For more posts in this series be sure to follow the mysql4oracledba tag and also watch out for MySQL for Oracle DBA presentations.

    The MySQLCamp for the Oracle DBA is a series of educational talks all Oracle DBA resources should attend. Two presentations from this series IGNITION and LIFTOFF will be presented at the MySQL Users Conference 2010 in Santa Clara, April 2010 This series also includes JUMPSTART and VELOCITY. If you would like to here these presentations in your area, please contact me.

Advanced reporting options for MySQL

I’m seeking help from the MySQL community for what tools are used today to generate complex reports for enterprise applications that use MySQL. In an Oracle world, you have Oracle Report Writer, in Microsoft Crystal Reports.

In the open source world there is Jasper Reports, Pentaho Reports and BIRT however I don’t know the power of complex reporting with these.

If anybody has experience using or evaluating these tools please let me know. This may lead to possible work.

Migrating MySQL latin1 to utf8 – The process

Having covered the preparation and character set options of performing a latin1 to utf8 MySQL migration, just how do you perform the migration correctly.

Example Case

Just to recap, we have the following example table and data.

mysql> select c,length(c),char_length(c),charset(c), hex(c) from conv.test_latin1;
+---------------+-----------+----------------+------------+----------------------------+
| c             | length(c) | char_length(c) | charset(c) | hex(c)                     |
+---------------+-----------+----------------+------------+----------------------------+
| a             |         1 |              1 | latin1     | 61                         |
| abc           |         3 |              3 | latin1     | 616263                     |
| ☺             |         3 |              3 | latin1     | E298BA                     |
| abc ☺☹☻       |        13 |             13 | latin1     | 61626320E298BAE298B9E298BB |
+---------------+-----------+----------------+------------+----------------------------+

Migration approach 1

The easiest way is to use mysqldump to dump the schema and data, to change the schema definitions and then a client reload process, all with the correct client character sets. Working on our sample table test_latin1 which I have in a conv schema you can do.

$ mysqldump -uroot -p --default-character-set=latin1 --skip-set-charset conv > conv.sql
$ sed -i -e "s/latin1/utf8/g" conv.sql
$ mysql -uroot -p --default-character-set=utf8 conv < conv.sql

If you attempt this which technically works (as I've seen a similar example on the Internet) your bound to screw up something. While the mysqldump and mysql load use the correct client command line options, performing a blind conversion of all references of latin1 to utf8 will not only change your table definition, in my example it will change the name of table, and if would change any values of data that contained the word 'latin1'. For example.

mysql> select c,length(c),char_length(c),charset(c), hex(c) from conv.test_utf8;
+---------+-----------+----------------+------------+----------------------------+
| c       | length(c) | char_length(c) | charset(c) | hex(c)                     |
+---------+-----------+----------------+------------+----------------------------+
| a       |         1 |              1 | utf8       | 61                         |
| abc     |         3 |              3 | utf8       | 616263                     |
| ?       |         3 |              1 | utf8       | E298BA                     |
| abc ??? |        13 |              7 | utf8       | 61626320E298BAE298B9E298BB |
+---------+-----------+----------------+------------+----------------------------+
4 rows in set (0.00 sec)

mysql> set names utf8;
Query OK, 0 rows affected (0.00 sec)

mysql> select c,length(c),char_length(c),charset(c), hex(c) from conv.test_utf8;
+---------------+-----------+----------------+------------+----------------------------+
| c             | length(c) | char_length(c) | charset(c) | hex(c)                     |
+---------------+-----------+----------------+------------+----------------------------+
| a             |         1 |              1 | utf8       | 61                         |
| abc           |         3 |              3 | utf8       | 616263                     |
| ☺             |         3 |              1 | utf8       | E298BA                     |
| abc ☺☹☻       |        13 |              7 | utf8       | 61626320E298BAE298B9E298BB |
+---------------+-----------+----------------+------------+----------------------------+
4 rows in set (0.01 sec)

Be sure to realize now you need to connect with --default-character-set=utf8 or in our example, we use set names as a stop gap measure.

Migration Approach 2

A good practice with using mysqldump regardless of migration is to always dump the schema with the --no-data option, and then dump the data separately with the --no-create-info option. In this case, you can then manually edit the schema file, carefully changing latin1 where appropriate. This is your production data, so some manual hand verification is a good thing. The additional benefit is you can create your schema and verify the syntax is correct before loading any data. While mysqldump creates a single file, due to this dump and reload, you can split your data file and perform some level of parallelism for data loading depending on your hardware capabilities.

$ mysqldump -uroot -p --default-character-set=latin1 --skip-set-charset --no-data conv > conv.schema.sql
$ mysqldump -uroot -p --default-character-set=latin1 --skip-set-charset --no-create-info conv > conv.data.sql
# For simplicity in this example I'm using a more strict global replacement. NOTE: this syntax depends on the version of mysql
$ sed -i -e "s/DEFAULT CHARSET=latin1/DEFAULT CHARSET=utf8/g" conv.schema.sql
$ mysql -uroot -p --default-character-set=utf8 conv < conv.schema.sql
$ mysql -uroot -p --default-character-set=utf8 conv < conv.data.sql
$ mysql -uroot -p --default-character-set=utf8 conv
mysql> select c,length(c),char_length(c),charset(c), hex(c) from conv.test_latin1;
+---------------+-----------+----------------+------------+----------------------------+
| c             | length(c) | char_length(c) | charset(c) | hex(c)                     |
+---------------+-----------+----------------+------------+----------------------------+
| a             |         1 |              1 | utf8       | 61                         |
| abc           |         3 |              3 | utf8       | 616263                     |
| ☺             |         3 |              1 | utf8       | E298BA                     |
| abc ☺☹☻       |        13 |              7 | utf8       | 61626320E298BAE298B9E298BB |
+---------------+-----------+----------------+------------+----------------------------+
4 rows in set (0.00 sec)

Other software interactions

While I only discuss the MySQL data and mysql client usage here, you should ensure that your programming language connecting to MySQL uses the correct character set management, e.g. PHP uses the default_charset settings in php.ini and even your webserver may benefit from the correct encoding, e.g. Apache httpd uses AddDefaultCharset httpd.conf.

Migration Approach Problems

In our example our migration is successfully, however like life, your production data is unlikely to be perfect. In my next post I will talk about identifying and handling exceptions, including data that was wrongly encoded originally into latin1, or translation such as html entities from rich editor input fields for example.

Don't Assume – Data Integrity

MySQL has the same level of data integrity for numbers and strings as Oracle; when MySQL is correctly configured. By default (a reason I wish I knew why it is still the default), MySQL performs silent conversions on boundary conditions of data that will result in your data not always being what is specified. Let’s look at the following examples to demonstrate default behavior.

For numbers

mysql> DROP TABLE IF EXISTS example;
mysql> CREATE TABLE example(i1  TINYINT, i2 TINYINT UNSIGNED, c1 VARCHAR(5));
mysql> INSERT INTO example (i1) VALUES (1), (-1), (100), (500);
Query OK, 4 rows affected, 1 warning (0.08 sec)
mysql> SELECT * FROM example;
+------+------+------+
| i1   | i2   | c1   |
+------+------+------+
|    1 | NULL | NULL |
|   -1 | NULL | NULL |
|  100 | NULL | NULL |
|  127 | NULL | NULL |
+------+------+------+
4 rows in set (0.00 sec)

As you can see for one value we inserted 500, yet the value of 127 is stored? For this example I have used the TINYINT numeric data type to demonstrate truncation. TINYINT is a 1 byte integer that stores values from -128 to +127. Unlike Oracle, MySQL has 9 different data types for numeric columns, and using these wisely can improve your database disk footprint, for example BIGINT v INT. Is there a big deal?.

MySQL also has a nice feature for numeric data types, the UNSIGNED attribute that ensures only a positive integer or 0 value. Let’s see what happens with this column.

Unsigned

mysql> TRUNCATE TABLE example;
mysql> INSERT INTO example (i2) VALUES (1), (-1), (100), (500);
Query OK, 4 rows affected, 2 warnings (0.00 sec)
mysql> SELECT * FROM example;
+------+------+------+
| i1   | i2   | c1   |
+------+------+------+
| NULL |    1 | NULL |
| NULL |    0 | NULL |
| NULL |  100 | NULL |
| NULL |  255 | NULL |
+------+------+------+
4 rows in set (0.00 sec)

Now you see that -1 and 500 are now not the expected values, and before while 500 was silently truncated to 127, now it’s truncated to 255.

For Strings

As you can now assume, the following also occurs for strings.

mysql> TRUNCATE TABLE example;
mysql> INSERT INTO example (c1) VALUES (NULL),('a'),('abcde'),('xyz12345');
Query OK, 4 rows affected, 1 warning (0.00 sec)
mysql> SELECT * FROM example;
+------+------+-------+
| i1   | i2   | c1    |
+------+------+-------+
| NULL | NULL | NULL  |
| NULL | NULL | a     |
| NULL | NULL | abcde |
| NULL | NULL | xyz12 |
+------+------+-------+
4 rows in set (0.00 sec)

Show warnings

As you can see here, the mysql client shows that warnings occurred, but if you don’t review the warning you would never know, a situation that is rarely reviewed with development in richer programming languages. Let us look at these actual warnings more closely.

mysql> INSERT INTO example (i1) VALUES (1), (-1), (100), (500);
Query OK, 4 rows affected, 1 warning (0.08 sec)
mysql> SHOW WARNINGS;
+---------+------+---------------------------------------------+
| Level   | Code | Message                                     |
+---------+------+---------------------------------------------+
| Warning | 1264 | Out of range value for column 'i1' at row 4 |
+---------+------+---------------------------------------------+

mysql> INSERT INTO example (i2) VALUES (1), (-1), (100), (500);
Query OK, 4 rows affected, 2 warnings (0.00 sec)
mysql> SHOW WARNINGS;
+---------+------+---------------------------------------------+
| Level   | Code | Message                                     |
+---------+------+---------------------------------------------+
| Warning | 1264 | Out of range value for column 'i2' at row 2 |
| Warning | 1264 | Out of range value for column 'i2' at row 4 |
+---------+------+---------------------------------------------+
2 rows in set (0.00 sec)
mysql> INSERT INTO example (c1) VALUES (NULL),('a'),('abcde'),('xyz12345');
Query OK, 4 rows affected, 1 warning (0.00 sec)
mysql> SHOW WARNINGS;
+---------+------+-----------------------------------------+
| Level   | Code | Message                                 |
+---------+------+-----------------------------------------+
| Warning | 1265 | Data truncated for column 'c1' at row 4 |
+---------+------+-----------------------------------------+
1 row in set (0.00 sec)

Using sql_mode

The solution is the sql_mode configuration variable and at minimum the value of STRICT_ALL_TABLES defined. We can demonstrate the expected behavior with the following syntax.

mysql> set SESSION sql_mode=STRICT_ALL_TABLES;
Query OK, 0 rows affected (0.00 sec)

mysql> TRUNCATE TABLE example;
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO example (i1) VALUES (1), (-1), (100), (500);
ERROR 1264 (22003): Out of range value for column 'i1' at row 4
mysql> SELECT * FROM example;
+------+------+------+
| i1   | i2   | c1   |
+------+------+------+
|    1 | NULL | NULL |
|   -1 | NULL | NULL |
|  100 | NULL | NULL |
+------+------+------+
3 rows in set (0.00 sec)

As you can see, even with an error for a single INSERT statement, some data was actually stored. You should read Don’t Assume – Transactions for some insights here.

When it comes to dates, there is greater complexity and this is grounds for another entry of this series.

References

About “Don’t Assume”

“Don’t Assume” is a series of posts to help the Oracle DBA understand, use and appreciate the subtle differences and unique characteristics of the MySQL RDBMS in comparison to Oracle. These points as essential to operate MySQL effectively in a production environment and avoid any loss of data or availability.

For more posts in this series be sure to follow the mysql4oracledba tag and also watch out for MySQL for Oracle DBA presentations.

The MySQLCamp for the Oracle DBA is a series of educational talks all Oracle DBA resources should attend. Two presentations from this series IGNITION and LIFTOFF will be presented at the MySQL Users Conference 2010 in Santa Clara, April 2010 This series also includes JUMPSTART and VELOCITY. If you would like to here these presentations in your area, please contact me.

How to crash mysqld intentionally

While some may think I’m daft, I have a legitimate reason for wanting to crash mysqld. However first we need to find a way to crash it.

Great thanks to Alan K, Mark L, Harrison and Hartmut on #mysql-dev for several suggestions and a config option I was unaware of. My investigation even lead to a documentation bug logged as #51739.

My first thought was to find a known bug and if necessary install the correct version to test that. A good one was suggested, Bug #48508 which fails on several versions that I will use to demonstrate with, however the simplest way is to issue kill -11

By default, no core file will be produced which is what I’m seeking but with the right options this is possible. First, the user running mysqld probably has a core file limit size of 0.

$ ulimit -c
0

You can fix this with with ulimit or you can specify this in the [mysqld_safe] section with core-file-size=unlimited

$ ulimit -c unlimited
$ ulimit -c
unlimited

The option I was not aware of is you also have to also specify core-file in your my.cnf

[mysqld]
core-file

I also for my CentOS 5.4 installation ran the following kernel commands, but this may be unnecessary.

sudo /sbin/sysctl -w kernel.core_pattern="core"
sudo /sbin/sysctl -w fs.suid_dumpable= 1

It is now easy to produce a core file.

$ bin/mysqld_safe &
$ killall -11 mysqld
$ bin/mysqld_safe: line 137:  2656 Segmentation fault      (core dumped) ...
100304 16:46:43 mysqld_safe Number of processes running now: 0
100304 16:46:43 mysqld_safe mysqld restarted
$ find . -name "core*"
./data/core.99999

NOTE: Do no run killall on a multi-instance server. I use this syntax here only for simplicity in presentation. It is best to run ps and kill the appropriate pid.

On a side note, I also tried to produce a core on Mac OS X without success. I’d still like to document that way, so if anybody can assist please ping me.

Don't Assume – Transactions

MySQL by default is a NON transactional database. For the hobbyist (See The Hobbyist and the Professional), startup entrepreneur and website developer this may not appear foreign, however to the seasoned Oracle DBA who has only used Oracle the concept is very foreign.

In MySQL you have to be concerned with two situations that will catch the unprepared out. The first is the default autocommit mode. This is TRUE, i.e. all statements are automatically committed on completion.

mysql> SELECT @@autocommit,TRUE;
+--------------+------+
| @@autocommit | TRUE |
+--------------+------+
|            1 |    1 |
+--------------+------+
1 row in set (0.00 sec)

The second is the storage engine used. Again a foreign term for Oracle DBA’s, a storage engine is a technology that stores and retrieves the underlying data from the MySQL database. MySQL has many different storage engines, each with relative strengths and weaknesses and different features. For the purpose of this discussion it is important to know that engines are either non-transactional or transactional. The default storage engine MyISAM is NON transactional. MySQL provides by default the InnoDB storage engine which is transactional. There are distinct advantages of a non transactional environment which I will not go into at this time.

Having recently written about this in my upcoiming book Expert PHP and MySQL I will demonstrate what happens with both MyISAM and InnoDB.

Non-transactional Tables

To show the difference, Listing 6-7 demonstrates that atomicity is not possible with non-transactional tables. The following tables are used in this example.

DROP TABLE IF EXISTS non_trans_parent;
CREATE TABLE non_trans_parent (
  id   INT UNSIGNED NOT NULL AUTO_INCREMENT,
  val  VARCHAR(10) NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY (val)
) ENGINE=MyISAM DEFAULT CHARSET latin1;
DROP TABLE IF EXISTS non_trans_child;
CREATE TABLE non_trans_child (
  id        INT UNSIGNED NOT NULL AUTO_INCREMENT,
  parent_id INT UNSIGNED NOT NULL,
  created   TIMESTAMP NOT NULL,
PRIMARY KEY (id),
INDEX (parent_id)
) ENGINE=MyISAM DEFAULT CHARSET latin1;

To test things out, perform a sample transaction that inserts records into these two tables:

START TRANSACTION;
INSERT INTO non_trans_parent(val) VALUES(‘a’);
INSERT INTO non_trans_child(parent_id,created) VALUES(LAST_INSERT_ID(),NOW());
INSERT INTO non_trans_parent (val) VALUES(‘a’);
ERROR 1062 (23000): Duplicate entry ‘a’ for key ‘val’
ROLLBACK;
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> SHOW WARNINGS;
+--------+------+--------------------------------------------------------------
| Level  | Code | Message
+--------+------+--------------------------------------------------------------
| Warning| 1196 | Some non-transactional changed tables couldn’t be rolled back
+--------+------+--------------------------------------------------------------
SELECT * FROM non_trans_parent;
+----+-----+
| id | val |
+----+-----+
|  1 | a   |
+----+-----+
SELECT * FROM non_trans_child;
+----+-----------+---------------------+
| id | parent_id | created             |
+----+-----------+---------------------+
|  1 |         1 | 2009–09–21 23:44:25 |
+----+-----------+---------------------+

As you can see, data that you would have expected to not exist from the transaction is present.

Transactional Tables

Repeat these SQL statements using the transactional storage engine InnoDB; you will observe the difference between transactional and non transactional processing. The following tables, shown in Listing 6-8, are used in this example.

DROP TABLE IF EXISTS trans_parent;
CREATE TABLE trans_parent (
  id   INT UNSIGNED NOT NULL AUTO_INCREMENT,
  val  VARCHAR(10) NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY (val)
) ENGINE=InnoDB DEFAULT CHARSET latin1;
DROP TABLE IF EXISTS trans_child;
CREATE TABLE trans_child (
  id        INT UNSIGNED NOT NULL AUTO_INCREMENT,
  parent_id INT UNSIGNED NOT NULL,
  created   TIMESTAMP NOT NULL,
PRIMARY KEY (id),
INDEX (parent_id)
) ENGINE=InnoDB DEFAULT CHARSET latin1;

Perform a sample transaction that inserts records into these two tables:

START TRANSACTION;
INSERT INTO trans_parent (val) VALUES(‘a’);
INSERT INTO trans_child (parent_id,created) VALUES(LAST_INSERT_ID(),NOW());
INSERT INTO trans_parent (val) VALUES(‘a’);
ERROR 1062 (23000): Duplicate entry ‘a’ for key ‘val’
ROLLBACK;
Query OK, 0 rows affected (0.01 sec)
SELECT * FROM trans_parent;
Empty set (0.00 sec)
SELECT * FROM trans_child;
Empty set (0.00 sec)

As you can see, no data has been recorded as part of the failing transaction.

About “Don’t Assume”

“Don’t Assume” is a series of posts to help the Oracle DBA understand, use and appreciate the subtle differences and unique characteristics of the MySQL RDBMS in comparison to Oracle. These points as essential to operate MySQL effectively in a production environment and avoid any loss of data or availability.

For more posts in this series be sure to follow the mysql4oracledba tag and also watch out for MySQL for Oracle DBA presentations.

The MySQLCamp for the Oracle DBA is a series of educational talks all Oracle DBA resources should attend. Two presentations from this series IGNITION and LIFTOFF will be presented at the MySQL Users Conference 2010 in Santa Clara, April 2010 This series also includes JUMPSTART and VELOCITY. If you would like to here these presentations in your area, please contact me.