MySQL pre version 5.0 was very lax in it’s management of valid data. It was easy for data integrity to be abused if you knew how. The most common examples were truncations and silent conversions that if not understood could provide a serious data integrity issue.
In version 5.0, the introduction of SQL_MODE solved this problem. We will look at one example of how SQL_MODE can be enabled to provided improved data integrity.
You want to store the individual RGB (red/green/blue) decimal values of colors in a table. Each of these has a range from 0 to 255. You read that you can store 255 values in a TINYINT Integer data type, so you create a table like:
DROP TABLE IF EXISTS color_to_decimal; CREATE TABLE color_to_decimal( name VARCHAR(20) NOT NULL PRIMARY KEY, red TINYINT NOT NULL, green TINYINT NOT NULL, blue TINYINT NOT NULL);
You insert some data like:
INSERT INTO color_to_decimal (name, red,green,blue) VALUES ('white',255,255,255); INSERT INTO color_to_decimal (name, red,green,blue) VALUES ('black',0,0,0); INSERT INTO color_to_decimal (name, red,green,blue) VALUES ('red',255,0,0); INSERT INTO color_to_decimal (name, red,green,blue) VALUES ('green',0,255,0); INSERT INTO color_to_decimal (name, red,green,blue) VALUES ('blue',0,0,255); INSERT INTO color_to_decimal (name, red,green,blue) VALUES ('yellow',255,255,0);
Great, but when you look at you data you get?
SELECT name, red, green, blue FROM color_to_decimal ORDER BY name; +--------+-----+-------+------+ | name | red | green | blue | +--------+-----+-------+------+ | black | 0 | 0 | 0 | | blue | 0 | 0 | 127 | | green | 0 | 127 | 0 | | red | 127 | 0 | 0 | | white | 127 | 127 | 127 | | yellow | 127 | 127 | 0 | +--------+-----+-------+------+ 6 rows in set (0.01 sec)
What happened, you delete the data and re-insert only to find no changes. You have been the victim of a silent conversion, via a means of truncation.
The TINYINT data type is 1 byte (8 bits). 8 bits can store the values from 0 to 255. When you use this integer data type, only 7 bits are actually available, which gives the range of 0 to 127. Why? Because MySQL reserved one bit for the sign, either positive or negative, even though you didn’t want a sign.
So knowing this, you go back and recreate your table with the following definition.
DROP TABLE IF EXISTS color_to_decimal; CREATE TABLE color_to_decimal( name VARCHAR(20) NOT NULL PRIMARY KEY, red TINYINT UNSIGNED NOT NULL, green TINYINT UNSIGNED NOT NULL, blue TINYINT UNSIGNED NOT NULL);
You load your data and look at it again, and you see.
+--------+-----+-------+------+ | name | red | green | blue | +--------+-----+-------+------+ | black | 0 | 0 | 0 | | blue | 0 | 0 | 255 | | green | 0 | 255 | 0 | | red | 255 | 0 | 0 | | white | 255 | 255 | 255 | | yellow | 255 | 255 | 0 | +--------+-----+-------+------+ 6 rows in set (0.00 sec)
But, should you have been told about this, should there have been an error. Well, in MySQL this is actually a warning, and most applications never support and cater for warnings. It is only when you use the MySQL client program, as in these examples, you are given an indication, with the following line after each insert. If you look closely.
mysql> INSERT INTO color_to_decimal (name, red,green,blue) VALUES ('blue',0,0,255); Query OK, 1 row affected, 1 warning (0.00 sec)
However there is a savior for this situation, and that is SQL_MODE.
When set to the setting TRADITIONAL, an error and not a warning is generated, and most applications support catching errors. Look at what happens in our example using the original table.
SET SQL_MODE=TRADITIONAL; DROP TABLE IF EXISTS color_to_decimal; CREATE TABLE color_to_decimal( name VARCHAR(20) NOT NULL PRIMARY KEY, red TINYINT NOT NULL, green TINYINT NOT NULL, blue TINYINT NOT NULL); INSERT INTO color_to_decimal (name, red,green,blue) VALUES ('white',255,255,255); ERROR 1264 (22003): Out of range value for column 'red' at row 1
As an added benefit you get this data integrity for free. We didn’t test it, because we know the data coming in is in the range of 0-255, but what if the user entered 500 for example. Let’s see.
TRUNCATE TABLE color_to_decimal; SET SQL_MODE=''; INSERT INTO color_to_decimal (name, red, green, blue) VALUES('a bad color',500,0,0); SELECT name, red, green, blue FROM color_to_decimal; SET SQL_MODE=TRADITIONAL; INSERT INTO color_to_decimal (name, red, green, blue) VALUES('a bad color',500,0,0);
Looking closely at the response in the client.
mysql> TRUNCATE TABLE color_to_decimal; Query OK, 0 rows affected (0.02 sec) mysql> SET SQL_MODE=''; Query OK, 0 rows affected (0.00 sec) mysql> INSERT INTO color_to_decimal (name, red, green, blue) VALUES('a bad color',500,0,0); Query OK, 1 row affected, 1 warning (0.00 sec) mysql> SELECT name, red, green, blue FROM color_to_decimal; +-------------+-----+-------+------+ | name | red | green | blue | +-------------+-----+-------+------+ | a bad color | 127 | 0 | 0 | +-------------+-----+-------+------+ 1 row in set (0.00 sec) mysql> SET SQL_MODE=TRADITIONAL; Query OK, 0 rows affected (0.00 sec) mysql> INSERT INTO color_to_decimal (name, red, green, blue) VALUES('a bad color',500,0,0); ERROR 1264 (22003): Out of range value for column 'red' at row 1
As discussed in my presentation Top 20 design tips for Data Architects the UNSIGNED column construct should be always defined unless there is a specific reason not to.
In some respects I would argue that the default for an Integer column should be actually UNSIGNED, and that SIGNED should be specified when you want a sign. Most integer columns generally in schema’s only contain positive numbers. There are of course plenty of examples, positional geo data, financial data, medical data for example.
One could also argue that MySQL should make the default SQL_MODE at least TRADITIONAL, and that only when you want backward compatibility should you then change the SQL_MODE.
This is Part I on SQL_MODE, there are few more interesting cases to discuss at a later time.
About the Author
Ronald Bradford provides Consulting and Advisory Services in Data Architecture, Performance and Scalability for MySQL Solutions. An IT industry professional for two decades with extensive database experience in MySQL, Oracle and Ingres his expertise covers data architecture, software development, migration, performance analysis and production system implementations. His knowledge from 10 years of consulting across many industry sectors, technologies and countries has provided unique insight into being able to provide solutions to problems. For more information Contact Ronald.