Don’t Assume – Data Integrity

MySQL has the same level of data integrity for numbers and strings as Oracle; when MySQL is correctly configured. By default (a reason I wish I knew why it is still the default), MySQL performs silent conversions on boundary conditions of data that will result in your data not always being what is specified. Let’s look at the following examples to demonstrate default behavior.

For numbers

mysql> DROP TABLE IF EXISTS example;
mysql> CREATE TABLE example(i1  TINYINT, i2 TINYINT UNSIGNED, c1 VARCHAR(5));
mysql> INSERT INTO example (i1) VALUES (1), (-1), (100), (500);
Query OK, 4 rows affected, 1 warning (0.08 sec)
mysql> SELECT * FROM example;
+------+------+------+
| i1   | i2   | c1   |
+------+------+------+
|    1 | NULL | NULL |
|   -1 | NULL | NULL |
|  100 | NULL | NULL |
|  127 | NULL | NULL |
+------+------+------+
4 rows in set (0.00 sec)

As you can see for one value we inserted 500, yet the value of 127 is stored? For this example I have used the TINYINT numeric data type to demonstrate truncation. TINYINT is a 1 byte integer that stores values from -128 to +127. Unlike Oracle, MySQL has 9 different data types for numeric columns, and using these wisely can improve your database disk footprint, for example BIGINT v INT. Is there a big deal?.

MySQL also has a nice feature for numeric data types, the UNSIGNED attribute that ensures only a positive integer or 0 value. Let’s see what happens with this column.

Unsigned

mysql> TRUNCATE TABLE example;
mysql> INSERT INTO example (i2) VALUES (1), (-1), (100), (500);
Query OK, 4 rows affected, 2 warnings (0.00 sec)
mysql> SELECT * FROM example;
+------+------+------+
| i1   | i2   | c1   |
+------+------+------+
| NULL |    1 | NULL |
| NULL |    0 | NULL |
| NULL |  100 | NULL |
| NULL |  255 | NULL |
+------+------+------+
4 rows in set (0.00 sec)

Now you see that -1 and 500 are now not the expected values, and before while 500 was silently truncated to 127, now it’s truncated to 255.

For Strings

As you can now assume, the following also occurs for strings.

mysql> TRUNCATE TABLE example;
mysql> INSERT INTO example (c1) VALUES (NULL),('a'),('abcde'),('xyz12345');
Query OK, 4 rows affected, 1 warning (0.00 sec)
mysql> SELECT * FROM example;
+------+------+-------+
| i1   | i2   | c1    |
+------+------+-------+
| NULL | NULL | NULL  |
| NULL | NULL | a     |
| NULL | NULL | abcde |
| NULL | NULL | xyz12 |
+------+------+-------+
4 rows in set (0.00 sec)

Show warnings

As you can see here, the mysql client shows that warnings occurred, but if you don’t review the warning you would never know, a situation that is rarely reviewed with development in richer programming languages. Let us look at these actual warnings more closely.

mysql> INSERT INTO example (i1) VALUES (1), (-1), (100), (500);
Query OK, 4 rows affected, 1 warning (0.08 sec)
mysql> SHOW WARNINGS;
+---------+------+---------------------------------------------+
| Level   | Code | Message                                     |
+---------+------+---------------------------------------------+
| Warning | 1264 | Out of range value for column 'i1' at row 4 |
+---------+------+---------------------------------------------+

mysql> INSERT INTO example (i2) VALUES (1), (-1), (100), (500);
Query OK, 4 rows affected, 2 warnings (0.00 sec)
mysql> SHOW WARNINGS;
+---------+------+---------------------------------------------+
| Level   | Code | Message                                     |
+---------+------+---------------------------------------------+
| Warning | 1264 | Out of range value for column 'i2' at row 2 |
| Warning | 1264 | Out of range value for column 'i2' at row 4 |
+---------+------+---------------------------------------------+
2 rows in set (0.00 sec)
mysql> INSERT INTO example (c1) VALUES (NULL),('a'),('abcde'),('xyz12345');
Query OK, 4 rows affected, 1 warning (0.00 sec)
mysql> SHOW WARNINGS;
+---------+------+-----------------------------------------+
| Level   | Code | Message                                 |
+---------+------+-----------------------------------------+
| Warning | 1265 | Data truncated for column 'c1' at row 4 |
+---------+------+-----------------------------------------+
1 row in set (0.00 sec)

Using sql_mode

The solution is the sql_mode configuration variable and at minimum the value of STRICT_ALL_TABLES defined. We can demonstrate the expected behavior with the following syntax.

mysql> set SESSION sql_mode=STRICT_ALL_TABLES;
Query OK, 0 rows affected (0.00 sec)

mysql> TRUNCATE TABLE example;
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO example (i1) VALUES (1), (-1), (100), (500);
ERROR 1264 (22003): Out of range value for column 'i1' at row 4
mysql> SELECT * FROM example;
+------+------+------+
| i1   | i2   | c1   |
+------+------+------+
|    1 | NULL | NULL |
|   -1 | NULL | NULL |
|  100 | NULL | NULL |
+------+------+------+
3 rows in set (0.00 sec)

As you can see, even with an error for a single INSERT statement, some data was actually stored. You should read Don’t Assume – Transactions for some insights here.

When it comes to dates, there is greater complexity and this is grounds for another entry of this series.

References

About “Don’t Assume”

“Don’t Assume” is a series of posts to help the Oracle DBA understand, use and appreciate the subtle differences and unique characteristics of the MySQL RDBMS in comparison to Oracle. These points as essential to operate MySQL effectively in a production environment and avoid any loss of data or availability.

For more posts in this series be sure to follow the mysql4oracledba tag and also watch out for MySQL for Oracle DBA presentations.

The MySQLCamp for the Oracle DBA is a series of educational talks all Oracle DBA resources should attend. Two presentations from this series IGNITION and LIFTOFF will be presented at the MySQL Users Conference 2010 in Santa Clara, April 2010 This series also includes JUMPSTART and VELOCITY. If you would like to here these presentations in your area, please contact me.

Tags: , , ,

3 Responses to “Don’t Assume – Data Integrity”

  1. Maarten says:

    About the defaults: the defaults are far too conservative, just as their total strategy. They should have had the guts to improve standard behaviour years ago. SQL_MODE itself is also a big WTF. It could’ve been nice as some sort of transition setting for migrating to newer mysql versions, but then again you should deprecate/remove stuff faster or else you’ll drown in keeping backward compatibility in thousands of combinations of legacy settings.

  2. Doug says:

    You should mention innodb_strict_mode as well.

  3. ronald says:

    Doug you are correct, I did not know of this. Thanks

    innodb_strict_mode, available in MySQL binaries from 5.1.38 is applicable ONLY to the InnoDB plugin.

    http://dev.mysql.com/doc/refman/5.1/en/innodb-parameters.html#sysvar_innodb_strict_mode