Why SQL_MODE is important

Today was another example of where a correct SQL_MODE saved customer data from being corrupted. By default, MySQL does not enforce data integrity. It allows what is called silent truncations where the result of what you INSERT or UPDATE does not represent truth. NOTE: I see very few customers ever have this correctly configured, those that do have actually listened to my advice.

If you do not read any further, your production MySQL environments should be running with at the bare minimum of SQL_MODE=STRICT_ALL_TABLES however I would also advocate for additional SQL_MODE settings.

For this example, some modified undesirable code attempted to reduce a counter by 1, however because of an UNSIGNED data type and a correctly set SQL_MODE, the application produced an error and data was not corrupted.

This is what should happen with your SQL.

mysql> update stats set loss_count=loss_count - 1 where user_id=42;
ERROR 1690 (22003): BIGINT UNSIGNED value is out of range in '(`db`.`stats`.`loss_count` - 1)'

It is interesting to note that the error message actually is misleading. The datatype for the column is SMALLINT however the error message prompted an unnecessary schema verification. Even with the calculation I would have assumed the inherited data type would be the column definition before the subtraction. If you try to set the value automatically to negative you get a better message.

mysql> update stats set loss_count=-1 where user_id=42;
ERROR 1264 (22003): Out of range value for column 'loss_count' at row 1

So why is this a big deal? If your MySQL instance is not using SQL_MODE then the result would have been a value of 0 and this would never highlight the bug in the code.

I should point out that only Drizzle of the MySQL forks and variants has tightened up this data integrity, however I have to agree to disagree with the decision to drop UNSIGNED. While I under internally the code simplification behind the decision, without check constraints in MySQL, UNSIGNED is a saving grace. In this example, Drizzle in this case would not have reported an error.

Tagged with: Databases Drizzle MySQL

Related Posts

Sysbench Under the Covers

Sysbench is a popular open-source benchmarking tool designed to evaluate the performance of system components such as CPU, memory, disk I/O, and databases. It is commonly used for testing MySQL, PostgreSQL, and other databases under different load conditions.

Read more

Tracking new AWS Database Infrastructure Availability

AWS can drop 10+ articles a day just in the What’s New feed. You either need an eagle eye, or luck to keep up if you run multiple AWS database products across multiple regions and managing infrastructure.

Read more

Evaluating Readyset Caching for MySQL

Readyset is a database caching solution for MySQL and PostgreSQL . For applications that have increased load on your primary database, or use scale-out infrastructure to support a high-read workload, ReadySet can be a drop-in solution to address current performance issues.

Read more