The curse of MySQL warnings

MySQL warnings are an anti-pattern when it comes to maintaining data integrity. When the information retrieved from a database does not match what was entered, and this is not identified immediately, this can be permanently lost.

MySQL by default for several decades until the most recent versions enabled you to insert incorrect data, or insert data that was then truncated, or other patterns that resulted in failed data integrity. Very few applications considered handling warnings as errors, and there is a generation of software products that have never informed the developers that warnings were occurring.

The most simplest example is:

CREATE SCHEMA IF NOT EXISTS warnings;
USE warnings;

CREATE TABLE short_name(
  id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  name VARCHAR(20) NOT NULL,
  PRIMARY KEY(id)
);

INSERT INTO short_name (name) VALUES ('This name is too long and will get truncated');
ERROR 1406 (22001): Data too long for column 'name' at row 1

This is what you expect would happen. In many, many applications IT DOES NOT.

For almost 20 years the default setting was to support possible data corruption

If you used an older version without setting up a more strict SQL_MODE from the default you end up with.

INSERT INTO short_name (name) VALUES ('This Name is too long and will get truncated');
Query OK, 1 row affected, 1 warning (0.00 sec)

SELECT * FROM short_name;
+----+----------------------+
| id | name                 |
+----+----------------------+
|  1 | This name is too lon |
+----+----------------------+
1 row in set (0.00 sec)

Only if you run SHOW WARNINGS and after the actual SQL statement would you know? There is no other way to find this information in any logs. There is no way to

mysql> SHOW WARNINGS;
+---------+------+-------------------------------------------+
| Level   | Code | Message                                   |
+---------+------+-------------------------------------------+
| Warning | 1265 | Data truncated for column 'name' at row 1 |
+---------+------+-------------------------------------------+
1 row in set (0.00 sec)

Numerous other examples can shock a customer when, after some time, expected data in a production is lost and unretrievable.

If you came from a more strict RDBMS background, or you tuned your MySQL installation or uncovered this and many other poor defaults, you would have improved your data integrity with and improved SQL_MODE.

So MySQL warnings are bad? No, they are ideal when used appropriately. However, the next critical dilemma occurs.

Warnings are valuable when used to identify important characteristics of an SQL statement that a developer or database administrator should be aware of. However, the only way to retrieve these warnings is from the application making the connection to the database at each statement, and generally, these warnings are just lost.

Here are some examples of warnings that are important for the engineering team that define criteria such as deprecation notices, which are important for production database upgrades.

SELECT JSON_MERGE('["a"]','["b"]'); 
Warning (Code 1287): 'JSON_MERGE' is deprecated and will be removed in a future release. Please use JSON_MERGE_PRESERVE/JSON_MERGE_PATCH instead

SELECT ST_GeomFromWKB(Point(0, 0));
Warning: (3195) st_geometryfromwkb(geometry) is deprecated and will be replaced by st_srid(geometry, 0) in a future version. Use st_geometryfromwkb(st_aswkb(geometry), 0) instead.

SELECT DATE('2024-01-01 10:00:00') 
Warning (Code 4096): Delimiter ' ' in position 11 in datetime value '2024-01-01 10:00:00' at row 1 is superfluous and is deprecated. Please remove.

SELECT BINARY 'a' = 'A' 
Warning (Code 1287): 'BINARY expr' is deprecated and will be removed in a future release. Please use CAST instead 

You definitely want to know about these, collect them (hard), add them to your backlog, and don’t leave it until its too late in the I can’t upgrade my database to have to address.

If you want to know about these, collect them (hard), add them to your backlog, and don’t leave it until it’s too late for a critical last-minute upgrade to my database to have to address.

There are also warnings that should be collected and used for performance verification, which apply to running systems. I wanted to show one specific example uncovered during testing of a MySQL upgrade to version 8.0.

Warning (Code 3170): Memory capacity of 8388608 bytes for 'range_optimizer_max_mem_size' exceeded. Range optimization was not done for this query.

In fact, this warning occurs in MySQL 5.7, but the customer never knew because they did not look at the warnings. How many other SQL statements in your application produce warnings now? How can you find this out?

It was rather easy to create a reproducible test case but what now?

  • Do you set range_optimizer_max_mem_size=0
  • Do you set to the value you need, which you can identify with SELECT * FROM performance_schema.memory_summary_by_thread_by_event_name WHERE thread_id=PS_CURRENT_THREAD_ID() AND event_name='memory/sql/test_quick_select'\G
  • Do you need to modify your optimizer_switch settings?
  • Do you try something else?
  • Do you refactor your application?
  • Do you just leave it as is?

When you want to consider several different options, which one works best for this query? What about the impact on your entire production workload? Knowing statistically which is the best choice for your full workload and under various conditions is the optimal output, but how?

Next BaseLine was built to perform experiments comparing changes to your data, configuration, and infrastructure to validate the next version of your product statistically performs better than your current version across all of your application at different workloads.

Next BaseLine also provides numerous benefits for a major database upgrade, so I’ve focussed on getting these capabilities to customers quicker to save money. It provides the benefit of detecting SQL statements that produce errors in the next MySQL version, enabling you to categorize and prioritize areas of your application that must be corrected. It also captures important information about the performance and quality of the data from your MySQL queries; this also can help in identifying the most critical aspects of your application to invest engineering time and mitigate risk in your database upgrade plan. It can also collect warning messages such as these discussed when considering migrating from MySQL 5.7 to MySQL 8, or it can just find them with your current application.

What is your pain point with MySQL database upgrades? What are you doing right now to help reduce this additional budget spend? Join our private beta program now to find out more.

Next BaseLine

Helping to create a better and faster next version of your data-driven product

Introducing the MySQL Cloud Service

The MySQL keynote at Oracle Open World 2016 announced the immediate availability of the MySQL Cloud Service, part of the larger Oracle Cloud offering. You can evaluate this now with a trial copy at cloud.oracle.com/mysql. MySQL server product manager Morgan Tocker gave two presentations at the event including a deep dive session.

This is the first release of the MySQL cloud service. As with all first releases there are some highlights and some pipeline features. All major cloud providers have MySQL offerings. AWS RDS (traditional, MAZ and Aurora) GCP Cloud SQL and Azure MySQL App Service. Users of OpenStack have Trove for comparison. I am not going to be evaluating features between cloud offerings in this post.

Highlights

The differentiating highlights as I see them from the presentation. I will provide a followup blog on actual usage at a later time.

  • MySQL 5.7
  • MySQL Enterprise Edition (a key difference with other cloud providers)
    • MySQL Enterprise features like Firewall, Thread Pool, Auditing etc
    • MySQL Enterprise support is included in price
    • MySQL Enterprise Monitor (MEM) is available and included in price
  • SSH access to machine
    • SSH access is a non-privileged user (opc). This shows and intention on security first policy.
  • Separated partitioning in OS/MySQL disk layout
  • ZFS. (Nice, I have missed using this)
  • Optimized partition workloads different for data and sequential logging
  • Two predefined backup policies, ZFS appliance (7 day retention) and cloud storage (30 day retention)
  • The managed backup philosophy is a weekly full backup, and daily incrementals
  • Sane default MySQL configuration (my.cnf)
  • Patching notification and capability. Automated backup before patching, and rollback capability
  • The Ksplice Oracle UEK functionality for improved host uptime with security vulnerabilities or kernel improvements

Overall an A effort on paper in V1 with the willingness to be useful, sane and flexible. In a future post I will evaluate the actual MySQL cloud capabilities with the available trial.

Observations

Features and functionality I see missing from V1 during this presentation. Some are features I would like to see, some are just observations, and some are likely present features but not discussed. I will leave it up the reader to decide which is which.

  • No MySQL 5.6. There was mention of supporting two versions in future moving forward (i.e. 5.7 and 8)
  • Separated MEM configuration and management. See my later thoughts on this.
  • MySQL topologies and easy to apply templates, including the future MySQL InnoDB Cluster
  • A longer archive storage retention capability policy for backups and/or binary logs (e.g. for compliance reasons)
  • The size of the pre-defined dedicated logging partition and binary logging may be problematic for longer retention capacity
  • Provisioned IOPS capacity or performance guarantees for Disk I/O
  • An ability to define MySQL configuration templates (e.g. dev, test, prod, master, slave etc) and be able to select these for a new instance deployment. You can of course manage this after the fact manually.
  • The compute workloads are more generic at present. There appears to be no optimized disk, network or CPU variants.
  • Improved key management being able to select an already defined SSH public key (e.g. with another instance)

Only offering MySQL 5.7 is an adoption impediment. This requires any organization with applications that are not greenfield to have already migrated to MySQL 5.7. I can understand the long-term rationale view here, but I see it as a clear limitation for more rapid adoption.

The details

The MySQL Cloud Service takes the hard parts out of managing MySQL. This is deployed in the Oracle Public Cloud, leveraging the fault-tolerant regional deployments in place. This Database as a Service (PaaS) helps to remove those annoying pieces of administration including backups, patches, monitoring etc. Powered by MySQL 5.7 Enterprise edition (the only cloud provider to offer this), the cloud system version in use is identical to the downloadable on-premise version. The Cloud service offers an initially optimized MySQL configuration of my.cnf to begin with, i.e. improvements on 5.7 defaults, and has variety of compute workload sizes to choose from. Storage is a ZFS appliance, but there is no information on provisioned IOPS for intensive workloads. You can use the web interface or REST API endpoints to create, deploy and manage your cloud instances. The REST API endpoints were not demonstrated in this session.

The predefined disk layout for storage is a very sane configuration. The Operating System (Oracle Unbreakable Linux 6 ) has a dedicated partition, (not part of sizing). There is a dedicated and throughput optimized ZFS LUN for data (what you size with the setup), a dedicated and latency optimized ZFS LUN for binary and InnoDB logs (which appears not initially sizable at present) and a dedicated ZFS LUN for backups. There is also a secondary backup storage capacity by default in Cloud Storage.

The UI interface provides the capability to configure a MEM server and a MEM client. To conserve presentation time Morgan consolidated these into his initial demo instance. I feel there is room here to optimize the initial setup and to separate out the “management” server capabilities, e.g. selecting your MEM configuration, and by default offering just the MEM client authentication (if MEM server is configured). For users not familiar with MySQL Enterprise features separating the definition and management in the initial creation stage is an optimization to remove complexity. There may even be an option for a getting started quick setup step that can provision your MEM setup in a dedicated instance when there is none detected or with a new account. Here is the flip side. An inexperienced user starting out may launch a MEM server with several test instances because the initial UI setup offers these as input fields, this is not the goal when managing multiple servers. The current version of MEM shown was 3.2, with 3.3 planned. Version 3.3. includes it’s own web interface for backup management.

Some things that are not in the initial release but I’m sure are on the roadmap. One is an upsize and downsize optimization. It would appear via the demo, that when a compute size modification occurs, the existing MySQL instance is shutdown and the VM is shutdown. A new VM is provisioned using the setup and disk partitions of the prior VM. An optimization is to provision a new VM, startup MySQL, then stop MySQL on new, stop on old, unmount on old, mount on new, and start MySQL. This removes the downtime in the VM provisioning step. Ideally I’d like to see the capability to perform this on a slave, and promote a slave more seamlessly. Practically however, this has many more moving pieces than in theory and so the future use MySQL router is a solution. The upcoming MySQL InnoDB cluster will also suffer from the complexity of resizing and uptime availability, especially when nodes are of varying compute sizes. As mentioned, I would like to see pre defined MySQL configurations. I would also like the option to pre-create multiple user authentications for instances, rather than having to specific one each time. I can see for a class of servers, e.g. a load test environment of a master/slave setup, and an application with several MySQL accounts, a means of bulk user and permission management.

Under the Hood Morgan talked about the InnoDB IO configuration optimizations, the number of IO Threads, use of O_DIRECT, the redo log size and buffer pool optimized to compute shape. The thread pool is enabled by default. The same considerations are in place for the operating system, Oracle Linux 6 UEK, MySQL task priority, memlock, and ext4 filesystem.

Again, those unfamiliar with MySQL Enterprise features will need greater help and UI help understanding the features, capabilities and configuration of Firewall, Encryption, Authentication, Audit, Monitor, Backup and Thread Pool.

The SSH access is what gives this first release control to be flexible. You can modify the MySQL configuration, incorporate configuration management processes. You can utilize on system database restore capabilities. You can monitor physical resource utilizations. I am unsure of the total control of changing (or breaking the system and the kernel).

There was a lot to digest in the 45 minute practical demonstration session. I am sure as with more reading and evaluation there will be more to share. As the roadmap for MySQL InnoDB cluster develops I can see there being a good cadence on new features and functionality released for the MySQL Cloud Service.

My Live Tweets (as the presentation was happening)

Oracle MySQL Public Cloud landing page