The curse of MySQL warnings

MySQL warnings are an anti-pattern when it comes to maintaining data integrity. When the information retrieved from a database does not match what was entered, and this is not identified immediately, this can be permanently lost.

MySQL by default for several decades until the most recent versions enabled you to insert incorrect data, or insert data that was then truncated, or other patterns that resulted in failed data integrity. Very few applications considered handling warnings as errors, and there is a generation of software products that have never informed the developers that warnings were occurring.

The most simplest example is:

CREATE SCHEMA IF NOT EXISTS warnings;
USE warnings;

CREATE TABLE short_name(
  id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  name VARCHAR(20) NOT NULL,
  PRIMARY KEY(id)
);

INSERT INTO short_name (name) VALUES ('This name is too long and will get truncated');
ERROR 1406 (22001): Data too long for column 'name' at row 1

This is what you expect would happen. In many, many applications IT DOES NOT.

For almost 20 years the default setting was to support possible data corruption

If you used an older version without setting up a more strict SQL_MODE from the default you end up with.

INSERT INTO short_name (name) VALUES ('This Name is too long and will get truncated');
Query OK, 1 row affected, 1 warning (0.00 sec)

SELECT * FROM short_name;
+----+----------------------+
| id | name                 |
+----+----------------------+
|  1 | This name is too lon |
+----+----------------------+
1 row in set (0.00 sec)

Only if you run SHOW WARNINGS and after the actual SQL statement would you know? There is no other way to find this information in any logs. There is no way to

mysql> SHOW WARNINGS;
+---------+------+-------------------------------------------+
| Level   | Code | Message                                   |
+---------+------+-------------------------------------------+
| Warning | 1265 | Data truncated for column 'name' at row 1 |
+---------+------+-------------------------------------------+
1 row in set (0.00 sec)

Numerous other examples can shock a customer when, after some time, expected data in a production is lost and unretrievable.

If you came from a more strict RDBMS background, or you tuned your MySQL installation or uncovered this and many other poor defaults, you would have improved your data integrity with and improved SQL_MODE.

So MySQL warnings are bad? No, they are ideal when used appropriately. However, the next critical dilemma occurs.

Warnings are valuable when used to identify important characteristics of an SQL statement that a developer or database administrator should be aware of. However, the only way to retrieve these warnings is from the application making the connection to the database at each statement, and generally, these warnings are just lost.

Here are some examples of warnings that are important for the engineering team that define criteria such as deprecation notices, which are important for production database upgrades.

SELECT JSON_MERGE('["a"]','["b"]'); 
Warning (Code 1287): 'JSON_MERGE' is deprecated and will be removed in a future release. Please use JSON_MERGE_PRESERVE/JSON_MERGE_PATCH instead

SELECT ST_GeomFromWKB(Point(0, 0));
Warning: (3195) st_geometryfromwkb(geometry) is deprecated and will be replaced by st_srid(geometry, 0) in a future version. Use st_geometryfromwkb(st_aswkb(geometry), 0) instead.

SELECT DATE('2024-01-01 10:00:00') 
Warning (Code 4096): Delimiter ' ' in position 11 in datetime value '2024-01-01 10:00:00' at row 1 is superfluous and is deprecated. Please remove.

SELECT BINARY 'a' = 'A' 
Warning (Code 1287): 'BINARY expr' is deprecated and will be removed in a future release. Please use CAST instead 

You definitely want to know about these, collect them (hard), add them to your backlog, and don’t leave it until its too late in the I can’t upgrade my database to have to address.

If you want to know about these, collect them (hard), add them to your backlog, and don’t leave it until it’s too late for a critical last-minute upgrade to my database to have to address.

There are also warnings that should be collected and used for performance verification, which apply to running systems. I wanted to show one specific example uncovered during testing of a MySQL upgrade to version 8.0.

Warning (Code 3170): Memory capacity of 8388608 bytes for 'range_optimizer_max_mem_size' exceeded. Range optimization was not done for this query.

In fact, this warning occurs in MySQL 5.7, but the customer never knew because they did not look at the warnings. How many other SQL statements in your application produce warnings now? How can you find this out?

It was rather easy to create a reproducible test case but what now?

  • Do you set range_optimizer_max_mem_size=0
  • Do you set to the value you need, which you can identify with SELECT * FROM performance_schema.memory_summary_by_thread_by_event_name WHERE thread_id=PS_CURRENT_THREAD_ID() AND event_name='memory/sql/test_quick_select'\G
  • Do you need to modify your optimizer_switch settings?
  • Do you try something else?
  • Do you refactor your application?
  • Do you just leave it as is?

When you want to consider several different options, which one works best for this query? What about the impact on your entire production workload? Knowing statistically which is the best choice for your full workload and under various conditions is the optimal output, but how?

Next BaseLine was built to perform experiments comparing changes to your data, configuration, and infrastructure to validate the next version of your product statistically performs better than your current version across all of your application at different workloads.

Next BaseLine also provides numerous benefits for a major database upgrade, so I’ve focussed on getting these capabilities to customers quicker to save money. It provides the benefit of detecting SQL statements that produce errors in the next MySQL version, enabling you to categorize and prioritize areas of your application that must be corrected. It also captures important information about the performance and quality of the data from your MySQL queries; this also can help in identifying the most critical aspects of your application to invest engineering time and mitigate risk in your database upgrade plan. It can also collect warning messages such as these discussed when considering migrating from MySQL 5.7 to MySQL 8, or it can just find them with your current application.

What is your pain point with MySQL database upgrades? What are you doing right now to help reduce this additional budget spend? Join our private beta program now to find out more.

Next BaseLine

Helping to create a better and faster next version of your data-driven product

Digital Tech Trek Digest [#Issue 2024.06]

MySQL Belgian Days 2024 and FOSDEM 2024

In this past week, I’ve been able not just to read or watch digital content online but to meet people in person. In Brussels, first at the MySQL Belgian Days 2024 event, followed by FOSDEM 2024.

There was a wide array of presentations covering many different topics; this is just a summary. Fred talked history of Command Line Monitoring and an intro to the new player Dolphie. Dave Stokes talked security, Sunny Bains gave us a brain dump of TiDB scalable architecture. We got an update on PMM and MySQL on k8s from Peter Zaitsev as well as a chat about his new product coroot. And then a great intro to a new generation of online schema change at scale with Sprit by Morgan Tocker. Alex Rubin shows us how not how to hack MySQL, but how MySQL can hack you. We have all crossed paths as MySQL Inc. employees or MySQL community members since 2006.

Marcelo Altmann gave us a detailed intro of a new era of caching with ReadySet. We also heard updates on Vitess. And that was just Day 1 presentations. The evening event was at the incredibly wall-to-wall packed Delirium Café, sponsored by ReadySet, which we offer great thanks and cheers.

Day 2 was packed with great content about MySQL Shell, MySQL Heatwave ML and Vector, MySQL Router, and the MySQL optimizer from many well-known Oracle MySQLers before amazing awards, Belgian beer, and black vodka, of course.

Congratulations Giuseppe Maxia on your MySQL Legends award at MySQL Belgium Days 2024. It is well deserved for all of your community contributions over the decades.

Check out the details at Unveiling the Highlights: A Look Back at MySQL Belgian Days 2024.

Saturday and Sunday were FOSDEM 24 and its usual location. So many people crossing the university, tunnels, and weird transit paths between all the university lecture halls it can feel like a blur. For the first time, I had no fixed agenda so I could check out random talks on random topics.

A shout-out to many people I know and some new people I met. Colin Charles, Alkin Tezuysal, Walter Heck, Charly Batista, Robert Hodges, Jens Bollmann, Monty Widenius, Matthias Crauwels, Michael Pope, Marcelo Altmann, Emerson Gaudencio, Aldo Junior and tons more I have forgotten to mention by name. There were many conversations also with random community people I didn’t even get names, for example, the team at Canonical.

About “Digital Tech Trek Digest”

Most days, I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new about professional and personal topics of interest. I turn what I read into actionable notes in a short, committed time window, summarizing what I learned, what I should learn and use, or what is of random interest. And thus my Digital Tech Trek.

Some of my regular sources include TLDR, Forbes Daily, ThoughWorks Podcasts, Daily Dose of Data Science and BoringCashCow. Also Scientific American Technology, Fareed’s Global Briefing, Software Design: Tidy First? by Kent Beck, Last Week in AWS to name a few.

What does the MySQL mysqlsh util.checkForServerUpgrade() execute

During a recent Aurora MySQL 8 upgrade process, a number of validation checks have failed. This is an analysis of the error message “present in INFORMATION_SCHEMA’s INNODB_SYS_TABLES table but missing from TABLES table”.

Some background

During a Major Upgrade from Aurora MySQL 5.7 to Aurora MySQL 8.0 the cluster instances were left in an incompatible-parameters state. The upgrade-prechecks.log shed some more light on the situation with

{
            "id": "schemaInconsistencyCheck",
            "title": "Schema inconsistencies resulting from file removal or corruption",
            "status": "OK",
            "description": "Error: Following tables show signs that either table datadir directory or frm file was removed/corrupted. Please check server logs, examine datadir to detect the issue and fix it before upgrade",
            "detectedProblems": [
                {
                    "level": "Error",
                    "dbObject": "flinestones.fred",
                    "description": "present in INFORMATION_SCHEMA's INNODB_SYS_TABLES table but missing from TABLES table"
                }
            ]
        }, 

For anonymity the troublesome table here is played by flinestones.fred

This error could be reproduced more quickly with the util.checkForServerUpgrade() check that saves the creation of a snapshot of your cluster, restore from the snapshot cluster, then the launch cluster instance path.

18) Schema inconsistencies resulting from file removal or corruption
  Error: Following tables show signs that either table datadir directory or frm
    file was removed/corrupted. Please check server logs, examine datadir to
    detect the issue and fix it before upgrade

  mysql.rds_heartbeat2 - present in INFORMATION_SCHEMA's INNODB_SYS_TABLES
    table but missing from TABLES table
  flinstones.fred -
    present in INFORMATION_SCHEMA's INNODB_SYS_TABLES table but missing from
    TABLES table 

As I am using the MySQL community mysqlsh tool with a managed AWS RDS MySQL cluster, I have discounted any rds specific messages.

Back to investigating the cause. Some basic spot checks within the Cluster confirmed this mismatch.

mysql > desc flinstones.fred;
ERROR 1146 (42S02): Table flinstones.fred ' doesn't exist

mysql > select * from information_schema.innodb_sys_tables where name = ' flinstones/fred';

*results*
(1 row)

A closer inspection of the Aurora MySQL error log re-iterated there was some issue.

$ aws rds download-db-log-file-portion --db-instance-identifier ${INSTANCE_ID} --log-file-name error/mysql-error-running.log --output text

... 
[Warning] InnoDB: Tablespace 'flinstones/fred' exists in the cache with id 5233285 != 4954605
...

What is this check

It is easy enough to look at the SQL behind this using open-source software, you go to the source and look at the SQL https://github.com/mysql/mysql-shell .. upgrade_check.cc. As the message is near identical to what AWS provides I am making an educated assumption the check is the same.

// clang-format off
std::unique_ptr
Sql_upgrade_check::get_schema_inconsistency_check() {
  return std::make_unique(
      "schemaInconsistencyCheck",
      "Schema inconsistencies resulting from file removal or corruption",
      std::vector{
       "select A.schema_name, A.table_name, 'present in INFORMATION_SCHEMA''s "
       "INNODB_SYS_TABLES table but missing from TABLES table' from (select "
       "distinct "
       replace_in_SQL("substring_index(NAME, '/',1)")
       " as schema_name, "
       replace_in_SQL("substring_index(substring_index(NAME, '/',-1),'#',1)")
       " as table_name from "
       "information_schema.innodb_sys_tables where NAME like '%/%') A left "
       "join information_schema.tables I on A.table_name = I.table_name and "
       "A.schema_name = I.table_schema where A.table_name not like 'FTS_0%' "
       "and (I.table_name IS NULL or I.table_schema IS NULL) and A.table_name "
       "not REGEXP '@[0-9]' and A.schema_name not REGEXP '@[0-9]';"},
      Upgrade_issue::ERROR,
      "Following tables show signs that either table datadir directory or frm "
      "file was removed/corrupted. Please check server logs, examine datadir "
      "to detect the issue and fix it before upgrade");
}

Ok, that’s a little more difficult to read than plain text, and what if I wanted to review other SQL statements this could become tedious.

Gather the SQL statements executed by util.checkForServerUpgrade()

Let’s use a more straightforward means of capturing SQL statements, the MySQL general log.

MYSQL_PASSWD=$(date | md5sum - | cut -c1-20)

docker network create -d bridge mynetwork
docker run --name mysql57 -e MYSQL_ROOT_PASSWORD="${MYSQL_PASSWD}" -d mysql:5.7
docker network connect mynetwork mysql57
docker inspect mysql57 | grep "IPAddress"
IP=$(docker inspect mysql57 | grep '"IPAddress":' | head -1 | cut -d'"' -f4)
docker exec -it mysql57 mysql -uroot -p${MYSQL_PASSWD} -e "SET GLOBAL general_log=1"
docker exec -it mysql57 mysql -uroot -p${MYSQL_PASSWD} -e "SHOW GLOBAL VARIABLES LIKE 'general_log_file'"
GENERAL_LOG_FILE=$(docker exec -it mysql57 mysql -uroot -p${MYSQL_PASSWD} -e "SHOW GLOBAL VARIABLES LIKE 'general_log_file'" | grep general_log_file | cut -d'|' -f3)


docker run --name mysql8 -e "MYSQL_ALLOW_EMPTY_PASSWORD=yes" -d mysql/mysql-server
docker exec -it mysql8 mysqlsh -h${IP} -uroot -p${MYSQL_PASSWD} --js -- util checkForServerUpgrade | tee check.txt

docker exec -it mysql57 grep Query ${GENERAL_LOG_FILE} | cut -c41- | tee check.sql


# Cleanup
docker stop mysql8 && docker rm mysql8
docker stop mysql57 && docker rm mysql57
docker network rm mynetwork

And we are left with the output of util.checkForServerUpgrade() and the SQL of all checks including of said statement:

check.sql

SET NAMES 'utf8mb4'
select current_user()
SELECT PRIVILEGE_TYPE, IS_GRANTABLE FROM INFORMATION_SCHEMA.USER_PRIVILEGES WHERE GRANTEE = '\'root\'@\'%\''
SELECT PRIVILEGE_TYPE, IS_GRANTABLE, TABLE_SCHEMA FROM INFORMATION_SCHEMA.SCHEMA_PRIVILEGES WHERE GRANTEE = '\'root\'@\'%\'' ORDER BY TABLE_SCHEMA
SELECT PRIVILEGE_TYPE, IS_GRANTABLE, TABLE_SCHEMA, TABLE_NAME FROM INFORMATION_SCHEMA.TABLE_PRIVILEGES WHERE GRANTEE = '\'root\'@\'%\'' ORDER BY TABLE_SCHEMA, TABLE_NAME
select @@version, @@version_comment, UPPER(@@version_compile_os)
SET show_old_temporals = ON
SELECT table_schema, table_name,column_name,column_type FROM information_schema.columns WHERE column_type LIKE 'timestamp /* 5.5 binary format */'
SET show_old_temporals = OFF
select SCHEMA_NAME, 'Schema name' as WARNING from INFORMATION_SCHEMA.SCHEMATA where SCHEMA_NAME in ('ADMIN', 'CUBE', 'CUME_DIST', 'DENSE_RANK', 'EMPTY', 'EXCEPT', 'FIRST_VALUE', 'FUNCTION', 'GROUPING', 'GROUPS', 'JSON_TABLE', 'LAG', 'LAST_VALUE', 'LEAD', 'NTH_VALUE', 'NTILE', 'OF', 'OVER', 'PERCENT_RANK', 'PERSIST', 'PERSIST_ONLY', 'RANK', 'RECURSIVE', 'ROW', 'ROWS', 'ROW_NUMBER', 'SYSTEM', 'WINDOW', 'LATERAL', 'ARRAY' ,'MEMBER' )
SELECT TABLE_SCHEMA, TABLE_NAME, 'Table name' as WARNING FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_TYPE != 'VIEW' and TABLE_NAME in ('ADMIN', 'CUBE', 'CUME_DIST', 'DENSE_RANK', 'EMPTY', 'EXCEPT', 'FIRST_VALUE', 'FUNCTION', 'GROUPING', 'GROUPS', 'JSON_TABLE', 'LAG', 'LAST_VALUE', 'LEAD', 'NTH_VALUE', 'NTILE', 'OF', 'OVER', 'PERCENT_RANK', 'PERSIST', 'PERSIST_ONLY', 'RANK', 'RECURSIVE', 'ROW', 'ROWS', 'ROW_NUMBER', 'SYSTEM', 'WINDOW', 'LATERAL', 'ARRAY' ,'MEMBER' )
select TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, COLUMN_TYPE, 'Column name' as WARNING FROM information_schema.columns WHERE TABLE_SCHEMA not in ('information_schema', 'performance_schema') and COLUMN_NAME in ('ADMIN', 'CUBE', 'CUME_DIST', 'DENSE_RANK', 'EMPTY', 'EXCEPT', 'FIRST_VALUE', 'FUNCTION', 'GROUPING', 'GROUPS', 'JSON_TABLE', 'LAG', 'LAST_VALUE', 'LEAD', 'NTH_VALUE', 'NTILE', 'OF', 'OVER', 'PERCENT_RANK', 'PERSIST', 'PERSIST_ONLY', 'RANK', 'RECURSIVE', 'ROW', 'ROWS', 'ROW_NUMBER', 'SYSTEM', 'WINDOW', 'LATERAL', 'ARRAY' ,'MEMBER' )
SELECT TRIGGER_SCHEMA, TRIGGER_NAME, 'Trigger name' as WARNING FROM INFORMATION_SCHEMA.TRIGGERS WHERE TRIGGER_NAME in ('ADMIN', 'CUBE', 'CUME_DIST', 'DENSE_RANK', 'EMPTY', 'EXCEPT', 'FIRST_VALUE', 'FUNCTION', 'GROUPING', 'GROUPS', 'JSON_TABLE', 'LAG', 'LAST_VALUE', 'LEAD', 'NTH_VALUE', 'NTILE', 'OF', 'OVER', 'PERCENT_RANK', 'PERSIST', 'PERSIST_ONLY', 'RANK', 'RECURSIVE', 'ROW', 'ROWS', 'ROW_NUMBER', 'SYSTEM', 'WINDOW', 'LATERAL', 'ARRAY' ,'MEMBER' )
SELECT TABLE_SCHEMA, TABLE_NAME, 'View name' as WARNING FROM INFORMATION_SCHEMA.VIEWS WHERE TABLE_NAME in ('ADMIN', 'CUBE', 'CUME_DIST', 'DENSE_RANK', 'EMPTY', 'EXCEPT', 'FIRST_VALUE', 'FUNCTION', 'GROUPING', 'GROUPS', 'JSON_TABLE', 'LAG', 'LAST_VALUE', 'LEAD', 'NTH_VALUE', 'NTILE', 'OF', 'OVER', 'PERCENT_RANK', 'PERSIST', 'PERSIST_ONLY', 'RANK', 'RECURSIVE', 'ROW', 'ROWS', 'ROW_NUMBER', 'SYSTEM', 'WINDOW', 'LATERAL', 'ARRAY' ,'MEMBER' )
SELECT ROUTINE_SCHEMA, ROUTINE_NAME, 'Routine name' as WARNING FROM INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_NAME in ('ADMIN', 'CUBE', 'CUME_DIST', 'DENSE_RANK', 'EMPTY', 'EXCEPT', 'FIRST_VALUE', 'FUNCTION', 'GROUPING', 'GROUPS', 'JSON_TABLE', 'LAG', 'LAST_VALUE', 'LEAD', 'NTH_VALUE', 'NTILE', 'OF', 'OVER', 'PERCENT_RANK', 'PERSIST', 'PERSIST_ONLY', 'RANK', 'RECURSIVE', 'ROW', 'ROWS', 'ROW_NUMBER', 'SYSTEM', 'WINDOW', 'LATERAL', 'ARRAY' ,'MEMBER' )
SELECT EVENT_SCHEMA, EVENT_NAME, 'Event name' as WARNING FROM INFORMATION_SCHEMA.EVENTS WHERE EVENT_NAME in ('ADMIN', 'CUBE', 'CUME_DIST', 'DENSE_RANK', 'EMPTY', 'EXCEPT', 'FIRST_VALUE', 'FUNCTION', 'GROUPING', 'GROUPS', 'JSON_TABLE', 'LAG', 'LAST_VALUE', 'LEAD', 'NTH_VALUE', 'NTILE', 'OF', 'OVER', 'PERCENT_RANK', 'PERSIST', 'PERSIST_ONLY', 'RANK', 'RECURSIVE', 'ROW', 'ROWS', 'ROW_NUMBER', 'SYSTEM', 'WINDOW', 'LATERAL', 'ARRAY' ,'MEMBER' )
select SCHEMA_NAME, concat('schema''s default character set: ',  DEFAULT_CHARACTER_SET_NAME) from INFORMATION_SCHEMA.schemata where SCHEMA_NAME not in ('information_schema', 'performance_schema', 'sys') and DEFAULT_CHARACTER_SET_NAME in ('utf8', 'utf8mb3')
select TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, concat('column''s default character set: ',CHARACTER_SET_NAME) from information_schema.columns where CHARACTER_SET_NAME in ('utf8', 'utf8mb3') and TABLE_SCHEMA not in ('sys', 'performance_schema', 'information_schema', 'mysql')
SELECT TABLE_SCHEMA, TABLE_NAME, 'Table name used in mysql schema in 8.0' as WARNING FROM INFORMATION_SCHEMA.TABLES WHERE LOWER(TABLE_SCHEMA) = 'mysql' and LOWER(TABLE_NAME) IN ('catalogs', 'character_sets', 'collations', 'column_type_elements', 'columns', 'dd_properties', 'events', 'foreign_key_column_usage', 'foreign_keys', 'index_column_usage', 'index_partitions', 'index_stats', 'indexes', 'parameter_type_elements', 'parameters', 'routines', 'schemata', 'st_spatial_reference_systems', 'table_partition_values', 'table_partitions', 'table_stats', 'tables', 'tablespace_files', 'tablespaces', 'triggers', 'view_routine_usage', 'view_table_usage', 'component', 'default_roles', 'global_grants', 'innodb_ddl_log', 'innodb_dynamic_metadata', 'password_history', 'role_edges')
select table_schema, table_name, concat(engine, ' engine does not support native partitioning') from information_schema.Tables where create_options like '%partitioned%' and upper(engine) not in ('INNODB', 'NDB', 'NDBCLUSTER')
select table_schema, table_name, 'Foreign key longer than 64 characters' as description from information_schema.tables where table_name in (select left(substr(id,instr(id,'/')+1), instr(substr(id,instr(id,'/')+1),'_ibfk_')-1) from information_schema.innodb_sys_foreign where length(substr(id,instr(id,'/')+1))>64)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete MAXDB sql_mode') from information_schema.routines where find_in_set('MAXDB', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete MAXDB sql_mode' from information_schema.EVENTS where find_in_set('MAXDB', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete MAXDB sql_mode' from information_schema.TRIGGERS where find_in_set('MAXDB', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete MAXDB option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('MAXDB', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete DB2 sql_mode') from information_schema.routines where find_in_set('DB2', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete DB2 sql_mode' from information_schema.EVENTS where find_in_set('DB2', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete DB2 sql_mode' from information_schema.TRIGGERS where find_in_set('DB2', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete DB2 option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('DB2', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete MSSQL sql_mode') from information_schema.routines where find_in_set('MSSQL', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete MSSQL sql_mode' from information_schema.EVENTS where find_in_set('MSSQL', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete MSSQL sql_mode' from information_schema.TRIGGERS where find_in_set('MSSQL', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete MSSQL option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('MSSQL', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete MYSQL323 sql_mode') from information_schema.routines where find_in_set('MYSQL323', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete MYSQL323 sql_mode' from information_schema.EVENTS where find_in_set('MYSQL323', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete MYSQL323 sql_mode' from information_schema.TRIGGERS where find_in_set('MYSQL323', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete MYSQL323 option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('MYSQL323', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete MYSQL40 sql_mode') from information_schema.routines where find_in_set('MYSQL40', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete MYSQL40 sql_mode' from information_schema.EVENTS where find_in_set('MYSQL40', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete MYSQL40 sql_mode' from information_schema.TRIGGERS where find_in_set('MYSQL40', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete MYSQL40 option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('MYSQL40', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete NO_AUTO_CREATE_USER sql_mode') from information_schema.routines where find_in_set('NO_AUTO_CREATE_USER', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete NO_AUTO_CREATE_USER sql_mode' from information_schema.EVENTS where find_in_set('NO_AUTO_CREATE_USER', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete NO_AUTO_CREATE_USER sql_mode' from information_schema.TRIGGERS where find_in_set('NO_AUTO_CREATE_USER', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete NO_AUTO_CREATE_USER option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('NO_AUTO_CREATE_USER', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete NO_FIELD_OPTIONS sql_mode') from information_schema.routines where find_in_set('NO_FIELD_OPTIONS', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete NO_FIELD_OPTIONS sql_mode' from information_schema.EVENTS where find_in_set('NO_FIELD_OPTIONS', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete NO_FIELD_OPTIONS sql_mode' from information_schema.TRIGGERS where find_in_set('NO_FIELD_OPTIONS', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete NO_FIELD_OPTIONS option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('NO_FIELD_OPTIONS', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete NO_KEY_OPTIONS sql_mode') from information_schema.routines where find_in_set('NO_KEY_OPTIONS', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete NO_KEY_OPTIONS sql_mode' from information_schema.EVENTS where find_in_set('NO_KEY_OPTIONS', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete NO_KEY_OPTIONS sql_mode' from information_schema.TRIGGERS where find_in_set('NO_KEY_OPTIONS', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete NO_KEY_OPTIONS option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('NO_KEY_OPTIONS', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete NO_TABLE_OPTIONS sql_mode') from information_schema.routines where find_in_set('NO_TABLE_OPTIONS', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete NO_TABLE_OPTIONS sql_mode' from information_schema.EVENTS where find_in_set('NO_TABLE_OPTIONS', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete NO_TABLE_OPTIONS sql_mode' from information_schema.TRIGGERS where find_in_set('NO_TABLE_OPTIONS', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete NO_TABLE_OPTIONS option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('NO_TABLE_OPTIONS', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete ORACLE sql_mode') from information_schema.routines where find_in_set('ORACLE', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete ORACLE sql_mode' from information_schema.EVENTS where find_in_set('ORACLE', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete ORACLE sql_mode' from information_schema.TRIGGERS where find_in_set('ORACLE', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete ORACLE option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('ORACLE', variable_value)
select routine_schema, routine_name, concat(routine_type, ' uses obsolete POSTGRESQL sql_mode') from information_schema.routines where find_in_set('POSTGRESQL', sql_mode)
select event_schema, event_name, 'EVENT uses obsolete POSTGRESQL sql_mode' from information_schema.EVENTS where find_in_set('POSTGRESQL', sql_mode)
select trigger_schema, trigger_name, 'TRIGGER uses obsolete POSTGRESQL sql_mode' from information_schema.TRIGGERS where find_in_set('POSTGRESQL', sql_mode)
select concat('global system variable ', variable_name), 'defined using obsolete POSTGRESQL option' as reason from performance_schema.global_variables where variable_name = 'sql_mode' and find_in_set('POSTGRESQL', variable_value)
select TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, UPPER(DATA_TYPE), COLUMN_TYPE, CHARACTER_MAXIMUM_LENGTH from information_schema.columns where data_type in ('enum','set') and CHARACTER_MAXIMUM_LENGTH > 255 and table_schema not in ('information_schema')
SELECT TABLE_SCHEMA, TABLE_NAME, concat('Partition ', PARTITION_NAME, ' is in shared tablespace ', TABLESPACE_NAME) as description FROM information_schema.PARTITIONS WHERE PARTITION_NAME IS NOT NULL AND (TABLESPACE_NAME IS NOT NULL AND TABLESPACE_NAME!='innodb_file_per_table')
SELECT tablespace_name, concat('circular reference in datafile path: \'', file_name, '\'') FROM INFORMATION_SCHEMA.FILES where file_type='TABLESPACE' and (file_name rlike '[^\\.]/\\.\\./' or file_name rlike '[^\\.]\\\\\\.\\.\\\\')
select table_schema, table_name, '', 'VIEW', UPPER(view_definition) from information_schema.views where table_schema not in ('performance_schema','information_schema','sys','mysql')
select routine_schema, routine_name, '', routine_type, UPPER(routine_definition) from information_schema.routines where routine_schema not in ('performance_schema','information_schema','sys','mysql')
select TABLE_SCHEMA,TABLE_NAME,COLUMN_NAME, 'COLUMN', UPPER(GENERATION_EXPRESSION) from information_schema.columns where extra regexp 'generated' and table_schema not in ('performance_schema','information_schema','sys','mysql')
select TRIGGER_SCHEMA, TRIGGER_NAME, '', 'TRIGGER', UPPER(ACTION_STATEMENT) from information_schema.triggers where TRIGGER_SCHEMA not in ('performance_schema','information_schema','sys','mysql')
select event_schema, event_name, '', 'EVENT', UPPER(EVENT_DEFINITION) from information_schema.events where event_schema not in ('performance_schema','information_schema','sys','mysql')
select table_schema, table_name, 'VIEW', UPPER(view_definition) from information_schema.views where table_schema not in ('performance_schema','information_schema','sys','mysql') and (UPPER(view_definition) like '%ASC%' or UPPER(view_definition) like '%DESC%')
select routine_schema, routine_name, routine_type, UPPER(routine_definition) from information_schema.routines where routine_schema not in ('performance_schema','information_schema','sys','mysql') and (UPPER(routine_definition) like '%ASC%' or UPPER(routine_definition) like '%DESC%')
select TRIGGER_SCHEMA, TRIGGER_NAME, 'TRIGGER', UPPER(ACTION_STATEMENT) from information_schema.triggers where TRIGGER_SCHEMA not in ('performance_schema','information_schema','sys','mysql') and (UPPER(ACTION_STATEMENT) like '%ASC%' or UPPER(ACTION_STATEMENT) like '%DESC%')
select event_schema, event_name, 'EVENT', UPPER(EVENT_DEFINITION) from information_schema.events where event_schema not in ('performance_schema','information_schema','sys','mysql') and (UPPER(event_definition) like '%ASC%' or UPPER(event_definition) like '%DESC%')
select 'global.sql_mode', 'does not contain either NO_ZERO_DATE or NO_ZERO_IN_DATE which allows insertion of zero dates' from (SELECT @@global.sql_mode like '%NO_ZERO_IN_DATE%' and @@global.sql_mode like '%NO_ZERO_DATE%' as zeroes_enabled) as q where q.zeroes_enabled = 0
select 'session.sql_mode', concat(' of ', q.thread_count, ' session(s) does not contain either NO_ZERO_DATE or NO_ZERO_IN_DATE which allows insertion of zero dates') FROM (select count(thread_id) as thread_count from performance_schema.variables_by_thread WHERE variable_name = 'sql_mode' and (variable_value not like '%NO_ZERO_IN_DATE%' or variable_value not like '%NO_ZERO_DATE%')) as q where q.thread_count > 0
select TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, concat('column has zero default value: ', COLUMN_DEFAULT) from information_schema.columns where TABLE_SCHEMA not in ('performance_schema','information_schema','sys','mysql') and DATA_TYPE in ('timestamp', 'datetime', 'date') and COLUMN_DEFAULT like '0000-00-00%'
select A.schema_name, A.table_name, 'present in INFORMATION_SCHEMA''s INNODB_SYS_TABLES table but missing from TABLES table' from (select distinct replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(NAME, '/',1), '@002d', '-'), '@003a', ':'), '@002e', '.'), '@0024', '$'), '@0021', '!'), '@003f', '?'), '@0025', '%'), '@0023', '#'), '@0026', '&'), '@002a', '*'), '@0040', '@')  as schema_name, replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(substring_index(NAME, '/',-1),'#',1), '@002d', '-'), '@003a', ':'), '@002e', '.'), '@0024', '$'), '@0021', '!'), '@003f', '?'), '@0025', '%'), '@0023', '#'), '@0026', '&'), '@002a', '*'), '@0040', '@')  as table_name from information_schema.innodb_sys_tables where NAME like '%/%') A left join information_schema.tables I on A.table_name = I.table_name and A.schema_name = I.table_schema where A.table_name not like 'FTS_0%' and (I.table_name IS NULL or I.table_schema IS NULL) and A.table_name not REGEXP '@[0-9]' and A.schema_name not REGEXP '@[0-9]'
select a.table_schema, a.table_name, concat('recognized by the InnoDB engine but belongs to ', a.engine) from information_schema.tables a join (select replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(NAME, '/',1), '@002d', '-'), '@003a', ':'), '@002e', '.'), '@0024', '$'), '@0021', '!'), '@003f', '?'), '@0025', '%'), '@0023', '#'), '@0026', '&'), '@002a', '*'), '@0040', '@')  as table_schema, replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(substring_index(NAME, '/',-1),'#',1), '@002d', '-'), '@003a', ':'), '@002e', '.'), '@0024', '$'), '@0021', '!'), '@003f', '?'), '@0025', '%'), '@0023', '#'), '@0026', '&'), '@002a', '*'), '@0040', '@')  as table_name from information_schema.innodb_sys_tables where NAME like '%/%') b on a.table_schema = b.table_schema and a.table_name = b.table_name where a.engine != 'Innodb'
FLUSH LOCAL TABLES
SELECT TABLE_SCHEMA, TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA not in ('information_schema', 'performance_schema', 'sys')
CHECK TABLE `mysql`.`columns_priv` FOR UPGRADE
CHECK TABLE `mysql`.`db` FOR UPGRADE
CHECK TABLE `mysql`.`engine_cost` FOR UPGRADE
CHECK TABLE `mysql`.`event` FOR UPGRADE
CHECK TABLE `mysql`.`func` FOR UPGRADE
CHECK TABLE `mysql`.`general_log` FOR UPGRADE
CHECK TABLE `mysql`.`gtid_executed` FOR UPGRADE
CHECK TABLE `mysql`.`help_category` FOR UPGRADE
CHECK TABLE `mysql`.`help_keyword` FOR UPGRADE
CHECK TABLE `mysql`.`help_relation` FOR UPGRADE
CHECK TABLE `mysql`.`help_topic` FOR UPGRADE
CHECK TABLE `mysql`.`innodb_index_stats` FOR UPGRADE
CHECK TABLE `mysql`.`innodb_table_stats` FOR UPGRADE
CHECK TABLE `mysql`.`ndb_binlog_index` FOR UPGRADE
CHECK TABLE `mysql`.`plugin` FOR UPGRADE
CHECK TABLE `mysql`.`proc` FOR UPGRADE
CHECK TABLE `mysql`.`procs_priv` FOR UPGRADE
CHECK TABLE `mysql`.`proxies_priv` FOR UPGRADE
CHECK TABLE `mysql`.`server_cost` FOR UPGRADE
CHECK TABLE `mysql`.`servers` FOR UPGRADE
CHECK TABLE `mysql`.`slave_master_info` FOR UPGRADE
CHECK TABLE `mysql`.`slave_relay_log_info` FOR UPGRADE
CHECK TABLE `mysql`.`slave_worker_info` FOR UPGRADE
CHECK TABLE `mysql`.`slow_log` FOR UPGRADE
CHECK TABLE `mysql`.`tables_priv` FOR UPGRADE
CHECK TABLE `mysql`.`time_zone` FOR UPGRADE
CHECK TABLE `mysql`.`time_zone_leap_second` FOR UPGRADE
CHECK TABLE `mysql`.`time_zone_name` FOR UPGRADE
CHECK TABLE `mysql`.`time_zone_transition` FOR UPGRADE
CHECK TABLE `mysql`.`time_zone_transition_type` FOR UPGRADE
CHECK TABLE `mysql`.`user` FOR UPGRADE

check.txt

Cannot set LC_ALL to locale en_US.UTF-8: No such file or directory
WARNING: Using a password on the command line interface can be insecure.
The MySQL server at 172.17.0.3:3306, version 5.7.33 - MySQL Community Server
(GPL), will now be checked for compatibility issues for upgrade to MySQL
8.0.24...

1) Usage of old temporal type
  No issues found

2) Usage of db objects with names conflicting with new reserved keywords
  No issues found

3) Usage of utf8mb3 charset
  No issues found

4) Table names in the mysql schema conflicting with new tables in 8.0
  No issues found

5) Partitioned tables using engines with non native partitioning
  No issues found

6) Foreign key constraint names longer than 64 characters
  No issues found

7) Usage of obsolete MAXDB sql_mode flag
  No issues found

8) Usage of obsolete sql_mode flags
  Notice: The following DB objects have obsolete options persisted for
    sql_mode, which will be cleared during upgrade to 8.0.
  More information:

https://dev.mysql.com/doc/refman/8.0/en/mysql-nutshell.html#mysql-nutshell-removals

  global system variable sql_mode - defined using obsolete NO_AUTO_CREATE_USER
    option

9) ENUM/SET column definitions containing elements longer than 255 characters
  No issues found

10) Usage of partitioned tables in shared tablespaces
  No issues found

11) Circular directory references in tablespace data file paths
  No issues found

12) Usage of removed functions
  No issues found

13) Usage of removed GROUP BY ASC/DESC syntax
  No issues found

14) Removed system variables for error logging to the system log configuration
  To run this check requires full path to MySQL server configuration file to be specified at 'configPath' key of options dictionary
  More information:

https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-13.html#mysqld-8-0-13-logging

15) Removed system variables
  To run this check requires full path to MySQL server configuration file to be specified at 'configPath' key of options dictionary
  More information:

https://dev.mysql.com/doc/refman/8.0/en/added-deprecated-removed.html#optvars-removed

16) System variables with new default values
  To run this check requires full path to MySQL server configuration file to be specified at 'configPath' key of options dictionary
  More information:

https://mysqlserverteam.com/new-defaults-in-mysql-8-0/

17) Zero Date, Datetime, and Timestamp values
  No issues found

18) Schema inconsistencies resulting from file removal or corruption
  No issues found

19) Tables recognized by InnoDB that belong to a different engine
  No issues found

20) Issues reported by 'check table x for upgrade' command
  No issues found

21) New default authentication plugin considerations
  Warning: The new default authentication plugin 'caching_sha2_password' offers
    more secure password hashing than previously used 'mysql_native_password'
    (and consequent improved client connection authentication). However, it also
    has compatibility implications that may affect existing MySQL installations.
    If your MySQL installation must serve pre-8.0 clients and you encounter
    compatibility issues after upgrading, the simplest way to address those
    issues is to reconfigure the server to revert to the previous default
    authentication plugin (mysql_native_password). For example, use these lines
    in the server option file:

    [mysqld]
    default_authentication_plugin=mysql_native_password

    However, the setting should be viewed as temporary, not as a long term or
    permanent solution, because it causes new accounts created with the setting
    in effect to forego the improved authentication security.
    If you are using replication please take time to understand how the
    authentication plugin changes may impact you.
  More information:

https://dev.mysql.com/doc/refman/8.0/en/upgrading-from-previous-series.html#upgrade-caching-sha2-password-compatibility-issues


https://dev.mysql.com/doc/refman/8.0/en/upgrading-from-previous-series.html#upgrade-caching-sha2-password-replication

Errors:   0
Warnings: 1
Notices:  1

No fatal errors were found that would prevent an upgrade, but some potential issues were detected. Please ensure that the reported issues are not significant before upgrading.

The pre-pre SQL check

I now am armed with an simplified single SQL statement. It does of course take a long to run in a cluster with thousands of tables.

select A.schema_name, A.table_name, 
       'present in INFORMATION_SCHEMA''s INNODB_SYS_TABLES table but missing from TABLES table' 
from (select distinct replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(NAME, '/',1), '@002d', '-'), '@003a', ':'), '@002e', '.'), '@0024', '$'), '@0021', '!'), '@003f', '?'), '@0025', '%'), '@0023', '#'), '@0026', '&'), '@002a', '*'), '@0040', '@')  as schema_name, 
replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(substring_index(substring_index(NAME, '/',-1),'#',1), '@002d', '-'), '@003a', ':'), '@002e', '.'), '@0024', '$'), '@0021', '!'), '@003f', '?'), '@0025', '%'), '@0023', '#'), '@0026', '&'), '@002a', '*'), '@0040', '@')  as table_name
 from information_schema.innodb_sys_tables 
where NAME like '%/%') A 
left join information_schema.tables I on A.table_name = I.table_name and A.schema_name = I.table_schema 
where A.table_name not like 'FTS_0%' 
and (I.table_name IS NULL or I.table_schema IS NULL) 
and A.table_name not REGEXP '@[0-9]' 
and A.schema_name not REGEXP '@[0-9]')

I then performed a number of drop/remove/restart/re-create/discard tablespace steps with no success. As a managed service RDS the only course of action now is to open an AWS Support ticket for help with this specific internal corruption.

Upgrading to AWS Aurora MySQL 8

With Aurora MySQL 8 now generally available to all, you may want to consider the plan for an upgrade path if you would like to take advantage of the new features for your application, for example, Common Table Expressions (CTE). This new major release has a much improved and streamlined upgrade progress from Aurora MySQL 5.7.

This tutorial will provide all the steps to allow you to try out setting up an Aurora cluster and performing an upgrade without the impact on your existing AWS environment. The two pre-requisites to getting started are:

You can find all the CLI cut/paste commands in my AWS Tutorials repo. This will lead you through all of the various AWS dependencies for a successful RDS Aurora cluster including IAM, KMS, VPC and EC2 requirements.

Create an RDS Aurora MySQL Cluster and Aurora MySQL Major upgrade – Aurora 2.x to Aurora 3.x can provide you with a POC of the primary operations path to achieving the goal of this post in under 30 minutes.

While this example will produce an upgraded cluster with some warnings, in real life a more detailed upgrade assessment is needed for any new version of software. The MySQL and Aurora pre-checks can be performed to minimize surprises during the final process of your data migration.

mysqlcheck –check-upgrade and the mysqlsh util.checkForServerUpgrade() pre-checks can help to assist in being prepared and not have your Cluster instances with the incompatible-parameters status. At this point download the upgrade-prechecks.log Aurora Log and trash your cluster and instance. They are unusable. Reviewing the upgrade-prechecks.log can contain more information than mysqlsh util.checkForServerUpgrade() output.

With an Aurora cluster configured with an instance parameter group enabling MySQL binary log replication, it is easy to have a functioning Aurora 5.7 Cluster with real-time replication to an Aurora 8 Cluster to minimize any downtime in your production environment and then benefit from an atomic data dictionary, roles, descending indexes, improved internal temporary table, additional JSON functions, Window Functions, CTEs and more!

More Reading

TDD for Infrastructure

Test Driven Development (TDD) is an important principle for producing quality software. This is not a new concept. The Extreme Programming (XP) agile methodology (1999) outlined the concept before the acronym became more widely accepted as “Another requirement is testability. You must be able to create automated unit and functional tests… You may need to change your system design to be easier to test. Just remember, where there is a will there is a way to test.” Another clear way to describe the hurdles TDD has encountered as a common sense approach is “This is opposed to software development that allows code to be added that is not proven to meet requirements.”

Infrastructure setup is still software. All setup should have adequate testing to ensure at anytime (not just during installation or configuration) any system is in a known state. While Configuration Management (CM) works with the goal of convergence, i.e. ensuring a system is in a known state, testing should be able to validate and identify any non-conformance and not to attempt to correct.

The Bash Automated Test System (BATS) is a known framework for shell scripting. It is very easy to use.

Good habits come from always doing them. Even for a quick test of a running MySQL server via vagrant for a blog post, the automated installation during setup includes validating a simple infrastructure setup via a bats test.

$ tail install.sh

...
sudo mysql -NBe "SHOW GRANTS"
systemctl status mysqld.service
ps -ef | grep mysqld
pidof mysqld
bats /vagrant/mysql8.bats

Rather than having some output that requires a human to read and interpret each line and make a determination, automated it. A good result is:

$ vagrant up
...
    mysql8: ok 1 bats present
    mysql8: ok 2 rpm present
    mysql8: ok 3 openssl present
    mysql8: ok 4 mysql rpm install
    mysql8: ok 5 mysql server command present
    mysql8: ok 6 mysql client command present
    mysql8: ok 7 mysqld running
    mysql8: ok 8 automated mysql access 

A unsuccessful installation is:

$ vagrant provision
...
    mysql8: not ok 8 automated mysql access
    mysql8: # (in test file /vagrant/mysql8.bats, line 60)
    mysql8: #   `[ "${status}" -eq 0 ]' failed
The SSH command responded with a non-zero exit status. Vagrant
assumes that this means the command failed. The output for this command
should be in the log above. Please read the output to determine what
went wrong.

$ echo $?
1

This amount of very simple testing and re-execution of testing either via ssh or a re-provision highlighted a bug in the installation script. Anybody that wishes to identify please reach out directly!

...
# Because openssl does not always give you a special character
NEWPASSWD="$(openssl rand -base64 24)+"
mysql -uroot -p${PASSWD} -e "ALTER USER USER() IDENTIFIED BY '${NEWPASSWD}'" --connect-expired-password
# TODO: create mylogin.cnf which is more obfuscated
echo "[mysql]
user=root
password='$NEWPASSWD'" | sudo tee -a /root/.my.cnf
sudo mysql -NBe "SHOW GRANTS"
systemctl status mysqld.service
ps -ef | grep mysqld
pidof mysqld
bats /vagrant/mysql8.bats

A simple trick with a BATS test is to echo any output that will help debug a failing test. If the test succeeds no output is given, if it fails you get the information for free. For example, lets say your test is:

# Note: additional security to both access the server via ssh
#       and accessing sudo should be in place for production systems
@test "automated mysql access" {
  local EXPECTED="${USER}@localhost"
  run sudo mysql -NBe "SELECT USER()"
  [ "${status}" -eq 0 ]
  [ "${output}" = "${EXPECTED}" ]
}

Execution will only provide:

 ✗ automated mysql access
   (in test file /vagrant/mysql8.bats, line 62)
     `[ "${output}" = "${EXPECTED}" ]' failed

What you want to see to more easily identify the problem is:

 ✗ automated mysql access
   (in test file /vagrant/mysql8.bats, line 62)
     `[ "${output}" = "${EXPECTED}" ]' failed
   root@localhost != vagrant@localhost

This echo enables a better and quicker ability to correct the failing test.

...
  [ "${status}" -eq 0 ]
  echo "${output} != ${EXPECTED}"
  [ "${output}" = "${EXPECTED}" ]
...

Testing is only as good as the boundary conditions put in place. Here is an example where your code used a number of environment variables and your testing process performed checks that these variables existed.

@test "EXAMPLE_VAR is defined ${EXAMPLE_VAR}" {
  [ -n "${EXAMPLE_VAR}" ]
}

The code was subsequently refactored and the environment variable was removed. Do you remove the test that checks for its existence? No. You should not ensure the variable is not set, so that any code now or in the future acts as desired.

@test "EXAMPLE_VAR is NOT defined" {
  [ -z "${EXAMPLE_VAR}" ]
}

References:
[1] https://en.wikipedia.org/wiki/Test-driven_development
[2] http://www.extremeprogramming.org/when.html
[3] https://github.com/sstephenson/bats
[4] https://github.com/bats-core/bats-core

Defensive Data Techniques

As a data architect I always ensure that for any database schema change there a fully recoverable execution path.
I have generally advised to create a patch/revert process for every change.  For example, if a change adds a new column or index to a table, a revert script would remove the respective column or index.
The goal is to always have a defensive position for any changes. The concept is that simple, it is not complex.

In its simplest form I use the following directory and file structure.

/schema
    schema.sql
    /patch
        YYYYMMDDXX.sql     where XX,ZZ are sequential 2 digit numbers, e.g. 01,02
        YYYYMMDDZZ.sql
   /revert
       YYYYMMDDXX.sql   This is the same file name in the revert sub-directory.
       YYYYMMDDZZ.sql

At any commit or tag in configuration management it is possible to create a current copy of the schema, i.e. use schema.sql.
It is also possible to take the first version of schema.sql and apply chronologically all the patch scripts to arrive at the same consistent structure of the schema that is in schema.sql. You can also run a validation process to confirm these are equivalent.
For each tagged version or commit of this directory structure and files in version control, this should always hold true.
While not the desired execution path, every revert script can be applied in a reverse chronological order and return to the first version of the schema.
If you want to maintain a first_schema.sql file within the directory structure, you can always create any version of the schema from a given commit in a roll-forward or roll-back scenario.

In reality however this is rarely implemented. There is always divergence or drift. Drift occurs for several primary reasons. The first is non-adherence to the defined process. The second and more critical is the lack of adequate testing and verification at each and every step.  A Test Driven Design (TDD) that validates the given approach would enable a verification of end state of the schema and enable the verification at each accumulated

In addition to each patch/revert there needs to be a state that is maintained of what has been applied.  Generally for RDBMS storing this metadata within a table is recommended.

The above example shows files of .sql extension. Any schema management process needs to cater for .sh, .py or other extensions to cater for more complex operations.
 
What about data changes?  I would recommend that for all configuration information you follow the same management principles as for schema objects, that is you have a patch to insert/update/delete data, and you have a revert script that can restore that data.  Generally the complexity of the rollback process is a hurdle for developers/engineers. Having a framework is important to manage how data consistency is maintained. This framework could generate a statement to restore the data (e.g. a selective mysqldump), require a hand-crafted statement, or leverage the benefit of the RDBMS by storing the data into intermediate shallow tables.

Using a least privileged model complicates an applicable framework approach. Does the user applying the change now require the FILE privilege, or CREATE/DROP privilege to create tables for the ability to restore data.

If there is strict referential integrity at the database level, those protections will defend against unintended consequences. For example, deleting a row that is dependent on a foreign key relationship.  In a normal operating system accommodations are made generally for the sake of performance, but also for supporting poor data cleansing requirements. If the application maintains a level of referential integrity, the schema management process also needs to support this, adding a further complexity.  Ensuring data integrity is an important separate topic. If there is a dangling row, what is the impact? The data still exists, it is just not presented in a user interface or included in calculations. This generally leads to greater unintended consequences that are generally never obvious at the time of execution, but rather days, weeks or months later.

When it comes to objects within the structure of an RDBMS the situation is more complex.  A classic example in MySQL is a user.  A user in MySQL is actually the user definition which is just the username, password and host.  A user contains one or more grants. The user may be the owner of additional objects. Using default and legacy MySQL, it is simply not possible to determine if a user is actually being used. Percona and other variances support INFORMATION_SCHEMA.USER_STATISTICS which is a better method of evaluating the use of a user.  This does however require the intervention of time-based data collection, as this table is the accumulative statistics since an instance restart or flush.

With this type of object, or meta object several defensive techniques exist.  

If you had the user `blargie` and that user had grants to read data from several schemas, is the user used?  I don’t think so, let’s just delete it is not a fact-based approach to avoiding a subsequent problem.
Is the user used? Let’s revoke the users privileges and monitor for errors or user feedback? Or let’s change the user’s password?  With each of these strategies it is important to always have a defensive process to rollback.
A different approach is to use a common data technique of marking information as deleted before it’s physically deleted (think trash can before you empty the trash).  For MySQL users there is no default functionality (in the most recent versions of MySQL you can DISABLE a user).  One implementation to apply this pattern is to rename the user, which has the benefit of keeping the user’s password and privileges intack, therefore reducing the amount of complexity in restoring.

Regardless of the technique, it is important there is always a recovery path.  In a subsequent post I will discuss this approach towards cloud metadata, for example an AWS KMS policy, IAM Rule or ASG setting and the impact of  Infrastructure as a Service (IaaS) such as Terraform.

More reading https://en.wikipedia.org/wiki/Test-driven_development, https://en.wikipedia.org/wiki/Defensive_programming

Identifying MySQL SSL communication using ngrep

Prior to MySQL 5.7 client communications with a MySQL instance were unencrypted by default. This plaintext capability allowed for various tools including pt-query-digest to analyze TCP/IP traffic. Starting with MySQL 5.7 SSL is enabled by default and will be used in client communications if SSL is detected on the server.

We can detect and confirm this easily with the ngrep command.

Default client connection traffic (5.6)

On a MySQL 5.6 server we monitor the default MySQL port traffic.

mysql56$ sudo ngrep -d eth1 -wi -P ' ' -W single -l port 3306
interface: eth1 (192.168.42.0/255.255.255.0)
filter: ( port 3306 ) and ((ip || ip6) || (vlan && (ip || ip6)))
...

We connect to this server using TCP/IP.

host$ mysql -uexternal -p -h192.168.42.16
mysql56> select 'unencrypted';

We can observe the communication to and from the server (in this example 192.168.42.16) is plaintext.

...
#
T 192.168.42.1:47634 -> 192.168.42.16:3306 [AP]      select 'unencrypted'
#
T 192.168.42.16:3306 -> 192.168.42.1:47634 [AP]      !    def    unencrypted  ! !                       unencrypted
#

SSL System Variables (5.6 default)

A default 5.6 installation does not have SSL enabled as verified by system variables.

mysql56 >SHOW  VARIABLES  LIKE '%ssl%';
+---------------+----------+
| Variable_name | Value    |
+---------------+----------+
| have_openssl  | DISABLED |
| have_ssl      | DISABLED |
| ssl_ca        |          |
| ssl_capath    |          |
| ssl_cert      |          |
| ssl_cipher    |          |
| ssl_crl       |          |
| ssl_crlpath   |          |
| ssl_key       |          |
+---------------+----------+
9 rows in set (0.02 sec)

Default client connection traffic (5.7)

Running the same example client connection with MySQL 5.7 you will observe that communications to and from the server (in this example 192.168.42.17) are not in plaintext.

mysql57$ sudo ngrep -d eth1 -wi -P ' ' -W single -l port 3306
interface: eth1 (192.168.42.0/255.255.255.0)
filter: ( port 3306 ) and ((ip || ip6) || (vlan && (ip || ip6)))

host$ mysql -uexternal -p -h192.168.42.17
mysql57> select 'encrypted';


T 192.168.42.1:36781 -> 192.168.42.17:3306 [AP]     @    F   l   d iVr  H   b ^    s t Z      ( 2d   " ?  |   )
#
T 192.168.42.17:3306 -> 192.168.42.1:36781 [AP]     p%  s`   3u5!%P]   v=  r # x   E   a y  '!    )Z    8   Js  z.  \t   (r H@     0 2 5k\    <   M  @)E& b q|q@    h

SSL System Variables (5.7 default)

A new MySQL 5.7 installation will have SSL enabled by default as seen in the MySQL variables.

mysql57 > SHOW GLOBAL VARIABLES LIKE '%ssl%';
+---------------+-----------------+
| Variable_name | Value           |
+---------------+-----------------+
| have_openssl  | YES             |
| have_ssl      | YES             |
| ssl_ca        | ca.pem          |
| ssl_capath    |                 |
| ssl_cert      | server-cert.pem |
| ssl_cipher    |                 |
| ssl_crl       |                 |
| ssl_crlpath   |                 |
| ssl_key       | server-key.pem  |
+---------------+-----------------+
9 rows in set (0.00 sec)

-no-ssl Client connection traffic (5.7)

If you want to emulate the unencrypted nature of MySQL 5.6 within any SSL enabled MySQL version (e.g. MySQL 5.7) you specify the --ssl option for mysql client connections. In MySQL 5.7 this option is also deprecated and --ssl-mode=disabled should be used>.

host$ > mysql -uexternal -p -h192.168.42.17 --ssl=0

host >select '-ssl=0 unencrypted';

T 192.168.42.1:36785 -> 192.168.42.17:3306 [AP]      select '-ssl=0 unencrypted'
#
T 192.168.42.17:3306 -> 192.168.42.1:36785 [AP]      '    def    -ssl=0 unencrypted  ! 3              -ssl=0 unencrypted

References

https://wiki.christophchamp.com/index.php?title=Ngrep
http://infoheap.com/ngrep-quick-start-guide/
Encrypted Connections - MySQL 5.7 Reference Manual
Implementing MySQL Security Features - Tutorial at Percona Live Europe 2017.

Q: Does MySQL support ACID? A: Yes

I was recently asked this question by an experienced academic at the NY Oracle Users Group event I presented at.

Does MySQL support ACID? (ACID is a set of properties essential for a relational database to perform transactions, i.e. a discrete unit of work.)

Yes, MySQL fully supports ACID, that is Atomicity, Consistency, Isolation and Duration. (*)

This is contrary to the first Google response found searching this question which for reference states “The standard table handler for MySQL is not ACID compliant because it doesn’t support consistency, isolation, or durability”.

The question is however not a simple Yes/No because it depends on timing within the MySQL product’s lifecycle and the version/configuration used in deployment. What is also *painfully* necessary is to understand why this question would even be asked of the most popular open source relational database.

MySQL has a unique characteristic of supporting multiple storage engines. These engines enabling varying ways of storing and retrieving data via the SQL interface in MySQL and have varying features for supporting transactions, locking, index strategies, compression etc. The problem is that the default storage engine from version 3.23 (1999) to 5.1 (2010) was MyISAM, a non-transactional engine, and hence the first point of confusion.

The InnoDB storage engine has been included and supported from MySQL 3.23. This is a transactional engine supporting ACID properties. However, not all of the default settings in the various MySQL versions have fully meet all ACID needs, specifically the durability of data. This is the second point of confusion. Overtime other transactional storage engines in MySQL have come and gone. InnoDB has been there since the start so there is no excuse to not write applications to fully support transactions. The custodianship of Oracle Corporation starting in 2010 quickly corrected this *flaw* by ensuring the default storage engine in MySQL 5.5 is InnoDB. But the damage to the ecosystem that uses MySQL, that is many thousands of open source projects, and the resources that work with MySQL has been done. Recently working on a MySQL 5.5 production system in 2016, the default engine was specifically defined in the configuration defined as MyISAM, and some (but not all tables) were defined using MyISAM. This is a further conversation as to why, is this a upgrade problem? Are there legacy dependencies with applications? Are the decision makers and developers simply not aware of the configuration? Or, are developers simply not comfortable with transactions?

Like other anti-reasonable MySQL defaults the unaware administrator or developer could consider MySQL as supporting ACID properties, however until detailed testing with concurrency and error conditions not realize the impact of poor configuration settings.

The damage of having a non-transactional storage engine as the default for over a decade has created a generation of professionals and applications that abuses one of the primary usages of a relational database, that is a transaction, i.e. to product a unit for work that is all or nothing. Popular open source projects such as WordPress, Drupal and hundreds more have for a long time not supported transactions or used InnoDB. Mediawiki was at least one popular open source project that was proactive towards InnoDB and transaction usage. The millions of plugins, products and startups that build on these technologies have the same flaws.

Further confusion arises when an application uses InnoDB tables but does not use transactions, or the application abuses transactions, for example 3 different transactions that should really be 1.

While newer versions of MySQL 5.6 and 5.7 improve default configurations, until these versions a more commonly implemented non-transactional use in a relational database will remain. A recent Effective MySQL NYC Meetup survey showed that installations of version 5.0 still exist, and that few have a policy for a regular upgrade cadence.

Expired MySQL passwords

I was surprised to find on one of my websites the message “Connect failed: Your password has expired. To log in you must change it using a client that supports expired passwords.

Not knowing that I was using a MySQL password expiry policy I reviewed the 5.7 documentation quickly which *clearly* states “The default default_password_lifetime value is 0, which disables automatic password expiration.”.

I then proceeded to investigate further, my steps are below the following comment.

However, it is always important with MySQL documentation and a new feature (in this case a 5.7 feature) to review release notes when installing versions or to least read ALL the documentation, because you may miss important information, such as.

Note:
From MySQL 5.7.4 to 5.7.10, the default default_password_lifetime value is 360 (passwords must be changed approximately once per year). For those versions, be aware that, if you make no changes to the default_password_lifetime variable or to individual user accounts, all user passwords will expire after 360 days, and all user accounts will start running in restricted mode when this happens. Clients…

I would encourage you to view the MySQL password expiry policy to see the full note. I have only include the intro here are a teaser, because you need to read the entire note.

Analysis

Back to impatient analysis steps.

$ mysql -u admin -p 
*********

SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 5.7.9-log |
+-----------+

SHOW GLOBAL VARIABLES LIKE 'default_p%';
+---------------------------+-------+
| Variable_name             | Value |
+---------------------------+-------+
| default_password_lifetime | 360   |
+---------------------------+-------+


SELECT host,user,password_last_changed 
FROM mysql.user
WHERE password_last_changed + INTERVAL @@default_password_lifetime DAY < CURDATE();
+-----------+--------------+-----------------------+
| host      | user         | password_last_changed |
+-----------+--------------+-----------------------+
| localhost | XXX          | 2014-12-01 12:53:36   |
| localhost | XXXXX        | 2014-12-01 12:54:04   |
| localhost | XX_XXXX      | 2015-06-04 11:01:11   |
+-----------+--------------+-----------------------+

Indeed there are some passwords that have expired.

After finding the applicable application credentials I looked at verifying the problem.

$ mysql -uXX_XXXX -p
*******************
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Server version: 5.7.9-log

mysql>

Interesting, there was no error to make a client connection, however.

mysql> use XXXX;
ERROR 1820 (HY000): You must reset your password using ALTER USER statement before executing this statement.

I then proceeded to change the password with the applicable hint shown.

ALTER USER XX_XXXX@localhost IDENTIFIED BY '*************************';

I chose to reuse the same password because changing the password would require a subsequent code change. MySQL accepted the same password. (A topic for a separate discussion on this point).

A manual verification showed the application and users operating as it should be, so immediate loss of data was averted. Monitoring of the sites home page however did not detect this problem of a partial page error, so should a password expiry policy be used, an applicable check in a regularly scheduled operational task is a good feature request.

All of this could have been avoided if my analysis started with reading the documentation and the note (partly shown above) which has an alternative and potentially more practical immediate solution.

In a firefighting operational mode it can be a priority to correct the problem, however more detailed analysis is prudent to maintain a "Being proactive rather than reactive" mindset. Being a Friday, I feel the old saying "There is more than one way to skin a cat" is applicable.

I am also more familiar with the SET PASSWORD syntax, so reviewing this 5.7 manual page is also a good read to determine what specific syntax is now deprecated and what "ALTER USER is now the preferred statement for assigning passwords" also.

Percona Live Presentation: Improving Performance With Better Indexes

The slides for my Improving Performance With Better Indexes presentation at Percona Live 2015 MySQL Conference and Expo are now available.

In this presentation I discuss how to identify, review and analyze SQL statements in order to create better indexes for your queries. This includes understanding the EXPLAIN syntax and how to create and identify covering and partial column indexes.

This presentation is based on the work with a customer showing the 95% improvement of a key 15 table join query running 15,000 QPS in a 25 billion SQL statements per day infrastructure.

As mentioned, Explaining the MySQL Explain is an additional presentation that goes into more detail for learning how to read Query Execution Plans (QEP) in MySQL.

What is FTS_BEING_DELETED.ibd

I currently have on a MySQL 5.6 database using innodb_file_per_table the following individual tablespace file.

schema/FTS_00000000000001bb_BEING_DELETED.ibd

The schema is all InnoDB tables, and there ARE NO Full Text Indexes. I cannot comment on if a developer has tried to create one previously.
I am none the wiser in explaining the ongoing use of these files, or if it can be/should be deleted.

On closer inspection there are infact a number of FTS files.

$ ls -al FTS*
-rw-r----- 1 mysql mysql 98304 Jan 29 16:21 FTS_00000000000001bb_BEING_DELETED_CACHE.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:20 FTS_00000000000001bb_BEING_DELETED.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:26 FTS_00000000000001bb_CONFIG.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:21 FTS_00000000000001bb_DELETED_CACHE.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:00 FTS_00000000000001bb_DELETED.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:20 FTS_00000000000001c7_BEING_DELETED_CACHE.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:26 FTS_00000000000001c7_BEING_DELETED.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:21 FTS_00000000000001c7_CONFIG.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:20 FTS_00000000000001c7_DELETED_CACHE.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:20 FTS_00000000000001c7_DELETED.ibd

Any MySQL gurus with knowledge to share, and for the benefit of others that Internet search at a later time.

Related articles included Overview and Getting Started with InnoDB FTS and Difference between InnoDB FTS and MyISAM FTS but do not mention file specifics.

The article InnoDB Full-text Search in MySQL 5.6 (part 1) provides more insight that these files remain even if a full text index was created and has since being removed. It is not clear from the filename which tables these files relate to.

What SQL is running in MySQL

Using the MySQL 5.6 Performance Schema it is very easy to see what is actually running on your MySQL instance. No more sampling or installing software or worrying about disk I/O performance with techniques like SHOW PROCESSLIST, enabling the general query log or sniffing the TCP/IP stack.

The following SQL is used to give me a quick 60 second view on a running MySQL system of ALL statements executed.

use performance_schema;
update setup_consumers set enabled='YES' where name IN ('events_statements_history','events_statements_current','statements_digest');
truncate table events_statements_current; truncate table events_statements_history; truncate table events_statements_summary_by_digest;
do sleep(60);
select now(),(count_star/(select sum(count_star) FROM events_statements_summary_by_digest) * 100) as pct, count_star, left(digest_text,150) as stmt, digest from events_statements_summary_by_digest order by 2 desc;
update setup_consumers set enabled='NO' where name IN ('events_statements_history','events_statements_current','statements_digest');

NOTE: These statements are for simple debugging and demonstration purposes. If you want to monitor SQL statements on an ongoing basis, you should not simply truncate tables and globally enable/disable options.

There are four performance schema tables that are applicable for looking at initial SQL analysis.

  1. The events_statements_summary_by_digest table shown below gives as the name suggests a way to summarize all queries into a common query pattern (or digest). This is great to get a picture of volume and frequency of SQL statements.
  2. The events_statements_current shows the currently running SQL statements
  3. The events_statements_history shows the fun, because it provides a *short, default 10 threads* history of the SQL statements that have run in any given thread.
  4. The events_statements_history_long (when enabled) gives you a history of the most recent 10,000 events.

One query can give me a detailed review of the type and frequency of ALL SQL statements run. The ALL is important, because on a slave you also get ALL replication applied events.

mysql> select now(),(count_star/(select sum(count_star) FROM events_statements_summary_by_digest) * 100) as pct, count_star, left(digest_text,150) as stmt, digest from events_statements_summary_by_digest order by 2 desc;
select * from events_statements_current where digest='ffb6231b78efc022175650d37a837b99'G
+---------------------+---------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+
| now()               | pct     | count_star | stmt                                                                                                                                                   | digest                           |
+---------------------+---------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+
| 2013-11-07 18:24:46 | 60.6585 |       7185 | SELECT * FROM `D.....` WHERE `name` = ?                                                                                                                | d6399273d75e2348d6d7ea872489a30c |
| 2013-11-07 18:24:46 | 23.4192 |       2774 | SELECT nc . id , nc . name FROM A.................. anc JOIN N........... nc ON anc . ............_id = nc . id WHERE ......._id = ?                   | c6e2249eb91767aa09945cbb118adbb3 |
| 2013-11-07 18:24:46 |  5.5298 |        655 | BEGIN                                                                                                                                                  | 7519b14a899fd514365211a895f5e833 |
| 2013-11-07 18:24:46 |  4.6180 |        547 | INSERT INTO V........ VALUES (...) ON DUPLICATE KEY UPDATE v.... = v.... + ?                                                                           | ffb6231b78efc022175650d37a837b99 |
| 2013-11-07 18:24:46 |  1.0891 |        129 | SELECT COUNT ( * ) FROM T............... WHERE rule = ? AND ? LIKE concat ( pattern , ? )                                                              | 22d984df583adc9a1ac282239e7629e2 |
| 2013-11-07 18:24:46 |  1.0553 |        125 | SELECT COUNT ( * ) FROM T............... WHERE rule = ? AND ? LIKE concat ( ? , pattern , ? )                                                          | a8ee43287bb2ee35e2c144c569a8b2de |
| 2013-11-07 18:24:46 |  0.9033 |        107 | INSERT IGNORE INTO `K......` ( `id` , `k......` ) VALUES (...)                                                                                         | 675e32e9eac555f33df240e80305c013 |
| 2013-11-07 18:24:46 |  0.7936 |         94 | SELECT * FROM `K......` WHERE k...... IN (...)                                                                                                         | 8aa7dc3b6f729aec61bd8d7dfa5978fa |
| 2013-11-07 18:24:46 |  0.4559 |         54 | SELECT COUNT ( * ) FROM D..... WHERE NAME = ? OR NAME = ?                                                                                              | 1975f53832b0c2506de482898cf1fd37 |
| 2013-11-07 18:24:46 |  0.3208 |         38 | SELECT h . * FROM H........ h LEFT JOIN H............ ht ON h . id = ht . ......_id WHERE ht . ........._id = ? ORDER BY h . level ASC                 | ca838db99e40fdeae920f7feae99d19f |
| 2013-11-07 18:24:46 |  0.2702 |         32 | SELECT h . * , ( POW ( ? * ( lat - - ? ) , ? ) + POW ( ? * ( ? - lon ) * COS ( lat / ? ) , ? ) ) AS distance FROM H........ h FORCE INDEX ( lat ) WHER | cd6e32fc0a20fab32662e2b0a282845c |
| 2013-11-07 18:24:46 |  0.1857 |         22 | SELECT h . * , ( POW ( ? * ( lat - ? ) , ? ) + POW ( ? * ( - ? - lon ) * COS ( lat / ? ) , ? ) ) AS distance FROM H........ h FORCE INDEX ( lat ) WHER | a7b43944f5811ef36c0ded7e79793536 |
| 2013-11-07 18:24:46 |  0.0760 |          9 | SELECT h . * , ( POW ( ? * ( lat - ? ) , ? ) + POW ( ? * ( ? - lon ) * COS ( lat / ? ) , ? ) ) AS distance FROM H........ h FORCE INDEX ( lat ) WHERE  | 4ccd8b28ae9e87a9c0b372a58ca22af7 |
| 2013-11-07 18:24:46 |  0.0169 |          2 | SELECT * FROM `K......` WHERE k...... IN (?)                                                                                                           | 44286e824d922d8e2ba6d993584844fb |
| 2013-11-07 18:24:46 |  0.0084 |          1 | SELECT h . * , ( POW ( ? * ( lat - - ? ) , ? ) + POW ( ? * ( - ? - lon ) * COS ( lat / ? ) , ? ) ) AS distance FROM H........ h FORCE INDEX ( lat ) WH | 299095227a67d99824af2ba012b81633 |
| 2013-11-07 18:24:46 |  0.0084 |          1 | SELECT * FROM `H........` WHERE `id` = ?                                                                                                               | 2924ea1d925a6e158397406403a63e3a |
| 2013-11-07 18:24:46 |  0.0084 |          1 | SHOW ENGINE INNODB STATUS                                                                                                                              | 0b04d3acd555401f1cbc479f920b1bac |
| 2013-11-07 18:24:46 |  0.0084 |          1 | DO `sleep` (?)                                                                                                                                         | 3d6e973c2657d0d136bbbdad05e68c7a |
| 2013-11-07 18:24:46 |  0.0084 |          1 | SHOW ENGINE INNODB MUTEX                                                                                                                               | a031f0e6068cb12c5b7508106687c2cb |
| 2013-11-07 18:24:46 |  0.0084 |          1 | SELECT NOW ( ) , ( `count_star` / ( SELECT SUM ( `count_star` ) FROM `events_statements_summary_by_digest` ) * ? ) AS `pct` , `count_star` , LEFT ( `d | 8a9e990cd85d6c42a2e537d04c8c5910 |
| 2013-11-07 18:24:46 |  0.0084 |          1 | SHOW SLAVE STATUS                                                                                                                                      | d2a0ffb1232f2704cef785f030306603 |
| 2013-11-07 18:24:46 |  0.0084 |          1 | TRUNCATE TABLE `events_statements_summary_by_digest`                                                                                                   | a7bef5367816ca771571e648ba963515 |
| 2013-11-07 18:24:46 |  0.0084 |          1 | UPDATE `setup_consumers` SET `enabled` = ? WHERE NAME IN (...)                                                                                         | 8205ea424267a604a3a4f68a76bc0bbb |
| 2013-11-07 18:24:46 |  0.0084 |          1 | SHOW GLOBAL STATUS                                                                                                                                     | ddf94d7d7b176021b8586a3cce1e85c9 |
+---------------------+---------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+

This immediately shows me a single simple application query that is executed 60% of the time. Further review of the data and usage pattern shows that should be cached. This is an immediate improvement on system scalability.

While you can look at the raw performance schema data, using ps_helper from Mark Leith makes live easier using the statement_analysis view because of normalizing timers into human readable formats (check out lock_latency).

mysql> select * from ps_helper.statement_analysis order by exec_count desc limit 10;
+-------------------------------------------------------------------+-----------+------------+-----------+------------+---------------+-------------+-------------+--------------+-----------+---------------+--------------+------------+-----------------+-------------+-------------------+----------------------------------+
| query                                                             | full_scan | exec_count | err_count | warn_count | total_latency | max_latency | avg_latency | lock_latency | rows_sent | rows_sent_avg | rows_scanned | tmp_tables | tmp_disk_tables | rows_sorted | sort_merge_passes | digest                           |
+-------------------------------------------------------------------+-----------+------------+-----------+------------+---------------+-------------+-------------+--------------+-----------+---------------+--------------+------------+-----------------+-------------+-------------------+----------------------------------+
| CREATE VIEW `io_by_thread_by_l ... SUM ( `sum_timer_wait` ) DESC  |           |     146117 |         0 |          0 | 00:01:47.36   | 765.11 ms   | 734.74 us   | 00:01:02.00  |         3 |             0 |            3 |          0 |               0 |           0 |                 0 | c877ec02dce17ea0aca2f256e5b9dc70 |
| SELECT nc . id , nc . name FRO ...  nc . id WHERE ......._id = ?  |           |      41394 |         0 |          0 | 16.85 s       | 718.37 ms   | 407.00 us   | 5.22 s       |    155639 |             4 |       312077 |          0 |               0 |           0 |                 0 | c6e2249eb91767aa09945cbb118adbb3 |
| BEGIN                                                             |           |      16281 |         0 |          0 | 223.24 ms     | 738.82 us   | 13.71 us    | 0 ps         |         0 |             0 |            0 |          0 |               0 |           0 |                 0 | 7519b14a899fd514365211a895f5e833 |
| INSERT INTO V........ VALUES ( ...  KEY UPDATE v.... = v.... + ?  |           |      12703 |         0 |          0 | 1.73 s        | 34.23 ms    | 136.54 us   | 696.50 ms    |         0 |             0 |            0 |          0 |               0 |           0 |                 0 | ffb6231b78efc022175650d37a837b99 |
| SELECT * FROM `D.....` WHERE `name` = ?                           |           |      10620 |         0 |          0 | 3.85 s        | 25.21 ms    | 362.52 us   | 705.16 ms    |         1 |             0 |            1 |          0 |               0 |           0 |                 0 | d6399273d75e2348d6d7ea872489a30c |
| SELECT COUNT ( * ) FROM T..... ... ? LIKE concat ( pattern , ? )  |           |       2830 |         0 |          0 | 1.22 s        | 2.14 ms     | 432.60 us   | 215.62 ms    |      2830 |             1 |       101880 |          0 |               0 |           0 |                 0 | 22d984df583adc9a1ac282239e7629e2 |
| SELECT COUNT ( * ) FROM T..... ... KE concat ( ? , pattern , ? )  |           |       2727 |         0 |          0 | 932.01 ms     | 30.95 ms    | 341.77 us   | 189.47 ms    |      2727 |             1 |        38178 |          0 |               0 |           0 |                 0 | a8ee43287bb2ee35e2c144c569a8b2de |
| INSERT IGNORE INTO `K......` ( `id` , `k......` ) VALUES (...)    |           |       2447 |         0 |          0 | 499.33 ms     | 9.65 ms     | 204.06 us   | 108.28 ms    |         0 |             0 |            0 |          0 |               0 |           0 |                 0 | 675e32e9eac555f33df240e80305c013 |
| SELECT * FROM `K......` WHERE k...... IN (...)                    |           |       2237 |         0 |          0 | 1.58 s        | 62.33 ms    | 704.19 us   | 345.61 ms    |     59212 |            26 |        59212 |          0 |               0 |           0 |                 0 | 8aa7dc3b6f729aec61bd8d7dfa5978fa |
| SELECT COUNT ( * ) FROM D..... WHERE NAME = ? OR NAME = ?         |           |       1285 |         0 |          0 | 797.72 ms     | 131.29 ms   | 620.79 us   | 340.45 ms    |      1285 |             1 |            8 |          0 |               0 |           0 |                 0 | 1975f53832b0c2506de482898cf1fd37 |
+-------------------------------------------------------------------+-----------+------------+-----------+------------+---------------+-------------+-------------+--------------+-----------+---------------+--------------+------------+-----------------+-------------+-------------------+----------------------------------+

Indeed, this simple query highlights a pile of additional information necessary for analysis like:

  1. What is that CREATE VIEW command that’s executed many more times?
  2. In this view, query 2 is executed some 3x more then query 4, yet in my 60 second sample it was 3x less. Has the profile of query load changed. What exactly is being sampled in this view?
  3. The lock_latency shows some incredibility large lock times, over 5 seconds for the top SELECT statement. Is this an outlier. Unfortunately the views give min/avg/max for the total_latency but no breakdown on lock_latency to see how much of a problem this actually is?

A quick note, the statement_analysis_raw view gives you the full SQL statement, so for example the first point listed the statement actually was.

select query from ps_helper.statement_analysis_raw order by exec_count desc limit 1;
CREATE VIEW `io_by_thread_by_latency` AS SELECT IF ( `processlist_id` IS NULL , `SUBSTRING_INDEX` ( NAME , ? , - ? ) , `CONCAT` ( `processlist_user` , ? , `processlist_host` ) ) SYSTEM_USER , SUM ( `count_star` ) `count_star` , `format_time` ( SUM ( `sum_timer_wait` ) ) `total_latency` , `format_time` ( MIN ( `min_timer_wait` ) ) `min_latency` , `format_time` ( AVG ( `avg_timer_wait` ) ) `avg_latency` , `format_time` ( MAX ( `max_timer_wait` ) ) `max_latency` , `thread_id` , `processlist_id` FROM `performance_schema` . `events_waits_summary_by_thread_by_event_name` LEFT JOIN `performance_schema` . `threads` USING ( `thread_id` ) WHERE `event_name` LIKE ? AND `sum_timer_wait` > ? GROUP BY `thread_id` ORDER BY SUM ( `sum_timer_wait` ) DESC

An unexplained connection experience

The “Too many connections” problem is a common issue with applications using excessive permissions (and those that grant said global permissions). MySQL will always grant a user with SUPER privileges access to a DB to investigate the problem with a SHOW PROCESSLIST and where you can check the limits. I however found the following.

mysql> show global variables like 'max_connections';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| max_connections | 2000  |
+-----------------+-------+
1 row in set (0.00 sec)

mysql> show global status like 'max%';
+----------------------+-------+
| Variable_name        | Value |
+----------------------+-------+
| Max_used_connections | 6637  |
+----------------------+-------+
1 row in set (0.00 sec)

How can the max_used_connection exceed max_connections? This is possible because you can dynamically change max_connections in a normal MySQL environment. However ,this is AWS RDS where you cannot change variables dynamically via mysql client. You can via other command line options but this has not happened. Furthermore, this server is using the defauly.mysql.5.5 parameter group to further validate the claim that it has not been changed.

I do not have an answer for the client in this case.

I would also add this as another ding on the usability of RDS in production environments. I was locked out of the DB for a long time, and with no visibility of what was going on. The only options were wait, or restart the server. RDS does not provide this level of visibility of the processlist using a privileged user that could see what was going on. Perhaps an interface they should consider in future.

Additional DB objects in AWS RDS

To expand on Jervin’s Default RDS Account Privileges, RDS for MySQL provides a number of routines and triggers defined the the ‘mysql’ meta schema. These help in various tasks because the SUPER privilege is not provided.

SELECT routine_schema,routine_name
FROM information_schema.routines;
+----------------+-----------------------------------+
| routine_schema | routine_name                      |
+----------------+-----------------------------------+
| mysql          | rds_collect_global_status_history |
| mysql          | rds_disable_gsh_collector         |
| mysql          | rds_disable_gsh_rotation          |
| mysql          | rds_enable_gsh_collector          |
| mysql          | rds_enable_gsh_rotation           |
| mysql          | rds_kill                          |
| mysql          | rds_kill_query                    |
| mysql          | rds_rotate_general_log            |
| mysql          | rds_rotate_global_status_history  |
| mysql          | rds_rotate_slow_log               |
| mysql          | rds_set_configuration             |
| mysql          | rds_set_gsh_collector             |
| mysql          | rds_set_gsh_rotation              |
| mysql          | rds_show_configuration            |
| mysql          | rds_skip_repl_error               |
+----------------+-----------------------------------+
15 rows in set (0.00 sec)

SELECT trigger_schema, trigger_name,
          CONCAT(event_object_schema,'.',event_object_table) AS table_name,
          CONCAT(action_timing,' ',event_manipulation) AS trigger_action
FROM information_schema.triggers;
+----------------+--------------+------------+----------------+
| trigger_schema | trigger_name | table_name | trigger_action |
+----------------+--------------+------------+----------------+
| mysql          | block_proc_u | mysql.proc | BEFORE UPDATE  |
| mysql          | block_proc_d | mysql.proc | BEFORE DELETE  |
| mysql          | block_user_i | mysql.user | BEFORE INSERT  |
| mysql          | block_user_u | mysql.user | BEFORE UPDATE  |
| mysql          | block_user_d | mysql.user | BEFORE DELETE  |
+----------------+--------------+------------+----------------+

Unexplained (trivial) MySQL behavior

The -N or –skip-column-names is a convenient option with the mysql client to skip the header line of output.
However I found when viewing the output via the terminal, some interesting and unexplained output.

$ mysql -h*** -u*** -p -e "SELECT VARIABLE_NAME,VARIABLE_VALUE FROM INFORMATION_SCHEMA.GLOBAL_STATUS"
+--------------------------------+----------------+
| VARIABLE_NAME                  | VARIABLE_VALUE |
+--------------------------------+----------------+
| ABORTED_CLIENTS                | 710001         |
| ABORTED_CONNECTS               | 31             |
| BINLOG_CACHE_DISK_USE          | 0              |
| BINLOG_CACHE_USE               | 0              |
| BYTES_RECEIVED                 | 2522301004     |
| BYTES_SENT                     | 317785976      |
| COM_ADMIN_COMMANDS             | 2890667        |
| COM_ASSIGN_TO_KEYCACHE         | 0              |
| COM_ALTER_DB                   | 0              |
| COM_ALTER_DB_UPGRADE           | 0              |
$ mysql -N -h*** -u*** -p -e "SELECT VARIABLE_NAME,VARIABLE_VALUE FROM INFORMATION_SCHEMA.GLOBAL_STATUS"
+--------------------------------+--------------+
|                ABORTED_CLIENTS |       710001 |
|               ABORTED_CONNECTS |           31 |
|          BINLOG_CACHE_DISK_USE |            0 |
|               BINLOG_CACHE_USE |            0 |
|                 BYTES_RECEIVED |   2522947764 |
|                     BYTES_SENT |    348838502 |
|             COM_ADMIN_COMMANDS |      2890742 |


As you can see all the values of the first column are right aligned in terminal display. When written to a file (which is the intended outcome), the data is not.

$ mysql ... > /tmp/x
$ head /tmp/x
ABORTED_CLIENTS	710009
ABORTED_CONNECTS	31
BINLOG_CACHE_DISK_USE	0
BINLOG_CACHE_USE	0
BYTES_RECEIVED	2526760299
BYTES_SENT	530046795
COM_ADMIN_COMMANDS	2890742
COM_ASSIGN_TO_KEYCACHE	0
COM_ALTER_DB	0
COM_ALTER_DB_UPGRADE	0

Upgrading to MySQL 5.5 on Ubuntu 10.04 LTS

Ubuntu does not provide an apt-get repository package for MySQL 5.5 on this older OS, however this is still a widely used long term support version. The following steps will upgrade an existing MySQL 5.1 apt-get version to a standard MySQL 5.5 binary.

Step 1. Remove existing MySQL 5.1 retaining data and configuration

sudo su -
service mysql stop
cp -r /etc/mysql /etc/mysql.51
cp -r /var/lib/mysql /var/lib/mysql.51
which mysqld
dpkg -P mysql-server mysql-server-5.1 mysql-server-core-5.1
which mysqld
which mysql
dpkg -P mysql-client-5.1 mysql-client-core-5.1
which mysql
dpkg -P libdbd-mysql-perl libmysqlclient16 mysql-common
# This will not remove /etc/mysql if any other files are in the directory
dpkg -l | grep mysql
[ -d /etc/mysql ] && mv /etc/mysql /etc/mysql.uninstall
cp -r /etc/mysql.51 /etc/mysql

2. Prepare configuration and required directories.

sudo su -
MYCNF="/etc/mysql/my.cnf"
grep basedir ${MYCNF}
sed -ie "s/^basedir.*$/basedir=/opt/mysql/" ${MYCNF}
grep basedir ${MYCNF}
sed -ie "/^[mysqld_safe]/a
skip-syslog" ${MYCNF}
chown -R mysql:mysql /var/lib/mysql
mkdir -p /var/run/mysqld
sudo chown mysql:mysql /var/run/mysqld

Install MySQL 5.5

sudo su -
mkdir -p /opt
cd /opt
# Install MySQL 5.5 Binaries
apt-get install -y libaio-dev  # New 5.5 dependency
wget http://cdn.mysql.com/Downloads/MySQL-5.5/mysql-5.5.28-linux2.6-x86_64.tar.gz
tar xvfz mysql*.tar.gz
ln -s mysql-5.5.28-linux2.6-x86_64 /opt/mysql
echo "export MYSQL_HOME=/opt/mysql
export PATH=$MYSQL_HOME/bin:$PATH" > /etc/profile.d/mysql.sh
chmod +x /etc/profile.d/mysql.sh
. /etc/profile.d/mysql.sh
echo $MYSQL_HOME
echo $PATH
which mysql

4. Upgrade and verify MySQL Instance

su - mysql
cd $MYSQL_HOME
bin/mysqld_safe --skip-syslog &
tail /var/log/mysql/error.log
# There will be some expected ERRORS in error log
bin/mysql_upgrade -uroot
bin/mysqladmin -uroot  shutdown
bin/mysqld_safe --skip-syslog &
tail -100 /var/log/mysql/error.log
mysql -uroot -e "SELECT VERSION();"
bin/mysqladmin -uroot  shutdown
exit

5. Setup MySQL for system use

# As Root
sudo su -
INIT="/etc/init.d/mysqld"
cp /opt/mysql/support-files/mysql.server ${INIT}
sed -ie "s/^basedir=$/basedir=/opt/mysql/;s/^datadir=$/datadir=/var/lib/mysql/" ${INIT}
${INIT} start
mysql -uroot -e "SELECT VERSION();"
${INIT} stop

Not a cool new feature for Master_Host

I was surprised to find on a customer MySQL server this new syntax for Master_host in SHOW SLAVE STATUS.

mysql> SHOW SLAVE STATUSG
*************************** 1. row ***************************
               Slave_IO_State: Connecting to master
                  Master_Host: testdb1.xxx.com or 10.XXX.XX.XXX
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: db1-354215-bin-log.000005
          Read_Master_Log_Pos: 1624
               Relay_Log_File: db2-354214-relay-log.000001
   

Is this a fancy new Percona Server feature? No. It’s operator error.

We read a little further to find.

mysql> SHOW SLAVE STATUSG
...
             Slave_IO_Running: Connecting

...
                Last_IO_Errno: 2005
                Last_IO_Error: error connecting to master 'repl@ testdb1.xxx.com or 10.XXX.XX.XXX' - retry-time: 60  retries: 86400

How can this be created.
Using MySQL MHA, you get the following message in the output of commands to manage replication.

...
Thu Jan  3 17:06:40 2013 - [info]  All other slaves should start replication from here.
Statement should be: CHANGE MASTER TO MASTER_HOST='testdb1.xxx.com or 10.XXX.XX.XXX', MASTER_PORT=3306,
 MASTER_LOG_FILE='db1-354215-bin-log.000005', MASTER_LOG_POS=1624, MASTER_USER='repl',
MASTER_PASSWORD='xxx';
...

Needless to say, this syntax was taken literately, and MySQL did not complain.

I would suggest here that while MySQL does not do any validation on the value of the MASTER_HOST value in the CHANGE MASTER TO command to ensure it is resolvable it should at least do some validation to ensure the value is either a DNS entry or an IPV4,IPV6 value, that is space ‘ ‘, is not a valid character in these situations.

SQLSTATE[HY000]: General error: 2006 MySQL server has gone away

This would have to be one of the most common MySQL error messages that is misleading to the end user developer. The MySQL Manual page confirms the broad range of possible conditions, but offers little to a PHP developer that does not speak MySQL Geek. I am commonly asked to help solve this issue from a developer.

The problem is that there are several conditions that can cause this error, and a more meaningful explanation to the end user would help in addressing the issue. In general terms, this actually means “Your SQL statement has failed because the connection to the database has been disconnected because of ???”.

Here are a few common situations and how to check for what “???” is.

1. Your MySQL server really did go away.

We can easily check this by looking at the server uptime and the server error log.

$ mysql -uroot -p -e "show global status like 'uptime';"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Uptime        | 68928 |
+---------------+-------+
1 row in set (0.04 sec)
$ tail /var/log/mysql/error.log
130101 22:22:30 InnoDB: Initializing buffer pool, size = 256.0M
130101 22:22:30 InnoDB: Completed initialization of buffer pool
130101 22:22:30 InnoDB: highest supported file format is Barracuda.
130101 22:22:30 InnoDB: 1.1.8 started; log sequence number 63444325509
130101 22:22:30 [Note] Server hostname (bind-address): '127.0.0.1'; port: 3306
130101 22:22:30 [Note]   - '127.0.0.1' resolves to '127.0.0.1';
130101 22:22:30 [Note] Server socket created on IP: '127.0.0.1'.
130101 22:22:30 [Note] Event Scheduler: Loaded 0 events
130101 22:22:30 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.28-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

In both these cases, the server has been up some time, and there are zero error messages to indicate problems.

If the MySQL server did go away, was it shutdown or did it crash? The MySQL error log will provide the answers. Generally the mysql daemon (mysqld) will be restarted by the mysqld_safe wrapper process.

2. The connection timed out

$ mysql -uroot -p -e "show global variables like '%timeout';"
+----------------------------+----------+
| Variable_name              | Value    |
+----------------------------+----------+
| connect_timeout            | 30       |
| delayed_insert_timeout     | 300      |
| innodb_lock_wait_timeout   | 50       |
| innodb_rollback_on_timeout | OFF      |
| interactive_timeout        | 28800    |
| lock_wait_timeout          | 31536000 |
| net_read_timeout           | 30       |
| net_write_timeout          | 60       |
| slave_net_timeout          | 3600     |
| wait_timeout               | 28800    |
+----------------------------+----------+

These values are relatively sane MySQL defaults. If however you have very short timeouts, you may get this error. Here is just one example.

mysql> SET SESSION wait_timeout=5;

## Wait 10 seconds

mysql> SELECT NOW();
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    132361
Current database: *** NONE ***

+---------------------+
| NOW()               |
+---------------------+
| 2013-01-02 11:31:15 |
+---------------------+
1 row in set (0.00 sec)

3. Your SQL statement was killed

Some systems will proactively kill SQL statements that have been running too long. You can easily check if this may be happening proactively by looking at how many KILL statements have been executed.

$ mysql -uroot -p -e "show global status like 'com_kill'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Com_kill      | 0     |
+---------------+-------+

Not killed this time.

4. Your SQL statement was too large.

A little harder to test and verify, but MySQL uses a maximum packet site for communications between the server and the client. If this includes large fields (for example BLOB columns), you may be getting a termination of your SQL statement due to size.

By default this is relatively small.

mysql> show global variables like 'max_allowed_packet';
+--------------------+---------+
| Variable_name      | Value   |
+--------------------+---------+
| max_allowed_packet | 1048576 |
+--------------------+---------+
1 row in set (0.00 sec)

You can increase, for example to 16M with:

mysql> set global max_allowed_packet=1024*1024*16;
mysql> show global variables like 'max_allowed_packet';
+--------------------+----------+
| Variable_name      | Value    |
+--------------------+----------+
| max_allowed_packet | 16777216 |
+--------------------+----------+
1 row in set (0.00 sec)

The good news, is this was the cause for the customer today, and now no more errors!

Be sure to keep this value during MySQL restarts.

#my.cnf
[mysqld]
max_allowed_packet = 16M

Joining the Continuent Team

This month I have joined the team at Continuent. No stranger to the MySQL ecosystem, Continuent provides replication and clustering technology for managing data between MySQL, Oracle, PostgreSQL, Vertica and a growing list of data stores.

I have known many of the team at Continuent for some time, and will again be joining Giuseppe Maxia from our days at MySQL Inc/AB starting back in 2006.

I am looking forward to taking the hard work out of administration of MySQL systems with the simplicity of Continuent Tungsten, simplifying tasks including automatic failover, multi-master and geo cluster redundancy to a single command.

Catch me speaking at the upcoming MySQL Connect (San Francisco) and Percona Live (New York) conferences, and where the third book of the Effective MySQL Series Replication Techniques in Depth will also be available for sale.

MySQL client password security

In case you missed it, MySQL 5.6.6, also known as Milestone 9, was recently released. I have yet to install this, however just one part of the MySQL 5.6.6 Release Notes makes placing installing and testing high on my TODO list.

Updated 20 Sep, 2012. Be sure to also read Todd’s post Understanding mysql_config_editor’s security aspects about a more in-depth and accurate description of this new feature. In summary, “It makes secure access via MySQL client applications easier to use”.

That is the reported improvements in password management. From the release notes:

Security Improvements

These security improvements were implemented:

MySQL now provides a method for storing authentication credentials securely in an option file named .mylogin.cnf. To create the file, use the mysql_config_editor utility. The file can be read later by MySQL client programs to obtain authentication credentials for connecting to a MySQL server. mysql_config_editor writes the .mylogin.cnf file using encryption so the credentials are not stored as clear text, and its contents when decrypted by client programs are used only in memory. In this way, passwords can be stored in a file in non-cleartext format and used later without ever needing to be exposed on the command line or in an environment variable. This improves security for interactive use of MySQL client programs, as well as security for noninteractive tasks that require a MySQL password from a file. For more information, see Section 4.6.6, “mysql_config_editor — MySQL Configuration Utility”.

The .mylogin.cnf file can contain multiple sets of options, known as “login paths.” To specify which option group to use from the .mylogin.cnf file for connecting to the server, use the –login-path option. See Section 4.2.3.4, “Command-Line Options that Affect Option-File Handling”.

There are additional improvements and modifications around encryption. Well worth reading about in MySQL 5.6.6 Release Notes.

Recent Presentations in Cali, Colombia

On July 4 I gave two presentations at the OTN Tour Day, and on July 5 I have three presentations at the MySQL Training Days. This was my 3rd visit to Colombia and it was great to see a receptive audience. Thanks to Robin for organizing the events in 2010, 2011 and 2012.

You can download all presentations from the provided links.

I will be speaking at Percona Live New York

Percona Live New York City, October 1 - 2, 2012
Percona is back for a second New York Percona Live Conference. As the resident New York MySQL Expert, I will again be presenting. My session will be on MySQL Backup and Recovery Essentials.

You can only present so much in one hour, and this presentation just touches on the highlights of what is possible. More detailed information about the right backup and recovery strategy and associated tools is available in my current book Effective MySQL: Backup and Recovery.

Encrypting your MySQL backups and more

Assuming you have a backup and recovery strategy in place, how secure is your data? Does a hacker need to obtain access to your production system bypassing all the appropriate security protection you have in place, or just the unencrypted data on the backup server?

Encryption with zNcrypt

The following steps demonstrate how I setup a mysqldump encrypted backup with zNcrypt, a product from Gazzang. You can request a free trial evaluation of the software from http://gazzang.com/request-a-trial. I asked for a AWS EC2 instance, and was able to provide my bootstrap instructions for OS and MySQL installation. Following installation and configuration, the first step is to verify the zNcrypt process is running:

$ sudo ezncrypt-service status
  ezncrypt | Checking system dependencies
** ezncrypt system is UP and running **
       log | File: /var/log/ezncrypt/ezncrypt-service.log

If the process is not running you would find the following error message:

$ sudo ezncrypt-service status
  ezncrypt | Checking system dependencies
** ezncrypt system is NOT running **
       log | File: /var/log/ezncrypt/ezncrypt-service.log

$ sudo ezncrypt-service start
  ezncrypt | Checking system dependencies
  ezncrypt | checking encryption directories
    keymgr | Retrieving key from KSS
           |  > Encryption password retrieved from KSS
  ezncrypt | starting service
           |  > using "aes_256" cipher algorithm
           | done!
    access | Loading access control list
           | done!
  ezncrypt | Thank you for using ezncrypt.
       log | File: /var/log/ezncrypt/ezncrypt-service.log

Under the covers you will find the following attached devices, and no actual processes.

$ df -h
Filesystem ...
...
/var/lib/ezncrypt/storage/encrypted_private
/var/lib/ezncrypt/ezncrypted

$ ps -ef | grep ezn
uid  4947  3327  0 23:15 pts/3    00:00:00 grep ezn

$ ps -ef | grep cry
root        30     2  0 21:41 ?        00:00:00 [ecryptfs-kthrea]
root        31     2  0 21:41 ?        00:00:00 [crypto]
uid  4951  3327  0 23:15 pts/3    00:00:00 grep cry

The first step is to create a backup directory and encrypt all contents that are placed in the directory. ezNcrypt uses the concept of an @category for reference with an encrypted file or directory.

$ mkdir /mysql/backup/encrypted
$ sudo ezncrypt --encrypt @backup /mysql/backup/encrypted
  ezncrypt | Checking system dependencies
           | Verifying ezncrypt license
           | getting information about location
           |   > path: /var/lib/ezncrypt/ezncrypted/backup
  ezncrypt | Checking encryption status
           | done!
    keymgr | Retrieving key from KSS
           |  > Encryption password retrieved from KSS
           | generating keys
           | done!
    backup | backing up data
           | This can take a while. Please be patient
           |  > backing up /mysql/backup/encrypted
           |  > File: /opt/ezncrypt/backup/2012-04-27/encrypted.tar.gz
           | done!
  ezncrypt | encrypting files
           |  > checking disk space
           |  > encrypting /mysql/backup/encrypted
           | done!
  ezncrypt | congratulations. you have encrypted your Files!!
       log | File: /var/log/ezncrypt/ezncrypt.log

The underlying regular directory is now replaced:

$ ls -l /mysql/backup
total 0
lrwxrwxrwx 1 root root 59 2012-04-27 00:03 encrypted -> /var/lib/ezncrypt/ezncrypted/backup//mysql/backup/encrypted

Any attempts to write to this encrypted directory will now fail, even with the Linux super user:

$ mysqldump --all-databases > /mysql/backup/encrypted/edump1.sql
-bash: /mysql/backup/encrypted/edump1.sql: Permission denied

$ sudo mysqldump --all-databases > /mysql/backup/encrypted/edump1.sql
-bash: /mysql/backup/encrypted/edump1.sql: Permission denied

In order to read and write from an encrypted directory you need to grant access controls to a given program, for example mysqldump:

$ sudo ezncrypt-access-control -a "ALLOW @backup * /usr/bin/mysqldump"
passphrase:
salt:
Rule added

You verify the defined access control rules with:

$ sudo ezncrypt-access-control -L
passphrase:
salt:
# -  Type     Category       Path    Process
1    ALLOW    @backup        *       /usr/bin/mysqldump

However, writing with mysqldump still causes an error because it is the shell redirection that is performing the writing, as seen in the system error log:

$ mysqldump --all-databases > /mysql/backup/encrypted/edump1.sql
-bash: /mysql/backup/encrypted/edump1.sql: Permission denied
$ dmesg | tail
[4138848.618559] ezncryptfs: DENIED type="acl" exec="/bin/bash" script="/dev/pts/4" comm="bash" path="/var/lib/ezncrypt/ezncrypted/backup" pid=7448 uid=1000

You can use the –result-file option with mysqldump to enable the process to create the file directly. For example:

$ time mysqldump --all-databases   --result-file=/mysql/backup/encrypted/edump2.sql
real      1m34.714s
user      0m59.388s
sys       0m9.589s

$ sudo ezncrypt-run "ls -l /mysql/backup/encrypted/"
passphrase:
salt:
total 3.0G
-rw-rw-r-- 1 uid gid 2.9G 2012-04-27 02:43 edump2.sql

In this single test, the transparent encryption added only a very nominal overhead to the mysqldump test backup used. You can easily extract the file from the encrypted directory, however that would defeat the purpose of using encryption. The following syntax is shown just to confirm the validity of the encrypted file:

$ sudo /usr/sbin/ezncrypt-run "cp /mysql/backup/encrypted/edump2.sql ."
passphrase:
salt:
$ ls –al edump*
total 3916
-rw-r--r-- 1 uid gid 2.9G 2012-04-27 02:55 edump2.sql


$ grep "^CREATE.*DATABASE" edump2.sql
CREATE DATABASE /*!32312 IF NOT EXISTS*/ `book2` /*!40100 DEFAULT CHARACTER SET latin1 */;
CREATE DATABASE /*!32312 IF NOT EXISTS*/ `employees` ...
CREATE DATABASE /*!32312 IF NOT EXISTS*/ `musicbrainz` ...
CREATE DATABASE /*!32312 IF NOT EXISTS*/ `mysql` ...
CREATE DATABASE /*!32312 IF NOT EXISTS*/ `sakila` ...
CREATE DATABASE /*!32312 IF NOT EXISTS*/ `world_innodb` ...
CREATE DATABASE /*!32312 IF NOT EXISTS*/ `world_myisam` ...

When using correctly configured directories and access controls, the use is truly transparent to the backup process.

Restoring an encrypted file is a little more involved. The best approach is to create a script to perform the work, than encrypt this script. When executed, this script will have the permissions necessary to read and apply the encrypted file.

Perhaps the best tip about using this type of transparent encryption is that it is possible to encrypt the MySQL user and password securely in a plain text configuration file and used with appropriate MySQL client commands. This helps to address another common security problem.

What compression do you use?

The following is an evaluation of various compression utilities that I tested when reviewing the various options for MySQL backup strategies. The overall winner in performance was pigz, a parallel implementation of gzip. If you use gzip today as most organizations do, this one change will improve your backup compression times.

Details of the test:

  • The database is 5.4GB of data
  • mysqldump produces a backup file of 2.9GB
  • The server is an AWS t1.xlarge with a dedicated EBS volume for backups

The following testing was performed to compare the time and % compression savings of various available open source products. This was not an exhaustive test with multiple iterations and different types of data files.

Compression
Utility
Compression Time
(sec)
Decompression Time
(sec)
New Size
(% Saving)
lzo (-3) 21 34 1.5GB (48%)
pigz (-1) 43 33 995MB (64%)
pigz (-3) 56 34 967MB (67%)
gzip (-1) 81 43 995MB (64%)
fastlz 92 128 1.3GB (55%)
pigz [-6] 105 25 902MB (69%)
gzip (-3) 106 43 967MB (67%)
compress 145 36 1.1GB (62%)
pigz (-9) 202 23 893MB (70%)
gzip [-6] 232 78 902MB (69%)
zip 234 50 902MB (69%)
gzip (-9) 405 43 893MB (70%)
bzip2 540 175 757MB (74%)
rzip 11 minutes 360 756MB (74%)
lzo (-9) 20 minutes 82 1.2GB (58%)
7z 33 minutes 122 669MB (77%)
lzip 47 minutes 132 669MB (77%)
lzma 58 minutes 180 639MB (78%)
xz 59 minutes 160 643MB (78%)

Observations

  • The percentage savings and compression time of results will vary depending on the type of data that is stored in the MySQL database.
  • The pigz compression utility was the surprising winner in best compression time producing at least a size of gzip. This was a full 50% faster than gzip.
  • For this compression tests, only one large file was used. Some utilities work much better with many smaller files.

Find our more information of these tests and the results in Effective MySQL: Backup and Recovery

Recent Presentations Buenos Aires MySQL/NoSQL/Cloud Conference

The first annual Latin America MySQL/NoSQL/Cloud Conference was held in Buenos Aires Argentina from June 26-28. Kudos to Santiago Lertora from Binlogic who had the vision for the conference in his country and made it happen. I look forward to the second annual event.

My first presentation was “Improving Performance with Better Indexes”. This presentation details the six steps to SQL performance analysis, Capture, Identify, Confirm, Analyze, Optimize and Verify. An explanation of MySQL EXPLAIN, and working examples to create indexes and better covering indexes in several examples are provided. A production example of a 13 table join is used to detail how covering indexes and partial column indexes can make a dramatic improvement in performance. Download Presentation (PDF).

More detailed information about EXPLAIN and creating indexes is available in book Effective MySQL: Optimizing SQL Statements.

My second presentation was “MySQL Backup and Recovery Essentials”. This presentation covers the most common options for MySQL backup and the respective restore options. Also covered is the importance of the master binary logs and point in time recovery capabilities. Download Presentation (PDF)

More detailed information about the right backup and recovery strategy and associated tools is available in book Effective MySQL: Backup and Recovery.

References

Latin America MySQL/NoSQL/Cloud Conference Program.

UTF-8 with MySQL and LAMP

A recent question on a mailing list was the best practices for UTF-8 and PHP/MySQL. The following are the configurations I used in my multi-language projects.

MySQL UTF-8 Configuration

# my.cnf
[mysqld]
default_character_set = utf8
character_set_client       = utf8
character_set_server       = utf8
[client]
default_character_set = utf8

PHP UTF-8 Configuration

#php.ini
default_charset = "utf-8"

Apache UTF-8 Configuration

#httpd.conf
AddDefaultCharset UTF-8
<VirtualHost>
    AddCharset UTF-8   .htm
</VirtualHost>

HTML file UTF-8 Configuration

 <meta charset="utf-8">

PHP file UTF-8 Configuration

header('Content-type: text/html; charset=UTF-8');

MySQL connection (extra precaution)

SET NAMES utf8;

Shell UTF-8

And last but not least, even editing files in shell can be affected (.e.g UTF-8 data to be inserted into DB from file). Ensure at least

#~/.profile
export LC_CTYPE=en_US.UTF-8
export LANG=en_US.UTF-8

Amateurs – They give us professionals a bad name

Any person with half a brain would see from the error messages below that the MySQL server is not operating optimally, or more specifically the MySQL upgrade has not completely successfully and let users can go happily use the website. It amazing me when web hosting providers tell their paying client that an upgrade has been performed yet they did not have the intelligence to actually look at the error log for confirmation. Got a mysql> prompt, it’s all good. One of the first things I check is the error log.

When will people learn the MySQL error log is a valuable resource both for what it contains, and what it should not contain.

120426 17:36:00 [Note] /usr/libexec/mysqld: Shutdown complete

120426 17:36:00 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
120426 17:36:00 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
120426 17:36:00 [Note] Plugin 'FEDERATED' is disabled.
/usr/libexec/mysqld: Table 'mysql.plugin' doesn't exist
120426 17:36:00 [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it.
120426 17:36:00 InnoDB: The InnoDB memory heap is disabled
120426 17:36:00 InnoDB: Mutexes and rw_locks use GCC atomic builtins
120426 17:36:00 InnoDB: Compressed tables use zlib 1.2.3
120426 17:36:00 InnoDB: Using Linux native AIO
120426 17:36:00 InnoDB: Initializing buffer pool, size = 128.0M
120426 17:36:00 InnoDB: Completed initialization of buffer pool
120426 17:36:00 InnoDB: highest supported file format is Barracuda.
120426 17:36:00  InnoDB: Waiting for the background threads to start
120426 17:36:01 InnoDB: 1.1.8 started; log sequence number 232577699
120426 17:36:01 [ERROR] Missing system table mysql.proxies_priv; please run mysql_upgrade to create it
120426 17:36:01 [ERROR] Can't open and lock privilege tables: Table 'mysql.servers' doesn't exist
120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_current' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_history' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_history_long' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'setup_consumers' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'setup_instruments' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'setup_timers' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'performance_timers' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'threads' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_summary_by_thread_by_event_name' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_summary_by_instance' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_summary_global_by_event_name' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'file_summary_by_event_name' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'file_summary_by_instance' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'mutex_instances' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'rwlock_instances' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'cond_instances' has the wrong structure
120426 17:36:01 [ERROR] Native table 'performance_schema'.'file_instances' has the wrong structure
120426 17:36:01 [ERROR] Column count of mysql.db is wrong. Expected 22, found 20. Created with MySQL 50077, now running 50523. Please use mysql_upgrade to fix this error.
120426 17:36:01 [ERROR] mysql.user has no `Event_priv` column at position 29
120426 17:36:01 [ERROR] Cannot open mysql.event
120426 17:36:01 [ERROR] Event Scheduler: An error occurred when initializing system tables. Disabling the Event Scheduler.
120426 17:36:01 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.5.23-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL) by Atomicorp
120426 17:46:01 [ERROR] Missing system table mysql.proxies_priv; please run mysql_upgrade to create it
120426 17:46:01 [ERROR] Can't open and lock privilege tables: Table 'mysql.servers' doesn't exist
120426 17:46:01 [ERROR] Column count of mysql.proc is wrong. Expected 20, found 16. Created with MySQL 50077, now running 50523. Please use mysql_upgrade to fix this error.

Some more light reading at Have you checked your MySQL error log today? and Monitoring MySQL – The error log

An excellent conference (5 out of 5 stars)

I wanted to extend thanks as others have also about the excellent annual MySQL Conference, now a Percona Live event. This was easily the best run, attended and energetic event in at least the past 3 years. With over a 1000 attendees a well stocked exhibitors hall (and good involvement in the hall), and good talks; there was just a great community vibe. To Terry, Kortney and all Percona staff involved, well done. The event ran on time, I personally did not see or hear of any issues. The only complaint was from many that wanted to attend multiple talks at the same time, another indication of the quality of speakers for the event.

Thank you to those that attended my two sessions on Explaining the MySQL Explain and MySQL Idiosyncrasies that BITE. Many people thanked me after presentations, along also with people coming up to me to say they appreciated the first book of the Effective MySQL Series. My desires to speak and write are only for the benefit of the MySQL community to hopefully learn and appreciate how to best use MySQL.

It was of course great to see many MySQL alumni, and old friends I have seen since meeting at my first MySQL conference in 2006.