Upcoming Percona Live 2021 Presentations

I am pleased to have been selected to present at Percona Live 2021 May 12-13. My presentations include talks on AWS RDS Aurora and QLDB managed services.

Understanding AWS RDS Aurora Capabilities

The RDS Aurora MySQL/PostgreSQL capabilities of AWS extend the HA capabilities of RDS read replicas and Multi-AZ.

In this presentation we will discuss the different capabilities and HA configurations with RDS Aurora including:

* RDS Cluster single instance
* RDS Cluster multiple instances (writer + 1 or more readers)
* RDS Cluster multi-master
* RDS Global Cluster
* RDS Cluster options for multi-regions

Each option has its relative merits and limitations. Each will depend on your business requirements, global needs and budget.

This presentation will include setup, monitoring and failover evaluations for the attendee with the goal to provide a feature matrix of when/how to consider each option as well as provide some details of the subtle differences Aurora provides.

This presentation is not going to go into the technical details of RDS Aurora’s underlying infrastructure or a feature by feature comparison of AWS RDS to AWS RDS Aurora.

A QLDB Cheatsheet for MySQL Users

Amazons new ledger database (QLDB) is an auditors best friend and lives up to the stated description of “Amazon QLDB can be used to track each and every application data change and maintains a complete and verifiable history of changes over time.”

This presentation will go over what was done to take a MySQL application that provided auditing activity changes for key data, and how it is being migrated to QLDB.

While QLDB does use a SQL-format for DML, and you can perform the traditional INSERT/UPDATE/DELETE/SELECT. The ability to extend these statements to manipulate Amazon Ion data (a superset of JSON) gives you improved data manipulation, and for example the FROM SQL statement.

Get a blow by blow comparison of MySQL structures (multiple tables and lots of columns) and SQL converted into a single QLDB table, with immutable, and cryptographically verifiable transaction log. No more triggers, duplicated tables, extra auditing for abuse of binary log activity.

We also cover the simplicity of using X Protocol and JSON output for data migration, and the complexity of AWS RDS not supporting X Protocol

#WDILTW – Creating examples can be hard

This week I was evaluating AWS QLDB. Specifically the verifiable history of changes to determine how to simplify present processes that perform auditing via CDC. This is not the first time I have looked at QLDB so there was nothing that new to learn.

What I found was that creating a workable solution with an existing application is hard. Even harder is creating an example to publish in this blog (and the purpose of this post).

First some background.

Using MySQL as the source of information, how can you leverage QLDB? It’s easy to stream data from MySQL Aurora, and it’s easy to stream data from QLDB, but it not that easy to place real-time data into QLDB. AWS DMS is a good way to move data from a source to a target, previously my work has included MySQL to MySQL, MySQL to Redshift, and MySQL to Kinesis, however there is no QLDB target.

Turning the problem upside down, and using QLDB as the source of information, and streaming to MySQL for compatibility seemed a way forward.

After setting up the QLDB Ledger and an example table, it was time to populate with existing data. The documented reference example looked very JSON compatible. Side bar, it is actually Amazon Ion a superset of JSON.

INSERT INTO Person
<< {
    'FirstName' : 'Raul',
    'LastName' : 'Lewis',
    'DOB' : `1963-08-19T`,
    'GovId' : 'LEWISR261LL',
    'GovIdType' : 'Driver License',
    'Address' : '1719 University Street, Seattle, WA, 98109'
},
{
    'FirstName' : 'Brent',
    'LastName' : 'Logan',
    'DOB' : `1967-07-03T`,
    'GovId' : 'LOGANB486CG',
    'GovIdType' : 'Driver License',
    'Address' : '43 Stockert Hollow Road, Everett, WA, 98203'
}

Now, MySQL offers with the X Protocol. This is something that lefred has evangelized for many years, I have seen presented many times, but finally I had a chance to use. The MySQL Shell JSON output looked ideal.

{
    "ID": 1523,
    "Name": "Wien",
    "CountryCode": "AUT",
    "District": "Wien",
    "Info": {
        "Population": 1608144
    }
}
{
    "ID": 1524,
    "Name": "Graz",
    "CountryCode": "AUT",
    "District": "Steiermark",
    "Info": {
        "Population": 240967
    }
}

And now, onto some of the things I learned this week.
Using AWS RDS Aurora MySQL is the first stumbling block, X Protocol is not supported. As this was a example, simple, mysqldump some reference data and load it into a MySQL 8 instance, and extract into JSON, so as to potentially emulate a pipeline.

Here is my experiences of trying to refactor into a demo to write up.

Launch a MySQL Docker container as per my standard notes. Harmless, right?

MYSQL_ROOT_PASSWORD="$(date | md5sum | cut -c1-20)#"
echo $MYSQL_ROOT_PASSWORD
docker run --name=qldb-mysql -p3306:3306 -v mysql-volume:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=$MYSQL_ROOT_PASSWORD -d mysql/mysql-server:latest
docker logs qldb-mysql
docker exec -it qldb-mysql /bin/bash

As it's a quick demo, I shortcut credentials to make using the mysql client easier. NOTE: as I always generate a new password each container, it's included here.

# echo "[mysql]
user=root
password='ab6ea7b0436cbc0c0d49#' > .my.cnf

# mysql 
ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)

What the? Did I make a mistake, I test manually and check

# mysql -u root -p

# cat .my.cnf

Nothing wrong there. Next check

# pwd
/
bash-4.2# grep root /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

And there is the first Dockerism. I don't live in Docker, so these 101 learnings would be known. First I really thing using "root" by default is a horrible idea. And when you shell in, you are not dropped into the home directory? Solved, we move on.

# mv /.my.cnf /root/.my.cnf

Mock and example as quickly as I can think.

# mysql

mysql> create schema if not exists demo;
Query OK, 1 row affected (0.00 sec)

mysql> use demo;
Database changed
mysql> create table sample(id int unsigned not null auto_increment, name varchar(30) not null, location varchar(30) not null, domain varchar(50) null, primary key(id));
Query OK, 0 rows affected (0.03 sec)
mysql> show create table sample;

mysql> insert into sample values (null,'Demo Row','USA',null), (null,'Row 2','AUS','news.com.au'), (null,'Kiwi','NZ', null);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> select * from sample;
+----+----------+----------+-------------+
| id | name     | location | domain      |
+----+----------+----------+-------------+
|  1 | Demo Row | USA      | NULL        |
|  2 | Row 2    | AUS      | news.com.au |
|  3 | Kiwi     | NZ       | NULL        |
+----+----------+----------+-------------+
3 rows in set (0.00 sec)

Cool, now to look at it in Javascript using MySQL Shell. Hurdle 2.

# mysqlsh
MySQL Shell 8.0.22

Copyright (c) 2016, 2020, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

 MySQL  JS > var session=mysqlx.getSession('root:ab6ea7b0436cbc0c0d49#@localhost')
mysqlx.getSession: Argument #1: Invalid URI: Illegal character [#] found at position 25 (ArgumentError)


What the, it doesn't like the password format. I'm not a Javascript person, and well this is an example for blogging, which is not what was actually setup, so do it the right way, create a user.

# mysql

mysql> create user demo@localhost identified by 'qldb';
Query OK, 0 rows affected (0.01 sec)

mysql> grant ALL ON sample.* to demo@localhost;
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> SHOW GRANTS FOR demo@localhost;
+----------------------------------------------------------+
| Grants for demo@localhost                                |
+----------------------------------------------------------+
| GRANT USAGE ON *.* TO `demo`@`localhost`                 |
| GRANT ALL PRIVILEGES ON `sample`.* TO `demo`@`localhost` |
+----------------------------------------------------------+
2 rows in set (0.00 sec)

Back into the MySQL Shell, and hurdle 3.

MySQL  JS > var session=mysqlx.getSession('demo:qldb@localhost')
mysqlx.getSession: Access denied for user 'demo'@'127.0.0.1' (using password: YES) (MySQL Error 1045)

Did I create the creds wrong, verify. No my password is correct.

#  mysql -udemo -pqldb -e "SELECT NOW()"
mysql: [Warning] Using a password on the command line interface can be insecure.
+---------------------+
| NOW()               |
+---------------------+
| 2021-03-06 23:15:26 |
+---------------------+

I don't have time to debug this, User take 2.

mysql> drop user demo@localhost;
Query OK, 0 rows affected (0.00 sec)

mysql> create user demo@'%' identified by 'qldb';
Query OK, 0 rows affected (0.01 sec)

mysql> grant all on demo.* to demo@'%'
    -> ;
Query OK, 0 rows affected (0.00 sec)

mysql> show grants;
+--
| Grants for root@localhost                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
+---
| GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, RELOAD, SHUTDOWN, PROCESS, FILE, REFERENCES, INDEX, ALTER, SHOW DATABASES, SUPER, CREATE TEMPORARY TABLES, LOCK TABLES, EXECUTE, REPLICATION SLAVE, REPLICATION CLIENT, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, ALTER ROUTINE, CREATE USER, EVENT, TRIGGER, CREATE TABLESPACE, CREATE ROLE, DROP ROLE ON *.* TO `root`@`localhost` WITH GRANT OPTION                                                                                                                                                                                                                    |
| GRANT APPLICATION_PASSWORD_ADMIN,AUDIT_ADMIN,BACKUP_ADMIN,BINLOG_ADMIN,BINLOG_ENCRYPTION_ADMIN,CLONE_ADMIN,CONNECTION_ADMIN,ENCRYPTION_KEY_ADMIN,FLUSH_OPTIMIZER_COSTS,FLUSH_STATUS,FLUSH_TABLES,FLUSH_USER_RESOURCES,GROUP_REPLICATION_ADMIN,INNODB_REDO_LOG_ARCHIVE,INNODB_REDO_LOG_ENABLE,PERSIST_RO_VARIABLES_ADMIN,REPLICATION_APPLIER,REPLICATION_SLAVE_ADMIN,RESOURCE_GROUP_ADMIN,RESOURCE_GROUP_USER,ROLE_ADMIN,SERVICE_CONNECTION_ADMIN,SESSION_VARIABLES_ADMIN,SET_USER_ID,SHOW_ROUTINE,SYSTEM_USER,SYSTEM_VARIABLES_ADMIN,TABLE_ENCRYPTION_ADMIN,XA_RECOVER_ADMIN ON *.* TO `root`@`localhost` WITH GRANT OPTION |
| GRANT PROXY ON ''@'' TO 'root'@'localhost' WITH GRANT OPTION                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
+---
3 rows in set (0.00 sec)

mysql> show grants for demo@'%';
+--------------------------------------------------+
| Grants for demo@%                                |
+--------------------------------------------------+
| GRANT USAGE ON *.* TO `demo`@`%`                 |
| GRANT ALL PRIVILEGES ON `demo`.* TO `demo`@`%`   |
+--------------------------------------------------+
2 rows in set (0.00 sec)

Right, initially I showed grants of not new user, but note to self, I should checkout the MySQL 8 Improved grants. I wonder how RDS MySQL 8 handles these, and how Aurora MySQL 8 will (when it ever drops, that's another story).

Third try is a charm, so nice to also see queries with 0.0000 execution granularity.

 MySQL  JS > var session=mysqlx.getSession('demo:qldb@localhost')
 MySQL  JS > var sql='SELECT * FROM demo.sample'
 MySQL  JS > session.sql(sql)
+----+----------+----------+-------------+
| id | name     | location | domain      |
+----+----------+----------+-------------+
|  1 | Demo Row | USA      | NULL        |
|  2 | Row 2    | AUS      | news.com.au |
|  3 | Kiwi     | NZ       | NULL        |
+----+----------+----------+-------------+
3 rows in set (0.0006 sec)

Get that now in JSON output. NOTE: There are 3 different JSON formats, this matched what I needed.

bash-4.2# mysqlsh
MySQL Shell 8.0.22

Copyright (c) 2016, 2020, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

Type '\help' or '\?' for help; '\quit' to exit.
 MySQL  JS > var session=mysqlx.getSession('demo:qldb@localhost')
 MySQL  JS > var sql='SELECT * FROM demo.sample'
 MySQL  JS > shell.options.set('resultFormat','json/array')
 MySQL  JS > session.sql(sql)
[
{"id":1,"name":"Demo Row","location":"USA","domain":null},
{"id":2,"name":"Row 2","location":"AUS","domain":"news.com.au"},
{"id":3,"name":"Kiwi","location":"NZ","domain":null}
]
3 rows in set (0.0006 sec)

Ok, that works in interactive interface, I need it scripted.

# vi
bash: vi: command not found
# yum install vi
Loaded plugins: ovl
http://repo.mysql.com/yum/mysql-connectors-community/el/7/x86_64/repodata/repomd.xml: [Errno 14] HTTP Error 403 - Forbidden
Trying other mirror.
...

And another downer of Docker containers, other tools or easy ways to install them, again I want to focus on the actual example, and not all this preamble, so

# echo "var session=mysqlx.getSession('demo:qldb@localhost')
var sql='SELECT * FROM demo.sample'
shell.options.set('resultFormat','json/array')
session.sql(sql)" > dump.js


# mysqlsh < dump.js

What the? Hurdle 4. Did I typo this as well, I check the file, and cut/paste it and get what I expect.

# cat dump.js
var session=mysqlx.getSession('demo:qldb@localhost')
var sql='SELECT * FROM demo.sample'
shell.options.set('resultFormat','json/array')
session.sql(sql)
# mysqlsh
MySQL Shell 8.0.22

Copyright (c) 2016, 2020, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

Type '\help' or '\?' for help; '\quit' to exit.
 MySQL  JS > var session=mysqlx.getSession('demo:qldb@localhost')
 MySQL  JS > var sql='SELECT * FROM demo.sample'
 MySQL  JS > shell.options.set('resultFormat','json/array')
 MySQL  JS > session.sql(sql)
[
{"id":1,"name":"Demo Row","location":"USA","domain":null},
{"id":2,"name":"Row 2","location":"AUS","domain":"news.com.au"},
{"id":3,"name":"Kiwi","location":"NZ","domain":null}
]
3 rows in set (0.0022 sec)

This is getting crazy.

# echo '[
> {"id":1,"name":"Demo Row","location":"USA","domain":null},
> {"id":2,"name":"Row 2","location":"AUS","domain":"news.com.au"},
> {"id":3,"name":"Kiwi","location":"NZ","domain":null}
> ]' > sample.json
bash-4.2# jq . sample.json
bash: jq: command not found

Oh the docker!!!!. Switching back to my EC2 instance now.

$ echo '[
> {"id":1,"name":"Demo Row","location":"USA","domain":null},
> {"id":2,"name":"Row 2","location":"AUS","domain":"news.com.au"},
> {"id":3,"name":"Kiwi","location":"NZ","domain":null}
> ]' > sample.json
$ jq . sample.json
[
  {
    "id": 1,
    "name": "Demo Row",
    "location": "USA",
    "domain": null
  },
  {
    "id": 2,
    "name": "Row 2",
    "location": "AUS",
    "domain": "news.com.au"
  },
  {
    "id": 3,
    "name": "Kiwi",
    "location": "NZ",
    "domain": null
  }
]

I am now way of the time I would like to spend on this weekly post, and it's getting way to long, and I'm nowhere near showing what I actually want. Still we trek on.

Boy, this stock EC2 image uses version 1, we need I'm sure V2, and well command does not work!!!!

$  aws qldb list-ledgers
ERROR:
$ aws --version

$ curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
$ unzip awscliv2.zip
$ sudo ./aws/install
$ export PATH=/usr/local/bin:$PATH
$ aws --version

Can I finally get a ledger now.

$ aws qldb create-ledger --name demo --tags JIRA=DEMO-5826,Owner=RonaldBradford --permissions-mode ALLOW_ALL --no-deletion-protection
 
{
    "Name": "demo",
    "Arn": "arn:aws:qldb:us-east-1:999:ledger/demo",
    "State": "CREATING",
    "CreationDateTime": "2021-03-06T22:46:41.760000+00:00",
    "DeletionProtection": false
}

$  aws qldb list-ledgers

{
    "Ledgers": [
        {
            "Name": "xx",
            "State": "ACTIVE",
            "CreationDateTime": "2021-03-05T20:12:44.611000+00:00"
        },
        {
            "Name": "demo",
            "State": "ACTIVE",
            "CreationDateTime": "2021-03-06T22:46:41.760000+00:00"
        }
    ]
}

$ aws qldb describe-ledger --name demo
{
    "Name": "demo",
    "Arn": "arn:aws:qldb:us-east-1:999:ledger/demo",
    "State": "ACTIVE",
    "CreationDateTime": "2021-03-06T22:46:41.760000+00:00",
    "DeletionProtection": false
}

Oh the Python 2, and the lack of user packaging, more crud of getting an example.

$ pip install pyqldb==3.1.0
ERROR

$ echo "alias python=python3
alias pip=pip3" >> ~/.bash_profile
source ~/.bash_profile
$ pip --version
pip 9.0.3 from /usr/lib/python3.6/site-packages (python 3.6)

$ python --version
Python 3.6.8

$ pip install pyqldb==3.1.0

ERROR

$ sudo pip install pyqldb==3.1.0

Yeah!, after all that, my example code works and data is inserted.

$ cat demo.py
from pyqldb.config.retry_config import RetryConfig
from pyqldb.driver.qldb_driver import QldbDriver

# Configure retry limit to 3
retry_config = RetryConfig(retry_limit=3)

# Initialize the driver
print("Initializing the driver")
qldb_driver = QldbDriver("demo", retry_config=retry_config)


def create_table(transaction_executor, table):

    print("Creating table {}".format(table))
    transaction_executor.execute_statement("Create TABLE {}".format(table))

def create_index(transaction_executor, table, column):
    print("Creating index {}.{}".format(table, column))
    transaction_executor.execute_statement("CREATE INDEX ON {}({})".format(table,column))


def insert_record(transaction_executor, table, values):
    print("Inserting into {}".format(table))
    transaction_executor.execute_statement("INSERT INTO {} ?".format(table),  values)


table="sample"
column="id"
qldb_driver.execute_lambda(lambda executor: create_table(executor, table))
qldb_driver.execute_lambda(lambda executor: create_index(executor, table, column))


record1 = { 'id': "1",
            'name': "Demo Row",
            'location': "USA",
            'domain':  ""
        }

qldb_driver.execute_lambda(lambda x: insert_record(x, table, record1))
$ python demo.py
Initializing the driver
Creating table sample
Creating index sample.id
Inserting into sample

One vets in the AWS Console, but you cannot show that in text in this blog, so goes to find a simple client and there is qldbshell

What the? I installed it and it complains about pyqldb.driver.pooled_qldb_driver. I literally used that in the last example.

$ pip3 install qldbshell
Collecting qldbshell
  Downloading PermissionError: [Errno 13] Permission denied: '/usr/local/lib/python3.6/site-packages/amazon.ion-0.7.0-py3.6-nspkg.pth' -> '/tmp/pip-p8j4d45d-uninstall/usr/local/lib/python3.6/site-packages/amazon.ion-0.7.0-py3.6-nspkg.pth'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/pip/basecommand.py", line 215, in main
    status = self.run(options, args)
  File "/usr/lib/python3.6/site-packages/pip/commands/install.py", line 365, in run
    strip_file_prefix=options.strip_file_prefix,
  File "/usr/lib/python3.6/site-packages/pip/req/req_set.py", line 783, in install
    requirement.uninstall(auto_confirm=True)
  File "/usr/lib/python3.6/site-packages/pip/req/req_install.py", line 754, in uninstall
    paths_to_remove.remove(auto_confirm)
  File "/usr/lib/python3.6/site-packages/pip/req/req_uninstall.py", line 115, in remove
    renames(path, new_path)
  File "/usr/lib/python3.6/site-packages/pip/utils/__init__.py", line 267, in renames
    shutil.move(old, new)
  File "/usr/lib64/python3.6/shutil.py", line 565, in move
    os.unlink(src)
PermissionError: [Errno 13] Permission denied: '/usr/local/lib/python3.6/site-packages/amazon.ion-0.7.0-py3.6-nspkg.pth'
[centos@ip-10-204-101-224] ~
$ sudo pip3 install qldbshell
WARNING: Running pip install with root privileges is generally not a good idea. Try `pip3 install --user` instead.
Collecting qldbshell
Requirement already satisfied: urllib3<1.27,>=1.25.4 in /usr/local/lib/python3.6/site-packages (from botocore<1.21.0,>=1.20.21->boto3>=1.9.237->qldbshell)
Installing collected packages: amazon.ion, qldbshell
  Found existing installation: amazon.ion 0.7.0
    Uninstalling amazon.ion-0.7.0:
      Successfully uninstalled amazon.ion-0.7.0
  Running setup.py install for amazon.ion ... done
  Running setup.py install for qldbshell ... done
Successfully installed amazon.ion-0.5.0 qldbshell-1.2.0


$ sudo pip3 install qldbshell

$ qldbshell
Traceback (most recent call last):
  File "/usr/local/bin/qldbshell", line 11, in 
    load_entry_point('qldbshell==1.2.0', 'console_scripts', 'qldbshell')()
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 476, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2700, in load_entry_point
    return ep.load()
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2318, in load
    return self.resolve()
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2324, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File "/usr/local/lib/python3.6/site-packages/qldbshell/__main__.py", line 25, in 
    from pyqldb.driver.pooled_qldb_driver import PooledQldbDriver
ModuleNotFoundError: No module named 'pyqldb.driver.pooled_qldb_driver'
$ pip list qldbshell
DEPRECATION: The default format will switch to columns in the future. You can use --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.conf under the [list] section) to disable this warning.
amazon.ion (0.5.0)
boto3 (1.17.21)
botocore (1.20.21)
ionhash (1.1.0)
jmespath (0.10.0)
pip (9.0.3)
prompt-toolkit (3.0.16)
pyqldb (3.1.0)
python-dateutil (2.8.1)
qldbshell (1.2.0)
s3transfer (0.3.4)
setuptools (39.2.0)
six (1.15.0)
urllib3 (1.26.3)

So, uninstalled and re-installed and voila, my data.

$ qldbshell
usage: qldbshell [-h] [-v] [-s QLDB_SESSION_ENDPOINT] [-r REGION] [-p PROFILE]
                 -l LEDGER
qldbshell: error: the following arguments are required: -l/--ledger
$ qldbshell -l demo

Welcome to the Amazon QLDB Shell version 1.2.0
Use 'start' to initiate and interact with a transaction. 'commit' and 'abort' to commit or abort a transaction.
Use 'start; statement 1; statement 2; commit; start; statement 3; commit' to create transactions non-interactively.
Use 'help' for the help section.
All other commands will be interpreted as PartiQL statements until the 'exit' or 'quit' command is issued.

qldbshell >

qldbshell > SELECT * FROM sample;                                                                                                                           
INFO:
{
 id: "1",
 name: "Demo Row",
 location: "USA",
 domain: ""
}
INFO: (0.1718s)

qldbshell > \q                                                                                                                                              
WARNING: Error while executing query: An error occurred (BadRequestException) when calling the SendCommand operation: Lexer Error: at line 1, column 1: invalid character at, '\' [U+5c];
INFO: (0.1134s)
qldbshell > exit                                                                                                                                            
Exiting QLDB Shell

Right \q is a mysqlism of the client, need to rewire myself.

Now, I have a ledger, I created an example table, mocked a row of data and verified. Now I can just load my sample data in JSON I created earlier right? Wrong!!!

$ cat load.py
import json
from pyqldb.config.retry_config import RetryConfig
from pyqldb.driver.qldb_driver import QldbDriver

# Configure retry limit to 3
retry_config = RetryConfig(retry_limit=3)

# Initialize the driver
print("Initializing the driver")
qldb_driver = QldbDriver("demo", retry_config=retry_config)

def insert_record(transaction_executor, table, values):
  print("Inserting into {}".format(table))
  transaction_executor.execute_statement("INSERT INTO {} ?".format(table),  values)


table="sample"

with open('sample.json') as f:
  data=json.load(f)

qldb_driver.execute_lambda(lambda x: insert_record(x, table, data))

$ python load.py
Traceback (most recent call last):
  File "load.py", line 2, in 
    from pyqldb.config.retry_config import RetryConfig
ModuleNotFoundError: No module named 'pyqldb'
[centos@ip-10-204-101-224] ~

Oh sweet, I'd installed that, and used it, and re-installed it.

$ pip list | grep pyqldb
DEPRECATION: The default format will switch to columns in the future. You can use --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.conf under the [list] section) to disable this warning.
[centos@ip-10-204-101-224] ~
$ sudo pip3 install pyqldb
WARNING: Running pip install with root privileges is generally not a good idea. Try `pip3 install --user` instead.
Collecting pyqldb
Requirement already satisfied: boto3<2,>=1.16.56 in /usr/local/lib/python3.6/site-packages (from pyqldb)
Requirement already satisfied: botocore<2,>=1.19.56 in /usr/local/lib/python3.6/site-packages (from pyqldb)
Requirement already satisfied: ionhash<2,>=1.1.0 in /usr/local/lib/python3.6/site-packages (from pyqldb)
Requirement already satisfied: six in /usr/local/lib/python3.6/site-packages (from amazon.ion<1,>=0.7.0->pyqldb)
Requirement already satisfied: s3transfer<0.4.0,>=0.3.0 in /usr/local/lib/python3.6/site-packages (from boto3<2,>=1.16.56->pyqldb)
Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /usr/local/lib/python3.6/site-packages (from boto3<2,>=1.16.56->pyqldb)
Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /usr/local/lib/python3.6/site-packages (from botocore<2,>=1.19.56->pyqldb)
Requirement already satisfied: urllib3<1.27,>=1.25.4 in /usr/local/lib/python3.6/site-packages (from botocore<2,>=1.19.56->pyqldb)
Installing collected packages: amazon.ion, pyqldb
  Found existing installation: amazon.ion 0.5.0
    Uninstalling amazon.ion-0.5.0:
      Successfully uninstalled amazon.ion-0.5.0
  Running setup.py install for amazon.ion ... done
  Running setup.py install for pyqldb ... done
Successfully installed amazon.ion-0.7.0 pyqldb-3.1.0

Load one more time.


$ cat load.py
import json
from pyqldb.config.retry_config import RetryConfig
from pyqldb.driver.qldb_driver import QldbDriver

# Configure retry limit to 3
retry_config = RetryConfig(retry_limit=3)

# Initialize the driver
print("Initializing the driver")
qldb_driver = QldbDriver("demo", retry_config=retry_config)

def insert_record(transaction_executor, table, values):
  print("Inserting into {}".format(table))
  transaction_executor.execute_statement("INSERT INTO {} ?".format(table),  values)


table="sample"

with open('sample.json') as f:
  data=json.load(f)

qldb_driver.execute_lambda(lambda x: insert_record(x, table, data))

$ python load.py
Initializing the driver
Inserting into sample

And done, I've got my JSON extracted MySQL 8 data in QLDB. I go to vett it in the client, and boy, didn't expect yet another package screw up. Clearly, these 2 AWS python packages are incompatible. That's a venv need, but I'm now at double my desired time to show this.

$ qldbshell -l demo
Traceback (most recent call last):
  File "/usr/local/bin/qldbshell", line 11, in 
    load_entry_point('qldbshell==1.2.0', 'console_scripts', 'qldbshell')()
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 476, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2700, in load_entry_point
    return ep.load()
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2318, in load
    return self.resolve()
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2324, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File "/usr/local/lib/python3.6/site-packages/qldbshell/__main__.py", line 25, in 
    from pyqldb.driver.pooled_qldb_driver import PooledQldbDriver
ModuleNotFoundError: No module named 'pyqldb.driver.pooled_qldb_driver'
[centos@ip-10-204-101-224] ~
$ pip list | grep qldbshell
DEPRECATION: The default format will switch to columns in the future. You can use --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.conf under the [list] section) to disable this warning.
qldbshell (1.2.0)


$ sudo pip uninstall qldbshell pyqldb

$ sudo pip install qldbshell
WARNING: Running pip install with root privileges is generally not a good idea. Try `pip3 install --user` instead.
Collecting qldbshell
  Downloading Requirement already satisfied: boto3>=1.9.237 in /usr/local/lib/python3.6/site-packages (from qldbshell)
Requirement already satisfied: amazon.ion<0.6.0,>=0.5.0 in /usr/local/lib/python3.6/site-packages (from qldbshell)
Requirement already satisfied: prompt_toolkit<3.1.0,>=3.0.5 in /usr/local/lib/python3.6/site-packages (from qldbshell)
Requirement already satisfied: ionhash~=1.1.0 in /usr/local/lib/python3.6/site-packages (from qldbshell)
Requirement already satisfied: s3transfer<0.4.0,>=0.3.0 in /usr/local/lib/python3.6/site-packages (from boto3>=1.9.237->qldbshell)
Requirement already satisfied: botocore<1.21.0,>=1.20.21 in /usr/local/lib/python3.6/site-packages (from boto3>=1.9.237->qldbshell)
Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /usr/local/lib/python3.6/site-packages (from boto3>=1.9.237->qldbshell)
Requirement already satisfied: six in /usr/local/lib/python3.6/site-packages (from amazon.ion<0.6.0,>=0.5.0->qldbshell)
Requirement already satisfied: wcwidth in /usr/local/lib/python3.6/site-packages (from prompt_toolkit<3.1.0,>=3.0.5->qldbshell)
Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /usr/local/lib/python3.6/site-packages (from botocore<1.21.0,>=1.20.21->boto3>=1.9.237->qldbshell)
Requirement already satisfied: urllib3<1.27,>=1.25.4 in /usr/local/lib/python3.6/site-packages (from botocore<1.21.0,>=1.20.21->boto3>=1.9.237->qldbshell)
Installing collected packages: qldbshell
  Running setup.py install for qldbshell ... done
Successfully installed qldbshell-1.2.0

Can I see my data now


$ qldbshell -l demo

Welcome to the Amazon QLDB Shell version 1.2.0
Use 'start' to initiate and interact with a transaction. 'commit' and 'abort' to commit or abort a transaction.
Use 'start; statement 1; statement 2; commit; start; statement 3; commit' to create transactions non-interactively.
Use 'help' for the help section.
All other commands will be interpreted as PartiQL statements until the 'exit' or 'quit' command is issued.

qldbshell > select * from sample;                                                                                                                           
INFO:
{
 id: 1,
 name: "Demo Row",
 location: "USA",
 domain: null
},
{
 id: 1,
 name: "Demo Row",
 location: "USA",
 domain: null
},
{
 id: "1",
 name: "Demo Row",
 location: "USA",
 domain: ""
},
{
 id: 3,
 name: "Kiwi",
 location: "NZ",
 domain: null
},
{
 id: 2,
 name: "Row 2",
 location: "AUS",
 domain: "news.com.au"
},
{
 id: 3,
 name: "Kiwi",
 location: "NZ",
 domain: null
},
{
 id: 2,
 name: "Row 2",
 location: "AUS",
 domain: "news.com.au"
}
INFO: (0.0815s)

And yes, data, I see it's duplicated, so I must have in between the 10 steps run twice. This does highlight a known limitation of QLDB, no unique constraints.

But wait, that data is not really correct, I don't want null. Goes back to the JSON to see the MySQL shell gives that.

$ jq . sample.json
[
  {
    "id": 1,
    "name": "Demo Row",
    "location": "USA",
    "domain": null
  },
...

At some point I also got this load error, but by now I've given up documenting how to do something, in order to demonstrate something.

NameError: name 'null' is not defined

One has to wrap the only nullable column with IFNULL(subdomain,'') as subdomain and redo all those steps again. This is not going to be practical having to wrap all columns in a wider table with IFNULL.

However, having exhausted all this time for what was supposed to be a quiet weekend few hours, my post is way to long, and I've learned "Creating examples can be hard".

#WDILTW – What can I run from my AWS Aurora database

When you work with AWS Aurora you have limited admin privileges. There are some different grants for MySQL including SELECT INTO S3 and LOAD FROM S3 that replace the loss of functionality to SELECT INTO OUTFILE and mysqldump/mysqlimport using a delimited format. While I know and use lambda capabilities, I have never executed anything with INVOKE LAMDBA directly from the database.

This week I found out about INVOKE COMPREHEND (had to look that product up), and INVOKE SAGEMAKER (which I used independently). These are machine learning capabilities that enable you to build custom integrations using Comprehend and SageMaker. I did not have any chance to evaluate these capabilities so I am unable to share any use cases or experiences. There are two built-in comprehend functions aws_comprehend_detect_sentiment() and aws_comprehend_detect_sentiment_confidence(), a likely future starting place. Sagemaker is invoked as an extension of a CREATE FUNCTION that provides the ALIAS AWS_SAGEMAKER_INVOKE_ENDPOINT syntax.

Also available are some MySQL status variables including Aurora_ml_logical_response_cnt, Aurora_ml_actual_request_cnt, Aurora_ml_actual_response_cnt, Aurora_ml_cache_hit_cnt, Aurora_ml_single_request_cnt.

Some googling found an interesting simple example, calculating the positive/negative sentiment and confidence of sentences of text. I could see this as useful for analyzing comments. I’ve included the example from this site here to encourage my readers to take a look as I plan to do. Post IAM configuration I will be really curious to evaluate the responsiveness of this example. Is this truly a batch only operation or could you return some meaningful response timely?

This also lead to bookmarking for reading https://awsauroralabsmy.com/, https://github.com/aws-samples/amazon-aurora-labs-for-mysql/ and https://squidfunk.github.io/mkdocs-material/ all from this one page.

#WDILTW – To use a RDBMS is to use a transaction

I learned this week that 30+ years of Relational Database Management System (RDBMS) experience still does not prepare yourself for the disappointment of working with organizations that use a RDBMS; MySQL specifically; have a released production product, have dozens to hundreds of developers, team leaders and architects, but do not know the importance of, nor use transactions. If I was to ask this when interviewing somebody that would work with a database and the response was it is not important, or not used these days it would be a hard fail.

To use a RDBMS is to understand a very simple principle, a foundation of a transactional system. It is called ACID. Atomicity, Consistency, Isolation, Durability.

In a simplistic description of this 50 year principle.

  • A – It’s all or nothing
  • C – Your work meets all existing constraints
  • I – Your work is independent and not affected by other concurrent work
  • D – Your work is reproducible in the event of a hard failure

A – Atomicity means a transaction. It’s not rocket science, it’s actually in the description of a RDBMS. What is a transaction? It is a means in Structure Query Language (SQL) notation that can ensure your unit of work is all or nothing.

The most basic of examples is very simple. You want to deposit a check that was given to you. For the bank to accept this deposit, it needs to ensure that the account holder that issued the check has sufficient funds, that is Step 1: subtract amount X from account A, and Step 2: deposit in account B. You cannot do Step 1 or Step 2 independently, you must do 1 and 2 together, otherwise real money could be lost or created from thin air. The lack of transactions also can be affected by not handling I – Isolation.

We have to ask our educators and responsible professionals why? Why can such a bedrock principle not be used? Why in a team of individuals in a company, one single individual from the years of development and support not say, well you have to use a transaction to ensure the integrity of the data you are working with? It’s called a database for a reason.

The unfortunately reality is there are several reasons for this utter failure, and the frustration of professionals like myself.

  1. RDBMS is not the bedrock of data management it was 20,30 even 50 years ago. Today with a proliferation of different products, transactions are simply not taught. Even SQL is not taught, yet most products that want to enable users to query data ultimately provide a SQL-like interface. Many users today want a drag & drop GUI interface but not realize that is not how you manipulate data. Hadoop was a life change with map-reduce approach to large datasets. iIt did not start with SQL, but it has a widely used SQL interface now. You look at the newest hot products like snowflake. What is it’s method of data access, SQL.
  2. MySQL, the most popular open source database does not enforce transactions, it’s optional. What you say? MySQL has some unique features including the capability for multiple storage engines, that provide different features and capabilities, such as index approaches, consistency and support for transactions. An entire generation of open source products were released and do not use transactions, but use a transactional storage engine within a transactional product.
  3. Still on the topic of MySQL, it also does not enforce strict handling like other RDBMS products. The default for decades was to enable an SQL statement to corrupt the data integrity, i.e. C – Consistency. Because MySQL supported warnings and they were always not checked for by developers, data in did not guarantee data out.
  4. Every mini-generation (e.g. 5 years) of new developers think they know more, than seasoned professionals.

As it was for me in the 80s, it should be for use of a relational database; and even a non-relational database; to read and understand the seminal works of An introduction to Database Systems by C.J. Date. It has been required learning at universities for decades, but it seems to have lost is favor with multiple iterations of new software developers.

#3 – What Did I learn this week. WDILTW.

Defensive Data Techniques

As a data architect I always ensure that for any database schema change there a fully recoverable execution path.
I have generally advised to create a patch/revert process for every change.  For example, if a change adds a new column or index to a table, a revert script would remove the respective column or index.
The goal is to always have a defensive position for any changes. The concept is that simple, it is not complex.

In its simplest form I use the following directory and file structure.

/schema
    schema.sql
    /patch
        YYYYMMDDXX.sql     where XX,ZZ are sequential 2 digit numbers, e.g. 01,02
        YYYYMMDDZZ.sql
   /revert
       YYYYMMDDXX.sql   This is the same file name in the revert sub-directory.
       YYYYMMDDZZ.sql

At any commit or tag in configuration management it is possible to create a current copy of the schema, i.e. use schema.sql.
It is also possible to take the first version of schema.sql and apply chronologically all the patch scripts to arrive at the same consistent structure of the schema that is in schema.sql. You can also run a validation process to confirm these are equivalent.
For each tagged version or commit of this directory structure and files in version control, this should always hold true.
While not the desired execution path, every revert script can be applied in a reverse chronological order and return to the first version of the schema.
If you want to maintain a first_schema.sql file within the directory structure, you can always create any version of the schema from a given commit in a roll-forward or roll-back scenario.

In reality however this is rarely implemented. There is always divergence or drift. Drift occurs for several primary reasons. The first is non-adherence to the defined process. The second and more critical is the lack of adequate testing and verification at each and every step.  A Test Driven Design (TDD) that validates the given approach would enable a verification of end state of the schema and enable the verification at each accumulated

In addition to each patch/revert there needs to be a state that is maintained of what has been applied.  Generally for RDBMS storing this metadata within a table is recommended.

The above example shows files of .sql extension. Any schema management process needs to cater for .sh, .py or other extensions to cater for more complex operations.
 
What about data changes?  I would recommend that for all configuration information you follow the same management principles as for schema objects, that is you have a patch to insert/update/delete data, and you have a revert script that can restore that data.  Generally the complexity of the rollback process is a hurdle for developers/engineers. Having a framework is important to manage how data consistency is maintained. This framework could generate a statement to restore the data (e.g. a selective mysqldump), require a hand-crafted statement, or leverage the benefit of the RDBMS by storing the data into intermediate shallow tables.

Using a least privileged model complicates an applicable framework approach. Does the user applying the change now require the FILE privilege, or CREATE/DROP privilege to create tables for the ability to restore data.

If there is strict referential integrity at the database level, those protections will defend against unintended consequences. For example, deleting a row that is dependent on a foreign key relationship.  In a normal operating system accommodations are made generally for the sake of performance, but also for supporting poor data cleansing requirements. If the application maintains a level of referential integrity, the schema management process also needs to support this, adding a further complexity.  Ensuring data integrity is an important separate topic. If there is a dangling row, what is the impact? The data still exists, it is just not presented in a user interface or included in calculations. This generally leads to greater unintended consequences that are generally never obvious at the time of execution, but rather days, weeks or months later.

When it comes to objects within the structure of an RDBMS the situation is more complex.  A classic example in MySQL is a user.  A user in MySQL is actually the user definition which is just the username, password and host.  A user contains one or more grants. The user may be the owner of additional objects. Using default and legacy MySQL, it is simply not possible to determine if a user is actually being used. Percona and other variances support INFORMATION_SCHEMA.USER_STATISTICS which is a better method of evaluating the use of a user.  This does however require the intervention of time-based data collection, as this table is the accumulative statistics since an instance restart or flush.

With this type of object, or meta object several defensive techniques exist.  

If you had the user `blargie` and that user had grants to read data from several schemas, is the user used?  I don’t think so, let’s just delete it is not a fact-based approach to avoiding a subsequent problem.
Is the user used? Let’s revoke the users privileges and monitor for errors or user feedback? Or let’s change the user’s password?  With each of these strategies it is important to always have a defensive process to rollback.
A different approach is to use a common data technique of marking information as deleted before it’s physically deleted (think trash can before you empty the trash).  For MySQL users there is no default functionality (in the most recent versions of MySQL you can DISABLE a user).  One implementation to apply this pattern is to rename the user, which has the benefit of keeping the user’s password and privileges intack, therefore reducing the amount of complexity in restoring.

Regardless of the technique, it is important there is always a recovery path.  In a subsequent post I will discuss this approach towards cloud metadata, for example an AWS KMS policy, IAM Rule or ASG setting and the impact of  Infrastructure as a Service (IaaS) such as Terraform.

More reading https://en.wikipedia.org/wiki/Test-driven_development, https://en.wikipedia.org/wiki/Defensive_programming

MySQL Data Security Risk Assessment presentation

Securing your data is only as good as your weakest link. A clear-text password in a file or history file, shared privileges between test and production or open sudo access when you can connect as an unprivileged user all are security flaws. This talk discusses how to navigate the poor defaults MySQL has in place, how to strengthen processes and how to audit your environment. It also covers the complexity of deploying changes in an always available production environment.

Presented at the Data.Ops Conference in Barcelona, Spain.
Download slides

Identifying MySQL SSL communication using ngrep

Prior to MySQL 5.7 client communications with a MySQL instance were unencrypted by default. This plaintext capability allowed for various tools including pt-query-digest to analyze TCP/IP traffic. Starting with MySQL 5.7 SSL is enabled by default and will be used in client communications if SSL is detected on the server.

We can detect and confirm this easily with the ngrep command.

Default client connection traffic (5.6)

On a MySQL 5.6 server we monitor the default MySQL port traffic.

mysql56$ sudo ngrep -d eth1 -wi -P ' ' -W single -l port 3306
interface: eth1 (192.168.42.0/255.255.255.0)
filter: ( port 3306 ) and ((ip || ip6) || (vlan && (ip || ip6)))
...

We connect to this server using TCP/IP.

host$ mysql -uexternal -p -h192.168.42.16
mysql56> select 'unencrypted';

We can observe the communication to and from the server (in this example 192.168.42.16) is plaintext.

...
#
T 192.168.42.1:47634 -> 192.168.42.16:3306 [AP]      select 'unencrypted'
#
T 192.168.42.16:3306 -> 192.168.42.1:47634 [AP]      !    def    unencrypted  ! !                       unencrypted
#

SSL System Variables (5.6 default)

A default 5.6 installation does not have SSL enabled as verified by system variables.

mysql56 >SHOW  VARIABLES  LIKE '%ssl%';
+---------------+----------+
| Variable_name | Value    |
+---------------+----------+
| have_openssl  | DISABLED |
| have_ssl      | DISABLED |
| ssl_ca        |          |
| ssl_capath    |          |
| ssl_cert      |          |
| ssl_cipher    |          |
| ssl_crl       |          |
| ssl_crlpath   |          |
| ssl_key       |          |
+---------------+----------+
9 rows in set (0.02 sec)

Default client connection traffic (5.7)

Running the same example client connection with MySQL 5.7 you will observe that communications to and from the server (in this example 192.168.42.17) are not in plaintext.

mysql57$ sudo ngrep -d eth1 -wi -P ' ' -W single -l port 3306
interface: eth1 (192.168.42.0/255.255.255.0)
filter: ( port 3306 ) and ((ip || ip6) || (vlan && (ip || ip6)))

host$ mysql -uexternal -p -h192.168.42.17
mysql57> select 'encrypted';


T 192.168.42.1:36781 -> 192.168.42.17:3306 [AP]     @    F   l   d iVr  H   b ^    s t Z      ( 2d   " ?  |   )
#
T 192.168.42.17:3306 -> 192.168.42.1:36781 [AP]     p%  s`   3u5!%P]   v=  r # x   E   a y  '!    )Z    8   Js  z.  \t   (r H@     0 2 5k\    <   M  @)E& b q|q@    h

SSL System Variables (5.7 default)

A new MySQL 5.7 installation will have SSL enabled by default as seen in the MySQL variables.

mysql57 > SHOW GLOBAL VARIABLES LIKE '%ssl%';
+---------------+-----------------+
| Variable_name | Value           |
+---------------+-----------------+
| have_openssl  | YES             |
| have_ssl      | YES             |
| ssl_ca        | ca.pem          |
| ssl_capath    |                 |
| ssl_cert      | server-cert.pem |
| ssl_cipher    |                 |
| ssl_crl       |                 |
| ssl_crlpath   |                 |
| ssl_key       | server-key.pem  |
+---------------+-----------------+
9 rows in set (0.00 sec)

-no-ssl Client connection traffic (5.7)

If you want to emulate the unencrypted nature of MySQL 5.6 within any SSL enabled MySQL version (e.g. MySQL 5.7) you specify the --ssl option for mysql client connections. In MySQL 5.7 this option is also deprecated and --ssl-mode=disabled should be used>.

host$ > mysql -uexternal -p -h192.168.42.17 --ssl=0

host >select '-ssl=0 unencrypted';

T 192.168.42.1:36785 -> 192.168.42.17:3306 [AP]      select '-ssl=0 unencrypted'
#
T 192.168.42.17:3306 -> 192.168.42.1:36785 [AP]      '    def    -ssl=0 unencrypted  ! 3              -ssl=0 unencrypted

References

https://wiki.christophchamp.com/index.php?title=Ngrep
http://infoheap.com/ngrep-quick-start-guide/
Encrypted Connections - MySQL 5.7 Reference Manual
Implementing MySQL Security Features - Tutorial at Percona Live Europe 2017.

Testing MySQL/MariaDB/Percona versions with Docker

Giuseppe Maxia has provided some great MySQL docker images. Percona and MariaDB also provide version via Docker Hub. In an attempt to have a consistent means of launching these different images I created the following convenience functions.

  1. Install docker for your OS. See Official Docker installation instructions.
  2. Get dockerhelper.sh
  3. Run your desired variant and version.

$ wget https://raw.githubusercontent.com/ronaldbradford/mysql-docker-minimal/master/dockerhelper.sh
$ . ./dockerhelper.sh
Docker Registered functions are:  docker_mysql, docker_percona, docker_mariadb

$ docker_mysql
ERROR: Specify a MySQL version to launch. Valid versions are 5.0 5.1 5.5 5.6 5.7 8.0
$ docker_percona
ERROR: Specify a Percona version to launch. Valid versions are 5.5 5.6 5.7
$ docker_mariadb
ERROR: Specify a MariaDB version to launch. Valid versions are 5.5 10.0 10.1

OTN appreciation day: The Performance Schema of MySQL 5.6+

To focus on just one point for OTN appreciation day on October 11 2016 and to the benefit of all users of MySQL is to consider the extremely convenient and rich value of information available in the MySQL Performance Schema to understand what SQL queries are running in a MySQL instance now. The MySQL Performance Schema in MySQL 5.6 is enabled by default, (performance_schema=on).

The following one off SQL statement will enable the instrumentation of SQL statements in the most detailed level of assessment.

The following query will show you the longest running queries in your database at this present time.

This ease of accessing what is running in a MySQL instance replaces many different and creative techniques as I describe in Improving MySQL Performance with Better Indexes in versions of MySQL before version 5.6.

If your organization does not have dedicated performance experts reviewing new functionality consistently and monitoring your production systems regularly for database optimization, the cost of having the MySQL performance schema available and with a large number of different forms of instrumentation out ways any reason not to.

One of the best presentations at Percona Live Amsterdam last week in the last time slot of the event (before beer and food) was Performance schema and sys schema by Mark Leith. I hope to provide a review of this presentation soon and my interest to explore the new MySQL 5.7 and 8.0 performance schema instruments. A few of my live tweets included:

MySQL 5.7 & 8.0 Performance Schema


Why I wrote this appreciation?

On Friday I was asked to review the MySQL performance and load of a newly developed product during simulated tested. When I was first given access to the MySQL database server I was very disappointed that for a new and unreleased product the MySQL version chosen was 5.5. This is in no way disrespectful for the great stability, functionality and features of MySQL 5.5, however for any new system under development MySQL 5.6 and MySQL 5.7 are both much more appropriate options for many reasons. If for no other reason to look at upgrading to at least MySQL 5.6 to enable you to become a better expert with this functionality is one key consideration.


OTN Appreciation Day

This post format was suggested by Tim Hall, a well known community champion among Oracle users, who maintains a rich web site of news and free technical info. According to his suggestion, this post wants to add to the OTN appreciation day, a distributed community effort to show something useful, or pleasant, or both related to the Oracle world.

For those not used to the Oracle Technology Network (OTN), it is the center of Oracle technology, the place where users can get all software (proprietary or open source) and other resources related to Oracle products. In the image below you may find several familiar names.

MySQL Group Replication OOW Tutorial

The second MySQL tutorial session at Oracle Open World was “MySQL Group Replication in a Nutshell” by MySQL Community Manager Frederic Descamps. This is succinctly described as:

“Multi-master update anywhere replication for MySQL with built-in conflict detection and resolution, automatic distributed recovery, and group membership.”

MySQL Group Replication (GR) is a virtually synchronous replication solution which is an integral component of MySQL InnoDB Cluster announced at the MySQL keynote. You can download a labs version of MySQL InnoDB cluster which includes three components.

  • MySQL Router
  • MySQL Shell
  • MySQL Group Replication

While included as part of MySQL InnoDB cluster, MySQL Group Replication can be run standalone. It is a plugin, made by and packaged by MySQL. With the plugin architecture in MySQL 5.7 the ability to release new features is greatly reduced from the more typical 2+ year general availability (GA) cycle. Plugins also allow for functionality to be not enabled by default therefore preserving the stability of an existing MySQL instance running version 5.7. This is a change in the philosophy of new functionality that I discussed in Understanding the MySQL Release Cadence which in 5.7.13 introduced the SQL interface for keyring key management. Not all in the community are happy however I consider it an important requirement for time-to-market in a fast paced open source data ecosystem.

MySQL GR is based on Replicated Database State Machine Theory and uses Paxos for evaluating consensus of available nodes in the cluster, being referred to as the Group Communication System (GCS). This is one key difference with Galera as the Paxos approach relies on accepting the certification stage within the cluster after a major of the nodes have acknowledged, rather than all nodes. MySQL GR is supported on a wide range of platforms including Linux, Windows, FreeBSD and Mac OS X, another difference with Galera.

The current Release Candidate (RC) version of MySQL Group Replication has some required configurations and some situations for applications that may not be ideal use cases for a synchronous solution. There is the complexities in the migration process of any existing infrastructure to considering MySQL Group Replication, which has at a minimum requirements of MySQL 5.7, GTID’s and row based replication. I would like to see MySQL put a lot more effort into the education and promotion of MySQL migrations from older versions to the current MySQL 5.7. Ideally I’d like to see better tools starting with MySQL 5.0 which I still see in production operation.

Some things are just the impact of current development priorities. The shell does not offer a means to promote a master in a single write configuration, i.e. the only way to simulate a failure is to produce a failure, which really means your three node cluster is no longer highly available. The use of savepoints is not currently available, a needed feature for future full compatibility for use in an OpenStack deployment. The creation of a cluster via the MySQL shell requires you to make the decision of supporting multi-master writes or a single master write. I can see the ideal need to be able to better support large batch transactions and DDL (some of those edge cases) to be able to toggle to a single write master and back. The current workaround is to utilize MySQL router to simulate this use case. The MySQL shell greatly reduces the complexity of orchestration. One of the features I like is a very convenient means to validate an Instance to see if the configuration matches minimum requirements. For example:

$ mysqlsh
> dba.validateInstance('root@mysql3:3306')

...
ERROR: Error executing the 'check' command: The operation could not
continue due to the following requirements not being met:
Some active options on server 'mysql3@3306' are incompatible with Group
Replication.
Please restart the server 'mysql3@3306' with the updated options file
and try again.
Option name                      Required Value   Current Value
Result
-------------------------------  ---------------  ---------------
-----
binlog_checksum                  NONE             CRC32            FAIL
master_info_repository           TABLE            FILE             FAIL
relay_log_info_repository        TABLE            FILE             FAIL
transaction_write_set_extraction  XXHASH64         OFF
FAIL
 at (shell):1:4
in

Something you can now do dynamically and persist in MySQL 8.0 using the SET PERSIST syntax.

The overall setup in a greenfield application is reasonable clear and will improve as the product moves towards general availability. The MySQL shell has a lot of future potential in a number of administrative functions, and the ability to switch easily between JavaScript and SQL means you can get the best of multiple languages.

In subsequent posts I will look into more of the detail of setup and monitoring of a cluster with performance_schema. I hope that existing monitoring tools will also start to support monitoring Group Replication. As the author of the New Relic MySQL Plugin in 2013 I may need to get motivated to offer a SaaS solution also.

You can find more information with official blog posts on MySQL Group Replication.

Presentations at Percona Live Amsterdam 2016

I was fortunate enough to give four presentations at the Percona Live 2016 event in Amsterdam. The slides for these are now available.

New UUID functions in MySQL 8.0.0

MySQL 8.0.0 introduces three new miscellaneous UUID functions of IS_UUID(), UUID_TO_BIN() and BIN_TO_UUID() joining the UUID() (in 5.0) and UUID_SHORT() (in 5.1) functions. See 8.0.0 Release Notes.

Thanks to the great work and hosting by Marcus Popp anybody can test out the SQL syntax of MySQL 8.0.0 using db4free without installing anything. If you want a minimal install Giuseppe Maxia provides docker minimal images of 5.0+ versions including 8.0.0.

A running docker container with MySQL 8.0 is as easy as:

The following script shows the usage and checks of these new functions.

Historically, to encode a UUID into a BINARY(16) datatype was to use UNHEX(REPLACE()) syntax. There was however no easy to unencode a BINARY(16) into the original value. BIN_TO_UUID() as shown in the output below solves this problem.

mysql> SELECT IS_UUID(1);
+------------+
| IS_UUID(1) |
+------------+
|          0 |
+------------+
1 row in set (0.01 sec)

mysql> SET @uuid='aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT IS_UUID(@uuid) AS is_uuid;
+---------+
| is_uuid |
+---------+
|       1 |
+---------+
1 row in set (0.01 sec)

mysql> SELECT IS_UUID(REPLACE(@uuid,'-','')) AS is_uuid;
+---------+
| is_uuid |
+---------+
|       1 |
+---------+
1 row in set (0.00 sec)

mysql> SELECT @uuid_bin := UUID_TO_BIN(@uuid) AS uuid_bin, LENGTH(@uuid_bin) AS len;
+------------------+------+
| uuid_bin         | len  |
+------------------+------+
| ���������������� |   16 |
+------------------+------+
1 row in set (0.00 sec)

mysql> SELECT @old_uuid_bin := UNHEX(REPLACE(@uuid,'-','')) AS old_uuid_bin, LENGTH(@old_uuid_bin) AS len;
+------------------+------+
| old_uuid_bin     | len  |
+------------------+------+
| ���������������� |   16 |
+------------------+------+
1 row in set (0.00 sec)

mysql> SELECT @uuid_bin = @old_uuid_bin;
+---------------------------+
| @uuid_bin = @old_uuid_bin |
+---------------------------+
|                         1 |
+---------------------------+
1 row in set (0.00 sec)

mysql> SELECT BIN_TO_UUID(@uuid_bin) AS uuid, HEX(@old_uuid_bin) AS uuid_old;
+--------------------------------------+----------------------------------+
| uuid                                 | uuid_old                         |
+--------------------------------------+----------------------------------+
| aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee | AAAAAAAABBBBCCCCDDDDEEEEEEEEEEEE |
+--------------------------------------+----------------------------------+
1 row in set (0.01 sec)

Introducing the MySQL Cloud Service

The MySQL keynote at Oracle Open World 2016 announced the immediate availability of the MySQL Cloud Service, part of the larger Oracle Cloud offering. You can evaluate this now with a trial copy at cloud.oracle.com/mysql. MySQL server product manager Morgan Tocker gave two presentations at the event including a deep dive session.

This is the first release of the MySQL cloud service. As with all first releases there are some highlights and some pipeline features. All major cloud providers have MySQL offerings. AWS RDS (traditional, MAZ and Aurora) GCP Cloud SQL and Azure MySQL App Service. Users of OpenStack have Trove for comparison. I am not going to be evaluating features between cloud offerings in this post.

Highlights

The differentiating highlights as I see them from the presentation. I will provide a followup blog on actual usage at a later time.

  • MySQL 5.7
  • MySQL Enterprise Edition (a key difference with other cloud providers)
    • MySQL Enterprise features like Firewall, Thread Pool, Auditing etc
    • MySQL Enterprise support is included in price
    • MySQL Enterprise Monitor (MEM) is available and included in price
  • SSH access to machine
    • SSH access is a non-privileged user (opc). This shows and intention on security first policy.
  • Separated partitioning in OS/MySQL disk layout
  • ZFS. (Nice, I have missed using this)
  • Optimized partition workloads different for data and sequential logging
  • Two predefined backup policies, ZFS appliance (7 day retention) and cloud storage (30 day retention)
  • The managed backup philosophy is a weekly full backup, and daily incrementals
  • Sane default MySQL configuration (my.cnf)
  • Patching notification and capability. Automated backup before patching, and rollback capability
  • The Ksplice Oracle UEK functionality for improved host uptime with security vulnerabilities or kernel improvements

Overall an A effort on paper in V1 with the willingness to be useful, sane and flexible. In a future post I will evaluate the actual MySQL cloud capabilities with the available trial.

Observations

Features and functionality I see missing from V1 during this presentation. Some are features I would like to see, some are just observations, and some are likely present features but not discussed. I will leave it up the reader to decide which is which.

  • No MySQL 5.6. There was mention of supporting two versions in future moving forward (i.e. 5.7 and 8)
  • Separated MEM configuration and management. See my later thoughts on this.
  • MySQL topologies and easy to apply templates, including the future MySQL InnoDB Cluster
  • A longer archive storage retention capability policy for backups and/or binary logs (e.g. for compliance reasons)
  • The size of the pre-defined dedicated logging partition and binary logging may be problematic for longer retention capacity
  • Provisioned IOPS capacity or performance guarantees for Disk I/O
  • An ability to define MySQL configuration templates (e.g. dev, test, prod, master, slave etc) and be able to select these for a new instance deployment. You can of course manage this after the fact manually.
  • The compute workloads are more generic at present. There appears to be no optimized disk, network or CPU variants.
  • Improved key management being able to select an already defined SSH public key (e.g. with another instance)

Only offering MySQL 5.7 is an adoption impediment. This requires any organization with applications that are not greenfield to have already migrated to MySQL 5.7. I can understand the long-term rationale view here, but I see it as a clear limitation for more rapid adoption.

The details

The MySQL Cloud Service takes the hard parts out of managing MySQL. This is deployed in the Oracle Public Cloud, leveraging the fault-tolerant regional deployments in place. This Database as a Service (PaaS) helps to remove those annoying pieces of administration including backups, patches, monitoring etc. Powered by MySQL 5.7 Enterprise edition (the only cloud provider to offer this), the cloud system version in use is identical to the downloadable on-premise version. The Cloud service offers an initially optimized MySQL configuration of my.cnf to begin with, i.e. improvements on 5.7 defaults, and has variety of compute workload sizes to choose from. Storage is a ZFS appliance, but there is no information on provisioned IOPS for intensive workloads. You can use the web interface or REST API endpoints to create, deploy and manage your cloud instances. The REST API endpoints were not demonstrated in this session.

The predefined disk layout for storage is a very sane configuration. The Operating System (Oracle Unbreakable Linux 6 ) has a dedicated partition, (not part of sizing). There is a dedicated and throughput optimized ZFS LUN for data (what you size with the setup), a dedicated and latency optimized ZFS LUN for binary and InnoDB logs (which appears not initially sizable at present) and a dedicated ZFS LUN for backups. There is also a secondary backup storage capacity by default in Cloud Storage.

The UI interface provides the capability to configure a MEM server and a MEM client. To conserve presentation time Morgan consolidated these into his initial demo instance. I feel there is room here to optimize the initial setup and to separate out the “management” server capabilities, e.g. selecting your MEM configuration, and by default offering just the MEM client authentication (if MEM server is configured). For users not familiar with MySQL Enterprise features separating the definition and management in the initial creation stage is an optimization to remove complexity. There may even be an option for a getting started quick setup step that can provision your MEM setup in a dedicated instance when there is none detected or with a new account. Here is the flip side. An inexperienced user starting out may launch a MEM server with several test instances because the initial UI setup offers these as input fields, this is not the goal when managing multiple servers. The current version of MEM shown was 3.2, with 3.3 planned. Version 3.3. includes it’s own web interface for backup management.

Some things that are not in the initial release but I’m sure are on the roadmap. One is an upsize and downsize optimization. It would appear via the demo, that when a compute size modification occurs, the existing MySQL instance is shutdown and the VM is shutdown. A new VM is provisioned using the setup and disk partitions of the prior VM. An optimization is to provision a new VM, startup MySQL, then stop MySQL on new, stop on old, unmount on old, mount on new, and start MySQL. This removes the downtime in the VM provisioning step. Ideally I’d like to see the capability to perform this on a slave, and promote a slave more seamlessly. Practically however, this has many more moving pieces than in theory and so the future use MySQL router is a solution. The upcoming MySQL InnoDB cluster will also suffer from the complexity of resizing and uptime availability, especially when nodes are of varying compute sizes. As mentioned, I would like to see pre defined MySQL configurations. I would also like the option to pre-create multiple user authentications for instances, rather than having to specific one each time. I can see for a class of servers, e.g. a load test environment of a master/slave setup, and an application with several MySQL accounts, a means of bulk user and permission management.

Under the Hood Morgan talked about the InnoDB IO configuration optimizations, the number of IO Threads, use of O_DIRECT, the redo log size and buffer pool optimized to compute shape. The thread pool is enabled by default. The same considerations are in place for the operating system, Oracle Linux 6 UEK, MySQL task priority, memlock, and ext4 filesystem.

Again, those unfamiliar with MySQL Enterprise features will need greater help and UI help understanding the features, capabilities and configuration of Firewall, Encryption, Authentication, Audit, Monitor, Backup and Thread Pool.

The SSH access is what gives this first release control to be flexible. You can modify the MySQL configuration, incorporate configuration management processes. You can utilize on system database restore capabilities. You can monitor physical resource utilizations. I am unsure of the total control of changing (or breaking the system and the kernel).

There was a lot to digest in the 45 minute practical demonstration session. I am sure as with more reading and evaluation there will be more to share. As the roadmap for MySQL InnoDB cluster develops I can see there being a good cadence on new features and functionality released for the MySQL Cloud Service.

My Live Tweets (as the presentation was happening)

Oracle MySQL Public Cloud landing page

MySQL Operations in Docker at Oracle Open World 2016

One of the Monday tutorials at Oracle Open World was MySQL Operations in Docker. A 2 hour tutorial by Giuseppe Maxia. This tutorial showed what you can do with MySQL on Docker which is specifically good for testing. Some key points from the tutorial included:

  • Differences between containers and Virtual Machines (VM)
    • VM are mutable architecture, you start then modify
    • containers are an immutable architecture
  • Containers are not micro-services
  • Understanding about the “official” MySQL docker image. (Hint: Use mysql/mysql-server, not mysql)
  • The issues of specifying a required password to install MySQL on a container
  • Understanding how to use volumes, for a file (e.g. /etc/my.cnf), or a directory (e.g. /var/lib/mysql)
  • How to produce a more secure MySQL installation using files
  • How to get MySQL 5.0,5.1,5.5,5.6,5.7 and 8.0 on #CentOS, #Ubuntu and #debian for #docker using his own minimal MySQL docker images. (NOTE: MySQL images by Oracle, Percona and MariaDB are only the current version)
  • MySQL Group replication demo (mysql/mysql-gr)

Giuseppe performed his demos on a dedicated Linux machine. My attempts to reproduce the tutorial steps on Mac failed, as mentioned because of issues with volumes. MySQL Group Replication with Docker on Mac is also unpredictable.

In summary, Giuseppe talked about how wonderful Docker is for development and testing but not advisable for production. Some of the questions regarding production concerns included the inability working with orchestrators, stability with volumes and overall container user security. In addition, a tough audience question “How do you upgrade MySQL in production using containers?” highlighted that this technology is evolving, and while becoming ideal for stateless applications, it is not ready for primetime for databases that require state to operate.

Get the Code Examples on GitHub.

Docker has become a popular technology for containers starting in 2013. It did not invent containers, A Brief History of Container Technology gives a timeline of technologies that have got us to where we are today.

MySQL Keynote at Oracle Open World 2016

Tomas Ulin made a number of key announcements at this year’s State of the Dolphin and Customer Experiences keynote. MySQL Public Cloud, MySQL 8.0 DMR, MySQL InnoDB Cluster, MySQL Group Replication (RC). Some tweets and points of the keynote:

There were also user stories by Nicolai Plum – Senior Systems Architect at Booking.com and Andrew Archibald – VP of Development at Churchill Downs.

Nicolai talked about how booking has evolved over the years starting with the traditional MySQL replication model, moving to a more complex sharded and partitioned architecture, and now a re-architecture towards a loosely coupled, write optimized and read optimized data model leveraging Redis queues. This work has enabled services to hide the complexity and need for developers to write SQL and leverage better data translation and interoperability, e.g. MySQL to Hadoop to MySQL. Booking.com is actively using MySQL 5.7, and is evaluating how to incorporate the new features of MySQL 8.0.

Andrew talked about how twinspires.com uses a multi data center master-master replication setup with MySQL 5.6 to manage critical availability needs for online wagering of horse races around the world. During peak times, load can increase 100x, similar to my own experiences handling 100x flash sales Improving performance – A full stack problem.

I have yet to discover how to deep link to this presentation at the OOW 2016 agenda to enable viewers to read the overview and speaker profiles.

Q: Does MySQL support ACID? A: Yes

I was recently asked this question by an experienced academic at the NY Oracle Users Group event I presented at.

Does MySQL support ACID? (ACID is a set of properties essential for a relational database to perform transactions, i.e. a discrete unit of work.)

Yes, MySQL fully supports ACID, that is Atomicity, Consistency, Isolation and Duration. (*)

This is contrary to the first Google response found searching this question which for reference states “The standard table handler for MySQL is not ACID compliant because it doesn’t support consistency, isolation, or durability”.

The question is however not a simple Yes/No because it depends on timing within the MySQL product’s lifecycle and the version/configuration used in deployment. What is also *painfully* necessary is to understand why this question would even be asked of the most popular open source relational database.

MySQL has a unique characteristic of supporting multiple storage engines. These engines enabling varying ways of storing and retrieving data via the SQL interface in MySQL and have varying features for supporting transactions, locking, index strategies, compression etc. The problem is that the default storage engine from version 3.23 (1999) to 5.1 (2010) was MyISAM, a non-transactional engine, and hence the first point of confusion.

The InnoDB storage engine has been included and supported from MySQL 3.23. This is a transactional engine supporting ACID properties. However, not all of the default settings in the various MySQL versions have fully meet all ACID needs, specifically the durability of data. This is the second point of confusion. Overtime other transactional storage engines in MySQL have come and gone. InnoDB has been there since the start so there is no excuse to not write applications to fully support transactions. The custodianship of Oracle Corporation starting in 2010 quickly corrected this *flaw* by ensuring the default storage engine in MySQL 5.5 is InnoDB. But the damage to the ecosystem that uses MySQL, that is many thousands of open source projects, and the resources that work with MySQL has been done. Recently working on a MySQL 5.5 production system in 2016, the default engine was specifically defined in the configuration defined as MyISAM, and some (but not all tables) were defined using MyISAM. This is a further conversation as to why, is this a upgrade problem? Are there legacy dependencies with applications? Are the decision makers and developers simply not aware of the configuration? Or, are developers simply not comfortable with transactions?

Like other anti-reasonable MySQL defaults the unaware administrator or developer could consider MySQL as supporting ACID properties, however until detailed testing with concurrency and error conditions not realize the impact of poor configuration settings.

The damage of having a non-transactional storage engine as the default for over a decade has created a generation of professionals and applications that abuses one of the primary usages of a relational database, that is a transaction, i.e. to product a unit for work that is all or nothing. Popular open source projects such as WordPress, Drupal and hundreds more have for a long time not supported transactions or used InnoDB. Mediawiki was at least one popular open source project that was proactive towards InnoDB and transaction usage. The millions of plugins, products and startups that build on these technologies have the same flaws.

Further confusion arises when an application uses InnoDB tables but does not use transactions, or the application abuses transactions, for example 3 different transactions that should really be 1.

While newer versions of MySQL 5.6 and 5.7 improve default configurations, until these versions a more commonly implemented non-transactional use in a relational database will remain. A recent Effective MySQL NYC Meetup survey showed that installations of version 5.0 still exist, and that few have a policy for a regular upgrade cadence.

Do you control your database outages?

Working with a client last week I noted in my analysis, “The mysql server was restarted on Thursday and so the [updated] my.cnf settings seems current”. This occurred between starting my analysis on Wednesday and delivering my findings on Friday.

# more /var/lib/mysql/ip-104-238-102-213.secureserver.net.err
160609 17:04:43 [Note] /usr/sbin/mysqld: Normal shutdown

The client however stated they did not restart MySQL and would not do that at 5pm which is still a high usage time of the production system. This is unfortunately not an uncommon finding, that a production system had an outage and that the client did not know about it and did not instigate this.

There are several common causes and the “DevOps” mindset for current production systems has made this worse.

  • You have managed hosting and they perform software updates with/without your knowledge. I have for example worked with several Rackspace customers and there would be an outage because Rackspace engineers decided to apply an upgrade at a time that suited them, not their customers.
  • You have chosen automate updates for your Operating System.
  • Your developers update the software when they like.
  • You are using a 3rd party product that is making an arbitrary decision.

In this case the breadcrumbs lead to the last option, that cPanel is performing this operation as hinted by the cPanel specific installed MySQL binaries.

$ rpm -qa | grep -i mysql
cpanel-perl-522-MySQL-Diff-0.43-1.cp1156.x86_64
MySQL55-devel-5.5.50-1.cp1156.x86_64
MySQL55-client-5.5.50-1.cp1156.x86_64
cpanel-perl-522-DBD-mysql-4.033-1.cp1156.x86_64
compat-MySQL50-shared-5.0.96-4.cp1136.x86_64
MySQL55-server-5.5.50-1.cp1156.x86_64
cpanel-mysql-libs-5.1.73-1.cp1156.x86_64
MySQL55-shared-5.5.50-1.cp1156.x86_64
MySQL55-test-5.5.50-1.cp1156.x86_64
compat-MySQL51-shared-5.1.73-1.cp1150.x86_64
cpanel-mysql-5.1.73-1.cp1156.x86_64

Also note that cPanel still uses MySQL 5.1 shared libraries.

So why did cPanel perform not one shutdown, but two in immediate succession? The first was 17 seconds, the second was 2 seconds. Not being experienced with cPanel I cannot offer an answer for this shutdown occurance. I can for others which I will detail later.

160609 17:04:24 [Note] /usr/sbin/mysqld: Normal shutdown
...
160609 17:04:28 [Note] /usr/sbin/mysqld: Shutdown complete
...
160609 17:04:41 [Note] /usr/sbin/mysqld: ready for connections.
...
160609 17:04:43 [Note] /usr/sbin/mysqld: Normal shutdown
...
160609 17:04:45 [Note] /usr/sbin/mysqld: ready for connections.
...

And why did the customer not know about the outage? If you use popular SaaS monitoring solutions such as New Relic and Pingdom you would not have been informed because these products have a sampling time of 60 seconds. I use these products along with Nagios on my personal blog site as they provide adequate instrumentation based on the frequency of usage. I would not recommend these tools as the only tools used in a high volume production system simply because of this one reason. In high volume system you need sampling are much finer granularity.

So just when you were going to justify that 17 seconds while unexpected is tolerable, I want to point out that this subsequently occurred and the outage was over 4 minutes.

160619 11:58:07 [Note] /usr/sbin/mysqld: Normal shutdown
...
160619 12:02:26 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
...

An analysis of the MySQL error log which is correctly not rolled as I always recommend showed a pattern of regular MySQL updates, from 5.5.37 thru 5.5.50. This is the most likely reason a 3rd party product has performed a database outage, to perform a software update at their choosing, not yours.

150316  3:54:11 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.37-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

150316  3:54:22 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.37-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

150316 19:07:31 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.37-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

150317  2:05:45 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.40-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

150317  2:05:54 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.40-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

150319  1:17:26 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.42-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

150319  1:17:34 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.42-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

150616  1:39:44 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.42-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

150616  1:39:52 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.42-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

151006  1:01:45 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.45-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

151006  1:01:54 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.45-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

151027  1:21:12 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.46-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

160105  1:31:35 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.47-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

160211  1:52:47 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.48-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

160211  1:52:55 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.48-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

160503  1:14:59 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.49-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

160503  1:15:03 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.49-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

160521 18:46:24 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.49-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

160522 11:51:45 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.49-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

160529 15:26:41 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.49-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

160529 15:30:12 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.49-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

160604 23:29:15 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.49-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

160609 17:04:41 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.50-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

160609 17:04:45 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.50-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)

160619 12:21:58 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.50-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 0  MySQL Community Server (GPL)

What is intriguing from this analysis is that several versions were skipped including .38, .39, .41, .43, .44. One may ask the question why?

For Clients

This leads to several questions of the strategy used in your organization for controlling outages of your MySQL infrastructure for upgrades or for other reasons.

  • What is an acceptable outage time?
  • What is the acceptable maintenance window to perform outages?
  • What is your release cadence for MySQL upgrades?
  • Who or what performs updates?
  • Can your monitoring detect small outages?

You should also consider in your business strategy having a highly available (HA) MySQL infrastructure to avoid any outage, or application intelligence to support varying levels of data access as I describe in Successful MySQL Scalability Principles.

Understanding the MySQL Release Cadence

At the recent New York Oracle Users Group summer general meeting I gave a presentation to the Oracle community on the MySQL product release cycle. Details included:

  • Identifying the server product options covering community, enterprise and ecosystem.
  • Describe MySQL enterprise products, features and support options.
  • Describing the DMR, RC, GA, EOL and labs product lifecycle.
  • Discussing the GA release frequency.
  • Talking about the MySQL Upgrade path.

Utilizing OpenStack Trove DBaaS for deployment management

Trove is used for self service provisioning and lifecycle management for relational and non-relational databases in an OpenStack cloud. Trove provides a RESTful API interface that is same regardless of the type of database. CLI tools and a web UI via Horizon are also provided wrapping Trove API requests.

In simple terms. You are a MySQL shop. You run a replication environment with daily backups and failover capabilities which you test and verify regularly. You have defined DBA and user credentials ACL’s across dev, test and prod environments. Now there is a request for using MongoDB or Cassandra, the engineering department has not decided but they want to evaluate the capabilities. How long as a operator does it take to acquire the software, install, configure, setup replication, backups, ACLs and enable the engineering department to evaluate the products?

With Trove DBaaS this complexity is eliminated due to a consistent interface to perform the provisioning, configuration, HA, B&R, ACL across other products the exact same way you perform these tasks for MySQL. This enables operations to be very proactive to changing technology requests supporting digital transformation strategies.

Enabling this capability is not an automatic approval of a new technology stack. It is important that strategic planning, support and management is included in the business strategy to understanding the DBaaS capability for your organization. Examples of operations due diligence would include how to integrate these products into your monitoring, logging and alerting systems. Determine what additional disk storage requirements may be needed. Test, verify and time recovery strategies.

Trove specifically leverages several other OpenStack services for source image and instance management. Each trove guest image includes a base operating system, the applicable database software and a database technology specific trove guest agent. This agent is the intelligence that knows the specific syntax and version needs to perform the tasks. The agent is also the communication mechanism between Trove and the running nova instance.

Trove is a total solution manager for the instance running your chosen database. Instances have no ssh, telnet or other general access. The only access is via the SQL communication via the defined ports, e.g. 3306 for MySQL.

The Trove lifecycle management covers the provisioning, management, security, configuration and tuning of your database. Amrith Kumar in a recent presentation at the NYC Postgres meetup provides a good description of the specifics.

Trove is capable of describing and supporting clustering and replication topologies for the various data stores. It can support backup and restore, failover and resizing of clusters without the operator needing to know the specific syntax of complexities of a database product they are unfamiliar with.

A great example is the subtle difference in MySQL replication management using GTID’s between MySQL and MariaDb. To the developer, the interaction between MySQL and MariaDB via SQL is the same, the management of a replication topology is not identical, but is managed by the Trove guest agent. To the operator, the management is the same.

Also in his presentation, Kumar described Tesora, an enterprise class Trove service provided with a number of important additional features. Tesora supports additional database products including Oracle and DB2Express as well as commercial versions for Oracle MySQL, EnterpriseDB, Couchbase, Datastax, and mongoDB. Using the Horizon UI customizations with pre-defined trove instances greatly reduces the work needed for operators and deployers to build there own.

Understanding the DBaaS capability for your organization

As your organization transforms to embrace the wealth of digital information that is becoming available, the capability to store, manage and consume this data in any given format or product becomes an increasing burden for operations.

How does your organization handle the request, “I need to use product Z to store data for my new project?” There are several responses I have experienced first-hand with clients.

  1. Enforce the company policy that Products O and S are all that can be used.
  2. Ignore the request.
  3. Consider the request, but antagonize your own internal organization with long wait times (e.g. months or years) and with repeated delays to evaluate a product you simply do not want to support.
  4. Do whatever the developers say, they know best.

Unfortunately I have seen too many organizations use the first three options as the answer. The last option you make consider as a non valid answer however I have also seen this prevalent when there is no operations team or strategic technical oversight.

Ignorance of the question only leads to a greater pressure point at a later time. This may be when your executive team now enforces the requirement with their timetable. I have seen this happen and with painful ramifications. With the ability to consume public cloud resources with only access to a credit card, development resources can now proceed unchecked more easily if ignored or delayed. When a successful proof of concept is produced this way and now a more urgent need is required to deploy, support and manage, the opportunity to have a positive impact on the design decision of a new data product has passed.

Using DBaaS is one enabling tool within a strategic business model for your organization to satisfy this question with greater control. This however is not the solution but rather one tool combined with applicable processes. In order to scope the requirements for the original question, your model also needs to consider the following:

  • Provisioning capabilities
  • Strategic planning and insight
  • Support and management
  • Release criteria

Provisioning

This is the strength of DBaaS. Operations can enable development to independently provision resources and technology with little additional impeding dependency. There is input from operations to enable varying products to be available by self servicing, however there is also some control. DBaaS can be viewed as a controlled and flexible enabler. A specific example is an organization that uses the MySQL relational database, and now a developer wishes to use the MongoDB NoSQL unstructured store. An operator may cringe at the notion of a lack of data consistency, structure data query access or performance capabilities. These are all valid points, however those are discussions at a strategic level discussion your workflow pipeline and should not be an impediment to iterate quickly. Without oversight, to iterate quickly can lead to unmanageable outcomes.

Strategic Planning

There always needs to be oversight combined with applicable strategy. A single developer stating they want to use the new product Z for a distribution key/value store needs to be vetted first within the engineering organization and its own developer peers. If another project is already using Product Y that has the same core data access and features, this burden of an additional product support should be a self contained discussion validating the need first.

This is one strength of a good engineering manager that balances the requirements of the business needs and objectives with the capabilities of the resources available, including staff, tools and technology. Applicable principles put in place should also ensure that some aspect of planning is instilled into the development culture.

Support and Management

The development and engineering resources rarely consider the administration and support required for the suite of products and services used in an organization. The emphasis is on feature development and customer requirements, not the sustainability, longevity and security of any system. Operational support is a long list of needs, just a few include:

  • Information security.
  • Information availability.
  • Service level agreements (SLAs) between partners, service providers and the internal organization
  • The backup ecosystem, time taken, consistency, point-in-time recovery, testing and verification, cost of H/W, S/W, licenses.
  • Internet connectivity.
  • Capacity planning and cost analysis of storing and archiving ever increasing sources of data.
  • Hardware and software upgrades.

Two way communication which is often overlooked is the start of better understanding. That is, operations being included and involved in strategic development planning, and engineering resources included in operations needs and requirements for ensuring those new product features operate for the benefit of customers. In summary, “bridging the communication chasm”.

DevOps is an abused term, this implies that developers now perform a subset of responsibilities of Operations. As an individual that has worked in both development teams and lead operations teams, your resources skills, personality, rational thinking and decision making needs are vastly different between an engineering task and a production operations task.

Developers need to live a 24 hour day (with the unnecessary 3am emergency call) in the shoes of an operator. The reverse is also true, however the ramifications to business continuity are not the same. Just one factor, the cost, or more specifically the loss to the business due to a production failure alters the decision making process. Failure can be anything from a hardware or connectivity problem, bad code that was released to a data breach.

Release Criteria

If an organization has a strong (and flexible) policy on release criteria, all parties from the stack-holder, executive, engineering, operations and marketing should be able to contribute to the discussion and decision for a new product, and applicable in-house or third-party support. This discussion is not a pre-requisite for any department or developer to iterate quickly, however it is pre-requisite to migrate from a proof-of-concept prototype to a supported feature. Another often overlooked criteria in the pursuit for rapid deployment of new features which are significantly more difficult to remove after publication.

Expired MySQL passwords

I was surprised to find on one of my websites the message “Connect failed: Your password has expired. To log in you must change it using a client that supports expired passwords.

Not knowing that I was using a MySQL password expiry policy I reviewed the 5.7 documentation quickly which *clearly* states “The default default_password_lifetime value is 0, which disables automatic password expiration.”.

I then proceeded to investigate further, my steps are below the following comment.

However, it is always important with MySQL documentation and a new feature (in this case a 5.7 feature) to review release notes when installing versions or to least read ALL the documentation, because you may miss important information, such as.

Note:
From MySQL 5.7.4 to 5.7.10, the default default_password_lifetime value is 360 (passwords must be changed approximately once per year). For those versions, be aware that, if you make no changes to the default_password_lifetime variable or to individual user accounts, all user passwords will expire after 360 days, and all user accounts will start running in restricted mode when this happens. Clients…

I would encourage you to view the MySQL password expiry policy to see the full note. I have only include the intro here are a teaser, because you need to read the entire note.

Analysis

Back to impatient analysis steps.

$ mysql -u admin -p 
*********

SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 5.7.9-log |
+-----------+

SHOW GLOBAL VARIABLES LIKE 'default_p%';
+---------------------------+-------+
| Variable_name             | Value |
+---------------------------+-------+
| default_password_lifetime | 360   |
+---------------------------+-------+


SELECT host,user,password_last_changed 
FROM mysql.user
WHERE password_last_changed + INTERVAL @@default_password_lifetime DAY < CURDATE();
+-----------+--------------+-----------------------+
| host      | user         | password_last_changed |
+-----------+--------------+-----------------------+
| localhost | XXX          | 2014-12-01 12:53:36   |
| localhost | XXXXX        | 2014-12-01 12:54:04   |
| localhost | XX_XXXX      | 2015-06-04 11:01:11   |
+-----------+--------------+-----------------------+

Indeed there are some passwords that have expired.

After finding the applicable application credentials I looked at verifying the problem.

$ mysql -uXX_XXXX -p
*******************
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Server version: 5.7.9-log

mysql>

Interesting, there was no error to make a client connection, however.

mysql> use XXXX;
ERROR 1820 (HY000): You must reset your password using ALTER USER statement before executing this statement.

I then proceeded to change the password with the applicable hint shown.

ALTER USER XX_XXXX@localhost IDENTIFIED BY '*************************';

I chose to reuse the same password because changing the password would require a subsequent code change. MySQL accepted the same password. (A topic for a separate discussion on this point).

A manual verification showed the application and users operating as it should be, so immediate loss of data was averted. Monitoring of the sites home page however did not detect this problem of a partial page error, so should a password expiry policy be used, an applicable check in a regularly scheduled operational task is a good feature request.

All of this could have been avoided if my analysis started with reading the documentation and the note (partly shown above) which has an alternative and potentially more practical immediate solution.

In a firefighting operational mode it can be a priority to correct the problem, however more detailed analysis is prudent to maintain a "Being proactive rather than reactive" mindset. Being a Friday, I feel the old saying "There is more than one way to skin a cat" is applicable.

I am also more familiar with the SET PASSWORD syntax, so reviewing this 5.7 manual page is also a good read to determine what specific syntax is now deprecated and what "ALTER USER is now the preferred statement for assigning passwords" also.

Oracle OpenStack leveraging MySQL Cluster and Docker

At Oracle Openworld this year, Oracle OpenStack Release 2 was announced. This Kilo based distribution included some new deployment features not see in other OpenStack distros including the use of Kolla, Docker and MySQL Cluster. The press release states “First commercially available OpenStack implementation completely packaged as Docker instances”.

Using Docker to containerize each component of services is a very convenient means of dev/test/prod management. Your single node developer environment gets HA automatically, you can easily deploy to two or more management nodes varying services, and they look and act just like your production environment. I have often struggled with developing in OpenStack either with single project environments, creating a devstack, a previously installed 3 node physical server deployment which takes up room and power in my office, and also comparing other single node solutions including Canonical and Mirantis. I am often left to using online services such as Mirantis Express, TryStack and HP Cloud to more easily evaluate the end product, but without any access to the operating cloud under the covers.

It is an interesting move to using MySQL Cluster. I liked this announcement. This is a very robust Master/Master MySQL compatible product that starts with a High Availability implemented through a shared nothing architecture. You get the benefit of dynamic adding of data shards as your system grows. However MySQL Cluster is a very different product under the covers. For those familiar with managing MySQL server a different set of skills are required, starting with the concept of a management node, data nodes and SQL nodes and the different ways to manage, monitor and triage. MySQL Cluster is effectively an in-memory solution so this will require some additional sizing considerations especially for production deployments. Your backup/recovery/disaster recovery strategy will also change. All of this administration exists for MySQL Cluster, it is just different. If you only use MySQL server these are new skills to master. As an expert in MySQL server and having not used MySQL Cluster for at least 7 years, I cannot provide an insights for example of the use of common monitoring tools (including newer SaaS offerings). Still, MySQL Cluster is an extremely stable and production ready product, that scales to millions of QPS easily. Using this as a HA solution gives you a rock solid base which is what you need first.

While I attended a number of sessions and took the Hands On Lab for Oracle OpenStack the proof is having a running local environment. My next post will talk about my experiences installing.

References

Managing MySQL Version Upgrades Presentation

The following presentation was given at the Oracle Technology Network (OTN) Latin America 2015 tour events in Uruguay, Argentina, Chile and Peru.

In this presentation I talk about the various versions and means of installing and upgrading MySQL including:

  • MySQL version history from 3.23 to 5.7.8
  • Historical installation options
  • Recommended use of Oracle yum repository for current version
  • The installation and upgrade process, and errors that occur
  • Compatibility changes between MySQL 5.5 and MySQL 5.6 including
    • Reserved words (and their true impact)
    • Legacy TIMESTAMP usage
    • FULLTEXT indexes
    • The query optimizer
    • Clear text password warnings and security improvements
  • Important configuration differences
  • Other recommendations
  • The use of replication

Testing and Verifying your MySQL Backup Strategy Presentation

This past week I have been the sole MySQL representative on the Oracle Technology Network (OTN) Latin America 2015 tour events in Uruguay, Argentina, Chile and Peru.

In this presentation I talk about the important steps for testing and verifying your MySQL backup strategy to ensure your business continuity in any disaster recovery situation. This includes:

  • Overview of the primary product options
  • Backup and recovery strategy considerations
  • Technical requirements
  • Common problems observed
  • What about a failover strategy

My thoughts on Architecture and Software Development with MySQL

Yesterday I was able to present to the Portland MySQL Users Group two presentations that are important foundations for effective development with MySQL.

With 26 years of architectural experience in RDBMS and 16 years of MySQL knowledge, my extensive exposure to large and small companies from consulting has lead to these presentations containing common obstacles I have seen an help organization with. All that can be easily avoided. The benefits of improved development and better processes leads to better quality software, better performance and a lower cost of ownership, that is saving companies money.

Thanks to Emily and Daniel for organizing and New Relic for hosting.

Percona Live Presentation: MySQL Security Essentials

The slides for my MySQL Security Essentials presentation at Percona Live 2015 MySQL Conference and Expo are now available.

In this presentation I discuss just how insecure legacy versions of MySQL are and what are the essential requirements for securing your installation on disk, via network and with user privileges. I provide recommendations for how to manage application access for your most important data asset.

This presentation describes the key security improvements in MySQL 5.6 and MySQL 5.7 as well as additional features provided in MariaDB 10.0 and 10.1 supporting roles and encryption.

I have also included slides for how easy it is to Hack MySQL and examples of denial of service attacks that are possible with even limited MySQL access.

Percona Live Presentation: Improving Performance With Better Indexes

The slides for my Improving Performance With Better Indexes presentation at Percona Live 2015 MySQL Conference and Expo are now available.

In this presentation I discuss how to identify, review and analyze SQL statements in order to create better indexes for your queries. This includes understanding the EXPLAIN syntax and how to create and identify covering and partial column indexes.

This presentation is based on the work with a customer showing the 95% improvement of a key 15 table join query running 15,000 QPS in a 25 billion SQL statements per day infrastructure.

As mentioned, Explaining the MySQL Explain is an additional presentation that goes into more detail for learning how to read Query Execution Plans (QEP) in MySQL.

Updating MySQL on Ubuntu 12.04 LTS to MySQL 5.6

The Ubuntu 12.04.3 LTS release only provides MySQL 5.1 and MySQL 5.5 using the default Ubuntu package manager.

Oracle (owners of the MySQL(tm)) now provide Debian/Ubuntu APT repositories for all GA and DMR versions of MySQL including supporting Ubuntu 12.04.

The following steps demonstrate upgrading from the Ubuntu 5.5 server package to the Oracle 5.6 server package.

Verify MySQL Packages

$ apt-cache search mysql-server
mysql-server - MySQL database server (metapackage depending on the latest version)
mysql-server-5.5 - MySQL database server binaries and system database setup
mysql-server-core-5.5 - MySQL database server binaries
auth2db - Powerful and eye-candy IDS logger, log viewer and alert generator
cacti - Frontend to rrdtool for monitoring systems and services
torrentflux - web based, feature-rich BitTorrent download manager

Verify MySQL on Server

$  dpkg -l | grep mysql
ii  libdbd-mysql-perl                      4.020-1build2                                       Perl5 database interface to the MySQL database
ii  libmysqlclient-dev                     5.5.34-0ubuntu0.12.04.1                             MySQL database development files
ii  libmysqlclient18                       5.5.34-0ubuntu0.12.04.1                             MySQL database client library
ii  mysql-client-5.5                       5.5.31-0ubuntu0.12.04.1                             MySQL database client binaries
ii  mysql-client-core-5.5                  5.5.34-0ubuntu0.12.04.1                             MySQL database core client binaries
ii  mysql-common                           5.5.34-0ubuntu0.12.04.1                             MySQL database common files, e.g. /etc/mysql/my.cnf
ii  mysql-server-5.5                       5.5.31-0ubuntu0.12.04.1                             MySQL database server binaries and system database setup
ii  mysql-server-core-5.5                  5.5.31-0ubuntu0.12.04.1                             MySQL database server binaries
ii  php5-mysqlnd                           5.3.10-1ubuntu3.8                                   MySQL module for php5 (Native Driver)

Results may vary based on dependencies.

Checking the MySQL error log (as it’s the right good practice to always do)

$ sudo tail -50 /var/log/mysql/error.log
150402 16:02:49 [Note] Plugin 'FEDERATED' is disabled.
150402 16:02:49 InnoDB: The InnoDB memory heap is disabled
150402 16:02:49 InnoDB: Mutexes and rw_locks use GCC atomic builtins
150402 16:02:49 InnoDB: Compressed tables use zlib 1.2.3.4
150402 16:02:49 InnoDB: Initializing buffer pool, size = 1.0G
150402 16:02:49 InnoDB: Completed initialization of buffer pool
150402 16:02:49 InnoDB: highest supported file format is Barracuda.
150402 16:02:49  InnoDB: Waiting for the background threads to start
150402 16:02:50 InnoDB: 5.5.31 started; log sequence number 20079278867
150402 16:02:50 [Note] Server hostname (bind-address): '127.0.0.1'; port: 3306
150402 16:02:50 [Note]   - '127.0.0.1' resolves to '127.0.0.1';
150402 16:02:50 [Note] Server socket created on IP: '127.0.0.1'.
150402 16:02:50 [Note] Event Scheduler: Loaded 0 events
150402 16:02:50 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.31-0ubuntu0.12.04.1'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  (Ubuntu)

A check shows that this is not the most current version available of 5.5 using the Ubuntu packages.

$ apt-cache show mysql-server-5.5
Package: mysql-server-5.5
Priority: optional
Section: database
Installed-Size: 31947
Maintainer: Ubuntu Developers 
Original-Maintainer: Debian MySQL Maintainers 
Architecture: amd64
Source: mysql-5.5
Version: 5.5.41-0ubuntu0.12.04.1
...

Package: mysql-server-5.5
Status: install ok installed
Priority: optional
Section: database
Installed-Size: 31950
Maintainer: Ubuntu Developers 
Architecture: amd64
Source: mysql-5.5
Version: 5.5.31-0ubuntu0.12.04.1
...

Just to be consistent with keeping current versions, you may choose to update MySQL 5.5 to the current available version.

$ sudo apt-get install mysql-server-5.5
...

Installing Oracle APT Packaging

The recommended documented way to move to using the Oracle repo is:

cd /tmp
# See https://dev.mysql.com/downloads/repo/apt/ for your right distro version
wget https://dev.mysql.com/get/mysql-apt-config_0.3.3-2ubuntu12.04_all.deb
sudo dpkg -i mysql-apt-config*.deb

This unfortunately uses a cursors based interface which is not something you automate for production systems and not the approach I would suggest.

So doing what this does

echo "deb http://repo.mysql.com/apt/ubuntu/ precise mysql-apt-config
deb http://repo.mysql.com/apt/ubuntu/ precise mysql-5.6" | sudo tee /etc/apt/sources.list.d/mysql.list
curl -s http://ronaldbradford.com/mysql/mysql.gpg | sudo apt-key add -
sudo apt-get update

Now we can look at available versions.


$ apt-cache search mysql-server
mysql-server-5.5 - MySQL database server binaries and system database setup
mysql-server-core-5.5 - MySQL database server binaries
auth2db - Powerful and eye-candy IDS logger, log viewer and alert generator
cacti - Frontend to rrdtool for monitoring systems and services
torrentflux - web based, feature-rich BitTorrent download manager
mysql-community-server - MySQL Server
mysql-server - MySQL Server meta package depending on latest version

This is where life gets a little confusing. Because Ubuntu supported MySQL 5.1 (as mysql-server) and MySQL 5.5 (as mysql-server-5.5) it can be misleading.

$ apt-cache show mysql-server
$ apt-cache show mysql-server
Package: mysql-server
Source: mysql-community
Version: 5.6.23-1ubuntu12.04
Architecture: amd64
Maintainer: MySQL Release Engineering 
Installed-Size: 46
Depends: mysql-community-server (= 5.6.23-1ubuntu12.04)
Homepage: http://www.mysql.com/
Priority: optional
Section: database
Filename: pool/mysql-5.6/m/mysql-community/mysql-server_5.6.23-1ubuntu12.04_amd64.deb
Size: 11644
SHA256: 1cb166cd230d2a4daca761ea80f2f34ee1fc0c92aaae972c914d81746f235d63
SHA1: 63548c852d5faeda751fbf038c0799fbbeac9905
MD5sum: da2f709a29a7cac97c834e6e69929891
Description: MySQL Server meta package depending on latest version
 The MySQL(TM) software delivers a very fast, multi-threaded, multi-user,
 and robust SQL (Structured Query Language) database server. MySQL Server
 is intended for mission-critical, heavy-load production systems as well
 as for embedding into mass-deployed software. MySQL is a trademark of
 Oracle. This is a meta package that depends on the latest mysql server
 package available in the repository.

Package: mysql-server
Priority: optional
Section: database
Installed-Size: 114
Maintainer: Ubuntu Developers 
Original-Maintainer: Debian MySQL Maintainers 
Architecture: all
Source: mysql-5.5
Version: 5.5.41-0ubuntu0.12.04.1
...

We are looking to ensure the Maintainer is the Official Release.

Upgrading to MySQL 5.6

sudo service mysql stop
ps -ef | grep mysql
sudo apt-get install -y mysql-server
$ sudo apt-get install -y mysql-server
...
The following extra packages will be installed:
  mysql-client mysql-common mysql-community-client mysql-community-server
The following packages will be REMOVED:
  mysql-client-5.5 mysql-client-core-5.5 mysql-server-5.5 mysql-server-core-5.5
The following NEW packages will be installed:
  mysql-client mysql-community-client mysql-community-server mysql-server
The following packages will be upgraded:
  mysql-common
...
Configuration file `/etc/mysql/my.cnf'
 ==> Modified (by you or by a script) since installation.
 ==> Package distributor has shipped an updated version.
   What would you like to do about it ?  Your options are:
    Y or I  : install the package maintainer's version
    N or O  : keep your currently-installed version
      D     : show the differences between the versions
      Z     : start a shell to examine the situation
 The default action is to keep your current version.
*** my.cnf (Y/I/N/O/D/Z) [default=N] ? N
...
Installing new version of config file /etc/apparmor.d/usr.sbin.mysqld ...
2015-04-02 16:53:07 0 [Warning] Using unique option prefix key_buffer instead of key_buffer_size is deprecated and will be removed in a future release. Please use the full name instead.
2015-04-02 16:53:07 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
150402 16:53:14 mysqld_safe Can't log to error log and syslog at the same time.  Remove all --log-error configuration options for --syslog to take effect
...

You may think the process is completed, but it is not. Always, Always check the error log. Have you checked your MySQL error log today?

$ sudo tail -300 /var/log/mysql/error.log
...
2015-04-02 16:53:14 20429 [Warning] Buffered warning: Changed limits: table_cache: 431 (requested 2000)

2015-04-02 16:53:14 20429 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
2015-04-02 16:53:14 20429 [Note] Plugin 'FEDERATED' is disabled.
2015-04-02 16:53:14 20429 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-04-02 16:53:14 20429 [Note] InnoDB: The InnoDB memory heap is disabled
2015-04-02 16:53:14 20429 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-04-02 16:53:14 20429 [Note] InnoDB: Memory barrier is not used
2015-04-02 16:53:14 20429 [Note] InnoDB: Compressed tables use zlib 1.2.3.4
2015-04-02 16:53:14 20429 [Note] InnoDB: Using Linux native AIO
2015-04-02 16:53:14 20429 [Note] InnoDB: Not using CPU crc32 instructions
2015-04-02 16:53:14 20429 [Note] InnoDB: Initializing buffer pool, size = 1.0G
2015-04-02 16:53:15 20429 [Note] InnoDB: Completed initialization of buffer pool
2015-04-02 16:53:15 20429 [Note] InnoDB: Highest supported file format is Barracuda.
2015-04-02 16:53:15 20429 [Note] InnoDB: 128 rollback segment(s) are active.
2015-04-02 16:53:15 20429 [Note] InnoDB: Waiting for purge to start
2015-04-02 16:53:15 20429 [Note] InnoDB: 5.6.23 started; log sequence number 20079286519
2015-04-02 16:53:15 20429 [Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: 487fda28-d97a-11e4-9254-e0cb4e3feb73.
2015-04-02 16:53:15 20429 [Note] Server hostname (bind-address): '127.0.0.1'; port: 3306
2015-04-02 16:53:15 20429 [Note]   - '127.0.0.1' resolves to '127.0.0.1';
2015-04-02 16:53:15 20429 [Note] Server socket created on IP: '127.0.0.1'.
2015-04-02 16:53:15 20429 [ERROR] Column count of mysql.events_waits_current is wrong. Expected 19, found 16. Created with MySQL 50541, now running 50623. Please use mysql_upgrade to fix this error.
2015-04-02 16:53:15 20429 [ERROR] Column count of mysql.events_waits_history is wrong. Expected 19, found 16. Created with MySQL 50541, now running 50623. Please use mysql_upgrade to fix this error.
2015-04-02 16:53:15 20429 [ERROR] Column count of mysql.events_waits_history_long is wrong. Expected 19, found 16. Created with MySQL 50541, now running 50623. Please use mysql_upgrade to fix this error.
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'events_waits_summary_by_host_by_event_name' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Incorrect definition of table performance_schema.events_waits_summary_by_thread_by_event_name: expected column 'THREAD_ID' at position 0 to have type bigint(20), found type int(11).
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'events_waits_summary_by_user_by_event_name' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'events_waits_summary_by_account_by_event_name' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Column count of mysql.file_summary_by_event_name is wrong. Expected 23, found 5. Created with MySQL 50541, now running 50623. Please use mysql_upgrade to fix this error.
2015-04-02 16:53:15 20429 [ERROR] Column count of mysql.file_summary_by_instance is wrong. Expected 25, found 6. Created with MySQL 50541, now running 50623. Please use mysql_upgrade to fix this error.
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'host_cache' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Incorrect definition of table performance_schema.mutex_instances: expected column 'LOCKED_BY_THREAD_ID' at position 2 to have type bigint(20), found type int(11).
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'objects_summary_global_by_type' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Incorrect definition of table performance_schema.rwlock_instances: expected column 'WRITE_LOCKED_BY_THREAD_ID' at position 2 to have type bigint(20), found type int(11).
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'setup_actors' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'setup_objects' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'table_io_waits_summary_by_index_usage' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'table_io_waits_summary_by_table' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'table_lock_waits_summary_by_table' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Column count of mysql.threads is wrong. Expected 14, found 3. Created with MySQL 50541, now running 50623. Please use mysql_upgrade to fix this error.
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'events_stages_current' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'events_stages_history' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'events_stages_history_long' has the wrong structure
...

Completing the MySQL 5.6 Upgrade

A MySQL upgrade of the meta schema is necessary.

$ sudo mysql_upgrade -uroot -p
Enter password:
Looking for 'mysql' as: mysql
Looking for 'mysqlcheck' as: mysqlcheck
Running 'mysqlcheck' with connection arguments: '--port=3306' '--socket=/var/run/mysqld/mysqld.sock'
Warning: Using a password on the command line interface can be insecure.
Running 'mysqlcheck' with connection arguments: '--port=3306' '--socket=/var/run/mysqld/mysqld.sock'
Warning: Using a password on the command line interface can be insecure.
mysql.columns_priv                                 OK
mysql.db                                           OK
mysql.event                                        OK
mysql.func                                         OK
mysql.general_log                                  OK
mysql.help_category                                OK
mysql.help_keyword                                 OK
mysql.help_relation                                OK
mysql.help_topic                                   OK
mysql.host                                         OK
mysql.ndb_binlog_index                             OK
mysql.plugin                                       OK
mysql.proc                                         OK
mysql.procs_priv                                   OK
mysql.proxies_priv                                 OK
mysql.servers                                      OK
mysql.slow_log                                     OK
mysql.tables_priv                                  OK
mysql.time_zone                                    OK
mysql.time_zone_leap_second                        OK
mysql.time_zone_name                               OK
mysql.time_zone_transition                         OK
mysql.time_zone_transition_type                    OK
mysql.user                                         OK
Running 'mysql_fix_privilege_tables'...
Warning: Using a password on the command line interface can be insecure.
Running 'mysqlcheck' with connection arguments: '--port=3306' '--socket=/var/run/mysqld/mysqld.sock'
Warning: Using a password on the command line interface can be insecure.
Running 'mysqlcheck' with connection arguments: '--port=3306' '--socket=/var/run/mysqld/mysqld.sock'
Warning: Using a password on the command line interface can be insecure.
...
OK
$ sudo service mysql restart
$ sudo service mysql restart
 * Stopping MySQL Community Server 5.6.23
....
 * MySQL Community Server 5.6.23 is stopped
 * Re-starting MySQL Community Server 5.6.23
150402 17:06:17 mysqld_safe Can't log to error log and syslog at the same time.  Remove all --log-error configuration options for --syslog to take effect.
......
 * MySQL Community Server 5.6.23 is started
$ sudo tail -300 /var/log/mysql/error.log
...
2015-04-02 17:06:15 20429 [Note] /usr/sbin/mysqld: Shutdown complete

150402 17:06:15 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
150402 17:06:17 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
2015-04-02 17:06:17 0 [Warning] Using unique option prefix key_buffer instead of key_buffer_size is deprecated and will be removed in a future release. Please use the full name instead.
2015-04-02 17:06:17 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2015-04-02 17:06:17 20994 [Warning] Buffered warning: Changed limits: max_open_files: 1024 (requested 5000)

2015-04-02 17:06:17 20994 [Warning] Buffered warning: Changed limits: table_cache: 431 (requested 2000)

2015-04-02 17:06:17 20994 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
2015-04-02 17:06:17 20994 [Note] Plugin 'FEDERATED' is disabled.
2015-04-02 17:06:17 20994 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-04-02 17:06:17 20994 [Note] InnoDB: The InnoDB memory heap is disabled
2015-04-02 17:06:17 20994 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-04-02 17:06:17 20994 [Note] InnoDB: Memory barrier is not used
2015-04-02 17:06:17 20994 [Note] InnoDB: Compressed tables use zlib 1.2.3.4
2015-04-02 17:06:17 20994 [Note] InnoDB: Using Linux native AIO
2015-04-02 17:06:17 20994 [Note] InnoDB: Not using CPU crc32 instructions
2015-04-02 17:06:17 20994 [Note] InnoDB: Initializing buffer pool, size = 1.0G
2015-04-02 17:06:17 20994 [Note] InnoDB: Completed initialization of buffer pool
2015-04-02 17:06:17 20994 [Note] InnoDB: Highest supported file format is Barracuda.
2015-04-02 17:06:17 20994 [Note] InnoDB: 128 rollback segment(s) are active.
2015-04-02 17:06:17 20994 [Note] InnoDB: Waiting for purge to start
2015-04-02 17:06:17 20994 [Note] InnoDB: 5.6.23 started; log sequence number 20081020877
2015-04-02 17:06:17 20994 [Note] Server hostname (bind-address): '127.0.0.1'; port: 3306
2015-04-02 17:06:17 20994 [Note]   - '127.0.0.1' resolves to '127.0.0.1';
2015-04-02 17:06:17 20994 [Note] Server socket created on IP: '127.0.0.1'.
2015-04-02 17:06:17 20994 [Note] Event Scheduler: Loaded 0 events
2015-04-02 17:06:17 20994 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.6.23'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL Community Server (GPL)

Correcting errors

As you can see there are several warnings/errors when starting MySQL.

The first is

 mysqld_safe Can't log to error log and syslog at the same time.  Remove all --log-error configuration options for --syslog to take effect.

We solve this with

$ sudo rm -f /etc/mysql/conf.d/mysqld_safe_syslog.cnf
$ sudo service mysql restart
 * Stopping MySQL Community Server 5.6.23
....
 * MySQL Community Server 5.6.23 is stopped
 * Re-starting MySQL Community Server 5.6.23
......
 * MySQL Community Server 5.6.23 is started

This is an Ubuntu default that conflicts with the my.cnf log_error were are familiar with in monitoring the MySQL error log. You can read my opinion on this in The correct approach to rolling MySQL logs

The second is

2015-04-02 17:06:17 0 [Warning] Using unique option prefix key_buffer instead of key_buffer_size is deprecated and will be removed in a future release. Please use the full name instead.

We solve this with

sudo sed -ie "s/^key_buffer[^_]/key_buffer_size/" /etc/mysql/my.cnf

Next

2015-04-02 17:18:06 22123 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.

We solve this with

sudo sed -ie "s/^myisam-recover[^-]/myisam-recover-options/" /etc/mysql/my.cnf

The warnings are interesting, and will part of the following post on MySQL 5.6 configuration changes discussed in the next point.

2015-04-02 17:22:08 22626 [Warning] Buffered warning: Changed limits: max_open_files: 1024 (requested 5000)
2015-04-02 17:22:08 22626 [Warning] Buffered warning: Changed limits: table_cache: 431 (requested 2000)

Leveraging MySQL 5.6 benefits

We may now have a MySQL 5.6 installation however we are far from utilizing the benefits of MySQL 5.6 fully. In a subsequent post I will talk about the configuration options we need to now consider, both new options such as innodb_purge_threads and important improvements such as sync_binlog. There are far greater complex changes including innodb_file_per_table, master_info_repository and relay_log_info_repository and then changes in defaults such as performance_schema

Validating MySQL version numbers

As part of a MySQL 5.5 to MySQL 5.6 upgrade across several Ubuntu servers of varying distros an audit highlighted a trivial but interesting versioning identification error in Ubuntu’s packaging of MySQL.

Ubuntu 12.04 LTS

$ sudo dpkg -l | grep mysql-server-5.5
ii  mysql-server-5.5   5.5.41-0ubuntu0.12.04.1  ...
$ mysql -uroot -p -e "SELECT VERSION()"
+-------------------------+
| VERSION()               |
+-------------------------+
| 5.5.41-0ubuntu0.12.04.1 |
+-------------------------+

But when you look at the mysql --version it does NOT say 5.5.41.

$ mysql --version
mysql  Ver 14.14 Distrib 5.5.34, for debian-linux-gnu (x86_64) using readline 6.2

Ubuntu 14.04 LTS

On 14.04 I get expected results.

$ sudo dpkg -l | grep mysql-server-5.5
ii  mysql-server-5.5       5.5.41-0ubuntu0.14.04.1   ...
rbradfor@rubble:~$ mysql -uroot -p -e "SELECT VERSION()"
+-------------------------+
| VERSION()               |
+-------------------------+
| 5.5.41-0ubuntu0.14.04.1 |
+-------------------------+
$ mysql --version
mysql  Ver 14.14 Distrib 5.5.41, for debian-linux-gnu (x86_64) using readline 6.3

Dynamic recreation of InnoDB redo logs

MySQL 5.6 will now automatically recreate the InnoDB redo log files during a MySQL restart if the size (or number) of these logs changes, i.e. a change to innodb_log_file_size. See Changing the Number or Size of InnoDB Log Files which states “If InnoDB detects that the innodb_log_file_size differs from the redo log file size, it will write a log checkpoint, close and remove the old log files, create new log files at the requested size, and open the new log files.”

Before MySQL 5.6 it was necessary to stop MySQL and remove the InnoDB log files manually before restarting MySQL.

The error log shows:

tail -f /mysql/log/error.log
...
2015-03-28 21:51:25 3767 [Warning] InnoDB: Resizing redo log from 2*3072 to 2*65536 pages, LSN=1626017
2015-03-28 21:51:25 3767 [Warning] InnoDB: Starting to delete and rewrite log files.
2015-03-28 21:51:25 3767 [Note] InnoDB: Setting log file ./ib_logfile101 size to 1024 MB
InnoDB: Progress in MB: 100 200 300 400 500 600 700 800 900 1000
2015-03-28 21:51:28 3767 [Note] InnoDB: Setting log file ./ib_logfile1 size to 1024 MB
InnoDB: Progress in MB: 100 200 300 400 500 600 700 800 900 1000
2015-03-28 21:51:31 3767 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0
2015-03-28 21:51:31 3767 [Warning] InnoDB: New log files created, LSN=1626017
...

It was however odd that MySQL had indicated it had successfully started but the underlying Redo log files were not complete and in-place as seen by the following directory listings.

$ sudo service mysql start
......
 * MySQL Community Server 5.6.23 is started
$
$ ls -lh /var/lib/mysql/
total 1.9G
-rw-rw---- 1 mysql mysql   56 Mar 28 19:42 auto.cnf
-rw-rw---- 1 mysql mysql  12M Mar 28 21:51 ibdata1
-rw-rw---- 1 mysql mysql 902M Mar 28 21:51 ib_logfile1
-rw-rw---- 1 mysql mysql 1.0G Mar 28 21:51 ib_logfile101
drwxr-x--- 2 mysql mysql 4.0K Mar 28 19:42 mysql
drwx------ 2 mysql mysql 4.0K Mar 28 19:42 performance_schema
$ ls -lh /var/lib/mysql/
total 2.1G
-rw-rw---- 1 mysql mysql   56 Mar 28 19:42 auto.cnf
-rw-rw---- 1 mysql mysql  12M Mar 28 21:51 ibdata1
    -rw-rw---- 1 mysql mysql 1.0G Mar 28 21:51 ib_logfile0 -rw-rw---- 1 mysql mysql 1.0G Mar 28 21:51 ib_logfile1 drwxr-x--- 2 mysql mysql 4.0K Mar 28 19:42 mysql drwx------ 2 mysql mysql 4.0K Mar 28 19:42 performance_schema