Utilizing OpenStack Trove DBaaS for deployment management

Trove is used for self service provisioning and lifecycle management for relational and non-relational databases in an OpenStack cloud. Trove provides a RESTful API interface that is same regardless of the type of database. CLI tools and a web UI via Horizon are also provided wrapping Trove API requests.

In simple terms. You are a MySQL shop. You run a replication environment with daily backups and failover capabilities which you test and verify regularly. You have defined DBA and user credentials ACL’s across dev, test and prod environments. Now there is a request for using MongoDB or Cassandra, the engineering department has not decided but they want to evaluate the capabilities. How long as a operator does it take to acquire the software, install, configure, setup replication, backups, ACLs and enable the engineering department to evaluate the products?

With Trove DBaaS this complexity is eliminated due to a consistent interface to perform the provisioning, configuration, HA, B&R, ACL across other products the exact same way you perform these tasks for MySQL. This enables operations to be very proactive to changing technology requests supporting digital transformation strategies.

Enabling this capability is not an automatic approval of a new technology stack. It is important that strategic planning, support and management is included in the business strategy to understanding the DBaaS capability for your organization. Examples of operations due diligence would include how to integrate these products into your monitoring, logging and alerting systems. Determine what additional disk storage requirements may be needed. Test, verify and time recovery strategies.

Trove specifically leverages several other OpenStack services for source image and instance management. Each trove guest image includes a base operating system, the applicable database software and a database technology specific trove guest agent. This agent is the intelligence that knows the specific syntax and version needs to perform the tasks. The agent is also the communication mechanism between Trove and the running nova instance.

Trove is a total solution manager for the instance running your chosen database. Instances have no ssh, telnet or other general access. The only access is via the SQL communication via the defined ports, e.g. 3306 for MySQL.

The Trove lifecycle management covers the provisioning, management, security, configuration and tuning of your database. Amrith Kumar in a recent presentation at the NYC Postgres meetup provides a good description of the specifics.

Trove is capable of describing and supporting clustering and replication topologies for the various data stores. It can support backup and restore, failover and resizing of clusters without the operator needing to know the specific syntax of complexities of a database product they are unfamiliar with.

A great example is the subtle difference in MySQL replication management using GTID’s between MySQL and MariaDb. To the developer, the interaction between MySQL and MariaDB via SQL is the same, the management of a replication topology is not identical, but is managed by the Trove guest agent. To the operator, the management is the same.

Also in his presentation, Kumar described Tesora, an enterprise class Trove service provided with a number of important additional features. Tesora supports additional database products including Oracle and DB2Express as well as commercial versions for Oracle MySQL, EnterpriseDB, Couchbase, Datastax, and mongoDB. Using the Horizon UI customizations with pre-defined trove instances greatly reduces the work needed for operators and deployers to build there own.

Understanding the DBaaS capability for your organization

As your organization transforms to embrace the wealth of digital information that is becoming available, the capability to store, manage and consume this data in any given format or product becomes an increasing burden for operations.

How does your organization handle the request, “I need to use product Z to store data for my new project?” There are several responses I have experienced first-hand with clients.

  1. Enforce the company policy that Products O and S are all that can be used.
  2. Ignore the request.
  3. Consider the request, but antagonize your own internal organization with long wait times (e.g. months or years) and with repeated delays to evaluate a product you simply do not want to support.
  4. Do whatever the developers say, they know best.

Unfortunately I have seen too many organizations use the first three options as the answer. The last option you make consider as a non valid answer however I have also seen this prevalent when there is no operations team or strategic technical oversight.

Ignorance of the question only leads to a greater pressure point at a later time. This may be when your executive team now enforces the requirement with their timetable. I have seen this happen and with painful ramifications. With the ability to consume public cloud resources with only access to a credit card, development resources can now proceed unchecked more easily if ignored or delayed. When a successful proof of concept is produced this way and now a more urgent need is required to deploy, support and manage, the opportunity to have a positive impact on the design decision of a new data product has passed.

Using DBaaS is one enabling tool within a strategic business model for your organization to satisfy this question with greater control. This however is not the solution but rather one tool combined with applicable processes. In order to scope the requirements for the original question, your model also needs to consider the following:

  • Provisioning capabilities
  • Strategic planning and insight
  • Support and management
  • Release criteria

Provisioning

This is the strength of DBaaS. Operations can enable development to independently provision resources and technology with little additional impeding dependency. There is input from operations to enable varying products to be available by self servicing, however there is also some control. DBaaS can be viewed as a controlled and flexible enabler. A specific example is an organization that uses the MySQL relational database, and now a developer wishes to use the MongoDB NoSQL unstructured store. An operator may cringe at the notion of a lack of data consistency, structure data query access or performance capabilities. These are all valid points, however those are discussions at a strategic level discussion your workflow pipeline and should not be an impediment to iterate quickly. Without oversight, to iterate quickly can lead to unmanageable outcomes.

Strategic Planning

There always needs to be oversight combined with applicable strategy. A single developer stating they want to use the new product Z for a distribution key/value store needs to be vetted first within the engineering organization and its own developer peers. If another project is already using Product Y that has the same core data access and features, this burden of an additional product support should be a self contained discussion validating the need first.

This is one strength of a good engineering manager that balances the requirements of the business needs and objectives with the capabilities of the resources available, including staff, tools and technology. Applicable principles put in place should also ensure that some aspect of planning is instilled into the development culture.

Support and Management

The development and engineering resources rarely consider the administration and support required for the suite of products and services used in an organization. The emphasis is on feature development and customer requirements, not the sustainability, longevity and security of any system. Operational support is a long list of needs, just a few include:

  • Information security.
  • Information availability.
  • Service level agreements (SLAs) between partners, service providers and the internal organization
  • The backup ecosystem, time taken, consistency, point-in-time recovery, testing and verification, cost of H/W, S/W, licenses.
  • Internet connectivity.
  • Capacity planning and cost analysis of storing and archiving ever increasing sources of data.
  • Hardware and software upgrades.

Two way communication which is often overlooked is the start of better understanding. That is, operations being included and involved in strategic development planning, and engineering resources included in operations needs and requirements for ensuring those new product features operate for the benefit of customers. In summary, “bridging the communication chasm”.

DevOps is an abused term, this implies that developers now perform a subset of responsibilities of Operations. As an individual that has worked in both development teams and lead operations teams, your resources skills, personality, rational thinking and decision making needs are vastly different between an engineering task and a production operations task.

Developers need to live a 24 hour day (with the unnecessary 3am emergency call) in the shoes of an operator. The reverse is also true, however the ramifications to business continuity are not the same. Just one factor, the cost, or more specifically the loss to the business due to a production failure alters the decision making process. Failure can be anything from a hardware or connectivity problem, bad code that was released to a data breach.

Release Criteria

If an organization has a strong (and flexible) policy on release criteria, all parties from the stack-holder, executive, engineering, operations and marketing should be able to contribute to the discussion and decision for a new product, and applicable in-house or third-party support. This discussion is not a pre-requisite for any department or developer to iterate quickly, however it is pre-requisite to migrate from a proof-of-concept prototype to a supported feature. Another often overlooked criteria in the pursuit for rapid deployment of new features which are significantly more difficult to remove after publication.

Expired MySQL passwords

I was surprised to find on one of my websites the message “Connect failed: Your password has expired. To log in you must change it using a client that supports expired passwords.

Not knowing that I was using a MySQL password expiry policy I reviewed the 5.7 documentation quickly which *clearly* states “The default default_password_lifetime value is 0, which disables automatic password expiration.”.

I then proceeded to investigate further, my steps are below the following comment.

However, it is always important with MySQL documentation and a new feature (in this case a 5.7 feature) to review release notes when installing versions or to least read ALL the documentation, because you may miss important information, such as.

Note:
From MySQL 5.7.4 to 5.7.10, the default default_password_lifetime value is 360 (passwords must be changed approximately once per year). For those versions, be aware that, if you make no changes to the default_password_lifetime variable or to individual user accounts, all user passwords will expire after 360 days, and all user accounts will start running in restricted mode when this happens. Clients…

I would encourage you to view the MySQL password expiry policy to see the full note. I have only include the intro here are a teaser, because you need to read the entire note.

Analysis

Back to impatient analysis steps.

$ mysql -u admin -p 
*********

SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 5.7.9-log |
+-----------+

SHOW GLOBAL VARIABLES LIKE 'default_p%';
+---------------------------+-------+
| Variable_name             | Value |
+---------------------------+-------+
| default_password_lifetime | 360   |
+---------------------------+-------+


SELECT host,user,password_last_changed 
FROM mysql.user
WHERE password_last_changed + INTERVAL @@default_password_lifetime DAY < CURDATE();
+-----------+--------------+-----------------------+
| host      | user         | password_last_changed |
+-----------+--------------+-----------------------+
| localhost | XXX          | 2014-12-01 12:53:36   |
| localhost | XXXXX        | 2014-12-01 12:54:04   |
| localhost | XX_XXXX      | 2015-06-04 11:01:11   |
+-----------+--------------+-----------------------+

Indeed there are some passwords that have expired.

After finding the applicable application credentials I looked at verifying the problem.

$ mysql -uXX_XXXX -p
*******************
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Server version: 5.7.9-log

mysql>

Interesting, there was no error to make a client connection, however.

mysql> use XXXX;
ERROR 1820 (HY000): You must reset your password using ALTER USER statement before executing this statement.

I then proceeded to change the password with the applicable hint shown.

ALTER USER XX_XXXX@localhost IDENTIFIED BY '*************************';

I chose to reuse the same password because changing the password would require a subsequent code change. MySQL accepted the same password. (A topic for a separate discussion on this point).

A manual verification showed the application and users operating as it should be, so immediate loss of data was averted. Monitoring of the sites home page however did not detect this problem of a partial page error, so should a password expiry policy be used, an applicable check in a regularly scheduled operational task is a good feature request.

All of this could have been avoided if my analysis started with reading the documentation and the note (partly shown above) which has an alternative and potentially more practical immediate solution.

In a firefighting operational mode it can be a priority to correct the problem, however more detailed analysis is prudent to maintain a "Being proactive rather than reactive" mindset. Being a Friday, I feel the old saying "There is more than one way to skin a cat" is applicable.

I am also more familiar with the SET PASSWORD syntax, so reviewing this 5.7 manual page is also a good read to determine what specific syntax is now deprecated and what "ALTER USER is now the preferred statement for assigning passwords" also.

Understanding the Oslo Libraries

Underpinning all of the OpenStack projects including Nova, Cinder, Keystone, Glance, Horizon, Heat, Trove, Murano and others is a set of core common libraries that provide a consistent, highly tested and compatible feature set. The Oslo project is a collection of over 30 libraries that are designed to reduce the technical debt of code duplication across projects and provide for a greater quality code path due to the frequency of use in OpenStack projects.

These libraries provide a variety of different features from the more commonly used functionality found in projects including configuration, logging, caching, messaging and database management to more specific features like deprecation management, handling plugins as well as frameworks for command line programs and state machines. The Oslo Python libraries are designed to be Python 2.7 and Python 3.4 compatible, leading the way in migration towards Python 3.

The first stable Oslo library oslo.config was included in the Grizzly release. Now over 30 libraries comprise the Oslo project. These libraries fall into a number of broad categories.

1. Stable OpenStack specific libraries

These libraries, using the olso. prefix are generally well described the library name.

  • oslo.cache
  • oslo.concurrency
  • oslo.context
  • oslo.config
  • oslo.db
  • oslo.i18n
  • oslo.log
  • oslo.messaging
  • oslo.middleware
  • oslo.policy
  • oslo.privsep
  • oslo.reports
  • oslo.serialization
  • oslo.service
  • oslo.utils
  • oslo.versionedobjects
  • oslo.vmware

2. Python libraries that can easily operate with other projects

In addition to the oslo namespace libraries, Oslo has a number of generically named libraries that are not OpenStack specific. The goal is that these libraries can be utilized outside of OpenStack by any Python project. These include:

  • automaton – a framework for building state machines.
  • cliff – a framework for building command line programs.
  • debtcollector – a collection of python patterns that help you collect your technical debt in a non-destructive manner (following deprecation patterns and strategies and so-on).
  • futurist – a collection of async functionality and additions from the future.
  • osprofiler – an OpenStack cross-project profiling library.
  • hacking – a library that provides a set of tools for enforcing coding style guidelines.
  • pbr – (or Python Build Reasonableness) is a add-on library that helps provide (and enforce) a set of sensible default setuptools behaviours.
  • pyCADF – a python implementation of the DMTF Cloud Audit (CADF) data model.
  • stevedore – a library for managing plugins for Python applications.
  • taskflow – a library that helps create applications that handle state/failures… in a reasonable manner.
  • tooz – a library that aims at centralizing the most common distributed primitives like group membership protocol, lock service and leader election

3. Convenience libraries

There are also several libraries that are used during the creation of, or support of OpenStack libraries.

The first was oslo-incubator where as the name suggests, initial libraries were incubated. As this code matured it was refactored into standard libraries. Projects have either graduated, been incorporated elsewhere or been deprecated. While the Oslo Incubator has been removed of libraries in Mitaka, one of the goals of the Newton cycle is to see the adoption of Oslo libraries in all projects. We will be providing a series of blogs to detail the walkthrough and reviews of existing projects for reference.

Other libraries include:

  • oslosphinx is a sphinx add-on library that provides theme and extension support for generating documentation with Sphinx. The Developer Documentation, Release Notes, a number of the OpenStack manuals including the Configuration Reference and now the Nova API Reference rely on this library.

  • oslotest is a helper library that provides base classes and fixtures for creating unit and functional tests.
  • oslo-cookiecutter is a project that creates a skeleton Oslo library from a set of templates.

4. Proposed or deprecated libraries

Some libraries fall outside of these categories, such as oslo.rootwrap. This was a mature library for handling fine filtering of shell commands to run as root. This is now deprecated in favor of oslo.privsep which is a mechanism for running selected python code with elevated privileges.

pylockfile is a legacy (and adopted) inter-process lock management library that was never used within OpenStack.

The oslo.version is an example of a proposed library at present to help in using python metadata to determine versioning.

The Oslo team is also evaluating what other common code may be suitable for an Oslo library.

The meaning behind the Oslo Name

Each OpenStack project has some reason behind the name. Oslo is in reference to the Oslo Peace Accords and “bringing peace” to the OpenStack project.

Oslo is also the capital of Norway, and in Norway you can find Moose. The moose is our project mascot.

Oracle OpenStack leveraging MySQL Cluster and Docker

At Oracle Openworld this year, Oracle OpenStack Release 2 was announced. This Kilo based distribution included some new deployment features not see in other OpenStack distros including the use of Kolla, Docker and MySQL Cluster. The press release states “First commercially available OpenStack implementation completely packaged as Docker instances”.

Using Docker to containerize each component of services is a very convenient means of dev/test/prod management. Your single node developer environment gets HA automatically, you can easily deploy to two or more management nodes varying services, and they look and act just like your production environment. I have often struggled with developing in OpenStack either with single project environments, creating a devstack, a previously installed 3 node physical server deployment which takes up room and power in my office, and also comparing other single node solutions including Canonical and Mirantis. I am often left to using online services such as Mirantis Express, TryStack and HP Cloud to more easily evaluate the end product, but without any access to the operating cloud under the covers.

It is an interesting move to using MySQL Cluster. I liked this announcement. This is a very robust Master/Master MySQL compatible product that starts with a High Availability implemented through a shared nothing architecture. You get the benefit of dynamic adding of data shards as your system grows. However MySQL Cluster is a very different product under the covers. For those familiar with managing MySQL server a different set of skills are required, starting with the concept of a management node, data nodes and SQL nodes and the different ways to manage, monitor and triage. MySQL Cluster is effectively an in-memory solution so this will require some additional sizing considerations especially for production deployments. Your backup/recovery/disaster recovery strategy will also change. All of this administration exists for MySQL Cluster, it is just different. If you only use MySQL server these are new skills to master. As an expert in MySQL server and having not used MySQL Cluster for at least 7 years, I cannot provide an insights for example of the use of common monitoring tools (including newer SaaS offerings). Still, MySQL Cluster is an extremely stable and production ready product, that scales to millions of QPS easily. Using this as a HA solution gives you a rock solid base which is what you need first.

While I attended a number of sessions and took the Hands On Lab for Oracle OpenStack the proof is having a running local environment. My next post will talk about my experiences installing.

References

Managing MySQL Version Upgrades Presentation

The following presentation was given at the Oracle Technology Network (OTN) Latin America 2015 tour events in Uruguay, Argentina, Chile and Peru.

In this presentation I talk about the various versions and means of installing and upgrading MySQL including:

  • MySQL version history from 3.23 to 5.7.8
  • Historical installation options
  • Recommended use of Oracle yum repository for current version
  • The installation and upgrade process, and errors that occur
  • Compatibility changes between MySQL 5.5 and MySQL 5.6 including
    • Reserved words (and their true impact)
    • Legacy TIMESTAMP usage
    • FULLTEXT indexes
    • The query optimizer
    • Clear text password warnings and security improvements
  • Important configuration differences
  • Other recommendations
  • The use of replication

Testing and Verifying your MySQL Backup Strategy Presentation

This past week I have been the sole MySQL representative on the Oracle Technology Network (OTN) Latin America 2015 tour events in Uruguay, Argentina, Chile and Peru.

In this presentation I talk about the important steps for testing and verifying your MySQL backup strategy to ensure your business continuity in any disaster recovery situation. This includes:

  • Overview of the primary product options
  • Backup and recovery strategy considerations
  • Technical requirements
  • Common problems observed
  • What about a failover strategy

Deploying Ubuntu OpenStack Kilo

My previous Ubuntu OpenStack setup has been using the Juno release. I received some installation problems for Kilo using the stable repo and so I switched to using the experimental repo. This comes with a number of surface changes.

  • The interactive installation asks for the installation type first, and password second.
  • The IP range of installed OpenStack services changes from 10.0.4.x to 10.0.7.x.
  • Juju GUI is no longer installed by default. You need to specifically add this as a service after initial installation.
  • The GUI displays additional information during installation.
  • The LXC container name changes from uoi-bootstrap to openstack-single-<user>.

Uninstall any existing environment

Remove any existing installed OpenStack cloud.

sudo openstack-install -k
sudo openstack-install -u

NOTE: Be sure to remove your existing cloud before upgrading. Failing to do so will mean you need to manually cleanup some things with:

sudo lxc-stop --name uoi-bootstrap
sudo lxc-destroy --name uoi-bootstrap
rm -rf $HOME/.cloud_install

Update the OpenStack installer

Upgrade Ubuntu OpenStack with the following commands. In my environment this installed version 0.99.14.

sudo apt-add-repository ppa:cloud-installer/experimental
sudo apt-get update
sudo apt-get upgrade openstack

Install OpenStack Kilo

Installing an Ubuntu OpenStack environment still uses the openstack-install command with an additional argument.

sudo openstack-install --upstream-ppa

NOTE: Updated 6/18/15 When using the experimental repo with version 0.99.12 or earlier you must specify the --extra-ppa argument and value, i.e. sudo openstack-install –extra-ppa ppa:cloud-installer/experimental. Thanks stokachu for pointing this out.

Adding Services

After setting up a Kilo cloud using Ubuntu OpenStack I was able to successfully add a Swift component. Something else that was not quite working as expected in stable.

References

Writing and testing unit tests in OpenStack

The following outlines an approach of identifying and improving unit tests in an OpenStack project.

Obtain the source code

You can obtain a copy of current source code for an OpenStack project at http://git.openstack.org. Active projects are categorized into openstack, openstack-dev, openstack-infra and stackforge.

NOTE: While you can find OpenStack projects on GitHub, these are just a mirror of the source repositories.

In this example I am going to use the Magnum project.

$ git clone git://git.openstack.org/openstack/magnum 
$ cd magnum

Run the current tests

The first step should be to run the current tests to verify the current code. This is to become familiar with the habit, especially if you may have made modifications and are returning to looking at your code. This will also create a virtual environment, which you will want to use later to test your changes.

$ tox -e py27

Should this fail, you may want to ensure all OpenStack developer dependencies are inplace on your OS.

Identify unit tests to work on

You can use the code coverage of unit tests to determine possible places to start adding to existing unit tests. The following command will produce a HTML report in the /cover directory of your project.

$ tox -e cover

This output will look similar to this example coverage output for Magnum. You can also produce a text based version with:

$ coverage report -m 

I will use this text version as a later verification.

Working on a specific unit test

Drilling down on any individual test file you will get an indication of code that does not have unit test coverage. For example in magnum/common/utils:

Once you have found a place to work with and you have identified the corresponding unit test file in the magnum/tests/unit sub-directory, in this example I am working on on magnum/tests/unit/common/test_utils.py, you will want to run this individual unit test in the virtual environment you previously created.

$ source .tox/py27/bin/activate
$ testr run test_utils -- -f

You can now start working on making your changes in whatever editor you wish. You may want to also work interactively in python initially to test and verify classes and methods especially if you are unfamiliar with how the code functions. For example, using the identical import found in test_utils.py for the test coverage I started with these simple checks.

(py27)$ python
Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from magnum.common import utils
>>> utils.is_valid_ipv4('10.0.0.1') == True
True
>>> utils.is_valid_ipv4('') == False
True

I then created some appropriate unit tests for these two methods based on this interactive validation. These tests show that I not only test for valid values, I also test various boundary contains for invalid values including blank, character and out of range values of IP addresses.

    def test_valid_ipv4(self):
        self.assertTrue(utils.is_valid_ipv4('10.0.0.1'))
        self.assertTrue(utils.is_valid_ipv4('255.255.255.255'))

    def test_invalid_ipv4(self):
        self.assertFalse(utils.is_valid_ipv4(''))
        self.assertFalse(utils.is_valid_ipv4('x.x.x.x'))
        self.assertFalse(utils.is_valid_ipv4('256.256.256.256'))
        self.assertFalse(utils.is_valid_ipv4(
                         'AA42:0000:0000:0000:0202:B3FF:FE1E:8329'))

    def test_valid_ipv6(self):
        self.assertTrue(utils.is_valid_ipv6(
                        'AA42:0000:0000:0000:0202:B3FF:FE1E:8329'))
        self.assertTrue(utils.is_valid_ipv6(
                        'AA42::0202:B3FF:FE1E:8329'))

    def test_invalid_ipv6(self):
        self.assertFalse(utils.is_valid_ipv6(''))
        self.assertFalse(utils.is_valid_ipv6('10.0.0.1'))
        self.assertFalse(utils.is_valid_ipv6('AA42::0202:B3FF:FE1E:'))

After making these changes you want to run and verify your modified test works as previously demonstrated.

$ testr run test_utils -- -f
running=OS_STDOUT_CAPTURE=${OS_STDOUT_CAPTURE:-1} \
OS_STDERR_CAPTURE=${OS_STDERR_CAPTURE:-1} \
OS_TEST_TIMEOUT=${OS_TEST_TIMEOUT:-160} \
${PYTHON:-python} -m subunit.run discover -t ./ ${OS_TEST_PATH:-./magnum/tests/unit} --list  -f
running=OS_STDOUT_CAPTURE=${OS_STDOUT_CAPTURE:-1} \
OS_STDERR_CAPTURE=${OS_STDERR_CAPTURE:-1} \
OS_TEST_TIMEOUT=${OS_TEST_TIMEOUT:-160} \
${PYTHON:-python} -m subunit.run discover -t ./ ${OS_TEST_PATH:-./magnum/tests/unit}  --load-list /tmp/tmpDMP50r -f
Ran 59 (+1) tests in 0.824s (-0.016s)
PASSED (id=19)

If your tests fail you will see a FAILED message like. I find it useful to write a failing test intentionally just to validate the actual testing process is working.


${PYTHON:-python} -m subunit.run discover -t ./ ${OS_TEST_PATH:-./magnum/tests/unit}  --load-list /tmp/tmpsZlk3i -f
======================================================================
FAIL: magnum.tests.unit.common.test_utils.UtilsTestCase.test_invalid_ipv6
tags: worker-0
----------------------------------------------------------------------
Empty attachments:
  stderr
  stdout

Traceback (most recent call last):
  File "magnum/tests/unit/common/test_utils.py", line 98, in test_invalid_ipv6
    self.assertFalse(utils.is_valid_ipv6('AA42::0202:B3FF:FE1E:832'))
  File "/home/rbradfor/os/openstack/magnum/.tox/py27/local/lib/python2.7/site-packages/unittest2/case.py", line 672, in assertFalse
    raise self.failureException(msg)
AssertionError: True is not false
Ran 55 (-4) tests in 0.805s (-0.017s)
FAILED (id=20, failures=1 (+1))

Confirming your new unit tests

You can verify this has improved coverage percentage by re-running the coverage commands. I use the text based version as an easy way to see a decrease in the number of lines not covered.

Before

$ coverage report -m | grep "common/utils"
magnum/common/utils    273     94     76     38    62%   92-94, 105-134, 151-157, 208-211, 215-218, 241-259, 267-270, 275-279, 325, 349-384, 442, 449-453, 458-459, 467, 517-518, 530-531, 544
$ tox -e cover

After

$ coverage report -m | grep "common/utils"
magnum/common/utils    273     86     76     38    64%   92-94, 105-134, 151-157, 241-259, 267-270, 275-279, 325, 349-384, 442, 449-453, 458-459, 467, 517-518, 530-531, 544

I can see 8 lines of improvement which I can also verify if I look at the html version.

Before

After

Additional Testing

Make sure you run a full test before committing. This runs all tests in multiple Python versions and runs the PEP8 code style validation for your modified unit tests.

$ tox -e py27

Here are some examples of PEP8 problems with my improvements to the unit tests.

pep8 runtests: commands[0] | flake8
./magnum/tests/unit/common/test_utils.py:88:80: E501 line too long (88 > 79 characters)
./magnum/tests/unit/common/test_utils.py:91:80: E501 line too long (87 > 79 characters)
...
./magnum/tests/unit/common/test_utils.py:112:32: E231 missing whitespace after ','
./magnum/tests/unit/common/test_utils.py:113:32: E231 missing whitespace after ','
./magnum/tests/unit/common/test_utils.py:121:30: E231 missing whitespace after ','
...

Submitting your work

In order for your time and effort to be included in the OpenStack project there are a number of key details you need to follow that I outlined in contributing to OpenStack. Specifically these documents are important.

You do not have to be familiar with the procedures in order to look at the code, and even look at improving the code. You will need to follow the steps as outlined in these links if you want to contribute your code. Remember if you are new, the best access to help is to jump onto the IRC channel of the project you are interested in.

This example along with additions for several other methods was submitted (See patch). It was reviewed and ultimately approved.

References

Some additional information about the tools and processes can be found in these OpenStack documentation and wiki pages.

Contributing to OpenStack

Following my first OpenStack Summit in Vancouver 4/2015 it was time to become involved with contributing to OpenStack.

I have lurked around the mailing lists and several IRC channels for a few weeks and familiarized myself with OpenStack in varying forms including devstack, the free hosted Mirantis Express and the VM version, Ubuntu OpenStack, and even building my own 3 physical server cloud from second hand hardware purchased on eBay.

There are several resources available however I suggest you start with this concise presentation I attended at the summit by Adrian Otto on “7 Habits of Highly Effective Contributors” (slides, video).

You should also look at contributions from existing developers by looking at current code being submitted for review at https://review.openstack.org. I spent several weeks just looking at submissions, and I look at new submissions most days. While it does not always make sense (including a lot initially) its important to look at the full scope of all the projects. It is extremely valuable to look at how the review process works, how others comment on contributions, and look at the types of patches and code changes that are being contributed. There are a number of ways of not doing it right which can be discouraging when you first start contributing. The following links are vital to read, and re-read.

Individual projects also have various information, for example Magnum’s Ways to Contribute.

The benefit of observing for some time is you can be better prepared when you start to contribute. I was also new to how unit testing and automated testing worked in Python (about 7th on my list of known languages), and so learning about running OpenStack tests with tox and understanding the different OpenStack tox configs were valuable lessons, helped by feedback of OpenStack developers on the mailing list and IRC (If you have not looked at the 7 Habits presentation, now is a great time).

I took the time to find areas of interest and value which become more apparent after attending my first Design Summit. I even committed to assist in a design priority in the Magnum project as a result of my learning about how unit testing worked.

And if you write about your experiences another thing you can do is Add your blog to Planet OpenStack. I have received great feedback from the OpenStack community when writing about my first experiences.

Tracking the Ubuntu OpenStack installation process

Following on from Installing Ubuntu OpenStack the following steps help you navigate around the single server installation, monitoring and debugging the installation process.

Configuration

The initial execution of the installer will create a default config.yaml file that defines the container and OpenStack services. After a successful installation this looks like:

$ more $HOME/.cloud-install/config.yaml
container_ip: 10.0.3.149
current_state: 2
deploy_complete: true
install_type: Single
openstack_password: openstack
openstack_release: juno
placements:
  controller:
    assignments:
      LXC:
      - nova-cloud-controller
      - glance
      - glance-simplestreams-sync
      - openstack-dashboard
      - juju-gui
      - keystone
      - mysql
      - neutron-api
      - neutron-openvswitch
      - rabbitmq-server
    constraints:
      cpu-cores: 2
      mem: 6144
      root-disk: 20480
  nova-compute-machine-0:
    assignments:
      BareMetal:
      - nova-compute
    constraints:
      mem: 4096
      root-disk: 40960
  quantum-gateway-machine-0:
    assignments:
      BareMetal:
      - quantum-gateway
    constraints:
      mem: 2048
      root-disk: 20480

This file changes during the installation process which I described later.

The LXC Container

The single server installation is managed within a single LXC container. You can obtain details of and connect to the container with the following.

$ sudo lxc-ls --fancy
----------------------------------------------------------------------------
uoi-bootstrap  RUNNING  10.0.3.149, 10.0.4.1, 192.168.122.1  -     YES      

$ sudo lxc-info --name uoi-bootstrap
Name:           uoi-bootstrap
State:          RUNNING
PID:            19623
IP:             10.0.3.149
IP:             10.0.4.1
IP:             192.168.122.1
CPU use:        27692.85 seconds
BlkIO use:      63.94 GiB
Memory use:     24.29 GiB
KMem use:       0 bytes
Link:           vethC0E9US
 TX bytes:      507.43 MiB
 RX bytes:      1.43 GiB
 Total bytes:   1.93 GiB

$ sudo lxc-attach --name uoi-bootstrap

You can also connect to the server directly. As I prefer to NEVER configure or connect to a server as root this is how I access the LXC container.

$ ssh [email protected]

Juju Status

When connected to the LXC container you can then look at the status of the Juju orchestration with.

$ export JUJU_HOME=~/.cloud-install/juju

$ juju status
environment: local
machines:
  "0":
    agent-state: started
    agent-version: 1.20.11.1
    dns-name: localhost
    instance-id: localhost
    series: trusty
    state-server-member-status: has-vote
  "1":
    agent-state: started
    agent-version: 1.20.11.1
    dns-name: 10.0.4.62
    instance-id: ubuntu-local-machine-1
    series: trusty
    hardware: arch=amd64 cpu-cores=1 mem=4096M root-disk=40960M
  "2":
    agent-state: started
    agent-version: 1.20.11.1
    dns-name: 10.0.4.77
    instance-id: ubuntu-local-machine-2
    series: trusty
    containers:
      2/lxc/0:
        agent-state: started
        agent-version: 1.20.11.1
        dns-name: 10.0.4.147
        instance-id: ubuntu-local-machine-2-lxc-0
        series: trusty
        hardware: arch=amd64
      2/lxc/1:
        agent-state: started
        agent-version: 1.20.11.1
        dns-name: 10.0.4.15
        instance-id: ubuntu-local-machine-2-lxc-1
        series: trusty
        hardware: arch=amd64
      2/lxc/2:
        agent-state: started
        agent-version: 1.20.11.1
        dns-name: 10.0.4.135
        instance-id: ubuntu-local-machine-2-lxc-2
        series: trusty
        hardware: arch=amd64
      2/lxc/3:
        agent-state: started
        agent-version: 1.20.11.1
        dns-name: 10.0.4.133
        instance-id: ubuntu-local-machine-2-lxc-3
        series: trusty
        hardware: arch=amd64
      2/lxc/4:
        agent-state: started
        agent-version: 1.20.11.1
        dns-name: 10.0.4.119
        instance-id: ubuntu-local-machine-2-lxc-4
        series: trusty
        hardware: arch=amd64
      2/lxc/5:
        agent-state: started
        agent-version: 1.20.11.1
        dns-name: 10.0.4.88
        instance-id: ubuntu-local-machine-2-lxc-5
        series: trusty
        hardware: arch=amd64
      2/lxc/6:
        agent-state: started
        agent-version: 1.20.11.1
        dns-name: 10.0.4.155
        instance-id: ubuntu-local-machine-2-lxc-6
        series: trusty
        hardware: arch=amd64
      2/lxc/7:
        agent-state: started
        agent-version: 1.20.11.1
        dns-name: 10.0.4.36
        instance-id: ubuntu-local-machine-2-lxc-7
        series: trusty
        hardware: arch=amd64
      2/lxc/8:
        agent-state: started
        agent-version: 1.20.11.1
        dns-name: 10.0.4.11
        instance-id: ubuntu-local-machine-2-lxc-8
        series: trusty
        hardware: arch=amd64
    hardware: arch=amd64 cpu-cores=2 mem=6144M root-disk=20480M
  "3":
    agent-state: started
    agent-version: 1.20.11.1
    dns-name: 10.0.4.10
    instance-id: ubuntu-local-machine-3
    series: trusty
    hardware: arch=amd64 cpu-cores=1 mem=2048M root-disk=20480M
  "4":
    agent-state: started
    agent-version: 1.20.11.1
    dns-name: 10.0.4.96
    instance-id: ubuntu-local-machine-4
    series: trusty
    hardware: arch=amd64 cpu-cores=1 mem=512M root-disk=8192M
  "5":
    agent-state: started
    agent-version: 1.20.11.1
    dns-name: 10.0.4.140
    instance-id: ubuntu-local-machine-5
    series: trusty
    hardware: arch=amd64 cpu-cores=1 mem=512M root-disk=8192M
  "6":
    agent-state: started
    agent-version: 1.20.11.1
    dns-name: 10.0.4.197
    instance-id: ubuntu-local-machine-6
    series: trusty
    hardware: arch=amd64 cpu-cores=1 mem=512M root-disk=8192M
services:
  glance:
    charm: cs:trusty/glance-11
    exposed: false
    relations:
      amqp:
      - rabbitmq-server
      cluster:
      - glance
      identity-service:
      - keystone
      image-service:
      - nova-cloud-controller
      - nova-compute
      object-store:
      - swift-proxy
      shared-db:
      - mysql
    units:
      glance/0:
        agent-state: started
        agent-version: 1.20.11.1
        machine: 2/lxc/4
        open-ports:
        - 9292/tcp
        public-address: 10.0.4.119
  glance-simplestreams-sync:
    charm: local:trusty/glance-simplestreams-sync-0
    exposed: false
    relations:
      amqp:
      - rabbitmq-server
      identity-service:
      - keystone
    units:
      glance-simplestreams-sync/0:
        agent-state: started
        agent-version: 1.20.11.1
        machine: 2/lxc/5
        public-address: 10.0.4.88
  juju-gui:
    charm: cs:trusty/juju-gui-16
    exposed: false
    units:
      juju-gui/0:
        agent-state: started
        agent-version: 1.20.11.1
        machine: 2/lxc/1
        open-ports:
        - 80/tcp
        - 443/tcp
        public-address: 10.0.4.15
  keystone:
    charm: cs:trusty/keystone-12
    exposed: false
    relations:
      cluster:
      - keystone
      identity-service:
      - glance
      - glance-simplestreams-sync
      - neutron-api
      - nova-cloud-controller
      - openstack-dashboard
      - swift-proxy
      shared-db:
      - mysql
    units:
      keystone/0:
        agent-state: started
        agent-version: 1.20.11.1
        machine: 2/lxc/2
        public-address: 10.0.4.135
  mysql:
    charm: cs:trusty/mysql-12
    exposed: false
    relations:
      cluster:
      - mysql
      shared-db:
      - glance
      - keystone
      - neutron-api
      - nova-cloud-controller
      - nova-compute
      - quantum-gateway
    units:
      mysql/0:
        agent-state: started
        agent-version: 1.20.11.1
        machine: 2/lxc/0
        public-address: 10.0.4.147
  neutron-api:
    charm: cs:trusty/neutron-api-6
    exposed: false
    relations:
      amqp:
      - rabbitmq-server
      cluster:
      - neutron-api
      identity-service:
      - keystone
      neutron-api:
      - nova-cloud-controller
      neutron-plugin-api:
      - neutron-openvswitch
      shared-db:
      - mysql
    units:
      neutron-api/0:
        agent-state: started
        agent-version: 1.20.11.1
        machine: 2/lxc/7
        open-ports:
        - 9696/tcp
        public-address: 10.0.4.36
  neutron-openvswitch:
    charm: cs:trusty/neutron-openvswitch-2
    exposed: false
    relations:
      amqp:
      - rabbitmq-server
      neutron-plugin:
      - nova-compute
      neutron-plugin-api:
      - neutron-api
    subordinate-to:
    - nova-compute
  nova-cloud-controller:
    charm: cs:trusty/nova-cloud-controller-51
    exposed: false
    relations:
      amqp:
      - rabbitmq-server
      cloud-compute:
      - nova-compute
      cluster:
      - nova-cloud-controller
      identity-service:
      - keystone
      image-service:
      - glance
      neutron-api:
      - neutron-api
      quantum-network-service:
      - quantum-gateway
      shared-db:
      - mysql
    units:
      nova-cloud-controller/0:
        agent-state: started
        agent-version: 1.20.11.1
        machine: 2/lxc/3
        open-ports:
        - 3333/tcp
        - 8773/tcp
        - 8774/tcp
        - 9696/tcp
        public-address: 10.0.4.133
  nova-compute:
    charm: cs:trusty/nova-compute-14
    exposed: false
    relations:
      amqp:
      - rabbitmq-server
      cloud-compute:
      - nova-cloud-controller
      compute-peer:
      - nova-compute
      image-service:
      - glance
      neutron-plugin:
      - neutron-openvswitch
      shared-db:
      - mysql
    units:
      nova-compute/0:
        agent-state: started
        agent-version: 1.20.11.1
        machine: "1"
        public-address: 10.0.4.62
        subordinates:
          neutron-openvswitch/0:
            upgrading-from: cs:trusty/neutron-openvswitch-2
            agent-state: started
            agent-version: 1.20.11.1
            public-address: 10.0.4.62
  openstack-dashboard:
    charm: cs:trusty/openstack-dashboard-9
    exposed: false
    relations:
      cluster:
      - openstack-dashboard
      identity-service:
      - keystone
    units:
      openstack-dashboard/0:
        agent-state: started
        agent-version: 1.20.11.1
        machine: 2/lxc/6
        open-ports:
        - 80/tcp
        - 443/tcp
        public-address: 10.0.4.155
  quantum-gateway:
    charm: cs:trusty/quantum-gateway-10
    exposed: false
    relations:
      amqp:
      - rabbitmq-server
      cluster:
      - quantum-gateway
      quantum-network-service:
      - nova-cloud-controller
      shared-db:
      - mysql
    units:
      quantum-gateway/0:
        agent-state: started
        agent-version: 1.20.11.1
        machine: "3"
        public-address: 10.0.4.10
  rabbitmq-server:
    charm: cs:trusty/rabbitmq-server-26
    exposed: false
    relations:
      amqp:
      - glance
      - glance-simplestreams-sync
      - neutron-api
      - neutron-openvswitch
      - nova-cloud-controller
      - nova-compute
      - quantum-gateway
      cluster:
      - rabbitmq-server
    units:
      rabbitmq-server/0:
        agent-state: started
        agent-version: 1.20.11.1
        machine: 2/lxc/8
        open-ports:
        - 5672/tcp
        public-address: 10.0.4.11

You can also look at a subset of the status for a particular service, for example keystone with:

$ juju status keystone
environment: local
machines:
  "1":
    agent-state: started
    agent-version: 1.20.11.1
    dns-name: 10.0.4.128
    instance-id: ubuntu-local-machine-1
    series: trusty
    containers:
      1/lxc/2:
        agent-state: started
        agent-version: 1.20.11.1
        dns-name: 10.0.4.142
        instance-id: ubuntu-local-machine-1-lxc-2
        series: trusty
        hardware: arch=amd64
    hardware: arch=amd64 cpu-cores=2 mem=6144M root-disk=20480M
services:
  keystone:
    charm: cs:trusty/keystone-12
    exposed: false
    relations:
      cluster:
      - keystone
      identity-service:
      - glance
      - glance-simplestreams-sync
      - neutron-api
      - nova-cloud-controller
      - openstack-dashboard
      shared-db:
      - mysql
    units:
      keystone/0:
        agent-state: started
        agent-version: 1.20.11.1
        machine: 1/lxc/2
        public-address: 10.0.4.142

Monitoring the Installation

When performing an installation you can monitor the executed commands with:

$ tail -f $HOME/.cloud-install/commands.log

...

This provides a lot of debugging output. A streamlined logging is actually possible with automated installation described later.

Uninstalling

As the single server instance is in a LXC container, as the documentation states uninstalling the environment is a rather trivial process that takes only a few seconds.

This will teardown the cloud but leaving userdata available for a subsequent deployment.

$ sudo openstack-install -k
Warning:

This will destroy the host Container housing the OpenStack private cloud. This is a permanent operation.
Proceed? [y/N] Y
Removing static route
Removing host container...
Container is removed.

You can also do a more permanent uninstall of the cloud and packages.

$ sudo openstack-install -u
Warning:

This will uninstall OpenStack and make a best effort to return the system back to its original state.
Proceed? [y/N] Y
Restoring system to last known state.
Ubuntu Openstack Installer Uninstalling ...Single install path.

This does not however seem to cleanup $HOME/.cloud-install. You can safely remove this or move it sideways when re-deploying without any issues.

Installation automation

As described in my original post, the openstack-install script is a cursors-based interactive view. You can automate the installation by defining the needed setup inputs in a separate configuration file and running in headless mode.

$ echo "install_type: Single
openstack_password: openstack" > install.yaml

$ sudo openstack-install --headless --config install.yaml

This has the added benefit providing a more meaningful log of the state of the installation with less verbose information then in the commands.log file.

[INFO  • 06-02 12:02:42 • cloudinstall.install] Running in headless mode.
[INFO  • 06-02 12:02:42 • cloudinstall.install] Performing a Single Install
[INFO  • 06-02 12:02:42 • cloudinstall.task] [TASKLIST] ['Initializing Environment', 'Creating container', 'Bootstrapping Juju']
[INFO  • 06-02 12:02:42 • cloudinstall.task] [TASK] Initializing Environment
[INFO  • 06-02 12:02:42 • cloudinstall.consoleui] Building environment
[INFO  • 06-02 12:02:42 • cloudinstall.single_install] Prepared userdata: {'extra_sshkeys': ['ssh-rsa ...\n'], 'seed_command': ['env', 'pollinate', '-q']}
[INFO  • 06-02 12:02:42 • cloudinstall.single_install] Setting permissions for user rbradfor
[INFO  • 06-02 12:02:43 • cloudinstall.task] [TASK] Creating container
[INFO  • 06-02 12:04:20 • cloudinstall.single_install] Setting DHCP properties for host container.
[INFO  • 06-02 12:04:20 • cloudinstall.single_install] Adding static route for 10.0.4.0/24 via 10.0.3.160
...
[INFO  • 06-02 12:22:50 • cloudinstall.consoleui] Checking availability of nova-cloud-controller: pending
[INFO  • 06-02 12:23:31 • cloudinstall.consoleui] Checking availability of nova-cloud-controller: installed
[INFO  • 06-02 12:23:52 • cloudinstall.consoleui] Checking availability of nova-cloud-controller: started
[INFO  • 06-02 12:24:34 • cloudinstall.consoleui] Checking availability of keystone: started
[INFO  • 06-02 12:24:44 • cloudinstall.consoleui] Checking availability of keystone: started
[INFO  • 06-02 12:24:44 • cloudinstall.consoleui] Checking availability of nova-cloud-controller: started
[INFO  • 06-02 12:27:38 • cloudinstall.consoleui] Checking availability of quantum-gateway: started
[INFO  • 06-02 12:27:38 • cloudinstall.consoleui] Checking availability of nova-cloud-controller: started
[INFO  • 06-02 12:27:38 • cloudinstall.consoleui] Validating network parameters for Neutron
[INFO  • 06-02 12:27:53 • cloudinstall.consoleui] All systems go!=

And 25 minutes later you have an available cloud.

If you attempt to look at the GUI status page with openstack-status you will be given a text based version of messages like.

$ sudo openstack-status
[INFO  • 06-02 12:06:21 • cloudinstall.core] Running openstack-status in headless mode.
[INFO  • 06-02 12:06:21 • cloudinstall.consoleui] Loaded placements from file.
[INFO  • 06-02 12:06:21 • cloudinstall.consoleui] Waiting for machines to start: 3 unknown
[INFO  • 06-02 12:08:20 • cloudinstall.consoleui] Waiting for machines to start: 1 pending, 2 unknown
[INFO  • 06-02 12:08:48 • cloudinstall.consoleui] Waiting for machines to start: 2 pending, 1 unknown
[INFO  • 06-02 12:09:04 • cloudinstall.consoleui] Waiting for machines to start: 1 down (started), 1 pending, 1 unknown
[INFO  • 06-02 12:09:13 • cloudinstall.consoleui] Waiting for machines to start: 1 down (started), 2 pending
[INFO  • 06-02 12:09:20 • cloudinstall.consoleui] Waiting for machines to start: 2 down (started), 1 pending
[INFO  • 06-02 12:09:26 • cloudinstall.consoleui] Waiting for machines to start: 1 pending, 2 started
[INFO  • 06-02 12:09:51 • cloudinstall.consoleui] Waiting for machines to start: 1 down (started), 2 started
[INFO  • 06-02 12:10:44 • cloudinstall.consoleui] Verifying service deployments
[INFO  • 06-02 12:10:44 • cloudinstall.consoleui] Missing ConsoleUI() attribute: set_pending_deploys
[INFO  • 06-02 12:10:44 • cloudinstall.consoleui] Checking if MySQL is deployed
[INFO  • 06-02 12:10:44 • cloudinstall.consoleui] Deploying MySQL to machine lxc:1
[INFO  • 06-02 12:10:49 • cloudinstall.consoleui] Deployed MySQL.
[INFO  • 06-02 12:10:49 • cloudinstall.consoleui] Checking if Juju GUI is deployed
[INFO  • 06-02 12:10:49 • cloudinstall.consoleui] Deploying Juju GUI to machine lxc:1
[INFO  • 06-02 12:11:00 • cloudinstall.consoleui] Deployed Juju GUI.
[INFO  • 06-02 12:11:00 • cloudinstall.consoleui] Checking if Keystone is deployed
[INFO  • 06-02 12:11:00 • cloudinstall.consoleui] Deploying Keystone to machine lxc:1
...

It seems you can trick it into providing both a GUI and text version with the following in another shell session.

$ sed -ie "/headless/d" $HOME/.cloud-install/config.yaml
$ sudo openstack-status

NOTE: You will not get any output until the initial container is completed. This also leaves a .pid file that must be manually cleaned up if you run to soon. The next invocation provides the following message.

$ sudo openstack-status
Another instance of openstack-status is running. If you're sure there are no other instances, please remove ~/.cloud-install/openstack.pid
$ rm $HOME/.cloud-install/openstack.pid

Monitoring the installation progress

The running config.yaml file changes over the duration of the installation.
It’s most basic configuration (when starting with the GUI) is:

$ more $HOME/.cloud-install/config.yaml
current_state: 0
openstack_release: juno

The release is also defined in the $HOME/.cloud-install/openstack_release file.

When starting by passing the configuration as previously mentioned it’s initial state is:

$ more $HOME/.cloud-install/config.yaml
config_file: install.yaml
current_state: 0
headless: true
install_type: Single
openstack_password: openstack
openstack_release: juno

This is updated when the LXC container is installed.

$ more $HOME/.cloud-install/config.yaml
config_file: install.yaml
container_ip: 10.0.3.77
current_state: 0
headless: true
install_type: Single
openstack_password: openstack
openstack_release: juno

And also updated during installation, such as.

$ more $HOME/.cloud-install/config.yaml
config_file: install.yaml
container_ip: 10.0.3.77
current_state: 0
headless: true
install_type: Single
openstack_password: openstack
openstack_release: juno
placements:
  controller:
    assignments:
      LXC:
      - nova-cloud-controller
      - glance
      - glance-simplestreams-sync
      - openstack-dashboard
      - juju-gui
      - keystone
      - mysql
      - neutron-api
      - neutron-openvswitch
      - rabbitmq-server
    constraints:
      cpu-cores: 2
      mem: 6144
      root-disk: 20480
  nova-compute-machine-0:
    assignments:
      BareMetal:
      - nova-compute
    constraints:
      mem: 4096
      root-disk: 40960
  quantum-gateway-machine-0:
    assignments:
      BareMetal:
      - quantum-gateway
    constraints:
      mem: 2048
      root-disk: 20480

When completed the configuration has the following settings.

config_file: install.yaml
container_ip: 10.0.3.77
current_state: 2
deploy_complete: true
install_type: Single
openstack_password: openstack
openstack_release: juno
placements:
...

Problems

When using the GUI installer the first time you quit (using Q), it seems to leave the terminal state wrong. The following will reset this to normal use.

$ stty sane  ^j    # (i.e. Ctrl-J together).

Subsequent uses of openstack-status do not have the same problem.

References

In my next post I am going to talk about the analysis taken to debug errors in the installation, starting with Keystone – hook failed: “config-changed” message I got attempting to install kilo, and hence this more detailed analysis of the installation process components.

Installing Ubuntu OpenStack

The The Canonical Distribution of Ubuntu OpenStack provides a simple installer to run an OpenStack cloud. You can deploy a simple single machine setup with fully containerized services (11 in total), or a multi server installation leveraging MAAS – Metal as a Service and Landscape Autopilot.

Installation

This post describes my experiences with the single machine setup on a 4 core machine with 32GB of RAM with a clean Ubuntu 14.04 LTS OS. The installation requires the following commands to configure the repo, install and configure your OpenStack cloud. In this example, the installed version is 0.22.3.

sudo apt-add-repository -y ppa:cloud-installer/stable
sudo apt-get update
sudo apt-get install -y openstack
sudo openstack-install --version
sudo openstack-install

The final step uses a cusors-based interface and only requires two steps before the installation.

  • Specify a password
  • Specify the install type




The UI provides a progress status of the installation. Initially new containers will start with a Pending status. Following the starting of the Juju GUI container the footer bar shows the URL for the JujuGUI, in my case http://10.0.4.112. Following the starting of the Openstack Dashboard you will then get a Horizon URL also detailed in the footer such as http://10.0.4.74/horizon.






Horizon

The Horizon display is what you generally expect.




JujuGUI

The JujuGUI provides a display of the deployment orchestration via charms. You can also drill down to specific services. An example is for the glance service using the charm cs:trusty/glance-11. This describes the relationships and configuration which are also seen in the GUI. You can also view online the full source code used to create this deployed service.




OpenStack Status

You can view the state of your containerized cloud with openstack-status which is a cursors-based display of the running installation, the same used during the installation. This displays the units deployed, status messages and a footer URL bar that indicates the URL’s of Horizon and JujuGUI. Each time you invoke this it will also check services, as indicated by the [INFO] messages.


Connecting to Containers

The installer will automatically create a SSH key for the user that you use to run the openstack-install command. This enables you to SSH to any of the containers, for example to connect to the MySQL container.

ssh [email protected]
$ mysql -uroot -p`sudo cat /var/lib/mysql/mysql.passwd` -e "SHOW SCHEMAS"
+--------------------+
| Database           |
+--------------------+
| information_schema |
| glance             |
| keystone           |
| mysql              |
| neutron            |
| nova               |
| performance_schema |
+--------------------+

You can use the various OpenStack clients to access OpenStack services. These are not installed by default.

sudo apt-get install -y python-glanceclient python-openstackclient python-novaclient python-keystoneclient
$ source $HOME/.cloud-install/openstack-admin-rc
$ glance image-list
+--------------------------------------+---------------------------------------------------------------+-------------+------------------+-----------+--------+
| ID                                   | Name                                                          | Disk Format | Container Format | Size      | Status |
+--------------------------------------+---------------------------------------------------------------+-------------+------------------+-----------+--------+
| f3cd4ec6-8ce6-4a44-85ec-2f8f066f351b | auto-sync/ubuntu-trusty-14.04-amd64-server-20150528-disk1.img | qcow2       | bare             | 257294848 | active |
+--------------------------------------+---------------------------------------------------------------+-------------+------------------+-----------+--------+

More Information

Read Tracking the Ubuntu OpenStack installation process for more detailed information on monitoring the installation process.

Thanks to the New York OpenStack Group and a presentation by Mark Baker of Canonical who demonstrated MAAS and Landscape AutoPilot installation of OpenStack. Slides of Automating hard things slides.

My thoughts on Architecture and Software Development with MySQL

Yesterday I was able to present to the Portland MySQL Users Group two presentations that are important foundations for effective development with MySQL.

With 26 years of architectural experience in RDBMS and 16 years of MySQL knowledge, my extensive exposure to large and small companies from consulting has lead to these presentations containing common obstacles I have seen an help organization with. All that can be easily avoided. The benefits of improved development and better processes leads to better quality software, better performance and a lower cost of ownership, that is saving companies money.

Thanks to Emily and Daniel for organizing and New Relic for hosting.

Python 3 semantics for integer division

As I refresh my skills in Python 2 to Python 3 semantics I discovered there is a difference in the division operator (i.e. /).

When using integers in Python 2 the result (by default) is an integer. For example.

$ python2
Python 2.7.6 (default, Mar 22 2014, 22:59:56)
>>> 1/2
0
>>> 1/2.0
0.5

In Python 3 the result is a float.

$ python3
Python 3.4.0 (default, Apr 11 2014, 13:05:11)
>>> 1/2
0.5

It has been encouraged in the Porting Python 2 Code to Python 3 documentation to perform the following import.

$ python2
Python 2.7.6 (default, Mar 22 2014, 22:59:56)
>>> from __future__ import division
>>> 1/2
0.5

I was also unaware of the floor operator (i.e. //) as specified in PEP 238.

$ python2
Python 2.7.6 (default, Mar 22 2014, 22:59:56)
>>> from __future__ import division
>>> 1//2
0
>>> 1/2
0.5

I uncovered this by playing with algorithms between versions. This Newton Method for the Square Root was something I was unaware of, and this example failed when using Python 2.

My improved version on the referenced example without an import.

def squareroot(number, precision = 5):
  root = number/2.0
  for i in range(20):
    nroot = (1/2.0)*(root + (number / root))
    #print i, nroot
    if (root - nroot < 1.0/10**precision):
      break
    root = nroot
  return round(nroot, precision)
>>> squareroot(10)
3.16228
>>> squareroot(10,1)
3.2
>>> squareroot(10,2)
3.16
>>> squareroot(10,5)
3.16228
>>> squareroot(10,10)
3.1622776602

Installing Python 3.3 on Ubuntu 14.04.2 LTS

Ubuntu 14.04 by default uses Python 2.7 and 3.4. If you want to install Python 3.3, in my case because various Openstack projects that maintain 3.3 compatibility.

I had a hard time finding what I would consider an official means. These are a third party PPA steps. I welcome comments for any other ways to install multiple Python environments.

$ sudo apt-get install -y python-software-properties
$ sudo add-apt-repository -y ppa:fkrull/deadsnakes
gpg: keyring `/tmp/tmpuljbio98/secring.gpg' created
gpg: keyring `/tmp/tmpuljbio98/pubring.gpg' created
gpg: requesting key DB82666C from hkp server keyserver.ubuntu.com
gpg: /tmp/tmpuljbio98/trustdb.gpg: trustdb created
gpg: key DB82666C: public key "Launchpad Old Python Versions" imported
gpg: Total number processed: 1
gpg:               imported: 1  (RSA: 1)

NOTE: The add repo command prompts for a user response without -y.

$ sudo add-apt-repository ppa:fkrull/deadsnakes

 This PPA has older and newer Python versions for Ubuntu. The packages in the official archives generally don't go back all that far, but people might still need to develop and test against these old Python interpreters. There also was a time when Google App Engine still ran on Python 2.5, but nobody likes to talk about that.

A disclaimer first: I do not guarantee any kind of updates. In particular, I shed all responsibility for security issues in these packages. If you want to use them in a security-or-otherwise-critical environment (say, on a production server), you do so at your own risk.

For Python 2.7 updates for supported Ubuntu releases, see my dedicated 2.7 PPA:

https://launchpad.net/~fkrull/+archive/ubuntu/deadsnakes-python2.7

Reporting Issues
================
Issues can be reported in the master issue tracker at:

https://bitbucket.org/fk/deadsnakes-issues

Donations
=========
If you like what I'm doing here, you can show your appreciation by donating:

https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=TTTFWBJ2DZK6E

Supported Ubuntu Versions
=========================
Supported — Precise, Trusty, Utopic

Generally, I try to support Ubuntu releases until their official End-of-Life.

Supported Python Versions
=========================
Currently supported releases — 2.3, 2.4, 2.5, 2.6, 2.7, 3.1, 3.2, 3.3, 3.4

Basically, if an Ubuntu version doesn't have an official package for a specific major Python version (be it "any more" or "yet"), look for one in this PPA. However, for a given Python major release, don't expect to find newer point releases if there is already an older point release in the official Ubuntu repositories (i.e., if an Ubuntu release has a package for Python 2.6.4, I won't provide a package with 2.6.5 for that Ubuntu release): newer Python point releases shouldn't add new features or change behaviour, so they're rather pointless (no pun intended) for development and testing; conversely, if that Python point release has a bug that is fixed in a newer release, that's still an issue with the original package and should be taken up with the Ubuntu or Debian maintainer of the package. Besides, making these update packages externally to the original repositories is a bit of a pain.

As an exception, I have updated Python 2.7 packages for several Ubuntu releases in a separate PPA:

https://launchpad.net/~fkrull/+archive/ubuntu/deadsnakes-python2.7

Supported Python Packages
=========================
Using third-party modules packaged for Debian or Ubuntu with the Python interpreters from this repository is a bit of a mixed bag. For Python 2, Python modules from the official repositories will not work, as a consequence of how Python packaging works in Debian. For Python 3 on the other hand, all pure-Python module packages at least should be available; compiled extension modules will not work however.

In general, you're better off installing Python modules using the common Python packaging tools rather than the system package manager. For an introduction into the Python packaging ecosystem and its tools, refer to the Python Packaging User Guide [1].

[1] https://packaging.python.org
 More info: https://launchpad.net/~fkrull/+archive/ubuntu/deadsnakes
Press [ENTER] to continue or ctrl-c to cancel adding it

gpg: keyring `/tmp/tmp1rda6tjx/secring.gpg' created
gpg: keyring `/tmp/tmp1rda6tjx/pubring.gpg' created
gpg: requesting key DB82666C from hkp server keyserver.ubuntu.com
gpg: /tmp/tmp1rda6tjx/trustdb.gpg: trustdb created
gpg: key DB82666C: public key "Launchpad Old Python Versions" imported
gpg: Total number processed: 1
gpg:               imported: 1  (RSA: 1)
OK
$ sudo apt-get update
$ sudo apt-cache show python3.3
Package: python3.3
Priority: optional
Section: python
Installed-Size: 255
Maintainer: Felix Krull 
Architecture: amd64
Version: 3.3.6-4+trusty1
Suggests: python3.3-doc, binutils
Depends: python3.3-minimal (= 3.3.6-4+trusty1), libpython3.3-stdlib (= 3.3.6-4+trusty1), mime-support
Filename: pool/main/p/python3.3/python3.3_3.3.6-4+trusty1_amd64.deb
Size: 136098
MD5sum: 924f7fcd5e84d0938c1ae8e8c5b7f226
SHA1: 8e34edec87644b7c620a98a5f2e5fa54f4cbba67
SHA256: 44bc419559695dd78f9b5e37a9bf7ce586c4d3b1922e896bb15ed9d243cb578c
Description-en: Interactive high-level object-oriented language (version 3.3)
 Python is a high-level, interactive, object-oriented language. Its 3.3 version
 includes an extensive class library with lots of goodies for
 network programming, system administration, sounds and graphics.
Description-md5: d77f2abf1b0095e2e7bf5e21022e3d54
Multi-Arch: allowed
Original-Maintainer: Matthias Klose 

And now you can install the specific version.

$ sudo apt-get install -y python3.3 python3.3-dev

At this time I now have 3 different versions as well as the python and python3 aliases.

$ python --version
Python 2.7.6

$ python3 --version
Python 3.4.0

$ python2.7 --version
Python 2.7.6

$ python3.3 --version
Python 3.3.6

$ python3.4 --version
Python 3.4.0

This enabled me to now use Python 3.3. for my Openstack tox testing which at first pass produces the same error as Python 3.4

$ tox -epy33 --notest

Percona Live Presentation: MySQL Security Essentials

The slides for my MySQL Security Essentials presentation at Percona Live 2015 MySQL Conference and Expo are now available.

In this presentation I discuss just how insecure legacy versions of MySQL are and what are the essential requirements for securing your installation on disk, via network and with user privileges. I provide recommendations for how to manage application access for your most important data asset.

This presentation describes the key security improvements in MySQL 5.6 and MySQL 5.7 as well as additional features provided in MariaDB 10.0 and 10.1 supporting roles and encryption.

I have also included slides for how easy it is to Hack MySQL and examples of denial of service attacks that are possible with even limited MySQL access.

Percona Live Presentation: Improving Performance With Better Indexes

The slides for my Improving Performance With Better Indexes presentation at Percona Live 2015 MySQL Conference and Expo are now available.

In this presentation I discuss how to identify, review and analyze SQL statements in order to create better indexes for your queries. This includes understanding the EXPLAIN syntax and how to create and identify covering and partial column indexes.

This presentation is based on the work with a customer showing the 95% improvement of a key 15 table join query running 15,000 QPS in a 25 billion SQL statements per day infrastructure.

As mentioned, Explaining the MySQL Explain is an additional presentation that goes into more detail for learning how to read Query Execution Plans (QEP) in MySQL.

Updating MySQL on Ubuntu 12.04 LTS to MySQL 5.6

The Ubuntu 12.04.3 LTS release only provides MySQL 5.1 and MySQL 5.5 using the default Ubuntu package manager.

Oracle (owners of the MySQL(tm)) now provide Debian/Ubuntu APT repositories for all GA and DMR versions of MySQL including supporting Ubuntu 12.04.

The following steps demonstrate upgrading from the Ubuntu 5.5 server package to the Oracle 5.6 server package.

Verify MySQL Packages

$ apt-cache search mysql-server
mysql-server - MySQL database server (metapackage depending on the latest version)
mysql-server-5.5 - MySQL database server binaries and system database setup
mysql-server-core-5.5 - MySQL database server binaries
auth2db - Powerful and eye-candy IDS logger, log viewer and alert generator
cacti - Frontend to rrdtool for monitoring systems and services
torrentflux - web based, feature-rich BitTorrent download manager

Verify MySQL on Server

$  dpkg -l | grep mysql
ii  libdbd-mysql-perl                      4.020-1build2                                       Perl5 database interface to the MySQL database
ii  libmysqlclient-dev                     5.5.34-0ubuntu0.12.04.1                             MySQL database development files
ii  libmysqlclient18                       5.5.34-0ubuntu0.12.04.1                             MySQL database client library
ii  mysql-client-5.5                       5.5.31-0ubuntu0.12.04.1                             MySQL database client binaries
ii  mysql-client-core-5.5                  5.5.34-0ubuntu0.12.04.1                             MySQL database core client binaries
ii  mysql-common                           5.5.34-0ubuntu0.12.04.1                             MySQL database common files, e.g. /etc/mysql/my.cnf
ii  mysql-server-5.5                       5.5.31-0ubuntu0.12.04.1                             MySQL database server binaries and system database setup
ii  mysql-server-core-5.5                  5.5.31-0ubuntu0.12.04.1                             MySQL database server binaries
ii  php5-mysqlnd                           5.3.10-1ubuntu3.8                                   MySQL module for php5 (Native Driver)

Results may vary based on dependencies.

Checking the MySQL error log (as it’s the right good practice to always do)

$ sudo tail -50 /var/log/mysql/error.log
150402 16:02:49 [Note] Plugin 'FEDERATED' is disabled.
150402 16:02:49 InnoDB: The InnoDB memory heap is disabled
150402 16:02:49 InnoDB: Mutexes and rw_locks use GCC atomic builtins
150402 16:02:49 InnoDB: Compressed tables use zlib 1.2.3.4
150402 16:02:49 InnoDB: Initializing buffer pool, size = 1.0G
150402 16:02:49 InnoDB: Completed initialization of buffer pool
150402 16:02:49 InnoDB: highest supported file format is Barracuda.
150402 16:02:49  InnoDB: Waiting for the background threads to start
150402 16:02:50 InnoDB: 5.5.31 started; log sequence number 20079278867
150402 16:02:50 [Note] Server hostname (bind-address): '127.0.0.1'; port: 3306
150402 16:02:50 [Note]   - '127.0.0.1' resolves to '127.0.0.1';
150402 16:02:50 [Note] Server socket created on IP: '127.0.0.1'.
150402 16:02:50 [Note] Event Scheduler: Loaded 0 events
150402 16:02:50 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.31-0ubuntu0.12.04.1'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  (Ubuntu)

A check shows that this is not the most current version available of 5.5 using the Ubuntu packages.

$ apt-cache show mysql-server-5.5
Package: mysql-server-5.5
Priority: optional
Section: database
Installed-Size: 31947
Maintainer: Ubuntu Developers 
Original-Maintainer: Debian MySQL Maintainers 
Architecture: amd64
Source: mysql-5.5
Version: 5.5.41-0ubuntu0.12.04.1
...

Package: mysql-server-5.5
Status: install ok installed
Priority: optional
Section: database
Installed-Size: 31950
Maintainer: Ubuntu Developers 
Architecture: amd64
Source: mysql-5.5
Version: 5.5.31-0ubuntu0.12.04.1
...

Just to be consistent with keeping current versions, you may choose to update MySQL 5.5 to the current available version.

$ sudo apt-get install mysql-server-5.5
...

Installing Oracle APT Packaging

The recommended documented way to move to using the Oracle repo is:

cd /tmp
# See https://dev.mysql.com/downloads/repo/apt/ for your right distro version
wget https://dev.mysql.com/get/mysql-apt-config_0.3.3-2ubuntu12.04_all.deb
sudo dpkg -i mysql-apt-config*.deb

This unfortunately uses a cursors based interface which is not something you automate for production systems and not the approach I would suggest.

So doing what this does

echo "deb http://repo.mysql.com/apt/ubuntu/ precise mysql-apt-config
deb http://repo.mysql.com/apt/ubuntu/ precise mysql-5.6" | sudo tee /etc/apt/sources.list.d/mysql.list
curl -s http://ronaldbradford.com/mysql/mysql.gpg | sudo apt-key add -
sudo apt-get update

Now we can look at available versions.


$ apt-cache search mysql-server
mysql-server-5.5 - MySQL database server binaries and system database setup
mysql-server-core-5.5 - MySQL database server binaries
auth2db - Powerful and eye-candy IDS logger, log viewer and alert generator
cacti - Frontend to rrdtool for monitoring systems and services
torrentflux - web based, feature-rich BitTorrent download manager
mysql-community-server - MySQL Server
mysql-server - MySQL Server meta package depending on latest version

This is where life gets a little confusing. Because Ubuntu supported MySQL 5.1 (as mysql-server) and MySQL 5.5 (as mysql-server-5.5) it can be misleading.

$ apt-cache show mysql-server
$ apt-cache show mysql-server
Package: mysql-server
Source: mysql-community
Version: 5.6.23-1ubuntu12.04
Architecture: amd64
Maintainer: MySQL Release Engineering 
Installed-Size: 46
Depends: mysql-community-server (= 5.6.23-1ubuntu12.04)
Homepage: http://www.mysql.com/
Priority: optional
Section: database
Filename: pool/mysql-5.6/m/mysql-community/mysql-server_5.6.23-1ubuntu12.04_amd64.deb
Size: 11644
SHA256: 1cb166cd230d2a4daca761ea80f2f34ee1fc0c92aaae972c914d81746f235d63
SHA1: 63548c852d5faeda751fbf038c0799fbbeac9905
MD5sum: da2f709a29a7cac97c834e6e69929891
Description: MySQL Server meta package depending on latest version
 The MySQL(TM) software delivers a very fast, multi-threaded, multi-user,
 and robust SQL (Structured Query Language) database server. MySQL Server
 is intended for mission-critical, heavy-load production systems as well
 as for embedding into mass-deployed software. MySQL is a trademark of
 Oracle. This is a meta package that depends on the latest mysql server
 package available in the repository.

Package: mysql-server
Priority: optional
Section: database
Installed-Size: 114
Maintainer: Ubuntu Developers 
Original-Maintainer: Debian MySQL Maintainers 
Architecture: all
Source: mysql-5.5
Version: 5.5.41-0ubuntu0.12.04.1
...

We are looking to ensure the Maintainer is the Official Release.

Upgrading to MySQL 5.6

sudo service mysql stop
ps -ef | grep mysql
sudo apt-get install -y mysql-server
$ sudo apt-get install -y mysql-server
...
The following extra packages will be installed:
  mysql-client mysql-common mysql-community-client mysql-community-server
The following packages will be REMOVED:
  mysql-client-5.5 mysql-client-core-5.5 mysql-server-5.5 mysql-server-core-5.5
The following NEW packages will be installed:
  mysql-client mysql-community-client mysql-community-server mysql-server
The following packages will be upgraded:
  mysql-common
...
Configuration file `/etc/mysql/my.cnf'
 ==> Modified (by you or by a script) since installation.
 ==> Package distributor has shipped an updated version.
   What would you like to do about it ?  Your options are:
    Y or I  : install the package maintainer's version
    N or O  : keep your currently-installed version
      D     : show the differences between the versions
      Z     : start a shell to examine the situation
 The default action is to keep your current version.
*** my.cnf (Y/I/N/O/D/Z) [default=N] ? N
...
Installing new version of config file /etc/apparmor.d/usr.sbin.mysqld ...
2015-04-02 16:53:07 0 [Warning] Using unique option prefix key_buffer instead of key_buffer_size is deprecated and will be removed in a future release. Please use the full name instead.
2015-04-02 16:53:07 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
150402 16:53:14 mysqld_safe Can't log to error log and syslog at the same time.  Remove all --log-error configuration options for --syslog to take effect
...

You may think the process is completed, but it is not. Always, Always check the error log. Have you checked your MySQL error log today?

$ sudo tail -300 /var/log/mysql/error.log
...
2015-04-02 16:53:14 20429 [Warning] Buffered warning: Changed limits: table_cache: 431 (requested 2000)

2015-04-02 16:53:14 20429 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
2015-04-02 16:53:14 20429 [Note] Plugin 'FEDERATED' is disabled.
2015-04-02 16:53:14 20429 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-04-02 16:53:14 20429 [Note] InnoDB: The InnoDB memory heap is disabled
2015-04-02 16:53:14 20429 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-04-02 16:53:14 20429 [Note] InnoDB: Memory barrier is not used
2015-04-02 16:53:14 20429 [Note] InnoDB: Compressed tables use zlib 1.2.3.4
2015-04-02 16:53:14 20429 [Note] InnoDB: Using Linux native AIO
2015-04-02 16:53:14 20429 [Note] InnoDB: Not using CPU crc32 instructions
2015-04-02 16:53:14 20429 [Note] InnoDB: Initializing buffer pool, size = 1.0G
2015-04-02 16:53:15 20429 [Note] InnoDB: Completed initialization of buffer pool
2015-04-02 16:53:15 20429 [Note] InnoDB: Highest supported file format is Barracuda.
2015-04-02 16:53:15 20429 [Note] InnoDB: 128 rollback segment(s) are active.
2015-04-02 16:53:15 20429 [Note] InnoDB: Waiting for purge to start
2015-04-02 16:53:15 20429 [Note] InnoDB: 5.6.23 started; log sequence number 20079286519
2015-04-02 16:53:15 20429 [Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: 487fda28-d97a-11e4-9254-e0cb4e3feb73.
2015-04-02 16:53:15 20429 [Note] Server hostname (bind-address): '127.0.0.1'; port: 3306
2015-04-02 16:53:15 20429 [Note]   - '127.0.0.1' resolves to '127.0.0.1';
2015-04-02 16:53:15 20429 [Note] Server socket created on IP: '127.0.0.1'.
2015-04-02 16:53:15 20429 [ERROR] Column count of mysql.events_waits_current is wrong. Expected 19, found 16. Created with MySQL 50541, now running 50623. Please use mysql_upgrade to fix this error.
2015-04-02 16:53:15 20429 [ERROR] Column count of mysql.events_waits_history is wrong. Expected 19, found 16. Created with MySQL 50541, now running 50623. Please use mysql_upgrade to fix this error.
2015-04-02 16:53:15 20429 [ERROR] Column count of mysql.events_waits_history_long is wrong. Expected 19, found 16. Created with MySQL 50541, now running 50623. Please use mysql_upgrade to fix this error.
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'events_waits_summary_by_host_by_event_name' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Incorrect definition of table performance_schema.events_waits_summary_by_thread_by_event_name: expected column 'THREAD_ID' at position 0 to have type bigint(20), found type int(11).
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'events_waits_summary_by_user_by_event_name' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'events_waits_summary_by_account_by_event_name' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Column count of mysql.file_summary_by_event_name is wrong. Expected 23, found 5. Created with MySQL 50541, now running 50623. Please use mysql_upgrade to fix this error.
2015-04-02 16:53:15 20429 [ERROR] Column count of mysql.file_summary_by_instance is wrong. Expected 25, found 6. Created with MySQL 50541, now running 50623. Please use mysql_upgrade to fix this error.
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'host_cache' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Incorrect definition of table performance_schema.mutex_instances: expected column 'LOCKED_BY_THREAD_ID' at position 2 to have type bigint(20), found type int(11).
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'objects_summary_global_by_type' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Incorrect definition of table performance_schema.rwlock_instances: expected column 'WRITE_LOCKED_BY_THREAD_ID' at position 2 to have type bigint(20), found type int(11).
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'setup_actors' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'setup_objects' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'table_io_waits_summary_by_index_usage' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'table_io_waits_summary_by_table' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'table_lock_waits_summary_by_table' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Column count of mysql.threads is wrong. Expected 14, found 3. Created with MySQL 50541, now running 50623. Please use mysql_upgrade to fix this error.
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'events_stages_current' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'events_stages_history' has the wrong structure
2015-04-02 16:53:15 20429 [ERROR] Native table 'performance_schema'.'events_stages_history_long' has the wrong structure
...

Completing the MySQL 5.6 Upgrade

A MySQL upgrade of the meta schema is necessary.

$ sudo mysql_upgrade -uroot -p
Enter password:
Looking for 'mysql' as: mysql
Looking for 'mysqlcheck' as: mysqlcheck
Running 'mysqlcheck' with connection arguments: '--port=3306' '--socket=/var/run/mysqld/mysqld.sock'
Warning: Using a password on the command line interface can be insecure.
Running 'mysqlcheck' with connection arguments: '--port=3306' '--socket=/var/run/mysqld/mysqld.sock'
Warning: Using a password on the command line interface can be insecure.
mysql.columns_priv                                 OK
mysql.db                                           OK
mysql.event                                        OK
mysql.func                                         OK
mysql.general_log                                  OK
mysql.help_category                                OK
mysql.help_keyword                                 OK
mysql.help_relation                                OK
mysql.help_topic                                   OK
mysql.host                                         OK
mysql.ndb_binlog_index                             OK
mysql.plugin                                       OK
mysql.proc                                         OK
mysql.procs_priv                                   OK
mysql.proxies_priv                                 OK
mysql.servers                                      OK
mysql.slow_log                                     OK
mysql.tables_priv                                  OK
mysql.time_zone                                    OK
mysql.time_zone_leap_second                        OK
mysql.time_zone_name                               OK
mysql.time_zone_transition                         OK
mysql.time_zone_transition_type                    OK
mysql.user                                         OK
Running 'mysql_fix_privilege_tables'...
Warning: Using a password on the command line interface can be insecure.
Running 'mysqlcheck' with connection arguments: '--port=3306' '--socket=/var/run/mysqld/mysqld.sock'
Warning: Using a password on the command line interface can be insecure.
Running 'mysqlcheck' with connection arguments: '--port=3306' '--socket=/var/run/mysqld/mysqld.sock'
Warning: Using a password on the command line interface can be insecure.
...
OK
$ sudo service mysql restart
$ sudo service mysql restart
 * Stopping MySQL Community Server 5.6.23
....
 * MySQL Community Server 5.6.23 is stopped
 * Re-starting MySQL Community Server 5.6.23
150402 17:06:17 mysqld_safe Can't log to error log and syslog at the same time.  Remove all --log-error configuration options for --syslog to take effect.
......
 * MySQL Community Server 5.6.23 is started
$ sudo tail -300 /var/log/mysql/error.log
...
2015-04-02 17:06:15 20429 [Note] /usr/sbin/mysqld: Shutdown complete

150402 17:06:15 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
150402 17:06:17 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
2015-04-02 17:06:17 0 [Warning] Using unique option prefix key_buffer instead of key_buffer_size is deprecated and will be removed in a future release. Please use the full name instead.
2015-04-02 17:06:17 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2015-04-02 17:06:17 20994 [Warning] Buffered warning: Changed limits: max_open_files: 1024 (requested 5000)

2015-04-02 17:06:17 20994 [Warning] Buffered warning: Changed limits: table_cache: 431 (requested 2000)

2015-04-02 17:06:17 20994 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
2015-04-02 17:06:17 20994 [Note] Plugin 'FEDERATED' is disabled.
2015-04-02 17:06:17 20994 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-04-02 17:06:17 20994 [Note] InnoDB: The InnoDB memory heap is disabled
2015-04-02 17:06:17 20994 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-04-02 17:06:17 20994 [Note] InnoDB: Memory barrier is not used
2015-04-02 17:06:17 20994 [Note] InnoDB: Compressed tables use zlib 1.2.3.4
2015-04-02 17:06:17 20994 [Note] InnoDB: Using Linux native AIO
2015-04-02 17:06:17 20994 [Note] InnoDB: Not using CPU crc32 instructions
2015-04-02 17:06:17 20994 [Note] InnoDB: Initializing buffer pool, size = 1.0G
2015-04-02 17:06:17 20994 [Note] InnoDB: Completed initialization of buffer pool
2015-04-02 17:06:17 20994 [Note] InnoDB: Highest supported file format is Barracuda.
2015-04-02 17:06:17 20994 [Note] InnoDB: 128 rollback segment(s) are active.
2015-04-02 17:06:17 20994 [Note] InnoDB: Waiting for purge to start
2015-04-02 17:06:17 20994 [Note] InnoDB: 5.6.23 started; log sequence number 20081020877
2015-04-02 17:06:17 20994 [Note] Server hostname (bind-address): '127.0.0.1'; port: 3306
2015-04-02 17:06:17 20994 [Note]   - '127.0.0.1' resolves to '127.0.0.1';
2015-04-02 17:06:17 20994 [Note] Server socket created on IP: '127.0.0.1'.
2015-04-02 17:06:17 20994 [Note] Event Scheduler: Loaded 0 events
2015-04-02 17:06:17 20994 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.6.23'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL Community Server (GPL)

Correcting errors

As you can see there are several warnings/errors when starting MySQL.

The first is

 mysqld_safe Can't log to error log and syslog at the same time.  Remove all --log-error configuration options for --syslog to take effect.

We solve this with

$ sudo rm -f /etc/mysql/conf.d/mysqld_safe_syslog.cnf
$ sudo service mysql restart
 * Stopping MySQL Community Server 5.6.23
....
 * MySQL Community Server 5.6.23 is stopped
 * Re-starting MySQL Community Server 5.6.23
......
 * MySQL Community Server 5.6.23 is started

This is an Ubuntu default that conflicts with the my.cnf log_error were are familiar with in monitoring the MySQL error log. You can read my opinion on this in The correct approach to rolling MySQL logs

The second is

2015-04-02 17:06:17 0 [Warning] Using unique option prefix key_buffer instead of key_buffer_size is deprecated and will be removed in a future release. Please use the full name instead.

We solve this with

sudo sed -ie "s/^key_buffer[^_]/key_buffer_size/" /etc/mysql/my.cnf

Next

2015-04-02 17:18:06 22123 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.

We solve this with

sudo sed -ie "s/^myisam-recover[^-]/myisam-recover-options/" /etc/mysql/my.cnf

The warnings are interesting, and will part of the following post on MySQL 5.6 configuration changes discussed in the next point.

2015-04-02 17:22:08 22626 [Warning] Buffered warning: Changed limits: max_open_files: 1024 (requested 5000)
2015-04-02 17:22:08 22626 [Warning] Buffered warning: Changed limits: table_cache: 431 (requested 2000)

Leveraging MySQL 5.6 benefits

We may now have a MySQL 5.6 installation however we are far from utilizing the benefits of MySQL 5.6 fully. In a subsequent post I will talk about the configuration options we need to now consider, both new options such as innodb_purge_threads and important improvements such as sync_binlog. There are far greater complex changes including innodb_file_per_table, master_info_repository and relay_log_info_repository and then changes in defaults such as performance_schema

Validating MySQL version numbers

As part of a MySQL 5.5 to MySQL 5.6 upgrade across several Ubuntu servers of varying distros an audit highlighted a trivial but interesting versioning identification error in Ubuntu’s packaging of MySQL.

Ubuntu 12.04 LTS

$ sudo dpkg -l | grep mysql-server-5.5
ii  mysql-server-5.5   5.5.41-0ubuntu0.12.04.1  ...
$ mysql -uroot -p -e "SELECT VERSION()"
+-------------------------+
| VERSION()               |
+-------------------------+
| 5.5.41-0ubuntu0.12.04.1 |
+-------------------------+

But when you look at the mysql --version it does NOT say 5.5.41.

$ mysql --version
mysql  Ver 14.14 Distrib 5.5.34, for debian-linux-gnu (x86_64) using readline 6.2

Ubuntu 14.04 LTS

On 14.04 I get expected results.

$ sudo dpkg -l | grep mysql-server-5.5
ii  mysql-server-5.5       5.5.41-0ubuntu0.14.04.1   ...
rbradfor@rubble:~$ mysql -uroot -p -e "SELECT VERSION()"
+-------------------------+
| VERSION()               |
+-------------------------+
| 5.5.41-0ubuntu0.14.04.1 |
+-------------------------+
$ mysql --version
mysql  Ver 14.14 Distrib 5.5.41, for debian-linux-gnu (x86_64) using readline 6.3

Dynamic recreation of InnoDB redo logs

MySQL 5.6 will now automatically recreate the InnoDB redo log files during a MySQL restart if the size (or number) of these logs changes, i.e. a change to innodb_log_file_size. See Changing the Number or Size of InnoDB Log Files which states “If InnoDB detects that the innodb_log_file_size differs from the redo log file size, it will write a log checkpoint, close and remove the old log files, create new log files at the requested size, and open the new log files.”

Before MySQL 5.6 it was necessary to stop MySQL and remove the InnoDB log files manually before restarting MySQL.

The error log shows:

tail -f /mysql/log/error.log
...
2015-03-28 21:51:25 3767 [Warning] InnoDB: Resizing redo log from 2*3072 to 2*65536 pages, LSN=1626017
2015-03-28 21:51:25 3767 [Warning] InnoDB: Starting to delete and rewrite log files.
2015-03-28 21:51:25 3767 [Note] InnoDB: Setting log file ./ib_logfile101 size to 1024 MB
InnoDB: Progress in MB: 100 200 300 400 500 600 700 800 900 1000
2015-03-28 21:51:28 3767 [Note] InnoDB: Setting log file ./ib_logfile1 size to 1024 MB
InnoDB: Progress in MB: 100 200 300 400 500 600 700 800 900 1000
2015-03-28 21:51:31 3767 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0
2015-03-28 21:51:31 3767 [Warning] InnoDB: New log files created, LSN=1626017
...

It was however odd that MySQL had indicated it had successfully started but the underlying Redo log files were not complete and in-place as seen by the following directory listings.

$ sudo service mysql start
......
 * MySQL Community Server 5.6.23 is started
$
$ ls -lh /var/lib/mysql/
total 1.9G
-rw-rw---- 1 mysql mysql   56 Mar 28 19:42 auto.cnf
-rw-rw---- 1 mysql mysql  12M Mar 28 21:51 ibdata1
-rw-rw---- 1 mysql mysql 902M Mar 28 21:51 ib_logfile1
-rw-rw---- 1 mysql mysql 1.0G Mar 28 21:51 ib_logfile101
drwxr-x--- 2 mysql mysql 4.0K Mar 28 19:42 mysql
drwx------ 2 mysql mysql 4.0K Mar 28 19:42 performance_schema
$ ls -lh /var/lib/mysql/
total 2.1G
-rw-rw---- 1 mysql mysql   56 Mar 28 19:42 auto.cnf
-rw-rw---- 1 mysql mysql  12M Mar 28 21:51 ibdata1
    -rw-rw---- 1 mysql mysql 1.0G Mar 28 21:51 ib_logfile0 -rw-rw---- 1 mysql mysql 1.0G Mar 28 21:51 ib_logfile1 drwxr-x--- 2 mysql mysql 4.0K Mar 28 19:42 mysql drwx------ 2 mysql mysql 4.0K Mar 28 19:42 performance_schema

SQL, ANSI Standards, PostgreSQL and MySQL

I have recently been working with the Donors Choose Open Data Set which happens to be in PostgreSQL. Easy enough to install and load the data in PostgreSQL, however as I live and breath MySQL, lets load the data into MySQL.

And here is where start our discussion, first some history.

SQL History

SQL – Structure Query Language is a well known common language for communicating with Relational Databases (RDBMS). It is not the only language I might add, having both used many years ago and just mentioned QUEL at a Looker Look and Tell event in New York. It has also been around since the 1970s making it; along with C; one of oldest in general use programming languages today.

SQL became an ANSI standard in 1986, and an ISO standard in 1987. The purpose of a standard is to provide commonality when communicating or exchanging information; in our case; a programming language communicating with a RDBMS. There have been several iterations of the standard as functionality and syntax improves. These are commonly referred to as SQL-86, SQL-89, SQL-92, SQL:1999, SQL:2003, SQL:2006, SQL:2008 and SQL:2011.

And so, with SQL being a standard it means that what we can do in PostgreSQL should translate to what we can do in MySQL.

SQL Communication

Both products provide a Command Line Interface (CLI) client tool for SQL communication, mysql for MySQL and psql for PostgreSQL. No surprises there. Both use by default the semicolon ; as a SQL statement terminator, and both CLI tools use \q as a means to quit and exit the tool. Certainly not a standard but great for syntax compatibility.

DDL Specification

Our journey begins with defining tables.

DROP TABLE

Both products SQL syntax support DROP TABLE. Infact, both support the DROP TABLE [IF EXISTS] syntax.

DROP TABLE donorschoose_projects;
DROP TABLE IF EXISTS donorschoose_projects;

CREATE TABLE

Both support CREATE TABLE.

Both support defining columns in the typical format <column_name> <datatype>, and both support the NOT NULL attribute. Talking about specific datatypes for columns is a topic on its own and so I discuss this later.

The PostgreSQL syntax was a table option WITHOUT OIDS which is not valid in MySQL. It is also obsolescent syntax in PostgreSQL 9.3. From the PostgreSQL manual “This optional clause specifies whether rows of the new table should have OIDs (object identifiers) assigned to them. The default is to have OIDs. Specifying WITHOUT OIDS allows the user to suppress generation of OIDs for rows of a table. This may be worthwhile for large tables … Specifying WITHOUT OIDS also reduces the space required to store the table on disk by 4 bytes per row of the table, thereby improving performance.”

In this example as this is just for testing, dropping the WITHOUT OIDS syntax creates a mutually compatible syntax.

Comments

Both MySQL and PostgreSQL support -- as an inline comment in an SQL statement. No need to strip those out.

ALTER TABLE

Both support ALTER TABLE ADD CONSTRAINT syntax which in our example is used to define the PRIMARY KEY, however while the syntax remains the same, the choice of datatype affects the outcome.

The following works in both products when the datatype is CHARACTER(32). More about CHARACTER() later.

ALTER TABLE donorschoose_projects ADD CONSTRAINT pk_donorschoose_projects PRIMARY KEY(_projectid);

In our example dataset, the primary key is defined with a TEXT datatype, and in MySQL this fails.

ERROR 1170 (42000): BLOB/TEXT column '_projectid' used in key specification without a key length

As the data in the dataset for primary keys by further analysis is indeed a 32 byte hexadecimal value, this is changed to CHARACTER(32) to be compatible for this data loading need. This however is an important key difference in any migration process with other data sets.

Side Note

Both products support the definition of the PRIMARY KEY in the CREATE TABLE syntax two different ways.

CREATE TABLE demo_pk1 (id character(32) NOT NULL PRIMARY KEY);
CREATE TABLE demo_pk2 (id character(32) NOT NULL, PRIMARY KEY(id));

CREATE INDEX

Both use CREATE INDEX syntax however with our sample dataset, this is the first observed difference in syntax with provided sample SQL statements.

PostgresSQL

CREATE INDEX projects_schoolid ON projects USING btree (_schoolid);

MySQL
The USING <type> qualifier must appear before the ON <table>.

CREATE INDEX USING btree projects_schoolid ON projects (_schoolid);

In both products USING btree is an optional syntax (for minimum compatibility) purposes so removing this provides a consistency.

Data Types

The following data types are defined in the PostgreSQL example data set. Each is discussed to identify a best fit in MySQL. For reference:

character

This data type is for a fixed width character field and requires a length attribute. MySQL supports CHARACTER(n) syntax for compatibility, however generally CHAR(n) is the preferred syntax. Indeed, PostgreSQL also supports CHAR(n).

The following showing both variants is valid in both products.

CREATE TABLE demo_character(c1 CHARACTER(1), c2 CHAR(1));

varchar/character varying

While this dataset does not use these datatypes, they are critical in the general conservations of character (aka string) types. This refers to a variable length string.

While character varying is not a valid MySQL syntax, varchar is compatible with both products.

CREATE TABLE demo_varchar(vc1 VARCHAR(10));

text

In PostgresSQL, text is used for variables of undefined length. The maximum length of a field is 1GB as stated in the FAQ.

In MySQL however TEXT only stores 2^16 characters (64K). The use of LONGTEXT is needed to support the full length capacity in PostgeSQL. This store 2^32 characters (~4GB).

Of all the complexity of this example dataset, the general use of text will be the most difficult to modify to a more applicable VARCHAR or TEXT datatype when optimizing in MySQL.

integer

PostgreSQL uses the integer datatype for a signed 4 byte integer value. MySQL supports the same syntax, however generally prefers to refer to the shorter INT syntax. Both products support both overall.

mysql> CREATE TABLE demo_integer(i1 INTEGER, i2 INT);
Query OK, 0 rows affected (0.11 sec)

mysql> INSERT INTO demo_integer VALUES (1,-1);
Query OK, 1 row affected (0.05 sec)

mysql> SELECT * FROM demo_integer;
+------+------+
| i1   | i2   |
+------+------+
|    1 |   -1 |
+------+------+
1 row in set (0.00 sec)
demo=# CREATE TABLE demo_integer(i1 INTEGER, i2 INT);
CREATE TABLE
demo=# INSERT INTO demo_integer VALUES (1,-1);
INSERT 0 1
demo=# SELECT * FROM demo_integer;
 i1 | i2
----+----
  1 | -1
(1 row)

And just to note the boundary of this data type.

mysql> TRUNCATE TABLE demo_integer;
Query OK, 0 rows affected (0.04 sec)

mysql> INSERT INTO demo_integer VALUES (2147483647, -2147483648);
Query OK, 1 row affected (0.04 sec)

mysql> SELECT * FROM demo_integer;
+------------+-------------+
| i1         | i2          |
+------------+-------------+
| 2147483647 | -2147483648 |
+------------+-------------+
1 row in set (0.00 sec)
demo=# TRUNCATE TABLE demo_integer;
TRUNCATE TABLE

demo=# INSERT INTO demo_integer VALUES (2147483647, -2147483648);
INSERT 0 1
demo=# SELECT * FROM demo_integer;
     i1     |     i2
------------+-------------
 2147483647 | -2147483648
(1 row)

The difference is in out-of-bounds value management, and here MySQL defaults suck. You can read my views at DP#4 The importance of using sql_mode.

demo=# TRUNCATE TABLE demo_integer;
TRUNCATE TABLE
demo=# INSERT INTO demo_integer VALUES (2147483647 + 1, -2147483648 - 1);
ERROR:  integer out of range
demo=# SELECT * FROM demo_integer;
 i1 | i2
----+----
(0 rows)
mysql> TRUNCATE TABLE demo_integer;
Query OK, 0 rows affected (0.04 sec)

mysql> INSERT INTO demo_integer VALUES (2147483647 + 1, -2147483648 - 1);
Query OK, 1 row affected, 2 warnings (0.07 sec)

mysql> SELECT * from demo_integer;
+------------+-------------+
| i1         | i2          |
+------------+-------------+
| 2147483647 | -2147483648 |
+------------+-------------+
1 row in set (0.00 sec)

While not in this dataset, both support the bigint data type. While the PostgreSQL docs indicate bigint is 8 bytes, testing with PostgresSQL 9.3 failed. Something to investigate more later.

demo=# CREATE TABLE demo_bigint(i1 BIGINT);
CREATE TABLE
demo=# INSERT INTO demo_bigint VALUES (2147483647 + 1), (-2147483648 - 1);
ERROR:  integer out of range
mysql> CREATE TABLE demo_bigint(i1 BIGINT);
Query OK, 0 rows affected (0.12 sec)

mysql> INSERT INTO demo_bigint VALUES (2147483647 + 1), (-2147483648 - 1);
Query OK, 2 rows affected (0.04 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT * from demo_bigint;
+-------------+
| i1          |
+-------------+
|  2147483648 |
| -2147483649 |
+-------------+
2 rows in set (0.01 sec)

And for reference, both products support smallint, a 2-byte integer.

Each product has additional integer data types.

numeric

For a fixed-precision number, PostgreSQL uses numeric but supports decimal.It would not be surprising to know that MySQL uses DECIMAL and for compatibility supports NUMERIC.

This leads to a side-bar discussion on knowing your data-types for your product. In a recent interview for a MySQL Engineer, a candidate (with SQL Server experience) provided a code example defining the NUMERIC datatype. I knew it was technically valid in MySQL syntax, but never actually seen this in use. When I asked the candidate for what was the syntax commonly used for a fixed-precision datatype they were unable to answer.

real/double precision

This dataset does not include these data types, however for reference, PostgresSQL uses real for 4 bytes, and double precision for 8 bytes. MySQL uses float for 4 bytes, and double for 8 bytes. MySQL however supports both PostgreSQL syntax options, however PostgreSQL supports float, but not double.

demo=# CREATE TABLE demo_floatingpoint(f1 FLOAT, f2 REAL, d1 DOUBLE, d2 DOUBLE PRECISION);
ERROR:  type "double" does not exist
LINE 1: ...TE TABLE demo_floatingpoint(f1 FLOAT, f2 REAL, d1 DOUBLE, d2...

demo=# CREATE TABLE demo_floatingpoint(f1 FLOAT, f2 REAL, d2 DOUBLE PRECISION);
CREATE TABLE
mysql> CREATE TABLE demo_floatingpoint(f1 FLOAT, f2 REAL, d1 DOUBLE, d2 DOUBLE PRECISION);
Query OK, 0 rows affected (0.07 sec)

date

Both PostgreSQL and MySQL use the date data type.

timestamp

Both PostgreSQL and MySQL use the timestamp data type to store date/time values. However, there is a difference in both precision and implementation here.

In PostgresSQL, timestamp supports a date before EPOCH, while in MySQL it does not. MySQL uses the DATETIME datatype.

Using PostgresSQL timestamp and MySQL DATETIME, both support microsecond precision. MySQL however only started to provide this in MySQL 5.6.

A key difference in column definition is the PostgreSQL timestamp without time zone syntax, used in our example dataset. Analysis of data loading will determine the impact here.

boolean

SQL:1999 calls for a Boolean datatype, and both PostgreSQL and MySQL support defining a column as BOOLEAN. MySQL however implicitly converts this to a SIGNED TINYINT, and any future DDL viewing shows this reference.

When referencing boolean, in PostgreSQL WHERE column_name = TRUE or WHERE column_name = t retrieves a true value. In MySQL WHERE column_name = TRUE or WHERE column_name = 1. When you SELECT a boolean, in PostgresSQL the answer is ‘t’, in MySQL, the answer is 1.

demo=# CREATE TABLE demo_boolean (b1 boolean);
CREATE TABLE
demo=# INSERT INTO demo_boolean VALUES (TRUE),(FALSE);
INSERT 0 2
demo=# SELECT * FROM demo_boolean;
 b1
----
 t
 f
(2 rows)
mysql> CREATE TABLE demo_boolean (b1 boolean);
Query OK, 0 rows affected (0.11 sec)

mysql> INSERT INTO demo_boolean VALUES (TRUE),(FALSE);
Query OK, 2 rows affected (0.03 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM demo_boolean;
+------+
| b1   |
+------+
|    1 |
|    0 |
+------+
2 rows in set (0.00 sec)

Other Data Types

Only the data types in this example have been reviewed.

Other syntax

In our sample SQL script, there is psql specific syntax to show a debugging line with \qecho .... For compatibility these are removed.

The loading of data with the \COPY <table_name> FROM PSTDIN WITH CSV HEADER is PostgreSQL specific and so loading the data is a future topic.

Finally, the VACUUM ANALYZE <table_name> command is also PostgreSQL specific and removed. This is a means effectively of optimizing and analyzing the table.

Both PostgreSQL and MySQL have an ANALYZE command, however the syntax is different, with the required TABLE keyword in MySQL.

PostgresSQL

ANALYZE donorschoose_projects;

ANALYZE TABLE donorschoose_projects;
ERROR:  syntax error at or near "table"

MySQL

ANALYZE donorschoose_projects;
ERROR 1064 (42000): You have an error in your SQL syntax;...

ANALYZE TABLE donorschoose_projects;

MySQL has an OPTIMIZE TABLE syntax, however while technically valid syntax this is not compatible with the default storage table InnoDB.

mysql> OPTIMIZE TABLE donorschoose_projects;
+----------------------------+----------+----------+-------------------------------------------------------------------+
| Table                      | Op       | Msg_type | Msg_text                                                          |
+----------------------------+----------+----------+-------------------------------------------------------------------+
| test.donorschoose_projects | optimize | note     | Table does not support optimize, doing recreate + analyze instead |
| test.donorschoose_projects | optimize | status   | OK                                                                |
+----------------------------+----------+----------+-------------------------------------------------------------------+
2 rows in set (0.32 sec)

Loops in shell scripting

If you are die hard Bourne Shell (/bin/sh) scripter, it can be a challenge not to be enticed by the syntax niceties of the Born Again Borne Shell (/bin/bash).

One example is the {..} syntax

#!/bin/bash
for I in {0..5}
do
   echo $I
done
0
1
2
3
4
5

This syntax is not valid in /bin/sh on Linux.

#!/bin/sh
for I in {0..5}
do
   echo $I
done
{0..5}

NOTE: However apparently it does work in Mac OS X, which is derived from BSD, not Linux.

/bin/sh gives you a for loop but it requires the full list of iterated values instead of a range.

#!/bin/sh

for I in 0 1 2 3 4 5
do
  echo $I
done

Note: Passing a string does not work by default.

#!/bin/sh

for I in "0 1 2 3 4 5"
do
  echo $I
done

The approach to product the same result requires some format management.

#!/bin/sh

OIFS=$IFS
IFS=" "
for I in `echo "0 1 2 3 4 5"`
do
  echo $I
done
IFS=$OIFS

You can use while

#!/bin/sh

I=0
while [ $I -le 5 ]
do 
  echo $I
  I=`expr $I + 1`
done

You can use one of several other shell commands, in this example awk

#!/bin/sh

for I in `awk 'BEGIN{for (i=0;i<=5;i++) print i}'`
do 
  echo $I
done

Or, the function specifically design for sequences of numbers seq

#!/bin/sh

for I in `seq 0 5`
do 
  echo $I
done

And for these few examples, there will be more possibilities to achieve close to feature parity of the /bin/bash syntax.
An example found on BSD is jot - 0 5. This is not available Ubuntu by default but installed with the athena-jot package. However the syntax is then different for correct usage.

Understanding when EXPLAIN is not using an index as intended

When reading a MySQL Query Execution Plan (QEP) produced by the EXPLAIN command, generally one of the first observations is to validate an index is being used per table (i.e. per row of output). In MySQL, this is observed with the key column.

In the following two simple single table examples we see the use of the PRIMARY key. To the untrained eye this may lead to assume that the right index is being used.

Example 1

+----+-------------+----------------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table          | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+----------------+-------+---------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | txxxxxxxxxxxx  | index | NULL          | PRIMARY | 4       | NULL |  100 | Using where |
+----+-------------+----------------+-------+---------------+---------+---------+------+------+-------------+

Example 2

+----+-------------+------------+-------+---------------+---------+---------+-------+------+-------+
| id | select_type | table      | type  | possible_keys | key     | key_len | ref   | rows | Extra |
+----+-------------+------------+-------+---------------+---------+---------+-------+------+-------+
|  1 | SIMPLE      | txxxxxxxxx | const | PRIMARY       | PRIMARY | 4       | const |    1 |       |
+----+-------------+------------+-------+---------------+---------+---------+-------+------+-------+

However this is not entirely true. While used, it is not as intended. A MySQL index can be used during three different stages of the query execution, the JOIN/WHERE, the GROUP BY and the ORDER BY. This type of use can be seen by looking at the Index Hint Syntax which enables you to suggest or force an index at these three various stages.

Without looking at the SQL statements for the above plans, the first giveaway is the possible_keys column which indicates the indexes considered when evaluating the JOIN/WHERE of your SQL statement. The absence in the first query informs you that no index was used. The use of the PRIMARY key in the first query is the result of an ORDER BY syntax. If we remove this ORDER BY we see the true possible execution.

+----+-------------+----------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table          | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+-------------+----------------+------+---------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | txxxxxxxxxxxx  | ALL  | NULL          | NULL | NULL    | NULL | 262827 | Using where |
+----+-------------+----------------+------+---------------+------+---------+------+--------+-------------+

The first query also includes a LIMIT and hence gives the perception that only a small number of rows are processed. In relational theory you would surmise this query is not efficient. Unfortunately as this example was on version MySQL 5.5, it was not possible to use the optimizer_trace functionality in MySQL 5.6 to delve deeper into understanding what decisions MySQL would take.

This example is not to say that you should add an index. This is one possible outcome in optimizing the SQL statement, however there are advantages and disadvantages. Each SQL statement should be reviewed in conjunction with it’s usage, the overall table structure(s) and all other SQL statements that utilize applicable tables.

What is testing?

In software development this is a simple question. What is [the purpose of] testing? If asked to give a one sentence answer what would you say? I have asked this simple question of attendees at many presentations, and also to software developers I have worked with or consulted to.

The most common answer is. “Testing is about making sure the software works, the function your testing does what it should, for example saves the information you entered”.

Unfortunately this is not the purpose of testing, and this attitude leads to what I generally term as poor quality software. “Testing is about trying to break your product any way possible, all the time.”

With this clarification in understanding of a basic and necessary software engineering principle, the attitude towards software development and the entire focus and mindset of engineering and quality assurance can change for the better.

Another very simple example which I often ask when consulting. What does your website look like when it’s down? Again, the general answer is often vague and/or incomplete. How do you know when your website is down? I have heard the response “The users will let you know”. You may laugh, but it is certainly not funny. Show me your website in a down state? Show me your website in a degraded state? When the answer is either unclear, or with a recent employment the same response, there has simply been little thought into producing a quality product by a testing process that is intent on breaking your software.

What procedures do you follow when receiving alerts about errors? What procedures do you put in place to ensure they do not happen again? Again, one has to be disappointed when the response is, “I will set up an email alert to the team for this type of error?” This reactive response is not addressing the problem, only acknowledging the existence of a problem. What is needed is being proactive. Was a bug raised? Can the problem be easily reproduced? How was the problem fixed the first time? Can this be corrected in the code? Can the interim resolution be automated?

When there is a negative user experience from any type of failure or error another important feedback loop is the post-mortem to review the when, why, how and who of the situation and to create a plan to ensure this does not happen again.

Testing needs to baked in to everything that is done, and practice makes for a more perfect outcome. In a high volume environment it is critical to have a simulated environment where you can benchmark performance of any new release for any regressions. A well defined load testing environment can be used to review experimental branches of possible performance improvements. It is also where you can determine the bottleneck and breaking point as you increase load 2X, 5X, 10X. It is impossible to be proactive when your system can fail at 2X load, and the engineering resources needed to implement a solution will not happen in time.

Disaster is inevitable. It will happen, whether small or large. Hardware and software inherently fails. How it fails and what is done to mitigate this to ensure the best possible consistent and rewarding consumer experience is only possible by consistently practicing to break your software at all stages in the development and deployment lifecycle.

MySQL Admin 101 for System Admins – key_buffer_size

As discussed in my presentation to NYLUG, I wanted to provide system administrations with some really quick analysis and performance fixes if you had limited knowledge of MySQL.

One of the most important things with MySQL is to tune memory properly. This can be complex as there are global buffers, and per session buffers, memory tables, and differences between storage engines. Even this first tip has conditions.

Configuration of MySQL can be found in the my.cnf file (How can I find that). Some variables are dynamic and some are not, and these can change between versions. Check out The most important MySQL Reference Manual page that everybody should bookmark for reference.

Here is a great example for the key_buffer_size found in the [mysqld] section of my.cnf. This is also historically known in legacy config files as key_buffer. This older format has been removed in 5.7. This is a global buffer that is responsible for caching the MyISAM Index data only. Two important things here, this is for the MyISAM storage engine only, and it’s only for indexes. MyISAM data relies on the OS file system cache.

We can confirm the current value in a running MySQL instance with:

mysql> SELECT LOWER(variable_name) as variable, variable_value/1024/1024 as MB 
       FROM   information_schema.global_variables 
       WHERE  variable_name = 'key_buffer_size';
+-----------------+------+
| variable        | MB   |
+-----------------+------+
| key_buffer_size |   16 |
+-----------------+------+
1 row in set (0.00 sec)

The following query will give you the current size of MyISAM indexes stored on disk in your instance.


mysql> SELECT FORMAT(SUM(data_length)/1024/1024,2) as data_mb, 
              FORMAT(SUM(index_length)/1024/1024,2) as index_mb 
       FROM   information_schema.tables 
       WHERE  engine='MyISAM';
+--------------+--------------+
| data_mb      | index_mb     |
+--------------+--------------+
| 504.01       | 114.48       |
+--------------+--------------+
1 row in set (2.36 sec)

NOTE: This is all MyISAM indexes in all schemas. At this time we have not determined what is “hot” data, “cold” data, backup tables etc. It’s a crude calculation, but in absence of more information, seeing that MyISAM is being used, and the buffer is not configured (default is generally 8MB), or is configured poorly as in this example shows that changing this value is an important step to consider. However, The first part of solving the problem is identifying the problem.

Tuning the buffer is hard. You have to take into consideration the amount of system RAM, is the server dedicated for MySQL only, or a shared server for example with a web container such as Apache. Are other storage engines used (for example InnoDB) that requires it’s own buffer size, are there multiple MySQL Instances on the server.

For this example of tuning, we are assuming a dedicated MySQL server and no other storage engines used.

Determining the system RAM and current usage can be found with:

$ free -m
             total       used       free     shared    buffers     cached
Mem:          3955       3846        109          0        424       1891
-/+ buffers/cache:       1529       2426
Swap:         1027          0       1027

With this information, we see a system with 4G of RAM (plenty of available RAM), a key_buffer_size of 16M, and the current maximum size of indexes is 114M. For this most simple case it’s obvious we can increase this buffer, to say 128M and not affect overall system RAM usage, but improve MyISAM performance.

Here are the same numbers for a different system to give you a comparison of what you may uncover.

mysql> SELECT LOWER(variable_name) as variable, variable_value/1024/1024 as MB
    ->        FROM   information_schema.global_variables
    ->        WHERE  variable_name = 'key_buffer_size';
+-----------------+------+
| variable        | MB   |
+-----------------+------+
| key_buffer_size |  354 |
+-----------------+------+
1 row in set (0.00 sec)

mysql> SELECT FORMAT(SUM(data_length)/1024/1024,2) as data_mb,
    ->               FORMAT(SUM(index_length)/1024/1024,2) as index_mb
    ->        FROM   information_schema.tables
    ->        WHERE  engine='MyISAM';
+------------+------------+
| data_mb    | index_mb   |
+------------+------------+
| 150,073.57 | 122,022.97 |
+------------+------------+
1 row in set (3.71 sec)

As I follow up in my next post on the innodb_buffer_pool_size, I will further clarify the complexity of MySQL memory tuning, and show that this information gathering is only a guide, and first step to a more complex analysis and tuning operation.

Improving performance – A full stack problem

Improving the performance of a web system involves knowledge of how the entire technology stack operates and interacts. There are many simple and common tips that can provide immediate improvements for a website. Some examples include:

  • Using a CDN for assets
  • Compressing content
  • Making fewer requests (web, cache, database)
  • Asynchronous management
  • Optimizing your SQL statements
  • Have more memory
  • Using SSD’s for database servers
  • Updating your software versions
  • Adding more servers
  • Configuring your software correctly
  • … And the general checklist goes on

Understanding where to invest your energy first, knowing what the return on investment can be, and most importantly the measurement and verification of every change made is the difference between blind trial and error and a solid plan and process. Here is a great example for the varied range of outcome to the point about “Updating your software versions”.

On one project the MySQL database was reaching saturation, both the maximum number of database connections and maximum number of concurrent InnoDB transactions. The first is a configurable limit, the second was a hard limit of the very old version of the software. Changing the first configurable limit can have dire consequences, there is a tipping point, however that is a different discussion. A simple software upgrade of MySQL which had many possible improvement benefits, combined with corrected configuration specific for this new version made an immediate improvement. The result moved a production system from crashing consistently under load, to at least barely surviving under load. This is an important first step in improving the customer experience.

In the PHP application stack for the same project the upgrading of several commonly used frameworks including Slim and Twig by the engineering department seemed like a good idea. However applicable load testing and profiling (after it was deployed, yet another discussion point) found the impact was a 30-40% increase in response time for the application layer. This made the system worse, and cancelled out prior work to improve the system.

How to tune a system to support 100x load increase with no impact in performance takes knowledge, experience, planning, testing and verification.

The following summarized graphs; using New Relic monitoring as a means of representative comparison; shows three snapshots of the average response time during various stages of full stack tuning and optimization. This is a very simplified graphical view that is supported by more detailed instrumentation using different products, specifically with much finer granularity of hundreds of metrics.

These graphs represent the work undertaken for a system under peak load showing an average 2,000ms response time, to the same workload under 50ms average response time. That is a 40x improvement!

If your organization can benefit from these types of improvements feel free to Contact Me.

There are numerous steps to achieving this. A few highlights to show the scope of work you need to consider includes:

  • Knowing server CPU saturation verses single core CPU saturation.
  • Network latency detection and mitigation.
  • What are the virtualization mode options of virtual cloud instances?
  • Knowing the network stack benefits of different host operating systems.
  • Simulating production load is much harder than it sounds.
  • Profiling, Profiling, Profiling.
  • Instrumentation can be misleading. Knowing how different monitoring works with sampling and averaging.
  • Tuning the stack is an iterative process.
  • The simple greatest knowledge is to know your code, your libraries, your dependencies and how to optimize each specific area of your technology stack.
  • Not everything works, some expected wins provided no overall or observed benefits.
  • There is always more that can be done. Knowing when to pause and prioritize process optimizations over system optimizations.

These graphs show the improvement work in the application tier (1500ms to 35ms to 25ms) and the database tier (500ms to 125ms to 10ms) at various stages. These graphs do not show for example improvements made in DNS resolution, different CDNs, managing static content, different types and ways of compression, remove unwanted software components and configuration, standardized and consistent stack deployments using chef, and even a reduction in overall servers. All of these successes contributed to a better and more consistent user experience.

40x performance improvements in LAMP stack

Correctly setting your mysql prompt using sudo

If you run multiple MySQL environments on multiple servers it’s a good habit to set your MySQL prompt to double check which server you are on.
however, using the MYSQL_PS1 environment variable I found this does not work under sudo (the normal way people run sudo).

I.e., the following syntax’s work.

$ mysql
$ sudo su - -c mysql
$ sudo su - ; mysql

but the following does not.

$ sudo mysql

The trick is actually to ensure via /etc/sudoers you inherit the MySQL_PS1 environment variable.

echo "export MYSQL_PS1="`hostname` [d]> "" | sudo tee /etc/profile.d/mysql.sh
echo 'Defaults    env_keep += "MYSQL_PS1"' | sudo tee /tmp/mysql
sudo chmod 400 /tmp/mysql
sudo mv /tmp/mysql /etc/sudoers.d

What is FTS_BEING_DELETED.ibd

I currently have on a MySQL 5.6 database using innodb_file_per_table the following individual tablespace file.

schema/FTS_00000000000001bb_BEING_DELETED.ibd

The schema is all InnoDB tables, and there ARE NO Full Text Indexes. I cannot comment on if a developer has tried to create one previously.
I am none the wiser in explaining the ongoing use of these files, or if it can be/should be deleted.

On closer inspection there are infact a number of FTS files.

$ ls -al FTS*
-rw-r----- 1 mysql mysql 98304 Jan 29 16:21 FTS_00000000000001bb_BEING_DELETED_CACHE.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:20 FTS_00000000000001bb_BEING_DELETED.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:26 FTS_00000000000001bb_CONFIG.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:21 FTS_00000000000001bb_DELETED_CACHE.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:00 FTS_00000000000001bb_DELETED.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:20 FTS_00000000000001c7_BEING_DELETED_CACHE.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:26 FTS_00000000000001c7_BEING_DELETED.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:21 FTS_00000000000001c7_CONFIG.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:20 FTS_00000000000001c7_DELETED_CACHE.ibd
-rw-r----- 1 mysql mysql 98304 Jan 29 16:20 FTS_00000000000001c7_DELETED.ibd

Any MySQL gurus with knowledge to share, and for the benefit of others that Internet search at a later time.

Related articles included Overview and Getting Started with InnoDB FTS and Difference between InnoDB FTS and MyISAM FTS but do not mention file specifics.

The article InnoDB Full-text Search in MySQL 5.6 (part 1) provides more insight that these files remain even if a full text index was created and has since being removed. It is not clear from the filename which tables these files relate to.

Good Test Data

Over the years you collect datasets you have created for various types of testing, seeding databases etc. I have always thought one needs to better manage this for future re-use. Recently I wanted to do some “Big Data” playing and again that question of what datasets can I use let me to review the past collated list at Seeking public data for benchmarks.

The types of things I was wanting to do lead me to realize a lot of content is “public domain” and Project Gutenberg is just one great source of text in multiple languages. This was just one aspect of my wish list but text based data is used from blogs, comments, articles, microblogs etc, and multiple languages was important from some text analysis.

With a bit of thinking about the building blocks, I created Good Test Data. A way for me to have core data, IP’s, people’s names, User Agents strings, text for articles, comments and a lot more. And importantly the ability to generate large randomized amounts of this data quickly and easily.

Now I can build a list of 1 million random names with unique usernames and emails with ease. I can generate millions of varying articles, from a short microblog, a comment, a blog to a multi page article. Then be able to produce HTML/PDF/PNG versions giving me file attachments. I’ve been playing more with image generation, creating banner images with varying text, and now I’m generating MP4 video to simulate the various standard sizes for advertising and just to see what people need.

I’m not sure of the potential use and benefit for others and that wasn’t the primary goal, however I would like to know how these building blocks could be used. The data is relatively agnostic, being able to easily load into MySQL tables. Depending on demand, being able to create pre-configured open source product data for e-commence products, CRM or blogging are all possible options.