TDD for Infrastructure

Test Driven Development (TDD) is an important principle for producing quality software. This is not a new concept. The Extreme Programming (XP) agile methodology (1999) outlined the concept before the acronym became more widely accepted as “Another requirement is testability. You must be able to create automated unit and functional tests… You may need to change your system design to be easier to test. Just remember, where there is a will there is a way to test.” Another clear way to describe the hurdles TDD has encountered as a common sense approach is “This is opposed to software development that allows code to be added that is not proven to meet requirements.”

Infrastructure setup is still software. All setup should have adequate testing to ensure at anytime (not just during installation or configuration) any system is in a known state. While Configuration Management (CM) works with the goal of convergence, i.e. ensuring a system is in a known state, testing should be able to validate and identify any non-conformance and not to attempt to correct.

The Bash Automated Test System (BATS) is a known framework for shell scripting. It is very easy to use.

Good habits come from always doing them. Even for a quick test of a running MySQL server via vagrant for a blog post, the automated installation during setup includes validating a simple infrastructure setup via a bats test.

$ tail

sudo mysql -NBe "SHOW GRANTS"
systemctl status mysqld.service
ps -ef | grep mysqld
pidof mysqld
bats /vagrant/mysql8.bats

Rather than having some output that requires a human to read and interpret each line and make a determination, automated it. A good result is:

$ vagrant up
    mysql8: ok 1 bats present
    mysql8: ok 2 rpm present
    mysql8: ok 3 openssl present
    mysql8: ok 4 mysql rpm install
    mysql8: ok 5 mysql server command present
    mysql8: ok 6 mysql client command present
    mysql8: ok 7 mysqld running
    mysql8: ok 8 automated mysql access 

A unsuccessful installation is:

$ vagrant provision
    mysql8: not ok 8 automated mysql access
    mysql8: # (in test file /vagrant/mysql8.bats, line 60)
    mysql8: #   `[ "${status}" -eq 0 ]' failed
The SSH command responded with a non-zero exit status. Vagrant
assumes that this means the command failed. The output for this command
should be in the log above. Please read the output to determine what
went wrong.

$ echo $?

This amount of very simple testing and re-execution of testing either via ssh or a re-provision highlighted a bug in the installation script. Anybody that wishes to identify please reach out directly!

# Because openssl does not always give you a special character
NEWPASSWD="$(openssl rand -base64 24)+"
mysql -uroot -p${PASSWD} -e "ALTER USER USER() IDENTIFIED BY '${NEWPASSWD}'" --connect-expired-password
# TODO: create mylogin.cnf which is more obfuscated
echo "[mysql]
password='$NEWPASSWD'" | sudo tee -a /root/.my.cnf
sudo mysql -NBe "SHOW GRANTS"
systemctl status mysqld.service
ps -ef | grep mysqld
pidof mysqld
bats /vagrant/mysql8.bats

A simple trick with a BATS test is to echo any output that will help debug a failing test. If the test succeeds no output is given, if it fails you get the information for free. For example, lets say your test is:

# Note: additional security to both access the server via ssh
#       and accessing sudo should be in place for production systems
@test "automated mysql access" {
  local EXPECTED="${USER}@localhost"
  run sudo mysql -NBe "SELECT USER()"
  [ "${status}" -eq 0 ]
  [ "${output}" = "${EXPECTED}" ]

Execution will only provide:

 ✗ automated mysql access
   (in test file /vagrant/mysql8.bats, line 62)
     `[ "${output}" = "${EXPECTED}" ]' failed

What you want to see to more easily identify the problem is:

 ✗ automated mysql access
   (in test file /vagrant/mysql8.bats, line 62)
     `[ "${output}" = "${EXPECTED}" ]' failed
   root@localhost != vagrant@localhost

This echo enables a better and quicker ability to correct the failing test.

  [ "${status}" -eq 0 ]
  echo "${output} != ${EXPECTED}"
  [ "${output}" = "${EXPECTED}" ]

Testing is only as good as the boundary conditions put in place. Here is an example where your code used a number of environment variables and your testing process performed checks that these variables existed.

@test "EXAMPLE_VAR is defined ${EXAMPLE_VAR}" {
  [ -n "${EXAMPLE_VAR}" ]

The code was subsequently refactored and the environment variable was removed. Do you remove the test that checks for its existence? No. You should not ensure the variable is not set, so that any code now or in the future acts as desired.

@test "EXAMPLE_VAR is NOT defined" {
  [ -z "${EXAMPLE_VAR}" ]


Writing re-runable shell script

I recently started playing with devstack again (An all-in-on OpenStack developer setup). Last time was over 3 years ago because I remember a pull request for a missing dependency at the time.

The installation docs provide information to bootstrap your system with a necessary user and privileges, however like many docs for software setup they contain one off instructions.

adduser stack
echo "stack ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers

When you write operations code you need to always be thinking about “testability” and “automation”. It is important to write re-runable code. You should always write parameterized code when possible, which can be refactored into usable functions at any time.

This is a good example to demonstrate a simple test condition for making the initial instructions re-runable.

sudo su -
# This creates default group of same username
# This creates user with default HOME in /home/stack
[ `grep ${NEW_USER} /etc/passwd | wc -l` -eq 0 ] && useradd -s /bin/bash -m ${NEW_USER}
[ ! -s ${NEW_USER_SUDO_FILE} ] && umask 226 && echo "${NEW_USER} ALL=(ALL) NOPASSWD: ALL" > ${NEW_USER_SUDO_FILE}