Auditing an existing MySQL Installation

Yesterday I ran into an old collegue that now runs quite a successful computer store chain and highly successful web store here in Australia. Long story short he was having some MySQL problems, so I offered to pass my eye over it. Now, given that they had some data corruption in stock levels not being correct (e.g. getting to an inventory count of -1), my first split second thought was inappropiate transaction management.

In thinking last night, what would I do as part of auditing an existing MySQL Installation and application quickly for reference and also to highlight issues etc, assuming I would have no access to any administrators or developers.

So what would you do? I made some preliminary notes, here’s the full account of what I did do.

Audit Steps

OS Specifics

$ mkdir /tmp/arabx
$ cd /tmp/arabx
# keep a quick copy of stuff
$ script
# Linux generals
$ hostname
$ uname -a
$ uptime
$ df
$ which mysql
$ mysql --version

MySQL Specifics

$ mysql -uroot -p mysql
mysql>  show variables;
mysql>  show status;
mysql>  show databases;
mysql>  show processlist;
mysql>  show full processlist;
mysql>  select host,user,passwd from user;
mysql> exit;
$ cat /etc/my.cnf
# error not found?
$ find / -name my.cnf
# Results
$ cat /etc/mysql/my.cnf
$ cp /etc/mysql/my.cnf .
$ cd /usr/local/mysql
$ du
$ cd data
$ ls -al
$ cp *.err /tmp/arabx  # for later review
# What, no database, surely they were not recording in the mysql database
# quick confirm there was another database from earlier list?
$ grep datadir /etc/mysql/my.cnf
# I see off somewhere else
$ cd /var/lib/mysql
$ ls -al
$ du
$ cp *.err /tmp/arabx  # for later review
$ cd /tmp/arabx
$ mysqldump --no-data  -u[user] -p [db] > schema.sql
$ grep TYPE schema.sql
# Observation:  All MyISAM, the first proof of my initial theory
$ mysql -u[user] -p [db]
mysql> show tables;
mysql> show tables status;
mysql>exit;
$ cd /tmp
$ tar cvfz arabx.tar.gz arabx
$ scp arabx.tar.gz ...

Application Specifics
I’m not going to detail the steps here as this is really very implementation dependant. What I did in summary was:

  • Identified website location on filesystem, general du, ls -l commands
  • View working screens showing Stock Count logs, Forfill Order and Product Return
  • Review code of these areas, as well as the data, confirming what was seen on screen via SQL

Recommendations

Immediate

  • Cleanup current directory of html files to remove old files, or old backup copies. This ensures no PC has some old page with bad code bookmarked
  • Cleanup /usr/local/mysql/data as it’s now defined in /var/local/mysql. It threw me with to data directories and error logs. It was only that the database directory was not in the first location, otherwise I may have missed it initially.
  • Stock Adjustment page needs to log an Adjustment history for Audit Trail (Only Sales and Returns were in Audit Trail, so it left a possible gap?)
  • Implementation of Transactions. Given the volume of transactions, and that it would appear that LOCK TABLES had been implemented but removed due to performance, obvious choice is implement InnoDB tables, and transaction management in code. (Quite some work)
  • Change of some columns from DATE to DATETIME to record the time of occurance. (Code was using NOW(), so that part’s already done). Or, implement some TIMESTAMP columns (as none were in use) and leverage MySQL standard functionality.

Medium Term

  • Running MySQL 4.0, which is fine, but the data corruption and lack of clean website code, leads to an easier solution for auditing via triggers.
  • Upgrade to MySQL 5.0 Key reason, triggers. This enables more audit analysis with even less need for or initial understanding of the application
  • Implement a CVS respository. Even for single development, it’s a habit that far outways the impact of a few more commands.
  • Review Backup strategies for HTML code and respository, which was only being mirrored, but not backed up for a disaster recovery situation

I guess if I ever do this again, there is merit in cleaning this all up and providing some level of automation.
Does anybody have any suggestions of obvious other things to consider.

Tagged with: Databases General MySQL Open Source Web Development

Producing IQR and Outlier statistics with SQL

The interquartile range (IQR) measures the spread of the middle 50% of a distribution — the distance between the first quartile (Q1) and the third quartile (Q3). Combined with Tukey’s 1.

Producing Mode statistics with SQL

The mode is the value or values that appear most frequently in a dataset. Unlike the mean or median, it applies naturally to categorical and ordinal data — star ratings, product codes, survey responses — and reveals what is most common, not what is average.

Extending MySQL Capabilies with UDFs, Plugins and Components - Part 2

MySQL offers three different approaches to extending the SQL capabilities with the default product you download and install. These are: User Defined Function (UDF) MySQL Manual MySQL Plugin MySQL Manual MySQL Component MySQL Manual In my prior post I provided a new uuidv function that accepted a numeric argument to return a string of the version of UUID specified.