It's makes me cry

I got home today and sat down to read my home email list. Nothing new. But on a MySQL mailing list, there was an enquiry why performance was slowing in a given application. I didn’t even have to read the situation, nor the problem, it took less then the 200ms mentioned to identify the problem looking at the supplied schema.

In summary, the first table in the schema had a primary key of VARBINARY(255) and a engine type of Innodb. Hold on, wait, it’s a concatenated key of two VARBINARY(255) columns. And I should mention, that primary key was a foreign key in the next table. If this was a home website app with one user, ok well it’s still bad, but this application was having performance problems with reasonable volumes of transactions, it’s not a beginner application. (A recent reference If you don’t know your data, you don’t know your application). Where do people learn this from!

Now I want to work on a large scale out MySQL environment, an organisation or site that’s going places, but the world of even normal every day applications is littered with these basic fundamental design errors. It’s make me cry.

My last contract had many flaws across the business, and the Java code for production applications was just as bad. Check out things like What constitutes a good error message to the user?, Why IT professionals get a bad name (I could easily write one of these a day for a long time, or you could check out The Daily WFT). You think the new contract with an large existing multinational company and world-wide installed Java application I could get an improvement. I’ll leave these stories for another time. I don’t want to have to deal with this.

I’ve got 16 half written blogs on cool MySQL features or interesting topics, but I never seem to get the motivation at the end of the day. I just want to know, how do people in IT get away with this. Where is the Pride in people’s work, where is the desire for people to make what they do a better place. Where is:

An opportunity that’s a challenge, including involving hard work and tight time frames, but a job that provides the rewards of job satisfaction for a productive contribution. An importance on quality, emphasis on continued improvement, and goals of simplicity in complex situations while working in a team environment are also necessary.

SHOW STATUS Gotcha

Well, it’s Sunday night so I will put this down to being the weekend. The background to being caught out is a request I made to my local Users Group mailing list for some information on people’s environments because I wanted to some empirical data analysis without having any more knowledge of the systems.

In summary (without the surrounding fan-fare, I was seeking):

SELECT VERSION();
SHOW STATUS;
SHOW VARIABLES;  // Optional

I was however perplexed why my first data point analysis (Read/Write ratio) using the Status values Com_insert, Com_update, Com_delete and Com_select was not always giving me expected results. In particular, a number of server results showed 0 for values while I knew the results came from working MySQL environments.

So, sanity check with good friend Morgan and I get the response to answer the dilemma SHOW STATUS defaults to session based statistics in 5.0+, there’s a million people it’s caught out, and many bugs about “xyz variable not being incremented”.

Well, it’s Sunday night so I will put this down to being the weekend. The background to being caught out is a request I made to my local Users Group mailing list for some information on people’s environments because I wanted to some empirical data analysis without having any more knowledge of the systems.

So I’m really seeking:

SELECT VERSION();
SHOW GLOBAL STATUS;
SHOW GLOBAL VARIABLES;  // Optional

Well, that’s my MySQL trivia point for the day. MySQL between version 4.1 and 5.0 made a major change with the default value. The MySQL 5.0 Manual also states this.

UPDATE 23 Sep 2006

Following Eric’s Show @&!# status again!, I did then read the Bug #19093, and it was great to see Monty comment (that’s open source for you, enabling even founders to mix with the common folk), and I’d tend to agree to his comments. MySQL has attempted to correct something.

Perhaps the best compromise to the problem is to deprecate and drop the SHOW STATUS command, and make the scope component GLOBAL|SESSION compulsory, so SHOW GLOBAL STATUS or SHOW SESSION STATUS are the only valid commands!

Microfox ?

I’ve added Digg to my general lunch time reading web sites. I came across this yesterday. Microsoft invites Firefox development team to Redmond.

Well, isn’t that nice, the big boy opening his pond (including all the sharks) to the little fish. (if you don’t get it, it’s a line from Die Hard 2 about pissing in somebody else’s pond) What’s funny about the article is not the article but some of the comments, here are but a few.

  • Cool. Very cool.
  • they’re going to brainwash them into working for microsoft and their going to cast off the IE team to replace the firefox team
  • Just replace IE 7 with FF 2.0. Save every web developer 100s of hours.
  • “Come into my parlour,” said the spider to the fly
  • DON’T DO IT! My god, has noone seen Braveheart? The Untouchables? The Godfather? Scarface?
  • hey’re going to kidnap the Mozilla developers with a cunning trail of pizza and cola, and force them to work on IE8.
  • ITS A TRAP! http://itsatrap.ytmnd.com/

I like the last one, maybe just because of the cool picture on referenced site.

I see today an official response Firefox welcomes Microsoft’s offer of help

The fast pace of technology in a Web 2.0 world

I had need to goto the Wikipedia this morning to review the terminology of something, and on the front page in Today’s featured article is Mercury. Being a tad curious given I’d heard only on the radio a few hours ago that Pluto was no longer a planet in our Solar System, I drilled down to the bottom to check references to other planets (quicker then searching). So at the bottom I found the following graphic and details of The Solar System Summary.

Well blow me down, they didn’t waste any time there. Pluto is no longer a planet in our Solar System. It is now categorised as a Dwarf Planet.

There are over 100 edits to this page on Dwarf Planet in the past 12 hours, including all the links in Pluto and Solar System correctly referenced.

Now, that’s the power of content by the community, one of the characteristics of Web 2.0.
Of course all of this runs under the LAMP Stack and powered by the MySQL Database. Combined with the fact this is a breed of organisation that didn’t start with large amount of Venture Capital, another trend of the newer generation of popular and successful internet enterprises.

No Spam Today

Huh, tricked you. As if!

However I was looking at my Akismet Spam section in WordPress, the open source software that runs my blog, and it gave me this message.

Caught Spam

You have no spam currently in the queue. Must be your lucky day. :)

The RAT and the CAT

No, it’s not a bedtime story, is a serious system’s design concept and I’m amazing that people don’t know about this.
As I mentioned in If you don’t know your data, you don’t know your application I was doing a Java Code Review, and I found a clear case of a much simplier solution. How simple you ask?

Well, without completing the task 100%, I achieved in less then 1 day (and lets say for the argument 1 more day of work), what is being worked on by somebody else for a week, with an estimate of 2 more weeks to complete. So let’s add 50% to my estimate, that’s a total of 3 days verses 15 days. You do the math. and yes that was last week and that task is still being worked on the same way, even with reference to my working code. Not to mention the code is a similiar magnitude of simplicity, and simplicity means cost savings in support, people so quickly forget that.

So what is this RAT and CAT:

  • RAT – Row At a Time
  • CAT – Chunk At a Time

This concept is really rather simple, however it’s amazing how implementations never consider this. Let me give you a simple example.

Your importing some external data (a flat file of 5 columns, and is inturn will be stored in 2 tables in a normalised form). So the RAT way of doing things would entail:

  • Using the language in question, open the file, read line by line, converting into object for reference.
  • Now for each row read you
    • You get the values (A and B), and then do a select to see if this row already exists in one of the normalised tables (calling in X). If it does, you have the surrogate key, else you insert the row and get the surrogate key
    • Now you do the same with the next set of values (C, D, E) which reference the normalised values (A and B), inserting into Y)
    • And so on and so on.
    • Report exceptions line by line if found

For those now laughing, it’s not really funny, it’s sad that programs are written this way. For those that say, but that’s the only way, well welcome to a different, radical and hard to learn approach.

The Cat way of doing things would entail:

  • Create a temporary table
  • Bulk load the data into the temporary table via appropiate command, e.g. mysqlimport or LOAD DATA.
  • Using one select, insert into X rows from temporary table that are not already present
  • Using one select, insert into Y rows from the temporart table that are not present, joining to X to get the appropiate surrogate key
  • Report exceptions via one select of data that wasn’t inserted due to some rules

It’s not rocket science people.

I should mention this is a simple example, and it’s not always possible to do CAT processing for all operations, but generally some portion of batch work can be, and the remaining must be done in a RAT way.

Updated
Some more recent articles including We need more CATs (2009) and The Art of Elimination (2010)

If you don't know your data, you don't know your application.

The art of data modelling is definitely lost on some [most] people, or they never found it, even though they think they did. Over dinner with good friend Morgan last night we were swapping present stories on the topic.

Morgan wrote recently about I want my 4 bytes back damn it., and interesting example storing an ISBN. Further reference can be found at Getting started with MySQL of a more impractical ISBN example.

Disk is cheap now, so the attitude and poor excuse can be, well a few extra bytes doesn’t matter. Well no! If your a social hacker and have a website with a maximium concurrent connections of 2 maybe, but much like some recent Java Code Reviewing I just performed, just because the system isn’t 24×7, doesn’t give you excuse to be lazy about writing the code not to handle concurrency, thread safety and also as efficient as possible, in this case RAT verses CAT. (I’ll need to write about this, it seemed to go over some of the other professionals even)

I can remember a very specific example some 10 years ago in doing some performance analysis on a site. I’d identified the need for an additional index on a table. Now this table was sized for 200 million rows, and it already had about 70 million. The problem was adding another index required 4GB disk allocation. These things have an effect on sizing, growth and backups.

So the impact on appropiate sizing can clearly have an effect, if it was just one poorly sized column that’s acceptable (just), but normally it’s a pattern that litters a data model.

What’s important to realise is, it’s not just diskspace, it’s also memory. Without really touching on sizing data, I did mention some examples previously in Improving Open Source Databases – WordPress. Here the use of BIGINT(20) for primary keys proved my point. That’s 8 bytes, but unless you have going to have 4 billion categories, it’s a waste. It’s a waste when it’s a foreign key in a table, and it’s a big waste when it’s indexed, and that index is then in memory, and wasting more precious resources.

So how to do identify when the designer of the model has no idea about the intrinsic data value being stored? If you see tables with VARCHAR(255), that’s a clear sign. They have no idea regarding the data, so a default limit is used. Morgan referred to it as “Shooting guns in the dark with your eyes closed”. Books don’t help the cause, I was just skimming High Performance MySQL last night (one of the many freebies from the UC). There on page 82, is a table definition with not one column, but two with varchar(255). Hmmm!

If you see any new applications with VARHAR(255) they are even more lost, because MySQL 5, which has been around quite some time now, supports VARCHAR(65535). Speaking of that, has anybody seen VARCHAR(65535). I’d like to know.

Another example, is in Sheeri’s Top 8 SQL Best Practices Point 4 in regards to storing IP’s effeciently. If you log for example every page hit, this can be your largest table, and moving from varchar(15) to int can save you upto 11 bytes per row alone.

These may just be simple examples, but it’s the principle. When you define a column, consider it’s data, if you don’t know then take the time to do the reasearch and learn.

Become named in Firefox 2

So, FireFox have come up with a novel idea to promote it’s product. Check out Firefox Day.

The official blurb: Share Firefox with a friend. If your friend downloads Firefox before September 15, you’ll both be immortalized in Firefox 2.

You can even choose how to link your names together on the “Firefox Friends Wall”. Examples like ‘my name’ Informed ‘your name’, or ‘my name’ Empowered ‘your name’, or ‘my name’ Liberated ‘your name’.

Perhaps MySQL can leverage this idea for some what to promote future download!



Pride

I’ve recently completed a contract and I’ve been in discussions with agents and other employers for further work. Having had one of the worse experiences in my previous work, I’ve been extra careful to ensure what I’m told at the interview/meeting stage is indeed true and accurate (in my last case it was not). I’ve also not made the assumption that an organisation that is dependent on software has placed a certain level of value on what’s in place. (in my last case I did, simply due to the size of the organisation and volume of business).

So, when being asked by people what I’m seeking, outside of the technical skills and compendencies, I’m seeking an organisation that places value on it’s existing software, it’s software quality, it’s software improvement and most importantly it’s software developers. It was unfortunate that for an organisation that lived in software, and would not survive long (especially at the present scale of operations) without it, they met none of these criteria. It was really sad, and overall I found the environment “depressing”.

Now “depressing” could be considered a harsh word, but it comes from both my ethos and also present circumstances in life. I had these two post-it notes on my wall at work, which summed up “What I was seeking/Where I wanted to be”, and “Where I was” in my job.

There have been a few articles recently, it was probably Marcus in Supporting ideas and being productive, based on earlier articles of How to Come Up With Ideas and How To Kill Good Ideas by Zack that motivated me to publish this. These articles have many good points, and while I could put my slant on it, I wanted to focus on just one thing.

In Software Development 10 years ago I was driven, now I know and I’m driven by passion. I always strive for something better then their presently is, seeking better quality, and better productivity, and most importantly today, simplicity.

Everybody is at a different position in life and it’s important to find the right fit. For me money is not a motivator, being productive, making what I’m doing better, where I’m working more enjoyable and productive, and who I’m working with a better place to be are important to me.

I’ve choose the word for the moment to best represent this as PRIDE.

Pride is:

  • A sense of one’s own proper dignity or value; self-respect
  • Pleasure or satisfaction taken in an achievement, possession, or association
  • A cause or source of pleasure or satisfaction; the best of a group or class
  • The most successful or thriving condition
  • To indulge (oneself) in a feeling of pleasure or satisfaction

For me, I’m seeking an organisation that takes pride in it’s people, it’s products and it’s future direction, and that attracts people that have pride in what they do, they way they think and how they interact with others.

Compiling MySQL Tutorial 2 – Directly from the source


Should you want to be on the bleeding edge, or in my case, don’t want to download 70MB each day in a daily snapshot (especially when I’m getting build errors), you can use Bit Keeper Free Bit Keeper Client that at least lets you download the MySQL Repository. This client doesn’t allow commits, which is a good thing for those non-gurus in mysql internals (which definitely includes me).

wget http://www.bitmover.com/bk-client.shar
/bin/sh bk-client.shar
cd bk_client-1.1
make

By placing sfioball in your path you can execute.

sfioball bk://mysql.bkbits.net/mysql-5.1 mysql-5.1

This took me about 4 mins, which seemed much quicker then getting a snapshot!

You can then get cracking with my instructions at Compiling MySQL Tutorial 1 – The Baseline.

A good reference in all this compiling is to take a good look at the MySQL Internals Manual. (which I only found out about recently)

If at a later time you want to update your repository to the latest, use the following command.

update bk://mysql.bkbits.net/mysql-5.1 mysql-5.1

Documentation References

5.1 Reference Manual – 2.9. MySQL Installation Using a Source Distribution
5.1 Reference Manual – 2.9.3. Installing from the Development Source Tree
5.1 Reference Manual – 2.13.1.3. Linux Source Distribution Notes
MySQL Internals Manual

Requirements for compiling

To confirm earlier notes on minimum requirements for compiling the following details should be confirmed.

automake --version
autoconf --version
libtool --version
m4 --version
gcc --version
gmake --version
bison -version


My results running Fedora Core 5.

$ automake --version
automake (GNU automake) 1.9.6
$ autoconf --version
autoconf (GNU Autoconf) 2.59
$ libtool --version
ltmain.sh (GNU libtool) 1.5.22 (1.1220.2.365 2005/12/18 22:14:06)
$ m4 --version
GNU M4 1.4.4
$ gcc --version
gcc (GCC) 4.1.1 20060525 (Red Hat 4.1.1-1)
$ gmake --version
GNU Make 3.80
$ bison -version
bison (GNU Bison) 2.1

Compiling MySQL Tutorial 1 – The Baseline – Update

Just to confirm my earlier confusion about verified snapshots at Compiling MySQL Tutorial 1 – The Baseline.

“Daily snapshot sources are only published, if they compiled successfully (using the BUILD/compile-pentium-debug-max script) and passed the test suite (using make test). If the source tree snapshot fails to compile or does not pass the test suite, the source tarball will not be published.”

Seems the fine print at MySQL Database Server 5.1: Beta snapshots also states this. Well, need to take my RTFM pill there.

Thanks to Lenz for putting the record straight, and helping with my Forum Post. Seems I did uncover a Bug, now recorded as Bug #21463. Just need to get it fixed to continue on my tutorial.

Compiling MySQL Tutorial 1 – The Baseline

Pre-requisites

This tutorial is aimed at Linux installations that has the standard development tools already installed. The INSTALL file in the source archives provides good details of the software required (e.g. gunzip, gcc 2.95.2+,make).

Create a separate user for this development work.

su -
useradd mysqldev
su - mysqldev

Get an appropiate Snapshot

You can view recent snapshots at http://downloads.mysql.com/snapshots/mysql-5.1/.

The official statement on snapshots from MySQL AB.
Daily snapshot sources are only published, if they compiled successfully (using the BUILD/compile-pentium-debug-max script) and passed the test suite (using make test). If the source tree snapshot fails to compile or does not pass the test suite, the source tarball will not be published.

At the time of producing this the present snapshot was http://downloads.mysql.com/snapshots/mysql-5.1/mysql-5.1.12-beta-nightly-20060801.tar.gz

mkdir -p $HOME/src
cd $HOME/src
SNAPSHOT="mysql-5.1.12-beta-nightly-20060801";export SNAPSHOT
wget http://downloads.mysql.com/snapshots/mysql-5.1/${SNAPSHOT}.tar.gz
tar xfz ${SNAPSHOT}.tar.gz
cd $SNAPSHOT

Compiling

You can for example do the simple way with most C programs:

DEPLOY=${HOME}/deploy; export DEPLOY
./configure --prefix=${DEPLOY}
make
make install

However it’s recommended you use the significant number of pre-configured build scripts found in the BUILD directory. For Example:

./BUILD/compile-pentium-debug --prefix=${DEPLOY}

I always get the following warnings. Not sure what to do about it, but it doesn’t break any future work to date. Surely somebody in the development team knows about them.

../build_win32/include.tcl: no such file or directory
cp: cannot create regular file `../test/logtrack.list': No such file or directory

Testing

The easy, but time consuming part over, let’s test this new beast.

make install

This will deploy to the preconfigured area of $USER/deploy. For multiple versions within this process you should adjust this back in the compile process.

cd $DEPLOY
bin/mysql_install_db
bin/mysqld_safe &
bin/mysql -e "SELECT VERSION()"
bin/mysqladmin -uroot shutdown

Changes

So if you haven’t already sniffed around there are some reasonable changes in the structure. Here are a few points that caught me out:

  • mysql_install_db is now under /bin not /scripts, infact there is no /scripts
  • mysqld is not under /bin but infact under /libexec. A good reason to always used mysqld_safe
  • by default the data directory is /var, normal documentation etc always has this as /data

This is a quick confirm it all works, and I’ve for the purposes of this example ensured that no other MySQL installation is running, and there is no default /etc/my.cnf file. (I always place this in the MySQL installation directory anyway).

Some minonr considerations for improvements with locations, users and multiple instances.

cd $DEPLOY
rm -rf var
bin/mysql_install_db --datadir=${DEPLOY}/data --user=$USER
bin/mysqld_safe --basedir=${DEPLOY} --datadir=${DEPLOY}/data --user=$USER --port=3307 --socket=/tmp/mysqldev.sock &
bin/mysql -P3307 -S/tmp/mysqldev.sock -e "SELECT VERSION()"
bin/mysqladmin -P3307 -S/tmp/mysqldev.sock  -uroot shutdown

Troubleshooting

Now, there is no guarantee that the snapshot is a working one, in the case of mysql-5.1.12-beta-nightly-20060801 in this example, it didn’t work for me? I’ve just reinstalled my OS to Fedora Core 5, and the previous source version (a 5.1.10 snapshot worked ok)
Here are some tips to help out.

The preconfigured BUILD scripts have a nice display option with -n that allows you to see what will happen.

./BUILD/compile-pentium-debug --prefix=${DEPLOY} -n

In my case it gave me this:

testing pentium4 ... ok
gmake -k distclean || true
/bin/rm -rf */.deps/*.P config.cache storage/innobase/config.cache
   storage/bdb/build_unix/config.cache bdb/dist/autom4te.cache autom4te.cache innobase/autom4te.cache;

path=./BUILD
. "./BUILD/autorun.sh"
CC="gcc" CFLAGS="-Wimplicit -Wreturn-type -Wswitch -Wtrigraphs -Wcomment -W -Wchar-subscripts -Wformat -Wparentheses -Wsign-compare -Wwrite-strings -Wunused
-DUNIV_MUST_NOT_INLINE -DEXTRA_DEBUG -DFORCE_INIT_OF_VARS  -DSAFEMALLOC -DPEDANTIC_SAFEMALLOC -DSAFE_MUTEX"
CXX="gcc" CXXFLAGS="-Wimplicit -Wreturn-type -Wswitch -Wtrigraphs -Wcomment -W -Wchar-subscripts -Wformat -Wparentheses -Wsign-compare -Wwrite-strings -Wctor-dtor-privacy
-Wnon-virtual-dtor -felide-constructors -fno-exceptions -fno-rtti  -DUNIV_MUST_NOT_INLINE -DEXTRA_DEBUG -DFORCE_INIT_OF_VARS
-DSAFEMALLOC -DPEDANTIC_SAFEMALLOC -DSAFE_MUTEX" CXXLDFLAGS=""
./configure --prefix=/home/mysqldev/deploy --enable-assembler  --with-extra-charsets=complex  --enable-thread-safe-client
--with-readline  --with-big-tables  --with-debug=full --enable-local-infile
gmake -j 4

In my case this helped with my problem, that is the autorun.sh command is throwing an error in the Innodb configuration.
Part of the compiling is you can turn Innodb off with –without-innodb, but the BUILD scripts don’t accept parameters, so it’s a matter of ignoring the autorun.sh command above, and adding –without-innodb to the configure.

As it turned out, the autorun.sh provided the additional storage engine support.

Details of the problem in this instance.

[mysqldev@lamda mysql-5.1.12-beta-nightly-20060801]$ . "./BUILD/autorun.sh"
Updating Berkeley DB source tree permissions...
../build_win32/include.tcl: no such file or directory
Updating Berkeley DB README file...
Building ../README
autoconf: building aclocal.m4...
autoconf: running autoheader to build config.hin...
autoconf: running autoconf to build configure
Building ../test/logtrack.list
cp: cannot create regular file `../test/logtrack.list': No such file or directory
Building ../build_win32/db.h
configure.in:723: required file `zlib/Makefile.in' not found
Can't execute automake

The good news is today, the MySQL Barcamp was announced, a great place to get some better insite into the gurus of the MySQL internals. At least a place to ask these questions providing they have a beginner group.

My Next Tutorial will take a look at the example provided by Brian Aker at the recent MySQL Users Conference HackFest B: Creating New SHOW Commands.

Securing a Tomcat Webapp – Part 2

If you wish to password protect your webapp with an Apache .htaccess type authentication model, you require two configuration steps. The first within your WEB-INF/web.xml, add the following replacing rolename appropiately.

  <security-constraint>
    <web-resource-collection>
      <web-resource-name>All Pages</web-resource-name>
        <url-pattern>*.htm</url-pattern>
        <url-pattern>*.html</url-pattern>
   </web-resource-collection>
    <auth-constraint>
       <role-name>rolename</role-name>
    </auth-constraint>
  </security-constraint>

  <!-- Define the Login Configuration for this Application -->
  <login-config>
    <auth-method>BASIC</auth-method>
    <realm-name>Test Application</realm-name>
  </login-config>

  <!-- Security roles referenced by this web application -->
  <security-role>
    <description>
      The role that is required to log in to the Application
    </description>
    <role-name>rolename</role-name>
  </security-role>

Second, within the tomcat $CATALINA_HOME/conf/server.xml, you need to define the Realm used within the appropiate host’s <Engine> definition.

  <Realm className="org.apache.catalina.realm.UserDatabaseRealm" debug="0" resourceName="UserDatabase"/>

This Realm connects with a known resource, which I define with the $CATALINA_HOME/conf/server.xml <GlobalNamingResources> definition.

<Resource name="UserDatabase" auth="Container"
          type="org.apache.catalina.UserDatabase"
          description="User database that can be updated and saved">
</Resource>
<ResourceParams name="UserDatabase">
    <parameter>
        <name>factory</name>
        <value>org.apache.catalina.users.MemoryUserDatabaseFactory</value>
    </parameter>
    <parameter>
        <name>pathname</name>
        <value>conf/custom/users.xml</value>
    </parameter>
</ResourceParams>

NOTE: The use of MemoryRealm has limited uses. Tomcat provides 5 different Realm implementations including JDBC, DataSource,JINDI, Memory and JAAS.

And of course you need to define your user authentication within the appropiately defined users file. In this case conf/custom/users.xml

Securing a Tomcat Webapp

If you require a webapp to always run in https mode using a SSL key, then you need to add the following to your WEB-INF/web.xml configuration.

 <security-constraint>
    <web-resource-collection>
        <web-resource-name>jsp</web-resource-name>
        <url-pattern>*.htm</url-pattern>
        <url-pattern>*.html</url-pattern>
    </web-resource-collection>
    <user-data-constraint>
        <transport-guarantee>CONFIDENTIAL</transport-guarantee>
    </user-data-constraint>
</security-constraint>

MySQL Response to Bugs

I’ve read at times people complaining about the response to bugs, and people bag the support of MySQL on the forums at times.

Well today I logged a bug, not the first and I’m sure it’s not the last. See LAST_INSERT_ID() does not return results for a problem in the latest Connector/J 5.0.3 that was released just recently.

Now it took me about 2 hours to log the bug, and probably at least 2 hours of frustration prior to that. The initial frustration 2 hours was unsuccessful debugging of what I was sure was valid code (and it was near midnight last night). The second 2 hours today was testing the problem between two environments, different database versions and different Connector/J versions, and providing a simple reproducable case of said problem.

So the timeline of the Bug in the MySQL Bug Tracking System.

  • [1 Aug 5:40] – Logged Bug
  • [1 Aug 5:58] – Initial feedback, something for me to confirm, bug “in Progress”
  • [1 Aug 6:04] – Confirmation that problem is new JDBC-compliant features, and the issue was a backwards-compatible option
  • [1 Aug 7:14] – Patch submitted to correct the issue
  • [1 Aug 7:28] – My response to initial feedback, as I hadn’t checked my mail for at least an hour, and then I had to go find the JSTL source code and check

So let me summarise that.

For aS3 (Non-critical) Bug, it took 18 mins from logging for MySQL AB to acknowledge and review the problem, 24 mins from logging to get confirmation of compatibility issue, and 94 mins from logging to have a patch submitted that addressed this and probably some more compatibility issues across 7 source files.

Good Job Mark Matthews. Where do I send the comments for the “MySQL AB employee of the month” award.

Determining Linux Video Driver Refresh Rates

I found this cool command called glxgears that you can run under X.

$ glxgears
10524 frames in 5.0 seconds = 2104.650 FPS
10003 frames in 5.0 seconds = 2000.501 FPS
8417 frames in 5.0 seconds = 1683.340 FPS    * Moved the window from left to right
10538 frames in 5.0 seconds = 2107.443 FPS
5639 frames in 5.0 seconds = 1120.850 FPS   * Moved the window around constantly
X connection to :0.0 broken (explicit kill or server shutdown).
$ glxgears --help
Warrning: unknown parameter: --help

Note the spelling error in “Warrning”.

Unexplained replication authentication behaviour

I’m playing with the latest 5.1.11 beta in a master/slave replication situation. Given a lack of H/W lying around I’m configuring a mixed H/W setup to leverage an existing office’s after hours CPUs running Windows XP for my slaves. So here is my test setup.

Server

Linux – 192.168.100.170

The following are the relevent my.cnf settings

[mysqld]
port=3307
server-id = 1
log-bin=mysql-bin

Confirming the server. I did some test commands prior to confirm the position.

$ mysql -uroot -p mysql
Your MySQL connection id is 47 to server version: 5.1.11-beta-log
mysql> show master status;
+------------------+----------+--------------+------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000001 |     1759 |              |                  |
+------------------+----------+--------------+------------------+

Slave

Windows XP – 192.168.100.171

The following are the relevent my.cnf settings

server-id	= 11
master-host     = 192.168.100.170
master-user     = rep
master-password = dolphin
master-port     = 3307

Test

I’ve taken a few steps back for the documentation, to clearly identify any setup problems for those that are also reviewing this to setup replication.

Starting the Slave, I get the following error.

060720 16:36:22 [ERROR] Slave I/O thread: error connecting to master '[email protected]:3307':
  Error: 'Host '192.168.100.171' is not allowed to connect to this MySQL server'
  errno: 1130  retry-time: 60  retries: 86400

Makes sense, you need to give access to the slave to access the master. So to the uninitiated you may miss that a slave requires different access then a default ‘all’ for example.

On the master.

$ mysql -uroot -p mysql
mysql>  grant replication slave on *.* to rep@'192.168.100.%' identified by 'dolphin';

Just a side note, when defining the master replication you can specifically state databases to include or ignore, but to set up the appropiate grant on a particular database fails. For example:

mysql> grant replication slave on test.* to rep@'192.168.100.%' identified by 'dolphin';
ERROR 1221 (HY000): Incorrect usage of DB GRANT and GLOBAL PRIVILEGES

But the problem was not indeed here, looking closer at the error message, the host and the port matched the my.cnf file, but the user didn’t. It was ‘root’ when it should have been ‘rep’.

So, some advice later from the Planet MySQL #IRC (thanks Sheeri) you find if you RTFM at MySQL 5.1 Manual – Replication Startup Options.

If the master.info file exists when the slave server starts, the server uses its contents and ignores any options that correspond to the values listed in the file. Thus, if you start the slave server with different values of the startup options that correspond to values in the master.info file, the different values have no effect, because the server continues to use the master.info file. To use different values, you must either restart after removing the master.info file or (preferably) use the CHANGE MASTER TO statement to reset the values while the slave is running.

So investigating the master.info file located in the slave data directory I get:

14
mysql-bin.000001
1839
192.168.100.170
root
dolphin
3307
60
0

There is a mismatch between the master.info file and the my.cnf file. I don’t recall the exact series of events when I setup the Windows XP installation, I know I first tested it as a master, and it’s most likely I then configured the slave settings, but used a master-user of ‘root’ initially and started the server. Then thinking this was insecure before I’d created the permissions, I changed the my.cnf file to use the user ‘rep’.

So the lessons today are:

  1. Correctly configure your slave settings before starting the mysql server for the first time.
  2. MySQL records slave replication authentication information in two files, the my.cnf and the master.info.
  3. Making changes to my.cnf are not reflected in an existing master.info. In this situtation use the CHANGE MASTER TO command in addition.
  4. Read the Manual. MySQL does have good documentation, and when it’s not quite right, like I found yesterday with Bug #21134, report it to help everybody in the future.

Now I fully agree this is a Replication 101 error. My excuse is the last replication I setup was in 2003 under 3.23, and it worked perfectly well in a production system for 2 years with a number of 3.23 and 4.0 slaves. Hence while I’m just refreshing my replication skills now.

Web Sites – eyeOS

WebSite: www.eyeos.org

eyeOS is an Open Source Web Desktop Environment, commonly known as Web Operating System or Web Office. With eyeOS you can be organized, work and have fun anywhere, using your own personalized Web Desktop. For using eyeOS, you only have to go to your eyeOS server (or use the official eyeOS public server) and log in with your username and password. If you don’t have a username and password, you can create an account from there too.

Web Sites – Thumbshots

WebSite: www.thumbshots.org

This site enables you to embed a thumbnail image of a website. Rather cool, except there must be some level of caching for new website requests. This page at the time of saving doesn’t show the thumbshot of the official site.

Below is an example output from website that I got the idea from:

What is a BLOB?

If your answer was Binary Large Object, then you would be wrong. Check out Mike’s comments on History of Database Blobs from the man that created the BLOB, Mr Jim Starkey.

I had the chance to meet Jim a few months ago, and I spent quite some time talking on various database topics. He is one cool guy.

Mercurial Version Control Software

I got asked (being a Java developer) about what was involved in creating an Eclipse Plugin for Mercurial. Well in true Google style, why invent when somebody probably already has. A quick check finds Mercurial Eclipse by VecTrace.

Now until last week, I’d never heard of Mercurial, so this is really an introduction to somebody that has no idea.

What is Mercurial?

Mercurial is a fast, lightweight Source Control Management system designed for efficient handling of very large distributed projects.

Ok, so big deal, I use CVS. I also use Subversion (SVN) for my Apache contributions, and also for MySQL GUI products. Why do we need another Version Control Product? Mercurial is a Distributed Software Configuration Management Tool. The following is from the Mercurial Wiki

A distributed SCM tool is designed to support a model in which each Repository is loosely coupled to many others. Each Repository contains a complete set of metadata describing one or more projects. These repositories may be located almost anywhere. Individual developers only need access to their own repositories, not to a central one, in order to Commit changes.

Distributed SCMs provide mechanisms for propagating changes between repositories.

Distributed SCMs are in contrast to CentralisedSCMs.

So, clearly a distributed model would well in a large distributed organisation where external factors limit continous access to a single central repository. Low Bandwidth, Poor Internet Connectivity, being on a plane, and travelling are all things that would make a distributed model a more ideal solution. I know I’ve taken my laptop away, and being an “Agile Methodology” developer, I commit often. When you have several days of uncommitted work it goes against the normal operation.

You can get more information at the official website at http://www.selenic.com/mercurial. A few quick links are: Quick Start Guide, Tutorial, Glossary.

Installing Mercurial

su -
cd /src
wget http://www.selenic.com/mercurial/release/mercurial-0.9.tar.gz
tar xvfz mercurial-0.9.tar.gz
cd mercurial-0.9
# NOTE: Requires python 2.3 or better.
python -V
python setup.py install --force

A quick check of the syntax.

$ hg
Mercurial Distributed SCM

basic commands (use "hg help" for the full list or option "-v" for details):

 add        add the specified files on the next commit
 annotate   show changeset information per file line
 clone      make a copy of an existing repository
 commit     commit the specified files or all outstanding changes
 diff       diff repository (or selected files)
 export     dump the header and diffs for one or more changesets
 init       create a new repository in the given directory
 log        show revision history of entire repository or files
 parents    show the parents of the working dir or revision
 pull       pull changes from the specified source
 push       push changes to the specified destination
 remove     remove the specified files on the next commit
 revert     revert files or dirs to their states as of some revision
 serve      export the repository via HTTP
 status     show changed files in the working directory
 update     update or merge working directory

A more detailed list:

$ hg help
Mercurial Distributed SCM

list of commands (use "hg help -v" to show aliases and global options):

 add        add the specified files on the next commit
 annotate   show changeset information per file line
 archive    create unversioned archive of a repository revision
 backout    reverse effect of earlier changeset
 bundle     create a changegroup file
 cat        output the latest or given revisions of files
 clone      make a copy of an existing repository
 commit     commit the specified files or all outstanding changes
 copy       mark files as copied for the next commit
 diff       diff repository (or selected files)
 export     dump the header and diffs for one or more changesets
 grep       search for a pattern in specified files and revisions
 heads      show current repository heads
 help       show help for a given command or all commands
 identify   print information about the working copy
 import     import an ordered set of patches
 incoming   show new changesets found in source
 init       create a new repository in the given directory
 locate     locate files matching specific patterns
 log        show revision history of entire repository or files
 manifest   output the latest or given revision of the project manifest
 merge      Merge working directory with another revision
 outgoing   show changesets not found in destination
 parents    show the parents of the working dir or revision
 paths      show definition of symbolic path names
 pull       pull changes from the specified source
 push       push changes to the specified destination
 recover    roll back an interrupted transaction
 remove     remove the specified files on the next commit
 rename     rename files; equivalent of copy + remove
 revert     revert files or dirs to their states as of some revision
 rollback   roll back the last transaction in this repository
 root       print the root (top) of the current working dir
 serve      export the repository via HTTP
 status     show changed files in the working directory
 tag        add a tag for the current tip or a given revision
 tags       list repository tags
 tip        show the tip revision
 unbundle   apply a changegroup file
 update     update or merge working directory
 verify     verify the integrity of the repository
 version    output version and copyright information

Mercurial Eclipse Plugin

The plugin is still in it’s early days, but the “FREEDOM” of open source enables me to easily review. After a quick install and review of docs, I shot off an email to the developer, stating why I was looking, and while I have other projects on the go, I asked what I could do to help. It’s only be 2 days and we have already communicated via email several times on various topics. That’s one reason why I really love the Open Source Community. Generally people are very receptive to feedback, comments and especially help.

Within Eclipse

  • Help ->Software Updates-> Find and install…
  • Select “Search for new features to install”, click Next
  • Click “New Remote site…”
  • Enter following details and click Ok
    • Name: MercurialEclipse Beta site
    • URL: http://zingo.homeip.net:8000/eclipse-betaupdate/
  • Follow the prompts to accept the license and download.

So now with Eclipse, on a project you can simply go Right Click -> Team -> Share Project -> Select Mercurial

A Quick Mercurial Tutorial

Of course the quickest way to learn about using Mercurial is to look at an existing product. So taking this plugin project for a spin.

$ cd /tmp
$ hg clone http://zingo.homeip.net:8000/hg/mercurialeclipse com.vectrace.MercurialEclipse
$ hg clone com.vectrace.MercurialEclipse example
$ cd example
# Create some new dummy files
$ touch test.txt
$ touch html/test.html
# View files against respository status
$ hg status
? html/test.html
? test.txt
# Add the new files
$ hg add
adding html/test.html
adding test.txt
# Commit changes
$ hg commit -m "Testing Mercurial"

So other then the second clone command (which enabled me to not mess up the original repository and to test distributed handling next), this is just the same as CVS (checkout, diff, add, commit)

# The Distributed nature involves first Pulling from the "upstream" respository
$ hg pull ../com.vectrace.MercurialEclipse
pulling from ../com.vectrace.MercurialEclipse
searching for changes
no changes found
# Confirm our new file is not in "upstream" respository
$ ls ../com.vectrace.MercurialEclipse/test.txt
ls: ../com.vectrace.MercurialEclipse/test.txt: No such file or directory
# Push local respository changes to "upstream" respository
$ hg push ../com.vectrace.MercurialEclipse
pushing to ../com.vectrace.MercurialEclipse
searching for changes
adding changesets
adding manifests
adding file changes
added 1 changesets with 2 changes to 2 files
$ ls ../com.vectrace.MercurialEclipse/test.txt
ls: ../com.vectrace.MercurialEclipse/test.txt: No such file or directory

Hmmm, missed something here. The Quick Start Guide docs seems to want to work back into the original respository, pulling in changes from the “downstream” repository, then executing a merge and commit command. I don’t want to do this just in case I start messing up the original repository. Time for me to create a new standalone repository so I don’t screw anything up. Stay tuned for an updated tutorial.

Of course this is very basic, I need to look into commands used in CVS like login, tagging (versioning), branching, diffing revisions and merging but this is a start.

I do have a concern with the number of distributed respositories, when and how do you know you are in sync with the “master”, indeed what is the “master”. It does merit some investigation to see what management is in place, like identifying all respositories, and comparing for example.

Conclusion

Ok, so now I’ve got a grasp on Mercurial, time to review the Java Code and see what works and what doesn’t in the Eclipse environment. Of course I don’t use Mercurial so what I may consider as functionality required, may be lower priority to those users out there. Any feedback I’m sure will be most appreciated by the original developer.

What is software quality?

Greg Lehey wrote today Is MySQL getting buggier?. The underlying question of his comments is a more fundamental and passionate topic, and especially for me. That is “Software Quality”.

The quintessential question is this. “How do you determine the ‘software quality’ of a product?” And then quickly followed by, “How do you benchmark this with other software products?”

The short answer to second question is simple. You can’t. The reasons why become apparent in addressing the first question. (There’s a mathematical term for this two question situation, another one of the million things to research and remember one day).

15 years ago as part of my masters research I worked on “Improving Software Quality and Software Productivity”. At the time when I started, I found that these were generally considered mutually exclusive. Quite simply, to improve software quality was to decrease productivity, and when you had to improve productivity, quality declined. Unless you had a clear end goal (which you can’t in software, features, cost, satisfaction etc.), it was impossible to measure the total “cost”. It was also impossible to clearly determine impact on key factors, as with software you never have a “control group” working in isolation of other groups trying different things, and being able to do imperial comparison analysis.

In summary, I found that to improve software quality and software productivity you had to reconsider the approach at several different levels.

  1. You had to strive to work towards a simpler business solution, rather then a more complex solution.
  2. You had to leverage converting business knowledge that was held in lengthy and often dated paper documents, into electronically managable content. Some of the benefits included concurrency, consistency and comparative analysis of the knowledge. The early days of CASE repositories were built on this principle, again before they became to complex.
  3. You also had to look at better ways of writing code, and the area I focussed on was code generation. That is “Code writing Code”. (I have a very successful story of 400,000-500,000 Lines of C code. The ODT was a specialised intelligent peer-to-peer selective replication, between a central repository and 32 distributed sites, allowing for independent heterogeneous operations. It provided levels of complexity within replication including two-way master of partial database data, selective row replication between nodes depending on business needs, and collision management. And I did this is1992. A story for another day).

I also found in the business and industry sectors of research, it was near impossible to get a commonality of some components, a higher level of re-use, a higher value of cost savings. There were many factors for this, another topic for another time.

Now that was a long time ago, and a number of things have changed greatly, including my views. These are far more prevalent regarding Open Source.

Some historical general principles in measurement don’t apply today in Open Source. I remember reading recently in the browser wars, that FireFox was “too honest”. Being more open to acknowledge problems/bugs and correcting them, then the Microsoft perspective. How do you compare quality in this situation. (Side Note: I wish there was a search engine that would search all pages that you had viewed, bookmarked etc., rather then searching the WWW for everything. Would make life much easier in finding references like this.)

Open Source does something commercial software can’t do. If you find a bug, and you can’t wait, then fix it yourself, or at worse pay somebody to fix it for you. The scope, functionality and rules are constantly changing, the resourcing and priorities as well. How do you adequately measure something so fluid.

I am a proponent of “Agile Development Methodologies”, having use for many years eXtreme Programming (XP) either rather seriously, or taking small attributes and applying to existing infrastructures. Terms include Test Driven Development (TDD), Feature Driven Development (FDD), Code Coverage, Unit Tests.

Speaking just on Unit Tests. You write a Unit Test before you write your code. It helps to understand what you are trying to achieve before you try to achieve it. It also provides coverage for when the rules change, you write new or changed tests, which allows you to have analysis performed by your code to address problems. I’ve found that writing extensive tests become a self documentation system of the functionality, it highlights what can and can’t occur. Simply ‘grep’ing can offer a clear summarised description of code, even if you have now knowledge of it. In Java you write long method names (some abstract examples would be ‘testFunctionDoesThisAndNotThis”, “testFunctionAcceptsTwoOfThese”, “testFunctionFailsWithValuesOfThis” etc.) I need to provide a more concrete example to better explain.

An Agile approach, also tends to better encapsulate the Web 2.0 principle of considering software as a service, and not a product. This entirely changes the general thought of software development, of the large cost of software development, followed by a maintenance period and cost of the product, followed by a review of potential new products as circumstances change, followed by … (you get the picture)

If you consider an acceptable ongoing cost for providing a service, perhaps pegged against other business revenue/expenses, and then your product was managed by what available resources you had (time, money, people), rather then by functionality requirements you entire perspective changes. You include a high level of user ownership, the release often principle, and the capacity to review and adapt to changing and improving technology and requirements, all these factors change the outcome on how software is development, managed and perceived.

The other thing about all this is, It’s common sense. It’s a simpler approach, and if everybody considered “How can I make this simpler?” first, it would change they way software is developed and perceived.

So getting back onto the topic of Software Quality. I found that in the right circumstances (and there are wrong circumstances), imploring a change in principle towards Agile concepts can have a significant effect on Software Quality. But, Beauty is in the eyes of the beholder. Software Quality is also much the same. What may be acceptable for somebody is unacceptable for somebody else. How then do you measure this?

So after all this, we are left with the question. “How do you determine the ‘software quality’ of a product?” Well, we can only consider a single product in isolation, and we are talking the MySQL server. I should note that you can’t consider all MySQL products in one analysis, different factors including the relative age of the product, the technologies etc. all affect the measurement. You need to consider each product individually.

So why is the number of bugs, or number of open bugs a measure of software quality. What about how big is the product (e.g. lines of code), how complex is the product, how mature is the product, how tested is the product. What we considered the lifespan of critical bugs as a measurement? So what if there were a larger number of critical bugs, but they were fixed immediately is that a good or poor reflection on software quality. If software is in the Web 2.0 “Perpetual Beta”, then are bugs really incomplete functionality.

In software and for organisations, I think an approach could be similar as mentioned by a prospective employer recently. “We know things aren’t the best they could be, but they are better then they were. If our goal could be to improve just 1% each day, then we would be easily 100% better in 6 months.” So with whatever metrics were in place (and you have to have some form of measurement), if there is always a continual improvement, then software quality is always improving.

Generating an internal SSL Certificate (for tomcat)

How to Generate an internal SSL certificate

Create the self-signed keystore

$ su -
$ URL="your.url.here";export URL
$ cd /opt/tomcat/conf
$ keytool -genkey -alias ${URL} -keyalg RSA -keystore ${URL}.keystore
Enter keystore password:  changeit
What is your first and last name?
  [Unknown]:  your.url.here
What is the name of your organizational unit?
  [Unknown]:  IT
What is the name of your organization?
  [Unknown]:  your.url.here
What is the name of your City or Locality?
  [Unknown]:  Brisbane
What is the name of your State or Province?
  [Unknown]:  QLD
What is the two-letter country code for this unit?
  [Unknown]:  AU
Is CN=your.url.here, OU=IT, O=your.url.here, L=Brisbane, ST=QLD, C=AU correct?
  [no]:  yes

Enter key password for <your.url.here>
        (RETURN if same as keystore password):

Turn the keystore into a X.509 certificate

$ keytool -export -alias ${URL} -keystore ${URL}.keystore -rfc -file ${URL}.cert
Enter keystore password:  changeit
Certificate stored in file <your.url.here.cert>

Delete existing trusted certificate

$ keytool -delete -alias ${URL} -file ${URL}.cert  -keystore /opt/java/jre/lib/security/cacerts  -storepass changeit

Import the certificate into cacerts – JRE trusted certificates

$ keytool -import -alias ${URL} -file ${URL}.cert  -keystore /opt/java/jre/lib/security/cacerts  -storepass changeit
Owner: CN=your.url.here, OU=IT, O=your.url.here, L=Brisbane, ST=QLD, C=AU
Issuer: CN=your.url.here, OU=IT, O=your.url.here, L=Brisbane, ST=QLD, C=AU
Serial number: 44ab628c
Valid from: Wed Jul 05 01:56:12 CDT 2006 until: Tue Oct 03 01:56:12 CDT 2006
Certificate fingerprints:
         MD5:  EC:76:01:04:7F:FC:21:CC:A8:41:AD:86:C8:B2:D5:6D
         SHA1: 2D:FD:7C:56:65:70:36:1B:1D:71:09:41:84:98:E6:8E:89:18:BC:18
Trust this certificate? [no]:  yes
Certificate was added to keystore


If you replaced an existing certificate you will need to restart Tomcat.

Differences in syntax between mysql and mysqltest

As I wrote earlier in Using the MySQL Test Suite I found an issue with using the current MySQL Sakila Sample Database as a test with mysqltest.

I was running an older version of 5.1.7 beta so I figured the best course of action was to upgrade to 5.1.11 beta.

Well the problem still exists, and I found that the cause was due to the syntax of the DELIMITER command. Assuming the creation of the schema tables, here is an example of what I found.

Running in an interactive mysql session the following works

DROP TRIGGER ins_film;
DELIMITER ;;
CREATE TRIGGER `ins_film` AFTER INSERT ON `film` FOR EACH ROW BEGIN
    INSERT INTO film_text (film_id, title, description)
        VALUES (new.film_id, new.title, new.description);
  END;;
DELIMITER ;

On a side note, why does DROP TRIGGER not contain the IF EXISTS syntax like DROP DATABASE and DROP TABLE. It’s a shame, it’s a nice SQL extension that MySQL provides.

Now running the same SQL in a mysqltest test (with appropiate create tables) produces the following error

TEST                            RESULT
-------------------------------------------------------
sakila-trigger                 [ fail ]

Errors are (from /opt/mysql/mysql-test/var/log/mysqltest-time) :
mysqltest: At line 53: Extra delimiter "" found
(the last lines may be the most important ones)

It seems, that the DELIMITER command within mysql accepts a line terminator as a statement terminator, while in mysqltest, this is not so.
The solution is to be more strict in the mysqltest, using the following syntax, note the first and list DELIMITER lines are terminated by the appropiate delimiter in use.

DELIMITER //;

CREATE TRIGGER `ins_film` AFTER INSERT ON `film` FOR EACH ROW BEGIN
  INSERT INTO film_text (film_id, title, description)
  VALUES (new.film_id, new.title, new.description);
END//

DELIMITER ;//

Surprisingly, this syntax does then not work in mysql? Is this a bug? I’m note sure, perhaps somebody could let me know.

Using PBXT 0.9.5

Paul has released Version 0.95 of his PBXT MySQL Transactional Storage Engine.

Here is what I did to get it operational under CentOS 4.3.

su -
useradd pbxt
su - pbxt
wget http://www.primebase.com/xt/download/mysql-4.1.16-pbxt-0.9.5.tar.gz
tar xvfz mysql-4.1.16-pbxt-0.9.5.tar.gz
cd mysql-4.1.16-pbxt-0.9.5
./configure --with-debug=full --with-pbxt-storage-engine --without-query-cache --with-mysqld-user=pbxt  --prefix=/home/pbxt/mysql
make
make install
scripts/mysql_install_db
cd /home/pbxt/mysql
./bin/mysqld_safe --user=pbxt --basedir=/home/pbxt/mysql &

Now, lets test and confirm PBXT is operational.

bin/mysql -uroot mysql
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 1 to server version: 4.1.16-pbxt-0.9.5-debug

Type 'help;' or 'h' for help. Type 'c' to clear the buffer.

mysql> show engines;
+------------+---------+------------------------------------------------------------+
| Engine     | Support | Comment                                                    |
+------------+---------+------------------------------------------------------------+
| MyISAM     | DEFAULT | Default engine as of MySQL 3.23 with great performance     |
| HEAP       | YES     | Alias for MEMORY                                           |
| MEMORY     | YES     | Hash based, stored in memory, useful for temporary tables  |
| MERGE      | YES     | Collection of identical MyISAM tables                      |
| MRG_MYISAM | YES     | Alias for MERGE                                            |
| ISAM       | NO      | Obsolete storage engine, now replaced by MyISAM            |
| MRG_ISAM   | NO      | Obsolete storage engine, now replaced by MERGE             |
| InnoDB     | YES     | Supports transactions, row-level locking, and foreign keys |
| INNOBASE   | YES     | Alias for INNODB                                           |
| BDB        | NO      | Supports transactions and page-level locking               |
| BERKELEYDB | NO      | Alias for BDB                                              |
| NDBCLUSTER | NO      | Clustered, fault-tolerant, memory-based tables             |
| NDB        | NO      | Alias for NDBCLUSTER                                       |
| EXAMPLE    | NO      | Example storage engine                                     |
| ARCHIVE    | NO      | Archive storage engine                                     |
| CSV        | NO      | CSV storage engine                                         |
| BLACKHOLE  | NO      | Storage engine designed to act as null storage             |
| PBXT       | YES     | Super fast transactional, row-level locking engine         |
+------------+---------+------------------------------------------------------------+
18 rows in set (0.00 sec)

Now to take if for a spin with the MySQL Sakila Sample Database. At this time I need a modified version of the schema, as the sample database is designed for Version 5.

Download sakila.mysql4 script.
Download sakila.pbxt.sql script.

bin/mysql -uroot mysql
mysql>  source sakila.mysql4
mysql> source  sakila.pbxt.sql
# NOTE: Alter table of film_text fails as expected due to FULLTEXT index
mysql> show create table actor;
mysql> exit;

A few warnings in my sakila.mysql4 script I’ll need to review, but it looks ok from an installation step. Now back to my tests that I’m writing to use the Sakila Sample Database which I was working on, so I can really test the ACID state and performance of PBXT.

Using the MySQL Test Suite

MySQL provides two different tools to test the MySQL Server with SQL statements. One is mysqltest and in 5.1 mysqlslap. Both of these tools have quite different purposes. This is a quick review of the usage of mysqltest.

Current Usage

Under Linux deploys, the README in the mysql-test directory gives you all the information you need to know to run.

To run the full test suite.

cd /opt/mysql/mysql-test
 ./mysql-test-run

Rather easy, it does take some time, and I was surprised to find a number of tests marked as ‘skipped’. The general purpose of having tests in a product is to provide coverage of software functionality, and tests should always be forward compatible, unless they are no longer applicable and should be removed. I wonder why the time wasn’t taken to ensure they work in each release.

Should you already have a mysqld running, you can also use this using the -extern syntax to run specific tests against the server. For Example.

./mysql-test-run --extern alias analyze

There are over 700 tests, so this approach is not practical for all tests. It is noted in the README that some tests can’t run in extern mode.
Looking a bit closer at both the server configuration and a running test sheds some light on the types of parameters that are used, and how mysqltest operates.

/opt/mysql/bin/mysqld --no-defaults --server-id=1 --basedir=/opt/mysql --port=9306 --port-open-timeout=380 --local-infile
--exit-info=256 --core --datadir=/opt/mysql/mysql-test/var/master-data --pid-file=/opt/mysql/mysql-test/var/run/master.pid
--socket=/opt/mysql/mysql-test/var/tmp/master.sock --log=/opt/mysql/mysql-test/var/log/master.log
 --character-sets-dir=/opt/mysql/share/mysql/charsets --default-character-set=latin1 --tmpdir=/opt/mysql/mysql-test/var/tmp
--language=/opt/mysql/share/mysql/english/ --innodb_data_file_path=ibdata1:128M:autoextend --open-files-limit=1024
--log-bin-trust-function-creators --loose-binlog-show-xid=0 --rpl-recovery-rank=1 --init-rpl-role=master --key_buffer_size=1M
--sort_buffer=256K --max_heap_table_size=1M --log-bin=/opt/mysql/mysql-test/var/log/master-bin --user=root

/opt/mysql/bin/mysqltest --no-defaults --socket=/opt/mysql/mysql-test/var/tmp/master.sock --database=test --user=root
--password= --silent -v --skip-safemalloc --tmpdir=/opt/mysql/mysql-test/var/tmp --port=9306 -R r/count_distinct3.result

Extending MySQL Test Suite

That’s all nice, but the power of this infrastructure is you can incoporate your own tests. This may be beneficial if you have complex statements and use bleeding edge versions of MySQL. Indeed, it’s another approach to being able to provide to MySQL reproducable test cases if you have a problem.

There are basically 2 steps to incoporating your own tests.

  1. Create a test script in the t subdirectory.
  2. Create a test results file in the r subdirectory.

Using the MySQL Sakila Sample Database as a test case, I created a test with the default sakila-schema.sql and sakila-data.sql files. I did need to replace the DROP and USE DATABASE commands and add a set of DROP TABLE statements.

cd /opt/mysql/mysql-test
vi t/sakila.test
./mysql-test-run --record sakila
more r/sakila.results

Example Results

./mysql-test-run sakila

Stopping master cluster
Installing Test Databases
Removing Stale Files
Installing Master Databases
running  ../bin/mysqld --no-defaults --bootstrap --skip-grant-tables     --basedir=.. --datadir=mysql-test/var/master-data --skip-innodb --skip-ndbcluster --skip-bdb
Installing Master Databases 1
running  ../bin/mysqld --no-defaults --bootstrap --skip-grant-tables     --basedir=.. --datadir=mysql-test/var/master-data1 --skip-innodb --skip-ndbcluster --skip-bdb
Installing Slave Databases
running  ../bin/mysqld --no-defaults --bootstrap --skip-grant-tables     --basedir=.. --datadir=mysql-test/var/slave-data --skip-innodb --skip-ndbcluster --skip-bdb
Manager disabled, skipping manager start.
Loading Standard Test Databases
Starting Tests

TEST                            RESULT
-------------------------------------------------------
sakila                         [ pass ]
-------------------------------------------------------

Ending Tests
Shutting-down MySQL daemon

Master shutdown finished
Slave shutdown finished
All 1 tests were successful.

Now, of surprise with my present version of 5.1.7 beta, a number of statements from the sakila-schema failed.
You can download my working version of the sakila.test from above here. The first failure was:

TEST                            RESULT
-------------------------------------------------------
sakila2                        [ fail ]

Errors are (from /opt/mysql/mysql-test/var/log/mysqltest-time) :
mysqltest: At line 220: query ';
CREATE TRIGGER `ins_film` AFTER INSERT ON `film` FOR EACH ROW BEGIN
INSERT INTO film_text (film_id, title, description)
VALUES (new.film_id, new.title, new.description);
END;;
CREATE TRIGGER `upd_film` AFTER UPDATE ON `film` FOR EACH ROW BEGIN
IF (old.title != new.title) or (old.description != new.description)
THEN
UPDATE film_text
SET title=new.title,
description=new.description,
film_id=new.film_id
WHERE film_id=old.film_id;
END IF;
END;;
CREATE TRIGGER `del_film` AFTER DELETE ON `film` FOR EACH ROW BEGIN
DELETE FROM film_text WHERE film_id = old.film_id;
END;;
DELIMITER ;
--' failed: 1065: Query was empty
(the last lines may be the most important ones)

Now is the time to upgrade to the latest 5.1.11 beta to confirm operations and isolate the errors into smaller tests, and hopefully all is well, if not, I’ve got something to contribute back.