It's makes me cry

I got home today and sat down to read my home email list. Nothing new. But on a MySQL mailing list, there was an enquiry why performance was slowing in a given application. I didn’t even have to read the situation, nor the problem, it took less then the 200ms mentioned to identify the problem looking at the supplied schema.

In summary, the first table in the schema had a primary key of VARBINARY(255) and a engine type of Innodb. Hold on, wait, it’s a concatenated key of two VARBINARY(255) columns. And I should mention, that primary key was a foreign key in the next table. If this was a home website app with one user, ok well it’s still bad, but this application was having performance problems with reasonable volumes of transactions, it’s not a beginner application. (A recent reference If you don’t know your data, you don’t know your application). Where do people learn this from!

Now I want to work on a large scale out MySQL environment, an organisation or site that’s going places, but the world of even normal every day applications is littered with these basic fundamental design errors. It’s make me cry.

My last contract had many flaws across the business, and the Java code for production applications was just as bad. Check out things like What constitutes a good error message to the user?, Why IT professionals get a bad name (I could easily write one of these a day for a long time, or you could check out The Daily WFT). You think the new contract with an large existing multinational company and world-wide installed Java application I could get an improvement. I’ll leave these stories for another time. I don’t want to have to deal with this.

I’ve got 16 half written blogs on cool MySQL features or interesting topics, but I never seem to get the motivation at the end of the day. I just want to know, how do people in IT get away with this. Where is the Pride in people’s work, where is the desire for people to make what they do a better place. Where is:

An opportunity that’s a challenge, including involving hard work and tight time frames, but a job that provides the rewards of job satisfaction for a productive contribution. An importance on quality, emphasis on continued improvement, and goals of simplicity in complex situations while working in a team environment are also necessary.

Your Ideal Job Requirements?

I came across in my papers while searching for some taxation information, a list that I made to use as an evaluation in an “ideal job”. This goes with a general comment I made only a few weeks ago to a number of colleagues.


I’m looking for an opportunity that’s a challenge, including involving hard work and tight time frames, but a job that provides the rewards of job satisfaction for a productive contribution. An importance on quality, emphasis on continued improvement, and goals of simplicity in complex situations while working in a team environment are also necessary.

The list I found and made in 2002 was:

  1. Use of Existing Core Skills – Technologies
    • Oracle, Unix, Java, HTML/Web
    • Open Source Projects (Apache – HTTP/Java/XML, MySQL, PHP, etc)
  2. Use of Existing Core Skills – Experience
    • Database Modelling
    • Large Systems Design & Development
    • Application Performance & Tuning
    • Technical Problem Solving
  3. About Newer Technologies
    • Encourage/Embrace use
    • Not bleeding edge
    • Opportunity for learning new relative skills
  4. Work Environment
    • Equipment
    • Location
    • Core working hours
    • Work at home options?
  5. Team Environment
    • Ability to undertaking varying roles (and not others due to a team)
    • Existing working relationship with people
    • Across Development/Management/Support teams
  6. Project Environment
    • Durations 1 month – 1 year
    • Challenging
    • Varying
    • Problem Solving
  7. Project Methodologies
    • Emphasis on Quality Procedures
    • Emphasis on Software Testing
    • Emphasis on Customer
  8. Remuneration
  9. Other
    • Some fun

What’s interesting to know, is that this list includes a lot of points I’d still consider essential for my “ideal job”. I’ll need to consider more a revised list, as it took some time to make this list up. It’s interesting to know that my present contract position, as well as my last position do not meet a great deal of my “ideal job” requirements.

SHOW STATUS Gotcha

Well, it’s Sunday night so I will put this down to being the weekend. The background to being caught out is a request I made to my local Users Group mailing list for some information on people’s environments because I wanted to some empirical data analysis without having any more knowledge of the systems.

In summary (without the surrounding fan-fare, I was seeking):

SELECT VERSION();
SHOW STATUS;
SHOW VARIABLES;  // Optional

I was however perplexed why my first data point analysis (Read/Write ratio) using the Status values Com_insert, Com_update, Com_delete and Com_select was not always giving me expected results. In particular, a number of server results showed 0 for values while I knew the results came from working MySQL environments.

So, sanity check with good friend Morgan and I get the response to answer the dilemma SHOW STATUS defaults to session based statistics in 5.0+, there’s a million people it’s caught out, and many bugs about “xyz variable not being incremented”.

Well, it’s Sunday night so I will put this down to being the weekend. The background to being caught out is a request I made to my local Users Group mailing list for some information on people’s environments because I wanted to some empirical data analysis without having any more knowledge of the systems.

So I’m really seeking:

SELECT VERSION();
SHOW GLOBAL STATUS;
SHOW GLOBAL VARIABLES;  // Optional

Well, that’s my MySQL trivia point for the day. MySQL between version 4.1 and 5.0 made a major change with the default value. The MySQL 5.0 Manual also states this.

UPDATE 23 Sep 2006

Following Eric’s Show @&!# status again!, I did then read the Bug #19093, and it was great to see Monty comment (that’s open source for you, enabling even founders to mix with the common folk), and I’d tend to agree to his comments. MySQL has attempted to correct something.

Perhaps the best compromise to the problem is to deprecate and drop the SHOW STATUS command, and make the scope component GLOBAL|SESSION compulsory, so SHOW GLOBAL STATUS or SHOW SESSION STATUS are the only valid commands!

Microfox ?

I’ve added Digg to my general lunch time reading web sites. I came across this yesterday. Microsoft invites Firefox development team to Redmond.

Well, isn’t that nice, the big boy opening his pond (including all the sharks) to the little fish. (if you don’t get it, it’s a line from Die Hard 2 about pissing in somebody else’s pond) What’s funny about the article is not the article but some of the comments, here are but a few.

  • Cool. Very cool.
  • they’re going to brainwash them into working for microsoft and their going to cast off the IE team to replace the firefox team
  • Just replace IE 7 with FF 2.0. Save every web developer 100s of hours.
  • “Come into my parlour,” said the spider to the fly
  • DON’T DO IT! My god, has noone seen Braveheart? The Untouchables? The Godfather? Scarface?
  • hey’re going to kidnap the Mozilla developers with a cunning trail of pizza and cola, and force them to work on IE8.
  • ITS A TRAP! http://itsatrap.ytmnd.com/

I like the last one, maybe just because of the cool picture on referenced site.

I see today an official response Firefox welcomes Microsoft’s offer of help

The fast pace of technology in a Web 2.0 world

I had need to goto the Wikipedia this morning to review the terminology of something, and on the front page in Today’s featured article is Mercury. Being a tad curious given I’d heard only on the radio a few hours ago that Pluto was no longer a planet in our Solar System, I drilled down to the bottom to check references to other planets (quicker then searching). So at the bottom I found the following graphic and details of The Solar System Summary.

Well blow me down, they didn’t waste any time there. Pluto is no longer a planet in our Solar System. It is now categorised as a Dwarf Planet.

There are over 100 edits to this page on Dwarf Planet in the past 12 hours, including all the links in Pluto and Solar System correctly referenced.

Now, that’s the power of content by the community, one of the characteristics of Web 2.0.
Of course all of this runs under the LAMP Stack and powered by the MySQL Database. Combined with the fact this is a breed of organisation that didn’t start with large amount of Venture Capital, another trend of the newer generation of popular and successful internet enterprises.

No Spam Today

Huh, tricked you. As if!

However I was looking at my Akismet Spam section in WordPress, the open source software that runs my blog, and it gave me this message.

Caught Spam

You have no spam currently in the queue. Must be your lucky day. :)

The RAT and the CAT

No, it’s not a bedtime story, is a serious system’s design concept and I’m amazing that people don’t know about this.
As I mentioned in If you don’t know your data, you don’t know your application I was doing a Java Code Review, and I found a clear case of a much simplier solution. How simple you ask?

Well, without completing the task 100%, I achieved in less then 1 day (and lets say for the argument 1 more day of work), what is being worked on by somebody else for a week, with an estimate of 2 more weeks to complete. So let’s add 50% to my estimate, that’s a total of 3 days verses 15 days. You do the math. and yes that was last week and that task is still being worked on the same way, even with reference to my working code. Not to mention the code is a similiar magnitude of simplicity, and simplicity means cost savings in support, people so quickly forget that.

So what is this RAT and CAT:

  • RAT – Row At a Time
  • CAT – Chunk At a Time

This concept is really rather simple, however it’s amazing how implementations never consider this. Let me give you a simple example.

Your importing some external data (a flat file of 5 columns, and is inturn will be stored in 2 tables in a normalised form). So the RAT way of doing things would entail:

  • Using the language in question, open the file, read line by line, converting into object for reference.
  • Now for each row read you
    • You get the values (A and B), and then do a select to see if this row already exists in one of the normalised tables (calling in X). If it does, you have the surrogate key, else you insert the row and get the surrogate key
    • Now you do the same with the next set of values (C, D, E) which reference the normalised values (A and B), inserting into Y)
    • And so on and so on.
    • Report exceptions line by line if found

For those now laughing, it’s not really funny, it’s sad that programs are written this way. For those that say, but that’s the only way, well welcome to a different, radical and hard to learn approach.

The Cat way of doing things would entail:

  • Create a temporary table
  • Bulk load the data into the temporary table via appropiate command, e.g. mysqlimport or LOAD DATA.
  • Using one select, insert into X rows from temporary table that are not already present
  • Using one select, insert into Y rows from the temporart table that are not present, joining to X to get the appropiate surrogate key
  • Report exceptions via one select of data that wasn’t inserted due to some rules

It’s not rocket science people.

I should mention this is a simple example, and it’s not always possible to do CAT processing for all operations, but generally some portion of batch work can be, and the remaining must be done in a RAT way.

Updated
Some more recent articles including We need more CATs (2009) and The Art of Elimination (2010)

If you don't know your data, you don't know your application.

The art of data modelling is definitely lost on some [most] people, or they never found it, even though they think they did. Over dinner with good friend Morgan last night we were swapping present stories on the topic.

Morgan wrote recently about I want my 4 bytes back damn it., and interesting example storing an ISBN. Further reference can be found at Getting started with MySQL of a more impractical ISBN example.

Disk is cheap now, so the attitude and poor excuse can be, well a few extra bytes doesn’t matter. Well no! If your a social hacker and have a website with a maximium concurrent connections of 2 maybe, but much like some recent Java Code Reviewing I just performed, just because the system isn’t 24×7, doesn’t give you excuse to be lazy about writing the code not to handle concurrency, thread safety and also as efficient as possible, in this case RAT verses CAT. (I’ll need to write about this, it seemed to go over some of the other professionals even)

I can remember a very specific example some 10 years ago in doing some performance analysis on a site. I’d identified the need for an additional index on a table. Now this table was sized for 200 million rows, and it already had about 70 million. The problem was adding another index required 4GB disk allocation. These things have an effect on sizing, growth and backups.

So the impact on appropiate sizing can clearly have an effect, if it was just one poorly sized column that’s acceptable (just), but normally it’s a pattern that litters a data model.

What’s important to realise is, it’s not just diskspace, it’s also memory. Without really touching on sizing data, I did mention some examples previously in Improving Open Source Databases – WordPress. Here the use of BIGINT(20) for primary keys proved my point. That’s 8 bytes, but unless you have going to have 4 billion categories, it’s a waste. It’s a waste when it’s a foreign key in a table, and it’s a big waste when it’s indexed, and that index is then in memory, and wasting more precious resources.

So how to do identify when the designer of the model has no idea about the intrinsic data value being stored? If you see tables with VARCHAR(255), that’s a clear sign. They have no idea regarding the data, so a default limit is used. Morgan referred to it as “Shooting guns in the dark with your eyes closed”. Books don’t help the cause, I was just skimming High Performance MySQL last night (one of the many freebies from the UC). There on page 82, is a table definition with not one column, but two with varchar(255). Hmmm!

If you see any new applications with VARHAR(255) they are even more lost, because MySQL 5, which has been around quite some time now, supports VARCHAR(65535). Speaking of that, has anybody seen VARCHAR(65535). I’d like to know.

Another example, is in Sheeri’s Top 8 SQL Best Practices Point 4 in regards to storing IP’s effeciently. If you log for example every page hit, this can be your largest table, and moving from varchar(15) to int can save you upto 11 bytes per row alone.

These may just be simple examples, but it’s the principle. When you define a column, consider it’s data, if you don’t know then take the time to do the reasearch and learn.

Become named in Firefox 2

So, FireFox have come up with a novel idea to promote it’s product. Check out Firefox Day.

The official blurb: Share Firefox with a friend. If your friend downloads Firefox before September 15, you’ll both be immortalized in Firefox 2.

You can even choose how to link your names together on the “Firefox Friends Wall”. Examples like ‘my name’ Informed ‘your name’, or ‘my name’ Empowered ‘your name’, or ‘my name’ Liberated ‘your name’.

Perhaps MySQL can leverage this idea for some what to promote future download!



New Toy


Got my new toy today, having only ordered it Thursday night, it arrived Monday. Normally Dell stuff comes from Asia, maybe they had some of these on hand locally.

A Dell 2407WFP 24″ LCD monitor with a 1920×1200 resolution.

Hmmm, not bad. I should mention $1,199 Delivered ($300 off normal price)

Peace Man

No, it’s not a slogan from the 1960’s and 1970’s, however if I could draw a picture in a wordpress textarea I’d draw a hand Victory signal with two fingers.

Today I started providing services as a Technical Analyst for Peace Software initially here in Brisbane. Stealing directly from the marketing blub.

“Peace Software is the world’s leading utility customer information software developer. Peace ™, the company’s flagship software product, is installed at major utilities in 35 regulated and competitive energy markets for billing and customer relationship management of millions of electric, gas and water customers. Peace Software has customers in North America, Europe and Asia-Pacific. ”

So, a different pace for me, especially in terms of the “end user” customer of the software I’ll be responsible for in some small way.

Things that are the same. Java, Web Client, Oracle, Unit Tests, some exposure to Agile Methodologies

Things that are different. Large, stable and established product, long standing company, corporate customer, lack of Internet end user urgency. And Still, no job working with MySQL which is my goal.

This is not the first time I’ve worked particularly in this type of software industry. I had a reasonable stint at Brisbane City Council when they first rolled out the RIMS system to manage Council Rates and Billing ($1 billion revenue p.y.) for one of the largest councils in the world.

Pride

I’ve recently completed a contract and I’ve been in discussions with agents and other employers for further work. Having had one of the worse experiences in my previous work, I’ve been extra careful to ensure what I’m told at the interview/meeting stage is indeed true and accurate (in my last case it was not). I’ve also not made the assumption that an organisation that is dependent on software has placed a certain level of value on what’s in place. (in my last case I did, simply due to the size of the organisation and volume of business).

So, when being asked by people what I’m seeking, outside of the technical skills and compendencies, I’m seeking an organisation that places value on it’s existing software, it’s software quality, it’s software improvement and most importantly it’s software developers. It was unfortunate that for an organisation that lived in software, and would not survive long (especially at the present scale of operations) without it, they met none of these criteria. It was really sad, and overall I found the environment “depressing”.

Now “depressing” could be considered a harsh word, but it comes from both my ethos and also present circumstances in life. I had these two post-it notes on my wall at work, which summed up “What I was seeking/Where I wanted to be”, and “Where I was” in my job.

There have been a few articles recently, it was probably Marcus in Supporting ideas and being productive, based on earlier articles of How to Come Up With Ideas and How To Kill Good Ideas by Zack that motivated me to publish this. These articles have many good points, and while I could put my slant on it, I wanted to focus on just one thing.

In Software Development 10 years ago I was driven, now I know and I’m driven by passion. I always strive for something better then their presently is, seeking better quality, and better productivity, and most importantly today, simplicity.

Everybody is at a different position in life and it’s important to find the right fit. For me money is not a motivator, being productive, making what I’m doing better, where I’m working more enjoyable and productive, and who I’m working with a better place to be are important to me.

I’ve choose the word for the moment to best represent this as PRIDE.

Pride is:

  • A sense of one’s own proper dignity or value; self-respect
  • Pleasure or satisfaction taken in an achievement, possession, or association
  • A cause or source of pleasure or satisfaction; the best of a group or class
  • The most successful or thriving condition
  • To indulge (oneself) in a feeling of pleasure or satisfaction

For me, I’m seeking an organisation that takes pride in it’s people, it’s products and it’s future direction, and that attracts people that have pride in what they do, they way they think and how they interact with others.

Troubleshooting Wireless Connections (under Windoze)

Router Configuration

In order to enable wireless you will most probably have a Wireless Router that is managing your Internet Connection (via cable or DSL), Each router is different, however they generally all have a web interface to access them, the Netgear and DLink routers seems to have generally moved to http://192.168.0.1 as the default address (which is also configurable) Refer to the router documentation for the correct address. You will be prompted for a username and password, this is generally admin and then a default password. Refer to appropiate router documentation for the default password.

Router Wireless Settings

Your router will allow you to configure a number of wireless settings. Where this menu option differs with every router, but normally a Wireless Settings option is easily visible. The common wireless settings are:

  • SSID. This is an identifier for your Wireless Network. This helps when you have access to multiple networks
  • Channel. This is a number generally between 1 and 13, and is just used as an offset from the Wireless frequency. Wireless 802.11 networks work at 2.4GHz can be interfered by other devices such as cordless phones and microwaves. It’s generally considered that a larger number is better, however I don’t know the reason why.
  • Mode: The present IEEE standard for wireless is 802.11 This is generally suffixed by either a ‘b’ or a ‘g’ (known as the mode), indicating the capacity of your wireless connection. Most present wireless network components are 802.11g with supports speeds to 54Mbps, will the older 802.11b had a maximum speed of 11Mbps. While some providers actually state today 110Mbps, they are actually cheating but using two 54Mbps channels in parallel where then split up your information, send in parallel and then reconstitute.
  • Security Encryption. By default, encryption is disabled, but it should always be enabled, and generally set to an Authentication Type of Automatic
  • WEP Encryption level. This can be:
    • Disabled – no data encryption
    • 64-bit (sometimes called 40-bit) encryption – ten hexadecimal digits (any combination of 0-9, A-F)
    • 128-bit encryption – twenty-six hexadecimal digits (any combination of 0-9, A-F)

The gets the terminology out of the way.

Available Network Connections

Start by looking at your present Windows Network Configuration on the machine in question.

  • Start | Control Panel
  • Now if you are in Classic View, you can just select Network Connections
  • If you are in Category View, you need to select Network and Internet Connections then Network Connections

At this point you should see a list of Network Connections. It’s not uncommon as per the screen to see 2 connections, a Wireless Network Connection from your wireless network card, and a Local Area Network if you used a fixed network cable.

Configuring your Wireless Connection

The first thing you need to check is if your Wireless Network connection has a Red Cross thru it. This is a good indicator that it’s not working. The following points run thru the options against a Wireless connection.

Disable/Enable

The Status column for the connection should first say Enabled. If it says Disabled, you should enable this single click on Wireless Network Connection, then Right Click, then select Enable.

Available Wireless Networks

A right Click and View Available Wireless Networks should provide a list of available Networks that have been found inrange and that broadcasting their SSID (it’s possible to not broadcast, but by default this is enabled). You should see the SSID you define for your Router, and you may well see other wireless networks. But Selecting an Available Network, you will be prompted to enter the WEP Encryption key if the network is using encryption. Do so, and click [Connect].

Status

Right Click, Status gives you information on your connection, It should have a Status of Connected, the other details if for informational purposes. Clicking on the Support tab is helpful to confirm your setttings. It’s important that the information is like:

  • Address Type: Assigned by DHCP
  • IP Address: 192.168.0.100
  • Subnet Mask: 255.255.255.0
  • Default Gateway: 192.168.0.1

These details define your connection, it’s important that the Default Gateway is the same as your router address, and that the IP Address is similar (with just the last part different)

Properties

Right Click, Properties gives you a lot to review. The three things you should review are:

  • Click on the General Tab, then click Internet Protocol (TCP/IP), then click Properties
  • Within this page, on the General Tab, you should ensure that Obtain an IP Address automatically is selected, as well as Obtain DNS server address automatically is selected. Click OK to return.
  • Click on Wireless Networks Tab, and ensure Use Windows to configure my wireless network settings is ticked.
  • Click on Authentication Tab, and ensure the Enable Network Access box is ticked.

If you have made any changes, it’s benefical to Disable and re Enable your connection.

Troubleshooting

  • Try Disable your Wireless Connection the Re-enable.
  • Start | Run then type in command and press Enter.
    • ipconfig – This should list your current connection details, including the IP address for your wireless connection, this should be like 192.168.0.100 (based on your DHCP settings, and the first 3 parts of the IP address should match your IP address of your Router.
    • ipconfig /release – This will manually disconnect your connection
    • ipconfig /renew – This will manually reconnect your connection. You should then do a ipconfig to reconfirm your details
  • If everything points to the connection working, you can test your network connection with the following commands.Start | Run then type in command and press Enter.
    • ping 192.168.0.1 – This should provide a few lines of response, do a network test between your computer and the router
    • ping 64.132.34.72 – This should again provide a few lines, and confirms a connection to the outside world (in this case www.news.com.au)
    • ping www.news.com.au – This finally ensures you are connected to the outside world, and that DNS (Domain Name Server) Resolution is working. This is the translation between human names, like websites, and computer identifications which is a 4 part IP address.
  • Should this all fail, it may be benefical to remove the Wireless driver settings completely and re-install, which is an entirely new exercise requiring documenting.

Protecting Your Network

Wireless in default operations is unsecure, and is very easy for example somebody else to use your bandwidth. It’s also easy for a hacker to intercept your communications and then analyse your information. There are two things you should always do:

  • Use an encrypted network using WEP
  • Change your router default password to something different

In addition for greater protection you can run with an Access List, that is specifically defining at your Router the specific Network Addresses of each Network Card that can access your network, but this requires more configuration.

References

Wikipedia IEEE 802.11

Compiling MySQL Tutorial 2 – Directly from the source


Should you want to be on the bleeding edge, or in my case, don’t want to download 70MB each day in a daily snapshot (especially when I’m getting build errors), you can use Bit Keeper Free Bit Keeper Client that at least lets you download the MySQL Repository. This client doesn’t allow commits, which is a good thing for those non-gurus in mysql internals (which definitely includes me).

wget http://www.bitmover.com/bk-client.shar
/bin/sh bk-client.shar
cd bk_client-1.1
make

By placing sfioball in your path you can execute.

sfioball bk://mysql.bkbits.net/mysql-5.1 mysql-5.1

This took me about 4 mins, which seemed much quicker then getting a snapshot!

You can then get cracking with my instructions at Compiling MySQL Tutorial 1 – The Baseline.

A good reference in all this compiling is to take a good look at the MySQL Internals Manual. (which I only found out about recently)

If at a later time you want to update your repository to the latest, use the following command.

update bk://mysql.bkbits.net/mysql-5.1 mysql-5.1

Documentation References

5.1 Reference Manual – 2.9. MySQL Installation Using a Source Distribution
5.1 Reference Manual – 2.9.3. Installing from the Development Source Tree
5.1 Reference Manual – 2.13.1.3. Linux Source Distribution Notes
MySQL Internals Manual

Requirements for compiling

To confirm earlier notes on minimum requirements for compiling the following details should be confirmed.

automake --version
autoconf --version
libtool --version
m4 --version
gcc --version
gmake --version
bison -version


My results running Fedora Core 5.

$ automake --version
automake (GNU automake) 1.9.6
$ autoconf --version
autoconf (GNU Autoconf) 2.59
$ libtool --version
ltmain.sh (GNU libtool) 1.5.22 (1.1220.2.365 2005/12/18 22:14:06)
$ m4 --version
GNU M4 1.4.4
$ gcc --version
gcc (GCC) 4.1.1 20060525 (Red Hat 4.1.1-1)
$ gmake --version
GNU Make 3.80
$ bison -version
bison (GNU Bison) 2.1

Compiling MySQL Tutorial 1 – The Baseline – Update

Just to confirm my earlier confusion about verified snapshots at Compiling MySQL Tutorial 1 – The Baseline.

“Daily snapshot sources are only published, if they compiled successfully (using the BUILD/compile-pentium-debug-max script) and passed the test suite (using make test). If the source tree snapshot fails to compile or does not pass the test suite, the source tarball will not be published.”

Seems the fine print at MySQL Database Server 5.1: Beta snapshots also states this. Well, need to take my RTFM pill there.

Thanks to Lenz for putting the record straight, and helping with my Forum Post. Seems I did uncover a Bug, now recorded as Bug #21463. Just need to get it fixed to continue on my tutorial.

Compiling MySQL Tutorial 1 – The Baseline

Pre-requisites

This tutorial is aimed at Linux installations that has the standard development tools already installed. The INSTALL file in the source archives provides good details of the software required (e.g. gunzip, gcc 2.95.2+,make).

Create a separate user for this development work.

su -
useradd mysqldev
su - mysqldev

Get an appropiate Snapshot

You can view recent snapshots at http://downloads.mysql.com/snapshots/mysql-5.1/.

The official statement on snapshots from MySQL AB.
Daily snapshot sources are only published, if they compiled successfully (using the BUILD/compile-pentium-debug-max script) and passed the test suite (using make test). If the source tree snapshot fails to compile or does not pass the test suite, the source tarball will not be published.

At the time of producing this the present snapshot was http://downloads.mysql.com/snapshots/mysql-5.1/mysql-5.1.12-beta-nightly-20060801.tar.gz

mkdir -p $HOME/src
cd $HOME/src
SNAPSHOT="mysql-5.1.12-beta-nightly-20060801";export SNAPSHOT
wget http://downloads.mysql.com/snapshots/mysql-5.1/${SNAPSHOT}.tar.gz
tar xfz ${SNAPSHOT}.tar.gz
cd $SNAPSHOT

Compiling

You can for example do the simple way with most C programs:

DEPLOY=${HOME}/deploy; export DEPLOY
./configure --prefix=${DEPLOY}
make
make install

However it’s recommended you use the significant number of pre-configured build scripts found in the BUILD directory. For Example:

./BUILD/compile-pentium-debug --prefix=${DEPLOY}

I always get the following warnings. Not sure what to do about it, but it doesn’t break any future work to date. Surely somebody in the development team knows about them.

../build_win32/include.tcl: no such file or directory
cp: cannot create regular file `../test/logtrack.list': No such file or directory

Testing

The easy, but time consuming part over, let’s test this new beast.

make install

This will deploy to the preconfigured area of $USER/deploy. For multiple versions within this process you should adjust this back in the compile process.

cd $DEPLOY
bin/mysql_install_db
bin/mysqld_safe &
bin/mysql -e "SELECT VERSION()"
bin/mysqladmin -uroot shutdown

Changes

So if you haven’t already sniffed around there are some reasonable changes in the structure. Here are a few points that caught me out:

  • mysql_install_db is now under /bin not /scripts, infact there is no /scripts
  • mysqld is not under /bin but infact under /libexec. A good reason to always used mysqld_safe
  • by default the data directory is /var, normal documentation etc always has this as /data

This is a quick confirm it all works, and I’ve for the purposes of this example ensured that no other MySQL installation is running, and there is no default /etc/my.cnf file. (I always place this in the MySQL installation directory anyway).

Some minonr considerations for improvements with locations, users and multiple instances.

cd $DEPLOY
rm -rf var
bin/mysql_install_db --datadir=${DEPLOY}/data --user=$USER
bin/mysqld_safe --basedir=${DEPLOY} --datadir=${DEPLOY}/data --user=$USER --port=3307 --socket=/tmp/mysqldev.sock &
bin/mysql -P3307 -S/tmp/mysqldev.sock -e "SELECT VERSION()"
bin/mysqladmin -P3307 -S/tmp/mysqldev.sock  -uroot shutdown

Troubleshooting

Now, there is no guarantee that the snapshot is a working one, in the case of mysql-5.1.12-beta-nightly-20060801 in this example, it didn’t work for me? I’ve just reinstalled my OS to Fedora Core 5, and the previous source version (a 5.1.10 snapshot worked ok)
Here are some tips to help out.

The preconfigured BUILD scripts have a nice display option with -n that allows you to see what will happen.

./BUILD/compile-pentium-debug --prefix=${DEPLOY} -n

In my case it gave me this:

testing pentium4 ... ok
gmake -k distclean || true
/bin/rm -rf */.deps/*.P config.cache storage/innobase/config.cache
   storage/bdb/build_unix/config.cache bdb/dist/autom4te.cache autom4te.cache innobase/autom4te.cache;

path=./BUILD
. "./BUILD/autorun.sh"
CC="gcc" CFLAGS="-Wimplicit -Wreturn-type -Wswitch -Wtrigraphs -Wcomment -W -Wchar-subscripts -Wformat -Wparentheses -Wsign-compare -Wwrite-strings -Wunused
-DUNIV_MUST_NOT_INLINE -DEXTRA_DEBUG -DFORCE_INIT_OF_VARS  -DSAFEMALLOC -DPEDANTIC_SAFEMALLOC -DSAFE_MUTEX"
CXX="gcc" CXXFLAGS="-Wimplicit -Wreturn-type -Wswitch -Wtrigraphs -Wcomment -W -Wchar-subscripts -Wformat -Wparentheses -Wsign-compare -Wwrite-strings -Wctor-dtor-privacy
-Wnon-virtual-dtor -felide-constructors -fno-exceptions -fno-rtti  -DUNIV_MUST_NOT_INLINE -DEXTRA_DEBUG -DFORCE_INIT_OF_VARS
-DSAFEMALLOC -DPEDANTIC_SAFEMALLOC -DSAFE_MUTEX" CXXLDFLAGS=""
./configure --prefix=/home/mysqldev/deploy --enable-assembler  --with-extra-charsets=complex  --enable-thread-safe-client
--with-readline  --with-big-tables  --with-debug=full --enable-local-infile
gmake -j 4

In my case this helped with my problem, that is the autorun.sh command is throwing an error in the Innodb configuration.
Part of the compiling is you can turn Innodb off with –without-innodb, but the BUILD scripts don’t accept parameters, so it’s a matter of ignoring the autorun.sh command above, and adding –without-innodb to the configure.

As it turned out, the autorun.sh provided the additional storage engine support.

Details of the problem in this instance.

[mysqldev@lamda mysql-5.1.12-beta-nightly-20060801]$ . "./BUILD/autorun.sh"
Updating Berkeley DB source tree permissions...
../build_win32/include.tcl: no such file or directory
Updating Berkeley DB README file...
Building ../README
autoconf: building aclocal.m4...
autoconf: running autoheader to build config.hin...
autoconf: running autoconf to build configure
Building ../test/logtrack.list
cp: cannot create regular file `../test/logtrack.list': No such file or directory
Building ../build_win32/db.h
configure.in:723: required file `zlib/Makefile.in' not found
Can't execute automake

The good news is today, the MySQL Barcamp was announced, a great place to get some better insite into the gurus of the MySQL internals. At least a place to ask these questions providing they have a beginner group.

My Next Tutorial will take a look at the example provided by Brian Aker at the recent MySQL Users Conference HackFest B: Creating New SHOW Commands.

Eclipse CVS Tutorial 2

Following on from my earlier Eclipse CVS Tutorial 1.



Decorators

The following options define what is presented in the Navigator View at the project, directory and file levels.

  • Window | Preferences
    • General | Appearance | Label Decorations
    • Team | CVS
    • Team | CVS | Label Decorations | Text Decorations

NOTE: There is a Mercurial Eclipse option under the top level menu in Window | Preferences. It should be moved to the Team submenu in line with CVS and SVN.


Menu Options

The following are the Right Click | Team menu views for both CVS and SVN.

NOTE: That depending on the Version Control, only the CVS or SVN options are shown. Presently the Mercurial options are shown for both, but should not be until the Project is shared with Right Click | Team | Share Project.

New File Management


To step through the process of adding and maintaining a new file within CVS.

  • Right Click | New | File adding test.txt
  • On test.txt, Right Click | Team | Commit. (Unlike the command line interface, where you must do a cvs add before a cvs commit, under eclipse the Team | Add to Version Control is optional)
  • Enter comment, and click Finish
  • Edit the file, and repeat a few times
  • To gain a history of information you can on test.txt do Right Click | Team | Show Resource History. This gives you the CVS Resource History tab and shows details of revision, user etc

Securing a Tomcat Webapp – Part 2

If you wish to password protect your webapp with an Apache .htaccess type authentication model, you require two configuration steps. The first within your WEB-INF/web.xml, add the following replacing rolename appropiately.

  <security-constraint>
    <web-resource-collection>
      <web-resource-name>All Pages</web-resource-name>
        <url-pattern>*.htm</url-pattern>
        <url-pattern>*.html</url-pattern>
   </web-resource-collection>
    <auth-constraint>
       <role-name>rolename</role-name>
    </auth-constraint>
  </security-constraint>

  <!-- Define the Login Configuration for this Application -->
  <login-config>
    <auth-method>BASIC</auth-method>
    <realm-name>Test Application</realm-name>
  </login-config>

  <!-- Security roles referenced by this web application -->
  <security-role>
    <description>
      The role that is required to log in to the Application
    </description>
    <role-name>rolename</role-name>
  </security-role>

Second, within the tomcat $CATALINA_HOME/conf/server.xml, you need to define the Realm used within the appropiate host’s <Engine> definition.

  <Realm className="org.apache.catalina.realm.UserDatabaseRealm" debug="0" resourceName="UserDatabase"/>

This Realm connects with a known resource, which I define with the $CATALINA_HOME/conf/server.xml <GlobalNamingResources> definition.

<Resource name="UserDatabase" auth="Container"
          type="org.apache.catalina.UserDatabase"
          description="User database that can be updated and saved">
</Resource>
<ResourceParams name="UserDatabase">
    <parameter>
        <name>factory</name>
        <value>org.apache.catalina.users.MemoryUserDatabaseFactory</value>
    </parameter>
    <parameter>
        <name>pathname</name>
        <value>conf/custom/users.xml</value>
    </parameter>
</ResourceParams>

NOTE: The use of MemoryRealm has limited uses. Tomcat provides 5 different Realm implementations including JDBC, DataSource,JINDI, Memory and JAAS.

And of course you need to define your user authentication within the appropiately defined users file. In this case conf/custom/users.xml

Securing a Tomcat Webapp

If you require a webapp to always run in https mode using a SSL key, then you need to add the following to your WEB-INF/web.xml configuration.

 <security-constraint>
    <web-resource-collection>
        <web-resource-name>jsp</web-resource-name>
        <url-pattern>*.htm</url-pattern>
        <url-pattern>*.html</url-pattern>
    </web-resource-collection>
    <user-data-constraint>
        <transport-guarantee>CONFIDENTIAL</transport-guarantee>
    </user-data-constraint>
</security-constraint>