MySQL Winter of Code

Our first session in Day 2 of the MySQL Camp was the MySQL Winter of Code, as well as an overview of the QA Pilot program and Overview of the Community Doxygen Project by Kaj Arnö and Jay Pipes.

Starting with discussions on Code Contributions & MySQL Winter of Code

Quality Contributer Program

  • More coding happens during wintertime then in summer
  • MySQL has less contributions than many other Open Source projects
  • Contributor License Agreement
    • We want to award contribution more then nominally
    • We want to encourage contributions in all areas
    • We prefer contributions in certain areas (especially encourage them)

Requirements for Winter of Code

  • A signed Contributor License Agreement
  • A well-formed proposal
  • Votes from the Community and/or MySQL

Topics for Winter of Code 2007

  • Connectors
    • Improvements in (pure drivers for) Perl, Apache APR, Python, Ruby
  • Storage Engines
    • File System Storage Engine
      • select directory,filename,size from files where size > 1000000;
      • select directory,sum(size) from files group by directory;
    • JPG/EXIF Storage Engine
      • update jpgfiles SET Author = ‘name';
  • Anything
    • Full Text Search for CJK
    • MySQL GIS improvements
    • Your Idea

Which versions does it go to?

  • MySQL 5.1 Community Server
  • MySQL 5.2 Enterprise Server

MySQL Quality Contributor Program

  • Searching for Quality Contributors
    • Bug Reports
    • Test Cases
    • Bug Patches
  • Defining a Quality Contributor
  • Encouraging Quality Contributors
    • Fixing Bugs
    • Responsiveness and feedback
    • Recognition and attribution
    • Privileges/Awards

Day 1 – Memorable Quotes

Plenty of people are writing highly technical stuff from MySQL Camp including your’s truly. However there needs to be a lighter side here, and well this is it, Memorable Quotes.

“That’s moderately easy to difficult.” Brian Aker talking about table_funcs in A MySQL Core Kernel

“That’s Trivial, it’s less then a day’s work”, Monty, also in “A MySQL Core Kernel”, of course Monty said “It’s Trivial” several times, and that’s fine, it probably is trivial and is a day’s work for the guru’s, the problem is there are presently 6,000 trivial day’s work on the list of things to do.

“I’m trying to estimate when my finger will fall off.” — Jay Pipes You had to be there. I will say no more.

“You work for InnoDB, right” — Dathan Vance Pattishall of Flickr “InnoDB works for me.” — Ken Jacobs of Oracle

“Absolutely” Steve Gunn of Google in “The MySQL at The Google” talk. And the question from the floor that prompted this response “Do schema changes ever affect the production systems”.

“Everything at Google grows at the rate Google grows. If you want a proper answer we have to file that with the SEC”. Steve Gunn of Google again in “The MySQL at The Google”.

“We like to use boxes that crash.” — Mark Callaghan of Google.

“I want to make it, but we have already met before.” — Paul Tuckfield while Jeremy bashing. Side Note, apparently I’ve been saying “bagging Jeremy” which is Aussie Slang, but here in the US it has other meanings!

“I’d love my business card to say Hacker Herder”. The very cool Leslie, our Google Liason person.

“Actually they are just extras, they have all been hired for the day.” — Sheeri. In reference to all the Google Employees wearing Google shirts.

“And we’ll give you a tee-shirt” — An Google employee about Job Opportunities.

“I’m going have to kill Jeremy. This wireless stinks, I’d rather have dialup” — Sheeri about our hotel connectivity, hotel being recommended by Jeremy.

“I’m the former founder of Live Journal.” — Brad Fitzpatrick. “How can you be a former founder” — Jeremy Cole.

There were of course so many more, I just didn’t write them down. But tomorrow I will be prepared.

Testing on the toilet


Yes you got it, even while in the restroom here at Google (you can’t say toilets here in the US, because that’s the device), Google keeps you occupied while standing or sitting with the writings of “Testing on the Toilet”.

In Episode 19, TOTT talks about “Converting Old Style Tests”. An interesting read, rather then the daily grind of the front page of USA Today, plus as well as something that can be obviously changed at a longer frequency.

So how was the toilet experience here at MySQL Camp. Well you have toilet warming seats , my first experience, it was a little weird, and then you get the builtin “bidet” as well, with the ability for front cleaning, rear cleaning and then drying. Now that was really weird.

There has been a policy of what photos we can and can’t take and that’s cool, so I can’t post a copy of it. I will however show you this cool testing logo of TOTT as it was also on a tee-shirt (yes, it’s a little stained, but geeks do that sometimes) of a Google employee in Kiev, which is where we can take photos.

MySQL Camp – Introductions & Comments

The great thing about this unconference, is the lack of total formal structure. For now , our first session we are having an open introduction of people, there are at good 60+ people here already, and people rolling in, and it’s great to hear people’s background, and also to bag Jeremy Cole at every opportunity. We have a variety of people from various backgrounds, companies and experience levels.

We are in the Kiev room, with power build into the desks, lots of desk space and full 360% swivel chairs. This is just another example of the company’s clear thinking about it’s requirements.

There have already been some very funny stories, I should have made more earlier notes. Here are some.

Adam Ritter (Proven Scaling ride winner) was the first to bag Oracle, really bold move with Ken Jacobs from Oracle directly behind him, and he had already made his introduction.

Paul Tuckfield of You Tube guys said to Jeremy Cole re his replication talk “I want to make it, but we have already met before.” There have been about 10 bags of Jeremy already, he is giving as good as he is getting. Proven Scaling are sponsoring Beer session tomorrow night. Great stuff Jeremy. He did also ask how many people were planning on coming, given the number of people at the MySQL Camp has tripled in the past few days.

Breaking news. Mark Callaghan from Google, “Is there anybody from You Tube here”, to which the Paul Tuckfield of You Tube identified himself. After a few quick words the Google comment was “Deals Off” which made everybody laugh. That’s been level of good interaction with people. here

Flickr DB dude (his words) Dathan Vance Pattishall said to Ken Jacobs “You work for InnoDB, right”. Ken Jacobs response was “InnoDB works for me.”. Again a lot of laughs.

My own Googlewear

So like two minutes later, some official looking Google people come over and saw “Come on over and get your Google Shirt”. So before the last past is even cold, we have our own Googlewear.

A minute later, Leslie is back again saying, guys and lady (Just for Sheeri), “Contintental breakfast is ready in the room”. Now to check out the Google Food!

Googlewear

Everybody here (that is not us visitors) are wearing Google shirts. It must be an official clothing label.

So Sheeri says “Actually they are just extras, they have been hired for the day.”

So the latest quote from Leslie is “Eat, joy and be merry, and stay inside the blue lines”. Of course I should also mention when we arrived the parking security guy said. “Follow the second yellow brick road”. This is going to be a weekend just of quotes!

I'm at Google Mountain View

We have made it to MySQL Camp being held at Google Head Quarters in Mountain View California. Directions WOOT!!!

So we are at the lobby reception of Building 40, and I’m lounging back in a large green beanbag behind all the name tags, this is so cool, the problem is with all our technology, nobody yet has the capability to read the photos from a digital camera so I can upload it. Both Sheeri and myself have left the right stuff in our hotel room. So stay tuned.

Leslie our Google co-coordinator wants her business card to read “Hacker Herder” which sounds so cool. This whole weekend is going to be a blast. More to come.

MySQL Quotes

Frank was on a role with MySQL quotes (it’s 1am here in New York – All that Red Bull & Vodka). Here are some of them:

Let me scale you!

Wanna scale.

Scale me Baby!

Backup Now!

MySQL – DBA Friendly.

MySQL – Use the Attitude.

MySQL. Be Bold!

MySQL. Look Again

MySQL – Coming to a website near you.

One small step for Data, one giant leap for DBA.

Data, we are serious about it.

My Job, My Passion. MySQL.

MySQL. Never Doubt.

MySQL. Scaling made Easy.

MySQL. Scaling all you want.

Got MySQL!

Do it with MySQL.

Scale Yourself.

MySQL or die.

I’ve also done some of my own shirts designs (see small images below), a number I already have on shirts (you can check them out on me at MySQL Camp).

Some other references include MySQL forge Merchandise and Arjen’s suggestions.


Linux One Liner – dirtree alternative

Linux has a cool command called dirtree that gives a more visual representation of your directory structure. If you have the misfortune of working on a Unix variant that doesn’t have it, checkout this cool one liner.

ls -R . | grep ":$" | sed -e 's/:$//' -e 's/[^-][^/]*\//--/g' -e 's/^/   /' -e 's/-/|/'

Thanks for the command Tom.

Log Buffer #13: a Carnival of the Vanities for DBAs

Unlike fellow author Giuseppe of last week’s Log Buffer #12 I volunteered for the job of this week’s Log Buffer. Lots to say, so little time, so lets get started with Log Buffer #13.

Tom Kyte has been at the DBForum 2006 in Denmark. Apart from the contents of the Forum, his picture and comment “I spied some artifacts from Mogens Oracle Museum, a copy of the Version 3 and Version 4 Oracle” in Dbforum 2006, in the past… was an impressive look back in time. Manuals, what are they? So how old is this? Wikipedia History places Oracle Version 4 at 1984, some 22 years ago. One of the comments to Tom’s entry takes us to Back to the future (Oracle 4.1 VM appliance). The title gives the article’s content away, but worth a view of Oracle history. Good to also see Tom won the 42 Question Quiz on day 1, but what was the question he got wrong?

Ric Smith gives us a window of this month’s upcoming Oracle Open World with some details of Oracle Open World 2006 – Oracle Develop, “a new event tailored for the “geek” in us all. The format will make for a more developer-oriented conference”. Craig Mullins is at the European International DB2 User Group conference being held in Vienna this week (must be the month for RDBMS conferences). Details of his presentation “Change Control for DB2 Access Paths” are at IDUG in Vienna.

Build Your Own Oracle RAC Cluster on Linux – Again references a very detailed article and explanation by Jeffrey Hunter on RAC and shows the benefit by contributions to the OTN Oracle Community. If you’re heading to Oracle Open World this month, Justin Kestelyn mentions details of a similar presentation being held during the “Oracle on Linux Experience” portion of OTN Night on Monday Oct 23.

David Aldridge in his article Linux 2.6 Kernel I/O Schedulers for Oracle Data Warehousing: Part II has received some good responses in his concise and simple Benchmark. I always strive for simplicity in solving problems and this looks like a good simple approach to graphing I/O.

A little off the beaten track is Applying Web 2.0 to the Enterprise by Jonathan Bruce. The reason why I mention this is two fold. Firstly, decisions made by Project Management can have a big effect on the software development process, and this can have a significant effect on the DBAs and System Administrators that support systems. The article also mentions Agile Software Development of which I am a strong proponent. As I have a very detailed database background I’m also wary of some of the “strenghts” mentioned generally with Agile. A topic I’m happy to discuss more at some time.

E A D G B E. I have no idea what that means, you will need to read Ian Thain’s article regarding the Sybase WorkSpace to find out. He publishes some healthy performance improvement throughput figures with his 3 tuning guidelines.

Firebird 2.00 Release Candidate 5 has also been released this week. The news article indicates that this will probably become the release version.

Peter Scott reminds us that with all the technology advances and an existing 8 year old system which includes documentation, things still happen in I hate on call.

Greg Sabino Mullane over at Planet PostgreSQL is keeping abreast of the various open source offerings with his report on Berkeley DB now does MVCC. His comment “Looks like Oracle is actually doing something with their purchase …. Curiously, this comes right at the point when MySQL is dropping the BDB engine from their product.”. Hmmm, interesting observation, however it wasn’t the actual reason why BDB was dropped from MySQL. The actual reason mentioned some time ago can be found at BDB Engine removal.

Are we working in a booming industry? “Overall, Gartner is predicting that the worldwide DBMS market is around $14 billion and will continue to grow by nearly 7% per year”. This comment by Zack Urlocker is from his attendance of the “Gartner Open Source Summit - a very thorough analysis of the impact of open source technology in the database market.” You can read all his comments of the summit in Gartner Mastermind panel and Gartner on Open Source Databases.

Exploring the secrets of intermediate materialization by Adam Machanic revives a trick he had in SQL Server 2000 in improving logical reads when query tuning. This example shows it’s operation in SQL Server 2005.

Peter Zaitsev gives us a quick refresher on his MySQL Performance Blog with What to tune in MySQL Server after installation. A good introduction reference of configurable system variables, particularly for those non-MySQL DBA’s that need to also support a MySQL installation. Mike Kruckenberg also gives us a valuable consolidated reference in his twin articles, Guide to Incompatibilities when Upgrading MySQL to Version 4.1 and Guide to Incompatibilities when Upgrading MySQL to Version 5.0. Essential reading for clearly understanding MySQL database upgrades and possible traps.

MySQL Tools for Microsoft Visual Studio 1.0.1 beta has been released. Enough said. Ok, well for those that want some more detail, I quote from Reggie Burnetta downloadable plug-in for Visual Studio 2005 that allows Windows developers to quickly build MySQL data-driven applications with Visual Studio. With this plug-in, developers will be able to create, modify and manage MySQL database objects with an easy-to-use interface from within the Visual Studio IDE.. If only I used Microsoft I could check it out!

Daniel Schneller highlights one of the problems in a large scale out MySQL implementation in his article MySQL replication timeout trap. Valuable information in a network infrastructure to ensure your slaves are performing optimally.

Normally I’d summarise a worthy article for review, this time I’ve reproduced the concise summary by Jason Gaylord in Preventing SQL Injection Attacks which explains his content. Scott Guthrie just posted some really good stuff about preventing SQL injection attacks. In his blog post he talks about an application that Michael Sutton created to check SQL injection attacks by screening Google search and looking for sites with QueryString, etc. Check out his post for more details: http://weblogs.asp.net/scottgu/archive/2006/09/30/tip_2f00_trick_3a00_-guard-against-sql-injection-attacks.aspx

Of the big 5 or 6 RDBMS products of the past 2 decades, DB2 is the only one that hasn’t crossed my path in some way. Willie Favero writes What’s in a name – The saga continues…, sharing his views on the official name of DB2® Version 9 for z/OS.

Jeremy Cole has been busy in recent months with his new found freedoms in his new venture Proven Scaling. He has released another MySQL Source Patch with On Triggers, Stored Procedures, and Call Stacks. Keep em comin’ Jeremy. And just as I complete this weeks Log Buffer, good mate Jay (the plumber) Pipes has published HOWTO: Making a Corresponding Test Case for your Patch. Very worthy information for all those past, present and future patch writers.

The Data Charmer Giuseppe Maxia gives us the inside goss on his recent vacation interests in Take the MySQL Certification in five steps. Good Advice, I liked Point 5, and the unofficial Point 6. Marcus Popp also points us to New Lists of Certified Candidates online so you can see your name in lights. Reminds me to stop procrastinating and to take the MySQL 5 exams myself. It’s been on the cards for a few months now.

We end this week with one of those feel good stories of something that inspires me. Paul McCullagh has written his own MySQL transactional storage engine. PBXT beta 0.97 has just been released as a Pluggable storage engine for MySQL 5.1. Quoting Paul “PBXT is the first full featured engine to be released in this form.”. This leverages a new feature in the upcoming MySQL 5.1 GA release where developers can use MySQL’s extensible Storage Engine Architecture as a plugin without the need for recompiling with MySQL source. Look out for a lot more opportunities in storing and access different types of data in the future with this feature. [Author Side Note: Compiling MySQL from the latest BK tree may contain code features that are not fully tested (e.g. 2 Oct 2006). It’s best when integrating other patches or plugins to use a known MySQL Source Snapshots, otherwise things may break!]

And with a certain amount of deja vu from last week’s closing Log Buffer #12 comment by Giuseppe, your’s truly will also be joining MySQL. Checkout my If you can’t beat them, join them.

That’s all for lucky #13. Thanks for the opportunity Dave.

If you can't beat them, join them!

Like fellow friends and MySQL’ers before me Morgan, Roland, Giuseppe, Markus and Sean, I’ve joined the MySQL juggernaut on the ride of my life, achieving two of my short/medium term professional goals in one step. Woot!

It says something to me about the company I’m very excited to work for when I knew of all these people before they joined MySQL this year (2006). I’ll also be joining other friends and MySQL people Arjen, Jon, Jay, Colin, Michael Z and I still have a list of friends that I’ve met while being part of the MySQL community.

And as Giuseppe said I’ll be working in a virtual company. Another article I like to tell others about MySQL is MySQL: Workers in 25 countries with no HQ.

I’ll leave you with the MySQL Values from the Company About MySQL AB page.

We want the MySQL server to be:

  • The best and the most used database in the world
  • Available and affordable for all
  • Easy to use
  • Continuously improved while remaining fast and safe
  • Fun to use and improve
  • Free from bugs

MySQL AB and the people of MySQL AB:

  • Subscribe to the Open Source philosophy
  • Aim to be good citizens
  • Prefer partners that share our values and mindset
  • Answer email and give support
  • Are a virtual company, networking with others

Tutorial – Beginner Web Services

An introduction to using Axis.

What is Axis?

Axis is essentially a SOAP engine — a framework for constructing SOAP processors such as clients, servers, gateways, etc. The current version of Axis is written in Java. But Axis isn’t just a SOAP engine — it also includes:

  • a simple stand-alone server,
  • a server which plugs into servlet engines such as Tomcat,
  • extensive support for the Web Service Description Language (WSDL),
  • emitter tooling that generates Java classes from WSDL.
  • some sample programs, and
  • a tool for monitoring TCP/IP packets.

Pre-Requisites

Installation

su -
cd /opt
wget http://apache.ausgamers.com/ws/axis/1_4/axis-bin-1_4.tar.gz
tar xvfz axis-bin-1_4.tar.gz
ln -s axis-1_4/ axis
echo "AXIS_HOME=/opt/axis;export AXIS_HOME" > /etc/profile.d/axis.sh
. /etc/profile.d/axis.sh
cp -r $AXIS_HOME/webapps/axis $CATALINA_HOME/webapps
catalina.sh stop
catalina.sh start

At this time, you should be able to confirm this installation was initially successful by going to http://localhost:8080/axis/

Installed Axis Options

The default Axis page, gives you a number of options. To confirm the installation, select the Validate Axis Link http://localhost:8080/axis/happyaxis.jsp. If there is anything missing this page will report it. In my case I was missing XML Security, which is optional.

cd /tmp
wget http://xml.apache.org/security/dist/java-library/xml-security-bin-1_3_0.zip
unzip  xml-security-bin-1_3_0.zip
cp xml-security-1_3_0/libs/xmlsec-1.3.0.jar /opt/tomcat/common/lib
catalina.sh stop
catalina.sh start

One of the links from the default home page are http://localhost:8080/axis/servlet/AxisServlet which Lists services.

First Use

One of the nicest parts of AXIS is its “instant Web service” feature called Java Web Service (JWS) — just take a Java file, rename it, and drop it into TOMCAT_HOME/webapps/axis to make all of the (public) methods in the class callable through Web services.

Quote.java

import java.util.HashMap;
import java.util.Map;

public class Quote {
  private HashMap quotes = null;
  public Quote() {
    quotes = new HashMap();
    quotes.put("Groucho Marx", "Time flies like an arrow.  Fruit flies like a banana.");
    quotes.put("Mae West", "When women go wrong, men go right after them.");
    quotes.put("Mark Twain", "Go to Heaven for the climate, Hell for the company.");
    quotes.put("Thomas Edison", "Genius is 1% inspiration, 99% perspiration.");
  }
  public String quote(String name) {
    String quote;
    if (name == null || name.length() == 0
      || (quote = (String) quotes.get(name)) == null) {
      quote = "No quotes.";
  }
  return (quote);
  }
  public int count() {
    return quotes.size();
  }
}
cp Quote.java /opt/tomcat/webapps/axis/Quote.jws

http://localhost:8080/axis/Quote.jws
http://localhost:8080/axis/Quote.jws?wsdl

More details can be found at Getting Started using Web Services with Tomcat and Axis.

What’s Next

In my next Tutorial, I’ll be moving to the practical use of Web Services using WSDL.

References

The price for digg success

I guess much like the Slashdot effect, the Digg effect is both a good thing for your exposure and traffic hits, and a bad thing for those ISP’s watching the traffic. (See Jay’s Slashdot Fame).

In the past week, I’ve gone to a number of digg article sites and they have been unavailable. I never kept details of these IT articles, but here is one the one article not of an IT I look at that I did. It referred to an image, which the host provider adjusted (see image to the right). The host provider was ImageShack. I didn’t read anything in the T & C about being too popular!

Here’s the original Digg Post Gmail ads get a little too personal.

Seems I’m not the only person wanting to see it, I found in the comments a repost of the image Here. Just for future reference my copy is Here

Logical Data Modelling (LDM)

Following my User Group Presentation I was asked by fellow MySQLer Kim about Logical Data Modelling (LDM), in relation to Physical Data Modelling.

Well, first the brain had to work overtime to remember when was the last time I worked on a Logical Data Model. The answer to that is 1996 doing R&D work for Oracle Corporation with their CASE repository tool, Oracle Designer, about version 1.3/1.3.2. I’ve learnt in the past 10 years to purge technical stuff from my brain, leading from the capacity in be able to remember in detail data models, data migration and data cleansing issues of projects even after leaving them 3 years eariler.

As Kim pointed out, he thinks physically, actually directly at the SQL level, then working backwards to produce an appropiate physical model. To think logically is to consider the entities and attributes and relations before considering the physical tables, columns and relationships. So how do you program somebody to think logically? In the case with Kim, he is undertaking formal studies after already grasping the concepts of software development. Generally today more people don’t undertake the formal education and we end up with The Hobbist and the Professional Syndrome.

I guess in summary I’d argue why bother. Does anybody still do traditional logical models? Feedback/comments are welcome. (Professionally I believe there is a place for LDM).
However for the purpose of the exercise lets start with a Physical Representation and present a Logical Representation of that to see the differences.



From this you can see a classic Library example, showing a table of Books and a Table of Authors, and an intersection table (3NF) to indicate that a Book can have one to many Authors, and likewise an Author can have one to many Books.

So how would this have been represented in a Logical Model.



Some things to consider in Logical Modelling.

  • Attributes really only require a name, and perhaps just a datatype using number, string, date (Jim Starkey would be happy), but nothing specific like SMALLINT, or VARCHAR(50).
  • An Attributes Mandatory/Optional state may or may not be known
  • Information is not in 3NF. i.e. relationships can be many to many (in our example, there is no concept of the intersection table

I’m going from memory here, so there is probably more points to consider.

So how do you teach this when you are trying to work backwards, when I learnt this back in 1988 (some 18 years ago), I’d never created a database or used SQL so I didn’t have the history. Arjen had a brilliant idea to consider Logical Modelling after the fact. Views. Consider an end user requirement for Reporting, and how you would represent your model better to an end user (effectively de-normalising these views so users don’t have to know about joins).

In this case you would create a view of Books and a View of Authors. Details such as mandatory/optional isn’t important to the end user report (ie, it’s not like it’s needed to be enforced), and specific datatype details again are not that important. The basics to know how to format a number or a date works.

An interesting approach that worked well in our explaining.

The Text Book

So it was interesting to go back to the text book using C.J. Date’s An Introduction to Database Systems to review definitions.

Logically isn’t really referenced, the term used is conceptual. A conceptual definition is “The right way to do database design is to get the logical design right first, without paying any attention whatsoever at that stage to physical – that is, performance – considerations”

Also for example Third Normal Form we get from Section 12.3 First, Second and Third Normal Forms.

Third normal form (very informal definition): A relvar is in 3NF if and only if the monkey attributes (if any) are both:

  • a. Mutually independent
  • b. Irreducibly dependent on the primary key

We explain the terms nonkey attribute and mutually independend (loosely) as follows:

  • A nonkey attribute is any attribute that does not participate in the primary key of the relvar concerned.
  • Two or more attributes are mutually independent if none of them is functionally dependent on any combination of the others. Such independence implies that each attributes can be updated independently of the rest.

Man, no wonder many years of experience and having generally seen most cases, enables me to forget this and not feel like I’ve fogotten something.

The Hobbyist and the Professional

I first coined this term in February 2006 in a paper titled “Overcoming the Challenges of Establishing Service and Support Channels” for the conference “Implementing Open Source for Optimal Business Performance” View Paper.

I must have referenced this several times, time to give this topic it’s due notice. In summary it targets three areas:

  • the lack of appropriate database design in open source projects
  • the lack of coding standards
  • the lack of sound programming principles

Here are the bullet points from the slide in a presentation.

Hobbyist

  • Download-able software and examples
  • Online tutorials
  • Books like Learn in 24 hours/For Dummies

Professional

  • Formal Qualifications
  • Grounding in sound programming practices
  • Understanding of SDLC principles
  • Worked in team environment

Middle Ground Developer

  • Time to skill verses output productivity
  • Depends on environment and requirements

Slow Queries aren't always that bad!

Well, now I have your attention, Slow Queries are bad (unless it’s a single user system and you don’t care). However there are worse things then slow queries in a large enterprise system.

I’ve been asked in recent weeks a number of questions which has brought this topic to discussion, as well as a current implementation I’m undertaking for a client of a purchased product.

High volume repetitive queries can have a worse effect on your system’s performance. Combined with slow queries that take locks, these queries can have an extreme effect on performance and if you don’t know your application, or have the right tools, it can be initially hard to diagonise easily.

This problem is the classic Wasting CPU cycles problem, seen it before, will see it again.

Here’s a classic example for reference. Using the current product that I’m customising and installing (I’ve not be involved in any development), a typical Customer Information System (CIS).

The system when deployed will have 1.1 Million Customers. Each Customer has a Balance. Now the Customer Balance is not recorded against the customer record, it’s calculated every time it’s required. So, no big deal, well yes. The stored procedure to calculate the balance hit’s 4 different tables, and one of these tables is one of the largest in the system (recording all detailed financial transactions). Multiple batch reports alone that work with large sets of customers all require or use the current balance in some means.

There are only a small set of transactions that affect the balance including Invoice Statements, Payments, Adjustments, Credit Control, Write-Offs . These processes by there own nature are either batch or small online transactions. The calculation of the balance can be applied against the Chunk of customers in a batch, or individual Row for online transactions (See The RAT and the CAT).

Even for the paranoid, the re-calculation of the customer balance for all customers in batch is more efficient after hours, yet only customers where some transaction that has been applied since the last time the re-calculation ran is necessary.

MySQL doesn’t be default have any analysis tools to identify and manage these types of queries. Peter Zaitsev in his article Slow Query Log analyzes tools highlights the issue with a good approach by a source code change to long_query_time as well as some additional scripts.

Hopefully MySQL will consider a more improved approach in future releases that doesn’t require patching the source code. Now it’s unlikely that MySQL will do another SHOW STATUS Gotcha and for example change the unit of measure from seconds to milliseconds, but they could make it a float, so you could specify 0.1 for example.

Web Site – Speed Test

Want to know your Internet Connection speed in a real world test?
Want a fancy graphical presentation of your internet Speed?

SpeedTest.Net has you covered. As you can see that even provide graphics results to can use on your own site.

So just how much does my Bittorrent download of Stargate and StarGate Atlantis weekly epsiodes affect my link (Azureus states about 20kB/s down and 20kB/s up. Here is the results. Hmmm, seems if affects my link much more then it really states!

Stories that impress and motivate you

I’ve worked for two Internet startup companies, both around 2 years each, both now long dead. The first was due to eventual lack of new VC funds, the second gross financial managment in the second year (apparently, when we were told there was no money December one year to pay us, the company that made large profits every month for over the first year, then had made losses every month for the past 12 months, but nobody knew about it. There were 5 Directors from 3 countries and nobody knew. Yeah Right!)


I’ve learnt a lot of non IT street smarts in this time. The first startup took the VC route, and after 3 rounds while I wasn’t involved in the process you pick up things. The single biggest tip here is the Bell-Mason Diagnostic. Here a few introduction references worthy of a quick review (One, Two).

When you take any great idea, and then consider the 4 quadrants and 12 axis you realise you really need to make larger circles of professional contacts.

Zac’s article Valley Boys Run MySQL talks about the new breed of Web entrepreneurs and Web 2.0. In particular check out
Valley Boys Digg.com’s Kevin Rose leads a new brat pack of young entrepreneurs
. This is the new wave of success that works without VC.

For me this recent post on the Meebo Blog really impressed me.

365 days ago

Nineteen releases, eleven (fantastic) team members, and 295,321 gummy bears later we see over a million users log into meebo each day. As a coder, all you hope is that your service will be able help people go about their day-to-day lives. Stability, bugs, and good usability are always top of mind.

1,000,000 users of your Web 2.0 application. This impresses and motivates me. What makes it possible to anybody, is you can get a LAMP stack, a live-cd, cheap hosting and you can turn your idea into something real for next to no cost. Of course, I won’t start on the nightmares out there of great ideas that are very poorly designed. At least the underlying stack can support anything you want to achieve, and MySQL is behind these success stories.

One more thing on Meebo, Check out the meebo map!. I’ve been told that Google has something of a similar nature at the Googleplex. Well I can say I’m very keen to see this, and will be 8 weeks time when the First MySQL Camp is held.



MySQL Trigger Features

Sheeri talked a little about MySQL Triggers in One TRIGGER Fact, Optimizer Rewriting Stuff. While it’s great that MySQL 5.0 has Triggers, there are some things that I feel could become features in future releases.

IF EXISTS

One of the beautiful features that MySQL has is IF EXISTS. This ternary operation that if the object exists performs the operation, of not it does nothing works wonders in reviewing logs for errors. One of the problems with Oracle for example, is the requirement to ignore the ORA errors for non-existent objects.

But this functionality doesn’t exist for Triggers? One must wonder why. I’d like to see this.

mysql> DROP TRIGGER IF EXISTS trigger_name;

MySQL Manual DROP TRIGGER

REPLACE

On feature that simplifies the lack of IF EXISTS functionality using Oracle is REPLACE. The syntax is:

oracle> CREATE OR REPLACE TRIGGER trigger_name...

In this case, this functionality effectively eliminates the need for a DROP IF EXISTS, however I’d not like to see this introduced into MySQL especially it’s an Oracle specific syntax and not ANSI SQL.

Multiple Triggers of same type per table

MySQL only allows one Trigger per type per table. That is, there can only be one BEFORE UPDATE for example. While you may ask the question why you would need this functionality. Here is a typical situation.

You use Triggers to perform some level of business functionality, determining values for optimised (denormalised) columns is a good example. So you need to write an appropiate trigger for that piece of functionality.
You also use Triggers to perform database auditing, that is for every insert/update/delete of data, you record a full copy of the change in an audit database. One way to ensure this is consistent across the entire database is to implement via triggers. So you leverage programming functionality to pre-create triggers for all tables to manage this.

The problem with MySQL occurs in that you have to now merge these triggers for tables that require both. If you want to deploy your application into a test environment, you may wish to not deploy your auditing triggers, but now you have this functionality mixed in with business logic.

Multi Type Triggers

Another cool Oracle feature is the capacity to define a trigger for multiple types. For example:

oracle> CREATE OR REPLACE TRIGGER trigger_name
oracle> BEFORE INSERT OR DELETE OR UPDATE ON table_name ...

In this example, with MySQL you would need to create three seperate triggers.

Other features

While not as important and one would need to consider if necessary in MySQL are some other Oracle provided trigger functionality. This includes:

  • WHEN CLAUSE trigger restriction
  • Triggers on DDL Statements such as CREATE, DROP, ALTER
  • Triggers on Database events such as LOGON, LOGOFF, STARTUP, and SHUTDOWN
  • INSTEAD OF Triggers (for Views)
  • STATEMENT based trigger

For more information you can check out the Oracle Documentation on Introduction to Triggers, CREATE TRIGGER, Documentation Search on Triggers

Handling Error Levels in Logging

In reviewing some provided code to a client, I observed a number of actions contray to generally accepted practices regarding logging. This is what I provided as the general programming conventions with regardings to logging.

Using Log4J (http://logging.apache.org/log4j/docs/), which is generally accepted as the benchmark for all Java applications. This provides the following logging levels.

  • FATAL
  • ERROR
  • WARN
  • INFO
  • DEBUG
  • TRACE – from 1.2.12, latest is 1.2.13

A description for what handling should occur per logging level.

FATAL. As the name suggests, all processing should stop. Should logging include a FATAL, the process is a Failure.

ERROR. An error has occured, and this requires attention, and action. Generally processing should stop, however additional post processing, or an alternative path could occur. Should logging include an ERROR, the process is a Failure.

WARN. Something that is unexpected occured, however it doesn’t affect the general processing from succeeding successfully. If a process includes WARN and not FATAL/ERROR it should be considered successful.

INFO. Information Only. On high volume systems, this level of logging may even be turned off. This generally indicates key information values or steps, and can assist when enabled in longer running processes to identify where a process is. You don’t log errors at INFO.

DEBUG. For Debugging Purposes only.

Ok, well that sounds like common sense. Here is what I observed on a client site.

  • Code logs a FATAL, but continues processing
  • A FATAL is logged, yet the calling process reports success
  • An ERROR is logged, yet the calling process reports success.
  • A lot of WARN are logged, and this is misleading, as it appears more information regarding XML elements not processed (We are talking 20+ Warnings per batch process). From what I’ve observed, these don’t require futher action, and should be changed in INFO.
  • Errors are being logged as INFO. A NullPointer RunTime Exception is logged as INFO. If an error provides an Exception argument where a stack trace is printed, it ain’t an INFO message.

Brisbane Users Group – MySQL Hackfest

Last night we had a number of keen souls at the Brisbane MySQL User Group. I was very impressed to see the majority of people with laptops at hand.

You can download my slides and code examples at my Articles page.

In our hands on Hackfest tutorial we created the new command SHOW USERGROUPS. Before anybody makes a comment, it was stated in the presentation that this command was made a dummy one, and is a poor candidate for two reasons.

  1. The results should be more dynamic, rather then hardcoded into the source tree
  2. USERGROUPS is not an ideal name due to comparisions to Users, Groups, Roles etc

Still it was productive, here was the outcome of our work for the evening.

mysql> SHOW USERGROUPS;
+----------------------+-------------------------+-----------------------------+
| Name                 | Location                | Website                     |
+----------------------+-------------------------+-----------------------------+
| Brisbane Users Group | Brisbane, QLD Australia | http://mysql.meetup.com/84/ |
| Boston Users Group   | Boston, USA             |                             |
+----------------------+-------------------------+-----------------------------+
2 rows in set (0.00 sec)

The trailing slash ‘/’ Hitcho’s contribution :)

mysql> SHOW AUTHORS;
+--------------------------------+---------------------------------------+----------------------------------------------------------------------+
| Name                           | Location                              | Comment                             |
+--------------------------------+---------------------------------------+----------------------------------------------------------------------+
| Brian (Krow) Aker              | Seattle, WA, USA                      | Architecture, archive, federated, bunch of little stuff :)           |
...
 Brisbane Users Group           | Brisbane, QLD Australia               | New Command SHOW USERGROUPS                             |
+--------------------------------+---------------------------------------+----------------------------------------------------------------------+
75 rows in set (0.05 sec)

I’ll be publishing more beginner tutorials in my MySQL Compiling Category over time.

Update Check out http://www.arabx.com.au/hackfest.diff.htm for a diff of files to produce this output.

Compiling MySQL Tutorial 3 – Debugging Output

Continuing on from Tutorial 2.

When reviewing the 2.1. C/C++ Coding Guidelines for MySQL, you will see that the MySQL Source uses within the C/C++ code DBUG (Fred Fish’s debug library). This is one of the 3rd party open source products that are documented in the Internals 1.4. The Open-source Directories.

You will also find some usage in the MySQL Manual E.3 The DBUG Package. So enough talk, how do you use this.

DBUG

You get the DBUG output by running a mysqld debug version with the argument –debug. The output for the SHOW AUTHORS commands is:

T@9190304: >dispatch_command
T@9190304: | >alloc_root
..
T@9190304: | query: SHOW AUTHORS
T@9190304: | >mysql_parse
T@9190304: | | >mysql_init_query
T@9190304: | | | >lex_start
T@9190304: | | | | >alloc_root
T@9190304: | | | | | enter: root: 0x9590b38
T@9190304: | | | | | exit: ptr: 0x95c1c70
T@9190304: | | | | <alloc_root
T@9190304: | | | <lex_start
T@9190304: | | | >mysql_reset_thd_for_next_command
T@9190304: | | | <mysql_reset_thd_for_next_command
T@9190304: | | <mysql_init_query
T@9190304: | | >Query_cache::send_result_to_client
T@9190304: | | <query_cache ::send_result_to_client
T@9190304: | | >mysql_execute_command
T@9190304: | | | >mysqld_show_authors
T@9190304: | | | | >alloc_root
T@9190304: | | | | | enter: root: 0x9590b38
T@9190304: | | | | | exit: ptr: 0x95c1c78
T@9190304: | | | | <alloc_root
...
T@9190304: | | | | >send_fields
T@9190304: | | | | | packet_header: Memory: 0x8c2480  Bytes: (4) 01 00 00 01
T@9190304: | | | | | >alloc_root
T@9190304: | | | | >Protocol::write
T@9190304: | | | | <protocol ::write
...
T@9190304: | | | | >send_eof
T@9190304: | | | | | packet_header: Memory: 0x8c2550  Bytes: (4) 05 00 00 51
T@9190304: | | | | | >net_flush
T@9190304: | | | | | | >vio_is_blocking
T@9190304: | | | | | | | exit: 0
T@9190304: | | | | | | <vio_is_blocking
T@9190304: | | | | | | >net_real_write
T@9190304: | | | | | | | >vio_write
T@9190304: | | | | | | | | enter: sd: 23, buf: 0x0x95b9bd0, size: 5107
T@9190304: | | | | | | | | exit: 5107
T@9190304: | | | | | | | <vio_write
T@9190304: | | | | | | <net_real_write
T@9190304: | | | | | <net_flush
T@9190304: | | | | | info: EOF sent, so no more error sending allowed
T@9190304: | | | | <send_eof
T@9190304: | | | <mysqld_show_authors
T@9190304: | | <mysql_execute_command
T@9190304: | | >query_cache_end_of_result
T@9190304: | | <query_cache_end_of_result
...
T@9190304: | <mysql_parse
T@9190304: | info: query ready
T@9190304: | >log_slow_statement
T@9190304: | <log_slow_statement
T@9190304: >dispatch_command

Using some of the debug options you get the following output with –debug=d,info,error,query,general,where:O,/tmp/mysqld.trace

create_new_thread: info: creating thread 3
create_new_thread: info: Thread created
?func: info: handle_one_connection called by thread 3
?func: info: New connection received on socket (23)
?func: info: Host: localhost
?func: info: vio_read returned -1,  errno: 11
thr_alarm: info: reschedule
process_alarm: info: sig: 14  active alarms: 1
?func: info: client_character_set: 8
check_user: info: Capabilities: 238213  packet_length: 16777216  Host: 'localhost'  Login user: 'rbradfor' Priv_user: ''  Using password: no Access: 0  db: '*none*'
send_ok: info: affected_rows: 0  id: 0  status: 2  warning_count: 0
send_ok: info: OK sent, so no more error sending allowed
do_command: info: vio_read returned -1,  errno: 11
thr_alarm: info: reschedule
process_alarm: info: sig: 14  active alarms: 1
do_command: info: Command on socket (23) = 3 (Query)
dispatch_command: query: SHOW AUTHORS
send_eof: info: EOF sent, so no more error sending allowed
dispatch_command: info: query ready
do_command: info: vio_read returned -1,  errno: 11
thr_alarm: info: reschedule
process_alarm: info: sig: 14  active alarms: 1
do_command: info: Command on socket (23) = 1 (Quit)
~THD(): info: freeing security context
end_thread: info: sending a broadcast
end_thread: info: unlocked thread_count mutex


I have found the following format good for comparing with the MySQL Source Code. —debug=d:N:F:L:O,/tmp/mysqld.trace

   49:   sql_parse.cc:  1536: do_command: info: Command on socket (23) = 3 (Query)
   50:     my_alloc.c:   172: alloc_root: enter: root: 0x9068b38
   51:     my_alloc.c:   219: alloc_root: exit: ptr: 0x9099c30
   52:   safemalloc.c:   128: _mymalloc: enter: Size: 16384
   53:   safemalloc.c:   197: _mymalloc: exit: ptr: 0x909cc80
   54:   safemalloc.c:   262: _myfree: enter: ptr: 0x9095bf8
   55:   sql_parse.cc:  1757: dispatch_command: query: SHOW AUTHORS
..
   88:    sql_show.cc:   385: mysqld_show_authors: packet_header: Memory: 0x940590  Bytes: (4)
..
  166:    protocol.cc:   385: send_eof: packet_header: Memory: 0x940550  Bytes: (4)
05 00 00 51
  167:    viosocket.c:   184: vio_is_blocking: exit: 0
  168:    viosocket.c:   105: vio_write: enter: sd: 23, buf: 0x0x9091bd0, size: 5107
  169:    viosocket.c:   117: vio_write: exit: 5107
  170:    protocol.cc:   342: send_eof: info: EOF sent, so no more error sending allowed
  171:     sql_lex.cc:   199: lex_end: enter: lex: 0x9068b58
  172:   sql_parse.cc:  1796: dispatch_command: info: query ready
  173:     my_alloc.c:   331: free_root: enter: root: 0x9068b38  flags: 1
  174:    viosocket.c:   184: vio_is_blocking: exit: 0
  175:    viosocket.c:    36: vio_read: enter: sd: 23, buf: 0x0x9091bd0, size: 4
  176:    viosocket.c:    49: vio_read: vio_error: Got error 11 during read
  177:    viosocket.c:    52: vio_read: exit: -1
  178:   sql_parse.cc:   809: do_command: info: vio_read returned -1,  errno: 11

  190:   sql_parse.cc:  1536: do_command: info: Command on socket (23) = 1 (Quit)
  191:     my_alloc.c:   331: free_root: enter: root: 0x9068b38  flags: 1
  192:      mysqld.cc:  1681: close_connection: enter: fd: socket (23)  error: ''
  193: sql_handler.cc:   628: mysql_ha_flush: enter: tables: (nil)  mode_flags: 0x02

This will make a lot more sense with my upcoming presentation where you will see clear use and interaction between source files such as sql_parse.cc & sql_show.cc.

For more information check out Making Trace Files

It's makes me cry

I got home today and sat down to read my home email list. Nothing new. But on a MySQL mailing list, there was an enquiry why performance was slowing in a given application. I didn’t even have to read the situation, nor the problem, it took less then the 200ms mentioned to identify the problem looking at the supplied schema.

In summary, the first table in the schema had a primary key of VARBINARY(255) and a engine type of Innodb. Hold on, wait, it’s a concatenated key of two VARBINARY(255) columns. And I should mention, that primary key was a foreign key in the next table. If this was a home website app with one user, ok well it’s still bad, but this application was having performance problems with reasonable volumes of transactions, it’s not a beginner application. (A recent reference If you don’t know your data, you don’t know your application). Where do people learn this from!

Now I want to work on a large scale out MySQL environment, an organisation or site that’s going places, but the world of even normal every day applications is littered with these basic fundamental design errors. It’s make me cry.

My last contract had many flaws across the business, and the Java code for production applications was just as bad. Check out things like What constitutes a good error message to the user?, Why IT professionals get a bad name (I could easily write one of these a day for a long time, or you could check out The Daily WFT). You think the new contract with an large existing multinational company and world-wide installed Java application I could get an improvement. I’ll leave these stories for another time. I don’t want to have to deal with this.

I’ve got 16 half written blogs on cool MySQL features or interesting topics, but I never seem to get the motivation at the end of the day. I just want to know, how do people in IT get away with this. Where is the Pride in people’s work, where is the desire for people to make what they do a better place. Where is:

An opportunity that’s a challenge, including involving hard work and tight time frames, but a job that provides the rewards of job satisfaction for a productive contribution. An importance on quality, emphasis on continued improvement, and goals of simplicity in complex situations while working in a team environment are also necessary.

SHOW STATUS Gotcha

Well, it’s Sunday night so I will put this down to being the weekend. The background to being caught out is a request I made to my local Users Group mailing list for some information on people’s environments because I wanted to some empirical data analysis without having any more knowledge of the systems.

In summary (without the surrounding fan-fare, I was seeking):

SELECT VERSION();
SHOW STATUS;
SHOW VARIABLES;  // Optional

I was however perplexed why my first data point analysis (Read/Write ratio) using the Status values Com_insert, Com_update, Com_delete and Com_select was not always giving me expected results. In particular, a number of server results showed 0 for values while I knew the results came from working MySQL environments.

So, sanity check with good friend Morgan and I get the response to answer the dilemma SHOW STATUS defaults to session based statistics in 5.0+, there’s a million people it’s caught out, and many bugs about “xyz variable not being incremented”.

Well, it’s Sunday night so I will put this down to being the weekend. The background to being caught out is a request I made to my local Users Group mailing list for some information on people’s environments because I wanted to some empirical data analysis without having any more knowledge of the systems.

So I’m really seeking:

SELECT VERSION();
SHOW GLOBAL STATUS;
SHOW GLOBAL VARIABLES;  // Optional

Well, that’s my MySQL trivia point for the day. MySQL between version 4.1 and 5.0 made a major change with the default value. The MySQL 5.0 Manual also states this.

UPDATE 23 Sep 2006

Following Eric’s Show @&!# status again!, I did then read the Bug #19093, and it was great to see Monty comment (that’s open source for you, enabling even founders to mix with the common folk), and I’d tend to agree to his comments. MySQL has attempted to correct something.

Perhaps the best compromise to the problem is to deprecate and drop the SHOW STATUS command, and make the scope component GLOBAL|SESSION compulsory, so SHOW GLOBAL STATUS or SHOW SESSION STATUS are the only valid commands!

Microfox ?

I’ve added Digg to my general lunch time reading web sites. I came across this yesterday. Microsoft invites Firefox development team to Redmond.

Well, isn’t that nice, the big boy opening his pond (including all the sharks) to the little fish. (if you don’t get it, it’s a line from Die Hard 2 about pissing in somebody else’s pond) What’s funny about the article is not the article but some of the comments, here are but a few.

  • Cool. Very cool.
  • they’re going to brainwash them into working for microsoft and their going to cast off the IE team to replace the firefox team
  • Just replace IE 7 with FF 2.0. Save every web developer 100s of hours.
  • “Come into my parlour,” said the spider to the fly
  • DON’T DO IT! My god, has noone seen Braveheart? The Untouchables? The Godfather? Scarface?
  • hey’re going to kidnap the Mozilla developers with a cunning trail of pizza and cola, and force them to work on IE8.
  • ITS A TRAP! http://itsatrap.ytmnd.com/

I like the last one, maybe just because of the cool picture on referenced site.

I see today an official response Firefox welcomes Microsoft’s offer of help

The fast pace of technology in a Web 2.0 world

I had need to goto the Wikipedia this morning to review the terminology of something, and on the front page in Today’s featured article is Mercury. Being a tad curious given I’d heard only on the radio a few hours ago that Pluto was no longer a planet in our Solar System, I drilled down to the bottom to check references to other planets (quicker then searching). So at the bottom I found the following graphic and details of The Solar System Summary.

Well blow me down, they didn’t waste any time there. Pluto is no longer a planet in our Solar System. It is now categorised as a Dwarf Planet.

There are over 100 edits to this page on Dwarf Planet in the past 12 hours, including all the links in Pluto and Solar System correctly referenced.

Now, that’s the power of content by the community, one of the characteristics of Web 2.0.
Of course all of this runs under the LAMP Stack and powered by the MySQL Database. Combined with the fact this is a breed of organisation that didn’t start with large amount of Venture Capital, another trend of the newer generation of popular and successful internet enterprises.

No Spam Today

Huh, tricked you. As if!

However I was looking at my Akismet Spam section in WordPress, the open source software that runs my blog, and it gave me this message.

Caught Spam

You have no spam currently in the queue. Must be your lucky day. :)