MYSQL Conference – Scaling and High Availablilty Architectures Tutorial

My first tutorial today at MySQL Conference 2007 is Scaling and High Availablilty Architectures by Jeremy Cole and Eric Bergen of Proven Scaling.

Basic Tenets

While not discussed, the premise is to Cache Everything. MemCache is a key component to any scalable system.

Lifetime of a scalable system

Using the analogy from a newborn child Jeremy stepped us through the categories Newborn, Toddler, Teenager, Late teens to 20s, Adult.

In Late teens to 20s, is where most systems die a slow death, he termed “the awkward stage”. This is where scalability is critical, and a meltdown for example can ruin you. Downtime is also just not acceptable for your user community.

When your Adult you need to perfect the ability to deploy incremental changes to your application seamlessly.

As the system grows, optimizations changes that may have worked are now affecting your system. It’s important to revisit during each stage.

Partitioning

Most applications mainly implement a horizontal partitioning model. Different components of your systemcan be scaled by a “partition key”. The different models include fixed “Hash key” partitioning, Dynamic “directory” partitioning, Partition by “group” and partition by “user”.

The Dynamic “directory” is a lot harder to implement, but is ultimately more scalable.

One of Partitioning Difficulties, is inter-partition interactions. Ultimately the solution is duplicating meta-data or duplicating data. Overall reporting is also more difficult. What if we want average for users per location, if we partition by users. Most systems user driven and partition by user. A newer strategy is to partition by group.

For implementing a Fixed Hash Key partitioning.

  • Divide data into B buckets
  • Divide the B buckets over M machines

You define 1024 physical buckets (can then the easily dividable) 0-1023 (user_id % 1024). Coded then by range to physical machines, 0-255, 256-511, 512-767, 768-1023. The plus side is very easy to implement, you can always derive where something is. The biggest problems is scalability, e.g. going from 4 machines to 5. You also don’t have any fine grained control over buckets.

For Dynamic Directory partitioning you maintain a database of mappings to partitions. A user can be easily moved at a later date in a much finer grain. MySQL Cluster is designed for this type of application. It is not necessary however, a well configured Innodb Hardware solution with memcache can easily provide the same functionality. The only writes are new users, or update partition keys, with a lot of reads.

HiveDB

This open source product implements a “standard” partition-by-key MySQL system written in Java.
Many organizations have a somewhat similar built system, but this is an example of something that’s been open sourced.

More information at www.hivedb.org.

The Hive API language should be the only code that should be re-written to be application development language (e.g. PHP,Ruby) when needed.

High Availability

The obvious goals.

  • Avoid downtime due to failures.
  • No single point of failure.
  • Extremely fast failover.
  • No dependency of DNS changes.
  • No Dependency on code changes.
  • Painless and seamless failover.
  • Fail-back must be just as painless.

The overall objective is speed.

Understanding MySQL Replication is important to understanding HA options.

MySQL Replication is Master-Slave One Way asynchronous replication.

  • Slave requests binary logs from last position.
  • Master sends binary logs up to current time.
  • Master keeps sending binary logs in real-time.

Dual Master provides an easy configuration to fail over, it doesn’t provide benefits in throughput. Can help solve online schema changes without downtime. Assuming existing queries will perform both pre and post schema. (set-sql-bin-log=0 for the session is the tip). There are a number of caveats.

Ultimately for High Availability you have a trade off, data loss (minuet) to scalability.

CU@UC07


I’ll be speaking at the upcoming 2007 MySQL Conference & Expo (Why they dropped the word User, who knows), this time with Guy Harrison (Author of many books including MySQL Stored Procedures). We will be talking on MySQL for Oracle DBAs and Developers.

Anyway, good friend Paul McCullagh, creator of PBXT will be also speaking on PrimeBase XT: Design and Implementation of a Transactional Storage Engine. He coined to me in an email “CU at the UC”. I’ve done a further level of refactoring, and added marketing. You can buy the shirt online here. (More colors including black and products coming, if you want it now, please ask).

MySQL Camp T-Shirts


For those that attended the MySQL Camp at Google HQ late last year you may have seen me with my own T-Shirt designs. A number of people inquired about getting them. I’ve finally got around to make them available online, so anybody that would like one can order online.

There are two different shirts. If you want your name on the shirt, you need to make sure you choose the correct one.

  • Early Adopters – For those that were the first 48 that signed up, your name as well as position and company are on the shirt.
  • The Herd – For everybody that registered on the website, your name is on the shirt.

Ok. I’ve already been asked why 48. This was the number of registrants when I got the shirt made back in Australia a week or so before the Camp.

There are also plenty more of my MySQL designs at my MySQL Wear Store.

For those that also liked the runner up pin “A mouse is a waste of a perfectly good hand”, you can also get this in it’s original graphical shirt design at Geek Cool – CLI”

Pluggable Storage Engines – What is the potential?

I started this post a month ago, but after Kaj’s discussion on the same topic at the MySQL Camp I figured it was time to post.

I had dinner with a friend recently (a very smart friend), and our conversation lead him to ask “What’s different with MySQL?”. One of the things I tried to describe was the “Pluggable Storage Engine Architecture” (PSE) potential for the future that I expect will set MySQL apart from all other Open Source and even commercial databases.

Here are some details of the example I tried to provide, given somebody who understands enough of the general principles of RDBMS’s.

Consider the ability that information (intelligent data) is available within a Relational Database via the appropriate tools and language (e.g. SQL) but it is not physically constrained to Tables, Columns and Rows of data and an application to manage that data which is the present traditional approach. Let’s use images that you take with your digital camera as an example.

In a typical RDBMS application you would create an application to managed the content of your data, with a number of tables, and links to the images etc. Of course you would need an application as well to both view and manage this information.

What if, you simply pointed your database to a directory of images and then was able to query information such as photos by date, or by size, by album, from a certain location, with given keyword etc. Most of this information about digital photographs is already there. This information is encoded into an Exif format that is embedded within JPEG images.

So what’s missing from this information? Tags and Comments are the most obvious, because this information can’t be determined electronically, this is something that humans do. If you could also embedded this information into an image with a suitable tool then you could be ready to manage your photos.

A further extension would be to have Image Analysis capabilities that enabled you to search for photos that contained the sky, or people, or something that was the color red.

What if in the future, your camera’s had a built in GPS and this information recorded within Exif, and then the ability to extend your output to link to popular on line mapping software such as Google Maps would exist. You could then use your digital camera to track your moves, taking photos that could then plot your path over a holiday, and also enabling location based queries.

It was interesting to postulate what ideas may be possible in the futre. I suspect that it won’t be long before we actually see this. So what are the other potentials that you may not consider. Another example may be a MP3 Dukebox style PSE, managing all the information held with ID tags of MP3 allowing you to do with music what could be done with images.

References

Exif Example

Here is some example content of Exif using EXIF Tool

 ./exiftool ~/Desktop/2006_02_23_AirShow/IMG_5966.JPG
ExifTool Version Number         : 6.50
File Name                       : IMG_5966.JPG
Directory                       : /home/rbradfor/Desktop/2006_02_23_AirShow
File Size                       : 2 MB
File Modification Date/Time     : 2006:09:24 17:44:32
File Type                       : JPEG
MIME Type                       : image/jpeg
Make                            : Canon
Camera Model Name               : Canon EOS 300D DIGITAL
Orientation                     : Horizontal (normal)
X Resolution                    : 180
Y Resolution                    : 180
Resolution Unit                 : inches
Modify Date                     : 2006:02:23 16:01:56
Y Cb Cr Positioning             : Centered
Exposure Time                   : 1/320
F Number                        : 10.0
ISO                             : 200
Exif Version                    : 0221
Date/Time Original              : 2006:02:23 16:01:56
Create Date                     : 2006:02:23 16:01:56
Components Configuration        : YCbCr
Compressed Bits Per Pixel       : 3
Shutter Speed Value             : 1/320
Aperture Value                  : 10.0
Max Aperture Value              : 3.5
Flash                           : No Flash
Focal Length                    : 18.0mm
Macro Mode                      : Unknown (0)
Self-timer                      : 0
Quality                         : Fine
Canon Flash Mode                : Off
Continuous Drive                : Single
Focus Mode                      : AI Focus AF
Canon Image Size                : Large
Easy Mode                       : Manual
Digital Zoom                    : Unknown (-1)
Contrast                        : +1
Saturation                      : +1
Sharpness                       : +1
Camera ISO                      : n/a
Metering Mode                   : Evaluative
Focus Range                     : Not Known
AF Point                        : Manual AF point selection
Canon Exposure Mode             : Program AE
Lens Type                       : Unknown (-1)
Long Focal                      : 55
Short Focal                     : 18
Focal Units                     : 1
Max Aperture                    : 3.6
Min Aperture                    : 22
Flash Activity                  : 0
Flash Bits                      : (none)
Zoom Source Width               : 3072
Zoom Target Width               : 3072
Color Tone                      : Normal
Focal Plane X Size              : 23.22mm
Focal Plane Y Size              : 15.49mm
Auto ISO                        : 100
Base ISO                        : 200
Measured EV                     : 9.00
Target Aperture                 : 10
Target Exposure Time            : 1/318
Exposure Compensation           : 0
White Balance                   : Auto
Slow Shutter                    : None
Shot Number In Continuous Burst : 0
Flash Guide Number              : 0
Flash Exposure Compensation     : 0
Auto Exposure Bracketing        : Off
AEB Bracket Value               : 0
Focus Distance Upper            : -0.01
Focus Distance Lower            : 5.46
Bulb Duration                   : 0
Camera Type                     : EOS Mid-range
Auto Rotate                     : None
ND Filter                       : Unknown (-1)
Self-timer 2                    : 0
Bracket Mode                    : Off
Bracket Value                   : 0
Bracket Shot Number             : 0
Canon Image Type                : IMG:EOS 300D DIGITAL JPEG
Canon Firmware Version          : Firmware Version 1.1.1
Camera Body No.                 : 0930402471
Serial Number Format            : Format 1
File Number                     : 159-5966
Owner's Name                    :
Canon Model ID                  : EOS Digital Rebel / 300D / Kiss Digital
Canon File Length               : 2387078
WB RGGB Levels Auto             : 1726 832 831 948
WB RGGB Levels Daylight         : 0 0 0 0
WB RGGB Levels Shade            : 0 0 0 0
WB RGGB Levels Cloudy           : 0 0 0 0
WB RGGB Levels Tungsten         : 0 0 0 0
WB RGGB Levels Fluorescent      : 0 0 0 0
WB RGGB Levels Flash            : 0 0 0 0
WB RGGB Levels Custom           : 0 0 0 0
WB RGGB Levels Kelvin           : 0 0 0 0
Color Temperature               : 5200
Num AF Points                   : 7
Canon Image Width               : 3072
Canon Image Height              : 2048
Canon Image Width As Shot       : 3072
Canon Image Height As Shot      : 2048
AF Points Used                  : Mid-left
Preview Quality                 : Normal
Preview Image Length            : 278318
Preview Image Width             : 1536
Preview Image Height            : 1024
Preview Image Start             : 2108760
Preview Focal Plane X Resolution: 3443.9
Preview Focal Plane Y Resolution: 3442.0
User Comment                    :
Flashpix Version                : 0100
Color Space                     : sRGB
Exif Image Width                : 3072
Exif Image Length               : 2048
Interoperability Index          : R98 - DCF basic file (sRGB)
Interoperability Version        : 0100
Related Image Width             : 3072
Related Image Length            : 2048
Focal Plane X Resolution        : 3443.946
Focal Plane Y Resolution        : 3442.017
Focal Plane Resolution Unit     : inches
Sensing Method                  : One-chip color area
File Source                     : Digital Camera
Custom Rendered                 : Normal
Exposure Mode                   : Auto
Scene Capture Type              : Standard
Compression                     : JPEG (old-style)
Thumbnail Offset                : 2560
Thumbnail Length                : 7680
Image Width                     : 3072
Image Height                    : 2048
Aperture                        : 10.0
Drive Mode                      : Single-frame shooting
Flash                           : Off
Image Size                      : 3072x2048
Lens                            : 18.0 - 55.0mm
Preview Image                   : (Binary data 278318 bytes, use -b option to extract)
Preview Image Size              : 1536x1024
Scale Factor To 35mm Equivalent : 1.6
Shooting Mode                   : Program AE
Shutter Speed                   : 1/320
Thumbnail Image                 : (Binary data 7680 bytes, use -b option to extract)
WB RGGB Levels                  : 1726 832 831 948
Blue Balance                    : 1.140108
Circle Of Confusion             : 0.019 mm
Focal Length                    : 18.0mm (35mm equivalent: 27.9mm)
Hyperfocal Distance             : 1.67 m
LV                              : 14.0
Lens                            : 18.0 - 55.0mm (35mm equivalent: 27.9 - 85.3mm)
Red Balance                     : 2.075767

The desire for Performance SQL Tips

It seems, people are clammering for a more consolidated help guide for SQL Performance tips.

Jay Pipes at the MySQL Camp ran a session Interactive Top 10 SQL performance Tips. There was plenty of input and discussion, and at the time Sheeri simply typed them into a wiki page for later work.

Well it seems even that rough list is popular at Del.icio.us ranking near the top of the Hot List on the front page. I saw it earlier and it was second or third, but didn’t think of taking a screen shot until now, but it’s still high.

I’d say that we could easily get the Top 10 for up to 10 different categories rather easily. Good luck Jay.

The MySQL Joust

At our MySQL Camp Jay and Brian pitted off in the Umbrella Joust. Not sure if there was a winner, or a looser, but in the end no blood was split (except Leslie, but that’s another story).

See these and more camp photos at Flickr.




The Falcon!

Some early notes by Brian Aker on Falcon as discussed at the MySQL Camp.

Falcon is a transactional engine MySQL will be introducing. The first discussions were held about 3 years ago with Ann Harrison and about 1 1/2 years ago, MySQL started taking seriously the possibilities.

Falcon is not an InnoDB replacement. It’s a different way of looking at the problem of how it looks at and manages transactions, and how it’s designed. It flips around the way data is stored. Some points:

  • It uses as much memory as possible, like Oracle SGA or InnoDB pool.
  • It has a row cache not a page cache for more optimal memory use.
  • No locking at all. Jim doesn’t believe in it for concurrency control. It has total versioning.
  • Falcon has to keep all changes in memory, so not great for user transactions that may take longer
  • Characteristics – Well optimised for short fast web transactions, Designed for environments with lots of memory.

In general discussions is was mentioned from the floor the fear that there will be so many storage engine options, and you will need a matrix for what is good for what.

In conclusion, Brian mentioned it will be alpha before the end of year.

MyISAM++

Monty gave us a quick overview of next generation of MyISAM. It is set to include:

  • New data disk format
  • Transaction support
  • multi-versioning
  • row level locking and escalation to table level locks. (interesting)
  • bitmap indexes and new table scanning optimizing indexes with up to 1000x times performance.

No details of time frame were given for delivery, however development is well underway.

Doxygen Project

What the?

Well this is the inheritance diagram of the Item Class in the MySQL 5.1 Source tree, nicely documented using the Doxygen tool as mentioned by Jay in his presentation at MySQL Camp.

Jay started the Community Doxygen Project on the Forge to improve the level of commenting enabling a better platform for the community to contribute MySQL server code changes.

At this early stage David Shrewsbury is working on fine tuning initial documentation examples for QA and review. You can check out the Status Page of automated commenting conversion.

You can see the present documentation of MySQL 5.1 source here.

The joys of working at Google


So, mid morning especially after having a heavy and late night drinking with new friends in Palo Alto I was seeking at Day #3 of the MySQL Camp a high-caffeine pick me up drink. Yesterday I had a Bawls, and after enjoying it I was a little concerned that when I returned to New York I would not be able to buy it. You can get it at Think Geek but that’s more complicated then a local supermarket.

So after getting a Googler to get me to the cafe fridge we find out that there weren’t any there. No problem, lets just go this way I’m told. So we start a quick tour of the larger cafe area and another set of fridges but no Bawls, we keep walking, no more again. At this time the recommendation is I should try a Rockstar, but so far no luck either. Then we head into another area (all in the same building) to a micro kitchen, no Bawls and no Rockstar, but man, it’s an entire kitchen, with like 15+ cereals alone on tap for reference. My host is now committed for me to find and try his recommendation of a RockStar. Off now, still in the same building upstairs. At this time I’m blown away. The first desk I actually walk past has a Dell 24″ Widescreen LCD monitor, of which I have one, See my comments here. One blink later to then see a desk in the next area with two side by side in vertical mode. Blink again, and in this area there are four desks, and each desk has twin Dell 24″ Widescreen LCD monitors. They were everywhere I looked. WOW!!!!

So at the next micro kitchen we finally achieve our objective. A Rockstar. Sometimes the journey is just as rewarding as the prize. In this case I got a quick and very amazing tour of just a small part of one building.

Some details about the Rockstar. Firstly not bad, certainly was a pick me up, and it didn’t take long to kick in. And when I looked at the Supplement Facts on the can. % Daily Value = 130%. So this is today’s total intake plus 1/3 of tomorrows. Oops! Breakfast, morning tea, and now Indian for lunch! (Update. I’m a dick, hence the need for the drink originally as beer killed too many brain cells. As pointed out it’s a value of 130 for calories and blank for % daily value.)

I was also told that Googlers can get addicted to the high energy drinks here. Yes, I’m certain that is true.

Return to Google Lobby – Camp Photo


Early on Sunday Day #3, I dragged a few willing participants out for a “different photo” based on the umbrellas in each Lobby. It worked out well. Special thanks to Kynan who ran around to other lobby’s to find additional umbrellas. (He is the one holding the white one, and yes, that’s a utility Kilt).

I’ll be uploading more in this series to My Flickr Photos MySQL Camp 01 soon.



You can get a larger copy of image Here.

MySQL Replibeertion

MySQL Replibeertion was the last scheduled session on Day 2, but not withstanding there was free beer (a lot of), there was a serious side with a Replication Discussion.

One of the first questions by Jeremy was “Are the any big replication users?” to which Sheeri quickly replied “Are you calling me fat again”.

This was a highly interactive session, here are some of the points from the audience.

Some Uses of Replication

  • Backup
  • Hot standby
  • Scaling
  • Data Warehousing
    • Slaves are larger then your database
  • For no special reason
  • Consolidation of multiple sources
  • Support for multiple indexes

Issues

  • Can break
  • replication lag
  • bi-directional replication is not supported
  • hard to setup replication/initialization upto point to run one command
  • hard to know when the slave is out of sync (working but broken) diagnose
  • does not manage binary logs for you (max-bin-logs )
  • no row level replication (5.1 row based replication, change based replication, good and bad )
  • Serialized execution on the slave
  • Master does not keep track of the slaves (to the master, the slave is just another connection) Jeremy comment “it really really sucks in production systems.”
  • No multi-master replication. A slave can not have more then one master.
  • ring replication. No idea when something breaks what is right.
  • no ability for delayed duplication.
  • no way to get binlogs back. (manually twink the info file)
  • Master doesn’t care what data is on the slave.
  • A replication backup is really only good for the restoration of that machine
  • Default reconnection timeout is way, way to high (default of 1 minute). It should be at most 1 second with exponential fallback. (master-connect-retry), no fallback, no max number of retries, logged in error log every time.

Feature Results (Things replication needs, what you want to see)

  • Delayed Replication
  • Registered Salves in the Master
  • Import Binlog
  • Checksum Table Events (Need ability for table checksum to be added to binlog periodically so it can be checked by the slave.)
  • Global Sequence Number
  • Connect Retry Exponential Backoff
  • Hetrogous Replication (Oracle to MySQL). Golden Gate Software has a commercial offering
  • Command Exclusion List (sql_log_bin=0)
  • Replication filters by data on the slave
  • Show upcoming queries, skip query
  • Multi-Master to one slave
  • piping mysqlbin log commands into the mysql client fails for some character sets.
  • Binlog index capability
  • Checksum of Binary Events to determine a command is valid
  • command line interface in mysqllog so you could go backwards and forwards, then execute commands.

Check out more at Google Code Blog.

Day 2 – Memorable Quotes

Continuing on from my Day 1 – Memorable Quotes from the MySQL Camp.

“Are there any big replication users” — Jeremy “Are you calling me fat again” — Sheeri

“Only some of us have problems with interruptions.” — Jeremy to Jay

“It really really sucks in production systems.” — Jeremy About Slave management by Master.

“So there are like 12 people here, it must be the CEO’s turn to talk.” — Marten Mickos MySQL CEO

“Kegs and Eggs” — Joel S. Regarding all beer that will still be available at breakfast tomorrow.

“You can fight to the death for it”, Jeremy to his two employees Joel and Justin about who gets to be called employee #1.

“Patches go to employee #1″ — Ronald directed to Joel when a replication patch was coined by Jeremy and Eric.

“It’s a little like Google, there are no numbers”. In response to getting any dates/times on a commitment to functionality by MySQL.

“There is a way, but you don’t want to do it.” — Monty on a topic in using Replication Slave for Master Backups

“The Blackhole storage engine is really really scary. It’s not just the name, it’s a hack.” — Jeremy

“It will suck you in.” — More on the Blackhole Storage Engine.

“It still scares me.” — Jeremy are a long discussion by Brian on the Blackhole architecture concluding with the transactional state.

“I’m not sure I’d buy that.” — Brian continuing on more comments about the blackhole discussion.

“Let’s not optimize things that won’t happen in the grand scheme of things” — Jeremy

“You asked what I wanted to see, not what was practical” — Sheeri

“Wasting network bandwidth is great” — Jeremy

“People do lots of weird things to do performance”.

“All you need is beer and love”.

“Oh” — Sheeri. Long pause. “Light bulb pops up” — Jeremy

“Are we eating Oracle’s lunch? No we are eating Oracle’s dessert” — Marten Mikos MySQL CEO

MySQL Winter of Code

Our first session in Day 2 of the MySQL Camp was the MySQL Winter of Code, as well as an overview of the QA Pilot program and Overview of the Community Doxygen Project by Kaj Arnö and Jay Pipes.

Starting with discussions on Code Contributions & MySQL Winter of Code

Quality Contributer Program

  • More coding happens during wintertime then in summer
  • MySQL has less contributions than many other Open Source projects
  • Contributor License Agreement
    • We want to award contribution more then nominally
    • We want to encourage contributions in all areas
    • We prefer contributions in certain areas (especially encourage them)

Requirements for Winter of Code

  • A signed Contributor License Agreement
  • A well-formed proposal
  • Votes from the Community and/or MySQL

Topics for Winter of Code 2007

  • Connectors
    • Improvements in (pure drivers for) Perl, Apache APR, Python, Ruby
  • Storage Engines
    • File System Storage Engine
      • select directory,filename,size from files where size > 1000000;
      • select directory,sum(size) from files group by directory;
    • JPG/EXIF Storage Engine
      • update jpgfiles SET Author = ‘name';
  • Anything
    • Full Text Search for CJK
    • MySQL GIS improvements
    • Your Idea

Which versions does it go to?

  • MySQL 5.1 Community Server
  • MySQL 5.2 Enterprise Server

MySQL Quality Contributor Program

  • Searching for Quality Contributors
    • Bug Reports
    • Test Cases
    • Bug Patches
  • Defining a Quality Contributor
  • Encouraging Quality Contributors
    • Fixing Bugs
    • Responsiveness and feedback
    • Recognition and attribution
    • Privileges/Awards

Day 1 – Memorable Quotes

Plenty of people are writing highly technical stuff from MySQL Camp including your’s truly. However there needs to be a lighter side here, and well this is it, Memorable Quotes.

“That’s moderately easy to difficult.” Brian Aker talking about table_funcs in A MySQL Core Kernel

“That’s Trivial, it’s less then a day’s work”, Monty, also in “A MySQL Core Kernel”, of course Monty said “It’s Trivial” several times, and that’s fine, it probably is trivial and is a day’s work for the guru’s, the problem is there are presently 6,000 trivial day’s work on the list of things to do.

“I’m trying to estimate when my finger will fall off.” — Jay Pipes You had to be there. I will say no more.

“You work for InnoDB, right” — Dathan Vance Pattishall of Flickr “InnoDB works for me.” — Ken Jacobs of Oracle

“Absolutely” Steve Gunn of Google in “The MySQL at The Google” talk. And the question from the floor that prompted this response “Do schema changes ever affect the production systems”.

“Everything at Google grows at the rate Google grows. If you want a proper answer we have to file that with the SEC”. Steve Gunn of Google again in “The MySQL at The Google”.

“We like to use boxes that crash.” — Mark Callaghan of Google.

“I want to make it, but we have already met before.” — Paul Tuckfield while Jeremy bashing. Side Note, apparently I’ve been saying “bagging Jeremy” which is Aussie Slang, but here in the US it has other meanings!

“I’d love my business card to say Hacker Herder”. The very cool Leslie, our Google Liason person.

“Actually they are just extras, they have all been hired for the day.” — Sheeri. In reference to all the Google Employees wearing Google shirts.

“And we’ll give you a tee-shirt” — An Google employee about Job Opportunities.

“I’m going have to kill Jeremy. This wireless stinks, I’d rather have dialup” — Sheeri about our hotel connectivity, hotel being recommended by Jeremy.

“I’m the former founder of Live Journal.” — Brad Fitzpatrick. “How can you be a former founder” — Jeremy Cole.

There were of course so many more, I just didn’t write them down. But tomorrow I will be prepared.

Testing on the toilet


Yes you got it, even while in the restroom here at Google (you can’t say toilets here in the US, because that’s the device), Google keeps you occupied while standing or sitting with the writings of “Testing on the Toilet”.

In Episode 19, TOTT talks about “Converting Old Style Tests”. An interesting read, rather then the daily grind of the front page of USA Today, plus as well as something that can be obviously changed at a longer frequency.

So how was the toilet experience here at MySQL Camp. Well you have toilet warming seats , my first experience, it was a little weird, and then you get the builtin “bidet” as well, with the ability for front cleaning, rear cleaning and then drying. Now that was really weird.

There has been a policy of what photos we can and can’t take and that’s cool, so I can’t post a copy of it. I will however show you this cool testing logo of TOTT as it was also on a tee-shirt (yes, it’s a little stained, but geeks do that sometimes) of a Google employee in Kiev, which is where we can take photos.

MySQL Camp – Introductions & Comments

The great thing about this unconference, is the lack of total formal structure. For now , our first session we are having an open introduction of people, there are at good 60+ people here already, and people rolling in, and it’s great to hear people’s background, and also to bag Jeremy Cole at every opportunity. We have a variety of people from various backgrounds, companies and experience levels.

We are in the Kiev room, with power build into the desks, lots of desk space and full 360% swivel chairs. This is just another example of the company’s clear thinking about it’s requirements.

There have already been some very funny stories, I should have made more earlier notes. Here are some.

Adam Ritter (Proven Scaling ride winner) was the first to bag Oracle, really bold move with Ken Jacobs from Oracle directly behind him, and he had already made his introduction.

Paul Tuckfield of You Tube guys said to Jeremy Cole re his replication talk “I want to make it, but we have already met before.” There have been about 10 bags of Jeremy already, he is giving as good as he is getting. Proven Scaling are sponsoring Beer session tomorrow night. Great stuff Jeremy. He did also ask how many people were planning on coming, given the number of people at the MySQL Camp has tripled in the past few days.

Breaking news. Mark Callaghan from Google, “Is there anybody from You Tube here”, to which the Paul Tuckfield of You Tube identified himself. After a few quick words the Google comment was “Deals Off” which made everybody laugh. That’s been level of good interaction with people. here

Flickr DB dude (his words) Dathan Vance Pattishall said to Ken Jacobs “You work for InnoDB, right”. Ken Jacobs response was “InnoDB works for me.”. Again a lot of laughs.

Google update – another 2 mins later


I’m outside enjoying a very lovely Danish and Orange juice with Jay and Leslie, and like 3 motorised scooters and a guy on a skateboard goes past. Did I mention how cool this place is!

Back in the foyer and Sheeri is sitting in the leather massage chair, as more people start streaming in. She has her laptop there and is IM’ing her boyfriend.

“So I’m in a massage chair at Google head quarters”. And his response is , “like right now”. Well here will be the photo and video when we find somebody with a card reader for my camera.

Jay’s looking a little worried, registrations are now over 200, 202, yesterday is was 150, and like 3 days ago, still in the 70s & 80s. People must have found out a free event at MySQL. We are going to kick people out that don’t contribute. It is a unconference.

So now that I write this, registrations are at 206.

My own Googlewear

So like two minutes later, some official looking Google people come over and saw “Come on over and get your Google Shirt”. So before the last past is even cold, we have our own Googlewear.

A minute later, Leslie is back again saying, guys and lady (Just for Sheeri), “Contintental breakfast is ready in the room”. Now to check out the Google Food!

Googlewear

Everybody here (that is not us visitors) are wearing Google shirts. It must be an official clothing label.

So Sheeri says “Actually they are just extras, they have been hired for the day.”

So the latest quote from Leslie is “Eat, joy and be merry, and stay inside the blue lines”. Of course I should also mention when we arrived the parking security guy said. “Follow the second yellow brick road”. This is going to be a weekend just of quotes!

I'm at Google Mountain View

We have made it to MySQL Camp being held at Google Head Quarters in Mountain View California. Directions WOOT!!!

So we are at the lobby reception of Building 40, and I’m lounging back in a large green beanbag behind all the name tags, this is so cool, the problem is with all our technology, nobody yet has the capability to read the photos from a digital camera so I can upload it. Both Sheeri and myself have left the right stuff in our hotel room. So stay tuned.

Leslie our Google co-coordinator wants her business card to read “Hacker Herder” which sounds so cool. This whole weekend is going to be a blast. More to come.

MySQL :: Developer Zone Quick Polls

I don’t get to the MySQL Developer Zone main page often enough. In thinking about what pages I view everyday or regularly, it doesn’t rate as high as Planet MySQL, MySQL Forums or even the MySQL Forge.

I was most dissappointed in the results of a recent poll What did you think of the 2006 Users Conference?. The top response was I had no idea there was a Users Conference. That’s not good to see this.

An interesting poll What are you most looking forward to at the MySQL Users Conference (April 24-27)?, the clear winner was Drinking beer with MySQL gurus. What does this say about the attendees. Either they are all alcho’s or the just want to be around guru’s in a less technical way.

I see this page also has a live feed of Planet MySQL. Perhaps we should get some more stuff down the right side of PlanetMySQL like the current Quick Poll itself and a feed of the current developer articles at the Developer Zone.

A Post MySQL Conference review. The 4 F's

Finally back home after some R&R at Yosemite before leaving the US. In conclusion, to sum up my experience of the 4th Annual MySQL Users Conference “Excellent”.
Here’s my take. Friends, Functionality & New Features, the Future.

Friends

I’ve used MySQL now for over 6 years, and full time for a number of years, yet I’ve only become active in the MySQL community, particularly Planet MySQL in the past 6 months. Over that time, I’ve read a lot from members, and heard from many people. It was great at the conference to meet many of these people for the first time. The list includes: Community MembersFrank Mash,Mike Kruckenberg, Markus Popp, Roland Bouman, Giuseppe Maxia and Paul McCullagh. MySQL EmployeesMike Hillyer, Colin Charles, Jay Pipes, Mike Zinner, and New ContactsKristian Köhntopp, Jeremy Cole, Sheeri Kritzer, Taneli Otala, Laura Thompson just to start the list.

Functionality

Not only was there plenty of discussion on Server Functionality, there was plenty of MySQL Client functionality including the MySQL Workbench, MySQL Migration Toolkit and the other MySQL GUI products.
There were a number of discussions on uses and implementations of MySQL in large web deployments. It would be great to see some more white papers here.

New Features

A few months ago I wrote an article A call to arms!. In some part, I was just giving my opinion and hoping to gee up some support and feedback from the community. Well, the MySQL 5 Pluggable Storage Architecture got a great boost with announcements of transactional storage engines Falcon by Jim Starkey, Solid and PBXT. Add details of InnoDB New Features, MyISAM additions, and indications of other wonderful if not entirely practical options. I’m sure there is much more in stall to come this year that wasn’t discussed.

A number of talks featured Cluster including Monday’s tutorial, and with 5.1 and beyond I can see next year there will be more discussion on successful Cluster implementations. There was a lot of talks about Scaling out. I’d like to see more practical examples, perhaps a detailed tutorial.

The Future

What does the future hold for MySQL? The MySQL Server and Storage Engine Roadmap provided an insight of the upcoming planned features and releases over the next 2 years. Of course, the marketplace can change quickly, and MySQL is in a great position to react to the needs of the community quickly.

And before your know it, the 5th Annual MySQL Conference will be in play.

Conference Feedback

One thing I had a chance to discuss with Jay Pipes after the conference, I wasn’t the first to mention, and plans are already in motion, was a number of talks just needed more time. Moving the schedule to 55 minute talks gives that extra time, even if it is open question time from the floor, but it also makes knowing when sessions are on much easier, if they always start at the top of the hour.

In Conclusion

Frank (a.k.a Farhan Mashraqi) asked me what session I liked the most? Hard to say. Agile Database Techniques: Data Doesn’t Have to be a Four-Letter Word Anymore rated very highly, as the content was close to heart and my expertise. HackFest B: Creating New SHOW Commands by Brian Aker, showed just how easy it was to get into the MySQL source. Of course the internals are much more complex then this, but it was a good introduction. My favourite keynote was The Ubuntu Project: Improving Collaboration in the Free Software World. There were a number of talks I was disappointed in, as well as a number I didn’t get to due to 8 sessions in parallel.

I would have to say, that what impressed me most was no one single talk, but the functionality of the GRT Shell that Mike Zinner and his team have built into the GUI product line. I was very impressed, and I could see this providing extensive functionality and not just MySQL specific centric tools. This will be area I’ll be focussing on my contributions in the near future.

MySQL Stored Procedures Performance

Another one of the sessions at the MySQL Users Conference I attended was Tuning MySQL5 SQL and Stored Procedures by Guy Harrison from Quest Software. A global company with 6000+ customers.

Guy has written a number of Oracle Performance Books in the past. His work now is on the “Spotlight” product family – Database diagnositic tools converting data to graphical representations. For these products, MySQL 5 and InnoDB only is necessary, simply due to accessing the right internal information for presentation. There are Freeware MySQL product downloads.

In this presentation he stated, nothing he was talking about specifically was relatively new. He did make quite a funny comment, “He is now seeking refugee status in the MySQL Community”.

Guy is author of O’Reilly “MySQL Stored Procedure Programming” Book. I managed to get for free at the conference from the MySQL Quiz night, in addition to a shirt and cap for stumping a Guru.

His talk were on tools and techniques for tuning MySQL.

  • Explain Command – reveals what the optimizer intends to do
  • Explain Extended
mysql> explain extended select ...;
mysql> show warnings G
Shows what the optimizer actually did. In this example, An IN was converted to EXISTS

There were 4 ways to provide optimizer hints.

  1. STRAIGHT_JOIN
  2. USE INDEX(…)
  3. FORCE INDEX(…)
  4. IGNORE INDEX(…)

In addition to the Show Query Log, there are Innodb specific commands, two in particular.

show status like 'innodb%'
* innodb_buffer_pool_read_requests
* innodb_data_read

Indexing and the optimizer

  • In MySQL Index is the best tool to improve performance, however sometimes it’s better to access the entire table.
  • Indexes generally effective when between 5% and 20% of rows are accessed.
  • Subqueries need to be satisified by an index or performance will be quite inefficent.
  • Overloading indexes with additional columns when key queries only use a few columns can enable improved performance.

Not all indexes are created equal. In the following examples, each advancement improved performance.

  • No indexes ()
  • Single Index (customer)
  • multiple indexes (customer, product)
  • concatenated indexes (customer + product)
  • covering index (including required columns, customer+product+qty)

Examples of SQL that can’t benefit from Indexes.

  • Derived tables – SELECT table in a from clause, creates a temporary table and will never get an index.
  • Views with UNIONS/GROUP BY

A comment from the audience was that derived tables can be of a benefit to a correlated sub-query in specific examples.

Stored Procedures provided a mixed blessing for performance.

  • Can improved perfomrance when high network overhead.
  • Some improvement on parsing.
  • Breaking up complex queries may provide benefits.
  • SQL is highly optimized for SET operations.
  • SP is not optimized for number crunching. Computionally not a fast language.

A written routine to calculate prime numbers provided the following performance (from most expensive to least) MySQL SP, Oracle SP,PHP,Perl,Java ,VB.NET ,C (gcc). This showed an example that was excessively inefficient. On the other hand, if the program is network dependent (e.g. access a million rows, perform some statisical aggretation). Comparatively the same between Java and SP locally, but much better in a remote host mode.

Performance of SQL in a SP will dominate overall performance. Where SQL is tuned, goto tried and proven traditional optimisation techniques.

  • Optimize iterations
  • Optimize Logic/Testing
  • Avoid recursion

Loop Management

  • Only perform necessary code within iterations
  • LEAVE or CONTINUE when possible in loops
  • Test the most likely IF/THEN statements first
  • extract if comparisions duplicated to produce nested if’s (within reason)

Some guidelines for Triggers.

  • Triggers will have a non-trival overhead for even the simplest trigger.
  • Due to FOR EACH ROW only, don’t have expensive SQL in any trigger.
  • Very carefully tune SQL in triggers.
  • Empty trigger produced 12% overhead.

For more information check out www.quest.com/mysql