Website of the Day – Slideshare


I came across an interesting site while reading World’s Best Presentation Contest Winners Announced by Guy Kawasaki called SlideShare.

It’s a happy medium between the bulk of image sites like Flickr and Yahoo Photos and video sites like Revver and YouTube where you can easily add Text to what you are wanting to say in a Slide Show. Interestingly enough, like most Web 2.0 Communities people will come up with ideas you never considered, for example check out Evangeline Lilly where this is effectively a portfolio photo shoot of an actress. Clever.

Reading the MySQL Manual

I was asked the question today, “How do I show the details of a Stored Procedure in MySQL. The SHOW PROCEDURE ‘name’ didn’t work.”.

The obvious answer was SELECT ROUTINE_NAME,ROUTINE_DEFINITION FROM INFORMATION_SCHEMA.ROUTINES, given I like to use the INFORMATION_SCHEMA whenever possible. This lead me to think was is the corresponding SHOW command. A quick manual search got me to SHOW CREATE PROCEDURE.

What was interesting was not this, but the list of other SHOW commands I didn’t know. I did not know about SHOW MUTEX STATUS, SHOW OPEN TABLES and SHOW PROCEDURE CODE for example.

It pays if you have free time (who ever has that), to re-read the manual, or at least a detailed index regularly. I’m sure there are plenty more helpful tips out there. Now just what does the output of these new commands really do, and does it really help. If only I could get commands to the stuff I really want.

MySQL Cluster Certified

Jonathon Coombes recently blogged in MySQL Cluster Certified that he passed the MySQL Cluster DBA Certification as was the first Australian. Lucky for him I passed the exam after my presentation on the second day of the conference. I guess us Australian’s are leading the world!

As Jonathon said it was rather hard, certainly more difficult then the other DBA exams but nothing for an experienced Cluster DBA.

MySQL Conference – YouTube

MySQL Conference 2007 Day 4 rolled quickly into the second keynote Scaling MySQL at YouTube by Paul Tuckfield.

The introduction by Paul Tuckfield was; “What do I know about anything, I was just the DBA at PayPal, now I’m just the DBA at youTube. There are only 3 DBA’s at YouTube.”

This talk had a number of great performance points, with various caching situations. Very interesting.

Scaling MySQL at YouTube

Top Reasons for YouTube Scalability

The technology stack:

  • Python
  • Memcache
  • MySQL Replication

Caching outside the database is huge.

It a display of numbers of hits per day it was said “I can neither confirm or deny the interpretation will work here (using an Alexa graph)”. This is not the first time I’ve heard this standard “Google” response. They must get lessons by lawyers in what you can say.

Standardizing on DB boxes (but they crash almost daily)

  • 4x2ghz opteron core
  • 16G RAM
  • 12x10k scsi
  • LSI hardware raid 10
  • Replication played a big part in fixing
  • Get a reliable H/W supplier

Replication Lessons

  • You don’t worry about it when a replicas fail.
  • One thing that sucks, Innodb doesn’t recover very fast. It does that durability think, but it takes hours to finish recovering (was it going to finish)
  • How many backups can you restore. When you switch you a replica, are you sure it’s right?
  • Did you test recovery, did you test your backups.
  • replication was key to trying different H/W permutations to identify incompatible H/W (combinations of controllers/disks)
  • we got good at re-parenting/promoting replicas, really fast
  • we built up ways to clone databases as fast as possible
  • Excellent way to test tuning changes or fixes (powerful place to test things)
  • Keep “intentional lag”/Stemcell replicas – Stop SQL thread, keeps a server a few hours or a day behind. Say if you drop a table you have a online backup.
  • When upgrading, always mysqldump then reload, rather then upgrade database.
  • Don’t care about CPU’s. I want as much memory as possible, I want as many spindles as possible.
  • For YouTube 2-3 second lag is acceptable.

If you db fits in ram, great otherwise

  • Cache is king
  • Writes should be cached by raid controller (buffered really) not the OS
  • Only the db should cache reads (not raid, not Linux buffer cache)

Only DB should cache reads

  • Hit in db cache means lower caches went unused.
  • Miss in db cache can only miss in other caches since they’re smaller.
  • Caching reads is worse then useless. It’s serialized writes.
  • Avoiding serialization in reads reaps compounds benefits under high concurrency

An important lesson learned. Do no cache reads in F/S and Raid Controller.

Caching Lessons
Overcoming Mystery Serialization

  • Use O_DIRECT
  • vm.swappiness=1-5
  • if you’re >80% buys — your not doing I/O concurrently look at other figures e.g. 80% busy 8 I/O’s, next configuration 80%, only 4 I/O’s
  • Mirror in H/W strip in S/W

Scale Out

  • Writes are parallel to master, but serialized to replicas. We need true horizontal partitioning.
  • We want true independent masters
  • EMD – Even More Databases — Extreme Makeover Database
  • Slave transactions must serialize to preserve commit order (this is why replication is always way slower)
  • The oracle caching algorithm (that’s a small o) — predicting the future
  • Replication lags: one IO bound thread. You do know the future, commands are coming up serially.
  • Write a script to do reads, before updates coming up (because they are cache hits).
  • The diamond. For golive, play shards binlogs back to original master for fallback.

MySQL Conference – Get Behind Dorsal Source


In a community session yesterday at MySQL Conference 2007, I first heard about Dorsal Source. A collaboration between Solid DB and Proven Scaling that allows for community people to upload patches to MySQL, get it compiled across multiple platforms, and have a downloadable distribution available on H/W individual contributors will never have access to.

That’s a great idea. I also hope we get the opportunity to get compiling of patches into multiple versions, as well to get builds of a lot of patches together. Personally, I’m running 3 versions just to diagnose one problem. 5.0.36 with a custom binary change, 5.0.37 so I have SHOW PROFILE, and 5.0.33 so I have microslow patch.

With new patches becoming available from the community, I hope I can see builds that combine all known patches that Dorsal Source may have.

I think this is going to be a great project.

MySQL Conference – PHP on Hormones

MySQL Conference 2007 Day 4 started early again at 8:20 am with PHP on Hormones by the father of PHP Ramus Lerdorf.

A very funny man, one of the best insightful talks of the conference (rather scary actually). Here are some opening comments.

  • In his own words as Keynote speaker. “I’m here because I’m old”.
  • Php 1 from 1994 started after seeing Mozilla in 1993. Because it was just me using it, I could change the language any time.
  • In 2005 the code looks like this (in comparison on 1995) — I’m not sure if this is worth 10 years of development
  • I wrote PHP to avoid programming
  • It’s changed to be more OO because people expect that. Universities teach this.
  • Hey, I was fixing bugs in my sleep. Iwould wake up, and in my mail box there would be bug fixes to bugs I didn’t even know I had.

Why do people contribute?

  • Self-interest
  • self expression
  • hormones
  • Improve the world

The slide included a great Chemical equation of “The Neuropeptide oxytocin” — Nature’s trust hormone

People need to attract other people, it makes you feel good, it comes out when you interact with people.

It’s not what people think about you, but rather what they think about themselves.

  • PHP was my baby, giving up control, just because I started it, doesn’t mean I have a bigger say in it.
  • Systems that harness network effects and get better the more people use them in a way that caters to their own self-interest. — Web 2.0
  • Once you build a framework your done, the users build the site, they drive the content.
  • The same people that work on open source projects, are the same people that use websites.
    • Self-interest
    • self expression
    • hormones
    • Improve the world

1. Performance
It your sites falls apart your done.

  • Benchmark
    • http_load
    • Callgrind inside valgrind
    • XDebug

valgrind –tool=callgrind

  • Excellent tool to see where time is spent in the code. You have to run a profiler.
  • Example of using Drupal. It turns out 50% of time was spent in the them, it had 47 SQL queries, 46 Selects.
  • Went from 4 per second to 80 per second, without any code changes. Some performance options, and some caching.
  • Guaranteed you can double the speed of your website by using a profiler.

2. Security
Critical problem areas.

  • 404 pages
  • Search page
  • PHP_SELF
  • $_GET, $_POST, $_COOKIE
  • $_SERVER
  • Lots of stupidity in IE (e.g. Always send a charset)

The web is broken you can all go home now.

People are venerable because people run older versions of browsers, and their data is not secure, and you can’t secure their data.

What can happen??
9 out of 10 of you have cross-site scripting hole on your site

Remote Greasemonkey
Profile Hacks
JS Trojans

Added a PHP logo to the MySQL User Website, it’s really the PHP website
IBM webpage, on article about security.

Tool to find holes, banks, insurance companies, CIA, even Yahoo where I work.

You know if they have been to bankofamerica.com, you can tell if they are a customer, you can tell if they are logged, you can then see their cookie credentials.

You don’t know if any sites have these problems.

JS trojan, iframe that captures
reconfigures your wireless router, moves it outside your DMZ, then uses traditional techniques to attack your machine (that you thought was secure inside a firewall)

You should never ever click on a link. It sort of defeats the purpose of the web.

Never use the same browser instance to do personal stuff and browsing.

So what are we doing about this?
There isn’t much we (PHP) can do to secure sites developed.
Built a filter extension in 5.2, back in 5.1.

http://php.net/filter *** YOU MUST IMPLEMENT THIS
filter.default=special_chars

3. APIs are Cool!

Two lines to grap the Atom feed from flickr of photos just uploaded.
That’s all I have to add to my code.

The really make you want to use the servers. It’s so easy.

API drives passion, drive people to use your site.
You can add a lot of cool things to your sites.

What to do

  • Avoid Participation Gimmicks
  • Get their Oxytocin flowing
  • Solve One Problem
  • Clean and Intuitive UI
  • API’s
  • Make it work

A full copy of the slides can be found at http://talks.php.net/show/mysql07key

MySQL Conference – Google

MySQL: The Real Grid Database

Introduction

  • Can’t work on performance problems until we solve the availability
  • We want MySQL to fix our problems first.

The problem

  • Deploy a DBMS for a workload with
    • too many queries
    • to many transactions
    • to much data

A well known solution

  • deploy a grid database
  • -use many replicas to scale read performance
    -shard your data over many master to scale write performance
    -sharding is easy, resharding is hard

  • availability and manageability trump performance
  • – make it easy to run many severs
    – unbretable aggregate perfomance

  • we describe problems that matter to us.
  • The grid database approach

    • Deploy a large number of small servers
    • use highly redundant commodity components
    • added capacity has a low incremental cost
    • not much capacity lost when a server fails
    • support many servers with a few DBAs

    Managability
    Make it easy to do the tasks that must be done. Reduce, Reduce.
    Make all tasks scriptable
    Why does it mater, support hundreds of servers, spend time solving more interesting problems. You generally have lots of problems to solve.

    Underutilize your severs
    Require less maintenance
    Requre less tuning
    tolerate load spikes better
    tolerate bad query plans better

    In a Perfect World
    Short running queries
    uses kill mistake and runaway queries
    accounts new use to many connections
    query plans are good
    new apps increase database workload by a small amount
    only appropiate date is stored in the database

    Reality

    • Long running transactions, create replication delays everywhere
    • servers with round robin DNS aliases make queries hard to find
    • applications create more connections where the database is slow
    • some storage engines use sampling to get query plan statistics
    • new applications create new database performance problems
    • applications use the database as long as rows are never deleted
    • many long running queries on replicas

    Solutions

    • Improve your ability to respond because prevention is impossible
    • Need tools to make monitoring easier
    • determine what is happening across servers
    • detemine what happened in the past

    Mantra

    • Monitor everything you can, and archive as long as possible. (vmstat 5 secs, iostat, mysql error logs)
    • You will need these to reconstruct failures
    • save as much as possible
    • script as much as possible

    Monitoring Matters

    • Display what is happening
    • -which table, account or statemsns caused most of the load
      -many fast queries can be as much a problem as one slow query

    • Record what happending
    • –archivce show status counters somweere
      – query data from the archive
      — visualise data from the archivce

    • record queries that have been run
    • — archive show processlist output (do every 30 seconds)
      — support queries on this archive

    • All of this much scale to an environment with many servers

    Monitoring Tools

    • Display counters and rate change for counters
    • aggregate values over many servers
    • visualize and rang results
    • display results over time

    Google mpgrep tools

    New Commands
    We changed mysql, three new commands
    SHOW USER _STATISTICS
    SHOW TABLE STATISTICS
    SHOW INDEX STATISTICS

    Per Account Activity
    USER_STATISTICS
    seconds executing commands
    number of rows fetched and changed
    total connections
    number of select/updates/other/commits/rollback/binlog bytes written.

    TABLE STATISTICS
    number of rows fetched/changed

    INDEX STATISTICS
    display number of rows fetched per index
    helps find indexes that are never used

    available in code.google.com in 4.0, porting to 5.0

    MySQL High Availability

    • Great options
      • Cluster
      • Replication
      • Middelware — e.g. continum
      • DRBD
    • We need some features right now
    • we are committed to innodb and mysql replication
    • *a lot of appplicaton code works on this
      *our tools and processed support this

    • We favor commodity hardware

    There are all great features but we are much more limited in what we can use.
    Management want to know we don’t loose transactions, not loose some transactions.

    Desired HA Functionality

    • Zero transaction loss on failures of a master
    • minimal downtime on failures of a master
    • reasonable cost in performance and dollars
    • fast and automatic failover to local or remove server
    • no changes to our programming model
      • does it support MVCC
      • does it support long running transactions (5 mins – populate temp table then use to update another table, changing rows), 5 mins on master, causes 5 mins on slave, causes code to failover from slaves to master

    • replication and reporting are concurrent on a slave

    MVCC must have update concurrent with query.

    Failures happen everywhere
    OS – kernal oom or panic (older 2.4 32 bit systems)
    mysqld – caused also by code we added
    disk, misdirected write, corrupt write (love innodb checksums)
    file system – inconsisted after unplanned hardware reboot (use ext2)
    server – bad RAM
    lan, switch – lose
    Rack – reboot
    Data center – power loss, overheading, lightning, fire
    People – things get killed or rebooted by mistake ( a typo can take out the wrong server, when names differ by a character or a digit)

    ext2 and 4.0 are great, there are the same generation.
    Trying not to use RAID, not battery backed raid etc, we try work around with software solutions. We do use RAID 0, but we also try software solution.
    When we have the right HA solution, we won’t need RAID.

    Mark. “Yes, Google programmers have bugs. Not me personally, it was my predecessor.”

    HA Features we want in MySQL
    Synchronous replication as an option
    a product that watches a master and initiates a failover
    archives of the master binlogs stored elsewhere
    state stored in the filesytstem t obe consistent after a crash
    . innodb and mysql dictionaries can get out of sync
    .replicatoin state on a slave can get out of sync

    We could not wait
    Features we added to MySQL 4.0.26
    We can do things a lot faster
    . we have more developers lying around
    . Our needs as specific, not a general product solution

    Transactional replications for slaves
    semi-synchronous replication
    mirrored binlogs
    fast and automated failover

    Transactional Replication
    Replication state on a slave is stored in files
    slave sql thread commits to storage engines and then updates a file
    a crash between the two can make replication state inconsistent
    transactional replication
    MySQL can solve this in the future by storing replication state in tables

    Semi-synchronous replication
    Block return from commit on a master until at least one slave has acknowledged receipt of
    slave io thread acknowledges receipt after buffering the changes
    modified mysql replication protocol to support acknowledgments
    conifuration options
    where to the master uses it
    where a slave used it
    how long the maser waits for an acknowledgement

    can run a server with some semi-sync replication slaves and some regulare replication salves
    this can be worked with any storage engines that supports commit, but we only use innodb

    * This is how we guarantee to management for Zero Transaction Loss.

    Latency single stream 1ms, multi-stream 10ms. This is acceptable for us.

    The MySQL Replication Protocol

    • The current replication protocol is efficient
    • a slaves makes one request

    Replication Acknowledgment

      Slaves register as semi-sync or async at connect time
      prepend flag bytes to all replication events sent to semi-sync clients
      the master sends the flag bytes to request acknowledged for replication events that represent the end of the transaction
      the slave use the existing connection for acknowledgments

    Mirrored Binlogs
    mysql does not provide a way to maintain a copy of a master’s binlog on a replica. By copy we me a file of same name and equivalent byte for byte.
    Hierarachial replication works much better where a slave can disconnect from one replication proxy and reconnect to another with adjusting binlog offsets.
    Hot backups taken before a failover and difficult to use after a failover

    Mirrored Binlog Implementions
    Slave IO threads write their own relay log and a copy of the bin log
    all events but the rotate log event are written

    After failover, start a new binlog on new master

    Fast Failover

    Slaves use a hostname, rather then an IP
    You can’t enable the binlog dynamically (in 4.0)
    Added new SQL STATEMENTS that does
    disconnect users with SUPER privilege
    disable new connections
    enable the bin log
    enable connections from all users

    Automatic failover
    Something must decided that a master has failed
    Something must choose the new master

    Q: What keeps up from moving to 5.0?
    A: Queries don’t parse (Joins)

    Data sets, 8GB servers, 50-100GB’s

    MySQL Roadmap

    Here are some notes from the MySQL Server Roadmap session at the MySQL Conference 2007.

    MySQL: Past and Future

    • 2001: 3:23
    • 2003: 4.0 UNION query Cache Embedded
    • 2004: 41. Subqueries
    • 2005: 5.0 Stored Procedures, Triggers, Views
    • Now: 5.1.17 Partitioning, Events, Row-based replication
    • 2007?: 6.0 Falcon, Performance, Conflict detection
    • 2008?: 6.1 Online Backup, FK Constraints

    2007 Timeline

    • Q1: 5.1 Beta, 5.1 Telco Production Ready, Monitoring Service 1.1, MySQL 6.0 Alpha, Community GA
    • Q2: MySQL 6.0 Beta, New Connectors GA
    • Q3: 5.1 RC, 6.0 Beta, MS 2.0, Enterprise Dashboard beta
    • Q4: 5.1 GA, 6.0 Beta

    Where are we today?

    • We are by fare the most populate open source database
    • The Enterprise world is moving online and MySQL is well-positioned for that trend, But:
      • Transactional scalability
      • Manageability
      • Specific online features

    MySQL Server Vision – The Future

    • Always Online — 24×7, Online backup,online analytics, online schema changes
    • Dynamic Scale-out — online partitioning, add node, replication aides,
    • Reliable — fault-tolerant, easy disagnosis, stable memory, ultimately self-healing
    • High-performance — Interactive web, real-time response, apps, 10,000-100,000 clients
    • Ease of use — Portable, Best for development, multiple connectors, easy tuning
    • Modularity and Ubiquity — Storage engines, plug ins

    How can you help?

    • Bug finding and fixing — Community Quality Contributor
    • Feature/patch contribution
    • But, to expedite your patch

    The goal: “Be the Best Online Database for Modern Applications”

    MySQL Conference – For Oracle DBAs and Developers


    I have just completed my presentation at the MySQL Conference 2007 on MySQL for Oracle DBAs and Developers.

    Not mentioned in my slides, but referenced during the presentation was what I consider the most important page to document from the MySQL Manual — 5.2.1. Option and Variable Reference

    You can download a PDF copy of my presentation here.

    MySQL Conference – Building a Vertical Search Engine in a Day

    Moving into the user sessions on the first day at MySQL Conference 2007, I attended Building a Vertical Search Engine in a Day.

    Some of my notes for reference.

    Web Crawling 101

    • Injection List – What is it seed URL’s you are starting from
    • Fetching the pages
    • Parsing the content – words and links
    • Updating the crawl DB
    • Whitelist
    • Blacklist
    • Convergence — avoiding the honey pots
    • Index
    • Map-reduce — split a large problem into little pieces, process in parallel, then combine results

    Focused content == vertical crawl

    • 20 Billion Pages out there, a lot of junk
    • Bread-first would take years and cost millions of lives

    OPIC + Term Vectors = Depth-first

    • OPIC is “On-line Page Importance Calculation”. Fixing OPIC Scoring Paper
    • Pure OPIC means “Fetch well-linked pages first”
    • We modify it to “fetch pages about MySQL first”

    Nutch & Hadoop are the technologies that run on a 4 server cluster. Sample starting with www.mysql.com in 23 loops, 150k pages fetched, 2M URL’s found .

    Serving up the results

    MySQL Conference – RedHat Keynote – One Laptop Per Child

    Our third keynote at MySQL Conference 2007 was titled Building the Ultimate Database Container with RHEL, MySQL, and Virtualization by Michael Evans.

    The presentation was on Red Hat & One Laptop Per Child. His initial Quote was “Thinking Past Platforms: The Next Challenge for Linux”, By Doc Sears, 2007-04-16 http://www.linuxjournal.com/node/1000210

    OLPC

    • A Non profit idea from Nicholas Negroponte.
    • Aim is to build & distribute inexpensive laptop systems to primary & secondary school students worldwide.
    • Sell to young children in developing countries.

    In summary at presentation to Red Hat — “Non-profit, run by a professor, we make hardware and sell to governments.”

    The overall dynamics have attracted a lot of interesting people in the world.

    The ability and goal is to make the device together, bringing all H/W and S/W people together.

    The people that get behind this project have the ethos — “I’m willing to jump into this to change the world.”

    This is the first time for a new opportunity in the last 10 years.

    The sugar user interface is a completely new experience.

    When the UI designer was presenting to a room of head executives. “What ever advice you got, keep it to yourself, your not the target market.”

    One key point — No backward compatibility needs.

    More information at www.laptop.org. Wikipedia Reference. Some videos at You Tube Inside One Laptop per Child: Episode one and Slightly better demo of the OLPC User Interface.

    MySQL Conference – The next keynote with Guy Kawasaki

    Without missing a beat at MySQL Conference 2007, we moved from Marten’s keynote to The Art of Innovation by Guy Kawasaki.

    Extremely fun and entertaining. His 10 points.

    1. Make Meaning

    • “To change the world”
    • To a VC, do not say “you want to make money”, that is understood. You will attract the wrong team.

    2. Make Mantra

    • Not a Mission statement (50-60 words long), but 2 or 3 words.
      • Wendy’s – “Healthy fast food”
      • Mike – “Authentic Athletic Performance”
      • FedEx – “Peace of Mind”
      • eBay – “Democratize commerce”
    • Create a mantra — Why do you exist?

    If you get stuck try the Dilbert mission statement generator.

    3. Jump to the next curve

    • Not 50% or 100% better, but “Do things 10x better”

    4. Roll the DICEE

    • “Create great stuff”
      • Deep: Fanning (Reef) Sandal that open beer bottles
      • Intelligent: BF-104 Flashlight (Panasonic) (takes 3 sizes of batteries)
      • Complete: Lexus
      • Elegant: Nano (Apple)
      • Emotive: Harley Davidson (They generate strong emotions)

    5. Don’t worry, be crappy

    Get it out there.

    6. Polarize people

    People love it or hate it.

    7. Let a hundred flowers blossom

    • People that are not your target market are using it.
    • Take the money, ask the people why you are buying, ask what you can do better.

    8. Churn baby, churn.

    Ok to ship stuff with crappy stuff in it, but important to continually revised and improve.

    9. Niche thyself
    With a nice graph.

    • Vertical — Ability to provide unique product or service
    • Horizontal –Value to customer
    • bottom right — Price
    • top left — Stupid
    • bottom left — Dotcom
    • top right — X You need to be High and to the right.
    • Fandango — It’s either Fandango, or Clubbin.
    • Breitling Emergency – watch
    • Smart car – park perpendicular
    • LG Kimchi refrigerator

    You need to be like the President of the United States – You need to high and to the right. Got a great laugh from the crowd.

    10. Follow the 10/20/30 rule

    Innovative, you need to pitch for what you want.

    • The optimum number of slides in 10 slides.
    • Given the slides in 20 minutes.
    • Use 30 point font

    11. Don’t let the bozos grind you down

    A bonus to our friends in the community.

    • “I think there is world market for five computers”
    • “This telephone has too many shortcomings to be seriously considered as a means of communication. The device is inherently of no value to us.” –Western Union 1876
    • “There is no reason why anyone would want a computer in their home.” — Digital Equipment Corp 1977
    • “It’s too far to drive, and I don’t see how it can be a business.” – Guy Kawasaki – Bozo (The company was Yahoo)

    Guy commenting on his lost opportunity with Yahoo — “It only covers the first billion, it’s the second billion that pisses me off.”

    Read more about Guy at his website Guy Kawasaki.

    The Art of Innovation. If you a copy of slides, send an email to [email protected]

    MySQL Conference – Opening Keynote with Marton Mickos

    It’s an early start this morning at 8:20am at MySQL Conference 2007 with CEO MÃ¥rten Mickos keynote talk Welcome and State of MySQL AB.

    Here are some of the key points that impressed on me.


    “The Participatory & Disruptive spirit of the Dolphin.”.

    Open Source disrupts inefficient models and produces new wealth.

    Architecture of Participation.

    • GPL and FOSS licenses
    • Google
    • del.ico.us

    MySQL Architecture of Participation
    You have the forge, the planet, community contributer program and many more.

    Production of Amateurs

    • You Tube
    • Second Life
    • Wikipedia

    Some really great Quotes:

    “Fail often to succeed soon.” IDEO company slogan

    “Noah’s Ark was built by amateurs, but the Titanic by professionals.”

    Innovation Happens Here

    MySQL Monitoring & Advisory Service

    In his presentation of MySQL Network Enterprise Dashboard, If you were quick you would have noticed the MySQL Version 6.0.0-alpha-pb518.log

    Leading Applications

    • Open Source ISVs
    • Web
    • Web 2.0
    • On-Demand
    • Hosting
    • Telco & VOIP
    • Security
    • Enterptise 2.0
    • Hub & Spoke

    We want to be known as The Online Database.

    Drawn to the Lamp

    • Microsoft
    • Oracle
    • IBM
    • Sun
    • HP
    • Dell
    • Unisys

    They all have an Open Source strategy, the develop open source products, they use and partner with MySQL.

    He also mentioned MySQL Enterprise Connection Alliance – MECA.

    Global Distributed Organization

    • 70% work from home
    • 100 major locations in nearly 30 different countries.
    • There is a great focus on culture – global culture.

    Disruptive Businesses

    • A smarter way to produce the goods
    • A smarter way to distribute them
    • Serve the underserved
    • Keep it simple
    • Innovate with technology to give customers more convenience
    • Make money
    • Gartner: 66% of enterprises are deploying MySQL or planning to
    • 750,000 newsletter subscribers
    • MySQL installed in over 11 million installations

    Going Forward

    • Scalability
    • Availability
    • Manageability
    MySQL – The best online database in the world.

    MySQL Conference – Rewarding the Community

    At MySQL Conference 2007, CEO MÃ¥rten Mickos in his opening keynote Welcome and State of MySQL AB rewarded the community. Those that contributed to “The best database in the world”.

    2007 MySQL Applications of the Year
    #1 in Online Video
    #1 in 3G Mobile Entertainment
    #1 in Creative Software

    And the Winners- YouTube, Amp’d mobile, and Adobe

    2007 MySQL Partners of the Year
    #1 reseller of MySQL Enterprise to govt
    #1 in MySQL Enterprise integration
    #1 in Open Source

    And the Winners – Carasoft, HP, and RedHat

    2007 Community Members of the Year
    Quality Contributor
    Community Code Contributor
    Community Advocate

    And the Winners

    Martin Freibe
    Paul McCullagh
    Sheeri Kritzer

    MYSQL Conference – Scaling and High Availablilty Architectures Tutorial

    My first tutorial today at MySQL Conference 2007 is Scaling and High Availablilty Architectures by Jeremy Cole and Eric Bergen of Proven Scaling.

    Basic Tenets

    While not discussed, the premise is to Cache Everything. MemCache is a key component to any scalable system.

    Lifetime of a scalable system

    Using the analogy from a newborn child Jeremy stepped us through the categories Newborn, Toddler, Teenager, Late teens to 20s, Adult.

    In Late teens to 20s, is where most systems die a slow death, he termed “the awkward stage”. This is where scalability is critical, and a meltdown for example can ruin you. Downtime is also just not acceptable for your user community.

    When your Adult you need to perfect the ability to deploy incremental changes to your application seamlessly.

    As the system grows, optimizations changes that may have worked are now affecting your system. It’s important to revisit during each stage.

    Partitioning

    Most applications mainly implement a horizontal partitioning model. Different components of your systemcan be scaled by a “partition key”. The different models include fixed “Hash key” partitioning, Dynamic “directory” partitioning, Partition by “group” and partition by “user”.

    The Dynamic “directory” is a lot harder to implement, but is ultimately more scalable.

    One of Partitioning Difficulties, is inter-partition interactions. Ultimately the solution is duplicating meta-data or duplicating data. Overall reporting is also more difficult. What if we want average for users per location, if we partition by users. Most systems user driven and partition by user. A newer strategy is to partition by group.

    For implementing a Fixed Hash Key partitioning.

    • Divide data into B buckets
    • Divide the B buckets over M machines

    You define 1024 physical buckets (can then the easily dividable) 0-1023 (user_id % 1024). Coded then by range to physical machines, 0-255, 256-511, 512-767, 768-1023. The plus side is very easy to implement, you can always derive where something is. The biggest problems is scalability, e.g. going from 4 machines to 5. You also don’t have any fine grained control over buckets.

    For Dynamic Directory partitioning you maintain a database of mappings to partitions. A user can be easily moved at a later date in a much finer grain. MySQL Cluster is designed for this type of application. It is not necessary however, a well configured Innodb Hardware solution with memcache can easily provide the same functionality. The only writes are new users, or update partition keys, with a lot of reads.

    HiveDB

    This open source product implements a “standard” partition-by-key MySQL system written in Java.
    Many organizations have a somewhat similar built system, but this is an example of something that’s been open sourced.

    More information at www.hivedb.org.

    The Hive API language should be the only code that should be re-written to be application development language (e.g. PHP,Ruby) when needed.

    High Availability

    The obvious goals.

    • Avoid downtime due to failures.
    • No single point of failure.
    • Extremely fast failover.
    • No dependency of DNS changes.
    • No Dependency on code changes.
    • Painless and seamless failover.
    • Fail-back must be just as painless.

    The overall objective is speed.

    Understanding MySQL Replication is important to understanding HA options.

    MySQL Replication is Master-Slave One Way asynchronous replication.

    • Slave requests binary logs from last position.
    • Master sends binary logs up to current time.
    • Master keeps sending binary logs in real-time.

    Dual Master provides an easy configuration to fail over, it doesn’t provide benefits in throughput. Can help solve online schema changes without downtime. Assuming existing queries will perform both pre and post schema. (set-sql-bin-log=0 for the session is the tip). There are a number of caveats.

    Ultimately for High Availability you have a trade off, data loss (minuet) to scalability.

    SHOW PROFILE

    I’ve been playing more with the SHOW PROFILE command available as part of MySQL Community 5.0.37. Thanks to the author Jeremy Cole. This command can provide some interesting insight into the workings of MySQL. It does however like most new releases of software enable users to make suggestions for new features and functionality. Here is my wish list. Some I figure are practical, some are wish list, but if you never ask you never know.

    1. The Unit of measure for duration is Second. It would be great if it could be tuned for display, say millisecond. If you look at my first Example, all figures are effectively represented in milli-second or even micro-second granularity.
    2. I would like to see a total for the query. Again in Example 1, you have to add up all the figures to determine this query took 8ms.
    3. Again in Example 1, perhaps a percentage of total time for each line duration may be helpful.
    4. More descriptive status descriptions (this is part of the MySQL internal code and not the patch)
    5. SET PROFILING=1; can only be set on a current session, making it impossible to easily monitor a JDBC multi-connection test. There needs to be a way to enable for a session, other then current interactive one you are viewing, but also be able to see results. You can’t do a SHOW PROFILE via a JDBC connection!
    6. I’d like to see a threshold, so queries running under threshold are discarded, much like a long-query-time option. This enables you to run a large number of SQL Statements and only profiles for longer running ones are logged
    7. I’d like to see a level of logging to file, again like a slow query log, so you can simply gather information on a working system and review at some later time. Combined with the previous point, you now have microsecond slow query log with explicit details.

    One major benefit of the SHOW PROFILE command is I can accurately get a figure for how long a query it taking (time in milliseconds). You just have to sum all figures (See wish list point 2).

    By default, the source details are not provided, you need to specify the SOURCE operand, which helps in both comparing with any debugging output and also trouncing through the code. As in Example 1, I needed to find why 95% of time was in a step with the most descriptive line of ‘end’.

    Example 1

    mysql> show profile SOURCE,MEMORY for query 4;
    +--------------------+------------+-----------------------+---------------+-------------+
    | Status             | Duration   | Source_function       | Source_file   | Source_line |
    +--------------------+------------+-----------------------+---------------+-------------+
    | Opening tables     | 0.00013200 | open_tables           | sql_base.cc   |        2106 |
    | System lock        | 0.00001800 | mysql_lock_tables     | lock.cc       |         153 |
    | Table lock         | 0.00000600 | mysql_lock_tables     | lock.cc       |         162 |
    | init               | 0.00001300 | mysql_select          | sql_select.cc |        2073 |
    | optimizing         | 0.00004800 | optimize              | sql_select.cc |         617 |
    | statistics         | 0.00002500 | optimize              | sql_select.cc |         773 |
    | preparing          | 0.00005200 | optimize              | sql_select.cc |         783 |
    | executing          | 0.00002200 | exec                  | sql_select.cc |        1407 |
    | Sending data       | 0.00000500 | exec                  | sql_select.cc |        1925 |
    | end                | 0.00786600 | mysql_select          | sql_select.cc |        2118 |
    | query end          | 0.00001400 | mysql_execute_command | sql_parse.cc  |        5085 |
    | freeing items      | 0.00000700 | mysql_parse           | sql_parse.cc  |        5973 |
    | closing tables     | 0.00001900 | dispatch_command      | sql_parse.cc  |        2120 |
    | logging slow query | 0.00001000 | log_slow_statement    | sql_parse.cc  |        2178 |
    | cleaning up        | 0.00000500 | dispatch_command      | sql_parse.cc  |        2143 |
    +--------------------+------------+-----------------------+---------------+-------------+
    15 rows in set (0.01 sec)
    

    Example 2

    We were experiencing increased latency in JDBC with a particular query, With a recommendation from MySQL Support we tried SET SQL_BUFFER_RESULT=1;

    mysql> show profile source for query 14;
    +------------------------------+------------+-------------------------+---------------+-------------+
    | Status                       | Duration   | Source_function         | Source_file   | Source_line |
    +------------------------------+------------+-------------------------+---------------+-------------+
    | Opening tables               | 0.00006025 | open_tables             | sql_base.cc   |        2106 |
    | System lock                  | 0.00004875 | mysql_lock_tables       | lock.cc       |         153 |
    | Table lock                   | 0.00000400 | mysql_lock_tables       | lock.cc       |         162 |
    | init                         | 0.00001600 | mysql_select            | sql_select.cc |        2073 |
    | optimizing                   | 0.00005675 | optimize                | sql_select.cc |         617 |
    | statistics                   | 0.00001250 | optimize                | sql_select.cc |         773 |
    | preparing                    | 0.00005175 | optimize                | sql_select.cc |         783 |
    | Creating tmp table           | 0.00001275 | optimize                | sql_select.cc |        1206 |
    | executing                    | 0.00006025 | exec                    | sql_select.cc |        1407 |
    | Copying to tmp table         | 0.00000400 | exec                    | sql_select.cc |        1547 |
    | converting HEAP to MyISAM    | 0.04820900 | create_myisam_from_heap | sql_select.cc |        9914 |
    | Copying to tmp table on disk | 0.04049075 | create_myisam_from_heap | sql_select.cc |        9968 |
    | Sending data                 | 1.29302000 | exec                    | sql_select.cc |        1925 |
    | end                          | 0.09398425 | mysql_select            | sql_select.cc |        2118 |
    | removing tmp table           | 0.00004975 | free_tmp_table          | sql_select.cc |        9856 |
    | end                          | 0.00089125 | free_tmp_table          | sql_select.cc |        9884 |
    | query end                    | 0.00001850 | mysql_execute_command   | sql_parse.cc  |        5085 |
    | freeing items                | 0.00000825 | mysql_parse             | sql_parse.cc  |        5973 |
    | closing tables               | 0.00003425 | dispatch_command        | sql_parse.cc  |        2120 |
    | logging slow query           | 0.00001325 | log_slow_statement      | sql_parse.cc  |        2178 |
    | cleaning up                  | 0.00000675 | dispatch_command        | sql_parse.cc  |        2143 |
    +------------------------------+------------+-------------------------+---------------+-------------+
    21 rows in set (0.00 sec)
    

    Looking at the lines helped to indicate that the temporary table was being flushed to disk, indicating we need to Add SET SESSION tmp_table_size=20*1024*1024;

    mysql> show profile source for query 18;
    +----------------------+------------+-----------------------+---------------+-------------+
    | Status               | Duration   | Source_function       | Source_file   | Source_line |
    +----------------------+------------+-----------------------+---------------+-------------+
    | Opening tables       | 0.00006050 | open_tables           | sql_base.cc   |        2106 |
    | System lock          | 0.00001250 | mysql_lock_tables     | lock.cc       |         153 |
    | Table lock           | 0.00000400 | mysql_lock_tables     | lock.cc       |         162 |
    | init                 | 0.00000775 | mysql_select          | sql_select.cc |        2073 |
    | optimizing           | 0.00005475 | optimize              | sql_select.cc |         617 |
    | statistics           | 0.00001225 | optimize              | sql_select.cc |         773 |
    | preparing            | 0.00005075 | optimize              | sql_select.cc |         783 |
    | Creating tmp table   | 0.00001350 | optimize              | sql_select.cc |        1206 |
    | executing            | 0.00006125 | exec                  | sql_select.cc |        1407 |
    | Copying to tmp table | 0.00000375 | exec                  | sql_select.cc |        1547 |
    | Sending data         | 0.29110925 | exec                  | sql_select.cc |        1925 |
    | end                  | 0.08023800 | mysql_select          | sql_select.cc |        2118 |
    | removing tmp table   | 0.00001525 | free_tmp_table        | sql_select.cc |        9856 |
    | end                  | 0.05971400 | free_tmp_table        | sql_select.cc |        9884 |
    | query end            | 0.00001925 | mysql_execute_command | sql_parse.cc  |        5085 |
    | freeing items        | 0.00000425 | mysql_parse           | sql_parse.cc  |        5973 |
    | closing tables       | 0.00004625 | dispatch_command      | sql_parse.cc  |        2120 |
    | logging slow query   | 0.00000800 | log_slow_statement    | sql_parse.cc  |        2178 |
    | cleaning up          | 0.00000300 | dispatch_command      | sql_parse.cc  |        2143 |
    +----------------------+------------+-----------------------+---------------+-------------+
    19 rows in set (0.00 sec)
    

    T60 Configuration

    This is a long story, of the absolute nightmare I have had with this lemon. I wish I could send it back!

    I got my new Lenovo T60 from MySQL.

    My goal of course is to put Linux on this and I’ve chosen the latest Ubuntu 6.10 switching from Fedora Core 5 on my present personal laptop.

    Some initial investigation from the supplied media provided some surprises. Of a 80GB Drive only 69GB available. There is second partition of 5+GB which I later find out to be the IBM Recovery Area as they don’t provided and CD’s of sofware. So the specs say an 80GB drive, but it’s actually only 74.53GB via gpartd. In addition the installation of Windows XP Pro has so much other crap it takes a few mins to fully boot up.

    So here is what I did to make this mahine usable.

    Backup

    As I haven’t installed anything, the only think to backup is the Recovery Media so I can restore to defaults if necessary. You can do this with:

    • Start | Programs | ThinkVantage | Create Recovery Media
    • Select Recovery Discs
    • You will need 1 CD + 1 DVD

    NOTE: Don’t loose these, you can’t select these options again to make another copy. This should be done regardless as soon as you get the machine, because if your hard drive crashes you loose it all.

    My next step was to install Ubuntu. I plan to reinstall Windows.
    Resize via Ubuntu to 20GB

    Reconfigure SATA Drive

    When I ran the XP Pro Installation CD I received the following error.

    Setup did not find any hard drives installed in your computer.
    Make sure any hard disk drives are powered on and properly connected to your computer, and that any disk-related hardware configuration is correct.
    This may involve running a manufacturer-supplied diagnostic or setup program.
    

    In order to reinstall XP via CD you need to reconfigure the Bios to detect your hard drive appropiately. These can be achieved by:

    • Power on
    • ThinkVantage Button
    • F1 for Bios setup
    • Select Config
    • Select Serial ATA (SATA)
    • Change value (F5) to Compatibility
    • F10 – Save and Exit
    • Yes – Save Configuration and Exit

    Windows Installation

    The following software then had to be installed.

    • Windows XP Pro
    • Windows XP Pro Service Pack 2
    • Office 2003
    • Office 2003 Service Pack 2

    Following default Windows XP Installation most devices were not found. The Lenovo website lists 100s of software updates for the T60 so I decided the simpliest solution was to use the provided ThinkAdvantage System Update, but that also took a number of steps.

    This will then enable wired Internet Access (for some reason, the download of the Wireless didn’t help as there was no Other Device that match the wireless device.

    ThinkVantage System Update. Follow instructions, I had to install Microsoft .NET Framework as stated. In true and typical form, I had to restart my machine two times (when will Windoze actually get this right).

    Disk Crash

    Install that hung

    two times, I booted both the keyboard and mouse would not work.

    Called Lenovo Technical Support
    “The protocols are not the same. I need to call Microsoft Support”.

    Top 10 Things for IT Professionals

    These IT related lists are really quite accurate. I sound like a broken record sometimes when I repeat these things. The articles provides very good detailed descriptions, I’ve included the bullet points just to temp you to read more.

    Top ten things ten years of professional software development has taught me.

    1. Object orientation is much harder than you think
    2. The difficult part of software development is communication
    3. Learn to say no
    4. If everything is equally important, then nothing is important
    5. Don’t over-think a problem
    6. Dive really deep into something, but don’t get hung up
    7. Learn about the other parts of the software development machine
    8. Your colleagues are your best teachers
    9. It all comes down to working software
    10. Some people are assholes

    The Top 10 Things They Never Taught Me in Design School.

    1. Talent is one-third of the success equation.
    2. 95 percent of any creative profession is shit work
    3. If everything is equally important, then nothing is very important.
    4. Don’t over-think a problem.
    5. Start with what you know; then remove the unknowns.
    6. Don’t forget your goal.
    7. When you throw your weight around, you usually fall off balance.
    8. The road to hell is paved with good intentions; or, no good deed goes unpunished.
    9. It all comes down to output.
    10. The rest of the world counts.

    That missing INNODB STATUS

    On Thursday I saw something I’d not seen before. An Empty Innodb Status. Now given the amount of output normally shown it was certainly a first. And it looked like:

    mysql> SHOW ENGINE INNODB STATUS;
    +--------+------+--------+
    | Type   | Name | Status |
    +--------+------+--------+
    | InnoDB |      |        |
    +--------+------+--------+
    1 row in set (0.03 sec)
    

    To answer some of the most obvious questions.

    • Yes it was a working existing MySQL instance, with InnoDB correctly configured. Indeed we had been benchmarking for several hours.
    • MySQL Server was running, indeed a command selecting data from the mysql schema worked just fine after seeing this (All other tables were Innodb).
    • Absolutely nothing in the host MySQL error log. (This was the second most disappointing aspect)
    • The Process List showed two queries that had been running for some time, everything was taking  ; 1 second. (This was the most disappointing)

    So the problem is, MySQL seems to effectively hung when dealing with queries solely in InnoDB tables. Closer investigation found that another application process had filled the /tmp file system. Reclaiming space didn’t cause MySQL and InnoDB to start operating. Even a shutdown of MySQL failed, with mysqld having to be killed manually

    For those super inquisitive the version was 5.1.16-ndb-6.2.0-log, and yes it is a Cluster release. I’ve yet to test the problem on a normal 5.1 version and log a bug appropriately if it exists.

    I suspect in our benchmark we definitely need to include some timeout handling, so the queries would fail (they were both UPDATES), but it did have the customer asking why, do which there was no answer.

    Watching Replication in action

    For all those instant GUI people out there, there is an easy way to watch the present status of your MySQL Slaves using the watch command.

    $ watch -n 1 -d "mysql -uroot -pxxxx mysql -e 'SHOW SLAVE STATUS\G'"
    

    The watch provides a view of a file or command, and shows interval updates to this output (-n  seconds> option). You can also specific a granularity better then one second for example 0.5. -d also highlights the differences for you. So while you see the following output with your SHOW SLAVE STATUS, on a loaded system you will also see bin-log and relay-log changes, and perhaps Seconds_Behind_Master.

    The question is, Why is Seconds_Behind_Master the last column in this display?


    *************************** 1. row ***************************
    Slave_IO_State: Waiting for master to send event
    Master_Host: localhost
    Master_User: repl
    Master_Port: 10002
    Connect_Retry: 60
    Master_Log_File: master-bin.000006
    Read_Master_Log_Pos: 102
    Relay_Log_File: newyork-relay-bin.000055
    Relay_Log_Pos: 244
    Relay_Master_Log_File: master-bin.000006
    Slave_IO_Running: Yes
    Slave_SQL_Running: Yes
    Replicate_Do_DB:
    Replicate_Ignore_DB:
    Replicate_Do_Table:
    Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
    Replicate_Wild_Ignore_Table:
    Last_Errno: 0
    Last_Error:
    Skip_Counter: 0
    Exec_Master_Log_Pos: 102
    Relay_Log_Space: 539
    Until_Condition: None
    Until_Log_File:
    Until_Log_Pos: 0
    Master_SSL_Allowed: No
    Master_SSL_CA_File:
    Master_SSL_CA_Path:
    Master_SSL_Cert:
    Master_SSL_Cipher:
    Master_SSL_Key:
    Seconds_Behind_Master: 0

    Smarter indexing for column LIKE '%string%'

    With my very heavy travel load and skilling load I’ve not had time to scratch myself. It hasn’t stopped the brain working overtime on various issues including the classic find a pattern in a string starting with a wildcard character. On a recent gig I saw the true classic.


    SELECT columns
    FROM users
    WHERE username LIKE '%str%'
    OR firstname LIKE '%str%'
    OR lastname LIKE '%str%'

    I went through the various options and comments on leading ‘%’, OR’s, combined columns, FULLTEXT (which doesn’t work in this case), merge indexing etc, however it perplexed me that nobody has really solved this problem, or at least shared their solutions.

    I have an idea, a theory and while I’d love to prove/disprove it, I simply just don’t have the time. So here are my notes, hopefully somebody can comment positively/negatively, do some research, or encourage me to pursue it more.

    The Problem

    The problem is quite simply the leading wildcard, Der!. So how do you eliminate the wildcard from the search?

    My Idea

    Working with the earlier example, and having a concatenated field of the three components (username,firstname,lastname), my idea is to encode into 27 bits, a bit for each alphabetic character (A-Z) found, and any non-alphabetic character except whitespace in bit 27.

    This column is then indexed and searched for bitwise matches. Here is a pictorial description.

    String a b c d e f g h i j k l m n o p q r s t u v w x y z
    Ronald Bradford 1 1   1   1           1   1 1     1                
    Search: ‘brad’ 1 1   1                           1                 *** MATCH ***
    Search: ‘fred’       1 1 1                       1                 NO MATCH

    My idea is very simple, the question is, will it actually work.

    The Questions

    The goal is quite obviously to get an Index utilization, to a maximum say of 10%-25% of matching rows. Factors that would affect this include:

    • If search string is one character, like a vowel I could this not useful, so an implied criteria for optimal work is at least 2 or 3 characters.
    • I see the effectiveness lost on large column values, it could be that only up to say 30 characters is optimal, I see strings of 100+ characters would be ineffective.
    • It doesn’t support case-sensitive searching or non ASCII searching

    The Tests

    • Create good sized table and distribution of first/last names (A IMDB Actors table would work)
    • Create a concatenated column of searchable fields (e.g. details = CONCAT_WS(‘ ‘,firstname,lastname)
    • Create function to return bitwised search string (e.g. bitize(str))
    • Create Indexed bitwise column and pre-populate accordingly (e.g. bitfulldetails)
    • Create BEFORE INSERT|BEFORE UPDATE Triggers to populate Indexed bitwised column
    • Test the sucker (of course you will need to include the actual LIKE command as well in the WHERE clause)

    A sample query would then be:


    SELECT columns
    FROM users
    WHERE bitize('str') & bitfulldetails
    AND fulldetails like '%str%'

    So the challenge to all those budding MySQL Gurus, does it seem plausible?

    What is the maximum number of colons ':' that may appear in a valid URL?

    In idle conversation I was asked by MM.

    Question: What is the maximum number of colons ‘:’ that may appear in a valid URL?

    * If you said zero to one, then you are victim of browsers, and you have never used anything but a browser.

    * If you said one, then your a novice.

    * If you said two, then you have probably seen http://host:port at some time.

    * If you said three, then you would be correct, the elite.

    http://user:pass@host:port/location

    For the record my initial answer was 2.

    CU@UC07


    I’ll be speaking at the upcoming 2007 MySQL Conference & Expo (Why they dropped the word User, who knows), this time with Guy Harrison (Author of many books including MySQL Stored Procedures). We will be talking on MySQL for Oracle DBAs and Developers.

    Anyway, good friend Paul McCullagh, creator of PBXT will be also speaking on PrimeBase XT: Design and Implementation of a Transactional Storage Engine. He coined to me in an email “CU at the UC”. I’ve done a further level of refactoring, and added marketing. You can buy the shirt online here. (More colors including black and products coming, if you want it now, please ask).

    Using Innodb Primary Keys wisely

    At a customer site recently I came across the following table definition for an Innodb table. 140 columns, 3 part primary key totally 44 bytes, 2 million rows giving over 900MB in data size. And it had 15 indexes, totaling over 2.3GB in size. Add into the mix a Windows Server 2003 OS, 640MB innodb_buffer_pool_size setting and table scans out the warzoo. This all leads to very poor performance.

    In is generally considered best practice for Innodb to use the shortest primary key as possible, and there is a clear reason for this. Innodb stores with every index the full primary key. So for example if an indexed column was 4 bytes in length, in this example the index row would be 48 bytes (before overhead). Fortunately an easy solution presented itself, because of this index storage requirement, Innodb will create an internal 6 byte primary key if none exists for a table. I of course had known about this but had never tried it in a production situation. I come from old school where every table is defined with a primary key.

    So a ALTER TABLE [name] DROP PRIMARY KEY results in a long wait, and a reduction in the Index size to 900MB. WOOT! Now, in closer analysis the Primary Key is the Primary Key because it’s the Unique requirement for the table. No problem I just add a Unique Key to replace the previously dropped Primary Key. A check to review the Index Size showed a size of 2.3GB. What the!

    It seems if you read the fine print of the MySQL documentation for Innodb Table Structures there is an exception clause. If you do not define a PRIMARY KEY for your table, MySQL picks the first UNIQUE index that has only NOT NULL columns as the primary key and InnoDB uses it as the clustered index.

    In my situation, by adding a Unique Key this was in turn converted internally to the new Primary Key. Drat! So to the Innodb developers out there. I’d like to see a way for the internally generated key to remain in this situation, or at least provide the ability for the designer to choose this capability.

    The only result is to physically create an INTEGER UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY column definition. This may seem like not a big thing, but the customer did not want to make any schema changes. However it was a necessity simply as the first step to get reasonable performance.

    This was only the first part of the problem, solving the full table scans via creative Indexing was not possible, code changes were out of the question for an immediate fix. The real problem was the size of the innodb_buffer_pool_size. It was just too small to handle both the table data and index pages in memory. Perfmon analysis showed the machine simply when into intensive disk I/O every time any queries were run. On a Linux system with just a database and Innodb tables, it can be recommended that 70-80% of available memory can be allocated to the innodb_buffer_pool_size.

    Alas this is Windows 32 bit, and their is an implied 2GB memory limit for any process, so the best one could manage in this situation was 1600MB.
    The long and the short was that even with poor database design, and immediate performance improvement occurred with an optimal Primary Key and sufficient memory allocated to the correct MySQL buffers. This is only a temporary fix for a greater problem.

    MySQL Camp T-Shirts


    For those that attended the MySQL Camp at Google HQ late last year you may have seen me with my own T-Shirt designs. A number of people inquired about getting them. I’ve finally got around to make them available online, so anybody that would like one can order online.

    There are two different shirts. If you want your name on the shirt, you need to make sure you choose the correct one.

    • Early Adopters – For those that were the first 48 that signed up, your name as well as position and company are on the shirt.
    • The Herd – For everybody that registered on the website, your name is on the shirt.

    Ok. I’ve already been asked why 48. This was the number of registrants when I got the shirt made back in Australia a week or so before the Camp.

    There are also plenty more of my MySQL designs at my MySQL Wear Store.

    For those that also liked the runner up pin “A mouse is a waste of a perfectly good hand”, you can also get this in it’s original graphical shirt design at Geek Cool – CLI”

    How's your desktop

    People that know me, know that I can’t just do one thing at once. People that have seen me work also know just how I can’t just work on a single thing, but work with multiple monitors, desktops and programs everywhere. (Power to my Dell 24″ Widescreen LCD with 1920×1200 resolution). However I must bow when I see another professional of equal skill, however this does boarder on lunacy.

    I had to printscreen friend and fellow MySQLer Frank’s desktop, it just defied a sense of normality, yet he worked away quite happily. In some ways I’m also amazed that Windoze actually functioned.

    Take a close look at the number of open Firefox windows and putty sessions.