MySQL conference schedule

I am one of the crazy individuals(*) that will be speaking at both the regular O’Reilly MySQL Conference and the IOUG Collaborate conference both being held in the second week of April. My 4 presentations are:

2011 MySQL Conferences

Next year will mark a significant change for the MySQL community. At least three major conferences will have dedicated MySQL content that is great for attendees getting the best information on how to use MySQL from the experts in the field.

O’Reilly MySQL Conference & Expo

The 9th Annual MySQL conference will be held at is usual home of recent years. Colin will again be back as committee chair for a 3rd year and this will be my 6th straight MySQL conference.

Date: April 11 – 14, 2011
Location: Hyatt Regency, Santa Clara, California
Website: There is no website at this time
Call for Papers: There are no details for call for papers
Program Chairs: Colin Charles from Monty Program AB and Brian Aker.

Collaborate 11

Collaborate is a larger conference (4,000-5,000 attendees) that is actually three separate conferences in one run by the IOUG, OAUG and Quest. The IOUG content is generally a focus for Oracle DBA’s. Last year marked the first year with any MySQL sessions, and this year Collaborate will have dedicated MySQL tracks chaired by fellow ACE director Sheeri Cabral who is well known for her work in the MySQL community.

Date: April 10 – 14, 2011
Location: Orange County Convention Center West, Orlando, Florida
Website: http://collaborate11.ioug.org/
Call for Papers: Now open. Closes Friday October 1, 2010
Program Chair: Sheeri Cabral

KScope 11

ODTUG Kaleidoscope (Kscope for short) is a conference (1500 attendees) that is very focused on delivering the best content from the top community contributors for the communities benefit. 2010 was my first Kaleidoscope conference and I felt completely at home. Great people, great events and the best conference food I’ve had in many years.

With a dedicated MySQL track in 2010 for the first time I will again be the MySQL Program Chair in 2011 with an extended format for the MySQL developer and DBA. The focus will be the best way to develop successful applications with MySQL and will include Architecture, Performance Tuning, Best Practices, Case Studies and Hands-On streams.

Date: June 26 – 30, 2011
Website: http://kscope11.com
Location: Long Beach, California
Call for Papers: Closes Tuesday October 26, 2010
Program Chair: Ronald Bradford – Independent Consultant

Recap

2010 is also not over. MySQL Sunday at OOW promises to be a great event in San Francisco in under 2 weeks. You can still register at a very cheap price of $75 for 4 dedicated tracks of MySQL content. Open SQL Camp being organized also by Sheeri in Boston in October will continue the tradition of a small but focused and free event for the MySQL community.

Upcoming Conferences with dedicated MySQL content

We recently held a dedicated MySQL Track at ODTUG Kaleidoscope 2010 conference for 4 days. This is the first of many Oracle events that will begin to include dedicated MySQL content.

If your attending OSCON 2010 in the next few weeks you will see a number of MySQL presentations.

MySQL will be represented at Open World 2010 in September with MySQL Sunday. Giuseppe has created a great one page summary of speakers. This event is described as technical sessions, an un-conference and an fireside chat with Edward Screven. I’ve seen tickets listed at $50 or $75 for the day.

Open SQL Camp will be held in Germany in August, and Boston in October. This is a great FREE event that includes technical content not just on MySQL but other open source databases and data stores.

You will also find dedicated MySQL tracks in Europe at the German Oracle Users Group (DOAG) conference in November and the United Kingdom Oracle Users Group (UKOUG) in November that I am planning on attending.

In 2011 there is already a lineup of events that will all contain multiple tracks of MySQL content.

For the MySQL community the introduction of various large Oracle conferences may be confusing. From my perspective I describe the big three as.

  • Oracle Open World is targeted towards marketing. This includes product announcements, case studies and first class events.
  • Collaborate is targeted towards deployment and includes 3 different user groups, the IOUG representing the Oracle Database, the Oracle Applications User Group, and the Quest Group.
  • ODTUG Kaleidoscope is targeted towards development. This includes the tools and technologies for developers and DBA’s to do your job.

Having just attended Kaleidoscope 2010, and being a relative unknown I left with a great impression of an open, technical and welcoming event. There was a great atmosphere, great events with excellent food for breakfast, lunch and dinner and I now have a long list of new friends. This conference very much reflected being part of a greater extended family, the experience I have enjoyed at previous MySQL conferences. I’ve already committed to being involved next year.

2010 MySQL Conference Presentations

I have uploaded my three presentations from the 2010 MySQL Users Conference in Santa Clara, California which was my 5th consecutive year appearing as a speaker.

A full history of my MySQL presentations can be found on the Presenting page.

My acceptance with Oracle as ACE Director

I hinted last week of my acceptance with Oracle before the formal announcement this week at the MySQL Users Conference, not for a job but as Oracle ACE Director. In today’s State of the MySQL Community keynote by Kaj Arnö I was one of the first three MySQL nominees that are now part of this program.

What exactly is an ACE Director? Using the description from the Oracle website.

Oracle ACEs and Oracle ACE Directors are known for their strong credentials as Oracle community enthusiasts and advocates, with candidates nominated by anyone in the Oracle Technology and Applications communities. The baseline requirements are the same for both designations; however, Oracle ACE Directors work more closely and formally with Oracle in terms of their community activity.

What does this mean to me?

As a significant contributor to the community I now have the opportunity to continue as well as to contribute to how Oracle continues to interact, promote and involve the MySQL community. As stewards our role as an Oracle ACE Director is to be actively involved. I look forward to the challenge to help shape and improve our State of the MySQL Community.

News and References
Welcome, Oracle ACE Directors for MySQL

State of the Dolphin – Opening keynote

Edward Screven – Chief Corporate Architect of Oracle provided the opening keynote at the 2010 MySQL Users Conference.

Overall I was disappointed. The first half was more an Oracle Sales pitch, we had some product announcements, we had some 5.5 performance buzz. While a few numbers and features were indeed great to hear, there was a clear lack of information to the MySQL ecosystem including employees, alumni and various support services. I hope more is unveiled this week.

Some notes of the session.

  • Oracle’s Strategy covers storage, servers, virtual machines, operating system, database, middleware, applications
  • We build a complete technology stack that is “open” and “integrated” based on “open standards”
  • products talk via open standards with the intention for customers to not feel locked in to any technology
  • Examples include apache, java, linux, xen, eclipse, and innodb
  • Unbreakable linux has now over 4,500 customers

After the sales pitch we got down to more about MySQL.

What MySQL means to Oracle? We make the Oracle solution more complete as a stack for customers.

What is the investment in MySQL?

  • Make MySQL a better MySQL
  • Develop, promote and support MySQL
  • MySQL community edition

Integration with Oracle Enterprise Manager, Oracle Secure Backup and Oracle Audit Vault infrastructure. *This I expected and have blogged about, so I’m glad to see this commitment.

MySQL 5.5 is now in Alpha, some features are

  • InnoDB will be default engine
  • Semi sync replication
  • Replication heartbeat
  • Signal
  • Performance Schema

MySQL 5.5 is planned on being faster with Innodb Performance Improvements & MySQL Performance Improvements.
MySQL 5.5 sysbench claims, read 200% faster, write 364% faster.

MySQL Workbench 5.2 announcement

  • SQL Development
  • Database Administration
  • Data Modelling

MySQL Cluster 7.1 GA announcement

  • Improved Administration
  • Higher Performance
  • Carrier Grade Availability & Performance

MySQL Enterprise Backup announcement

  • Online backup for InnoDB only
  • Formally InnoDB hot backup with additional features including incremental backups

MySQL Enterprise Monitor 2.2 Beta announcment

In closing the statement was “MySQL lets Oracle be more complete at the database layer”. Is that good for the MySQL Community or better for the Oracle revenue model?

Edward Screven of Oracle to Answer Questions for future of MySQL

For those of you on the O’Reilly MySQL conference list you will no doubt see this email, but for readers here is the important bits.


Oracle Executive Will Speak at O’Reilly MySQL Conference & Expo
Edward Screven to Answer Questions re: Future of MySQL

Sebastopol, CA, February 24, 2010—Wonder about the future of MySQL? Curious about what Oracle plans for the open source database software? Expect answers when Edward Screven, Oracle’s chief corporate architect and leader of the MySQL business, speaks at the O’Reilly MySQL Conference & Expo, scheduled for April 12-15, at the Santa Clara Convention Center and the Hyatt Regency Santa Clara.

Edward Screven reports to CEO Larry Ellison, and he drives technology and architecture decisions across all Oracle products to ensure that product directions are consistent with Oracle’s overall strategy. He’ll discuss the current and future state of MySQL, now part of the Oracle family of products. His presentation will also cover Oracle’s investment in MySQL technology and community, as well as the role that open source in general is playing within heterogeneous customer environments around the world.

I have not found a link yet to provide reference to this.

Europe conference options for MySQL Developers

For those in the US the annual MySQL UC is taking place again in April. For those in Europe we have dedicated room for MySQL and MySQL related products/variants/branches at FOSDEM 2010 being held in Brussels, Belgium on 6-7 Feb.

This conference will feature a full day of talks with a format of 20 minutes presentation and 5 minutes Q&A. More information about submissions can be found at Call for Papers for “MySQL and Friends” Developer Room at FOSDEM 2010 now open!

Other references:

Updated

Wednesday January 6th is the last date for submissions. Extension for FOSDEM MySQL

Transcending Technology Specific Boundaries

I had the pleasure to sit on the Performance Panel at the recent Percona Performance Conference. While the panel contained a number of usual MySQL suspects, one person was not familiar, that being Cary Millsap from Method R.

An expert in optimizing Oracle performance, Cary also gave an session on Day 2 that I attended. While he opened professing not to be an expert in MySQL, his talk provided valuable foundation knowledge irrespective of whether you use MySQL or another database product.

Having come myself from 7 straight years in system architecture and performance tuning in Ingres, then a further 6 years in Oracle again heavily involved in system architecture and performance tuning, a lot of my experience in the 10 years of providing my own MySQL consulting is drawn from my past RDBMS experiences. In addition much of what I actually provide to clients today is common sense that I don’t see applied.

A summary of the excellent content provided by Cary.

The common technology agnostic problem we need to address is:

  • Users say that everything is slow, but I don’t know where to begin
  • Users are complaining but all the monitoring dials are green

From a user’s perspective, their experience consist simply of two elements.

  1. Task
  2. Time

In general, business people simply don’t care about the “system” except thought the specific tasks that make up their pressing business needs. And for these users, performance is all about the time to complete this task.

Throughput can be stated as tasks per time.
Response time is the time taken per task.

Cary also quoted Donald Knuth — “The universal experience of programmers who have been using measurements tools has been that there intuitive guesses fail.”

Performance is easy if you stop guessing where your code is slow. A few best practice tips are:

  • You have to insist on seeing where time goes for any task you think is important
  • You need to look at the sequence diagram of the task
  • What individual part takes the most time, then look at the task before that. The fastest way to do something is don’t.
  • To drill down, you need to attack the skew of each part, not the average.

In Summary the closing points were:

  • Performance is about time and tasks
  • Not all tasks are created equal
  • Read “The Goal”
  • Don’t guess, your probably wrong
  • Measure response time before you optimize anything – Insist on it

Performance is easy when code measures it’s own time and tasks. This closing statement on instrumentation I completely concur with.

Percona Performance Conference Talk

My final presentation during the 2009 MySQL Conference and Expo week was with the Percona Performance Conference on the topic of The Ideal Performance Architecture. My talk included discussions on Technology, Disk, Memory, Indexes, SQL and Data.

Updated 09/18/09
you can now see video of the event at Percona TV.

MySQL Monitoring 101

At the 2009 MySQL Conference and Expo I presented to a full room on MySQL Monitoring 101.

This presentation focused on the following four goals.

  • Know what to monitor
  • Know how you can monitor
  • Learn practices to diagnose problems
  • Have a foundation of historical information

Updated 09/18/09
You can also find additional materials at:

A change in the MySQL Binary distributions

Yesterday was the surprise announcement of MySQL 5.4 at the 2009 MySQL Conference and Expo. It was unfortunate that the supporting information was not that forthcoming on the MySQL website. I tried for several hours to try and download, but no mirrors were initially available. Today I see some information on the mysql.com home page and finally able to get the binary.

What I found most significant with this new major version release is a change in the binary distribution, as seen on the Download page.

MySQL 5.4 is only available on 3 platforms:

  • Linux (AMD64 / Intel EM64T)
  • Solaris 10 (SPARC, 64-bit)
  • Solaris 10 (AMD64 / Intel EM64T, 64-bit)

I was also surprised that this beta release highlights the emphasis of community contributions (long overdue), yet the community and indeed many employees of Sun/MySQL were simply unaware of this work. This is clearly a change in involving the community. While I applaud the beta status, hopefully a more stable product to start with, it’s development was done in a very closed company model.

Setting up MySQL on Amazon Web Services (AWS) Presentation

On Tuesday at the MySQL Camp 2009 in Santa Clara I presented Setting up MySQL on Amazon Web Services (AWS).

This presentation assumed you know nothing about AWS, and have no account. With Internet access via a Browser and a valid Credit Card, you can have your own running Web Server on the Internet in under 10 minutes, just point and click.

We also step into some more detail online click and point and supplied command line tools to demonstrate some more advanced usage.

What's happening with InnoDB

I have moved on to InnoDB: Innovative Technologies for Performance and Data Protection by Ken Jacobs at MySQL Conference and Expo.

With a brief history lesson of inception from 1994, inclusion in MySQL in 2000 and acquired by Oracle in 2005. Most of the work was done by one person. InnoDB is based on sound database computer science using Gray & Reuters definitive text on database design.

Some key points in Ken’s discussion.

  • Adaptive Hash indexing for frequent queries on keys.
  • In plugin Adaptive Hash is configurable
  • Insert Buffering – Deferring secondary index writes
  • Fast Index Create – doesn’t requires all indexes to be rebuilt
  • Table Compression – Changing the page size

The InnoDB plugin available in 5.1 has a number of new benefits.

  • fast index creation
  • table compression
  • info schema tables
  • new row storage format
  • file format management

All InnoDB 1.0.3 plugin features will be available in MySQL 5.4

The big announcement is a new product – Embedded InnoDB. This has the high performance, reliability and rich functionality of InnoDB, has a flexible programmatic API. No SQL, No security.

Search at Craigslist

I am now sitting in on MySQL and Search at Craigslist by Jeremy Zawodny at MySQL Users Conference

Some of the technical difficulties that required addressing.

  • High churn rate
  • half life can be very short
  • Growth
  • Traffic
  • Need to archive postings, e.g. 100M but be searchable
  • Internationalization and UTF-8

Some of the Craigslist Goals

  • Open Source
  • Easy and approachable
  • be green with energy use

A review of the Internals server configuration

  • Load Balancer (perlbal like)
  • Read Proxy Array (perl+memcached)
  • Web Read Array (apache 1.3 + mod_perl)
  • Object Cache (Perl + memcached)
  • Read DB Cluster (MySQL 5.0.x)
  • Search Cluster (Sphinx)

Clusters of DB servers have good vertical partitioning by Roles. These being

  • Users
  • Classified
  • Forums
  • Stats
  • Archive

Sphinx is a full standalone full text search that is used. Did compare with Apache Solr, but it seemed more complex and complicated. The Sphinx configuration:

  • Partitioned based on cities (people search locally)
  • Attributes v Keywords
  • Persistent Connections
  • Minimal stopword list
  • Partition in 2 clusters (1 master, 4 slaves)

The results of implementing Sphinx were:

  • decrease in 25 MySQL boxes to 10 sphinx boxes
  • no locking
  • 1,000+ qps
  • 50M queries per day
  • Better separation of code

MySQL Users Conference Opening Lines

Opening introduction from Colin Charles got us started. Karen Tegan Padir VP MySQL & Software Infrastructure was the opening keynote.

She comes from a strong tech background and is passionate about open source, the communities and how to make a successful product.

There isn’t a person that doesn’t go a day without interacting with a website or hardware system that uses a MySQL database.

The big news was the announcement of MySQL 5.4 – Performance & Scalability. Key features include.

  • InnoDb scalability 16way x86 and 64 way CMT servers
  • subquery optimization
  • new query algorithms
  • improved stored procedures, and prepared statements
  • enhanced Information Schema
  • improved DTrace Support

More information at MySQL 5.4 Announcement Details….

Other key points includes:

1. Ken Jacobs announces today an Embedded Innodb with a powerful API (not SQL based). Read more at Innobase Introduces Embedded InnoDB
2. MySQLCluster 7.0 is also released today. Some benchmarks 4.3x improvements. New features also include LDAP support.
3. The next release of MySQL Query Analyzer, 2.1 announced.
4. Sun announces a commitment to accept contributions from the community.
5. Community also gets the Monthly Rapid Updates.
6. MySQL Drizzle Project is discussed as a technology incubator.

Partners of the year: Intel, Infobright and Lifeboard.
Appliation of the year: Zappos, Alcatel-lucent and Symantec.
Community members of the year: Marc Delisle, Ronald Bradford, Shlomi Noach.

Where is the MySQL in Sun's announcement

I find it surprising that in the official Sun Announcement there is no mention of MySQL for two reasons. Firstly, this was Sun largest single purchase of $1 billion only 12 months ago. Second, MySQL’s largest competitor is Oracle.

While the Sun website shows the news in grandeur, the MySQL website is noticeably absent in any information of it’s owners’ acquisition.

On my professional side, as an independent speaker for Sun Microsystems with plans for upcoming webinars and future speaking on “Best Practices in Migrating to MySQL from Oracle”, this news does not benefit my bottom line.

A Drizzle update – Running version 2009.03.970-development

I’ve not looked at compiling and running Drizzle on my server for the past four weeks. Well overdue time for a check and see how it’s going. I saw in today’s planet.mysql.com by Eric Day a new dependency is needed. libdrizzle 0.2.0 now in Drizzle is now required, so I started there.

cd ~/bzr
bzr branch lp:libdrizzle
cd libdrizzle
./config/autorun.sh
./configure
make
sudo make install

No problems there, also documented at the Drizzle Wiki. Great to see the docs up to date. I see my old work on starting the compiling page still relevant. Tested on CentOS 5 and Mac OS/X 10.5

Compiling drizzle was not much more difficult.

cd ~/bzr/drizzle
bzr update
make distclean
./config/autorun.sh
./configure --prefix=/home/drizzle/deploy
make
make install

The problems happened when I started drizzle. Initially I was using bin/drizzled_safe, but it was recommended via IRC#drizzle I stick with sbin/drizzled

sbin/drizzled &
error while loading shared libraries: libprotobuf.so.2: cannot open shared object file: No such file or directory

An investigation of Google Proto Buffers.

$ protoc --version
libprotoc 2.0.2

I see that protobuf 2.0.3 is now available, but this was not the problem.

I got around the problem by specifying the current library path:

$ LD_LIBRARY_PATH=/usr/local/lib sbin/drizzled &

I corrected this problem by adding /usr/local/lib to the default ld path, both the libdrizzle and libprotobuf libs are located there.

$ echo "/usr/local/lib" > /etc/ld.so.conf.d/drizzle.conf
$ ldconfig
$ ls -l /usr/local/lib
total 37240
-rw-r--r-- 1 root root  1194602 Mar 31 17:42 libdrizzle.a
-rwxr-xr-x 1 root root      940 Mar 31 17:42 libdrizzle.la
lrwxrwxrwx 1 root root       19 Mar 31 17:42 libdrizzle.so -> libdrizzle.so.0.0.2
lrwxrwxrwx 1 root root       19 Mar 31 17:42 libdrizzle.so.0 -> libdrizzle.so.0.0.2
-rwxr-xr-x 1 root root  1117979 Mar 31 17:42 libdrizzle.so.0.0.2
-rw-r--r-- 1 root root 12199302 Nov 30 23:32 libprotobuf.a
-rwxr-xr-x 1 root root      836 Nov 30 23:32 libprotobuf.la
lrwxrwxrwx 1 root root       20 Nov 30 23:32 libprotobuf.so -> libprotobuf.so.2.0.0
lrwxrwxrwx 1 root root       20 Aug 27  2008 libprotobuf.so.0 -> libprotobuf.so.0.0.0
-rwxr-xr-x 1 root root  5027949 Aug 27  2008 libprotobuf.so.0.0.0
lrwxrwxrwx 1 root root       20 Nov 30 23:32 libprotobuf.so.2 -> libprotobuf.so.2.0.0
-rwxr-xr-x 1 root root  5586965 Nov 30 23:32 libprotobuf.so.2.0.0
-rw-r--r-- 1 root root  9264068 Nov 30 23:32 libprotoc.a
-rwxr-xr-x 1 root root      852 Nov 30 23:32 libprotoc.la
lrwxrwxrwx 1 root root       18 Nov 30 23:32 libprotoc.so -> libprotoc.so.0.0.0
lrwxrwxrwx 1 root root       18 Nov 30 23:32 libprotoc.so.0 -> libprotoc.so.0.0.0
-rwxr-xr-x 1 root root  3645396 Nov 30 23:32 libprotoc.so.0.0.0
drwxr-xr-x 2 root root     4096 Mar 31 17:42 pkgconfig

Starting

$ sbin/drizzled &
InnoDB: The InnoDB memory heap is disabled
InnoDB: Mutexes and rw_locks use GCC atomic builtins.
090331 18:38:08  InnoDB: highest supported file format is Barracuda.
InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
090331 18:38:08  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
090331 18:38:08 InnoDB Plugin 1.0.3 started; log sequence number 46419
sbin/drizzled: ready for connections.
Version: '2009.03.970-development'  socket: ''  port: 4427  Source distribution

Verifying

$ bin/drizzle -uroot
Welcome to the Drizzle client..  Commands end with ; or g.
Your Drizzle connection id is 1
Server version: 2009.03.970-development Source distribution
Type 'help;' or 'h' for help. Type 'c' to clear the buffer.
drizzle> select version();
+-------------------------+
| version()               |
+-------------------------+
| 2009.03.970-development |
+-------------------------+
1 row in set (0.00 sec)
drizzle> exit

Sweet! Now to try some testing & benchmarking before the barrage of conferences next month, 2009 MySQL Camp, Percona Performance Conference and MySQL Conference & Expo.

I’m going to check out The Juice Database Benchmark next as a more realistic benchmark to DBT2 and sysbench.

Identifying resource bottlenecks – CPU

One of the first steps when addressing a MySQL performance tuning problem is to perform a system audit of the physical hardware resources, then identify any obvious bottlenecks in these resources.

When dealing with CPU, a quick audit should include identifying the number of CPU cores your server has, and the types of these cores. The key file on Linux systems is /proc/cpuinfo.

Number of cores can be found via the command cat /proc/cpuinfo | grep “^processor” | wc -l

You need to look more closely at the file to determine the type of CPU (e.g. below the model name shows Intel(R) Xeon(R) CPU X3220 @ 2.40GHz. The combination of knowing the number of processors (cores) listed and physical id and siblings helps identify how many CPUs and how many cores per CPU exist.

$ cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Xeon(R) CPU           X3220  @ 2.40GHz
stepping	: 11
cpu MHz		: 2394.051
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
bogomips	: 4789.96
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

...

Other commands that help with identifying CPU/cores include mpstat and top.

$ mpstat -P ALL 5

11:43:43 AM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
11:43:48 AM  all    0.00    0.00    0.00    0.00    0.05    0.00    0.00   99.95   1033.00
11:43:48 AM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00   1000.40
11:43:48 AM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:43:48 AM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00     31.40
11:43:48 AM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      1.00
$ top
0

top - 11:42:09 up 36 days, 13:17,  2 users,  load average: 0.20, 0.24, 0.25
Tasks: 133 total,   1 running, 132 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   4050776k total,  3825584k used,   225192k free,   397580k buffers
Swap:  1052248k total,      128k used,  1052120k free,  2302408k cached

You can easily identify a CPU bottleneck using the vmstat command.

The following shows an idle system.

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0    128 234088 409632 2474372    0    0     0     0 1081  198  0  0 100  0  0
 0  0    128 234088 409632 2474396    0    0     0     0 1003   59  0  0 100  0  0
 0  0    128 234088 409636 2474392    0    0     0   100 1085  209  0  0 100  0  0
 0  0    128 233836 409636 2474396    0    0     0     0 1014  184  3  0 97  0  0
 0  0    128 233284 409636 2474396    0    0     0     0 1182  435  2  0 98  0  0
 0  0    128 233176 409636 2474396    0    0     0     0 1024  104  1  0 99  0  0
 0  0    128 233176 409636 2474396    0    0     0     0 1079  195  0  0 100  0  0
 1  0    128 233168 409644 2474396    0    0     0   232 1021  188  3  0 97  0  0
 0  0    128 233176 409644 2474396    0    0     0     0 1111  213  2  0 98  0  0
 0  0    128 233176 409644 2474396    0    0     0     0 1005   60  0  0 100  0  0

The key columns (from the man page are)

CPU – These are percentages of total CPU time.

  • us: Time spent running non-kernel code. (user time, including nice time)
  • sy: Time spent running kernel code. (system time)
  • id: Time spent idle. Prior to Linux 2.5.41, this includes IO-wait time.
  • wa: Time spent waiting for IO. Prior to Linux 2.5.41, included in idle.
  • st: Time stolen from a virtual machine. Prior to Linux 2.6.11, unknown.
  • Procs

  • r: The number of processes waiting for run time.

NOTE: The columns of vmstat may vary between different Linux Operating Systems.

If you system is CPU Bound then you will observe this. Look at id,us,sy,r

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 3  0    128 275684 397176 2300672    0    0     0     0 1118  427 74  2 25  0  0
 3  0    128 217404 397176 2300672    0    0     0     0 1017  138 74  1 25  0  0
 6  0    128 239584 397176 2300672    0    0     0     0 1086  350 93  2  5  0  0
 4  0    128 269468 397176 2300672    0    0     0     0 1005  229 98  2  0  0  0
 4  0    128 217636 397180 2300668    0    0     0   168 1087  251 99  2  0  0  0
 4  0    128 240576 397180 2300668    0    0     0     0 1006  182 99  2  0  0  0
 4  0    128 270708 397180 2300668    0    0     0     0 1079  338 98  2  0  0  0
 4  0    128 218752 397180 2300684    0    0     0     0 1005  106 99  1  0  0  0
 4  0    128 226316 397180 2300684    0    0     0     0 1077  308 98  2  0  0  0
 4  0    128 198664 397184 2300680    0    0     0    76 1010  250 99  1  0  0  0
 4  0    128 179444 397184 2300680    0    0     0     0 1077  238 100  0  0  0  0
 4  0    128 185396 397184 2300688    0    0     0     0 1006  210 99  1  0  0  0
 4  0    128 199408 397184 2300688    0    0     0     0 1079  336 99  1  0  0  0

You should also be wary of a Single CPU Bound process. This is why knowing the number of cores is important. In this example, one CPU is bound.

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st

 0  0    128  99592 412544 2477580    0    0     0     0 1017   89  0  0 100  0  0
 0  0    128  99592 412544 2477580    0    0     0     0 1090  222  0  0 100  0  0
 0  0    128  99592 412544 2477580    0    0     0     0 1019   98  0  0 100  0  0
 1  0    128  99592 412544 2477580    0    0     0     0 1096  347 14  0 86  0  0
 1  0    128  99592 412548 2477576    0    0     0    84 1030  194 25  0 75  0  0
 1  0    128  99592 412548 2477576    0    0     0     0 1094  300 25  0 75  0  0
 1  0    128  99592 412548 2477580    0    0     0     0 1012   76 25  0 75  0  0
 1  0    128  99592 412548 2477580    0    0     0     0 1096  318 25  0 75  0  0
 1  0    128  73192 412548 2477580    0    0     0     0 1039  273 29  0 70  0  0
 1  0    128  77284 412556 2477572    0    0     0   268 1122  373 25  1 75  0  0
 2  0    128  83592 412556 2477584    0    0     0     0 1036  374 27  1 72  0  0
 0  0    128  56220 412564 2477576    0    0     0   172 1017   84  7  0 94  0  0
 0  0    128  56220 412564 2477576    0    0     0     0 1078  192  0  0 100  0  0
$ mpstat -P ALL 1
12:15:55 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
12:15:56 PM  all   25.00    0.00    0.00    0.00    0.00    0.00    0.00   75.00   1072.00
12:15:56 PM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00   1001.00
12:15:56 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
12:15:56 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00     62.00
12:15:56 PM    3  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      9.00

12:15:56 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
12:15:57 PM  all   25.00    0.00    0.00    0.00    0.00    0.00    0.00   75.00   1021.00
12:15:57 PM    0    0.00    0.00    0.00    0.00    0.00    1.00    0.00   99.00   1001.00
12:15:57 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
12:15:57 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00     18.00
12:15:57 PM    3  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      2.00

I will be detailing identifying bottlenecks of Memory, Disk and Network in future posts. You can also find out more at the MySQL User Conference “Monitoring 101 – Simple stuff to save your bacon” session.

Log Buffer #94: a Carnival of the Vanities for DBAs

April 25th, 2008 – by Ronald Bradford

Welcome to the 94th edition of Log Buffer, the weekly review of the database blogsphere. Adding to the list of usual database suspects, I have some more alternative considerations for our readers this week.

We start with Conferences

Still some discussion from last weeks’ 2008 MySQL Conference & Expo.

Baron “xarpb” Schwartz calls it correct in Like it or not, it is the MySQL Conference and Expo. Matt Assay of c|net gives us some of his opinions in three posts Two great posts on MySQL, Back to the future for MySQL and Between two consenting corporations… in followup to last week’s active slashdot discussion. Many others have also commented if you have not been following the news released before opening keynotes.

If you didn’t get a hard copy, Sheeri Kritzer Cabral has published the Pythian EXPLAIN Cheatsheet many attendees received.

Also last week was Collaborate 08 – Technology and Applications IOUG forum for the Oracle Community.

This week we also see the Web 2.0 San Francisco in action, and excitement is also brewing for the PGCon – PostgreSQL Conference for Users and Developers happening in under a month as Robert Treat has Plane tickets booked for PGCon. Postgres was also visible at the MySQL Conference & Expo if you were looking with a prominent consulting team downing the blue elephant during the event. Wish I’d taken a photo now!

Still more news from Adam Machanic of the Pythian group with SQLTeach Toronto: Almost Here.

Common threads

The 2008 Google Summer of Code announced this week showcases the Open Source databases MySQL (14 projects) and PostgreSQL (6 projects). Kaj Arnö talks more in Fourteen Summer of Code projects accepted 2008. The company PrimeBase Technologies also features strongly with two projects for the Blob Streaming storage engine for MySQL as I detail in Media Blob Streaming getting a Google boost.

MySQL

DTrace Integration with MySQL 5.0 – Chime demo in MySQL Users Conference 2008 by Jenny Chen is an example of Sun’s Open Source contribution to MySQL which I saw as a physical demo last week. Unfortunately, due to the imbalance in actually getting new functionality into Community contributions (actually non existence in current or next mysql version :-(), this functionality is only really for show. Dtrace with MySQL 6.0.5 – on a Mac describes some of this work actually making it into the next, next version. It seems this next Falcon Preview is available but not announced by MySQL generally as I note in Continued confusion in MySQL/Sun release policy.

MySQL Gurus Mark Callaghan and Brian Aker comment respectively here and here on MySQL Heap (Memory) Engine – Dynamic Row Format Support. Work submitted by Igor Chernyshev of eBay Kernel Team (whom I’ve met previously and was most impressed with his ability to submit MySQL patch work, with little previous MySQL kernel knowledge, but extensive C++ knowledge). This work also contributed to eBay Wins Application of the Year at MySQL Conference & Expo.

Mark also mentions in his post “How do users get it? There is no community branch into which people can submit changes with a GPL license.“. A topic your’s truly has also mentioned regarding the Community contributions, development and release. Perhaps a sign of more benefit to the community soon as Monty mentions.

Baron Schwartz comments on Keith “a.k.a Kevin” Murphy’s work in Spring 2008 issue of MySQL Magazine. With a quick plug also for his upcoming book “High Performance MySQL – Version 2″ (me giving it a plug also now), Baron also has the best published anti-spam sniffer email I’ve seen, and recently updated to his new employer. Check his blog and let me know.

Postgres

Joshus Drake of Command Prompt Inc. The Postgres Company gets excited in Is that performance I smell? Ext2 vs Ext3 on 50 spindles, testing for PostgreSQL and gives us some insight into different settings of two popular file system types. It would be great to see a follow up with a few more different filesystems types.

Pabloj “so many trails … so little time” extends his MySQL example to Postgres in Loading data from files. And on Postgres Online Journal, we get An Almost Idiot’s Guide to PostgreSQL YUM giving you a step by step guide of PostgreSQL setup, including the all important “Backing up Old Version”.

Oracle

We get a detailed book chapter from Keith Lake of Oracle OLAP The most powerful, open Analytic Engine in his extensive post on Tuning Guidance for OLAP 10g. David Litchfield brings attention in A New Class of Vulnerability in Oracle: Lateral SQL Injection. The title is sufficient for all Oracle DBA’s to review.

Don Seiler gives his experience in Bind Variables and Parallel Queries Do Not Mix when an Oracle Bug is discovered the database to 64-bit H/W..
Matching LOB Indexes and Segments by Michael McLaughlin gives us a good CASE/REGEX SQL example exam question, and simple output to monitor the growth of LOBs in your Oracle database.
Additional readings for Oracle folks can be found with Kenneth Downs writing Advanced Table Design: Resolutions and Dan Norris’ Collaborate 08 thoughts gives a concise review of a largely attended Oracle event.

SQL Server

B Esakkiappan’s SQL Thoughts gives us a throughout lesson on SQL Server 2005 Database Transaction logs with Know the Transaction LOG – Part – 1, Part – 2, Part – 3 and Part -4 Restoring Data.

Paul S. Randal of SQL Skills adds Conference Questions Pot-Pourri: How to create Agent alerts to his writings following many requests after a recent workshop.

In Scalability features I would like to have in SQL Server Michael Zilberstein lists 3 key features including “Active-Active cluster”, “Indexes per partition” and “Bitmap indexes and function based indexes”.

Ingres, Times Ten, Google App Engine and more

Some movement in the Ingres world with Deb Woods of Ingres Technology Blog discussing in Inside the Community – Ingres style…. the Ingres Engineering Summit occurring this week. Attendees included newbies to a 24 year Ingres veteran. That beats my experience in Ingres which now extends 19 years.

We get another very detailed installation description, this time for Times Ten in Install Oracle TimesTen In-Memory Database 7.0.4 on Linux.

Just a few weeks ago, a new database offering hit the market with the Google App Engine. News this week includes
Google App Engine Hack-a-thons! being announced with events in New York on May 7th and San Francisco on May 16th. As a developer with an account and an excuse to use it more, I can’t win, being in the right towns on the wrong dates.

OakLeaf Systems this week writes Comparing Google App Engine, Amazon SimpleDB and Microsoft SQL Server Data Services. Another good read just for comparison.

Not in a blog, but in discussion in at the recent MySQL, was msql. It was interesting to find out that PHP was originally developed for msql first, and only used MySQL as the preferred database after some functionality requirement. Interesting what could have been?

In Conclusion

Thanks Dave for the opportunity to contribute to the week in review. Until my chance to charm the readers next time.

I leave you with a photo, and challenge our readers to find another person who would be capable of wearing a t-shirt that states “My free software runs your company”. Michael Widenius- Founder and original developer of MySQL can, and my thanks to you for MySQL, and the Vodka shots at the Conference last week.

Happy Earth Day 2008!

Making business decisions for the community and the enterprise

I was prompted following a few key words by Marten Mickos at the Sun Dinner on Wednesday evening, and subsequent one on one discussion with Marten, to post my thoughts of some significant news this week announced at the MySQL Conference. The decision to provide as it’s been termed is “Enterprise only features”. It is unfortunate this was not discussed in Marten’s opening keynote, having been exposed the evening before in the Partner’s meeting and hitting the blog sphere before the conference officially started.

MySQL, past, present and future as an Open Source company requires a functional business model to succeed. This includes the funding of resources and the technology progression. It is also necessary in this business climate to build a successful business quickly. How do you do this? Well that’s probably the difference between a successful CEO and an unsuccessful one, and what Marten Mickos has produced is clearly very successful.

I may not necessarily agree with the decisions made, more specifically not understanding at this time the rationale of which features are free and which are likely to be commercial, but I respect the decision made by Marten Mickos. These new feature considerations are in a future release of MySQL, they are also I’m sure not yet set in stone, however MySQL can not be all things to all people, no software can. It reminds me of the Homer Simpson car, designed to do everything Homer wanted in a car, but it bankrupted the previously successful company due the views of one individual to solve all their own needs, but not the needs of the majority. Who is affected by this decision, who will benefit, again it’s too early to tell.

Monty indicates this is a MySQL decision, not a Sun decision. This indicates the transition of MySQL to being under the Sun banner of the largest open source company is well, still in transition. Today it was again confirmed to me, that the MySQL database will always be GPL and MySQL will not never revoke the functionality that powers the world’s largest websites as free software for the database server.

As an advocate for the MySQL community, I’d like to consider myself one of the pulses, a thought in the MySQL conscious and even a vocal lobbyist. I am however not interested in being a disruptor in the MySQL ecosystem. A Communications Lesson on Slashdot I believe correctly states “Sun to Begin Close Sourcing MySQL.” The headline is wrong.

A number of people have posted their comments, let’s stop bickering about it, and let’s see something positive happen for the benefit of the community. For example, as I’ve mentioned previously regarding the lack of differentiation for the Community version, again mentioned by Mark Callaghan in A better (community) HEAP engine where a worthwhile patch can’t be of benefit to the community in a binary release for the lay person.

When in management, I am responsible for contributing to the success of the company, to play a significant role in the functional business model, to ensure funding for resources and technology progression. In other terms, how can revenue generation be achieved to fund prominently salaries of staff, including my own. How am I going to do this? This will be the difference between my huge success or not.

The database frontier

Jay’s opening lines regarding the final MySQL Conference keynote speaker was: “I work with a lot of data. I think peta-bytes, maybe exa-bytes”. This was relating to Jacek Becla from the Stanford Linear Accelerator Center, giving his presentation on “The Science and Fiction of Petascale Analytics”.

The goal of the Large Synoptics Survey Telescope (LSST) is the storage of 50+ PB of images and 20+ PB data.
Let’s just clarify the size. 20 PB of data = 20 years of HD Movies = 2000 years of 128kb MP3

The next database frontier is obviously building huge databases. What part will MySQL or other relational databases play? Some interesting facts were.

  • The Digital Universe Created 161 Exabytes of data last year.
  • Google, processes 20 petabytes of data per day.

The Operational plan for LSST Project Timeline is 10 years, only starting in 2014. The timeline:

  • 2009 Choosing Technology
  • 2010-2014 constructions
  • 2014-2023 production

The primary goals are: Scale, parallelize, fault tolerant.

Q: What a MySQL fellow does?

A: Maria, an ACID, MVCC engine that plans to be the default non-transactional and default transactional engine for MySQL.

Presently development with a team of 6 people and plans of adding 2-3 developers the work on Maria should see the 1.5 release this month.

It was great to here Monty say “We have a policy of zero MySQL Bugs, like the old MySQL way.”

Maria Version History
1.0 – “Crash Safe” — part of a existing 5.1 branch
1.5 – “Concurrent insert/select” to be merged as part of formal MySQL 6.0 release
2.0 – Transactional and ACID
3.0 – High Concurrency & Online Backup
4.0 – Data Warehousing

The schedule has all of the features to be available for the next MySQL Conference Q2 2009

Some points of note:

  • This is a MyISAM replacement.
  • It was interesting to hear about log file size (suggesting being big like 1G), and there are not circular. New log files will be created, and only files purged only when no longer used.
  • There has been a change in default page size, presently defaulting to 8K for both data and indexes.
  • Maria 1.5 does not support INSERT DELAYED and FullText and GIS indexes are not crash safe.
  • There are extensive tests in the MySQL Test Suite
  • Will support READ COMMITTED and REPEATABLE READ (available in 1.5)
  • Every part of the development and process is open an available documentation (unlike some other storage engines Monty mentions)
  • Have a drop everything policy on new bugs to have Maria as stable as possible.
  • Check out the blog at monty-says.blogspot.com

Tips from the MySQL Conference

What would be great if people could create a single line (one tip) from each talk and we could aggregate these for an executive summary for tech people.

This was prompted from only a few minutes looking in on Baron Shwartz’s EXPLAIN presentation. What I didn’t know was.

EXPLAIN EXTENDED SELECT …; SHOW WARNINGS; gives the rewritten SQL query

If only I had time to whip out an application on my Google AppEngine and get twitter feeds with say a mysqlconf keyword. Perhaps we need a all night BoF hackfest to do it.

PrimeBase PBXT/Blob Streaming BoF – What you missed.

A small but committed group met at 8:30pm to hear more about our the plans from PrimeBase Technologies here at the 2008 MySQL Conference. Our discussion started in true MySQL form.

Monty Widenius presents to the group plastic cups and a bottle of Absolut Vodka.
After a shot, Paul starts with “While I can still talk”.
Monty, slams another bottle of Vodka on the table.
We all laugh.

Paul outlined some of the roadmap plans from existing the Alpha release to Beta releases.
He talked about the plans for Synchronous Replication and there was active discussion on various use cases.
There was also discussion and input on Solid State Drive (SDD) Technology which will be tested with PBXT in the coming months.

Scaling Wisdom

The 20 second summary from the Scaling MySQL – Up or Out? from our panel of experts at 2008 MySQL Conference and Expo.

  • Paul Tuckfield from YouTube — The answers to everything is replication, you just have to rephrase the question.
  • Jeff Rothschild from Facebook — Memory, the source of all problems is your developers.
  • Domas for Wikipedia — You should be afraid that 10 min structural change may answer detailed problems.
  • Fahan Mashraqi from Fotolog — Architect property, the most optimized schema may not be enough, what is the cost of serving the data, no just the time to run the SQL.
  • John Allspaw from Flickr — There is nothing more permanent then a temporary solution
  • Monty Taylor from MySQL — You have to know what’s happening on every piece of your technology stack.

What's in a new name

Also in the MySQL Press Releases today but dated for tomorrow is Sun Microsystems Announces MySQL 5.1.

I find the wording clearly a new language from my previous understanding — “pending general availability of MySQL™ 5.1″.

We now see the trademark notice, obviously a Sun influence.
We now have a “pending” GA version. MySQL is obviously very keen to release MySQL 5.1 for GA. This was expected at last years’ MySQL conference. Many in the past year have expected this prior to this year’s conference (we are now being informed late Q2 2008). There was an anticipation there would be two RC versions, this is now the third. So what exactly does “pending” mean? Will 5.1.24 be renamed Production if it passes community acceptance (I say community because it’s not an Enterprise release). This would be a change from previous naming policy. What’s most likely is hopefully they release 5.1.25 as production. Comes back then to why the words “pending general availability”, and not “next release candidate” which is what it is.

Previously MySQL also made new with the initial RC status of 5.1, moving away from the previously policy.

Standing room only

At Day 1 of the 2008 MySQL Conference and Expo today, our high numbers of attendees (reported at 2,000) have resulted in Standing Room only in a lot of talks. This has got to be excellent PR.

I got to sit in on the Memcached and MySQL session by Brian Aker and at the end I stuck my head in to Best Practices for Database Administrators by and Explain demystified by Baron Schwartz both 2008 MySQL Award winners.

All full to overfilling presentations.