Beyond MySQL GA: patches, storage engines, forks, and pre-releases – FOSDEM 2010

Kristian Nielsen presented “Beyond MySQL GA: patches, storage engines, forks, and pre-releases”.
This included a history of current products:

Google Patches (5.0 & 5.1) included improvements in :

  • statistics/monitoring
  • lock contention
  • binlog
  • malloc()
  • filesorts
  • innodb I/O and wait statistics
  • SHOW …STATISTICS statements
  • smp scalability
  • I/O scalability
  • semisync replication
  • many more

Percona Patches (5.0) focus on

  • statistics/monitoring
  • performance/scalability
  • buffer pool content/mutexes
  • microslow patch

These have been ported to 5.1 and mainly integrated into XtraDB.

EBay Patches (5.0) have included:

  • variable length memory storage engine
  • pool of threads
  • Virtual columns

XtraDB storage engine (5.1) includes

  • Percona patches
  • Google patches
  • Innodb patches
  • Has XtraBackup for backup

Other engines/patches discussed included:

  • PBXT storage engine – community contribution
  • FederatedX – replacement to Federated
  • Sphinx storage engine
  • Pinba storage engine – Collects PHP statistics
  • Others OQGraph/Spider
  • Galera – Synchronous replication
  • Drizzle

Alternative packaging options for MySQL 5.0 and MySQL 5.1 including Our Delta, Percona and MariaDB.

FOSDEM 2010 MySQL Developer Room Schedule
FOSDEM 2010 Website
Brussels, Belgium
February 7, 2010

Multi-Master Manager for MySQL – FOSDEM 2010

The next presentation by Piotr Biel from Percona was on Multi-Master Manager for MySQL.

The introduction included a discussion of the popular MySQL HA solutions including:

  • MySQL Master-slave replication with failover
  • MMM managed bi-directional replication
  • Heartbeat/SAN
  • Heartbeat/DRBD
  • NDB Cluster

A key problem that was clarified in the talk is the discussion of Multi-Master and this IS NOT master-master. You only write to a single node. With MySQL is this critical because MySQL replication does not manage collision detection.

The MMM Cluster Elements are:

  • monitoring node
  • database nodes

And the Application Components are:

  • mon
  • agent
  • angel

MMM works with 3 layers.

  • Network Layer – uses a virtual IP address, related to servers, not a physical machine
  • Database Layer
  • Application Layer

MMM uses two roles for management with your application.

  • exclusive – also known as the writer
  • balanced – also known as the reader

There are 3 different statuses are used to indicate node state

  • proper operation
  • maintenance
  • fatal errors

The mmm_control is the tool used to manage the cluster including:

  • move roles
  • enable/disable individual nodes
  • view cluster status
  • configure failover

The Implementation challenges require the use of the following MySQL settings to minimize problems.

  • auto_increment_offset/auto_increment_increment
  • log_slave_updates
  • read_only

FOSDEM 2010 MySQL Developer Room Schedule
FOSDEM 2010 Website
Brussels, Belgium
February 7, 2010

10x Performance Improvements in MySQL – A Case Study

The slides for my presentation at FOSDEM 2010 are now available online at slideshare. In this presentation I describe a successful client implementation with the result of 10x performance improvements. My presentation covers monitoring, reviewing and analyzing SQL, the art of indexes, improving SQL, storage engines and caching.

The end result was a page load improvement from 700+ms load time to a a consistent 60ms.

State of phpMyAdmin – FOSDEM 2010

Following the opening keynote “Dolphins, now and beyond”, Marc Delisle presented on “State of phpMyAdmin”.

phpMyAdmin is an DBA administration tool for MySQL available today in 57 different languages. This is found today in many distributions, LAMP stack products and also in cpanel. The product is found at http://phpmyadmin.net.

There are current two versions, the legacy 2.x version to support older php 3.x & 4.x, The current version 3.x is for PHP 5.2 or greater.

The current UI includes some new features including.

  • calendar input for date fields
  • meta data for mime types e.g images, which is great for showing the output as an image, otherwise blob data
  • Relational designer with the able to show and create foreign keys

The New features in 3.3 (currently in beta) include:

  • Replication support including configuring master/slave, start/stop slave.
  • Synchronization model showing structure and data differences between two servers and ability to sync.
  • New export to php array, xslx, mediawiki, new importing features including progress bar.
  • Changes tracking for changes on per instance or per table. Providing change report and export options.

FOSDEM 2010 MySQL Developer Room Schedule
FOSDEM 2010 Website
Brussels, Belgium
February 7, 2010

Dolphins, now & beyond – FOSDEM 2010

I had the honor of opening the day at the MySQL developer room at FOSDEM 2010 where I had a chance to talk about the MySQL product and community, now and what’s happening moving forward.

For those that missed the talk, my slides are available online at Slideshare however slides never due justice to some of the jokes including:

  • What do you consider? the Blue Pill, or the Red Pill
  • Why think two dimensionally, how about the Green Pill
  • Emerging Breeds with performance enhancing modifications

Speaking at MySQL UC 2010

My talk on 10x performance improvements – A case study has just been approved for the 2010 MySQL Conference. This will be my 5th straight year speaking at the MySQL conferences. For those in Europe wanting a sneak peek I am also speaking at FOSDEM 2010 in Brussels on Feb 7th where I’ll be giving an abridged version.

As an independent MySQL consultant, my work generally covers performance tuning and scalability and sometimes database architecture. Often however my work involves a review of a given problem and recommendations (immediate, short and long term). I’m rarely involved in the full implementation and generally do not see the full fruits of the proposed work.

Recently however I was able to work with a client, first in resolving critical performance issues and then in a review of the application architecture and MySQL environment, provide recommendations and also helping internal resources in the successful implementation. The result was a very successful engagement and an ideal case study on a strategy for tackling improving performance of an application using MySQL.

With an existing environment that included a MySQL master, 3 database slaves and 6 web servers this application provided the grounds of a well sized configuration and the need for greater availability and scalability.

The Call for Papers for the 2010 MySQL conference is still open. If you would like to share your experiences with MySQL submit your talk today.


O'Reilly MySQL Conference & Expo 2010

Europe conference options for MySQL Developers

For those in the US the annual MySQL UC is taking place again in April. For those in Europe we have dedicated room for MySQL and MySQL related products/variants/branches at FOSDEM 2010 being held in Brussels, Belgium on 6-7 Feb.

This conference will feature a full day of talks with a format of 20 minutes presentation and 5 minutes Q&A. More information about submissions can be found at Call for Papers for “MySQL and Friends” Developer Room at FOSDEM 2010 now open!

Other references:

Updated

Wednesday January 6th is the last date for submissions. Extension for FOSDEM MySQL

Transcending Technology Specific Boundaries

I had the pleasure to sit on the Performance Panel at the recent Percona Performance Conference. While the panel contained a number of usual MySQL suspects, one person was not familiar, that being Cary Millsap from Method R.

An expert in optimizing Oracle performance, Cary also gave an session on Day 2 that I attended. While he opened professing not to be an expert in MySQL, his talk provided valuable foundation knowledge irrespective of whether you use MySQL or another database product.

Having come myself from 7 straight years in system architecture and performance tuning in Ingres, then a further 6 years in Oracle again heavily involved in system architecture and performance tuning, a lot of my experience in the 10 years of providing my own MySQL consulting is drawn from my past RDBMS experiences. In addition much of what I actually provide to clients today is common sense that I don’t see applied.

A summary of the excellent content provided by Cary.

The common technology agnostic problem we need to address is:

  • Users say that everything is slow, but I don’t know where to begin
  • Users are complaining but all the monitoring dials are green

From a user’s perspective, their experience consist simply of two elements.

  1. Task
  2. Time

In general, business people simply don’t care about the “system” except thought the specific tasks that make up their pressing business needs. And for these users, performance is all about the time to complete this task.

Throughput can be stated as tasks per time.
Response time is the time taken per task.

Cary also quoted Donald Knuth — “The universal experience of programmers who have been using measurements tools has been that there intuitive guesses fail.”

Performance is easy if you stop guessing where your code is slow. A few best practice tips are:

  • You have to insist on seeing where time goes for any task you think is important
  • You need to look at the sequence diagram of the task
  • What individual part takes the most time, then look at the task before that. The fastest way to do something is don’t.
  • To drill down, you need to attack the skew of each part, not the average.

In Summary the closing points were:

  • Performance is about time and tasks
  • Not all tasks are created equal
  • Read “The Goal”
  • Don’t guess, your probably wrong
  • Measure response time before you optimize anything – Insist on it

Performance is easy when code measures it’s own time and tasks. This closing statement on instrumentation I completely concur with.

One advantage of Oracle/Sun/MySQL

This weeks’ announcement Oracle to by Sun was a major talking point at the 2009 MySQL Conference & Expo. While it is too early to even speculate what the future holds with the official MySQL product, for myself a speaker on MySQL topics, Oracle Open World is now a target market.

In addition to many years of providing MySQL for the Oracle DBA Resources I have with the recent closure of call for papers submitted two sessions for consideration.

Integrating MySQL into your Oracle DBA management processes

Most large enterprise organizations use more then one RDBMS product to service business requirements. With the increase in MySQL usage for web based applications such as self-service content, Oracle DBA’s need to understand and appreciate the minimum to ensure performance and availability meets client expectations.

Just how to you integrate MySQL into existing and existing Oracle database infrastructure and management monitoring process?
What are the critical monitoring components? How do these compare to current Oracle Best Practices.
Understand the various end user tools support multiple RDBMS products including Oracle and MySQL.

In this session, DBA’s will leave with the essential knowledge and appreciation of MySQL management.

An overview for evaluating migrating from Oracle to MySQL

MySQL is becoming increasing popular RDBMS for web based applications due to it’s ease of use, availability within the the LAMP stack and large number of open source applications. While implementing MySQL for a new development project may be easy, migrating existing databases, data and applications to use MySQL is not. In this presentation, we will answer questions including:

What are the major challenges to overcome to consider MySQL for some portion of your business?
What are the issues in application portability?
What are ideal applications to consider for migration?
Tools, Products and options for easy migration?
What Oracle features are not supported?

Learn how to read and write to MySQL directly via Oracle Heterogeneous services.

Percona Performance Conference Talk

My final presentation during the 2009 MySQL Conference and Expo week was with the Percona Performance Conference on the topic of The Ideal Performance Architecture. My talk included discussions on Technology, Disk, Memory, Indexes, SQL and Data.

Updated 09/18/09
you can now see video of the event at Percona TV.

MySQL Monitoring 101

At the 2009 MySQL Conference and Expo I presented to a full room on MySQL Monitoring 101.

This presentation focused on the following four goals.

  • Know what to monitor
  • Know how you can monitor
  • Learn practices to diagnose problems
  • Have a foundation of historical information

Updated 09/18/09
You can also find additional materials at:

A change in the MySQL Binary distributions

Yesterday was the surprise announcement of MySQL 5.4 at the 2009 MySQL Conference and Expo. It was unfortunate that the supporting information was not that forthcoming on the MySQL website. I tried for several hours to try and download, but no mirrors were initially available. Today I see some information on the mysql.com home page and finally able to get the binary.

What I found most significant with this new major version release is a change in the binary distribution, as seen on the Download page.

MySQL 5.4 is only available on 3 platforms:

  • Linux (AMD64 / Intel EM64T)
  • Solaris 10 (SPARC, 64-bit)
  • Solaris 10 (AMD64 / Intel EM64T, 64-bit)

I was also surprised that this beta release highlights the emphasis of community contributions (long overdue), yet the community and indeed many employees of Sun/MySQL were simply unaware of this work. This is clearly a change in involving the community. While I applaud the beta status, hopefully a more stable product to start with, it’s development was done in a very closed company model.

Setting up MySQL on Amazon Web Services (AWS) Presentation

On Tuesday at the MySQL Camp 2009 in Santa Clara I presented Setting up MySQL on Amazon Web Services (AWS).

This presentation assumed you know nothing about AWS, and have no account. With Internet access via a Browser and a valid Credit Card, you can have your own running Web Server on the Internet in under 10 minutes, just point and click.

We also step into some more detail online click and point and supplied command line tools to demonstrate some more advanced usage.

What's happening with InnoDB

I have moved on to InnoDB: Innovative Technologies for Performance and Data Protection by Ken Jacobs at MySQL Conference and Expo.

With a brief history lesson of inception from 1994, inclusion in MySQL in 2000 and acquired by Oracle in 2005. Most of the work was done by one person. InnoDB is based on sound database computer science using Gray & Reuters definitive text on database design.

Some key points in Ken’s discussion.

  • Adaptive Hash indexing for frequent queries on keys.
  • In plugin Adaptive Hash is configurable
  • Insert Buffering – Deferring secondary index writes
  • Fast Index Create – doesn’t requires all indexes to be rebuilt
  • Table Compression – Changing the page size

The InnoDB plugin available in 5.1 has a number of new benefits.

  • fast index creation
  • table compression
  • info schema tables
  • new row storage format
  • file format management

All InnoDB 1.0.3 plugin features will be available in MySQL 5.4

The big announcement is a new product – Embedded InnoDB. This has the high performance, reliability and rich functionality of InnoDB, has a flexible programmatic API. No SQL, No security.

Search at Craigslist

I am now sitting in on MySQL and Search at Craigslist by Jeremy Zawodny at MySQL Users Conference

Some of the technical difficulties that required addressing.

  • High churn rate
  • half life can be very short
  • Growth
  • Traffic
  • Need to archive postings, e.g. 100M but be searchable
  • Internationalization and UTF-8

Some of the Craigslist Goals

  • Open Source
  • Easy and approachable
  • be green with energy use

A review of the Internals server configuration

  • Load Balancer (perlbal like)
  • Read Proxy Array (perl+memcached)
  • Web Read Array (apache 1.3 + mod_perl)
  • Object Cache (Perl + memcached)
  • Read DB Cluster (MySQL 5.0.x)
  • Search Cluster (Sphinx)

Clusters of DB servers have good vertical partitioning by Roles. These being

  • Users
  • Classified
  • Forums
  • Stats
  • Archive

Sphinx is a full standalone full text search that is used. Did compare with Apache Solr, but it seemed more complex and complicated. The Sphinx configuration:

  • Partitioned based on cities (people search locally)
  • Attributes v Keywords
  • Persistent Connections
  • Minimal stopword list
  • Partition in 2 clusters (1 master, 4 slaves)

The results of implementing Sphinx were:

  • decrease in 25 MySQL boxes to 10 sphinx boxes
  • no locking
  • 1,000+ qps
  • 50M queries per day
  • Better separation of code

MySQL Users Conference Opening Lines

Opening introduction from Colin Charles got us started. Karen Tegan Padir VP MySQL & Software Infrastructure was the opening keynote.

She comes from a strong tech background and is passionate about open source, the communities and how to make a successful product.

There isn’t a person that doesn’t go a day without interacting with a website or hardware system that uses a MySQL database.

The big news was the announcement of MySQL 5.4 – Performance & Scalability. Key features include.

  • InnoDb scalability 16way x86 and 64 way CMT servers
  • subquery optimization
  • new query algorithms
  • improved stored procedures, and prepared statements
  • enhanced Information Schema
  • improved DTrace Support

More information at MySQL 5.4 Announcement Details….

Other key points includes:

1. Ken Jacobs announces today an Embedded Innodb with a powerful API (not SQL based). Read more at Innobase Introduces Embedded InnoDB
2. MySQLCluster 7.0 is also released today. Some benchmarks 4.3x improvements. New features also include LDAP support.
3. The next release of MySQL Query Analyzer, 2.1 announced.
4. Sun announces a commitment to accept contributions from the community.
5. Community also gets the Monthly Rapid Updates.
6. MySQL Drizzle Project is discussed as a technology incubator.

Partners of the year: Intel, Infobright and Lifeboard.
Appliation of the year: Zappos, Alcatel-lucent and Symantec.
Community members of the year: Marc Delisle, Ronald Bradford, Shlomi Noach.

Where is the MySQL in Sun's announcement

I find it surprising that in the official Sun Announcement there is no mention of MySQL for two reasons. Firstly, this was Sun largest single purchase of $1 billion only 12 months ago. Second, MySQL’s largest competitor is Oracle.

While the Sun website shows the news in grandeur, the MySQL website is noticeably absent in any information of it’s owners’ acquisition.

On my professional side, as an independent speaker for Sun Microsystems with plans for upcoming webinars and future speaking on “Best Practices in Migrating to MySQL from Oracle”, this news does not benefit my bottom line.

A Drizzle update – Running version 2009.03.970-development

I’ve not looked at compiling and running Drizzle on my server for the past four weeks. Well overdue time for a check and see how it’s going. I saw in today’s planet.mysql.com by Eric Day a new dependency is needed. libdrizzle 0.2.0 now in Drizzle is now required, so I started there.

cd ~/bzr
bzr branch lp:libdrizzle
cd libdrizzle
./config/autorun.sh
./configure
make
sudo make install

No problems there, also documented at the Drizzle Wiki. Great to see the docs up to date. I see my old work on starting the compiling page still relevant. Tested on CentOS 5 and Mac OS/X 10.5

Compiling drizzle was not much more difficult.

cd ~/bzr/drizzle
bzr update
make distclean
./config/autorun.sh
./configure --prefix=/home/drizzle/deploy
make
make install

The problems happened when I started drizzle. Initially I was using bin/drizzled_safe, but it was recommended via IRC#drizzle I stick with sbin/drizzled

sbin/drizzled &
error while loading shared libraries: libprotobuf.so.2: cannot open shared object file: No such file or directory

An investigation of Google Proto Buffers.

$ protoc --version
libprotoc 2.0.2

I see that protobuf 2.0.3 is now available, but this was not the problem.

I got around the problem by specifying the current library path:

$ LD_LIBRARY_PATH=/usr/local/lib sbin/drizzled &

I corrected this problem by adding /usr/local/lib to the default ld path, both the libdrizzle and libprotobuf libs are located there.

$ echo "/usr/local/lib" > /etc/ld.so.conf.d/drizzle.conf
$ ldconfig
$ ls -l /usr/local/lib
total 37240
-rw-r--r-- 1 root root  1194602 Mar 31 17:42 libdrizzle.a
-rwxr-xr-x 1 root root      940 Mar 31 17:42 libdrizzle.la
lrwxrwxrwx 1 root root       19 Mar 31 17:42 libdrizzle.so -> libdrizzle.so.0.0.2
lrwxrwxrwx 1 root root       19 Mar 31 17:42 libdrizzle.so.0 -> libdrizzle.so.0.0.2
-rwxr-xr-x 1 root root  1117979 Mar 31 17:42 libdrizzle.so.0.0.2
-rw-r--r-- 1 root root 12199302 Nov 30 23:32 libprotobuf.a
-rwxr-xr-x 1 root root      836 Nov 30 23:32 libprotobuf.la
lrwxrwxrwx 1 root root       20 Nov 30 23:32 libprotobuf.so -> libprotobuf.so.2.0.0
lrwxrwxrwx 1 root root       20 Aug 27  2008 libprotobuf.so.0 -> libprotobuf.so.0.0.0
-rwxr-xr-x 1 root root  5027949 Aug 27  2008 libprotobuf.so.0.0.0
lrwxrwxrwx 1 root root       20 Nov 30 23:32 libprotobuf.so.2 -> libprotobuf.so.2.0.0
-rwxr-xr-x 1 root root  5586965 Nov 30 23:32 libprotobuf.so.2.0.0
-rw-r--r-- 1 root root  9264068 Nov 30 23:32 libprotoc.a
-rwxr-xr-x 1 root root      852 Nov 30 23:32 libprotoc.la
lrwxrwxrwx 1 root root       18 Nov 30 23:32 libprotoc.so -> libprotoc.so.0.0.0
lrwxrwxrwx 1 root root       18 Nov 30 23:32 libprotoc.so.0 -> libprotoc.so.0.0.0
-rwxr-xr-x 1 root root  3645396 Nov 30 23:32 libprotoc.so.0.0.0
drwxr-xr-x 2 root root     4096 Mar 31 17:42 pkgconfig

Starting

$ sbin/drizzled &
InnoDB: The InnoDB memory heap is disabled
InnoDB: Mutexes and rw_locks use GCC atomic builtins.
090331 18:38:08  InnoDB: highest supported file format is Barracuda.
InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
090331 18:38:08  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
090331 18:38:08 InnoDB Plugin 1.0.3 started; log sequence number 46419
sbin/drizzled: ready for connections.
Version: '2009.03.970-development'  socket: ''  port: 4427  Source distribution

Verifying

$ bin/drizzle -uroot
Welcome to the Drizzle client..  Commands end with ; or g.
Your Drizzle connection id is 1
Server version: 2009.03.970-development Source distribution
Type 'help;' or 'h' for help. Type 'c' to clear the buffer.
drizzle> select version();
+-------------------------+
| version()               |
+-------------------------+
| 2009.03.970-development |
+-------------------------+
1 row in set (0.00 sec)
drizzle> exit

Sweet! Now to try some testing & benchmarking before the barrage of conferences next month, 2009 MySQL Camp, Percona Performance Conference and MySQL Conference & Expo.

I’m going to check out The Juice Database Benchmark next as a more realistic benchmark to DBT2 and sysbench.

Identifying resource bottlenecks – CPU

One of the first steps when addressing a MySQL performance tuning problem is to perform a system audit of the physical hardware resources, then identify any obvious bottlenecks in these resources.

When dealing with CPU, a quick audit should include identifying the number of CPU cores your server has, and the types of these cores. The key file on Linux systems is /proc/cpuinfo.

Number of cores can be found via the command cat /proc/cpuinfo | grep “^processor” | wc -l

You need to look more closely at the file to determine the type of CPU (e.g. below the model name shows Intel(R) Xeon(R) CPU X3220 @ 2.40GHz. The combination of knowing the number of processors (cores) listed and physical id and siblings helps identify how many CPUs and how many cores per CPU exist.

$ cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Xeon(R) CPU           X3220  @ 2.40GHz
stepping	: 11
cpu MHz		: 2394.051
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
bogomips	: 4789.96
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

...

Other commands that help with identifying CPU/cores include mpstat and top.

$ mpstat -P ALL 5

11:43:43 AM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
11:43:48 AM  all    0.00    0.00    0.00    0.00    0.05    0.00    0.00   99.95   1033.00
11:43:48 AM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00   1000.40
11:43:48 AM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
11:43:48 AM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00     31.40
11:43:48 AM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      1.00
$ top
0

top - 11:42:09 up 36 days, 13:17,  2 users,  load average: 0.20, 0.24, 0.25
Tasks: 133 total,   1 running, 132 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   4050776k total,  3825584k used,   225192k free,   397580k buffers
Swap:  1052248k total,      128k used,  1052120k free,  2302408k cached

You can easily identify a CPU bottleneck using the vmstat command.

The following shows an idle system.

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0    128 234088 409632 2474372    0    0     0     0 1081  198  0  0 100  0  0
 0  0    128 234088 409632 2474396    0    0     0     0 1003   59  0  0 100  0  0
 0  0    128 234088 409636 2474392    0    0     0   100 1085  209  0  0 100  0  0
 0  0    128 233836 409636 2474396    0    0     0     0 1014  184  3  0 97  0  0
 0  0    128 233284 409636 2474396    0    0     0     0 1182  435  2  0 98  0  0
 0  0    128 233176 409636 2474396    0    0     0     0 1024  104  1  0 99  0  0
 0  0    128 233176 409636 2474396    0    0     0     0 1079  195  0  0 100  0  0
 1  0    128 233168 409644 2474396    0    0     0   232 1021  188  3  0 97  0  0
 0  0    128 233176 409644 2474396    0    0     0     0 1111  213  2  0 98  0  0
 0  0    128 233176 409644 2474396    0    0     0     0 1005   60  0  0 100  0  0

The key columns (from the man page are)

CPU – These are percentages of total CPU time.

  • us: Time spent running non-kernel code. (user time, including nice time)
  • sy: Time spent running kernel code. (system time)
  • id: Time spent idle. Prior to Linux 2.5.41, this includes IO-wait time.
  • wa: Time spent waiting for IO. Prior to Linux 2.5.41, included in idle.
  • st: Time stolen from a virtual machine. Prior to Linux 2.6.11, unknown.
  • Procs

  • r: The number of processes waiting for run time.

NOTE: The columns of vmstat may vary between different Linux Operating Systems.

If you system is CPU Bound then you will observe this. Look at id,us,sy,r

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 3  0    128 275684 397176 2300672    0    0     0     0 1118  427 74  2 25  0  0
 3  0    128 217404 397176 2300672    0    0     0     0 1017  138 74  1 25  0  0
 6  0    128 239584 397176 2300672    0    0     0     0 1086  350 93  2  5  0  0
 4  0    128 269468 397176 2300672    0    0     0     0 1005  229 98  2  0  0  0
 4  0    128 217636 397180 2300668    0    0     0   168 1087  251 99  2  0  0  0
 4  0    128 240576 397180 2300668    0    0     0     0 1006  182 99  2  0  0  0
 4  0    128 270708 397180 2300668    0    0     0     0 1079  338 98  2  0  0  0
 4  0    128 218752 397180 2300684    0    0     0     0 1005  106 99  1  0  0  0
 4  0    128 226316 397180 2300684    0    0     0     0 1077  308 98  2  0  0  0
 4  0    128 198664 397184 2300680    0    0     0    76 1010  250 99  1  0  0  0
 4  0    128 179444 397184 2300680    0    0     0     0 1077  238 100  0  0  0  0
 4  0    128 185396 397184 2300688    0    0     0     0 1006  210 99  1  0  0  0
 4  0    128 199408 397184 2300688    0    0     0     0 1079  336 99  1  0  0  0

You should also be wary of a Single CPU Bound process. This is why knowing the number of cores is important. In this example, one CPU is bound.

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st

 0  0    128  99592 412544 2477580    0    0     0     0 1017   89  0  0 100  0  0
 0  0    128  99592 412544 2477580    0    0     0     0 1090  222  0  0 100  0  0
 0  0    128  99592 412544 2477580    0    0     0     0 1019   98  0  0 100  0  0
 1  0    128  99592 412544 2477580    0    0     0     0 1096  347 14  0 86  0  0
 1  0    128  99592 412548 2477576    0    0     0    84 1030  194 25  0 75  0  0
 1  0    128  99592 412548 2477576    0    0     0     0 1094  300 25  0 75  0  0
 1  0    128  99592 412548 2477580    0    0     0     0 1012   76 25  0 75  0  0
 1  0    128  99592 412548 2477580    0    0     0     0 1096  318 25  0 75  0  0
 1  0    128  73192 412548 2477580    0    0     0     0 1039  273 29  0 70  0  0
 1  0    128  77284 412556 2477572    0    0     0   268 1122  373 25  1 75  0  0
 2  0    128  83592 412556 2477584    0    0     0     0 1036  374 27  1 72  0  0
 0  0    128  56220 412564 2477576    0    0     0   172 1017   84  7  0 94  0  0
 0  0    128  56220 412564 2477576    0    0     0     0 1078  192  0  0 100  0  0
$ mpstat -P ALL 1
12:15:55 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
12:15:56 PM  all   25.00    0.00    0.00    0.00    0.00    0.00    0.00   75.00   1072.00
12:15:56 PM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00   1001.00
12:15:56 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
12:15:56 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00     62.00
12:15:56 PM    3  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      9.00

12:15:56 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
12:15:57 PM  all   25.00    0.00    0.00    0.00    0.00    0.00    0.00   75.00   1021.00
12:15:57 PM    0    0.00    0.00    0.00    0.00    0.00    1.00    0.00   99.00   1001.00
12:15:57 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00      0.00
12:15:57 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00     18.00
12:15:57 PM    3  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      2.00

I will be detailing identifying bottlenecks of Memory, Disk and Network in future posts. You can also find out more at the MySQL User Conference “Monitoring 101 – Simple stuff to save your bacon” session.

Classic quotes – Community One East

The CommunityOne East 2009 conference has finished up. There were a few classic statements made by the speakers during the day. They included.

“We have a community reception, that’s a long way to say free beer.”

“Google is the dial tone of the Internet, if it’s not there people start freaking out.”

“I am an insom-maniac, a late night hacker.”

“Having a successful catastrophic – Achieving your marketing goals, and your site crashes due to the load.”

“Ruby is a beautiful expressive fun language.”

(talking about cloud providers)“Lock-in, it’s like marriage, it’s not necessarily a bad thing.”

“It wasn’t a red carpet, but it was carpeted.”

Event: CommunityOne East in New York, NY.
Article Author: Ronald Bradford

Your Code, Your Community, Your Cloud… Project Kenai

Following the opening keynote announcement about Kenai I ventured into a talk on Project Kenai.

With today’s economy, the drive is towards efficiency is certainly a key consideration, it was quoted that dedicated hosting servers only run at 30% efficiency.

An overview again of Cloud Computing

  • Economics – Pay as you go,
  • Developer Centric – rapid self provisioning, api-driven, faster deployment
  • Flexibility – standard services, elastic, on demand, multi-tenant

Types of Clouds

  • Public – pay as you go, multi-tenant application and services
  • Private – Cloud computing model run within a company’s own data center
  • Mixed – Mixed user of public and private clouds according to applications

SmugMug was referenced as a Mixed Cloud example.

Cloud Layers

  • Infrastructure as a Services – Basic storage and computer capabilities offer as a service (eg. AWS)
  • Platform as a Service – Developer platform with build-in services. e.g. Google App Engine
  • Software as Service – applications offered on demand over the network e.g salesforce.com

Some issues raised about this layers included.

  • IaaS issues include Service Level, Privacy, Security, Cost of Exit
  • PaaS interesting point, one that is the bane of MySQL performance tuning, that is instrumentation
  • SaaS nothing you need to download, you take the pieces you need, interact with the cloud. More services simply like doing your Tax online.

Sun offers Project Kenai as well as Zembly.

Project Kenai

  • A platform and ecosystem for developers.
  • Freely host open source projects and code.
  • Connect, community, collaborate and Code with peers
  • Eventually easily deploy application/services to “clouds”

Kenai Features

  • Code Repository with SVN, Mercurial, or an external repository
  • Issue tracking with bugzilla, jira
  • collaboration tools such as wiki, forums, mailing lists
  • document hosting
  • your profile
  • administrative role

Within Kenai you can open up to 5 open source projects and various metrics of the respositories, issue trackers, wiki etc.

The benefits were given as the features are integrated into your project, not distributed across different sites. Agile development within the project sees a release every 2 weeks. Integration with NetBeans and Eclipse is underway.

Kenai is targeted as being the core of the next generation of Sun’s collaboration tools. However when I asked for more details about uptake in Sun, it’s only a request, not a requirement for internal teams.

The API’s for the Sun Cloud are at http://kenai.com/projects/suncloudapis.

Event: CommunityOne East in New York, NY.
Presenter: Tori Wieldt, Sun Microsystems
Article Author: Ronald Bradford

Everybody is talking About Clouds

From the opening keynote at CommunityOne East we begin with Everybody is talking About Clouds.

It’s difficult to get a good definition, the opening cloud definition today was Software/Platform/Storage/Database/Infrastructure as a service. Grid Computing, Visualization, Utility Computing, Application Hosting. Basically all the buzz words we currently know.

Cloud computing has the ideals of truly bringing a freedom of choice. For inside or outside of an enterprise, the lower the barrier, time and cost into freedom of choice give opportunities including:

  • Self-service provisioning
  • Scale up, Scale down.
  • Pay for only what you use.

Sun’s Vision has existed since 1984 with “The NETWORK is the Computer”.

Today, Sun’s View includes Many Clouds, Public and Private, Tuned up for different application needs, geographical, political, with a goal of being Open and Compatible.

How do we think into the future for developing and deploying into the cloud? The answer given today was, The Sun Open Cloud Platform which includes the set of core technologies, API’s and protocols that Sun hopes to see uptake among many different providers.

The Sun Cloud Platform

  • Products and Technologies – VirtualBox, Sun xVM, Q-Laser, MySQL
  • Expertise and Services
  • Partners – Zmanda, Rightscale, Kickapps
  • Open Communities – Glashfish, Java, Open Office, Zfs, Netbeans, Eucalyptus

The Sun Cloud includes:

  • Compute Service
  • Storage Service
  • Virtual Data Center
  • Open API – Public, RESTful, Java, Python, Ruby

The public API has been released today and is available under Kenai. It includes two key points:

  • Everything is a resource http GET, POST, PUT etc
  • A single starting point, other URI’s are discoverable.

What was initially showed was CLI interface exmaples, great to see this still is common, a demonstration using drag and drop via a web interface was also given, showing a load balanced, multi-teired, multi server environment. This was started and tested during the presentation.

Then Using Cyberduck (a WebDAV client on Mac OS/X) and being able to access the storage component at storage.network.com directly, then from Open Office you now get options to Get/Save to Cloud ( using TwoGuys.com, Virtual Data Center example document).

Seamless integration between the tools, and the service. That was impressive.

More information at sun.com/cloud. You can get more details also at the Sun Microsystems Unveils Open Cloud PlatformOfficial Press Release.

Event: CommunityOne East in New York, NY.
Article Author: Ronald Bradford

CommunityOne East – An open developer conference

With an opening video from thru-you.com – an individual taking random you-tube video and producing video mashup’s, the CommunityOne East conference in New York, NY beings.

The opening introduction was by Chief Sustainability Officer Dave Douglas. Interesting job title.

His initial discussion was around what is the relationship between technology and society. A plug for his upcoming book “Citizen Engineer” – The responsibilities of a 21st Century Engineer. He quotes “Crisis loves an Innovation” by Jonathan Schwartz, and extends with “Crisis loves a Community”.

He asks us to consider the wider community ecosystem such as schools, towns, governments, NGO’s etc with our usage and knowledge of technology.

Event: CommunityOne East in New York, NY.
Author: Ronald Bradford

Patience and Passion at Web 2.0 NY

Gary Vaynerchuk spoke next at Web 2.0 NY on Building Personal Brand Within the Social Media Landscape.

He was hilarious. His video presentation is available online, to share with others. He is inspirational for new young entrepreneur and I’d love to see him talk at Ultra Light Startups

His talk was simply “Patience and Passion”.

Words of wisdom included.

  • There is no reason to do stuff you hate.
  • You can loose just as much money doing stuff you love.
  • What do you want to do for the rest of my life, do it.
  • Hussle is the most important word. We are building businesses here.
  • You can monetize anything, you need to work hard, be patient and passionate about your business.
  • You have to have a business model, that makes some cash along the way – Freemium.
  • Your goal should be to leave a legacy.
  • You need to build brand equity, in yourself. There is never a bad time when you believe, when you work hard, when you know what you are doing.
  • People are the people that are going to help you, get out there and network, be transparent, be exposed.
  • Previously you had to work hard to build a brand, now you can use any network to become known and more successful.
  • 9-5 is for your job, a few hours with your family, then 7-2am is plenty of time to focus on your dreams.

In closing, you have to do what you love.

Integrity, clarity and responsibility – Web 2.0 NY Keynote

Next on the Web 2.0 keynote speaker list was Maria Thomas of www.etsy.com with her talk The DIY Guide to Growing a Company.

If never heard of Etsy before – Your place to buy & sell all things handmade, interesting site. Most companies start small, and stay small. Only 0.1 of 1% grow to any size (e.g. > $250 Million) Esty, this year has 100M in revenue, Amazon is about $2 billion.

The opening lines included the message “Don’t loose the essence of who you are, and what you want to achieve.” and the term Filotimo.

Filotimo (Greek)

  • Operating with integrity
  • a clarity of purpose
  • a sense of social responsibility

It was an interesting point about qualifications “I got my Internet Degree at Amazon.com, then building digital media business at NPR – National Public Radio.”

Some more quotes from this discussion.

  • Listen to end user, but be clear about when you want to go.
  • Set you goals, communicate them, measure them.
  • Practice Filimato – Keep it human.
  • Go behind the resume, talk to people, be direct, be honest. Believe in employees.
  • Understand that small decisions and impact bigger decisions.
  • Just because your a DIY company doesn’t mean you have build everything.
  • Get products out the door, get them out fast.
  • It takes an effort of will, and a very good process to do this well.
  • The perfect is enemy of good enough.
  • Launch products off at platform and build more quickly.
  • The marketplace has social, it’s personal, it’s playful.
  • Visits to etsy become habit forming, it makes connections to real people with unique products.

What if software was a physical object – NY Web 2.0 Third Keynote

Some points of reference from the next Web 2.0 keynote by Jason Fried of 37 Signals

  • Software business is a great place to be.
  • You can build anything you want. All you really have to it type, it’s not easy to do, it’s just not that hard to do.
  • Change is easy, cost is cheap in relation to physical objects.
  • You can build it anywhere.
  • Software doesn’t have the same kind of feedback as physical objects.
  • Visually we can determine good verses poor design (e.g. a bottle of water or a remote control).
    It doesn’t have edges, size or weight. It just expands, continues to expand and this is bad.
  • What would your software be like if it was physical?
  • When you say yes to too many features you end up with Homer’s car.
  • The goal should be simple, clean, elegant and streamlined.
  • Once you hit bloat, it’s too hard to go back.
  • Listen to customers, but don’t do everything as they say. Think of yourself as a curator.
  • Make your software a collection, not a warehouse.
  • You don’t need to have everything in the world.
  • You need a few solid features.
  • Real work is hard, imaginary work is easy.
  • Tell the people that want to have new features, to build them. Attach real costs to any requests.

The DNA of your company has to able to say no.

Technology changes, humans don't. – Web 2.0 NY Second keynote

I needed a rest from my opening keynote review NY Tech 1995-2008. Opening Web 2.0 Expo NY Keynote but a few siginificant points from The Death of the Grand Gesture by Deb Schultz.

  • An interesting site is Visual Complexity showing graphical representations of many social networks.
  • All the binary communication becomes white noise — Information Overload.
  • “Technology changes, humans don’t” – Deb Schultz

NY Tech 1995-2008. Opening Web 2.0 Expo NY Keynote

Web 2.0 Expo NY keynotes are happening today. Technology in use included CrowdVine which I’d not heard of, and plenty of Twitter feeds such as w2e_NY08.

The opening keynote was Fred Wilson from Union Square Ventures with his presentation New York’s Web Industry From 1995 to 2008: From Nascent to Ascendent .

Some stats, Seed and early stage deals.

  • 1995 230 SF Bay area, 30 in NY
  • 2008 360 SF Bay area, 116 in NY

Fred first asked “New York is not an alley. Call it Broadway, or just New York.”

Here is a summary of his history of New York Web Industry.

  • 1991 – ZDNet
  • 1993 – New York Online Dialup services
  • 1993 – Jupiter Communications online conference
  • 1993 – Prodigy
  • 1994 – Startups such as Pseudo, Total New york, Razorfish.
  • 1994 – Time Warner Pathfinder
  • 1995 -NYIC 55 Broad St. – Technology oriented building
  • 1995 – Seth Godin – Yoyodyne – Permission Marketing
  • 1995 – itraffic, agency.com, NY Times online
  • 1995 – Softbank, Double Click, 24×7, Real Media
  • 1996 – Silicon Alley Reporter
  • 1996 – ivillage, the knot
  • 1996 – Flatiron Partners – good sued for that
  • 1997 – The Silicon Alley Report Radio Show
  • 1997 – mining co.
  • 1997 – Total NY sold to AOL
  • 1997 – Agency rollups razorfish buying 4 companies
  • 1997 – DoubleClick IPO
  • 1998 – Seth Godin moves to Yahoo
  • 1998 – Burn Rate
  • 1998 – Kozmo – We’ll be right over
  • 1998 – was the last year of sanity in the Internet wave
  • 1999 – The start of the boom
  • 1999 the big players came online , all hell breaks loose. 200 startups were funded in 1999, 300 in 2000.
  • 2000 – The Crash & Burn
  • 2000 – f**kedcompany
  • 2000 – Google came to New York. – 86th St Starbucks
  • 2001 – Layoffs, Landlords and bankruptcies
  • 2002 – Rock bottom
  • 2003 – Renewal
  • 2003 – Blogging started gizmodo
  • 2003 – Web 2.0 coined
  • 2003 – del.icio.us was launched from a computer in an apartment
  • 2004 – NY Tech Meetup
  • 2004 – Union Square Ventures $120million raised
  • 2005 – about.com acquired by NY Times
  • 2005 – Etsy
  • 2006 – Google took over port authority building, now with 750 engineers in NY
  • 2008 – Web 2.0 comes to New York City

New York is now 1/3 of Silicon valley, compared to 1/8 of funded Internet companies.

One thing mentioned is a documentary called “We live in Public”. Some of the footage from 1999, is so early Big Brother.

Web 2.0 in NY

I will be attending next week’s Web 2.0 Expo 2008 in New York.

Garys Guide has a schedule of the key events and off site associated event parties.

It will be a bit of a change from the typical MySQL Conferences and recent OSCON Conference I have attended this year.

The Keynote titles gives you an indication of the variety of talks expected.

  • Organizing Chaos: The Growth of Collaborative Filters
  • (Re)making the Internet: Accounting for the Future of Information, Communication and Entertainment Technologies
  • Next Generation of Video Games
  • 10 Things We’ve Learned at 37signals
  • High Order Bit
  • What ManyEyes Knows
  • Arianna Huffington in Conversation with Tim O’Reilly
  • Because We Make You Happy
  • The Real Future of Technology
  • Enterprise Radar
  • The Death of the Grand Gesture
  • It’s Not Information Overload. It’s Filter Failure.
  • Building Personal Brand Within the Social Media Landscape