Top 5 wishes for MySQL

June 21, 2007 by ronald

Note: My views are just that: mine.

1. Real time Query Monitoring

MySQL 5.0 GA provides only 3 ways to look at queries that are executed on a server in some way or another. Slow Query Log, General Query Log and Binary Log. All require a server reboot to activate and de-activate. In a production system, it’s sometimes critical to be able to know “what is going on”, and you simply can’t reboot the server twice (once to turn on, once to turn off). 5.1 goes some way with Log Tables to being able to turn on General and Slow Logs into tables. Question is, as Kevin Burton listed in his points, when is 5.1 going to be out.

Real time query monitoring also needs to have a granularity of operation better then “server”. There needs to be a capacity to assign this on per connection basis. A server is being hammered, certain status variables are increasing greatly, I need to know now what queries are causing this. MySQL provides no means of doing this. MySQL Proxy is a great new idea/project people will start hearing more about, and it goes a long way to helping, but it’s not dynamic in that I can simply turn on logging on a production system without impact to the MySQL server or connections.

Real time query monitoring granularity of time, also need to be in better units, it’s very difficult to find slow running queries > 100ms when the present granularity is seconds. MySQL Proxy as mentioned, and also Connector/J provide this, (BTW Connector/J has excellent features in it’s many connection options if you develop with Java but it’s yet another output to look at, and when your application server and database servers are on different machine architectures it’s a lot of work to sync).

I am also behind SHOW PROFILE. I’d like to see it being able to be attached to existing connections, and applied to queries, and then output discarded for a time base condition (say < 100ms). Granted the act of observation slows things done, it the ability to be able to observe, see and use the information that’s around in bits that would be a start.

2. Consistent Release Cycles

As Stewart pointed out, it’s ridiculous to have 2 years for release of a product. A consistent cycle is needed. 5.0 Change Notes shows first release to GA 22 months. 5.1 Change Notes still not GA is at 19 months.
We talk now about getting new features, 5.1 is frozen, 5.2 seems lost now in any discussions. 6.0 has a few key features but I’m sure significant new features will be limited to ensure the exposure of Falcon. So, a key new feature maybe in 6.1+

Scope creep, lack of clear planning, test coverage, and user community contribution I feel are all factors. I know that user community testing and contribution is continuing to increase and I applaud the valuable contribution of the community. I wish I could do more myself.

3. Information Schema Extensions

I’ve heard of a Pluggable API for I_S tables in 5.1 Could somebody really confirm? I’ve seen Google doing File System Storage Engines (e.g. for /proc info). I really, really, really wish things like 5.1 processlist and status/variables tables were backported to 5.0 to start with.

In addition, now that I have started, there is need for more detailed information on queries, extensions of status variables that are needed. SHOW PROFILE goes someway internally to indicate what’s going on, but knowing that a certain buffer is being used, and what portion of it per connection will help in sizing. It’s important for example to know the tmp_table_size actually used in a result set, VARCHAR and COLLATION have a huge effect that people simply don’t consider. As the number of Pluggable storage engines increases, the need to know what is really happening is going to be more important. Some of this may be more in relation to Real Time Query Monitoring, but I feel certain additional information is needed to be stored.

4. Online Table Maintenance

It was not until I had to time operations recently for ALTER TABLE ADD COLUMN|INDEX did I realize the extent of the time it takes for InnoDB tables (i.e. is takes your database table offline during this time for any OLTP). My tests were taking over an hour (and I was not in the 3 digit GB range for a table). A real uptime system can’t support downtime like this. Traditionally large scale out MySQL applications have been developed around this limitation, however to compete more with Enterprise experiences, and resources coming from enterprise background this is simply not an option. Add the fact you can’t add a datafile to a Innodb Tablespace online (why not!). While speaking of datafiles, I echo Frank’s comment with the limitation of when using innodb_file_per_table, you can’t copy the file between MySQL Instances (assuming for example all the same version of H/W, O/S, MySQL).

I really hope that Falcon addresses these issues to provide a transactional storage engine offering with these enterprise uptime features.

5. Published Benchmarks

MySQL does not publish any benchmarks, well at least not what I know about. The first problem is: how long is a piece of string. There are millions of variables, but it would be great if even a number of cases of straight forward cases were proven. People may then have a better indication of baseline systems.

Here is my initial wish list.

Classification of Server Configuration. Let’s say a comparison between 1 CPU (dual Core), 2 CPU (dual Core) and 4 CPU (dual Core) commodity H/W. With the same memory (4G), disk (local disk), O/S (64 bit to support > 4GB), sample data (20-50GB) and same queries (simple OLTP only) just what are the benefits. Can we get a cost of throughput to $ Cost.
Disk Configuration. Just how does local Raid 1/Raid 5/Raid 10, compare with SAN (Raid 5) and SAN (Raid 10). Ok there are many types of disks as well as Raid, but start with commodity SATA 72K, 8MB cache. In addition how does a machine with 6 drives (in Raid 1 with OS,Data,Logs split) compare with Raid 5 or Raid 10.
RAM. How do our tests run when we take a 4GB system with 20-50GB of data and give it now 16GB.
Backup/Recovery. How long does it take to backup and restore.
Admin, how long does it take to ALTER TABLE, add data file, even something simply like how long to load data into a memory table across H/W has been interesting

As you can see it turns into a nightmare quickly, we didn’t talk about storage engines like MyISAM/InnoDB, tuning parameters, different O/S etc but surely something really is better then nothing. If only there was a baseline of data and queries to start with. Surely with the data sources available out there, some enterprising person could create a 20GB, 50GB + realistic production type data source, and 20-50 OLTP queries and we have a baseline.

There is talk of the Build Farm by Jay for compiling, let’s get that baseline so we can run some tests across thousands of configurations.
If sufficient work was done by MySQL to get some standard start, then the community might take up the challenge of taking the data/queries/benchmarking framework and test on all the configurations out there, tune to the wazoo and provide back to the MySQL Forge data for everybody to look at.

One day, knowing that this type of disk with this type of battery backup in this RAID configuration just isn’t anywhere near as good as 3 other types of options at roughly the same price.

The Rest

There are more, but in keeping with the spirit of 5 knowing that at least 2 people have shown scope creep already I’ll stop. I really want to mention more.

More About Top 5

Jay Pipes started the Top 5 wishes for MySQL recently. Here are the Planet MySQL contributors to date.

Jay Pipes
Marten Mickos – MySQL CEO
Stewart Smith
Kevin Burton
Farhan Mashaqi
Jeremy Cole

It’s almost like a chain letter, so I’ll start it by passing it onto 3 more, my challenge is to: — Mike Kruckenberg, my evil(he isn’t really evil) twin – Roland Bouman & Paul McCullagh.

27 June 2007 Update
Since my posting we have also had:
Antony Curtis
Alan Kasindorf
Jim Winstead
Jonathon Coombes

MySQL – Wikipedia

June 15, 2007 by ronald

Wikipedia logo from www.wikipedia.org
I was reading only last week the notes from Wikipedia: Site Internals, Configuration and Code Examples, and Management Issues Tutorial by Domas Mituzas at the recent 2007 MySQL Conference. I didn’t attend this session, like a lot of sessions too much good stuff at the same time.

It’s obviously taken a while to catch up on my reading, but with the present MySQL 12 days of Scale-Out I thought I’ll complete my notes for all to see.

If you have never used Wikipedia well, why are you reading this, you should spend an hour there now. Alexa places Wikipedia in one of the top 10 visited sites on the Internet.

Wikipedia runs on the LAMP stack, powered by the MySQL database. Nothing new here, but how Wikipedia scales is. Some of the interesting points involved how a “Content Delivery Network” was build with components including Squid, Lighttpd, Memcached, LVS to improve caching. Appropriate caching is an important component to a successful scale-out infrastructure. An interesting quote however:

The common mistake is to believe that database is too slow and everything in it has to becached somewhere else. In scaled out environments reads are very efficient, and difference of time between efficient MySQL query and memcached request is negligible – both may execute in less than 1ms usually).
Sometimes memcached request may be more expensive than database query simply because it has to establish connection to a server, whereas database connection is usually
open.

Wikipedia has a developed an application Load Balancer. This offers a flexibility in efficient database use and is critical to any scale-out infrastructure. Combined with a good Database API and items such as the Pager class, allows you to write efficient index-based offsets pager (instead of ugly LIMIT 50 OFFSET 10000) syntax for example.

The main ideology in operating database servers is RAIS: – Redundant Array of Inexpensive/Independent/Instable[sic] Servers

RAID0. Seems to provide additional performance/space. Having half of that capacity with an idea that a failure will degrade performance even more doesn’t sound like an efﬁcient idea. Also, we do notice disk problems earlier. This disk conﬁguration should be probably called AID, instead of RAID0.
innodb_ﬂush_log_at_trx_commit=0, tempted to do innodb_ﬂush_method=nosync. If a server crashes for any reason, there’s not much need to look at its data consistency. Usually that server will need hardware repairs anyway. InnoDB transaction recovery would take half an hour or more. Failing over to another server will usually be much faster. In case of master failure, last properly replicated event is last event for whole environment. No ‘last transaction contains millions’ worries makes the management of such environment much easier – an idea often forgotten by web applications people.

The thing I found interesting in the RAIS mysql-node conﬁguration was slave-skip-errors=0,1213,1158,1053,1007,1062

However, the greatest tip is “All database interaction is optimized around MySQL’s methods of reading the data.” This includes:

Every query must have appropriate index for reads…
Every query result should be sorted by index, not by ﬁlesorts. This means strict and predictable path of data access…
Some fat-big-tables have extended covering indexing just on particular nodes…
Queries prone to hit multiversioning troubles have to be rewritten accordingly…

Wikipedia is also clever in it’s sharding. A means to implement vertical and horizontal partition of data via the application for optimal scale-out. This comes down to designing your application correct from the start. Wikipedia considers it’s partition via:

data segments
tasks
time

HiveDB (not used by wikipedia) is open source framework for horizontally partitioning MySQL systems. Well worth reviewing.

Wikipedia also makes use of compression. This works when your data can be compressed well like text. This improves performance, however analysis on other projects have shown this does place a CPU impact on the server so it is important to monitor and use appropriately.

Another clever approach is to move searching to tools more appropriate for this task, in this case Lucene. As with any scale-out it is important to leverage the power of appropriate tools for maximum benefit.

I have only summarized Domas’ notes. It’s well worth a detailed read.

MySQL – Testing failing non-transactional statements

June 15, 2007 by ronald

I was asked recently to confirm a consistent state of data in a non-transactional MySQL table after a failing statement updating multiple rows did not complete successfully.

Hmmm, this is what I did.

Created a MEMORY table
Populated with some data, and a Primary Key
Updated the Primary Key so that it failed with a Duplicate Key Error after updating only half the rows
Confirmed that the rows that were updated, were, and the rows that were not updated, were not

DROP TABLE IF EXISTS mem1;
CREATE TABLE mem1(
i1  INT UNSIGNED NOT NULL PRIMARY KEY,
c1 CHAR(10) NOT NULL,
dt TIMESTAMP)
ENGINE=MEMORY;

INSERT INTO mem1(i1,c1) VALUES (1,'a'), (2,'b'), (3,'c'), (4,'d'), (5,'e');
SELECT * FROM mem1;
+----+----+---------------------+
| i1 | c1 | dt                  |
+----+----+---------------------+
|  1 | a  | 2007-06-14 17:26:29 |
|  2 | b  | 2007-06-14 17:26:29 |
|  3 | c  | 2007-06-14 17:26:29 |
|  4 | d  | 2007-06-14 17:26:29 |
|  5 | e  | 2007-06-14 17:26:29 |
+----+----+---------------------+
5 rows in set (0.00 sec)

UPDATE mem1 SET i1 = 9 - i1 - SLEEP(1), c1='x' ORDER BY i1;
ERROR 1062 (23000): Duplicate entry '5' for key 1
SELECT * FROM mem1;
+----+----+---------------------+
| i1 | c1 | dt                  |
+----+----+---------------------+
|  8 | x  | 2007-06-14 17:29:05 |
|  7 | x  | 2007-06-14 17:29:05 |
|  6 | x  | 2007-06-14 17:29:05 |
|  4 | d  | 2007-06-14 17:28:36 |
|  5 | e  | 2007-06-14 17:28:36 |
+----+----+---------------------+
5 rows in set (0.00 sec)

While I was also hoping for the TIMESTAMP column to reflect when the row was modified, it was when the statement was executed.

This test did however prove the requirements. Simple when you think about it, but it took a few minutes to think about it the first time.

Some comments of 'Five months with MySQL Cluster'

June 8, 2007 by ronald

I recently saw the Planet MySQL post Five months with MySQL Cluster by Brian Moon.

Thought I’d add my 5 cents worth (Australian’s don’t have 1 cent coins any more to make 2 cents worth)

Firstly, it’s great you wrote about your experiences in moving to MySQL Cluster. I think more people should.

Joins

“We used a lot of joins. We learned (later) that joins in the cluster are not a good idea.”

MySQL Cluster’s number one strength is Primary Key Lookups, it’s very good you learned that joins (especially say 5-6-7 table joins) are not a good idea, it is not a strength, and certainly not scalable to thousands of queries per second.

Rewrite

“We rewrote our application (basically, our public web site) to use the new cluster and its new table design.”

It’s great you acknowledged that you had to rewrite your application. I’m sure the attitude of people in the industry is: We need more HA, MySQL offers MySQL Cluster as a HA 5x9s solution, let’s just put out existing application on Cluster. This simply does not provide the results people think, and could in theory result in disaster, particularly when choosing H/W, see next point.

I would expand on a few cases of what you re-wrote, this level of education will help educate the masses.

Hardware

“Six new Dell dual-core, dual processor Opterons with a lot of memory”.

First, MySQL Cluster is an in-memory database, so lots of memory is essential. Second, Data Nodes are a single threaded process, so even with 4 cores your H/W will be underutilized as data nodes.

If an organization wants to say get two 4 CPU Dual Core machines (that’s 2×8 cores per machine), it’s impractical to use as Data Nodes. Far greater performance, reliability and scalability is obtained by having 8×2 core machines. The issue then becomes power consumption and rack space, this is what hurts MySQL Cluster. It’s important to remember MySQL Cluster was designed to run on low commodity hardware, and a lot of it.

“So, we configured the cluster with 4 replicas.”

Interesting. You don’t see many references to more then the default, documented and accepted 2 replicas.

Administration

“MySQL Cluster is a whole new animal.”

Yes it is, an ALL DUMP 1000 for example, and then having to parse log files for the “right” strings needs to be improved for example just to determine memory utilization. You may also want to check out ndbtop. I managed to get an earlier version working, but never really had the time to delve more. Monty also may have some admin stuff of interest buried within his NDB Connectors work.

Conclusions

“What a moron!”

Far from it, I hope your article helps in the education of MySQL Cluster more to the community. I’m certainly going to reference my responses to your article as “key considerations” in considering MySQL Cluster for existing applications.

I would add that with MySQL Cluster you require all equipment to be within a LAN, even the same switch. This is important, MySQL Cluster does not work in a WAN situation. I’ve seen an example H/W provided for a trial Cluster with some machines in a West Coast data center, and some in an East Coast data center.

You also can’t have a lag slave for example as in a Master/Slave environment.

You didn’t mention any specific sizes for data, I’d be interested to know, particularly growth and how you will manage that?
You also didn’t mention anything about Disk? MySQL Cluster may be an in-memory database but it does a lot of disk work, and having appropriate disk is important. People overlook that.
You didn’t mention anything about timings? Like how does backups for example compare now to previously.
You didn’t mention version? 5.1 (not GA) is significant improvement in memory utilization due to true varchar support, saving a lot of memory, but as I said not yet production software.

Amp'd Mobile no more

June 4, 2007 by ronald

Announced at the 2007 Conference as MySQL Applications of the Year – #1 in 3G Mobile Entertainment, Amp’d Mobile is no longer the poster boy within the US telecommunications.

Amp’d Mobile Implodes: Burns $360 million, Declares Bankruptcy. Wow, that’s news on a Sunday.

Things that irk me!

June 3, 2007 by ronald

As part of my job, I spend a lot of time assisting people when they are driving. But sometimes is can be trying.

People that type commands, make a mistake, then backspace over typed text (like 10-20 characters), only to have to retype the text again. Using bash for example, you can just arrow back, change then goto end of line, saving all that re-typing. And it’s so painful when they are slow typer’s.
People that have to use copy/paste (ie, mouse). But when they scroll up several pages to find a command, copy, then scroll down to paste when up arrow a few times in command history gives you the command you want. Even, simply removing the scrolling down (it’s not needed to paste) would save 1 of 4 operations.
Using copy/paste in vi, where your not in insert mode(so it doesn’t work properly, so they have to exit insert mode, copy, enter insert mode, paste, just to do it again)! People need to learn the power of Yank (Y or yy)

Innodb Monitoring I didn't know

June 2, 2007 by ronald

Ok, so I knew about innodb_table_monitor and innodb_tablespace_monitor. I’ve tried them before, looked at the output and given up, partly because it didn’t serve the purpose I wanted it to at the time, and also because it’s format was a little cryptic.

What I didn’t know was there are actually 4 monitors via this “create table functionality”. You can also do innodb_monitor which is the same as SHOW INNODB STATUS, and you can also do innodb_lock_monitor .

Another thing I didn’t know is that these commands don’t send the output just once, it’s on a timer. I’ve found the timers to be different. For innodb_monitor you get every 15 sec, as well as a nice line given time of averages which seems to always say 16 seconds.


=====================================
070601 15:11:25 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 16 seconds

innodb_table_monitor was in this case 65 seconds??

===========================================
070601 15:13:33 INNODB TABLE MONITOR OUTPUT
===========================================
 ...

===========================================
070601 15:14:37 INNODB TABLE MONITOR OUTPUT
===========================================

The fact it goes to the MySQL error log is annoying, rather then a log for Innodb Monitoring. It’s much easier to script when you know the content coming into a file and can control it, with the MySQL error log you can’t.

More info at SHOW ENGINE INNODB STATUS and the InnoDB Monitors.

So you have to read the fine print in the MySQL Manual to get these things. If you also read the manual page to completion, which I did after I wrote this initial post you find yet another option innodb_status_file. If this is set for the server, it writes the output of SHOW ENGINE INNODB STATUS to a file every 15 seconds. What about the other outputs to file? Seems a need to be consistent here.

Things to ponder more.

Now were are all those useful tools that parse this output?
Is this output worth parsing and monitoring at any regularity?

Updated: 06/09

Two additional references that help in the understanding of these monitors are:

SHOW INNODB STATUS walk through – July 2006
InnoDB Table and Tablespace Monitors – Jan 2009

Log Buffer #47: a Carnival of the Vanities for DBAs

June 2, 2007 by ronald

Log Buffer #47: a Carnival of the Vanities for DBAs
June 1st, 2007 – by Ronald Bradford

Welcome to the 47^th edition of Log Buffer, the weekly review of database blogs. No time to wait, lets read more about this week’s database blogging activities.

The PostgreSQL Conference for Users and Developers wrapped up this week and Peter Eisentraut gives us a review including the lightning talks and wrap-up session with a charity auction in PGCon Day 4. Meanwhile Alex Gorbachev is at Miracle Scotland Database Forum – Day One, sounds like from his post there is a lot of drinking and tasting going on.

Tim Hall in Schema Owners and Application Users… starts with “I was trying to explain to a colleague the concept of using application users, rather than logging directly into the schema owner.” As he mentions, it’s an introductory topic however his article gives us a detailed discussion of implementation.

An OTN blog consolidates a number of announcements for Oracle, PHP and OPAL in it’s concise publication New PHP Doc and Software. Plenty of reading links here.

Paul Moen of Pythian covers an important step in Oracle: Standby Automatic File Management follows on from his earlier posts in creating a standby database.

Roland Bouman in MySQL User Defined Function Repository has made an effort to build a central repository of the varied UDF’s that exist to extend MySQL SQL Function syntax. His follow up article The MySQL UDF Repository: lib_mysqludf_sys shows just how dangerous these can really be.

Partitioning, a key new feature in MySQL 5.1 can be incorrectly used as Sergey Petrunia has described in Partition pruning tip: Use comparisons of column, not of partitioning function value. While 5.1 is not GA yet, and maybe not for some time Kevin Burton was quick to comment about software functionality about to be deployed with this new functionality. Of course partitioning in MySQL is free, in Oracle it is not as Mathias describes in Rant about partitioning licensing. Not only do you have to purchase the top of the line Enterprise Edition at 40K, you have to purchase an additional option on top of that. I didn’t know that.

Peter Zaitsev of the MySQL Performance Blog discusses Predicting how long data load would take. This is a very common problem particularly in a direct changeover of a production system to MySQL, and as with Peters example of loading 1TB, this is no longer an uncommon problem with MySQL.

Jag Singh from Optimize Data Warehouse uses a practical approach in Data type validation using regular expressions with a procedural language before data loading into a database, or performed within a database. It’s important we don’t lose site that some things are best done not in the database.

Slashdot starts a flame war with aptly titled 8 Reasons Not To Use MySQL (And 5 To Adopt It) with a reference between Five Compelling Reasons to Use MySQL and Eight Sound Reasons Not to Use MySQL, both published by CIO magazine on the same day. MySQL Performance Blog, and Curt Monash were quick to respond with MySQL – to use or not to use and Whether or not to use MySQL respectively.

Q&A Webinar Part 4 – MySQL Cluster by Ivan Zoratti gives us 33 points of reference with MySQL Cluster. Being involved with MySQL Cluster, it’s interesting to read the types of questions people are asking. Following up, Jonathan Miller a seasoned veteran with MySQL Cluster shows in Just when you think you know something that even the experts can be stretched with the new MySQL Cluster Certification Exam. Likewise I was surprised when I passed the exam myself recently of it’s complexity.

Andy Campbell in his blog Oracle Stuff I Should Have Known demonstrates he has too much spare time on his hand in understanding terminal colors in a novel but interesting Adding some colour to SQL*Plus. Would have looked better with some blue!

One of the features I promote in MySQL, and that exists in other RDBMS’, but Oracle does not have is native Multirow Inserts. Robert Vollman however provides Oracle’s two verbose alternatives. Only knew about one of them myself. Still they are far from the simplicity other database products have.

Brian Duff of DuffBuff writes If I had five Oracle wishes, they would be…. We all have wish lists and Brian writes, “I was thinking about things in the Oracle-sphere that I’d love to see happen over the next few years”. I won’t spoil his wish list, I think point 4 is a good one.

Using the OUTPUT and OUTPUT INTO clauses in SQL Server 2005 describes the new features for retrieving values that were just inserted/updated/deleted by a DML statement. Vardecimal Storage Format in SP2 discusses another new feature in SQL Server 2005, the new Vardecimal format. The caveat being that this functionality is restricted to the Enterprise and Developer Editions. One using SQL Server should also know about the Swiss-knife features in SQL Server to help you.

Things to take care before installing SQL Server 2005 on Vista operating system or Windows Server 2008?. Nothing more to say here, perhaps you need to read if your a SQL Server user.

Data Geek Gal Beth touches on one of my pet peeves in Data Quality: As Meets the Needs of an Organization. If data has a structure, appropriate rules should be put in place to ensure this is maintained in the database. You will forever be doing data cleansing if you simply bother to consider this basic step later. Developers should really learn to be smarter in this area. One point not discussed is, “Where are the validation rules applied, the application or the database?”

View Larger Image

I’m not sure if a Log Buffer has had any images before, but I recently stopped at the Oracle HQ in Redwood city to snap this photograph. The irony of the experience not portrayed within the photograph was I worked for Oracle Corporation in the 1990’s but never visited the HQ, and while I was out taking this photo I was wearing a MySQL shirt. (No honking by any peak hour Oracle workers).

That’s it for this week, until next week. Clocks ticking Frank!

My ‘hourly’ MySQL monitor script Version 0.03

May 30, 2007 by ronald

I realized when I released my very crappy version of My ‘hourly’ MySQL monitor script I really should have included my standard logging.

So I did that the night I wrote my original blog, but never published it. I’ve had need to use it again today, so a few more usability tweaks for parameterization and we are good to go.

Now Version 0.03 includes three files:

hourly.sh
common.sh
mysql.conf

Simple use is:

$ cd /directory
$ vi mysql.conf
# correctly specify MYSQL_AUTHENTICATION
$ chmod +x ./hourly.sh
$ nohup hourly.sh &

This gives you the following files

-rw-r--r-- 1 rbradford rbradford  2643 2007-05-29 15:47 mysql.innodbstatus.070529.154757.log
-rw-r--r-- 1 rbradford rbradford   414 2007-05-29 15:47 mysql.processlist.070529.154757.log
-rw-r--r-- 1 rbradford rbradford 12597 2007-05-29 15:47 mysql.status.070529.154757.log
-rw-r--r-- 1 rbradford rbradford 22229 2007-05-29 15:47 mysql.variables.070529.154757.log
-rw-r--r-- 1 rbradford rbradford 13146 2007-05-29 15:47 os.ps.070529.154757.log
-rw-r--r-- 1 rbradford rbradford   390 2007-05-29 15:48 os.vmstat.5.070529.154757.log

By default, written in /tmp, you can override by setting LOG_DIR.

It gives you a pile of output you can easily grep, I’m working on some very simple graphing. One thing I have done is pass the status into Mark Leith’s Aggregating SHOW STATUS Output as well as passed on some feedback that I hope will get integrated into later solutions.

For now, it’s a tool I can implement in a few seconds, run while somebody is showing or demonstrating a system, and I’ve got some meaningful information to look at. Combined with my more in-depth ‘minute’ script, a general-log and taking notes of individual steps in a system walk though, I have all the information I need to analyze a working system very quickly from a purely database level. Still there is lots to do manually, but I’ve got a consistent view of information to review.

My 'hourly' MySQL monitor script

May 27, 2007 by ronald

Following my article Everything fails, Monitor Everything, and some inquiries, I’ve made some small modifications to my initially hourly script. This script is still a quick and dirty trial of what I’m wanting to develop, but in true Guy Kawasaki terms “5. Don’t worry, be crappy”. It works for now, and enables me to determine what works and what doesn’t.

My goals are Data Collection, Data Analysis and Data Presentation. This is the start of Data Collection. So now I get the following files:

os.vmstat.070524.122054.log
os.ps.070524.122054.log
mysql.innodbstatus.070524.122054.log
mysql.processlist.070524.122054.log
mysql.status.070524.122054.log
mysql.tablestatus.070524.122054.log
mysql.tablestatus.vertical.070524.122054.log
mysql.variables.070524.122054.log



#!/bin/sh
#  Name:    hourly
#  Purpose: Script to 'cron' hourly to run for monitoring
#  Author:  Ronald Bradford

error() {
  echo "ERROR: $1"
  exit 1
}

MYSQL_AUTHENTICATION=".mysql.authentication"
[ ! -f $MYSQL_AUTHENTICATION ] && error "You must specific MySQL Authentication in $MYSQL_AUTHENTICATION"
[ -z `which mysqladmin` ] && error "mysqladmin must be in the PATH"

DATETIME_FORMAT="+%y%m%d.%H%M%S"
DATETIME=`date $DATETIME_FORMAT`
DATABASE="test"

AUTHENTICATION=`cat $MYSQL_AUTHENTICATION`
# run vmstat every second for 1 hour
# normally this is an overkill 5 seconds is acceptable
# but need to monitoring any spike

VMSTAT_OPTIONS="1 3600"
LOG_FILE="os.vmstat.$DATETIME.log"
echo "INFO:  Logging vmstat $VMSTAT_OPTIONS to $LOG_FILE"
vmstat $VMSTAT_OPTIONS > $LOG_FILE 2>&1 &

LOG_FILE="os.ps.$DATETIME.log"
echo "INFO:  Logging ps to $LOG_FILE"
ps -ef > $LOG_FILE 2>&1 &

LOG_FILE="mysql.variables.$DATETIME.log"
echo "INFO:  Logging mysqladmin variables to $LOG_FILE"
echo "| date_time                        | $DATETIME |" > $LOG_FILE
mysqladmin $AUTHENTICATION variables >> $LOG_FILE 2>&1 &

LOG_FILE="mysql.tablestatus.vertical.$DATETIME.log"
mysql $AUTHENTICATION $DATABASE -e "SHOW TABLE STATUSG" > $LOG_FILE 2>&1 &
LOG_FILE="mysql.tablestatus.$DATETIME.log"
mysql $AUTHENTICATION $DATABASE -e "SHOW TABLE STATUS" > $LOG_FILE 2>&1 &

COUNT=0
MAX_COUNT=60
SLEEP_TIME=60
LOG_FILE1="mysql.status.$DATETIME.log"
LOG_FILE2="mysql.processlist.$DATETIME.log"
LOG_FILE3="mysql.innodbstatus.$DATETIME.log"
> $LOG_FILE1
> $LOG_FILE2
> $LOG_FILE3
echo "INFO:  Logging mysqladmin extended-status per $SLEEP_TIME secs for $MAX_COUNT times to $LOG_FILE1"
echo "INFO:  Logging mysqladmin processlist per $SLEEP_TIME secs for $MAX_COUNT times to $LOG_FILE2"
echo "INFO:  Logging mysql show innodb status per $SLEEP_TIME secs for $MAX_COUNT times to $LOG_FILE3"
while [ $COUNT -lt $MAX_COUNT ]
do
  NOW=`date $DATETIME_FORMAT`
  echo "| date_time                        | $NOW |" >> $LOG_FILE1
  echo $NOW >> $LOG_FILE2
  echo $NOW >> $LOG_FILE3
  mysqladmin $AUTHENTICATION extended-status >> $LOG_FILE1
  mysqladmin --verbose $AUTHENTICATION processlist >> $LOG_FILE2
  mysql $AUTHENTICATION $DATABASE -e "SHOW INNODB STATUS\G" >> $LOG_FILE3 2>&1 &
  COUNT=`expr $COUNT + 1`
  sleep $SLEEP_TIME
done
exit 0

So from here, I need to:

Put into my standard sh script framework which provides correct logging, message management and true parameterization.
Additional pre-checks for the correct security requirements
Revised Parameterised settings including database
Host and Instance logging
Additional file parsing for later Data Analysis and Data Presentation.

Cool Photo Printing Site

May 26, 2007 by ronald

I just came across Moo as a link from Fotolog.

This site gives you the option to get a photo like card on the front, and text on the back. The interesting part is that you can do 100 different photos in the batch of 100, and it’s the size that grabs me. I’ve been looking for something different with my various sites including Admiring Creation and Heavy Horse Day and GeekErr. Just now trying to get all my photos ready for printing!

Using Perl with MySQL

May 23, 2007 by ronald

NOTE: Problems presently exist, I’m seeking the expert help of the community and Perl Gurus

I have the need to do some quick benchmarking, I use MyBench as it’s effective in being able to plug in a query, some randomness and 2 minutes later (with a correctly configured Perl/MySQL environment) you have multi-threaded load testing.

However, when the environment you are on is not configured, and you only know the basics for Perl Operation and Installation, (code is just code, that’s the easy part) and the box is not accessible to the outside world say for cpan, it gets more complicated. I’ve attempted to install and configure DBI, DBD::mysql and Time::HiRes but without success.

DBI
DBI was straightforward, a download, make, install worked without issue. a make install was performed.

DBD:mysql
DBD::mysql didn’t need an compile, mysql.pm already existed and make said it was all up to date. However then running mybench gave the first error.

failed: Can't locate DBD/mysql.pm in @INC

Ok, so it wasn’t installed as ‘root’. Some small Perl Pathing.

PERL5LIB=~/DBD-mysql-4.004/lib;export PERL5LIB

Let to the next error:

failed: Can't locate loadable object for module DBD::mysql in @INC (@INC

Hmmm, a little more complicated. So going back to the compiling part, I realized I could force compile it, and this also confirmed one possible issue, the libmysqlclient library.

perl Makefile.PL I will use the following settings for compiling and testing: cflags (mysql_config) = -I/path/to/mysql/include embedded (mysql_config) = libs (mysql_config) = -L/path/to/mysql/lib -lmysqlclient -lz -lcrypt -lnsl -lm ..

Both the mysql.h and libmysqlclient software correctly located and valid, but still no luck.

Moving in parallel I then managed a SA that could install the rpm’s (being RHEL). Problem is, MySQL is not installed via RPM, so the only possible means of installing DBD::mysql is to force no dependencies. This did not prove successful be added clues to the problem.

failed: Can't load '/usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi/auto/DBD/mysql/mysql.so' for module DBD::mysql: libmysqlclient.so.14: cannot open shared object file: No such file or directory at /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi/DynaLoader.pm line 230.

So it used the installed DBD::mysql, and it couldn’t find the expected library path. This gives some confident that perhaps the earlier version is right, and that some other dependancy is missing.

I’ve not found any good resource to work though this online in my environmental situation, but surely this is pain that somebody else has experienced.

Thanks for those friends that have already contributed small parts to getting someway here, however it just ain’t working, and I need it to just work.

Any input appreciated.

Website of the Day – Slideshare

May 21, 2007 by ronald

I came across an interesting site while reading World’s Best Presentation Contest Winners Announced by Guy Kawasaki called SlideShare.

It’s a happy medium between the bulk of image sites like Flickr and Yahoo Photos and video sites like Revver and YouTube where you can easily add Text to what you are wanting to say in a Slide Show. Interestingly enough, like most Web 2.0 Communities people will come up with ideas you never considered, for example check out Evangeline Lilly where this is effectively a portfolio photo shoot of an actress. Clever.

Everything fails, Monitor Everything

May 20, 2007 by ronald

From the recent MySQL Conference a number of things resonate strongly almost daily with me. These included:

Guy Kawasaki – Don’t let the bozos grind you down.
- Boy, the bozos have ground me down this week. I slept for 16 hrs today, the first day of solid rest in 3 weeks.
Paul Tuckfield – YouTube and his various caching tip insights.
- I’ve seen the promising results of Paul Tuckfield’s comment of pre-fetching for Replication written recently by Farhan.
Ramus – SSL is not secure — This still really scares me.
- How do I tell rather computer illiterate friends about running multiple browsers, clearing caches, never visiting SSL sites after other sites that are insecure etc.
Everything fails, Monitor Everything – Google

What I’ve been working on most briefly lately, and really want to be far more prepared everywhere I go is Monitor Everything.

It’s so easy on site to just do a vmstat 1 in one session and a mysqladmin -r -i extended-status | grep -v ” | 0 “ in another, and you may observe a trend, make some notes, say 25% CPU, 3000 Selects, 4000 Insert/Updates per second etc, but the problem is, the next day you don’t have actual figures to compare. What was the table_lock_waits yesterday, they seem high today.

I also only found a problem on a site when I graphed the results. I’ll give you a specific example. The average CPU for the system was 55%, the target was 50%. When graphing the CPU, it was plainly obvious something was not right. I could see with extremely regularity (and count 12 in one hour) a huge CPU spike for a second or two. It was so regular in the graph it was not possible it was random. So, after further investigation and testing, a 5 minute job on this production server (and not on previous testing servers) took 25% CPU for a second or two, and a huge amount of Page Faults. Did it effect the overall impact of the performance of the system. I don’t know, but it was a significant anonomoly that required investigation.

So, quite simply, always monitor and record so you can later reference, even if you don’t process the raw figures at the time. The question is then, “What do I monitor”. Answer, monitor everything.

The problem is with most monitoring, e.g. vmstat and mysqladmin is the lack of a timestamp for easy comparison. It’s really, really annoying that you can add this to the line output. The simple solution is to segment your data into both manageable chunks and consistent chunks.

Manageable chucks can be as easy as creating log files hourly, ensuring the start exactly at the top of the hour. Use a YYYYMMDD.HHMMSS naming standard for all files and you can never go wrong.
Consistent chunks is to ensure you start all manageable monitor (e.g. hourly) at the exact same time, so you can compare.

You need to monitor at least the following:

vmstat
mysqladmin extended-status
mysqladmin processlist
mysqladmin variables
mysqladmin -r -i [n] extended-status | grep -v ” | 0 “

I haven’t found an appropriate network monitoring, but you should also at that.

The issue here is frequency. Here are some guidelines. vmstat every 5 seconds. extended-status and processlist every 30 seconds, variables every hour, and extended-status differences is difficult, but it saves a lot of number crunching later for quick analysis. I do it every second, but not all the time, you need to work out a trigger to enable, or to say run it for 30 seconds every 15-30 minutes.

So in one hour I could have:

20070519.160000.os.vmstat.log
20070519.160000.mysql.variables.log
20070519.160000.mysql.status.log
20070519.160000.mysql.processlist.log
20070519.160500.mysql.status.increment.log, then 1610, 1620, 1630 etc

I have my own scripts for monitoring under development, and I’ve been revising slowly, particularly to be able to load data into a MySQL database so I can easily use SQL for analysis. One thing I actually do is parse files into CSV for easy loading.

There are two tools out there that I’m reviewing and you should look at. Mark Leith has written a Aggregated SHOW STATUS stat pack, and there is also tool called mysqlreport. These both go some what to ultimately what I want.

I haven’t used it yet, but I’ve seen and been very impressed with the simplicity of Munin for graphing. I really need to get some free time to get this operational.

So Monitor Everything and Graph Everything. Plenty of work to do here.

Reading the right MySQL Manual

May 17, 2007 by ronald

I learned an extremely valuable lesson today on a client site. It’s important that users of MySQL read the right version of the manual for the product they are using. It’s very easy to just goto http://dev.mysql.com/doc/ which is what I type in directly and browser the manual. While the MySQL Manual has separate sections for 4.x, 5.0, 5.1 etc, the 5.0 Manual for example reflects the most current version of MySQL 5.0. You may not be running the most current version, infact most production systems rarely run the current version.

My specific case was with Connector/J (JDBC) Reference of 5.0.4. The manual pages reflects the new 5.0.5 or todays’ 5.0.6 release and a particular default is now a different value. With Connector/J the docs are bundled with the version. The MySQL Community Server product does not bundle the manual, and I don’t know where to view instances of the MySQL manual for each specific dot release!

The MySQL Conference recap

May 15, 2007 by ronald

I recently had the opportunity to return and speak at the Brisbane MySQL Users Group. I spent some time talking about MySQL User Conference 2007 Summary and Life as a Consultant. My summary of included:

Overview
Keynotes
Marten Mickos – MySQL
Guy Kawasaki
Michael Evans – OLPC
Rasmus Lerdorf – PHP
Paul Tuckfield – YouTube
Community Awards
Product Road Map
Google
Storage Engines
Dorsal Source
What’s Next

One question was posed to me. “What new did MySQL do this year?” being from the last User Conference. MySQL did seem to not make a great impact at the conference over the successes of the previous year. I had to think some time to come up with the following list.

MySQL Network Monitoring and Advisory Service for Enterprise Level Customers Read More
MySQL Enterprise and MySQL Community split Read More
MySQL Enterprise Unlimited Read More
Pluggable Storage Engine Architecture continuing to become more viable for engines like PBXT, Nitro and InfoBright
Falcon Alpha Download
Time You Person of the Year, with all websites like YouTube, Wikipedia, FaceBook and Flickr all powered by MySQL. Read More

And most recently:

Open Source Database Vendor Partners with LINBIT to Jointly Promote & Support DRBD for MySQL Enterprise Read More
IBM DB2 as a Certified Storage Engine for MySQL on System i Read More

It’s hard to say if these are big ticket items or not, but it is definitely disappointing that 5.1 GA is still MIA. We stay tuned.

I also managed a much better response then from my Conference Presentation opening Slide.

“How can you tell an Oracle DBA has touched your MySQL Installation?”

MYSQL_HOME=/home/oracle/products/mysql-version
mysqld_safe –user=oracle &

Reading the MySQL Manual

May 3, 2007 by ronald

I was asked the question today, “How do I show the details of a Stored Procedure in MySQL. The SHOW PROCEDURE ‘name’ didn’t work.”.

The obvious answer was SELECT ROUTINE_NAME,ROUTINE_DEFINITION FROM INFORMATION_SCHEMA.ROUTINES, given I like to use the INFORMATION_SCHEMA whenever possible. This lead me to think was is the corresponding SHOW command. A quick manual search got me to SHOW CREATE PROCEDURE.

What was interesting was not this, but the list of other SHOW commands I didn’t know. I did not know about SHOW MUTEX STATUS, SHOW OPEN TABLES and SHOW PROCEDURE CODE for example.

It pays if you have free time (who ever has that), to re-read the manual, or at least a detailed index regularly. I’m sure there are plenty more helpful tips out there. Now just what does the output of these new commands really do, and does it really help. If only I could get commands to the stuff I really want.

MySQL Cluster Certified

April 28, 2007 by ronald

Jonathon Coombes recently blogged in MySQL Cluster Certified that he passed the MySQL Cluster DBA Certification as was the first Australian. Lucky for him I passed the exam after my presentation on the second day of the conference. I guess us Australian’s are leading the world!

As Jonathon said it was rather hard, certainly more difficult then the other DBA exams but nothing for an experienced Cluster DBA.

MySQL Conference – YouTube

April 27, 2007 by ronald

MySQL Conference 2007 Day 4 rolled quickly into the second keynote Scaling MySQL at YouTube by Paul Tuckfield.

The introduction by Paul Tuckfield was; “What do I know about anything, I was just the DBA at PayPal, now I’m just the DBA at youTube. There are only 3 DBA’s at YouTube.”

This talk had a number of great performance points, with various caching situations. Very interesting.

Scaling MySQL at YouTube

Top Reasons for YouTube Scalability

The technology stack:

Python
Memcache
MySQL Replication

Caching outside the database is huge.

It a display of numbers of hits per day it was said “I can neither confirm or deny the interpretation will work here (using an Alexa graph)”. This is not the first time I’ve heard this standard “Google” response. They must get lessons by lawyers in what you can say.

Standardizing on DB boxes (but they crash almost daily)

4x2ghz opteron core
16G RAM
12x10k scsi
LSI hardware raid 10
Replication played a big part in fixing
Get a reliable H/W supplier

Replication Lessons

You don’t worry about it when a replicas fail.
One thing that sucks, Innodb doesn’t recover very fast. It does that durability think, but it takes hours to finish recovering (was it going to finish)
How many backups can you restore. When you switch you a replica, are you sure it’s right?
Did you test recovery, did you test your backups.
replication was key to trying different H/W permutations to identify incompatible H/W (combinations of controllers/disks)
we got good at re-parenting/promoting replicas, really fast
we built up ways to clone databases as fast as possible
Excellent way to test tuning changes or fixes (powerful place to test things)
Keep “intentional lag”/Stemcell replicas – Stop SQL thread, keeps a server a few hours or a day behind. Say if you drop a table you have a online backup.
When upgrading, always mysqldump then reload, rather then upgrade database.
Don’t care about CPU’s. I want as much memory as possible, I want as many spindles as possible.
For YouTube 2-3 second lag is acceptable.

If you db fits in ram, great otherwise

Cache is king
Writes should be cached by raid controller (buffered really) not the OS
Only the db should cache reads (not raid, not Linux buffer cache)

Only DB should cache reads

Hit in db cache means lower caches went unused.
Miss in db cache can only miss in other caches since they’re smaller.
Caching reads is worse then useless. It’s serialized writes.
Avoiding serialization in reads reaps compounds benefits under high concurrency

An important lesson learned. Do no cache reads in F/S and Raid Controller.

Caching Lessons
Overcoming Mystery Serialization

Use O_DIRECT
vm.swappiness=1-5
if you’re >80% buys — your not doing I/O concurrently look at other figures e.g. 80% busy 8 I/O’s, next configuration 80%, only 4 I/O’s
Mirror in H/W strip in S/W

Scale Out

Writes are parallel to master, but serialized to replicas. We need true horizontal partitioning.
We want true independent masters
EMD – Even More Databases — Extreme Makeover Database
Slave transactions must serialize to preserve commit order (this is why replication is always way slower)
The oracle caching algorithm (that’s a small o) — predicting the future
Replication lags: one IO bound thread. You do know the future, commands are coming up serially.
Write a script to do reads, before updates coming up (because they are cache hits).
The diamond. For golive, play shards binlogs back to original master for fallback.

MySQL Conference – Get Behind Dorsal Source

April 27, 2007 by ronald

In a community session yesterday at MySQL Conference 2007, I first heard about Dorsal Source. A collaboration between Solid DB and Proven Scaling that allows for community people to upload patches to MySQL, get it compiled across multiple platforms, and have a downloadable distribution available on H/W individual contributors will never have access to.

That’s a great idea. I also hope we get the opportunity to get compiling of patches into multiple versions, as well to get builds of a lot of patches together. Personally, I’m running 3 versions just to diagnose one problem. 5.0.36 with a custom binary change, 5.0.37 so I have SHOW PROFILE, and 5.0.33 so I have microslow patch.

With new patches becoming available from the community, I hope I can see builds that combine all known patches that Dorsal Source may have.

I think this is going to be a great project.

MySQL Conference – PHP on Hormones

April 27, 2007 by ronald

MySQL Conference 2007 Day 4 started early again at 8:20 am with PHP on Hormones by the father of PHP Ramus Lerdorf.

A very funny man, one of the best insightful talks of the conference (rather scary actually). Here are some opening comments.

In his own words as Keynote speaker. “I’m here because I’m old”.
Php 1 from 1994 started after seeing Mozilla in 1993. Because it was just me using it, I could change the language any time.
In 2005 the code looks like this (in comparison on 1995) — I’m not sure if this is worth 10 years of development
I wrote PHP to avoid programming
It’s changed to be more OO because people expect that. Universities teach this.
Hey, I was fixing bugs in my sleep. Iwould wake up, and in my mail box there would be bug fixes to bugs I didn’t even know I had.

Why do people contribute?

Self-interest
self expression
hormones
Improve the world

The slide included a great Chemical equation of “The Neuropeptide oxytocin” — Nature’s trust hormone

People need to attract other people, it makes you feel good, it comes out when you interact with people.

It’s not what people think about you, but rather what they think about themselves.

PHP was my baby, giving up control, just because I started it, doesn’t mean I have a bigger say in it.
Systems that harness network effects and get better the more people use them in a way that caters to their own self-interest. — Web 2.0
Once you build a framework your done, the users build the site, they drive the content.
The same people that work on open source projects, are the same people that use websites.
- Self-interest
- self expression
- hormones
- Improve the world

1. Performance
It your sites falls apart your done.

Benchmark
- http_load
- Callgrind inside valgrind
- XDebug

valgrind –tool=callgrind

Excellent tool to see where time is spent in the code. You have to run a profiler.
Example of using Drupal. It turns out 50% of time was spent in the them, it had 47 SQL queries, 46 Selects.
Went from 4 per second to 80 per second, without any code changes. Some performance options, and some caching.
Guaranteed you can double the speed of your website by using a profiler.

2. Security
Critical problem areas.

404 pages
Search page
PHP_SELF
$_GET, $_POST, $_COOKIE
$_SERVER
Lots of stupidity in IE (e.g. Always send a charset)

The web is broken you can all go home now.

People are venerable because people run older versions of browsers, and their data is not secure, and you can’t secure their data.

What can happen??
9 out of 10 of you have cross-site scripting hole on your site

Remote Greasemonkey
Profile Hacks
JS Trojans

Added a PHP logo to the MySQL User Website, it’s really the PHP website
IBM webpage, on article about security.

Tool to find holes, banks, insurance companies, CIA, even Yahoo where I work.

You know if they have been to bankofamerica.com, you can tell if they are a customer, you can tell if they are logged, you can then see their cookie credentials.

You don’t know if any sites have these problems.

JS trojan, iframe that captures
reconfigures your wireless router, moves it outside your DMZ, then uses traditional techniques to attack your machine (that you thought was secure inside a firewall)

You should never ever click on a link. It sort of defeats the purpose of the web.

Never use the same browser instance to do personal stuff and browsing.

So what are we doing about this?
There isn’t much we (PHP) can do to secure sites developed.
Built a filter extension in 5.2, back in 5.1.

http://php.net/filter *** YOU MUST IMPLEMENT THIS
filter.default=special_chars

3. APIs are Cool!

Two lines to grap the Atom feed from flickr of photos just uploaded.
That’s all I have to add to my code.

The really make you want to use the servers. It’s so easy.

API drives passion, drive people to use your site.
You can add a lot of cool things to your sites.

What to do

Avoid Participation Gimmicks
Get their Oxytocin flowing
Solve One Problem
Clean and Intuitive UI
API’s
Make it work

A full copy of the slides can be found at http://talks.php.net/show/mysql07key

MySQL Conference – Google

April 27, 2007 by ronald

MySQL: The Real Grid Database

Introduction

Can’t work on performance problems until we solve the availability
We want MySQL to fix our problems first.

The problem

Deploy a DBMS for a workload with
- too many queries
- to many transactions
- to much data

A well known solution

deploy a grid database

-use many replicas to scale read performance
-shard your data over many master to scale write performance
-sharding is easy, resharding is hard

availability and manageability trump performance

– make it easy to run many severs
– unbretable aggregate perfomance

we describe problems that matter to us.

The grid database approach

Deploy a large number of small servers
use highly redundant commodity components
added capacity has a low incremental cost
not much capacity lost when a server fails
support many servers with a few DBAs

Managability
Make it easy to do the tasks that must be done. Reduce, Reduce.
Make all tasks scriptable
Why does it mater, support hundreds of servers, spend time solving more interesting problems. You generally have lots of problems to solve.

Underutilize your severs
Require less maintenance
Requre less tuning
tolerate load spikes better
tolerate bad query plans better

In a Perfect World
Short running queries
uses kill mistake and runaway queries
accounts new use to many connections
query plans are good
new apps increase database workload by a small amount
only appropiate date is stored in the database

Reality

Long running transactions, create replication delays everywhere
servers with round robin DNS aliases make queries hard to find
applications create more connections where the database is slow
some storage engines use sampling to get query plan statistics
new applications create new database performance problems
applications use the database as long as rows are never deleted
many long running queries on replicas

Solutions

Improve your ability to respond because prevention is impossible
Need tools to make monitoring easier
determine what is happening across servers
detemine what happened in the past

Mantra

Monitor everything you can, and archive as long as possible. (vmstat 5 secs, iostat, mysql error logs)
You will need these to reconstruct failures
save as much as possible
script as much as possible

Monitoring Matters

Display what is happening

-which table, account or statemsns caused most of the load
-many fast queries can be as much a problem as one slow query

Record what happending

–archivce show status counters somweere
– query data from the archive
— visualise data from the archivce

record queries that have been run

— archive show processlist output (do every 30 seconds)
— support queries on this archive

All of this much scale to an environment with many servers

Monitoring Tools

Display counters and rate change for counters
aggregate values over many servers
visualize and rang results
display results over time

Google mpgrep tools

New Commands
We changed mysql, three new commands
SHOW USER _STATISTICS
SHOW TABLE STATISTICS
SHOW INDEX STATISTICS

Per Account Activity
USER_STATISTICS
seconds executing commands
number of rows fetched and changed
total connections
number of select/updates/other/commits/rollback/binlog bytes written.

TABLE STATISTICS
number of rows fetched/changed

INDEX STATISTICS
display number of rows fetched per index
helps find indexes that are never used

available in code.google.com in 4.0, porting to 5.0

MySQL High Availability

Great options
- Cluster
- Replication
- Middelware — e.g. continum
- DRBD
We need some features right now
we are committed to innodb and mysql replication

*a lot of appplicaton code works on this
*our tools and processed support this

We favor commodity hardware

There are all great features but we are much more limited in what we can use.
Management want to know we don’t loose transactions, not loose some transactions.

Desired HA Functionality

Zero transaction loss on failures of a master
minimal downtime on failures of a master
reasonable cost in performance and dollars
fast and automatic failover to local or remove server
no changes to our programming model
- does it support MVCC
replication and reporting are concurrent on a slave

MVCC must have update concurrent with query.

Failures happen everywhere
OS – kernal oom or panic (older 2.4 32 bit systems)
mysqld – caused also by code we added
disk, misdirected write, corrupt write (love innodb checksums)
file system – inconsisted after unplanned hardware reboot (use ext2)
server – bad RAM
lan, switch – lose
Rack – reboot
Data center – power loss, overheading, lightning, fire
People – things get killed or rebooted by mistake ( a typo can take out the wrong server, when names differ by a character or a digit)

ext2 and 4.0 are great, there are the same generation.
Trying not to use RAID, not battery backed raid etc, we try work around with software solutions. We do use RAID 0, but we also try software solution.
When we have the right HA solution, we won’t need RAID.

Mark. “Yes, Google programmers have bugs. Not me personally, it was my predecessor.”

HA Features we want in MySQL
Synchronous replication as an option
a product that watches a master and initiates a failover
archives of the master binlogs stored elsewhere
state stored in the filesytstem t obe consistent after a crash
. innodb and mysql dictionaries can get out of sync
.replicatoin state on a slave can get out of sync

We could not wait
Features we added to MySQL 4.0.26
We can do things a lot faster
. we have more developers lying around
. Our needs as specific, not a general product solution

Transactional replications for slaves
semi-synchronous replication
mirrored binlogs
fast and automated failover

Transactional Replication
Replication state on a slave is stored in files
slave sql thread commits to storage engines and then updates a file
a crash between the two can make replication state inconsistent
transactional replication
MySQL can solve this in the future by storing replication state in tables

Semi-synchronous replication
Block return from commit on a master until at least one slave has acknowledged receipt of
slave io thread acknowledges receipt after buffering the changes
modified mysql replication protocol to support acknowledgments
conifuration options
where to the master uses it
where a slave used it
how long the maser waits for an acknowledgement

can run a server with some semi-sync replication slaves and some regulare replication salves
this can be worked with any storage engines that supports commit, but we only use innodb

* This is how we guarantee to management for Zero Transaction Loss.

Latency single stream 1ms, multi-stream 10ms. This is acceptable for us.

The MySQL Replication Protocol

The current replication protocol is efficient
a slaves makes one request

Replication Acknowledgment

Mirrored Binlogs
mysql does not provide a way to maintain a copy of a master’s binlog on a replica. By copy we me a file of same name and equivalent byte for byte.
Hierarachial replication works much better where a slave can disconnect from one replication proxy and reconnect to another with adjusting binlog offsets.
Hot backups taken before a failover and difficult to use after a failover

Mirrored Binlog Implementions
Slave IO threads write their own relay log and a copy of the bin log
all events but the rotate log event are written

After failover, start a new binlog on new master

Fast Failover

Slaves use a hostname, rather then an IP
You can’t enable the binlog dynamically (in 4.0)
Added new SQL STATEMENTS that does
disconnect users with SUPER privilege
disable new connections
enable the bin log
enable connections from all users

Automatic failover
Something must decided that a master has failed
Something must choose the new master

Q: What keeps up from moving to 5.0?
A: Queries don’t parse (Joins)

Data sets, 8GB servers, 50-100GB’s

Quote – 26 April 2007

April 27, 2007 by ronald

“The web is broken you can all go home now.”

Ramus Lerdorf — Father of PHP — MySQL Conference 2007

Quote – 25 April 2007

April 26, 2007 by ronald

“Don’t complain, do something about it”

Baron Schwartz – Creator of MySQL Toolkit — MySQL Conference 2007

MySQL Roadmap

April 26, 2007 by ronald

Here are some notes from the MySQL Server Roadmap session at the MySQL Conference 2007.

MySQL: Past and Future

2001: 3:23
2003: 4.0 UNION query Cache Embedded
2004: 41. Subqueries
2005: 5.0 Stored Procedures, Triggers, Views
Now: 5.1.17 Partitioning, Events, Row-based replication
2007?: 6.0 Falcon, Performance, Conflict detection
2008?: 6.1 Online Backup, FK Constraints

2007 Timeline

Q1: 5.1 Beta, 5.1 Telco Production Ready, Monitoring Service 1.1, MySQL 6.0 Alpha, Community GA
Q2: MySQL 6.0 Beta, New Connectors GA
Q3: 5.1 RC, 6.0 Beta, MS 2.0, Enterprise Dashboard beta
Q4: 5.1 GA, 6.0 Beta

Where are we today?

We are by fare the most populate open source database
The Enterprise world is moving online and MySQL is well-positioned for that trend, But:
- Transactional scalability
- Manageability
- Specific online features

MySQL Server Vision – The Future

Always Online — 24×7, Online backup,online analytics, online schema changes
Dynamic Scale-out — online partitioning, add node, replication aides,
Reliable — fault-tolerant, easy disagnosis, stable memory, ultimately self-healing
High-performance — Interactive web, real-time response, apps, 10,000-100,000 clients
Ease of use — Portable, Best for development, multiple connectors, easy tuning
Modularity and Ubiquity — Storage engines, plug ins

How can you help?

Bug finding and fixing — Community Quality Contributor
Feature/patch contribution
But, to expedite your patch

The goal: “Be the Best Online Database for Modern Applications”

Quote – 25 April 2007

April 25, 2007 by ronald

“What ever advice you got, keep it to yourself, your not the target market.”

Red Hat & One Laptop Per Child UI Designer to bunch of suits – MySQL Conference 2007

MySQL Conference – For Oracle DBAs and Developers

April 25, 2007 by ronald

I have just completed my presentation at the MySQL Conference 2007 on MySQL for Oracle DBAs and Developers.

Not mentioned in my slides, but referenced during the presentation was what I consider the most important page to document from the MySQL Manual — 5.2.1. Option and Variable Reference

You can download a PDF copy of my presentation here.

MySQL Conference – Building a Vertical Search Engine in a Day

April 25, 2007 by ronald

Moving into the user sessions on the first day at MySQL Conference 2007, I attended Building a Vertical Search Engine in a Day.

Some of my notes for reference.

Web Crawling 101

Injection List – What is it seed URL’s you are starting from
Fetching the pages
Parsing the content – words and links
Updating the crawl DB
Whitelist
Blacklist
Convergence — avoiding the honey pots
Index
Map-reduce — split a large problem into little pieces, process in parallel, then combine results

Focused content == vertical crawl

20 Billion Pages out there, a lot of junk
Bread-first would take years and cost millions of lives

OPIC + Term Vectors = Depth-first

OPIC is “On-line Page Importance Calculation”. Fixing OPIC Scoring Paper
Pure OPIC means “Fetch well-linked pages first”
We modify it to “fetch pages about MySQL first”

Nutch & Hadoop are the technologies that run on a 4 server cluster. Sample starting with www.mysql.com in 23 loops, 150k pages fetched, 2M URL’s found .

Serving up the results

Generating the index
Setting up Nutch with Tomcat (can also run with Resin) Introduction to Nutch, Part 2: Searching, Running Nutch with Mac OSX
Single searcher vs. multiple searchers
Optimizing the user interface

MySQL Conference – RedHat Keynote – One Laptop Per Child

April 25, 2007 by ronald

Our third keynote at MySQL Conference 2007 was titled Building the Ultimate Database Container with RHEL, MySQL, and Virtualization by Michael Evans.

The presentation was on Red Hat & One Laptop Per Child. His initial Quote was “Thinking Past Platforms: The Next Challenge for Linux”, By Doc Sears, 2007-04-16 http://www.linuxjournal.com/node/1000210

OLPC

A Non profit idea from Nicholas Negroponte.
Aim is to build & distribute inexpensive laptop systems to primary & secondary school students worldwide.
Sell to young children in developing countries.

In summary at presentation to Red Hat — “Non-profit, run by a professor, we make hardware and sell to governments.”

The overall dynamics have attracted a lot of interesting people in the world.

The ability and goal is to make the device together, bringing all H/W and S/W people together.

The people that get behind this project have the ethos — “I’m willing to jump into this to change the world.”

This is the first time for a new opportunity in the last 10 years.

The sugar user interface is a completely new experience.

When the UI designer was presenting to a room of head executives. “What ever advice you got, keep it to yourself, your not the target market.”

One key point — No backward compatibility needs.

More information at www.laptop.org. Wikipedia Reference. Some videos at You Tube Inside One Laptop per Child: Episode one and Slightly better demo of the OLPC User Interface.

MySQL Conference – The next keynote with Guy Kawasaki

April 25, 2007 by ronald

Without missing a beat at MySQL Conference 2007, we moved from Marten’s keynote to The Art of Innovation by Guy Kawasaki.

Extremely fun and entertaining. His 10 points.

1. Make Meaning

“To change the world”
To a VC, do not say “you want to make money”, that is understood. You will attract the wrong team.

2. Make Mantra

Not a Mission statement (50-60 words long), but 2 or 3 words.
- Wendy’s – “Healthy fast food”
- Mike – “Authentic Athletic Performance”
- FedEx – “Peace of Mind”
- eBay – “Democratize commerce”
Create a mantra — Why do you exist?

If you get stuck try the Dilbert mission statement generator.

3. Jump to the next curve

Not 50% or 100% better, but “Do things 10x better”

4. Roll the DICEE

“Create great stuff”
- Deep: Fanning (Reef) Sandal that open beer bottles
- Intelligent: BF-104 Flashlight (Panasonic) (takes 3 sizes of batteries)
- Complete: Lexus
- Elegant: Nano (Apple)
- Emotive: Harley Davidson (They generate strong emotions)

5. Don’t worry, be crappy

Get it out there.

6. Polarize people

People love it or hate it.

7. Let a hundred flowers blossom

People that are not your target market are using it.
Take the money, ask the people why you are buying, ask what you can do better.

8. Churn baby, churn.

Ok to ship stuff with crappy stuff in it, but important to continually revised and improve.

9. Niche thyself
With a nice graph.

Vertical — Ability to provide unique product or service
Horizontal –Value to customer

bottom right — Price
top left — Stupid
bottom left — Dotcom
top right — X You need to be High and to the right.

Fandango — It’s either Fandango, or Clubbin.
Breitling Emergency – watch
Smart car – park perpendicular
LG Kimchi refrigerator

You need to be like the President of the United States – You need to high and to the right. Got a great laugh from the crowd.

10. Follow the 10/20/30 rule

Innovative, you need to pitch for what you want.

The optimum number of slides in 10 slides.
Given the slides in 20 minutes.
Use 30 point font

11. Don’t let the bozos grind you down

A bonus to our friends in the community.

“I think there is world market for five computers”
“This telephone has too many shortcomings to be seriously considered as a means of communication. The device is inherently of no value to us.” –Western Union 1876
“There is no reason why anyone would want a computer in their home.” — Digital Equipment Corp 1977
“It’s too far to drive, and I don’t see how it can be a business.” – Guy Kawasaki – Bozo (The company was Yahoo)

Guy commenting on his lost opportunity with Yahoo — “It only covers the first billion, it’s the second billion that pisses me off.”

Read more about Guy at his website Guy Kawasaki.

The Art of Innovation. If you a copy of slides, send an email to [email protected]