Watching a slave catchup

This neat one line command can be of interest when you are rebuilding a MySQL slave and replication is currently catching up.

$ watch --interval=1 --differences 'mysql -uuser -ppassword -e "SHOW SLAVE STATUS\G"'

You will see the standard SHOW SLAVE STATUS output, but the watch command presents an updated view every second, and highlights differences. This can be useful in a background window to keep an eye on those ‘Seconds Behind Master’.

*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: 10.10.10.10
                Master_User: slave
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: mysql-bin.000626
        Read_Master_Log_Pos: 88159239
             Relay_Log_File: slave-relay.000005
              Relay_Log_Pos: 426677632
      Relay_Master_Log_File: mysql-bin.000621
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB:
        Replicate_Ignore_DB:
         Replicate_Do_Table:
     Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
                 Last_Errno: 0
                 Last_Error:
               Skip_Counter: 0
        Exec_Master_Log_Pos: 426677495
            Relay_Log_Space: 2714497549
            Until_Condition: None
             Until_Log_File:
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File:
         Master_SSL_CA_Path:
            Master_SSL_Cert:
          Master_SSL_Cipher:
             Master_SSL_Key:
      Seconds_Behind_Master: 24131

Some Drupal observations

I had the opportunity to review a client’s production Drupal installation recently. This is a new site and traffic is just starting to pick up. Drupal is a popular LAMP stack open source CMS system using the MySQL Database.

Unfortunately I don’t always have the chance to focus on one product when consulting, sometimes the time can be minutes to a few hours. Some observations from looking at Drupal.

Disk footprint

Presently, volume and content is of a low volume, but expecting to ramp up. I do however find 90% of disk volume in one table called ‘watchdog';


+--------------+--------------+--------------+-------------+--------+
| table_schema | total_mb     | data_mb      | index_mb    | tables |
+--------------+--------------+--------------+-------------+--------+
| xxxxx        | 812.95555878 | 745.34520721 | 67.61035156 |    191 |
+--------------+--------------+--------------+-------------+--------+

+-------------------------------------------+--------+------------+------------+----------------+--------------+--------------+-------------+
| table_name                                | engine | row_format | table_rows | avg_row_length | total_mb     | data_mb      | index_mb    |
+-------------------------------------------+--------+------------+------------+----------------+--------------+--------------+-------------+
| watchdog                                  | MyISAM | Dynamic    |      63058 |            210 | 636.42242813 | 607.72516251 | 28.69726563 |
| cache_menu                                | MyISAM | Dynamic    |        145 |         124892 |  25.33553696 |  25.32577133 |  0.00976563 |
| search_index                              | MyISAM | Dynamic    |     472087 |             36 |  23.40134048 |  16.30759048 |  7.09375000 |
| comments                                  | MyISAM | Dynamic    |      98272 |            208 |  21.83272934 |  19.58272934 |  2.25000000 |

Investigating the content of the ‘watchdog’ table shows detailed logging. Drilling down just on the key ‘type’ records shows the following.

mysql> select message,count(*) from watchdog where type='page not found' group by message order by 2 desc limit 10;
+--------------------------------------+----------+
| message                              | count(*) |
+--------------------------------------+----------+
| content/images/loadingAnimation.gif  |    17198 |
| see/images/loadingAnimation.gif      |     6659 |
| images/loadingAnimation.gif          |     6068 |
| node/images/loadingAnimation.gif     |     2774 |
| favicon.ico                          |     1772 |
| sites/all/modules/coppa/coppa.js     |      564 |
| users/images/loadingAnimation.gif    |      365 |
| syndicate/google-analytics.com/ga.js |      295 |
| content/img_pos_funny_lowsrc.gif     |      230 |
| content/google-analytics.com/ga.js   |      208 |
+--------------------------------------+----------+
10 rows in set (2.42 sec)

Some 25% of rows is just the reporting one missing file. Correcting this one file cuts down a pile of unnecessary logging.

Repeating Queries

Looking at just 1 random second of SQL logging shows 1200+ SELECT statements.
355 are SELECT changed FROM node

$ grep would_you_rather drupal.1second.log
              7 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              5 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              3 Query       SELECT field_image_textarea_value AS value FROM content_type_would_you_rather WHERE vid = 24303 LIMIT 0, 1
              4 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              6 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
             10 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              9 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              8 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              9 Query       SELECT field_image_textarea_value AS value FROM content_type_would_you_rather WHERE vid = 24303 LIMIT 0, 1

There is plenty of information regarding monitoring the Slow Queries in MySQL, but I have also promoted that’s it not the slow queries that ultimately slow a system down, but the 1000’s of repeating fast queries.

MySQL of course has the Query Cache to assist, but this is a course grade solution, and a high volume read/write environment this is meaningless.

There is a clear need for either a application level caching, or a database redesign to pull rather then poll this information, however without more in depth review of Drupal I can not make any judgment calls.

Best Practices in Migrating to MySQL

This week I was the invited speaker to give a 4 hr presentation to the Federal Government Sector in Washington DC on “Best Practices in Migrating to MySQL“. This was a followup to my day long “MySQL for the Oracle DBA Bootcamp” which I presented in Washington DC last year. It was good to see a number of attendees from my first DC presentation.

There was good attendance across various government departments and companies providing services to the government sector, as well a variety of job descriptions.

Thanks to Carahsoft and Sun/MySQL for organizing and sponsoring the event. Thanks also to Phil Hildebrand who provided fantastic support during my preparation answering all my SQL Server questions.

Thanks also to Baron Schwartz creator of Maatkit who as my invited guest was nice enough to table a list of attendee questions, which is always a good reference for revising slides and writing more blog posts.

You can find the first of seven sessions online in my presentations section.

Updated
Thanks to Baron Schwartz for his follow-up blog posts Migrating US Government applications from Oracle to MySQL and 50 things to know before migrating Oracle to MySQL.

Strict mode can still throw warnings

MySQL by default is vary lax with data validation. Silent conversions is a concept that is not a common practice in other databases. In MySQL, instead of throwing an error, a warning was thrown and many applications simply did not handle warnings. With the introduction of sql_mode=STRICT_ALL_TABLES (or TRADITIONAL), in MySQL 5, a better level of validation now exists.

My understanding was that Warnings are now thrown as Errors, therefore eliminating the need to do a SHOW WARNINGS to confirm any problems after every query (this is a performance overhead on a high volume system due to the round trip latency).

However I found an instance where MySQL in STRICT Mode still throws warnings, leading to the question, are there any other areas, and does the earlier statement “Warnings are now thrown as Errors” hold true.

Here is my seeding process to showing the problem.

mysql> create table i(i tinyint, unique key( i));
Query OK, 0 rows affected (0.01 sec)
mysql> insert into i values(999);
Query OK, 1 rows affected (0.00 sec)

Using default settings, attempting to INSERT a duplicate row throws an error, using INSERT IGNORE does not.

mysql> insert into i values(999);
ERROR 1062 (23000): Duplicate entry '127' for key 'i'
mysql> insert ignore into i values(999);
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+--------------------------------------------+
| Level   | Code | Message                                    |
+---------+------+--------------------------------------------+
| Warning | 1264 | Out of range value for column 'i' at row 1 |
+---------+------+--------------------------------------------+
1 row in set (0.00 sec)

When using a Strict Mode, a recommendation for all new systems, it is generally accepted that warnings are translated into errors, which implies your could should never have to consider checking for warnings.

mysql> truncate table i;
mysql> set sql_mode=strict_all_tables;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into i values(999);
ERROR 1264 (22003): Out of range value for column 'i' at row 1
mysql> insert ignore into i values(999);
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+--------------------------------------------+
| Level   | Code | Message                                    |
+---------+------+--------------------------------------------+
| Warning | 1264 | Out of range value for column 'i' at row 1 |
+---------+------+--------------------------------------------+
1 row in set (0.00 sec)
mysql> set sql_mode=traditional;
Query OK, 0 rows affected (0.00 sec)

mysql> insert ignore into i values(9990);
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+--------------------------------------------+
| Level   | Code | Message                                    |
+---------+------+--------------------------------------------+
| Warning | 1264 | Out of range value for column 'i' at row 1 |
+---------+------+--------------------------------------------+
1 row in set (0.00 sec)

I should caveat this post also with using caution with INSERT IGNORE. This should only be used if you never care about errors which I would never consider as a best practice design approach.

Reducing the MySQL 5.1.30 disk footprint

The current size of a MySQL 5.1.30 installation is around 420M.

$ du -sh .
426M	.

A further breakdown.

$ du -sh *
213M	bin
20K	COPYING
9.8M	docs
8.0K	EXCEPTIONS-CLIENT
436K	include
12K	INSTALL-BINARY
121M	lib
504K	man
4.0K	my.cnf
77M	mysql-test
4.0K	README
20K	scripts
2.3M	share
2.9M	sql-bench
100K	support-files

A means to reduce the footprint by 25% is to delete some unused stuff.

$ rm -rf docs/ mysql-test/ sql-bench/
$ du -sh .
337M	.

It’s no big deal, however it certainly does cut down on verbose output in the backup logs removing the mysql-test directory and files.

Improving your web site compatibility with browsers

Every website page content uses two basic elements, HyperText Markup Language (HTML) and Cascading Style Sheets (CSS). Each of these has various standards, HTML has versions such as 3.2, 4.0, 4.01, and the new XHTML 1.0,1.1, 2.0 along with various version flavors know as strict, transitional & frameset. CSS also has various versions including 1, 2 and 3.

Each browser renders your combined HTML & CSS differently. The look and feel can vary between FireFox, Safari, Chrome, Internet Explorer and the more less common browsers. Indeed each version of a product also renders different. With IE 8 just being released, it’s common versions now are 5.0, 5.5, 6.0 and 7.0. This product alone now has 5 versions that UI designers must test and verify.

To minimize presentation and rendering problems, adhering to the standards can only assist, and greatly benefit the majority of entrepreneurs, designers and developers that are not dedicated resources. There are two excellent online tools from the standards body, the World Wide Web Consortium (W3C) that an easily assist you.

You can also link directly to these sites, so it’s easier to validate your HTML and CSS directly from your relevant webpage. For example, HTML Validation and CSS.

It’s not always possible to meet the standards, and when you are not the full-time developer of your site, it can be time consuming if you don’t check early and regularly.

Brand identity with undesirable domain names

Choosing a domain name for your brand identity is the start. Protecting your domain name by registering for example .net, .org, and the many more extensions is one step in brand identity.

However a recent very unpleasant experience in New York, resulted in realizing some companies also register undesirable domain names. I was one of many unhappy people, mainly tourists as I was showing an Australian friend the sights of New York. We had chosen to use the City Sights NY bus line, but we caught with some 100+ people in a clear “screw the paying customers” experience.

I was really annoyed that my friend, only in New York for 2 days both had to experience this, and missed out on a night tour. I commented, I’m going to register citysightsnysucks.com, and share the full story of our experience, directing people to use Gray Line New York, which clearly by observation were providing the service we clearly did not get.

To my surprise, the domain name was already taken. To my utter surprise, the owner of the domain is the same as citysightsny.com. Did they do this by choice, or did another unhappy person (at least in 2006) register this, only to be perhaps threatened legally to give up the domain.

I would generally recommend in brand identity this approach, especially when select common misspellings and hyphenated versions if applicable can easily lead to a lot of domain names for your brand identity.

$ whois citysightsny.com

Whois Server Version 2.0

Domain names in the .com and .net domains can now be registered
with many different competing registrars. Go to http://www.internic.net
for detailed information.

   Domain Name: CITYSIGHTSNY.COM
   Registrar: INTERCOSMOS MEDIA GROUP, INC. D/B/A DIRECTNIC.COM
   Whois Server: whois.directnic.com
   Referral URL: http://www.directnic.com
   Name Server: NS0.DIRECTNIC.COM
   Name Server: NS1.DIRECTNIC.COM
   Status: clientDeleteProhibited
   Status: clientTransferProhibited
   Status: clientUpdateProhibited
   Updated Date: 31-dec-2006
   Creation Date: 28-nov-2004
   Expiration Date: 28-nov-2011

Registrant:
 CitySights New York LLC
 15 Second Ave
 Brooklyn, NY 11215
 US
 718-875-8200x103
Fax:718-875-7056


Domain Name: CITYSIGHTSNY.COM


$ whois citysightsnysucks.com

Whois Server Version 2.0

Domain names in the .com and .net domains can now be registered
with many different competing registrars. Go to http://www.internic.net
for detailed information.

   Domain Name: CITYSIGHTSNYSUCKS.COM
   Registrar: INTERCOSMOS MEDIA GROUP, INC. D/B/A DIRECTNIC.COM
   Whois Server: whois.directnic.com
   Referral URL: http://www.directnic.com
   Name Server: NS.PUSHONLINE.NET
   Name Server: NS2.PUSHONLINE.NET
   Status: clientDeleteProhibited
   Status: clientTransferProhibited
   Status: clientUpdateProhibited
   Updated Date: 26-jun-2007
   Creation Date: 11-aug-2006
   Expiration Date: 11-aug-

Registrant:
 CitySights New York LLC
 15 Second Ave
 Brooklyn, NY 11215
 US
 718-875-8200x103
Fax:718-875-7056


Domain Name: CITYSIGHTSNYSUCKS.COM

Technology changes, humans don't. – Web 2.0 NY Second keynote

I needed a rest from my opening keynote review NY Tech 1995-2008. Opening Web 2.0 Expo NY Keynote but a few siginificant points from The Death of the Grand Gesture by Deb Schultz.

  • An interesting site is Visual Complexity showing graphical representations of many social networks.
  • All the binary communication becomes white noise — Information Overload.
  • “Technology changes, humans don’t” – Deb Schultz

A 5.1 QEP nicety – Using join buffer

I was surprised to find yesterday when using MySQL 5.1.26-rc with a client I’m recommending 5.1 to, some information not seen in the EXPLAIN plan before while reviewing SQL Statements.

Using join buffer

+----+-------------+-------+--------+---------------+--------------+---------+------------------------+-------+----------------------------------------------+
| id | select_type | table | type   | possible_keys | key          | key_len | ref                    | rows  | Extra                                        |
+----+-------------+-------+--------+---------------+--------------+---------+------------------------+-------+----------------------------------------------+
|  1 | SIMPLE      | lr    | ALL    | NULL          | NULL         | NULL    | NULL                   |  1084 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | ca    | ref    | update_check  | update_check | 4       | XXXXXXXXXXXXXXXXX      |     4 | Using where; Using index                     |
|  1 | SIMPLE      | ce    | ALL    | NULL          | NULL         | NULL    | NULL                   | 13319 | Using where; Using join buffer               |
|  1 | SIMPLE      | co    | eq_ref | PRIMARY       | PRIMARY      | 4       | XXXXXXXXXXXXXXXXX      |     1 | Using where                                  |
+----+-------------+-------+--------+---------------+--------------+---------+------------------------+-------+----------------------------------------------+
4 rows in set (0.00 sec)
mysql> select version();
+-----------+
| version() |
+-----------+
| 5.1.26-rc |
+-----------+
1 row in set (0.00 sec)

Sergey Petrunia of the MySQL Optimizer team writes about this in Use of join buffer is now visible in EXPLAIN.

VirtualBox, compiling Part 2

So I managed to find all dependencies after some trial and error for compiling VirtualBox 1.6.4 under Ubuntu 8.0.4, then finding the Linux build instructions to confirm.

It was not successful however in building, throwing the following error:

kBuild: Compiling dyngen - dyngen.c
kBuild: Linking dyngen
kmk[2]: Leaving directory `/usr/local/VirtualBox-1.6.4/src/recompiler'
kmk[2]: Entering directory `/usr/local/VirtualBox-1.6.4/src/apps'
kmk[2]: pass_bldprogs: No such file or directory
kmk[2]: *** No rule to make target `pass_bldprogs'. Stop.
kmk[2]: Leaving directory `/usr/local/VirtualBox-1.6.4/src/apps'
kmk[1]: *** [pass_bldprogs_before] Error 2
kmk[1]: Leaving directory `/usr/local/Virtu

More searching, I needed to add two more files manually. Read More Here.

A long wait, compiling for 20+ minutes, and a necessary reboot as upgraded images threw another error, I got 1.6.4 running, and able to boot Fedora Core 9 image created under 1.5.6

But the real test, and the need for this version was to install Intrepid.

This also failed with a Kernel panic during boot. More info to see this reported as a Ubuntu Bug and Virtual Box Bug.

More work still needed.

Virtual Box, a world of hurt

I successfully installed Virtual box via a few simply apt-get commands under Ubuntu 8.04 via these instructions.

It started fine, after two small annoying, install this module, add this group messages. I was even able to install Ubuntu Intrepid from .iso. But from here it was down hill.

Attempting to start VM gives the error.

This kernel requires the following features not present on the CPU:
pae
Unable to boot - please use a kernel appropriate for the CPU

Some digging around, and confirmation that the current packaged version of Virtual Box doesn’t support PAE. You think they could tell you before successfully installing an OS. I’m running 1.5.6, I need 1.6.x

$ dpkg -l | grep virtualbox
ii  virtualbox-ose                             1.5.6-dfsg-6ubuntu1                      x86 virtualization solution - binaries
ii  virtualbox-ose-modules-2.6.24-19-generic   24.0.4                                   virtualbox-ose module for linux-image-2.6.24
ii  virtualbox-ose-source                      1.5.6-dfsg-6ubuntu1                      x86 virtualization solution - kernel module

Off to the Virtual Box Downloads to get 1.6.4
Don’t make the same mistake as I did and use the first download link, that’s the commercial version that doesn’t install what you expect, you need the OSE. Of course this is not packaged, it’s only source.

  ./configure
Checking for environment: Determined build machine: linux.x86, target machine: linux.x86, OK.
Checking for kBuild: found, OK.
Checking for gcc: found version 4.2.3, OK.
Checking for as86:
  ** as86 (variable AS86) not found!

Ok, well I go through this step like 4 times, installing one package at a time, I wish they could do a pre-check and give you all missing requirements. I installed bin86, bcc, iasl.

Then I got to the following error.

$ ./configure
...
Checking for libxml2:
  ** not found!

Well it’s installed, all too hard. Throw Virtual Box away for virtualization software. And why am I using it anyway. Because VMWare Server doesn’t work under Ubuntu 8.04 either because of some ancient gcc dependency. Sees I may have to go back to that. I just want a working virtualization people on the most popular Linux distro to install other current distros. It’s not a difficult request.

$ dpkg -l | grep libxml
ii  libxml-parser-perl                         2.34-4.3                                 Perl module for parsing XML files
ii  libxml-twig-perl                           1:3.32-1                                 Perl module for processing huge XML document
ii  libxml2                                    2.6.31.dfsg-2ubuntu1                     GNOME XML library
ii  libxml2-utils                              2.6.31.dfsg-2ubuntu1                     XML utilities
ii  python-libxml2                             2.6.31.dfsg-2ubuntu1                     Python bindings for the GNOME XML library

Choosing MySQL 5.1 over 5.0

I have been asked twice this week what version of MySQL I would choose for a new project.
As with most questions in life the answer is: It Depends?

In general I would now recommend for a new project to select 5.1, and he is why.

  1. If it’s a new project and your not managing existing applications with older versions then 5.1 is slated for General Availability (GA) at some imminent time. Having been at Release Candidate (RC) for quite some time (almost 1 year), many people, both internally and in the community are just waiting for Sun/MySQL to get this version out.
  2. MySQL 5.0 is in maintenance mode, it’s now 3 years old. MySQL is placing (I’m assuming) resourcing energies to current and future releases.
  3. If your looking at releasing a product in the next 3 months for example, you do not want to consider the testing and deployment of a new version (e.g. 5.1) in the next 6-9 months.
  4. Unless your comparing specific performance between 5.1 and 5.0 in your edge cases, for a new project start with 5.1 you should be testing and confirming performance and reliability here. The worse case is you can test in 5.0 of any specific problem.
  5. 5.1 gives you new features of course, partitioning may be of benefit but don’t assume it’s going to be a great improvement unless you applications SQL naturally tended to the MySQL partitioning strengths.
  6. The single biggest benefit is the Pluggable Storage Engine Architecture. This can give you some benefits, and in the case of transaction storage engines that are production ready, Innodb now has a pluggable version, much improved on the MySQL supplied version. There are a long list of other engines under development with relative strengths and weakness, however be wary of versions that require customized builds of MySQL.

There are some concerns where I don’t have answers? For example, if you have MySQL Support , is 5.1 supported? I know a common answer to problems in pre 5.0 versions is, have you tried upgrading to 5.1

Why is not released? This is good question, the answer is obviously a level of quality, however it is generally discussed that 5.1 is of better quality in existing features then 5.0. It is 5.1 specific features you need to be careful of. It’s important that you do read carefully the 5.1 Release Notes to see where bug fixes or compatibility changes are still occuring.

As with any choice in the Open Source world, some level of risk assessment is necessary. If you have good metrics and measurement in place for your system, and you adequately test your software, there is no reason not to now consider 5.1 as a viable alternative for new development.

As I post this I note, I see the yet unreleased 5.1.28 list of bugs still shows issues of concern.

Project Darkstar

It may sound like either a astronomical research project or a Star Wars spin- off, but Project Darkstar is an open source infrastructure from Sun Microsystems that states “simplify the development and operation of massively scalable online games, virtual worlds, and social networking applications.”

The advertising sounds promising like many sites, the emphasis seems to be on gaming throughout the material, interesting they threw in the term “social networking applications” specifically in opening descriptions.

I believe worthy of investigation, if only to see how that solve some classic problems. So, Learn some more, Start your rockets and Participate.

Finding, exposing and referencing good material

I came across www.problogger.net by accident. Like many sites and information these days, you simply don’t find via search engines because your normally searching for something specific. I did find it via several levels of hyperlinks. I really wish there was firefox plugin that would track every site you visited, often I’d like to plot how I got to where I am, but that’s another story.

Considering the author is Australian, a top Plurker and Photographer got me intrigued enough to delve for a few moments. What I found is some good information, such as 10 Ways to Optimize a Popular Post on Your Blog and Is Your Blog a Networking Tool?.

Some more reading, Five Ways That Strategic Bullet Points Make You a Stronger Blogger, which leads to a site that includes articles such as Seal the Deal: 10 Tips for Writing the Ultimate Landing Page, which is exactly what I’m looking for with an upcoming Ad Words campaign, but wasn’t searching at the time.

You never know sometimes where good information comes from.

Ultra light startups NY meeting

I attended the Ultra light startup’s meeting last night for the first time. I found it most productive for the 2 hours of time to see a different approach talking about startups, to see a variety of approaches, concepts, ideas, ventures all at various stages and generally people with different and interesting ideas and goal.

The start included a 1-2 min elevator pitch by every person with a few questions of feedback. Some interesting projects included, Rose Tech Ventures Incubator, Home Shop Technologies, New York City Co-working, Wiki Streets, Robots for Planet Earth, Festival Travel Channel, Sunshine Suites, Peek You and Wiki Pages. Two presenters put forth their ideas,concepts, and intentions with domains registered within the last 2 days.

The main discussion was on Co-Working, a concept I’ve not heard of before. It’s a different approach to the Telecommuting approach, companies moving from attendance based to performance based. Another term mentioned as ROW – Results Only Work environment.

I think some improvements in the “elevator pitch” would be.

  • egg timer for 90 seconds
  • Recommend people have 3 slides, and example layout would be “How I am”, “What we do”, and “What we want”
  • Some tips of not what to put on slides, like for example, more then 5 lines and less then 30 point for example.

References
Facebook Group
Subscribe to mailing list

Using consistent data types for columns

I came across this error recently when trying to modify the data type of a column.

ERROR 1025 (HY000): Error on rename of './sakila/#sql-1d91_5' to './sakila/inventory' (errno: 150)

Not the first time, and not the last time. A common problem with InnoDB tables, is the lack of information, you need to dig deeper with the following command (and appropriate security a well organized security profile will NOT have).

mysql> SHOW ENGINE INNODB STATUS;

...

------------------------
LATEST FOREIGN KEY ERROR
------------------------
080717 20:00:28 Error in foreign key constraint of table sakila/inventory:
there is no index in the table which would contain
the columns as the first columns, or the data types in the
table do not match the ones in the referenced table
or one of the ON ... SET NULL columns is declared NOT NULL. Constraint:
,
  CONSTRAINT "fk_inventory_film" FOREIGN KEY ("film_id") REFERENCES "film" ("film_id") ON UPDATE CASCADE

...

You also need to dig though the output of the command to find this, on a larger system this can be quite a lot of information just to find the details of the error. (It would be nice if there was an easier way.)

The result of this error is the columns in a foreign key relationship need to be of the same data type. This is actually a good thing, and MySQL generally operates much better in joins when the join columns are consistent.

On the assumption that you use surrogate primary keys for tables, this is a candidate for a naming standard for primary keys should be the primary key name is unique column name within your schema.

For example, if you call all your primary key’s ‘id’, your foreign key’s are normally ‘table_id’. While this is a common approach that I promoted myself for many years, it’s easy to read and consistent, actually naming every primary key uniquely provides two great benefits.

First, you can easily identify relationships in your entire schema without even knowing about the schema in detail. Second, you can leverage the benefit of the INFORMATION_SCHEMA to in the case of this post, confirm the data types are consistent for all matching columns, even when Referential Integrity is not used.

So, instead of using ‘id’, use ‘actor_id’, ‘film_id’, ‘user_id’ etc. For any self join keys, I normally prefix with parent, so ‘parent_user_id’ for example if you have a hierarchy within a table.

MySQL involvement in OSCON opening keynote

Before I get to post my OSCON reflection I see I didn’t post this (which I reference).

At OSCON opening keynotes Tim O’Reilly Interviews Monty Widenius & Brian Aker. This provided some interesting answers in a Q & A session. Here is some of the discussion.

TO: So 6 months in. How is it with Sun?
BA: Really rewarding environment. My first question was? You are going to send me free H/W. No H/W has been delivered yet, or access to the masses, still hoping. Sun is a very engineering driven company.
MW. Thanks God we didn’t go public. Starting to do closed sourced components, going public this would have continued.

TO: Sun saved MySQL from public market/ insulated from market.
MW: 6 months in, Sun still trying to figure out what they bought. Sun has made a commitment to open source throughout the organization. Engineers who have been working in closed environments, now seeing this all in public, and opens yourself up to more inputs and exposure.

TO: You have your own projects within sun, how does that affect with the main line of development of MySQL, Monty you with the Maria Storage Engine, Brian you with Drizzle.

TO: What is the Support like in Sun?
BA: My boss got it. We are looking at going after different market area, niche and ecosystem. There is certain direction the main codebase is heading such as enterprise features, oracle like replacements. There is a core set of environments what they aren’t needed. Additional new requirements like a proximity data storage, historically Postgres has been good for this type of GIS data. This is a new type of data store. location/time and proximity of objects.
Sun has given us more free hands to work for best features of MySQL. For Drizzle, to strip it down into more components architecture and extensibility. It’s a micro-kernel there will be an interface for large parts of the code.

TO: What do you think about Google?
BA: Happy opening up more of their data, and trying to turn the world into their own 20% of time project.

TO: What do you think about Amazon?
BA: Interesting position, secretive company. At the beginning how little anybody though of Amazon in a service marketplace.

TO : What do you think about Microsoft?
MW: less and less things are good.
BA: Irrelevant.

TO: What do you think about Apple?
NW: More afraid of Apple then Microsoft
BA: Really want an iPhone, but hoping Google will get Android out and it works.

TO: What are the cool things MySQL can do on the Sun field, and reverse?
BA: Both Sun and MySQL Engineers thought about open source differently. MySQL has created a set of steps of evolution, e.g. employees contributing to open source projects. MySQL’s DNA was very small, it’s interesting how fast this is influencing Sun’s approach
MW: MySQL has become to management driven in previous years, Sun has enabled us to get back to our roots.

Where the happening community people now hang

Eric of Proven Scaling commented on a lack of IRC action in the normal mysql channels today when he visited the #drizzle channel on irc.freenode.net.

ebergen: I'm still in #mysql-dev and #planet.mysql but they are hardly active these days [1:51pm]
rbradfor: ebergen: funny, #drizzle is where the action is. [1:51pm]

There is active movement on the Drizzle project. Why is this? Well, I think most importantly is that there is active contribution from the community, at least 5 different companies and more individuals are pushing code to Drizzle, and it’s being accepted and incorporated. Something you can not say about the MySQL Community branch.

As I write this, there are 35 active people on the #drizzle channel now, and 137 members of the Drizzle Discuss list.

My contribution is as Monty put’s it, “Your the build team”. I am managing the Build Master for Drizzle and my company 42SQL is providing the hosting and support. I’ve even managed to push my first small code changes to the project using the very simple Contributing Code instructions. No fuss, no pain, and I don’t care if it doesn’t get included, but it’s available for all to see and use.

In 2 days we now have 15 build slaves covering Ubuntu 8.04 32 & 64bit, Debian 32 & 64 bit, CentOS 5 64 bit, Gentoo 32 & 64 bit, and Mac OS/X 10.5, with definitely some color at times in the waterfall display.

Jay has a good article on Drizzle Buildbot Now Accepting BuildSlaves.

We need your help! There are plenty of Linux/Unix OS’s out there, and we want to know Drizzle can be compiled as broadly as possible. Most of the contributors of build slaves to date are not names I know well in the MySQL community, which is excellent. What I’d like to see is more names I do know.

It’s easy, just check out Instructions for setting up a BuildSlave for Drizzle.

Installing Buildbot

BuildBot is a system to automate the compile/test cycle required by most software projects to validate code changes.

Here is my environment.

$ uname -a
Linux app.example.com 2.6.18-53.el5 #1 SMP Mon Nov 12 02:14:55 EST 2007 x86_64 x86_64 x86_64 GNU/Linux
$ python
Python 2.4.3 (#1, May 24 2008, 13:57:05)

Here is what I did to get it installed successfully.

CentOS

$ yum install python-devel
$ yum install zope

Ubuntu

$ apt-get install python-dev
$ apt-get install python-zopeinterface
$ cd /tmp
# installation of Twisted
$ wget http://tmrc.mit.edu/mirror/twisted/Twisted/8.1/Twisted-8.1.0.tar.bz2
$ bunzip2 Twisted-8.1.0.tar.bz2
$ tar xvf Twisted-8.1.0.tar
$ cd Twisted-8.1.0
$ sudo python setup.py install
# installation of BuildBot
$ cd /tmp
$ wget http://downloads.sourceforge.net/buildbot/buildbot-0.7.8.tar.gz
$ tar xvfz buildbot-0.7.8.tar.gz
$ cd buildbot-0.7.8
$ sudo python setup.py install


And a confirmation.
$ buildbot --version
Buildbot version: 0.7.8
Twisted version: 8.1.0

You will notice a few dependencies. I found these out from the following errors.

Error causing needing python-devel

$ python setup.py install
....
gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtun
e=generic -D_GNU_SOURCE -fPIC -fPIC -I/usr/include/python2.4 -c conftest.c -o conftest.o
building 'twisted.runner.portmap' extension
creating build/temp.linux-x86_64-2.4
creating build/temp.linux-x86_64-2.4/twisted
creating build/temp.linux-x86_64-2.4/twisted/runner
gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtun
e=generic -D_GNU_SOURCE -fPIC -fPIC -I/usr/include/python2.4 -c twisted/runner/portmap.c -o build/temp.linux-x86_64-2.4/twisted/runner/portmap.o
twisted/runner/portmap.c:10:20: error: Python.h: No such file or directory
twisted/runner/portmap.c:14: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘*’ token
twisted/runner/portmap.c:31: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘*’ token
twisted/runner/portmap.c:45: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘PortmapMethods’

Error causing zope to be installed

$ buildbot start /home/buildbot/master/
Traceback (most recent call last):
  File "/usr/bin/buildbot", line 4, in ?
    runner.run()
  File "/usr/lib/python2.4/site-packages/buildbot/scripts/runner.py", line 939, in run
    start(so)
  File "/usr/lib/python2.4/site-packages/buildbot/scripts/startup.py", line 85, in start
    rc = Follower().follow()
  File "/usr/lib/python2.4/site-packages/buildbot/scripts/startup.py", line 6, in follow
    from twisted.internet import reactor
  File "/usr/lib64/python2.4/site-packages/twisted/internet/reactor.py", line 11, in ?
    from twisted.internet import selectreactor
  File "/usr/lib64/python2.4/site-packages/twisted/internet/selectreactor.py", line 17, in ?
    from zope.interface import implements
ImportError: No module named zope.interface

Installation was the easy part, configuration a little more complex.

MySQL Proxy lua scripts from presentation

The following Lua scripts are the examples are from my MySQL Proxy @ OSCON 08 presentation.

analyze_query.lua

MySQL Proxy Analyze Query.

Requires MySQL Proxy Logging Module.

What is released is the Version for MySQL 5.0. A generic version for all MySQL versions is not yet released.

histogram.lua

This script is part of the standard MySQL Proxy examples.

Other Scripts

Additional Lua scripts from MySQL forge are available here.

MySQL Proxy @ OSCON 08

Today I presented with Giuseppe Maxia of Sun Microsystems Inc at OSCON 08 on “MySQL Proxy: From Architecture to Implementation”. I was surprised to find that MySQL has a strong showing with a number of presentations this week.

Our talk covered the basics of MySQL Proxy, what’s coming in future features, and a number of examples of how I have used Proxy in consulting engagements to improve the information retrieval particularly for identifying performance problems.

Download Presentations Slides

The lighter side at O'Reilly OSCON 08

Between the keynotes, general sessions, BoFs that are plenty of events at OSCON 08.

Last night in the BoF time, a selected few enjoyed the relaxed music mode of the Good Company Soul and Blues Review. You can find my photos of what I’ve deemed the O’Reilly Band here.

Lacking the offers of free beer and activities of the larger events, may missed out on knowing about a great time, it was most enjoyable and relaxing. I think they should warm up for the keynotes in future, but then who would run all the AV!

The lights were rather bright, so grayscale actually helped out, so I’ve got a mixture of color and black and white.

larger versions.

The fast paced open source ecosystem

This morning at OSCON 08, Tim O’Reilly’s opening keynote Open Source on the O’Reilly Radar included a slide on Drizzle, giving this new project maximum exposure to the Open Source community.

Drizzle was only officially announced yesterday in Drizzle, Clouds, “What If?” by primary architect Brian Aker. Things move fast. There has been a number of comments from people yesterday including Mark Attwood, Monty Widenus,Monty Taylor,Ronald Bradford, Arjen Lentz, Lewis Cunningham, Jeremy Cole, 451 Group,Matt Asay, Assaf Arkin, SlashDot, Builder.au and MySQL HA.

The Drizzle Launchpad project has reached 5th on a Google Search.

Unfortunately, not all uptake and feedback was positive. The official Wikipedia page for Drizzle was marked for speedy deletion almost instantly, and within a few hours permanently deleted.

The new kid on the block – Drizzle

Before today, Drizzle was known as a light form of rain found in Seattle (among other places). Not any more. If you have not read the news already today, Drizzle, Clouds, “What If?” is the new kid on the RDBMS bock.

Faster, leaner and designed with the original goals of ease-of-use, reliability and performance, Drizzle will make an impact in those organizations that are seeking a viable database storage solution for large scalable applications. The key to Drizzle is several fold. First, the crud has been removed. The first part of Drizzle development is to remove bloat or non functioning software from the MySQL tree. In fact if you monitor the commits, it reads like, this has been removed, these files have been deleted, this code has been refactored, this new library has been introduced. Design decisions that have limited MySQL’s development for years are being simply cast aside.

The current landscape has become more complicated in 2008. You have the official MySQL releases, 5.0 is becoming ancient (being released almost 3 years ago), 5.1 is now clearly a lame duck, no release date for past few years, (the internal joke was 5.1 will be released in Q2, but the year is unspecified). 6.0 is in identity crisis with beta parts in alpha. These versions are moving so slowly, they are moving towards extinction like the dinosaurs. Monty Widenius is working solidly on Maria (and unofficial MySQL 5.1 branch), probably more stable and possibly released before 5.1. MySQL cluster has gone it’s own way, it was shackled by the 5.1 legacy and simply couldn’t wait for a GA product.

Jim Starkey (creator of Falcon, the new 6.0 storage engine) now is working in the clouds with Nimbus DB. Dorsal Source is lying dormant, Proven Scaling has it’s enterprise binaries and now Percona has it’s own patched ports. You have strong patches from Google and eBay that have zero hope of every being introduced into the official MySQL releases, probably until 7.x (5.1 and 6.0 have been frozen for a long time). Innodb from Oracle invested heavily in new features in a 5.1 plugin, announced at the MySQL Conference, broken in 1 day by MySQL releasing a new RC version making it in-compatible. Kickfire and Infobright have their own hacked versions, and Nitro DB I suspect have just given up waiting (now like 2 years).

With Sun’s acquisition now at T+6 months, cash and resources doesn’t appear to have helped with the official product. The single greatest movement in this period is that MySQL is now hosted under Launchpad, enabling anybody to access the source code, and even create branches like Jim Winstead reported. However I doubt you will see this helping code getting in the mainline product, but at least it will be more visible. This was an initiative long before the Sun acquisition, and indeed is against Sun policy of using Mercurial.

So why is Drizzle going to be any different or better?

You start with a committed list of contributors from already 6-7 different organizations. The clear goals of simplification, to make it faster and scale better on multi-core servers echo the work being done. You have developers who work in real world situations, not just coders for many years without experiencing operational use, and you have zero sales and marketing getting in the way. Removal of incomplete or stagnant functionality is key for the alpha version and includes stored procedures, triggers, prepared statements, query cache, extra data types, full-text, timezones etc is just the start.

Being small and nimble will enable Drizzle to develop and release code in much shorter iterations. You will see new developments allowing far greater plugin support via the new modularity approach and far better coding standards, making expert knowledge of how MySQL internals work a lesser requirement to contribute.

Will it fizzle, will it dazzle? Drizzle has the potential to be a stellar product. I’m a supporter and I hope to contribute in some small way.

References


About the Author

Ronald Bradford provides Consulting and Advisory Services in Data Architecture, Performance and Scalability for MySQL Solutions. An IT industry professional for two decades with extensive database experience in MySQL, Oracle and Ingres his expertise covers data architecture, software development, migration, performance analysis and production system implementations. His knowledge from 10 years of consulting across many industry sectors, technologies and countries has provided unique insight into being able to provide solutions to problems. For more information Contact Ronald.

An East Coast option

Within the present MySQL ecosystem, there are limited options for dedicated MySQL Consulting in the US. Outside of the official Sun/MySQL Consulting, Percona and Proven Scaling both based in Silicon valley are the only options generally known and accepted by the MySQL Community.

There is now an east coast option based in New York, and that is Ronald Bradford. Providing expert MySQL Consulting in Architecture, Performance, Scalability, Migration and Knowledge Transfer.

With two decades working in the IT industry, Ronald is well qualified in MySQL having previously provided consulting services for MySQL Inc combining 9 years experience with the product. His consulting experience is not limited to MySQL, having also worked extensively with Oracle, and previously with Ingres. More details of this experience is available at Linked In

This week you will find him on the west coast. If your at OSCON 2008, then please track me down. You can use my Contact Form, email [me] at [this domain], ping me on Twitter, track me on irc://irc.freenode.net (~arabxptyltd) or drop in to my OSCON session at 2:35pm Thursday.

Your data and the cloud

I will be speaking on July 29th in New York at an Entrepreneurs Forum on A Free Panel on Cloud Computing. With a number of experts including Hank Williams of KloudShare, Mike Nolet of AppNexus, and Hans Zaunere of New York PHP fame is should be a great event.

The focus of my presentation will be on “Extending existing applications to leverage the cloud” where I will be discussing both the advantages of the cloud, and the complexities and issues that you will encounter such as data management, data consistency, loss of control, security and latency for example.

Using traditional MySQL based applications I’ll be providing an approach that can lead to your application gaining greater power of cloud computing.


About the Author

Ronald Bradford provides Consulting and Advisory Services in Data Architecture, Performance and Scalability for MySQL Solutions. An IT industry professional for two decades with extensive database experience in MySQL, Oracle and Ingres his expertise covers data architecture, software development, migration, performance analysis and production system implementations. His knowledge from 10 years of consulting across many industry sectors, technologies and countries has provided unique insight into being able to provide solutions to problems. For more information Contact Ronald.

When (n) counts?

I have seen on many engagements the column data type is defined as INT(1).

People have the misconception that this numeric integer data type is of the length of one digit, or one byte. (One digit is 0-9 an one byte is 0-255)

This is incorrect.

Integer

For integer numeric data types in MySQL, that is TINYINT, SMALLINT, MEDIUMINT, INT, BIGINT the (n) has no bearing on the size of data stored within the specific data type. The (n) is simply for display formatting.

In the MySQL Manual 10.2. Numeric Types you read This optional display width is used to display integer values having a width less than the width specified for the column by left-padding them with spaces. The display width does not constrain the range of values that can be stored in the column, nor the number of digits that are displayed for values having a width exceeding that specified for the column.

The following example shows the (n) in this case 3 has no effect on the size of data stored.

DROP TABLE IF EXISTS numeric_int;
CREATE TABLE numeric_int(i INT(3) NOT NULL);
INSERT INTO numeric_int VALUES (1),(22),(333),(444),(55555);
SELECT * FROM numeric_intG
i: 1
i: 22
i: 333
i: 444
i: 55555

Floating Point

When it comes to floating point precision of FLOAT and DOUBLE, the syntax of (m,n) has a different inteperation. The manual states A precision from 0 to 23 results in a four-byte single-precision FLOAT column. A precision from 24 to 53 results in an eight-byte double-precision DOUBLE column.
I will discuss this some more in a different post with some interesting findings.

And MySQL allows a non-standard syntax: FLOAT(M,D) or REAL(M,D) or DOUBLE PRECISION(M,D). Here, “(M,D)” means than values can be stored with up to M digits in total, of which D digits may be after the decimal point. For example, a column defined as FLOAT(7,4) will look like -999.9999 when displayed. MySQL performs rounding when storing values, so if you insert 999.00009 into a FLOAT(7,4) column, the approximate result is 999.0001.

So in the case of FLOAT,DOUBLE the (n) does both affect storage and presentation where it rounds the number as confirmed by the following test. Look a the last 2 rows for the rounding confirmation.

DROP TABLE IF EXISTS numeric_float;
CREATE TABLE numeric_float(f1 FLOAT(10,5)  NOT NULL);
INSERT INTO numeric_float values (1),(2.0),(3.12345),(4.123451),(5.123456);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0
SELECT * FROM numeric_floatG
f1: 1.00000
f1: 2.00000
f1: 3.12345
f1: 4.12345
f1: 5.12346
5 rows in set (0.01 sec)

Fixed Precision

The DECIMAL data type (NUMBER is a synonym) stores numbers to a fixed number of precision. From the manual again When declaring a DECIMAL or NUMERIC column, the precision and scale can be (and usually is) specified; for example: salary DECIMAL(5,2)
In this example, 5 is the precision and 2 is the scale. The precision represents the number of significant digits that are stored for values, and the scale represents the number of digits that can be stored following the decimal point. If the scale is 0, DECIMAL and NUMERIC values contain no decimal point or fractional part.

So in our test:

DROP TABLE IF EXISTS numeric_decimal;
CREATE TABLE numeric_decimal(f1 DECIMAL(10,5)  NOT NULL);
INSERT INTO numeric_decimal values (1),(2.0),(3.12345),(4.123451),(5.123456);
Query OK, 5 rows affected, 2 warnings (0.00 sec)
SELECT * FROM numeric_decimalG
f1: 1.00000
f1: 2.00000
f1: 3.12345
f1: 4.12345
f1: 5.12346

What is also interesting is that with a FLOAT, the rounding of a number greater then (n), produces no warnings, yet when using DECIMAL you will see warnings. These are:

INSERT INTO numeric_decimal values (1),(2.0),(3.12345),(4.123451),(5.123456);
Query OK, 5 rows affected, 2 warnings (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 2

mysql> show warnings;
+-------+------+-----------------------------------------+
| Level | Code | Message                                 |
+-------+------+-----------------------------------------+
| Note  | 1265 | Data truncated for column 'f1' at row 4 |
| Note  | 1265 | Data truncated for column 'f1' at row 5 |
+-------+------+-----------------------------------------+
2 rows in set (0.00 sec)

What is also interesting is that the manual states the following When such a column is assigned a value with more digits following the decimal point than are allowed by the specified scale, the value is converted to that scale. (The precise behavior is operating system-specific, but generally the effect is truncation to the allowable number of digits.)

The number is generally truncated, buy differs per OS. In the case on Mac O/S and Linux it is rounded. The two test environments in this case where:

mysql> show variables like '%version%';
+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| protocol_version        | 10                           |
| version                 | 5.1.23-rc                    |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | i686                         |
| version_compile_os      | apple-darwin9.0.0b5          |
+-------------------------+------------------------------+
5 rows in set (0.01 sec)

mysql> show variables like '%version%';
+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| protocol_version        | 10                           |
| version                 | 5.1.24-rc                    |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | i686                         |
| version_compile_os      | redhat-linux-gnu             |
+-------------------------+------------------------------+
5 rows in set (0.41 sec)

Conclusion

So just to conclude, (n) for Integer types is for display formatting only, (m,n) for floating point will round the number at n places, while in fixed point (m,n) n will round or truncate the number.


About the Author

Ronald Bradford provides Consulting and Advisory Services in Data Architecture, Performance and Scalability for MySQL Solutions. An IT industry professional for two decades with extensive database experience in MySQL, Oracle and Ingres his expertise covers data architecture, software development, migration, performance analysis and production system implementations. His knowledge from 10 years of consulting across many industry sectors, technologies and countries has provided unique insight into being able to provide solutions to problems. For more information Contact Ronald.

References