Ronald Bradford
MySQL Expert

MySQL Expert Ronald Bradford shares valuable input in MySQL Performance Tuning, MySQL Scalability and general MySQL Help from his two decades of working with MySQL, Oracle, Ingres and development technologies.

Archive for the ‘Web Development’ Category

Why is my database slow?

Tuesday, May 11th, 2010

Not part of my Don’t Assume series, but when a client states “Why is my database slow”", you need to determine if indeed the database is slow.

Some simple tools come to the rescue here, one is Firebug. If a web page takes 5 seconds to load, but the .htm file takes 400ms, and the 100+ assets being downloaded from one base url, then is the database actually slow? Tuning the database will only improve the 400ms portion of 5,000ms download.

There some very simple tips here. MySQL is my domain expertise and I will not profess to improving the entire stack however perception is everything to a user and you can often do a lot. Some simple points include:

  • Know about blocking assets in your <head> element, e.g. .js files.
  • Streamline .js, .css and images to what’s needed. .e.g. download a 100k image only to resize to a thumbnail via style elements.
  • Sprites. Like many efficient but simple SQL statements, network overhead is your greatest expense.
  • Splitting images to a different domain.
  • Splitting images to multiple domains (e.g. 3 via CNAME only needed.) — Hint: Learn about the protocol
  • Cookieless domains for static assets
  • Lighter web container for static assets (e.g. nginx, lighttpd)
  • Know about caching, expires and etags
  • Stripping out http://ww.domain.com from all your internal links (that one alone saved 12% of HTML page size for a client). You may ask is that really a big deal, well in a high volume site the sooner you can release the socket on your webserver, the sooner you can start serving a different request.

Like tuning a database, some things work better then others, some require more testing then others, and consultants never tell you all the tricks.

References

As with everything in tuning, do your research and also determine what works in your environment and what doesn’t. Two excellent resources to start with are Steve Souders and Best Practices for Speeding Up Your Web Site by Yahoo.

Upcoming book – Expert PHP and MySQL

Wednesday, March 3rd, 2010

This month will see the release of the book Expert PHP and MySQL which I was a co-author of. Initially this will be available for purchase in PDF format from the Wrox website and I am hopeful this will be available in print format for the MySQL Users Conference.

More then just your standard PHP and MySQL there is detailed content on technologies including Memcached, Sphinx, Gearman, MySQL UDFs and PHP extensions. We will be posting more information at www.ExpertPhpandMySQL.com. You can download a PDF version of Chapter 1 Techniques Every Expert Programmer Needs to Know.

The book includes the following content:

  1. Techniques Every Expert Programmer Needs to Know
  2. Advanced PHP Constructs
  3. MySQL Drivers and Storage Engines
  4. Improved Performance through Caching
  5. Memcached MySQL
  6. Advanced MySQL
  7. Extending MySQL with User-defined Functions
  8. Writing PHP Extensions
  9. Full Text Search using SPHINX
  10. Multi-tasking in PHP and MySQL
  11. Rewrite Rules
  12. User Authentication with PHP and MySQL
  13. Understanding the INFORMATION_SCHEMA
  14. Security
  15. Service and Command Lines
  16. Optimization and Debugging

Problems compiling MySQL 5.4

Thursday, June 11th, 2009

Seem’s the year Sun had for improving MySQL, and with an entire new 5.4 branch the development team could not fix the autoconf and compile dependencies that has been in MySQL for all the years I’ve been compiling MySQL. Drizzle has got it right, thanks to the great work of Monty Taylor.

I’m working on the Wafflegrid AWS EC2 AMI’s for Matt Yonkovit and while compiling 5.1 was straight forward under Ubuntu 8.10 Intrepid, compiling 5.4 was more complicated.

For MySQL 5.1 I needed only to do the following:

apt-get install -y build-essential
apt-get install libncurses5-dev
./configure
make
make install

For MySQL 5.4, I elected to use the BUILD scripts (based on Wafflegrid recommendations). That didn’t go far before I needed.

apt-get install -y automake libtool

You then have to go compiling MySQL 5.4 for 10+ minutes to get an abstract error, then you need to consider what dependencies may be missing.
I don’t like to do a blanket apt-get of a long list of proposed packages unless I know they are actually needed.

The error was:

make[1]: Entering directory `/src/mysql-5.4.0-beta/sql'
make[1]: warning: -jN forced in submake: disabling jobserver mode.
/bin/bash ../ylwrap sql_yacc.yy y.tab.c sql_yacc.cc y.tab.h sql_yacc.h y.output sql_yacc.output -- -d --verbose
make -j 6 gen_lex_hash
make[2]: Entering directory `/src/mysql-5.4.0-beta/sql'
rm -f mini_client_errors.c
/bin/ln -s ../libmysql/errmsg.c mini_client_errors.c
make[2]: warning: -jN forced in submake: disabling jobserver mode.
rm -f pack.c
../ylwrap: line 111: -d: command not found
/bin/ln -s ../sql-common/pack.c pack.c
....
make[1]: Leaving directory `/src/mysql-5.4.0-beta/sql'
make: *** [all-recursive] Error 1

What a lovely error ../ylwrap: line 111: -d: command not found

ylwrap is part of yacc, and by default in this instance it’s not even an installed package. I’ve compiled MySQL long enough that it requires yacc, and actually bison but to you think it would hurt if the configure told the user this.

It’s also been some time since I’ve compiled MySQL source, rather focusing on Drizzle. I had forgotten just how many compile warnings MySQL throws. Granted a warning is not an error, but you should not just ignore them in building a quality product.

Announcing Drizzle on EC2

Sunday, April 26th, 2009

I have published the very first sharable Drizzle Amazon Machine Image (AMI) for AWS EC2, based on the good feedback from my discussion at the Drizzle Developer Day on what options we should try.

This first version is a 32bit Developer instance, showcasing Drizzle and all necessary developer tools to build Drizzle from source.

What you will find on drizzle-ami/intrepid-dev32 – ami-b858bfd1

Ubuntu 8.10 Intrepid 32 bit base server installation:

  • build tools
  • drizzle dependencies
  • bzr 1.31.1

From the respective source trees the following software is available:

  • drizzle 2009.04.997
  • libdrizzle 0.0.2
  • gearman 0.0.4
  • memcached 1.2.8
  • libmemcached 0.28

Drizzle has been configured with necessary dependencies for PAM authentication, http_auth, libgearman and MD5 but these don’t seem to be available in the binary distribution.

I will be creating additional AMI’s including 64bit and LAMP ready binary only images.

The following example shows using drizzle on this AMI. Some further work is necessary for full automation, parameters and logging. I’ve raised a number of issues the Drizzle Developers are now hard at work on.

1. Starting Drizzle

ssh ubuntu@ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com
sudo /etc/init.d/drizzle-server.init start &

2. Testing Drizzle (the sakila database has been installed)

$ drizzle
Welcome to the Drizzle client..  Commands end with ; or \g.
Your Drizzle connection id is 4
Server version: 2009.04.997 Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

drizzle> select version();
+-------------+
| version()   |
+-------------+
| 2009.04.997 |
+-------------+
1 row in set (0 sec)

drizzle> select count(*) from sakila.film;
+----------+
| count(*) |
+----------+
|     1000 |
+----------+
1 row in set (0 sec)

3. Compiling Drizzle

sudo su - drizzle
ls
deploy  drizzle  libdrizzle  sakila-drizzle
cd drizzle
./configure --help
Description of plugins:

   === HTTP Authentication Plugin ===
  Plugin Name:      auth_http
  Description:      HTTP based authentications
  Supports build:   static and dynamic

   === PAM Authenication Plugin ===
  Plugin Name:      auth_pam
  Description:      PAM based authenication.
  Supports build:   dynamic

   === compression UDFs ===
  Plugin Name:      compression
  Description:      UDF Plugin for compression
  Supports build:   static and dynamic
  Status:           mandatory

   === crc32 UDF ===
  Plugin Name:      crc32
  Description:      UDF Plugin for crc32
  Supports build:   static and dynamic
  Status:           mandatory

   === Error Message Plugin ===
  Plugin Name:      errmsg_stderr
  Description:      Errmsg Plugin that sends messages to stderr.
  Supports build:   dynamic

   === Daemon Example Plugin ===
  Plugin Name:      hello_world
  Description:      UDF Plugin for Hello World.
  Supports build:   dynamic

   === Gearman Logging Plugin ===
  Plugin Name:      logging_gearman
  Description:      Logging Plugin that logs to Gearman.
  Supports build:   dynamic

   === Query Logging Plugin ===
  Plugin Name:      logging_query
  Description:      Logging Plugin that logs all queries.
  Supports build:   static and dynamic
  Status:           mandatory

   === Syslog Logging Plugin ===
  Plugin Name:      logging_syslog
  Description:      Logging Plugin that writes to syslog.
  Supports build:   static and dynamic
  Status:           mandatory

   === MD5 UDF ===
  Plugin Name:      md5
  Description:      UDF Plugin for MD5
  Supports build:   static and dynamic

   === One Thread Per Connection Scheduler ===
  Plugin Name:      multi_thread
  Description:      plugin for multi_thread
  Supports build:   static
  Status:           mandatory

   === Old libdrizzle Protocol ===
  Plugin Name:      oldlibdrizzle
  Description:      plugin for oldlibdrizzle
  Supports build:   static
  Status:           mandatory

   === Pool of Threads Scheduler ===
  Plugin Name:      pool_of_threads
  Description:      plugin for pool_of_threads
  Supports build:   static
  Status:           mandatory

   === Default Signal Handler ===
  Plugin Name:      signal_handler
  Description:      plugin for signal_handler
  Supports build:   static
  Status:           mandatory

   === Single Thread Scheduler ===
  Plugin Name:      single_thread
  Description:      plugin for single_thread
  Supports build:   static
  Status:           mandatory

   === Archive Storage Engine ===
  Plugin Name:      archive
  Description:      Archive Storage Engine
  Supports build:   static
  Status:           mandatory

   === Blackhole Storage Engine ===
  Plugin Name:      blackhole
  Description:      Basic Write-only Read-never tables
  Supports build:   static and dynamic
  Configurations:   max, max-no-ndb

   === CSV Storage Engine ===
  Plugin Name:      csv
  Description:      Stores tables in text CSV format
  Supports build:   static
  Status:           mandatory

   === Memory Storage Engine ===
  Plugin Name:      heap
  Description:      Volatile memory based tables
  Supports build:   static
  Status:           mandatory

   === InnoDB Storage Engine ===
  Plugin Name:      innobase
  Description:      Transactional Tables using InnoDB
  Supports build:   static and dynamic
  Configurations:   max, max-no-ndb
  Status:           mandatory

   === MyISAM Storage Engine ===
  Plugin Name:      myisam
  Description:      Traditional non-transactional MySQL tables
  Supports build:   static
  Status:           mandatory

Report bugs to <http://bugs.launchpad.net/drizzle>.

Setting up MySQL on Amazon Web Services (AWS) Presentation

Wednesday, April 22nd, 2009

On Tuesday at the MySQL Camp 2009 in Santa Clara I presented Setting up MySQL on Amazon Web Services (AWS).

This presentation assumed you know nothing about AWS, and have no account. With Internet access via a Browser and a valid Credit Card, you can have your own running Web Server on the Internet in under 10 minutes, just point and click.

We also step into some more detail online click and point and supplied command line tools to demonstrate some more advanced usage.

Some Drupal observations

Monday, February 23rd, 2009

I had the opportunity to review a client’s production Drupal installation recently. This is a new site and traffic is just starting to pick up. Drupal is a popular LAMP stack open source CMS system using the MySQL Database.

Unfortunately I don’t always have the chance to focus on one product when consulting, sometimes the time can be minutes to a few hours. Some observations from looking at Drupal.

Disk footprint

Presently, volume and content is of a low volume, but expecting to ramp up. I do however find 90% of disk volume in one table called ‘watchdog’;


+--------------+--------------+--------------+-------------+--------+
| table_schema | total_mb     | data_mb      | index_mb    | tables |
+--------------+--------------+--------------+-------------+--------+
| xxxxx        | 812.95555878 | 745.34520721 | 67.61035156 |    191 |
+--------------+--------------+--------------+-------------+--------+

+-------------------------------------------+--------+------------+------------+----------------+--------------+--------------+-------------+
| table_name                                | engine | row_format | table_rows | avg_row_length | total_mb     | data_mb      | index_mb    |
+-------------------------------------------+--------+------------+------------+----------------+--------------+--------------+-------------+
| watchdog                                  | MyISAM | Dynamic    |      63058 |            210 | 636.42242813 | 607.72516251 | 28.69726563 |
| cache_menu                                | MyISAM | Dynamic    |        145 |         124892 |  25.33553696 |  25.32577133 |  0.00976563 |
| search_index                              | MyISAM | Dynamic    |     472087 |             36 |  23.40134048 |  16.30759048 |  7.09375000 |
| comments                                  | MyISAM | Dynamic    |      98272 |            208 |  21.83272934 |  19.58272934 |  2.25000000 |

Investigating the content of the ‘watchdog’ table shows detailed logging. Drilling down just on the key ‘type’ records shows the following.

mysql> select message,count(*) from watchdog where type='page not found' group by message order by 2 desc limit 10;
+--------------------------------------+----------+
| message                              | count(*) |
+--------------------------------------+----------+
| content/images/loadingAnimation.gif  |    17198 |
| see/images/loadingAnimation.gif      |     6659 |
| images/loadingAnimation.gif          |     6068 |
| node/images/loadingAnimation.gif     |     2774 |
| favicon.ico                          |     1772 |
| sites/all/modules/coppa/coppa.js     |      564 |
| users/images/loadingAnimation.gif    |      365 |
| syndicate/google-analytics.com/ga.js |      295 |
| content/img_pos_funny_lowsrc.gif     |      230 |
| content/google-analytics.com/ga.js   |      208 |
+--------------------------------------+----------+
10 rows in set (2.42 sec)

Some 25% of rows is just the reporting one missing file. Correcting this one file cuts down a pile of unnecessary logging.

Repeating Queries

Looking at just 1 random second of SQL logging shows 1200+ SELECT statements.
355 are SELECT changed FROM node

$ grep would_you_rather drupal.1second.log
              7 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              5 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              3 Query       SELECT field_image_textarea_value AS value FROM content_type_would_you_rather WHERE vid = 24303 LIMIT 0, 1
              4 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              6 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
             10 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              9 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              8 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              9 Query       SELECT field_image_textarea_value AS value FROM content_type_would_you_rather WHERE vid = 24303 LIMIT 0, 1

There is plenty of information regarding monitoring the Slow Queries in MySQL, but I have also promoted that’s it not the slow queries that ultimately slow a system down, but the 1000’s of repeating fast queries.

MySQL of course has the Query Cache to assist, but this is a course grade solution, and a high volume read/write environment this is meaningless.

There is a clear need for either a application level caching, or a database redesign to pull rather then poll this information, however without more in depth review of Drupal I can not make any judgment calls.

Javascript Helpers

Monday, October 13th, 2008

Combined with my old favorites of Dynamic Drive, DHTML Goodies and Brain Jar, I’ve added the following to my list of Javascript sources.

Improving your web site compatibility with browsers

Friday, September 26th, 2008

Every website page content uses two basic elements, HyperText Markup Language (HTML) and Cascading Style Sheets (CSS). Each of these has various standards, HTML has versions such as 3.2, 4.0, 4.01, and the new XHTML 1.0,1.1, 2.0 along with various version flavors know as strict, transitional & frameset. CSS also has various versions including 1, 2 and 3.

Each browser renders your combined HTML & CSS differently. The look and feel can vary between FireFox, Safari, Chrome, Internet Explorer and the more less common browsers. Indeed each version of a product also renders different. With IE 8 just being released, it’s common versions now are 5.0, 5.5, 6.0 and 7.0. This product alone now has 5 versions that UI designers must test and verify.

To minimize presentation and rendering problems, adhering to the standards can only assist, and greatly benefit the majority of entrepreneurs, designers and developers that are not dedicated resources. There are two excellent online tools from the standards body, the World Wide Web Consortium (W3C) that an easily assist you.

You can also link directly to these sites, so it’s easier to validate your HTML and CSS directly from your relevant webpage. For example, HTML Validation and CSS.

It’s not always possible to meet the standards, and when you are not the full-time developer of your site, it can be time consuming if you don’t check early and regularly.

Brand identity with undesirable domain names

Wednesday, September 24th, 2008

Choosing a domain name for your brand identity is the start. Protecting your domain name by registering for example .net, .org, and the many more extensions is one step in brand identity.

However a recent very unpleasant experience in New York, resulted in realizing some companies also register undesirable domain names. I was one of many unhappy people, mainly tourists as I was showing an Australian friend the sights of New York. We had chosen to use the City Sights NY bus line, but we caught with some 100+ people in a clear “screw the paying customers” experience.

I was really annoyed that my friend, only in New York for 2 days both had to experience this, and missed out on a night tour. I commented, I’m going to register citysightsnysucks.com, and share the full story of our experience, directing people to use Gray Line New York, which clearly by observation were providing the service we clearly did not get.

To my surprise, the domain name was already taken. To my utter surprise, the owner of the domain is the same as citysightsny.com. Did they do this by choice, or did another unhappy person (at least in 2006) register this, only to be perhaps threatened legally to give up the domain.

I would generally recommend in brand identity this approach, especially when select common misspellings and hyphenated versions if applicable can easily lead to a lot of domain names for your brand identity.

$ whois citysightsny.com

Whois Server Version 2.0

Domain names in the .com and .net domains can now be registered
with many different competing registrars. Go to http://www.internic.net
for detailed information.

   Domain Name: CITYSIGHTSNY.COM
   Registrar: INTERCOSMOS MEDIA GROUP, INC. D/B/A DIRECTNIC.COM
   Whois Server: whois.directnic.com
   Referral URL: http://www.directnic.com
   Name Server: NS0.DIRECTNIC.COM
   Name Server: NS1.DIRECTNIC.COM
   Status: clientDeleteProhibited
   Status: clientTransferProhibited
   Status: clientUpdateProhibited
   Updated Date: 31-dec-2006
   Creation Date: 28-nov-2004
   Expiration Date: 28-nov-2011

Registrant:
 CitySights New York LLC
 15 Second Ave
 Brooklyn, NY 11215
 US
 718-875-8200x103
Fax:718-875-7056

Domain Name: CITYSIGHTSNY.COM
$ whois citysightsnysucks.com

Whois Server Version 2.0

Domain names in the .com and .net domains can now be registered
with many different competing registrars. Go to http://www.internic.net
for detailed information.

   Domain Name: CITYSIGHTSNYSUCKS.COM
   Registrar: INTERCOSMOS MEDIA GROUP, INC. D/B/A DIRECTNIC.COM
   Whois Server: whois.directnic.com
   Referral URL: http://www.directnic.com
   Name Server: NS.PUSHONLINE.NET
   Name Server: NS2.PUSHONLINE.NET
   Status: clientDeleteProhibited
   Status: clientTransferProhibited
   Status: clientUpdateProhibited
   Updated Date: 26-jun-2007
   Creation Date: 11-aug-2006
   Expiration Date: 11-aug-

Registrant:
 CitySights New York LLC
 15 Second Ave
 Brooklyn, NY 11215
 US
 718-875-8200x103
Fax:718-875-7056

Domain Name: CITYSIGHTSNYSUCKS.COM

To www or not www

Monday, September 22nd, 2008

Domain names historically have been www.example.com, written also with the protocol prefix http://www.example.com, but in reality www. is optional, only example.com is actually needed.

www. is technically a sub-domain and sub-domains incur a small penalty in search engine optimization.

There is no right or wrong. What is important is that you choose one, and the other needs to be a 301 Permanent Redirect to the one you have chosen.

You also need to know that creating a server alias in your web server configuration, for example Apache or Tomcat is not a permanent redirect, in-fact it is technically duplicate content, with two web sites the same also incurring a penalty for search engine rating.

So what do the big players do. Here are a few.

Use www

  • www.google.com
  • www.facebook.com
  • www.cnn.com
  • www.yahoo.com
  • www.myspace.com
  • www.ebay.com
  • www.plurk.com
  • www.amazon.com
  • www.fotolog.com
  • www.linkedin.com

Do not use www

  • digg.com
  • wordpress.com
  • identi.ca

Show duplicate content

  • flickr.com
  • chi.mp
  • corkd.com
  • vimeo.com
  • garysguide.org
  • engineyard.com

Curiously youtube.com uses a 303 redirect, microsoft.com, stumbleupon.com and craigslist.org a 302 redirect.

How do you check? Use a CLI tool such as wget.

$ wget www.google.com
--2008-09-22 19:56:48--  http://www.google.com/
Resolving www.google.com... 72.14.205.99, 72.14.205.103, 72.14.205.104, ...
Connecting to www.google.com|72.14.205.99|:80... connected.
HTTP request sent, awaiting response... 200 OK

$ wget google.com
--2008-09-22 19:57:56--  http://google.com/
Resolving google.com... 64.233.167.99, 64.233.187.99, 72.14.207.99
Connecting to google.com|64.233.167.99|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.google.com/ [following]

$ wget www.facebook.com
--2008-09-22 20:07:59--  http://www.facebook.com/
Resolving www.facebook.com... 69.63.178.12
Connecting to www.facebook.com|69.63.178.12|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]

$ wget facebook.com
--2008-09-22 19:59:43--  http://facebook.com/
Resolving facebook.com... 69.63.176.140, 69.63.178.11
Connecting to facebook.com|69.63.176.140|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.facebook.com/ [following]

$ wget digg.com
--2008-09-22 20:10:47--  http://digg.com/
Resolving digg.com... 64.191.203.30
Connecting to digg.com|64.191.203.30|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 15322 (15K) [text/html]

$ wget www.digg.com
--2008-09-22 20:14:06--  http://www.digg.com/
Resolving www.digg.com... 64.191.203.30
Connecting to www.digg.com|64.191.203.30|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://digg.com/ [following]

$ wget twitter.com
--2008-09-22 20:26:18--  http://twitter.com/
Resolving twitter.com... 128.121.146.100
Connecting to twitter.com|128.121.146.100|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2655 (2.6K) [text/html]

$ wget www.twitter.com
--2008-09-22 20:26:41--  http://www.twitter.com/
Resolving www.twitter.com... 128.121.146.100
Connecting to www.twitter.com|128.121.146.100|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://twitter.com/ [following]

Professionally, I prefer shorter and simpler without www.

References: