Ronald Bradford
MySQL Expert

MySQL Expert Ronald Bradford shares valuable input in MySQL Performance Tuning, MySQL Scalability and general MySQL Help from his two decades of working with MySQL, Oracle, Ingres and development technologies.

Archive for the ‘Web’ Category

HiTCHO Top tech tips

Wednesday, May 13th, 2009

I recent visit with old Brisbane friend HiTCHO which I met at the Brisbane MySQL Users Group in 2005, has lead to this cool list of some hardware and software technologies he used that I am now considering or have already implemented or purchased.

Software

  1. xmarks.com – Bookmark-Powered Web Discovery
  2. Pulse – Smart Pen
  3. Quicksilver Mac windows manager
  4. MailPlane – Brings Gmail to your Mac desktop
  5. Evernote – Remember Everything, with Firefox plugin and iPhone App
  6. Textmate – The missing editor for Mac OS/X
  7. Screen flow Professional screencasting Studio
  8. Snoop – A GNU/Linux file descriptor monitoring tool inspired by FreeBSD’s ‘watch’.

Hardware

  1. Drobo – Storage that manages itself
  2. Canon PowerShot SX1. True HD in a Canon compact digital camera.
  3. LiveScribe – Never miss a word

Announcing Drizzle on EC2

Sunday, April 26th, 2009

I have published the very first sharable Drizzle Amazon Machine Image (AMI) for AWS EC2, based on the good feedback from my discussion at the Drizzle Developer Day on what options we should try.

This first version is a 32bit Developer instance, showcasing Drizzle and all necessary developer tools to build Drizzle from source.

What you will find on drizzle-ami/intrepid-dev32 – ami-b858bfd1

Ubuntu 8.10 Intrepid 32 bit base server installation:

  • build tools
  • drizzle dependencies
  • bzr 1.31.1

From the respective source trees the following software is available:

  • drizzle 2009.04.997
  • libdrizzle 0.0.2
  • gearman 0.0.4
  • memcached 1.2.8
  • libmemcached 0.28

Drizzle has been configured with necessary dependencies for PAM authentication, http_auth, libgearman and MD5 but these don’t seem to be available in the binary distribution.

I will be creating additional AMI’s including 64bit and LAMP ready binary only images.

The following example shows using drizzle on this AMI. Some further work is necessary for full automation, parameters and logging. I’ve raised a number of issues the Drizzle Developers are now hard at work on.

1. Starting Drizzle

ssh ubuntu@ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com
sudo /etc/init.d/drizzle-server.init start &

2. Testing Drizzle (the sakila database has been installed)

$ drizzle
Welcome to the Drizzle client..  Commands end with ; or \g.
Your Drizzle connection id is 4
Server version: 2009.04.997 Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

drizzle> select version();
+-------------+
| version()   |
+-------------+
| 2009.04.997 |
+-------------+
1 row in set (0 sec)

drizzle> select count(*) from sakila.film;
+----------+
| count(*) |
+----------+
|     1000 |
+----------+
1 row in set (0 sec)

3. Compiling Drizzle

sudo su - drizzle
ls
deploy  drizzle  libdrizzle  sakila-drizzle
cd drizzle
./configure --help
Description of plugins:

   === HTTP Authentication Plugin ===
  Plugin Name:      auth_http
  Description:      HTTP based authentications
  Supports build:   static and dynamic

   === PAM Authenication Plugin ===
  Plugin Name:      auth_pam
  Description:      PAM based authenication.
  Supports build:   dynamic

   === compression UDFs ===
  Plugin Name:      compression
  Description:      UDF Plugin for compression
  Supports build:   static and dynamic
  Status:           mandatory

   === crc32 UDF ===
  Plugin Name:      crc32
  Description:      UDF Plugin for crc32
  Supports build:   static and dynamic
  Status:           mandatory

   === Error Message Plugin ===
  Plugin Name:      errmsg_stderr
  Description:      Errmsg Plugin that sends messages to stderr.
  Supports build:   dynamic

   === Daemon Example Plugin ===
  Plugin Name:      hello_world
  Description:      UDF Plugin for Hello World.
  Supports build:   dynamic

   === Gearman Logging Plugin ===
  Plugin Name:      logging_gearman
  Description:      Logging Plugin that logs to Gearman.
  Supports build:   dynamic

   === Query Logging Plugin ===
  Plugin Name:      logging_query
  Description:      Logging Plugin that logs all queries.
  Supports build:   static and dynamic
  Status:           mandatory

   === Syslog Logging Plugin ===
  Plugin Name:      logging_syslog
  Description:      Logging Plugin that writes to syslog.
  Supports build:   static and dynamic
  Status:           mandatory

   === MD5 UDF ===
  Plugin Name:      md5
  Description:      UDF Plugin for MD5
  Supports build:   static and dynamic

   === One Thread Per Connection Scheduler ===
  Plugin Name:      multi_thread
  Description:      plugin for multi_thread
  Supports build:   static
  Status:           mandatory

   === Old libdrizzle Protocol ===
  Plugin Name:      oldlibdrizzle
  Description:      plugin for oldlibdrizzle
  Supports build:   static
  Status:           mandatory

   === Pool of Threads Scheduler ===
  Plugin Name:      pool_of_threads
  Description:      plugin for pool_of_threads
  Supports build:   static
  Status:           mandatory

   === Default Signal Handler ===
  Plugin Name:      signal_handler
  Description:      plugin for signal_handler
  Supports build:   static
  Status:           mandatory

   === Single Thread Scheduler ===
  Plugin Name:      single_thread
  Description:      plugin for single_thread
  Supports build:   static
  Status:           mandatory

   === Archive Storage Engine ===
  Plugin Name:      archive
  Description:      Archive Storage Engine
  Supports build:   static
  Status:           mandatory

   === Blackhole Storage Engine ===
  Plugin Name:      blackhole
  Description:      Basic Write-only Read-never tables
  Supports build:   static and dynamic
  Configurations:   max, max-no-ndb

   === CSV Storage Engine ===
  Plugin Name:      csv
  Description:      Stores tables in text CSV format
  Supports build:   static
  Status:           mandatory

   === Memory Storage Engine ===
  Plugin Name:      heap
  Description:      Volatile memory based tables
  Supports build:   static
  Status:           mandatory

   === InnoDB Storage Engine ===
  Plugin Name:      innobase
  Description:      Transactional Tables using InnoDB
  Supports build:   static and dynamic
  Configurations:   max, max-no-ndb
  Status:           mandatory

   === MyISAM Storage Engine ===
  Plugin Name:      myisam
  Description:      Traditional non-transactional MySQL tables
  Supports build:   static
  Status:           mandatory

Report bugs to <http://bugs.launchpad.net/drizzle>.

Setting up MySQL on Amazon Web Services (AWS) Presentation

Wednesday, April 22nd, 2009

On Tuesday at the MySQL Camp 2009 in Santa Clara I presented Setting up MySQL on Amazon Web Services (AWS).

This presentation assumed you know nothing about AWS, and have no account. With Internet access via a Browser and a valid Credit Card, you can have your own running Web Server on the Internet in under 10 minutes, just point and click.

We also step into some more detail online click and point and supplied command line tools to demonstrate some more advanced usage.

Twitter Tips

Friday, March 27th, 2009

I have in the past questioned the value of Twitter as an effective business tool, but it continues to defy the trend of inability to bridge the business gap with social media.

Even with still continual growth problems (at least it’s not down as much) Twitter is everywhere I go, see or do. You see it at business events, business cards, meetups even on CNN Headline News. There are so many various differ twitter sites, applications, widgets etc, I’m surprised there isn’t a twitter index just of the twitter related sites.

I have now incorporated Twitter into my professional site and I’m using this micro-blogging approach more to share my professional skills and interests to my growing band of followers. I don’t expect to make the Twitter top list which is headed by CNN Breaking News with 667,353 followers.

Even Lance Armstrong (who rates 9th) used Twitter for press releases this week of his injuries.

For more reading check out How Twitter Makes You A Better Writer and 27 Twitter Applications Your Small Business Can Use Today.

I was surprised to see How to get a job by blogging: Tips for a setting up the kind of professional blog that will get you hired, barely mention Twitter.

Now be sure to add a background appropriate to your Twitter. This one is wicked.

Your Code, Your Community, Your Cloud… Project Kenai

Wednesday, March 18th, 2009

Following the opening keynote announcement about Kenai I ventured into a talk on Project Kenai.

With today’s economy, the drive is towards efficiency is certainly a key consideration, it was quoted that dedicated hosting servers only run at 30% efficiency.

An overview again of Cloud Computing

  • Economics – Pay as you go,
  • Developer Centric – rapid self provisioning, api-driven, faster deployment
  • Flexibility – standard services, elastic, on demand, multi-tenant

Types of Clouds

  • Public – pay as you go, multi-tenant application and services
  • Private – Cloud computing model run within a company’s own data center
  • Mixed – Mixed user of public and private clouds according to applications

SmugMug was referenced as a Mixed Cloud example.

Cloud Layers

  • Infrastructure as a Services – Basic storage and computer capabilities offer as a service (eg. AWS)
  • Platform as a Service – Developer platform with build-in services. e.g. Google App Engine
  • Software as Service – applications offered on demand over the network e.g salesforce.com

Some issues raised about this layers included.

  • IaaS issues include Service Level, Privacy, Security, Cost of Exit
  • PaaS interesting point, one that is the bane of MySQL performance tuning, that is instrumentation
  • SaaS nothing you need to download, you take the pieces you need, interact with the cloud. More services simply like doing your Tax online.

Sun offers Project Kenai as well as Zembly.

Project Kenai

  • A platform and ecosystem for developers.
  • Freely host open source projects and code.
  • Connect, community, collaborate and Code with peers
  • Eventually easily deploy application/services to “clouds”

Kenai Features

  • Code Repository with SVN, Mercurial, or an external repository
  • Issue tracking with bugzilla, jira
  • collaboration tools such as wiki, forums, mailing lists
  • document hosting
  • your profile
  • administrative role

Within Kenai you can open up to 5 open source projects and various metrics of the respositories, issue trackers, wiki etc.

The benefits were given as the features are integrated into your project, not distributed across different sites. Agile development within the project sees a release every 2 weeks. Integration with NetBeans and Eclipse is underway.

Kenai is targeted as being the core of the next generation of Sun’s collaboration tools. However when I asked for more details about uptake in Sun, it’s only a request, not a requirement for internal teams.

The API’s for the Sun Cloud are at http://kenai.com/projects/suncloudapis.

Event: CommunityOne East in New York, NY.
Presenter: Tori Wieldt, Sun Microsystems
Article Author: Ronald Bradford

Everybody is talking About Clouds

Wednesday, March 18th, 2009

From the opening keynote at CommunityOne East we begin with Everybody is talking About Clouds.

It’s difficult to get a good definition, the opening cloud definition today was Software/Platform/Storage/Database/Infrastructure as a service. Grid Computing, Visualization, Utility Computing, Application Hosting. Basically all the buzz words we currently know.

Cloud computing has the ideals of truly bringing a freedom of choice. For inside or outside of an enterprise, the lower the barrier, time and cost into freedom of choice give opportunities including:

  • Self-service provisioning
  • Scale up, Scale down.
  • Pay for only what you use.

Sun’s Vision has existed since 1984 with “The NETWORK is the Computer”.

Today, Sun’s View includes Many Clouds, Public and Private, Tuned up for different application needs, geographical, political, with a goal of being Open and Compatible.

How do we think into the future for developing and deploying into the cloud? The answer given today was, The Sun Open Cloud Platform which includes the set of core technologies, API’s and protocols that Sun hopes to see uptake among many different providers.

The Sun Cloud Platform

  • Products and Technologies – VirtualBox, Sun xVM, Q-Laser, MySQL
  • Expertise and Services
  • Partners – Zmanda, Rightscale, Kickapps
  • Open Communities – Glashfish, Java, Open Office, Zfs, Netbeans, Eucalyptus

The Sun Cloud includes:

  • Compute Service
  • Storage Service
  • Virtual Data Center
  • Open API – Public, RESTful, Java, Python, Ruby

The public API has been released today and is available under Kenai. It includes two key points:

  • Everything is a resource http GET, POST, PUT etc
  • A single starting point, other URI’s are discoverable.

What was initially showed was CLI interface exmaples, great to see this still is common, a demonstration using drag and drop via a web interface was also given, showing a load balanced, multi-teired, multi server environment. This was started and tested during the presentation.

Then Using Cyberduck (a WebDAV client on Mac OS/X) and being able to access the storage component at storage.network.com directly, then from Open Office you now get options to Get/Save to Cloud ( using TwoGuys.com, Virtual Data Center example document).

Seamless integration between the tools, and the service. That was impressive.

More information at sun.com/cloud. You can get more details also at the Sun Microsystems Unveils Open Cloud PlatformOfficial Press Release.

Event: CommunityOne East in New York, NY.
Article Author: Ronald Bradford

CommunityOne East – An open developer conference

Wednesday, March 18th, 2009

With an opening video from thru-you.com – an individual taking random you-tube video and producing video mashup’s, the CommunityOne East conference in New York, NY beings.

The opening introduction was by Chief Sustainability Officer Dave Douglas. Interesting job title.

His initial discussion was around what is the relationship between technology and society. A plug for his upcoming book “Citizen Engineer” – The responsibilities of a 21st Century Engineer. He quotes “Crisis loves an Innovation” by Jonathan Schwartz, and extends with “Crisis loves a Community”.

He asks us to consider the wider community ecosystem such as schools, towns, governments, NGO’s etc with our usage and knowledge of technology.

Event: CommunityOne East in New York, NY.
Author: Ronald Bradford

Hurting the little guy?

Monday, March 16th, 2009

Today I come back from the dentist, if that wasn’t bad enough news, I get an email from Google AdWords titled Your Google AdWords Approval Status.

In the email, all my AdWords campaigns are now disapproved, because of:

SUGGESTIONS:
-> Ad Content: Please remove the following trademark from your ad:
mysql.

Yeah right. I can’t put the word ‘MySQL’ in my ads. How are people to now find me? It would appear that many ads have been pulled not just mine. Is this a proactive measure by Google? is this a complaint from the MySQL trademark holder Sun Microsystems?

I’d like any comment, feedback or suggestions on how one can proceed here.

It reminds me of the days CentOS advertised itself as an “Open source provider of a popular North American Operating System”, or something of that nature.

Some Drupal observations

Monday, February 23rd, 2009

I had the opportunity to review a client’s production Drupal installation recently. This is a new site and traffic is just starting to pick up. Drupal is a popular LAMP stack open source CMS system using the MySQL Database.

Unfortunately I don’t always have the chance to focus on one product when consulting, sometimes the time can be minutes to a few hours. Some observations from looking at Drupal.

Disk footprint

Presently, volume and content is of a low volume, but expecting to ramp up. I do however find 90% of disk volume in one table called ‘watchdog’;


+--------------+--------------+--------------+-------------+--------+
| table_schema | total_mb     | data_mb      | index_mb    | tables |
+--------------+--------------+--------------+-------------+--------+
| xxxxx        | 812.95555878 | 745.34520721 | 67.61035156 |    191 |
+--------------+--------------+--------------+-------------+--------+

+-------------------------------------------+--------+------------+------------+----------------+--------------+--------------+-------------+
| table_name                                | engine | row_format | table_rows | avg_row_length | total_mb     | data_mb      | index_mb    |
+-------------------------------------------+--------+------------+------------+----------------+--------------+--------------+-------------+
| watchdog                                  | MyISAM | Dynamic    |      63058 |            210 | 636.42242813 | 607.72516251 | 28.69726563 |
| cache_menu                                | MyISAM | Dynamic    |        145 |         124892 |  25.33553696 |  25.32577133 |  0.00976563 |
| search_index                              | MyISAM | Dynamic    |     472087 |             36 |  23.40134048 |  16.30759048 |  7.09375000 |
| comments                                  | MyISAM | Dynamic    |      98272 |            208 |  21.83272934 |  19.58272934 |  2.25000000 |

Investigating the content of the ‘watchdog’ table shows detailed logging. Drilling down just on the key ‘type’ records shows the following.

mysql> select message,count(*) from watchdog where type='page not found' group by message order by 2 desc limit 10;
+--------------------------------------+----------+
| message                              | count(*) |
+--------------------------------------+----------+
| content/images/loadingAnimation.gif  |    17198 |
| see/images/loadingAnimation.gif      |     6659 |
| images/loadingAnimation.gif          |     6068 |
| node/images/loadingAnimation.gif     |     2774 |
| favicon.ico                          |     1772 |
| sites/all/modules/coppa/coppa.js     |      564 |
| users/images/loadingAnimation.gif    |      365 |
| syndicate/google-analytics.com/ga.js |      295 |
| content/img_pos_funny_lowsrc.gif     |      230 |
| content/google-analytics.com/ga.js   |      208 |
+--------------------------------------+----------+
10 rows in set (2.42 sec)

Some 25% of rows is just the reporting one missing file. Correcting this one file cuts down a pile of unnecessary logging.

Repeating Queries

Looking at just 1 random second of SQL logging shows 1200+ SELECT statements.
355 are SELECT changed FROM node

$ grep would_you_rather drupal.1second.log
              7 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              5 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              3 Query       SELECT field_image_textarea_value AS value FROM content_type_would_you_rather WHERE vid = 24303 LIMIT 0, 1
              4 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              6 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
             10 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              9 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              8 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              9 Query       SELECT field_image_textarea_value AS value FROM content_type_would_you_rather WHERE vid = 24303 LIMIT 0, 1

There is plenty of information regarding monitoring the Slow Queries in MySQL, but I have also promoted that’s it not the slow queries that ultimately slow a system down, but the 1000’s of repeating fast queries.

MySQL of course has the Query Cache to assist, but this is a course grade solution, and a high volume read/write environment this is meaningless.

There is a clear need for either a application level caching, or a database redesign to pull rather then poll this information, however without more in depth review of Drupal I can not make any judgment calls.

Where is the innovation?

Monday, November 24th, 2008

The 2009 MySQL Conference has closed it’s submissions for papers. This year the motto is “Innovation Everywhere”.

Last weekend’s Open SQL Camp in Charlottesville, Virginia, we had the chance to talk about the movements in the MySQL ecosystem. I was impressed to get the details of the Percona MySQL Patches, but focus is still in 5.0. (Welcome to the Percona team Tom Basil) Our Delta is attempting now to integrate patches into various MySQL branches. There was an opening keynote by Brian Aker from Drizzle, and Drizzle team Jay Pipes and Stewart Smith on hand. It was also announced that MySQL 5.1.30 will be GA, available in early December.

But these are not innovations that are ground breaking. Last year, it was the announcement of KickFire that I found most intriguing regarding innovation.

What is there this year?. The most interesting thing I read last week was Memcached as a L2 Cache for Innodb – The Waffle Grid Project. This is my kind of innovation. It’s sufficiently MySQL, but just adds another dimension with another companion technology. The patch seems relatively simple in concept and code size, and I’m almost prepared to fire up a few EC2’s to take this one for a spin. I’m doubly impressed because the creators are two friends and colleagues that are not hard core kernel hackers, but professionals on the front line dealing with clients daily. Will it be successful, or viable? That is the question about innovation.

Unfortunately I spend more time these days not seeing innovation in MySQL, but in other alternative database solutions in general. Projects like Clustrix, Inc., LucidDB, and Mongo in the 10gen stack.