What is Google's direction?

Tonight over discussion was Android and what is Google’s ultimate direction. Have they lost their way, or are they just planning to explode with so many new things that will revolutionize what and how we do things. With $475,000 first price for Android, they certainly have the money available to invest in new directions.

I arrive home, and find email discussion on The Google Browser – Chrome.

Inquisitive, I take a look, to find the great teaser, nothing by a comic, come back tomorrow for the download link. Is that clever to leak information, have everybody write about it and check back tomorrow?

Naming standards? Singular or Plural

It’s important that for any software application good standards exist. Standards ensure a number of key considerations. Standards are necessary to enforce and provide reproducible software and to provide a level of quality in a team environment, ease of readability and consistency.

If you were going to create a MySQL Naming Standard you have to make a number of key decisions. Generally there is no true right or wrong, however my goals tend towards readability and simplicity. In 2 decades of database design I’ve actually changed my preference between some of these points.

1. Pluralism

Option 1
All database objects are defined in the logical form, that being singular.

For example: box, customer, person, category, user, order, order_line product, post, post_category

Option 2

For database tables & views, objects are defined in plural. For columns, objects are defined singular.

For example: boxes, customers, people, categories, users, orders, order_lines, products, posts, post_categories

Issues
Inconsistency between table name and column name, when using plural. Column names simply are not plural.
When the plural of the name is a completely different spelled word. For example a table of People, and a primary key of PersonId.
When the plural rule is not adding ‘s’, for example replacing ‘y’ with ‘ies’ as with Category.
Strict rule necessary for relationship and intersection tables. Generally, only the last portion is plural.
What about other objects, such as stored procedures for example.

2. Case Sensitivity

Option 1
All database objects should be specified as lowercase only. Words are separated with ‘underscores’.

For example: customer, customer_history, order, order_line, product, product_price, product_price_history

Option 2
All database objects use CamelCase. Words are separated via Case.

For example: Customer, CustomerHistory, Order, OrderLine, Product, ProductPrice, ProductPriceHistory

Issues
Some database products have restrictions here, Oracle for example, UPPERCASES all objects. MySQL allows for both, except for some Operating Systems that does not support Mixed Case properly (e.g. Microsoft)

3.Key names

In a related post I will be discussing natural and surrogate keys. For this purpose, we will assume you are using surrogate keys.

All keys will have a standard name. For Example: id, key, sgn (system global number)

When referencing primary keys and foreign keys, what standard is used?

Option 1
The primary key is the same name across all tables, making it easy to know the primary key of a table.
For foreign keys, the name is prefixed with the table name or appropriate table alias.

For Example: id. The foreign key would be customer_id, order_id, product_id

Option 2
The primary key is defined as unique across the system, and as such the foreign key is the same as the primary key.

For Example: customer_id, order_id, product_id

Issues
When self referencing columns to a table, having a standard is also appropriate, for example parent_product_id.

4. Specific Object Names

Option 1
For the given type of object, have a standard prefix or suffix.

For Example, all tables are prefixed with tbl, all views are suffixed with _view, all columns for a given table are prefixed with table alias.

Option 2
Don’t prefix or suffix an object with it’s object type.

5. Reserved Words

Option 1
Don’t use them. When a word is reserved, find a more description name.

E.g. system_user, order_date,

Option 2
Allow them.

For Example: user, date, group, order

Issues
A number of database systems do not allow the use of reserved words, or historically have not. Sum such as MySQL for example, allow reserved words, but only when additional quoting is used.

6. Abbreviations

This indeed could be a entire topic it’s self. In simplicity, do you use abbreviations other then the most common and everybody knows abbreviations (to the point you don’t know what the abbreviation actually is in real life type abbreviations), or do you not.

I use the example of ‘url’. How many people actually know what url stands for. This is a common abbreviation.

Option 1
For objects, use abbreviations when possible. Don’t use it for key tables, but for child and intersection tables.
For Example: invoice, inv_detail, inv_id,

Option 2
Avoid abbreviations at all costs.
invoice, invoice_detail, invoice_id,

Issues
Unless you have a large schema (e.g. > 500 tables) the use of abbreviations should not be needed given the relative sizes of objects in modern databases.

My Recommendations

There are a lot more considerations then these few examples for naming standards, however as with every design, it’s important to make a start and work towards continual improvement.

  1. All objects are singular (very adamant, people/person, category/categories, sheep/sheep for tables, but not columns – simplicity wins as English is complex for plurals).
  2. All objects are lowercase and use underscore ‘_’ (I really like CamelCase for readability, but for consistency and simplicity, unfortunately lowercase is easier).
  3. All primary key’s are defined as unique across the system.
  4. Don’t use prefixes/suffixes to identity object types.
  5. Never use reserved words.
  6. Don’t use abbreviations except for the most obvious.

At the end of the day, I will work with what standard is in place. What I won’t work with is, when there is no documented, accountable standard.

Websites in review – Week 1

I often come across new websites, quite often by accident, or by indirection in links from looking at other details. The Internet is an amazing place, and one could spend all day reading such a variety information and only touch on just a few specific topics.

I think it’s important to share interesting and new sites, often it’s a referral from others that provide for enjoyable and useful reading. Here are mine in return. Sites which I bookmarked and intended to review again.

99 Designs — www.99designs.com

Need something designed? 99designs connects clients needing design work such as logo designs, business cards or web sites to a thriving community of 17,781 talented designers.

Wufoo — www.wufoo.com

Wufoo strives to be the easiest way to collect information over the Internet.

Our HTML form builder helps you create contact forms, online surveys, and invitations so you can collect the data, registrations and online payments you need without writing a single line of code.

Mako Templates — www.makotemplates.org

Mako is a template library written in Python. It provides a familiar, non-XML syntax which compiles into Python modules for maximum performance. Mako’s syntax and API borrows from the best ideas of many others, including Django templates, Cheetah, Myghty, and Genshi. Conceptually, Mako is an embedded Python (i.e. Python Server Page) language, which refines the familiar ideas of componentized layout and inheritance to produce one of the most straightforward and flexible models available, while also maintaining close ties to Python calling and scoping semantics.

Venture Beat — www.venturebeat.com

I came across this site when reading Developer Analytics: Facebook game Mob Wars making $22,000 a day.

VentureBeat’s mission is to provide news and information about private companies and the venture capital that fuels them. Founder Matt Marshall covered venture capital for the San Jose Mercury News until he left in Sept. 2006 to launch VentureBeat as an independent company. VentureBeat will focus initially on Silicon Valley, and gradually expand to cover innovation hubs around the globe. Its mission in each region will be the same: to provide insider news and data about the entrepreneurial and venture community that is useful to decision makers.

Developer Analytics — www.developeranalytics.com

The previous article referenced this site, with the tagline “the world’s first social media ratings and reporting services.” I like the categories used in reports.

  • Reach – Unique traffic.
  • Audience Profile – Demographics and sociographics breakdown. Interest clouds.
  • Engagement -Impressions. Average page views per user. Return users.
  • Growth -New Users. Churn Rates. Viral Factors.
  • Monetization – CPM and CPA revenue potential.

SXSW Interactive — www.sxsw.com/interactive

I’ve heard of South By South West Conference before, and this is it. SXSW Interactive: March 13-17, 2009 -The Brightest Minds in Emerging Technology

SXSW Interactive The SXSW Interactive Festival features five days of exciting panel content and amazing parties. Attracting digital creatives as well as visionary technology entrepreneurs, the event celebrates the best minds and the brightest personalities of emerging technology. Whether you are a hard-core geek, a dedicated content creator, a new media entrepreneur, or just someone who likes being around an extremely creative community, SXSW Interactive is for you!

Actually sounds rather interesting, and different.

Pulse 2.0 — www.pulse2.com

Pulse 2.0 is a company that is driven by our passion of technology and entrepreneurship. We use Pulse 2.0 to share our thoughts on Web 2.0.

Monetization through Online Advertising

Next week I will be the panel moderator for September 2008 Entrepreneurs Forum on “Monetization through Online Advertising” organized by Ultra Light Startups™. A slightly different approach to my regular speaking schedule, it will be good to observe and interact with our speakers.

The companies represented by the panelists for the evening include:

  • PerformLine – Alex Baydin, Founder and CEO
  • SocialDough – Derek Lee, Founder and CEO
  • Zanox Inc – Rani Nagpal, Vice President of Affiliate Management
  • Blinkx – Max Ramirez, Head of online ad sales

Event Details

Date: Thursday, September 4, 2008
Time: 7:00pm – 9:00pm
Location: Rose Technology Ventures, LLC
Street: 30 East 23rd Street
City/Town: New York, NY

An intestesting approach to free hosting

I came across the OStatic Free hosting service that provide Solaris + Glassfish (Java Container) + MySQL.

They offer “Now you can get free Web hosting on Cloud Computing environment free of charge for up to 12 months.

The catch “accumulate 400 Points for participating on the site“.

A rather novel approach, you get 100 points for registering, but then only 5-15 points per task. The equates to approximately at least 30 tasks you need to perform, such as answering a question, or reviewing somebodies application. That seems like a lot of work?

In this case, Free has definitely a cost and time factor to consider.

VirtualBox, compiling Part 2

So I managed to find all dependencies after some trial and error for compiling VirtualBox 1.6.4 under Ubuntu 8.0.4, then finding the Linux build instructions to confirm.

It was not successful however in building, throwing the following error:

kBuild: Compiling dyngen - dyngen.c
kBuild: Linking dyngen
kmk[2]: Leaving directory `/usr/local/VirtualBox-1.6.4/src/recompiler'
kmk[2]: Entering directory `/usr/local/VirtualBox-1.6.4/src/apps'
kmk[2]: pass_bldprogs: No such file or directory
kmk[2]: *** No rule to make target `pass_bldprogs'. Stop.
kmk[2]: Leaving directory `/usr/local/VirtualBox-1.6.4/src/apps'
kmk[1]: *** [pass_bldprogs_before] Error 2
kmk[1]: Leaving directory `/usr/local/Virtu

More searching, I needed to add two more files manually. Read More Here.

A long wait, compiling for 20+ minutes, and a necessary reboot as upgraded images threw another error, I got 1.6.4 running, and able to boot Fedora Core 9 image created under 1.5.6

But the real test, and the need for this version was to install Intrepid.

This also failed with a Kernel panic during boot. More info to see this reported as a Ubuntu Bug and Virtual Box Bug.

More work still needed.

Interacting with BuildBot using IRC

Using BuildBot for Drizzle has been a great way to help in the verification of the sometimes rapid code changes that are being committed.

Curious why the IRC notifier within BuildBot only joined and exited the #drizzle channel in IRC, some further investigation of the IRC Documentation lead to more information to share.

By default, the following configuration is not much help in any automated notification.

from buildbot.status import words
c['status'].append(words.IRC(host="irc.freenode.net", nick="drizzle_buildbot", channels=["#drizzle"]))

However, within IRC you can query using several commands. My first trials.

rbradfor: drizzle_buildbot: list builders
[3:10pm] drizzle_buildbot: Configured builders: centos5.64.1 centos5.64.1-mt debian4.32.1[offline] debian5.32.1 debian5.32.2 debian5.64.1 doxygen fedora8.32.1[offline] fedora8.64.1 gentoo8.32.1 gentoo8.64.1 osx105.32.1 osx105.32.1-mt osx105.64.1[offline] osx105.64.1-mt[offline] suse11.32.1[offline] ubuntu804.32.1[offline] ubuntu804.32.2[offline] ubuntu804.32.3[offline] ubuntu804.32.4 ubuntu804.32.4-mt ubuntu804.32.5 ubuntu804.32.6[offline] ubuntu804.32.7[offline] ubuntu804
[3:10pm] rbradfor: drizzle_buildbot: status all
[3:10pm] drizzle_buildbot left the chat room. (Excess Flood)
[3:11pm] drizzle_buildbot joined the chat room.
[3:11pm] rbradfor: drizzle_buildbot: notify on
[3:11pm] drizzle_buildbot: The following events are being notified: ['started', 'finished']
[3:13pm] drizzle_buildbot: build #484 of centos5.64.1 started including []
[3:18pm] drizzle_buildbot: build #484 of centos5.64.1 is complete: Success [build successful]  Build details are at http://drizzlebuild.42sql.com/builders/centos5.64.1/builds/484
[3:25pm] rbradfor: drizzle_buildbot: notify off
[3:25pm] drizzle_buildbot: The following events are being notified: []
[3:26pm] rbradfor: drizzle_buildbot: watch centos5.64.1
[3:26pm] drizzle_buildbot: there are no builds currently running
[3:34pm] rbradfor: drizzle_buildbot: notify on failed
[3:34pm] drizzle_buildbot: The following events are being notified: ['failed']
[4:09pm] rbradfor: drizzle_buildbot: help
[4:09pm] drizzle_buildbot: Get help on what? (try 'help foo', or 'commands' for a command list)
[4:09pm] rbradfor: drizzle_buildbot: help commands
[4:09pm] drizzle_buildbot: Usage: commands - List available commands
[4:09pm] rbradfor: drizzle_buildbot: commands
[4:09pm] drizzle_buildbot: buildbot commands: commands, dance, destroy, excited, force, hello, help, join, last, leave, list, notify, source, status, stop, version, watch

The docs list the following commands.

To use the service, you address messages at the buildbot, either normally (botnickname: status) or with private messages (/msg botnickname status). The buildbot will respond in kind.

Some of the commands currently available:

list builders
    Emit a list of all configured builders
status BUILDER
    Announce the status of a specific Builder: what it is doing right now.
status all
    Announce the status of all Builders
watch BUILDER
    If the given Builder is currently running, wait until the Build is finished and then announce the results.
last BUILDER
    Return the results of the last build to run on the given Builder.
join CHANNEL
    Join the given IRC channel
leave CHANNEL
    Leave the given IRC channel
notify on|off|list EVENT
    Report events relating to builds. If the command is issued as a private message, then the report will be sent back as a private message to the user who issued the command. Otherwise, the report will be sent to the channel. Available events to be notified are:

    started
        A build has started
    finished
        A build has finished
    success
        A build finished successfully
    failed
        A build failed
    exception
        A build generated and exception
    successToFailure
        The previous build was successful, but this one failed
    failureToSuccess
        The previous build failed, but this one was successful


help COMMAND
    Describe a command. Use help commands to get a list of known commands.
source
    Announce the URL of the Buildbot's home page.
version
    Announce the version of this Buildbot.

If the allowForce=True option was used, some addtional commands will be available:

force build BUILDER REASON
    Tell the given Builder to start a build of the latest code. The user requesting the build and REASON are recorded in the Build status. The buildbot will announce the build's status when it finishes.
stop build BUILDER REASON
    Terminate any running build in the given Builder. REASON will be added to the build status to explain why it was stopped. You might use this if you committed a bug, corrected it right away, and don't want to wait for the first build (which is destined to fail) to complete before starting the second (hopefully fixed) build.

I don’ want to flood the IRC channel with messages, so delving deeper into the documentation via the following commands gives me more tips.

$ cd buildbot-0.7.8
$ pydoc buildbot.status.words

By defining categories against the IRC notification, and assigning builders to a given category, in theory you will get notifications just for these builders. I didn’t seem to produce the desired results, so for now it needs to be manual interaction until I get additional time to investigate.

b00 = {'name': "centos5.64.1", 'slavename': "centos5_64", 'builddir': "build00", 'factory': f1, 'category': "irc" }
...
from buildbot.status import words
c['status'].append(words.IRC(host="irc.freenode.net", nick="drizzle_buildbot", channels=["#drizzle"], categories=["irc"]))

Virtual Box, a world of hurt

I successfully installed Virtual box via a few simply apt-get commands under Ubuntu 8.04 via these instructions.

It started fine, after two small annoying, install this module, add this group messages. I was even able to install Ubuntu Intrepid from .iso. But from here it was down hill.

Attempting to start VM gives the error.

This kernel requires the following features not present on the CPU:
pae
Unable to boot - please use a kernel appropriate for the CPU

Some digging around, and confirmation that the current packaged version of Virtual Box doesn’t support PAE. You think they could tell you before successfully installing an OS. I’m running 1.5.6, I need 1.6.x

$ dpkg -l | grep virtualbox
ii  virtualbox-ose                             1.5.6-dfsg-6ubuntu1                      x86 virtualization solution - binaries
ii  virtualbox-ose-modules-2.6.24-19-generic   24.0.4                                   virtualbox-ose module for linux-image-2.6.24
ii  virtualbox-ose-source                      1.5.6-dfsg-6ubuntu1                      x86 virtualization solution - kernel module

Off to the Virtual Box Downloads to get 1.6.4
Don’t make the same mistake as I did and use the first download link, that’s the commercial version that doesn’t install what you expect, you need the OSE. Of course this is not packaged, it’s only source.

  ./configure
Checking for environment: Determined build machine: linux.x86, target machine: linux.x86, OK.
Checking for kBuild: found, OK.
Checking for gcc: found version 4.2.3, OK.
Checking for as86:
  ** as86 (variable AS86) not found!

Ok, well I go through this step like 4 times, installing one package at a time, I wish they could do a pre-check and give you all missing requirements. I installed bin86, bcc, iasl.

Then I got to the following error.

$ ./configure
...
Checking for libxml2:
  ** not found!

Well it’s installed, all too hard. Throw Virtual Box away for virtualization software. And why am I using it anyway. Because VMWare Server doesn’t work under Ubuntu 8.04 either because of some ancient gcc dependency. Sees I may have to go back to that. I just want a working virtualization people on the most popular Linux distro to install other current distros. It’s not a difficult request.

$ dpkg -l | grep libxml
ii  libxml-parser-perl                         2.34-4.3                                 Perl module for parsing XML files
ii  libxml-twig-perl                           1:3.32-1                                 Perl module for processing huge XML document
ii  libxml2                                    2.6.31.dfsg-2ubuntu1                     GNOME XML library
ii  libxml2-utils                              2.6.31.dfsg-2ubuntu1                     XML utilities
ii  python-libxml2                             2.6.31.dfsg-2ubuntu1                     Python bindings for the GNOME XML library

Project Darkstar

It may sound like either a astronomical research project or a Star Wars spin- off, but Project Darkstar is an open source infrastructure from Sun Microsystems that states “simplify the development and operation of massively scalable online games, virtual worlds, and social networking applications.”

The advertising sounds promising like many sites, the emphasis seems to be on gaming throughout the material, interesting they threw in the term “social networking applications” specifically in opening descriptions.

I believe worthy of investigation, if only to see how that solve some classic problems. So, Learn some more, Start your rockets and Participate.

MySQL involvement in OSCON opening keynote

Before I get to post my OSCON reflection I see I didn’t post this (which I reference).

At OSCON opening keynotes Tim O’Reilly Interviews Monty Widenius & Brian Aker. This provided some interesting answers in a Q & A session. Here is some of the discussion.

TO: So 6 months in. How is it with Sun?
BA: Really rewarding environment. My first question was? You are going to send me free H/W. No H/W has been delivered yet, or access to the masses, still hoping. Sun is a very engineering driven company.
MW. Thanks God we didn’t go public. Starting to do closed sourced components, going public this would have continued.

TO: Sun saved MySQL from public market/ insulated from market.
MW: 6 months in, Sun still trying to figure out what they bought. Sun has made a commitment to open source throughout the organization. Engineers who have been working in closed environments, now seeing this all in public, and opens yourself up to more inputs and exposure.

TO: You have your own projects within sun, how does that affect with the main line of development of MySQL, Monty you with the Maria Storage Engine, Brian you with Drizzle.

TO: What is the Support like in Sun?
BA: My boss got it. We are looking at going after different market area, niche and ecosystem. There is certain direction the main codebase is heading such as enterprise features, oracle like replacements. There is a core set of environments what they aren’t needed. Additional new requirements like a proximity data storage, historically Postgres has been good for this type of GIS data. This is a new type of data store. location/time and proximity of objects.
Sun has given us more free hands to work for best features of MySQL. For Drizzle, to strip it down into more components architecture and extensibility. It’s a micro-kernel there will be an interface for large parts of the code.

TO: What do you think about Google?
BA: Happy opening up more of their data, and trying to turn the world into their own 20% of time project.

TO: What do you think about Amazon?
BA: Interesting position, secretive company. At the beginning how little anybody though of Amazon in a service marketplace.

TO : What do you think about Microsoft?
MW: less and less things are good.
BA: Irrelevant.

TO: What do you think about Apple?
NW: More afraid of Apple then Microsoft
BA: Really want an iPhone, but hoping Google will get Android out and it works.

TO: What are the cool things MySQL can do on the Sun field, and reverse?
BA: Both Sun and MySQL Engineers thought about open source differently. MySQL has created a set of steps of evolution, e.g. employees contributing to open source projects. MySQL’s DNA was very small, it’s interesting how fast this is influencing Sun’s approach
MW: MySQL has become to management driven in previous years, Sun has enabled us to get back to our roots.

Drizzle needs you


Use MySQL, but want to follow the new kid on the block?
Want to help contribute to Drizzle?

We are seeking help in compiling across different platforms.
Please help us by becoming a buildbot slave.

There are detailed instructions, so now is the time to take a few minutes and help out the project.

The Drizzle Buildbot is hosted and supported by 42SQL.

Installing Buildbot

BuildBot is a system to automate the compile/test cycle required by most software projects to validate code changes.

Here is my environment.

$ uname -a
Linux app.example.com 2.6.18-53.el5 #1 SMP Mon Nov 12 02:14:55 EST 2007 x86_64 x86_64 x86_64 GNU/Linux
$ python
Python 2.4.3 (#1, May 24 2008, 13:57:05)

Here is what I did to get it installed successfully.

CentOS

$ yum install python-devel
$ yum install zope

Ubuntu

$ apt-get install python-dev
$ apt-get install python-zopeinterface
$ cd /tmp
# installation of Twisted
$ wget http://tmrc.mit.edu/mirror/twisted/Twisted/8.1/Twisted-8.1.0.tar.bz2
$ bunzip2 Twisted-8.1.0.tar.bz2
$ tar xvf Twisted-8.1.0.tar
$ cd Twisted-8.1.0
$ sudo python setup.py install
# installation of BuildBot
$ cd /tmp
$ wget http://downloads.sourceforge.net/buildbot/buildbot-0.7.8.tar.gz
$ tar xvfz buildbot-0.7.8.tar.gz
$ cd buildbot-0.7.8
$ sudo python setup.py install


And a confirmation.
$ buildbot --version
Buildbot version: 0.7.8
Twisted version: 8.1.0

You will notice a few dependencies. I found these out from the following errors.

Error causing needing python-devel

$ python setup.py install
....
gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtun
e=generic -D_GNU_SOURCE -fPIC -fPIC -I/usr/include/python2.4 -c conftest.c -o conftest.o
building 'twisted.runner.portmap' extension
creating build/temp.linux-x86_64-2.4
creating build/temp.linux-x86_64-2.4/twisted
creating build/temp.linux-x86_64-2.4/twisted/runner
gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtun
e=generic -D_GNU_SOURCE -fPIC -fPIC -I/usr/include/python2.4 -c twisted/runner/portmap.c -o build/temp.linux-x86_64-2.4/twisted/runner/portmap.o
twisted/runner/portmap.c:10:20: error: Python.h: No such file or directory
twisted/runner/portmap.c:14: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘*’ token
twisted/runner/portmap.c:31: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘*’ token
twisted/runner/portmap.c:45: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘PortmapMethods’

Error causing zope to be installed

$ buildbot start /home/buildbot/master/
Traceback (most recent call last):
  File "/usr/bin/buildbot", line 4, in ?
    runner.run()
  File "/usr/lib/python2.4/site-packages/buildbot/scripts/runner.py", line 939, in run
    start(so)
  File "/usr/lib/python2.4/site-packages/buildbot/scripts/startup.py", line 85, in start
    rc = Follower().follow()
  File "/usr/lib/python2.4/site-packages/buildbot/scripts/startup.py", line 6, in follow
    from twisted.internet import reactor
  File "/usr/lib64/python2.4/site-packages/twisted/internet/reactor.py", line 11, in ?
    from twisted.internet import selectreactor
  File "/usr/lib64/python2.4/site-packages/twisted/internet/selectreactor.py", line 17, in ?
    from zope.interface import implements
ImportError: No module named zope.interface

Installation was the easy part, configuration a little more complex.

Working with Google App Engine

Yesterday I took a more serious look at Google App Engine, I got a developer account some weeks ago.

After going though the getting started demo some time ago, I chose an idea for a FaceBook Application and started in true eXtreme Programming (XP) style (i.e. What’s the bare minimum required for first iteration). I taught myself some Python and within just a few minutes had some working data being randomly generated totally within the development SDK environment On my MacBook. I was not able to deploy initially via the big blue deploy button, the catch is you have to register the application manually online.

Then it all worked, and hey presto I’ve got my application up at provided domain hosting at appspot.com

Having coming from a truly relational environment, most notably MySQL of recent years I found the Datastore API different in a number of ways.

  • There is no means of Sequences/Auto Increment. There is an internal Unique Key, but it’s a String, not an integer, not enabling me to re-use it.
  • The ListProperty enables the use of Lists in Python (like Arrays) to be easily stored.
  • The ReferenceProperty is used as a foreign key relationship, and then can be more reference within an object hierarchy
  • I really missed an interactive interface. You have no abililty to look at your data, specifically for me I wanted to seek some data, then I wanted to delete some data, but I had to do all this via code.

Having developed a skelaton FaceBook application before in PHP, I figured a Python version would not be that much more work, but here is where I good stumped Information at Hosting a Facebook Application on Google AppEngine leveraging the PyFacebook project didn’t enable me to integrate Google App Engine with FaceBook just yet.

This had me thinking I need to resort to a standalone simply Python Facebook application to confirm the PyFacebook usage. Now my problems started. Under Mac it’s a lot more complex to install and configure Python/Django etc then under Linux. I tried to do it on my dedicated server, but drat Python is at 2.3.4, and it seems 2.5.x is needed.

Still it was a valuable exercise, I dropped the FaceBook goal and just worked on more Google App Engine stuff. Still early days, but it was productive to try out this new technology.

What I need to work on now is how to hold state within Python infrastructure so I can manage a user login and storing and retrieving user data for my sample app.

Beyond Blogs

I was reading today in a printed magazine Business Week the article Beyond Blogs. It’s unusual these days to actually read on paper what we can find on our online world.

What’s interesting is the printed article did actually contain content I didn’t find online. There was a section called “We Didn’t See ‘em coming”, and it’s finally important site mentioned was iTunes. I found the following comment extremely relevant. “…. But we didn’t guess it would become the leading destination for podcast downloads. Contrary to our expectations, podcasts have evolved into a feature of traditional radio, not a rival to it.”

It’s important that with any business model you know, understand and review consistently your competitors. I find many organizations that don’t do this. You need to know your competitor. But as mentioned with iTunes, the designers of podcasts could have easily considered radio to be a competitor initially. One must always evaluate the changing times regularly.

The following are three more quotes of interest.

But in the helter-skelter of the blogosphere, we wrote, something important was taking place: In the 10 minutes it took to set up a blogging account, anyone with an Internet connection could become a global publisher. Some could become stars and gain power.

Like the LAMP stack has done for websites, the cost to entry now to get exposure is very low. The problem is now too much content exists to review, compare and evaluate effectively.

Turned out it wasn’t quite that simple. The magazine article, archived on our Web site, kept attracting readers and blog links. A few professors worked it into their curricula, sending class after class of students to the story. With all this activity, the piece gained high-octane Google juice.“.

I’d not heard of Google juice before.

In relation to Linked In, FaceBook and MySpace, “While only a small slice of the population wants to blog, a far larger swath of humanity is eager to make friends and contacts, to exchange pictures and music, to share activities and ideas. These social connectors are changing the dynamics of companies around the world. Millions of us are now hanging out on the Internet with customers, befriending rivals,…“.

Microsoft, Yahoo and Open Source

There has been plenty of press this week regarding Microsoft making a bid for Yahoo. This week the Wall Street Journal Article From Uncertain Future To Leading Yahoo Bid has prompted me to the following observations. I quote several points:

The bid, he said on the call, is “the next major milestone in Microsoft’s companywide transformation” to incorporate online services.

as Microsoft pushes the bid and, if successful, tries to meld Yahoo with Microsoft.

Microsoft had been negotiating to buy online ad company DoubleClick Inc. but lost that deal to Google, which paid $3.1 billion. Microsoft in May countered, spending $6 billion on online ad company aQuantive Inc.

While Microsoft should continue investing in its own online services, it needed to speed things up through acquisitions.

Once a company had a critical mass of buyers and sellers on its online-ad system, it could hold sway over much of the industry. In computers, Microsoft achieved that position with its Windows operating system. But on the Internet, Google was quickly taking on that role.


The Alexa Ratings has Yahoo as the number 1 real-estate property, outstripping Google. What’s important to realize that Yahoo along with many top traffic websites not only use Open Source, but their business is run on Open Source. At the database, there is MySQL powering Yahoo, Facebook, Wikipedia, YouTube, Fotolog and Flickr for example. Google also uses MySQL within critical components (not the search engine).

One can only hope that if such a bid is successful, much like the Sun acquisition of MySQL , that strong components of the Open Source ideal infects the much larger host.

I was thinking of taking this popular Tux & Microsoft Office image and badging Tux with a Yahoo logo, or perhaps he needs to be planting a big neon sign in the center.

NY Tech Meetup – Idea Virus

On more thing that came from the NY Tech Meetup last night was the Idea Virus. It was handed out on a piece of paper. Here is what it said.

Idea Virus

  1. Think of a novel or awesome idea
  2. Write it on the back of your business card
  3. Spread the viral idea to somebody else
  4. Duplicate and re-propagate ideas that you like

When you go home, leave your accumulated ideas in the Halloween bucket. Check out idea-virus.org to see the results.

So I goto the site, and all I get is a Not quite ready yet… message page. They should have a least put up what I just typed out, and maybe some more “Coming Soon”, or submit to a email or something. 15 mins of work, rather then 2 mins of work would not leave people like me going “Well why bother!”

NY Tech Meetup

Tonight I headed to the NY Tech Meetup organized by the CEO of Meetup and co-founder of Fotolog, the company my friend Frank works for.

This forum provided for quick presentations by new NY high tech ventures and other interesting discussions, then enabling further networking between people.

A Perfect Thing


The first speaker was Steven Levy, mentioned on the site as Newsweek’s tech editor & all-around geek writer extraordinaire. He is the author of “The Perfect Thing”, a story of the Apple iPod. He shared a funny story of a dinner where he was seated with Bill Gates at a Microsoft XP launch in late 2001, in which he had just that week got his initial iPod following the launch. When he gave it to Bill Gates, he observed as he described this mind meld as a votex between Bill’s brain and the iPod while he checked it out, exploring all the menu options, buttons and options. 45 seconds later came the comment of something like, looks great, and it works with a Macintosh.

Urbis

Our second speaker was Steve Spurgat from www.urbis.com. The blub. Urbis is a creative community with three types of users: creative people, those who love and support creative people, and those who have opportunities for creative people. It’s very creative.. Some of the interesting features of this site included:

  • Can pre-define the people that can review your submissions, by various criteria, meaning that your feedback can be restricted.
  • You can specify your specific goals for your submission.
  • You can select the present opportunities for your submission.
  • There is an economy system to see reviews of your own work you must review others

Presently only writing is available, but plans for Music, Art and Film will be available in the next few months. With some 12,000+ members and 13% active, it’s a good start.

There was also discussion of copyright, Urbis being a registered copyright agent complying with government guidelines, and of revenue models including the option for fees from publishers, and the potential of ad copy. A competitor site Trigger Street was also mentioned, started by Kevin Spacey.

One Web 2.0 thing I liked about this site, and the next was that the website was the presentation (no powerpoint), and while talking the home page of the website was displayed and the content was dynamically changing, in this case, reviews being submitted online. A good selling point.

LinkStorms

Scott Kolber of LinkStorms was our next presenter. Described as the next generation of links for the web providing context specific fast links and specific navigation from a button, images, banner ad.

The revenue model is CPM plus a publishers setup, maintenance and support fee structure. Apparently up to 40% click thru rate, which is extraordinary compared to the current stats of < 1% for general banners.

When asked what was different with this model, the answer was "the results. It's a better user experience looking at ads".

You can see it in action at Premiere Magazine – The Departed.

CogMap

Brent Halliburton and his approach to a wikipedia of Organization charts with CogMap certainly got the best response the crowd. A good comedian, Brent made the mistake with a slow Internet connection to demonstrate interactively with an example from the audience and not his own prepared content. It ended up not rendering, then crashing but he managed to turn it around into a plus and the best applause of the night.

His idea provoked a wide range of comment and feedback and when asked why? “Because if your an entrepreneur you do things”. “In the big scheme of things I don’t have all the answers. I just put it out there.”

uPlayMe

David Fishman provided the last presentation of uPlayMe, a Windoze program that provides a slant on the community social networking via enternaintment, specifically when they are actually playing via Windows Media Player for example. It’s designed to help people discover other people with the same interests, or weird interests. Some other sites mentioned in the discussion included Last.fm, Pandora and MOG.

2007 Predictions

We ended with an audience participated 2007 predications. The included:

  • No Predication – (The first person from the Board of Advisors I believe that was specifically asked)
  • IP TV market and integration with the TV
  • Will see a Billion $ organization from the NY community
  • The buzz of radios that can do multiple gigibits of transfer between neighours (yes it sounded weird)
  • Era of the connected home, Computer, TV, Stereo
  • Some political thing at change.net
  • Another political thing, an organic style camp debrief
  • The Term 2.0 will cease being used in 2007
  • Skype will be a source of major innovation
  • NY will produce a billion dollar Internet company

The Hobbyist and the Professional

I first coined this term in February 2006 in a paper titled “Overcoming the Challenges of Establishing Service and Support Channels” for the conference “Implementing Open Source for Optimal Business Performance” View Paper.

I must have referenced this several times, time to give this topic it’s due notice. In summary it targets three areas:

  • the lack of appropriate database design in open source projects
  • the lack of coding standards
  • the lack of sound programming principles

Here are the bullet points from the slide in a presentation.

Hobbyist

  • Download-able software and examples
  • Online tutorials
  • Books like Learn in 24 hours/For Dummies

Professional

  • Formal Qualifications
  • Grounding in sound programming practices
  • Understanding of SDLC principles
  • Worked in team environment

Middle Ground Developer

  • Time to skill verses output productivity
  • Depends on environment and requirements

Stories that impress and motivate you

I’ve worked for two Internet startup companies, both around 2 years each, both now long dead. The first was due to eventual lack of new VC funds, the second gross financial managment in the second year (apparently, when we were told there was no money December one year to pay us, the company that made large profits every month for over the first year, then had made losses every month for the past 12 months, but nobody knew about it. There were 5 Directors from 3 countries and nobody knew. Yeah Right!)


I’ve learnt a lot of non IT street smarts in this time. The first startup took the VC route, and after 3 rounds while I wasn’t involved in the process you pick up things. The single biggest tip here is the Bell-Mason Diagnostic. Here a few introduction references worthy of a quick review (One, Two).

When you take any great idea, and then consider the 4 quadrants and 12 axis you realise you really need to make larger circles of professional contacts.

Zac’s article Valley Boys Run MySQL talks about the new breed of Web entrepreneurs and Web 2.0. In particular check out
Valley Boys Digg.com’s Kevin Rose leads a new brat pack of young entrepreneurs
. This is the new wave of success that works without VC.

For me this recent post on the Meebo Blog really impressed me.

365 days ago

Nineteen releases, eleven (fantastic) team members, and 295,321 gummy bears later we see over a million users log into meebo each day. As a coder, all you hope is that your service will be able help people go about their day-to-day lives. Stability, bugs, and good usability are always top of mind.

1,000,000 users of your Web 2.0 application. This impresses and motivates me. What makes it possible to anybody, is you can get a LAMP stack, a live-cd, cheap hosting and you can turn your idea into something real for next to no cost. Of course, I won’t start on the nightmares out there of great ideas that are very poorly designed. At least the underlying stack can support anything you want to achieve, and MySQL is behind these success stories.

One more thing on Meebo, Check out the meebo map!. I’ve been told that Google has something of a similar nature at the Googleplex. Well I can say I’m very keen to see this, and will be 8 weeks time when the First MySQL Camp is held.



Handling Error Levels in Logging

In reviewing some provided code to a client, I observed a number of actions contray to generally accepted practices regarding logging. This is what I provided as the general programming conventions with regardings to logging.

Using Log4J (http://logging.apache.org/log4j/docs/), which is generally accepted as the benchmark for all Java applications. This provides the following logging levels.

  • FATAL
  • ERROR
  • WARN
  • INFO
  • DEBUG
  • TRACE – from 1.2.12, latest is 1.2.13

A description for what handling should occur per logging level.

FATAL. As the name suggests, all processing should stop. Should logging include a FATAL, the process is a Failure.

ERROR. An error has occured, and this requires attention, and action. Generally processing should stop, however additional post processing, or an alternative path could occur. Should logging include an ERROR, the process is a Failure.

WARN. Something that is unexpected occured, however it doesn’t affect the general processing from succeeding successfully. If a process includes WARN and not FATAL/ERROR it should be considered successful.

INFO. Information Only. On high volume systems, this level of logging may even be turned off. This generally indicates key information values or steps, and can assist when enabled in longer running processes to identify where a process is. You don’t log errors at INFO.

DEBUG. For Debugging Purposes only.

Ok, well that sounds like common sense. Here is what I observed on a client site.

  • Code logs a FATAL, but continues processing
  • A FATAL is logged, yet the calling process reports success
  • An ERROR is logged, yet the calling process reports success.
  • A lot of WARN are logged, and this is misleading, as it appears more information regarding XML elements not processed (We are talking 20+ Warnings per batch process). From what I’ve observed, these don’t require futher action, and should be changed in INFO.
  • Errors are being logged as INFO. A NullPointer RunTime Exception is logged as INFO. If an error provides an Exception argument where a stack trace is printed, it ain’t an INFO message.

Microfox ?

I’ve added Digg to my general lunch time reading web sites. I came across this yesterday. Microsoft invites Firefox development team to Redmond.

Well, isn’t that nice, the big boy opening his pond (including all the sharks) to the little fish. (if you don’t get it, it’s a line from Die Hard 2 about pissing in somebody else’s pond) What’s funny about the article is not the article but some of the comments, here are but a few.

  • Cool. Very cool.
  • they’re going to brainwash them into working for microsoft and their going to cast off the IE team to replace the firefox team
  • Just replace IE 7 with FF 2.0. Save every web developer 100s of hours.
  • “Come into my parlour,” said the spider to the fly
  • DON’T DO IT! My god, has noone seen Braveheart? The Untouchables? The Godfather? Scarface?
  • hey’re going to kidnap the Mozilla developers with a cunning trail of pizza and cola, and force them to work on IE8.
  • ITS A TRAP! http://itsatrap.ytmnd.com/

I like the last one, maybe just because of the cool picture on referenced site.

I see today an official response Firefox welcomes Microsoft’s offer of help

The fast pace of technology in a Web 2.0 world

I had need to goto the Wikipedia this morning to review the terminology of something, and on the front page in Today’s featured article is Mercury. Being a tad curious given I’d heard only on the radio a few hours ago that Pluto was no longer a planet in our Solar System, I drilled down to the bottom to check references to other planets (quicker then searching). So at the bottom I found the following graphic and details of The Solar System Summary.

Well blow me down, they didn’t waste any time there. Pluto is no longer a planet in our Solar System. It is now categorised as a Dwarf Planet.

There are over 100 edits to this page on Dwarf Planet in the past 12 hours, including all the links in Pluto and Solar System correctly referenced.

Now, that’s the power of content by the community, one of the characteristics of Web 2.0.
Of course all of this runs under the LAMP Stack and powered by the MySQL Database. Combined with the fact this is a breed of organisation that didn’t start with large amount of Venture Capital, another trend of the newer generation of popular and successful internet enterprises.

No Spam Today

Huh, tricked you. As if!

However I was looking at my Akismet Spam section in WordPress, the open source software that runs my blog, and it gave me this message.

Caught Spam

You have no spam currently in the queue. Must be your lucky day. :)

Become named in Firefox 2

So, FireFox have come up with a novel idea to promote it’s product. Check out Firefox Day.

The official blurb: Share Firefox with a friend. If your friend downloads Firefox before September 15, you’ll both be immortalized in Firefox 2.

You can even choose how to link your names together on the “Firefox Friends Wall”. Examples like ‘my name’ Informed ‘your name’, or ‘my name’ Empowered ‘your name’, or ‘my name’ Liberated ‘your name’.

Perhaps MySQL can leverage this idea for some what to promote future download!



Compiling MySQL Tutorial 2 – Directly from the source


Should you want to be on the bleeding edge, or in my case, don’t want to download 70MB each day in a daily snapshot (especially when I’m getting build errors), you can use Bit Keeper Free Bit Keeper Client that at least lets you download the MySQL Repository. This client doesn’t allow commits, which is a good thing for those non-gurus in mysql internals (which definitely includes me).

wget http://www.bitmover.com/bk-client.shar
/bin/sh bk-client.shar
cd bk_client-1.1
make

By placing sfioball in your path you can execute.

sfioball bk://mysql.bkbits.net/mysql-5.1 mysql-5.1

This took me about 4 mins, which seemed much quicker then getting a snapshot!

You can then get cracking with my instructions at Compiling MySQL Tutorial 1 – The Baseline.

A good reference in all this compiling is to take a good look at the MySQL Internals Manual. (which I only found out about recently)

If at a later time you want to update your repository to the latest, use the following command.

update bk://mysql.bkbits.net/mysql-5.1 mysql-5.1

Documentation References

5.1 Reference Manual – 2.9. MySQL Installation Using a Source Distribution
5.1 Reference Manual – 2.9.3. Installing from the Development Source Tree
5.1 Reference Manual – 2.13.1.3. Linux Source Distribution Notes
MySQL Internals Manual

Requirements for compiling

To confirm earlier notes on minimum requirements for compiling the following details should be confirmed.

automake --version
autoconf --version
libtool --version
m4 --version
gcc --version
gmake --version
bison -version


My results running Fedora Core 5.

$ automake --version
automake (GNU automake) 1.9.6
$ autoconf --version
autoconf (GNU Autoconf) 2.59
$ libtool --version
ltmain.sh (GNU libtool) 1.5.22 (1.1220.2.365 2005/12/18 22:14:06)
$ m4 --version
GNU M4 1.4.4
$ gcc --version
gcc (GCC) 4.1.1 20060525 (Red Hat 4.1.1-1)
$ gmake --version
GNU Make 3.80
$ bison -version
bison (GNU Bison) 2.1

Compiling MySQL Tutorial 1 – The Baseline

Pre-requisites

This tutorial is aimed at Linux installations that has the standard development tools already installed. The INSTALL file in the source archives provides good details of the software required (e.g. gunzip, gcc 2.95.2+,make).

Create a separate user for this development work.

su -
useradd mysqldev
su - mysqldev

Get an appropiate Snapshot

You can view recent snapshots at http://downloads.mysql.com/snapshots/mysql-5.1/.

The official statement on snapshots from MySQL AB.
Daily snapshot sources are only published, if they compiled successfully (using the BUILD/compile-pentium-debug-max script) and passed the test suite (using make test). If the source tree snapshot fails to compile or does not pass the test suite, the source tarball will not be published.

At the time of producing this the present snapshot was http://downloads.mysql.com/snapshots/mysql-5.1/mysql-5.1.12-beta-nightly-20060801.tar.gz

mkdir -p $HOME/src
cd $HOME/src
SNAPSHOT="mysql-5.1.12-beta-nightly-20060801";export SNAPSHOT
wget http://downloads.mysql.com/snapshots/mysql-5.1/${SNAPSHOT}.tar.gz
tar xfz ${SNAPSHOT}.tar.gz
cd $SNAPSHOT

Compiling

You can for example do the simple way with most C programs:

DEPLOY=${HOME}/deploy; export DEPLOY
./configure --prefix=${DEPLOY}
make
make install

However it’s recommended you use the significant number of pre-configured build scripts found in the BUILD directory. For Example:

./BUILD/compile-pentium-debug --prefix=${DEPLOY}

I always get the following warnings. Not sure what to do about it, but it doesn’t break any future work to date. Surely somebody in the development team knows about them.

../build_win32/include.tcl: no such file or directory
cp: cannot create regular file `../test/logtrack.list': No such file or directory

Testing

The easy, but time consuming part over, let’s test this new beast.

make install

This will deploy to the preconfigured area of $USER/deploy. For multiple versions within this process you should adjust this back in the compile process.

cd $DEPLOY
bin/mysql_install_db
bin/mysqld_safe &
bin/mysql -e "SELECT VERSION()"
bin/mysqladmin -uroot shutdown

Changes

So if you haven’t already sniffed around there are some reasonable changes in the structure. Here are a few points that caught me out:

  • mysql_install_db is now under /bin not /scripts, infact there is no /scripts
  • mysqld is not under /bin but infact under /libexec. A good reason to always used mysqld_safe
  • by default the data directory is /var, normal documentation etc always has this as /data

This is a quick confirm it all works, and I’ve for the purposes of this example ensured that no other MySQL installation is running, and there is no default /etc/my.cnf file. (I always place this in the MySQL installation directory anyway).

Some minonr considerations for improvements with locations, users and multiple instances.

cd $DEPLOY
rm -rf var
bin/mysql_install_db --datadir=${DEPLOY}/data --user=$USER
bin/mysqld_safe --basedir=${DEPLOY} --datadir=${DEPLOY}/data --user=$USER --port=3307 --socket=/tmp/mysqldev.sock &
bin/mysql -P3307 -S/tmp/mysqldev.sock -e "SELECT VERSION()"
bin/mysqladmin -P3307 -S/tmp/mysqldev.sock  -uroot shutdown

Troubleshooting

Now, there is no guarantee that the snapshot is a working one, in the case of mysql-5.1.12-beta-nightly-20060801 in this example, it didn’t work for me? I’ve just reinstalled my OS to Fedora Core 5, and the previous source version (a 5.1.10 snapshot worked ok)
Here are some tips to help out.

The preconfigured BUILD scripts have a nice display option with -n that allows you to see what will happen.

./BUILD/compile-pentium-debug --prefix=${DEPLOY} -n

In my case it gave me this:

testing pentium4 ... ok
gmake -k distclean || true
/bin/rm -rf */.deps/*.P config.cache storage/innobase/config.cache
   storage/bdb/build_unix/config.cache bdb/dist/autom4te.cache autom4te.cache innobase/autom4te.cache;

path=./BUILD
. "./BUILD/autorun.sh"
CC="gcc" CFLAGS="-Wimplicit -Wreturn-type -Wswitch -Wtrigraphs -Wcomment -W -Wchar-subscripts -Wformat -Wparentheses -Wsign-compare -Wwrite-strings -Wunused
-DUNIV_MUST_NOT_INLINE -DEXTRA_DEBUG -DFORCE_INIT_OF_VARS  -DSAFEMALLOC -DPEDANTIC_SAFEMALLOC -DSAFE_MUTEX"
CXX="gcc" CXXFLAGS="-Wimplicit -Wreturn-type -Wswitch -Wtrigraphs -Wcomment -W -Wchar-subscripts -Wformat -Wparentheses -Wsign-compare -Wwrite-strings -Wctor-dtor-privacy
-Wnon-virtual-dtor -felide-constructors -fno-exceptions -fno-rtti  -DUNIV_MUST_NOT_INLINE -DEXTRA_DEBUG -DFORCE_INIT_OF_VARS
-DSAFEMALLOC -DPEDANTIC_SAFEMALLOC -DSAFE_MUTEX" CXXLDFLAGS=""
./configure --prefix=/home/mysqldev/deploy --enable-assembler  --with-extra-charsets=complex  --enable-thread-safe-client
--with-readline  --with-big-tables  --with-debug=full --enable-local-infile
gmake -j 4

In my case this helped with my problem, that is the autorun.sh command is throwing an error in the Innodb configuration.
Part of the compiling is you can turn Innodb off with –without-innodb, but the BUILD scripts don’t accept parameters, so it’s a matter of ignoring the autorun.sh command above, and adding –without-innodb to the configure.

As it turned out, the autorun.sh provided the additional storage engine support.

Details of the problem in this instance.

[mysqldev@lamda mysql-5.1.12-beta-nightly-20060801]$ . "./BUILD/autorun.sh"
Updating Berkeley DB source tree permissions...
../build_win32/include.tcl: no such file or directory
Updating Berkeley DB README file...
Building ../README
autoconf: building aclocal.m4...
autoconf: running autoheader to build config.hin...
autoconf: running autoconf to build configure
Building ../test/logtrack.list
cp: cannot create regular file `../test/logtrack.list': No such file or directory
Building ../build_win32/db.h
configure.in:723: required file `zlib/Makefile.in' not found
Can't execute automake

The good news is today, the MySQL Barcamp was announced, a great place to get some better insite into the gurus of the MySQL internals. At least a place to ask these questions providing they have a beginner group.

My Next Tutorial will take a look at the example provided by Brian Aker at the recent MySQL Users Conference HackFest B: Creating New SHOW Commands.

MySQL Response to Bugs

I’ve read at times people complaining about the response to bugs, and people bag the support of MySQL on the forums at times.

Well today I logged a bug, not the first and I’m sure it’s not the last. See LAST_INSERT_ID() does not return results for a problem in the latest Connector/J 5.0.3 that was released just recently.

Now it took me about 2 hours to log the bug, and probably at least 2 hours of frustration prior to that. The initial frustration 2 hours was unsuccessful debugging of what I was sure was valid code (and it was near midnight last night). The second 2 hours today was testing the problem between two environments, different database versions and different Connector/J versions, and providing a simple reproducable case of said problem.

So the timeline of the Bug in the MySQL Bug Tracking System.

  • [1 Aug 5:40] – Logged Bug
  • [1 Aug 5:58] – Initial feedback, something for me to confirm, bug “in Progress”
  • [1 Aug 6:04] – Confirmation that problem is new JDBC-compliant features, and the issue was a backwards-compatible option
  • [1 Aug 7:14] – Patch submitted to correct the issue
  • [1 Aug 7:28] – My response to initial feedback, as I hadn’t checked my mail for at least an hour, and then I had to go find the JSTL source code and check

So let me summarise that.

For aS3 (Non-critical) Bug, it took 18 mins from logging for MySQL AB to acknowledge and review the problem, 24 mins from logging to get confirmation of compatibility issue, and 94 mins from logging to have a patch submitted that addressed this and probably some more compatibility issues across 7 source files.

Good Job Mark Matthews. Where do I send the comments for the “MySQL AB employee of the month” award.

Mercurial Version Control Software

I got asked (being a Java developer) about what was involved in creating an Eclipse Plugin for Mercurial. Well in true Google style, why invent when somebody probably already has. A quick check finds Mercurial Eclipse by VecTrace.

Now until last week, I’d never heard of Mercurial, so this is really an introduction to somebody that has no idea.

What is Mercurial?

Mercurial is a fast, lightweight Source Control Management system designed for efficient handling of very large distributed projects.

Ok, so big deal, I use CVS. I also use Subversion (SVN) for my Apache contributions, and also for MySQL GUI products. Why do we need another Version Control Product? Mercurial is a Distributed Software Configuration Management Tool. The following is from the Mercurial Wiki

A distributed SCM tool is designed to support a model in which each Repository is loosely coupled to many others. Each Repository contains a complete set of metadata describing one or more projects. These repositories may be located almost anywhere. Individual developers only need access to their own repositories, not to a central one, in order to Commit changes.

Distributed SCMs provide mechanisms for propagating changes between repositories.

Distributed SCMs are in contrast to CentralisedSCMs.

So, clearly a distributed model would well in a large distributed organisation where external factors limit continous access to a single central repository. Low Bandwidth, Poor Internet Connectivity, being on a plane, and travelling are all things that would make a distributed model a more ideal solution. I know I’ve taken my laptop away, and being an “Agile Methodology” developer, I commit often. When you have several days of uncommitted work it goes against the normal operation.

You can get more information at the official website at http://www.selenic.com/mercurial. A few quick links are: Quick Start Guide, Tutorial, Glossary.

Installing Mercurial

su -
cd /src
wget http://www.selenic.com/mercurial/release/mercurial-0.9.tar.gz
tar xvfz mercurial-0.9.tar.gz
cd mercurial-0.9
# NOTE: Requires python 2.3 or better.
python -V
python setup.py install --force

A quick check of the syntax.

$ hg
Mercurial Distributed SCM

basic commands (use "hg help" for the full list or option "-v" for details):

 add        add the specified files on the next commit
 annotate   show changeset information per file line
 clone      make a copy of an existing repository
 commit     commit the specified files or all outstanding changes
 diff       diff repository (or selected files)
 export     dump the header and diffs for one or more changesets
 init       create a new repository in the given directory
 log        show revision history of entire repository or files
 parents    show the parents of the working dir or revision
 pull       pull changes from the specified source
 push       push changes to the specified destination
 remove     remove the specified files on the next commit
 revert     revert files or dirs to their states as of some revision
 serve      export the repository via HTTP
 status     show changed files in the working directory
 update     update or merge working directory

A more detailed list:

$ hg help
Mercurial Distributed SCM

list of commands (use "hg help -v" to show aliases and global options):

 add        add the specified files on the next commit
 annotate   show changeset information per file line
 archive    create unversioned archive of a repository revision
 backout    reverse effect of earlier changeset
 bundle     create a changegroup file
 cat        output the latest or given revisions of files
 clone      make a copy of an existing repository
 commit     commit the specified files or all outstanding changes
 copy       mark files as copied for the next commit
 diff       diff repository (or selected files)
 export     dump the header and diffs for one or more changesets
 grep       search for a pattern in specified files and revisions
 heads      show current repository heads
 help       show help for a given command or all commands
 identify   print information about the working copy
 import     import an ordered set of patches
 incoming   show new changesets found in source
 init       create a new repository in the given directory
 locate     locate files matching specific patterns
 log        show revision history of entire repository or files
 manifest   output the latest or given revision of the project manifest
 merge      Merge working directory with another revision
 outgoing   show changesets not found in destination
 parents    show the parents of the working dir or revision
 paths      show definition of symbolic path names
 pull       pull changes from the specified source
 push       push changes to the specified destination
 recover    roll back an interrupted transaction
 remove     remove the specified files on the next commit
 rename     rename files; equivalent of copy + remove
 revert     revert files or dirs to their states as of some revision
 rollback   roll back the last transaction in this repository
 root       print the root (top) of the current working dir
 serve      export the repository via HTTP
 status     show changed files in the working directory
 tag        add a tag for the current tip or a given revision
 tags       list repository tags
 tip        show the tip revision
 unbundle   apply a changegroup file
 update     update or merge working directory
 verify     verify the integrity of the repository
 version    output version and copyright information

Mercurial Eclipse Plugin

The plugin is still in it’s early days, but the “FREEDOM” of open source enables me to easily review. After a quick install and review of docs, I shot off an email to the developer, stating why I was looking, and while I have other projects on the go, I asked what I could do to help. It’s only be 2 days and we have already communicated via email several times on various topics. That’s one reason why I really love the Open Source Community. Generally people are very receptive to feedback, comments and especially help.

Within Eclipse

  • Help ->Software Updates-> Find and install…
  • Select “Search for new features to install”, click Next
  • Click “New Remote site…”
  • Enter following details and click Ok
    • Name: MercurialEclipse Beta site
    • URL: http://zingo.homeip.net:8000/eclipse-betaupdate/
  • Follow the prompts to accept the license and download.

So now with Eclipse, on a project you can simply go Right Click -> Team -> Share Project -> Select Mercurial

A Quick Mercurial Tutorial

Of course the quickest way to learn about using Mercurial is to look at an existing product. So taking this plugin project for a spin.

$ cd /tmp
$ hg clone http://zingo.homeip.net:8000/hg/mercurialeclipse com.vectrace.MercurialEclipse
$ hg clone com.vectrace.MercurialEclipse example
$ cd example
# Create some new dummy files
$ touch test.txt
$ touch html/test.html
# View files against respository status
$ hg status
? html/test.html
? test.txt
# Add the new files
$ hg add
adding html/test.html
adding test.txt
# Commit changes
$ hg commit -m "Testing Mercurial"

So other then the second clone command (which enabled me to not mess up the original repository and to test distributed handling next), this is just the same as CVS (checkout, diff, add, commit)

# The Distributed nature involves first Pulling from the "upstream" respository
$ hg pull ../com.vectrace.MercurialEclipse
pulling from ../com.vectrace.MercurialEclipse
searching for changes
no changes found
# Confirm our new file is not in "upstream" respository
$ ls ../com.vectrace.MercurialEclipse/test.txt
ls: ../com.vectrace.MercurialEclipse/test.txt: No such file or directory
# Push local respository changes to "upstream" respository
$ hg push ../com.vectrace.MercurialEclipse
pushing to ../com.vectrace.MercurialEclipse
searching for changes
adding changesets
adding manifests
adding file changes
added 1 changesets with 2 changes to 2 files
$ ls ../com.vectrace.MercurialEclipse/test.txt
ls: ../com.vectrace.MercurialEclipse/test.txt: No such file or directory

Hmmm, missed something here. The Quick Start Guide docs seems to want to work back into the original respository, pulling in changes from the “downstream” repository, then executing a merge and commit command. I don’t want to do this just in case I start messing up the original repository. Time for me to create a new standalone repository so I don’t screw anything up. Stay tuned for an updated tutorial.

Of course this is very basic, I need to look into commands used in CVS like login, tagging (versioning), branching, diffing revisions and merging but this is a start.

I do have a concern with the number of distributed respositories, when and how do you know you are in sync with the “master”, indeed what is the “master”. It does merit some investigation to see what management is in place, like identifying all respositories, and comparing for example.

Conclusion

Ok, so now I’ve got a grasp on Mercurial, time to review the Java Code and see what works and what doesn’t in the Eclipse environment. Of course I don’t use Mercurial so what I may consider as functionality required, may be lower priority to those users out there. Any feedback I’m sure will be most appreciated by the original developer.

What is software quality?

Greg Lehey wrote today Is MySQL getting buggier?. The underlying question of his comments is a more fundamental and passionate topic, and especially for me. That is “Software Quality”.

The quintessential question is this. “How do you determine the ‘software quality’ of a product?” And then quickly followed by, “How do you benchmark this with other software products?”

The short answer to second question is simple. You can’t. The reasons why become apparent in addressing the first question. (There’s a mathematical term for this two question situation, another one of the million things to research and remember one day).

15 years ago as part of my masters research I worked on “Improving Software Quality and Software Productivity”. At the time when I started, I found that these were generally considered mutually exclusive. Quite simply, to improve software quality was to decrease productivity, and when you had to improve productivity, quality declined. Unless you had a clear end goal (which you can’t in software, features, cost, satisfaction etc.), it was impossible to measure the total “cost”. It was also impossible to clearly determine impact on key factors, as with software you never have a “control group” working in isolation of other groups trying different things, and being able to do imperial comparison analysis.

In summary, I found that to improve software quality and software productivity you had to reconsider the approach at several different levels.

  1. You had to strive to work towards a simpler business solution, rather then a more complex solution.
  2. You had to leverage converting business knowledge that was held in lengthy and often dated paper documents, into electronically managable content. Some of the benefits included concurrency, consistency and comparative analysis of the knowledge. The early days of CASE repositories were built on this principle, again before they became to complex.
  3. You also had to look at better ways of writing code, and the area I focussed on was code generation. That is “Code writing Code”. (I have a very successful story of 400,000-500,000 Lines of C code. The ODT was a specialised intelligent peer-to-peer selective replication, between a central repository and 32 distributed sites, allowing for independent heterogeneous operations. It provided levels of complexity within replication including two-way master of partial database data, selective row replication between nodes depending on business needs, and collision management. And I did this is1992. A story for another day).

I also found in the business and industry sectors of research, it was near impossible to get a commonality of some components, a higher level of re-use, a higher value of cost savings. There were many factors for this, another topic for another time.

Now that was a long time ago, and a number of things have changed greatly, including my views. These are far more prevalent regarding Open Source.

Some historical general principles in measurement don’t apply today in Open Source. I remember reading recently in the browser wars, that FireFox was “too honest”. Being more open to acknowledge problems/bugs and correcting them, then the Microsoft perspective. How do you compare quality in this situation. (Side Note: I wish there was a search engine that would search all pages that you had viewed, bookmarked etc., rather then searching the WWW for everything. Would make life much easier in finding references like this.)

Open Source does something commercial software can’t do. If you find a bug, and you can’t wait, then fix it yourself, or at worse pay somebody to fix it for you. The scope, functionality and rules are constantly changing, the resourcing and priorities as well. How do you adequately measure something so fluid.

I am a proponent of “Agile Development Methodologies”, having use for many years eXtreme Programming (XP) either rather seriously, or taking small attributes and applying to existing infrastructures. Terms include Test Driven Development (TDD), Feature Driven Development (FDD), Code Coverage, Unit Tests.

Speaking just on Unit Tests. You write a Unit Test before you write your code. It helps to understand what you are trying to achieve before you try to achieve it. It also provides coverage for when the rules change, you write new or changed tests, which allows you to have analysis performed by your code to address problems. I’ve found that writing extensive tests become a self documentation system of the functionality, it highlights what can and can’t occur. Simply ‘grep’ing can offer a clear summarised description of code, even if you have now knowledge of it. In Java you write long method names (some abstract examples would be ‘testFunctionDoesThisAndNotThis”, “testFunctionAcceptsTwoOfThese”, “testFunctionFailsWithValuesOfThis” etc.) I need to provide a more concrete example to better explain.

An Agile approach, also tends to better encapsulate the Web 2.0 principle of considering software as a service, and not a product. This entirely changes the general thought of software development, of the large cost of software development, followed by a maintenance period and cost of the product, followed by a review of potential new products as circumstances change, followed by … (you get the picture)

If you consider an acceptable ongoing cost for providing a service, perhaps pegged against other business revenue/expenses, and then your product was managed by what available resources you had (time, money, people), rather then by functionality requirements you entire perspective changes. You include a high level of user ownership, the release often principle, and the capacity to review and adapt to changing and improving technology and requirements, all these factors change the outcome on how software is development, managed and perceived.

The other thing about all this is, It’s common sense. It’s a simpler approach, and if everybody considered “How can I make this simpler?” first, it would change they way software is developed and perceived.

So getting back onto the topic of Software Quality. I found that in the right circumstances (and there are wrong circumstances), imploring a change in principle towards Agile concepts can have a significant effect on Software Quality. But, Beauty is in the eyes of the beholder. Software Quality is also much the same. What may be acceptable for somebody is unacceptable for somebody else. How then do you measure this?

So after all this, we are left with the question. “How do you determine the ‘software quality’ of a product?” Well, we can only consider a single product in isolation, and we are talking the MySQL server. I should note that you can’t consider all MySQL products in one analysis, different factors including the relative age of the product, the technologies etc. all affect the measurement. You need to consider each product individually.

So why is the number of bugs, or number of open bugs a measure of software quality. What about how big is the product (e.g. lines of code), how complex is the product, how mature is the product, how tested is the product. What we considered the lifespan of critical bugs as a measurement? So what if there were a larger number of critical bugs, but they were fixed immediately is that a good or poor reflection on software quality. If software is in the Web 2.0 “Perpetual Beta”, then are bugs really incomplete functionality.

In software and for organisations, I think an approach could be similar as mentioned by a prospective employer recently. “We know things aren’t the best they could be, but they are better then they were. If our goal could be to improve just 1% each day, then we would be easily 100% better in 6 months.” So with whatever metrics were in place (and you have to have some form of measurement), if there is always a continual improvement, then software quality is always improving.

Differences in syntax between mysql and mysqltest

As I wrote earlier in Using the MySQL Test Suite I found an issue with using the current MySQL Sakila Sample Database as a test with mysqltest.

I was running an older version of 5.1.7 beta so I figured the best course of action was to upgrade to 5.1.11 beta.

Well the problem still exists, and I found that the cause was due to the syntax of the DELIMITER command. Assuming the creation of the schema tables, here is an example of what I found.

Running in an interactive mysql session the following works

DROP TRIGGER ins_film;
DELIMITER ;;
CREATE TRIGGER `ins_film` AFTER INSERT ON `film` FOR EACH ROW BEGIN
    INSERT INTO film_text (film_id, title, description)
        VALUES (new.film_id, new.title, new.description);
  END;;
DELIMITER ;

On a side note, why does DROP TRIGGER not contain the IF EXISTS syntax like DROP DATABASE and DROP TABLE. It’s a shame, it’s a nice SQL extension that MySQL provides.

Now running the same SQL in a mysqltest test (with appropiate create tables) produces the following error

TEST                            RESULT
-------------------------------------------------------
sakila-trigger                 [ fail ]

Errors are (from /opt/mysql/mysql-test/var/log/mysqltest-time) :
mysqltest: At line 53: Extra delimiter "" found
(the last lines may be the most important ones)

It seems, that the DELIMITER command within mysql accepts a line terminator as a statement terminator, while in mysqltest, this is not so.
The solution is to be more strict in the mysqltest, using the following syntax, note the first and list DELIMITER lines are terminated by the appropiate delimiter in use.

DELIMITER //;

CREATE TRIGGER `ins_film` AFTER INSERT ON `film` FOR EACH ROW BEGIN
  INSERT INTO film_text (film_id, title, description)
  VALUES (new.film_id, new.title, new.description);
END//

DELIMITER ;//

Surprisingly, this syntax does then not work in mysql? Is this a bug? I’m note sure, perhaps somebody could let me know.