Google – Is both friend and foe

September 25, 2006 by ronald

This is an interesting article How Google can make – or break – your company. The plight of a small business that overnight lost it’s high free search results and in the coming months say a 20% drop in sales.

The official response “Just as the web changes constantly, Google continues to modify and refine our algorithms to improve the quality of our objective search results.”

The comment made by the small business “It’s like owning a shop on a busy street corner where all the pedestrians suddenly and mysteriously vanish.” That’s gotto hurt.

The article also delves into a lot more, an interesting read.

Asian Air Show 2006 featuring the Airbus A380

September 24, 2006 by ronald

While in Singapore earlier in the Year (Yes, I’m very slack with these photos), I got up close and personal with the new Airbus A380. I wasn’t that lucky to get inside, I think that was reserved for airline flight crews, but still there are a few photos.

I’ve had some media problems with both DVD’s containing photos of my Singapore/Thailand/Malaysia trip, but you can view a few Here on my Flickr.

Updated
7 Nov 2006 Airbus recently announced it’s third major delay, and today FedEx cancelled it’s order. Read more amount the details, and airline orders at Wikipedia.

Flickr. Moving on from Web 2.0.

September 24, 2006 by ronald

On the features of Web 2.0 sites is the design pattern concept of The Perpetual Beta (See Point 6).

Seems Flickr or more likely Yahoo (the owner of Flickr) wants to be more.
Have a close look at the logo.

Google Earth

September 24, 2006 by ronald

Google Earth and Google Maps can give you some great views of our planet. Here is a full-time blog just for Google Earth.

There’s plenty of amazing things to see, check out Top Ten Coolest Things seen with Google Earth. The list includes:

Capsized Cruise Ship Captured in Google Earth / Maps
KC-135 Caught Refueling C-5 Galaxy in Mid-Air in Google Earth!
Huge Scale Model of Disputed Border Region of China Found in Google Earth
Lancaster Bomber Caught Flying in Google Earth
Flying Car? Not Really
Nude Sunbather caught by google earth.
Firefox Crop Circle in Google Earth
Google Earth Las Vegas
Shipwrecks Around the Google Earth
See African Animals in High Resolution in Google Earth

Not to be outdone, check out the Google 3D Warehouse for images like Golden Gate Bridge.

Here are a few more cool things I’ve seen. Hole, More about the hole

Plane Spotting without leaving your desk

September 24, 2006 by ronald

I’ve heard of people that go plane spotting. Watching different planes that come into airports.

I’m lazy, while checking out my home with Wikimapia look what I spotted Here. (Offline image: View)

So for all those plane spotters. What is it? My guess would be an Boeing 737, probably a 737-800.

Speaking of Plane Spotting, I’ve been meaning of putting up my photos of the recent Asian Air Show in Singapore and my up close and personal photos of the Airbus A380.

Arrrrrgggggggggggghhhhhhhhhhhhhhhh! I’ve tried to retrieve my photos from DVD and I’m getting an cp: reading `/media/photos2006/photos/2006/2006_02_23_AirShow/IMG_5939.JPG': Input/output error Error.

Well so much for those photos of Singapore, Bangkok and Malaysia from earlier in the year.
That’s gotta ruin your Sunday.

PS: How did I know what plane it was. Hint: the color gives it away.

PPS: Updated Jan 12 2007. Interesting that the present live view no longer has the plane. My Offline View has it. Good to know that this little area of Brisbane has been updated on Google Maps.

The price for digg success

September 24, 2006 by ronald

I guess much like the Slashdot effect, the Digg effect is both a good thing for your exposure and traffic hits, and a bad thing for those ISP’s watching the traffic. (See Jay’s Slashdot Fame).

In the past week, I’ve gone to a number of digg article sites and they have been unavailable. I never kept details of these IT articles, but here is one the one article not of an IT I look at that I did. It referred to an image, which the host provider adjusted (see image to the right). The host provider was ImageShack. I didn’t read anything in the T & C about being too popular!

Here’s the original Digg Post Gmail ads get a little too personal.

Seems I’m not the only person wanting to see it, I found in the comments a repost of the image Here. Just for future reference my copy is Here

Planet kicked out of solar system after failing test

September 17, 2006 by ronald

PLUTO supporters from all over the world are burning things after the tiny cold planet was booted out of the solar system.

The decision caused uproar in the text book and hanging mobile industries, which will have to recall all their products and re-do them,

The disney company announced it would rename its famous cartoon dog.

Trivia questions, already at calamity points, will need to be rewritten and planetariums are busy papering over their Pluto exhibits, retraining guides and cutting the ends off posters.

From City News Issue 153 31 August 06

Larger Image

Logical Data Modelling (LDM)

September 17, 2006 by ronald

Following my User Group Presentation I was asked by fellow MySQLer Kim about Logical Data Modelling (LDM), in relation to Physical Data Modelling.

Well, first the brain had to work overtime to remember when was the last time I worked on a Logical Data Model. The answer to that is 1996 doing R&D work for Oracle Corporation with their CASE repository tool, Oracle Designer, about version 1.3/1.3.2. I’ve learnt in the past 10 years to purge technical stuff from my brain, leading from the capacity in be able to remember in detail data models, data migration and data cleansing issues of projects even after leaving them 3 years eariler.

As Kim pointed out, he thinks physically, actually directly at the SQL level, then working backwards to produce an appropiate physical model. To think logically is to consider the entities and attributes and relations before considering the physical tables, columns and relationships. So how do you program somebody to think logically? In the case with Kim, he is undertaking formal studies after already grasping the concepts of software development. Generally today more people don’t undertake the formal education and we end up with The Hobbist and the Professional Syndrome.

I guess in summary I’d argue why bother. Does anybody still do traditional logical models? Feedback/comments are welcome. (Professionally I believe there is a place for LDM).
However for the purpose of the exercise lets start with a Physical Representation and present a Logical Representation of that to see the differences.

From this you can see a classic Library example, showing a table of Books and a Table of Authors, and an intersection table (3NF) to indicate that a Book can have one to many Authors, and likewise an Author can have one to many Books.

So how would this have been represented in a Logical Model.

Some things to consider in Logical Modelling.

Attributes really only require a name, and perhaps just a datatype using number, string, date (Jim Starkey would be happy), but nothing specific like SMALLINT, or VARCHAR(50).
An Attributes Mandatory/Optional state may or may not be known
Information is not in 3NF. i.e. relationships can be many to many (in our example, there is no concept of the intersection table

I’m going from memory here, so there is probably more points to consider.

So how do you teach this when you are trying to work backwards, when I learnt this back in 1988 (some 18 years ago), I’d never created a database or used SQL so I didn’t have the history. Arjen had a brilliant idea to consider Logical Modelling after the fact. Views. Consider an end user requirement for Reporting, and how you would represent your model better to an end user (effectively de-normalising these views so users don’t have to know about joins).

In this case you would create a view of Books and a View of Authors. Details such as mandatory/optional isn’t important to the end user report (ie, it’s not like it’s needed to be enforced), and specific datatype details again are not that important. The basics to know how to format a number or a date works.

An interesting approach that worked well in our explaining.

The Text Book

So it was interesting to go back to the text book using C.J. Date’s An Introduction to Database Systems to review definitions.

Logically isn’t really referenced, the term used is conceptual. A conceptual definition is “The right way to do database design is to get the logical design right first, without paying any attention whatsoever at that stage to physical – that is, performance – considerations”

Also for example Third Normal Form we get from Section 12.3 First, Second and Third Normal Forms.

Third normal form (very informal definition): A relvar is in 3NF if and only if the monkey attributes (if any) are both:

a. Mutually independent
b. Irreducibly dependent on the primary key

We explain the terms nonkey attribute and mutually independend (loosely) as follows:

A nonkey attribute is any attribute that does not participate in the primary key of the relvar concerned.
Two or more attributes are mutually independent if none of them is functionally dependent on any combination of the others. Such independence implies that each attributes can be updated independently of the rest.

Man, no wonder many years of experience and having generally seen most cases, enables me to forget this and not feel like I’ve fogotten something.

The Hobbyist and the Professional

September 17, 2006 by ronald

I first coined this term in February 2006 in a paper titled “Overcoming the Challenges of Establishing Service and Support Channels” for the conference “Implementing Open Source for Optimal Business Performance” View Paper.

I must have referenced this several times, time to give this topic it’s due notice. In summary it targets three areas:

the lack of appropriate database design in open source projects
the lack of coding standards
the lack of sound programming principles

Here are the bullet points from the slide in a presentation.

Hobbyist

Download-able software and examples
Online tutorials
Books like Learn in 24 hours/For Dummies

Professional

Formal Qualifications
Grounding in sound programming practices
Understanding of SDLC principles
Worked in team environment

Middle Ground Developer

Time to skill verses output productivity
Depends on environment and requirements

Slow Queries aren't always that bad!

September 16, 2006 by ronald

Well, now I have your attention, Slow Queries are bad (unless it’s a single user system and you don’t care). However there are worse things then slow queries in a large enterprise system.

I’ve been asked in recent weeks a number of questions which has brought this topic to discussion, as well as a current implementation I’m undertaking for a client of a purchased product.

High volume repetitive queries can have a worse effect on your system’s performance. Combined with slow queries that take locks, these queries can have an extreme effect on performance and if you don’t know your application, or have the right tools, it can be initially hard to diagonise easily.

This problem is the classic Wasting CPU cycles problem, seen it before, will see it again.

Here’s a classic example for reference. Using the current product that I’m customising and installing (I’ve not be involved in any development), a typical Customer Information System (CIS).

The system when deployed will have 1.1 Million Customers. Each Customer has a Balance. Now the Customer Balance is not recorded against the customer record, it’s calculated every time it’s required. So, no big deal, well yes. The stored procedure to calculate the balance hit’s 4 different tables, and one of these tables is one of the largest in the system (recording all detailed financial transactions). Multiple batch reports alone that work with large sets of customers all require or use the current balance in some means.

There are only a small set of transactions that affect the balance including Invoice Statements, Payments, Adjustments, Credit Control, Write-Offs . These processes by there own nature are either batch or small online transactions. The calculation of the balance can be applied against the Chunk of customers in a batch, or individual Row for online transactions (See The RAT and the CAT).

Even for the paranoid, the re-calculation of the customer balance for all customers in batch is more efficient after hours, yet only customers where some transaction that has been applied since the last time the re-calculation ran is necessary.

MySQL doesn’t be default have any analysis tools to identify and manage these types of queries. Peter Zaitsev in his article Slow Query Log analyzes tools highlights the issue with a good approach by a source code change to long_query_time as well as some additional scripts.

Hopefully MySQL will consider a more improved approach in future releases that doesn’t require patching the source code. Now it’s unlikely that MySQL will do another SHOW STATUS Gotcha and for example change the unit of measure from seconds to milliseconds, but they could make it a float, so you could specify 0.1 for example.

Web Site – Speed Test

September 16, 2006 by ronald

Want to know your Internet Connection speed in a real world test?
Want a fancy graphical presentation of your internet Speed?

SpeedTest.Net has you covered. As you can see that even provide graphics results to can use on your own site.

So just how much does my Bittorrent download of Stargate and StarGate Atlantis weekly epsiodes affect my link (Azureus states about 20kB/s down and 20kB/s up. Here is the results. Hmmm, seems if affects my link much more then it really states!

Google Doodles

September 16, 2006 by ronald

Sometimes using that Firefox Google Quick Search causes you to miss out on the best part of the Google Search experience, and that is the Google Doodle.

Now while I’m tempted to include one here, I’m respecting the wishes of Google which request you use an official logo and “don’t feed the kangaroo“.

So speaking of the Kangroo, you will need to goto the Google Doodle of the 2000 Summer Olympic Games held in Sydney Australia.

Stories that impress and motivate you

September 16, 2006 by ronald

I’ve worked for two Internet startup companies, both around 2 years each, both now long dead. The first was due to eventual lack of new VC funds, the second gross financial managment in the second year (apparently, when we were told there was no money December one year to pay us, the company that made large profits every month for over the first year, then had made losses every month for the past 12 months, but nobody knew about it. There were 5 Directors from 3 countries and nobody knew. Yeah Right!)

I’ve learnt a lot of non IT street smarts in this time. The first startup took the VC route, and after 3 rounds while I wasn’t involved in the process you pick up things. The single biggest tip here is the Bell-Mason Diagnostic. Here a few introduction references worthy of a quick review (One, Two).

When you take any great idea, and then consider the 4 quadrants and 12 axis you realise you really need to make larger circles of professional contacts.

Zac’s article Valley Boys Run MySQL talks about the new breed of Web entrepreneurs and Web 2.0. In particular check out
Valley Boys Digg.com’s Kevin Rose leads a new brat pack of young entrepreneurs. This is the new wave of success that works without VC.

For me this recent post on the Meebo Blog really impressed me.

365 days ago

Nineteen releases, eleven (fantastic) team members, and 295,321 gummy bears later we see over a million users log into meebo each day. As a coder, all you hope is that your service will be able help people go about their day-to-day lives. Stability, bugs, and good usability are always top of mind.

1,000,000 users of your Web 2.0 application. This impresses and motivates me. What makes it possible to anybody, is you can get a LAMP stack, a live-cd, cheap hosting and you can turn your idea into something real for next to no cost. Of course, I won’t start on the nightmares out there of great ideas that are very poorly designed. At least the underlying stack can support anything you want to achieve, and MySQL is behind these success stories.

One more thing on Meebo, Check out the meebo map!. I’ve been told that Google has something of a similar nature at the Googleplex. Well I can say I’m very keen to see this, and will be 8 weeks time when the First MySQL Camp is held.

MySQL Trigger Features

September 14, 2006 by ronald

Sheeri talked a little about MySQL Triggers in One TRIGGER Fact, Optimizer Rewriting Stuff. While it’s great that MySQL 5.0 has Triggers, there are some things that I feel could become features in future releases.

IF EXISTS

One of the beautiful features that MySQL has is IF EXISTS. This ternary operation that if the object exists performs the operation, of not it does nothing works wonders in reviewing logs for errors. One of the problems with Oracle for example, is the requirement to ignore the ORA errors for non-existent objects.

But this functionality doesn’t exist for Triggers? One must wonder why. I’d like to see this.

mysql> DROP TRIGGER IF EXISTS trigger_name;

MySQL Manual DROP TRIGGER

REPLACE

On feature that simplifies the lack of IF EXISTS functionality using Oracle is REPLACE. The syntax is:

oracle> CREATE OR REPLACE TRIGGER trigger_name...

In this case, this functionality effectively eliminates the need for a DROP IF EXISTS, however I’d not like to see this introduced into MySQL especially it’s an Oracle specific syntax and not ANSI SQL.

Multiple Triggers of same type per table

MySQL only allows one Trigger per type per table. That is, there can only be one BEFORE UPDATE for example. While you may ask the question why you would need this functionality. Here is a typical situation.

You use Triggers to perform some level of business functionality, determining values for optimised (denormalised) columns is a good example. So you need to write an appropiate trigger for that piece of functionality.
You also use Triggers to perform database auditing, that is for every insert/update/delete of data, you record a full copy of the change in an audit database. One way to ensure this is consistent across the entire database is to implement via triggers. So you leverage programming functionality to pre-create triggers for all tables to manage this.

The problem with MySQL occurs in that you have to now merge these triggers for tables that require both. If you want to deploy your application into a test environment, you may wish to not deploy your auditing triggers, but now you have this functionality mixed in with business logic.

Multi Type Triggers

Another cool Oracle feature is the capacity to define a trigger for multiple types. For example:

oracle> CREATE OR REPLACE TRIGGER trigger_name
oracle> BEFORE INSERT OR DELETE OR UPDATE ON table_name ...

In this example, with MySQL you would need to create three seperate triggers.

Other features

While not as important and one would need to consider if necessary in MySQL are some other Oracle provided trigger functionality. This includes:

WHEN CLAUSE trigger restriction
Triggers on DDL Statements such as CREATE, DROP, ALTER
Triggers on Database events such as LOGON, LOGOFF, STARTUP, and SHUTDOWN
INSTEAD OF Triggers (for Views)
STATEMENT based trigger

For more information you can check out the Oracle Documentation on Introduction to Triggers, CREATE TRIGGER, Documentation Search on Triggers

Handling Error Levels in Logging

September 13, 2006 by ronald

In reviewing some provided code to a client, I observed a number of actions contray to generally accepted practices regarding logging. This is what I provided as the general programming conventions with regardings to logging.

Using Log4J (http://logging.apache.org/log4j/docs/), which is generally accepted as the benchmark for all Java applications. This provides the following logging levels.

FATAL
ERROR
WARN
INFO
DEBUG
TRACE – from 1.2.12, latest is 1.2.13

A description for what handling should occur per logging level.

FATAL. As the name suggests, all processing should stop. Should logging include a FATAL, the process is a Failure.

ERROR. An error has occured, and this requires attention, and action. Generally processing should stop, however additional post processing, or an alternative path could occur. Should logging include an ERROR, the process is a Failure.

WARN. Something that is unexpected occured, however it doesn’t affect the general processing from succeeding successfully. If a process includes WARN and not FATAL/ERROR it should be considered successful.

INFO. Information Only. On high volume systems, this level of logging may even be turned off. This generally indicates key information values or steps, and can assist when enabled in longer running processes to identify where a process is. You don’t log errors at INFO.

DEBUG. For Debugging Purposes only.

Ok, well that sounds like common sense. Here is what I observed on a client site.

Code logs a FATAL, but continues processing
A FATAL is logged, yet the calling process reports success
An ERROR is logged, yet the calling process reports success.
A lot of WARN are logged, and this is misleading, as it appears more information regarding XML elements not processed (We are talking 20+ Warnings per batch process). From what I’ve observed, these don’t require futher action, and should be changed in INFO.
Errors are being logged as INFO. A NullPointer RunTime Exception is logged as INFO. If an error provides an Exception argument where a stack trace is printed, it ain’t an INFO message.

The Windoze CLI, or lack there of.

September 8, 2006 by ronald

There are many reasons why I hate Windoze, here is just another example.

I need via the command line, change to a given directory, so I can manually run a command. Not only can I not do a cd to the directory, as shown in the printscreen here, I can’t use a wildcard to try and identify the directory better (perhaps its the addition of uppercase characters or a fullstop at the end of the directory that’s making this OS confused?). Of course I can’t do tab completion either.

Ok, well, just cut and paste what the dir gives you. No, it’s not highlight, right mouse click, Hmmm you have to go, Menu | Edit | Mark, then highlight the text, then Menu | Edit | Copy, then Menu |Edit |Paste just to get a copy of the text, and that still doesn’t work.

Of course Windoze was never designed for programmers, especially those that can achieve far more with a command line interface (CLI for all those Windoze users that have never heard of it), and much more quickly then with having to move a mouse around, drawing one hand away from the keyboard where you do all the other typing. So much for making the world more efficient.

So how to you actually do it, well the only way I know is you have to install more software from XP Power Toys called Open Command Window Here. This allows you to use the GUI Windows Explorer (again a mouse thing) to navigate to the directory, then right click to open a command window at this directory.

So what was the outcome of the actually directory using this comment, see below. The same name I was trying to type in. Go Figure!!!!!!!!!!

Brisbane Users Group – MySQL Hackfest

September 6, 2006 by ronald

Last night we had a number of keen souls at the Brisbane MySQL User Group. I was very impressed to see the majority of people with laptops at hand.

You can download my slides and code examples at my Articles page.

In our hands on Hackfest tutorial we created the new command SHOW USERGROUPS. Before anybody makes a comment, it was stated in the presentation that this command was made a dummy one, and is a poor candidate for two reasons.

The results should be more dynamic, rather then hardcoded into the source tree
USERGROUPS is not an ideal name due to comparisions to Users, Groups, Roles etc

Still it was productive, here was the outcome of our work for the evening.

mysql> SHOW USERGROUPS;
+----------------------+-------------------------+-----------------------------+
| Name                 | Location                | Website                     |
+----------------------+-------------------------+-----------------------------+
| Brisbane Users Group | Brisbane, QLD Australia | http://mysql.meetup.com/84/ |
| Boston Users Group   | Boston, USA             |                             |
+----------------------+-------------------------+-----------------------------+
2 rows in set (0.00 sec)

The trailing slash ‘/’ Hitcho’s contribution

mysql> SHOW AUTHORS;
+--------------------------------+---------------------------------------+----------------------------------------------------------------------+
| Name                           | Location                              | Comment                             |
+--------------------------------+---------------------------------------+----------------------------------------------------------------------+
| Brian (Krow) Aker              | Seattle, WA, USA                      | Architecture, archive, federated, bunch of little stuff :)           |
...
 Brisbane Users Group           | Brisbane, QLD Australia               | New Command SHOW USERGROUPS                             |
+--------------------------------+---------------------------------------+----------------------------------------------------------------------+
75 rows in set (0.05 sec)

I’ll be publishing more beginner tutorials in my MySQL Compiling Category over time.

Update Check out http://www.arabx.com.au/hackfest.diff.htm for a diff of files to produce this output.

Compiling MySQL Tutorial 3 – Debugging Output

September 5, 2006 by ronald

Continuing on from Tutorial 2.

When reviewing the 2.1. C/C++ Coding Guidelines for MySQL, you will see that the MySQL Source uses within the C/C++ code DBUG (Fred Fish’s debug library). This is one of the 3rd party open source products that are documented in the Internals 1.4. The Open-source Directories.

You will also find some usage in the MySQL Manual E.3 The DBUG Package. So enough talk, how do you use this.

DBUG

You get the DBUG output by running a mysqld debug version with the argument –debug. The output for the SHOW AUTHORS commands is:

T@9190304: >dispatch_command
T@9190304: | >alloc_root
..
T@9190304: | query: SHOW AUTHORS
T@9190304: | >mysql_parse
T@9190304: | | >mysql_init_query
T@9190304: | | | >lex_start
T@9190304: | | | | >alloc_root
T@9190304: | | | | | enter: root: 0x9590b38
T@9190304: | | | | | exit: ptr: 0x95c1c70
T@9190304: | | | | <alloc_root
T@9190304: | | | <lex_start
T@9190304: | | | >mysql_reset_thd_for_next_command
T@9190304: | | | <mysql_reset_thd_for_next_command
T@9190304: | | <mysql_init_query
T@9190304: | | >Query_cache::send_result_to_client
T@9190304: | | <query_cache ::send_result_to_client
T@9190304: | | >mysql_execute_command
T@9190304: | | | >mysqld_show_authors
T@9190304: | | | | >alloc_root
T@9190304: | | | | | enter: root: 0x9590b38
T@9190304: | | | | | exit: ptr: 0x95c1c78
T@9190304: | | | | <alloc_root
...
T@9190304: | | | | >send_fields
T@9190304: | | | | | packet_header: Memory: 0x8c2480  Bytes: (4) 01 00 00 01
T@9190304: | | | | | >alloc_root
T@9190304: | | | | >Protocol::write
T@9190304: | | | | <protocol ::write
...
T@9190304: | | | | >send_eof
T@9190304: | | | | | packet_header: Memory: 0x8c2550  Bytes: (4) 05 00 00 51
T@9190304: | | | | | >net_flush
T@9190304: | | | | | | >vio_is_blocking
T@9190304: | | | | | | | exit: 0
T@9190304: | | | | | | <vio_is_blocking
T@9190304: | | | | | | >net_real_write
T@9190304: | | | | | | | >vio_write
T@9190304: | | | | | | | | enter: sd: 23, buf: 0x0x95b9bd0, size: 5107
T@9190304: | | | | | | | | exit: 5107
T@9190304: | | | | | | | <vio_write
T@9190304: | | | | | | <net_real_write
T@9190304: | | | | | <net_flush
T@9190304: | | | | | info: EOF sent, so no more error sending allowed
T@9190304: | | | | <send_eof
T@9190304: | | | <mysqld_show_authors
T@9190304: | | <mysql_execute_command
T@9190304: | | >query_cache_end_of_result
T@9190304: | | <query_cache_end_of_result
...
T@9190304: | <mysql_parse
T@9190304: | info: query ready
T@9190304: | >log_slow_statement
T@9190304: | <log_slow_statement
T@9190304: >dispatch_command

Using some of the debug options you get the following output with –debug=d,info,error,query,general,where:O,/tmp/mysqld.trace

create_new_thread: info: creating thread 3
create_new_thread: info: Thread created
?func: info: handle_one_connection called by thread 3
?func: info: New connection received on socket (23)
?func: info: Host: localhost
?func: info: vio_read returned -1,  errno: 11
thr_alarm: info: reschedule
process_alarm: info: sig: 14  active alarms: 1
?func: info: client_character_set: 8
check_user: info: Capabilities: 238213  packet_length: 16777216  Host: 'localhost'  Login user: 'rbradfor' Priv_user: ''  Using password: no Access: 0  db: '*none*'
send_ok: info: affected_rows: 0  id: 0  status: 2  warning_count: 0
send_ok: info: OK sent, so no more error sending allowed
do_command: info: vio_read returned -1,  errno: 11
thr_alarm: info: reschedule
process_alarm: info: sig: 14  active alarms: 1
do_command: info: Command on socket (23) = 3 (Query)
dispatch_command: query: SHOW AUTHORS
send_eof: info: EOF sent, so no more error sending allowed
dispatch_command: info: query ready
do_command: info: vio_read returned -1,  errno: 11
thr_alarm: info: reschedule
process_alarm: info: sig: 14  active alarms: 1
do_command: info: Command on socket (23) = 1 (Quit)
~THD(): info: freeing security context
end_thread: info: sending a broadcast
end_thread: info: unlocked thread_count mutex

I have found the following format good for comparing with the MySQL Source Code. —debug=d:N:F:L:O,/tmp/mysqld.trace

   49:   sql_parse.cc:  1536: do_command: info: Command on socket (23) = 3 (Query)
   50:     my_alloc.c:   172: alloc_root: enter: root: 0x9068b38
   51:     my_alloc.c:   219: alloc_root: exit: ptr: 0x9099c30
   52:   safemalloc.c:   128: _mymalloc: enter: Size: 16384
   53:   safemalloc.c:   197: _mymalloc: exit: ptr: 0x909cc80
   54:   safemalloc.c:   262: _myfree: enter: ptr: 0x9095bf8
   55:   sql_parse.cc:  1757: dispatch_command: query: SHOW AUTHORS
..
   88:    sql_show.cc:   385: mysqld_show_authors: packet_header: Memory: 0x940590  Bytes: (4)
..
  166:    protocol.cc:   385: send_eof: packet_header: Memory: 0x940550  Bytes: (4)
05 00 00 51
  167:    viosocket.c:   184: vio_is_blocking: exit: 0
  168:    viosocket.c:   105: vio_write: enter: sd: 23, buf: 0x0x9091bd0, size: 5107
  169:    viosocket.c:   117: vio_write: exit: 5107
  170:    protocol.cc:   342: send_eof: info: EOF sent, so no more error sending allowed
  171:     sql_lex.cc:   199: lex_end: enter: lex: 0x9068b58
  172:   sql_parse.cc:  1796: dispatch_command: info: query ready
  173:     my_alloc.c:   331: free_root: enter: root: 0x9068b38  flags: 1
  174:    viosocket.c:   184: vio_is_blocking: exit: 0
  175:    viosocket.c:    36: vio_read: enter: sd: 23, buf: 0x0x9091bd0, size: 4
  176:    viosocket.c:    49: vio_read: vio_error: Got error 11 during read
  177:    viosocket.c:    52: vio_read: exit: -1
  178:   sql_parse.cc:   809: do_command: info: vio_read returned -1,  errno: 11

  190:   sql_parse.cc:  1536: do_command: info: Command on socket (23) = 1 (Quit)
  191:     my_alloc.c:   331: free_root: enter: root: 0x9068b38  flags: 1
  192:      mysqld.cc:  1681: close_connection: enter: fd: socket (23)  error: ''
  193: sql_handler.cc:   628: mysql_ha_flush: enter: tables: (nil)  mode_flags: 0x02

This will make a lot more sense with my upcoming presentation where you will see clear use and interaction between source files such as sql_parse.cc & sql_show.cc.

For more information check out Making Trace Files

Google at "The Simpsons"

September 3, 2006 by ronald

I have a friend that is a DBA with Fotolog. A very novel idea that works well. Worth browsing to see what people take photos of. You never know what you find. I thought this was cool. You can view some of my photos here.

Original Source

A lesson in Perl Package Syntax

September 1, 2006 by ronald

I was given a perl program today, however running on my Fedora Core 5 install gave me the following error.

$ ./nameofcommand
Can't locate Term/ReadKey.pm in @INC (@INC contains: /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.7/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.6/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl/5.8.7 /usr/lib/perl5/site_perl/5.8.6
/usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl/5.8.4 /usr/lib/perl5/site_perl/5.8.3 /usr/lib/perl5/site_perl
/usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.7/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.6/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.8
/usr/lib/perl5/vendor_perl/5.8.7 /usr/lib/perl5/vendor_perl/5.8.6 /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl/5.8.4
/usr/lib/perl5/vendor_perl/5.8.3 /usr/lib/perl5/vendor_perl /usr/lib/perl5/5.8.8/i386-linux-thread-multi /usr/lib/perl5/5.8.8 .)
at ./nameofcommand line 28.

A quick check of the things I know about Perl.

$ rpm -qa | grep perl
perl-Digest-HMAC-1.01-14.2
perl-XML-Parser-2.34-6.1.2.2
perl-DBI-1.50-2.2
perl-Digest-SHA1-2.11-1.2
perl-URI-1.35-2.2
perl-Compress-Zlib-1.41-1.2.2
perl-HTML-Parser-3.51-1.FC5
perl-5.8.8-5
perl-Net-DNS-0.57-1
perl-HTML-Tagset-3.10-2.1
perl-Net-IP-1.25-1.fc5
perl-libwww-perl-5.805-1.1
perl-SGMLSpm-1.03ii-16.2
perl-String-CRC32-1.4-1.FC5

$ perl --version
This is perl, v5.8.8 built for i386-linux-thread-multi

$ cpan Term/ReadKey
Going to read /root/.cpan/sources/authors/01mailrc.txt.gz
CPAN: Compress::Zlib loaded ok
Going to read /root/.cpan/sources/modules/02packages.details.txt.gz
  Database was generated on Thu, 31 Aug 2006 11:32:19 GMT
CPAN: HTTP::Date loaded ok

  There's a new CPAN.pm version (v1.87) available!
  [Current version is v1.7602]
  You might want to try
    install Bundle::CPAN
    reload cpan
  without quitting the current session. It should be a seamless upgrade
  while we are running...

Going to read /root/.cpan/sources/modules/03modlist.data.gz
CPAN: Storable loaded ok
Going to write /root/.cpan/Metadata
Running make for T/T//T/Te/Term/ReadKey
CPAN: LWP::UserAgent loaded ok
Fetching with LWP:

http://cpan.mirrors.ilisys.com.au/authors/id/T/T//T/Te/Term/ReadKey

Fetching with LWP:

http://cpan.mirrors.ilisys.com.au/authors/id/T/T//T/Te/Term/ReadKey.gz

Trying with "/usr/bin/links -source" to get

http://cpan.mirrors.ilisys.com.au/authors/id/T/T//T/Te/Term/ReadKey

CPAN: Digest::MD5 loaded ok

Trying with "/usr/bin/links -source" to get

http://cpan.mirrors.ilisys.com.au/authors/id/T/T//T/Te/Term/CHECKSUMS

'glob' trapped by operation mask at (eval 25) line 1.
 at /usr/lib/perl5/5.8.8/CPAN.pm line 4265
        CPAN::Distribution::MD5_check_file('CPAN::Distribution=HASH(0xaeee4d4)', '/root/.cpan/sources/authors/id/T/T/T/Te/Term/CHECKSUMS') called at /usr/lib/perl5/5.8.8/CPAN.pm line 4246
        CPAN::Distribution::verifyMD5('CPAN::Distribution=HASH(0xaeee4d4)') called at /usr/lib/perl5/5.8.8/CPAN.pm line 3894
        CPAN::Distribution::get('CPAN::Distribution=HASH(0xaeee4d4)') called at /usr/lib/perl5/5.8.8/CPAN.pm line 4463
        CPAN::Distribution::make('CPAN::Distribution=HASH(0xaeee4d4)') called at /usr/lib/perl5/5.8.8/CPAN.pm line 4703
        CPAN::Distribution::test('CPAN::Distribution=HASH(0xaeee4d4)') called at /usr/lib/perl5/5.8.8/CPAN.pm line 4808
        CPAN::Distribution::install('CPAN::Distribution=HASH(0xaeee4d4)') called at /usr/lib/perl5/5.8.8/CPAN.pm line 2137
        CPAN::Shell::rematein('CPAN::Shell', 'install', 'Term/ReadKey') called at /usr/lib/perl5/5.8.8/CPAN.pm line 2164
        CPAN::Shell::install('CPAN::Shell', 'Term/ReadKey') called at /usr/bin/cpan line 202

Time to IM my friend and Perl expert Tony. I can obviously read Perl ok, have even worked with Perl CGI many many years ago, but this will be some configuration syntax I’m just not familar with because I don’t develop in Perl.

In summary,Tony’s (Mr Perl for today) tips we:

perl -e ‘for (@INC) { printf “%d %sn”, $i++, $_} ‘
PERL5LIB env variable
or -I command line switch, syntax is -Idirlist dirlist is a colon-separated list of one or more directories to INC
but it might need term::readkey perl uses keyword :: package or similar

And indeed it was the syntax. Now some enterprising perl person could rather then just saying Can’t locate Term/ReadKey.pm , could add try cpan Term::ReadKey to install this package.

$ cpan Term::ReadKey
CPAN: Storable loaded ok
Going to read /root/.cpan/Metadata
  Database was generated on Thu, 31 Aug 2006 11:32:19 GMT
Running install for module Term::ReadKey
Running make for J/JS/JSTOWE/TermReadKey-2.30.tar.gz
CPAN: LWP::UserAgent loaded ok
Fetching with LWP:

http://cpan.mirrors.ilisys.com.au/authors/id/J/JS/JSTOWE/TermReadKey-2.30.tar.gz

CPAN: Digest::MD5 loaded ok
Fetching with LWP:

http://cpan.mirrors.ilisys.com.au/authors/id/J/JS/JSTOWE/CHECKSUMS

Checksum for /root/.cpan/sources/authors/id/J/JS/JSTOWE/TermReadKey-2.30.tar.gz ok
Scanning cache /root/.cpan/build for sizes
TermReadKey-2.30/
TermReadKey-2.30/genchars.pl
TermReadKey-2.30/Makefile.PL
TermReadKey-2.30/Configure.pm
TermReadKey-2.30/test.pl
TermReadKey-2.30/ReadKey.pm
TermReadKey-2.30/META.yml
TermReadKey-2.30/ReadKey.xs
TermReadKey-2.30/ppport.h
TermReadKey-2.30/MANIFEST
TermReadKey-2.30/README

  CPAN.pm: Going to build J/JS/JSTOWE/TermReadKey-2.30.tar.gz

Checking if your kit is complete...
Looks good
Writing Makefile for Term::ReadKey
cp ReadKey.pm blib/lib/Term/ReadKey.pm
AutoSplitting blib/lib/Term/ReadKey.pm (blib/lib/auto/Term/ReadKey)
/usr/bin/perl -I/usr/lib/perl5/5.8.8 genchars.pl

Writing termio/termios section of cchars.h... Done.
Checking for sgtty...
        Sgtty NOT found.
Writing sgtty section of cchars.h... Done.
/usr/bin/perl /usr/lib/perl5/5.8.8/ExtUtils/xsubpp -noprototypes -typemap /usr/lib/perl5/5.8.8/ExtUtils/typemap  ReadKey.xs > ReadKey.xsc && mv ReadKey.xsc ReadKey.c
gcc -c   -D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables   -DVERSION="2.30" -DXS_VERSION="2.30" -fPIC "-I/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE"   ReadKey.c
In file included from ReadKey.xs:6:
ppport.h:230:1: warning: "PERL_UNUSED_DECL" redefined
In file included from ReadKey.xs:4:
/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/perl.h:163:1: warning: this is the location of the previous definition
In file included from ReadKey.xs:375:
cchars.h: In function ‘XS_Term__ReadKey_GetControlChars’:
cchars.h:244: warning: unused variable ‘i’
ReadKey.xs: In function ‘ReadMode’:
ReadKey.xs:770: warning: unused variable ‘Perl___notused’
ReadKey.c: In function ‘XS_Term__ReadKey_SetTerminalSize’:
ReadKey.c:1999: warning: unused variable ‘targ’
Running Mkbootstrap for Term::ReadKey ()
chmod 644 ReadKey.bs
rm -f blib/arch/auto/Term/ReadKey/ReadKey.so
gcc  -shared -L/usr/local/lib ReadKey.o  -o blib/arch/auto/Term/ReadKey/ReadKey.so      
        

chmod 755 blib/arch/auto/Term/ReadKey/ReadKey.so
cp ReadKey.bs blib/arch/auto/Term/ReadKey/ReadKey.bs
chmod 644 blib/arch/auto/Term/ReadKey/ReadKey.bs
Manifying blib/man3/Term::ReadKey.3pm
  /usr/bin/make  -- OK
Running make test
PERL_DL_NONLAZY=1 /usr/bin/perl "-Iblib/lib" "-Iblib/arch" -w test.pl
1 .. 8
ok 1
ok 2
ok 3
ok 4
ok 5
ok 6
ok 7
ok 8
  /usr/bin/make test -- OK
Running make install
Installing /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/Term/ReadKey/ReadKey.so
Installing /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/Term/ReadKey/ReadKey.bs
Files found in blib/arch: installing files in blib/lib into architecture dependent library tree
Installing /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/Term/ReadKey/autosplit.ix
Installing /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/Term/ReadKey.pm
Installing /usr/share/man/man3/Term::ReadKey.3pm
Writing /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/Term/ReadKey/.packlist
Appending installation info to /usr/lib/perl5/5.8.8/i386-linux-thread-multi/perllocal.pod
  /usr/bin/make install  -- OK

It's makes me cry

August 31, 2006 by ronald

I got home today and sat down to read my home email list. Nothing new. But on a MySQL mailing list, there was an enquiry why performance was slowing in a given application. I didn’t even have to read the situation, nor the problem, it took less then the 200ms mentioned to identify the problem looking at the supplied schema.

In summary, the first table in the schema had a primary key of VARBINARY(255) and a engine type of Innodb. Hold on, wait, it’s a concatenated key of two VARBINARY(255) columns. And I should mention, that primary key was a foreign key in the next table. If this was a home website app with one user, ok well it’s still bad, but this application was having performance problems with reasonable volumes of transactions, it’s not a beginner application. (A recent reference If you don’t know your data, you don’t know your application). Where do people learn this from!

Now I want to work on a large scale out MySQL environment, an organisation or site that’s going places, but the world of even normal every day applications is littered with these basic fundamental design errors. It’s make me cry.

My last contract had many flaws across the business, and the Java code for production applications was just as bad. Check out things like What constitutes a good error message to the user?, Why IT professionals get a bad name (I could easily write one of these a day for a long time, or you could check out The Daily WFT). You think the new contract with an large existing multinational company and world-wide installed Java application I could get an improvement. I’ll leave these stories for another time. I don’t want to have to deal with this.

I’ve got 16 half written blogs on cool MySQL features or interesting topics, but I never seem to get the motivation at the end of the day. I just want to know, how do people in IT get away with this. Where is the Pride in people’s work, where is the desire for people to make what they do a better place. Where is:

An opportunity that’s a challenge, including involving hard work and tight time frames, but a job that provides the rewards of job satisfaction for a productive contribution. An importance on quality, emphasis on continued improvement, and goals of simplicity in complex situations while working in a team environment are also necessary.

Your Ideal Job Requirements?

August 27, 2006 by ronald

I came across in my papers while searching for some taxation information, a list that I made to use as an evaluation in an “ideal job”. This goes with a general comment I made only a few weeks ago to a number of colleagues.

I’m looking for an opportunity that’s a challenge, including involving hard work and tight time frames, but a job that provides the rewards of job satisfaction for a productive contribution. An importance on quality, emphasis on continued improvement, and goals of simplicity in complex situations while working in a team environment are also necessary.

The list I found and made in 2002 was:

Use of Existing Core Skills – Technologies
- Oracle, Unix, Java, HTML/Web
- Open Source Projects (Apache – HTTP/Java/XML, MySQL, PHP, etc)
Use of Existing Core Skills – Experience
- Database Modelling
- Large Systems Design & Development
- Application Performance & Tuning
- Technical Problem Solving
About Newer Technologies
- Encourage/Embrace use
- Not bleeding edge
- Opportunity for learning new relative skills
Work Environment
- Equipment
- Location
- Core working hours
- Work at home options?
Team Environment
- Ability to undertaking varying roles (and not others due to a team)
- Existing working relationship with people
- Across Development/Management/Support teams
Project Environment
- Durations 1 month – 1 year
- Challenging
- Varying
- Problem Solving
Project Methodologies
- Emphasis on Quality Procedures
- Emphasis on Software Testing
- Emphasis on Customer
Remuneration
Other
- Some fun

What’s interesting to know, is that this list includes a lot of points I’d still consider essential for my “ideal job”. I’ll need to consider more a revised list, as it took some time to make this list up. It’s interesting to know that my present contract position, as well as my last position do not meet a great deal of my “ideal job” requirements.

SHOW STATUS Gotcha

August 27, 2006 by ronald

Well, it’s Sunday night so I will put this down to being the weekend. The background to being caught out is a request I made to my local Users Group mailing list for some information on people’s environments because I wanted to some empirical data analysis without having any more knowledge of the systems.

In summary (without the surrounding fan-fare, I was seeking):

SELECT VERSION();
SHOW STATUS;
SHOW VARIABLES;  // Optional

I was however perplexed why my first data point analysis (Read/Write ratio) using the Status values Com_insert, Com_update, Com_delete and Com_select was not always giving me expected results. In particular, a number of server results showed 0 for values while I knew the results came from working MySQL environments.

So, sanity check with good friend Morgan and I get the response to answer the dilemma SHOW STATUS defaults to session based statistics in 5.0+, there’s a million people it’s caught out, and many bugs about “xyz variable not being incremented”.

So I’m really seeking:

SELECT VERSION();
SHOW GLOBAL STATUS;
SHOW GLOBAL VARIABLES;  // Optional

Well, that’s my MySQL trivia point for the day. MySQL between version 4.1 and 5.0 made a major change with the default value. The MySQL 5.0 Manual also states this.

UPDATE 23 Sep 2006

Following Eric’s Show @&!# status again!, I did then read the Bug #19093, and it was great to see Monty comment (that’s open source for you, enabling even founders to mix with the common folk), and I’d tend to agree to his comments. MySQL has attempted to correct something.

Perhaps the best compromise to the problem is to deprecate and drop the SHOW STATUS command, and make the scope component GLOBAL|SESSION compulsory, so SHOW GLOBAL STATUS or SHOW SESSION STATUS are the only valid commands!

Microfox ?

August 25, 2006 by ronald

I’ve added Digg to my general lunch time reading web sites. I came across this yesterday. Microsoft invites Firefox development team to Redmond.

Well, isn’t that nice, the big boy opening his pond (including all the sharks) to the little fish. (if you don’t get it, it’s a line from Die Hard 2 about pissing in somebody else’s pond) What’s funny about the article is not the article but some of the comments, here are but a few.

Cool. Very cool.
they’re going to brainwash them into working for microsoft and their going to cast off the IE team to replace the firefox team
Just replace IE 7 with FF 2.0. Save every web developer 100s of hours.
“Come into my parlour,” said the spider to the fly
DON’T DO IT! My god, has noone seen Braveheart? The Untouchables? The Godfather? Scarface?
hey’re going to kidnap the Mozilla developers with a cunning trail of pizza and cola, and force them to work on IE8.
ITS A TRAP! http://itsatrap.ytmnd.com/

I like the last one, maybe just because of the cool picture on referenced site.

I see today an official response Firefox welcomes Microsoft’s offer of help

The fast pace of technology in a Web 2.0 world

August 25, 2006 by ronald

I had need to goto the Wikipedia this morning to review the terminology of something, and on the front page in Today’s featured article is Mercury. Being a tad curious given I’d heard only on the radio a few hours ago that Pluto was no longer a planet in our Solar System, I drilled down to the bottom to check references to other planets (quicker then searching). So at the bottom I found the following graphic and details of The Solar System Summary.

Well blow me down, they didn’t waste any time there. Pluto is no longer a planet in our Solar System. It is now categorised as a Dwarf Planet.

There are over 100 edits to this page on Dwarf Planet in the past 12 hours, including all the links in Pluto and Solar System correctly referenced.

Now, that’s the power of content by the community, one of the characteristics of Web 2.0.
Of course all of this runs under the LAMP Stack and powered by the MySQL Database. Combined with the fact this is a breed of organisation that didn’t start with large amount of Venture Capital, another trend of the newer generation of popular and successful internet enterprises.

No Spam Today

August 24, 2006 by ronald

Huh, tricked you. As if!

However I was looking at my Akismet Spam section in WordPress, the open source software that runs my blog, and it gave me this message.

Caught Spam

You have no spam currently in the queue. Must be your lucky day.

The RAT and the CAT

August 24, 2006 by ronald

No, it’s not a bedtime story, is a serious system’s design concept and I’m amazing that people don’t know about this.
As I mentioned in If you don’t know your data, you don’t know your application I was doing a Java Code Review, and I found a clear case of a much simplier solution. How simple you ask?

Well, without completing the task 100%, I achieved in less then 1 day (and lets say for the argument 1 more day of work), what is being worked on by somebody else for a week, with an estimate of 2 more weeks to complete. So let’s add 50% to my estimate, that’s a total of 3 days verses 15 days. You do the math. and yes that was last week and that task is still being worked on the same way, even with reference to my working code. Not to mention the code is a similiar magnitude of simplicity, and simplicity means cost savings in support, people so quickly forget that.

So what is this RAT and CAT:

RAT – Row At a Time
CAT – Chunk At a Time

This concept is really rather simple, however it’s amazing how implementations never consider this. Let me give you a simple example.

Your importing some external data (a flat file of 5 columns, and is inturn will be stored in 2 tables in a normalised form). So the RAT way of doing things would entail:

Using the language in question, open the file, read line by line, converting into object for reference.
Now for each row read you
- You get the values (A and B), and then do a select to see if this row already exists in one of the normalised tables (calling in X). If it does, you have the surrogate key, else you insert the row and get the surrogate key
- Now you do the same with the next set of values (C, D, E) which reference the normalised values (A and B), inserting into Y)
- And so on and so on.
- Report exceptions line by line if found

For those now laughing, it’s not really funny, it’s sad that programs are written this way. For those that say, but that’s the only way, well welcome to a different, radical and hard to learn approach.

The Cat way of doing things would entail:

Create a temporary table
Bulk load the data into the temporary table via appropiate command, e.g. mysqlimport or LOAD DATA.
Using one select, insert into X rows from temporary table that are not already present
Using one select, insert into Y rows from the temporart table that are not present, joining to X to get the appropiate surrogate key
Report exceptions via one select of data that wasn’t inserted due to some rules

It’s not rocket science people.

I should mention this is a simple example, and it’s not always possible to do CAT processing for all operations, but generally some portion of batch work can be, and the remaining must be done in a RAT way.

Updated
Some more recent articles including We need more CATs (2009) and The Art of Elimination (2010)

If you don't know your data, you don't know your application.

August 24, 2006 by ronald

The art of data modelling is definitely lost on some [most] people, or they never found it, even though they think they did. Over dinner with good friend Morgan last night we were swapping present stories on the topic.

Morgan wrote recently about I want my 4 bytes back damn it., and interesting example storing an ISBN. Further reference can be found at Getting started with MySQL of a more impractical ISBN example.

Disk is cheap now, so the attitude and poor excuse can be, well a few extra bytes doesn’t matter. Well no! If your a social hacker and have a website with a maximium concurrent connections of 2 maybe, but much like some recent Java Code Reviewing I just performed, just because the system isn’t 24×7, doesn’t give you excuse to be lazy about writing the code not to handle concurrency, thread safety and also as efficient as possible, in this case RAT verses CAT. (I’ll need to write about this, it seemed to go over some of the other professionals even)

I can remember a very specific example some 10 years ago in doing some performance analysis on a site. I’d identified the need for an additional index on a table. Now this table was sized for 200 million rows, and it already had about 70 million. The problem was adding another index required 4GB disk allocation. These things have an effect on sizing, growth and backups.

So the impact on appropiate sizing can clearly have an effect, if it was just one poorly sized column that’s acceptable (just), but normally it’s a pattern that litters a data model.

What’s important to realise is, it’s not just diskspace, it’s also memory. Without really touching on sizing data, I did mention some examples previously in Improving Open Source Databases – WordPress. Here the use of BIGINT(20) for primary keys proved my point. That’s 8 bytes, but unless you have going to have 4 billion categories, it’s a waste. It’s a waste when it’s a foreign key in a table, and it’s a big waste when it’s indexed, and that index is then in memory, and wasting more precious resources.

So how to do identify when the designer of the model has no idea about the intrinsic data value being stored? If you see tables with VARCHAR(255), that’s a clear sign. They have no idea regarding the data, so a default limit is used. Morgan referred to it as “Shooting guns in the dark with your eyes closed”. Books don’t help the cause, I was just skimming High Performance MySQL last night (one of the many freebies from the UC). There on page 82, is a table definition with not one column, but two with varchar(255). Hmmm!

If you see any new applications with VARHAR(255) they are even more lost, because MySQL 5, which has been around quite some time now, supports VARCHAR(65535). Speaking of that, has anybody seen VARCHAR(65535). I’d like to know.

Another example, is in Sheeri’s Top 8 SQL Best Practices Point 4 in regards to storing IP’s effeciently. If you log for example every page hit, this can be your largest table, and moving from varchar(15) to int can save you upto 11 bytes per row alone.

These may just be simple examples, but it’s the principle. When you define a column, consider it’s data, if you don’t know then take the time to do the reasearch and learn.

Become named in Firefox 2

August 22, 2006 by ronald

So, FireFox have come up with a novel idea to promote it’s product. Check out Firefox Day.

The official blurb: Share Firefox with a friend. If your friend downloads Firefox before September 15, you’ll both be immortalized in Firefox 2.

You can even choose how to link your names together on the “Firefox Friends Wall”. Examples like ‘my name’ Informed ‘your name’, or ‘my name’ Empowered ‘your name’, or ‘my name’ Liberated ‘your name’.

Perhaps MySQL can leverage this idea for some what to promote future download!

New Toy

August 13, 2006 by ronald

Got my new toy today, having only ordered it Thursday night, it arrived Monday. Normally Dell stuff comes from Asia, maybe they had some of these on hand locally.

A Dell 2407WFP 24″ LCD monitor with a 1920×1200 resolution.

Hmmm, not bad. I should mention $1,199 Delivered ($300 off normal price)