Digital Tech Trek Digest [#Issue 2024.08]

The One Billion Row Challenge Shows That Java Can Process a One Billion Rows File in Two Seconds

Well, it’s way under 2 seconds for the 1brc. The published results are in, and if you’re good, you can read 1 billion data points of weather data and analyze it. The final best number, as per the article release, is “00:00.323″. Yes, that answer is in milliseconds “Result (m:s.ms)”. Mind-blowing.

ScyllaDB Summit 2024

Last week, I attended this virtual event. All the presentations can be found online. I had never used the product before, so while some new features like Tablets were not as applicable in understanding the full impact, the DynamoDB performance and cost comparisons were very applicable.

So what is ScyllaDB? It is a distributed NoSQL DBaaS that speaks Cassandra protocol (do large companies still use this?), and it speaks AWS DynamoDB protocol. That is really interesting to me. You can choose a Cloud Hosted offering, or if you’re into managing your setup, you can use the Open Source ScyllaDB version available from GitHub. I started at ScyllaDB University to get a grip on the basics. I have yet to try the local Docker Compose setup.

Thanks also to the team for the swag which I received.

Playing a game with your CI/CD pipeline

My friend Sergey has created a game in GitLab called GitTerra. Drop a few lines into your .gitlab-ci.yml, and each build will give you a generated 3D map of a city based on your commit. I look forward to some of his next steps, leveraging potentially different colors for languages or different building structures for artifacts found in your commit.

We raised 11.6M to build Serverless Postgres for Modern SaaS

Congrats to Gwen and her co-founder for getting seed funding for Nile Serverless Postgres for Modern SaaS. Awesome news for an entrepreneur, and I’m very hopeful for the success of Nile.

The Safest Way to Test Postgres Destructive Queries

While I am a user of ElephantSQL serverless PostgreSQL and Neon, Nile and Xata are just a few that are competing in the space. With multiple other products that also speak PostgreSQL protocol, you can easily trial a small product in an RDBMS in the cloud at no cost. PostgreSQL is definitely outdoing MySQL in this space. You have the extensive set of NoSQL Cloud offerings, SycllaDB I just mentioned, and D1 by CloudFlare I have yet to try this branching feature for your database, sounds interesting and I’ve added to my just as long list of products to try, as books to read. Nit: It’s PostgreSQL, not Postgres.

About “Digital Tech Trek Digest”

I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new covering professional and personal topics of interest. Turning what I read into some actionable notes in a short, committed time window is a summary of what I learned, what I should learn and use, or what is of random interest. And thus my Digital Tech Trek.

Some of my regular sources include TLDR, Forbes Daily, ThoughWorks Podcasts, Daily Dose of Data Science and BoringCashCow. Also Scientific American Technology, Fareed’s Global Briefing, Software Design: Tidy First? by Kent Beck, Last Week in AWS, Micro Newsletter to name a few.

More woes with java version on Ubuntu

Armed with more information on Drizzle JDBC being a JDBC 4.0 implementation (helps to explain my issues in Getting started with Drizzle JDBC) I took the time to read about some other new JDBC 4.0 features.

There was reference to handling chained exceptions, however when trying to get this working for SQLException was more complex on Ubuntu 9.04 then I anticipated.

My first problem was an apparent source level problem.

$ javac ExampleDrizzle.java
----------
1. ERROR in ExampleDrizzle.java (at line 14)
	for(Throwable e : sx ) {
	    ^^^^^^^^^^^^^^^^
Syntax error, 'for each' statements are only available if source level is 1.5

That’s weird, what java version was I running now I’d changed with update-alternatives –config java yesterday.

$ java -version
java version "1.6.0_16"
Java(TM) SE Runtime Environment (build 1.6.0_16-b01)
Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode)

No issues here, a quick man reference gives me:

-1.5                    set compliance level to 1.5

I try that, and well that fixes one problem, but creates another.

$ javac -1.5 ExampleDrizzle.java
----------
1. ERROR in ExampleDrizzle.java (at line 14)
	for(Throwable e : sx ) {
	                  ^^
Can only iterate over an array or an instance of java.lang.Iterable

Now Class SQLException 1.6 javadocs shows SQLException as implementing the generics Iterable<Throwable>, while 1.5 javadoc does not. I guess I need to use 1.6 then.

$ javac -1.6 ExampleDrizzle.java
Annotation processing got disabled, since it requires a 1.6 compliant JVM
----------
1. ERROR in ExampleDrizzle.java (at line 14)
	for(Throwable e : sx ) {
	                  ^^
Can only iterate over an array or an instance of java.lang.Iterable

Wait a minute, I’m using a 1.6 compliant JVM. Double checking

$ ls -al /etc/alternatives/java*
lrwxrwxrwx 1 root root 36 2009-09-17 18:53 /etc/alternatives/java -> /usr/lib/jvm/java-6-sun/jre/bin/java
lrwxrwxrwx 1 root root 46 2009-09-17 18:53 /etc/alternatives/java.1.gz -> /usr/lib/jvm/java-6-sun/jre/man/man1/java.1.gz
lrwxrwxrwx 1 root root 31 2009-09-17 17:50 /etc/alternatives/javac -> /usr/lib/jvm/java-gcj/bin/javac
lrwxrwxrwx 1 root root 41 2009-09-17 17:50 /etc/alternatives/javac.1.gz -> /usr/lib/jvm/java-gcj/man/man1/javac.1.gz
lrwxrwxrwx 1 root root 33 2009-09-17 17:50 /etc/alternatives/javadoc -> /usr/lib/jvm/java-gcj/bin/javadoc
lrwxrwxrwx 1 root root 43 2009-09-17 17:50 /etc/alternatives/javadoc.1.gz -> /usr/lib/jvm/java-gcj/man/man1/javadoc.1.gz
lrwxrwxrwx 1 root root 31 2009-09-17 17:50 /etc/alternatives/javah -> /usr/lib/jvm/java-gcj/bin/javah
lrwxrwxrwx 1 root root 41 2009-09-17 17:50 /etc/alternatives/javah.1.gz -> /usr/lib/jvm/java-gcj/man/man1/javah.1.gz
lrwxrwxrwx 1 root root 33 2009-09-11 10:06 /etc/alternatives/javap -> /usr/lib/jvm/java-6-sun/bin/javap
lrwxrwxrwx 1 root root 43 2009-09-11 10:06 /etc/alternatives/javap.1.gz -> /usr/lib/jvm/java-6-sun/man/man1/javap.1.gz
lrwxrwxrwx 1 root root 39 2009-09-11 10:06 /etc/alternatives/java_vm -> /usr/lib/jvm/java-6-sun/jre/bin/java_vm
lrwxrwxrwx 1 root root 38 2009-09-11 10:06 /etc/alternatives/javaws -> /usr/lib/jvm/java-6-sun/jre/bin/javaws
lrwxrwxrwx 1 root root 48 2009-09-11 10:06 /etc/alternatives/javaws.1.gz -> /usr/lib/jvm/java-6-sun/jre/man/man1/javaws.1.gz

javac is not using Sun Java 6. I have no idea how that happened, but it explains now the problem, should be checking javac version, not java version.

$ javac -version
Eclipse Java Compiler 0.894_R34x, 3.4.2 release, Copyright IBM Corp 2000, 2008. All rights reserved.

What the? I was writing Java code on this server by hand, but decided last night to install eclipse after the fact. Did this affect this. I’m not certain whether I installed eclipse before or after my work last night.

I try to change the alternatives again.

$ sudo update-alternatives --config java

There are 4 alternatives which provide `java'.

  Selection    Alternative
-----------------------------------------------
*         1    /usr/lib/jvm/java-6-sun/jre/bin/java
          2    /usr/bin/gij-4.3
          3    /usr/bin/gij-4.2
 +        4    /usr/lib/jvm/java-gcj/jre/bin/java

Press enter to keep the default[*], or type selection number: 1
Using '/usr/lib/jvm/java-6-sun/jre/bin/java' to provide 'java'.

$ javac -version
Eclipse Java Compiler 0.894_R34x, 3.4.2 release, Copyright IBM Corp 2000, 2008. All rights reserved.

That doesn’t work. One needs to know that java and javac operate independently.

$ sudo update-alternatives --config javac

There are 4 alternatives which provide `javac'.

  Selection    Alternative
-----------------------------------------------
          1    /usr/lib/jvm/java-6-sun/bin/javac
          2    /usr/bin/ecj
          3    /usr/bin/gcj-wrapper-4.3
*+        4    /usr/lib/jvm/java-gcj/bin/javac

Press enter to keep the default[*], or type selection number: 1
Using '/usr/lib/jvm/java-6-sun/bin/javac' to provide 'javac'.
$ javac -version
javac 1.6.0_16

$ javac ExampleDrizzle.java

Buyer beware with Ubuntu and it’s rather messed up implementation approach toward alternative java JVM’s.

Getting started with Drizzle JDBC

In preparation for some Java work I wanted to configure and test the Drizzle JDBC Driver. Any chance to swing Drizzle into a MySQL discussion is worth the research. What I found was an issue compiling and an issue running on Ubuntu 9.04

You can start by downloading and building the Drizzle JDBC. My first problem was when I tried to build a usable .jar. I got errors in the test cases which caused by default no built .jar to work with. I raised Bug #432146 – org.drizzle.jdbc.MySQLDriverTest Tests fail. As I stated it may not be a real bug, but it seems at present that you require a running MySQL instance as well as a running Drizzle instance. In my case I didn’t have MySQL running, and I think to be fair, I should be able to build a Drizzle driver without MySQL.

Anyway, as per the Wiki Docs I proceeded to package without successful test cases. My next problem was more interesting, and perhaps found earlier from the tests?

I first created a test schema my code was going to use.

$ ~/drizzle/deploy/bin/drizzle
Your Drizzle connection id is 724
Server version: 2009.09.1126 Source distribution (trunk)

drizzle> create schema test_java;
Query OK, 1 row affected (0 sec)
drizzle> exit

I wrote a simple Java program.

$ cat ExampleDrizzle.java
import java.sql.*;

public class ExampleDrizzle {

  public static void main(String args[]) {

    try {
      Class.forName("org.drizzle.jdbc.Driver");
    } catch (Exception e) {
      System.out.println(e.getMessage());
      System.exit(1);
    }

    try {
      Connection con = DriverManager.getConnection("jdbc:drizzle://localhost:4427/test_java");
      Statement st = con.createStatement();
      st.executeUpdate("CREATE TABLE a (id int not null primary key, value varchar(20))");
      st.close();
      con.close();
    } catch (SQLException e) {
      System.out.println(e.getMessage());
    }
  }
}

Compiled.

$ javac ExampleDrizzle.java

Ran.

$ java ExampleDrizzle
org.drizzle.jdbc.Driver not found in gnu.gcj.runtime.SystemClassLoader{urls=[file:mysql-connector-java-5.1.8-bin.jar,file:./], parent=gnu.gcj.runtime.ExtensionClassLoader{urls=[], parent=null}}

Oops, been a while since using Java. I was amazed I could write the code in vi in the first place.

$ export CLASSPATH=drizzle-jdbc-0.5-SNAPSHOT.jar:.
$ java ExampleDrizzle
17-Sep-09 6:48:45 PM org.drizzle.jdbc.internal.drizzle.DrizzleProtocol 
INFO: Connected to: localhost:4427
Exception in thread "main" java.lang.NoClassDefFoundError: org.drizzle.jdbc.DrizzleConnection
   at java.lang.Class.initializeClass(libgcj.so.90)
   at org.drizzle.jdbc.Driver.connect(Driver.java:74)
   at java.sql.DriverManager.getConnection(libgcj.so.90)
   at ExampleDrizzle.main(ExampleDrizzle.java:15)
Caused by: java.lang.ClassNotFoundException: java.sql.SQLFeatureNotSupportedException not found in gnu.gcj.runtime.SystemClassLoader{urls=[file:drizzle-jdbc-0.5-SNAPSHOT.jar,file:./], parent=gnu.gcj.runtime.ExtensionClassLoader{urls=[], parent=null}}
   at java.net.URLClassLoader.findClass(libgcj.so.90)
   at java.lang.ClassLoader.loadClass(libgcj.so.90)
   at java.lang.ClassLoader.loadClass(libgcj.so.90)
   at java.lang.Class.forName(libgcj.so.90)
   at java.lang.Class.initializeClass(libgcj.so.90)
   ...3 more

Hmmm, that’s disappointing. I thought about it a minute, figured some guidance would be beneficial , so I sought out the best Java person on #drizzle IRC. Getting a name, but no response from an initial inquiry after about a half hour I thought again at the problem. Just what java are you using?

$ java -version
java version "1.5.0"
gij (GNU libgcj) version 4.3.3

$ ls -l /usr/bin/java
lrwxrwxrwx 1 root root 22 2009-07-17 12:36 /usr/bin/java -> /etc/alternatives/java

$ sudo find / -name java
[sudo] password for rbradfor:
/usr/lib/java
/usr/lib/ure/share/java
/usr/lib/jvm/java-6-sun-1.6.0.16/bin/java
/usr/lib/jvm/java-6-sun-1.6.0.16/jre/bin/java
/usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/bin/java
/usr/lib/jvm/java-1.5.0-gcj-4.3-1.5.0.0/jre/bin/java
/usr/bin/java
/usr/include/c++/4.3/gnu/java
/usr/include/c++/4.3/java
/usr/local/include/google/protobuf/compiler/java

$ ls -l /etc/alternatives/j*
...
lrwxrwxrwx   1 root root    33 2009-09-17 17:50 jar -> /usr/lib/jvm/java-gcj/jre/bin/jar
lrwxrwxrwx   1 root root    39 2009-09-17 17:50 jar.1.gz -> /usr/lib/jvm/java-gcj/man/man1/jar.1.gz
lrwxrwxrwx   1 root root    35 2009-09-17 17:50 jarsigner -> /usr/lib/jvm/java-gcj/bin/jarsigner
lrwxrwxrwx   1 root root    45 2009-09-17 17:50 jarsigner.1.gz -> /usr/lib/jvm/java-gcj/man/man1/jarsigner.1.gz
lrwxrwxrwx   1 root root    34 2009-09-17 17:50 java -> /usr/lib/jvm/java-gcj/jre/bin/java
lrwxrwxrwx   1 root root    40 2009-09-17 17:50 java.1.gz -> /usr/lib/jvm/java-gcj/man/man1/java.1.gz
lrwxrwxrwx   1 root root    31 2009-09-17 17:50 javac -> /usr/lib/jvm/java-gcj/bin/javac
lrwxrwxrwx   1 root root    41 2009-09-17 17:50 javac.1.gz -> /usr/lib/jvm/java-gcj/man/man1/javac.1.gz
...

I wonder if I should use the real Sun Java.

$ sudo apt-get install sun-java6-jdk
Reading package lists... Done
Building dependency tree
Reading state information... Done
sun-java6-jdk is already the newest version.
0 upgraded, 0 newly installed, 0 to remove and 2 not upgraded.
$ sudo update-alternatives --config java

There are 4 alternatives which provide `java'.

  Selection    Alternative
-----------------------------------------------
          1    /usr/lib/jvm/java-6-sun/jre/bin/java
          2    /usr/bin/gij-4.3
          3    /usr/bin/gij-4.2
*+        4    /usr/lib/jvm/java-gcj/jre/bin/java

Press enter to keep the default[*], or type selection number: 1
Using '/usr/lib/jvm/java-6-sun/jre/bin/java' to provide 'java'.

$ ls -l /usr/bin/java
lrwxrwxrwx 1 root root 22 2009-07-17 12:36 /usr/bin/java -> /etc/alternatives/java
$ ls -l /etc/alternatives/java
lrwxrwxrwx 1 root root 36 2009-09-17 18:53 /etc/alternatives/java -> /usr/lib/jvm/java-6-sun/jre/bin/java

Yep, it took a minute to discover the update-alternatives command, lucky I didn’t try that manually.

A second try.

$ javac ExampleDrizzle.java
$ java ExampleDrizzle
Sep 17, 2009 6:54:22 PM org.drizzle.jdbc.internal.drizzle.DrizzleProtocol 
INFO: Connected to: localhost:4427
Sep 17, 2009 6:54:22 PM org.drizzle.jdbc.internal.drizzle.DrizzleProtocol close
INFO: Closing connection
Sep 17, 2009 6:54:22 PM org.drizzle.jdbc.internal.common.packet.AsyncPacketFetcher run
INFO: Connection closed

$ ~/drizzle/deploy/bin/drizzle test_java
Server version: 2009.09.1126 Source distribution (trunk)

drizzle> show tables;
+---------------------+
| Tables_in_test_java |
+---------------------+
| a                   |
+---------------------+
1 row in set (0 sec)

drizzle> desc a;
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id    | int         | NO   | PRI | NULL    |       |
| value | varchar(20) | YES  |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+
2 rows in set (0 sec)

And I’ve got a working testcase.

CommunityOne East – An open developer conference

With an opening video from thru-you.com – an individual taking random you-tube video and producing video mashup’s, the CommunityOne East conference in New York, NY beings.

The opening introduction was by Chief Sustainability Officer Dave Douglas. Interesting job title.

His initial discussion was around what is the relationship between technology and society. A plug for his upcoming book “Citizen Engineer” – The responsibilities of a 21st Century Engineer. He quotes “Crisis loves an Innovation” by Jonathan Schwartz, and extends with “Crisis loves a Community”.

He asks us to consider the wider community ecosystem such as schools, towns, governments, NGO’s etc with our usage and knowledge of technology.

Event: CommunityOne East in New York, NY.
Author: Ronald Bradford

Project Darkstar

It may sound like either a astronomical research project or a Star Wars spin- off, but Project Darkstar is an open source infrastructure from Sun Microsystems that states “simplify the development and operation of massively scalable online games, virtual worlds, and social networking applications.”

The advertising sounds promising like many sites, the emphasis seems to be on gaming throughout the material, interesting they threw in the term “social networking applications” specifically in opening descriptions.

I believe worthy of investigation, if only to see how that solve some classic problems. So, Learn some more, Start your rockets and Participate.

A summary introduction to Agile

Agile Development Methodology: – Most popular Implementations: Extreme Programming (XP), SCRUM, Crystal

Links

Books Highly Recommended

Extreme Programming Explained Extreme Programming Pocket Book More books on my Library page.

Tutorial – Beginner Web Services

An introduction to using Axis.

What is Axis?

Axis is essentially a SOAP engine — a framework for constructing SOAP processors such as clients, servers, gateways, etc. The current version of Axis is written in Java. But Axis isn’t just a SOAP engine — it also includes:

  • a simple stand-alone server,
  • a server which plugs into servlet engines such as Tomcat,
  • extensive support for the Web Service Description Language (WSDL),
  • emitter tooling that generates Java classes from WSDL.
  • some sample programs, and
  • a tool for monitoring TCP/IP packets.

Pre-Requisites

Installation

su -
cd /opt
wget http://apache.ausgamers.com/ws/axis/1_4/axis-bin-1_4.tar.gz
tar xvfz axis-bin-1_4.tar.gz
ln -s axis-1_4/ axis
echo "AXIS_HOME=/opt/axis;export AXIS_HOME" > /etc/profile.d/axis.sh
. /etc/profile.d/axis.sh
cp -r $AXIS_HOME/webapps/axis $CATALINA_HOME/webapps
catalina.sh stop
catalina.sh start

At this time, you should be able to confirm this installation was initially successful by going to http://localhost:8080/axis/

Installed Axis Options

The default Axis page, gives you a number of options. To confirm the installation, select the Validate Axis Link http://localhost:8080/axis/happyaxis.jsp. If there is anything missing this page will report it. In my case I was missing XML Security, which is optional.

cd /tmp
wget http://xml.apache.org/security/dist/java-library/xml-security-bin-1_3_0.zip
unzip  xml-security-bin-1_3_0.zip
cp xml-security-1_3_0/libs/xmlsec-1.3.0.jar /opt/tomcat/common/lib
catalina.sh stop
catalina.sh start

One of the links from the default home page are http://localhost:8080/axis/servlet/AxisServlet which Lists services.

First Use

One of the nicest parts of AXIS is its “instant Web service” feature called Java Web Service (JWS) — just take a Java file, rename it, and drop it into TOMCAT_HOME/webapps/axis to make all of the (public) methods in the class callable through Web services.

Quote.java

import java.util.HashMap;
import java.util.Map;

public class Quote {
  private HashMap quotes = null;
  public Quote() {
    quotes = new HashMap();
    quotes.put("Groucho Marx", "Time flies like an arrow.  Fruit flies like a banana.");
    quotes.put("Mae West", "When women go wrong, men go right after them.");
    quotes.put("Mark Twain", "Go to Heaven for the climate, Hell for the company.");
    quotes.put("Thomas Edison", "Genius is 1% inspiration, 99% perspiration.");
  }
  public String quote(String name) {
    String quote;
    if (name == null || name.length() == 0
      || (quote = (String) quotes.get(name)) == null) {
      quote = "No quotes.";
  }
  return (quote);
  }
  public int count() {
    return quotes.size();
  }
}
cp Quote.java /opt/tomcat/webapps/axis/Quote.jws

http://localhost:8080/axis/Quote.jws
http://localhost:8080/axis/Quote.jws?wsdl

More details can be found at Getting Started using Web Services with Tomcat and Axis.

What’s Next

In my next Tutorial, I’ll be moving to the practical use of Web Services using WSDL.

References

When is a batch job successful?

Simple enough question, and it’s a simple enough answer. When the batch job/process in question successfully completes what it is designed to do and not in error.

I’m attempting to test, integrate and document some developed code on a client site, and well, I’m disgusted. (as with most things, is an accumulation of a number of things that lead to these frustrations.)

The process is broken down into two parts, lets call these X and Y. Now Y is the most stable part of a long standing product, it’s the API calls to the database. X does some pre-processing, then calls Y, then reports back success/failure.

Simple enough, and these are batch processes run after hours, so operators that don’t have the business knowledge need to know success or failure.

I’ll set aside for the moment that the calling process (which is indeed a shell script wrapper around the Java code) returns a status 0-Success and 1-Failure. This is practically useless because even when X fails, it doesn’t necessary report that (another story, but part of the same frustration)

I’ve extracted a small portion of the XML response that is returned from Y, that is then inteperated by X.

...
 <Status>
  <Code>0</Code>
  <Description>Success</Description>
  <DateTime>2006-09-26T16:03:45</DateTime>
 </Status>
 <Result>
   <OutputFileName>Not.Real.Name.Output.txt</OutputFileName>
 </Result>
...
...
<Severity>
  <Code>FATAL</Code>
    <Description>A fatal error was encountered while processing. See the reason code and description for further details.</Description>
</Severity>
<Reason>
 <Code>50</Code>
 <Description>XML exception: XML parse error on line: 9, position: ....
</Reason>
...

I’m not being told, “Oh, that’s a problem”. I’m been attempted to be convinced that it’s not an error, it is success.

Well, I don’t know from what planet you have lobbed in from, but in by book, FATAL is FATAL. Check out Handling Error Levels in Logging.

What’s the most depressing is I’m expected to hand this over to the customer for testing. My job isn’t actually testing, it’s integration and documentation for the end user, but the level of quality has demanded that I test it onsite before passing on. Well, I’m not going to give this to the customer, which makes it hard when the developers (who are on the same team as me) don’t see this as a problem.

PS: The list of articles of this nature has grown to the point, I’ve created my own “The Daily WFT” category. I’ve had a lot of stories I’ve never written about, perhaps I’ll pen a few more now.

Handling Error Levels in Logging

In reviewing some provided code to a client, I observed a number of actions contray to generally accepted practices regarding logging. This is what I provided as the general programming conventions with regardings to logging.

Using Log4J (http://logging.apache.org/log4j/docs/), which is generally accepted as the benchmark for all Java applications. This provides the following logging levels.

  • FATAL
  • ERROR
  • WARN
  • INFO
  • DEBUG
  • TRACE – from 1.2.12, latest is 1.2.13

A description for what handling should occur per logging level.

FATAL. As the name suggests, all processing should stop. Should logging include a FATAL, the process is a Failure.

ERROR. An error has occured, and this requires attention, and action. Generally processing should stop, however additional post processing, or an alternative path could occur. Should logging include an ERROR, the process is a Failure.

WARN. Something that is unexpected occured, however it doesn’t affect the general processing from succeeding successfully. If a process includes WARN and not FATAL/ERROR it should be considered successful.

INFO. Information Only. On high volume systems, this level of logging may even be turned off. This generally indicates key information values or steps, and can assist when enabled in longer running processes to identify where a process is. You don’t log errors at INFO.

DEBUG. For Debugging Purposes only.

Ok, well that sounds like common sense. Here is what I observed on a client site.

  • Code logs a FATAL, but continues processing
  • A FATAL is logged, yet the calling process reports success
  • An ERROR is logged, yet the calling process reports success.
  • A lot of WARN are logged, and this is misleading, as it appears more information regarding XML elements not processed (We are talking 20+ Warnings per batch process). From what I’ve observed, these don’t require futher action, and should be changed in INFO.
  • Errors are being logged as INFO. A NullPointer RunTime Exception is logged as INFO. If an error provides an Exception argument where a stack trace is printed, it ain’t an INFO message.

Securing a Tomcat Webapp – Part 2

If you wish to password protect your webapp with an Apache .htaccess type authentication model, you require two configuration steps. The first within your WEB-INF/web.xml, add the following replacing rolename appropiately.

  <security-constraint>
    <web-resource-collection>
      <web-resource-name>All Pages</web-resource-name>
        <url-pattern>*.htm</url-pattern>
        <url-pattern>*.html</url-pattern>
   </web-resource-collection>
    <auth-constraint>
       <role-name>rolename</role-name>
    </auth-constraint>
  </security-constraint>

  <!-- Define the Login Configuration for this Application -->
  <login-config>
    <auth-method>BASIC</auth-method>
    <realm-name>Test Application</realm-name>
  </login-config>

  <!-- Security roles referenced by this web application -->
  <security-role>
    <description>
      The role that is required to log in to the Application
    </description>
    <role-name>rolename</role-name>
  </security-role>

Second, within the tomcat $CATALINA_HOME/conf/server.xml, you need to define the Realm used within the appropiate host’s <Engine> definition.

  <Realm className="org.apache.catalina.realm.UserDatabaseRealm" debug="0" resourceName="UserDatabase"/>

This Realm connects with a known resource, which I define with the $CATALINA_HOME/conf/server.xml <GlobalNamingResources> definition.

<Resource name="UserDatabase" auth="Container"
          type="org.apache.catalina.UserDatabase"
          description="User database that can be updated and saved">
</Resource>
<ResourceParams name="UserDatabase">
    <parameter>
        <name>factory</name>
        <value>org.apache.catalina.users.MemoryUserDatabaseFactory</value>
    </parameter>
    <parameter>
        <name>pathname</name>
        <value>conf/custom/users.xml</value>
    </parameter>
</ResourceParams>

NOTE: The use of MemoryRealm has limited uses. Tomcat provides 5 different Realm implementations including JDBC, DataSource,JINDI, Memory and JAAS.

And of course you need to define your user authentication within the appropiately defined users file. In this case conf/custom/users.xml

Securing a Tomcat Webapp

If you require a webapp to always run in https mode using a SSL key, then you need to add the following to your WEB-INF/web.xml configuration.

 <security-constraint>
    <web-resource-collection>
        <web-resource-name>jsp</web-resource-name>
        <url-pattern>*.htm</url-pattern>
        <url-pattern>*.html</url-pattern>
    </web-resource-collection>
    <user-data-constraint>
        <transport-guarantee>CONFIDENTIAL</transport-guarantee>
    </user-data-constraint>
</security-constraint>

Mercurial Version Control Software

I got asked (being a Java developer) about what was involved in creating an Eclipse Plugin for Mercurial. Well in true Google style, why invent when somebody probably already has. A quick check finds Mercurial Eclipse by VecTrace.

Now until last week, I’d never heard of Mercurial, so this is really an introduction to somebody that has no idea.

What is Mercurial?

Mercurial is a fast, lightweight Source Control Management system designed for efficient handling of very large distributed projects.

Ok, so big deal, I use CVS. I also use Subversion (SVN) for my Apache contributions, and also for MySQL GUI products. Why do we need another Version Control Product? Mercurial is a Distributed Software Configuration Management Tool. The following is from the Mercurial Wiki

A distributed SCM tool is designed to support a model in which each Repository is loosely coupled to many others. Each Repository contains a complete set of metadata describing one or more projects. These repositories may be located almost anywhere. Individual developers only need access to their own repositories, not to a central one, in order to Commit changes.

Distributed SCMs provide mechanisms for propagating changes between repositories.

Distributed SCMs are in contrast to CentralisedSCMs.

So, clearly a distributed model would well in a large distributed organisation where external factors limit continous access to a single central repository. Low Bandwidth, Poor Internet Connectivity, being on a plane, and travelling are all things that would make a distributed model a more ideal solution. I know I’ve taken my laptop away, and being an “Agile Methodology” developer, I commit often. When you have several days of uncommitted work it goes against the normal operation.

You can get more information at the official website at http://www.selenic.com/mercurial. A few quick links are: Quick Start Guide, Tutorial, Glossary.

Installing Mercurial

su -
cd /src
wget http://www.selenic.com/mercurial/release/mercurial-0.9.tar.gz
tar xvfz mercurial-0.9.tar.gz
cd mercurial-0.9
# NOTE: Requires python 2.3 or better.
python -V
python setup.py install --force

A quick check of the syntax.

$ hg
Mercurial Distributed SCM

basic commands (use "hg help" for the full list or option "-v" for details):

 add        add the specified files on the next commit
 annotate   show changeset information per file line
 clone      make a copy of an existing repository
 commit     commit the specified files or all outstanding changes
 diff       diff repository (or selected files)
 export     dump the header and diffs for one or more changesets
 init       create a new repository in the given directory
 log        show revision history of entire repository or files
 parents    show the parents of the working dir or revision
 pull       pull changes from the specified source
 push       push changes to the specified destination
 remove     remove the specified files on the next commit
 revert     revert files or dirs to their states as of some revision
 serve      export the repository via HTTP
 status     show changed files in the working directory
 update     update or merge working directory

A more detailed list:

$ hg help
Mercurial Distributed SCM

list of commands (use "hg help -v" to show aliases and global options):

 add        add the specified files on the next commit
 annotate   show changeset information per file line
 archive    create unversioned archive of a repository revision
 backout    reverse effect of earlier changeset
 bundle     create a changegroup file
 cat        output the latest or given revisions of files
 clone      make a copy of an existing repository
 commit     commit the specified files or all outstanding changes
 copy       mark files as copied for the next commit
 diff       diff repository (or selected files)
 export     dump the header and diffs for one or more changesets
 grep       search for a pattern in specified files and revisions
 heads      show current repository heads
 help       show help for a given command or all commands
 identify   print information about the working copy
 import     import an ordered set of patches
 incoming   show new changesets found in source
 init       create a new repository in the given directory
 locate     locate files matching specific patterns
 log        show revision history of entire repository or files
 manifest   output the latest or given revision of the project manifest
 merge      Merge working directory with another revision
 outgoing   show changesets not found in destination
 parents    show the parents of the working dir or revision
 paths      show definition of symbolic path names
 pull       pull changes from the specified source
 push       push changes to the specified destination
 recover    roll back an interrupted transaction
 remove     remove the specified files on the next commit
 rename     rename files; equivalent of copy + remove
 revert     revert files or dirs to their states as of some revision
 rollback   roll back the last transaction in this repository
 root       print the root (top) of the current working dir
 serve      export the repository via HTTP
 status     show changed files in the working directory
 tag        add a tag for the current tip or a given revision
 tags       list repository tags
 tip        show the tip revision
 unbundle   apply a changegroup file
 update     update or merge working directory
 verify     verify the integrity of the repository
 version    output version and copyright information

Mercurial Eclipse Plugin

The plugin is still in it’s early days, but the “FREEDOM” of open source enables me to easily review. After a quick install and review of docs, I shot off an email to the developer, stating why I was looking, and while I have other projects on the go, I asked what I could do to help. It’s only be 2 days and we have already communicated via email several times on various topics. That’s one reason why I really love the Open Source Community. Generally people are very receptive to feedback, comments and especially help.

Within Eclipse

  • Help ->Software Updates-> Find and install…
  • Select “Search for new features to install”, click Next
  • Click “New Remote site…”
  • Enter following details and click Ok
    • Name: MercurialEclipse Beta site
    • URL: http://zingo.homeip.net:8000/eclipse-betaupdate/
  • Follow the prompts to accept the license and download.

So now with Eclipse, on a project you can simply go Right Click -> Team -> Share Project -> Select Mercurial

A Quick Mercurial Tutorial

Of course the quickest way to learn about using Mercurial is to look at an existing product. So taking this plugin project for a spin.

$ cd /tmp
$ hg clone http://zingo.homeip.net:8000/hg/mercurialeclipse com.vectrace.MercurialEclipse
$ hg clone com.vectrace.MercurialEclipse example
$ cd example
# Create some new dummy files
$ touch test.txt
$ touch html/test.html
# View files against respository status
$ hg status
? html/test.html
? test.txt
# Add the new files
$ hg add
adding html/test.html
adding test.txt
# Commit changes
$ hg commit -m "Testing Mercurial"

So other then the second clone command (which enabled me to not mess up the original repository and to test distributed handling next), this is just the same as CVS (checkout, diff, add, commit)

# The Distributed nature involves first Pulling from the "upstream" respository
$ hg pull ../com.vectrace.MercurialEclipse
pulling from ../com.vectrace.MercurialEclipse
searching for changes
no changes found
# Confirm our new file is not in "upstream" respository
$ ls ../com.vectrace.MercurialEclipse/test.txt
ls: ../com.vectrace.MercurialEclipse/test.txt: No such file or directory
# Push local respository changes to "upstream" respository
$ hg push ../com.vectrace.MercurialEclipse
pushing to ../com.vectrace.MercurialEclipse
searching for changes
adding changesets
adding manifests
adding file changes
added 1 changesets with 2 changes to 2 files
$ ls ../com.vectrace.MercurialEclipse/test.txt
ls: ../com.vectrace.MercurialEclipse/test.txt: No such file or directory

Hmmm, missed something here. The Quick Start Guide docs seems to want to work back into the original respository, pulling in changes from the “downstream” repository, then executing a merge and commit command. I don’t want to do this just in case I start messing up the original repository. Time for me to create a new standalone repository so I don’t screw anything up. Stay tuned for an updated tutorial.

Of course this is very basic, I need to look into commands used in CVS like login, tagging (versioning), branching, diffing revisions and merging but this is a start.

I do have a concern with the number of distributed respositories, when and how do you know you are in sync with the “master”, indeed what is the “master”. It does merit some investigation to see what management is in place, like identifying all respositories, and comparing for example.

Conclusion

Ok, so now I’ve got a grasp on Mercurial, time to review the Java Code and see what works and what doesn’t in the Eclipse environment. Of course I don’t use Mercurial so what I may consider as functionality required, may be lower priority to those users out there. Any feedback I’m sure will be most appreciated by the original developer.

Generating an internal SSL Certificate (for tomcat)

How to Generate an internal SSL certificate

Create the self-signed keystore

$ su -
$ URL="your.url.here";export URL
$ cd /opt/tomcat/conf
$ keytool -genkey -alias ${URL} -keyalg RSA -keystore ${URL}.keystore
Enter keystore password:  changeit
What is your first and last name?
  [Unknown]:  your.url.here
What is the name of your organizational unit?
  [Unknown]:  IT
What is the name of your organization?
  [Unknown]:  your.url.here
What is the name of your City or Locality?
  [Unknown]:  Brisbane
What is the name of your State or Province?
  [Unknown]:  QLD
What is the two-letter country code for this unit?
  [Unknown]:  AU
Is CN=your.url.here, OU=IT, O=your.url.here, L=Brisbane, ST=QLD, C=AU correct?
  [no]:  yes

Enter key password for <your.url.here>
        (RETURN if same as keystore password):

Turn the keystore into a X.509 certificate

$ keytool -export -alias ${URL} -keystore ${URL}.keystore -rfc -file ${URL}.cert
Enter keystore password:  changeit
Certificate stored in file <your.url.here.cert>

Delete existing trusted certificate

$ keytool -delete -alias ${URL} -file ${URL}.cert  -keystore /opt/java/jre/lib/security/cacerts  -storepass changeit

Import the certificate into cacerts – JRE trusted certificates

$ keytool -import -alias ${URL} -file ${URL}.cert  -keystore /opt/java/jre/lib/security/cacerts  -storepass changeit
Owner: CN=your.url.here, OU=IT, O=your.url.here, L=Brisbane, ST=QLD, C=AU
Issuer: CN=your.url.here, OU=IT, O=your.url.here, L=Brisbane, ST=QLD, C=AU
Serial number: 44ab628c
Valid from: Wed Jul 05 01:56:12 CDT 2006 until: Tue Oct 03 01:56:12 CDT 2006
Certificate fingerprints:
         MD5:  EC:76:01:04:7F:FC:21:CC:A8:41:AD:86:C8:B2:D5:6D
         SHA1: 2D:FD:7C:56:65:70:36:1B:1D:71:09:41:84:98:E6:8E:89:18:BC:18
Trust this certificate? [no]:  yes
Certificate was added to keystore


If you replaced an existing certificate you will need to restart Tomcat.

Guidelines for managing embedded external project dependencies

I’ve yet to find any Java project that doesn’t have dependancies on some other Open Source external libraries. I’ve yet to find a Java project that manages these external dependencies appropiately for support and integration at an enterprise level.

As with most projects, understanding an applying sound principles that scale will help you at a later date, and generally the cost of implementation is minimual at the start, but of course becomes more expensive when it’s really needed. The classic case is Version Control. For over 10 years, even on small single developer projects, I’ve used Version Control, it should be taught at university as an introduction to good programming design, it would greatly benefit software development and maintenance.

Back onto the topic of hand. Let’s use a moderate Java Web Based application, and for the purposes of this discussion the following Open Source external libraries are incoporated. Log4J, JUnit, Canoo WebTest, MySQL JDBC, Apache Commons (Collections, DHCP, Pool, HTTPClient, Taglibs Mailer). I could continue, but this will suffice for the demonstration.

It’s very easy for your project to include the appropiate jar’s such as (log4j.jar, junit.jar, commons-pool.jar etc), however this is where support and integration with other products fall down.

A Controlled Approach

You need to keep a seperate repository (under source code control of course) of your external libraries, and this becomes the source across all corporate projects. This is to include the following for each library:

  • The actual deployed jar
  • The matching source code of the deployed jar
  • Java Documentation of deployed jar

Versioning

Log4J is an example of an Open Source project that does version their jars, will many other open source projects do not. Why don’t they? Well one reason is to enable people to upgrade easily but simply overriding existing versions, and processes that have specific CLASSPATH’s are not affected. Generally today, implementations of software include all jar’s within a specified directory so I don’t see the problem.
Log4J gives in this example a log4j-1.2.12.jar for deployment purposes. When libraries do not include a version number, the are to be specifically added. This adds another small delemma of standards. The general practice is to use the hyphen ‘-‘. followed by the product version using the full-stop ‘.’, however there are projects that don’t follow this.

Version Recording

So now we have for each external library, an appropiately versioned jar, and matching source and documentation. This is the initial baseline. What’s needed is a simple HTML index that manages this information for use. The Index should include:

  • Product Name
  • Product URL
  • Repository Version
    • Version Number
    • Version Date
    • Download Date
  • Latest Version
    • Version number
    • Date
    • Comment

You may ask, why do you record the Latest Version, when the practice should be to always get the latest version. JUnit is a good example, the present version 4.x, requires a JDK 1.5.x deployment, and if your application is running only 1.4.x, then you can only use older 3.8.x versions.s mentioned earlier,

Management

Having an index of external libraries is one thing, correct use and management is the most important step. Let’s assume we have taken the time to download and document the required libraries from our example, and everything has been deployed into our first project.

Now, a 2/3 month task of checking for updated versions can be scheduled. Withing this process, newer versions can be downloaded and recorded appropiately. In our example, Log4J now has 1.2.13. Updating the external libraries repository is the simple part, the next step is to notify all coporate projects of the new version, and to encourage uptake. This may not always occur in a timely manner, but with at least this baseline in place and when there are issues, standardisation on the known coporate version is the first step.

Dependencies

Within each project libaries, a readme that details which versions of which external jars are included andwhen they were last updated from the repository should be done. Noting this information with the both the external libraries repository and the project repository provides a paper trail. In addition, should there be any exceptions, this is the place this information can be reported.

External Projects

Canoo WebTest is a good example of an external project that also includes other external libraries such as Log4J, JUnit, HtmlUnit, NekoHtml.
Problems arise when these products may use and implement older or unknown versions of libraries.

Internal Projects

Having internal projects that are dependent on other internal projects is nothing new in a large corporate enterprise. The problem arises when a spaghetti of undocumented dependancies causes a management nightmare. Let’s take this real life example.

Product A has included jars of Product B, Product C, Product D and Product E. The Product C actually has a different version of Product D embedded within in. Product D which now is included twice also includes Product E, so there are now three copies of this, all are different in size, and all a binary only with no version numbers, and no corresponding matching source code. Does this sound bad? Well it is. How it ever worked is still amazing.

This mess could have been managed first with Version Control (a basic 101 in software development), and an appropiate management of external libraries, and a similar approach to internal libraries.

An Example

This is a great example to highlight the cost in lack of appropiate management. I was supporting an existing large scale project (1000+ users) (let’s call this Product A), and the integration of a new project (let’s call this Product B) had been passed from the development team for implementation, testing and release. A threaded process, it would simply just hang after some initialisation, no notice, no errors, just nothing. Not withstanding that something should have been better reported for the errors. Due to 7 possible log files between the software application and the application server, nothing was reported, but that’s another topic.

The final result was Product B had introduced the use of org.apache.log4j.Logger.trace(), a new more granular logging then debug(). The appropiate Log4J jar had been included in the product, and this was Version 1.2.12. However, Product A, which was using Product B, was bundled with an earlier version of Log4J, Version 1.2.8, and this version didn’t support this new method.

While it took a few hours of debugging to find this problem, it was made easier because at least these jar’s were version, of the 20-30 jars across products only 3-4 were versioned. Similar problems with QName and XMLBeans unamed jars prior to this took days to resolve (indeed one had to be worked around as it couldn’t be resolved).

A further complication in this process was when Product B was introduced. This was developed and built under Linux, while Product A was still being maintained under Windows. From the experience of Integration is was found that the order of loading within the classloader of a commerical application server differed between operating systems.

What constitutes a good error message to the user?

Today, will go down in my professional history as quite possibly the lowest I would ever think of a software developer. I’ve carefully avoided the term “fellow coder”, speaking of a IT industry sticking by fellow IT people, but not today.

I presently support an existing production system, some 1000+ users that’s been in place around 3 years in which I’ve had no prior involvement until recently. Integration with other external parties within the system have provided an esclation in errors being reported in this external communication, and lack of adequate feedback to the user is another topic. Email is the means of reporting administratively of user errors, again another topic of issue. Within these emails, which are almost impossible to manage due to the limited internal GUI only toolset and lack of access to actual email account files to automate scripting (yet another topic? Do you see a trend here), is some relevent information regarding the transaction, and then most importantly the error message. The thing I need to investigate the problem.

The line reads (removing some stuff):


Error Code: 234567892

Ok, well a little cryptic, but surely you can work out from the external system what this means. Investigation of some more errors, in the mail GUI product, yet another series of open windows (you can’t view messages like a regular mail client with a summary list and a detail panel), provides for a trend in the error messages:


Error Code: 1235492567
Error Code: -434783134
Error Code: 34345199

The trend being there is none. Of course today by mid morning the email error count is into the hundreds, and I’m none the wiser. Well time to closely investigate the code management (as I’ve already contacted the external party, and asked if I can provide some error codes to receive greater information).

The following constitutes the two lines of code taken in the determination of the error messages I’ve shown so far. Note, this code takes the external system response, and then “attempts to determine usefull error content for presentation back to the user”.


errorNo = new Random().nextInt();
error = “Error Code: ” + errorNo;

Now while everybody laughed out loud, including fellow developers, DBA, IT manager, the Business Owners and Users (which can’t read code but could understand the first of these two lines I highlighted), and yes it really was funny in context with a bigger picture, but really it wasn’t funny at all. Some things make my blood boil, and this was one of them. With all the time lost between multitudes of people, users, call centre etc, I’d never felt a stronger conviction to hunt down the developer that wrote this.

The end of the story is after even trolling old CVS repository entries I was unable to piece sufficient information to determine the author. Most likely done pre version control, and then that trail leads to only a few names I’ve heard mentioned before.

I’d like to see somebody top that one!

How I work

My work life is really fragmented at present, so I’ve decided a split approach in answer to Dave Rosenberg’s How I Work–what I have learned so far .

What is your role?
Support Developer providing sole support an internal client web based system (Java, Oracle) with 1000+ users producing >$1m daily revenue. Independent Database Consultant. Specialising in Database Modelling, Large Systems Design and Web Development Technologies. Strong proponent of Agile Development Methodologies (specifically Extreme Programming – XP).
What is your computer setup?
Dell Optiplex GX280 (P4, 1GB, 17″ CRT) running Redhat FC4.
(When I started only 2 months ago I got lumped with a similar speced machine running Windows XP, and upteen windows apps (like 4 different versions of TOAD for example) . This OS was contry to my understanding provided in the interview process.
Primary – Dell Inspiron 5150 (P4/3.2GHz, 1GB, 120GB, 15″ + 21″ CRT)
Secondary – Generic (P3/600MHz, 1GB, lots of HDD ~500GB, 17″ CRT)
Both with very impressive Saitek backlit keyboards (one red, one blue) Great at night. See image at bottom.
What desktop software applications do you use daily?
RedHat Fedora Core 4
Eclipse 3.1, FireFox 1.5, Thunderbird 1.5, WebSphere Application Server 5.1, J2DK 1.4.2, Oracle SQL Developer, Open Office 2, XMMS and SSH client (which I use most)
Due to legacy internal systems and support I also must run under Wine (Internet Exploder), and First Class (email client).
Not to stop there, I also must run under Citrix ICA Client apps (FocalPoint, Heat Call Logging, and Microsoft Word for internal forms that won’t work under OO2.) And before somebody suggests why not try VMWare or other clients, I have tried, but software like Focalpoint can’t install?
CentOS 4.3
FireFox 1.5, ThunderBird 1.5, Gaim, SSH, Skype, Open Office 2.
Maybe not all of these every day, but some combination of each day –> MySQL 4.1, MySQL 5.0, MySQL 5.1, MySQL Workbench, Eclipse 3.1, J2DK 1.4.2, J2DK 5.0, Apache Tomcat 5.0.28, Apache Httpd 2.0.53, JMeter.

Presently also configuring a new laptop drive running Ubuntu Dapper 6.06 RC. For the Record, Beta 2, Flight 6 and Flight 7 all failed.
What websites do you visit every day?
Internal Wiki (all day)
At Lunch Google Sci-Tech News
PlanetMySQL
Google Sci-Tech News
PlanetMySQL (if I haven’t already got to them).
What mobile device or cell phone do you use?
N/A NEC e606 3G phone (which I’ve had for probably 3 years now) operating on a 3G network with video calls, again for 3 years.
Do you use IM?
No. All access is blocked other then an internal Jabber server, that I use rarely, never for communication, just a cut/paste of command or syntax. Speaking of blocked, SSH access to my production servers is blocked, and even reading news like Skypes Call Out is blocked by WebSense. Extensively, however due to current employement policy, I’m very hamstrung unless before/after hours.
I use Gaim as I have AIM, MSN and Google Talk accounts, and Skype. My preference was always AIM, but as clients come and go, I’ve had to accumulate more accounts.
Do you use a VoIP phone?
No. Not at present, however for many years I worked with US clients and used Packet 8. Still have the hard phone somewhere.
I’ve also used Skype talk for one or one or conference calls. Of late in Australia to New Zealand and Singapore. Indeed, the quality to Singapore has been excellent, when living in the US, calls to Singapore on Skype were clearer then my Packet8 phone to US numbers.
Do you have a personal organization/time management theory?
Current contract employees a number of disjointed methods, which in observation just shows so much inefficency, it’s worth documenting just to highlight what not to do. We have daily team meetings (10 mins), each listing your top 2 daily tasks. Weekly we have to also submit weekly goals. Weekly combined meetings with another team where we again give weekly top 2 tasks. We use two seperate systems (with manual double entry) for work identification, one for call centre logging issues, and Bugzilla for software bugs, and enhancements. We also use XPlanner (again duplicating a lot of tasks) for time management.
With all this rigid structure, I am daily given either other work to do, or investigate, and in over 3 months, I would rarely end a week anywhere near where it was so described at the start of the week.
With all the technology possible, I actually do not have any electronic management gadget, never had. I use a combination of notebook, plain paper (usually for daily notes etc, which I discard regularly), a diary, and normally a lot of emails which I normally send to/from home.
Given that email is used so much, I basically use Draft Emails for any electronic notes.

Anything else?
Perhaps there is merit in How I work now, and How I’d like to work now.

Saitek Keyboard

The GWT!


New to the AJAX vertical space is the Google Web Toolkit (GWT) released the the Sun Java One Conference last week.

AJAX (“Asynchronous Javascript and XML”) isn’t new, infact the underlying requirements within AJAX, the DHTML, DOM manipulation and XMLHttpRequest were available in 1997. In fact, I implemented functionality to perform what AJAX does back in the late 90’s, probably starting 1999, using solely Javascript, and some of that is still in use today on at least one of my sites. Of course Google made this functionality popular with it’s use in Google Suggest a few years ago.

So what is GWT? An extract from the Google Web Toolkit Web Page.

Google Web Toolkit (GWT) is a Java software development framework that makes writing AJAX applications like Google Maps and Gmail easy for developers who don’t speak browser quirks as a second language. Writing dynamic web applications today is a tedious and error-prone process; you spend 90% of your time working around subtle incompatibilities between web browsers and platforms, and JavaScript’s lack of modularity makes sharing, testing, and reusing AJAX components difficult and fragile.

GWT lets you avoid many of these headaches while offering your users the same dynamic, standards-compliant experience. You write your front end in the Java programming language, and the GWT compiler converts your Java classes to browser-compliant JavaScript and HTML.

The definition of a Unit Test

A Test is not a Unit Test if:

  • It talks to the database
  • It communicates across a network
  • It touches the filesystem
  • It can’t run the same time as any of your other unit tests
  • You have to do special things to your environment to run it (e.g. editing config files)

Why IT professionals get a bad name

Sometimes you just can’t find words to describe bad code, and if you are forced into maintenance it can be a mindfield. I’m presently supporting an existing deployed Web Based Java application, which I’ve had no involvement with previously, and for lack of any complements it’s absolutely terrible.

How this code was ever written, and of course never reviewed (that’s another talk on software development) one can only shutter. If it was an isolated case, then perhaps, but the code is bloated, and riddled with unnecessary complexity. Basic 101 practices such as Standard CSS, well formed html pages (ie, not with double <html> and <body> tags), and Javascript functions that are not duplicated up the warzoo are things you can discover spending 2 mins looking at just the first screen before you even look at the Java Code.

Here is an example.

tempKeyStr = tempKeyStr.substring(0,2).concat(String.valueOf(Integer.parseInt(tempKeyStr.substring(2).trim())));

Ok, I guess I should give you a hint of the type of values in the tempKeyStr prior to this wonderful transformation. This string contains a 2 alphanumerical character prefix, then optionally a space, then a numeric number (usually 2, 3 or 4 digits). An example may be ‘XX 999′ or ‘AA99′.

This will help you to identify what this lines now does?

Ok, well to save the suspense, the purpose of the above line of code is to trim and squeeze any whitespace from the string. So something like the following would surfice.

tempKeyStr = tempKeyStr.replaceAll(" ","");

Of course refactoring would totally eliminate this line, placing the replaceAll syntax in a previous declaration.
Those observant Java guru’s would also state, but what if this was written 5 years ago, replaceAll is only available since 1.4. Yes, correct, but Pattern has always existed, replaceAll is just a convience wrapper for this, and this type conversion from String to Integer to int to String is unnecessary complexity.

On that note of refactoring, I was sent the following link recently Refactoring to the Max. As the title suggests, sometimes people go to far.

Integrating SVN into Eclipse

Being a CVS Version Control Person, I’ve had to learn Subversion as part of Open Source Contribution. Both MySQL and JMeter use SVN.

Steps for integration of SVN into Eclipse IDE. Refer to Subclipse for more information.

Installation of JMeter via SVN

  • Start Eclipse
  • Help | Software Updates | Find and Install
  • Search For New Features To Install [Next]
  • [New Remote Site]
  • URL: http://subclipse.tigris.org/update_1.0.x
  • [Finish]
  • Tick Subclipse [Next]
  • I accept Terms and Condtions [Next]
  • Finish

Operation

  • Restart Eclipse
  • Window | Open Perspective | Other | SVN Repository Exploring
  • Right Click New | Repository Location
  • URL http://svn.apache.org/repos/asf/jakarta/jmeter/trunk/ [Finish]
  • Right Click on Node, Checkout
  • Check out as project in Workspace [Finish]

Contributing to JMeter

As part of my using JMeter for the purpose of testing a new Transactional storage engine PBXT for MySQL, I’ve been investigating the best approach for handling transactions. Read more about earlier decisions at my earlier post Testing a new MySQL Transactional Storage Engine.

I found that the JMeter JDBC Sampler only supports SELECT and UPDATE Statements, and not calls to stored procedures. This is just one approach I’m considering taking.

Well, I guess it’s time to contribute code to an Apache Project. I’ve modified code and logged bugs before for Tomcat, but this will be my first attempt of modify code and submit.

A summary of what I did (really for my own short term memory):

Now I just have to wait to see if it’s accepted. Regardless, it works for me. And that’s Open Source. FREEDOM

svn checkout http://svn.apache.org/repos/asf/jakarta/jmeter/trunk/ jmeter

$ svn diff JDBCSampler.java > JDBCSampler.java.patch
$ cat  JDBCSampler.java.patch
Index: JDBCSampler.java
===================================================================
--- JDBCSampler.java    (revision 388876)
+++ JDBCSampler.java    (working copy)
@@ -23,6 +23,7 @@
 import java.sql.ResultSetMetaData;
 import java.sql.SQLException;
 import java.sql.Statement;
+import java.sql.CallableStatement;

 import org.apache.avalon.excalibur.datasource.DataSourceComponent;
 import org.apache.jmeter.samplers.Entry;
@@ -45,6 +46,8 @@

        public static final String QUERY = "query";
        public static final String SELECT = "Select Statement";
+       public static final String UPDATE = "Update Statement";
+       public static final String STATEMENT = "Call Statement";

        public String query = "";

@@ -69,6 +72,7 @@
                log.debug("DataSourceComponent: " + pool);
                Connection conn = null;
                Statement stmt = null;
+               CallableStatement cs = null;

                try {

@@ -88,14 +92,19 @@
                                        Data data = getDataFromResultSet(rs);
                                        res.setResponseData(data.toString().getBytes());
                                } finally {
-                                       if (rs != null) {
-                                               try {
-                                                       rs.close();
-                                               } catch (SQLException exc) {
-                                                       log.warn("Error closing ResultSet", exc);
-                                               }
-                                       }
+                                       close(rs);
                                }
+                       // execute stored procedure
+                       } else if (STATEMENT.equals(getQueryType())) {
+                               try {
+                                       cs = conn.prepareCall(getQuery());
+                                       cs.execute();
+                                       String results = "Executed";
+                                       res.setResponseData(results.getBytes());
+                               } finally {
+                                       close(cs);
+                               }
+                       // Insert/Update/Delete statement
                        } else {
                                stmt.execute(getQuery());
                                int updateCount = stmt.getUpdateCount();
@@ -112,20 +121,8 @@
                        res.setResponseMessage(ex.toString());
                        res.setSuccessful(false);
                } finally {
-                       if (stmt != null) {
-                               try {
-                                       stmt.close();
-                               } catch (SQLException ex) {
-                                       log.warn("Error closing statement", ex);
-                               }
-                       }
-                       if (conn != null) {
-                               try {
-                                       conn.close();
-                               } catch (SQLException ex) {
-                                       log.warn("Error closing connection", ex);
-                               }
-                       }
+                       close(stmt);
+                       close(conn);
                }

                res.sampleEnd();
@@ -164,6 +161,38 @@
                return data;
        }

+       public static void close(Connection c) {
+               try {
+                       if (c != null) c.close();
+               } catch (SQLException e) {
+                       log.warn("Error closing Connection", e);
+               }
+       }
+
+       public static void close(Statement s) {
+               try {
+                       if (s != null) s.close();
+               } catch (SQLException e) {
+                       log.warn("Error closing Statement", e);
+               }
+       }
+
+       public static void close(CallableStatement cs) {
+               try {
+                       if (cs != null) cs.close();
+               } catch (SQLException e) {
+                       log.warn("Error closing CallableStatement", e);
+               }
+       }
+
+       public static void close(ResultSet rs) {
+               try {
+                       if (rs != null) rs.close();
+               } catch (SQLException e) {
+                       log.warn("Error closing ResultSet", e);
+               }
+       }
+
        public String getQuery() {
                return query;
        }

$ svn diff JDBCSamplerBeanInfo.java > JDBCSamplerBeanInfo.java.patch
$ cat JDBCSamplerBeanInfo.java.patch
Index: JDBCSamplerBeanInfo.java
===================================================================
--- JDBCSamplerBeanInfo.java    (revision 388876)
+++ JDBCSamplerBeanInfo.java    (working copy)
@@ -50,7 +50,7 @@
                p.setValue(NOT_UNDEFINED, Boolean.TRUE);
                p.setValue(DEFAULT, JDBCSampler.SELECT);
                p.setValue(NOT_OTHER,Boolean.TRUE);
-               p.setValue(TAGS,new String[]{JDBCSampler.SELECT,"Update Statement"});
+               p.setValue(TAGS,new String[]{JDBCSampler.SELECT,JDBCSampler.UPDATE,JDBCSampler.STATEMENT});

                p = property("query");
                p.setValue(NOT_UNDEFINED, Boolean.TRUE);

Update

Good to know somebody read my post, and responded positively. The quickest way for patches is to log a Bugzilla request. Seemed somebody already had, so it was easy for me to just to contribute to Bug #38682

Testing a new MySQL Transactional Storage Engine

As part of my A call to arms! post about a month ago, I’ve had a number of unofficial comments of support. In addition, I’ve also been approached to assist in the completion of a MySQL Transactional support engine. More information on the PBXT engine will be forthcoming soon by it’s creator.

Anyway, I’ve taken on the responsiblity of assisting in testing this new storage engine. This will also give me the excuse of being able to pursue some other ideas about the performance of differing storage engines for differing tables in business circumstances, such as MyIsam verses InnoDB in a highly OLTP environment. Part of testing will be ensure ACID conformance in varying situations and multi-concurrency use. Of course the ability to also do performance and load testing would be a obvious extension.

Considering how I’m going to benchmark is an interesting approach. I of course want to use Java, my choice of language at present. This presents a problem, in another factor towards performance, however by using Java, I’m simulating a more real world environment of a programming overhead and JDBC Connector rather then just raw performance output.

Laying out a plan would include an ability to have an existing database structure and data, be able to bulk define SQL statements and transactions, and parameterise SQL during transactions. I would need to be able to verify the state of database from the transactions, and clearly identify any invalid data. I would also need the ability for handling threads, and of course adequate reporting of my results.

As of MySQL 5.1.4, there is a supported benchmarking tool called mysqlslap in MySQL. I’ve discounted using this because I figured at this early stage, the documentation and exposure of this is of course limiting, and I’m sure I’d still need to perform other development.

Along comes JMeter. Within Java development I use JUnit quite extensively. This is key in the test-driven agile methodology approach of Extreme Programming. In discussion with this problem with a collegue on a new project, I found that JMeter was used for extensive load testing for web applications, but also performed database testing, and provides the support to integrate JUnit tests.

So yesterday I had a quick look at JMeter. The capabilities for defining, reporting and threading are quite complete. It took litterally minutes to install, configure, run an initial test and view results all in a GUI interface. A little more work gave me scripting handling of my initial tests. I’ve posted my initial investigations of JMeter – Performance Testing Software and JMeter and Ant Integration earlier.

With this behind me, I’ve just got to define the approach for more complete transactional tests, explictly confirming the results (I’m hoping to achieve this in custom JUnit tests). If I can solve this, then I can spend the most of time in the defining of adequate tests. Let’s see what the next few days work provides.

JMeter and Ant Integration

Using Ant withJMeter you can achieve remote running and web based reporting.

I got the ant-jmeter.jar and sample results output .xls from Embedding JMeter with Ant. JMeter Ant Task


cd /tmp
wget http://www.programmerplanet.org/ant-jmeter/ant-jmeter.jar
wget http://www.programmerplanet.org/ant-jmeter/jmeter-results-report.xsl
mv ant-meter.jar $ANT_HOME/lib

Within a new project directory, place your saved JMeter Tests (*.jmx) in a loadtests subdirectory, and the downloaded jmeter-results-report.xsl in the project directory.

build.xml

<project name="dbtest" default="dist" basedir=".">

<property name="base.dir" value="."/>
<property name="report.dir" value="report"/>

<taskdef name="jmeter"
        classname="org.programmerplanet.ant.taskdefs.jmeter.JMeterTask"/>
<target name="dist" depends="runtest,testresults" />

<target name="runtest" description="Run jmeter tests">
        <jmeter jmeterhome="/opt/jmeter"
                resultlog="${base.dir}/loadtests/JMeterResults.jtl">
                <testplans dir="${base.dir}/loadtests" includes="*.jmx"/>
        </jmeter>
</target>

<target name="testresults" description="Report Test Results" depends="runtest">
        <delete dir="${report.dir}" quiet="true"/>
        <mkdir dir="${report.dir}" />
        <xslt in="${base.dir}/loadtests/JMeterResults.jtl"
                out="${report.dir}/JMeterResults.html"
                style="${base.dir}/jmeter-results-report.xsl"/>
</target>
</project>

Report output from running ant can be found at report/JMeterResults.html

JMeter – Performance Testing Software

Apache JMeter is a 100% pure Java desktop application designed to load test functional behavior and measure performance. It was originally designed for testing Web Applications but has since expanded to other test functions. Specifically it provides complete support for database testing via JDBC.

Some References: Homepage http://jakarta.apache.org/jmeter/  ·  Wiki Page  ·  User Manual

Initial Installation Steps

$ su -
$ cd /opt
$ wget http://apache.planetmirror.com.au/dist/jakarta/jmeter/binaries/jakarta-jmeter-2.1.1.tgz
$ wget http://apache.planetmirror.com.au/dist/jakarta/jmeter/source/jakarta-jmeter-2.1.1_src.tgz
$ tar xvfz jakarta-jmeter-2.1.1.tgz
$ tar xvfz jakarta-jmeter-2.1.1_src.tgz
$ ln -s jakarta-jmeter-2.1.1 jmeter
$ echo "PATH=/opt/jmeter/bin:$PATH;export PATH" > /etc/profile.d/jmeter.sh
$ . /etc/profile.d/jmeter.sh
$ jmeter &

Adding MySQL Support

cd /tmp
wget http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-3.1.12.tar.gz/from/http://mysql.planetmirror.com/
tar xvfz mysql-connector-java-3.1.12.tar.gz
cp mysql-connector-java-3.1.12/mysql-connector-java-3.1.12-bin.jar /opt/jakarta-jmeter-2.1.1/lib/

Steps to perform simple MySQL JDBC Test.

1. Launch JMeter
2. Add a new Thread Group (using right click)
3. Define Thread Settings (no messing around 3 threads x 1000 iterations)
4. Add a Sampler JDBC Request
5. Add initial sample SQL query
6. Add a JDBC Connection Configuration
7. Define JDBC Connnection details (I’m using the sakila sample database at this time)
8. Define a Results View
9. Run the sucker

This is just a quick intro to prove it all works, There are quite a lot of reporting output possible, something for my next post.
Click on Image for a larger view.

MySQL Sakila Sample Application

I’m sure you are all aware by now of Mike Hillyer’s MySQL Sakila Sample Database that will be launched at the MySQL Conference. We now have an official MySQL Forum for this as well.

As part of leveraging this existing database, and using this for the basis of my MySQL Conference presentation on MySQL for Oracle Developers, I’ve released the first version of my MySQL Sakila Sample Application at http://sakila.arabx.com.au which I would very much like some feedback on. Please use the official MySQL Forum for any comments, suggestions or complaints. Please Note I am still very much in the planning and design phase.

We also have an Unofficial Wiki that describes a little more of the concept and purpose the Sample Application, and a call for others to get onboard to design and develop their own versions in varying languages.

So the sample application, what does this showcase? For now, work has been on the presentation of data, as we finalise the schema and data. In addition the application has been designed to be more self documenting describing via the top menu options the functions available, and business logic considerations, specific MySQL features and a schema to show the underlying tables in question. Look at Admin or Film for an initial example.

Here is a quick list of the functions of the MySQL Sakila Sample Application. For now there is not user authentication so it’s open for all to view.

  • Home
  • Customers
    • Index (NF), Search/List, Add, View (NF)
  • Rentals
    • Index(NF), New, Search (NF), Return(NF), Overdue(NF), Out(NF)
  • Film
    • Index, Movie List, Actor List, Categories, Languages
  • Reports
    • Index (NF), Top Film Rentals, Top Customer Rentals
  • Admin
    • Index, Staff, Stores, Countries, Cities

(NF) – Not Functional – Please note, as the data hasn’t been finalised, some of the data is my own patch just to display functionality.
More information on what’s available is in the Admin|Documentation page.

XP January Meeting

The Brisbane XP Group met yesterday for a presentation by Dr Paul King of Asert on the book Sustainable Software Development : An Agile Perspective.

I found it a good time to get a collective opinion and review of the techniques and methods we are moving towards in Software Development. Indeed one key point better describing Pair Programming has been added to my upcoming conference presentation Overcoming the Challenges of Establishing Service and Support Channels. I’m hoping Paul makes his notes available as a review of this book, that I will also mention in my presentation.

In Review, this is some of the key points I got from this presentation.

  • Software gradually degrades over time, and will become a maintenance nightmare
  • Successful software will be changed again and again
  • The IT industry has a problem historically with credibility

So the goal is to move towards Sustainability. Some of the points mentioned by Paul were:

  • Continual refinement
  • A working product at all times (not just working software)
  • Value Defect Prevention over Defect Detection
  • Additional investment and emphasis on design

On point I struggle with is Pair Programming. I don’t struggle with the concept, it’s great and really works. The struggle is selling Pair Programming as a core XP Principle. Some good points of discussion lead to a better angle.

  • Pair Programming – should be de-emphasised as a key point
  • By selling Defect Prevention and using Continuous Code Reviews as one method of implementing this
  • Continuous Code Review are achieved with Pair Programming

Much easier. The key point is management understands the term Code Reviews, and if you can show the effect of Defect Prevention on support costs, using Pair Programming, Refactoring and other techniques, your sales pitch will be easier.

Also for reference, the book Software Craftsmanship: The New Imperative was mentioned as a book with similar ideals. A third recommended reading book that was mentioned at the meeting was The Pragmatic Programmer: From Journeyman to Master.





Support for Technology Stacks

As part of my next conference presentation Overcoming the Challenges of Establishing Service and Support Channels I’ve been struggling to find with my professional sources, any quality organisations that provide full support for a technology stack, for example a LAMP stack, or a Java Servlet stack.

Restricted to searching via online, I’ve been impressed by what I’ve found at Spike Source www.spikesource.com. An organisation with an experienced CEO, well known in the Java Industry. They certainly have all the buzz words covered in their product information.

Benefits of their SpikeSource Core Stack.

  • Fully tested and certified
  • Installs in minutes with integrated installer
  • Enterprise-class maintenance and support available
  • Vendor neutral
  • Horizontally and vertically scalable

SpikeSource offers three prebuilt configurations that can have you up and running in around ten minutes. These configurations comprise the following component choices:

  • LAMP Stack – for Websites with dynamic database-driven content written using Perl and PHP.
  • Servlet Stack – for dynamic Websites written using Java-based Web technologies such as servlets.
  • J2EE Stack – for Web applications that separate Web interface and application logic using Java Servlets and Enterprise JavaBeans.

Supported Platforms. What’s of interest here is RHEL, SuSE as well as Fedora Core 3. In line with for example Oracle software running under Linux.

What’s interesting, is they have MySQL 4.1.14 in their spikesource stack (1.6.2), so they are quite some months behind here. Especially now that MySQL 5 has been available 3 months now. Not only just stack technology, their infrastructure supports a large number of open source products and appears to provide infrastructure via a community to enhance the product offerings within this stack. The Spike Developer Zone Components List provides a long list of products.

Their release notes provide good instructions, in particular what configuration was used in the building of the software. For example, here is the MySQL Release Notes, MySQL Quick Start Guide, MySQL Troubleshooting Guide

They talk about testing, where Core Stack Testing provides more details here.

They also claim to provide VMWare Community Virtual Machine that can be run via the free VM Player on any system without having an effect on an existing system. This is indeed impressive, however it doesn’t seem available. There are many other installations available at the VMWare site.

I’m interested to see what else existing in the marketplace for a fully supported technology stack, rather then support of individual components (e.g. RedHat for Linux, MySQL AB for MySQL, JBoss for a servlet container)

In reading comparisions, there is reference also to Source Labs – www.sourcelabs.com. Anybody that can offer recommendations that I can research would be great.

Book Review – Beyond Java

Well the title got me when I decided to purchase this book “Beyond Java – A glimpse at the Future of Programming Languages”, however perhaps it should have been titled “Why to move from Java to Ruby” as the book for a good portion is an explanation of how Ruby solves the problems that Java has and the direction Java is moving. While the book did describe where Java was, and the future limits and what to look at Beyond Java, the high use of Ruby to describe these overwhelmed the book. In fact, only the last chapter of 20 pages gave an comparison of “Contenders” as the chapter title described other then scant descriptions

Initially I lost count of the number of times information regarding C++ was repeated in the book, and how Java got it’s great penetration from the C++ community. I almost put the book down after the first few chapters, it was highly repetitive.

However, given my increasing interest in Ruby I was able to work though this. I could see a Java developer that has already discard Ruby as a fad to put this book down. In fact, as a Ruby reference it provided some good tips, again strengthening my comment of including Ruby in the title.

Overall an interesting read, however for a small book it could have offered a lot more.

On the same topic, some interesting points in the article The Problems With Java.