MySQL & Sun

I have been noticeably absent on my comments in this topic, something I was pinged about yet again today by another colleague.
Did I have nothing to say? No. It’s just I’ve not been able to say much, or actually do anything in the past 7 days due to strong bought of the flu which has kept me in bed near 5 days straight (coincidently aligning perfectly with MySQL’s recent ACM. Go figure that luck).

So what are my thought’s here. (Ok, I’m going to waffle a bit with my point of view, but my key technical points for those of you that just want that, scroll down to the next section heading)

Overall I believe it’s a good thing, on the surface and at the moment. As mentioned MySQL was on an IPO path, I’d would have liked the option to buy my own MySQL shares, be part of a company that got to that point (have worked for 2 failed startup Internet companies previously). I think there would have been many a proud MySQL employee,ex-employee, and user that day, and that’s a shame for the pinnacle victory of Open Source, to become something of worth, to see the light at the end of the tunnel, but be tempted by the fruits enroute. MySQL, like every startup has investors, they have invested in MySQL for quite some time, I’d think probably up to 10 years now, they obviously want their money too.

But the market out there has many larger players, those less understanding of Open Source, more of the corporate giants with larger shares $15 Billion in RDBMS revenues, and willing to either run over, or kill off a smaller weaker competitor by $ size. There has been some press saying it’s a a bad thing, Sun is dead, what a waste of money etc. If Sun did not purchase MySQL I’m sure there was a line that would. It doesn’t take a rocket scientist to think of names such as IBM, Microsoft & Oracle. All competitors ironically. One mentioned that was interesting was Yahoo? Had not thought of that, but in that line of thought, would Google do good, or evil.

I’ve worked more closely with Sun equipment in 2007. Indeed my longest engagement and Californian Client recently mentioned in the press as co customer of Sun & MySQL was a very tough engagement stressing both Solaris 10 and MySQL 5 (with custom built binary). (1) Stressing me too very much, regardless of the New York to San Francisco each week for a decent time. I was exposed to Dtrace, I was exposed to zero skills in MySQL, well perhaps one, but he would not return phone calls or emails, that’s just plain unprofessional even to a colleague. I was exposed the warts of legal not allowing me to get Sun help on DTrace even when that was offered by sUN and was to the clear benefit of MySQL. Luckily cooler heads, weeks (more then we had) and a lot of pain prevailed on this said point, but to my point, what a darn unnecessary pain to start with.

I see more of Sun around. I know of other consulting clients (of course we can’t mention), other large sites of Community collegues, e.g. Fotolog are Sun, be it MySQL 4.x days however. I’ve heard the Sun/MySQL relationship is not that good, that MySQL doesn’t work as well on Sun as it could, that there are clear performance gains that can be made under the Solaris OS.

And historically, I stated working with SUN in the early 90s, it might be good to return after my time with SunOS before there was Solaris. I think Solaris started at 2.5.

(1) I mention custom built binary because many people don’t realize, there are actually three paths of getting features into MySQL. Both the Community and Enterprise path are subject to Engineering and the constraints of “Feature Freeze” for example. A Custom built binary, can include a specific feature or features into a supported version of MySQL just for you. MySQL Professional Services has dedicate staff that work in these areas. Our consulting offerings are listed here

My Key Points (Some personal)

  • Access to a benchmarking group, that’s gotta help. MySQL Inc the company has a number of shortcomings, and little movement to excite me, this is one of them. I’m sure Sun will want to provide MySQL this detail, and this can only be a key strength. (Sun people reading this, I want to help)
  • Access to H/W. Ok, so I can’t get any benchmark results out of MySQL, I can’t even get access to several consistent machines to test stuff myself anyway (as if I had that time), but if I’d got allocated a few days a month of dedicated access to a bank of services I’d sure use it. Especially in the area of RAID performance, LVM/SAN Backup & Recovery. (Again Sun people, pick me, I especially want just a dedicated server to test multiple performance tests with different RAID configurations, and MOST IMPORTANTLY, provide figures of how long to recover a degraded system under load. Want the same under a SAN, but that become more political.
  • Access to Dtrace resources. I see this tool as key in my next level of internal knowledge of the server. This will grant me better access to resources, Luckily my Macbook with 10.5 also has Dtrace, a recent new discovery, so that will make getting Solaris x86 Parallels VM a secondary priority. (Sun people listening, a Parallels/VMWare Solaris is on my wish list)
  • Access to facilities. I live in New York. While I have no desire to want to go into an office day by day, to know now I can, say have meeting a room, an area to work for a few days distracted etc, is golden, and a plus for a larger company.
  • Access to facilities for community events. This is gotta help or fledgling MySQL User Groups and MySQL Camp events. I really hope it can actually flow onto helping significantly in event organization taking the MySQL User Conference from it’s present limitations into bigger and better
  • Access to a better expense system. MySQL’s system sucks. Sorry, no other word for it. Scorn me later. I’m at a disadvantage over many other MySQLer’s, a) I’m one of the few that travel all the time, b) I’m of even less few that live and travel not in my country or origin and so I’m confined to the limitations of said countries financial systems. Specifically I own property in Australia, have 20K CC, can’t get a CC to cover a month’s expeneses here, actually can’t get them to cover a week (already have a few). I can’t use my Australian ones effectively here, two problems, one MySQL could never sought out how to cover my “additional expenses” and I was paying out $150 per month on my own dime, plus I had to physically go to a bank and physically transfer money to Australia (all under a magical $10K AUD limit). When you travel every week, and when I had a brutal travel schedule for a while I was not able to actually get to bank for three weeks.
  • If SUN doesn’t provide a corporate card, or handle expense of the two largest ticket items, airfares and accommodation, there will be hell to pay if I’m not re-reimbursed in very timely fashion. I have a corporate card with MySQL, a sore point actually, it took 12 months to get one, to no fault of my own.
  • Employment Visa. As I mentioned my live at MySQL is more complex then others, I work here in the US under the E3 visa, that is tied to an employer. This will need to change, and that I’m sure is no trivial task.
  • Changed Vacation Time (for the worse). MySQL has a very good Swedish Vacation policy, this goes. For an employee under 2 years my vacation time I hear is like a week less then 2+, and overall vacation time in general is less. They say more US based holidays and forced Christmas/New Year. I’m not the family guy, a day scattered here or there doesn’t work for me, I need blocks of time, the world is out there.
  • Java. My core programming expertize is Java. (sh/awk etc would argue a strong case). My strongest non-scripting language is Java, I’ve missed not being able to remain closer to application development, even stack technology, this may provide some opportunity where consulting engagements are MySQL and development related and I can refresh this skills. Going to another Java One again would be bonus :)
  • Additional Development/Testing Resources. Sun is a company 100 times the size of MySQL. Does this grant us dedicate resources that may be able to help the bleeding of MySQL. (Bleeding the time to market for features, quality therefore, and flexibility for growth). All things I consider still going backwards at MySQL. Additional procedures may make things worse, but MySQL I’m sure will remain a key different component for a long time.
  • Community I’m going to end my points on this. MySQL in my opinion does not do right by the community. I have some specific views I’ll be posting soon on this point. How will community function under SUN, will be allowed to thrive, or will it’s path be dictated as it is now, buy a sales and marketing driven focus. There are some things that need to change in MySQL, this is one of them. Will Marten Mickos, the present CEO the head of the MySQL group at Sun see this and have better direction in store. What I’d expect is to see Sales and Marketing be re-organized (this seems most logical), and then that leaves community to return to a key component such as Engineering & Services and not a side thought. I came from the community before I joined MySQL, I am active (less so) in the community while at MySQL, I stand up for the community at MySQL (again more on this soon), and I’ll be part of the community long after I leave MySQL and/or Sun.

My Ideas for MySQL Camp III

Diary: January 21st 2008 – Martin Luther King Day (Day doctor’s practices are closed BTW.)

“I have a dream”, poetic . Actually I have thumping 5 day straight headache but that’s another story.

I have a dream for MySQL Camp III. A 48 hour Global Hackfest. I ran this by Jay over Thanksgiving, to get back to more the purpose of the Camp, for hackers, coders and the very experience to get to together to share their skills, and for those at the top of our respective game to learn just a little more. MySQL Camp II was a success to attendees in general, but of little value to the experts.

I hope to get us middle to advanced ground. Here is an overview.

  • 48 hour event
  • Say 12pm Friday GMT to 12pm Sunday GMT
  • Global meeting points of two or more to work together, sleep, eat and play together (in fact the goal is not to attend alone, you should really try to get to at least the closest person that’s also attending, it’s designed to be a distributed group event)
  • 4 key areas
    • Getting Starrted. Getting the code, understanding the basics of compiling – Linux, Mac OS/X and Windoze. CLI & IDE debugging options
    • A&D. Analysis of problems, selected bugs, existing patches etc, reviewing procedures, documentation requirements, important of test plans etc. Laying down a plan on what’s going to happen, how long it’s going to take etc.
    • Doing it. Taking a working developer environment, and a set plan, and executing to completion
    • Reviewing it. Getting an insite into MySQL and how bug fixes, community contributions etc are submitted, reviewed, proposed and received
  • I would anticipate we would run say 2 or 3 tracks of these 4 points, so we would repeat stuff, perhaps a different problem, but this enables you to get a real grip, as well as cater for the 24hr cycle.
  • I could see the Doing it interesting, perhaps depending on attendees either a mentor process where a code guru could instruct some youngerlings, or multiple teams working in parallel on the same problem, a little bit of competition.
  • Code base line version and list of bugs/features to be looked at to be pre-determined, so we have a clear structure during the event. This will be proposal format, and may include for example back porting patches for example.
  • We will definitely be having some prizes and some fun, it’s going to be one of those work to 3am in the morning weekends regardless of where you live.

I expect to run this format of MySQL Camp twice, the first to work out any serious problems. I had hoped in November last year to get this before UC2008, fat chance, but I’m proposing a MySQL UC2008 a BOF on the subject.

I had spoken with good friend Farshan Mashraqi that had done some good Sun webinars recently and he was seeking a contact to see if Sun could donate the time/bandwidth for the electronic component or even sponsor. (This was clearly discussed late last year) Seems now this may be easier, or harder with Sun’s involvement.

I have a lot more details, but I want to get this out there into the world, and get some feedback first.
I’d like people’s feedback. Here are 5 questions to start with.

  1. Do you think it’s a good idea?
  2. Would you attended/participate?
  3. Could you contribute in some why? What?
  4. The one thing that appeals the most on the concept?
  5. The one thing that appeals the least/lacks/needs on the concept?

Compiling MySQL 5.0.51 under Ubuntu 7.10

I’ve finally decided to work on a number of improvements in Instrumentation within the MySQL Server I’ve wanted for the first year, See What is the optimal thread specific buffer size?. It’s been a while since I’ve compiled from source, and from these issues, the first under Ubuntu 7.10 (a fresh install). Here are some of the problems, and solutions overcome, just for some others that may experience them.

I should have simply read my own notes from years ago in Compiling MySQL, specifically the pre-requisites list, but it sometimes helps to remember why things are so.

In summary, I needed the following:

apt-get install automake libtool g++ ncurses-dev

Thanks Miademora, I meant to say that, forgot

First error, “aclocal: not found”

$ ./BUILD/compile-pentium-debug
BUILD/check-cpu: Oops, could not find out what kind of cpu this machine is using.
+ make -k distclean
make: *** No rule to make target `distclean'.
+ true
+ /bin/rm -rf */.deps/*.P config.cache innobase/config.cache bdb/build_unix/config.cache bdb/dist/autom4te.cache autom4te.cache innobase/autom4te.cache
+ path=BUILD
+ . BUILD/autorun.sh
+ aclocal
BUILD/autorun.sh: 1: aclocal: not found
+ die Can't execute aclocal
+ echo Can't execute aclocal
Can't execute aclocal
+ exit 1

This needed automake.

$ apt-get install automake
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages were automatically installed and are no longer required:
  liblzo1
Use 'apt-get autoremove' to remove them.
The following extra packages will be installed:
  autoconf autotools-dev m4
Suggested packages:
  autoconf2.13 autobook autoconf-archive gnu-standards autoconf-doc
  automake1.10-doc
Recommended packages:
  automaken
The following NEW packages will be installed:
  autoconf automake autotools-dev m4
0 upgraded, 4 newly installed, 0 to remove and 24 not upgraded.
Need to get 1085kB of archives.
After unpacking 3899kB of additional disk space will be used.
Do you want to continue [Y/n]? y
WARNING: The following packages cannot be authenticated!
  m4 autoconf autotools-dev automake
Install these packages without verification [y/N]? y
Get:1 http://archive.ubuntu.com gutsy/main m4 1.4.10-0ubuntu2 [207kB]
Get:2 http://archive.ubuntu.com gutsy/main autoconf 2.61-4 [448kB]
Get:3 http://archive.ubuntu.com gutsy/main autotools-dev 20070306.1 [61.6kB]
Get:4 http://archive.ubuntu.com gutsy/main automake 1:1.10+nogfdl-1 [369kB]
Fetched 1085kB in 2s (515kB/s)
Selecting previously deselected package m4.
(Reading database ... 93930 files and directories currently installed.)
Unpacking m4 (from .../m4_1.4.10-0ubuntu2_i386.deb) ...
Selecting previously deselected package autoconf.
Unpacking autoconf (from .../autoconf_2.61-4_all.deb) ...
Selecting previously deselected package autotools-dev.
Unpacking autotools-dev (from .../autotools-dev_20070306.1_all.deb) ...
Selecting previously deselected package automake.
Unpacking automake (from .../automake_1%3a1.10+nogfdl-1_all.deb) ...
Setting up m4 (1.4.10-0ubuntu2) ...

Setting up autoconf (2.61-4) ...

Setting up autotools-dev (20070306.1) ...
Setting up automake (1:1.10+nogfdl-1) ...

Second error, “libtoolize: not found”.

./BUILD/compile-pentium-debug
BUILD/check-cpu: Oops, could not find out what kind of cpu this machine is using.
+ make -k distclean
make: *** No rule to make target `distclean'.
+ true
+ /bin/rm -rf */.deps/*.P config.cache innobase/config.cache bdb/build_unix/config.cache bdb/dist/autom4te.cache autom4te.cache innobase/autom4te.cache
+ path=BUILD
+ . BUILD/autorun.sh
+ aclocal
+ autoheader
+ test -f /usr/bin/glibtoolize
+ libtoolize --automake --force
BUILD/autorun.sh: 1: libtoolize: not found
+ die Can't execute libtoolize
+ echo Can't execute libtoolize
Can't execute libtoolize
+ exit 1

Using a nice Ubuntu feature, you can find the right package by typing the command.

$ libtoolize
The program 'libtoolize' is currently not installed.  You can install it by typing:
sudo apt-get install libtool
bash: libtoolize: command not found

This needed the libtool package.

$ apt-get install libtool
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages were automatically installed and are no longer required:
  liblzo1
Use 'apt-get autoremove' to remove them.
The following extra packages will be installed:
  libc6-dev linux-libc-dev
Suggested packages:
  glibc-doc manpages-dev libtool-doc g77 fortran77-compiler gcj
Recommended packages:
  libltdl3-dev
The following NEW packages will be installed:
  libc6-dev libtool linux-libc-dev
0 upgraded, 3 newly installed, 0 to remove and 24 not upgraded.
Need to get 4277kB of archives.
After unpacking 18.3MB of additional disk space will be used.
Do you want to continue [Y/n]? y
WARNING: The following packages cannot be authenticated!
  linux-libc-dev libc6-dev libtool
Install these packages without verification [y/N]? y
Get:1 http://archive.ubuntu.com gutsy-updates/main linux-libc-dev 2.6.22-14.47 [653kB]
Get:2 http://archive.ubuntu.com gutsy-proposed/main libc6-dev 2.6.1-1ubuntu10 [3287kB]
Get:3 http://archive.ubuntu.com gutsy/main libtool 1.5.24-1ubuntu1 [337kB]
Fetched 4277kB in 4s (900kB/s)
Selecting previously deselected package linux-libc-dev.
(Reading database ... 94193 files and directories currently installed.)
Unpacking linux-libc-dev (from .../linux-libc-dev_2.6.22-14.47_i386.deb) ...
Selecting previously deselected package libc6-dev.
Unpacking libc6-dev (from .../libc6-dev_2.6.1-1ubuntu10_i386.deb) ...
Selecting previously deselected package libtool.
Unpacking libtool (from .../libtool_1.5.24-1ubuntu1_i386.deb) ...
Setting up linux-libc-dev (2.6.22-14.47) ...
Setting up libc6-dev (2.6.1-1ubuntu10) ...
Setting up libtool (1.5.24-1ubuntu1) ...

Third error, “preprocessor “/lib/cpp” fails sanity check”

./BUILD/compile-pentium-debug
...
checking how to run the C++ preprocessor... /lib/cpp
configure: error: C++ preprocessor "/lib/cpp" fails sanity check
See `config.log' for more details.
$ more config.log
configure:5336: gcc -c -Wimplicit -Wreturn-type -Wswitch -Wtrigraphs -Wcomment -
W -Wchar-subscripts -Wformat -Wparentheses -Wsign-compare -Wwrite-strings -Wunus
ed-function -Wunused-label -Wunused-value -Wunused-variable -Woverloaded-virtual
 -Wsign-promo -Wreorder -Wctor-dtor-privacy -Wnon-virtual-dtor  -felide-construc
tors -fno-exceptions -fno-rtti  -DUNIV_MUST_NOT_INLINE -DEXTRA_DEBUG -DFORCE_INI
T_OF_VARS -DSAFEMALLOC -DPEDANTIC_SAFEMALLOC -DSAFE_MUTEX    conftest.cpp >&5
gcc: error trying to exec 'cc1plus': execvp: No such file or directory

This needed the package g++.

$ apt-get install g++
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages were automatically installed and are no longer required:
  liblzo1
Use 'apt-get autoremove' to remove them.
The following extra packages will be installed:
  g++-4.1 libstdc++6-4.1-dev
Suggested packages:
  g++-multilib g++-4.1-multilib gcc-4.1-doc libstdc++6-4.1-doc
The following NEW packages will be installed:
  g++ g++-4.1 libstdc++6-4.1-dev
0 upgraded, 3 newly installed, 0 to remove and 24 not upgraded.
Need to get 3730kB of archives.
After unpacking 13.7MB of additional disk space will be used.
Do you want to continue [Y/n]? y
WARNING: The following packages cannot be authenticated!
  libstdc++6-4.1-dev g++-4.1 g++
Install these packages without verification [y/N]? y
Get:1 http://archive.ubuntu.com gutsy/main libstdc++6-4.1-dev 4.1.2-16ubuntu2 [1129kB]
Get:2 http://archive.ubuntu.com gutsy/main g++-4.1 4.1.2-16ubuntu2 [2600kB]
Get:3 http://archive.ubuntu.com gutsy/main g++ 4:4.1.2-9ubuntu2 [1440B]
Fetched 3730kB in 4s (871kB/s)
Selecting previously deselected package libstdc++6-4.1-dev.
(Reading database ... 95367 files and directories currently installed.)
Unpacking libstdc++6-4.1-dev (from .../libstdc++6-4.1-dev_4.1.2-16ubuntu2_i386.deb) ...
Selecting previously deselected package g++-4.1.
Unpacking g++-4.1 (from .../g++-4.1_4.1.2-16ubuntu2_i386.deb) ...
Selecting previously deselected package g++.
Unpacking g++ (from .../g++_4%3a4.1.2-9ubuntu2_i386.deb) ...
Setting up g++-4.1 (4.1.2-16ubuntu2) ...
Setting up libstdc++6-4.1-dev (4.1.2-16ubuntu2) ...
Setting up g++ (4:4.1.2-9ubuntu2) ...

Fourth error “checking for termcap functions library… configure: error: No curses/termcap library found”

This needed the package nurses-dev.

This got a successful compile. Again, should have started with the pre-requisites list. A review of the pre-requisites from previously showed gmake no longer necessary, the following is.

automake --version
autoconf --version
libtool --version
m4 --version
gcc --version
bison -version

What is the optimal thread specific buffer size?

So you want to know what join_buffer_size, sort_buffer_size, read_buffer_size and read_rnd_buffer_size values for your application should be? These MySQL thread specific buffers are variables I can never get right because there are insufficient metrics, instrumentation or even abstract details with the present MySQL Versions. These are important because Memory is a resource that you want to maximum towards your database data (the System Global Area), and not towards the Process Global Area in which there is no limitations.

I’ve wanted to know this answer for some time, I’ve asked many people including internal MySQL resources, I’d hoped that when joining MySQL more details would be available, but I’ve never been able to get an answer. I’ve always been meaning to work this out, it’s now 2008 and well the time has now come to do something about it.

The questions I want to answer include?

  • When are these buffers used, i.e. which SQL Statements trigger these. Ok, you can work this out when you review the SQL + Schema but I want something that works for clients.
  • I want to know if the buffers are fully allocated or incrementally allocated. My understanding is that for these 4 buffers, there are pre-allocated unlike internal temporary tables, but since reading some of the code I’m no longer completely convinced.
  • I want to know what size of the buffer (if fully allocated) is actually used.

My initial goal is to add the following new status variables.

  • sort_buffer_count, sort_buffer_usage
  • read_buffer_count, read_buffer_usage
  • read_rnd_buffer_count, read_rnd_buffer_usage
  • join_buffer_count, join_buffer_usage

Now, it’s likely there will be some overlap, for example sort_scan, but nothing like introducing some consistency here.

There per session status variables will allow you when reviewing individual SQL statements to see the impact. Ultimately I’d like to add more smarts into MySQL, because things like “gathering all SQL Statements” can be difficult and often not possible in production. Combined with leverage MySQL Proxy we will get there.

I’m looking forward to getting in the bows of the MySQL Source Code, it’s going to be a steep curve as I do little development these days, less in C++ but nothing like a challenge and questions to a lot of close colleagues for small snippets of help.