InnoDB I_S.tables.table_rows out by a factor of 100x

I’ve always believed that the MySQL Information_schema.tables.table_rows figure for Innodb tables to be while approximate, approximately accurate.

Today I found that the figures varied on one table from 10x to 100x wrong.

Before performing an ALTER I always verify sizes for reference.

+---------+------------+--------+------------+------------+----------------+------------+------------+-----------+------------+
| table_s | table_name | engine | row_format | table_rows | avg_row_length | total_mb   | data_mb    | index_mb  | today      |
+---------+------------+--------+------------+------------+----------------+------------+------------+-----------+------------+
| XXXXX   | s_a        | InnoDB | Compact    |     208993 |           7475 |  1491.5312 |  1490.0156 |    1.5156 | 2009-09-09 |

mysql> alter table s_a modify col VARCHAR(255);
Query OK, 23471 rows affected (4 min 26.23 sec)

After

+---------+------------+--------+------------+------------+----------------+------------+------------+-----------+------------+
| XXXXX   | s_a        | InnoDB | Compact    |    2407063 |            629 |  1447.0312 |  1445.5156 |    1.5156 | 2009-09-09 |

I was so caught out by this. The table reported 200k rows, but the alter returned 23k, that’s like 10x out.
I ran my query again, and the second time I got.

+---------+------------+--------+------------+------------+----------------+------------+------------+-----------+------------+
| XXXXX   | s_a        | InnoDB | Compact    |      21813 |          69487 |  1447.0312 |  1445.5156 |    1.5156 | 2009-09-09 |

This is closer to what I’d expected, 22k verses 23k.

I have to say, while I have always treated the data and index size as accurate, I now how little confidence in the table_rows any more.

Focus on what you do best

When you have a great idea for a web application, it can be hard to consider with all the moving parts to focus just on what’s your uniqueness or differentiator from everybody else.

You may want to have control over your forums, comments, chat, photo management etc, i.e. user data, but how much does that help you. Is allocating resources to these features when plenty of completed applications exist distracting you and lengthening your time to market.

I always like to refer to Guy Kawasaki’s quote “Don’t worry, be crappy”. While I don’t necessarily agree with just throwing functionality out to the www, I believe in quality over quantity, you want to ensure that more time is spent in reviewing the input for new or improved features rather then bugs, bugs, bugs.


Ping.fm and Plurk are two new community driven sites that have leveraged the functionality of other sites, these being Get Satisfaction – People-Powered Customer Service for Everything! and Disq Us – Turn your blog comments into a webwide discussion.

There are advantages and disadvantages to this approach. As an smaller web site with a growing community, exposing what you do to a wider audience when using a third party to manage something can greatly help in exposure and associated marketing at no cost. On the down side, you are losing traffic to another site.

You need to ensure you can always get access to your data, and your community contributions. Ensuring adequate API’s for integration and data extraction are key. From a technology perspective, BitKeeper and LaunchPad come to mind. BitKeeper is a closed source, version control system that MySQL used. This was a killer for community contributions, where individual users simply could not contribute, and if they wanted even access to getting source code via the repository had to pay for an appropriate client. SourceForge and Apache are two examples of huge communities where they leverage the power of the community. LaunchPad is the latest kid on the block, but suffers from the fact that while access to applications hosted there are free, the actually LaunchPad code itself is closed. This has caused some issues.

It’s a fine line, and in the genre of software development, the Internet can create copies of anything just about overnight. More and more I hear about companies working in stealth mode rather then open community input and interaction, but that’s a topic for another discussion.

Time Warner Cable Speed

I had my Time Warner cable installed yesterday, a rather painless process. Reported as having Internet speeds of 10MB down and 1/2 MB up, these were confirmed with speedtest.net

Installation – July 5

A day later – Jul 6

Don't use HostMonster

Following a 2-4 day outage from my hosting provider of my dedicated server, I decided to move non critical websites to shared hosting. I have one with 1&1 but I created a second account to share load and act as a backup with www.hostmonster.com after a recommendation from a friend. I was able to move stuff, I was able to get some domains there, but it didn’t last long.

What a disaster. NEVER USE HOSTMONSTER!

Probably about a week after my account was created, they decided to move my account, they didn’t notify me it was going to happen, they just did it. They lost all my files, and did not tell me, after making multiple inquires and phone calls.

Here is some history.

Tue, Jun 10, 2008 at 8:59 PM

Dear Hostmonster Customer,

Hostmonster has started migrating your account (ronaldbr).
Below you will find important migration details.  Please refer to your ticket
number 0 for any specific details.


Although Hostmonster will do everything possible to ensure that your
migration goes quickly and smoothly; it is important to understand
that your account will be moving from one physical server location to
another. During migration the IP address attached to your domain name
will be changed from your old server IP to your new server IP. This will
cause a temporary interruption in email, ftp, and the visibility
of your website.


This window of interruption occurs because most Internet Service Providers
(ISP's) take 24-72 hours to clear their Cache.


Although this window could last approximately 24-72 hours it typically
only lasts 48 hours before your site becomes fully functional again. Your
web browser (IE, Firefox, Netscape) has a Cached version of your site
stored on your local system. In some cases it will help if you clear your
browser cache. For more Information about Cache and clearing your browser
Cache please review our article on:

http://helpdesk.hostmonster.com/kb/index.php?x=&mod_id=2&id=352

If after waiting 48 hours and clearing your browsers Cache, your
website has not begun functioning normally please contact our World
Class U.S. based Support Team by phone:

   Hostmonster Support:

       * Main Line: (866) 573-4678

       * Outside U.S: (801) 494-8462

       Support Questions: Press 2


Important Migration Details:

Your username and cPanel password will remain the same.

   * Migration Date: June 10, 2008

   * Migration Start Time: 06:00 PM MST

   * Migration End Time: (estimate) 04:00 AM MST

   * Old Server IP: 74.220.207.97

   * New Server IP:


Check out the CEO's Blog!

Come see the latest news, information, and updates on Hostmonster. While
you're there tell me how you think our company is doing!

Thank you again for choosing HostMonster.Com!

Matt Heaton (CEO)

http://www.mattheaton.com/

I opened a ticket 24 hrs later at Wed Jun 11 2008 09:26PM.

I got a quick response, but was lied too when told no data was lost.

Wed Jun 11 2008 10:02PM by [email protected]
Dear Customer,
Thanks for contacting us.
We apologize for the trouble you've been having, we are working on the issue the migration has some complications none of your data was lost please allow 24 hours tops for the site to be fully functional again.
Thanks
Corbin
Level 1 Support Engineer
hostmonster.com

I asked for a reason why this was done, and why I wasn’t even notified. I was given a lame response with a “isn’t likely” it will happen again, “but might be” both in the same sentence.

Thu Jun 12 2008 10:13PM by [email protected]
We migrated your site to free up hard-disk space on our server, and I apologize that appropriate notice wasn't given. I have notified my supervisor in an effort to recommend improved communications. It isn't likely that your account will be migrated again for our business needs, but it might be needed in the future.

Thank you for your inquiry.

John Pratt
Support Level 1
HostMonster.com
866.573.4678

Probably my second or third call now, is Friday morning, and I’ve had added to my ticket some lame text that it’s being escalated.

Fri Jun 13 2008 08:32AM by [email protected]
I am reopening this ticket. It was supposedly moved from host97 to host262 but cpanel man. is still showing host97 and tracert shows host97. I've tried going to both host262.hostmonster.com:2082 and host97.hostmonster.com:2082 but it's not logging me in. I've tried to change the password and tried both host262.hostmonster.com:2082 and host97.hostmonster.com:2082 with the new password and won't login either. An L3 says the username broke in the migration but it's a different kind of "broken" since it doesn't have the error message in Cpanel Manager. I got L3 approval to move this ticket to the escalations queue.

Nicholas Martin
Support Level 1
HostMonster.com
866.573.4678

So now, it’s Sunday morning, on the phone again, no information, reason or help forthcoming again. A note on ticket Sat 2pm (not visible to me), apparently all files lost on both old server and new server. When were you like going to tell the customer.

This service is woeful, I want my money back.

Of course when I said I wanted to cancel my account and get my money back (I can’t login remember), I was told I would have to call back when the billing department was open.

Well, my complaint will be going to [email protected] – Supervisor of Host Monster Tech Support.

Learning from a Disaster

As Farhan has already pointed out to us, Disaster is Inevitable – Must shutdown generators. My primary hosting provider The Planet had a serious meltdown, 9,000 servers unavailable, DNS and administration application .

My server was effectively totally unavailable from 3PM Saturday until 10AM Monday, 43 hours in total.

The problem didn’t stop there. Started and verified servers and domains, but like 8 hours later I find that the DNS is wrong on two important domains (I didn’t discover this because I have them in local /etc/hosts) because I moved them to a different IP like 2 weeks ago.

The Planet denied any problems, ticket logged to get them fixed because admin interface was still down.
Wind forward, Service Updates pages states

June 3 – 3:00pm CDT
All DNS Zone files for ns1, ns2, ns5 and ns6 are completely updated as of the information that was available 5:30PM Central Saturday. All DNS servers have been rebooted and BIND has been restarted.
.

What a flat out lie. DNS lookup directly against name server confirms it’s wrong. At Wed 12:00am Wednesday, the admin interface finally gives access to view and update zone files. What the. It indicates and IP address which is should be, but clearly not what it is.

I will so be sending a complaint to [email protected] when I finally get this resolved.

And just to make this all worse, my present professional site ronaldbradford.com is of critical importance. It’s my only exposure for preparing to provide information to potential employers, and it’s used for the preparation of consulting information. I’ve been forced earlier today because of events starting today for NY Internet Week to purchase DNS services at www.easydns.com. That didn’t also got to plan, yet another story.

The next opening keynote – Everything fails, All the Time

Our third keynote this morning was by Dr Werner Vogels – CTO Amazon.com

His second question to the audience “How many of you don’t shop at Amazon?” When one or two people raised their hands he commented, “Can I talk to you later.”

He was here to talk about Amazon as the technology company, forget about shopping. Think about the technology that drives that. There were some large and impressive numbers about Amazon. The one stood out was 3.9M shipments for peak day (didn’t mention which day)

Some points from the session.

  • Some Big News — Persistence Storage for Amazon EC2
  • Amazon started as technology consumer, we are now a technology provider.
  • We were enterprise scale, now we are at web scale.
  • Taking the approach, get big fast, the result was every architectural principle was broken.
  • Moving into Software as a Service
  • Develop -> Test -> Operate
  • There is a box between Test and Operate (undifferentiated heavy lifting)
  • We seem to put data centers in the same places as trailer parks.
  • At Amazon we expect our data centers to fail.
  • We expect our software to tolerate these fails.
  • We want to be able to loss a data center and not have this affect our customers.
  • Amazon, brings data centers down multiple times a year, just to show the software survives.
  • The provider model — Scalable, Cost Effective, Reliable, Simple
  • Addressing Uncertainty.

He used the 365 main power failure example. Large Web 2.0 Web sites were down including: Live Journal, Craigslist, Technorati, TypePad, Yelp and USA Today.

Is Amazon a viable option in these times? Worthy of consideration.

Updated You can view more links including video courtesy of Sheeri Kritzer at MySQL Forge Conference Notes