Opinions, Expertise, Passion.

Information in black and white, and sometimes some color.

Jul
10

Deleting from ARCHIVE tables

Link to this post

I can’t say I’ve used the ARCHIVE storage engine before, but at the NY MySQL Meetup last night there was discussion of the improvements to ARCHIVE in 5.1 and the fact that you could not DELETE from archive. A simple test confirmed this indeed throws an error.

DROP TABLE IF EXISTS url_log;
CREATE TABLE url_log(
log_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
log_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
user_id INT UNSIGNED NULL,
url VARCHAR(100) NOT NULL,
PRIMARY KEY (log_id))
ENGINE=ARCHIVE;

DELETE FROM url_log;
ERROR 1031 (HY000): Table storage engine for 'url_log' doesn't have this option

However, part of MySQL 5.1 which is RC status, there is partitioning. Thinking that one could probably partition say a log table by DAY OF MONTH, and then you could do what you want with the data in a partition and delete the partition, I tried the following test.
NOTE: for the purposes of testing, I used SECOND() rather then DAY() and smaller ranges for simplicity.

DROP TABLE IF EXISTS url_log;
CREATE TABLE url_log(
log_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
log_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
user_id INT UNSIGNED NULL,
url VARCHAR(100) NOT NULL,
PRIMARY KEY (log_id))
ENGINE=ARCHIVE
PARTITION BY RANGE ( SECOND(log_date) ) (
    PARTITION p0 VALUES LESS THAN (10),
    PARTITION p1 VALUES LESS THAN (20),
    PARTITION p2 VALUES LESS THAN (30),
    PARTITION p3 VALUES LESS THAN MAXVALUE
);

However this throws an error.

ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function

Primary keys, and AUTO_INCREMENT do not play well with partitioning, so for the purpose of this proof of concept, I’ll drop these.

DROP TABLE IF EXISTS url_log;
CREATE TABLE url_log(
log_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
user_id INT UNSIGNED NULL,
url VARCHAR(100) NOT NULL)
ENGINE=ARCHIVE
PARTITION BY RANGE ( SECOND(log_date) ) (
    PARTITION p0 VALUES LESS THAN (10),
    PARTITION p1 VALUES LESS THAN (20),
    PARTITION p2 VALUES LESS THAN (30),
    PARTITION p3 VALUES LESS THAN MAXVALUE
);

Create a simple Stored Procedure to randomly generate some data. The function is not efficient using RAND() and SLEEP() but it does provide the generation of some data.

DELIMITER $$
DROP PROCEDURE IF EXISTS load_url_log;
CREATE PROCEDURE load_url_log (insert_count INT)
BEGIN
  DECLARE i INT DEFAULT 1;

  WHILE i < insert_count
  DO
     INSERT INTO url_log(user_id, url)
     VALUES (FLOOR(RAND()*100), CONCAT(REPEAT('x',FLOOR(RAND()*99)),SLEEP(RAND())));
  END WHILE;
END $$

DELIMITER ;

CALL load_url_log(500);

I quick check shows a distribution of data.

mysql> select distinct(second(log_date)) from url_log;
mysql> select distinct(second(log_date)) from url_log where second(log_date) < 10;
+--------------------+
| (second(log_date)) |
+--------------------+
|                  0 |
|                  1 |
|                  2 |
|                  3 |
|                  4 |
|                  5 |
|                  6 |
|                  7 |
|                  8 |
|                  9 |
+--------------------+
10 rows in set (0.00 sec)

And now the purpose of the test. Deleting data via deleting a partition.

mysql> ALTER TABLE url_log DROP PARTITION p0;
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> select distinct(second(log_date)) from url_log where second(log_date) < 10;
Empty set (0.00 sec)

And it works. Re-creating however did not.

ALTER TABLE url_log ADD PARTITION (PARTITION p0 VALUES LESS THAN (10));
ERROR 1481 (HY000): MAXVALUE can only be used in last partition definition

Ok, so in my example I was lazy, I didn’t create specific partitions as I would in real world here, e.g. 31 partitions for DAYS. Simulate a little better.

DROP TABLE IF EXISTS url_log;
CREATE TABLE url_log(
log_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
user_id INT UNSIGNED NULL,
url VARCHAR(100) NOT NULL)
ENGINE=ARCHIVE
PARTITION BY RANGE ( SECOND(log_date) ) (
    PARTITION p0 VALUES LESS THAN (10),
    PARTITION p1 VALUES LESS THAN (20),
    PARTITION p2 VALUES LESS THAN (30),
    PARTITION p3 VALUES LESS THAN (60)
);
ALTER TABLE url_log DROP PARTITION p0;
ALTER TABLE url_log ADD PARTITION (PARTITION p0 VALUES LESS THAN (10));
ERROR 1493 (HY000): VALUES LESS THAN value must be strictly increasing for each partition

Still doesn’t work. RTFM indicates this.

DROP TABLE IF EXISTS url_log;
CREATE TABLE url_log(
log_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
user_id INT UNSIGNED NULL,
url VARCHAR(100) NOT NULL)
ENGINE=ARCHIVE
PARTITION BY LIST ( SECOND(log_date) ) (
    PARTITION p0 VALUES IN (0,1,2,3,4,5,6,7,8,9),
    PARTITION p1 VALUES IN (10,11,12,13,14,15,16,17,18,19),
    PARTITION p2 VALUES IN (20,21,22,23,24,25,26,27,28,29),
    PARTITION p3 VALUES IN (30,31,32,33,34,35,36,37,38,39),
    PARTITION p4 VALUES IN (40,41,42,43,44,45,46,47,48,49),
    PARTITION p5 VALUES IN (50,51,52,53,54,55,56,57,58,59)
);

ALTER TABLE url_log DROP PARTITION p0;
ALTER TABLE url_log ADD PARTITION (PARTITION p0 VALUES IN (0,1,2,3,4,5,6,7,8,9));

And that works.

So by creating an ARCHIVE table with 31 LIST partitions, one for each day of the month, you could use ARCHIVE to log data into DAY partitions, analyze, summarize,log,copy the data from the previous day, and purge it within 28 days.

Posted under Databases, MySQL, Professional on 10 Jul 2008
Comments (2)
Jul
10

Check your spelling

Link to this post

I’ve been Plurking more lately rather then Twittering. I’d like to offer to help out at Twitter if I could find the right person to talk to.

I’m no English major, but I do like to ensure my spelling is correct (at least for the bulk of the audience). You see grammar problems on sites, due to the nature of English not being the first language of many people, but one should always check your spelling as per this popup message I got today.

Posted under Web Sites on 10 Jul 2008
Comments (1)
Home
Professional Blog RSS Feed of Professional Blog
Consulting
Presentations
About Ronald
Related Links
Contact Ronald
  • « Jun spinner iCalendar Aug »
    July 2008
    M T W T F S S
     123456
    78910111213
    14151617181920
    21222324252627
    28293031EC
  • Categories:
    • Professional
      • 42SQL
      • Apple
        • iPhone
        • MacBook
        • OS/X
      • Clever Design
      • Cloud Computing
        • 10gen
        • AppNexus
        • Kaavo
        • Kloudshare
      • Databases
        • Drizzle
        • Ingres
        • MySQL
          • Compiling
          • GUI Products
          • MySQL Events
            • mysqlcamp01
            • mysqlcamp02
          • MySQL Proxy
          • MySQL User Conferences
            • mysqluc06
            • mysqluc07
            • mysqluc08
          • Storage Engines
            • Non Transactional
              • Infobright
              • KickFire
              • Maria
              • Nitro
            • Transactional
              • Blob Streaming
              • Falcon
              • InnoDB
              • PBXT
              • Solid
        • Oracle
      • Extreme Programming (XP)
      • General
      • Java
        • Tomcat
      • Linux
        • One Liners
      • Microsoft
      • Open Source
        • Buildbot
        • Ubuntu
        • UltimateLAMP
        • Virtual Box
      • OSCON 2008
      • Packet General
      • PrimeBase Technologies
      • Solid State Drives
      • Sun
      • The Daily WTF
      • Web 2.0 NY
      • Windoze
      • Yahoo
    • Web
      • Google
        • App Engine
        • Summer of Code
      • SEO
        • Brand Identity
      • Web Development
        • Amazon
          • EC2
          • S3
          • SimpleDB
        • CSS
        • HTML
        • PHP
        • Web 2.0
      • Web Sites
        • Application Software
        • Content
        • Cool Tools
        • Linux Stuff
        • MySQL Related
        • Show Your Stuff
        • Twitter
        • Unype
      • WordPress
  • Pages:
    • Best Of PlanetMySQL Articles
    • Interesting Articles
    • MediaWiki Restyling (1)

  • Archives:
    • November 2008
    • October 2008
    • September 2008
    • August 2008
    • July 2008
    • June 2008
    • May 2008
    • April 2008
    • March 2008
    • February 2008
    • January 2008
    • December 2007
    • November 2007
    • October 2007
    • September 2007
    • August 2007
    • July 2007
    • June 2007
    • May 2007
    • April 2007
    • March 2007
    • February 2007
    • January 2007
    • December 2006
    • November 2006
    • October 2006
    • September 2006
    • August 2006
    • July 2006
    • June 2006
    • May 2006
    • April 2006
    • March 2006
    • February 2006
    • January 2006
    • December 2005
    • November 2005
    • October 2005
    • September 2005
    • July 2005
    • June 2005
    • February 2005
    • October 2004
    • September 2004
    • July 2004
    • June 2004