Beginner CSV Engine issues

I’ve just started using the CSV engine for a practical application and I’ve come across a few subtle and unexpected issues/limitations.

First, you can’t create any columns in your CSV table nullable.

mysql> create table t1(i INT) ENGINE=CSV;
ERROR 1178 (42000): The storage engine for the table doesn't support nullable columns

RTFM shows this was introduced in 5.1.23. See CSV Limitations

The second and more annoying was creating a CSV table, inserting a sample row (just to check the syntax), then replacing the file ([datadir]/[schema]/[table].CSV) with the same format, but with additional generated rows. This was to no avail when attempting to SELECT from the table.

The solution was to do a REPAIR TABLE [table] in order to see the newly refreshed data.
Futhermore, some more RTFM shows in Repairing and Checking CSV Tables an important caveat to using the CSV engine and loading data directly by file copy.

Warning: Note that during repair, only the rows from the CSV file up to the first damaged row are copied to the new table. All other rows from the first damaged row to the end of the table are removed, even valid rows.

Buyer beware.

Comments

  1. says

    Ronald,
    can you post the sample CSV that you used for your test?
    With CSV, there are a few things that can go wrong. When you inserted the new rows:
    * Which OS were you using?
    * did you stop the server or close all the tables before replacing the CSV file?

    Giuseppe

  2. Matthew Montgomery says

    Ronald,

    Had you tried FLUSH TABLES to force closure of the file handle on the .CSV file after the swap out?

  3. Matthew Montgomery says

    Yep, FLUSH TABLES should be sufficient.

    matt@bouchon:/var/lib/mysql$ sudo cat test/bas.CSV
    “1”
    matt@bouchon:/var/lib/mysql$ sudo vi test/bas.CSV
    matt@bouchon:/var/lib/mysql$ mysql test
    Reading table information for completion of table and column names
    You can turn off this feature to get a quicker startup with -A

    Welcome to the MySQL monitor. Commands end with ; or g.
    Your MySQL connection id is 39
    Server version: 5.0.67-0ubuntu6-log (Ubuntu)

    Type ‘help;’ or ‘h’ for help. Type ‘c’ to clear the buffer.

    mysql> show tables;
    +—————-+
    | Tables_in_test |
    +—————-+
    | bar |
    | bas |
    | foo |
    | t1 |
    | t2 |
    +—————-+
    5 rows in set (0.00 sec)

    mysql> select * from bas;
    +—+
    | a |
    +—+
    | 1 |
    +—+
    1 row in set (0.02 sec)

    mysql> flush tables;
    Query OK, 0 rows affected (0.06 sec)

    mysql> select * from bas;
    +—+
    | a |
    +—+
    | 1 |
    | 2 |
    | 3 |
    | 4 |
    | 5 |
    +—+
    5 rows in set (0.00 sec)

  4. says

    I’m not sure why the nullable restriction surprised you. There is no way to represent NULL in CSV, all data is literal!

  5. Justin Swanhart says

    Toby,

    A CSV file produced with:
    SELECT …
    INTO OUTFILE FIELDS TERMINATED BY ‘,’ ESCAPED BY ‘\’

    Will produce a CSV file that has the value N for null, so it is very possible to store a NULL value in a CSV file.

  6. RLS says

    I need help. I have an excel file saved as csv. I have a column that contains unique tracking numbers and when it is saved, the numbers are blown, when I try and convert them back to their original number it doesn’t remain after being resaved. How can I keep the numbers in the column, while not allowing the cvs save to truncate? This is what it should look like:
    ship_date tracking number Date Rec’d
    1/16/2011 125602955668866
    1/16/2011 1235689948586
    1/16/2011 123456789456123
    1/16/2011 12345678999999900000
    1/16/2011 753159456258951000
    1/16/2011 4567898523584750
    1/16/2011

    And this is what I get:
    ship_date tracking_number Date Rec’d last_name
    1/6/2011 2.09926E+18 1/4/2011
    1/6/2011 2.09926E+18 1/4/2011
    1/6/2011 2.09926E+18 1/4/2011
    1/6/2011 2.09926E+18 1/4/2011
    1/6/2011 2.09926E+18 1/4/2011
    1/6/2011 2.09926E+18 1/4/2011
    1/6/2011 2.09926E+18 1/4/2011

    Please help