Baron got a great amount of response from his 50 things to know before migrating Oracle to MySQL. I’m glad I invited him as a fellow MySQL colleague to my presentation to the Federal Government on Best Practices for Migrating to MySQL from Oracle and SQL Server for his inspiration.
Oracle will always be a more featured product then MySQL. There are however features that MySQL has that Oracle does not. While I’ve got a draft of a list of my own, I have several hundred incomplete drafts.
One of these features I was able to demonstrate to a client is the ability to have multiple VALUES clauses for a single INSERT statement. For example.
INSERT INTO t1(c1) VALUES (1), (2), (3), (4), (5);
Instead of
INSERT INTO t1(c1) VALUES(1); INSERT INTO t1(c1) VALUES(2); INSERT INTO t1(c1) VALUES(3); INSERT INTO t1(c1) VALUES(4); INSERT INTO t1(c1) VALUES(5);
Does it make a difference? What is the performance improvement?
The number one reason for an improvement in performance is the lack of network latency for each command. We ran a number of tests in a specific example for the client, taking multiple single insert statements, and combining into combined statements.
We ran tests across localhost and also a network test.
It was found that taking several thousand INSERT queries and combined into a maximum of 1M packets made sense. Overall this single test showed a 88% improvement from 11.4 seconds to 1.4 seconds.
real 0m11.403s user 0m0.175s sys 0m0.157s real 0m1.415s user 0m0.019s sys 0m0.012s
In the real world example, differences in the volumes of query to combine and system load showed a reduction of 72%
Mark Callaghan says
MySQL has a few things that seem simple at first glance but make it very efficient. This is one. Having the client-server protocol push many rows at a time back to the client (rather than the client pull 1 row at a time) is another.
Roland Bouman says
Yes – very useful feature.
There are a few popular myths around this feature – some people still think this is MySQL-specific (it is not – Postgres supports it in v >= 8.2; MS SQL Server supports it in v 2008), and some people think it is non-standard (it is in the 2003 version of the SQL standard)
Anyway, for Oracle, I sometimes generate scripts like this:
INSERT INTO t (col1,…,colX)
SELECT v1,…,vX FROM DUAL UNION ALL
SELECT v1,…,vX FROM DUAL UNION ALL
…
SELECT v1,…,vX FROM DUAL
;
clunky, but it does allow you to throw the data in there in one statement.
Anders Karlsson says
On the other hand, the ay Oracle does this with the array intercafe is even more performant (although requires a bit more programming). The Oracle C API (Pro*C or OCI) is not the easiest to get along with, but once you know them, they can kick butt. I know, I used to teach people how to use them.
Ryan says
If you’re using InnoDB in autocommit mode, you will also see a very large improvement from the multi/batched insert statement due to fewer commits. Even moreso if you use innodb flush with setting 1.
Anders Karlsson says
Rereading what I wrote before, I realize my typing was worse than usual. Sorry anout that.
Brooks Johnson says
The test conditions aren’t exactly clear to me. Were you sending the single inserts as a batch or one by one? If as a batch, was autocommit on or were you committing all the inserts as one?
Shlomi Noach says
Combined with INSERT IGNORE, the multi-insert feature allows for a really nice peoperty:
You can insert 100 rows together, and those that fail a constraint (FOREIGN, UNIQUE), don’t get inserted, now errors thrown, while all the rest get inserted just fine.
Olly says
I made extensive use of these bulk inserts, and combined them with “ON DUPLICATE KEY UPDATE `columnname` = VALUES(columnname)”.
You get really huge performance improvements from that.