The value of multi insert values

Baron got a great amount of response from his 50 things to know before migrating Oracle to MySQL . I’m glad I invited him as a fellow MySQL colleague to my presentation to the Federal Government on Best Practices for Migrating to MySQL from Oracle and SQL Server for his inspiration.

Oracle will always be a more featured product then MySQL. There are however features that MySQL has that Oracle does not. While I’ve got a draft of a list of my own, I have several hundred incomplete drafts.

One of these features I was able to demonstrate to a client is the ability to have multiple VALUES clauses for a single INSERT statement. For example.

INSERT INTO t1(c1) VALUES (1), (2), (3), (4), (5);

Instead of

INSERT INTO t1(c1) VALUES(1);
INSERT INTO t1(c1) VALUES(2);
INSERT INTO t1(c1) VALUES(3);
INSERT INTO t1(c1) VALUES(4);
INSERT INTO t1(c1) VALUES(5);

Does it make a difference? What is the performance improvement?

The number one reason for an improvement in performance is the lack of network latency for each command. We ran a number of tests in a specific example for the client, taking multiple single insert statements, and combining into combined statements.

We ran tests across localhost and also a network test.

It was found that taking several thousand INSERT queries and combined into a maximum of 1M packets made sense. Overall this single test showed a 88% improvement from 11.4 seconds to 1.4 seconds.

real        0m11.403s
user        0m0.175s
sys         0m0.157s

real        0m1.415s
user        0m0.019s
sys         0m0.012s

In the real world example, differences in the volumes of query to combine and system load showed a reduction of 72%

Tagged with: Databases General Linux MySQL Open Source

Related Posts

More CPUs or Newer CPUs

In a CPU-bound database workload, regardless of price, would you scale-up or scale-new? What if price was the driving factor, would you scale-up or scale-new? I am using as a baseline the first available AWS Graviton2 processor for RDS (r6g).

Read more

An Interesting Artifact with AWS RDS Aurora Storage

As part of using public datasets with my own Benchmarking Suite I wanted upsize a dataset for larger volume testing. I have always used the INFORMATION_SCHEMA.TABLES data_length and index_length columns as a sufficiently accurate measurement for actual disk space used.

Read more

How long does it take the ReadySet cache to warm up?

During my setup of benchmarking I run a quick test-sysbench script to ensure my configuration is right before running an hour+ duration test. When pointing to a Readyset cache where I have cached the 5 queries used in the sysbench test, but I have not run any execution of the SQL, throughput went up 10x in 5 seconds.

Read more