Reputation: 5958
I recently wrote a simple Java program that processed some data and inserted it in a MyISAM table. About 35000 rows had to be inserted. I wrote the INSERT statement using INSERT ... SET syntax and executed it for all rows with PreparedStatement.executeBatch()
. So:
String sql = "INSERT INTO my_table"
+ " SET "
+ " my_column_1 = ? "
+ " my_column_2 = ? "
...
+ " my_column_n = ? ";
try(PreparedStatement pst = con.prepareStatement(sql)){
for(Object o : someCollection){
pst.setInt(1, ...);
pst.setInt(2, ...);
...
pst.setInt(n, ...);
pst.addBatch();
}
pst.executeBatch();
}
I tried inserting all rows in a single batch and in bacthes of 1000, but in all cases the execution was VERY slow (about 1 minute per 1000 rows). After some tinkering I found that changing the syntax to INSERT ... VALUES improved the speed dramatically, 100x at the very least (I didn't measure it accurately).
String sql = "INSERT INTO my_table (my_column_1, my_column_2, ... , my_column_n)"
+ " VALUES (?, ?, ... , ?)";
What's going on here? Can it be that the JDBC driver cannot rewrite the batches when using INSERT ... SET? I didn't find any documentation about this. I am creating my connections with options rewriteBatchedStatements=true&useServerPrepStmts=false
.
I first noticed this problem when accessing a database in another host. That is, I have used the INSERT ... SET approach before without any noticeable performance issue in applications that were executing in the same host as the database. So I guess the problem may be that many more statements are sent over the network with INSERT ... SET than with INSERT ... VALUES.
Upvotes: 1
Views: 704
Reputation: 894
If you examine the INSERT ... SET syntax, you'll see it's only meant for inserting a single row. INSERT ... VALUES is meant for inserting multiple rows at one time.
In other words - even though you set rewriteBatchedStatements=true, the JDBC driver can't optimize the SET variation like it can with the VALUES variation because SET is not built for the batch case you have. Use VALUES to compress N inserts into one.
Bonus tip - If you use ON DUPLICATE KEY UPDATE, the JDBC currently can't rewrite those statements either. (edit: This statement is false - my mistake.)
There's an option you can set to verify all of this for yourself (I think it's 'profileSQL').
Upvotes: 1