Reputation: 2840
I'm trying to move a large number of records from one MySQL instance two another inside RDS. They are on different VPCs and different AWS accounts, so I can't create a data pipeline that would do the copy for me.
I've written a quick java program that connects to both the import database and the export database and does the following:
SELECT MAX(primary_key) FROM table
SELECT * FROM table WHERE(primary_key > max_from_import) LIMIT 1000000
INSERT INTO table (col1....coln) VALUES (?....n?)
With this method I'm able to see around 100000 records being imported an hour, but I know that from this question that a way to optimize inserts is not to create a new query each time, but to append more data with each insert. i.e.
INSERT INTO table (col1...coln) VALUES (val1...valn), (val1...valn)....(val1...valn);
Does the jdbc driver know to do this, or is there some sort of optimization I can make on my end to improve insert run time?
UPDATE: Both answers recommended using the add and execute batch, as well as removing auto commit. Removing auto commit saw a slight improvement (10%), doing the batch yielded a run time of less than 50% of the individual inserts.
Upvotes: 0
Views: 1181
Reputation: 8761
First create a JDBC connection to Destination database and make its auto commit property to false
.
After that in a loop do the following
Read N(for example 1000) number of rows from Source database and write that to destination database.
After some inserts commit destination database connection.
Sample code to get more idea is given below
Connection sourceCon = getSourceDbConnction();
Connection destCon = getDestinationDbConnction();
destCon.setAutoCommit(false);
int i=0;
String query;
while((query=getInsertQuery()!=null)
{
statement.executeUpdate(query);
i++;
if(i%10 == 0)
{
destCon.commit();
i=0;
}
}
destCon.commit();
The getInsertQuery
function should give string in INSERT INTO table (col1...coln) VALUES (val1...valn), (val1...valn)....(val1...valn);
format.
Also it should return null
, if all tables are processed.
If you are using Prepared Statements, you can use addBatch
and executeBatch
functions. Inside loop add values using addBatch
function. After some inserts call executeBatch
.
Upvotes: 1
Reputation: 23839
You need to use batch insert. Internally, Connector/J (MySQL JDBC driver) can rewrite batch inserts into multi values insert statements.
(Note that this is the default Connector/J behavior. You can add
the option useServerPrepStmts=true
to the JDBC url to enable server side prepared statements)
The code looks like the following:
try(PreparedStatement stmt = connection.prepareStatement(sql)) {
for(value : valueList) {
stmt.clearParameters();
stmt.setParameter(1, value);
stmt.addBatch();
}
stmt.executeBatch();
}
The code above will generate a multi value insert:
INSERT tablename(field) VALUES(value1), (value2), (value3) ...
Upvotes: 1