tchoedak
tchoedak

Reputation: 97

jdbc java.sql.preparedstatement duplicating rows during insert

I have a program in scala that connects to an oracle database using ojdbc, queries a table, and tries to insert records from the java.sql.resultSet into another table on a separate jdbc connection.

//conn1 to oracle: java.sql.Connection = oracle.jdbc.driver.T4CConnection@698122b2
//conn2 to another non-oracle database: java.sql.Connection = com.snowflake.client.jdbc.SnowflakeConnectionV1@6e4566f1

My attempt at capturing results from an oracle table:

val stmt1 = conn1.createStatement()
stmt1.setFetchSize(3000)
val sql1 = "select userid from nex.users"
val result = stmt1.executeQuery(sql1)

and code for attempting to insert records from result to a separate database and table via jdbc:

val insert_sql = "insert into test.users (userid) values (?)"
val ps = conn2.prepareStatement(insert_sql)
val batchSize = 3000
var count = 0
while (result.next) {
    ps.setInt(1, result.getInt(1))
    ps.addBatch()
    count += 1
    if (count % batchSize == 0) ps.executeBatch()
}

What's stumping me is this is almost the exact same syntax in many examples of using jdbc, but in my second table, I'm seeing 4x the original number of rows from the first table.

select userid, count(*) from test.users group by userid


1 4
2 4
3 4
4 4
5 4
6 4
etc

Upvotes: 0

Views: 1806

Answers (2)

Roger
Roger

Reputation: 2953

Yes, clearBatch is missing.

executeBatch() calls clearBatch() in the end. But there is no guarantee for that will be exactly the same in other implementations.

Also, if needed, I am making a minor-subtle addition to tchoedak's answer :)

ps.executeBatch();
conn2.commit();
ps.clearBatch(); 

Upvotes: 1

tchoedak
tchoedak

Reputation: 97

The issue was that I needed to execute ps.clearBatch() after every execute, otherwise the next batch would get piled on top of the previous batch. When trying this on a large table that would need to call executeBatch more often, the amount of duplicate rows were x times higher. The final code looks similar but with ps.clearBatch() .

val ps = conn2.prepareStatement(insert_sql)
val batchSize = 3000
var count = 0
while (result.next) {
    ps.setInt(1, result.getInt(1))
    ps.addBatch()
    count += 1
    if (count % batchSize == 0) 
        ps.executeBatch()
        ps.clearBatch()
}

Upvotes: 0

Related Questions