Reputation: 601
I've written a class method that will take "batches" of data (each row that makes a "value" to be inserted, via SQL, to the database comes from a two-dimensional array labeled "data_values").
However, there will be instances when my program will be getting redundant data, i.e. data that might already be in the database. Because there's a primary key in the database, the program will break if it cannot upload the data because of a duplicate entry.
Is there a way to use a try/catch so that the program will continue uploading data, effectively "skipping" the duplicates? If so, how can I implement it?
Thank you in advance. If I could clarify my question, please let me know.
My current code is here:
public void insertData(ArrayList<String> data_types, String[][] data_values) {
try{
c.setAutoCommit(false);
// creates insert statement
String insertDataScript = "INSERT INTO "+tableName+" VALUES (";
for(int q = 0; q < data_types.size()-1; q++) {
insertDataScript += "?, ";
}
insertDataScript += "?)";
PreparedStatement stmt = c.prepareStatement(insertDataScript);
for (int i = 0; i < data_values.length; i++) {
for(int j = 1; j < data_types.size()+1; j++) {
if(data_types.get(j-1).toLowerCase().equals("double")) {
stmt.setDouble(j, Double.valueOf(data_values[i][j-1]));
}
else if(data_types.get(j-1).toLowerCase().equals("string")) {
stmt.setString(j, data_values[i][j-1]);
}
else {
System.out.println("Error");
}
}
stmt.addBatch();
}
stmt.executeBatch();
c.commit();
c.setAutoCommit(true);
stmt.close();
}
catch ( Exception e ) {
System.err.println( e.getClass().getName() + ": " + e.getMessage() );
System.exit(0);
}
}
Upvotes: 0
Views: 1080
Reputation: 57043
Simply change (albeit it not really exception handling, but rather exception bypassing)
String insertDataScript = "INSERT INTO "+tableName+" VALUES (";
to
String insertDataScript = "INSERT OR IGNORE INTO "+tableName+" VALUES (";
Consider the following demo (equivalent to suggested and then what you currently have) :-
rowid has been used for convenience as it's basically a build in primary key.
the only reason why the columns have been specified i.e.(rowid,othercolumn,mydatecolumn)
is that rowid is normally hidden. In your case just VALUES (without the preceding columns) will expect values for all columns and thus include the defined primary key column(s).
shown/actioned in reverse order as both can then run together
:-
INSERT OR IGNORE INTO mytable (rowid,othercolumn,mydatecolumn) -- rowid is a PRIMARY KEY as such
VALUES
(10,'x','x'),
(11,'x','x'),
(12,'x','x'),
(13,'x','x'),
(14,'x','x'),
(10,'x','x')
;
INSERT INTO mytable (rowid,othercolumn,mydatecolumn) -- rowid is a PRIMARY KEY as such
VALUES
(20,'x','x'),
(21,'x','x'),
(22,'x','x'),
(23,'x','x'),
(24,'x','x'),
(20,'x','x')
;
results in :-
INSERT OR IGNORE INTO mytable (rowid,othercolumn,mydatecolumn) -- rowid is a PRIMARY KEY as such
VALUES
(10,'x','x'),
(11,'x','x'),
(12,'x','x'),
(13,'x','x'),
(14,'x','x'),
(10,'x','x')
> Affected rows: 5
> Time: 0.208s
i.e. 5 of the 6 were added the 6th a duplicate (according to the primary key) was skipped.
INSERT INTO mytable (rowid,othercolumn,mydatecolumn) -- rowid is a PRIMARY KEY as such
VALUES
(20,'x','x'),
(21,'x','x'),
(22,'x','x'),
(23,'x','x'),
(24,'x','x'),
(20,'x','x')
> UNIQUE constraint failed: mytable.rowid
> Time: 0.006s
i.e. none are inserted due to 1 duplicate.
If you wanted the data from the duplicates to be applied then instead of INSERT OR IGNORE, you could use INSERT OR REPLACE.
e.g. the following (run after the above i.e. all are duplicates bit with different data):-
INSERT OR REPLACE INTO mytable (rowid,othercolumn,mydatecolumn) -- rowid is a PRIMARY KEY as such
VALUES
(10,'xx','x'),
(11,'x','xx'),
(12,'aa','x'),
(13,'x','aa'),
(14,'x','bb'),
(10,'cc','x')
;
then you get :-
INSERT OR REPLACE INTO mytable (rowid,othercolumn,mydatecolumn) -- rowid is a PRIMARY KEY as such
VALUES
(10,'xx','x'),
(11,'x','xx'),
(12,'aa','x'),
(13,'x','aa'),
(14,'x','bb'),
(10,'cc','x')
> Affected rows: 6
> Time: 0.543s
i.e. now all 6 INSERTs are actioned (5 rows updated as the 1st and last update the same row twice).
Upvotes: 1
Reputation: 51
My first suggestion would be to deduplicate the data before inserting it into the db. (Edit: totally missed the "already in the db" part, so this probably won't work unless you want to do a query before every insert. Maybe you can use an INSERT IGNORE
?)
If you cannot do this because you do not have control over the primary key or there is no way to ignore duplicates in the insert, then there are ways to catch specific exception types and continue the program instead of calling System.exit
. In order to do that you would probably need to have smaller prepared statements and put the try/catch inside the for loop over 'data_values`.
Here is a post talking about catching this type of exception: Catch duplicate key insert exception.
Upvotes: 1