DrXCheng
DrXCheng

Reputation: 4132

JDBC insert multiple rows

I am now using batch:

String query = "INSERT INTO table (id, name, value) VALUES (?, ?, ?)";
PreparedStatement ps = connection.prepareStatement(query);            
for (Record record : records) {
    ps.setInt(1, record.id);
    ps.setString(2, record.name);
    ps.setInt(3, record.value);
    ps.addBatch();
}
ps.executeBatch();

I am just wondering if the above code is equivalent to the following code. If not, which is faster?

String query = "INSERT INTO table (id, name, value) VALUES ";
for (Record record : records) {
    query += "(" + record.id + ",'" + record.name + "'," + record.value + "),";
}
query = query.substring(1, query.length() - 1);
PreparedStatement ps = connection.prepareStatement(query);
ps.executeUpdate();

Upvotes: 41

Views: 76898

Answers (7)

Donovan Smith
Donovan Smith

Reputation: 736

I found this question while trying to upload a large csv to my remote database. Based on what I experienced, the two are not the same. The first method, with batching is the safer solution, but I also found it to be much slower.

I recognize the that this is not the way that is recommended for implementing something that accepts user inputs, but for my use case, it was the first tractable solution that I found.

My use case was this: I had a civ with 21 million rows that I wanted to upload to my database, and I found that using prepared statement batches was much slower. As far as I can tell, this is because each insert statement is added separately to the batch, and so having batch of 1000 executes 1000 insert statements. Whatever the case, this was taking about 30 seconds per batch for me with batch sizes of 1000. With 21 million rows, that would have taken two days. Therefore, I deliberately did something UNSAFE that worked much faster.

When I did it this way, each batch of 10,000 took about 0.25 seconds. This ran enough (1000x) faster than the other way that I thought I would share it for people that are looking for the same answer I was.

For reference, the CSV file that I was using was downloaded from https://fdc.nal.usda.gov/download-datasets.html.

The readCSV function was from Jeronimo Backes's solution here: Slow CSV row parsing and splitting. https://mvnrepository.com/artifact/com.univocity/univocity-parsers

Again, don't do something like this for a solution where you are worried about injection attacks. I used this for a really fast way of reading a csv and putting it into a mysql database.

    public static void writeFileToCSV(String filename) {
    

        PreparedStatement stmt = null;
        Connection con = null;
        Integer insertCount = 0;
        Integer batchSize = 10000;
        Integer numBatchesSent = 0;
        IterableResult<String[], ParsingContext> rows = readCSV(filename);
        
    
        try {
            con = mySqlDatabase.getConnection();
            con.setAutoCommit(true); //I found that this did not significantly change the speed of the queries because there was not one query per row.
    
            Integer numInBatch = 0;
            ArrayList<Object> values = new ArrayList<>();
            String sqlStart = "INSERT IGNORE INTO food_nutrient(id, fdc_id, nutrient_id, amount) VALUES ";
            String sqlEnd = " ;";

            StringBuilder sqlBuilder = new StringBuilder();
            StringBuilder valuesBuilder = new StringBuilder();
    
            Integer lineNum = 0;
    
//This is my manual parsing of the csv columns. There may be a slicker way to do this
            for (String[] nextLine : rows) {

                if (lineNum == 0) { //This ignores the header row of the csv.
                    lineNum++;
                    continue;
                }
                
                Integer id = Integer.parseInt(nextLine[0]);
                Integer fdc_id = Integer.parseInt(nextLine[1]);
                Integer nutrient_id = Integer.parseInt(nextLine[2]);
                Double amount = parseDouble(nextLine[3]);
                if (valuesBuilder.length() > 0) {
                    valuesBuilder.append(',');
                }
                valuesBuilder.append("(");
                valuesBuilder.append(id);
                valuesBuilder.append(',');
    
                if (fdc_id != null) {
                    valuesBuilder.append(fdc_id + ',');
                    valuesBuilder.append(',');
                } else {
                    valuesBuilder.append("NULL,");
                }
                if (nutrient_id != null) {
                    valuesBuilder.append(nutrient_id);
                    valuesBuilder.append(',');
                } else {
                    valuesBuilder.append("NULL,");
                }
                if (amount != null) {
                    valuesBuilder.append(amount);
                    valuesBuilder.append(')');
                } else {
                    valuesBuilder.append("NULL)");
                }
    
    
    
                if (++insertCount % batchSize == 0) {
                    
                    sqlBuilder.append(sqlStart);
                    sqlBuilder.append(valuesBuilder);
                    sqlBuilder.append(sqlEnd);
                    stmt = con.prepareStatement(sqlBuilder.toString());
                    sqlBuilder = new StringBuilder();
                    valuesBuilder = new StringBuilder();
                    stmt.executeUpdate();
                    numBatchesSent++;
    
                    System.out.println("Sent batch "+ numBatchesSent + " with " + batchSize + " new rows.");
                    
                } //: send the batch
            }//: For each row in the csv
    
            //Send the values that were in the last batch.
            sqlBuilder.append(sqlStart);
            sqlBuilder.append(valuesBuilder);
            sqlBuilder.append(sqlEnd);
            stmt = con.prepareStatement(sqlBuilder.toString());
            sqlBuilder = new StringBuilder();
            valuesBuilder = new StringBuilder();
            stmt.executeUpdate();
            
    
        } catch (SQLException ex) {
            System.out.println(ex.getMessage());
            ex.printStackTrace();
        } catch (DataAccessException ex) {
            ex.printStackTrace();
        }
    
        finally {
            try {
                System.out.println(stmt);
                stmt.executeBatch();
            } catch (Exception ex) {
                ex.printStackTrace(); //Push the remainder through to the database.
            }
            try {
                if (stmt != null) {
                    stmt.close();
                }
                if (con != null) {
                    con.close();
                }
                rows.getContext().stop();
            } catch (SQLException e) {
                e.printStackTrace();
            }
        }
    
    }
    


    public static IterableResult<String[], ParsingContext> readCSV(String filePath) {
        File file = new File(filePath);
        //configure the parser here. By default all values are trimmed
        CsvParserSettings parserSettings = new CsvParserSettings();
        //create the parser
        CsvParser parser = new CsvParser(parserSettings);
        //create an iterable over rows. This will not load everything into memory.
        IterableResult<String[], ParsingContext> rows = parser.iterate(file);
        return rows;
    }
    

  //This just takes care of NumberFormatExceptions that I had been getting.
    private static Double parseDouble(String str) {
        try {
            Double value=Double.parseDouble(str);
            return value;
        } catch (NumberFormatException ex) {
            System.out.println("There was probably a null value");
            return null;
        }
    }

Upvotes: 0

Masoud Bokaei
Masoud Bokaei

Reputation: 31

In case that your records size is less than or equal to 1000 the following code is better than both of your codes:

StringBuilder query = new StringBuilder("INSERT INTO table (id, name, value) VALUES ");

if (records.size() <= 1000) {
    
    for (int i = 0; i < records.size(); i++)
        query.append("(?, ?, ?), ");

    query = new StringBuilder(query.substring(1, query.length() - 1));

    PreparedStatement ps = connection.prepareStatement(query.toString());

    for (int i = 0; i < records.size(); i++) {
        ps.setInt((i * 3) + 1, record.id);
        ps.setString((i * 3) + 2, record.name);
        ps.setInt((i * 3) + 3, record.value);
    }
    
    ps.executeUpdate();
    
}

In this way you are using PreparedStatement and create it dynamically depending the size of your record list using multiple values clause in one insert query

Upvotes: 2

Reimeus
Reimeus

Reputation: 159774

Turn off autocommit

executeBatch will have an improved performance over executeUpdate as long as autocommit is set to false:

connection.setAutoCommit(false);  
PreparedStatement ps = connection.prepareStatement(query);            
for (Record record : records) {
    // etc.
    ps.addBatch();
}
ps.executeBatch();
connection.commit(); 

Upvotes: 46

Memin
Memin

Reputation: 4090

You can face a serious performance issue if the number of items that you want to insert is large. Therefore, it is safer to define a batch size, and constantly execute the query when the batch size is reached.

Something like the following example code should work. For the full story of how using this code efficiently, please see this link.

private static void insertList2DB(List<String> list) {
        final int batchSize = 1000; //Batch size is important.
        Connection conn = getConnection();
        PreparedStatement ps = null;
        try {
            String sql = "INSERT INTO theTable (aColumn) VALUES (?)";
            ps = conn.prepareStatement(sql);

            int insertCount=0;
            for (String item : list) {
                ps.setString(1, item);
                ps.addBatch();
                if (++insertCount % batchSize == 0) {
                    ps.executeBatch();
                }
            }
            ps.executeBatch();

        } catch (SQLException e) {
            e.printStackTrace();
            System.exit(1);
        }
    finally {
        try {
            ps.close();
            conn.close();
        } catch (SQLException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }
} 

Upvotes: 5

sanjaykumar
sanjaykumar

Reputation: 29

public void insertBatch(final List<Record > records ) {

    String query = "INSERT INTO table (id, name, value) VALUES (?, ?, ?)";


    GenericDAO.getJdbcTemplate().batchUpdate(sql, new BatchPreparedStatementSetter() {


        @Override
        public void setValues(PreparedStatement ps, int i) throws SQLException {
              Record record = records .get(i);
              ps.setInt(1, record.id);
              ps.setString(2, record.name);
              ps.setInt(3, record.value);
        }

        @Override
        public int getBatchSize() {
            return records.size();
        }
    });
}

Upvotes: 1

ARNAB2012
ARNAB2012

Reputation: 400

i think this will do

String query = "INSERT INTO table (id, name, value) VALUES ";
for (Record record : records)
{
query += "(" + record.id + ",'" + record.name + "'," + record.value + "),";
query = query.substring(1, query.length() - 1);
PreparedStatement ps = connection.prepareStatement(query);
ps.executeUpdate();
}

because you have to execute the query for each records to insert into database.

Upvotes: -7

Filipe Fedalto
Filipe Fedalto

Reputation: 2540

First of all, with query string concatenation you not only lose the type conversion native to PreparedStatement methods, but you also get vulnerable to malicious code being executed in the database.

Second, PreparedStatements are previously cached in the very database itself, and this already gives a very good performance improvement over plain Statements.

Upvotes: 32

Related Questions