basu
basu

Reputation: 87

What is best practice to store 50000+ records in mysql in a single transaction

Input set: thousands(>10000) of csv files, each containing >50000 entries. output: Store those data in mysql db.

Approach taken: Read each file and store the data into database. Below is the code snippet for the same. Please suggest if this approach is ok or not.

    PreparedStatement pstmt2 = null;
try 
{
pstmt1 = con.prepareStatement(sqlQuery);
result = pstmt1.executeUpdate();
con.setAutoCommit(false);
sqlQuery = "insert   into "
        + tableName
        + " (x,y,z,a,b,c) values(?,?,?,?,?,?)";
pstmt2 = con.prepareStatement(sqlQuery);
Path file = Paths.get(filename);

lines = Files.lines(file, StandardCharsets.UTF_8);
final int batchsz = 5000;
for (String line : (Iterable<String>) lines::iterator) {

    pstmt2.setString(1, "somevalue");
    pstmt2.setString(2, "somevalue");
    pstmt2.setString(3, "somevalue");
    pstmt2.setString(4, "somevalue");
    pstmt2.setString(5, "somevalue");
    pstmt2.setString(6, "somevalue");
    pstmt2.addBatch();
    if (++linecnt % batchsz == 0) {
        pstmt2.executeBatch();
    }
}
int batchResult[] = pstmt2.executeBatch();
pstmt2.close();
con.commit();

} catch (BatchUpdateException e) {
    log.error(Utility.dumpExceptionMessage(e));

} catch (IOException ioe) {
    log.error(Utility.dumpExceptionMessage(ioe));
} catch (SQLException e) {
    log.error(Utility.dumpExceptionMessage(e));
} finally {
    lines.close();
    try {
        pstmt1.close();
        pstmt2.close();
    } catch (SQLException e) {
        Utility.dumpExceptionMessage(e);
    }
}

Upvotes: 3

Views: 812

Answers (2)

Rodrigo
Rodrigo

Reputation: 614

I've used LOAD DATA INFILE ins situations like this in the past.

The LOAD DATA INFILE statement reads rows from a text file into a table at a very high speed. LOAD DATA INFILE is the complement of SELECT ... INTO OUTFILE. (See Section 14.2.9.1, “SELECT ... INTO Syntax”.) To write data from a table to a file, use SELECT ... INTO OUTFILE. To read the file back into a table, use LOAD DATA INFILE. The syntax of the FIELDS and LINES clauses is the same for both statements. Both clauses are optional, but FIELDS must precede LINES if both are specified.

The IGNORE number LINES option can be used to ignore lines at the start of the file. For example, you can use IGNORE 1 LINES to skip over an initial header line containing column names:

LOAD DATA INFILE '/tmp/test.txt' INTO TABLE test IGNORE 1 LINES;

http://dev.mysql.com/doc/refman/5.7/en/load-data.html

Upvotes: 2

e4c5
e4c5

Reputation: 53734

As @Ridrigo has already pointed out, LOAD DATA INFILE is the way to go. Java is not really needed at all.

If the format of your CSV is not something that can directly be inserted into the database, your Java code can renter the picture. Use it to reorganize/transform the CSV and save it as another CSV file instead of writing it into the database.

You can also use the Java code to iterate through the folder that contains the CSV, and then execute the system command for the

Runtime r = Runtime.getRuntime();
Process p = r.exec("mysql -p password -u user database -e 'LOAD DATA INFILE ....");

you will find that this is much much faster than running individual sql queries for each row of the CSV file.

Upvotes: 1

Related Questions