Peter
Peter

Reputation: 157

Copy Tables From Source Database to Destination Database On Same Host System (java.lang.OutOfMemoryError)

I need to query a database and copy the resultset into another database, which has the same database structure and is also on the same host system.

The following JAVA-function works pretty well (fast and without errors), if the query result is pretty small:

public void copyTableData(Connection dbConnOnSrcDB, Connection dbConnOnDestDB,
                          String sqlQueryOnSrcDB, String tableNameOnDestDB) 
  throws SQLException {

    try (

        PreparedStatement prepSqlStatmOnSrcDB = dbConnOnSrcDB.prepareStatement(sqlQueryOnSrcDB);

        ResultSet sqlResultsFromSrcDB = prepSqlStatmOnSrcDB.executeQuery()
    ) {
        ResultSetMetaData sqlMetaResults = sqlResultsFromSrcDB.getMetaData();

        // Stores the query results
        List<String> columnsOfQuery = new ArrayList<>();

        // Store query results
        for (int i = 1; i <= sqlMetaResults.getColumnCount(); i++)
            columnsOfQuery.add(sqlMetaResults.getColumnName(i));

        try (
            PreparedStatement prepSqlStatmOnDestDB = dbConnOnDestDB.prepareStatement(
                "INSERT INTO " + tableNameOnDestDB +
                     " (" + columnsOfQuery.stream().collect(Collectors.joining(", ")) + ") " +
                        "VALUES (" + columnsOfQuery.stream().map(c -> "?").collect(Collectors.joining(", ")) + ")")
        ) {

            while (sqlResultsFromSrcDB.next()) {
                for (int i = 1; i <= sqlMetaResults.getColumnCount(); i++)
                    prepSqlStatmOnDestDB.setObject(i, sqlResultsFromSrcDB.getObject(i));

                prepSqlStatmOnDestDB.addBatch();
            }
            prepSqlStatmOnDestDB.executeBatch();
        }
    }
}

But I have very large database queries and resultsets in the range of several hundred megabytes.

Problem A: I found out that the below OutOfMemoryError is raising, when the second line of code is processed:

ResultSet sqlResultsFromSrcDB = prepSqlStatmOnSrcDB.executeQuery()

JAVA-Exeption:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.lang.Class.getDeclaredFields0(Native Method)
at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
at java.lang.Class.getDeclaredField(Class.java:2068)
at java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl$1.run(AtomicReferenceFieldUpdater.java:323)
at java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl$1.run(AtomicReferenceFieldUpdater.java:321)
at java.security.AccessController.doPrivileged(Native Method)
at java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.<init>(AtomicReferenceFieldUpdater.java:320)
at java.util.concurrent.atomic.AtomicReferenceFieldUpdater.newUpdater(AtomicReferenceFieldUpdater.java:110)
at java.sql.SQLException.<clinit>(SQLException.java:372)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2156)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:300)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354)
at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:169)
at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:117)
at Application.copyTableData(Application.java:159)
at Application.main(Application.java:585)

Problem B: The copy job needs really much time. Is there a way to speed up the copy process?

The DB queries are:

String[] tables = new String[]{
                    "table1",
                    "table1_properties",
                    "table1_addresses",
                    "table2",
                    "table3",
                    "table4",
                    "table5",
                    "table6",
                    "table7",
                    "table8",
                    "table9",
                    "table10"
            };

Function call:

for( String table : tables ){

  getDataFromSrcDB = "SELECT " + table + ".* " +
    "FROM table1 " +
        "FULL JOIN table1_properties " +
            "ON table1_properties.d_id=table1.d_id " +
        "FULL JOIN table1_addresses " +
            "ON table1_addresses.d_id=table1_properties.d_id " +
        "FULL JOIN table2 " +
            "ON table2.p_id=table1_properties.p_id " +
        "FULL JOIN table3 " +
            "ON table3.d_id=table1.d_id " +
        "FULL JOIN table4 " +
            "ON table4.d_id=table1.d_id " +
        "FULL JOIN table5 " +
            "ON table5.d_id=table1.d_id " +
        "FULL JOIN table6 " +
            "ON table6.d_id=table1.d_id " +
        "FULL JOIN table7 " +
            "ON table7.d_id=table1.d_id " +
        "FULL JOIN table8 " +
            "ON table8.id=table4.id " +
        "FULL JOIN table9 " +
            "ON table9.d_id=table1.d_id " +
        "FULL JOIN table10 " +
            "ON table10.a_id=table1_addresses.a_id " +
        "WHERE ST_Intersects(ST_MakeEnvelope(" +
               minLong + "," +
               minLat + "," +
               maxLong + "," +
               maxLat + ",4326), geom :: GEOMETRY) OR " +
        "ST_Intersects(ST_MakeEnvelope(" +
               minLong + "," +
               minLat + "," +
               maxLong + "," +
               maxLat + ",4326), CAST(table3.location AS GEOMETRY))";

   copyTableData(dbConnOnSrcDB, dbConnOnDestDB, getDataFromSrcDB, table);
}

Upvotes: 2

Views: 494

Answers (3)

utrucceh
utrucceh

Reputation: 1116

When you fetch data directly it is fillin ram until all datas fetched. So you can easly see OutOfMemoryError error.

If you fetch data with stream you can capture unlimited data, because of stream fetching, and processing and contiue with new datas, clearing ram for processed datas (as parted with fetchSize)

Upvotes: 0

Youcef LAIDANI
Youcef LAIDANI

Reputation: 59950

When the size of the batch is huge, you get this error :

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

I have some solution.

First Solution

Instead you can divide the batch for small batches for example each 1_000 persist the data, you need some configuration also, as Mark Rotteveel mention in the comment, and as the documentation mention Getting results based on a cursor :

By default the driver collects all the results for the query at once. This can be inconvenient for large data sets so the JDBC driver provides a means of basing a ResultSet on a database cursor and only fetching a small number of rows.

So what you should to do :

  • The connection to the server must be using the V3 protocol.
  • The Connection must not be in autocommit mode.
  • The query given must be a single statement
  • The fetch size of the Statement is needed to the appropriate size
  • ..read the details in the documentation

in this case your code can be like this :

//Note here you set auto commit for the source connection
dbConnOnSrcDB.setAutoCommit(false);

final int batchSize = 1000;
final int fetchSize = 50;
int count = 0;
...
//Set the appropriate size for the FetchSize
sqlResultsFromSrcDB.setFetchSize(fetchSize);
while (sqlResultsFromSrcDB.next()) {
    for (int i = 1; i <= sqlMetaResults.getColumnCount(); i++) {
        prepSqlStatmOnDestDB.setObject(i, sqlResultsFromSrcDB.getObject(i));
    }
    prepSqlStatmOnDestDB.addBatch();
    if (++count % batchSize == 0) {
        prepSqlStatmOnDestDB.executeBatch();
    }
}
prepSqlStatmOnDestDB.executeBatch(); // insert remaining records

Second Solution

Because you are using PostgreSQL I would like to use dblink to transfer data between database to another database.


Some usefull links :

Upvotes: 4

Avhi
Avhi

Reputation: 886

You have many ways to achieve it. Here are few options you can apply-

  1. Read Data from First database and write it into csv file and then read again from csv file in chunks and write to another database.(Easy to implement but more coding) https://gauravmutreja.wordpress.com/2011/10/13/exporting-your-database-to-csv-file-in-java/

  2. If you don't have much data manipulation before data transfer to another database, you can write a simple DB function to read data from one Database and write into another one.

  3. or you can try Spring batch to do this.

Upvotes: 0

Related Questions