Tyrovar
Tyrovar

Reputation: 33

Fixing GC Overhead Limit Exceeded Without Increasing Heap Size

I am working on a Java program that will take data from a Sybase database and, using UCanAccess, import it into a Microsoft Access Database. However, I am currently running into a problem, receiving the error “java.lang.OutOfMemoryError: GC overhead limit exceeded”.

To put the situation into context, I am attempting to import approximately 1.3 million records into the Access Database. The program currently encounters the error after approximately 800,000 of these records have been imported, about ten minutes at run time, and long after the ResultSet has been retrieved from the Sybase Database.

I have attempted to modify the heap size, but that causes the program to slow down significantly. Note that this is an ad hoc program to be run multiple times as needed, so the run time should be in the order of minutes or possibly hours, whereas increasing the heap size, based on my observations, would increase the run time to the order of days.

For reference, the error occurs in the main method, during the subroutine called getRecords (the exact line of code that this occurs on varies on a run-by-run basis). I have included the code to the program below, with some minor changes to parts of the code, such as the exact query I am using and the username and password to the access database, so as not to reveal sensitive information.

Is there anything that I can change in the code of my program to ease the load on the garbage collector without increasing the run time beyond a few hours?

EDIT: It appears that I was mistaken as to the default max heap size of Java. When I thought I was increasing the heap size by setting it to 512m, I was unintentionally cutting the heap size in half. When I set the heap size to 2048m instead, I got a java heap space error. I would still like to solve the problem without modifying the heap size, if possible.

EDIT 2: Apparently, I was misled as to a number of records I needed to process. It is double the size I originally thought it was, which indicates that I need to drastically change my approach. Going to go ahead and accept an answer, because that answer did result in large improvements.

getRecords method:

   public static void getRecords(SybaseDatabase sdb, AccessDatabase adb)
    {
        ArrayList<Record> records = new ArrayList<Record>();
        StringBuffer sql = new StringBuffer();
        Record currentRecord = null;
        try{
            Statement sybStat = sdb.connection.createStatement();
            PreparedStatement resetADB = adb.connection.prepareStatement("DELETE FROM Table");
            PreparedStatement accStat = adb.connection.prepareStatement("INSERT INTO Table (A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)");
sql.append(query);//query is a placeholder, as I cannot give out the actual query to the database. I have confirmed that the query itself gives the ResultSet that I am looking for
            ResultSet rs = sybStat.executeQuery(sql.toString());
            resetADB.executeUpdate();
            boolean nextWatch = true;
            Integer i = 1;
            Record r = new Record();
            while(nextWatch)
            {
                for (int j = 0; j < 1000 && nextWatch; j++)
                {
                    nextWatch = rs.next();

                    r.setColumn(i, 0);
                    r.setColumn(rs.getString("B"), 1);
                    r.setColumn(rs.getString("C"), 2);
                    r.setColumn(rs.getString("D"), 3);
                    r.setColumn(rs.getString("E"), 4);
                    r.setColumn(rs.getString("F"), 5);
                    r.setColumn(rs.getString("G"), 6);
                    r.setColumn(rs.getString("H"), 7);
                    r.setColumn(rs.getString("I"), 8);
                    r.setColumn(rs.getString("J"), 9);
                    r.setColumn(rs.getString("K"), 10);
                    r.setColumn(rs.getInt("L"), 11);
                    r.setColumn(rs.getString("M"), 12);
                    r.setColumn(rs.getString("N"), 13);
                    r.setColumn(rs.getString("O"), 14);
                    r.setColumn(rs.getString("P"), 15);

                    records.add(r);
                    i++;
                }

                for(int k = 0; k < records.size(); k++)
                {
                    currentRecord = records.get(k);

                    for(int m = 0; m < currentRecord.getNumOfColumns(); m++)
                    {
                        if (currentRecord.getColumn(m) instanceof String)
                        {
                            accStat.setString(m + 1, "\"" + currentRecord.getColumn(m) + "\"");
                        }
                        else
                        {
                            accStat.setInt(m + 1, Integer.parseInt(currentRecord.getColumn(m).toString()));
                        }
                    }
                    accStat.addBatch();
                }
                accStat.executeBatch();
                accStat.clearBatch();
                records.clear();
            }
            adb.connection.commit();
        }
        catch(Exception e){
            e.printStackTrace();
        }
        finally{

        }   
    }
}

Full code:

import java.util.*;
import java.sql.*;
import com.sybase.jdbc2.jdbc.SybDriver;//This is an external file that is used to connect to the Sybase database. I will not include the full code here for the sake of space but will provide it upon request.

public class SybaseToAccess {
    public static void main(String[] args){
        String accessDBPath = "C:/Users/me/Desktop/Database21.accdb";//This is a placeholder, as I cannot give out the exact file path. However, I have confirmed that it points to the correct file on the system.
        String sybaseDBPath = "{sybServerName}:{sybServerPort}/{sybDatabase}";//See above comment
        try{
            AccessDatabase adb = new AccessDatabase(accessDBPath);
            SybaseDatabase sdb = new SybaseDatabase(sybaseDBPath, "user", "password");

            getRecords(sdb, adb);
        }
        catch(Exception e){
            e.printStackTrace();
        }
        finally{

        }       
    }
    public static void getRecords(SybaseDatabase sdb, AccessDatabase adb)
    {
        ArrayList<Record> records = new ArrayList<Record>();
        StringBuffer sql = new StringBuffer();
        Record currentRecord = null;
        try{
            Statement sybStat = sdb.connection.createStatement();
            PreparedStatement resetADB = adb.connection.prepareStatement("DELETE FROM Table");
            PreparedStatement accStat = adb.connection.prepareStatement("INSERT INTO Table (A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)");
sql.append(query);//query is a placeholder, as I cannot give out the actual query to the database. I have confirmed that the query itself gives the ResultSet that I am looking for
            ResultSet rs = sybStat.executeQuery(sql.toString());
            resetADB.executeUpdate();
            boolean nextWatch = true;
            Integer i = 1;
            Record r = new Record();
            while(nextWatch)
            {
                for (int j = 0; j < 1000 && nextWatch; j++)
                {
                    nextWatch = rs.next();

                    r.setColumn(i, 0);
                    r.setColumn(rs.getString("B"), 1);
                    r.setColumn(rs.getString("C"), 2);
                    r.setColumn(rs.getString("D"), 3);
                    r.setColumn(rs.getString("E"), 4);
                    r.setColumn(rs.getString("F"), 5);
                    r.setColumn(rs.getString("G"), 6);
                    r.setColumn(rs.getString("H"), 7);
                    r.setColumn(rs.getString("I"), 8);
                    r.setColumn(rs.getString("J"), 9);
                    r.setColumn(rs.getString("K"), 10);
                    r.setColumn(rs.getInt("L"), 11);
                    r.setColumn(rs.getString("M"), 12);
                    r.setColumn(rs.getString("N"), 13);
                    r.setColumn(rs.getString("O"), 14);
                    r.setColumn(rs.getString("P"), 15);

                    records.add(r);
                    i++;
                }

                for(int k = 0; k < records.size(); k++)
                {
                    currentRecord = records.get(k);

                    for(int m = 0; m < currentRecord.getNumOfColumns(); m++)
                    {
                        if (currentRecord.getColumn(m) instanceof String)
                        {
                            accStat.setString(m + 1, "\"" + currentRecord.getColumn(m) + "\"");
                        }
                        else
                        {
                            accStat.setInt(m + 1, Integer.parseInt(currentRecord.getColumn(m).toString()));
                        }
                    }
                    accStat.addBatch();
                }
                accStat.executeBatch();
                accStat.clearBatch();
                records.clear();
            }
            adb.connection.commit();
        }
        catch(Exception e){
            e.printStackTrace();
        }
        finally{

        }   
    }
}

class AccessDatabase{
    public Connection connection = null;
    public AccessDatabase(String filePath)
            throws Exception
        {
            String dbString = null;
            dbString   = "jdbc:ucanaccess://" + filePath; 
            connection = DriverManager.getConnection(dbString);
            connection.setAutoCommit(false);
        }
}
class Record{
    ArrayList<Object> columns;
public
    Record(){
        columns = new ArrayList<Object>();
        columns.add("Placeholder1");
        columns.add("Placeholder2");
        columns.add("Placeholder3");
        columns.add("Placeholder4");
        columns.add("Placeholder5");
        columns.add("Placeholder6");
        columns.add("Placeholder7");
        columns.add("Placeholder8");
        columns.add("Placeholder9");
        columns.add("Placeholder10");
        columns.add("Placeholder11");
        columns.add("Placeholder12");
        columns.add("Placeholder13");
        columns.add("Placeholder14");
        columns.add("Placeholder15");
        columns.add("Placeholder16");
    }

    <T> void setColumn(T input, int colNum){
        columns.set(colNum, input);
    }

    Object getColumn(int colNum){
        return columns.get(colNum);
    }

    int getNumOfColumns()
    {
        return columns.size();
    }
}

class SybaseDatabase{
    public Connection connection;

    @SuppressWarnings("deprecation")
    public SybaseDatabase(String filePath, String Username, String Password)
        throws Exception
    {
        SybDriver driver;

        try 
        {
            driver = (SybDriver)Class.forName("com.sybase.jdbc2.jdbc.SybDriver").newInstance();
            driver.setVersion(SybDriver.VERSION_6);
            DriverManager.registerDriver(driver);
        } 
        catch (Exception e) 
        {
            e.printStackTrace(System.err);
        }   

        connection = DriverManager.getConnection("jdbc:sybase:Tds:" + filePath, Username, Password);
    }
}

Upvotes: 0

Views: 9479

Answers (1)

Prim
Prim

Reputation: 2968

If you want to use less memory, you should process less lines in same time but reuse all objects you can reuse (like the PreparedStatement)

First : You use an ArrayList<> in Record with a fixed size. You can just use an array Record[] for that. The principle of ArrayList is to have an array with a dynamic size which you don't need here

Second : don't load all the data from database before handle it, load a few part of data and process it, and continue.

You can do that by extracting the part of your code processing some rows and changing your query by limiting the number of returned rows.

Now, you load 1000 rows (from index 0 to 999), you process and commit them. Then you load 1000 rows (from index 1000 to 1999), you process and commit them. And then you continue. Between each pack of rows, don't keep any reference on precessed data (like on records) to avoid them to be kept in memory (like that they will be garbage-collected when necessary).

If you still have not enought memory, i guess you kept a reference on some objects which are not garbage collected due to that, causing a memory leak problem : your program need more and more memory when processing each data. You can use some tools like the jvisualvm (provided within java) to investigate the use of the memory

Upvotes: 1

Related Questions