Patrick
Patrick

Reputation: 831

Select 100+ millions of rows in HSQLDB

I have to iterate through a table with 100+ millions of records with JDBC on a HSQLDB database and I don't manage to do it in a reasonable time. I use hsqldb v2.4.0.

I tried to use a PreparedStatement with the following query to slice the data:

String select = "SELECT ID, NAME, VALUE FROM MY_TABLE ORDER BY ID OFFSET ? ROWS FETCH ? ROWS ONLY";

The problem is that it takes more and more time as we go through the table. Note that the ID column is indexed.

I tried to put a fetch size but it doesn't work either :

String select = "SELECT ID, NAME, VALUE FROM MY_TABLE";
PreparedStatement selectStatement = connection.prepareStatement(select, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
selectStatement.setFetchSize(5000);

And then I iterate through the ResultSet :

ResultSet result = selectStatement.executeQuery();
while (result.next()) {
    Long id = result.getLong(1);
    // do stuff ...
} 

HSQLDB still tries to fetch all the rows of the table and the returned ResultSetdoesn't fit in memory. Here is the stacktrace :

java.lang.OutOfMemoryError: Java heap space
at org.hsqldb.navigator.RowSetNavigatorData.ensureCapacity(Unknown Source)
at org.hsqldb.navigator.RowSetNavigatorData.add(Unknown Source)
at org.hsqldb.QuerySpecification.buildResult(Unknown Source)
at org.hsqldb.QuerySpecification.getSingleResult(Unknown Source)
at org.hsqldb.QuerySpecification.getResult(Unknown Source)
at org.hsqldb.StatementQuery.getResult(Unknown Source)
at org.hsqldb.StatementDMQL.execute(Unknown Source)
at org.hsqldb.Session.executeCompiledStatement(Unknown Source)
at org.hsqldb.Session.execute(Unknown Source)
at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(Unknown Source)
at org.hsqldb.jdbc.JDBCPreparedStatement.executeQuery(Unknown Source)
at com.jolbox.bonecp.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:174)
at myclass at the line ResultSet result = selectStatement.executeQuery();

Is there a way to achieve this in HSQLDB?

Upvotes: 0

Views: 608

Answers (1)

fredt
fredt

Reputation: 24372

This issue is not related to memory use, as not too much memory is used for this type of SELECT.

The increase in time to select the result is expected. The OFFSET clause in the SELECT indicates how many rows are skipped. As it gets larger, more rows are selected and skipped.

You need to modify your SELECT to:

SELECT ID, NAME, VALUE FROM MY_TABLE WHERE ID > ? ORDER BY ID FETCH ? ROWS ONLY

And you process the result like this, using a running lastID with a PreparedStatement.

long lastID = -1;

// repeat the rest of the code until the result is empty
selectStatement.setLong(1, lastID);
selectStatement.setInt(2, 100000);

ResultSet result = selectStatement.executeQuery();
while (result.next()) {
 Long id = result.getLong(1);
 lastID = id;
 // do stuff ...
}

And

Upvotes: 3

Related Questions