Reputation: 1644
I am trying to read a 30GB CSV file with 60 columns and over 78 million rows into an H2 database on disk.
H2 url:
jdbc:h2:file:./data/datasource
Simplified code I am using to transfer the file:
import java.io.BufferedReader;
import org.springframework.stereotype.Service;
@Service
public class RecordService{
private RecordDao recordDao;
public void readAndWriteToH2(){
String path = "path/to/csv-file.csv";
List<RecordEntity> records = new ArrayList<>();
try (BufferedReader br = new BufferedReader(new FileReader(path))) {
String line = br.readLine(); // skip header line
for(int lineCount = 0; (line = br.readLine()) != null; lineCount++){
String[] split = line.split(',');
RecordEntity record = new RecordEntity();
record.prop0 = split[0];
record.prop1 = split[1];
record.prop2 = split[2];
records.add(record);
if(lineCount > 1_000_000){
lineCount = 0;
recordDao.saveAll(records); // Spring/Hibernate .saveAll()
records = new ArrayList<>();
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
The problem I get is that I run out of heap memory after 16 million records are written. This means that Hibernate has already executed recordDao.saveAll(records)
15 times successfully before there is a memory problem. What I don't understand is why there is a memory problem. Are the lines read by the BufferReader not being discarded once used? Is Hibernate holding on to all of the records after recordDao.saveAll(records)
?
java.lang.OutOfMemoryError: Java heap space
at java.base/java.lang.StringLatin1.newString(StringLatin1.java:752) ~[na:na]
at java.base/java.lang.String.substring(String.java:2839) ~[na:na]
at java.base/java.lang.String.split(String.java:3371) ~[na:na]
at java.base/java.lang.String.split(String.java:3354) ~[na:na]
at java.base/java.lang.String.split(String.java:3447) ~[na:na]
at gp.fake.service.FakeService.setupNationalAddressTableFromCsv(FakeService.java:42) ~[classes/:na]
at gp.fake.control.FakeControl.setupNationalAddressTable(FakeControl.java:20) ~[classes/:na]
at java.base/java.lang.invoke.DirectMethodHandle$Holder.invokeVirtual(DirectMethodHandle$Holder) ~[na:na]
at java.base/java.lang.invoke.LambdaForm$MH/0x0000028f71007800.invoke(LambdaForm$MH) ~[na:na]
at java.base/java.lang.invoke.Invokers$Holder.invokeExact_MT(Invokers$Holder) ~[na:na]
...
Upvotes: 1
Views: 43