Reputation: 1466
I've read a lot of articles recently about populating a grails table from huge data, but seem to have hit a ceiling. My code is as follows:
class LoadingService {
def sessionFactory
def dataSource
def propertyInstanceMap = org.codehaus.groovy.grails.plugins.DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP
def insertFile(fileName) {
InputStream inputFile = getClass().classLoader.getResourceAsStream(fileName)
def pCounter = 1
def mCounter = 1
Sql sql = new Sql(dataSource)
inputFile.splitEachLine(/\n|\r|,/) { line ->
line.each { value ->
if(value.equalsIgnoreCase('0') {
pCounter++
return
}
sql.executeInsert("insert into Patient_MRNA (patient_id, mrna_id, value) values (${pCounter}, ${mCounter}, ${value.toFloat()})")
pCounter++
}
if(mCounter % 100 == 0) {
cleanUpGorm()
}
pCounter = 1
mCounter++
}
}
def cleanUpGorm() {
session.currentSession.clear()
propertyInstanceMap.get().clear()
}
}
I have disabled secondary cache, I'm using assigned ids, and I am explicitly handling this many to many relationship through a domain, not the hasMany and belongsTo.
My speed has increased monumentally after applying these methods, but after a while the inserts slow down to the point of almost stopping compared to about 623,000 per minute at the beginning.
Is there some other memory leak that I should be aware of or have I just hit the ceiling in terms of batch inserts in grails?
To be clear it takes about 2 minutes to insert 1.2 million rows, but then they start to slow down.
Upvotes: 1
Views: 945
Reputation: 1152
I have fought with this in earlier versions of Grails. Back then I resorted to either simply run the batch manually in proper chunks or use another tool for the batch import, such as Pentaho Data Integration (or other ETL tool or DIY).
Upvotes: 1
Reputation: 171
Try doing batch inserts, it's much more efficient
def updateCounts = sql.withBatch { stmt ->
stmt.addBatch("insert into TABLENAME ...")
stmt.addBatch("insert into TABLENAME ...")
stmt.addBatch("insert into TABLENAME ...")
...
}
Upvotes: 3