Reputation: 1764
I am on java project to import huge amount of data from .csv file to a database. I am interested in understanding what can be the best approach in achieving this.
Can you please help me with some thought from other end of spectrum?
Upvotes: 0
Views: 3178
Reputation: 80176
Best option is to use native support of the DB while doing bulk operations with huge data. If Oracle then SQL*Loader. If Postgres then they have the COPY command.
If you are looking for Java specific options then below is my preference order
JDBC: use batch operations support but this has a limitation that any failure in the batch operation will short-circuit the entire flow
Hibernate: ORMs are not meant for this. However, you can use StatelessSession and batch configuration together to achieve optimal performance.
Upvotes: 0
Reputation: 3975
In my opinion , such cases (bulk import) should be addressed using database features:
In case of Oracle SQLLoader (as suggested by @Pangea)
In case of MS SQL Server BCP (Bulk Copy)
If you are looking @ Java based approach for this then I echo @Pangea In addition to to that You can break down a batch insert into sub-batches and run them concurrently for better perf.
Ex: If you have 10k records to be inserted then you can build batches of 200 records each and insert 5 batches concurrently.
In this case you need code to track each sub-batch.
Hope this helps!
Upvotes: 0