Reputation: 590
I am using Spring Boot in conjunction with Hibernate to create the RESTful API for a simple web app where I read a .csv file and insert each row into a mysql database table. I am able to successfully do this but it's taking a very long time for any large csv file. I understand that it's better to batch my insert statements to reduce number of transactions but I don't think I am making that happen with the current code I have. This is what I currently am doing (which works but very slowly):
CsvUploaderController.java
public class CsvUploaderController {
// CsvUploader Repository definition:
// @Repository
// public interface CsvUploaderRepository extends JpaRepository<CityObj, Integer>
@Autowired
CsvUploaderRepository csvUploaderRepository;
// gets called by front end with the .csv file data
@PutMapping("/upload_csv")
public List<CityObj> uploadCsv(@RequestBody Object data) {
// unpackCSV converts an arrayList of Objects to a List of CityObj
List<CityObj> unpackedCSV = unpackCSV((ArrayList<Object>) data);
csvUploaderRepository.deleteAll(); // delete all rows in table currently. Not sure if this is relevent to issue
// save all items as rows in my database
return csvUploaderRepository.saveAll(unpackedCSV); // this is where it takes forever to complete
}
...
}
application.properties:
spring.datasource.url=jdbc:mysql://url.to.my.database
spring.datasource.username=myusername
spring.datasource.password=weirdpassword
spring.datasource.hikari.maximumPoolSize = 5
spring.jpa.properties.hibernate.generate_statistics=true
spring.jpa.properties.hibernate.jdbc.batch_size=20 // I've tried playing around with different values. Hasnt helped
#spring.jpa.properties.hibernate.order_inserts=true // I've also tried using this but not sure what it does and didnt help
What am I doing wrong? How can I improve the performace of my inserts? Am I understanding saveAll() incorrectly? My problem is very similar to what is described here: Spring boot 2 upgrade - spring boot data jpa saveAll() very slow
Upvotes: 3
Views: 11960
Reputation: 41
Use a JdbcTemplate, It is much faster.
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.jdbc.core.BatchPreparedStatementSetter;
public int[] batchInsert(List<Book> books) {
return this.jdbcTemplate.batchUpdate(
"insert into books (name, price) values(?,?)",
new BatchPreparedStatementSetter() {
public void setValues(PreparedStatement ps, int i) throws SQLException {
ps.setString(1, books.get(i).getName());
ps.setBigDecimal(2, books.get(i).getPrice());
}
public int getBatchSize() {
return books.size();
}
});
}
Upvotes: 4
Reputation: 16131
You could get some performance by tweaking HikariCP configuration for MySQL: https://github.com/brettwooldridge/HikariCP/wiki/MySQL-Configuration
As saveAll()
internally just loops over the list you could do the looping manually and flush every 20 entities to reduce the pressure on the persistence context.
Batching done right will of course help.
In the end the fastest method will be plain SQL with JdbcTemplate
and multivalue INSERTs like:
INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);
Upvotes: 2