Chrisma Daniel
Chrisma Daniel

Reputation: 175

Spring Data JPA HIbernate batch insert is slower

I use Spring Data, Spring Boot, and Hibernate as JPA provider and I want to improve performance in bulk inserting.

I refer to this link to use batch processing:

http://docs.jboss.org/hibernate/orm/4.1/manual/en-US/html/ch15.html

This is my code and my application.properties for insert batching experiment.

My service:

    @Value("${spring.jpa.properties.hibernate.jdbc.batch_size}")
    private int batchSize;

    @PersistenceContext
    private EntityManager em;

    @Override
    @Transactional(propagation = Propagation.REQUIRED)
    public SampleInfoJson getSampleInfoByCode(String code) {
//        SampleInfo newSampleInfo = new SampleInfo();
//        newSampleInfo.setId(5L);
//        newSampleInfo.setCode("SMP-5");
//        newSampleInfo.setSerialNumber(10L);
//        sampleInfoDao.save(newSampleInfo);
        log.info("starting... inserting...");
        for (int i = 1; i <= 5000; i++) {
            SampleInfo newSampleInfo = new SampleInfo();
//            Long id = (long)i + 4;
//            newSampleInfo.setId(id);
            newSampleInfo.setCode("SMPN-" + i);
            newSampleInfo.setSerialNumber(10L + i);
//            sampleInfoDao.save(newSampleInfo);
            em.persist(newSampleInfo);
            if(i%batchSize == 0){
                log.info("flushing...");
                em.flush();
                em.clear();
            }
        }

part of application.properties that related to batching:

spring.jpa.properties.hibernate.jdbc.batch_size=100
spring.jpa.properties.hibernate.cache.use_second_level_cache=false
spring.jpa.properties.hibernate.order_inserts=true
spring.jpa.properties.hibernate.order_updates=true

Entity class:

@Entity
@Table(name = "sample_info")
public class SampleInfo implements Serializable{

    private Long id;
    private String code;
    private Long serialNumber;

    @Id
    @GeneratedValue(
            strategy = GenerationType.SEQUENCE,
            generator = "sample_info_seq_gen"
    )
    @SequenceGenerator(
            name = "sample_info_seq_gen",
            sequenceName = "sample_info_seq",
            allocationSize = 1
    )
    @Column(name = "id")
    public Long getId() {
        return id;
    }

    public void setId(Long id) {
        this.id = id;
    }

    @Column(name = "code", nullable = false)
    public String getCode() {
        return code;
    }

    public void setCode(String code) {
        this.code = code;
    }

    @Column(name = "serial_number")
    public Long getSerialNumber() {
        return serialNumber;
    }

    public void setSerialNumber(Long serialNumber) {
        this.serialNumber = serialNumber;
    }
}

Running the service above batch inserting 5000 rows took 30 to 35 seconds to complete, but if comment these lines:

if(i%batchSize == 0){
    log.info("flushing...");
    em.flush();
    em.clear();
}

inserting 5000 rows took only 5 to 7 seconds, faster than batch mode.

Why is it slower when using batch mode?

Upvotes: 2

Views: 11440

Answers (1)

Javasick
Javasick

Reputation: 2983

That because EntityManager don't persist data in database immediately. And when you call flush() data will be persisted. When you comment those lines, EntityManager flushes data depending on flush-mode parameter, calling flush directly you tell EntityManager execute queries in database.

Upvotes: 1

Related Questions