Bran
Bran

Reputation: 13

Why does Spring not always use batch insert/update when configured?

In my Spring JpaRepository based project, I have a method that always loads records one by one from the method that calls saveAll(List list)...when I use a List where I create the objects and add them to the list. Below code takes about 5 minutes...


@Transactional(readOnly = false)
public List<Data> saveAll(List<Data> dataList) throws Exception {
 try {
        for(Data d : dataList){

            if(d.getCol04() == null)
                d.setCol04("TEST");
            else
                d.setCol04(null);

            log.info(new Gson().toJson(d));

        }

        return dataRepository.saveAll(dataList);

    }catch (Exception e) {
        log.error("Unexpected error.", e);
        return null;
    }

}

2019-10-30 09:07:11 DEBUG org.hibernate.loader.Loader - Result set row: 0 2019-10-30 09:07:11 DEBUG org.hibernate.loader.Loader - Result row: EntityKey[ com.test.Data#component[batchSeq,batchId]{batchSeq=866, batchId=6012113450}] 2019-10-30 09:07:11 DEBUG o.h.engine.internal.TwoPhaseLoad - Resolving associations for [com.test.Data#component[batchSeq,batchId]{batchSeq=866, batchId=6012113450}] 2019-10-30 09:07:11 DEBUG o.h.engine.internal.TwoPhaseLoad - Done materializing entity [com.test.Data#component[batchSeq,batchId]{batchSeq=866, batchId=6012113450}] 2019-10-30 09:07:11 DEBUG org.hibernate.loader.Loader - Done entity load 2019-10-30 09:07:11 DEBUG org.hibernate.loader.Loader - Loading entity: [com.test.Data#component[batchSeq,batchId]{batchSeq=867, batchId=6012113450}] 2019-10-30 09:07:11 DEBUG org.hibernate.SQL - select datae0_.batch_seq as batch_seq1_0_0_, datae0_.batch_id as batch_id2_0_0_, datae0_.col01 as col3_0_0_, datae0_.col02 as col4_0_0_, datae0_.col03 as col5_0_0_, datae0_.col04 as col6_0_0_, datae0_.col05 as col7_0_0_, datae0_.col06 as col8_0_0_, datae0_.col07 as col9_0_0_, datae0_.col08 as col10_0_0_, datae0_.col09 as col11_0_0_, datae0_.col10 as col12_0_0_, datae0_.group_id as group_id13_0_0_, datae0_.parent_trans_id as parent_trans_id14_0_0_, datae0_.result_message as result_message15_0_0_, datae0_.result_status as result_status16_0_0_, datae0_.tn as tn17_0_0_, datae0_.tn_quantity_assigned as tn_quantity_assig18_0_0_, datae0_.tn_quantity_requested as tn_quantity_reque19_0_0_, datae0_.xml_data as xml_data20_0_0_ from owner.batch_data datae0_ where datae0_.batch_seq=? and datae0_.batch_id=? 2019-10-30 09:07:12 DEBUG org.hibernate.loader.Loader - Result set row: 0 2019-10-30 09:07:12 DEBUG org.hibernate.loader.Loader - Result row: EntityKey[com.test.Data#component[batchSeq,batchId]{batchSeq=867, batchId=6012113450}] 2019-10-30 09:07:12 DEBUG o.h.engine.internal.TwoPhaseLoad - Resolving associations for [com.test.Data#component[batchSeq,batchId]{batchSeq=867, batchId=6012113450}] 2019-10-30 09:07:12 DEBUG o.h.engine.internal.TwoPhaseLoad - Done materializing entity [com.test.Data#component[batchSeq,batchId]{batchSeq=867, batchId=6012113450}] 2019-10-30 09:07:12 DEBUG org.hibernate.loader.Loader - Done entity load 2019-10-30 09:07:12 DEBUG org.hibernate.loader.Loader - Loading entity: [com.test.Data#component[batchSeq,batchId]{batchSeq=868, batchId=6012113450}] 2019-10-30 09:07:12 DEBUG org.hibernate.SQL - . . . 2019-10-30 09:10:34 DEBUG o.h.e.j.b.internal.AbstractBatchImpl - Reusing batch statement 2019-10-30 09:10:34 DEBUG org.hibernate.SQL - update owner.batch_data set col01=?, col02=?, col03=?, col04=?, col05=?, col06=?, col07=?, col08=?, col09=?, col10=?, group_id=?, parent_trans_id=?, result_message=?, result_status=?, tn=?, tn_quantity_assigned=?, tn_quantity_requested=?, xml_data=? where batch_seq=? and batch_id=? 2019-10-30 09:10:34 DEBUG o.h.e.j.batch.internal.BatchingBatch - Executing batch size: 500 2019-10-30 09:10:36 DEBUG o.s.orm.jpa.JpaTransactionManager - Closing JPA EntityManager [SessionImpl(2081270212)] after transaction 2019-10-30 09:10:36 INFO o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics { 77841799 nanoseconds spent acquiring 1 JDBC connections; 0 nanoseconds spent releasing 0 JDBC connections; 89273975 nanoseconds spent preparing 1501 JDBC statements; 250886005163 nanoseconds spent executing 1500 JDBC statements; 4812997147 nanoseconds spent executing 3 JDBC batches; 0 nanoseconds spent performing 0 L2C puts; 0 nanoseconds spent performing 0 L2C hits; 0 nanoseconds spent performing 0 L2C misses; 5462588455 nanoseconds spent executing 1 flushes (flushing a total of 1500 entities and 0 collections); 0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections) } 2019-10-30 09:10:36 DEBUG o.h.e.j.internal.JdbcCoordinatorImpl - HHH000420: Closing un-released batch 2019-10-30 09:10:36 DEBUG o.s.orm.jpa.JpaTransactionManager - Creating new transaction with name [org.springframework.data.jpa.repository.support.SimpleJpaRepository.findById]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT,readOnly 2019-10-30 09:10:36 DEBUG o.s.orm.jpa.JpaTransactionManager - Opened new EntityManager [SessionImpl(1953492703)] for JPA transaction 2019-10-30 09:10:36 DEBUG o.s.jdbc.datasource.DataSourceUtils - Setting JDBC Connection [HikariProxyConnection@113673215 wrapping oracle.jdbc.driver.T4CConnection@5f71a000] read-only 2019-10-30 09:10:36 DEBUG o.h.e.t.internal.TransactionImpl - On TransactionImpl creation, JpaCompliance#isJpaTransactionComplianceEnabled == false 2019-10-30 09:10:36 DEBUG o.h.e.t.internal.TransactionImpl - begin

SaveAll(List list) loads in batches when I query findAll(List id) and use the returned list as the argument to saveAll. Below code takes about 8 seconds.


//Working method

@Transactional(readOnly = false)
public List<Data> saveAll() throws Exception {

    try {
        List<Data> dataList2 = dataRepository.findAllByBatchId(6000000L);

        for(Data d : dataList2){

            if(d.getCol04() == null)
                d.setCol04("TEST");
            else
                d.setCol04(null);

            log.info(new Gson().toJson(d));

        }

        return dataRepository.saveAll(dataList2);

    }catch (Exception e) {
        log.error("Unexpected error.", e);
        return null;
    }

}

. . . 2019-10-30 09:00:25 DEBUG o.h.e.j.batch.internal.BatchingBatch - Executing batch size: 500 2019-10-30 09:00:27 DEBUG o.s.orm.jpa.JpaTransactionManager - Closing JPA EntityManager [SessionImpl(2038854767)] after transaction 2019-10-30 09:00:27 INFO o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics { 89300419 nanoseconds spent acquiring 1 JDBC connections; 0 nanoseconds spent releasing 0 JDBC connections; 1037028 nanoseconds spent preparing 2 JDBC statements; 330435552 nanoseconds spent executing 1 JDBC statements; 6423148647 nanoseconds spent executing 3 JDBC batches; 0 nanoseconds spent performing 0 L2C puts; 0 nanoseconds spent performing 0 L2C hits; 0 nanoseconds spent performing 0 L2C misses; 7100764566 nanoseconds spent executing 1 flushes (flushing a total of 1500 entities and 0 collections); 2082345 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections) } 2019-10-30 09:00:27 DEBUG o.h.e.j.internal.JdbcCoordinatorImpl - HHH000420: Closing un-released batch 2019-10-30 09:00:27 DEBUG o.s.w.s.m.m.a.HttpEntityMethodProcessor - Using 'application/json', given [/] and supported [application/json] 2019-10-30 09:00:27 DEBUG o.s.w.s.m.m.a.HttpEntityMethodProcessor - Writing ["{"data":0}"] 2019-10-30 09:00:27 DEBUG o.s.o.j.s.OpenEntityManagerInViewInterceptor - Closing JPA EntityManager in OpenEntityManagerInViewInterceptor

Any ideas why this happens?

Upvotes: 0

Views: 904

Answers (1)

sandeep pandey
sandeep pandey

Reputation: 350

Could this be because of dirty checking feature, In first method when you are saving dataList before saving the data, hibernate might be loading all data in memory to check which data has been modified.This could be reason why it takes time and If Data(entity) is further associated with other entity then this add up on time. Where as in second method as dataList2 is preloaded before saving to database, hence no resolution is done at the time of saving. In order to test this theory you may eager load all data(along with associated entity) and then save, if it takes same time as equal to second method, then we might conclude this. Please also refer this link https://forum.hibernate.org/viewtopic.php?f=1&t=994162&view=previous, it might help

Upvotes: 1

Related Questions