user3444718
user3444718

Reputation: 1605

Spring batch understanding chunk processing

I have total 8 records in table, from which 6 are eligible for jpareader when spring batch calls read. Now I have page size and chunk size set to 1 for testing. Expecting that when job runs, it should make 6 read calls, and then it should process them one by one and write them one by one. But actually what happens is it just calls read 4 times (from logs I can see reading page 0...1 like that) and processes 4 from which one is filtered out because doesnt match criteria for writing and then it just updates 3 records and job is marked completed successfully.

Like this job needs to be run 3 times in order to process all records. Something is not clear to us. Tried to understand chunk processing but I think chunk is only to aggregate result to make write call..after that I expect read and process should continue.

With this test, we are confused on what to set for chunk size for production, if we set to large number it will require more memory (heap).

Upvotes: 0

Views: 9385

Answers (1)

Mahmoud Ben Hassine
Mahmoud Ben Hassine

Reputation: 31590

I see the confusion. The pageSize parameter of the JpaPagingItemReader has nothing to do with the chunkSize (or commit-interval) of the chunk-oriented step.

If you take the JpaPagingItemReader and use it outside a chunk oriented step with a pageSize = 4, it will fetch 4 items at a time (ie, per page). Now those 4 items can be processed in chunks of 2 for example, and you will have two chunks per page. The JpaPagingItemReader will read the first page (list of 4 items) and then return items from that list each time a call to read is made by the chunk-oriented step. Here is an example with a pageSize = 4, chunkSize = 2, totalItems = 8 and a chunk listener:

ChunkListener.beforeChunk
Reading page 0
Reading item1
Reading item2
Writing item1
Writing item2
ChunkListener.afterChunk
ChunkListener.beforeChunk
Reading item3
Reading item4
Writing item3
Writing item4
ChunkListener.afterChunk
ChunkListener.beforeChunk
Reading page 1
Reading item5
Reading item6
Writing item5
Writing item6
ChunkListener.afterChunk
ChunkListener.beforeChunk
Reading item7
Reading item8
Writing item7
Writing item8
ChunkListener.afterChunk
ChunkListener.beforeChunk
Reading page 2
Reading item = null
ChunkListener.afterChunk

I created a sample app with this configuration so you can play with it and see how things work.

Hope this helps understanding the chunk-oriented processing model when used with a paging item reader.

Upvotes: 4

Related Questions