Reputation: 31
Tried to find if this was asked before but couldn't.
Here is the problem. The following has to be achieved via Spring batch There is one file to be read and processed. The item reader is not thread safe. The plan is to have multithreaded homogenous processors and multithreaded homogenous writers injest items read by a single threaded reader.
Kind of like below:
----------> Processor #1 ----------> Writer #1
|
Reader -------> Processor #2 ----------> Writer #2
|
----------> Processor #3 ----------> Writer #3
Tried AsyncItemProcessor and AsyncItemWriter, but holding debug point on processor resulted in reader not being executed until the point was released i.e. single threaded processing.
Task executor was tried like below:
<tasklet task-executor="taskExecutor" throttle-limit="20">
Multiple threads on the reader were launched.
Synchronising the reader also didn't work.
I tried to read about partitioner but it seemed complex.
Is there an annotation to mark the reader as single threaded? Would pushing read data to Global context be a good idea?
Please guide towards a solution.
Upvotes: 3
Views: 6416
Reputation: 123
Came across this with a similar problem at hand.
Here's how I am doing it at the moment. As @mminella suggested, synchronized itemReader with the flatfileItemReader as delegate. This works with decent performance. The code writes about ~4K records per second at the moment but the speed doesn't entirely depend on the design, other attributes contribute as well.
Tried other approaches to increase performance, both kind of failed.
Upvotes: 0
Reputation: 10142
I guess nothing is in built in Spring Batch API for the pattern that you are looking for. Coding on your part would be needed to achieve what you are looking for.
Method ItemWriter.write already takes a List
of processed items based on your chunk size so you can divide up that List
into as many threads as you like. You spawn your own threads and pass a segment of list to each of threads to write .
Problem is with method ItemProcesor.process() as it processes item by item so you are limited by a single item and you wouldn't be able to much of a threading for a single item.
So challenge is to write your own reader than can hand over a list of items to processor instead of a single item so you can process those items in parallel & writer will work on a list of list.
In all of this set up, you have to remember that threads spawned by you will be out of read - process - write transaction boundary of Spring batch so you will have to take care of that on your own - in terms of merging processed output for all threads and waiting till all threads are complete and handling any errors. All in all, its very risky.
Making a item reader to return a list instead single object - Spring batch
Upvotes: 1