Preeti
Preeti

Reputation: 222

improve spring batch job performance

I am in the process of implementing a spring batch job for our file upload process. My requirement is to read a flat file, apply business logic then store it in DB then post a Kafka message.

I have a single chunk-based step that uses a custom reader, processor, writer. The process works fine but takes a lot of time to process a big file.

It takes 15 mins to process a file having 60K records. I need to reduce it to less than 5 mins, as we will be consuming much bigger files than this.

As per https://docs.spring.io/spring-batch/docs/current/reference/html/scalability.html I understand making it multithreaded would give a performance boost, at the cost of restart ability. However, I am using FlatFileItemReader, ItemProcessor, ItemWriter and none of them is thread-safe.

Any suggestions as to how to improve performance here?

Here is the writer code:-

 public void write(List<? extends Message> items) {
        items.forEach(this::process);
    }
    
  private void process(Message message) {
        if (message == null)
            return;
        try {
           //message is a DTO that have info about success or failure.
            if (success) {
                //post kafka message using spring cloud stream
                //insert record in DB using spring jpaRepository
            } else {
                 //insert record in DB using spring jpaRepository
            }
        } catch (Exception e) {
           //throw exception
        }
    }

Best regards, Preeti

Upvotes: 0

Views: 2701

Answers (1)

Rakesh
Rakesh

Reputation: 700

Please refer to below SO thread and refer the git hub source code for parallel processing

Spring Batch multiple process for heavy load with multiple thread under every process

Spring batch to process huge data

Upvotes: 0

Related Questions