ivish
ivish

Reputation: 612

Always read first n lines on spring batch job restart

I am using spring batch module to read a complex file with multi-line records. First 3 lines in the file will always contain a header with few common fields. These common fields will be used in the processing of subsequent records in the file. The job is restartable. Suppose the input file has 10 records (please note number of records may not be same as number of lines since records can span over multiple lines). Suppose job runs first time, starts reading the file from line 1, and processes first 5 records and fails while processing 6th record. During this first run, since job has also parsed header part (first 3 lines in the file), application can successfully process first 5 records. Now when failed job restarted it will start from 6th record and hence will not read the header part this time. Since application requires certain values contained in the header record, the job fails. I would like to know possible suggestions so that restarted job always reads the header part and then starts from where it left off (6th record in the above scenario).

Thanks in advance.

Upvotes: 1

Views: 996

Answers (2)

Michael Pralow
Michael Pralow

Reputation: 6630

i guess, the file in question does not change between runs? then it's not necessary to re-read it, my solution builds on this assumption

if you use one step you can

it should work for re-start as well, because Spring Batch reads/saves the values from the first run and will provide the complete ExecutionContext for subsequent runs

Upvotes: 3

Nenad Bozic
Nenad Bozic

Reputation: 3784

You can make 2 step job where:

First step reads first 3 lines as header information and puts everything you need to job context (and therefore save it in DB for future executions if job fails). If this step fails, header info will be read again and if it passes you are sure it will always have header info in job context.

Second step can use same file for input but this time you can tell it to skip first 3 lines and read rest as is. This way you will get restartability on that step and each time job fails it will resume where it left of.

Upvotes: 3

Related Questions