Reputation: 6646
My application is a scheduled job runner with batch configurations.
I can have CSV files with different number of rows, but I know that the first row will be always the header:
id,firstName,lastName
1,Viktor,Someone
2,Joe,Smith
3,Rebecca,Harper
How should I set up the chunk to be dynamic? The file can contain 5, 10, or even 100000 rows.
So instead of giving a big number to the chunk, I am looking for a better dynamic solution based on the number of the rows without counting the header row!
@Bean
public Step step1() {
return stepBuilderFactory.get("step1").<Employee, Employee>chunk(100000)
.reader(reader())
.writer(writer())
.build();
}
The reader is a FlatFileItemReader
.
Upvotes: 0
Views: 1739
Reputation: 31600
What about the following:
@Bean
public Step step1() {
long lineNumberWithoutHeader = Files.lines(Paths.get("path to your file")).count() - 1;
int chunkSize = .. // calculate chunk size based on lineNumberWithoutHeader
return stepBuilderFactory.get("step1").<Employee, Employee>chunk(chunkSize)
.reader(reader())
.writer(writer())
.build();
}
You can refactor the code as needed (inject the file resource or late bind it from job parameters, extract the calculation logic in a separate method, etc), but you got the idea.
Another option would be to use a separate step that does the calculation and put it in the job execution context, then configure your chunk-oriented step with the value from the execution context.
Upvotes: 1