Reputation: 328
I am using spring batch for reading data from db and process the same and do spome process in writer.
if batch size is less than the records read by reader then spring batch runs in multiple batches.I want to do the processing in writer only once at the end of all batch process completion or if this is not possible then i will remove writer and process the data obtained in processor after batch job is completed.Is this possible?
Below is my trigger Spring Batch job code
private void triggerSpringBatchJob() {
loggerConfig.logDebug(log, " : Triggering product catalog scheduler ");
JobParametersBuilder builder = new JobParametersBuilder();
try {
// Adding date in buildJobParameters because if not added we will get A job
// instance already exists: JobInstanceAlreadyCompleteException
builder.addDate("date", new Date());
jobLauncher.run(processProductCatalog, builder.toJobParameters());
} catch (JobExecutionAlreadyRunningException | JobRestartException | JobInstanceAlreadyCompleteException
| JobParametersInvalidException e) {
e.printStackTrace();
}
}
Below is my spring batch configuration
@Configuration
@EnableBatchProcessing
public class BatchJobProcessConfiguration {
@Bean
@StepScope
RepositoryItemReader<Tuple> reader(SkuRepository skuRepository,
ProductCatalogConfiguration productCatalogConfiguration) {
RepositoryItemReader<Tuple> reader = new RepositoryItemReader<>();
reader.setRepository(skuRepository);
// query parameters
List<Object> queryMethodArguments = new ArrayList<>();
if (productCatalogConfiguration.getSkuId().isEmpty()) {
reader.setMethodName("findByWebEligibleAndDiscontinued");
queryMethodArguments.add(productCatalogConfiguration.getWebEligible()); // for web eligible
queryMethodArguments.add(productCatalogConfiguration.getDiscontinued()); // for discontinued
queryMethodArguments.add(productCatalogConfiguration.getCbdProductId()); // for cbd products
} else {
reader.setMethodName("findBySkuIds");
queryMethodArguments.add(productCatalogConfiguration.getSkuId()); // for sku ids
}
reader.setArguments(queryMethodArguments);
reader.setPageSize(1000);
Map<String, Direction> sorts = new HashMap<>();
sorts.put("sku_id", Direction.ASC);
reader.setSort(sorts);
return reader;
}
@Bean
@StepScope
ItemWriter<ProductCatalogWriterData> writer() {
return new ProductCatalogWriter();
}
@Bean
ProductCatalogProcessor processor() {
return new ProductCatalogProcessor();
}
@Bean
SkipPolicy readerSkipper() {
return new ReaderSkipper();
@Bean
Step productCatalogDataStep(ItemReader<Tuple> itemReader, ProductCatalogWriter writer,
HttpServletRequest request, StepBuilderFactory stepBuilderFactory,BatchConfiguration batchConfiguration) {
return stepBuilderFactory.get("processProductCatalog").<Tuple, ProductCatalogWriterData>chunk(batchConfiguration.getBatchChunkSize())
.reader(itemReader).faultTolerant().skipPolicy(readerSkipper()).processor(processor()).writer(writer).build();
}
@Bean
Job productCatalogData(Step productCatalogDataStep, HttpServletRequest request,
JobBuilderFactory jobBuilderFactory) {
return jobBuilderFactory.get("processProductCatalog").incrementer(new RunIdIncrementer())
.flow(productCatalogDataStep).end().build();
}
}
Upvotes: 1
Views: 805
Reputation: 31600
want to do the processing in writer only once at the end of all batch process completion or if this is not possible then i will remove writer and process the data obtained in processor after batch job is completed.Is this possible?
"at the end of all batch process completion" is key here. If the requirement is to do some processing after all chunks have been "pre-processed", I would keep it simple and use two steps for that:
A final step would clean up the temporary storage if it is persistent (file, staging table, etc). Otherwise, ie if it is in memory, this is optional.
Upvotes: 0