Sanjay Naik
Sanjay Naik

Reputation: 328

Can we get data processed in Spring Batch after batch job is completed?

I am using spring batch for reading data from db and process the same and do spome process in writer.

if batch size is less than the records read by reader then spring batch runs in multiple batches.I want to do the processing in writer only once at the end of all batch process completion or if this is not possible then i will remove writer and process the data obtained in processor after batch job is completed.Is this possible?

Below is my trigger Spring Batch job code

    private void triggerSpringBatchJob() {
        loggerConfig.logDebug(log, " : Triggering product catalog scheduler ");
        
        JobParametersBuilder builder = new JobParametersBuilder();

        try {

            // Adding date in buildJobParameters because if not added we will get A job
            // instance already exists: JobInstanceAlreadyCompleteException
            builder.addDate("date", new Date());
            jobLauncher.run(processProductCatalog, builder.toJobParameters());

        } catch (JobExecutionAlreadyRunningException | JobRestartException | JobInstanceAlreadyCompleteException
                | JobParametersInvalidException e) {

            e.printStackTrace();

        }
    }

Below is my spring batch configuration


@Configuration
@EnableBatchProcessing
public class BatchJobProcessConfiguration {
    

    
    @Bean
    @StepScope
    RepositoryItemReader<Tuple> reader(SkuRepository skuRepository,
            ProductCatalogConfiguration productCatalogConfiguration) {

        RepositoryItemReader<Tuple> reader = new RepositoryItemReader<>();
        reader.setRepository(skuRepository);
        // query parameters
        List<Object> queryMethodArguments = new ArrayList<>();
        
        
        if (productCatalogConfiguration.getSkuId().isEmpty()) {
            reader.setMethodName("findByWebEligibleAndDiscontinued");
            queryMethodArguments.add(productCatalogConfiguration.getWebEligible()); // for web eligible
            queryMethodArguments.add(productCatalogConfiguration.getDiscontinued()); // for discontinued
            queryMethodArguments.add(productCatalogConfiguration.getCbdProductId()); // for cbd products
        } else {
            reader.setMethodName("findBySkuIds");
            queryMethodArguments.add(productCatalogConfiguration.getSkuId()); // for sku ids
        }

        reader.setArguments(queryMethodArguments);

        reader.setPageSize(1000);
        Map<String, Direction> sorts = new HashMap<>();
        sorts.put("sku_id", Direction.ASC);
        reader.setSort(sorts);

        return reader;
    }

    @Bean
    @StepScope
    ItemWriter<ProductCatalogWriterData> writer() {
        return new ProductCatalogWriter();
    }

    @Bean
    ProductCatalogProcessor processor() {
        return new ProductCatalogProcessor();
    }
    
    @Bean
     SkipPolicy readerSkipper() {
        return new ReaderSkipper();

    @Bean
    Step productCatalogDataStep(ItemReader<Tuple> itemReader, ProductCatalogWriter writer,
            HttpServletRequest request, StepBuilderFactory stepBuilderFactory,BatchConfiguration batchConfiguration) {
        return stepBuilderFactory.get("processProductCatalog").<Tuple, ProductCatalogWriterData>chunk(batchConfiguration.getBatchChunkSize())
                .reader(itemReader).faultTolerant().skipPolicy(readerSkipper()).processor(processor()).writer(writer).build();
    }

    
    @Bean
    Job productCatalogData(Step productCatalogDataStep, HttpServletRequest request,
            JobBuilderFactory jobBuilderFactory) {
        return jobBuilderFactory.get("processProductCatalog").incrementer(new RunIdIncrementer())
                .flow(productCatalogDataStep).end().build();
    }

}


Upvotes: 1

Views: 805

Answers (1)

Mahmoud Ben Hassine
Mahmoud Ben Hassine

Reputation: 31600

want to do the processing in writer only once at the end of all batch process completion or if this is not possible then i will remove writer and process the data obtained in processor after batch job is completed.Is this possible?

"at the end of all batch process completion" is key here. If the requirement is to do some processing after all chunks have been "pre-processed", I would keep it simple and use two steps for that:

  • Step 1: (pre)processes the data as needed and writes it to a temporary storage
  • Step 2: Here you do whatever you want with the processed data prepared in the temporary storage

A final step would clean up the temporary storage if it is persistent (file, staging table, etc). Otherwise, ie if it is in memory, this is optional.

Upvotes: 0

Related Questions