reverse
reverse

Reputation: 350

Spring Batch Parallel Processing - File Splitting at runtime

I have a Simple Spring Batch which reads, say, 1 million records, from a file and prints it on the console.

Now, I want to deploy this batch on N servers, say N=5.

How can I make sure that are same records are NOT being read by all the server instances ?

As in - how I can split the records in file appropriately ( 1 million / 5) to achieve optimized results ?

Please help with code examples. Thanks.

Upvotes: 5

Views: 2729

Answers (1)

Niraj Sonawane
Niraj Sonawane

Reputation: 11055

As suggested by Michael, You can split the file using system command and then use MultiResourcePartitioner for processing splited files parallel. this is how i did it

@Bean
    public Partitioner partitioner() {
        MultiResourcePartitioner partitioner = new MultiResourcePartitioner();
        ClassLoader cl = this.getClass().getClassLoader();
        ResourcePatternResolver resolver = new PathMatchingResourcePatternResolver(cl);
        Resource[] resources = resolver.getResources("file:" + filePath + "/"+"*.csv");     
        partitioner.setResources(resources);
        partitioner.partition(10);      
        return partitioner;
    }

    @Bean
    public TaskExecutor taskExecutor() {
        ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
        taskExecutor.setMaxPoolSize(4);
        taskExecutor.afterPropertiesSet();
        return taskExecutor;
    }   

    @Bean
    @Qualifier("masterStep")
    public Step masterStep() {
        return stepBuilderFactory.get("masterStep")
                .partitioner(ProcessDataStep())
                .partitioner("ProcessDataStep",partitioner())   
                .taskExecutor(taskExecutor())
                .listener(pcStressStepListener)
                .build();
    }


    @Bean
    @Qualifier("processData")
    public Step processData() {
        return stepBuilderFactory.get("processData")
                .<pojo, pojo> chunk(5000)
                .reader(reader)             
                .processor(processor())
                .writer(writer)         
                .build();
    }



    @Bean(name="reader")
    @StepScope
    public FlatFileItemReader<pojo> reader(@Value("#{stepExecutionContext['fileName']}") String filename) {

        FlatFileItemReader<pojo> reader = new FlatFileItemReader<>();
        reader.setResource(new UrlResource(filename));
        reader.setLineMapper(new DefaultLineMapper<pojo>() {
            {
                setLineTokenizer(new DelimitedLineTokenizer() {
                    {
                        setNames(FILE HEADER);


                    }
                });
                setFieldSetMapper(new BeanWrapperFieldSetMapper<pojo>() {
                    {
                        setTargetType(pojo.class);
                    }
                });
            }
        });
        return reader;
    }   

Upvotes: 3

Related Questions