Reputation: 7500
I'm new to Spring Batch and trying to implement a batch job where I
I've looked through this question on StackOverflow, but the main accepted answer was essentially to implement two steps that read twice from the database:
<job id="myJob">
<step id="step1" next="step2">
<tasklet>
<chunk reader="reader" writer="typeAwriter"/>
</tasklet>
</step>
<step id="step2">
<tasklet>
<chunk reader="reader" processor="processor" writer="typeBwriter"/>
</tasklet>
</step>
</job>
Isn't there a more efficient way of doing this than reading twice from the MySQL database? For example, what if you query is quite large and drags system performance?
Upvotes: 2
Views: 12905
Reputation: 7500
I'll go ahead and answer my own question. There's multiple ways to do this, but I found that saving properties and objects first to the StepExecutionContext
, then promoting them to the JobExecutionContext
after the Step completes works well. It is also documented pretty thoroughly here.
Step 1:
In your Writer / Reader declare a private StepExecution
. Then, inside your read/write method create the step context, and put your data in as a key/value pair:
ExecutionContext stepContext = this.stepExecution.getExecutionContext();
stepContext.put("someKey", someObject);
Step 2:
Add an ExecutionContextPromotionListener
to your step's bean configuration. The ExecutionContextPromotionListener
must include a String[]
property called Keys that includes the keys you wish to promote to Job scope beyond your step, similar to this implementation from a LinkedIn article:
@Bean
public ExecutionContextPromotionListener promotionListener() {
ExecutionContextPromotionListener listener = new ExecutionContextPromotionListener();
listener.setKeys( new String[] { "entityRef" } );
return listener;
}
Step 3: You also need to add the StepExecution into your Writer before your step executes:
@BeforeStep
public void saveStepExecution( StepExecution stepExecution ) {
this.stepExecution = stepExecution;
}
Step 4:
This will give your write()
method access to the stepExecution
instance, where it can access stepContext
for you to save your data. For instance, you can write
write() {
... // write logic
ExecutionContext stepContext = this.stepExecution.getExecutionContext();
stepContext.put("keyYouWantToPutIn", theCorrespondingDataObject);
}
Finally, in your next step, you can retrieve this data (example coming directly from the Spring Batch documentation:
@BeforeStep
public void retrieveInterstepData(StepExecution stepExecution) {
JobExecution jobExecution = stepExecution.getJobExecution();
ExecutionContext jobContext = jobExecution.getExecutionContext();
this.someObject = jobContext.get("someKey");
}
This time, however, notice that it's being accessed from the jobContext
as opposed to the stepContext
- it's been promoted!
Upvotes: 2
Reputation: 164
What you need is a chunk strategy instead of tasklet. The ItemReader will read chunks from your database, the processor will process you data and then you can for each item send them to the ItemWriter that can write to database and file. This is one of many possible strategies, I don't know the details on your business logic, but I think this is information enough to get you going on your own ideas.
<?xml version="1.0" encoding="UTF-8"?>
<job id="customerJob" xmlns="http://xmlns.jcp.org/xml/ns/javaee"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee http://xmlns.jcp.org/xml/ns/javaee/jobXML_1_0.xsd"
version="1.0">
<step id="step1">
<chunk item-count="5">
<reader ref="itemReader"/>
<processor ref="itemProcessor"/>
<writer ref="itemWriter"/>
</chunk>
</step>
</job>
This is the JSR-352 XML type, for Spring you have corresponding approach.
Upvotes: 2