Grant Edwards
Grant Edwards

Reputation: 41

StaxEventItemReader - skip XML fragments processed in previous Job executions

When working with CSV files and restarting a FAILED Job, a StepExecutionListner and associated beforeStep( ....) method can be used to position the reader within the file. So the code could look something like:

public void beforeStep(StepExecution stepExecution) {

    ExecutionContext executionContext = stepExecution.getExecutionContext();

    if (executionContext.containsKey(getKey(LINES_READ_COUNT))) {

        long lineCount = executionContext.getLong(getKey(LINES_READ_COUNT));

        LineReader reader = getReader();
        Object record = "";
        while (reader.getPosition() < lineCount && record != null) {
            record = readLine();
        }
    }
} // Or something similar

My question is how do you achieve the same thing when working with a StaxEventItemReader?

My batch_step_execution_context looks something like {"string":"StaxEventItemReader.read.count","int":6}. So in my case the the first 5 XML fragment were successfully processed and upon restarting the Job I would like to start processing from XML fragment number 6 onwards.

Given the config below, how would I position the reader within the XML file?

<batch:job id="reportJob" restartable="true">
    <batch:step id="step1">
        <batch:tasklet>
            <batch:chunk reader="xmlItemReader" writer="cvsFileItemWriter" processor="filterReportProcessor"
                commit-interval="1">
            </batch:chunk>
            <batch:listeners>
                <batch:listener ref="step1Listener" />
            </batch:listeners>
        </batch:tasklet>
    </batch:step>
</batch:job>

<bean id="step1Listener" class="com.mkyong.listeners.Step1Listener" />

<bean id="filterReportProcessor" class="com.mkyong.processor.FilterReportProcessor" />

<bean id="xmlItemReader" class="org.springframework.batch.item.xml.StaxEventItemReader">
    <property name="fragmentRootElementName" value="record" />
    <property name="resource" value="classpath:xml/report.xml" />
    <property name="unmarshaller" ref="reportUnmarshaller" />
</bean>

<!-- Read and map values to object, via jaxb2 -->
<bean id="reportUnmarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
    <property name="classesToBeBound">
        <list>
            <value>com.mkyong.model.Report</value>
        </list>
    </property>
</bean>

Environment - spring-batch-core-2.2.0; spring-core-3.2.2

Test Input File

Convert a XML file into a CSV file.

<company>
    <record refId="1001">
        <name>mkyong</name>
        <age>31</age>
        <dob>31/8/1982</dob>
        <income>200,000</income>
    </record>
    <record refId="1002">
        <name>kkwong</name>
        <age>30</age>
        <dob>26/7/1983</dob>
        <income>100,999</income>
    </record>
    <record refId="1003">
        <name>joel</name>
        <age>29</age>
        <dob>21/8/1984</dob>
        <income>1,000,000</income>
    </record>
    <record refId="1004">
        <name>leeyy</name>
        <age>29</age>
        <dob>21/3/1984</dob>
        <income>80,000.89</income>
    </record>
    <record refId="1005">
        <name>Grant</name>
        <age>29</age>
        <dob>21/3/1984</dob>
        <income>80,000.89</income>
    </record>
</company>

Test Run 1

After processing two records in the input file, I forced a RunTimeException.

batch_job_execution --->>  "FAILED";"FAILED";"java.lang.RuntimeException: Get me out of here!

batch_step_execution_context --->> {"string":"StaxEventItemReader.read.count","int":2}

Output CSV file --->> 1001,mkyong,31,31/08/1982,200000
                      1002,kkwong,30,26/07/1983,100999

Test Run 2

Process all "remaining data", so expecting .... refId="1003", refId="1004", refId="1005"

batch_job_execution --->>  "COMPLETED";"COMPLETED";"''";"2015-01-25 16:18:08.587"

batch_step_execution_context --->>  {"string":"StaxEventItemReader.read.count","int":6}


Output CSV file --->> 1001,mkyong,31,31/08/1982,200000
                      1002,kkwong,30,26/07/1983,100999
                      1003,joel,29,21/08/1984,1000000
                      1004,leeyy,29,21/03/1984,80000.89
                      1005,Grant,29,21/03/1984,80000.89

Test Result

Unfortunately it looks like the StaxEventItemReader is reading from the beginning of the file, rather than re-positioning itself based on the value of StaxEventItemReader.read.count which is set to 2 after the first test.

Upvotes: 1

Views: 1129

Answers (1)

Jimmy Praet
Jimmy Praet

Reputation: 2370

You don't need to configure anything, this is already the default behavior of the StaxEventItemReader. When it opens it repositions itself from the read count in the step execution context.

Upvotes: 1

Related Questions