Reputation: 153
My XML file looks like below,
<?xml version="1.0" encoding="UTF-8"?>
<File fileId="123" xmlns="abc:XYZ" > ABC123411/10/20
XBC128911/10/20
BCD456711/23/22
</File>
This is a fixed length flat xml file, and I need to parse this file as For ex,
ABC123411/10/20
as create Content object.
public class Content {
private id;
private name;
private date;
// getters
}
Ex:
name: ABC
id: 1234
Date: 11/10/20
This is what I'm trying
<bean id="reader" class="org.springframework.batch.item.xml.StaxEventItemReader" scope="step">
<property name="resource" value="file:#{jobExecutionContext['source.download.filePath']}" />
<property name="unmarshaller" ref="jaxb2Marshaller" />
<property name="fragmentRootElementNames" value="File">
</property>
</bean>
<bean id="jaxb2Marshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="packagesToScan">
<list>
<value>com.test.model</value>
</list>
</property>
</bean>
and my pojo,
@XmlAccessorType(XmlAccessType.FIELD)
@XmlRootElement(name = "File", namespace = "//namespace")
public class TestRecord {
@XmlValue
private String data;
public String getData() {
return data;
}
}
Now this code parses the xml file and sets the value as String in TestRecord.data as below
ABC123411/10/20
XBC128911/10/20
BCD456711/23/22
With this method, we need to write a mapper again to parse this string (from TestRecord.data) by new line and then tokenize each String and assign to Content object.
I just want to check if this is something we can do it in XML configuration using readers available or any other better options? thanks!
Upvotes: 0
Views: 461
Reputation: 31620
I would keep it simple and create a tasklet that transforms this:
<?xml version="1.0" encoding="UTF-8"?>
<File fileId="123" xmlns="abc:XYZ" > ABC123411/10/20
XBC128911/10/20
BCD456711/23/22
</File>
into this:
ABC123411/10/20
XBC128911/10/20
BCD456711/23/22
and then create a chunk-oriented step with a FlatFileItemReader
to parse the new file. This would be simpler than trying to find a way to ignore lines, use regular expressions to parse the content, etc.
Upvotes: 1
Reputation: 76
I successfully extracted the contents using RegexLineTokenizer instead of FixedLengthTokenizer setting strict to false prevents it from choking on lines that do not match the pattern, but it will create objects with empty properties for them.
@Bean
public static RegexLineTokenizer regexpTokenizer() {
RegexLineTokenizer tok = new RegexLineTokenizer();
tok.setRegex("([A-Za-z]{3})(\\d{4})(\\d{2}/\\d{2}/\\d{2})");
tok.setNames("name","id","date" );
tok.setStrict(false);
return tok;
}
Here is what that translates to as an XML configuration:
<bean id="reader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<property name="resource" value="/file path" />
<property name="linesToSkip" value="2" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.RegexLineTokenizer">
<property name="names"
value="name,id,date"/>
<property name="regex"
value="([A-Za-z]{3})(\d{4})(\d{2}/\d{2}/\d{2})"/>
<property name="strict" value="false"/>
</bean>
</property>
<property name="fieldSetMapper">
<!-- Parse the object -->
<bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
<property name="prototypeBeanName" value="testRecord" />
</property>
</bean>
</property>
Upvotes: 0