marie
marie

Reputation: 11

How to read CSV file with different number of columns with Spring Batch

I have a CSV file that doesn't have a fixed number of columns, like this:

  col1,col2,col3,col4,col5    
  val1,val2,val3,val4,val5 
  column1,column2,column3
  value1,value2,value3

Is there any way to read this kind of CSV file with Spring Batch?

I tried to do this:

<bean id="ItemReader" class="org.springframework.batch.item.file.FlatFileItemReader">

    <!-- Read a csv file -->
    <property name="resource" value="classpath:file.csv" />

    <property name="lineMapper">
        <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
            <!-- split it -->
            <property name="lineTokenizer">
                <bean
                    class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
                    <property name="names"
                        value="col1,col2,col3,col4,col5,column1,column2,column3" />
                </bean>
            </property>
            <property name="fieldSetMapper">
                <bean
                    class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
                    <property name="prototypeBeanName" value="myBean" />
                </bean>
            </property>

        </bean>
    </property>

</bean>

But the result was this error:

IncorrectTokenCountException stack trace

Upvotes: 0

Views: 2695

Answers (2)

Gustavo Passini
Gustavo Passini

Reputation: 2678

AbstractLineTokenizer#setStrict(boolean) in your DelimitedLineTokenizer should do the job.

From the javadoc :

Public setter for the strict flag. If true (the default) then number of tokens in line must match the number of tokens defined (by Range, columns, etc.) in LineTokenizer. If false then lines with less tokens will be tolerated and padded with empty columns, and lines with more tokens will simply be truncated.

You should change this part of your configuration to:

<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
    <property name="names" value="col1,col2,col3,col4,col5,column1,column2,column3" />
    <property name="strict" value="false" />
</bean>

Upvotes: 2

Michael Minella
Michael Minella

Reputation: 21493

You can use the PatternMatchingCompositeLineMapper to delegate to the appropriate LineMapper implementation per line based on a pattern. From there, each of your delegates would use a DelimtedLineTokenizer and a FieldSetMapper to map the line accordingly.

You can read more about this in the documentation here: http://docs.spring.io/spring-batch/trunk/apidocs/org/springframework/batch/item/file/mapping/PatternMatchingCompositeLineMapper.html

Upvotes: 2

Related Questions