Reputation: 1951
How do I configure Super CSV to skip blank or white-space only lines?
I'm using the CsvListReader and sometimes I'll get a blank line in my data. When this happens, an exception to the effect of:
number of CellProcessors must match number of fields
I'd like to simply skip these lines.
Upvotes: 3
Views: 2878
Reputation: 9868
Update: Super CSV 2.1.0 (released April 2013) allows you to supply a CommentMatcher
via the preferences that will let you skip lines that are considered comments. There are 2 built in matchers you can use, or you can supply your own. In this case you could use new CommentMatches("\\s+")
to skip blank lines.
Super CSV only skips lines of zero length (just a line terminator).
It's not a valid CSV file if there are blank lines (see rule 4 of RFC4180 which states that Each line should contain the same number of fields throughout the file
). The only time a blank line is valid is if it's part of a multi-line field surrounded by quotes. e.g.
column1,column2
"multi-line field
with a blank line",value2
That being said, it might be possible to make Super CSV a bit more lenient with blank lines (it could ignore them). If you could post a feature request on our SourceForge page, we can investigate this further and potentially add this functionality in a future release.
That doesn't help you right now though!
I haven't done extensive testing on this, but it should work :) You can write your own tokenizer that skips blank lines:
package org.supercsv.io;
import java.io.IOException;
import java.io.Reader;
import java.util.List;
import org.supercsv.prefs.CsvPreference;
public class SkipBlankLinesTokenizer extends Tokenizer {
public SkipBlankLinesTokenizer(Reader reader, CsvPreference preferences) {
super(reader, preferences);
}
@Override
public boolean readColumns(List<String> columns) throws IOException {
boolean moreInput = super.readColumns(columns);
// keep reading lines if they're blank
while (moreInput && (columns.size() == 0 ||
columns.size() == 1 &&
columns.get(0).trim().isEmpty())){
moreInput = super.readColumns(columns);
}
return moreInput;
}
}
And just pass this into the constructor of your reader (you'll have to pass the preferences into both the reader and the tokenizer):
ICsvListReader listReader = null;
try {
CsvPreference prefs = CsvPreference.STANDARD_PREFERENCE;
listReader = new CsvListReader(
new SkipBlankLinesTokenizer(new FileReader(CSV_FILENAME), prefs),
prefs);
...
Hope this helps
Upvotes: 3
Reputation: 41132
I didn't know this library (you should add a Java tag...), but looking at the examples, I see they have readers supporting a variable number of rows per line. An empty line is a sub-case of this pattern.
Alternatively (maybe less efficient), you can just catch the exception and go on with your reading...
Upvotes: 0