Reputation: 45
Sample Data -
Header1, full_name, header3, header4
20, "Tom, ZZZ", "test", 30
CSVReader csvReader = new CSVReader(reader, ',', '"');
The second row doesn't read as expected. since there is a double quote in the full_name column value.
I want to ignore such cases. any suggestion would be appreciated.
using openCSV java api for parsing.
Edit:
I am getting the data from database. one of the database column field has that one double quote in it's value. Because of that the csv data looks malformed.
Upvotes: 3
Views: 6206
Reputation: 6289
univocity-parsers can handle unescaped quotes and is also 4x faster than opencsv. Try this code:
public static void main(String... args){
String input = "" +
"20, \"bob, XXX\", \"test\", 30\n" +
"20, \"evan\"s,YYY \", \"test\", 30\n" +
"20, \"Tom, ZZZ\", \"test\", 30 ";
CsvParserSettings settings = new CsvParserSettings();
CsvParser parser = new CsvParser(settings);
List<String[]> rows = parser.parseAll(new StringReader(input));
//printing values enclosed in [ ] to make sure you are getting the expected result
for(String[] row : rows){
for(String value : row){
System.out.print("[" + value + "],");
}
System.out.println();
}
}
This will produce:
[20],[bob, XXX],[test],[30],
[20],["evan"s],[YYY "],[test],[30],
[20],[Tom, ZZZ],[test],[30],
Additionally, you can control how to handle unescaped quotes with one of:
settings.setUnescapedQuoteHandling(UnescapedQuoteHandling.STOP_AT_DELIMITER);
settings.setUnescapedQuoteHandling(UnescapedQuoteHandling.STOP_AT_CLOSING_QUOTE);
settings.setUnescapedQuoteHandling(UnescapedQuoteHandling.RAISE_ERROR);
settings.setUnescapedQuoteHandling(UnescapedQuoteHandling.SKIP_VALUE);
When reading large files, you can use a RowProcessor
or iterate over each row like this:
parser.beginParsing(new File("/path/to/your.csv"));
String[] row;
while ((row = parser.parseNext()) != null) {
// process row
}
Disclaimer: I'm the author of this libary. It's open source and free (Apache 2.0 license)
Upvotes: 2