Andy Jacobs
Andy Jacobs

Reputation: 957

CSVParser not handling escaped delimiters in unquoted strings

I'm using com.opencsv.CSVParser (5.1) in my Java program.

                    final CSVParser csvParser =
                        new CSVParserBuilder()
                        .withSeparator(',')
                        .withQuoteChar('"')
                        .withEscapeChar('\\')
                        .withIgnoreQuotations(true)
                        .build();

My input file has

3,2.48,E #3,String with \, comma in it,0

I was expecting the 4th field to end up with "String with , comma in it". But instead, the parser is splitting the string into two fields at the escaped comma, with "String with " and " comma in it". The documentation for withEscapeChar() says:

Sets the character to use for escaping a separator or quote.

And since quoted separators don't need to be escaped, I assumed (hoped) this would allow me to escape separators in non-quoted strings. I've tried this both with and without withIgnoreQuotations.

Am I missing something, or doing something wrong?

Upvotes: 0

Views: 935

Answers (1)

andrewJames
andrewJames

Reputation: 21910

I don't see anything wrong with your code - but I also am not able to parse your data as expected - I hit the same problem as you. This feels like a bug (which is surprising). And if it's not a bug, then the correct usage is too obscure for me.

Alternatively, you can use Commons CSV:

<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-csv</artifactId>
    <version>1.8</version>
</dependency>

Sample code:

import com.opencsv.CSVReader;
import com.opencsv.CSVWriter;

...

private void commonsCsvTest() throws URISyntaxException, IOException {
    Path path = Paths.get(ClassLoader.getSystemResource("csv/escapes.csv").toURI());
    Reader in = new FileReader(path.toString());
    Iterable<CSVRecord> records = CSVFormat.DEFAULT.withEscape('\\').parse(in);
    for (CSVRecord record : records) {
        System.out.println(record.get(3));
    }
}

Using your data in the input file "escapes.csv", we get the following output:

String with , comma in it

You can obviously change how you read the input file, to fit your specific situation.

Upvotes: 1

Related Questions