Sunil Kumar B M
Sunil Kumar B M

Reputation: 2795

Escaping delimiting character in univocity csv parser

I have a scenario where one of the rows in the data has the delimiting character in the content.

5 |0St"|"ring |field[1]

Should always pass - quoted field separator

where the delimiting character is | and it is also present in one of the columns as shown above.

My configuration is as below:

quoteChar = "
quoteEscapeChar = \\

But when I try to parse the row, it is splitting the column into two separate columns ("0St" and "ring") and failing.

If the put quote around the entire columns as shown below, it works fine.

5 |"0St|ring" |field[1]

Should always pass - quoted field separator

Is there any setting to specify delimiter escaping character?

I'm using univocity 2.5.9

Any help is appreciated

Upvotes: 2

Views: 904

Answers (1)

Jeronimo Backes
Jeronimo Backes

Reputation: 6289

Author of the library here. I believe I already explained the problem in the ticket you opened, but let me try again:

Basically that's NOT how the CSV format works.

If you have a field delimiter in your value (i.e. you have a | between 0St and ring), your entire value MUST be quoted, i.e. you MUST have your value written as "0St|ring" instead of 0St"|"ring.

Any CSV parser will read 0St"|"ring into 0St" then try to process what's after the | as another value. There is just nothing else you can do other than writing the entire value within quotes.

The ONLY way to get 0St"|"ring to be parsed into a single value (I assume you want to get 0St|ring as a result) is to write your own parsing code to process your data this way.

Hope this helps.

Upvotes: 1

Related Questions