jmc
jmc

Reputation: 1729

Parsing CSV with a comma in between data without double quotes

Apparently we have a working CSV parser for importing files where text data values are surrounded within double quotes. Now a change in this feature requires us to remove the double quotes for all data fields.

The problem is we have an 'Address' field that contains commas in between them and is now parsed as separate data fields. The way I can think of dealing with this is

  1. Create an intelligent method that can identify that a comma belongs to a data field
  2. To use the pipe character as delimiter '|'

Currently, I'd like to go with option number 1

Is there any library that can do this?

Upvotes: 0

Views: 204

Answers (1)

npinti
npinti

Reputation: 52185

As far as I know, having double quotes in CSV data is standard. This allows the CSV parser to identify which commas to use to eventually split the data, thus your change would make the parser behave in a non standard way.

That being said, how would you know if you need to split or keep going? Unless your data has a very rigid pattern I doubt that it is possible to develop a system which reliably guesses where it needs to split.

The easier solution would most likely be to simply change the delimeter. You would also need to keep in mind that sometimes these files are processed/updated by humans, thus you need to stick to the most intuitive of formats.

Upvotes: 1

Related Questions