Olly
Olly

Reputation: 7758

Overcoming a basic problem with CSV parsing using the FasterCSV gem

I have found a CSV parsing issue with FasterCSV (1.5.0) which seems like a genuine bug, but which I'm hoping there's a workaround for.

Basically, adding a space after the separator (in my case a comma) when the fields are enclosed in quotes generates a MalformedCSVError.

Here's a simple example:

# No quotes on fields -- works fine
FasterCSV.parse_line("one,two,three")
=> ["one", "two", "three"]

# Quotes around fields with no spaces after separators -- works fine
FasterCSV.parse_line("\"one\",\"two\",\"three\"")
=> ["one", "two", "three"]

# Quotes around fields but with a space after the first separator -- fails!
FasterCSV.parse_line("\"one\", \"two\",\"three\"")
=> FasterCSV::MalformedCSVError: Illegal quoting on line 1.

Am I going mad, or is this a bug in FasterCSV?

Upvotes: 15

Views: 8476

Answers (3)

Mike Woodhouse
Mike Woodhouse

Reputation: 52316

I had hoped that the :col_sep option might allow a regular expression, but it seems to be used for both reading and writing, which is a shame. The documentation doesn't hold out much hope and your need is probably more immediate than could be satisfied by requesting a change or submitting a patch ;-)

If you're calling #parse_line explicitly, then you could always call

gsub(/,\s*/, ',')

on your input line. That regular expression might need to change significantly if you anticipate the possibility of comma-space within quoted strings. (I'd suggest reposting such a question here with a suitable tag and let the RegEx mavens loose on it should that be the case).

Upvotes: 2

Ben James
Ben James

Reputation: 125119

The MalformedCSVError is correct here.

Leading/trailing spaces in CSV format are not ignored, they are considered part of a field. So this means you have started a field with a space, and then included unescaped double quotes in that field, which would cause the illegal quoting error.

Maybe this library is just more strict than others you have used.

Upvotes: 14

Robert Massa
Robert Massa

Reputation: 4598

Maybe you could set the :col_sep: option to ', ' to make it parse files like that.

Upvotes: 2

Related Questions