Reputation: 7758
I have found a CSV parsing issue with FasterCSV (1.5.0) which seems like a genuine bug, but which I'm hoping there's a workaround for.
Basically, adding a space after the separator (in my case a comma) when the fields are enclosed in quotes generates a MalformedCSVError
.
Here's a simple example:
# No quotes on fields -- works fine
FasterCSV.parse_line("one,two,three")
=> ["one", "two", "three"]
# Quotes around fields with no spaces after separators -- works fine
FasterCSV.parse_line("\"one\",\"two\",\"three\"")
=> ["one", "two", "three"]
# Quotes around fields but with a space after the first separator -- fails!
FasterCSV.parse_line("\"one\", \"two\",\"three\"")
=> FasterCSV::MalformedCSVError: Illegal quoting on line 1.
Am I going mad, or is this a bug in FasterCSV?
Upvotes: 15
Views: 8476
Reputation: 52316
I had hoped that the :col_sep
option might allow a regular expression, but it seems to be used for both reading and writing, which is a shame. The documentation doesn't hold out much hope and your need is probably more immediate than could be satisfied by requesting a change or submitting a patch ;-)
If you're calling #parse_line
explicitly, then you could always call
gsub(/,\s*/, ',')
on your input line. That regular expression might need to change significantly if you anticipate the possibility of comma-space within quoted strings. (I'd suggest reposting such a question here with a suitable tag and let the RegEx mavens loose on it should that be the case).
Upvotes: 2
Reputation: 125119
The MalformedCSVError
is correct here.
Leading/trailing spaces in CSV format are not ignored, they are considered part of a field. So this means you have started a field with a space, and then included unescaped double quotes in that field, which would cause the illegal quoting error.
Maybe this library is just more strict than others you have used.
Upvotes: 14
Reputation: 4598
Maybe you could set the :col_sep: option to ', ' to make it parse files like that.
Upvotes: 2