Reputation: 4539
I am finding the CSV parsing in Ruby 1.9.3 to be remarkably fragile. So much so that I am wondering if I am doing something wrong
If I do the following in irb I get an error:
1.9.3-p125 :011 > require 'csv'
=> true
1.9.3-p125 :012 > a = 'one,two,three, "four, five",six'
=> "one,two,three, \"four, five\",six"
1.9.3-p125 :013 > arr = CSV.parse(a)
CSV::MalformedCSVError: Illegal quoting in line 1.
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1887:in `each'
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1849:in `loop'
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1849:in `shift'
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1791:in `each'
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1805:in `to_a'
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1805:in `read'
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1379:in `parse'
from (irb):13
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/bin/irb:16:in `<main>'
I've found that the problem is the extra space preceding the "four, five" value. If I remove the space, then it works.
1.9.3-p125 :010 > a = 'one,two,three,"four, five",six'
=> "one,two,three,\"four, five\",six"
1.9.3-p125 :011 > arr = CSV.parse(a)
=> [["one", "two", "three", "four, five", "six"]]
Spaces in front of the other values does not cause a problem. The following parses just fine
one, two, three,"four, five", six
Is there some parse option I am missing that makes using quoted values so fragile?
Upvotes: 2
Views: 2736
Reputation: 35453
This is correct behavior. It's not being fragile.
Your comma after "four" is ending the field, and the next field starts immediately with the space.
You can't validly put a quote in the middle of a field (without escaping it).
Upvotes: 3