kevzettler
kevzettler

Reputation: 5213

How do I parse JSON quotes in CSV?

I'm trying to parse some CSV that has some random JSON in it. The JSON has double quotes:

csv = CSV.parse('example,json=[{"json": "obj"}],endexample')
CSV::MalformedCSVError: Illegal quoting in line 1.
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1887:in `each'
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1849:in `loop'
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1849:in `shift'
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1791:in `each'
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1805:in `to_a'
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1805:in `read'
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1379:in `parse'
  from (irb):13
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/bin/irb:12:in `<main>'

I've read that in CSV you escape a quote within quotes so I've tried something like .gsub('"','""'), however this doesn't help.

csv = CSV.parse('example,json=[{""json"": ""obj""}],endexample')
CSV::MalformedCSVError: Illegal quoting in line 1.
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1887:in `each'
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1849:in `loop'
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1849:in `shift'
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1791:in `each'
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1805:in `to_a'
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1805:in `read'
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1379:in `parse'
  from (irb):17
  from /Users/kevzettler/.rbenv/versions/1.9.3-p194/bin/irb:12:in `<main>' 

Upvotes: 0

Views: 1461

Answers (1)

Ermin Dedovic
Ermin Dedovic

Reputation: 907

From Wikipedia about CSV:

  • Fields containing a line-break, double-quote, and/or commas should be quoted. (If they are not, the file will likely be impossible to process correctly).
  • A (double) quote character in a field must be represented by two (double) quote characters.

Try this:

csv = CSV.parse('example,"json=[{""json"": ""obj""}]",endexample')

Upvotes: 4

Related Questions