David
David

Reputation: 127

How to read a CSV that contains double quotes (")

I have a CSV File that looks like this

"url","id","role","url","deadline","availability","location","my_type","keywords","source","external_id","area","area (1)"
"https://myurl.com","123456","This a string","https://myurl.com?source=5&param=1","31-01-2020","1","Location´s Place","another_string, my_string","key1, key2, key3","anotherString","145129","Place in Earth",""

It has 13 columns.

The issue is that I get each row with a \" and I don't want that. Also, I get 16 columns back in the read.

This is what I have done

csv = CSV.new(File.open('myfile.csv'), quote_char:"\x00", force_quotes:false)
csv.read[1]

Output:

["\"https://myurl.com\"", "\"123456\"", "\"This a string\"", "\"https://myurl.com?source=5&param=1\"", "\"31-01-2020\"", "\"1\"", "\"Location´s Place\"", "\"another_string", " my_string\"", "\"key1", " key2", " key3\"", "\"anotherString\"", "\"145129\"", "\"Place in Earth\"", "\"\""]

Upvotes: 2

Views: 373

Answers (1)

Jörg W Mittag
Jörg W Mittag

Reputation: 369478

The file you showed is a standard CSV file. There is nothing special needed. Just delete all those unnecessary arguments:

csv = CSV.new(File.open('myfile.csv'))
csv.read[1]
#=> [
#      "https://myurl.com", 
#      "123456", 
#      "This a string", 
#      "https://myurl.com?source=5&param=1", 
#      "31-01-2020", 
#      "1", 
#      "Location´s Place", 
#      "another_string, my_string", 
#      "key1, key2, key3", 
#      "anotherString", 
#      "145129", 
#      "Place in Earth", 
#      ""
#   ]
  • force_quotes doesn't do anything in your code, because it controls whether or not the CSV library will quote all fields when writing CSV. You are reading, not writing, so this argument is useless.
  • quote_char: "\x00" is clearly wrong, since the quote character in the example you posted is clearly " not NUL.
  • quote_char: '"' would be correct, but is not necessary, since it is the default.

Upvotes: 4

Related Questions