Reputation: 37
I'm trying to parse CSV files in Rails, which works great except for anything saved in Excel (testing with Version 16.26) for both Windows and Mac (CSVs saved in Numbers & Google sheets work fine). Any character with an accent produces "Encoding::UndefinedConversionError: "\xEF" from ASCII-8BIT to UTF-8".
Excel claims it saves in UTF-8.
I want accented characters to not throw errors when I upload CSVs saved in Excel.
Things I've tried:
setting the read encoding to bom|utf-8 (as per the BOM link), utf-8, iso-8859-1, utf-16, windows-1252, ascii-8bit (and cycling through each of these in an array incase one fails then dropping it out of the array)
current code uses ISO8859-1:UTF-8 which is supposed to read in ISO8859-1 then encode in UTF-8
Creating a tempfile, converting it to binmode, CSV.parse(temp.path, encoding: "bom|utf-8") per the first answer in this thread.
data = CSV.parse(csv, headers: true, header_converters: :symbol, skip_blanks: true, encoding: 'ISO8859-1:UTF-8')
It also works if I take a csv saved in Excel, then save it in google sheets or Numbers then upload it. Unfortunately, Excel is the most common CSV uploaded by our users.
Upvotes: 2
Views: 2441
Reputation: 37
Solved by using csvreader gem. The built in CSV parser sucks in rails.
Upvotes: 0