Reputation: 682
Update: The original CSV was created in Excel; when I copied the data in to a Google Spreadsheet and downloaded a CSV from Drive, it works fine. I'm guessing there's an encoding issue w/ the Excel CSV? Is there any way to work around this w/ Excel or do we need to tell our clients to use Google docs?
I've got a CSV w/ non-roman characters (my example is in French, but we support entirely non-roman languages such as Arabic and Thai as well) that I'm reading via ColdFusion's cffile. The problem is the output from the read is converting all the accented characters into a weird ? symbol (�). There was originally no charset specified on the cffile, so I tried adding utf-8 (no change) and utf-16 (everything is converted to sort-of Chinese?).
Anyone know how I can get this data out of the CSV without losing/messing up the characters?
CSV Example:
Smith,Joan,[email protected],Hôpital Jésus
Original cffile:
<cffile action="read" file="#expandedFilePath#" variable="strCSV">
cffile w/ charset added:
<cffile action="read" file="#expandedFilePath#" variable="strCSV" charset="utf-8">
cfdump of strCSV (no charset/utf-8 charset):
Smith,Joan,[email protected],H�pital J�sus
cfdump of strCSV (utf-16 charset):
卭楴栬䩯慮ⱪ潡渮獭楴桀瑥獴潭ⱈ楴慬⁊畳ഊ
Upvotes: 4
Views: 954
Reputation: 2445
Excel, like most Windows programs, uses the CP-1252 encoding (not UTF-8; and this is important: ALSO NOT ISO-8859-1 as recognised by most encoding guessers). Did you already try to do:
<cffile action="read" file="#expandedFilePath#"
variable="strCSV"
charset="windows-1252" />
If this works, can you rely on your inputs to always be default Windows files?
Upvotes: 1