Reputation: 3322
I'm working with a CSV file exported from a MAC OS - File Maker Pro program. The CSV seems to be formatted properly. It imports into PHP and into my MySQL database ok. It also imports into my Ubuntu OS using LibreOffice Calc. But, in all cases I end up with strange characters. It's supposed to be a UTF-8 charset, but I'm not sure. Can anyone help explain what kind of transformation is occuring?
Examples:
... Herald Print., [1880’s?]. First and only edition ...
....excellent relic of this manufacturer’s involvement with....
Edit:
Looking at part of the above:
[1880’s?]
manufacturer’s
lost.
od -ctx1 part.txt
0000000 [ 1 8 8 0 342 200 231 s ? ] \r m a n u
5b 31 38 38 30 e2 80 99 73 3f 5d 0d 6d 61 6e 75
0000020 f a c t u r e r 342 200 231 s \r l o s
66 61 63 74 75 72 65 72 e2 80 99 73 0d 6c 6f 73
0000040 t . \v \r \r
74 2e 20 0b 0d 0d
0000046
Upvotes: 1
Views: 793
Reputation: 57418
The encoding is indeed UTF8 and your quotation mark is right there:
http://www.tachyonsoft.com/uc0020.htm
The transformation you see appears to be UTF8 interpreted as ISO-8859-1[5] or Latin1, after you read it, so check that your MySQL is using UTF8 as charset and that the extraction stage keeps it in UTF8 (e.g., if you sent the data to a webpage advertising ISO-8859-15, you'd see exactly that - an Euro sign followed by garbage).
Upvotes: 0