Reputation: 1269
I have a .csv
file which should be in 'UTF-8' encoding. I have exported it from Sql Server Management Studio. However, when importing it to R
it fails on the lines with ÿ
. I use read.csv2 and specify file encoding "UTF-8-BOM".
Notepad++
correctly displays the ÿ
and says it is UTF-8 encoding. Is this a bug with the R
encoding, or is ÿ
in fact not part of the UTF-8 encoding scheme?
I have uploaded a small tab delimited
.txt
file that fails here:
https://www.dropbox.com/s/i2d5yj8sv299bsu/TestData.txt
Thanks
Upvotes: 1
Views: 1081
Reputation: 25837
That is probably part of the BOM marker at the beginning. If the editor or parser doesn't recognize BOM markers it believes it is garbage. See https://www.ultraedit.com/support/tutorials-power-tips/ultraedit/unicode.html for more details.
Upvotes: 0