Reputation: 382
I have a text file that claims to be UTF-8 encoded. That is, when i call file -I $file
it prints $file: text/plain; charset=utf-8
. But when I open it with UTF-8 encoding some characters seem corrupted. That is, the file is suppose to be german but the special german characters like ö
are displayed as ö
.
I guessed that the claim to be UTF-8 is wrong and executed the enca script to guess the real encoding. But sadly enca tells me that the language de
(german) is not supported.
Is there another way to fix the file?
Upvotes: 5
Views: 7440
Reputation: 8905
To get a file to read properly in a given encoding, you need three things:
Note that (2) is not strictly necessary, but if the file encoding is detected improperly, you will need to manually re-read the file in the correct encoding. For example, using :e ++enc=utf-8
for a utf-8 file that was not detected as such.
See http://vim.wikia.com/wiki/Working_with_Unicode for getting all three of these concepts correct.
Upvotes: 3
Reputation: 201568
The UTF-8 encoded form of “ö” U+00F6 is 0xC3 0xB6, and if these bytes are interpreted in ISO-8859-1 they are “ö” (U+00C3 U+00B6). So either the file is actually being read and interprered as ISO-8859-1, even though you expect otherwise, or there has been a double encoding: previously, the file or part thereof has been read as if it were ISO-8859-1 (even though it was UTF-8), and the misinterpreted data has then been written out as UTF-8 encoded.
Upvotes: 4
Reputation: 5631
You can also check the encoding with :set encoding
, and set it accordingly with :set encoding=utf-8
. If you still see incorrect characters, that means those where not written in the file as utf-8 and you'll need to convert them.
EDIT : if you could submit your file it would help
Upvotes: 2