Reputation: 1383
If I had a file encoded in ISO but wanted to read the file as UTF-8 using java would I still get the same text?
would special characters such as µÃÿ display the same?
Upvotes: 0
Views: 486
Reputation: 1062
In short, no. The way the characters are represented (bitwise) in ISO is not the same as how characters are represented in UTF-8.
However, you can convert a file from ISO to UTF-8, but not UTF-8 to ISO, because there are many more recognizable characters in UTF-8 than there are in ISO.
My recommendation would be to detect the encoding (see: Java : How to determine the correct charset encoding of a stream) and then to handle each case accordingly.
Upvotes: 0
Reputation: 179392
No, you would not. UTF-8 does not encode characters beyond U+007f in the same way as ISO-8859-1 (ISO-8859-1 encodes U+0080 through U+00ff as single bytes \x80
to \xff
, while UTF-8 uses two bytes for each of those characters).
You have to use an explicit encoding specification when opening the file: new InputStreamReader(new FileInputStream(...), <encoding>)
Upvotes: 1