Reputation: 6815
I am using java to read a text file and which has some special chars like Yen(¥)
. I have not specified any encoding/charset while reading a file and is working fine in windows. But if i deploy the same in unix machine then ¥
is replaced by '?
'. Now i am going to specify charset windows-1252 to avoid the issue. will windows-1252
work on unix/linux
boxes? My unix box charset is set to 'utf-8
'. am using below the code:
LineIterator iterator =FileUtils.lineIterator(*filename*,"Windows-1252");
Upvotes: 1
Views: 1411
Reputation: 11
If I am understanding your problem correctly, I usualy solve this by saving the text file in UTF-8 encoding with your text editor, and then specifying UTF-8 again when opening that file from your java program.
Upvotes: 1
Reputation: 328556
The class StandardCharsets
gives you a list of encodings / charsets that are "guaranteed to be available on every implementation of the Java platform."
This list doesn't contain the Windows encodings but for most common Java versions on Windows, Mac and Linux, Cp1251
is available.
Note that you'll get a UnsupportedCharsetException
or UnsupportedEncodingException
when it's not available, so the code above is safe (in the sense that it won't produce garbage).
If you want to be really safe, the common approach is to use only UTF-8 encoded data in your projects.
Upvotes: 2