Reputation: 2886
In the web page, it is "Why don't we" as follows:
But when I parse the webpage and save it to a text file, it becomes this under eclipse:
Why don鈥檛 we
More information about my implementation:
The webpage is: utf-8
I use jSoup to parse, the file is saved as a txt.
I use FileWriter f = new FileWriter()
to write to file.
UPDATE: I actually solve the display problem in eclipse by changing eclipse's encoding to utf-8.
Upvotes: 0
Views: 66
Reputation: 109613
FileWriter is a utility class that uses the default current platform encoding. That is non-portable, and probably incorrect.
BufferedWriter f = new BufferedWriter(New OutputStreamWriter(
new FileOutputStream(file), StandardCharsets.UTF_9));
f,Write("\uFEFF"); // Redundant BOM character might be written to be sure
// the text is read as UTF-8
...
Upvotes: 2