dardy
dardy

Reputation: 433

Java's UTF-8 encoding

I have this code:

BufferedWriter w = Files.newWriter(file, Charsets.UTF_8);
w.newLine();
StringBuilder sb = new StringBuilder();
sb.append("\"").append("éééé").append("\";")
w.write(sb.toString());

But it ain't work. In the end my file hasn't an UTF-8 encoding. I tried to do this when writing:

w.write(new String(sb.toString().getBytes(Charsets.US_ASCII), "UTF8"));

It made question marks appear everywhere in the file...

I found that there was a bug regarding the recognition of the initial BOM charcater (http://bugs.java.com/view_bug.do?bug_id=4508058), so I tried using the BOMInputStream class. But bomIn.hasBOM() always returns false, so I guess my problem is not BOM related maybe?

Do you know how I can make my file encoded in UTF-8? Was the problem solved in Java 8?

Upvotes: 0

Views: 1242

Answers (1)

You're writing UTF-8 correctly in your first example (although you're redundantly creating a String from a String)

The problem is that the viewer or tool you're using to view the file doesn't read the file as UTF-8.

Don't mix in ASCII, that just converts all the non-ASCII bytes to question marks.

Upvotes: 1

Related Questions