Reputation: 33
For some weird reason I can't seem to print ë in Java.
public class Eindopdracht0002test
{
public static void main(String[] args)
{
System.out.println("\u00EB");
}
}
It's supposed to print "België"
(dutch for Belgium) however it returns "België"
.
Does anyone know how to resolve this?
Upvotes: 1
Views: 491
Reputation: 124235
In UTF-8 ë
is written as 11000011 10101011
(source: https://unicode-table.com/en/00EB).
Console in Windows is using code pages which are 8-bit mappings to characters (you can check code page of your console with chcp
command). This means when ë
is sent to output stream (console) as 11000011 10101011
bits, console sees it as two characters, which in 850 code page (based on your comments) are mapped to:
├
- 11000011 (195 in decimal)½
- 10101011 (171 in decimal)If you don't want to use UTF-8 encoding you can create separate Writer and specify different encoding which will translate characters to bytes according to that encoding. To do so you can use
OutputStreamWriter(OutputStream out, String charsetName)
which in your case may look like
OutputStreamWriter(System.out, "cp850") osw = OutputStreamWriter(System.out, "cp850");
// needed encoding ------------^^^^^
since you want send characters with specified encoding to standard output stream.
To use println
method and ensure it will automatically flush its data you can wrap created OutputStreamWriter
in
PrintWriter(OutputStream out, boolean autoFlush)
like
PrintWriter out = new PrintWriter(osw, true);
You can also do both these things in one line:
PrintWriter out = new PrintWriter(new OutputStreamWriter(System.out, "cp850"), true);
Now if you use out.println("\u00EB");
it should use recognize ë
character and use cp850
encoding to locate its mapping (which is 137
) and send proper byte representation (here 10001001
) to System.out
(console).
Upvotes: 2