Reputation: 3848
How to put a supplementary Unicode character (say, codepoint 10400) in a string literal? I have tried putting a surrogate pair like this:
String text = "TEST \uD801\uDC00";
System.out.println(text);
but it doesn't seem to work.
UPDATE:
The good news is, the string is constructed properly.
Byte array in UTF-8: 54 45 53 54 20 f0 90 90 80
Byte array in UTF-16: fe ff 0 54 0 45 0 53 0 54 0 20 d8 1 dc 0
But the bad news is, it is not printed properly (in my Fedora box) and I can see a square instead of the expected symbol (my console didn't support unicode properly).
Upvotes: 27
Views: 43877
Reputation: 64622
It is supposed to work using:
System.out.println(
"text = " + new String(Character.toChars(h))
);
But the output is:
text = ?
Upvotes: 5
Reputation:
"Works for me", what exactly is the issue?
public static void main (String[] args) throws Exception {
int cp = 0x10400;
String text = "test \uD801\uDC00";
System.out.println("cp: " + cp);
System.out.println("found: " + text.codePointAt(5));
System.out.println("len: " + text.length());
}
Output:
cp: 66560
found: 66560
len: 7
Note that length -- like most String methods -- deals with char
s, not Unicode characters. So much for awesome Unicode support :)
Happy coding.
Upvotes: 20