Nish
Nish

Reputation: 1012

Convert a unicode into it's actual chinese symbol in java

I want to convert this U+2E93 into its corresponding Chinese symbol in java. I tried using this approach.

String encoding = "UTF-8";
String cp = "U+2E93".subString(2);
int cpVal=Integer.parseInt(cp,16);
String tempString = Character.toString((char)cpVal);
byte[] bytes = tempString.getBytes(Charset.forName(encoding));
result = new String(bytes);

This is working fine in my local where default charset is UTF-8 but not working on one linux VM , where default charset is ISO-8859-1.

Upvotes: 1

Views: 1631

Answers (1)

Andreas
Andreas

Reputation: 159086

Use a Unicode escape sequence.

System.out.println("\u2E93");

If you receive the code point as a string, like shown in the question, do it like this:

Java 11+

String cp = "U+2E93";
int codePoint = Integer.parseInt(cp.substring(2), 16);
String result = Character.toString(codePoint);
System.out.println(result);

Java 5+

String cp = "U+2E93";
int codePoint = Integer.parseInt(cp.substring(2), 16);
String result = new String(new int[] { codePoint }, 0, 1);
System.out.println(result);

Output (from all 3 above)


For characters from the supplemental planes, you need to give the UTF-16 surrogate pairs, when using a string literal

System.out.println("\uD83D\uDC4D");
String cp = "U+1F44D";
...

Output (from both)

👍

Upvotes: 3

Related Questions