Pratap Vhatkar
Pratap Vhatkar

Reputation: 701

Chinese character UTF-16 Encoding string in Java

I am trying to encode string in java using following method,

String s = "子";
byte[]   bytesEncoded = Base64.encodeBase64(s.getBytes("UTF-16"));
String stringEncoded = new String(bytesEncoded);

When I run this code in eclipse I am getting value as /v9bUA==

But some online UTF 16 converter giving values like 4E02

Anybody knows how to convert Chinese characters in UTF 16.

I already gone through most of stackoverflow question still got no answers.

Thanks in Advance!

Upvotes: 0

Views: 4093

Answers (2)

Evan Jones
Evan Jones

Reputation: 886

The code

String s = "子";
byte[] utf16encodedBytes = s.getBytes("UTF-16")

will give you the string encoded as uft16 bytes.

I think what is confusing you here is that you are then encoding to Base64 which gives those bytes in ASCII as /v9bUA==. The number 4E02 is a Hex encoding. To see the Hex encoding for your example you could try:

String hexEncodedString =  DatatypeConverter.printHexBinary(utf16encodedBytes);

Upvotes: 1

Pratap Vhatkar
Pratap Vhatkar

Reputation: 701

This works fine.

You just need to convert bytecode in to hex representation,

String encodeAsUcs2(String messageContent) throws UnsupportedEncodingException {
  byte[] bytes = messageContent.getBytes("UTF-16BE");

  StringBuilder sb = new StringBuilder();
  for (byte b : bytes) {
    sb.append(String.format("%02X", b));
  }

  return sb.toString();
}

Upvotes: 1

Related Questions