Reputation: 187399
I have a Java servlet that receives data from an upstream system via a HTTP GET request. This request includes a parameter named "text" and another named "charset" that indicates how the text parameter was encoded:
If I instruct the upstream system to send me the text TĀ
and debug the servlet request params, I see the following:
request.getParameter("charset") == "UTF-16LE"
request.getParameter("text").getBytes() == [0, 84, 1, 0]
The code points (in hex) for the two characters in this string are:
[T] 0054
[Ā] 0100
I cannot figure out how to convert this byte[]
back to the String "TĀ"
. I should mention that I don't entirely trust the charset
and suspect it may be using UTF-16BE
.
Upvotes: 0
Views: 4092
Reputation: 311050
Why are you calling getBytes() at all? You already have the parameter as a String. Calling getBytes(), without specifying a charset, is just an opportunity to mangle the data.
Upvotes: 0
Reputation: 6897
Use the String(byteArray, charset)
constructor:
byte[] bytes = { 0, 84, 1, 0 };
String string = new String(bytes, "UTF-16BE");
Upvotes: 6