Reputation: 7024
I'm stuck while creating a licence manager for an Android app where licence key is generated on desktop server, and verification code runs on android devices. The verification code when executed on desktop produces desired results, but the same code produces a different result on Android.
I debugged the problem and reached the point where the results were getting snapped!
here is a code snippet to demonstrate the difference:
byte[] bytes = {-88, 50, -29, 114, 51, 88, 38, -52, 114, 91, -23, -55, 124, 37, -90, -49, 36, -110, -67, -59, -33, -75, 85, -72, -109, 25, -54, 89, 6, 35, -50, -11, -87, -22, 33, -2, 55, -30, 75, -36, -40, -29, -103, 110, 46, -100, -68, 101, -105, 62, 53, -20, -20, -21, -118, -72, -27, 32, 59, 127, 15, -117, 6, 102};
System.out.println(new String(bytes, "UTF-8").hashCode());
on oracle jdk the result comes out to be
-24892055
but on android phone the result is:
-186036018
Any help will be appreciated.
Upvotes: 1
Views: 606
Reputation: 40593
It's a difference in how Android and Java handle malformed UTF-8. Given the four byte sequence 0xf5 0xa9 0xea 0x21
, Android returns two Unicode replacement characters (0xfffd
). Oracle's class library returns three Unicode replacement characters.
Here's a simpler example that demonstrates the problem.
byte[] bytes = { (byte) 0xf5, (byte) 0xa9, (byte) 0xea, (byte) 0x21 };
String decoded = new String(bytes, "UTF-8");
for (int i = 0; i < decoded.length(); i++) {
System.out.print(Integer.toHexString(decoded.charAt(i)) + " ");
}
Oracle's JVM prints
fffd fffd fffd
Android's dalvikvm prints
fffd fffd
Your best bet is to avoid decoding byte sequences using UTF-8 unless you know that they are in fact UTF-8. I've reported this inconsistency to the Dalvik team to investigate: Android bug 23831.
If you use CharsetDecoder, Android uses icu4c to do the conversion. That returns U+fffd U+fffd U+0021, which also seems correct by my reading of the UTF-8 spec. In future releases, Android's String will match Android's CharsetDecoder 2.
Upvotes: 2
Reputation: 19040
When you call getBytes() you need to specify an ecoding there as well, otherwise you'll get the default encoding from the OS, which could be anything, e.g. showBytes(new String(bytes, "UTF-8").getBytes("UTF-8"));
Upvotes: 2