PC.
PC.

Reputation: 7024

Different results on Oracle JRE and Dalvik JVM

I'm stuck while creating a licence manager for an Android app where licence key is generated on desktop server, and verification code runs on android devices. The verification code when executed on desktop produces desired results, but the same code produces a different result on Android.

I debugged the problem and reached the point where the results were getting snapped!

here is a code snippet to demonstrate the difference:

byte[] bytes = {-88, 50, -29, 114, 51, 88, 38, -52, 114, 91, -23, -55, 124, 37, -90, -49, 36, -110, -67, -59, -33, -75, 85, -72, -109, 25, -54, 89, 6, 35, -50, -11, -87, -22, 33, -2, 55, -30, 75, -36, -40, -29, -103, 110, 46, -100, -68, 101, -105, 62, 53, -20, -20, -21, -118, -72, -27, 32, 59, 127, 15, -117, 6, 102};
System.out.println(new String(bytes, "UTF-8").hashCode());

on oracle jdk the result comes out to be

-24892055

but on android phone the result is:

-186036018

Any help will be appreciated.

Upvotes: 1

Views: 606

Answers (2)

Jesse Wilson
Jesse Wilson

Reputation: 40593

It's a difference in how Android and Java handle malformed UTF-8. Given the four byte sequence 0xf5 0xa9 0xea 0x21, Android returns two Unicode replacement characters (0xfffd). Oracle's class library returns three Unicode replacement characters.

Here's a simpler example that demonstrates the problem.

byte[] bytes = { (byte) 0xf5, (byte) 0xa9, (byte) 0xea, (byte) 0x21 };
String decoded = new String(bytes, "UTF-8");
for (int i = 0; i < decoded.length(); i++) {
  System.out.print(Integer.toHexString(decoded.charAt(i)) + " ");
}

Oracle's JVM prints

fffd fffd fffd 

Android's dalvikvm prints

fffd fffd

Your best bet is to avoid decoding byte sequences using UTF-8 unless you know that they are in fact UTF-8. I've reported this inconsistency to the Dalvik team to investigate: Android bug 23831.

If you use CharsetDecoder, Android uses icu4c to do the conversion. That returns U+fffd U+fffd U+0021, which also seems correct by my reading of the UTF-8 spec. In future releases, Android's String will match Android's CharsetDecoder 2.

Upvotes: 2

superfell
superfell

Reputation: 19040

When you call getBytes() you need to specify an ecoding there as well, otherwise you'll get the default encoding from the OS, which could be anything, e.g. showBytes(new String(bytes, "UTF-8").getBytes("UTF-8"));

Upvotes: 2

Related Questions