Reputation: 3161
Im trying to compare two different string encoded by sha512. But, result is different. It can be an encode problem i mean. I hope you can help me.
This is my Java code:
MessageDigest digest = java.security.MessageDigest.getInstance("SHA-512");
digest.update(MyString.getBytes());
byte messageDigest[] = digest.digest();
// Create Hex String
StringBuffer hexString = new StringBuffer();
for (int i = 0; i < messageDigest.length; i++) {
String h = Integer.toHexString(0xFF & messageDigest[i]);
while (h.length() < 2)
h = "0" + h;
hexString.append(h);
}
return hexString.toString();
and, this is my C# code:
UnicodeEncoding UE = new UnicodeEncoding();
byte[] hashValue;
byte[] message = UE.GetBytes(MyString);
SHA512Managed hashString = new SHA512Managed();
string hex = "";
hashValue = hashString.ComputeHash(message);
foreach (byte x in hashValue)
{
hex += String.Format("{0:x2}", x);
}
return hex;
Where is the problem ? Thx much guys
UPDATE
If i don't specify encoding type, it supposes Unicode i think. Result is this (without specifying anything):
Java SHA: a99951079450e0bf3cf790872336b3269da580b62143af9cfa27aef42c44ea09faa83e1fbddfd1135e364ae62eb373c53ee4e89c69b54a7d4d268cc2274493a8
C# SHA: 70e6eb559cbb062b0c865c345b5f6dbd7ae9c2d39169571b6908d7df04642544c0c4e6e896e6c750f9f135ad05280ed92b9ba349de12526a28e7642721a446aa
Instead, if i specify UTF-16 in Java:
Java UTF-16: SHA f7a587d55916763551e9fcaafd24d0995066371c41499fcb04614325cd9d829d1246c89af44b98034b88436c8acbd82cd13ebb366d4ab81b4942b720f02b0d9b
It's always different !!!
Upvotes: 7
Views: 6143
Reputation: 1109432
Here,
digest.update(MyString.getBytes());
you should be explicitly specifying the desired character encoding in String#getBytes()
method. It will otherwise default to the platform default charset as is been obtained by Charset#defaultCharset()
.
Fix it accordingly:
digest.update(MyString.getBytes("UTF-16LE"));
It should at least be the same charset as UnicodeEncoding
is internally using.
Unrelated to the concrete problem, Java has also an enhanced for
loop and a String#format()
.
Upvotes: 6
Reputation: 111359
The UnicodeEncoding
in C# you use corresponds to the little-endian UTF-16 encoding, while "UTF-16" in Java corresponds to the big-endian UTF-16 encoding. Another difference is that C# doesn't output the Byte Order Marker (called "preamble" in the API) if you don't ask for it, while "UTF-16" in Java generates it always. To make the two programs compatible you can make Java also use the little-endian UTF-16:
digest.update(MyString.getBytes("UTF-16LE"));
Or you could switch to some other well known encoding, like UTF-8.
Upvotes: 6
Reputation: 34034
The reason is probably that you did not specify the encoding to use when converting the string to bytes, java uses the platform default encoding, while UnicodeEncoding
seems to use utf-16.
Edit:
The documentation for UnicodeEncoding says
This constructor creates an instance that uses the little endian byte order, provides a Unicode byte order mark, and does not throw an exception when an invalid encoding is detected.
Javas "utf-16" however seems to default to big endian byte order. With character encodings its better to be really specific, there is an UnicodeEncoding constructor taking two boolean specifiyng byte order, while in java there is also "utf-16le" and "utf-16be". You could try the following in c#
new UnicodeEncoding(true, false) // big endian, no byte order mark
and in java
myyString.getBytes("utf-16be")
Or even better use "utf-8" / Encoding.UTF8 in both cases since it is not affected by different byteorders.
Upvotes: 3