kinghomer
kinghomer

Reputation: 3161

String SHA-512 Encoding: C# and JAVA result is different

Im trying to compare two different string encoded by sha512. But, result is different. It can be an encode problem i mean. I hope you can help me.

This is my Java code:

    MessageDigest digest = java.security.MessageDigest.getInstance("SHA-512"); 
    digest.update(MyString.getBytes()); 
    byte messageDigest[] = digest.digest();

    // Create Hex String
    StringBuffer hexString = new StringBuffer();
    for (int i = 0; i < messageDigest.length; i++) {
        String h = Integer.toHexString(0xFF & messageDigest[i]);
        while (h.length() < 2)
            h = "0" + h;
        hexString.append(h);
    }
    return hexString.toString();

and, this is my C# code:

        UnicodeEncoding UE = new UnicodeEncoding();
        byte[] hashValue;
        byte[] message = UE.GetBytes(MyString);

        SHA512Managed hashString = new SHA512Managed();
        string hex = "";

        hashValue = hashString.ComputeHash(message);
        foreach (byte x in hashValue)
        {
            hex += String.Format("{0:x2}", x);

        }
        return hex;

Where is the problem ? Thx much guys

UPDATE

If i don't specify encoding type, it supposes Unicode i think. Result is this (without specifying anything):

Java SHA: a99951079450e0bf3cf790872336b3269da580b62143af9cfa27aef42c44ea09faa83e1fbddfd1135e364ae62eb373c53ee4e89c69b54a7d4d268cc2274493a8

C# SHA: 70e6eb559cbb062b0c865c345b5f6dbd7ae9c2d39169571b6908d7df04642544c0c4e6e896e6c750f9f135ad05280ed92b9ba349de12526a28e7642721a446aa

Instead, if i specify UTF-16 in Java:

Java UTF-16: SHA f7a587d55916763551e9fcaafd24d0995066371c41499fcb04614325cd9d829d1246c89af44b98034b88436c8acbd82cd13ebb366d4ab81b4942b720f02b0d9b

It's always different !!!

Upvotes: 7

Views: 6143

Answers (3)

BalusC
BalusC

Reputation: 1109432

Here,

digest.update(MyString.getBytes()); 

you should be explicitly specifying the desired character encoding in String#getBytes() method. It will otherwise default to the platform default charset as is been obtained by Charset#defaultCharset().

Fix it accordingly:

digest.update(MyString.getBytes("UTF-16LE")); 

It should at least be the same charset as UnicodeEncoding is internally using.


Unrelated to the concrete problem, Java has also an enhanced for loop and a String#format().

Upvotes: 6

Joni
Joni

Reputation: 111359

The UnicodeEncoding in C# you use corresponds to the little-endian UTF-16 encoding, while "UTF-16" in Java corresponds to the big-endian UTF-16 encoding. Another difference is that C# doesn't output the Byte Order Marker (called "preamble" in the API) if you don't ask for it, while "UTF-16" in Java generates it always. To make the two programs compatible you can make Java also use the little-endian UTF-16:

digest.update(MyString.getBytes("UTF-16LE"));

Or you could switch to some other well known encoding, like UTF-8.

Upvotes: 6

J&#246;rn Horstmann
J&#246;rn Horstmann

Reputation: 34034

The reason is probably that you did not specify the encoding to use when converting the string to bytes, java uses the platform default encoding, while UnicodeEncoding seems to use utf-16.

Edit:

The documentation for UnicodeEncoding says

This constructor creates an instance that uses the little endian byte order, provides a Unicode byte order mark, and does not throw an exception when an invalid encoding is detected.

Javas "utf-16" however seems to default to big endian byte order. With character encodings its better to be really specific, there is an UnicodeEncoding constructor taking two boolean specifiyng byte order, while in java there is also "utf-16le" and "utf-16be". You could try the following in c#

new UnicodeEncoding(true, false) // big endian, no byte order mark

and in java

myyString.getBytes("utf-16be")

Or even better use "utf-8" / Encoding.UTF8 in both cases since it is not affected by different byteorders.

Upvotes: 3

Related Questions