Mike G
Mike G

Reputation: 1994

Java MD5 Hash Not Matching .NET Hash

I have a webservice that is written in C# handling some validation of values. In it I need to check a MD5 hash generated in the calling Java client.

The Java client generates the hash in this manner

Charset utf8Charset = Charset.forName("UTF-8");

byte[] bytesOfPhrase = phrase.getBytes(utf8Charset);
MessageDigest md = MessageDigest.getInstance("MD5");

byte[] thedigest = md.digest(bytesOfPhrase);
this._AuthenticationToken = new String(thedigest, utf8Charset);

The C# webservice generates its has in this manner:

private static string HashString(string toHash)
{
    MD5CryptoServiceProvider md5Provider = new MD5CryptoServiceProvider();

    byte[] hashedBytes = md5Provider.ComputeHash(_StringEncoding.GetBytes(toHash));
    return Convert.ToBase64String(hashedBytes);
}

I've tried several charsets in the Java code, but none of them produce a string that is anywhere similar to the Java produced string. Using hard coded values that are the same during every call (meaning that I've hardcoded the parameters so the hashes should match) still produces an odd Java string.

C# Example of hashed values:

6wM7McddLBjofdFJ3rU6/g==

I'd post the example of the string Java produces, but it has some very odd characters that I do not think I can paste in here.

What am I doing wrong?

Upvotes: 3

Views: 4579

Answers (4)

Jon Skeet
Jon Skeet

Reputation: 1504092

This is fundamentally broken code:

// Badly broken
byte[] thedigest = md.digest(bytesOfPhrase);
this._AuthenticationToken = new String(thedigest, utf8Charset);

Never, ever, ever try to encode arbitrary binary data by passing it to the String constructor. Always use base64, or hex, or something like that. Apache Commons Codec has a Base64 encoder, or this public domain version has a slightly more pleasant API.

The equivalent C# would be:

// Equally broken
byte[] hashedBytes = md5Provider.ComputeHash(Encoding.UTF8.GetBytes(toHash));
return Encoding.UTF8.GetString(hashedBytes);

What are the chances that the binary data produced by an MD5 digest is actually a valid UTF-8 byte sequence?

Two other things to note:

  • You can get hold of an MD5 hash slightly more simply in .NET using the MD5 class:

    byte[] hash;
    using (MD5 md5 = MD5.Create())
    {
        hash = md5.ComputeHash(bytes);
    }
    // Use hash
    

    Note the use of the using statement to dispose of the instance afterwards. My main preference for this is that it's easier to remember, read and type MD5 than MD5CryptoServiceProvider :)

  • You haven't made it clear what _StringEncoding is, but the code should really just use Encoding.UTF8 to match the Java.

Upvotes: 7

t_motooka
t_motooka

Reputation: 565

Your C# code outputs MD5 hash as BASE64-encoded, but java code does not. A generic method to compare two MD5 hashes is to compare its hexadecimal presentation (16bytes -> 32digits).

Upvotes: 0

Oswald
Oswald

Reputation: 31685

In C#, you encode the bytes using Base64. In Java, you interpret the bytes as a UTF-8-string.

Upvotes: 0

hrnt
hrnt

Reputation: 10142

Your C# digest is in Base64, but your Java digest is not. Convert thedigest to Base64 as well.

Upvotes: 3

Related Questions