Mark Ma
Mark Ma

Reputation: 1362

the byte string is same in python and c#,but md5 hashcode is different

I want to generate the hash code from a string,e.g."咖啡",but the hashcode I get from python and c# is different,the one from python is what I want

c#

String str = "咖啡";
MD5 m = MD5.Create();
byte[] data = m.ComputeHash(Encoding.Default.GetBytes(str));
StringBuilder sbuilder = new StringBuilder();
for(int i=0;i<data.Length;i++){
  sbuilder.Append(data[i].ToString("x2"));
}
byte[] hex = Encoding.Default.GetBytes(str);
StringBuilder hex_builder = new StringBuilder();
foreach(byte a in hex){
  hex_builder.Append("{0:x2}",a);
}
//md5 hash code
Response.Write(sbuilder.ToString());
//binary string
Response.Write(hex_builder.ToString());

python

#coding:utf8
str = '咖啡'
m = hashlib.md5()
m.update(str)
#md5 hashcode
print m.hexdigest()
#binary string
print ' '.join(["%02x"%ord(x) for x in str])

binary string is e5 92 96 e5 95 a1 both in c# and python

md5 hash code:

(c#)a761914f9760af3c112e24f08dea1b16

(python)3b7daa58a1fecdf5ba4d94d539fbb4d5

Upvotes: 6

Views: 3639

Answers (2)

Ken Fischer
Ken Fischer

Reputation: 41

Had same problem. Was able to encode the text as "UTF-16LE" and the C# and Python both produced the same result.

def getMD5(text):
    encoding = text.encode("UTF-16LE")
    md5 = hashlib.md5()
    md5.update(encoding)
    return md5.hexdigest()

Upvotes: 4

Mihai Oprea
Mihai Oprea

Reputation: 2047

The string encoding might be different; therefore when you convert string to byte[] you probably get different values. Try printing those to see if they are the same.

Upvotes: 5

Related Questions