Razgriz
Razgriz

Reputation: 7343

SHA1 C# result different from Python 3

I have this code on C#:

public static string GenerateSecureHash(string data) {
    HashAlgorithm algorithm = SHA1.Create();
    byte[] hash = algorithm.ComputeHash(Encoding.UTF8.GetBytes(data)); 
    StringBuilder sb = new StringBuilder();
    foreach (byte b in hash) 
        sb.Append(b.ToString("x2"));

    return sb.ToString(); 
}

Which generates an SHA1 String given a data string. I tried converting this to Python3 with the following code:

def GenerateSecureHash(data: str):
    print('Generate Secure Hash called!')

    enc = str.encode('utf-8') # encode in utf8
    hash = sha1(enc)
    formatted = hash.hexdigest()

    return formatted

But they give different outputs.

For example, if I feed in "testStringHere", here are the outputs:

C#: 9ae24d80c345695120ff1cf9a474e36f15eb71c9
Python: 226cf119b78825f1720cf2ca485c2d85113d68c6

Can anyone point me to the right direction?

Upvotes: 0

Views: 341

Answers (2)

user459872
user459872

Reputation: 24562

The issue is at here,

enc = str.encode('utf-8')

By doing this you are actually encoding the string "utf-8" with the default encoding 'utf-8'(not the '"testStringHere"' string).

>>> str.encode("utf-8")
b'utf-8'

See the documentation of str.encode

>>> help(str.encode)
Help on method_descriptor:

encode(self, /, encoding='utf-8', errors='strict')
    Encode the string using the codec registered for encoding.

    encoding
      The encoding in which to encode the string.
    errors
      The error handling scheme to use for encoding errors.
      The default is 'strict' meaning that encoding errors raise a
      UnicodeEncodeError.  Other possible values are 'ignore', 'replace' and
      'xmlcharrefreplace' as well as any other name registered with
      codecs.register_error that can handle UnicodeEncodeErrors.

You could do the encoding by

enc = str.encode("testStringHere", 'utf-8') # encode in utf8

OR

enc = "testStringHere".encode('utf-8') # encode in utf8

Demo:

>>> from hashlib import sha1
>>> enc = str.encode("testStringHere", 'utf-8')
>>> enc1 = "testStringHere".encode('utf-8')
>>> sha1(enc).hexdigest() == sha1(enc1).hexdigest()
True

Upvotes: 3

emptyflash
emptyflash

Reputation: 1339

I would suspect the issue is the way you're converting the bytes back to a string.

Maybe try something like this answer: https://stackoverflow.com/a/1003289/813503

string result = System.Text.Encoding.UTF8.GetString(byteArray);

Upvotes: 0

Related Questions