CEGRD
CEGRD

Reputation: 7925

What is the SHA-1 of 255?

Assume you have a sha-1 algorithm that accepts text as input instead of a byte array. (For instance there are some Javascript libraries like that).

When you would like to apply sha-1 to a text (let's say the text is a password), then you first do a utf-8 encoding because the text can contain multi-byte characters. In other words, the corresponding integer value of a character in the text can be larger than an 8-bit byte can hold. Since the sha-1 algorithm works on 8-bit units, it helps to encode the text in utf-8 first.

My question is this: When you have a non-textual binary data where the value of each byte is between 0 and 255, are you still expected to do utf-8 encoding on the binary data before you pass it to the sha-1 algorithm? I know that when the values are between 0 and 127, utf-8 does not modify the data at all.

However, if the values are between 128 and 255, the UTF-8 modifies such data.

In summary, here is my question: What is SHA-1 of the byte containing the value 255 (all 1's)?

With UTF-8 encoding:    730cf30d408ecf51aad876f5c491f837f7ddea4c

Without UTF-8 encoding: 85e53271e14006f0265921d02d4d736cdc580b0b

Which one is the right one?

Upvotes: 1

Views: 443

Answers (1)

Mat
Mat

Reputation: 206831

No, don't UTF-8 encode binary data, it makes no sense. If you want a hash of a piece of binary data, you should SHA-1 exactly that, not some random transformation of that data.

You shouldn't UTF-8 encode strings either unless what you want is the SHA-1 of the UTF-8 representation of that string.

Upvotes: 6

Related Questions