daGrevis
daGrevis

Reputation: 21333

What's the maximal length of string that's going to be hashed?

I wonder... what's the maximal length of string that's going to be hashed?

For example, to hash Hello, world! with SHA-1 is no problems. But what about string that's like 100'000'000 chars long? Does it even work? Does it somehow increase collision possibility?

Are there any limits?

Upvotes: 6

Views: 10837

Answers (2)

gbn
gbn

Reputation: 432657

Wikipedia shows max message size in bits for SHA-1 as 2^64−1. So, this would be 2^60-1 unicode characters. In decimal 1,152,921,504,606,846,975 characters.

Most language string limits are 2GB - 1 characters.

Collision probability is subject to the birthday problem, specifically the "Probability table" bit. I'm not clever enough too lazy to work the probability for collisions using SHA-1 with a collection of 100MB strings...

Upvotes: 11

user684934
user684934

Reputation:

You can hash long inputs. Yes, hash algorithms still work on large inputs. No, a larger input doesn't increase collision probability. (But they'll take longer.) You should keep in mind that 100 million characters isn't that many bytes for a computer, and most hashes in use today are fast. It would take a modern computer maybe a few seconds to hash a string that long.

There are no theoretical limits, and the practical limits allow for any reasonable use.

Upvotes: 4

Related Questions