Reputation: 21333
I wonder... what's the maximal length of string that's going to be hashed?
For example, to hash Hello, world!
with SHA-1 is no problems. But what about string that's like 100'000'000 chars long? Does it even work? Does it somehow increase collision possibility?
Are there any limits?
Upvotes: 6
Views: 10837
Reputation: 432657
Wikipedia shows max message size in bits for SHA-1 as 2^64−1. So, this would be 2^60-1 unicode characters. In decimal 1,152,921,504,606,846,975 characters.
Most language string limits are 2GB - 1 characters.
Collision probability is subject to the birthday problem, specifically the "Probability table" bit. I'm not clever enough too lazy to work the probability for collisions using SHA-1 with a collection of 100MB strings...
Upvotes: 11
Reputation:
You can hash long inputs. Yes, hash algorithms still work on large inputs. No, a larger input doesn't increase collision probability. (But they'll take longer.) You should keep in mind that 100 million characters isn't that many bytes for a computer, and most hashes in use today are fast. It would take a modern computer maybe a few seconds to hash a string that long.
There are no theoretical limits, and the practical limits allow for any reasonable use.
Upvotes: 4