How unique are the first 8-12 characters of SHA256 hashes?

Take this hash for example:

ba7816bf 8f01cfea 414140de 5dae2223 b00361a3 96177a9c b410ff61 f20015ad

It's too long for my purposes so I intend to use a small chunk from it, such as:

ba7816bf8f01
ba7816bf

Or similar. My intended use case:

//example.com/video-gallery/lightbox/ba7816bf8f01

I thought I'd SHA256 the URL of the video, use the first few characters as an ad-hoc ID. How many characters should I use from the generated hash, to considerably reduce the chance of collision?

I got the idea from URLs and Hashing by Google.

Upvotes: 12

Views: 10083

Answers (1)

Ry-
Ry-

Reputation: 224983

The Wikipedia page on birthday attacks has a table with the number of entries you need to produce a certain chance of collision with a certain number of bits as a random identifier. If you want to have a one in a million chance of a collision and expect to store a million documents, for example, you’ll need fewer than 64 bits (16 hex characters).

Base64 is a good way to fit more bits into the same length of string compared to hex, too, taking 1⅓ characters per byte instead of 2.

Upvotes: 9

Related Questions