ahmet alp balkan
ahmet alp balkan

Reputation: 45196

MD5 etc. as a hash function

Let's say you are planning to design a hash function which will generate keys between 0-256. Will using first 2 digits of MD5-digest be a great idea for a uniform distribution? What do you think on this? Is it expensive to md5() some word (2-10 letters)?

I know it is a rough definition of requirements but it would be great to discuss this.

Upvotes: 4

Views: 1625

Answers (4)

Robert Harvey
Robert Harvey

Reputation: 180787

You could try calculating an 8-bit CRC.

Upvotes: 3

Jonathan Leffler
Jonathan Leffler

Reputation: 753455

You haven't explained how you're going to use the hash, and what you're going to do with the collisions that are inevitable given that you have only 256 output values.

I think even MD5 (which is not cryptographically secure any more) is overkill for the likely applications.

I'd probably go with a CRC (cyclic redundancy check) algorithm that would generate a 16-bit or 32-bit number for you, and would likely give you good enough distribution.

Upvotes: 0

Rafe Kettler
Rafe Kettler

Reputation: 76945

There's no reason to use a cryptographic strength hash for something as simple as generating 3 digit hashes. You're better off using a more simple hash there.

I'm not certain specifically how expensive MD5 is relative to others, but there are plenty of better ways to create a small hash (see this article for some algorithm ideas).

Upvotes: 4

Martin Beckett
Martin Beckett

Reputation: 96119

MD5 is designed to uniformaly spread the input over all the output bytes so it's as good as any other general hash function - sounds like a bit of overkill if you only want 256 values.

Note the output of MD5 is 128bytes (16bytes), it's only the text representation that is hex digits - so there is really no first two digits of MD5 - just use the bottom 8bits.

Upvotes: 1

Related Questions