Malcolm Smith
Malcolm Smith

Reputation: 3580

Does sha-1 ever produce collisions for input messages less than 160bits?

I have a 128bit ID that I want to perform a one way hash on, but I don't want to ever get the same digest for an input message. Does anyone know if sha-1, or an alternative, is guaranteed not to produce collisions for the set of messages less than its output digest size? This is at least theoretically possible...

I also considered using RSA, and discarding the private key to give me a one-way encrypt, but I need to store the result in a 32 char DB field, and the encryption schemes available to me don't produce anything small enough.

Any suggestions of another way of producing a deterministic, non-reversable and collision free transform of the original value are welcome.

Upvotes: 15

Views: 2303

Answers (9)

Will
Will

Reputation: 75615

Your ID is unique and 128 bits.

Your comments explain that you cannot use the ID as-is.

You need it to be unique, not just probably unique. Therefore, you cannot use a hash.

You cannot have both worlds - you cannot have a 1:1 mapping that is not reversible. Its an impossibility.

Encrypting - a bijective operation, so there'll be no collisions - IDs with a secret key will make reversing the ID to determine its original value very hard.

AES has a nice block length of 128 bits which will generate 128 bits output from your 128 bits of input, is faster than old algorithms (!) and is widely available for most platforms and languages. I suggest you use AES for your purpose.

Upvotes: 4

ddyer
ddyer

Reputation: 1788

Hashing is "unlikely" to produce any duplicates, but there are no guarantees. On the other hand, any symmetric encryption scheme will produce 128 bits out for 128 bits in, and guarantee no duplicates.

On the other hand, if you are depending on the hashes being unique, my intuition is you're doing something wrong. If you're using hashes to obfuscate passwords for example, you have to be careful that you don't make the hashed password the de facto password.

Upvotes: 1

President James K. Polk
President James K. Polk

Reputation: 41958

If you want to compute a secret permutation from 128 bits to 128 bits, one simple solution is to use a 128-bit block cipher like AES with a fixed but secret key. You must, of course, be able to keep the key secret forever or you've got nothing.

Upvotes: 7

supercat
supercat

Reputation: 81115

For sufficiently large bit sizes, I think discrete exponentiation is a 1:1 function but reversal is computationally infeasible. I'm not sure how "large" is required though. Code for an unusably slow (but conceptually understandable) implementation:

unsigned long spin_once(unsigned long dat)
{
  if (dat & 1)
    return (dat >> 1);
  return (dat >> 1) ^ SomeMagicNumber;
}

unsigned long hash(unsigned long dat)
{
  unsigned long i,ret;

  if (dat == 0xFFFFFFFF)
    return 0;
  ret = 1;
  for (i=0; i < dat; i++)
    ret = spin_once(ret);
}

This program would take billions of steps to compute the hash for many values of dat, but with trickier code the job can be done in reasonable time. A 32-bit hash is cryptographically worthless, of course, but the approach can be readily extended to any size.

Upvotes: 2

mb14
mb14

Reputation: 22596

According to this article http://www.debian-administration.org/users/dkg/weblog/48,

US gov't federal agencies have been directed to cease all reliance on SHA-1 by the end of 2010

However, as far as I know no collision has been found yet , which means , even people which have tried really hard haven't found any collisions. So it seems reasonable to assume that no collision happen (if you are not dealing with high-security data)

Upvotes: 0

Justin K
Justin K

Reputation: 2694

Isn't the point of a one-way hash that you can't (in general) recover the original data from the hashed value? So why would someone designing a hash function go out of their way to prevent collisions for small inputs?

Instead, it looks like you want to obscure the data, which is fine for some purposes. If it's not practical to use public key encryption, you might as well write your own function.

Upvotes: 0

RHSeeger
RHSeeger

Reputation: 16262

I don't know what hash functions avoid collisions but, if you can't find the answer here, a good starting point might be Perfect Hash Function on Wikipedia. From that page:

A perfect hash function for a set S is a hash function that maps distinct elements in S to distinct integers, with no collisions.

There's a number of links to more information on that page that you may find useful.

That being said, can you say why you need a perfect has function? It may be that there are other ways to accomplish what you want to without needing that property, and someone here may be able to make a suggestion.

Upvotes: 2

quantumSoup
quantumSoup

Reputation: 28122

Does anyone know if sha-1, or an alternative, is guaranteed not to produce collisions

Hash functions were designed not to produce collisions, but nothing is "guaranteed." On the contrary, it is guaranteed that there WILL be collisions, because the message space is practically indefinite, while you have a limited number of possible hashes.

SHA-1 however has been proven to be collision-resistant, and that's the best you can hope for.

Upvotes: 2

Tom Hawtin - tackline
Tom Hawtin - tackline

Reputation: 147124

Cryptographic hashes give a very good approximation of a random number for a given input. So how many random hashes do you need in a room until you get the same 160 bits? It about the square root (disclaimer: I am not a statistician). So you should expect to see clashes at around 80-bits.

I guess practicalities mean you should know when cosmic rays will be a bigger problem than collisions.

Upvotes: 7

Related Questions