Dagrada
Dagrada

Reputation: 1155

Hash UUIDs without requiring ordering

I have two UUIDs. I want to hash them perfectly to produce a single unique value, but with a constraint that f(m,n) and f(n,m) must generate the same hash.

Can anyone help?

Upvotes: 2

Views: 895

Answers (3)

George Garyfallou
George Garyfallou

Reputation: 126

this is what I use for these requirements.

func generateUniqueString(stringsList []string) string {
    // Sort the strings to make order irrelevant
    sort.Strings(stringsList)

    concatenated := strings.Join(stringsList, ",")

    // Compute the hash
    hash := sha256.Sum256([]byte(concatenated))

    // Return the hash as a hexadecimal string
    return fmt.Sprintf("%x", hash)
}

The stringsList is the list of UUIDs as strings.

Hope that helps

Upvotes: 0

Patrick M
Patrick M

Reputation: 10989

To build on user2357112's brilliant solution and boil down the comment chain, let's consider your requirements one by one (and out of order):

  • No collisions

Technically, that's not a hash function. A hash function is about mapping heterogeneous, arbitrary length data inputs into fixed-width, homogenous outputs. The only way to accomplish that if the input is longer than the output is through some data loss. For most applications, this is tolerable because the hash function is only used as a fast lookup key and the code falls back onto the slower, complete comparison of the data. That's why many guides and languages insist that if you implement one, you must implement the other.

Fortunately, you say:

  • Two UUID inputs m and n
  • UUIDs are 128 bits each
  • Output of f(m,n) must be 256 bits or less

Combined your two inputs are exactly 256 bits, which means you do not have to lose any data. If you needed a smaller output, then you would be out of luck. As it is, you can concatenate the two numbers together and generate a perfect, unique representation.

  • f(m,n) and f(n,m) must generate the same hash

To accomplish this final requirement, make a decision on the concatenation order by some intrinsic value of the two UUIDs. The suggested smaller-first works just great. However...

  • The hash does not need to be reversible

If you specifically need irreversible hashing, that's a different question entirely. You could still use the less-than comparison to ensure order independence when feeding to a cryptographically hash function, but you would be hard pressed to find something that guaranteed no collisions even with fixed-width inputs a 256 bit output width.

Upvotes: 2

user2357112
user2357112

Reputation: 280181

Concatenate them with the smaller one first.

Upvotes: 4

Related Questions