Reputation: 581
Given n integer id's, I wish to link all possible sets of up to k id's to a constant value. What I'm looking for is a way to translate sets (e.g. {1, 5}, {1, 3, 5} and {1, 2, 3, 4, 5, 6, 7}) to unique values.
Guarantees:
Additionally, it would be nice (but not obligatory) if "neighboring" sets (drop one id, add one id, swap one id, etc) are easy to reach, as well as "all sets that include at least this set".
Any ideas?
Upvotes: 0
Views: 328
Reputation: 44250
Enumerate using the product of primes.
Now hash(ab) := 6, and hash (abc) := 30
And a nice side effect is that, if "ab" is a subset of "abc", then:
hash(abc) % hash(ab) == 0
and
hash(abc) / hash(ab) == hash(c)
The bad news: You might run into overflow, the 100th prime will probably be around 1000, and 64 bits cannot accomodate 1000**10. This will not affect the functioning as a hash function; only the subset thingy will fail to work. the same method applied to anagrams
The other option is Zobrist-hashing. It is equivalent to the the primes method, but instead of primes you use a fixed set of (random) numbers, and instead of multiplying you use XOR. For a fixed small (it needs << ~70 bits) set like yours, it might be possible to tune the zobrist tables to totally avoid collisions (yielding a perfect hash).
And the final (and simplest) way is to use a (100bit) bitmap, and treat that as a hashvalue (maybe after modulo table size)
And a totally unrelated method is to just build a decision tree on the bits of the bitmap. (the tree would have a maximal depth of k) a related kD tree on bit values
Upvotes: 1
Reputation: 20027
Calculate a 'diff' from each set {1, 6, 87, 89} = {1,5,81,2,0,0,...} {1,2,3,4} = { 1,1,1,1,0,0,0,0... };
Then binary encode each number with a variable length encoding and concatenate the bits.
It's hard to compare the sets (except for the first few equal bits), but because there can't be many large intervals in a set, all possible values just might fit into 64 bits. (slack of 16 bits at least...)
Upvotes: 0
Reputation: 15679
May be not the best solution, but you can do the following:
so if you have {2,5,9,4} first Step->{2,4,5,9}; second->"2459"
This way you will get a unique String from a unique set. If you really need to map them to an integer value, you can hash the string after that.
A second way I can think of is to store them in a java Set and simply map it against a HashMap with set
as keys
Upvotes: 0