Reputation: 93
I have a code that should get unique string(for example, "d86c52ec8b7e8a2ea315109627888fe6228d") from client and return integer more than 2200000000 and less than 5800000000. It's important, that this generated int is not random, it should be one for one unique string. What is the best way to generate it without using DB?
Now it looks like this:
did = "d86c52ec8b7e8a2ea315109627888fe6228d"
min_cid = 2200000000
max_cid = 5800000000
cid = did.hash.abs.to_s.split.last(10).to_s.to_i
if cid < min_cid
cid += min_cid
else
while cid > max_cid
cid -= 1000000000
end
end
Upvotes: 0
Views: 1141
Reputation: 156434
Here's the problem - your range of numbers has only 3.6x10^9 possible values where as your sample unique string (which looks like a hex integer with 36 digits) has 16^32 possible values (i.e. many more). So when mapping your string into your integer range there will be collisions.
The mapping function itself can be pretty straightforward, I would do something such as below (also, consider using only a part of the input string for integer conversion, e.g. the first seven digits, if performance becomes critical):
def my_hash(str, min, max)
range = (max - min).abs
(str.to_i(16) % range) + min
end
my_hash(did, min_cid, max_cid) # => 2461595789
[Edit] If you are using Ruby 1.8 and your adjusted range can be represented as a Fixnum
, just use the hash
value of the input string object instead of parsing it as a big integer. Note that this strategy might not be safe in Ruby 1.9 (per the comment by @DataWraith) as object hash values may be randomized between invocations of the interpreter so you would not get the same hash number for the same input string when you restart your application:
def hash_range(obj, min, max)
(obj.hash % (max-min).abs) + [min, max].min
end
hash_range(did, min_cid, max_cid) # => 3886226395
And, of course, you'll have to decide what to do about collisions. You'll likely have to persist a bucket of input strings which map to the same value and decide how to resolve the conflicts if you are looking up by the mapped value.
Upvotes: 3
Reputation: 35540
You could generate a 32-bit CRC, drop one bit, and add the result to 2.2M. That gives you a max value of 4.3M.
Alternately you could use all 32 bits of the CRC, but when the result is too large, append a zero to the input string and recalculate, repeating until you get a value in range.
Upvotes: 0