How to I convert the output of an md5 hash into a float between 0 and 1

Question

In A/B testing, it's fairly common to split based on modulo arthimetic, but that often causes overlapping experiments, i.e. if you used id % 2 == 0 as your split criteria, one set of users would be consistently getting into control or experiment.

A solution I've heard about is to use hashing. I want to concatenate a user_id with an experiment name, hash it, and then convert that into a float between 0 and 1. I know how to do the hashing (Digest::MD5::hexdigest('test').to_i(16)) but I'm confused on the next steps for conversion to a float between 0 and 1.

What are the steps?

Huey · Accepted Answer

I figured out the solution by porting the code that's listed here: http://blog.richardweiss.org/2016/12/25/hash-splits.html

test_id_digest = Digest::MD5::hexdigest(user_id + experiment_name)
test_id_first_digits = test_id_digest[0..5]
test_id_final_int = test_id_final_int = test_id_first_digits.to_i(16)

ab_split = test_id_final_int.to_f/0xFFFFFF

The basic idea is to create the digest, then take the first six letters, then divide by the largest six digit hex string.

The blog post referenced goes into verifying the randomness of this solution.

How to I convert the output of an md5 hash into a float between 0 and 1

Answers (1)

Related Questions