Reputation: 283053
I'm trying to generate some tokens. They have to be 26 characters along from the alphabet [a-z0-9]
.
The closest solution I've found is part 2 from this answer, but the string won't be uniformly distributed.
If my alphabet was a power of 2 in length, this wouldn't be so hard, but as it stands, I'm not sure how to do this properly.
Specifically, this is what I've got so far:
export function createSessionId() {
const len = 26;
let bytes = new Crypto.randomBytes(len);
let result = new Array(len);
const alphabet = 'abcdefghijklmnopqrstuvwxyz0123456789';
for(let i=0; i<len; ++i) {
result[i] = alphabet[bytes[i]%alphabet.length];
}
return result.join('');
}
But I'm pretty sure that won't be distributed properly because of the modulo.
Upvotes: 3
Views: 609
Reputation: 6783
Let's see what you got here:
An alphabet of 36 characters, where each character is randomly selected at once by using modulo 36, is indeed not evenly distributed.
You have several options:
Select a whole series of bytes to represent the total string. That means, you need to select enough bytes to represent a number in the range of 0 - 36^26. That way, your value will be evenly distributed (as far as the crypto provider allows for).
If you insist on choosing one digit per time, you want to make sure its value will be distributed evenly, using modulo 36 does not do the job, as you correctly assumed. In this case, you can either
For 2a), the distribution is not perfectly even, but very close to chance, assumed that the crypto provider is fair.
For 2b), the distribution is even. But this will be paid by a higher (and especially unpredictable) runtime when throwing away unneeded results. Of course you can statistically calculate the runtime, but the worst case is an infinite one, if your RNG keeps producing invalid results forever (which is very, very unlikely, but theoretically possible).
My advice is 2a). Take a series of bytes, interpret them as a float value and multiply the result with 36.
Upvotes: 2
Reputation: 283053
Here's my implementation of Psi's answer, 2b.
export function createSessionId() {
const len = 26;
let bytes = Crypto.randomBytes(30); // a few extra in case we're unlucky
let result = new Array(len);
const alphabet = 'abcdefghijklmnopqrstuvwxyz0123456789';
let i =0;
let j = 0;
for(;;) {
if(i >= bytes.length) {
bytes = Crypto.randomBytes(((len-j)*1.2)|0); // we got unlucky, gather up some more entropy
i = 0;
}
let value = bytes[i++];
if(value >= 252) { // we need a multiple of 36 for an even distribution
continue;
}
result[j++] = alphabet[value % alphabet.length];
if(j >= len) {
break;
}
}
return result.join('');
}
There's less than a 2% chance of needing to reroll (4/255), so I think this ought to be efficient enough.
Hard to unit test something like this, but this passes:
test(createSessionId.name, () => {
let ids = new Set();
let dist = Array.apply(null,new Array(26)).map(() => ({}));
for(let i=0; i<1500; ++i) {
let id = createSessionId();
if(ids.has(id)) {
throw new Error(`Not unique`);
}
ids.add(id);
for(let j=0; j<id.length; ++j) {
dist[j][id[j]] = true;
}
}
for(let i=0; i<26; ++i) {
expect(Object.keys(dist[i]).length).toEqual(36);
}
});
Upvotes: 1