Richard Chen
Richard Chen

Reputation: 25

How to generate random string type primary key, which can auto increase its length?

If my table needs to use string type as its primary key, the length of which is increasable and as short as possible, and when it is available, it should be random in some sense, how can I make that?

For example:
given 26 letters, and the result should be like:

enter image description here

Upvotes: 1

Views: 285

Answers (1)

r3mainer
r3mainer

Reputation: 24567

Assuming you just want a bit of obfuscation rather than proper cryptographic security, I'd suggest using a set of linear congruential generators to transform your integers into non-sequential values that you can then convert into base-26 values where each digit is represented by a letter of the alphabet (e.g., a=0, b=1, ..., z=25).

You'll need a different LCG for strings of each length, but these can be generated quite easily. Also, the input values will have to be adjusted so that, for example, the first two-character string corresponds to an input value of 26. (I'm counting from zero, since this makes the maths a bit more straightforward.)

For example, suppose you start with a value of n=12345. The first thing you need to do is figure out how long the output string needs to be:

n = 12345       # Input value
m = 26          # LCG modulus
k = 1           # Length of output string

while n >= m:
    n -= m
    m *= 26
    k += 1

print(k)        # Should be 3 in this case
print(n)        # Should be 11643 (=12345 - 26 - 26**2)

Next, transform this output value of n with an LCG having a modulus of m=263 (for a 3-character output). For example, you could try a=7541 and c=12127. (Make sure the values you choose correspond to a maximal length sequence according to the Hull–Dobell theorem as described in the Wikipedia article.)

n_enc = (n * 7541 + 12127) % (26**3)    # Should be 2294

In base 26, the mumber 2294 is represented as 3×262 + 10×26 + 6, so the final output will be dkg.

To reverse this process, convert the base-26 string back into an integer, apply the inverse LCG function

n = ((n_enc + 5449) * 3277) % (26**3)   # Should be 11643

and add back on the smaller powers of 26:

while m > 26:
    m //= 26
    n += m

One slight wrinkle in this method is that if the length of your alphabet is not divisible by any squares greater than 1 (e.g., 26 = 2×13 is not divisible by 4, 9 or 16), then the LCG for single-character strings is inevitably going to produce sequential results. You can fix this by using a random permutation of the alphabet to represent the base-26 numbers.

I should also add the standard caveat that random strings of alphabet characters can sometimes spell words that are offensive or inappropriate, so you might want to consider restricting yourself to a disemvowelled alphabet if these strings are going to be visible to users at all.

Upvotes: 1

Related Questions