mauveine
mauveine

Reputation: 33

Making an optimal pseudo random unique code generator for voucher-like behaviour

I'm developing a simple management system for IT tech services. This system has a website where the clients can check the current state of their order by entering their unique code. This code, of course, has to be unique, and hopefully not easy to (manually) bruteforce.

I was thinking of an output somewhere on the lines of "XKF-042", easy to read and write down. The problem arises on the generation of these values: I could use plain random data and generate both pieces of the code, but that forces me to check wether the code already exists or not, which feels like an exponential waste of resources.

A simple answer would be to just begin counting from an arbitrary number, let's say "ABC-001", and add 1, so there is no real need to check if the value already exists. The problem with that is the ease of bruteforcing; anyone could just check ABC-XXX and check the last thousand consecutive orders.

Maths are not my forte, but I know there has to be a more elegant solution to this problem.

I'm thinking about generating every single possible permutation for each side of the code and scramble them, so I have a list of pairs to read from that's seemingly random, and maybe shift the "right side of the code" list every 1000 codes.

EDIT: It's not critical that the codes are impossible to guess; there won't be any personal info on the output besides the order info and the costs. I could use a 4x4 code (like "SJDM-4823") to make it "stronger".

Upvotes: 2

Views: 258

Answers (2)

Peter O.
Peter O.

Reputation: 32898

An article I've written contains advice for unique random identifiers. From your question, it seems you have the following dilemma: Generate random unique identifiers that—

  • Are long enough to be hard to guess, but
  • are short enough to be easy to type by end users.

The advice in that article explains how unique random IDs should be generated (128 bits or longer, using a cryptographic RNG such as random_bytes together with bin2hex in PHP). But for your purposes, the resulting IDs may be too long to be suitable. There are ways to deal with such long IDs, including—

  1. Dividing the ID into memorable chunks (for example: "374528294473" becomes "374-538-294-473"),
  2. Converting the ID to a sequence of memorable words (as in Bitcoin's BIP39), or
  3. Adding a so-called "checksum digit" at the end of the ID to guard against typing mistakes.

You should try (1) or (2) before deciding to use shorter IDs than what the advice in that article recommends.

Also, your application will generally have to check IDs for uniqueness against a database of IDs already generated; this uniqueness check, however, may or may not be a performance bottleneck for your application, and the only way to find out is to try it and see. Alternatively, you can include the ID table's record number (which should be unique for each record) into the ID itself.


There may also be a serious security issue if the order ID is the only thing that grants access to information about that order. Ideally, there should be other forms of authorization, such as allowing only logged-in users or certain logged-in users to access information about the order associated with an order ID. See also this question.

Upvotes: 1

Severin Pappadeux
Severin Pappadeux

Reputation: 20130

Well, I would propose to use LCG, Linear Congruential Generator. It has very nice properties - given right set of constants (Hull–Dobell Theorem) output uniquely covers all 232 space (or 64bit space, but as far as I remember PHP has only 32bit integers). Basically it is 1-to-1 mapper from [0...232) interval to another [0...232) interval. It could be used in two ways

One, IDnext = LCG(IDprev), like typical random numbers generator. Or just feed it from linearly incrementing counter, IDnext = LCG(1, 2, ...). You could convert integer to 8 symbols base-16 string, should be good enough.

Don't have PHP code, have a python one

import numpy as np

class LCG:

    UZERO: np.uint32 = np.uint32(0)
    UONE : np.uint32 = np.uint32(1)

    def __init__(self, seed: np.uint32, a: np.uint32, c: np.uint32) -> None:
        self._seed: np.uint32 = np.uint32(seed)
        self._a   : np.uint32 = np.uint32(a)
        self._c   : np.uint32 = np.uint32(c)

    def next(self) -> np.uint32:
        self._seed = self._a * self._seed + self._c
        return self._seed

    def seed(self) -> np.uint32:
        return self._seed

    def set_seed(self, seed: np.uint32) -> np.uint32:
        self._seed = seed

    def skip(self, ns: np.int32) -> None:
        """
        Signed argument - skip forward as well as backward

        The algorithm here to determine the parameters used to skip ahead is
        described in the paper F. Brown, "Random Number Generation with Arbitrary Stride,"
        Trans. Am. Nucl. Soc. (Nov. 1994). This algorithm is able to skip ahead in
        O(log2(N)) operations instead of O(N). It computes parameters
        A and C which can then be used to find x_N = A*x_0 + C mod 2^M.
        """

        nskip: np.uint32 = np.uint32(ns)

        a: np.uint32 = self._a
        c: np.uint32 = self._c

        a_next: np.uint32 = LCG.UONE
        c_next: np.uint32 = LCG.UZERO

        while nskip > LCG.UZERO:
            if (nskip & LCG.UONE) != LCG.UZERO:
                a_next = a_next * a
                c_next = c_next * a + c

            c = (a + LCG.UONE) * c
            a = a * a

            nskip = nskip >> LCG.UONE

        self._seed = a_next * self._seed + c_next


#%%
np.seterr(over='ignore')

a = np.uint32(1664525)
c = np.uint32(1013904223)
seed = np.uint32(1)

rng = LCG(seed, a, c)

print(rng.next())
print(rng.next())
print(rng.next())

Upvotes: 1

Related Questions