Riley Fitzpatrick
Riley Fitzpatrick

Reputation: 929

How to create randomized user ids with no repeats in MySQL?

I am looking for some insight into efficiency and ease of implementation with a certain goal I have. I have a MySQL table storing user information. When an account is created, they need to get their own unique random 5 digit code from 00000-99999.

Here are the strategies I am considering. Some of them seem difficult (for a beginner like me), so I just want to consult Stack Overflow for any advice.

What I am asking is which (of these or your own strategy) would be best first and foremost in terms of efficiency when creating a new account (don't want to keep users waiting), and secondly, in terms of ease of implementation. Thanks in advance!

Upvotes: 1

Views: 182

Answers (3)

Rick James
Rick James

Reputation: 142366

Build the table with the numbers in random order. (If that is difficult, I will explain.)

BEGIN;
SELECT number FROM tbl LIMIT 1 FOR UPDATE;
DELETE FROM tbl WHERE number = << the value from the SELECT >>
COMMIT;

This will get you the 'next' number, then remove it. The transaction is to avoid two connections from getting the same number.

Upvotes: 0

Uueerdo
Uueerdo

Reputation: 15961

With such a small range, if you must go with one or the other (and not a combination of both), you are probably better off with maintaining a list of unused numbers. Well, really it depends on what you expect your upper limit of use to be in proportion to the total possible values.

If you expect low density, generating a random number and checking it against existing ones is probably your best bet. Your collision rate will be low enough as to not really affect performance.

If you expect high density, keeping the list of unused numbers and making a random selection from that is the better option because the alternative can be downright horrible. Thinking worst case, when only one number remains, using the low density approach would mean you have to generate and throw away half the numbers in the range to have a 50% chance of finding the unused one.

The best approach is to switch approaches as the range fills; at 50% used you'll likely find a usable number within two random number generations; but at some point of fullness (when the remaining numbers are few), it is better to switch over.

But the best approach is to not use random numbers for id values; the examples above are complicated further by needing to deal with race conditions where multiple database clients could be contending for the same unused company id.

Upvotes: 1

APC
APC

Reputation: 146269

The simplest solution would be to define company_id as an automatically incrementing identifier column:

create table USER_DATA (
  company_id int not null auto_increment,
  .... other columns here
  primary key (company_id)
);

This is extremely efficient, which seems to matter to you. Moreover it is a standard approach to generate a series of guaranteed unique numbers, which will make life easier for everybody it the long run. True company_id is monotonically incrementing rather than randomly assigned, but what exactly is the benefit of random IDs? Literally nobody cares whether an ID is a random number.

Upvotes: 1

Related Questions