I am looking for some insight into efficiency and ease of implementation with a certain goal I have. I have a MySQL table storing user information. When an account is created, they need to get their own unique random 5 digit code from 00000-99999. Here are the strategies I am considering. Some of them seem difficult (for a beginner like me), so I just want to consult Stack Overflow for any advice. Populate the company_id column with all 5 digit codes from 00000-99999 in random order. This is what I have done thus far. My issue is that when someone creates an account, I would need to query the database to find the first open row and grab the 5 digit code associated with it. I haven't been able to figure that part out thus far. Populate the company_id column with all 5 digits codes from 00000-99999 in numerical order. When someone creates an account, I would choose a random open row and grab the 5 digit code associated with it. The issue here is choosing a random row that is open. Again, I do not know how I would accomplish this. Create the 5 digit code when the user creates a new account and create a new row in my user_data with that 5 digit code as the company_id column. The issue here is that the program in which a new account is created must be able to access a list of 5 digit codes that are not yet used . I am thinking this may be the best option, by creating an array of 5 digit codes 00000-99999, filling an array with all currently used 5 digit codes from the company_id column, removing all elements in the first array that match elements in the second, and then selecting a random 5 digit code from the altered first array. What I am asking is which (of these or your own strategy) would be best first and foremost in terms of efficiency when creating a new account (don't want to keep users waiting), and secondly, in terms of ease of implementation. Thanks in advance!

Reputation: 929

How to create randomized user ids with no repeats in MySQL?

I am looking for some insight into efficiency and ease of implementation with a certain goal I have. I have a MySQL table storing user information. When an account is created, they need to get their own unique random 5 digit code from 00000-99999.

Here are the strategies I am considering. Some of them seem difficult (for a beginner like me), so I just want to consult Stack Overflow for any advice.

Populate the company_id column with all 5 digit codes from 00000-99999 in random order. This is what I have done thus far. My issue is that when someone creates an account, I would need to query the database to find the first open row and grab the 5 digit code associated with it. I haven't been able to figure that part out thus far.
Populate the company_id column with all 5 digits codes from 00000-99999 in numerical order. When someone creates an account, I would choose a random open row and grab the 5 digit code associated with it. The issue here is choosing a random row that is open. Again, I do not know how I would accomplish this.
Create the 5 digit code when the user creates a new account and create a new row in my user_data with that 5 digit code as the company_id column. The issue here is that the program in which a new account is created must be able to access a list of 5 digit codes that are not yet used. I am thinking this may be the best option, by creating an array of 5 digit codes 00000-99999, filling an array with all currently used 5 digit codes from the company_id column, removing all elements in the first array that match elements in the second, and then selecting a random 5 digit code from the altered first array.

What I am asking is which (of these or your own strategy) would be best first and foremost in terms of efficiency when creating a new account (don't want to keep users waiting), and secondly, in terms of ease of implementation. Thanks in advance!

Upvotes: 1

Answers (3)

Rick James

Reputation: 142366

Build the table with the numbers in random order. (If that is difficult, I will explain.)

BEGIN;
SELECT number FROM tbl LIMIT 1 FOR UPDATE;
DELETE FROM tbl WHERE number = << the value from the SELECT >>
COMMIT;

This will get you the 'next' number, then remove it. The transaction is to avoid two connections from getting the same number.

Upvotes: 0

Uueerdo

Reputation: 15961

With such a small range, if you must go with one or the other (and not a combination of both), you are probably better off with maintaining a list of unused numbers. Well, really it depends on what you expect your upper limit of use to be in proportion to the total possible values.

If you expect low density, generating a random number and checking it against existing ones is probably your best bet. Your collision rate will be low enough as to not really affect performance.

If you expect high density, keeping the list of unused numbers and making a random selection from that is the better option because the alternative can be downright horrible. Thinking worst case, when only one number remains, using the low density approach would mean you have to generate and throw away half the numbers in the range to have a 50% chance of finding the unused one.

The best approach is to switch approaches as the range fills; at 50% used you'll likely find a usable number within two random number generations; but at some point of fullness (when the remaining numbers are few), it is better to switch over.

But the best approach is to not use random numbers for id values; the examples above are complicated further by needing to deal with race conditions where multiple database clients could be contending for the same unused company id.

Upvotes: 1

APC

Reputation: 146269

The simplest solution would be to define company_id as an automatically incrementing identifier column:

create table USER_DATA (
  company_id int not null auto_increment,
  .... other columns here
  primary key (company_id)
);

This is extremely efficient, which seems to matter to you. Moreover it is a standard approach to generate a series of guaranteed unique numbers, which will make life easier for everybody it the long run. True company_id is monotonically incrementing rather than randomly assigned, but what exactly is the benefit of random IDs? Literally nobody cares whether an ID is a random number.

Upvotes: 1

How to create randomized user ids with no repeats in MySQL?

Answers (3)

Related Questions