Reputation: 4335
Let's say you have a random number generator spitting out numbers between 1 and 100 000 000 and you want to store them in a database (MySQL) with the timestamp when they were generaeted. If a number that has previously been seen comes, it is discarded.
What would be the best algorithm to make this happen? SELECT then INSERT as necessary? Is there something more efficient?
Upvotes: 0
Views: 214
Reputation: 30341
If you put a UNIQUE
index on the column with the extracted numbers any INSERT
attempting to duplicate a UNIQUE
key will fail.
Therefore the easiest and most portable version will be (PHP code, but you get the idea):
function extraction() {
do {
$random = generate_random_number();
$result = @mysql_query("INSERT INTO extractions(number) VALUE ($random)");
} while (!$result);
return $random;
}
Upvotes: 0
Reputation: 22905
You can go for a SEQUENCE
:
+
-
You can do a SELECT ...
then INSERT ...
:
+
-
SELECT
and INSERT
and end up with 2 equal numbers;UNIQUE
constraint, then previos situation will lead to an exception;You can choose INSERT ON DUPLICATE KEY UPDATE
, and by now it seems to be the best option (take a look at "INSERT IGNORE" vs "INSERT ... ON DUPLICATE KEY UPDATE"), at least in my view, with the only exception — not portable to other RDBMSes.
P.S. This article is not related to MySQL, but it is worth reading it to get an overview of all the catches that can happen on your way.
Upvotes: 1
Reputation: 1270763
This would normally be solved by creating a unique index on the random number column in the table. You could experiment to see if a b-tree versus a hash has better performance.
If you have lots of memory, you could pre-populate a table with 100,000,000 rows -- all possible values. Then, when you look to see if something is already created, then you only need to see if the time stamp is non-null. However, this would require over a Gbyte of RAM to store the table in memory, and would only be the opimal solution if you are trying to maximize transactions per second.
Upvotes: 0
Reputation: 18368
If you don't need to insert a new random value every time you can use INSERT IGNORE or REPLACE INTO. Otherwise you should SELECT to check and then INSERT.
Upvotes: 1