Reputation: 1201
I have a query like this
INSERT INTO data(data_uuid, ...) VALUES(uuid_v4(), ...);
I have read about on duplicate update but what I am looking for instead of updating the existing column is to reinsert the row with an updated uuid. Is there any constructs that can make life barer-able here?
I keep hitting such an error :
ERROR 1062 (23000): Duplicate entry '0f7018da-4df5-4e22-b995-3b3c3e4e85d3' for key 'data_uuid'
To resolve it temporarily I am padding the UUID however I would like to have cleaner solution that still keeps me with the UUID.
The uuid function looks like this :
CREATE FUNCTION uuid_v4() RETURNS char(36) CHARSET latin1
BEGIN
SET @h1 = LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0');
SET @h2 = LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0');
SET @h3 = LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0');
SET @h6 = LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0');
SET @h7 = LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0');
SET @h8 = LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0');
SET @h4 = CONCAT('4', LPAD(HEX(FLOOR(RAND() * 0x0fff)), 3, '0'));
SET @h5 = CONCAT(HEX(FLOOR(RAND() * 4 + 8)),
LPAD(HEX(FLOOR(RAND() * 0x0fff)), 3, '0'));
RETURN LOWER(CONCAT(
@h1, @h2, '-', @h3, '-', @h4, '-', @h5, '-', @h6, @h7, @h8
));
END
Upvotes: 2
Views: 234
Reputation: 10264
As said in comments, if you are getting collisions with UUID with just a million rows (yes, just) that's because your random uuid generation is using some weak random generator.
Since you now posted your uuid_v4()
function, and it relies on MySQL rand()
, I can explain why your code is failing.
According to mysql docs: http://dev.mysql.com/doc/refman/5.7/en/mathematical-functions.html
RAND() is not meant to be a perfect random generator. It is a fast way to generate random numbers on demand that is portable between platforms for the same MySQL version.
This means that you can't use mysql to generate your uuid, at least not with RAND()
function.
You will need to generate the uuid outside mysql of possible. There's a few libraries for it in many languages:
Always check if the library you choose uses a Cryptographically-safe random generator.
It's possible to generate safe UUID V4 on MySQL side using random_bytes()
function:
This function returns a binary string of len random bytes generated using the random number generator of the SSL library.
So we can update your function to:
CREATE FUNCTION uuid_v4s()
RETURNS CHAR(36)
BEGIN
-- 1th and 2nd block are made of 6 random bytes
SET @h1 = HEX(RANDOM_BYTES(4));
SET @h2 = HEX(RANDOM_BYTES(2));
-- 3th block will start with a 4 indicating the version, remaining is random
SET @h3 = SUBSTR(HEX(RANDOM_BYTES(2)), 2, 3);
-- 4th block first nibble can only be 8, 9 A or B, remaining is random
SET @h4 = CONCAT(HEX(FLOOR(ASCII(RANDOM_BYTES(1)) / 64)+8),
SUBSTR(HEX(RANDOM_BYTES(2)), 2, 3));
-- 5th block is made of 6 random bytes
SET @h5 = HEX(RANDOM_BYTES(6));
-- Build the complete UUID
RETURN LOWER(CONCAT(
@h1, '-', @h2, '-4', @h3, '-', @h4, '-', @h5
));
END
This function should be safe enough to use without care about collisions unless you have a very massive row count.
I've created following test scenario: Insert random UUID v4 as primary key for a table until 40.000.000 rows are created. When a collision is found, the row is updated incrementing collisions
column:
INSERT INTO test (uuid) VALUES (uuid_v4()) ON DUPLICATE KEY UPDATE collisions=collisions+1;
The sum of collisions after 40 million rows with each function is:
+----------+----------------+
| RAND() | RANDOM_BYTES() |
+----------+----------------+
| 55 | 0 |
+----------+----------------+
The number collisions in both scenarios tends to increase as number of rows grows.
Upvotes: 2