Reputation: 43
My objective
I am trying to retrieve multiple random rows that contain only unique userid
but for the type
column to be random - type
can only be 0
or 1
. The table in question will contain less than 1,000 rows at any given time.
My table
CREATE TABLE tbl_message_queue (
userid bigint(20) NOT NULL,
messageid varchar(20) NOT NULL,
`type` int(1) NOT NULL,
PRIMARY KEY (userid,messageid,`type`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Sample data
userid | messageid | type
---------------------------------------------------
4353453 | 518423942 | 0
4353453 | 518423942 | 1
2342934 | 748475435 | 0
2342934 | 748475435 | 1
7657529 | 821516543 | 0
7657529 | 821516543 | 1
0823546 | 932843285 | 0
0823546 | 932843285 | 1
What to rule out
Using ORDER BY RAND()
isn't feasible as over at least 18,000 of these types of queries are executed by applications at any given moment and are causing high load. Using SELECT DISTINCT
or GROUP BY
is (obviously) more efficient and will always pick unique userid
but type
will always equal to 0
with an acceptable load.
The common method is to create an id
column but I'm looking for an alternative way only. The group primary key cannot change as it is required and deeply integrated into our application, however the structure of each column can be altered.
Thanks.
Upvotes: 4
Views: 113
Reputation: 57408
My understanding of your question is that for each userid
you have two entries, but want to extract only one, at random.
To achieve this, you ought to generate a random value between 0 and 1 for each unique userid
, and then JOIN this list with the starting list:
SELECT a.* FROM tbl_message_queue AS a
JOIN ( SELECT userid, FLOOR(2*RAND()) AS type
FROM tbl_message_queue GROUP BY userid ) AS b
ON ( a.userid = b.userid AND a.type = b.type );
But if an ORDER BY RAND()
does not work for you, maybe we should compromise.
In the above sequence, any two userids will be uncorrelated -- i.e., the fact that user A gets type 0 tells you nothing about what user B will turn up with.
Depending on the use case, a less random (but "apparently random") sequence could be obtained with two queries:
SELECT @X := FLOOR(2*RAND()), @Y := POW(2,FLOOR(2+14*RAND()))-1;
SELECT * FROM tbl_message_queue WHERE (((userid % @Y) & 1) XOR type XOR @X);
This way, you can get what seems a random extraction. What really happens is that the userids are correlated, and you only have some couple dozens different extractions possible. But using only simple operators, and no JOINs, this query is very fast.
Upvotes: 2