Reputation: 582
I want to extract random rows and I want the number of rows to variate every time I run
How can I make that happen?
This is my table
CREATE TABLE sbb_data (
scope_node VARCHAR(25),
site VARCHAR(20),
name VARCHAR(40),
is_deleted INT,
is_connected INT
);
SELECT name FROM sbb_data;
Output
SBB 1
SBB 2
SBB 3
SBB 4
SBB 5
SBB 6
SBB 7
SBB 8
SBB 9
SBB 10
SBB 11
When I run the following query
SELECT name FROM sbb_data ORDER BY RAND() LIMIT 0,5;
I get something like this
SBB 8
SBB 9
SBB 2
SBB 10
SBB 4
But the number of rows output is always 5
I want it to variate between 0,5
Upvotes: 0
Views: 3032
Reputation: 86706
Pick 5 rows at random, then assign them a new random number; pick rows where the new random number is less than 0.5 (picking, on average, half the numbers, but possibly none, possibly all over them).
SELECT *
FROM (SELECT * FROM sbb_data ORDER BY RAND() LIMIT 0,5) rand5
WHERE RAND() < 0.5
For example: https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=c80a90b08e77e5e6448991d589f39df4
Upvotes: 1
Reputation: 1229
SET @nrRows = floor(rand()*6);
PREPARE STMT FROM 'SELECT * FROM sbb_data ORDER BY RAND() LIMIT ?';
EXECUTE STMT USING @nrRows;
Basically what you are asking is to limit based on a variable (@nrRows
), which should take values between [0, 5].
Upvotes: -1
Reputation: 61
SELECT name FROM
( SELECT name,RAND() as rand FROM sbb_data order by RAND() ) as tab
where tab.rand <0.5 LIMIT 5
insert a random value in the query and according to this value the number of rows will change tab.rand <0.5
and LIMIT 5
will limit the max number of row to 5 .
and changing the condition to tab.rand <0.2
will make geting number of rows <5 more probable
Upvotes: 1
Reputation: 1269703
One way you can do this is to insert a bit more randomness:
SELECT name
FROM sbb_data CROSS JOIN
(SELECT COUNT(*) as cnt FROM sbb_data) x
WHERE RAND() < 5.0 / cnt
LIMIT 5;
This selects each row with a probability where "5" is the expected number -- however it could be more than or less than 5. The limit 5
limits this to 5.
As written, this would select 5 rows more often than other numbers. You might want 4.0 / cnt
or 3.0 / cnt
for a more even distribution.
Also, on even modestly sized data, this is much faster than doing an order by rand()
.
Upvotes: 2