Herker
Herker

Reputation: 582

How to ORDER BY RAND() and set LIMIT randomly?

I want to extract random rows and I want the number of rows to variate every time I run
How can I make that happen?

This is my table

CREATE TABLE sbb_data (
    scope_node VARCHAR(25), 
    site VARCHAR(20), 
    name VARCHAR(40),  
    is_deleted INT, 
    is_connected INT
);

SELECT name FROM sbb_data;

Output

SBB 1
SBB 2
SBB 3
SBB 4
SBB 5
SBB 6
SBB 7
SBB 8
SBB 9
SBB 10
SBB 11

When I run the following query

SELECT name FROM sbb_data ORDER BY RAND() LIMIT 0,5;

I get something like this

SBB 8
SBB 9
SBB 2
SBB 10
SBB 4

But the number of rows output is always 5

I want it to variate between 0,5

Upvotes: 0

Views: 3032

Answers (4)

MatBailie
MatBailie

Reputation: 86706

Pick 5 rows at random, then assign them a new random number; pick rows where the new random number is less than 0.5 (picking, on average, half the numbers, but possibly none, possibly all over them).

SELECT *
  FROM (SELECT * FROM sbb_data ORDER BY RAND() LIMIT 0,5) rand5
 WHERE RAND() < 0.5

For example: https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=c80a90b08e77e5e6448991d589f39df4

Upvotes: 1

Ergis
Ergis

Reputation: 1229

SET @nrRows = floor(rand()*6);

PREPARE STMT FROM 'SELECT * FROM sbb_data ORDER BY RAND() LIMIT ?';

EXECUTE STMT USING @nrRows;

Basically what you are asking is to limit based on a variable (@nrRows), which should take values between [0, 5].

Upvotes: -1

hakim
hakim

Reputation: 61

SELECT name FROM 
( SELECT name,RAND() as rand FROM sbb_data order by RAND() ) as tab 
where tab.rand <0.5 LIMIT 5

insert a random value in the query and according to this value the number of rows will change tab.rand <0.5 and LIMIT 5 will limit the max number of row to 5 .

and changing the condition to tab.rand <0.2 will make geting number of rows <5 more probable

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269703

One way you can do this is to insert a bit more randomness:

SELECT name
FROM sbb_data CROSS JOIN
     (SELECT COUNT(*) as cnt FROM sbb_data) x
WHERE RAND() < 5.0 / cnt
LIMIT 5;

This selects each row with a probability where "5" is the expected number -- however it could be more than or less than 5. The limit 5 limits this to 5.

As written, this would select 5 rows more often than other numbers. You might want 4.0 / cnt or 3.0 / cnt for a more even distribution.

Also, on even modestly sized data, this is much faster than doing an order by rand().

Upvotes: 2

Related Questions