Fahad Sadah
Fahad Sadah

Reputation: 2418

Select random row(s)

In MySQL I can select X random rows with:

SELECT * FROM table ORDER BY RAND() LIMIT X

This does not work in SQLite. Is there an equivalent?

Upvotes: 70

Views: 40661

Answers (7)

david serero
david serero

Reputation: 190

This work amazing for big tables and getting N rows

WITH row_count AS (
    SELECT COUNT(*) AS total_rows FROM YOURTABLENAME
),
random_ids(rowid, count) AS (
    SELECT ABS(RANDOM() % (SELECT total_rows FROM row_count)) + 1 AS rowid, 1 AS count
    UNION ALL
    SELECT ABS(RANDOM() % (SELECT total_rows FROM row_count)) + 1, count + 1
    FROM random_ids
    WHERE count < 10 -- Replace 10 with the number of random rowids you want
)
SELECT * 
FROM YOURTABLENAME
WHERE rowid IN (SELECT rowid FROM random_ids);

Upvotes: 0

Donnie
Donnie

Reputation: 46913

SELECT * FROM table ORDER BY RANDOM() LIMIT X

Upvotes: 63

Max Shenfield
Max Shenfield

Reputation: 4267

The accepted answer works, but requires a full table scan per query. This will get slower and slower as your table grows large, making it risky for queries that are triggered by end-users.

The following solution takes advantage of indexes to run in O(log(N)) time.

SELECT * FROM table
WHERE rowid > (
  ABS(RANDOM()) % (SELECT max(rowid) FROM table)
)
LIMIT 1;

To break it down

  • SELECT max(rowid) FROM table - Returns the largest valid rowid for the table. SQLite is able to use the index on rowid to run this efficiently.
  • ABS(RANDOM()) % ... - Return a random number between 0 and max(rowid) - 1). SQLite's random function generates a number between -9223372036854775808 and +9223372036854775807. The ABS makes sure its positive, and the modulus operator gates it between max(rowid) - 1.
  • rowid > ... - Rather than using =, use > in case the random number generated corresponds to a deleted row. Using strictly greater than ensures that we return a row with a row id between 1 (greater than 0) and max(rowid) (great than max(rowid) - 1). SQLite uses the primary key index to efficiently return this result as well.

This also works for queries with WHERE clauses. Apply the WHERE clause to both the output and the SELECT max(rowid) subquery. I'm not sure which conditions this will run efficiently, however.

Note: This was derived from an answer in a similar question.

Upvotes: 5

Evhz
Evhz

Reputation: 9246

This one solves the negative RANDOM integers, and keeps good performance on large datasets:

SELECT * FROM table LIMIT 1 OFFSET abs(random() % (select count(*) from table));

where:
abs(random() % n ) Gives you a positive integer in range(0,n)

Upvotes: 3

Davor Josipovic
Davor Josipovic

Reputation: 5504

All answers here are based on ORDER BY. This is very inefficient (i.e. unusable) for large sets because you will evaluate RANDOM() for each record, and then ORDER BY which is a resource expensive operation.

An other approach is to place abs(CAST(random() AS REAL))/9223372036854775808 < 0.5 in the WHERE clause to get in this case for example 0.5 hit chance.

SELECT *
FROM table
WHERE abs(CAST(random() AS REAL))/9223372036854775808 < 0.5

The large number is the maximum absolute number that random() can produce. The abs() is because it is signed. Result is a uniformly distributed random variable between 0 and 1.

This has its drawbacks. You can not guarantee a result and if the threshold is large compared to the table, the selected data will be skewed towards the start of the table. But in some carefully designed situations, it can be a feasible option.

Upvotes: 3

Ali
Ali

Reputation: 22317

For a much better performance use:

SELECT * FROM table WHERE id IN (SELECT id FROM table ORDER BY RANDOM() LIMIT x)

SQL engines first load projected fields of rows to memory then sort them, here we just do a random sort on id field of each row which is in memory because it's indexed, then separate X of them, and find the whole row using these X ids.

So this consume less RAM and CPU as table grows!

Upvotes: 107

bkaid
bkaid

Reputation: 52073

SELECT * FROM table ORDER BY RANDOM() LIMIT 1

Upvotes: 10

Related Questions