munich
munich

Reputation: 530

Select a random row based on a requirement from 50k rows table. Too slow. How can I speed it up?

I have a table with about 50k rows. There's a column called status. Its value can be either 0 or 1. I need to grab a random row that has status = 0.

SELECT * FROM table WHERE status = 0 ORDER BY RAND() LIMIT 1 is obviously too slow.

So, what I did so far was, get the count of the amount of rows that have status = 0 using SELECT COUNT(status) FROM table WHERE status = 0

Now I now that I have 209 rows that have status = 0, but how can I work on these rows?

I'd would like to order these 209 rows, and make my php script get a random row number between the numbers 0 and 209, but I'm not sure how I can achieve that..

Upvotes: 0

Views: 101

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269963

Did you try this?

SELECT t.*
FROM table t
WHERE status = 0
ORDER BY RAND()
LIMIT 1;

Sorting 209 should be quite fast.

EDIT:

Getting a random value efficiently is challenging. If you have an index on table(status), you can try an approach like this:

SELECT t.*
FROM table t cross join
     (select count(*) as allcnt from table where status = 0) const
WHERE status = 0 and
      RAND() < 10/allcnt
ORDER BY RAND()
LIMIT 1;

The rand() in the where clause is quite fast (RAND() is fast, sorting is slow). It should greatly reduce the number of rows to an expected value of 10 -- which means that you are almost certain to get a row (99.99+%, I think) when you run it. This is also simpler than the variable method.

The variable method involves a subquery, which incurs its own overhead for reading and writing the derived table. The idea is to enumerate the rows and then choose one index randomly:

SELECT t.*
FROM (select t.*, @rn := @rn + 1 as rn
      from table t cross join
           (select @rn := 0, @rand := rand()) const
      where status = 0
     ) t cross join
     (select count(*) as allcnt from table where status = 0) const
WHERE floor(1 + @rand * const.allcnt) = rn;

Upvotes: 1

arieljuod
arieljuod

Reputation: 15838

I know this need two queries, but you can give it a try and benchmark it: get the total count of the rows that matches status = 0, get a random number from PHP between 0 and that number-1, do a select query with limit 1 and an offset of the random value generated by php.

Upvotes: 0

Related Questions