DMande
DMande

Reputation: 341

query order by rand() too slow

i have a large table in a database called offers(over 300.000 rows).

when i execute the below query it takes over 3 secs.

$sql = "SELECT * FROM `offers` WHERE (`start_price` / `price` >= 2) ORDER BY RAND() LIMIT 1"; 

Table offers

`id` int(11) NOT NULL,
`title` text NOT NULL,
`description` text NOT NULL,
`image` text NOT NULL,
`price` float NOT NULL,
`start_price` float NOT NULL,
`brand` text NOT NULL

is there any way to make it faster? i want to select one random row (start_price / price >= 2)

Upvotes: 4

Views: 1623

Answers (3)

Craig van Tonder
Craig van Tonder

Reputation: 7687

One option to make this faster is to ensure that you leverage indexing:

How does database indexing work?

http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html

So in this case ensure that you have an index for start_price together with price and in that exact order.

Another way is to optimise the coalition that is in use for the database and tables, so choose utf8mb4 over utf8 and if sorting/localisation is not being an issue for you and you want to be completely anal then general_ci over unicode_ci:

What's the difference between utf8_general_ci and utf8_unicode_ci

Despite the MyISAM storage engine delivering faster read speeds (http://www.rackspace.com/knowledge_center/article/mysql-engines-myisam-vs-innodb) I have found that there are various tweaks available to the InnoDB storage engine that can speed things up more so than I was able to achieve using MyISAM:

https://dba.stackexchange.com/questions/5666/possible-to-make-mysql-use-more-than-one-core?lq=1

So something like the following would be another option:

[mysqld] // Don't play here unless you have read and understand what is going on
innodb_read_io_threads=64
innodb_write_io_threads=64
innodb_buffer_pool_size=2G

Yet another option is to take a look at alternate storage engines: https://www.percona.com/software/mysql-database/percona-server/benchmarks

You could also see the other answers for refactoring of your query :)

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269693

I think your problem is that your query requires a full table scan for the WHERE clause. The order by does make things worse -- depending on the volume that pass the filter.

You might consider storing this number in the table and adding an index to it:

alter table offers add column start_to_price float;

update offers
    set start_to_price = start_price / price;

create index idx_offers_s2p on offers(start_to_price);

Then, your query might be fast:

SELECT o.*
FROM `offers` o 
WHERE start_to_price >= 2
ORDER BY RAND()
LIMIT 1;

If performance is still a problem, then I would be likely to use a where clause first:

SELECT o.*
FROM `offers` o CROSS JOIN
     (select COUNT(*) as cnt from offers where start_to_price >= 2) oo
WHERE rand() <= 10 / cnt
ORDER BY RAND()
LIMIT 1;

This pulls about 10 rows at random and then chooses one of them.

If these don't work, then there are other solutions that get progressively more complicated.

Upvotes: 4

Kickstart
Kickstart

Reputation: 21513

There are alternatives. The one I have used is described here:-

http://jan.kneschke.de/projects/mysql/order-by-rand/

Essentially you generate a random number that is between your min and max id, and then join that against your result set (using >=), with a limit of 1. So you get a result set starting from a random point in your full results and then just grab the first record.

Down side is that if you id fields are not equally distributed then it isn't truly random

Quick example code, assuming your offers table has a unique key called id:-

SELECT offers.* 
FROM offers 
INNER JOIN 
(
    SELECT RAND( ) * ( MAX( Id ) - MIN( Id ) ) + MIN( Id ) AS Id
    FROM offers
    WHERE (`start_price` / `price` >= 2)
) AS r2
ON offers.Id >= r2.Id
WHERE (`start_price` / `price` >= 2) 
ORDER BY offers.Id LIMIT 1

Upvotes: 0

Related Questions