Vahagn
Vahagn

Reputation: 105

how to fix error with mysql random

I have project in php + mysql (over 2 000 000 rows). Please view this php code.

<?php 
            for($i=0;$i<20;$i++)
            {
                $start = rand(1,19980);
                $select_images_url_q = "SELECT * FROM photo_gen WHERE folder='$folder' LIMIT $start,2 ";
                $result_select = (mysql_query($select_images_url_q));
                while($row = mysql_fetch_array($result_select))
                    {
                    echo '<li class="col-lg-2 col-md-3 col-sm-3 col-xs-4" style="height:150px">
                                      <img class="img-responsive" src="http://static.gif.plus/'.$folder.'/'.$row['code'].'_s.gif">
                                </li>';
                }
            }
            ?>

This code work very slowly in $start = rand(1,19980); position, Please help how I can make select request with mysql random function, thank you

Upvotes: 1

Views: 291

Answers (3)

Scott Arciszewski
Scott Arciszewski

Reputation: 34093

Depending on what your code is doing with $folder, you may be vulnerable to SQL injection.

For better security, consider moving to PDO or MySQLi and using prepared statements. I wrote a library called EasyDB to make it easier for developers to adopt better security practices.

The fast, sane, and efficient way to select N distinct random elements from a database is as follows:

  1. Get the number of rows that match your condition (i.e. WHERE folder = ?).
  2. Generate a random number between 0 and this number.
  3. Select a row with a given offset like you did.
  4. Store the ID of the previously generated row in an ever-growing list to exclude from the results, and decrement the number of rows.

An example that uses EasyDB is as follows:

// Connect to the database here:
$db = \ParagonIE\EasyDB\Factory::create(
    'mysql;host=localhost;dbname=something',
    'username',
    'putastrongpasswordhere'
); 

// Maintain an array of previous record IDs in $exclude
$exclude = array();
$count = $db->single('SELECT count(id) FROM photo_gen WHERE folder = ?', $folder);

// Select _up to_ 40 values. If we have less than 40 in the folder, stop
// when we've run out of photos to load:
$max = $count < 40 ? $count : 40;

// The loop:
for ($i = 0; $i < $max; ++$i) {
    // The maximum value will decrease each iteration, which makes
    // sense given that we are excluding one more result each time
    $r = mt_rand(0, ($count - $i - 1));

    // Dynamic query
    $qs = "SELECT * FROM photo_gen WHERE folder = ?";

    // We add AND id NOT IN (2,6,7,19, ...) to prevent duplicates:
    if ($i > 0) {
        $qs .= " AND id NOT IN (" . implode(', ', $exclude) . ")";
    }
    $qs .= "ORDER BY id ASC LIMIT ".$r.", 1";

    $row = $db->row($qs, $folder);

    /**
     * Now you can operate on $row here. Feel free to copy the
     * contents of your while($row=...) loop in place of this comment.
     */

    // Prevent duplicates
    $exclude []= (int) $row['id'];
}

Gordon's answer suggests using ORDER BY RAND(), which in general is a bad idea and can make your queries very slow. Furthermore, although he says that you shouldn't need to worry about there being less than 40 rows (presumably, because of the probability involved), this will fail in edge cases.

A quick note about mt_rand(): It's a biased and predictable random number generator with only 4 billion possible seeds. If you want better results, look into random_int() (PHP 7 only, but I'm working on a compatibility layer for PHP 5 projects. See the linked answer for more information.)

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1269583

Actually, even though the table has 2+ million rows, I'm guessing that a given folder has many fewer. Hence, this should be reasonable with an index on photo_gen(folder):

SELECT *
FROM photo_gen 
WHERE folder = '$folder'
ORDER BY rand()
LIMIT 40;

If a folder can still have tens or hundreds of thousands of examples, I would suggest a slight variation:

SELECT pg.**
FROM photo_gen pg cross join
     (select count(*) cnt from photo_gen where folder = $folder) as cnt
WHERE folder = '$folder' and
      rand() < 500 / cnt
ORDER BY rand()
LIMIT 40;

The WHERE expression should get about 500 rows (subject to the vagaries of sample variation). There is a really high confidence that there will be at least 40 (you don't need to worry about it). The final sort should be fast.

There are definitely other methods, but they are complicated by the where clause. The index is probably the key thing you need for improved performance.

Upvotes: 1

Vlad DX
Vlad DX

Reputation: 4730

It's better to firstly compose your SQL query (as a string in PHP) once and then just execute it once.

Or you could use this way to select values if it fits your case: Select n random rows from SQL Server table

Upvotes: 0

Related Questions