JohnathanKong
JohnathanKong

Reputation: 1307

Random Result Pagination

I currently have a listing of data that has a randomly generated order listing. Right now there is no pagination, so it is easy for me to generate a randomly ordered list. As my list of data grows, it gets slower because of all the data, so the obvious solution is pagination. The issue that I have with pagination is that I cannot randomly generate the order every time the page loads, so my manager and I have come to the conclusion that a list will have to be pre-generated ahead of time and will be re-generated every x amount of time. Now the issue is how do we store this generated list? There are four options that we've come up with:

  1. Session (takes up ram on the server)
  2. Cookies (more data is transmitted. Think about thousands of integer values transmitted to the user)
  3. Flat File (Implementation might take a bit of time. Not an extreme amount, but a little longer than the rest)
  4. database (a cron job will run ever x amount of time and do a mass update on all records. Our worry is that if there is too much data, it might slow down the system if people are hitting the server during an update.)

If there are any other solutions which seem better than pre-generated time-based lists, I'd love to hear about them.

Thanks.

UPDATE: An answer that I really liked, but was deleted for some reason, was that someone mentioned the use of a SEED, then I can store the seed instead of a list of ids, which would cut-down my data storage and simplify everything. I just tested the solution, and it works almost flawlessly. The only problem is that when I use LIMIT, everything will screw up. Does anyone have an suggestions to that? I don't want to have to generate all the data every time, I just want to use LIMIT * , . If I use this with a seed though, the numbers always reset, as it should.

Hopefully that made sense. It made more sense as I was thinking about it than how it turned out typed.

Upvotes: 6

Views: 4567

Answers (6)

Irfandi D. Vendy
Irfandi D. Vendy

Reputation: 1004

Here's my working solution based on Kevin's answer:

public function galleries(){
   if (!Input::has('page')){                    //it means we are on the first page
      $seed = rand(1, 10);                  //generate a random number between 1 and 10
      Session::put('seed', $seed);          //then pass it to session named 'seed'
   }else{
      if (Input::get('page') == 1){             //this also indicate the first page, because page = 1
         $seed = rand(1, 10);
         Session::put('seed', $seed);
      }
   }

   $orgs = Organization::orderByRaw("RAND(".Session::get('seed').")")->paginate(4); //put the seed value to RAND(), this way, the second page will generate random row with same seed as the first page
   return redirect('galleries')->with('orgs', $orgs);
}

}

Cheers.

Edit : this is a laravel project based, but you get the idea.,

Upvotes: 0

Nick Zulu
Nick Zulu

Reputation: 331

using seed is thebest solution as said above...

by using the following logic you can 'mess-up' your data for each seperate visitor (using his/her IP) as well as keep your pagination results for a good amount of navifation time...

//generate individual seed...
$ip=$_SERVER['REMOTE_ADDR'];
$hour=date("H");
$day=date("j");
$month=date("n");
$ip=str_replace(".","",$ip);
$seed=($ip+$hour+$day+$month);
//query 
$query="SELECT * FROM my_table ORDER BY RAND($seed) LIMIT $amount OFFSET $offset";

pros: fast and easy, no extra load, no cron needed, unique randomness per visitor.

cons: data re-messed if hour (morelike), day or month change during navigation.

Upvotes: 0

Massimo
Massimo

Reputation: 573

I prefer Random file, watch this class of caching taked from opencart:

<?php
final class Cache {
  private $expire = 3600; 
  public function __construct() {
    $files = glob(DIR_CACHE . 'cache.*');
    if ($files) {
      foreach ($files as $file) {
        $time = substr(strrchr($file, '.'), 1);
        if ($time < time()) { unlink($file); }
      }
    }
  }

  public function get($key) {
    $files = glob(DIR_CACHE . 'cache.' . $key . '.*');
    if ($files) {
      foreach ($files as $file) {
        $handle = fopen($file, 'r');
        $cache = fread($handle, filesize($file));
        fclose($handle);
        return unserialize($cache);
      }
    }
  }

  public function set($key, $value) {
    $this->delete($key);
    $file = DIR_CACHE . 'cache.' . $key . '.' . (time() + $this->expire);
    $handle = fopen($file, 'w');
    fwrite($handle, serialize($value));
    fclose($handle);
  }

  public function delete($key) {
    $files = glob(DIR_CACHE . 'cache.' . $key . '.*');
    if ($files) {
      foreach ($files as $file) {
        unlink($file);
      }
    }
  }
}
?>

is really simple to use and it works so well, you use the random query and save your data into the file, I post an example.

$cache = new cache();
$data = $cache->get('my_query_key');
if (!$data) {
  // I do my query and I put it into an array (because I can use shuffle :P)
  $result = mysql_query('SELECT * FROM items');
  $data = array();
  while($row = mysql_fetch_assoc($result)) { $data[] = $row; }
  $cache->set('my_query_key', $data);
}
shuffle($data);

The only things that there is a problem when saving more than 100kb of a file, but as rumor, I use it and works very well never get me no problem. Ah.. in this case isn't needed to use RAND() on query. :P

I have write this post without checking sintax, be ware ^^

Upvotes: 0

geocar
geocar

Reputation: 9305

Use a subselect. This way you can still use the RAND()-trick, but OFFSET won't mess you up.

select * from (select * from items order by rand(3) asc) as b limit 5 offset @x;

Alternate hint: Upgrade your MySQL. This was an old bug.


Example:

mysql> select * from (select id from items order by rand(3) asc) as b limit 5 offset 1;
+-------+
| id    |
+-------+
| 24621 | 
| 25214 | 
| 27119 | 
| 24672 | 
| 25585 | 
+-------+
5 rows in set (0.01 sec)

mysql> select * from (select id from items order by rand(3) asc) as b limit 5 offset 2;
+-------+
| id    |
+-------+
| 25214 | 
| 27119 | 
| 24672 | 
| 25585 | 
| 27718 | 
+-------+
5 rows in set (0.01 sec)

mysql> select * from (select id from items order by rand(3) asc) as b limit 5 offset 3;
+-------+
| id    |
+-------+
| 27119 | 
| 24672 | 
| 25585 | 
| 27718 | 
| 25457 | 
+-------+
5 rows in set (0.00 sec)

mysql> select * from (select id from items order by rand(3) asc) as b limit 5 offset 4;
+-------+
| id    |
+-------+
| 24672 | 
| 25585 | 
| 27718 | 
| 25457 | 
| 27112 | 
+-------+
5 rows in set (0.01 sec)

mysql> select * from (select id from items order by rand(3) asc) as b limit 5 offset 5;
+-------+
| id    |
+-------+
| 25585 | 
| 27718 | 
| 25457 | 
| 27112 | 
| 24779 | 
+-------+
5 rows in set (0.02 sec)

Upvotes: 0

Kevin
Kevin

Reputation: 13097

Mysql RAND() accepts a seed as an optional argument. Using a seed, it will return the same randomized result set each time.

What you could do is generate a random seed in PHP on the first page request and pass it along to each page using a query string.

Edit: Sorry, I didn't realize the solution was posted already but was deleted.

Upvotes: 4

anonymous coward
anonymous coward

Reputation: 12814

Use #4, possibly only storing the IDs of the data to retrieve, in the order they should be retrieved.

Better than that, if possible (since this is similar to a scaling issue) is to pre-generate each page's data. For example, if it's only viewed through a browser, just go ahead and pre-generate x-number of static pages HTML (or just the table/list portion).

I know that this probably sounds ridiculous without further explanation, but think about one of these options.

Also, if the server is taking that hard of a hit during generation of the results, it needs to be separated from the web server, and possibly the report/generation run on a clone/replicated slave of the original database.

Upvotes: 0

Related Questions