Reputation: 1307
I currently have a listing of data that has a randomly generated order listing. Right now there is no pagination, so it is easy for me to generate a randomly ordered list. As my list of data grows, it gets slower because of all the data, so the obvious solution is pagination. The issue that I have with pagination is that I cannot randomly generate the order every time the page loads, so my manager and I have come to the conclusion that a list will have to be pre-generated ahead of time and will be re-generated every x amount of time. Now the issue is how do we store this generated list? There are four options that we've come up with:
If there are any other solutions which seem better than pre-generated time-based lists, I'd love to hear about them.
Thanks.
UPDATE: An answer that I really liked, but was deleted for some reason, was that someone mentioned the use of a SEED, then I can store the seed instead of a list of ids, which would cut-down my data storage and simplify everything. I just tested the solution, and it works almost flawlessly. The only problem is that when I use LIMIT, everything will screw up. Does anyone have an suggestions to that? I don't want to have to generate all the data every time, I just want to use LIMIT * , . If I use this with a seed though, the numbers always reset, as it should.
Hopefully that made sense. It made more sense as I was thinking about it than how it turned out typed.
Upvotes: 6
Views: 4567
Reputation: 1004
Here's my working solution based on Kevin's answer:
public function galleries(){
if (!Input::has('page')){ //it means we are on the first page
$seed = rand(1, 10); //generate a random number between 1 and 10
Session::put('seed', $seed); //then pass it to session named 'seed'
}else{
if (Input::get('page') == 1){ //this also indicate the first page, because page = 1
$seed = rand(1, 10);
Session::put('seed', $seed);
}
}
$orgs = Organization::orderByRaw("RAND(".Session::get('seed').")")->paginate(4); //put the seed value to RAND(), this way, the second page will generate random row with same seed as the first page
return redirect('galleries')->with('orgs', $orgs);
}
}
Cheers.
Edit : this is a laravel project based, but you get the idea.,
Upvotes: 0
Reputation: 331
using seed is thebest solution as said above...
by using the following logic you can 'mess-up' your data for each seperate visitor (using his/her IP) as well as keep your pagination results for a good amount of navifation time...
//generate individual seed...
$ip=$_SERVER['REMOTE_ADDR'];
$hour=date("H");
$day=date("j");
$month=date("n");
$ip=str_replace(".","",$ip);
$seed=($ip+$hour+$day+$month);
//query
$query="SELECT * FROM my_table ORDER BY RAND($seed) LIMIT $amount OFFSET $offset";
pros: fast and easy, no extra load, no cron needed, unique randomness per visitor.
cons: data re-messed if hour (morelike), day or month change during navigation.
Upvotes: 0
Reputation: 573
I prefer Random file, watch this class of caching taked from opencart:
<?php
final class Cache {
private $expire = 3600;
public function __construct() {
$files = glob(DIR_CACHE . 'cache.*');
if ($files) {
foreach ($files as $file) {
$time = substr(strrchr($file, '.'), 1);
if ($time < time()) { unlink($file); }
}
}
}
public function get($key) {
$files = glob(DIR_CACHE . 'cache.' . $key . '.*');
if ($files) {
foreach ($files as $file) {
$handle = fopen($file, 'r');
$cache = fread($handle, filesize($file));
fclose($handle);
return unserialize($cache);
}
}
}
public function set($key, $value) {
$this->delete($key);
$file = DIR_CACHE . 'cache.' . $key . '.' . (time() + $this->expire);
$handle = fopen($file, 'w');
fwrite($handle, serialize($value));
fclose($handle);
}
public function delete($key) {
$files = glob(DIR_CACHE . 'cache.' . $key . '.*');
if ($files) {
foreach ($files as $file) {
unlink($file);
}
}
}
}
?>
is really simple to use and it works so well, you use the random query and save your data into the file, I post an example.
$cache = new cache();
$data = $cache->get('my_query_key');
if (!$data) {
// I do my query and I put it into an array (because I can use shuffle :P)
$result = mysql_query('SELECT * FROM items');
$data = array();
while($row = mysql_fetch_assoc($result)) { $data[] = $row; }
$cache->set('my_query_key', $data);
}
shuffle($data);
The only things that there is a problem when saving more than 100kb of a file, but as rumor, I use it and works very well never get me no problem. Ah.. in this case isn't needed to use RAND() on query. :P
I have write this post without checking sintax, be ware ^^
Upvotes: 0
Reputation: 9305
Use a subselect. This way you can still use the RAND()
-trick, but OFFSET won't mess you up.
select * from (select * from items order by rand(3) asc) as b limit 5 offset @x;
Alternate hint: Upgrade your MySQL. This was an old bug.
Example:
mysql> select * from (select id from items order by rand(3) asc) as b limit 5 offset 1;
+-------+
| id |
+-------+
| 24621 |
| 25214 |
| 27119 |
| 24672 |
| 25585 |
+-------+
5 rows in set (0.01 sec)
mysql> select * from (select id from items order by rand(3) asc) as b limit 5 offset 2;
+-------+
| id |
+-------+
| 25214 |
| 27119 |
| 24672 |
| 25585 |
| 27718 |
+-------+
5 rows in set (0.01 sec)
mysql> select * from (select id from items order by rand(3) asc) as b limit 5 offset 3;
+-------+
| id |
+-------+
| 27119 |
| 24672 |
| 25585 |
| 27718 |
| 25457 |
+-------+
5 rows in set (0.00 sec)
mysql> select * from (select id from items order by rand(3) asc) as b limit 5 offset 4;
+-------+
| id |
+-------+
| 24672 |
| 25585 |
| 27718 |
| 25457 |
| 27112 |
+-------+
5 rows in set (0.01 sec)
mysql> select * from (select id from items order by rand(3) asc) as b limit 5 offset 5;
+-------+
| id |
+-------+
| 25585 |
| 27718 |
| 25457 |
| 27112 |
| 24779 |
+-------+
5 rows in set (0.02 sec)
Upvotes: 0
Reputation: 13097
Mysql RAND() accepts a seed as an optional argument. Using a seed, it will return the same randomized result set each time.
What you could do is generate a random seed in PHP on the first page request and pass it along to each page using a query string.
Edit: Sorry, I didn't realize the solution was posted already but was deleted.
Upvotes: 4
Reputation: 12814
Use #4, possibly only storing the IDs of the data to retrieve, in the order they should be retrieved.
Better than that, if possible (since this is similar to a scaling issue) is to pre-generate each page's data. For example, if it's only viewed through a browser, just go ahead and pre-generate x-number of static pages HTML (or just the table/list portion).
I know that this probably sounds ridiculous without further explanation, but think about one of these options.
Also, if the server is taking that hard of a hit during generation of the results, it needs to be separated from the web server, and possibly the report/generation run on a clone/replicated slave of the original database.
Upvotes: 0