Reputation: 2184
I have the following data set:
update (id, update_time)
1 <a timestamp>
2 <a timestamp>
item_update (update_id, item_id)
1 327
1 328
1 496
2 345
2 477
What I want to do is display them as follows in a page:
On March 6, 2012: 327 328 496 On July 7, 2012: 345 477
More or less. The thing is that the data set is expected to get pretty big. I can't have all that in a single page. However, if I just limit them to an arbitrary number, then the groups get cut off in the middle. If I limit them by update, then I might get too few in one page and too many in another.
How can I limit the result set so that I get as many groups (updates) as possible without exceeding a certain amount of items?
Upvotes: 1
Views: 148
Reputation: 23125
You can use this solution:
SELECT b.update_time, a.item_id
FROM item_update a
INNER JOIN updte b ON a.update_id = b.id
LEFT JOIN item_update c ON a.update_id = c.update_id
AND a.item_id <= c.item_id
GROUP BY a.update_id, a.item_id
HAVING COUNT(1) <= 2
The 2
in the HAVING
clause is the number of items to limit per group (limited via n highest item_id
per group). You can fiddle around with this solution using the demo:
Notice if you change the 2
to a 3
, you'll get one more result for March 6th.
If you want your results to be displayed hierarchically, you can do this:
SELECT a.val
FROM
(
SELECT CONCAT('- - - -> ', a.item_id) AS val,
CONCAT(b.update_time, a.item_id) AS orderfactor
FROM item_update a
INNER JOIN updte b ON a.update_id = b.id
LEFT JOIN item_update c ON a.update_id = c.update_id AND a.item_id <= c.item_id
GROUP BY a.update_id, a.item_id
HAVING COUNT(1) <= 2
UNION ALL
SELECT DATE_FORMAT(update_time, 'On %M %e, %Y:'),
update_time
FROM updte
) a
ORDER BY a.orderfactor
Upvotes: 0
Reputation: 6221
The following query
select a.id, floor(ifnull(sum(b.update_count),0) / 50) as start_page
from updte as a
left join
(select update_id, case when count(1) > 50 then 50 else count(1) end as update_count
from item_update group by update_id) as b
on a.id > b.update_id
group by a.id
will give you the page number that each update_id
group should start on (assuming that we want to show no more then 50 items on page except when there is more than 50 items in group). Knowing start_page
for each update_id
you can display only relevant update_id
from item_update
.
Upvotes: 1
Reputation: 2184
Found an answer, although it's not pretty.
Basically, I have to first find all the unique update_id
within item_update, as follows:
(SELECT * FROM
(SELECT update_id
FROM item_update
WHERE update_id < :my_last_id_from_previous_page
ORDER BY update_id DESC
LIMIT 0, :limit
) as inner_query
GROUP BY update_id) as outer_query
That resulting table gives me all the groups that have to be returned, so with that the rest is a piece of cake:
SELECT item_id, update.id, update_time FROM outer_query
INNER JOIN item_update ON (item_update.update_id = outer_query.update_id)
INNER JOIN update ON (item_update.update_id = update.id)
Upvotes: 0
Reputation: 828
I'm assuming that you won't want to split a group, then you can simply
select update_id, count(*) cnt from item_update group by update_id;
This will give you counts for each update_id. Then it's just a matter of looping through the update_ids and until your cumulative reaches (or passes) your page max and then firing another query for all those update_ids...
There may be edge cases though where one update group has 1 item and the next has 1000... there it would probably make sense to allow your pagination to break on groups...
page1
group1
item1
item2
item3
group2
item1
item2
page2
group2
item3
item4
etc...
then it's just a matter of using the limit clause with sorting...
Upvotes: 1