Reputation: 11848
Let's say I have a site like Stackoverflow with posts that people can reply to, and I would like to have an interface for sorting the posts by reply count
This will eventually have infinite scroll pagination, so showing 10 results at a time.
Here's an example query for that:
SELECT *, (SELECT COUNT(*)
FROM post_reply pr
WHERE pr.convo_id = post.convo_id) as replies
FROM post
ORDER BY replies
LIMIT 10;
This works, but it is prohibitively slow. I have hundreds of thousands of posts and this causes the query to take >30s to complete.
An index would improve the speed, but I have no idea how to implement an index on a subquery.
A materialized view could also work, but updating the materialized view every time someone replies to a post seems prohibitively slow as well.
Is there a good solution to this problem?
Upvotes: 0
Views: 138
Reputation: 5950
You could change order of queries and first generate list of posts by reply count and then get post
columns. This should use primary key (I'm assuming post.convo_id
is one) and potentially be faster, tho I do not guarantee it will.
SELECT post.*, sub.replies
FROM (SELECT pr.convo_id, COUNT(*) AS replies
FROM post_reply pr
GROUP BY pr.convo_id
ORDER BY replies --maybe DESC if you want top reply count first
LIMIT 10
) AS sub
JOIN post USING(convo_id);
Upvotes: 1
Reputation: 1271141
You cannot really speed up this query.
You can change the data model and use a lot of infrastructure to get a faster sort. The idea is:
post_reply_count
to the posts
table. + 1
for insert
, - 1
for delete
. And whatever is appropriate for update
.This adds overhead. But if you really need a speed response to this query, you may not have a choice.
Upvotes: 1