Ryan Peschel
Ryan Peschel

Reputation: 11848

How to efficiently sort by the results of a subquery?

Let's say I have a site like Stackoverflow with posts that people can reply to, and I would like to have an interface for sorting the posts by reply count

This will eventually have infinite scroll pagination, so showing 10 results at a time.

Here's an example query for that:

SELECT *, (SELECT COUNT(*) 
           FROM post_reply pr 
           WHERE pr.convo_id = post.convo_id) as replies 
FROM post 
ORDER BY replies 
LIMIT 10;

This works, but it is prohibitively slow. I have hundreds of thousands of posts and this causes the query to take >30s to complete.

An index would improve the speed, but I have no idea how to implement an index on a subquery.

A materialized view could also work, but updating the materialized view every time someone replies to a post seems prohibitively slow as well.

Is there a good solution to this problem?

Upvotes: 0

Views: 138

Answers (2)

Łukasz Kamiński
Łukasz Kamiński

Reputation: 5950

You could change order of queries and first generate list of posts by reply count and then get post columns. This should use primary key (I'm assuming post.convo_id is one) and potentially be faster, tho I do not guarantee it will.

SELECT post.*, sub.replies
  FROM (SELECT pr.convo_id, COUNT(*) AS replies
          FROM post_reply pr
         GROUP BY pr.convo_id
         ORDER BY replies --maybe DESC if you want top reply count first
         LIMIT 10
       ) AS sub
  JOIN post USING(convo_id);

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1271141

You cannot really speed up this query.

You can change the data model and use a lot of infrastructure to get a faster sort. The idea is:

  1. Add a column post_reply_count to the posts table.
  2. Add an index on this column.
  3. Keep this column up-to-date using triggers -- + 1 for insert, - 1 for delete. And whatever is appropriate for update.
  4. Use this column in your query.

This adds overhead. But if you really need a speed response to this query, you may not have a choice.

Upvotes: 1

Related Questions