Strange behavior when doing where and order by query in postgres

Question

Background: A large table, 50M+, all column in query is indexed.

when I do a query like this:

select * from table where A=? order by id DESC limit 10;

In statement, A, id are both indexed.
Now confusing things happen:

the more rows where returned, the less time whole sql cost
the less rows where returned, the more time whole sql cost

I have a guess here: postgres do the order by first, and then where , so it cost more time to find 10 row in the orderd index when target rowset is small(like find 10 particular sand on beach); opposite， if target rowset is large, it's easy to find the first 10.

Is it right? Or there are some other reason for this?

Final question: How to optimize this situation?

jjanes · Accepted Answer

It can either use the index on A to apply the selectivity, then sort on "id" and apply the limit. Or it can read them already in order using the index on "id", then filter out the ones that meet the A condition until it finds 10 of them. It will choose the one it thinks is faster, and sometimes it makes the wrong choice.

If you had a multi-column index, on (A,id) it could use that one index to do both things, get the selectivity on A and still fetch the already in order by "id", at the same time.

Strange behavior when doing where and order by query in postgres

Answers (2)

Related Questions