Reputation: 20570
SELECT * FROM table1
WHERE (col1, col2) IN (($1, $2), ($3, $4))
ORDER BY col3
LIMIT 10;
Output of EXPLAIN ANALYZE
:
Limit (cost=59174.75..59174.77 rows=10 width=113) (actual time=3632.627..3632.661 rows=10 loops=1)
-> Sort (cost=59174.75..59180.22 rows=2188 width=113) (actual time=3632.623..3632.634 rows=10 loops=1)
Sort Key: col3
Sort Method: top-N heapsort Memory: 27kB
-> Nested Loop (cost=2.62..59127.46 rows=2188 width=113) (actual time=0.234..3561.309 rows=38347 loops=1)
...........
Total runtime: 3632.818 ms
But when I remove the order:
SELECT * FROM table1 WHERE (col1, col2) IN (($1, $2), ($3, $4)) LIMIT 10;
Limit (cost=2.62..272.85 rows=10 width=105) (actual time=0.258..1.143 rows=10 loops=1)
-> Nested Loop (cost=2.62..59127.46 rows=2188 width=105) (actual time=0.255..1.115 rows=10 loops=1)
........
Total runtime: 1.306 ms
btree index on (col1, col2)
and a btree index on col3
. WHERE (col1, col2) IN (($1, $2), ($3, $4)) ORDER BY col3 LIMIT 10;
. (Look ups are always with an IN clause and then order.)Note: Is it possible to create an index on (col1, col2, col3)? That would look up using (col1, col2)
and have col3
already ordered ...
Upvotes: 1
Views: 1383
Reputation: 656391
Yes. You got your answer in the question already.
For the given query, a multicolumn index on (col1, col2, col3)
should be perfect. Just try it.
More about order of columns in a multicolumn B-tree index under this related question on dba.SE:
Is a composite index also good for queries on the first field?
In addition, if you don't actually need all columns from table1
, only put the required ones in the SELECT
list instead of *
to gain performance.
As for your added requirement:
WHERE (col1, col2) IN (($1, $2), ($3, $4))
is equivalent to:
WHERE (col1 = $1 AND col2 = $2 OR
col1 = $3 AND col2 = $4)
This reduces the effectiveness of the index over (col1, col2, col3)
, since Postgres cannot just fetch the pre-sorted list from the index. It depends. The fewer items in your IN
list and the more rows with the same col3
per (col1, col2)
there are, the more you can gain from said index.
You'll have to test. Create the index additionally, be sure your server is configured reasonably, statistics are up to date (ANALYZE
) and your cost settings are reasonable and then EXPLAIN
will show what Postgres picks. Be sure to run a set of queries that represents your use cases. In the end, remove indexes that don't get used.
The sort step seems to be the expensive part. Try this alternative query: one UNION ALL
leg per item in your IN
list. This makes an offer to Postgres that it can't refuse: the special index fits the legs of this query perfectly. And the final sort step is cheap for only a few IN
items.
(
SELECT *
FROM table1
WHERE col1 = $1 AND col2 = $3
ORDER BY col3
LIMIT 10
)
UNION ALL
(
SELECT *
FROM table1
WHERE col1 = $3 AND col2 = $4
ORDER BY col3
LIMIT 10
)
... UNION ALL ...
ORDER BY col3
LIMIT 10
Note that all parentheses are required to allow ORDER BY
and LIMIT
for each leg in addition to the final ORDER BY
and LIMIT
.
Upvotes: 2