Yuri Yaryshev
Yuri Yaryshev

Reputation: 1041

Index scan for multicolumn comparison - non-uniform index column ordering

This question is closely related to Enforcing index scan for multicolumn comparison

The solution there is perfect, but seems to works only if all index columns have same ordering. This question is different because column b is desc here, and this fact stops from using row-syntax to solve the same problem. This is why I'm looking for another solution.

Suppose index is built for 3 columns (a asc, b DESC, c asc), I want Postgres to:

  1. find key [a=10, b=20, c=30] in that B-tree,
  2. scan next 10 entries and return them.

If the index has only one column the solution is obvious:

select * from table1 where a >= 10 order by a limit 10

But if there are more columns the solution becomes much more complex. For 3 columns:

select * from table1
where a > 10 or (a = 10 and (b < 20 or b = 20 and c <= 30))
order by a, b DESC, c
limit 10;

How can I tell Postgres that I want this operation?

And can I be sure that even for those complex queries for 2+ columns the optimizer will always understand that he should perform range-scan? Why?

Upvotes: 2

Views: 314

Answers (2)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 659367

Strictly speaking, your index on (a ASC, b DESC, c ASC) can still be used, but only based on the leading expression a. See:

It's usefulness is limited and Postgres will only use it if the predicate on a alone is selective enough (less than roughly 5% of all rows have a >= 10). (Or possibly to profit from an index-only scans where possible.) But all index tuples qualifying on only a have to be read and you will see a FILTER step in the query plan to discard non-qualifying rows - both adding additional cost. An index on just (a) typically does a better job as it's smaller and cheaper to maintain.

I have tried and failed in the past to make full use of an index with non-uniform sort order (ASC | DESC) like you display for ROW value comparison. I am pretty certain it's not possible. Think of it: Postgres compares whole row values, which can either be greater or smaller, but not both at the same time.

There are workarounds for datatypes with a defined negator (like - for numeric data types). See the solution provided by "The Impaler"! The trick is to invert values and wrap it in an expression index to get uniform sort order for all index expressions after all - which is currently the only way to tap into the full potential of row comparison. Be sure to make both WHERE conditions and ORDER BY match the special index.

Upvotes: 1

The Impaler
The Impaler

Reputation: 48875

PostgreSQL implements tuples very thoroughly, (unlike half implementations found in Oracle, DB2, SQL Server, etc.). You can write your condition using "tuples inequality", as in:

select * 
from table1
where (a, -b, c) >= (10, -20, 30)
order by a, -b, c
limit 10

Please note that since the second column is in descending order, you must "invert" its value during the comparison. That's why it's expressed as -b and also, -20. This can be tricky for non-numeric columns such as dates, varchars, LOBs, etc.

Finally, the use of an index is still possible with the -b column value if you create an ad-hoc index, such as:

create index ix1 on table1 (a, (-b), c);

However, you can never force PostgreSQL to use an index. SQL is a declarative language, not an imperative one. You can entice it to do it by keeping table stats up to date, and also by selecting a small number of rows. If your LIMIT is too big, PostgreSQL may be inclined to use a full table scan instead.

Upvotes: 2

Related Questions