Joe23
Joe23

Reputation: 5782

Multi-Column index not used for index-only scan, but partial index is

My question is why a multi-column index is not used for a index only scan, when a partial index with equivalent information (I think) is.

The table:

CREATE TABLE test 
(
  id      INT,
  descr   TEXT,
  flag    BOOLEAN
);

INSERT INTO test
SELECT GENERATE_SERIES(1,100000) AS id,
       MD5(RANDOM()::TEXT) AS descr,
       (RANDOM() < 0.1) AS flag;

SELECT *
FROM test LIMIT 10;

A content sample:

id  descr   flag
1   81978ceb5514461fbad9af1152ad78f6    true
2   cc0aee68ba3e0095cc74d53e8da55fef    false
3   689a76e5897d565638f8ddd2d2019b7a    true
4   9df03bc2969a6af88cd1d6e0423d0f4c    true
5   318983766d11f831e9f0df34606dc908    false
6   198102bb71640a16f28263b7fb56ba2e    false
7   9bef7320389db46a8ad88ffa611e81b5    false
8   c1f0d637ee0a985aa7d768a78d2d97b1    false
9   781b4064f721ae3879d95579264b0aba    false
10  c4582890bb1e9af430e0f36b50f5e88c    false

The query I need to run is:

SELECT id
FROM test
WHERE flag;

Now if I use a partial index. The query (eventually) gets executed as an index only scan:

CREATE INDEX i1 
  ON test (id) WHERE flag;

QUERY PLAN
Index Only Scan using i1 on test  (cost=0.29..354.95 rows=9911 width=4) (actual time=0.120..6.268 rows=9911 loops=1)
  Heap Fetches: 9911
  Buffers: shared hit=834 read=29
Planning time: 0.806 ms
Execution time: 6.922 ms

What I do not understand is: why a multi column index of the following form is never used for an index-only scan?

CREATE INDEX i2 
  ON test (flag, id);

QUERY PLAN
Bitmap Heap Scan on test  (cost=189.10..1122.21 rows=9911 width=4) (actual time=0.767..5.986 rows=9911 loops=1)
  Filter: flag
  Heap Blocks: exact=834
  Buffers: shared hit=863
  ->  Bitmap Index Scan on i2  (cost=0.00..186.62 rows=9911 width=0) (actual time=0.669..0.669 rows=9911 loops=1)
        Index Cond: (flag = true)
        Buffers: shared hit=29
Planning time: 0.090 ms
Execution time: 6.677 ms

Can't the query visit all consecutive leaves of the btree where flag is True to determine all ids?

(Note that tuple visibility probably is not the problem since the index-only scan is used with the partial index i1.)

My Postgres version is: PostgreSQL 9.6.2 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.3.1-14ubuntu2) 5.3.1 20160413, 64-bit

Upvotes: 2

Views: 273

Answers (1)

Laurenz Albe
Laurenz Albe

Reputation: 246808

Without being able to determine the exact cause of the behavior on your system, it is probably caused by bad statistics or an inaccurate visibility map that mislead the PostgreSQL optimizer.

VACUUM (ANALYZE) test;

will do two things:

  1. It will update the visibility map, which is necessary for PostgreSQL to make an informed decision if an index only scan can be performed or not.

  2. It will collect statistics on the table, which cause PostgreSQL to estimate the number of result rows accurately so that it can choose the best plan.

Upvotes: 1

Related Questions