tdma
tdma

Reputation: 192

Postgresql not using multi-column indexes (btree_gin)

I am having issues making postgres use my multi-column indexes for full search using the btree_gin extension. This is for a search page for articles. The idea behind the use of btree_gin is to be able to get the 'id' field for sorting and magazine_id as a filter:

CREATE INDEX idx_gin_search ON article USING gin(id, magazine_id, search_vector_full) WITH (fastupdate = off);

Postgres decides to use a btree index on magazine instead and then filter (=slow):

Executed SQL
SELECT ••• FROM article WHERE (( (article.search_vector) @@    
(plainto_tsquery('pg_catalog.english', 'interesting'))) AND    
article.magazine_id = 7) ORDER BY article.id ASC LIMIT 36
Time 13.4780406952 ms

QUERY PLAN
Limit  (cost=2021.87..2021.96 rows=36 width=384) (actual time=9.782..9.787 rows=36 loops=1)
  ->  Sort  (cost=2021.87..2027.49 rows=2248 width=384) (actual time=9.781..9.784 rows=36 loops=1)
    Sort Key: id
    Sort Method: top-N heapsort  Memory: 53kB
    ->  Index Scan using idx_magazine_id on article (cost=0.29..1952.53 rows=2248 width=384) (actual time=0.035..8.924 rows=2249 loops=1)
          Index Cond: (magazine_id = 7)
          Filter: (search_vector @@ '''interesting'''::tsquery)
          Rows Removed by Filter: 11413
Planning time: 4.600 ms
Execution time: 9.860 ms

Then, what I find even understand less, is that it also refuses to use this simple btree index on the LIST page for articles, where they are just listed x per page in descending order:

CREATE INDEX idx_btree_listing ON article USING btree(id DESC, magazine_id);

Again it doesn't use the multi-column index:

Executed SQL
SELECT ••• FROM article WHERE article.magazine_id = 7
ORDER BY article.id DESC LIMIT 36
Time 1.4750957489 ms

QUERY PLAN
Limit  (cost=0.29..7.48 rows=36 width=384) (actual time=0.034..0.115 rows=36 loops=1)
->  Index Scan Backward using idx_magazine_id on article  (cost=0.29..2729.56 rows=13662 width=384) (actual time=0.031..0.107 rows=36 loops=1)
    Filter: (magazine_id = 7)
    Planning time: 1.354 ms
    Execution time: 0.207 ms

EDIT: The above is a development setup with less records and only 1 magazine, hence the fast speed. Here is a log produced by auto_explain on the production server:

duration: 230.629 ms  plan:
SELECT article.id, article.title, article.date, article.content FROM article WHERE article.magazine_id = 7 ORDER BY article.id DESC LIMIT 36

Limit  (cost=0.42..43.67 rows=36 width=306) (actual time=229.876..229.995 rows=36 loops=1)
    ->  Index Scan Backward using idx_magazine_id on article (cost=0.42..239539.22 rows=199379 width=306) (actual time=229.866..229.968 rows=36 loops=1)
    Filter: (article.magazine_id = 7)
    Rows Removed by Filter: 116414

I would be grateful anyone could give me tips for my further debugging on this one.

Upvotes: 3

Views: 2282

Answers (1)

Eelke
Eelke

Reputation: 21993

The first column in your multi column index is id. You do not filter on id so postgres won't use that index. You do not have to filter on all columns in the index but the columns on which you do filter have to be the first n columns in the index.

Try experimenting with variations of the index you have like moving id to the end or ommiting id from the index.

Upvotes: 2

Related Questions