Reputation: 1747
I have a table "offer_facts" with many columns including product_dimension_id ( foreign key for product dimension table) and source_name ( varchar ). Both of these columns are indexed . At the moment approximately there are 120K rows in this table. This table is constantly growing ( around 20K per day).
Below is the query and the output I get .
SELECT version() "PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit"
EXPLAIN (ANALYZE, BUFFERS) SELECT DISTINCT product_dimension_id from offer_facts WHERE source_name='customer_conti'
And the output is
HashAggregate (cost=36619.24..36621.37 rows=213 width=4) (actual time=2654.272..2655.064 rows=399 loops=1)
Group Key: product_dimension_id
Buffers: shared hit=2697 read=17687
-> Seq Scan on offer_facts (cost=0.00..35425.82 rows=477367 width=4) (actual time=0.021..1525.361 rows=479880 loops=1)
Filter: ((source_name)::text = 'customer_conti'::text)
Rows Removed by Filter: 723466
Buffers: shared hit=2697 read=17687
Planning time: 0.201 ms
Execution time: 2655.778 ms
I am not sure why it is doing Seq Scan and not Index Scan .
I have created index with
CREATE INDEX idx_offer_facts_dimensions ON offer_facts USING btree (source_name COLLATE pg_catalog."default", shop_dimension_id, time_dimension_id, date_dimension_id, source_dimension_id, product_dimension_id);
and I have vaccuumed and analyzed the table.
Upvotes: 0
Views: 198
Reputation: 1269753
This is your query:
SELECT DISTINCT product_dimension_id
FROM offer_facts
WHERE source_name = 'customer_conti' ;
The optimal index is a composite index: offer_facts(source_name, product_dimension_id)
. Individual indexes on each column are not as useful. This query can make use of an index scan; Postgres should be smart enough to find that execution path.
Upvotes: 1