Reputation: 43
I have a product
table and it has an indexed column manufacturer_id
. The table has 30M rows. I want to check if a specific row exists in the table. Queries I used:
select exists(select 1 from product where manufacturer_id = '0');
select 1 from product where manufacturer_id = '0' fetch first row only;
select 1 from product where manufacturer_id = '0' limit 1;
Each query takes 4 minutes. There are 12M rows with manufacturer_id = '0'
in the table. However, if I run the same query with manufacturer_id = '1234567'
(which has 2 rows in the table), it returns the result in 250 milliseconds.
I would expect that checking row existence doesn't take long. Or, am I mistaken and it is normal to take that long given the table volume? If it is not normal, how could I improve the performance?
Execution plan:
Result (cost=0.07..0.08 rows=1 width=1) (actual time=0.015..0.016 rows=1 loops=1)
Buffers: shared hit=1
InitPlan 1 (returns $0)
-> Seq Scan on product (cost=0.00..914306.95 rows=12306868 width=0) (actual time=0.014..0.014 rows=1 loops=1)
Filter: ((manufacturer_id)::text = '0'::text)
Buffers: shared hit=1
Planning Time: 0.093 ms
Execution Time: 0.036 ms
Create Index statement: (this is the only index other than the primary key)
create index product_manufacturer_id
on product (manufacturer_id);
manufacturer_id: varchar
type and indexed
column.
Postgres Version: 12
Upvotes: 1
Views: 2521
Reputation: 1270873
You want an index on manufacturer_id
:
create index idx_product_manufacturer_id on product(manufacturer_id)
You may want additional keys after manufacturer_id
to support other queries. But manufacturer_id
needs to be the first column
Also, you need to be careful about types. So, you want the comparison to use the same type -- double quotes mean that the comparison is to a string not an integer. And even if they are both strings, the index may not be used if the collations are not the same.
Without an index, Postgres has to scan the table to find a match. Apparently, '1234567'
shows up quickly in whatever order Postgres is scanning. But '0'
does not.
An index fixes this problem. And it will also be helpful for joining to the manufacturers
table.
Upvotes: 2