Sagar
Sagar

Reputation: 5596

Multiple index available in where clause, which one Postgres chooses?

I have table

CREATE TABLE IF NOT EXISTS employeedb.employee_tbl
(
    id                         integer                  NOT NULL GENERATED ALWAYS AS IDENTITY ( INCREMENT 1 START 1 MINVALUE 1 MAXVALUE 2147483647 CACHE 1 ),
    dm_company_id              integer                  NOT NULL,
    customer_employee_id       text                     NOT NULL),
    personal_mobile            text,
    CONSTRAINT employee_pkey PRIMARY KEY (id),
    CONSTRAINT employee_id_company_constraint UNIQUE (dm_company_id, customer_employee_id)
 

and index on personal_mobile like

CREATE INDEX company_id_personal_mobile_index ON employeedb.employee_tbl (dm_company_id, personal_mobile COLLATE "C");

Now when I hit following query

explain analyse select * from employeedb.employee_tbl where dm_company_id = 2011 and employee_tbl.personal_mobile like '+1%';

I always see employee_id_company_constraint index is used.

Index Scan using employee_id_company_constraint on employee_tbl  (cost=0.14..8.16 rows=1 width=641) (actual time=0.056..0.093 rows=2 loops=1)
  Index Cond: (dm_company_id = 2011)
  Filter: (personal_mobile ~~ '+1%'::text)
  Rows Removed by Filter: 28
Planning Time: 0.951 ms
Execution Time: 0.173 ms

Why Postgres won't use company_id_personal_mobile_index which is more efficient than employee_id_company_constraint for above query ?

Upvotes: 0

Views: 50

Answers (1)

jjanes
jjanes

Reputation: 44287

There is an important piece of information we can't see from your plan, which is how many rows it thinks will be removed by the filter. That vital data is just not reported.

You can run this query to probe what the planner thinks is going on:

explain analyse select * from employeedb.employee_tbl where dm_company_id = 2011;

We would need to know how many rows it expects to find from this simplified query, and how many it actually finds (but presumably it will actually find 30, which is the 2 found + the 28 filtered from your current plan).

If it thinks that dm_company_id = 2011 will only match 1 row, then it doesn't make sense (according to the planners current way of thinking) to use the more complicated index to rule that single row in or out, it should be faster to just test that one row and see.

If my theory here is the correct, then the question becomes, why does it expect to find 1 row when the real answer is 30? Maybe your stats are out of date, maybe something else is going on. An ANALYZE should fix the first, the 2nd will require more investigation--which isn't warranted until my theory is confirmed and the ANALYZE fails to correct the problem.

Upvotes: 3

Related Questions