Reputation: 36
I have a large (~110 million rows) table on PostgreSQL 12.3 whose relevant fields can be described by the following DDL:
CREATE TABLE tbl
(
item1_id integer,
item2_id integer,
item3_id integer,
item4_id integer,
type_id integer
)
One of the queries we execute often is:
SELECT type_id, item1_id, item2_id, item3_id, item4_id
FROM tbl
WHERE
type_id IS NOT NULL
AND item1_id IN (1, 2, 3)
AND (
item2_id IN (4, 5, 6)
OR item2_id IS NULL
)
AND (
item3_id IN (7, 8, 9)
OR item3_id IS NULL
)
AND (
item4_id IN (10, 11, 12)
OR item4_id IS NULL
)
Although we have indexes for each of the individual columns, the query is still relatively slow (a couple of seconds). Hoping to optimize this, I created the following index:
CREATE INDEX tbl_item_ids
ON public.tbl USING btree
(item1_id ASC, item2_id ASC, item3_id ASC, item4_id ASC)
WHERE type_id IS NOT NULL;
Unfortunately the query performance barely improved - EXPLAIN
tells me this is because although an index scan is done with this newly created index, only item1_id
is used as an Index Cond
, whereas all the other filters are applied at table level (i.e. plain Filter
).
I'm not sure why the index is not used in its entirety (or at least for more than the item1_id
column). Is there an obvious reason for this? Is there a way I can restructure the index or the query itself to help with performance?
Upvotes: 2
Views: 524
Reputation: 246268
A multi-column index can only be used for more than the first column if the condition on the first column uses an equality comparison (=
). IN
or = ANY
does not qualify.
So you will be better off with individual indexes for each column, which can be combined with a bitmap or.
You should try to avoid OR
in the WHERE
condition, perhaps with
WHERE coalesce(item2_id, -1) IN (-1, 4, 5, 6)
where -1 is a value that doesn't occur. Then you could use an index on the coalesce
expression.
Upvotes: 2