Reputation: 1486
I'm researching now on creating indexes for our tables.
I found out about multicolumn indexes but I'm not sure about the impact. Example:
We have SQLs on findById
, findByIdAndStatus
, findByResult
.
It says that the most used on WHERE
should be listed first in the columns list. But I was wondering if it'll have a huge impact if I create an index on different combinations where clauses.
This: (creating one index for all)
CREATE INDEX CONCURRENTLY ON Students (id, status, result)
vs.
This: (creating different indexes on different queries)
CREATE INDEX CONCURRENTLY ON Students (id)
CREATE INDEX CONCURRENTLY ON Students (status)
CREATE INDEX CONCURRENTLY ON Students (result)
Thank you so much in advance!
Upvotes: 15
Views: 18618
Reputation: 246163
The rule of thumb you read is wrong.
A better rule is: create such an index only if it is useful and gets used often enough that it is worth the performance hit on data modification that comes with every index.
A multi-column B-tree index on (a, b, c)
is useful in several cases:
if the query looks like this:
SELECT ... FROM tab
WHERE a = $1 AND b = $2 AND c <operator> $3
where <operator>
is an operator supported by the index and $1
, $2
and $3
are constants.
if the query looks like this:
SELECT ... FROM tab
WHERE a = $1 AND b = $2
ORDER BY c;
or like this
SELECT ... FROM tab
WHERE a = $1
ORDER BY b, c;
Any decorations in the ORDER BY
clause must be reflected in the CREATE INDEX
statement. For example, for ORDER BY b, c DESC
the index must be created on (a, b, c DESC)
or (a, b DESC, c)
(indexes can be read in both directions).
if the query looks like this:
SELECT c
FROM tab
WHERE a = $1 AND b <operator> $2;
If the table is newly VACUUM
ed, this can get you an index only scan, because all required information is in the index.
In recent PostgreSQL versions, such an index in better created as
CREATE INDEX ON tab (a, b) INCLUDE (c);
Upvotes: 6
Reputation: 4567
Creating one index for all and creating different indexes will have completely different impact on the queries.
You can use EXPLAIN to see if indexes are getting used for the queries.
This video is really good to know about DB indexes.
Index CREATE INDEX CONCURRENTLY ON Students (id, status, result) will be used only and only if query uses id, (id,status) or (id, status and result) in WHERE clause. a query with status in Where will not use this index at all.
Indexes are basically balanced binary trees. A multicolumn index will index rows by id, then rows ordered by id's are further indexes by status and then with result and so on. You can see that in this index, the ordering via status is not present at all. It is only available on rows indexed by the id's first.
Do have the look at video, it explains all this pretty well.
Upvotes: 16