mengmeng
mengmeng

Reputation: 1486

Creating multicolumn indexes in PostgreSQL

I'm researching now on creating indexes for our tables.

I found out about multicolumn indexes but I'm not sure about the impact. Example:

We have SQLs on findById, findByIdAndStatus, findByResult.

It says that the most used on WHERE should be listed first in the columns list. But I was wondering if it'll have a huge impact if I create an index on different combinations where clauses.

This: (creating one index for all)

CREATE INDEX CONCURRENTLY ON Students (id, status, result)

vs.

This: (creating different indexes on different queries)

CREATE INDEX CONCURRENTLY ON Students (id)

CREATE INDEX CONCURRENTLY ON Students (status)

CREATE INDEX CONCURRENTLY ON Students (result)

Thank you so much in advance!

Upvotes: 15

Views: 18618

Answers (2)

Laurenz Albe
Laurenz Albe

Reputation: 246163

The rule of thumb you read is wrong.
A better rule is: create such an index only if it is useful and gets used often enough that it is worth the performance hit on data modification that comes with every index.

A multi-column B-tree index on (a, b, c) is useful in several cases:

  1. if the query looks like this:

    SELECT ... FROM tab
    WHERE a = $1 AND b = $2 AND c <operator> $3
    

    where <operator> is an operator supported by the index and $1, $2 and $3 are constants.

  2. if the query looks like this:

    SELECT ... FROM tab
    WHERE a = $1 AND b = $2
    ORDER BY c;
    

    or like this

    SELECT ... FROM tab
    WHERE a = $1
    ORDER BY b, c;
    

    Any decorations in the ORDER BY clause must be reflected in the CREATE INDEX statement. For example, for ORDER BY b, c DESC the index must be created on (a, b, c DESC) or (a, b DESC, c) (indexes can be read in both directions).

  3. if the query looks like this:

    SELECT c
    FROM tab
    WHERE a = $1 AND b <operator> $2;
    

    If the table is newly VACUUMed, this can get you an index only scan, because all required information is in the index.

    In recent PostgreSQL versions, such an index in better created as

    CREATE INDEX ON tab (a, b) INCLUDE (c);
    

Upvotes: 6

mahoriR
mahoriR

Reputation: 4567

Creating one index for all and creating different indexes will have completely different impact on the queries.

  1. You can use EXPLAIN to see if indexes are getting used for the queries.

  2. This video is really good to know about DB indexes.

  3. Index CREATE INDEX CONCURRENTLY ON Students (id, status, result) will be used only and only if query uses id, (id,status) or (id, status and result) in WHERE clause. a query with status in Where will not use this index at all.

Indexes are basically balanced binary trees. A multicolumn index will index rows by id, then rows ordered by id's are further indexes by status and then with result and so on. You can see that in this index, the ordering via status is not present at all. It is only available on rows indexed by the id's first.

Do have the look at video, it explains all this pretty well.

Upvotes: 16

Related Questions