Reputation: 1162
I have a large amount of data (500+ mil rows) in a table that I need to filter/query in real-time. I haven't been able to get satisfactory performance OR predictable query plans using regular b-tree indexes. I thought that using a BRIN would help a lot, but because our data cannot be inserted in any controlled physical ordering that I need to query by, I have set up a MATERIALIZED VIEW
to select the data (including joined data) and sort it in a specific order. Something along the lines of...
CREATE MATERIALIZED VIEW my_view AS
SELECT a.one, b.two, b.three, c.four, c.five, c.six
FROM a, b, c WHERE ...joins
ORDER BY b.three, b.two, a.one, c.four;
I then created the index based on multiple columns, since all specified columns will always be used in the single query this view is meant for.
CREATE INDEX my_view_idx ON my_view
USING BRIN (three, two, one, four) WITH (pages_per_range = 64);
I ordered the columns (both in the table and in the BRIN) based on selectivity, meaning b.three
will filter out 80% of the records (ie. only 20% of records will match), b.two
will filter out 70%, etc.
Was ordering the BRIN columns the same as the physical sorting of the table necessary? I cannot find any resources that describe this. The closest thing I could find was from: https://www.postgresql.org/docs/10/indexes-multicolumn.html ...
A multicolumn BRIN index can be used with query conditions that involve any subset of the index's columns. Like GIN and unlike B-tree or GiST, index search effectiveness is the same regardless of which index column(s) the query conditions use.
... but that doesn't describe column ordering, only inclusion in a query.
I could experiment (and have been, with surprisingly good results), but it's a slow process as it takes 2+ hours to materialize the view and build the index, so I would love to have some sort of factual basis for my guessing to avoid wasting lots of time.
Upvotes: 8
Views: 1971
Reputation: 11
I think the order of columns in BRIN index doesn't matter - according to the same doc: https://www.postgresql.org/docs/10/indexes-multicolumn.html
Like GIN and unlike B-tree or GiST, index search effectiveness is the same regardless of which index column(s) the query conditions use.
Looks like the order is only important for B-tree and GiST.
Upvotes: 1