Query performance does not improve when storing sorted data

Question

I am trying to improve the performance of the base using the cluster command. For tests, I created copies of the original tables and sorted the data in them by foreign keys, by which the most frequent selections are made. There are big improvements for low selectivity queries. For example, for queries like select * from table1 t1 join table2 t2 on t1.a = t2.a where t1.a = 1; performance improved by 10-1000 times.

However, for queries with high selectivity the performance gain is either zero or negative. For example, a query like select a, sum(b) from table1 group by a; takes the same amount of time as before. And in some cases, when using several related tables with full reading, without filtering, the execution time increases by more than 2 times. The query cost in queries against tables with sorted data has decreased by several times.

I tried to collect more detailed statistics: for foreign keys I executed the alter table ... command. alter column ... set statistics = 10000; After that, the query cost usually decreases by another 10-100 times. But it does not lead to actual acceleration. The structure of the query plan (for example, methods of accessing tables or using indexes) does not change. Only the cost changes.

Should I even expect performance improvement in highly selective queries? If so, how do I indicate to postgesql that the data is sorted?
Should grouping be faster if the data in the table is sorted?
Why do some queries run slower after changing the order of data in
tables?

P.S.: For some reasons I can't attach the query plans temporarily.

Query performance does not improve when storing sorted data

Answers (1)

Related Questions