Andrew
Andrew

Reputation: 404

Postgres 10: query time range partitions with limit leads to all partitions scan

For simplicity I have a table with 2 columns customer_id which is int and created_at which is timestamp. I created range based partitions for the created_at so each partition holds single day. Also I have an index (customer_id, created_at DESC). Then I run the query:

SELECT * FROM my_table WHERE customer_id = 123 ORDER BY created_at DESC LIMIT 10

And that query runs really slow. EXPLAIN ANALYZE says that postgres is actually selecting rows from all the partitions matching WHERE clause, then order them and only then returns top 10. Is that intended behavior? What is the better way to get top 10 records in such case? It feels like retry with different created_at values until I reach required amount of records is a bit wrong approach. But I might be wrong about that.

Upvotes: 1

Views: 73

Answers (1)

CodeForFood
CodeForFood

Reputation: 117

PostgreSQL has to scan each partition to check if there are rows for customer_id = 123, then it applies the ORDER BY created_at DESC and the LIMIT 10. The issue is that PostgreSQL isn't fully utilizing the partitioning to stop the scan once it finds enough rows.

To improve performance, you can try the following strategies:

Use constraint exclusion and partition pruning

SHOW constraint_exclusion;

If it's set to OFF, enable it:

SET constraint_exclusion = partition;

Optimize the Index

You can try a more targeted index. A composite index on just customer_id and a separate index on created_at might help:

CREATE INDEX idx_customer_created_at ON my_table (customer_id, created_at DESC);

Upvotes: 2

Related Questions