Reputation: 404
For simplicity I have a table with 2 columns customer_id
which is int and created_at
which is timestamp. I created range based partitions for the created_at
so each partition holds single day. Also I have an index (customer_id, created_at DESC). Then I run the query:
SELECT * FROM my_table WHERE customer_id = 123 ORDER BY created_at DESC LIMIT 10
And that query runs really slow. EXPLAIN ANALYZE
says that postgres is actually selecting rows from all the partitions matching WHERE clause, then order them and only then returns top 10. Is that intended behavior? What is the better way to get top 10 records in such case? It feels like retry with different created_at
values until I reach required amount of records is a bit wrong approach. But I might be wrong about that.
Upvotes: 1
Views: 73
Reputation: 117
PostgreSQL has to scan each partition to check if there are rows for customer_id = 123
, then it applies the ORDER BY created_at DESC and the LIMIT 10. The issue is that PostgreSQL isn't fully utilizing the partitioning to stop the scan once it finds enough rows.
To improve performance, you can try the following strategies:
Use constraint exclusion and partition pruning
SHOW constraint_exclusion;
If it's set to OFF, enable it:
SET constraint_exclusion = partition;
Optimize the Index
You can try a more targeted index. A composite index on just customer_id and a separate index on created_at might help:
CREATE INDEX idx_customer_created_at ON my_table (customer_id, created_at DESC);
Upvotes: 2