Reputation: 494
Let's say I have a results
table:
result_id | attr_id | user_id | value | timestamp |
---|---|---|---|---|
1 | 1 | 1 | 100 | 2024-02-10 14:30:15.248087+00 |
2 | 2 | 1 | 111 | 2024-02-10 10:30:15.248087+00 |
3 | 1 | 1 | 122 | 2024-02-09 14:30:15.248087+00 |
4 | 2 | 1 | 162 | 2024-02-08 10:30:15.248087+00 |
5 | 1 | 2 | 119 | 2024-02-10 14:30:15.248087+00 |
6 | 2 | 2 | 128 | 2024-02-10 10:30:15.248087+00 |
7 | 1 | 2 | 137 | 2024-02-09 14:30:15.248087+00 |
8 | 2 | 2 | 146 | 2024-02-08 10:30:15.248087+00 |
I need to calculate percentile for every row over partition by user_id
and attr_id
preceding current row and only in 10 days interval. I can calculate standard deviation in this way because it supports partial mode:
SELECT
stddev(value) OVER (
PARTITION BY user_id, attr_id
ORDER BY timestamp ASC RANGE BETWEEN '10 days'::interval PRECEDING AND CURRENT ROW
EXCLUDE CURRENT ROW
) AS stddev_efficiency
FROM results;
Is there a way to calculate percentile in postgresql/timescaleDB in accordance with the described requirements?
Upvotes: 0
Views: 78
Reputation: 44305
Presumably the problem is with percentile_cont (and friends) being an "ordered set aggregate" and so being ineligible to be used with/as a window function.
You can find a discussion of work-arounds for that issue on the PostgreSQL wiki. Basically you would implement your own aggregate function which is not an "ordered set aggregate" and so keeps the state in memory the normal way instead, but then can be used as window functions. You will probably need to do this in C to get good performance. You might need to deal with memory issues as well, but there isn't enough information here to advice further on that.
Upvotes: 0