Reputation: 559
If I have a PostgreSQL table that has columns for datetime and for an array of items, such as:
| time | items |
| -------------------------- | ---------------------- |
| 2020-12-06 11:31:38.000 | {item1, item2} |
| 2020-12-06 11:48:11.304 | {item1} |
| 2020-12-06 11:48:48.654 | {item1, item2, item3} |
| 2020-12-06 11:49:50.355 | {item2} |
| 2020-12-06 11:55:31.842 | {item1, item2} |
How can I query the table to aggregate the count of a specific item in equidistant time intervals?
For example, I'd like to count the occurrences of item1
in 5 minute intervals, so that the query result looks like this:
| start_time | end_time | item1 count |
| -------------------------- | ----------------------------------- | --------------- |
| 2020-12-06 11:30:00.000 | 2020-12-06 11:34:99.999 | 1 |
| 2020-12-06 11:35:00.000 | 2020-12-06 11:39:99.999 | 0 |
| 2020-12-06 11:40:00.000 | 2020-12-06 11:44:99.999 | 0 |
| 2020-12-06 11:45:00.000 | 2020-12-06 11:49:99.999 | 2 |
| 2020-12-06 11:50:00.000 | 2020-12-06 11:54:99.999 | 0 |
| 2020-12-06 11:55:00.000 | 2020-12-06 11:59:99.999 | 1 |
I'm having a tough time trying to figure out what query can help me achieve this in the most optimal way. I've been thinking that Postgres' date_trunc
or grid
might help with this, but am really not sure how to approach the problem. Any suggestions?
Upvotes: 2
Views: 1910
Reputation: 73
For postgres >= 14 you should probably use date_bin. As for the 'item1', if you don't care for it possibly being listed multiple times in a single row and just want to count rows where it's present, a simple filter will suffice:
select
date_bin('5 minutes', time)
, count(*) filter (where 'item1' in any(items))
from table
group by 1
Upvotes: 0
Reputation: 1269445
You can use generate_series()
to generate the timestamps. Then unnest, filter and aggregate:
select gs.ts, count(i.time) as num_item1
from generate_series('2020-12-06 11:30:00.000'::timestamp, '2020-12-06 11:55:00.000', interval '5 minute') gs(ts) left join
(items i join lateral
unnest(i.items) item
on item = 'item1'
)
on i.time >= gs.ts and i.time < gs.ts + interval '5 minute'
group by gs.ts
order by 1;
Here is a db<>fiddle.
Upvotes: 1