Benedict Lewis
Benedict Lewis

Reputation: 2813

Calculate difference between two rows, averaged by day

I have a table of items, which have a status and creation day

+----+-----------+------------+
| id | status    | created    |
+----+-----------+------------+
| 1  | PROCESSED | 2018-12-01 |
+----+-----------+------------+
| 2  | PROCESSED | 2018-12-01 |
+----+-----------+------------+
| 3  | ABORTED   | 2018-12-01 |
+----+-----------+------------+

There is a corresponding item status table, which is updated when the status changes

+----+---------+-----------+------------------+
| id | item_id | status    | created          |
+----+---------+-----------+------------------+
| 1  | 1       | RECEIVED  | 2018-12-01 10:00 |
+----+---------+-----------+------------------+
| 2  | 1       | PROCESSED | 2018-12-01 12:00 |
+----+---------+-----------+------------------+
| 3  | 2       | RECEIVED  | 2018-12-01 11:00 |
+----+---------+-----------+------------------+
| 4  | 2       | PROCESSED | 2018-12-01 12:00 |
+----+---------+-----------+------------------+
| 5  | 3       | RECEIVED  | 2018-12-01 13:00 |
+----+---------+-----------+------------------+
| 6  | 3       | ABORTED   | 2018-12-01 13:30 |
+----+---------+-----------+------------------+

I want to produce a report which shows the average time for an item to be processed, grouped by day, excluding items which have been aborted. (the time for an item to be processed is the time difference between RECEIVED and PROCESSED)

Something like this (duration in seconds):

+------------+------------------+
| day        | avg_duration     |
+------------+------------------+
| 2018-12-01 | 5400             |
+------------+------------------+

From other questions, I've identified that this can likely be solved using table partitioning, but haven't been able to write a working query. What's the best way to do this?

Upvotes: 2

Views: 76

Answers (2)

klin
klin

Reputation: 121504

Use the boolean aggregate bool_and() to filter out aborted items:

select date, avg(duration)
from (
    select created::date as date, item_id, extract(epoch from max(created)- min(created)) as duration
    from item_status
    group by created::date, item_id
    having bool_and(status <> 'ABORTED')
    ) s
group by date

    date    | avg  
------------+------
 2018-12-01 | 5400
(1 row) 

Upvotes: 2

Vamsi Prabhala
Vamsi Prabhala

Reputation: 49260

This requires 2 levels of aggregation, once on item and date and then on date.

select dt_created,avg(diff) as avg_diff
from (select item_id
            ,created::date as dt_created
            ,max(case when status = 'PROCESSED' then created end) - max(case when status = 'RECEIVED' then created end) as diff 
      from item_statuses 
      group by item_id,created::date
      having count(case when status = 'ABORTED' then 1 end) = 0
    ) t
group by dt_Created

Upvotes: 1

Related Questions