Calculating time with datetime by groups

Question

I have two tables Tickets and Tasks. When ticket is registered then it appears in Tickets table and every action that is made with the ticket is saved in the Tasks table. Tickets table includes information like who created the ticket, start and end dates (if it is closed) etc. Tasks table looks like this:

ID  Ticket_ID   Task_type_ID    Task_type   Group_ID    Submit_Date
1   120         1               Opened      3           2016-12-09 11:10:22.000
2   120         2               Assign      4           2016-12-09 12:10:22.000
3   120         3               Paused      4           2016-12-09 12:30:22.000
4   120         4               Unpause     4           2016-12-10 10:30:22.000
5   120         2               Assign      6           2016-12-12 10:30:22.000
6   120         2               Assign      7           2016-12-12 15:30:22.000
7   120         5               Modify      NULL        2016-12-13 15:30:22.000
8   120         6               Closed      NULL        2016-12-13 16:30:22.000

I would like to calculate the time how long each group completed their task. The start time is the time when the ticket was assigned to certain group and end time is when that group completes their task (if they assign it elsewhere or close it). But it should not include the paused time(task_type_ID 3 to 4). Also when ticket is assigned to other group the new group ID appears in the previous task/row. If the task goes through multiple groups it should calculate how long the ticket was in the hands of every group. I know it is complicated but maybe someone has an idea that I can start to build from.

GMB · Accepted Answer

This is a quite sophisticated gaps-and-island problem.

Here is one approach at it:

select distinct 
    ticket_id, 
    group_id, 
    sum(sum(datediff(minute, submit_date, lead_submit_date))) 
        over(partition by group_id) elapsed_minutes
from (
    select
        t.*,
        row_number()      over(partition by ticket_id order by submit_date) rn1,
        row_number()      over(partition by ticket_id, group_id order by submit_date) rn2,
        lead(submit_date) over(partition by ticket_id order by submit_date) lead_submit_date
    from mytable t
) t
where task_type <> 'Paused' and group_id is not null
group by ticket_id, group_id, rn1 - rn2

In the subquery, we assign row numbers to records within two different partitions (by tickets vs by ticket and group), and recover the date of the next record with lead().

We can then use the difference between the row numbers to build groups of "adjacent" records (where the tickets stays in the same group), while not taking into account periods when the ticket was paused. Aggregation comes into play here.

The final step is to compute the overall time spent in each group : this handles the case when a ticket is assigned to the same group more than once during its lifecycle (although that's not showing in your sample data, the description of the question makes it sound like that may happen). We could do this with another level of aggregation but I went for a window sum and distinct, which avoids adding one more level of nesting to the query.

Executing the subquery independently might help understanding the logic better (see the below db fiddle).

For your sample data, the query yields:

ticket_id | group_id | minutes_elapsed
--------: | -------: | --------------:
      120 |        3 |              60
      120 |        4 |            2900
      120 |        6 |             300
      120 |        7 |            1440

Calculating time with datetime by groups

Answers (2)

Related Questions