Reputation: 35
I have a dataset that contains information regarding a specific type of event, includes starttime
, endtime
, and duration
(in minutes). I would like to group this data by date
(will be extracted from starttime
), but I have a row that represents a event go across several days, for example, index
33, starttime
2020-07-13 10:19:54 UTC, endtime
2020-07-15 13:13:21 UTC, and duration
3053. This row should be broke to three rows, for instance:
starttime endtime duration
2020-07-13 10:19 2020-07-13 23:59 821
2020-07-14 00:00 2020-07-14 23:59 1440
2020-07-15 00:00 2020-07-15 13:13 793
To be clear, I am trying to use these 3 rows to replace the original row, so when I group by date
, the number will be correct.
How do I do this in BigQuery?
Upvotes: 1
Views: 495
Reputation: 1269563
Use genereate_date_array()
and unnest()
:
select t.ind,
greatest(start_ts, timestamp(dt)) as start_ts,
least(timestamp(date_add(dt, interval 1 day)), end_ts)
from (select 32 as ind, timestamp('2020-07-13 10:19:54 UTC') as start_ts, timestamp('2020-07-13 13:13:21 UTC') as end_ts union all
select 33 as ind, timestamp('2020-07-13 10:19:54 UTC') as start_ts, timestamp('2020-07-15 13:13:21 UTC') as end_ts
) t cross join
unnest(generate_date_array(date(start_ts), date(end_ts))) dt ;
Note: This gives the ending timestamp as midnight rather than one minute or one second before. That way, the intermediate days have 24 hours. Of course, you can subtracts a second or minute, but that seems less accurate to me.
Upvotes: 1