Reputation: 55
I have the following data which I want to group into seconds per day in BigQuery.
Source Table:
+--------------+---------------------+---------------------+ | ComputerName | StartDatetime | EndDatetime | +--------------+---------------------+---------------------+ | Computer1 | 2020-06-10T21:01:28 | 2020-06-10T21:20:19 | +--------------+---------------------+---------------------+ | Computer1 | 2020-06-10T22:54:01 | 2020-06-11T05:21:48 | +--------------+---------------------+---------------------+ | Computer2 | 2020-06-08T09:11:54 | 2020-06-10T11:36:27 | +--------------+---------------------+---------------------+
I want to be able to visualise the data in the following way
+------------+--------------+------------------+ | Date | ComputerName | Runtime(Seconds) | +------------+--------------+------------------+ | 2020-10-10 | Computer1 | 5089 | +------------+--------------+------------------+ | 2020-10-11 | Computer1 | 19308 | +------------+--------------+------------------+ | 2020-10-08 | Computer2 | 53285 | +------------+--------------+------------------+ | 2020-10-09 | Computer2 | 86400 | +------------+--------------+------------------+ | 2020-10-10 | Computer2 | 41787 | +------------+--------------+------------------+
I am not too sure of the way I should approach this. Some input would be greatly appreciated.
Upvotes: 2
Views: 72
Reputation: 172993
Another option for BigQuery Standard SQL
Straightforward, "little silly" and almost logic-less option of just "stupidly" counting seconds in respective days - still looks like an option to me
#standardSQL
select Date, ComputerName,
countif(second >= timestamp(StartDatetime) and second < timestamp(EndDatetime)) as Runtime_Seconds
from `project.dataset.table`,
unnest(generate_date_array(date(StartDatetime), date(EndDatetime))) Date,
unnest(generate_timestamp_array(timestamp(Date + 1), timestamp(Date), interval -1 second)) second with offset
where offset > 0
group by Date, ComputerName
if applied to sample data from your question - output is
Upvotes: 2
Reputation: 172993
Below is for BigQuery Standard SQL
#standardSQL
select Date, ComputerName,
sum(datetime_diff(
least(datetime (Date + 1), EndDatetime),
greatest(datetime(Date), StartDatetime),
second
)) as Runtime_Seconds
from `project.dataset.table`,
unnest(generate_date_array(date(StartDatetime), date(EndDatetime))) Date
group by Date, ComputerName
if to apply to sample data in your question - as in below example
#standardSQL
with `project.dataset.table` as (
select 'Computer1' ComputerName, datetime '2020-06-10T21:01:28' StartDatetime, datetime '2020-06-10T21:20:19' EndDatetime union all
select 'Computer1', '2020-06-10T22:54:01', '2020-06-11T05:21:48' union all
select 'Computer2', '2020-06-08T09:11:54', '2020-06-10T11:36:27'
)
select Date, ComputerName,
sum(datetime_diff(
least(datetime (Date + 1), EndDatetime),
greatest(datetime(Date), StartDatetime),
second
)) as Runtime_Seconds
from `project.dataset.table`,
unnest(generate_date_array(date(StartDatetime), date(EndDatetime))) Date
group by Date, ComputerName
output is
Upvotes: 2
Reputation: 1269853
This is an interval overlap problem. You can solve this by splitting each time period into separate days and then looking at the overlap for each day:
with t as (
select 'Computer1' as computername, datetime '2020-06-10T21:01:28' as startdatetime, datetime '2020-06-10T21:20:19' as enddatetime union all
select 'Computer1' as computername, datetime '2020-06-10T22:54:01' as startdatetime, datetime '2020-06-11T05:21:48' as enddatetime union all
select 'Computer2' as computername, datetime '2020-06-08T09:11:54' as startdatetime, datetime '2020-06-10T11:36:27' as enddatetime
)
select dte, t.computername,
sum(case when enddatetime >= dte and
startdatetime < date_add(dte, interval 1 day)
then datetime_diff(least(date_add(dte, interval 1 day), enddatetime),
greatest(dte, startdatetime),
second)
end) as runtime_seconds
from (select t.*,
generate_date_array(date(t.startdatetime), date(t.enddatetime), interval 1 day) gda
from t
) t cross join
unnest(gda) dte
group by dte, t.computername;
Upvotes: 2