user6794223
user6794223

Reputation: 192

Query that counts total records per day and total records with same time timestamp and id per day in Bigquery

I have timeseries data like this:

time id value
2018-04-25 22:00:00 UTC A 1
2018-04-25 23:00:00 UTC A 2
2018-04-25 23:00:00 UTC A 2.1
2018-04-25 23:00:00 UTC B 1
2018-04-26 23:00:00 UTC B 1.3

How do i write a query to produce an output table with these columns:

So the output of our query should be:

date records records_conflicting_time_id
2018-04-25 4 2
2018-04-26 1 0

Getting records is easy, i just truncate the time to get date and then group by date. But i'm really struggling to produce a column that counts the number of records where id + time is not unique over that date...

Upvotes: 0

Views: 821

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172994

Consider below approach

select date(time) date, 
  sum(cnt) records, 
  sum(if(cnt > 1, cnt, 0)) conflicting_records
from (
  select time, id, count(*) cnt
  from your_table
  group by time, id
)
group by date              

if applied to sample data in your question - output is

enter image description here

Upvotes: 1

Max Zolotenko
Max Zolotenko

Reputation: 1132

with YOUR_DATA as
  (
              select  cast('2018-04-25 22:00:00 UTC' as timestamp) as `time`, 'A' as id, 1.0 as value
    union all select  cast('2018-04-25 23:00:00 UTC' as timestamp) as `time`, 'A' as id, 2.0 as value
    union all select cast('2018-04-25 23:00:00 UTC' as timestamp) as `time`, 'A' as id, 2.1 as value
    union all select cast('2018-04-25 23:00:00 UTC' as timestamp) as `time`, 'B' as id, 1.0 as value
    union all select cast('2018-04-26 23:00:00 UTC' as timestamp) as `time`, 'B' as id, 1.3 as value
  )
select  cast(timestamp_trunc(t1.`time`, day) as date) as `date`,
        count(*) as records,
        case when count(*)-count(distinct cast(t1.`time` as string) || t1.id) = 0 then 0
          else count(*)-count(distinct cast(t1.`time` as string) || t1.id)+1
        end as records_conflicting_time_id
from    YOUR_DATA t1
group by cast(timestamp_trunc(t1.`time`, day) as date)
;

Upvotes: 0

Related Questions