Xenon Drey
Xenon Drey

Reputation: 171

Bigquery grouping message based on time range

I have data like this:

id type time      message
-- ---- --------  -------
1  in   10:12:00  A
1  in   10:13:00  B
1  in   10:14:00  C
1  in   10:20:00  D
1  in   10:22:00  E
1  out  10:23:00  F
1  in   10:24:00  G
1  in   10:25:00  H
1  in   10:35:00  I

and I want to in every 6 minutes, the message that has type='in' is group into an array, something like this:

id  type time      message
--  ---- --------  -------
1   in   10:12:00  [A,B,C]
1   in   10:20:00  [D,E]
1   out  10:24:00  F
1   in   10:24:00  [G,H]
1   in   10:35:00  [I]

I'm planning to use LEAD() but it's really hard to get multiple rows within time 6 minutes time range. How can I do this in Biquery (#Standart SQL)?

Upvotes: 1

Views: 84

Answers (2)

Sergey Geron
Sergey Geron

Reputation: 10172

Try this:

with mytable as (
  select 1 as id, 'in' as type, time '10:12:00' as time, 'A' as message union all
  select 1, 'in', '10:13:00', 'B' union all
  select 1, 'in', '10:14:00', 'C' union all
  select 1, 'in', '10:20:00', 'D' union all
  select 1, 'in', '10:22:00', 'E' union all
  select 1, 'out', '10:23:00', 'F' union all
  select 1, 'in', '10:24:00', 'G' union all
  select 1, 'in', '10:25:00', 'H' union all
  select 1, 'in', '10:35:00', 'I'
)
select 
  id,
  type,
  time(extract(hour from time), DIV(extract(minute from time), 6) * 6, 0) as time,
  string_agg(message) as message
from mytable
where type = 'in'
group by 1, 2, 3
union all
select *
from mytable
where type = 'out'
order by time

enter image description here

If you need arrays specifically:

with mytable as (
  select 1 as id, 'in' as type, time '10:12:00' as time, 'A' as message union all
  select 1, 'in', '10:13:00', 'B' union all
  select 1, 'in', '10:14:00', 'C' union all
  select 1, 'in', '10:20:00', 'D' union all
  select 1, 'in', '10:22:00', 'E' union all
  select 1, 'out', '10:23:00', 'F' union all
  select 1, 'in', '10:24:00', 'G' union all
  select 1, 'in', '10:25:00', 'H' union all
  select 1, 'in', '10:35:00', 'I'
)
select
  id,
  type,
  time(extract(hour from time), DIV(extract(minute from time), 6) * 6, 0) as time,
  array_agg(message) as message
from mytable
where type = 'in'
group by 1, 2, 3
union all
select id, type, time, array_agg(message)
from mytable
where type = 'out'
group by 1, 2, 3
order by time

enter image description here

Upvotes: 1

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172993

Consider below approach

select id, type, min(time) time,  
  string_agg(message) message
from (
  select *, 
    div(time_diff(time, min(time) over(partition by id, type), minute), 6) as grp
  from `project.dataset.table` 
  where type = 'in'   
)
group by id, type, grp
union all 
select *
from `project.dataset.table`
where type = 'out'
order by time            

if applied to sample data in your question - output is

enter image description here

Upvotes: 2

Related Questions