Raj
Raj

Reputation: 2398

Grouping consecutive records

Here is my records(Input). The field2 always starts at 100 after which it can take any value above 100.

field1   field2
===============
val1     100
val2     110
------------
val3     100
val4     110
val3     130
val3     140
------------
val1     100

I would need to group consecutive records that starts with 100 after which it can be any number that is not 100. For the example above the output I need is

field 1                    field2
===================================================
(val1, val2)                (100, 110)
(val3, val4, val3, val3)    (100, 110, 130, 140)
(val1)                      (100)

How do I achieve this

Upvotes: 1

Views: 120

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173171

You must have some field that can be used to define the order.
In below examples I assume it is id field

Below should do what you expect

#standardSQL
SELECT 
  CONCAT('(',STRING_AGG(field1 ORDER BY id), ')') AS field1,
  CONCAT('(',STRING_AGG(CAST(field2 AS STRING) ORDER BY id), ')') AS field2
FROM (
  SELECT 
    id, field1, field2,
    COUNTIF(field2 = 100) OVER (ORDER BY id) AS grp
  FROM yourTable
) t
GROUP BY grp
ORDER BY MIN(id)   

you can test/try it with below dummy data from your question

#standardSQL
WITH yourTable AS (
  SELECT 1 AS id, 'val1' AS field1, 100 AS field2 UNION ALL
  SELECT 2 AS id, 'val2' AS field1, 110 AS field2 UNION ALL
  SELECT 3 AS id, 'val3' AS field1, 100 AS field2 UNION ALL
  SELECT 4 AS id, 'val4' AS field1, 110 AS field2 UNION ALL
  SELECT 5 AS id, 'val3' AS field1, 130 AS field2 UNION ALL
  SELECT 6 AS id, 'val3' AS field1, 140 AS field2 UNION ALL
  SELECT 7 AS id, 'val1' AS field1, 100 AS field2 
)
SELECT 
  CONCAT('(',STRING_AGG(field1 ORDER BY id), ')') AS field1,
  CONCAT('(',STRING_AGG(CAST(field2 AS STRING) ORDER BY id), ')') AS field2
FROM (
  SELECT 
    id, field1, field2,
    COUNTIF(field2 = 100) OVER (ORDER BY id) AS grp
  FROM yourTable
) 
GROUP BY grp
ORDER BY MIN(id) 

output is

field1                  field2   
------                  ------   
(val1,val2)             (100,110)    
(val3,val4,val3,val3)   (100,110,130,140)    
(val1)                  (100)    

Wondering if that is possible(without using the order column)?

If you really have only those two fields in your table - you most likely out of luck and need to revisit logic of populating this table with the additional field to be used for ordering (as a timeline)

As a totally last resort - you can try below example where such column is generated on fly - but please understand there is absolutely no guarantee in getting order that you expect

#standardSQL
WITH yourTable AS (
  SELECT 'val1' AS field1, 100 AS field2 UNION ALL
  SELECT 'val2' AS field1, 110 AS field2 UNION ALL
  SELECT 'val3' AS field1, 100 AS field2 UNION ALL
  SELECT 'val4' AS field1, 110 AS field2 UNION ALL
  SELECT 'val3' AS field1, 130 AS field2 UNION ALL
  SELECT 'val3' AS field1, 140 AS field2 UNION ALL
  SELECT 'val1' AS field1, 100 AS field2 
),
tempTable AS (
  SELECT field1, field2, ROW_NUMBER() OVER() AS id  
  FROM yourTable
)
SELECT 
  CONCAT('(',STRING_AGG(field1 ORDER BY id), ')') AS field1,
  CONCAT('(',STRING_AGG(CAST(field2 AS STRING) ORDER BY id), ')') AS field2
FROM (
  SELECT 
    id, field1, field2,
    COUNTIF(field2 = 100) OVER (ORDER BY id) AS grp
  FROM tempTable
) 
GROUP BY grp
ORDER BY MIN(id) 

output is same - but again - no guarantee in it!!!

field1                  field2   
------                  ------   
(val1,val2)             (100,110)    
(val3,val4,val3,val3)   (100,110,130,140)    
(val1)                  (100)    

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270873

Let me assume that you have a column that specifies the ordering. Then you can identify groups by counting the number of "100" records before each record and then using array_agg():

select array_agg(field1 order by id) as field1s,
       array_agg(field2 order by id) as field2s
from (select t.*,
             sum(case when field2 = 100 then 1 else 0 end) over (order by id) as grp
      from t
     ) t
group by grp;

Note: The solution in MySQL would look very, very different. But, it would still start with select.

Upvotes: 2

Related Questions