user1474992
user1474992

Reputation: 807

Sequence grouping in TSQL

I'm trying to group data in sequence order. Say I have the following table:

| 1 | A |
| 1 | A |
| 1 | B |
| 1 | B |
| 1 | C |
| 1 | B |

I need the SQL query to output the following:

| 1 | A | 1 |
| 1 | A | 1 |
| 1 | B | 2 |
| 1 | B | 2 |
| 1 | C | 3 |
| 1 | B | 4 |

The last column is a group number that is incremented in each group. The important thing to note is that rows 3, 4 and 5 contain the same data which should be grouped into 2 groups not 1.

Upvotes: 1

Views: 1730

Answers (3)

Mikser
Mikser

Reputation: 21

For MSSQL2008:

Suppose you have a SampleStatuses table:

Status  Date
A       2014-06-11
A       2014-06-14
B       2014-06-25
B       2014-07-01
A       2014-07-06
A       2014-07-19
B       2014-07-21
B       2014-08-13
C       2014-08-19

you write the following:

;with
cte as (
    select top 1 RowNumber, 1 as GroupNumber, [Status], [Date] from SampleStatuses order by RowNumber
    union all
    select c1.RowNumber,
        case when c2.Status <> c1.Status then c2.GroupNumber + 1 else c2.GroupNumber end as  GroupNumber, c1.[Status], c1.[Date] 
    from cte c2 join SampleStatuses c1 on c1.RowNumber = c2.RowNumber + 1       
)
select * from cte;

you get this result:

RowNumber   GroupNumber Status  Date
1           1           A       2014-06-11
2           1           A       2014-06-14
3           2           B       2014-06-25
4           2           B       2014-07-01
5           3           A       2014-07-06
6           3           A       2014-07-19
7           4           B       2014-07-21
8           4           B       2014-08-13
9           5           C       2014-08-19

Upvotes: 2

Phil
Phil

Reputation: 497

This will give you rankings on your columns. It will not give you 1,2,3 however. It will give you 1,3,6 etc based on how many in each grouping

select 
a,
b,
rank() over (order by a,b)
 from
table1

See this SQLFiddle for a clearer idea of what I mean: http://sqlfiddle.com/#!3/0f201/2/0

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269953

The normal way you would do what you want is the dense_rank function:

select key, val,
       dense_rank() over (order by key, val)
from t

However, this does not address the problem of separating the last groups.

To handle this, I have to assume there is an "id" column. Tables, in SQL, do not have an ordering, so I need the ordering. If you are using SQL Server 2012, then you can use the lag() function to get what you need. Use the lag to see if the key, val pair is the same on consecutive rows:

with t1 as (
     select id, key, val,
            (case when key = lead(key, 1) over (order by id) and
                       val = lead(val, 1) over (order by id)
                  then 1
                  else 0
             end) as SameAsNext
     from t
    )
select id, key, val,
       sum(SameAsNext) over (order by id) as GroupNum
from t

Without SQL Server 2012 (which has cumulative sums), you have to do a self-join to identify the beginning of each group:

select t.*,
from t left outer join
     t tprev
     on t.id = t2.id + 1 and t.key = t2.key and t.val = t2.val
where t2.id is null

With this, assign the group as the minimum id using a join:

select t.id, t.key, t.val,
       min(tgrp.id) as GroupId
from t left outer join
     (select t.*,
      from t left outer join
           t tprev
           on t.id = t2.id + 1 and t.key = t2.key and t.val = t2.val
      where t2.id is null
    ) tgrp
    on t.id >= tgrp.id

If you want these to be consecutive numbers, then put them in a subquery and use dense_rank().

Upvotes: 1

Related Questions