w0051977
w0051977

Reputation: 15807

Add a group id column using row_number

Please see the DDL below:

create table #Test (reference int identity,id int,dataset varchar(10),primary key (reference))
insert into #Test (id,dataset) values (1,'System1')
insert into #Test (id,dataset) values (2,'System10')
insert into #Test (id,dataset) values (9,'System3')
insert into #Test (id,dataset) values (0,'')
insert into #Test (id,dataset) values (8,'System6')
insert into #Test (id,dataset) values (19,'System7')
insert into #Test (id,dataset) values (0,'')
insert into #Test (id,dataset) values (55,'System10')
insert into #Test (id,dataset) values (62,'System4')

I am looking for an output like this:

enter image description here

An ID of zero separates the groups. I have tried this so far:

select * from (
select  row_number() over (order by reference) as groupid, #test.* from #test

However, the groupid does not reset when a zero is reached in the ID column.

I realize this is bad design. The table was created by an external company. I am just querying it.

Upvotes: 4

Views: 2149

Answers (3)

Praveen ND
Praveen ND

Reputation: 560

In SQL Server 2012, the simplest way by using CTE.You can use this:

WITH CTE AS
(
  SELECT
      COALESCE(SUM(CASE WHEN ID = 0 THEN 1 END) OVER (ORDER BY reference),0)+ 1 AS GroupID,
      ID AS ID,
      dataset AS Dataset
  FROM 
     #Test 
)
SELECT * FROM CTE
WHERE ID !=0

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269963

In SQL Server 2008, the simplest way is to use outer apply. I would do this as:

SELECT cume.GroupID, t.*
FROM #Test t OUTER APPLY
      (SELECT COUNT(*) as GroupID
       FROM #Test t2
       WHERE t2.reference <= t.reference AND t2.id = 0
      ) cume
WHERE t.id <> 0;

Note that the OP doesn't seem to want rows with 0s in the output. It is safe to exclude them from the outer query.

One alternative method would use recursive CTEs, but that might be troublesome on large tables.

Upvotes: 0

Giorgos Betsos
Giorgos Betsos

Reputation: 72175

Try this:

SELECT *,
       COALESCE(SUM(CASE WHEN id = 0 THEN 1 END) 
                OVER (ORDER BY reference),0) + 1 AS GroupID
FROM #Test

The query uses SUM OVER with an ORDER BY clause, to calculate a running total of the number of 0 values. If we add 1 to this running total we obtain the required GroupID value.

Note: This version of SUM is available from SQL Server 2012 onwards.

Edit:

You can simulate SUM OVER with an ORDER BY clause in older versions of SQL Server using OUTER APPLY as in the query below:

SELECT t1.*,
       x.cnt + 1 AS GroupID
FROM #Test t1
OUTER APPLY (SELECT COUNT(cASE WHEN id = 0 THEN 1 END) AS cnt
             FROM #Test t2
             WHERE t2.reference <= t1.reference) AS x

Upvotes: 5

Related Questions