iamshazi
iamshazi

Reputation: 87

T-SQL to create an ID column

I'm using SQL Server 2008 R2 and I have the following dataset:

+---------+--------------+--------------+----------+------------+------------+
| Dossier | refmouvement | refadmission | refunite |  datedeb   |  datefin   |
+---------+--------------+--------------+----------+------------+------------+
| P001234 |         2567 |         1234 |      227 | 2012-01-01 | 2012-01-02 |
| P001234 |         2568 |         1234 |      227 | 2012-01-02 | 2012-01-03 |
| P001234 |         2569 |         1234 |      224 | 2012-01-03 | 2012-01-06 |
| P001234 |         2570 |         1234 |      232 | 2012-01-06 | 2012-01-10 |
| P001234 |         2571 |         1234 |      232 | 2012-01-10 | 2012-01-15 |
| P001234 |         2572 |         1234 |      232 | 2012-01-15 | 2012-01-20 |
| P001234 |         2573 |         1234 |      232 | 2012-01-20 | 2012-01-25 |
| P001234 |         2574 |         1234 |      224 | 2012-01-25 | 2012-01-29 |
| P001234 |         2575 |         1234 |      227 | 2012-01-29 | 2012-02-05 |
| P001234 |         2576 |         1234 |      227 | 2012-02-05 | 2012-02-10 |
| P001234 |         2577 |         1234 |      232 | 2012-02-10 | 2012-02-15 |
| P001234 |         2578 |         1234 |      201 | 2012-02-15 | 2012-02-26 |
+---------+--------------+--------------+----------+------------+------------+

This dataset is ordered by datedeb, otherwise known as startdate.

As you can notice this is a contiguous dataset where datefin is equal to the next line's datedeb

I need to create an ID column that is going to give an unique ID based on the refunite and the datedeb columns like this:

+----+---------+--------------+--------------+----------+------------+------------+
| ID | Dossier | refmouvement | refadmission | refunite |  datedeb   |  datefin   |
+----+---------+--------------+--------------+----------+------------+------------+
|  1 | P001234 |         2567 |         1234 |      227 | 2012-01-01 | 2012-01-02 |
|  1 | P001234 |         2568 |         1234 |      227 | 2012-01-02 | 2012-01-03 |
|  2 | P001234 |         2569 |         1234 |      224 | 2012-01-03 | 2012-01-06 |
|  3 | P001234 |         2570 |         1234 |      232 | 2012-01-06 | 2012-01-10 |
|  3 | P001234 |         2571 |         1234 |      232 | 2012-01-10 | 2012-01-15 |
|  3 | P001234 |         2572 |         1234 |      232 | 2012-01-15 | 2012-01-20 |
|  3 | P001234 |         2573 |         1234 |      232 | 2012-01-20 | 2012-01-25 |
|  4 | P001234 |         2574 |         1234 |      224 | 2012-01-25 | 2012-01-29 |
|  5 | P001234 |         2575 |         1234 |      227 | 2012-01-29 | 2012-02-05 |
|  5 | P001234 |         2576 |         1234 |      227 | 2012-02-05 | 2012-02-10 |
|  6 | P001234 |         2577 |         1234 |      232 | 2012-02-10 | 2012-02-15 |
|  7 | P001234 |         2578 |         1234 |      201 | 2012-02-15 | 2012-02-26 |
+----+---------+--------------+--------------+----------+------------+------------+

I just can't wrap my head around a RANK(), ROW_NUMBER() or DENSE_RANK() function or a combination of that could achieve this, I have looked everywhere but I cannot find anything, maybe I'm not using the proper keywords but I just can't figure it out

Any help will be appreciated

Thanks.

Here's the code that I've tried so far:

SELECT 
   ROW_NUMBER() over(order by t1.[datedeb])  as [ID1],
   dense_Rank() over(partition by t1.[refunite]   order by t1.[datedeb])  as [ID2],
   t1.[Dossier]
   ,t1.[refmouvement]
   ,t1.[refadmission]
   ,t1.[refunite]
   ,t1.[datedeb]
   ,t1.[datefin]
   ,t2.[refmouvement] as [prev_refmouvement]
   ,t2.refunite as prev_refunite
FROM [sometable] t1
LEFT OUTER JOIN [sometable] t2  /*self join*/
     ON t2.datefin = t1.datedeb
        AND t1.[refadmission] = t2.[refadmission]
ORDER BY
   t1.[datedeb]

This is what it gives me :

+-----+-----+---------+--------------+--------------+----------+------------+------------+-------------------+---------------+
| ID1 | ID2 | Dossier | refmouvement | refadmission | refunite |  datedeb   |  datefin   | prev_refmouvement | prev_refunite |
+-----+-----+---------+--------------+--------------+----------+------------+------------+-------------------+---------------+
|   1 |   1 | P001234 |         2567 |         1234 |      227 | 2012-01-01 | 2012-01-02 | NULL              | NULL          |
|   2 |   2 | P001234 |         2568 |         1234 |      227 | 2012-01-02 | 2012-01-03 | 2567              | 227           |
|   3 |   1 | P001234 |         2569 |         1234 |      224 | 2012-01-03 | 2012-01-06 | 2568              | 227           |
|   4 |   1 | P001234 |         2570 |         1234 |      232 | 2012-01-06 | 2012-01-10 | 2569              | 224           |
|   5 |   2 | P001234 |         2571 |         1234 |      232 | 2012-01-10 | 2012-01-15 | 2570              | 232           |
|   6 |   3 | P001234 |         2572 |         1234 |      232 | 2012-01-15 | 2012-01-20 | 2571              | 232           |
|   7 |   4 | P001234 |         2573 |         1234 |      232 | 2012-01-20 | 2012-01-25 | 2572              | 232           |
|   8 |   2 | P001234 |         2574 |         1234 |      224 | 2012-01-25 | 2012-01-29 | 2573              | 232           |
|   9 |   3 | P001234 |         2575 |         1234 |      227 | 2012-01-29 | 2012-02-05 | 2574              | 224           |
|  10 |   4 | P001234 |         2576 |         1234 |      227 | 2012-02-05 | 2012-02-10 | 2575              | 227           |
|  11 |   5 | P001234 |         2577 |         1234 |      232 | 2012-02-10 | 2012-02-15 | 2576              | 227           |
|  12 |   1 | P001234 |         2578 |         1234 |      201 | 2012-02-15 | 2012-02-26 | 2577              | 232           |
+-----+-----+---------+--------------+--------------+----------+------------+------------+-------------------+---------------+

Shaz

Upvotes: 3

Views: 282

Answers (3)

muhmud
muhmud

Reputation: 4604

with sometable as (
    select *
    from (
        values ('P001234', 2567, 1234, 227, cast('2012-01-01' as date), cast('2012-01-02' as date)),
        ('P001234', 2568, 1234, 227, cast('2012-01-02' as date), cast('2012-01-03' as date)),
        ('P001234', 2569, 1234, 224, cast('2012-01-03' as date), cast('2012-01-06' as date)),
        ('P001234', 2570, 1234, 232, cast('2012-01-06' as date), cast('2012-01-10' as date)),
        ('P001234', 2571, 1234, 232, cast('2012-01-10' as date), cast('2012-01-15' as date)),
        ('P001234', 2572, 1234, 232, cast('2012-01-15' as date), cast('2012-01-20' as date)),
        ('P001234', 2573, 1234, 232, cast('2012-01-20' as date), cast('2012-01-25' as date)),
        ('P001234', 2574, 1234, 224, cast('2012-01-25' as date), cast('2012-01-29' as date)),
        ('P001234', 2575, 1234, 227, cast('2012-01-29' as date), cast('2012-02-05' as date)),
        ('P001234', 2576, 1234, 227, cast('2012-02-05' as date), cast('2012-02-10' as date)),
        ('P001234', 2577, 1234, 232, cast('2012-02-10' as date), cast('2012-02-15' as date)),
        ('P001234', 2578, 1234, 201, cast('2012-02-15' as date), cast('2012-02-26' as date))
    ) t (Dossier, refmouvement, refadmission, refunite, datedeb, datefin)
), pos as (
    select d.*, (case when d2.refunite is null then null 
                        when d2.refunite != d.refunite then d2.datedeb 
                        else d.datedeb end) as forward, 
                (case when d3.refunite is null then null 
                        when d3.refunite != d.refunite then d3.datedeb 
                        else d.datedeb end) as backward
    from sometable d
    left outer join sometable d2 on d.refadmission = d2.refadmission and d.datefin = d2.datedeb
    left outer join sometable d3 on d.refadmission = d3.refadmission and d.datedeb = d3.datefin
)
select dense_rank() over (order by isnull((select min(datedeb) 
                                            from pos 
                                            where refadmission = t.refadmission 
                                            and refunite = t.refunite 
                                            and datedeb > t.datedeb 
                                            and datedeb = backward
                                            and ((t.datedeb = t.backward and t.datedeb = t.forward) 
                                                    or t.datedeb != t.backward or t.backward is null)
                                            and datedeb != forward), datedeb)) as ID, 
       Dossier, refmouvement, refadmission, refunite, datedeb, datefin
from pos t
order by datedeb

Upvotes: 1

larsts
larsts

Reputation: 451

You could, of course, have multiple tables in the WITH, eliminating the table variable. Based on Bogdan Sahleans answer, you could rewrite like this:

WITH CTEHelper AS 
    (SELECT ROW_NUMBER() OVER(ORDER BY datedeb) AS RowNum,
            refunite, 
            datedeb
    FROM    dbo.Sometable),
CTERecursive AS (
        SELECT  crt.RowNum,
                crt.refunite,
                crt.datedeb,
                1 AS Id -- Starting rank
        FROM    CTEHelper crt
        WHERE   crt.RowNum = 1
        UNION ALL
        SELECT  crt.RowNum,
                crt.refunite,
                crt.datedeb,
                CASE WHEN prev.refunite = crt.refunite THEN prev.Id ELSE prev.Id + 1 END
        FROM    CTEHelper crt INNER JOIN CTERecursive prev ON crt.RowNum = prev.RowNum + 1
    )
SELECT  crt.id, 
        s.*
FROM    CTERecursive crt
    JOIN Sometable s ON s.refunite = crt.refunite AND s.datedeb = crt.datedeb

Upvotes: 2

Bogdan Sahlean
Bogdan Sahlean

Reputation: 1

DECLARE @Results TABLE(
    RowNum INT PRIMARY KEY,
    refunite INT NOT NULL,
    datedeb DATETIME NOT NULL
);

INSERT  @Results (RowNum, refunite, datedeb)
SELECT  ROW_NUMBER() OVER(ORDER BY datedeb) AS RowNum,
        refunite, 
        datedeb
FROM    dbo.MyTable;

WITH CTERecursive
AS (
    SELECT  crt.RowNum,
            crt.refunite,
            crt.datedeb,
            1 AS Rnk -- Starting rank
    FROM    @Results crt
    WHERE   crt.RowNum = 1
    UNION ALL
    SELECT  crt.RowNum,
            crt.refunite,
            crt.datedeb,
            CASE WHEN prev.refunite = crt.refunite THEN prev.Rnk ELSE prev.Rnk + 1 END
    FROM    @Results crt INNER JOIN CTERecursive prev ON crt.RowNum = prev.RowNum + 1
)
SELECT  *
FROM    CTERecursive
-- OPTION(MAXRECURSION 1000); -- Uncomment this line if you change the number of recursion levels allowed (default 100)

Results:

RowNum      refunite    datedeb                 Rnk
----------- ----------- ----------------------- ---
1           227         2012-01-01 00:00:00.000 1
2           227         2012-01-02 00:00:00.000 1
3           224         2012-01-03 00:00:00.000 2
4           232         2012-01-06 00:00:00.000 3
5           232         2012-01-10 00:00:00.000 3
6           232         2012-01-15 00:00:00.000 3
7           232         2012-01-20 00:00:00.000 3
8           224         2012-01-25 00:00:00.000 4
9           227         2012-01-29 00:00:00.000 5
10          227         2012-02-05 00:00:00.000 5
11          232         2012-02-10 00:00:00.000 6
12          201         2012-02-15 00:00:00.000 7

Upvotes: 3

Related Questions