CiucaS
CiucaS

Reputation: 2128

SQL Group by dates

I've read some topics about group by sequence, it's almost what I need, but I could not figure out an solution for my problem.

I have a table like this:

PlanificatorPozitieID JalonID     DataStart               DataFinal               
--------------------- ----------- ----------------------- ----------------------- 
26                    46          2012-05-21 00:00:00.000 2012-05-31 00:00:00.000 
28                    48          2012-06-01 00:00:00.000 2012-06-01 00:00:00.000 
27                    60          2012-06-02 00:00:00.000 2012-06-02 00:00:00.000 
29                    60          2012-06-07 00:00:00.000 2012-06-08 00:00:00.000 
37                    60          2012-06-08 00:00:00.000 2012-06-10 00:00:00.000 
30                    65          2012-06-10 00:00:00.000 2012-06-13 00:00:00.000 
31                    65          2012-06-18 00:00:00.000 2012-06-24 00:00:00.000 
32                    65          2012-06-23 00:00:00.000 2012-07-01 00:00:00.000 
33                    66          2012-07-02 00:00:00.000 2012-07-02 00:00:00.000 
34                    66          2012-07-02 00:00:00.000 2012-07-05 00:00:00.000 
36                    66          2012-07-06 00:00:00.000 2012-07-10 00:00:00.000 


Desired output:

PlanificatorPozitieID JalonID     DataStart               DataFinal               
--------------------- ----------- ----------------------- ----------------------- 
26                    46          2012-05-21 00:00:00.000 2012-05-31 00:00:00.000 
28                    48          2012-06-01 00:00:00.000 2012-06-01 00:00:00.000 
27                    60          2012-06-02 00:00:00.000 2012-06-02 00:00:00.000 
29                    60          2012-06-07 00:00:00.000 2012-06-10 00:00:00.000 
30                    65          2012-06-10 00:00:00.000 2012-06-13 00:00:00.000 
31                    65          2012-06-18 00:00:00.000 2012-07-01 00:00:00.000 
33                    66          2012-07-02 00:00:00.000 2012-07-05 00:00:00.000 
36                    66          2012-07-06 00:00:00.000 2012-07-10 00:00:00.000 

So I have to group by JalonID, but the group by should only be made if the DataFinal >= DataStart. I want to get the timeperiod of each JalonID, but I want to get only the periods that has no pause time.

Home I made myself clear.

select MIN(pp.DataStart) as DataStart, MAX(pp.DataFinal) as DataFinal, pp.JalonID FROM #PlanPozitii pp
GROUP BY pp.JalonID 

But this query doesn't satisfy my condition to group by periods that are continuous.

As for clarify. Take the follow example

30                    65          2012-06-10 00:00:00.000 2012-06-13 00:00:00.000 
31                    65          2012-06-18 00:00:00.000 2012-06-24 00:00:00.000 
32                    65          2012-06-23 00:00:00.000 2012-07-01 00:00:00.000

2012-06-13 00:00:00.000 < 2012-06-18 00:00:00.000 so no group would take place between PlanificatorPozitieID 30 and 31. But 2012-06-24 00:00:00.000 > 2012-06-23 00:00:00.000 so now there would be a group between PlanificatorPozitieID 31 and 32.

So from these 3 rows we will have get two rows.

30                    65          2012-06-10 00:00:00.000 2012-06-13 00:00:00.000 
31                    65          2012-06-18 00:00:00.000 2012-07-01 00:00:00.000 





DECLARE @YourTable TABLE(PlanificatorPozitieID INT, JalonID INT,DataStart DATETIME, DataFinal DATETIME)
INSERT INTO @YourTable VALUES
(39,1223,'2015-02-16 00:00:00.000','2015-02-20 00:00:00.000'),
(43,1223,'2015-02-19 00:00:00.000','2015-02-24 00:00:00.000'),
(40,1223,'2015-02-23 00:00:00.000','2015-02-27 00:00:00.000'),
(42,1223,'2015-03-09 00:00:00.000','2015-03-13 00:00:00.000')
;WITH cte AS
(
SELECT  a.PlanificatorPozitieID, 
        a.JalonID, 
        a.DataStart, 
        COALESCE(b.DataFinal,a.datafinal) AS [DataFinal],
        ROW_NUMBER() OVER (PARTITION BY  a.JalonID ORDER BY DATEDIFF(dd,a.datastart, COALESCE(b.DataFinal,a.datafinal))) [rn],
        COUNT(*) OVER (PARTITION BY  a.JalonID) [cnt]
FROM    @YourTable a
        LEFT JOIN  @YourTable b 
        ON  a.JalonID = b.JalonID AND 
            b.DataStart BETWEEN a.DataStart AND a.DataFinal AND  
            a.PlanificatorPozitieID <> b.PlanificatorPozitieID AND 
            DATEDIFF(dd,a.DataStart,a.DataFinal) < DATEDIFF(dd,a.DataStart,b.DataFinal)
)
SELECT * 
FROM   cte 
WHERE  rn= 1 OR rn=cnt

Result :

PlanificatorPozitieID JalonID     DataStart               DataFinal               rn                   cnt
--------------------- ----------- ----------------------- ----------------------- -------------------- -----------
40                    1223        2015-02-23 00:00:00.000 2015-02-27 00:00:00.000 1                    4
43                    1223        2015-02-19 00:00:00.000 2015-02-27 00:00:00.000 4                    4

Expected result:

PlanificatorPozitieID JalonID     DataStart               DataFinal               
--------------------- ----------- ----------------------- ----------------------- 
39                    1223        2015-02-16 00:00:00.000 2015-02-27 00:00:00.000 
42                    1223        2015-03-09 00:00:00.000 2015-03-13 00:00:00.000 

Upvotes: 3

Views: 109

Answers (3)

CiucaS
CiucaS

Reputation: 2128

I've found a solutin, it's not very efficient as it uses 2 cursors. But It works in case anyone needs an example

DROP TABLE #DateTEst
DROP TABLE #pozitii
drop table #PozitiiJaloaneStandard
CREATE TABLE #DateTest (JalonStandardID int,DataStart datetime,DataFinal datetime)


INSERT INTO #DateTest VALUES (1,'2015-05-05','2015-05-08')
INSERT INTO #DateTest VALUES (1,'2015-05-09','2015-05-13')
INSERT INTO #DateTest VALUES (1,'2015-05-12','2015-05-15')
INSERT INTO #DateTest VALUES (1,'2015-05-16','2015-05-18')
INSERT INTO #DateTest VALUES (1,'2015-05-14','2015-05-19')
INSERT INTO #DateTest VALUES (2,'2015-05-05','2015-05-06')
INSERT INTO #DateTest VALUES (2,'2015-05-06','2015-05-07')
INSERT INTO #DateTest VALUES (2,'2015-05-06','2015-05-09')
INSERT INTO #DateTest VALUES (3,'2015-05-05','2015-05-07')
INSERT INTO #DateTest VALUES (3,'2015-05-08','2015-05-10')
INSERT INTO #DateTest VALUES (4,'2015-05-05','2015-05-08')
INSERT INTO #DateTest VALUES (5,'2015-05-07','2015-05-07')
INSERT INTO #DateTest VALUES (5,'2015-05-08','2015-05-08')
INSERT INTO #DateTest VALUES (5,'2015-05-09','2015-05-12')
INSERT INTO #DateTest VALUES (5,'2015-05-11','2015-05-12')
INSERT INTO #DateTest VALUES (6,'2015-05-05','2015-05-20')
INSERT INTO #DateTest VALUES (6,'2015-05-15','2015-05-18')




CREATE TABLE #Pozitii (DataStart datetime, DataFinal datetime)

CREATE TABLE #PozitiiJaloaneStandard ( JalonStandardID int, DataStart datetime, DataFinal datetime)



Declare @JalonStandarID int
DEclare @PlanificatorPozitieID int
Declare @DataStartPozitie datetime
Declare @DataFinalPozitie datetime 


DEclare @DataStartMin datetime
Declare @PozitieMinima int
Declare @DataFinalMin datetime


Declare @DataStartJalonStandard datetime
Declare @DataFinalJalonStandard datetime


Declare Crs_JaloaneStandard Cursor For
Select JalonStandardID
From #DateTest
ORDER BY JalonStandardID
Open Crs_JaloaneStandard
Fetch Next From Crs_JaloaneStandard Into 
@JalonStandarID
While @@Fetch_Status = 0 
    Begin
            INSERT INTO #Pozitii
            SELECT pp.DataStart,pp.DataFinal            
            FROM #DateTest pp
            WHERE pp.JalonStandardID = @JalonStandarID
            GROUP BY  pp.DataStart,pp.DataFinal



            SELECT @DataStartMin = MIN(DataStart) FROM #Pozitii

            SELECT   @DataFinalMin = DataFinal FROM 
             #Pozitii WHERE DataStart = @DataStartMin



            Declare Crs_Pozitii Cursor For
            SELECT 
            p.DataStart,p.DataFinal
            FROM #Pozitii p
            ORDER by p.DataStart ASC
            Open Crs_Pozitii
            Fetch Next From Crs_Pozitii Into
            @DataStartPozitie,@DataFinalPozitie
            while @@FETCH_STATUS = 0 
            begin

                    if (@DataFinalMin > @DataStartPozitie) and (@DataFinalMin <= @DataFinalPozitie )
                    begin 
                        set @DataFinalMin = @DataFinalPozitie
                    end
                    if (@DataFinalMin <= @DataStartPozitie) begin
                         INSERT INTO #PozitiiJaloaneStandard VALUES (@JalonStandarID,@DataStartMin,@DataFinalMin)
                         set @DataFinalMin = @DataFinalPozitie
                         set @DataStartMin = @DataStartPozitie      
                         print @DataStartPozitie
                         print @DataFinalPozitie                    
                    end



            Fetch Next From Crs_Pozitii Into 
            @DataStartPozitie,@DataFinalPozitie
            End 

            INSERT INTO #PozitiiJaloaneStandard VALUES (@JalonStandarID,@DataStartMin,@DataFinalMin)

            DELETE FROM #Pozitii
            Close Crs_Pozitii
            Deallocate Crs_Pozitii

    Fetch Next From Crs_JaloaneStandard Into 
    @JalonStandarID
    End

Close Crs_JaloaneStandard
Deallocate Crs_JaloaneStandard




SELECT * FROM #PozitiiJaloaneStandard
GROUP BY JalonStandardID,DataStart,DataFinal

Upvotes: 0

Deep
Deep

Reputation: 3202

I don't know if it will work fine with actual data because I haven't tested it rigorously, but here is a solution :

DECLARE @YourTable TABLE(PlanificatorPozitieID INT, JalonID INT,DataStart DATETIME, DataFinal DATETIME)
INSERT INTO @YourTable VALUES
(26,46,'2012-05-21 00:00:00.000','2012-05-31 00:00:00.000'), 
(28,48,'2012-06-01 00:00:00.000','2012-06-01 00:00:00.000'), 
(27,60,'2012-06-02 00:00:00.000','2012-06-02 00:00:00.000'), 
(29,60,'2012-06-07 00:00:00.000','2012-06-08 00:00:00.000'), 
(37,60,'2012-06-08 00:00:00.000','2012-06-10 00:00:00.000'), 
(30,65,'2012-06-10 00:00:00.000','2012-06-13 00:00:00.000'), 
(31,65,'2012-06-18 00:00:00.000','2012-06-24 00:00:00.000'), 
(32,65,'2012-06-23 00:00:00.000','2012-07-01 00:00:00.000'), 
(33,66,'2012-07-02 00:00:00.000','2012-07-02 00:00:00.000'), 
(34,66,'2012-07-02 00:00:00.000','2012-07-05 00:00:00.000'), 
(36,66,'2012-07-06 00:00:00.000','2012-07-10 00:00:00.000')

;WITH cte AS
(
SELECT  a.PlanificatorPozitieID, 
        a.JalonID, 
        a.DataStart, 
        COALESCE(b.DataFinal,a.datafinal) AS [DataFinal],
        ROW_NUMBER() OVER (PARTITION BY  a.JalonID ORDER BY DATEDIFF(dd,a.datastart, COALESCE(b.DataFinal,a.datafinal))) [rn],
        COUNT(*) OVER (PARTITION BY  a.JalonID) [cnt]
FROM    @YourTable a
        LEFT JOIN  @YourTable b 
        ON  a.JalonID = b.JalonID AND 
            b.DataStart BETWEEN a.DataStart AND a.DataFinal AND  
            a.PlanificatorPozitieID <> b.PlanificatorPozitieID AND 
            DATEDIFF(dd,a.DataStart,a.DataFinal) < DATEDIFF(dd,a.DataStart,b.DataFinal)
)
SELECT * 
FROM   cte 
WHERE  rn= 1 OR rn=cnt

Upvotes: 2

Raybarg
Raybarg

Reputation: 720

https://msdn.microsoft.com/en-us/library/hh231256.aspx

I was looking to LAG() function which makes it possible to access previous rows in query. My thought was to calculate the difference between current row and previous row to get a column to group by with;

with a as (
select 1 as ID, 1 as A, 2 as B union all
select 1 as ID, 2 as A, 3 as B union all
select 1 as ID, 3 as A, 4 as B union all
select 1 as ID, 4 as A, 5 as B union all
select 1 as ID, 6 as A, 7 as B
) 
select ID, A, B, A-LAG(B,1,0) OVER (order by ID) as koe from a where B > A

If you run that query, you get result as;

ID          A           B           koe
----------- ----------- ----------- -----------
1           1           2           1
1           2           3           0
1           3           4           0
1           4           5           0
1           6           7           1

Imagine A is DataStart and B is DataFinal, and calculated koe is difference, as you can see it works on all but first row... first row gets difference between nonexisting row (so its 0). But this is the direction I would start trying.

Upvotes: -1

Related Questions