Merin Nakarmi
Merin Nakarmi

Reputation: 3418

T-SQL code to find maximum and minimum date over a series of changes

I have a table called DimWorkerCode. It has a column called WorkerCode. This is our business key here. Changes that can happen to a WorkerCode is UnitCode and WindowsID as shown in figure below.

WorkerCode table

I want to ignore WindowsID, and just select WorkerCode, Unitcode and StartDate which will be minimum StartDate and EndDate which will be maximum EndDate.

I tried this query:

SELECT 
    WorkerCode, UnitCode, 
    MIN(StartDate) AS StartDate, 
    MAX(ISNULL(EndDate, '9999/12/31')) AS EndDate 
FROM 
    dbo.DimWorkerCode 
GROUP BY 
    WorkerCode, UnitCode

and got this result set:

enter image description here

But I am expecting result something like this.

Result

How can I do it in T-SQL? Help please.

Upvotes: 0

Views: 63

Answers (3)

Merin Nakarmi
Merin Nakarmi

Reputation: 3418

I got the idea from Zhorov and modifying it to meet my exact requirement.

CREATE TABLE DimWorkerCode (
   ID int,
   WorkerCode varchar(4), 
   UnitCode varchar(4),
   WindowID int,
   StartDate date, 
   EndDate date
);

INSERT INTO DimWorkerCode
   (ID, WorkerCode, UnitCode, WindowID, StartDate, EndDate)
VALUES
   (1,  'AA01', 'AA00', 2,  '2007-01-01', '2008-01-01'),
   (2,  'AA01', 'AA00', 5,  '2008-01-01', '2008-01-01'),
   (3,  'AA01', 'AA00', 3,  '2009-01-01', '2010-01-01'),
   (4,  'AA01', 'XYZ0', 9,  '2010-01-01', '2011-01-01'),
   (5,  'AA01', 'XYZ0', 12, '2011-01-01', '2012-01-01'),
   (6,  'AA01', 'AA00', 13, '2012-01-01', '2013-01-01'),
   (7,  'AA01', 'AA00', 24, '2013-01-01', '2014-01-01'),
   (8,  'AA01', 'AA00', 17, '2014-01-01', '2015-01-01'),
   (9,  'AA01', 'AA00', 18, '2015-01-01', '2016-01-01'),
   (10, 'AA01', 'AA00', 22, '2016-01-01', NULL)

Here it goes

WITH CTE AS
(
SELECT 
    ID, 
    WorkerCode, LAG(WorkerCode, 1, WorkerCode) OVER (ORDER BY ID) AS PrevWorkerCode, 
    UnitCode,   LAG(UnitCode, 1, UnitCode) OVER (ORDER BY ID) AS PrevUnitCode,
    StartDate, 
    ISNULL(EndDate , '9999/12/31') AS EndDate
    FROM DimWorkerCode
)
,
ChangedCTE AS
(
    SELECT *, IIF(WorkerCode = PrevWorkerCode AND UnitCode = PrevUnitCode, 0, 1) AS Changed FROM CTE
)
,
GroupedCTE AS 
(
    SELECT *, SUM(Changed) OVER(ORDER BY ID) AS GroupID FROM ChangedCTE
)
,
MinMaxCTE As
(
SELECT MAX(WorkerCode) AS WorkerCode, MAX(UnitCode) AS UnitCode, MIN(StartDate) AS StartDate, MAX(EndDate) AS EndDate FROM GroupedCTE GROUP BY GroupID
)

SELECT WorkerCode, UnitCode, StartDate, IIF(EndDate = '9999-12-31', NULL, EndDate) AS EndDate FROM MinMaxCTE

Output:

enter image description here

Upvotes: 0

Zhorov
Zhorov

Reputation: 29993

If I understand your question correctly and you want to get min and max dates, next approach may help. The important part here is to define groups (each new group begins when WorkerCode or Unitcode are changed).

Table:

CREATE TABLE DimWorkerCode (
   ID int,
   WorkerCode varchar(4), 
   UnitCode varchar(4),
   WindowID int,
   StartDate date, 
   EndDate date
)
INSERT INTO DimWorkerCode
   (ID, WorkerCode, UnitCode, WindowID, StartDate, EndDate)
VALUES
   (1,  'AA01', 'AA00', 2,  '2007-01-01', '2008-01-01'),
   (2,  'AA01', 'AA00', 5,  '2008-01-01', '2008-01-01'),
   (3,  'AA01', 'AA00', 3,  '2009-01-01', '2010-01-01'),
   (4,  'AA01', 'XYZ0', 9,  '2010-01-01', '2011-01-01'),
   (5,  'AA01', 'XYZ0', 12, '2011-01-01', '2012-01-01'),
   (6,  'AA01', 'AA00', 13, '2012-01-01', '2013-01-01'),
   (7,  'AA01', 'AA00', 24, '2013-01-01', '2014-01-01'),
   (8,  'AA01', 'AA00', 17, '2014-01-01', '2015-01-01'),
   (9,  'AA01', 'AA00', 18, '2015-01-01', '2016-01-01'),
   (10, 'AA01', 'AA00', 22, '2016-01-01', NULL)

Statement:

;WITH ChangeCTE AS (
   SELECT 
      *,
      CASE 
         WHEN (UnitCode = LAG(UnitCode) OVER (ORDER BY ID)) AND (WorkerCode = LAG(WorkerCode) OVER (ORDER BY ID)) THEN 0 
         ELSE 1 
      END AS Change
   FROM DimWorkerCode
), GroupCTE AS (
   SELECT 
      *,
      SUM(Change) OVER (ORDER BY ID) AS GroupID
   FROM ChangeCTE
)
SELECT 
   MAX(WorkerCode) AS WorkerCode, 
   MAX(UnitCode) AS UnitCode, 
   MIN(StartDate) AS StartDate, 
   MAX(ISNULL(EndDate, '9999/12/31')) AS EndDate 
FROM GroupCTE 
GROUP BY GroupID

Output:

WorkerCode  UnitCode    StartDate           EndDate
AA01        AA00        01/01/2007 00:00:00 01/01/2010 00:00:00
AA01        XYZ0        01/01/2010 00:00:00 01/01/2012 00:00:00
AA01        AA00        01/01/2012 00:00:00 31/12/9999 00:00:00

Upvotes: 1

adam.stowe
adam.stowe

Reputation: 56

The issue is that you are trying to pull some data without grouping on it (enddate for min(startdate).

I'm not certain this is the best solution, but it should work. Using Row_Number(), we're listing the records by min(startdate) and max(enddate) without grouping them; then you pull the records at the beginning of both listings.

select
    WorkerCode,
    UnitCode,
    StartDate,
    EndDate
from
    (
        select      
            WorkerCode,
            UnitCode,
            StartDate,
            EndDate
            row_number() over (partition by WorkerCode, UnitCode order by StartDate) as MinStartDateRow,
            row_number() over (partition by WorkerCode, UnitCode order by EndDate desc) as MaxEndDateRow
        from
            dbo.DimWorkerCode
    ) x
where
    MinStartDateRow = 1
    or MaxEndDateRow = 1

Upvotes: 2

Related Questions