Reputation: 3418
I have a table called DimWorkerCode
. It has a column called WorkerCode
. This is our business key here. Changes that can happen to a WorkerCode
is UnitCode
and WindowsID
as shown in figure below.
I want to ignore WindowsID
, and just select WorkerCode
, Unitcode
and StartDate
which will be minimum StartDate
and EndDate
which will be maximum EndDate
.
I tried this query:
SELECT
WorkerCode, UnitCode,
MIN(StartDate) AS StartDate,
MAX(ISNULL(EndDate, '9999/12/31')) AS EndDate
FROM
dbo.DimWorkerCode
GROUP BY
WorkerCode, UnitCode
and got this result set:
But I am expecting result something like this.
How can I do it in T-SQL? Help please.
Upvotes: 0
Views: 63
Reputation: 3418
I got the idea from Zhorov and modifying it to meet my exact requirement.
CREATE TABLE DimWorkerCode (
ID int,
WorkerCode varchar(4),
UnitCode varchar(4),
WindowID int,
StartDate date,
EndDate date
);
INSERT INTO DimWorkerCode
(ID, WorkerCode, UnitCode, WindowID, StartDate, EndDate)
VALUES
(1, 'AA01', 'AA00', 2, '2007-01-01', '2008-01-01'),
(2, 'AA01', 'AA00', 5, '2008-01-01', '2008-01-01'),
(3, 'AA01', 'AA00', 3, '2009-01-01', '2010-01-01'),
(4, 'AA01', 'XYZ0', 9, '2010-01-01', '2011-01-01'),
(5, 'AA01', 'XYZ0', 12, '2011-01-01', '2012-01-01'),
(6, 'AA01', 'AA00', 13, '2012-01-01', '2013-01-01'),
(7, 'AA01', 'AA00', 24, '2013-01-01', '2014-01-01'),
(8, 'AA01', 'AA00', 17, '2014-01-01', '2015-01-01'),
(9, 'AA01', 'AA00', 18, '2015-01-01', '2016-01-01'),
(10, 'AA01', 'AA00', 22, '2016-01-01', NULL)
Here it goes
WITH CTE AS
(
SELECT
ID,
WorkerCode, LAG(WorkerCode, 1, WorkerCode) OVER (ORDER BY ID) AS PrevWorkerCode,
UnitCode, LAG(UnitCode, 1, UnitCode) OVER (ORDER BY ID) AS PrevUnitCode,
StartDate,
ISNULL(EndDate , '9999/12/31') AS EndDate
FROM DimWorkerCode
)
,
ChangedCTE AS
(
SELECT *, IIF(WorkerCode = PrevWorkerCode AND UnitCode = PrevUnitCode, 0, 1) AS Changed FROM CTE
)
,
GroupedCTE AS
(
SELECT *, SUM(Changed) OVER(ORDER BY ID) AS GroupID FROM ChangedCTE
)
,
MinMaxCTE As
(
SELECT MAX(WorkerCode) AS WorkerCode, MAX(UnitCode) AS UnitCode, MIN(StartDate) AS StartDate, MAX(EndDate) AS EndDate FROM GroupedCTE GROUP BY GroupID
)
SELECT WorkerCode, UnitCode, StartDate, IIF(EndDate = '9999-12-31', NULL, EndDate) AS EndDate FROM MinMaxCTE
Output:
Upvotes: 0
Reputation: 29993
If I understand your question correctly and you want to get min
and max
dates, next approach may help. The important part here is to define groups (each new group begins when WorkerCode
or Unitcode
are changed).
Table:
CREATE TABLE DimWorkerCode (
ID int,
WorkerCode varchar(4),
UnitCode varchar(4),
WindowID int,
StartDate date,
EndDate date
)
INSERT INTO DimWorkerCode
(ID, WorkerCode, UnitCode, WindowID, StartDate, EndDate)
VALUES
(1, 'AA01', 'AA00', 2, '2007-01-01', '2008-01-01'),
(2, 'AA01', 'AA00', 5, '2008-01-01', '2008-01-01'),
(3, 'AA01', 'AA00', 3, '2009-01-01', '2010-01-01'),
(4, 'AA01', 'XYZ0', 9, '2010-01-01', '2011-01-01'),
(5, 'AA01', 'XYZ0', 12, '2011-01-01', '2012-01-01'),
(6, 'AA01', 'AA00', 13, '2012-01-01', '2013-01-01'),
(7, 'AA01', 'AA00', 24, '2013-01-01', '2014-01-01'),
(8, 'AA01', 'AA00', 17, '2014-01-01', '2015-01-01'),
(9, 'AA01', 'AA00', 18, '2015-01-01', '2016-01-01'),
(10, 'AA01', 'AA00', 22, '2016-01-01', NULL)
Statement:
;WITH ChangeCTE AS (
SELECT
*,
CASE
WHEN (UnitCode = LAG(UnitCode) OVER (ORDER BY ID)) AND (WorkerCode = LAG(WorkerCode) OVER (ORDER BY ID)) THEN 0
ELSE 1
END AS Change
FROM DimWorkerCode
), GroupCTE AS (
SELECT
*,
SUM(Change) OVER (ORDER BY ID) AS GroupID
FROM ChangeCTE
)
SELECT
MAX(WorkerCode) AS WorkerCode,
MAX(UnitCode) AS UnitCode,
MIN(StartDate) AS StartDate,
MAX(ISNULL(EndDate, '9999/12/31')) AS EndDate
FROM GroupCTE
GROUP BY GroupID
Output:
WorkerCode UnitCode StartDate EndDate
AA01 AA00 01/01/2007 00:00:00 01/01/2010 00:00:00
AA01 XYZ0 01/01/2010 00:00:00 01/01/2012 00:00:00
AA01 AA00 01/01/2012 00:00:00 31/12/9999 00:00:00
Upvotes: 1
Reputation: 56
The issue is that you are trying to pull some data without grouping on it (enddate for min(startdate).
I'm not certain this is the best solution, but it should work. Using Row_Number(), we're listing the records by min(startdate) and max(enddate) without grouping them; then you pull the records at the beginning of both listings.
select
WorkerCode,
UnitCode,
StartDate,
EndDate
from
(
select
WorkerCode,
UnitCode,
StartDate,
EndDate
row_number() over (partition by WorkerCode, UnitCode order by StartDate) as MinStartDateRow,
row_number() over (partition by WorkerCode, UnitCode order by EndDate desc) as MaxEndDateRow
from
dbo.DimWorkerCode
) x
where
MinStartDateRow = 1
or MaxEndDateRow = 1
Upvotes: 2