Reputation: 741
I need to update a foreign key in table 1 with the correct entry based on table 2. The correct foreign key is the earliest date that falls after, but not before the next effective dates in table 2. If there are multiple entries in table 2 with the same effective date, then use the modified date column as a tie breaker and pick the most recent one. Here is the based table structure (all dates are in Date format):
Table 1
pK1 PeriodStartDate pK2
1 2016-04-01 00:00:00.000
2 2016-07-01 00:00:00.000
Table 2
pK2 EffectiveFrom ModifiedDate
3 2016-03-01 00:00:00.000 2016-04-01 00:00:00.000
4 2016-05-01 00:00:00.000 2016-06-01 00:00:00.000
5 2016-05-01 00:00:00.000 2016-06-02 00:00:00.000
So in the above example table 1 would look like this:
pK1 PeriodStartDate pK2
1 2016-04-01 00:00:00.000 3
2 2016-07-01 00:00:00.000 5
This is because for row 1 it falls between March 1st and May 1st (from table 2). And for row 2 it is after the last date, but as there are two similar start dates we choose the last modified.
I'm not sure of the solution. I was trying something like this:
UPDATE table1
SET pK2 = table2.pK2
FROM table2
WHERE PeriodStartDate > (SELECT FIRST(table2.EffectiveFrom) FROM table2)
I'm just not sure how to find an entry that is bounded by another row (and then needs another column for the tie breaker)
Upvotes: 0
Views: 155
Reputation: 7392
First off, you need to apply a row_number()
over Table2
, partitioned on the PeriodStart
and ordered by the ModifiedDate
(desc). Call this MaxModified
; and 1 is always the most recently modified record.
pK2 PeriodStart ModifiedDate MaxModified
3 2016-03-01 00:00:00.000 2016-04-01 00:00:00.000 1
5 2016-05-01 00:00:00.000 2016-06-02 00:00:00.000 1
4 2016-05-01 00:00:00.000 2016-06-01 00:00:00.000 2
Then, for only where MaxModified=1
, you add a new "id" to this so we can line up a start date, with the next rows start date (our end date). This is also done with the row_number()
function ordered by the PeriodStart
.
pK2 PeriodStart ModifiedDate MaxModified myID
3 2016-03-01 00:00:00.000 2016-04-01 00:00:00.000 1 1
5 2016-05-01 00:00:00.000 2016-06-02 00:00:00.000 1 2
Then we take that result and join it to itself offset by one row to get an end date value for each original row.
pK2 PeriodStart ModifiedDate MaxModified myID PeriodEnd
3 2016-03-01 00:00:00.000 2016-04-01 00:00:00.000 1 1 2016-05-01 00:00:00.000
5 2016-05-01 00:00:00.000 2016-06-02 00:00:00.000 1 2 NULL
Once we have that, its a simple matter of joining on the start/end dates to get our pk2
value.
Full script...
DECLARE @Table1 TABLE (pK1 INT, PeriodStart DATETIME, pK2 INT)
DECLARE @Table2 TABLE (pK2 INT, PeriodStart DATETIME, ModifiedDate DATETIME)
INSERT INTO @Table1
VALUES (1,'2016-04-01',NULL),
(2,'2016-07-01',NULL)
INSERT INTO @Table2
VALUES (3,'2016-03-01','2016-04-01'),
(4,'2016-05-01','2016-06-01'),
(5,'2016-05-01','2016-06-02')
;WITH OrderedList AS
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY PeriodStart ORDER BY ModifiedDate DESC) AS MaxModified
FROM @Table2
),X AS
(
SELECT *,
ROW_NUMBER() OVER(ORDER BY PeriodStart) AS myID
FROM OrderedList
WHERE MaxModified=1
), Y AS
(
SELECT L.*, R.PeriodStart AS PeriodEnd
FROM X L
LEFT JOIN X R ON L.myID=R.myID-1 AND R.MaxModified=1
WHERE L.MaxModified=1
)
UPDATE T SET pK2=Y.pK2
FROM @Table1 T
LEFT JOIN Y ON T.PeriodStart >= Y.PeriodStart AND T.PeriodStart < COALESCE(Y.PeriodEnd,CURRENT_TIMESTAMP)
SELECT *
FROM @Table1
Upvotes: 1