Reputation: 61
Which is the best practice to link tables which have unique relation for a determined period of time?
Example 1
Tab_Civil_Status:
ID | NAME | ID_STATUS |
---|---|---|
1 | Jenny | 1 |
2 | Mike | 2 |
Tab_Civil_Status_Desc:
ID_STATUS | STATUS_DESC |
---|---|
1 | Single |
2 | Married |
Tab_Civil_Events:
PERSON-ID | EVENT | REF_Date | new-Status |
---|---|---|---|
2 | Wedding | 10/5/2017 | 2 |
A query to Tab_Civil_Status
should provide the below:
Case A:
Select * FROM `Tab_Civil_Status` […] Where REF_Date = 1/1/2020
ID | NAME | ID_STATUS |
---|---|---|
1 | Jenny | 1 |
2 | Mike | 2 |
Case B:
Select * FROM `Tab_Civil_Status` […] Where REF_Date = 1/1/2016
ID | NAME | ID_STATUS |
---|---|---|
1 | Jenny | 1 |
2 | Mike | 1 |
Example 2
Tab_Orders:
ID | Order_Code | Product_Id |
---|---|---|
1 | a | 1 |
2 | b | 2 |
Tab_Products:
Product_Id | Product_Name |
---|---|
1 | Apple |
2 | Meta |
Tab_Products_Events:
Product_Id | EVENT | REF_Date | Old_Name | New_Name |
---|---|---|---|---|
2 | Change_Name | 1/12/2020 | Meta |
These are simplified sample, what I am looking for is the best practice to structure tables and queries for similar cases; the logic to apply rather than the actual code to use.
a - Is the best practice to have an "Event Table" as in the above cases tracking the changes? or there are better ways?
I know there might be sevaral ways to reach the same scope, another sample is the below
Tab_Products_Events_Sample_2
:
Product_Id | EVENT | REF_Date | Name |
---|---|---|---|
2 | Given _Name | 1/12/2010 | |
2 | Change_Name | 1/12/2019 | Meta |
b - is there a more efficient query than the below? Also based on the above point
@Selected_date = '1/1/2019'
SELECT
o.[Order_Code],
ifnull(e.[Old_Name],p.[Product_Name]) as p_name
FROM Tab_Orders as o
LEFT JOIN Tab_Products as p
ON o.[Product_Id] = p.[Product_Id]
CROSS APPLY (
SELECT top 1 t.[Old_Name]
FROM Tab_Products_Events as t
WHERE o.[Product_Id] = t.[Product_Id]
and t.[REF_Date] >= @Selected_date
ORDER BY t.[REF_Date] desc
) e
result
Order_Code | p_name |
---|---|
a | Apple |
b |
Thanks
Upvotes: 1
Views: 902
Reputation: 37313
Note: my answer is based on the first example
1st approach: generating date ranges
You should first extract the date ranges for each person status using a similar query:
SELECT
PERSON_ID, new_Status, REF_Date,
ISNULL(LEAD(REF_Date) OVER(PARTITION BY PERSON_ID ORDER BY REF_DATE), GETDATE()) End_DATE
FROM
Tab_Civil_Events
As an example, consider the following sample data:
CREATE TABLE Tab_Civil_Events
(
PERSON_ID INT,
EVENT VARCHAR(50),
REF_Date DATETIME,
new_Status INT
)
INSERT INTO Tab_Civil_Events(PERSON_ID, EVENT, REF_Date, new_Status)
VALUES (2, 'Wedding', '10/5/2017', 2),
(2, 'Divorce', '10/5/2018', 1),
(2, 'Wedding', '10/5/2019', 2),
(2, 'Divorce', '10/5/2020', 1)
SELECT
PERSON_ID, new_Status, REF_Date,
ISNULL(LEAD(REF_Date) OVER (PARTITION BY PERSON_ID ORDER BY REF_DATE),
GETDATE()) End_DATE
FROM
Tab_Civil_Events
This will result in the following table:
PERSON_ID | new_Status | REF_Date | End_DATE |
---|---|---|---|
2 | 2 | 2017-10-05T00:00:00Z | 2018-10-05T00:00:00Z |
2 | 1 | 2018-10-05T00:00:00Z | 2019-10-05T00:00:00Z |
2 | 2 | 2019-10-05T00:00:00Z | 2020-10-05T00:00:00Z |
2 | 1 | 2020-10-05T00:00:00Z | 2022-01-29T09:47:52.7Z |
The second step is to create a common table expression to query the data. For example:
DECLARE @QueryDate DATETIME = '20190501'
;WITH CTE_1 AS
(
SELECT
PERSON_ID,
new_Status,
REF_Date,
ISNULL(LEAD(REF_Date) OVER (PARTITION BY PERSON_ID ORDER BY REF_DATE),
GETDATE()) End_DATE
FROM
Tab_Civil_Events
)
SELECT
CTE_1.PERSON_ID,
dsc.STATUS
FROM
CTE_1
INNER JOIN
Tab_Civil_Status_Desc dsc ON CTE_1.new_Status = dsc.ID
WHERE
@QueryDate BETWEEN REF_Date AND End_DATE
Using the sample data I provided previously, this query will give the following result:
PERSON_ID | STATUS |
---|---|
2 | Single |
While changing the date value to 20191101
SET @QueryDate = '20191101'
Will result in the following values:
PERSON_ID | STATUS |
---|---|
2 | Married |
Make sure that you have created the appropriate indexes to support your queries.
For example:
CREATE NONCLUSTERED INDEX IX_PersonId_RefDate
ON Tab_Civil_Events(PERSON_ID, REF_DATE)
INCLUDE(new_status);
2nd approach: retrieving the nearest event date
You should first retrieve the maximum event dates for each person having a value smallest than the given date:
SELECT
PERSON_ID,
MAX(Ref_Date) AS event_date
FROM
Tab_Civil_Events
WHERE
Ref_Date <= @QueryDate
GROUP BY
PERSON_ID
Next, you should join this query with the events table to get the status information as follows:
DECLARE @QueryDate DATETIME = '20190501'
SELECT
tbl.PERSON_ID, dsc.STATUS
FROM
Tab_Civil_Events tbl
INNER JOIN
Tab_Civil_Status_Desc dsc ON tbl.new_Status = dsc.ID
INNER JOIN
(SELECT
PERSON_ID,
MAX(Ref_Date) AS event_date
FROM
Tab_Civil_Events
WHERE
Ref_Date <= @QueryDate
GROUP BY
PERSON_ID) t1 ON tbl.PERSON_ID = t1.PERSON_ID
AND t1.event_date = tbl.ref_date
3rd approach: temporal tables
I am posting this part after you mentioned that you can change the way the data is stored.
Using SQL Server 2016 or a newer version, you can use temporal tables to simplify querying your data.
As mentioned in the official documentation:
SQL Server 2016 introduced support for temporal tables (also known as system-versioned temporal tables) as a database feature that brings built-in support for providing information about data stored in the table at any point in time rather than only the data that is correct at the current moment in time.
Temporal tables are implemented as a pair of tables, a current table and a history table. Within each of these tables, start and end dates are stored and used to define the period of validity for each row.
You can refer to the following resources to learn more about this type of data structure:
Upvotes: 1