sqlenthusiast
sqlenthusiast

Reputation: 185

Update record with IsActive

I am trying to build a dimension table which will identify current and historical data, by using the fields 'IsActive' and 'EffectiveDate'.

IsActive of 1 = Active.

EffectiveDate = Date the record was ingested.

Scenario: I have an existing record with the current view of the employee, however if any information about the employee changes, instead of updating the existing record, I would like to create a new line item with the updated record becoming active and the previous record becoming inactive.

ID Name Surname Age IsActive EffectiveDate
1 John Doe 54 1 2021-01-01

When a change to the employee data is done, I would like to update the table as follows:

ID Name Surname Age IsActive EffectiveDate
1 John Doe 54 0 2021-01-01
2 John Doe 64 1 2021-06-25

I am using the combination of 'Name', 'Surname', and 'Age' to identify unique records. The reason I am using these three fields is because there is no other information provided to uniquely identify an employee.

I would really appreciate any assistance.

Upvotes: 0

Views: 3809

Answers (2)

anon
anon

Reputation:

You could do this yourself, but I would give Temporal Tables a try first, this is pretty much exactly what they were designed for. You just need to learn slightly different query syntax to find the rows that were "active" at a point in time or during a range.

In the meantime I can share a very simple example of how to do this yourself, but I agree with the comment that Name + Surname + Age is a terrible primary key, because all three of those could change for any given employee (and the age absolutely will change - this is why we usually store Birthdate instead). Why don't we pretend that we assign each employee, somewhere, a unique but meaningless ID (the kind of thing that would go on their access badge or the data in the HR database) just to keep the example simple.

Let's say your table is:

CREATE TABLE dbo.EmployeeHistoryStuff
(
  ID            bigint       NOT NULL IDENTITY(1,1) PRIMARY KEY,
  EmployeeID    int, 
  Name          nvarchar(50), 
  Surname       nvarchar(50),
  Age           tinyint,
  IsActive      bit          NOT NULL DEFAULT (1),
  EffectiveDate datetime2(0) NOT NULL DEFAULT sysutcdatetime()
);

And we can add a few sample rows:

INSERT dbo.EmployeeHistoryStuff(EmployeeID, Name, Surname, Age) 
VALUES(1, N'Aaron', N'Bertrand', 29),
      (2, N'Teemu', N'Selanne',  31),
      (3, N'Bobby', N'Orr',      62),
      (4, N'Wayne', N'Gretzky',  55);

Now we can create a trigger that intercepts any updates any previous rows to IsActive = 1 and inserts a new row:

CREATE TRIGGER dbo.InsteadOfEmployeeHistoryStuff
ON dbo.EmployeeHistoryStuff
INSTEAD OF UPDATE
AS
BEGIN
  SET NOCOUNT ON;
  DECLARE @now datetime2(0) = sysutcdatetime();
  
  UPDATE old SET IsActive = 0
    FROM inserted AS i
    INNER JOIN dbo.EmployeeHistoryStuff AS old
    ON i.EmployeeID = old.EmployeeID
    WHERE old.IsActive = 1;
  
  INSERT dbo.EmployeeHistoryStuff(EmployeeID, Name, Surname, Age, EffectiveDate)
    SELECT EmployeeID, Name, Surname, Age, DATEADD(SECOND, 1, @now)
      FROM inserted;
END
GO

Now, if we perform an update because some employees had a birthday:

 UPDATE dbo.EmployeeHistoryStuff SET Age += 1 WHERE EmployeeID IN (1,2);
ID EmployeeID Name Surname Age IsActive EffectiveDate
1 1 Aaron Bertrand 29 False 2021-10-22 13:37:24
2 2 Teemu Selanne 31 False 2021-10-22 13:37:24
3 3 Bobby Orr 62 True 2021-10-22 13:37:24
4 4 Wayne Gretzky 55 True 2021-10-22 13:37:24
5 1 Aaron Bertrand 30 True 2021-10-22 13:37:30
6 2 Teemu Selanne 32 True 2021-10-22 13:37:30

Then Wayne wants to get out of the limelight:

 UPDATE dbo.EmployeeHistoryStuff SET Surname = N'Schmetzky' WHERE EmployeeID = 4;
ID EmployeeID Name Surname Age IsActive EffectiveDate
1 1 Aaron Bertrand 29 False 2021-10-22 13:37:24
2 2 Teemu Selanne 31 False 2021-10-22 13:37:24
3 3 Bobby Orr 62 True 2021-10-22 13:37:24
4 4 Wayne Gretzky 55 False 2021-10-22 13:37:24
5 1 Aaron Bertrand 30 True 2021-10-22 13:37:30
6 2 Teemu Selanne 32 True 2021-10-22 13:37:30
7 4 Wayne Schmetzky 55 True 2021-10-22 13:37:35

Then Wayne changes his mind:

UPDATE dbo.EmployeeHistoryStuff SET Surname = N'Gretzky' WHERE EmployeeID = 4;
ID EmployeeID Name Surname Age IsActive EffectiveDate
1 1 Aaron Bertrand 29 False 2021-10-22 13:37:24
2 2 Teemu Selanne 31 False 2021-10-22 13:37:24
3 3 Bobby Orr 62 True 2021-10-22 13:37:24
4 4 Wayne Gretzky 55 False 2021-10-22 13:37:24
5 1 Aaron Bertrand 30 True 2021-10-22 13:37:30
6 2 Teemu Selanne 32 True 2021-10-22 13:37:30
7 4 Wayne Schmetzky 55 False 2021-10-22 13:37:35
8 4 Wayne Gretzky 55 True 2021-10-22 13:37:40
9 4 Wayne Gretzky 55 True 2021-10-22 13:37:40

My timings are a bit off here, I'll come back and fix in a bit.

Upvotes: 1

NiharikaMoola
NiharikaMoola

Reputation: 5074

You can use the Merge statement as one of the options, to update records in Target table (dim) when matched with Source table (raw table) based on condition.

Here to update the existing record you can’t use ‘Age’ column to define uniqueness in your combination, as your new dataset has the change of data in ‘Age’ column.

Below are my repro details:

  1. Existing dimension employee data:

enter image description here

  1. Source data:

enter image description here

  1. Code:

     Create procedure sp_employee
     AS   
    
     SET NOCOUNT ON;  
    
     BEGIN
    
    --update Active flag of existing data
    
     MERGE dim_employee AS Target
     USING employee_data AS Source
     ON Source.Name = Target.Name
     and Source.Surname = Target.Surname
     WHEN MATCHED THEN UPDATE SET
     Target.IsActive = 0;
    
    --Insert new data (assuming the table has only changed data set)
     Insert into dim_employee
     select Name, Surname, Age, IsActive, EffectiveDate from employee_data;
    
    End
    GO
    
  2. If you have old records in the source table along with the new change set, you can use the below code to insert only the change dataset.

     Insert into dim_employee
     select Name, Surname, Age, IsActive, EffectiveDate from employee_data
     except
     select distinct a.Name, a.Surname, a.Age, a.IsActive, a.EffectiveDate from employee_data a
     inner join dim_employee b on a.Name = b.Name and a.Surname = b.Surname and a.Age = b.Age;
    
  3. Execute stored procedure

     Exec sp_employee;
    
  4. After executing stored procedure: Here IsActive column data for ‘John Doe’ got updated to 0 and a new record for ‘John Doe’ got inserted from employee_data table.

enter image description here

Upvotes: 0

Related Questions