SSIS ETL Transform-Load How To Handle Create / Update (i.e. UPSERT) for Foreign Key Table Data?

Question

I am performing ETL on a set of Office, Employee, Location tables by following the standard practice of bringing all the data into Staging tables, first by way of Extract packages, and then performing Transform-Load on each of the staging table to get the data into the respective table(s).

In each of my Transform-Load SSIS Packages, I am performing CUD ( Create, Update, Delete) by using MERGE JOIN and CONDITIONAL Splits.

This works fine when the data in Staging table is 1-to-1 with the corresponding real table. In the scenario below (see image) the SampleLocation table is 1-to-1 with the StageSampleLocation table.

The trouble I am having is deciding how to handle a situation where the Staging table has data that will go into the Foreign key table(s).

The Problem explained

If you take a look at the following database diagram...

The data from StageSampleOffice goes into SampleOffice for fields that have Office data. In addition to office data, the StageSampleOffice has Person data -- in this example, OfficeManagerName field will need to be looked up in the FK table SamplePerson. If the name doesn't exist in SamplePerson table, this name will need to be inserted in SamplePerson first, and the PersonId PK value for that person will be retrieved and stored as the FK value in the row for the imported Office in the SampleOffice table, in my Data Flow Task.

Similarly, for the address info in StageSampleOffice, the details will need to be looked up in the SampleLocation FK table, and if the address doesn't exist, a new one needs to be inserted with the corresponding values from StageSampleOffice. Once that is done, the LocationId for the address will be stored as FK in the SampleOffice table.

As you can see, data for SampleLocation and SamplePerson could come into the system from 2 or more sources. In the example above, for SampleLocation I get a Location data file that only has the addresses. I also get addresses as part of Office records from various office types, that come in the StageSampleOffice table.

What I have tried so far

I have already separated the Extract workflows from the Transform-Load workflows. I have 1 extract package per staging table, that essentially reads data from the source (flatfile or table), truncates the staging table, and imports everything as-is into the staging table.

I am thinking that

for Person data that is present in the StageSampleOffice, I will first insert the data into the StageSamplePerson table (not shown in diagram) and then execute the Transform-Load Package for SamplePerson that will do the Create or Update for those Persons and
for Location data that is present in the StageSampleOffice, I will first insert the data into the StageSampleLocation table (not shown in diagram) and then execute the Transform-Load Package for SampleLocation that will do the Create or Update for those Location.

This way, all the FK rows will be present in the respective table when the flow returns to my main package that does the Transform-Load for the SampleOffice table.

Is this a good idea, or is there a better way.

Thank you!

SSIS ETL Transform-Load How To Handle Create / Update (i.e. UPSERT) for Foreign Key Table Data?

The Problem explained

What I have tried so far

Answers (1)

Related Questions