AswinRajaram
AswinRajaram

Reputation: 1622

SSIS incremental data load error

I am trying to perform incremental insert from staging table (cust_reg_dim_stg) to the warehouse table (dim_cust_reg). This is the query I am using.

 insert into dim_cust_reg WITH(TABLOCK)
(
    channel_id
    ,cust_reg_id
    ,cust_id
    ,status
    ,date_created
    ,date_activated
    ,date_archived
    ,custodian_id
    ,reg_type_id
    ,reg_flags
    ,acc_name
    ,acc_number
    ,sr_id
    ,sr_type
    ,as_of_date
    ,ins_timestamp
    )
select channel_id
    ,cust_reg_id
    ,cust_id
    ,status
    ,date_created
    ,date_activated
    ,date_archived
    ,reg_type_id
    ,reg_flags
    ,acc_name
    ,acc_number
    ,sr_id
    ,sr_type
    ,as_of_date
    ,getdate() ins_timestamp

from umpdwstg..cust_reg_dim_stg stg with(nolock)
join lookup_channel ch with(nolock) on stg.channel_name = ch.channel_name

where not exists
(select * from dim_cust_reg dest
    where dest.cust_reg_id=stg.cust_reg_id 
    and dest.sr_id=stg.sr_id
    and dest.channel_id=ch.channel_id )

Here channel_id is not there in the staging table and is taken using a channel lookup table (lookup_channel). On running this query I am getting the following error.

 Violation of PRIMARY KEY constraint 'PK__dim_cust__4A293521A789A5FA'. 
 Cannot insert duplicate key in object 'dbo.dim_cust_reg'.

What is wrong with the query? channel_id,sr_id and cust_reg_id forms the unique key combination. There seems to be no data error.

Upvotes: 1

Views: 94

Answers (2)

Rupesh Kumar
Rupesh Kumar

Reputation: 136

There are 2 areas where you will need to troubleshoot:-

In this code below:

join lookup_channel ch with(nolock) on stg.channel_name = ch.channel_name 

The incoming channel_name in the staging table may have a different channel name as compared to the record in the destination dimension.

OR it may be because of this join condition inside the NOT EXISTS condition:

and dest.sr_id=stg.sr_id
    and dest.channel_id=ch.channel_id

Here, again the incoming channel_id may be different when you compare the staged data to the one in the destination. So, suggestion is to ignore the channel id once and try to troubleshoot. Once this data is loaded in the target you can get the exact reason whether error was because of the channel_id.

Happy troubleshooting!

Upvotes: 1

Rajesh Bhat
Rajesh Bhat

Reputation: 811

If there is already a duplicate entries in the table - custr_regr_dim_stg - then the SELECT query will produce both those records and will try to insert the same into the dim_cust_reg table. So use DISTINCT in the SELECT statement.

Upvotes: 0

Related Questions