Dan Rubio
Dan Rubio

Reputation: 4907

My PSQL after insert trigger fails to insert into another table when ON DUPLICATE encounters a dupilcate

I am slowly working through a feature where I am importing large csv files. The contents of the csv file has a chance that when it is uploaded the contents will trigger a uniqueness conflict. I've combed stack overflow for some similar resources but I still can't seem to get my trigger to update another table when a duplicate entry is found. The following code is what I have currently implemented with my line of logic for this process. Also, this is implemented in a rails app but the underlying sql is the following.

When a user uploads a file, the following happens when its processed.


CREATE TEMP TABLE codes_temp ON COMMIT DROP AS SELECT * FROM codes WITH NO DATA;

create or replace function log_duplicate_code() 
  returns trigger 
  language plpgsql 
  as 
$$
begin
    insert into duplicate_codes(id, campaign_id, code_batch_id, code, code_id, created_at, updated_at)
    values (gen_random_uuid(), excluded.campaign_id, excluded.code_batch_id, excluded.code, excluded.code_id, now(), now());
    return null;
end; 
$$

create trigger log_duplicate_code 
after insert on codes 
for each row execute procedure log_duplicate_code();

INSERT INTO codes SELECT * FROM codes_temp ct 
    ON CONFLICT (campaign_id, code) 
DO update set updated_at = excluded.updated_at;

DROP TRIGGER log_duplicate_code ON codes;

When I try to run this process nothing happens at all. If I were to have a csv file with this value CODE01 and then upload again with CODE01 the duplicate_codes table doesn't get populated at all and I don't understand why. There is no error that gets triggered or anything so it seems like DO UPATE..... is doing something. What am I missing here?

I also have some questions that come to my mind even if this were to work as intended. For example, I am uploading millions of these codes, etc.

1). Should my trigger be a statement trigger instead of a row for scalability?

2). What if someone else tries to upload another file that has millions of codes? I have my code wrapped in a transaction. Would a new separate trigger be created? Will this conflict with a previously processing process?

####### EDIT #1 #######

Thanks to Adriens' comment I do see that After Insert does not have the OLD key phrase. I updated my code to use EXCLUDED and I receive the following error for the trigger.

ERROR:  missing FROM-clause entry for table "excluded" (PG::UndefinedTable)

Finally, here are the S.O. posts I've used to try to tailor my code but I just can't seem to make it work.

####### EDIT #2 #######

I have a little more context on to how this is implemented.

When the CSV is loaded, a staging table called codes_temp is created and dropped at the end of the transaction. This table contains no unique constraints. From what I read only the actual table that I want to insert codes should have the unique constraint error.

In my INSERT statement, the DO update set updated_at = excluded.updated_at; doesn't trigger a unique constraint error. As of right now, I don't know if it should or not. I borrowed this logic taken from this s.o. question postgresql log into another table with on conflict it seemed to me like I had to update something if I specify the DO UPDATE SET clause.

Last, the correct criteria for codes in the database is the following:

For example, this is an example entry in my codes table

id, campaign_id, code
1,  1,           CODE01
2,  1,           CODE02
3,  1,           CODE03

If any of these codes appear again somewhere, This should not be inserted into the codes table but it needs to be inserted into the duplicate_codes table because they were already uploaded before.

id, campaign_id, code
1,  1,           CODE01.  
2,  1,           CODE02
3,  1,           CODE03

As for the codes_temp table I don't have any unique constraints, so there is no criteria to select the right one.

postgresql log into another table with on conflict

Postgres insert on conflict update using other table

Postgres on conflict - insert to another table

How to do INSERT INTO SELECT and ON DUPLICATE UPDATE in PostgreSQL 9.5?

Upvotes: 0

Views: 520

Answers (1)

Adrian Klaver
Adrian Klaver

Reputation: 19665

Seems to me something like:

INSERT INTO 
   codes 
SELECT 
   distinct on(campaign_id, code) * 
FROM 
   codes_temp ct 
ORDER BY
  campaign_id, code, id DESC;
   

Assuming id was assigned sequentially, the above would select the most recent row into codes.

Then:


INSERT INTO 
    duplicate_codes 
SELECT
    * 
FROM 
   codes_temp AS ct
LEFT JOIN 
   codes 
ON 
   ct.id = codes.id
WHERE 
   codes.id IS NULL;

The above would select the rows in codes_temp that where not selected into codes into the duplicates table.

Obviously not tested on your data set. I would create a small test data set that has uniqueness conflicts and test with.

Upvotes: 1

Related Questions