Postgres: upsert a row and update a primary key column

Question

Suppose I have two tables in my Postgres database:

create table transactions
(
    id bigint primary key,
    doc_id bigint not null,
    -- lots of other columns...
    amount numeric not null
);

-- same columns
create temporary table updated_transactions
(
    id bigint primary key,
    doc_id bigint not null,
    -- lots of other columns...
    amount numeric not null
);

Both tables have just a primary key, and no unique indexes.

I need to upsert rows from updated_transactions into transactions using the following rules:

id column values in transactions and updated_transactions don't match
other columns like doc_id, etc (except of the amount) should match
when a matching row is found, update both amount and id columns
when a matching row is not found, insert it

id values in updated_transactions are taken from a sequence. A business object just populates updated_transactions and then merges the new or updated rows from it into transactions using an upsert query. So my old unchanged transactions keep their ids intact, and the updated ones are assigned new ids.

In MSSQL and Oracle, it would be a merge statement similar to this:

merge into transactions t
using updated_transactions ut on t.doc_id = ut.doc_id, ...
when matched then
    update set t.id = ut.id, t.amount = ut.amount
when not matched then
    insert (t.id, t.doc_id, ..., t.amount)
    values (ut.id, ut.doc_id, ..., ut.amount);

In PostgreSQL, I suppose it should be something like this:

insert into transactions(id, doc_id, ..., amount)
select coalesce(t.id, ut.id), ut.doc_id, ... ut.amount
from updated_transactions ut
left join transactions t on t.doc_id = ut.doc_id, ....
    on conflict
    on constraint transactions_pkey
    do update
    set amount = excluded.amount, id = excluded.id

The problem is with the do update clause: excluded.id is an old value from transactions table, while I need a new value from updated_transactions.

ut.id value is inaccessible for the do update clause, and the only thing I can use is the excluded row. But the excluded row has only coalesce(t.id, ut.id) expression which returns old id values for the existing rows.

Is it possible to update both id and amount columns using the upsert query?

yallie · Accepted Answer

Looks like the task can be accomplished using writable CTEs instead of the plain upsert.

First, I'll post the easier version of the query that answers the original question as it was asked. This solution assumes that doc_id, unit_id columns address a candidate key, but doesn't require a unique index on these columns.

Test data:

create temp table transactions
(
    id bigint primary key,
    doc_id bigint,
    unit_id bigint,
    amount numeric
);

create temp table updated_transactions
(
    id bigint primary key,
    doc_id bigint,
    unit_id bigint,
    amount numeric
); 

insert into transactions(id, doc_id, unit_id, amount)
values (1, 1, 1, 10), (2, 1, 2, 15), (3, 1, 3, 10);

insert into updated_transactions(id, doc_id, unit_id, amount)
values (6, 1, 1, 11), (7, 1, 2, 15), (8, 1, 4, 20);

The query to merge updated_transactions into transactions:

with new_values as
(
    select ut.id new_id, t.id old_id, ut.doc_id, ut.unit_id, ut.amount 
    from updated_transactions ut
    left join transactions t 
        on t.doc_id = ut.doc_id and t.unit_id = ut.unit_id
),
updated as
(
    update transactions tr
    set id = nv.new_id, amount = nv.amount
    from new_values nv
    where id = nv.old_id
    returning tr.*
)
insert into transactions(id, doc_id, unit_id, amount)
select ut.new_id, ut.doc_id, ut.unit_id, ut.amount
from new_values ut
where ut.new_id not in (select id from updated);

The results:

select * from transactions

-- id | doc_id | unit_id | amount
------+--------+---------+-------
--  3 |   1    |    3    |  10    -- not changed
--  6 |   1    |    1    |  11    -- updated
--  7 |   1    |    2    |  15    -- updated 
--  8 |   1    |    4    |  20    -- inserted

In my real application doc_id, unit_id aren't always unique, so they don't represent a candidate key. To match the rows I take into account the row number, calculated for the rows sorted by their ids. So here's my second solution.

Test data:

-- the tables are the same as above
insert into transactions(id, doc_id, unit_id, amount)
values (1, 1, 1, 10), (2, 1, 1, 15), (3, 1, 3, 10);

insert into updated_transactions(id, doc_id, unit_id, amount)
values (6, 1, 1, 11), (7, 1, 1, 15), (8, 1, 4, 20);

The merge query:

with trans as
(
    select id, doc_id, unit_id, amount,
        row_number() over(partition by doc_id, unit_id order by id) row_num
    from transactions
),
updated_trans as
(
    select id, doc_id, unit_id, amount,
        row_number() over(partition by doc_id, unit_id order by id) row_num
    from updated_transactions
),
new_values as
(
    select ut.id new_id, t.id old_id, ut.doc_id, ut.unit_id, ut.amount 
    from updated_trans ut
    left join trans t 
        on t.doc_id = ut.doc_id and t.unit_id = ut.unit_id and t.row_num = ut.row_num
),
updated as
(
    update transactions tr
    set id = nv.new_id, amount = nv.amount
    from new_values nv
    where id = nv.old_id
    returning tr.*
)
insert into transactions(id, doc_id, unit_id, amount)
select ut.new_id, ut.doc_id, ut.unit_id, ut.amount
from new_values ut
where ut.new_id not in (select id from updated);

The results:

select * from transactions;

-- id | doc_id | unit_id | amount
------+--------+---------+-------
--  3 |   1    |    3    | 10     -- not changed
--  6 |   1    |    1    | 11     -- updated
--  7 |   1    |    1    | 15     -- updated
--  8 |   1    |    4    | 20     -- inserted

References:

Postgres: upsert a row and update a primary key column

Answers (2)

Related Questions