Reputation: 2530
Suppose I have two tables in my Postgres database:
create table transactions
(
id bigint primary key,
doc_id bigint not null,
-- lots of other columns...
amount numeric not null
);
-- same columns
create temporary table updated_transactions
(
id bigint primary key,
doc_id bigint not null,
-- lots of other columns...
amount numeric not null
);
Both tables have just a primary key, and no unique indexes.
I need to upsert rows from updated_transactions
into transactions
using the following rules:
transactions
and updated_transactions
don't matchdoc_id
, etc (except of the amount
) should matchamount
and id
columnsid
values in updated_transactions
are taken from a sequence.
A business object just populates updated_transactions
and then merges the
new or updated rows from it into transactions
using an upsert query.
So my old unchanged transactions keep their id
s intact, and the updated ones
are assigned new id
s.
In MSSQL and Oracle, it would be a merge
statement similar to this:
merge into transactions t
using updated_transactions ut on t.doc_id = ut.doc_id, ...
when matched then
update set t.id = ut.id, t.amount = ut.amount
when not matched then
insert (t.id, t.doc_id, ..., t.amount)
values (ut.id, ut.doc_id, ..., ut.amount);
In PostgreSQL, I suppose it should be something like this:
insert into transactions(id, doc_id, ..., amount)
select coalesce(t.id, ut.id), ut.doc_id, ... ut.amount
from updated_transactions ut
left join transactions t on t.doc_id = ut.doc_id, ....
on conflict
on constraint transactions_pkey
do update
set amount = excluded.amount, id = excluded.id
The problem is with the do update
clause: excluded.id
is an old value
from transactions
table, while I need a new value from updated_transactions
.
ut.id
value is inaccessible for the do update
clause, and the only thing I can
use is the excluded
row. But the excluded
row has only coalesce(t.id, ut.id)
expression which returns old id
values for the existing rows.
Is it possible to update both id
and amount
columns using the upsert query?
Upvotes: 0
Views: 5353
Reputation: 2530
Looks like the task can be accomplished using writable CTEs instead of the plain upsert.
First, I'll post the easier version of the query that answers the original question as it was asked. This solution assumes that doc_id, unit_id
columns address a candidate key, but doesn't require a unique index on these columns.
Test data:
create temp table transactions
(
id bigint primary key,
doc_id bigint,
unit_id bigint,
amount numeric
);
create temp table updated_transactions
(
id bigint primary key,
doc_id bigint,
unit_id bigint,
amount numeric
);
insert into transactions(id, doc_id, unit_id, amount)
values (1, 1, 1, 10), (2, 1, 2, 15), (3, 1, 3, 10);
insert into updated_transactions(id, doc_id, unit_id, amount)
values (6, 1, 1, 11), (7, 1, 2, 15), (8, 1, 4, 20);
The query to merge updated_transactions
into transactions
:
with new_values as
(
select ut.id new_id, t.id old_id, ut.doc_id, ut.unit_id, ut.amount
from updated_transactions ut
left join transactions t
on t.doc_id = ut.doc_id and t.unit_id = ut.unit_id
),
updated as
(
update transactions tr
set id = nv.new_id, amount = nv.amount
from new_values nv
where id = nv.old_id
returning tr.*
)
insert into transactions(id, doc_id, unit_id, amount)
select ut.new_id, ut.doc_id, ut.unit_id, ut.amount
from new_values ut
where ut.new_id not in (select id from updated);
The results:
select * from transactions
-- id | doc_id | unit_id | amount
------+--------+---------+-------
-- 3 | 1 | 3 | 10 -- not changed
-- 6 | 1 | 1 | 11 -- updated
-- 7 | 1 | 2 | 15 -- updated
-- 8 | 1 | 4 | 20 -- inserted
In my real application doc_id, unit_id
aren't always unique, so they don't represent a candidate key. To match the rows I take into account the row number, calculated for the rows sorted by their id
s. So here's my second solution.
Test data:
-- the tables are the same as above
insert into transactions(id, doc_id, unit_id, amount)
values (1, 1, 1, 10), (2, 1, 1, 15), (3, 1, 3, 10);
insert into updated_transactions(id, doc_id, unit_id, amount)
values (6, 1, 1, 11), (7, 1, 1, 15), (8, 1, 4, 20);
The merge query:
with trans as
(
select id, doc_id, unit_id, amount,
row_number() over(partition by doc_id, unit_id order by id) row_num
from transactions
),
updated_trans as
(
select id, doc_id, unit_id, amount,
row_number() over(partition by doc_id, unit_id order by id) row_num
from updated_transactions
),
new_values as
(
select ut.id new_id, t.id old_id, ut.doc_id, ut.unit_id, ut.amount
from updated_trans ut
left join trans t
on t.doc_id = ut.doc_id and t.unit_id = ut.unit_id and t.row_num = ut.row_num
),
updated as
(
update transactions tr
set id = nv.new_id, amount = nv.amount
from new_values nv
where id = nv.old_id
returning tr.*
)
insert into transactions(id, doc_id, unit_id, amount)
select ut.new_id, ut.doc_id, ut.unit_id, ut.amount
from new_values ut
where ut.new_id not in (select id from updated);
The results:
select * from transactions;
-- id | doc_id | unit_id | amount
------+--------+---------+-------
-- 3 | 1 | 3 | 10 -- not changed
-- 6 | 1 | 1 | 11 -- updated
-- 7 | 1 | 1 | 15 -- updated
-- 8 | 1 | 4 | 20 -- inserted
References:
Upvotes: 2
Reputation: 5930
Create unique index on those columns you use as key and pass its name in your upsert expression, so that it uses it instead of pkey.
Then it will insert row if no matches were found, using ID from updated_transactions
. If it finds match, then you can use excluded.id to get ID from updated_transactions
.
I think that left join transactions
is redundant.
So it would look kinda like this:
insert into transactions(id, doc_id, ..., amount)
select ut.id, ut.doc_id, ... ut.amount
from updated_transactions ut
on conflict
on constraint transactions_multi_column_unique_index
do update
set amount = excluded.amount, id = excluded.id
Upvotes: 2