Reputation: 42813

Use multiple conflict_target in ON CONFLICT clause

I have two separately unique columns in a table: col1, col2. Both have a unique index (col1 is unique and so is col2).

I need INSERT ... ON CONFLICT ... DO UPDATE syntax, and update other columns in case of a conflict, but I can't use both columns as conflict_target.

It works:

INSERT INTO table
...
ON CONFLICT ( col1 ) 
DO UPDATE 
SET 
-- update needed columns here

But how to do this for several columns, something like this:

...
ON CONFLICT ( col1, col2 )
DO UPDATE 
SET 
....

Currently using Postgres 9.5.

Upvotes: 176

Answers (14)

yugandhar

Reputation: 630

The simpler solution is using MERGE statement Let's say we have table with col1 and col2 are different unique keys, one example - col1 is email and col2 is username, we want both to be unique.

Let's say we have age and zipcode are non unique columns and id is primary key (auto serial), and for the following example

col1(email) value is [email protected],
col2(username) is exampleuser,
col3(age) is 25,
col4(zipcode) is 94041.

MERGE INTO table as target
USING (VALUES('[email protected]', 'exampleuser', 25, 94041)) AS 
       source(email, username, age, zipcode)
ON target.email = source.email OR target.username = source.username
WHEN MATCHED THEN
    UPDATE SET age = source.age, zipcode = source.zipcode
WHEN NOT MATCHED THEN
    INSERT (email, username, age, zipcode) 
    VALUES (source.email, source.username, source.age, source.zipcode)
RETURNING merge_action() as insert_or_update, source.id;

You can observe 4 steps

create source with given values
match with all unique keys (combined with OR condition)
if any unique keys match, do UPDATE statement
for not matching, do INSERT statement
(Optional) return either INSERT or UPDATE based on the action taken, along with id.

Upvotes: 2

BPS

Reputation: 1678

I ran into this problem and landed on a different solution.

In my original table, I had two unique columns, id and name. In my application, the name is allowed to change but the id is not.

First, I created a temporary table that is essentially a copy of the original table except without any constraints.

create temp table if not exists staging_table
(
    id              uuid           not null,
    gen             integer        not null generated always as identity,
    name            varchar(50)    not null,
    description     varchar(110)   not null default '',
    ...
)
    on commit drop;

I added the gen column just to make sure rows were unique. You may not need that.

When I want to do an upsert into table, I first do an insert into that temporary table:

insert into staging_table (id, name, description, ...)
VALUES (?, ?, ?, ...)
on conflict do nothing;

followed by a merge:

merge into table as t
using staging_table as s
on (t.id = s.id or t.name = s.name) and ...
when matched then
    update
    set name        = s.name,
        description = s.description,
        ...
when not matched then
    insert (id, name, description, ...)
    values (s.id, s.name, s.description, ...);
commit;

Be sure not to commit this transaction between those statements or the rows in the staging table will be lost.

Upvotes: 0

Erwin Brandstetter

Reputation: 658202

Clean solution

This achieves what the question asks for:

CREATE OR REPLACE FUNCTION f_upsert_double(_uniq1 int, _uniq2 int, _val text)
  RETURNS void 
  LANGUAGE plpgsql AS
$func$
BEGIN
   LOOP
      -- covers **all** unique violations
      INSERT INTO double_uniq
             ( uniq1,  uniq2,  val) 
      VALUES (_uniq1, _uniq2, _val) 
      ON     CONFLICT DO NOTHING;

      EXIT WHEN FOUND;

      -- updating *two* rows is a corner case possibility!
      -- take care to cover **all** possible unique violations
      UPDATE double_uniq
      SET    val = _val
      WHERE  uniq1 = _uniq1
      OR     uniq2 = _uniq2;

      EXIT WHEN FOUND;
   END LOOP;
END
$func$;

Call:

SELECT f_upsert_double(1, 2, 'foo');

fiddle

Can easily be adapted to return a resulting ID or whatever.

Works for single-row UPSERT.

It handles all race conditions that could develop with concurrent writes around the two UNIQUE constraints gracefully. The loop guarantees that either INSERT or UPDATE succeeds eventually. In reality, the function will hardly ever loop at all.
Here is a discussion and a step-by-step explanation why the loop is necessary:

Race conditions between INSERT ON CONFLICT DO NOTHING and SELECT

Never raises an exception for either UNIQUE constraint, so no need for an EXCEPTION clause. That's important because an EXCEPTION clause in a PL/pgSQL code block butchers performance and results in problems like this one:

How to improve bulk inserts to Postgres DB

So it's also as cheap as it gets.

Note: address all possible unique violations in the UPDATE (from all PRIMARY KEY, UNIQUE and EXCLUDE constraints), or you can get an endless loop!

Is SELECT or INSERT in a function prone to race conditions?

Dirty problem

That said, in most cases you should not have to do this to begin with. It's a questionable DB design and workflow.
Typically, the situation arises when only one of (col1, col2) can have a value, while the other one must be null. Then there is a clean solution with two mutually exclusive partial unique indexes.

Or, better yet, merge the two UNIQUE constraints into one using the NULLS NOT DISTINCT in Postgres 15 or later. See:

UPSERT based on UNIQUE constraint with NULL values

Error "no unique or exclusion constraint matching ON CONFLICT" for composite keys with overlapping columns

Upvotes: 2

Martin Gerhardy

Reputation: 2000

If you are using postgres 9.5 or later, you can use the EXCLUDED space.

Example taken from What's new in PostgreSQL 9.5:

INSERT INTO user_logins (username, logins)
VALUES ('Naomi',1),('James',1)
ON CONFLICT (username)
DO UPDATE SET logins = user_logins.logins + EXCLUDED.logins;

Upvotes: 1

Lars Blumberg

Reputation: 21421

It seems to be the case that for the OP col1 and col2 are independently unique. So that a combined unique constrained will NOT solve the insert/on conflict problem, as suggested in other answers.

One solution to this is a CTE which firstly only inserts a new row with col1 set and assigns a random value for col2 (avoiding a constraint violation on col2). And only then it updates col2 to its final value. Now this is what happens:

If col1 is duplicated, the first clause's conflict will hit and not insert a new row. Otherwise a new row is inserted with col1 set to its final value and col2 set to a temporary and unique value.
If col2 is duplicated, the second clause's conflict will hit and simply store the final value of :col2. Either for the existing row or for the newly created row from the insert clause.

Given that col1 and col2 are both strings/texts, we can use gen_random_uuid()::text to generate a unique string for the first insert clause:

with
    row as (
        insert into table(col1, col2)
        values(:col1, gen_random_uuid()::text)
        on conflict(col1) do set col1 = :col1
        returning col1
    )
update table
set col2 = :col2
from row
where row.col1 = :col1
returning col1, col2

with :col1 and :col2 being variables passed to the query.

Now if your column isn't a string/text you can come up with other unique values such as temporarily assigning -1 to col2, knowing that all your values are otherwise positive.

For when you have more than 2 independently unique columns, above approach scales in so far as you can move more (and even all) column updates into the CTE clause:

with
    row as (
        insert into table(col1, col2, ..., colN)
        values(
            :col1,
            gen_random_uuid()::text,
            ...
            gen_random_uuid()::text,
        )
        on conflict(col1) do set col1 = :col1
        returning col1
    ),
    col2_update as (
        update table
        set col2 = :col2
        from row
        where row.col1 = :col1
    ),
    ...
    colN_update as (
        update table
        set colN = :colN
        from row
        where row.col1 = :col1
    )
select col1, col2, ..., colN from row

Upvotes: 2

Anatolii Stepaniuk

Reputation: 2713

ON CONFLICT ( col1, col2 )
DO UPDATE 
SET

works fine. but you should not update col1, col2 in the SET section.

Upvotes: 1

MakerBen

Reputation: 47

I get I am late to the party but for the people looking for answers I found this: here

INSERT INTO tbl_Employee 
VALUES (6,'Noor')
ON CONFLICT (EmpID,EmpName)
DO NOTHING;

Upvotes: -1

user2625834

Reputation: 5

ON CONFLICT is very clumsy solution, run

UPDATE dupes SET key1=$1, key2=$2 where key3=$3    
if rowcount > 0    
  INSERT dupes (key1, key2, key3) values ($1,$2,$3);

works on Oracle, Postgres and all other database

Upvotes: -8

Niko Dunk

Reputation: 511

Kind of hacky but I solved this by concatenating the two values from col1 and col2 into a new column, col3 (kind of like an index of the two) and compared against that. This only works if you need it to match BOTH col1 and col2.

INSERT INTO table
...
ON CONFLICT ( col3 ) 
DO UPDATE 
SET 
-- update needed columns here

Where col3 = the concatenation of the values from col1 and col2.

Upvotes: -1

e4c5

Reputation: 53774

A sample table and data

CREATE TABLE dupes(col1 int primary key, col2 int, col3 text,
   CONSTRAINT col2_unique UNIQUE (col2)
);

INSERT INTO dupes values(1,1,'a'),(2,2,'b');

Reproducing the problem

INSERT INTO dupes values(3,2,'c')
ON CONFLICT (col1) DO UPDATE SET col3 = 'c', col2 = 2

Let's call this Q1. The result is

ERROR:  duplicate key value violates unique constraint "col2_unique"
DETAIL:  Key (col2)=(2) already exists.

What the documentation says

conflict_target can perform unique index inference. When performing inference, it consists of one or more index_column_name columns and/or index_expression expressions, and an optional index_predicate. All table_name unique indexes that, without regard to order, contain exactly the conflict_target-specified columns/expressions are inferred (chosen) as arbiter indexes. If an index_predicate is specified, it must, as a further requirement for inference, satisfy arbiter indexes.

This gives the impression that the following query should work, but it does not because it would actually require a together unique index on col1 and col2. However such an index would not guarantee that col1 and col2 would be unique individually which is one of the OP's requirements.

INSERT INTO dupes values(3,2,'c') 
ON CONFLICT (col1,col2) DO UPDATE SET col3 = 'c', col2 = 2

Let's call this query Q2 (this fails with a syntax error)

Why?

Postgresql behaves this way is because what should happen when a conflict occurs on the second column is not well defined. There are number of possibilities. For example in the above Q1 query, should postgresql update col1 when there is a conflict on col2? But what if that leads to another conflict on col1? how is postgresql expected to handle that?

A solution

A solution is to combine ON CONFLICT with old fashioned UPSERT.

CREATE OR REPLACE FUNCTION merge_db(key1 INT, key2 INT, data TEXT) RETURNS VOID AS
$$
BEGIN
    LOOP
        -- first try to update the key
        UPDATE dupes SET col3 = data WHERE col1 = key1 and col2 = key2;
        IF found THEN
            RETURN;
        END IF;

        -- not there, so try to insert the key
        -- if someone else inserts the same key concurrently, or key2
        -- already exists in col2,
        -- we could get a unique-key failure
        BEGIN
            INSERT INTO dupes VALUES (key1, key2, data) ON CONFLICT (col1) DO UPDATE SET col3 = data;
            RETURN;
        EXCEPTION WHEN unique_violation THEN
            BEGIN
                INSERT INTO dupes VALUES (key1, key2, data) ON CONFLICT (col2) DO UPDATE SET col3 = data;
                RETURN;
            EXCEPTION WHEN unique_violation THEN
                -- Do nothing, and loop to try the UPDATE again.
            END;
        END;
    END LOOP;
END;
$$
LANGUAGE plpgsql;

You would need to modify the logic of this stored function so that it updates the columns exactly the way you want it to. Invoke it like

SELECT merge_db(3,2,'c');
SELECT merge_db(1,2,'d');

Upvotes: 87

KajMagnus

Reputation: 11686

You can typically (I would think) generate a statement with only one on conflict that specifies the one and only constraint that is of relevance, for the thing you are inserting.

Because typically, only one constraint is the "relevant" one, at a time. (If many, then I'm wondering if something is weird / oddly-designed, hmm.)

Example:
_{(License: Not CC0, only CC-By)}

// there're these unique constraints:
//   unique (site_id, people_id, page_id)
//   unique (site_id, people_id, pages_in_whole_site)
//   unique (site_id, people_id, pages_in_category_id)
// and only *one* of page-id, category-id, whole-site-true/false
// can be specified. So only one constraint is "active", at a time.

val thingColumnName = thingColumnName(notfificationPreference)

val insertStatement = s"""
  insert into page_notf_prefs (
    site_id,
    people_id,
    notf_level,
    page_id,
    pages_in_whole_site,
    pages_in_category_id)
  values (?, ?, ?, ?, ?, ?)
  -- There can be only one on-conflict clause.
  on conflict (site_id, people_id, $thingColumnName)   <—— look
  do update set
    notf_level = excluded.notf_level
  """

val values = List(
  siteId.asAnyRef,
  notfPref.peopleId.asAnyRef,
  notfPref.notfLevel.toInt.asAnyRef,
  // Only one of these is non-null:
  notfPref.pageId.orNullVarchar,
  if (notfPref.wholeSite) true.asAnyRef else NullBoolean,
  notfPref.pagesInCategoryId.orNullInt)

runUpdateSingleRow(insertStatement, values)

And:

private def thingColumnName(notfPref: PageNotfPref): String =
  if (notfPref.pageId.isDefined)
    "page_id"
  else if (notfPref.pagesInCategoryId.isDefined)
    "pages_in_category_id"
  else if (notfPref.wholeSite)
    "pages_in_whole_site"
  else
    die("TyE2ABK057")

The on conflict clause is dynamically generated, depending on what I'm trying to do. If I'm inserting a notification preference, for a page — then there can be a unique conflict, on the site_id, people_id, page_id constraint. And if I'm configuring notification prefs, for a category — then instead I know that the constraint that can get violated, is site_id, people_id, category_id.

So I can, and fairly likely you too, in your case?, generate the correct on conflict (... columns ), because I know what I want to do, and then I know which single one of the many unique constraints, is the one that can get violated.

Upvotes: 0

Paul A Jungwirth

Reputation: 24551

ON CONFLICT requires a unique index* to do the conflict detection. So you just need to create a unique index on both columns:

t=# create table t (id integer, a text, b text);
CREATE TABLE
t=# create unique index idx_t_id_a on t (id, a);
CREATE INDEX
t=# insert into t values (1, 'a', 'foo');
INSERT 0 1
t=# insert into t values (1, 'a', 'bar') on conflict (id, a) do update set b = 'bar';
INSERT 0 1
t=# select * from t;
 id | a |  b  
----+---+-----
  1 | a | bar

* In addition to unique indexes, you can also use exclusion constraints. These are a bit more general than unique constraints. Suppose your table had columns for id and valid_time (and valid_time is a tsrange), and you wanted to allow duplicate ids, but not for overlapping time periods. A unique constraint won't help you, but with an exclusion constraint you can say "exclude new records if their id equals an old id and also their valid_time overlaps its valid_time."

Upvotes: 130

Vladimir Voznesensky

Reputation: 31

Create a constraint (foreign index, for example).

OR/AND

Look at existing constraints (\d in psq).
Use ON CONSTRAINT(constraint_name) in the INSERT clause.

Upvotes: -1

Peter Krauss

Reputation: 13952

In nowadays is (seems) impossible. Neither the last version of the ON CONFLICT syntax permits to repeat the clause, nor with CTE is possible: not is possible to breack the INSERT from ON CONFLICT to add more conflict-targets.

Upvotes: 12

Use multiple conflict_target in ON CONFLICT clause

Answers (14)

Clean solution

Dirty problem

A sample table and data

Reproducing the problem

What the documentation says

Why?

A solution

Related Questions