Why is there a difference on UPDATE query results when a `UNIQUE INDEX` is involved?

Question

I stumbled into Why would I get a duplicate key error when updating a row? so I tried a few things on https://extendsclass.com/postgresql-online.html.

Given the following schema:

create table scientist (id integer PRIMARY KEY, firstname varchar(100), lastname varchar(100));
        insert into scientist (id, firstname, lastname) values (1, 'albert', 'einstein');
        insert into scientist (id, firstname, lastname) values (2, 'isaac', 'newton');
        insert into scientist (id, firstname, lastname) values (3, 'marie', 'curie');
        select * from scientist;

CREATE UNIQUE INDEX fl_idx ON scientist(firstname, lastname);

when I run this query:

UPDATE scientist AS c SET
    firstname = new_values.F,
    lastname = new_values.L
FROM (
    SELECT * FROM
        UNNEST(
            ARRAY[1, 1]::numeric[],
            ARRAY['one', 'v']::text[],
            ARRAY['three', 'f']::text[]
        ) AS T(
            I,
            F,
            L
        )
    ) AS new_values
WHERE c.id = new_values.I
RETURNING c.id, c.firstname, c.lastname;

I get back:

id  firstname   lastname
1   one          three

whereas if I don't create the index (CREATE UNIQUE INDEX fl_idx ON scientist(firstname, lastname);) I get:

id  firstname   lastname
1   v           f

So I am not sure why the UNIQUE INDEX affects the result and why there isn't a duplicate key value violates unique constraint exception when I change my UNNEST to (similar to what happens on the SO question I mentioned above) since the id is a PRIMARY KEY:

UNNEST(
    ARRAY[1, 1]::numeric[],
    ARRAY['one', 'one']::text[],
    ARRAY['three', 'three']::text[]
)

The postgres version I run the above queries was: PostgreSQL 11.11 (Debian 11.11-0+deb10u1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit

Lukasz Szozda · Accepted Answer

From Update:

When using FROM you should ensure that the join produces at most one output row for each row to be modified. In other words, a target row shouldn't join to more than one row from the other table(s). If it does, then only one of the join rows will be used to update the target row, but which one will be used is not readily predictable.

In your case you have two matches for 1, so the choice is completely dependent in which order rows are read.

Here example when index is present for both runs and results are different:

db<>fiddle demo 1
db<>fiddle demo 2

Do you know why I don't get the "duplicate key value violates unique constraint" error?

There is no duplicate key neither on column id nor pair first_name/last_name after update is performed.

Scenario 1:

+-----+------------+----------+
| id  | firstname  | lastname |
+-----+------------+----------+
|  2  | isaac      | newton   |
|  3  | marie      | curie    |
|  1  | v          | f        |
+-----+------------+----------+

Scenario 2:

+-----+------------+----------+
| id  | firstname  | lastname |
+-----+------------+----------+
|  2  | isaac      | newton   |
|  3  | marie      | curie    |
|  1  | one        | three    |
+-----+------------+----------+

EDIT:

Using "UPSERT" and trying to insert/update row twice:

INSERT INTO scientist (id,firstname, lastname)
VALUES (1, 'one', 'three'), (1, 'v', 'f')
ON CONFLICT (id) 
DO UPDATE SET firstname = excluded.firstname;

-- ERROR:  ON CONFLICT DO UPDATE command cannot affect row a second time

Why is there a difference on UPDATE query results when a `UNIQUE INDEX` is involved?

Answers (1)

Related Questions