MaximeF
MaximeF

Reputation: 5403

How to use an aggregation key with null value and with UPSERT?

I have a problem using UPSERT in Postgresql 9.5.

I have a table with 50 columns and my aggregation keys contains 20 keys, of which 15 of them can be null.

So this is my table :

CREATE TABLE public.test
(
  id serial NOT NULL,
  stamp timestamp without time zone,
  foo_id integer,
  bar_id integer,
  ...
  CONSTRAINT id_pk PRIMARY KEY (id),
  CONSTRAINT test_agg_key_unique UNIQUE (stamp, foo_id, bar_id, ...)
);

After I will create a partial index with my aggregation key. But I need to create a unique constraint before because all keys are not NOT NULL

alter table public.test ADD CONSTRAINT test_agg_key_unique UNIQUE (stamp, foo_id, bar_id, ...);

Then:

CREATE UNIQUE INDEX test_agg_key on lvl1_conversion.conversion (coalesce(stamp, '1980-01-01 01:01:01'), coalesce(foo_id, -1), coalesce(bar_id, -1), ...);

And now I can execute my UPSERT :

INSERT INTO public.test as t (id, stamp, foo_id, bar_id, ...)
VALUES (RANDOM_ID, '2016-01-01 01:01:01', 1, 1, ...)
ON CONFLICT (stamp, foo_id, bar_id, ...)
  do update set another_column = t.another_column  + 1
    where t.stamp = '2016-01-01 01:01:01' and t.foo_id = 1 and t.bar_id= 1 and ...;

So if the aggregation key already exist he will update the row if it will insert a new row. But when I use the same query but with one or multiple null value I receive this exception :

ERROR:  duplicate key value violates unique constraint "test_agg_key_unique"

Because of this exception it never call do update

A another good example this : https://dba.stackexchange.com/questions/151431/postgresql-upsert-issue-with-null-values

Upvotes: 1

Views: 550

Answers (3)

MaximeF
MaximeF

Reputation: 5403

After reading this question : here I found the solution.

Thank you Erwin Brandstetter : https://dba.stackexchange.com/a/151438/107395

Solution :

So I need create a index with all key and add COALESCE for each columns can be null.

So if it a text COALESCE(test_field, '') or if it a number COALESCE(test_field, -1)

CREATE UNIQUE INDEX test_upsert_solution_idx
    ON test_upsert (name, status, COALESCE(test_field, ''), COALESCE(test_field2, '')...);

and in the UPSERT remove the WHERE in DO UPDATE also add COALESCE into ON CONFLICT:

INSERT INTO test_upsert as tu(name, status, test_field, identifier, count) 
VALUES ('test', 1, null, 'ident', 11)
ON CONFLICT (name, status, COALESCE(test_field, '')) 
 DO UPDATE  -- match expr. index
  SET count = COALESCE(tu.count + EXCLUDED.count, EXCLUDED.count, tu.count);

Upvotes: 1

klin
klin

Reputation: 121794

The only way I can see is to use a trigger to make the columns practically not nullable, formally remaining nullable.

Test table:

create table test
(
    id serial not null,
    stamp timestamp without time zone,
    foo_id integer,
    bar_id integer,
    another_column integer,
    constraint id_pk primary key (id),
    constraint test_agg_key_unique unique (stamp, foo_id, bar_id)
);

Trigger:

create or replace function before_insert_on_test()
returns trigger language plpgsql as $$
begin
    new.stamp:= coalesce(new.stamp, '1980-01-01 01:01:01');
    new.foo_id:= coalesce(new.foo_id, -1);
    new.bar_id:= coalesce(new.bar_id, -1);
    return new;
end $$;

create trigger before_insert_on_test
before insert on test
for each row
execute procedure before_insert_on_test();

You do not need an additional unique index:

insert into test values (default, null, 1, null, 0)
on conflict (stamp, foo_id, bar_id) do
    update set another_column = test.another_column+ 1
returning *;

 id |        stamp        | foo_id | bar_id | another_column 
----+---------------------+--------+--------+----------------
  1 | 1980-01-01 01:01:01 |      1 |     -1 |              0

insert into test values (default, null, 1, null, 0)
on conflict (stamp, foo_id, bar_id) do
    update set another_column = test.another_column+ 1
returning *;

 id |        stamp        | foo_id | bar_id | another_column 
----+---------------------+--------+--------+----------------
  1 | 1980-01-01 01:01:01 |      1 |     -1 |              1

Note that you do not need where clause as update concerns only the row with conflict.


Update: Alternative solution

The problem stems from the fact that the composite unique index comprising nullable elements is generally not a good idea. You should abandon this approach and resist all the logic on a trigger.

Remove unique indexes and create the trigger:

create or replace function before_insert_on_test()
returns trigger language plpgsql as $$
declare
    found_id integer;
begin
    select id
    from test
    where
        coalesce(stamp, '1980-01-01 01:01:01') = coalesce(new.stamp, '1980-01-01 01:01:01')
        and coalesce(foo_id, -1) = coalesce(new.foo_id, -1)
        and coalesce(bar_id, -1) = coalesce(new.bar_id, -1)
    into found_id;
    if found then
        update test
        set another_column = another_column+ 1
        where id = found_id;
        return null;  -- abandon insert
    end if;
    return new;
end $$; 

create trigger before_insert_on_test
before insert on test
for each row
execute procedure before_insert_on_test();

Use simply insert, without on conflict.

You can try to speedup the trigger with (non unique) index:

create index on test(coalesce(stamp, '1980-01-01 01:01:01'), coalesce(foo_id, -1), coalesce(bar_id, -1));

Upvotes: 2

Nickan
Nickan

Reputation: 466

handel the the column with null value by isnull function and give them default value , like:

INSERT INTO public.test as t (id, stamp, foo_id, bar_id, ...)
VALUES (RANDOM_ID, '2016-01-01 01:01:01', 1, 1, ...)
ON CONFLICT (stamp, foo_id, bar_id, ...)
do update set another_column = isnull(t.another_column,0)  + 1
where t.stamp = '2016-01-01 01:01:01' and t.foo_id = 1 and t.bar_id= 1 and ...;

Upvotes: 0

Related Questions