manuel.menendez
manuel.menendez

Reputation: 305

postgres - avoid creating duplicate null columns

I have this table schema on Postgres:

> \d+ users_types_brands

                   Table "public.users_types_brands"
     Column     |            Type             | Collation | Nullable |                    Default                     | Storage | Stats target | Description 
----------------+-----------------------------+-----------+----------+------------------------------------------------+---------+--------------+-------------
 id             | integer                     |           | not null | nextval('users_types_brands_id_seq'::regclass) | plain   |              | 
 inserted_at    | timestamp without time zone |           |          | now()                                          | plain   |              | 
 updated_at     | timestamp without time zone |           |          | now()                                          | plain   |              | 
 users_types_id | bigint                      |           |          |                                                | plain   |              | 
 brand_id       | bigint                      |           | not null |                                                | plain   |              | 
 tasks_type_id  | integer                     |           |          |                                                | plain   |              | 
Indexes:
    "users_types_brands_pkey" PRIMARY KEY, btree (id)
    "users_types_brands_users_types_id_brand_id_tasks_type_id_index" UNIQUE, btree (users_types_id, brand_id, tasks_type_id)
Foreign-key constraints:
    "users_types_brands_users_types_id_fkey" FOREIGN KEY (users_types_id) REFERENCES users_types(id)
Access method: heap

Right now the table looks like this:

my_db=# select * from users_types_brands;
 id |        inserted_at         |         updated_at         | users_types_id | brand_id | tasks_type_id 
----+----------------------------+----------------------------+----------------+----------+---------------
 12 | 2021-10-24 16:43:12.244026 | 2021-10-24 16:43:12.244026 |              2 |      112 |             8
 14 | 2021-10-24 17:03:12.012874 | 2021-10-24 17:03:12.012874 |              2 |      111 |             9
(2 rows)
                   

Of course, I can't insert a row like this:

my_db=# insert into users_types_brands (users_types_id, brand_id, tasks_type_id) values (2, 112, 8);
ERROR:  duplicate key value violates unique constraint "users_types_brands_users_types_id_brand_id_tasks_type_id_index"
DETAIL:  Key (users_types_id, brand_id, tasks_type_id)=(2, 112, 8) already exists.

But I can do this several times:

my_db=# insert into users_types_brands (users_types_id, brand_id) values (2, 112);
INSERT 0 1

And obtain this:

my_db=# select * from users_types_brands;
 id |        inserted_at         |         updated_at         | users_types_id | brand_id | tasks_type_id 
----+----------------------------+----------------------------+----------------+----------+---------------
 12 | 2021-10-24 16:43:12.244026 | 2021-10-24 16:43:12.244026 |              2 |      112 |             8
 14 | 2021-10-24 17:03:12.012874 | 2021-10-24 17:03:12.012874 |              2 |      111 |             9
 16 | 2021-10-24 17:15:58.295428 | 2021-10-24 17:15:58.295428 |              2 |      112 |              
 17 | 2021-10-24 17:16:36.99971  | 2021-10-24 17:16:36.99971  |              2 |      112 |              
(4 rows)

Now, according to the business rules, tasks_type_id can be null 👍

But how can I avoid creating repeated rows like the last two? One null tasks_type_id is OK, but not two or more.

Has anyone faced this before?

Upvotes: 0

Views: 801

Answers (2)

ncank
ncank

Reputation: 956

There are two basic solutions to this problem but both have their own cons.

1.Using a partial index, as Belayer pointed out. The downside is for non-null values you will require another partial index since this one will ignore non-null values and cover only the rows with nulls.

CREATE UNIQUE INDEX "index_for_nulls" ON "table" ( "field_a", "field_b" ) WHERE "field_c" IS NULL;
CREATE UNIQUE INDEX "index_for_non_nulls" ON "table" ( "field_a", "field_b", "field_c" ) WHERE "field_c" IS NOT NULL;

2.Using COALESCE in the index definition to avoid null values. This way index will cover all rows but planner won't use the full index if you don't use the exact statement which is defined in the index

CREATE UNIQUE INDEX "index" ON "table" ( "field_a", "field_b", ( COALESCE( "field_c", -1 ) );

Upvotes: 1

Belayer
Belayer

Reputation: 14934

You can create a Partial Unique Index. It will allow a single row with the same users_types_id and brand_id and null tasks_type_id, but only a single one. (See Demo)

create unique index tasks_type_id_just_1_unique
    on users_types_brands (users_types_id, brand_id)
  where tasks_type_id is null;

Upvotes: 2

Related Questions