PostgreSQL: Integrity constraint between 2 tables

Question

I am using PostgreSQL and have the 2 following tables:

create table anomaly (
    id          bigint          not null unique,
    description text            not null,
    zone1       varchar(100)    not null,
    language1   varchar(100)    not null,
    is_translation_anomaly  boolean not null,
    primary key(id)
);

create table translation_anomaly (
    id          bigint          not null unique,
    zone2       varchar(100)    not null,
    language2   varchar(100)    not null,
    primary key(id),
    foreign key(id) references anomaly(id)
);

A translation anomaly (translation_anomaly) is a specification of an anomaly. I want to implement the two following integrity constraints, without the use of triggers or stored procedures:

zone1 has to be different than zone2
language1 has to be different than language2

I have taken a look at both this and this questions, but couldn't get anything from the answers that I could apply in my case. I'm new with SQL, so I apologize if I missed something in these answers that I should have picked up.

I am using PostgreSQL 9.4.10.

Thank you in advance!

Basil Bourque · Accepted Answer

I suspect your table design is not properly normalized. I don’t know the semantics of your business problem, so I can only guess.

Caveat: Code here is untested; syntax may not be correct.

Normalize

If the child table represents translations as in localizations, then all the values including the original should be present in the child table.

For example, if you were intending to store English wording in the parent table while the child table holds French and Arabic, don’t do that. All three (English, French, Arabic) should be present in the child table.

By the way, every table needs its own unique identifier. So your child table needs two fields, one is its own identifier (primary key), and another column holding the identifier of its parent (foreign key).

Notice in the following code how we:

Removed the zone and language columns from the first table.
Added an identifier column to the child table of translation_anomaly.

CREATE TABLE anomaly
(
    id                     BIGINT  NOT NULL UNIQUE,
    description            TEXT    NOT NULL,
    is_translation_anomaly BOOLEAN NOT NULL,
    PRIMARY KEY (id)
)
;

CREATE TABLE translation_anomaly
(
    id         BIGINT       NOT NULL UNIQUE,
    anomaly_id BIGINT       NOT NULL,
    zone       VARCHAR(100) NOT NULL,
    language   VARCHAR(100) NOT NULL,
    PRIMARY KEY (id),
    FOREIGN KEY (anomaly_id) REFERENCES anomaly (id)
)
;

Multi-column UNIQUE constraint

Define a unique index for the combination of columns anomaly_id plus zone to enforce your first rule.
Ditto for language plus anomaly_id to enforce your second rule.

See Question, In Postgresql, force unique on combination of two columns.

By the way, no need to declare NOT NULL or UNIQUE on a column that is marked with a PRIMARY KEY constraint. See the doc page Constraints.

Adding a primary key will automatically create a unique B-tree index on the column or group of columns listed in the primary key, and will force the column(s) to be marked NOT NULL.

Notice in the following code that we:

Added a pair of UNIQUE constraints
Removed the redundant NOT NULL & UNIQUE from the declaration of identifier columns.

CREATE TABLE anomaly
(
    id                     BIGINT,
    description            TEXT    NOT NULL,
    is_translation_anomaly BOOLEAN NOT NULL,
    PRIMARY KEY (id)
)
;

CREATE TABLE translation_anomaly
(
    id         BIGINT,
    anomaly_id BIGINT       NOT NULL,
    zone       VARCHAR(100) NOT NULL,
    language   VARCHAR(100) NOT NULL,
    PRIMARY KEY (id),
    FOREIGN KEY (anomaly_id) REFERENCES anomaly (id),
    UNIQUE (zone, anomaly_id),
    UNIQUE (language, anomaly_id)
)
;

If your true rule is that there be only one language per zone for any one anomaly, then make a triple column UNIQUE constraint: UNIQUE ( language , zone , anomaly_id ).

CREATE TABLE anomaly
(
    id                     BIGINT,
    description            TEXT    NOT NULL,
    is_translation_anomaly BOOLEAN NOT NULL,
    PRIMARY KEY (id)
)
;

CREATE TABLE translation_anomaly
(
    id         BIGINT,
    anomaly_id BIGINT       NOT NULL,
    zone       VARCHAR(100) NOT NULL,
    language   VARCHAR(100) NOT NULL,
    PRIMARY KEY (id),
    FOREIGN KEY (anomaly_id) REFERENCES anomaly (id),
    UNIQUE (zone, language, anomaly_id)
)
;

By the way, you may want the database to handle the sequence of numbers assigned as identifiers. The modern approach in Postgres 10 and later is to make the identifiers identity columns (see post by Eisentraut, and post by depesz ). But as you are using 9.4, the legacy approach uses SERIAL as a pseudo data type. Apparently you want 64-bit integer numbers, so use BIGSERIAL/SERIAL8.

Another off-topic issue is reserved words. Over a thousand words are reserved in various SQL databases. So you can quite easily have a collision between your identifier names and system key words & reserved words. In your case, at least LANGUAGE is a reserved word which used as the name of a column. In my own work, I use a trailing underscore to avoid such collisions. The SQL Standard explicitly promises to never use a trailing underscore on any key/reserved word.

In the following code, note that we:

Use BIGSERIAL pseudo-type for id columns.
Append an underscore _ on all identifiers.

Looks like:

CREATE TABLE anomaly_
(
    id_                     BIGSERIAL,
    description_            TEXT    NOT NULL,
    is_translation_anomaly_ BOOLEAN NOT NULL,
    PRIMARY KEY (id_)
)
;

CREATE TABLE translation_anomaly_
(
    id_         BIGSERIAL,
    anomaly_id_ BIGINT       NOT NULL,
    zone_       VARCHAR(100) NOT NULL,
    language_   VARCHAR(100) NOT NULL,
    PRIMARY KEY (id_),
    FOREIGN KEY (anomaly_id_) REFERENCES anomaly_ (id_),
    UNIQUE (zone_, language_, anomaly_id_)
)
;

Other issues to consider include: Naming your constraints, and using domains or enumerated types for your zone_ & language_ columns.

PostgreSQL: Integrity constraint between 2 tables

Answers (2)

Normalize

Multi-column UNIQUE constraint

Related Questions