Matt
Matt

Reputation: 484

Unique constraints on their own or as a primary key?

Is there any benefit to using a table schema like this:

CREATE TABLE review (
    review_id SERIAL PRIMARY KEY,
    account_id INT REFERENCES account(account_id) NOT NULL, 
    product_id INT REFERENCES product(product_id) NOT NULL, 
    rating SMALLINT NOT NULL, 
    comment TEXT,
    UNIQUE (account_id, product_id)
);

Or should the constraint itself be the primary key, like this:

CREATE TABLE review (
    CONSTRAINT review_pkey (account_id, product_id) PRIMARY KEY,
    account_id INT REFERENCES account(account_id) NOT NULL, 
    product_id INT REFERENCES product(product_id) NOT NULL, 
    rating SMALLINT NOT NULL, 
    comment TEXT,
);

Upvotes: 0

Views: 593

Answers (2)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 659367

Besides religion, habits, personal preferences and convenience with certain client tools, there are other good reasons for an additional surrogate PK as demonstrated in your first example.

If you are going to reference that table with foreign keys from other tables:

  • Referencing table(s) only need to include the single surrogate PK columns as FK reference, which is smaller, faster and simpler. If referencing table(s) have many rows and review does not, a single instance may already outweigh the additional cost to review. Else, multiple instances may.
    For small lookup tables that are referenced in many rows, even consider a smallserial surrogate PK - if that actually helps. See:

  • Typically, there will be an index on FK columns of referencing tables, too. Your example with two integer is most favorable for the multicolumn PK / FK as it keeps index size to a minimum. A B-tree index on two integer columns is no bigger than one on a single integer (8 bytes typically is the minimum "payload" for index tuples). Other, bigger data types would make additional difference.

  • If review receives many updates to one of the columns (account_id, product_id), those will cascade to all referencing tables based on those two columns. Multiplies write costs, bloats multiple tables and indexes. If it cascades to wide rows or many referencing rows, costs may increase substantially. All of this may be avoided with a surrogate PK - if the relational design is actually supposed to work that way.

If review is involved in many queries with joins, joining on two columns instead of just one is more tedious to write and slightly more expensive. Again, more so for bigger data types.

That said, if you have none of the above (or similar), look to Laurenz' answer.

Weigh actual costs, not religious beliefs.

Upvotes: 1

Laurenz Albe
Laurenz Albe

Reputation: 248305

The second version is clearly preferable, because it requires one less column and one less index, and there is no down side.

The column is obvious, the indexes aren't, because you forgot to add them: You need indexes on all the foreign key columns so that deletes on the referenced tables can be fast. With the artificial primary key, you need indexes on review_id, account_id and product_id, while without you can do with the indexes on (account_id, product_id) and product_id.

The only people who will advocate the first solution are people who hold a religious belief that every table has to have an artificially generated numerical primary key, no matter what. In reality, the combination of the two artificially generated keys from the referenced tables is just as good.

Upvotes: 2

Related Questions