Using single column table

Question

I'm creating a database to store the events of mobile apps recovered from multiple sources. Problem is that rows from the event table don't have much meaning to the user as it's mostly a succession of integers. Forcing them to make multiple joins or multiple queries.

CREATE TABLE source (
    id              serial PRIMARY KEY,
    value           string NOT NULL
); 

CREATE TABLE application (
    id              serial PRIMARY KEY,
    value           string NOT NULL
);

CREATE TABLE platform (
    id              serial PRIMARY KEY,
    value           string NOT NULL
);

CREATE TABLE country (
    id              serial PRIMARY KEY,
    value           string NOT NULL
);

CREATE TABLE event (
    id              serial PRIMARY KEY,
    source_id       integer REFERENCES source(id),
    application_id  integer REFERENCES application(id),
    platform_id     integer REFERENCES platform(id),
    country_id      integer REFERENCES country(id),

    ...

    updated_at      date NOT NULL,
    value           decimal(100, 2) NOT NULL
);

I thought of directly using the value of the "secondary" tables as a primary key (as it's unique and not null) that I would reference in the event table. It would look like that:

CREATE TABLE source (
    value              string PRIMARY KEY
); 

CREATE TABLE application (
    value              string PRIMARY KEY
);

CREATE TABLE platform (
    value              string PRIMARY KEY
);

CREATE TABLE country (
    value              string PRIMARY KEY
);

CREATE TABLE event (
    id              serial PRIMARY KEY,
    source          string REFERENCES source(value),
    application     string REFERENCES application(value),
    platform        string REFERENCES platform(value),
    country         string REFERENCES country(value),

    ...

    updated_at      date NOT NULL,
    value           decimal(100, 2) NOT NULL
);

I think it might also be good this way as I don't really see an added value at using a surrogate key in this situation. Also prevents me from using views which might have slower performances as it executes a query every time I use the view in a query.

What do you think of this option ?

Gordon Linoff · Accepted Answer

"real" systems usually use surrogate keys. There are multiple reasons why:

Integers are more efficient for indexes, because they are fixed length.
Integers are more efficient for foreign key references, because they are only four bytes (strings are often larger).
String values may change and then referring tables need to be updated.
Auto generated primary keys contain other information, such as the order of insertion.
End users do not directly access tables. If such functionality is needed, then views are fine.

There is nothing per se wrong with using strings. But in practice, they are not used for this purpose.

Using single column table

Answers (1)

Related Questions