Reputation: 513
id
column for foreign key? Performance / storage? How significant? Is it frowned in the industry?Example: this is the schema for my sample problem:
In my schema sometimes I use id
column as the foreign key and sometimes use some other data column.
In vehicle_detail
table I use a unique
size
column as a foreign key from vehicle_size
table and unique
color
column as the foreign key from vehicle_color
table.
But in vehicle_user
I used the user_identifier_id
as a foreign key which refers to id
primary key column in user_identifier
table.
Which is the correct way?
On a side note, I don't have id
columns for the garage_level
, garage_spaceid
, vehicle_garage_status
and vehicle_parking_status
tables because they only have one column which is the primary key and the data they store is just at most 15 rows in each table and it is probably never going to change. Should I still have an id
column in those ?
Upvotes: 1
Views: 4836
Reputation: 3547
OP question is about "correct way".
I will try to provide some kind of summary from existing comments and answers, general DO
and general DONT
for FKs.
Literally from Laurenz Albe answer and it was noted in comments
It was noted by Adrian Klavier in comments.
There is no such general rule that PK or unique constraint must be defined on a single column.
So the question title itself must be corrected: "Which column(s) for foreign key: id or any other column(s) and why?"
Let's talk about "why".
Is there a cost associated with using any other unique column other than id column for foreign key? Performance / storage? How significant? Is it frowned in the industry?
General DO: Analyze requirements, use logic, use math (arithmetics is enough usually). There is no a single database design that's always good for all cases. Always ask yourself: "Can it be improved?". Never be content with design of existing FKs, if requirements changed or DBMS changed or storage options changed - revise design.
General DONT: Don't think that there is a single correct rule for all cases. Don't think: "if that worked in that database/table than it will work for this case too".
Let me illustrate this points with a common example.
id uuid
fieldinteger
(4 bytes) + date
(4 bytes)id
of uuid
type (16 bytes)id
id
fieldIt this a correct design or not?
Let's use math:
int
+date
: it's 4+4=8 bytesSo it looks like additional 16 bytes per each row + indexes costs is a mistake.
And that's a very common mistake especially in combination of MSSQL + CLUSTERED indexes on random uuids
Is it always a mistake?
No.
Consider latter cases.
Suppose that you have a distributed system:
ServerA
, ServerB
, ServerC
are sources of dataHeadServer
- is data aggregatorserverA
-ServerC
could be duplicated: the same record could exists on several instancesserverA
and data for tables with FKs from serverB
-serverC
In such case existence of PK on id uuid
is justified:
Suppose that you have an API to access data for external consumers.
There is a good unique constraint on:
client_id
: incrementing integer in range 1..100000invoice_date
: dates '20100101'..'20210901'And a surrogate key on id
with random uuids.
You can create external API in forms:
/server/invoice/{client_id}/{invoice_date}
/server/invoice/{id}
From security POV /{id}
is superior by reasons:
uuid
value existence of otherint
, entityB on bigint' and entityC on
int+
byte+
date`In such case surrogate key not only justified but becames essential.
I hope that I was clear in explanation of main correct principle: "There is no such thing as a universal correct principle".
An additional advice: avoid CASCADE UPDATE/DELETEs: Although it depends on DBMS you use.
But in general :
Thank you for your attention.
I hope this helps somebody.
Upvotes: 1
Reputation: 64
A foreign key is a field or a column that is used to establish a link between two tables. A FOREIGN KEY is a column (or collection of columns) in one table, that refers to the PRIMARY KEY in another table.
There is no rule that it should refer to a id
column but the column it refers to should be the primary key. In real scenarios, it usually refers to Id
column as in most cases it is the primary key in the tables.
Upvotes: 1
Reputation: 135
A foreign key is basically a column of a different table(it is always of a different table, since that is the role it serves). It is used to join/ get data from a different table. Think of it like say school is a database and there are many different table for different aspects of student.
say by using Admission number 1234, from accounts table you can get the fees and sports table you can get the sports he play.
Now there is no rule that foreign key should be id column, you can keep it whatever you want. But,to use foreign key you should have a matching column in both tables therefore usually id column is only used. As I stated in the above example the only common thing in say sports table and accounts table would be admission number.
admn_no | sports |
+---------+------------+
| 1234 | basketball
+---------+---------+
| admn_no | fees |
+---------+---------+
| 1234 | 1000000 |
+---------+---------+
Now say using the query\
select * from accounts join sports using (admn_no);
you will get:
+---------+---------+------------+
| admn_no | fees | sports |
+---------+---------+------------+
| 1234 | 1000000 | basketball |
+---------+---------+------------+
PS: sorry for bad formatting
Upvotes: 1
Reputation: 246578
A foreign key has to target a primary key or unique constraint. It is normal to reference the primary key, because you typically want to reference an individual row in another table, and the primary key is the identifier of a table row.
From a technical point of view, it does not matter whether a foreign key references the primary key or another unique constraint, because in PostgreSQL both are implemented in the same way, using a unique index.
As to your concrete examples, there is nothing wrong with having the unique size
column of vehicle_size
be the target of a foreign key, although it begs the question why you didn't make size
the primary key and omit the id
column altogether. There is no need for each table to have an id
column that is the automatically generated numeric primary key, except that there may be ORMs and other software that expect that.
Upvotes: 4