Do I have to create a surrogate key if I want to save space?

Question

Let's say I have a very large table with owners of cars like so:

OWNERSHIP
owner    | car
---------------
steven   | audi
bernahrd | vw
dieter   | vw
eike     | vw
robert   | audi
... one hundred million rows ...

If I refactor it to this:

OWNERSHIP
owner    | car <-foreign key TYPE.car_type
---------------
steven   | audi
bernahrd | vw
dieter   | vw
eike     | vw
robert   | audi
...


TYPE
car_type      |
---------------
audi
vw

Do I win anything spacewise or speedwise or do I need to create an INTEGER surrogate key on car_type for that?

Tometzky · Accepted Answer

Using two tables and string foreign key would of course use more space than using one. How much more depends on how many types of cars you have.

You should use integer car_id:

Using integer keys would save space if significant percentage of car names would repeat.
More so if you'd need to index car column, as integer index is much smaller than string index.
Also comparing integers is faster than comparing strings, so searching by car should also be faster.
Smaller table means that bigger part if it would fit in cache, so accessing it should also be faster.

Do I have to create a surrogate key if I want to save space?

Answers (2)

Related Questions