labyrinth
labyrinth

Reputation: 13936

Does a foreign key reduce redundancy if all the columns are referenced?

I've read that one of the benefits of normalization is to reduce redundancy in the DB. But I'm wondering, if you end up referencing all the columns in the target table?

For example, if I have a Video table that references a Genre table, the Genre table might very likely have a single column with a dozen fairly static values like 'Horror' 'Sci-Fi' 'Romance' etc.

In a case like this, does it save any space to separate the two, or is the only benefit making it so you can update all referencing rows from one place?

Upvotes: 2

Views: 1212

Answers (10)

nvogel
nvogel

Reputation: 25526

Normalization has nothing to do with saving space. It's about eliminating potential anomalies that can occur as a result of certain kinds of redundancy. Since normalization defines the logical level only it would be quite possible for a normalized database to be physically larger or physically smaller than a denormalized or un-normalized one.

It is true that normalization generally makes designs that ought to translate efficiently into storage - but that's really down to the features of the DBMS rather than something implicit in normalization.

Upvotes: 0

gbn
gbn

Reputation: 432230

Data modification anomalies

  • What if you add a new genre?
  • Is Sci-Fi the same as SciFi?
  • Is Sci-Fi the same as Sci-fi?

It gets worse if you another table, say, "Books" that have the same Genres.

Upvotes: 0

Jason
Jason

Reputation: 1325

Saving space is one benefit to separating the 2 tables. Like it was said before, putting a Genre_ID in place of an actual value such as "Horror" or "Adventure" will save space.

In my opinion, the better part of doing this to to enforce integrity. If you put in the text values in the Video table, what prevents you from changing the value accidentally? Now some rows may have "Adventure" or "Action/Adventure" and so on. By having 2 tables and referencing with a foreign key, you're going to have better control over what values can be a genre.

In summary, don't worry about the fact that you reference all the columns, especially if a table has very few columns. If you decide to add an ID field, or just keep the 1 column table as a list of "acceptable values", your goal should be to enforce integrity first, and save space or I/O costs second.

Upvotes: 2

Stephanie Page
Stephanie Page

Reputation: 3893

Right, space saving is ONE of the benefits, not the only one.

In the case you mentioned, no, you'll save no space if you use that one column as the PK which is fine.

You could abstract that table with a autonumber/sequence and use that as the PK, and make the current column the candidate key (so it stays unique).

But leaving your design exactly as you've outlined, the benefit is in consistency. You'll have only those 12 values... you'll not accidentally enter a value for "Horrer" or "PSY-Fi"

Upvotes: 3

JOTN
JOTN

Reputation: 6317

In cases like that, your key values plus their indexes can be considerably larger than the data itself. Another model of doing simple codes like that is to have a table of codes and then an insert and update check constraint to validate them. That also avoids a join in order to get the genre data out. Which way you do it is kind of a toss up and would depend on what your application queries tend to be.

Upvotes: 0

Matt Phillips
Matt Phillips

Reputation: 11519

What happens if you want to ensure that your rows in the Video table have valid/predetermined entries for Genre? If you don't have a foreign key constraint you would need an enum for that column in the Video table and then you would have to change the schema every time you add a new Genre instead of just adding a new row to a Genre table.

Upvotes: 0

MAW74656
MAW74656

Reputation: 3539

But you've hit the problem yourself there:

single column with a dozen fairly static values like 'Horror' 'Sci-Fi' 'Romance' etc.

With surrogate keys and normalized tables, you only have "Horror" stored once in database, but its ID number is stored in several places (a simple number is smaller than the text most of the time, and does save space). Not only does it increase the maintainability of the database, but it does indeed save raw space.

Upvotes: 0

tster
tster

Reputation: 18237

Yes, it will save space if you have a surrogate key (int) which you use in the video table instead of the varchar(20) or whatever the genra would be.

Upvotes: 0

iDevlop
iDevlop

Reputation: 25262

You will save also because 'Horror' takes 12 bytes in Unicode, while GenreId can be a Byte or char(1).

Upvotes: -1

Raj More
Raj More

Reputation: 48016

I would use surrogate keys (Autonumber, Identity, etc) and use that for the foreign key join instead of the actual value.

The idea is more about data quality than reducing space.

In most db's an INT will be smaller than Varchar2 (20)

Upvotes: 0

Related Questions