Irina
Irina

Reputation: 1598

Database Normalization - Field depending on another non-key field

The following is part of a table definition in SQL Server:

CREATE TABLE User     
[UserId] INT NOT NULL IDENTITY(1,1),
[EatsFruit] BIT NOT NULL DEFAULT '0',
[FavoriteFruit] NVARCHAR(50) DEFAULT NULL,

As you can imagine, UserId is the primary key. I used a simpler example here to explain my question related to the "fruit" fields.

The field EatsFruit will be either 1 or 0, depending on whether the user eats fruit or not. If EatsFruit contains a 1, then the FavoriteFruit field will include the user's favorite fruit. If EatsFruit is 0, then FavoriteFruit is not relevant and it would have to contain N/A or some similar value.

I am wondering what the best way to model this is, and whether it needs to be normalized.

Since the FavoriteFruit field depends on the content of EatsFruit, should it be separated in a different table containing UserId and FavoriteFruit? This would be cleaner, because an entry for a certain user would not appear unless the user actually eats fruits (and the content of FavoriteFruit would always be relevant). However, since the primary key of the new table would also be UserId, doesn't this mean that FavoriteFruit really depends on the UserId and should not have been separated from the main table in the first place?

What would be best practices here? Thank you so much!

Upvotes: 0

Views: 70

Answers (2)

SteveJ
SteveJ

Reputation: 3323

From a pure normalization perspective, you don't want to have a field that is potentially taking up space with useless information, as you would in your example when the user doesn't eat fruit. Furthermore, you really don't want Favorite Fruit to be an NVarchar as "Melon" and "Watermelon" are different things (or are they), Or what about an entry of "Aple" on accident.

Were it me, I would have a Fruit Table and a FavoriteFruit Association table, The FavoriteFruit table would have the fruit id and the user id. If the user didn't have a favorite fruit, no space is used. Also, I would ask whether I could get rid of "EatsFruit" and simply check for an entry in the FavoriteFruits table.

That said, the way you have is set up, though maybe playing a little loose, is not an unpardonable sin.

Cheers.

Upvotes: 2

Juan Carlos Oropeza
Juan Carlos Oropeza

Reputation: 48207

The moment you start saving conditionals NULL in your table you know that data need to be normalized.

Imagine you have a field anual_bonus in your Employee table but only managers get a bonus. You will have a lot of nulls in that field what will be a waste.

For this case I would have

User:

  user_id

EatFruit:

  user_id
  favorite_fruit_id (can be null if user eat fruit but doesnt have favorite?)

Fruits

  fruit_id
  fruit_name

so to get user who like fruits you do

SELECT user.*
FROM user
LEFT JOIN EatFruit
      on user.user_id = EatFruit.user_id
WHERE EatFruit.user_id IS NOT NULL

Upvotes: 0

Related Questions