Reputation: 11
I'm studying for an exam at the moment and I need a hand thoroughly understand how to convert a table into 3NF. I understand from Unormalised to 1NF, and i'm getting stuck on 1NF to 2NF. I've been given this example from a tutorial.
filmNo | fTitle | dirNo | director | actorNo | aName | role | timeOnScreen
F1100 | Happy Days | D101 | Jim Alan | A1020 | Sheila Toner | Jean Simpson | 15.45
| | D101 | Jim Alan | A1222 | Peter Watt | Tom Kinder | 25.38
| | D101 | Jim Alan | A1020 | Sheila Toner | Silvia Simpson| 22.56
F1109 | Snake Bite | D076 | Sue Ramsay | A1567 | Steve Mcdonald| Tim Rosey | 19.56
| | D076 | Sue Ramsay | A1222 | Peter Watt | Archie Bold | 10.44
So this table to 1NF is quite easy but it's getting to 2NF and 3NF that i'm struggling with. I'm getting lost on determining the dependencies on the columns. Am I correct in saying that role and timeOnScreen are dependent on Actor, but also on the film? How would this convert to 2NF. I think from 2NF I'd be able to go to 3NF. But I'd really like to go through the steps to do this so I can fully understand it for my exam.
Upvotes: 0
Views: 2353
Reputation: 18408
"I'm getting lost on determining the dependencies on the columns".
First and foremost, dependencies between "columns" in a relation schema should be derived from business rules which should be given. Determining the dependencies from a data sample is just a guessing game, always fallible and never to be relied upon. If your exercise requires you to do that, try and make the best of it, but forget about the approach as soon as your exam is over.
A functional dependency (which is the only type of dependency you need to consider if nothing beyond 3NF is targeted) is a rule "AB->CD" to the effect that for any valid value of our relation schema, IF you take the relational projection over {ABCD} of that relation value, then the resulting relation value will be one in which any combination of {AB} values will appear at most once. That's where the name "functional" dependency derives from : "AB->CD" expresses that both C and D values are mathematical functions of the combinations of AB values. Conversely, it expresses that the AB combination is a determinant for finding single C and D values. Applying that to your sample should allow you to find some reasonable FDs that your exercise is wrongly expecting you to guess.
Upvotes: 2