Reputation: 151
From https://www.comp.nus.edu.sg/~lingtw/onf.pdf:
STOCK
Agent Company Product
a1 c1 p1
a1 c2 p1
a1 c1 p3
a1 c2 p4
a2 c1 p1
a2 c1 p2
a3 c2 p4
Why is there no MVD (multivalued dependency) in the above relation?
Is it because of the interrelationships between the attributes?
Why is the above relation in 4NF and not in 5NF?
Upvotes: -1
Views: 301
Reputation: 42
4NF and 5NF are somewhat abstract, especially when you try to understand them the mathematical way. When you use a common sense approach you might be surprised how banal things can be.
In order to know if a number of projections together constitute a join dependency you go through the procedure of first taking these projection on the original table and then joining the resulting tables back together. If this yields a table that is the same as the original there is a JD for these components.
For example: (using the JD notation on page 70 of the slides)
* { AC(Agent, Company), AP(Agent, Product), CP(Company Product) }
is a JD with 3 components and we have seen in another answer, where I went through the procedure, that it holds.
While
* { AC(Agent, Company), AP(Agent, Product) }
is a JD with 2 components and we have seen that it does not hold.
The JD with 2 components says the same as (each of) the pair of MVDs
Agent ->> Company
Agent ->> Product
You can also write them more succinctly as
Agent ->> Company | Product
A FD (for example X -> Y
) says that for a given value of X you will always see the same value of Y.
So if you find a first row where X is 5 and Y is 'f', and you look further for a second X with value 5, then Y will again have the value 'f'.
If it hasn't there is no FD.
You can understand MVDs intuitively in the same way but this time Y has several values spread over several rows.
So for X ->> Y
the X could be 5 and you find a Y with (say) 'f' on one row, with 'j' on another row, and with 'm' on still another row.
Then you expect other 5s to also have 'f', 'j', 'm'.
I do not find this easy to apply. I prefer JDs with 2 components.
You talk about the MVDs agent ->> company
, company ->> product
, and agent ->> product
.
For 4NF the middle one is too much. 5NF is not about MVDs, it is about JDs with 3 components.
In your initial question you asked:
Why is it stated that there is no MVD in the above example? Is it because of the interrelationships between the attributes?
I went through the JD procedure for 2 components and showed the JD does not hold and therefore there are also no MVDs. Consequently it is okay for 4NF; it is in 4NF.
So no, it is not "because of the interrelationships between the attributes"; I do not even know what "interrelationships" are. I only know about FDs, MVDs and JDs. There are no MVDs because the procedure tells me so.
Why is the above relation in 4NF and not in 5NF?
I also went through the procedure with 3 components and showed the JD holds, an so the relation is not in 5NF. Again I know this because I went through the procedure that allows me to test (on the sample data) the hypothesis that a JD exists.
On a more intuitive level:
The slides state (in point 4, on page 67)
If an agent sells a product and he represents the company making that product, then he sells that product for that company.
That is the business rule that is the cause of tuple forcing. This is wat makes certain combinations (of company and product) compulsory. This is what we want to detect in the data because it causes the JD (with 3 components). It is typical for a relation that is not in 5NF to have a business rule that forces certain rows to exist when others exist. So when agent a1 sells product p1 (maybe of another company, let us say c2) and company c1 happens to make that product then a row must exist for (a1 c1 p1) since (a1 c2 p1) also exists.
To recap:
A sells P AP
A represents C AC
C makes P CP
implies
A sells P made by C STOCK
so the following JD holds:
* { AP(Agent, Product), AC(Agent, Company), CP(Company Product) }
Upvotes: 1
Reputation: 42
Multivalued dependencies (MVD) apply to 4NF, join dependencies (JD) to 5NF.
A table has a problem with 4NF when it has 2 multivalued columns that are independent of each other and all combinations of the values have been stored. However, the problem is never stated in terms of multivalued attributes; instead it is stated in terms of MVDs. I find it easier when it is defined in terms of a JD. In that case you will have a JD with 2 components.
Let us apply this on your example. We first need to make the projections on
STOCK(Agent Company Product)
a1 c1 p1
a1 c2 p1
a1 c1 p3
a1 c2 p4
a2 c1 p1
a2 c1 p2
a3 c2 p4
First we take Agent, Company:
AC(Agent Company)
a1 c1
a1 c2
a2 c1
a3 c2
We left out column Product and then eliminate duplicates.
Second we take Agent, Product:
AP(Agent Product)
a1 p1
a1 p3
a1 p4
a2 p1
a2 p2
a3 p4
We join them back and get
ACP1
a1 c1 p1
a1 c1 p3
a1 c1 p4
a1 c2 p1
a1 c2 p3
a1 c2 p4
a2 c1 p1
a2 c1 p2
a3 c2 p4
ACP1 is not the same as the original table STOCK, and therefore there is no JD with the 2 components. Hence there are no MVDs an thus no problem with 4NF. It is in 4NF.
There is a problem with 5NF when a JD exists with 3 or more components. In the example, since there are only 3 columns, the only other projection that we can do is on Company, Product:
CP(Company Product)
c1 p1
c2 p1
c1 p3
c2 p4
c1 p2
We now need to join the 3 projections back and see if we get the original table. That is ACP1. We now need to join it with CP. This yields
ACP2
a1 c1 p1
a2 c1 p1
a1 c2 p1
a1 c1 p3
a1 c2 p4
a3 c2 p4
a2 c1 p2
Since in ACP1 we already have all the columns, the effect of joining with CP is not to add a column; instead it is to filter out certain rows.
ACP2 is indeed the same as STOCK. The fact that the rows are in a different order is not important; We get exactly the same rows. So we have a JD, and therefore a problem with 5NF; STOCK is not in 5NF.
The solution is we split it up into 3 tables AC, AP and CP.
Upvotes: 0
Reputation: 42
Right.
The mathematical formulation is unwieldy. What you should understand, but the formal definition makes it hard to understand is : There are 3 kinds of JDs : 1) trivial JDs, 2) JDs implied by keys, and 3) the other JDs.
The first 2 are not a problem and are therefor allowed; the third is the problem and you should get rid of those by splitting up the table.
1) trivial means that one of the components in the JD is the entire heading of the original table. (see page 70 of the slides) 2) implied by keys means that each component contains a key; that is an entire key, not just part of a key. (see page 71) 3) other JDs are problematic: they are the tuple-forcing ones.
"AP, AC and CP are in 5NF because all these projections don't have any JD, is that right?".
That is right: you would need een join dependency with 3 components but you have only 2 attributes in each of these relations. So it just isn't possible. Mind also that if you include both attributes as one of the components, the JD becomes trivial.
Upvotes: 0