Functional dependencies in database

Question

I have the following set of functional dependencies on the relation schema r(A, B, C, D, E, F ) :

A -> BCD
BC -> DE
B -> D
D -> A

Can anyone show with explanation how to find the candidate keys for this relation ?

willeM_ Van Onsem · Accepted Answer

Short answer: The candidate keys are (A,F), (B,F) and (D,F).

Like this Wikipedia article says:

The set of all candidate keys can be computed e.g. from the set of functional dependencies. To this end we need to define the attribute closure α⁺ for an attribute set α. The set α⁺ contains all attributes that are functionally implied by α.

It is quite simple to find a single candidate key. We start with a set α of attributes and try to remove successively each attribute. If after removing an attribute the attribute closure stays the same, then this attribute is not necessary and we can remove it permanently. We call the result minimize(α). If α is the set of all attributes, then minimize(α) is a candidate key.

So now we only need to put this into practice. Let's start with all attributes α₀=(A,B,C,D,E,F). Now we can look whether removing A generates a problem. With α'₀=(B,C,D,E,F), α'₀⁺ is (B,C,D,E,F,A) (since D→ A holds). Now by kicking out A permanently and trying to remove B, we will end up with a candidate key.

α₁=(B,C,D,E,F). Can we throw out B? Yes because α'₁=(C,D,E,F) will result in α'₁⁺=(A,B,C,D,E,F) (since D→A and A→BCD).
α₂=(C,D,E,F). Can we throw out C? Yes because α'₂=(D,E,F) will result in α'₂⁺=(A,B,C,D,E,F) (since D→A and A→BCD).
α₃=(D,E,F). Can we throw out D? No because α'₃=(E,F) will result in α'₃⁺=(E,F).
α₃=(D,E,F). Can we throw out E? Yes because α'₃=(D,F) will result in α'₃⁺=(A,B,C,D,E,F) (since D→A; A→BCD; and BC→DE).
α₄=(D,F). Can we throw out F? No because α'₄=(D) will result in α'₄⁺=(A,B,C,D,E) (since D→A; A→BCD; and BC→DE).

So now we generated minimize(α₀)=α₄=(D,F). We could use a brute force approach where in each iteration we iterate over all possible keys we can remove. But this will cost exponential time to generate.

The Wikipedia article however includes a way to generate all candidate keys polynomial in the number of keys and the functional dependencies. The algorithm is defined as:

function find_candidate_keys(A, F)
/* A is the set of all attributes and F is the set of functional dependencies */
K[0] := minimize(A);
n := 1; /* Number of Keys known so far */
i := 0; /* Currently processed key */
while i < n do
  foreach α → β ∈ F do
    /* Build a new potential key from the previous known key and the current FD */
    S := α ∪ (K[i] − β);
    /* Search whether the new potential key is part of the already known keys */ 
    found := false;
    for j := 0 to n-1 do
      if K[j] ⊆ S then found := true;
    /* If not, add if 
    if not found then
      K[n] := minimize(S);
      n := n + 1;
  i := i + 1
return K

So if we run the algorithm, we first have to calculate minimize(A), but the nice thing is: we already did that above. So K[0] = (D,F), n=1 and i=0.

Now we take the while loop and start iterating over all functional dependencies.

for A→ BCD. So now we construct a key (A,F). We check if there is already a subset defined as key (which is not the case). Now we minimize it, like: (A,F)→(A,F). So we add a new key (A,F).
for BC→DE. So now we construct a key (B,C,F). We check if there is already a subset defined as key (which is not the case). Now we minimize it, like (B,C,F)→(B,F)→(B,F). So we add (B,F).
for B→D. So now we construct a key (B,F). We check if there is already a subset defined as key (this is the case). We don't add this one.
for D→A. So now we construct a key (D,F). We check if there is already a subset defined as key (this is the case). We don't add this one.

This is the end of the first iteration. So K is now K=[(D,F),(A,F),(B,F)]. n=3 and now i=1. So for K[i]=(A,F) we now iterate:

for A→ BCD. So now we construct a key (A,F). We check if there is already a subset defined as key (this is the case). We don't add this one.
for BC→DE. So now we construct a key (B,C,F). We check if there is already a subset defined as key (this is the case). We don't add this one.
for B→D. So now we construct a key (A,B,F). We check if there is already a subset defined as key (this is the case). We don't add this one.
for D→A. So now we construct a key (D,F). We check if there is already a subset defined as key (this is the case). We don't add this one.

This is the end of the second iteration. So K is now K=[(D,F),(A,F),(B,F)]. n=3 and now i=2. So for K[i]=(B,F) we now iterate:

for A→ BCD. So now we construct a key (A,F). We check if there is already a subset defined as key (this is the case). We don't add this one.
for BC→DE. So now we construct a key (B,C,F). We check if there is already a subset defined as key (this is the case). We don't add this one.
for B→D. So now we construct a key (B,F). We check if there is already a subset defined as key (this is the case). We don't add this one.
for D→A. So now we construct a key (B,D,F). We check if there is already a subset defined as key (this is the case). We don't add this one.

So at the end K=[(D,F),(A,F),(B,F)]. These are all candidate keys.

Functional dependencies in database

Answers (1)

Related Questions