Projecting Functional Dependencies

Projecting Functional Dependencies

When we learn design of relation schemas, we shall also have need to answer the following question about FD's. Assume we have a relation R with some FD's F, and we "project" R by removing certain attributes from the schema. Consider S is the relation that results from R if we remove the elements corresponding to the dropped attributes, in all R's tuples. Since S is a set, duplicate tuples are replaced by one copy. What FD's hold in S?

The answer is acquired in principle by computing all FD's that:

a) Follow from F, and
b) Involve only attributes of S.

A Complete Set of Inference Rules

As there may be a large number of such FD's, and many of them may be unneeded (i.e., they follow from other such FD's), we are free to simplify that set of FD's if we wish. However, generally, the calculation of the FD's for S is in the worst case exponential in the number of attributes of S.

Example : Assume R(A, B, C, D) has FD's A → B, B → C, and C → D. Assume also that we wish to project out the attribute B, leaving a relation S(A,C, D). Technically, to find the FD's for S, we need to take the closure of all eight subsets of {A, C, D}, using the full set of FD's, including those involving B. However, there are some clear simplifications we can make.

●  Closing the empty set and the set of all attributes cannot yield a nontrivial FD. 

●  If we already know that the closure of some set X is all attributes, then we cannot discover any new FD's by closing supersets of X.

Therefore, we may start with the closures of the singleton sets, and then move on to the doubleton sets if required. For each closure of a set X, we add the FD X → E for each attribute E that is in X+ and in the schema of S, but not in X.

First, {A}+ = {A, B, C, D). Thus, A → C and A → D hold in S. Note that A → B is true in R, but makes no sense in S because B is not an attribute of S.

Next, we suppose {C}+ = {C,D}, from which we get the additional FD C → D for S. Since {D}+ = {D}, we can add no more FD's, and are done with the singletons.

Since {A}+ contains all attributes of S, there is no point in considering any superset of {A}. The reason is that whatever FD we could discover, for instance AC → D, follows by the rule for augmenting left sides from one of the FD's we already discovered for S by considering A alone as the left side. Therefore, the only doubleton whose closure we need to take is {C,D}+ = {C,D}. This observation allows us to add nothing. We are done with the closures, and the FD's we have discovered are A → C, A → D, and C → D.

If we wish, we can observe that A → D follows from the other two by transitivity. Thus a simpler, equivalent set of FD's for S is A → C and C → D.