Why the Closure Algorithm Works

Why the Closure Algorithm Works

In this section, we shall show why the closure algorithm correctly decides whether or not a FD A1A2 . . . An → B follows from a given set of FD's S. There are two parts to the proof:

1.  We must prove that the closure algorithm does not claim too much. That is, we must show that if A1A2 . . . An → B is asserted by the closure test (i.e., B is in {A1, A2, . . . , An}+), then A1A2 . . . An → B holds in any relation that satisfies all the FD's in S.

2.   We must prove that the closure algorithm does not fail to discover a FD that truly follows from the set of FD's S.

Why the Closure Algorithm Claims only True FD's

We can prove by induction on the number of times that we apply the growing operation of step 2 that for every attribute D in X, the FD A1A2 . . . An → D holds (in the special case where D is among the As, this FD is trivial). That is, every relation R satisfying all of the FD's in S also satisfies A1A2 . . . An → D. 

BASIS:  The basis case is when there are zero steps. Then D must be one of A1, A2 . . . , An,   and surely A1A2 . . . An → D holds in any relation, because it is a trivial FD.

INDUCTION:  For the induction, suppose D was added when we used the FD B1B2 . . . Bm → D. We know by the inductive hypothesis that R satisfies A1A2 . . . An → Bi for all i = 1,2, . . . , m. Put another way, any two tuples of R that agree on all of A1, A2, . . . , An also agree on all of B1, B2 , . . , Bm . Since R satisfies B1B2 . . . Bm → D, we also know that these two tuples agree on D. Thus, R satisfies A1A2 . . . An → D.           

Why the Closure Algorithm Discovers All True FD's

Suppose A1A2 . . . An → B were a FD that the closure algorithm says does not follow from set S. That is, the closure of {A1, A2, . . . , An} using set of FD's S does not include B. We must show that FD A1A2 . . . An → B really doesn't follow from S. That is, we must show that there is at least one relation instance that satisfies all the FD's in S, and yet does not satisfy A1A2 . . . An → B.

This instance I is actually quite simple to construct; it is shown in the following figure. I has only two tuples t and s. The two tuples agree in all the attributes of {A1, A2, . . . , An}+ , and they disagree in all the other attributes. We must show first that I satisfies all the FD's of S, and then it does not satisfy A1A2 . . . An → B.
An instance I satisfying S

Suppose there were some FD C1C2 . . . Ck → D in set S that instance I does not satisfy. Since I has only two tuples, t and s, those must be the two tuples that violate C1C2 . . . Ck → D. That is, t and s agree in all the attributes of {C1, C2, . . . , Ck}, yet disagree on D. If we examine the above figure, we see that all of C1, C2, . . . , Ck must be among the attributes of {A1, A2, . . . , An}+ , because those are the only attributes on which t and s agree. Likewise, D must be among the other attributes, because only on those attributes do t and s disagree.

But then we did not compute the closure correctly. C1C2 . . . Ck → D should have been applied when X was {A1, A2, . . . , An} to add D to X. We conclude that C1C2 . . . Ck → D cannot exist; i.e., instance I satisfies S.

Second, we must show that I does not satisfy A1A2 . . . An → B. However, this part is easy. Surely, A1, A2, . . . , An are among the attributes on which t and s agree. Also, we know that B is not in {A1, A2, . . . , An}+ , so B is one of the attributes on which t and s disagree. Thus, I does not satisfy A1A2 . . . An → B. We conclude that the closure algorithm asserts neither too few nor too many FD's; it asserts exactly those FD's that do follow from S.



Tags