Recovering Information from a Decomposition

Recovering Information from a Decomposition

Let us now turn our attention to the question of why the decomposition algorithm of "Decomposition into BCNF" preserves the information that was included in the original relation. The idea is that if we follow this algorithm, then the projections of the original tuples can be "joined" again to produce all and only the original tuples.

To make the situation simpler, let us look at a relation R(A, B, C) and a FD B → C, which we suppose is a BCNF violation. It is possible, for instance, that as in Example (b) of "Decomposition into BCNF", there is a transitive dependency chain, with another FD A → B. In that case, {A} is the only key, and the left side of B → C clearly is not a superkey. Another possibility is that B → C is the only nontrivial FD, in which case the only key is {A, B}. Again, the left side of B → C is not a superkey. In either case, the required decomposition based on the FD B → C separates the attributes into schemas {A, B} and {B, C}.

Let t be a tuple of R. We may write t = (a, b, c), where a, b, and c are the components of t for attributes A, B, and C, respectively. Tuple t projects as (a, b) for the relation with schema {A, B} and as (b, c) for the relation with schema {B, C}.

It is possible to join a tuple from {A, B} with a tuple from {B, C}, provided they agree in the B component. Particularly, (a, b) joins with (b, c) to give us the original tuple t = (a, b, c) back again. That is, regardless of what tuple t we started with, we can always join its projections to get t back.

However, getting back those tuples we started with is not adequate to assure that the original relation R is truly represented by the decomposition.  What might happen if there were two tuples of R, say t = (a, b, c) and v = (d, b, e)?  When we project t onto {A, B} we get u = (a, b), and when we project v onto {B,C} we get w = (b, e), as suggested by the following figure.

Tuples u and w join, since they agree on their B components. The resulting tuple is x = (a, b, e). Is it possible that x is a bogus tuple? That is, could (a,b,e) not be a tuple of R?

Joining two tuples from projected relations

Since we suppose the FD B → C for relation R, the answer is "no". Recall that this FD says any two tuples of R that agree in their B components must also agree in their C components. Since t and v agree in their B components (they both have b there), they also agree on their C components. That means c = e;  i.e., the two values we supposed were different are really the same. Therefore, (a, b, e) is really (a, b, c); that is, x = t.

Since t is in R, it must be that x is in R. Put another way, as long as FD B → C holds, the joining of two projected tuples cannot make a bogus tuple. Rather, every tuple produced by joining is guaranteed to be a tuple of R.

This argument works in general. We assumed A, B, and C were each single attributes, but the same argument would apply if they were any sets of attributes. That is, we take any BCNF-violating FD, let B be the attributes on the left side, let C be the attributes on the right but not the left, and let A be the attributes on neither side. We may conclude:

●  If we decompose a relation according to the method of "Decomposition into BCNF", then the original relation can be recovered exactly by joining the tuples of the new relations in all possible ways.
 
If we decompose relations in a way that is not based on a FD, then we might not be able to recover the original relation. Here is an example.

Example : Assume we have the relation R(A, B, C) as above, but that the FD B → C does not hold. Then R might consist of the two tuples


The projections of R onto the relations with schemas {A,B} and {B,C}
      
are     


and


Respectively. Since all four tuples share the same B-value, 2, each tuple of one relation joins with both tuples of the other relation. Therefore, when we try to reconstruct R by joining, we get    



That is, we get "too much"; we get two bogus tuples, (1,2,5) and (4,2,3) that were not in the original relation R.


Tags