Boyce-Codd Normal Form

Boyce-Codd Normal Form

The objective of decomposition is to replace a relation by several that do not exhibit anomalies. There is, it turns out, a simple condition under which the anomalies discussed above can be guaranteed not to exist. This condition is called Boyce-Codd normal form, or BCNF.

●  A relation R is in BCNF if and only if: whenever there is a nontrivial FD A1A2 . . . An → B for R, it is the case that {A1, A2, . . . , An} is a superkey for R

That is, the left side of every nontrivial FD must be a superkey. Remember that a superkey need not be minimal. Thus, a corresponding statement of the BCNF condition is that the left side of every nontrivial FD must include a key.

When we find a BCNF-violating FD, we occasionally wind up with a simpler complete decomposition into BCNF relations if we enlarge the right side of the FD to include the right sides of all the other FDs that have the same left side, whether or not they are BCNF violations. The following is an alternative definition of BCNF in which we look for a set of FDs with common left side, at least one of which is nontrivial and violates the BCNF condition.

●  Relation R is in BCNF if an only if: whenever there is a nontrivial FD A1A2 . . . An → B1B2 . . . Bm for R, it is the case that {A1,A2, . . . , An} is a superkey for R.

This requirement is equivalent to the original BCNF condition. Recall that the FD A1A2 . . . An → B1B2 . . . Bm is shorthand for the set of FDs A1A2 . . . An → Bi for i = 1,2, . . . , m. Since there must be at least one Bi that is not among the As (or else A1A2 . . . An → B1B2 . . . Bm would be trivial), A1A2 . . . An → Bi is a BCNF violation according to our original definition.

Example (a) : Relation Movies, as in "Design of Relational Database Schemas / Anomalies" Figure 1, is not in BCNF. To see why, we first need to determine what sets of attributes are keys. We argued in "Keys of Relations" Example, why {title, year, starName} is a key. In this way, any set of attributes containing these three is a superkey. The same arguments we followed in "Keys of Relations" Example, can be used to explain why no set of attributes that does not include all three of title, year and starName could be a superkey. Therefore, we state that {title, year, starName} is the only key for Movies.

However, consider the FD

        title year  length filmType studioName

which holds in Movies according to our discussion in "Keys of Relations" Example.

Unluckily, the left side of the above FD is not a superkey. Particularly, we know that title and year do not functionally determine the sixth attribute, starName. Therefore, the existence of this FD violates the BCNF condition and tell us Movies is not in BCNF. Furthermore, according to the original definition of BCNF, where a single attribute on the right side was required, we can offer any of the three FDs, such as title year → length, as a BCNF violation.

Example (b) : On the other hand, Movies1 of "Decomposing Relations" Figure 2,  is in BCNF. Since

        title year length filmType studioName

holds in this relation, and we have argued that neither title nor year by itself functionally determines any of the other attributes, the only key for Movies1 is {title, year}. Moreover, the only nontrivial FDs must have at least title and year on the left side, and therefore their left sides must be superkeys. Thus, Movies1 is in BCNF.

Example (c) : We claim that any two-attributes relation is in BCNF. We need to study the possible nontrivial FDs with a single attribute on the right. There are not too many cases to consider, so let us consider them in turn. In what follows, suppose that the attributes are A and B.

1.  There are no nontrivial FDs. Then surely the BCNF condition must hold, because only a nontrivial FD can violate this condition. By the way, note that {A,B} is the only key in this case.

2.  A → B holds, but B → A does not hold. In this case, A is the only key, and each nontrivial FD contains A on the left (in fact the left can only be A). Thus there is no violation of the BCNF condition.

3.  B → A holds, but A → B does not hold. This case is symmetric to case (2).

4.  Both A → B and B → A hold. Then both A and B are keys. Surely any FD has at least one of these on the left, so there can be no BCNF violation.

It is worth noticing from case (4) above that there may be more than one key for a relation. Moreover, the BCNF condition only requires that some key be contained in the left side of any nontrivial FD, not that all keys are contained in the left side. Also examine that a relation with two attributes, each functionally determining the other, is not completely implausible. For instance, a company may assign its employees unique employee IDs and also record their Social Security numbers. A relation with attributes empID and ssNo would have each attribute functionally determining the other. Put another way, each attribute is a key, since we dont expect to find two tuples that agree on either attribute.



Tags