Design of Relational Database Schemas / Anomalies

Design of Relational Database Schemas / Anomalies

Design of Relational Database Schemas

Careless selection of a relational database schema can lead to problems. For instance, Example (b) of "Combining Relations" showed what happens if we try to combine the relation for a many-many relationship with the relation for one of its entity sets. The major problem we identified is redundancy, where a fact is repeated in more than one tuple. This problem is seen in Figure 2 of "Functional Dependencies", which we reproduce here as Figure 1; the length and film-type for Star Wars and Wayne's World are each repeated, once for each star of the movie.

In this section, we shall tackle the problem of design of good relation schemas in the following stages:

1. We first explore in more detail the problems that arise when our schema is flawed.
2. Then, we introduce the idea of "decomposition", breaking a relation schema (set of attributes) into two smaller schemas.
3. Next, we introduce "Boyce-Codd normal form", or "BCNF", a condition on a relation schema that removes these problems.
4. These points are tied together when we explain how to assure the BCNF condition by decomposing relation schemas.

The relation Movies exhibiting anomalies


Problems such as redundancy that take place when we try to cram too much into a single relation are called anomalies. The major kinds of anomalies that we meet are:

1. Redundancy. Information may be repeated without need in various tuples. Examples are the length and film type for movies as in Figure 1.

2. Update Anomalies. We may alter information in one tuple but leave the same information unchanged in another. For example, if we found that Star Wars was really 125 minutes long, we might carelessly change the length in the first tuple of Figure 1 but not in the second or third tuples. True, we might argue that one should never be so careless. But we shall see that it is possible to redesign relation Movies so that the risk of such mistakes does not exist.

3. Deletion Anomalies. If a set of values becomes vacant, we may lose other information as a side effect. For instance, should we delete Emilio Estevez from the set of stars of Mighty Ducks, then we have no more stars for that movie in the database. The last tuple for Mighty Ducks in the relation Movies would vanish, and with it information that it is 104 minutes long and in color.