Decomposing Relations

Decomposing Relations

The accepted way to remove these anomalies is to decompose relations. Decomposition of R involves splitting the attributes of R to make schemas of two new relations. Our decomposition rule also includes a way of populating those relations with tuples by "projecting" the tuples of R.  After explaining the decomposition process, we shall explain how to pick a decomposition that removes anomalies.

Given a relation R with schema {A1, A2, . . . , An}, we may decompose R into two relations S and T with schemas {B1, B2, . . . , Bm} and {C1, C2, . . . , Ck} respectively, such that

1.   {A1, A2, . . . , An} = {B1, B2, . . . , Bm}  U  {C1, C2, . . . , Ck}

2.  The tuples in relation S are the projections onto {B1, B2, . . . , Bm} of all the tuples in R. That is for each tuple t in the current instance of R, take the components of t in the attributes B1, B2, . . . , Bm. These components form a tuple, and this tuple belongs in the current instance of S. However, relations are sets, and the same tuple of S could result from projecting two different tuples of R. If so, we put into the current instance of S only one copy of each tuple.

3.  Similarly, the tuples in relation T are the projections, onto set of attributes {C1, C2, . . . , Ck}, of the tuples in the current instance of R.

Example : Let us decompose the Movies relation of Figure 1. First, we shall decompose the schema. Our choice, whose merit will be seen in "Boyce-Codd Normal Form", is to use

1.  A relation called Movies1, whose schema is all the attributes except for starName.

2.  A relation called Movies2, whose schema consists of the attributes title, year, and starName.

The relation Movies exhibiting anomalies

Now, let us show the process of decomposing relation instances by decomposing the sample data of Figure 1. First, let us construct the projection onto the Movies1 schema:

       {title, year, length, filmType, studioName}

The first three tuples of Figure 1, each have the same components in these five attributes:

       (Star Wars, 1977, 124, color, Fox)

The fourth tuple yields a different tuple for the first five components, and the fifth and sixth tuple each yield the same five-component tuple. The resulting relation for Movies1 is shown in Figure 2.

The relation Movies1

Next, consider the projection of Figure 1 onto the schema of Movies2. Each of the six tuples of that figure differ in at least one of the attributes title, year, and starName, so the result is the relation shown in Figure 3.
The relation Movies2

Notice how this decomposition removes the anomalies we mentioned in "Design of Relational Database Schemas / Anomalies". The redundancy has been removed; for instance, the length of each film appears only once, in relation Movies1. The risk of an update anomaly is gone. For example, since we only have to change the length of Star Wars in one tuple of Movies1, we cannot wind up with two different lengths for that movie.

Finally, the risk of a deletion anomaly is gone. If we delete all the stars for Mighty Ducks, say, that deletion makes the movie disappear from Movies2. But all the other information about the movie can still be found in Movies1.

It might appear that Movies2 still has redundancy, since the title and year of a movie can appear many times. However, these two attributes form a key for movies, and there is no more concise way to represent a movie. Furthermore, Movies2 does not offer an opportunity for an update anomaly. For example, one might suppose that if we changed to 2003 the year in the Carrie Fisher tuple, but not the other two tuples for Star Wars, then there would be an update irregularity. However, there is nothing in our assumed FDs that prevents there being a different movie named Star Wars in 2003, and Carrie Fisher may star in that one as well. Thus, we do not want to prevent changing the year in one Star Wars tuple, nor is such a change essentially incorrect.