*Decomposing Relations*

The accepted way to remove these anomalies is to decompose relations. Decomposition of R involves splitting the attributes of R to make schemas of two new relations. Our decomposition rule also includes a way of populating those relations with tuples by "projecting" the tuples of R. After explaining the decomposition process, we shall explain how to pick a decomposition that removes anomalies.

Given a relation R with schema {A_{1}, A_{2}, . . . , A_{n}}, we may decompose R into two relations S and T with schemas {B_{1}, B_{2}, . . . , B_{m}} and {C_{1}, C_{2}, . . . , C_{k}} respectively, such that

1. {A_{1}, A_{2}, . . . , A_{n}} = {B_{1}, B_{2}, . . . , B_{m}} U {C_{1}, C_{2}, . . . , C_{k}}

2. The tuples in relation S are the projections onto {B_{1}, B_{2}, . . . , B_{m}} of all the tuples in R. That is for each tuple t in the current instance of R, take the components of t in the attributes B_{1}, B_{2}, . . . , B_{m}. These components form a tuple, and this tuple belongs in the current instance of S. However, relations are sets, and the same tuple of S could result from projecting two different tuples of R. If so, we put into the current instance of S only one copy of each tuple.

3. Similarly, the tuples in relation T are the projections, onto set of attributes {C_{1}, C_{2}, . . . , C_{k}}, of the tuples in the current instance of R.**Example :** Let us decompose the Movies relation of Figure 1. First, we shall decompose the schema. Our choice, whose merit will be seen in "Boyce-Codd Normal Form", is to use

1. A relation called Movies1, whose schema is all the attributes except for starName.

2. A relation called Movies2, whose schema consists of the attributes title, year, and starName.

Now, let us show the process of decomposing relation instances by decomposing the sample data of Figure 1. First, let us construct the projection onto the Movies1 schema:

**{title, year, length, filmType, studioName}**

The first three tuples of Figure 1, each have the same components in these five attributes:

**(Star Wars, 1977, 124, color, Fox)**

The fourth tuple yields a different tuple for the first five components, and the fifth and sixth tuple each yield the same five-component tuple. The resulting relation for Movies1 is shown in Figure 2.

Next, consider the projection of Figure 1 onto the schema of Movies2. Each of the six tuples of that figure differ in at least one of the attributes title, year, and starName, so the result is the relation shown in Figure 3.

Notice how this decomposition removes the anomalies we mentioned in "Design of Relational Database Schemas / Anomalies". The redundancy has been removed; for instance, the length of each film appears only once, in relation Movies1. The risk of an update anomaly is gone. For example, since we only have to change the length of Star Wars in one tuple of Movies1, we cannot wind up with two different lengths for that movie.

Finally, the risk of a deletion anomaly is gone. If we delete all the stars for Mighty Ducks, say, that deletion makes the movie disappear from Movies2. But all the other information about the movie can still be found in Movies1.

It might appear that Movies2 still has redundancy, since the title and year of a movie can appear many times. However, these two attributes form a key for movies, and there is no more concise way to represent a movie. Furthermore, Movies2 does not offer an opportunity for an update anomaly. For example, one might suppose that if we changed to 2003 the year in the Carrie Fisher tuple, but not the other two tuples for Star Wars, then there would be an update irregularity. However, there is nothing in our assumed FDs that prevents there being a different movie named Star Wars in 2003, and Carrie Fisher may star in that one as well. Thus, we do not want to prevent changing the year in one Star Wars tuple, nor is such a change essentially incorrect.

### Tags

- anomalies
- tuples
- redundancy
- Scrolling Cursors
- Protecting Against Concurrent Updates
- Modifications by Cursor
- Cursors
- Triggers in SQL
- Schema-Level Constraints and Triggers
- Tuple-Based CHECK Constraints
- Keys Declared With UNIQUE
- Constraints and Triggers
- Modifying Views
- View Definitions
- Introduction to Selection of Indexes
- Default Values / Indexes
- Simple Table Declarations
- Deletion / Updates
- Database Modifications
- Grouping / HAVING Clauses
- Full-Relation Operations
- Union, Intersection, and Difference of Queries
- Tuple Variables
- Disambiguating Attributes
- Queries Involving More Than One Relation
- Null Values and Comparisons Involving NULL
- Projection in SQL
- The Database Language SQL
- Additional Constraint Examples
- Extending the Projection Operator
- Grouping
- Extended Operators of Relational Algebra
- Selection on Bags / Product of Bags / Joins of Bags
- Union, Intersection, and Difference of Bags
- Relational Operations on Bags
- Dependent and Independent Operations
- Selection / Cartesian Product
- Set Operations on Relations
- An Algebra of Relational Operations
- Relational Algebra
- Information Integration Via Semistructured Data
- References
- Nested Relations
- The Object-Relational Model
- Representing ODL Relationships
- Representing Other Type Constructors
- Nonatomic Attributes in Classes
- Relationships in ODL / Inverse Relationships
- Reasoning About Multivalued Dependencies
- Definition of Multivalued Dependencies
- Multivalued Dependencies
- Boyce-Codd Normal Form
- Design of Relational Database Schemas / Anomalies
- Why the Closure Algorithm Works
- Trivial Functional Dependencies
- Rules About Functional Dependencies
- Keys of Relations
- Using Null Values to Combine Relations - Comparison of Approaches
- Converting Subclass Structures to Relations
- From E/R Diagrams to Relational Designs
- Relation Instances
- Equivalent Representations of a Relation
- Tuples / Domains
- Converting Multiway Relationships to Binary
- Multiway Relationships
- Database System Implementation
- Database Design
- Multimedia Data
- Relational Database Systems