Representing Set-Valued Attributes

Representing Set-Valued Attributes

Record structures are not the most complicated kind of attribute that can appear in ODL class definitions. Values can also be made using type constructors Set, Bag, List, Array, and Dictionary from "Types in ODL". Each presents its own problems when migrating to the relational model. We shall only talk about the Set constructor, which is the most common, in detail.

One approach to representing a set of values for an attribute A is to make one tuple for each value. That tuple contains the suitable values for all the other attributes besides A. Let us first see an instance where this approach works well, and then we shall see a pitfall.

Stars with a set of addresses

Example (a) : Assume that class Star were defined so that for each star we could record a set of addresses, as in Figure (a). Assume next that Carrie Fisher also has a beach home, but the other two stars mentioned in "Nonatomic Attributes in Classes" Figure (b) each have only one home. Then we may create two tuples with name attribute equal to "Carrie Fisher",  as shown in Figure (b). Other tuples remain as they were in "Nonatomic Attributes in Classes" Figure (b).

Allowing a set of addresses

Unluckily, this technique of replacing objects with one or more set-valued attributes by collections of tuples, one for each combination of values for these attributes, can lead to unnormalized relations, of the type explained in "Design of Relational Database Schemas / Anomalies". In reality, even one set-valued attribute can lead to a BCNF violation, as the next example shows.

Atomic Values

Stars with a set of addresses and a birthdate

Example (b) : Assume that we add birthdate as an attribute in the definition of the Star class; that is, we use the definition shown in Figure (c). We have added to Figure (a) the attribute birthdate of type Date, which is one of ODL's atomic types. The birthdate attribute can be an attribute of the Stars relation, whose schema now becomes:

Stars (name, street, city, birthdate)

Let us make another change to the data of Figure (b). Since a set of addresses can be empty, let us suppose that Harrison Ford has no address in the database. Then the revised relation is shown in Figure (d). Two bad things have happened:

1. Carrie Fisher's birthdate has been repeated in each tuple, causing redundancy. Note that her name is also repeated, but that repetition is not true redundancy, because without the name appearing in each tuple we could not know that both addresses were associated with Carrie Fisher.

2. Because Harrison Ford has an empty set of addresses, we have lost all information about him. This situation is an instance of a deletion anomaly that we explained in "Design of Relational Database Schemas / Anomalies".

Adding birthdates

Although name is a key for the class Star, our need to have various tuples for one star to represent all their addresses means that name is not a key for the relation Stars. Actually, the key for that relation is {name, street, city}. Therefore, the functional dependency

name → birthdate

is a BCNF violation. This fact explains why the anomalies mentioned above are able to occur.

There are many options about how to handle set-valued attributes that appear in a class declaration along with other attributes, set-valued or not. First, we may just place all attributes, set-valued or not, in the schema for the relation, then use the normalization techniques of "Design of Relational Database Schemas / Anomalies" and "Multivalued Dependencies" to remove the resulting BCNF and 4NF violations. Note that a set-valued attribute in conjunction with a single-valued attribute leads to a BNCF violation, as in above mentioned Example (b). Two set-valued attributes in the same class declaration will lead to a 4NF violation.

The second approach is to separate out each set-valued attribute as if it were a many-many relationship between the objects of the class and the values that appear in the sets. We shall discuss this approach for relationships in "Representing ODL Relationships".