Outerjoins

Outerjoins

An attribute of the join operator is that it is possible for certain tuples to be "dangling"; that is, they fail to match any tuple of the other relation in the common attributes. Dangling tuples do not have any trace in the result of the join, so the join may not represent the data of the original relations completely. In cases where this behavior is undesirable, a variation on the join, called "outerjoin", has been proposed and appears in several commercial systems.

We shall examine the "natural" case first, where the join is on equated values of all attributes in common to the two relations. The outerjoin  is formed by starting with , and adding any dangling tuples from R or S. The added tuples must be padded with a special null symbol, , in all the attributes that they do not possess but that appear in the join result.    

Example 1: In Figure 1 we see two relations U and V. Tuple (1,2,3) of U  joins with both (2,3,10) and (2,3,11) of V, so these three tuples are not dangling. On the other hand, the other three tuples - (4,5,6) and (7,8,9) of U and (6,7,12) of V - are dangling. That is, for none of these three tuples is there a tuple of the other relation that agrees with it on both the B and C components. Thus, in , the three dangling tuples are padded with in the attributes that they do not have: attribute D for the tuples of U and attribute A for the tuple of V.

There are many variants of the basic (natural) outerjoin idea. The left outerjoin  is like the outerjoin, but only dangling tuples of the left argument R are padded with and added to the result. The right outerjoin  is like the outerjoin, but only the dangling tuples of the right argument S are padded with and added to the result.

Example 2: If U and V are as in Figure 1, then  is:


     Outerjoin of relations


and
 

Furthermore, all three natural outerjoin operators have theta-join analogs, where first a theta-join is taken and then those tuples that failed to join with any tuple of the other relation, when the condition of the theta-join was applied, are padded with and added to the result. We use  to denote a theta-outerjoin with condition C. This operator can also be modified with L or R to indicate left- or right-outerjoin.

Example 3: Let U and V be the relations of Figure 1, and look at . Tuples (4,5,6) and (7,8,9) of U each satisfy the condition with both of the tuples (2,3,10) and (2,3,11) of V. In this way, none of these four tuples are dangling in this theta-join. On the other hand, the two other tuples - (1,2,3)  of U and (6,7,12) of V - are dangling. They therefore appear, padded, in the result shown in Figure 2.

   Result of a theta-outerjoin



Tags