Union, Intersection, and Difference of Bags

Union, Intersection, and Difference of Bags

When we take the union of two bags, we add the number of occurrences of each tuple. That is, if R is a bag in which the tuple t appears n times, and S is a bag in which the tuple t appears m times, then in the bag R U S tuple t appears n + m times. Note that either n or m (or both) can be 0.

When we intersect two bags R and S, in which tuple t appears n and m times, respectively, in R ∩ S tuple t appears min(n, m) times. When we calculate R - S, the difference of bags R and S, tuple t appears in R - S max(0,n - m) times. That is, if t appears in R more times than it appears in S, then in R - S tuple t appears the number of times it appears in R. minus the number of times it appears in S. On the other hand, if t appears at least as many times in S as it appears in R. then t does not appear at all in R - S. Intuitively, occurrences of t in S each "cancel" one occurrence in R.

Example (a) : Let R be the relation of "Relational Operations on Bags" Figure (a), that is, a bag in which tuple (1,2) appears three times and (3,4) appears once. Let S be the bag



Then the bag union R U S is the bag in which (1,2) appears four times (three times for its occurrences in R and once for its occurrence in S); (3,4)  appears three times, and (5,6) appears once.

The bag intersection R ∩ S is the bag

with one occurrence each of (1,2) and (3,4). That is, (1,2) appears three times in R and once in S, and min(3,1) = 1, so (1,2) appears once in R ∩ S. Likewise, (3,4) appears min(1,2) = 1 time in R ∩ S. Tuple (5,6), which appears once in S but zero times in R appears min(0,1) = 0 times in R ∩ S.
 
The bag difference R - S is the bag


To see why, notice that (1,2) appears three times in R and once in S, so in R - S it appears max(0,3 - 1) = 2 times. Tuple (3,4) appears once in R  and twice in S, so in R - S it appears max(0,1 - 2) = 0 times. No other tuple appears in R, so there can be no other tuples in R - S.

As another example, the bag difference S - R is the bag



Tuple (3,4) appears once because that is the difference in the number of times it appears in S minus the number of times it appears in R. Tuple (5,6) appears once in S - R for the same reason. The resulting bag happens to be a set in this case.

Projection of Bags

We have already exemplified the projection of bags. As we saw in "Relational Operations on Bags" Example (b), each tuple is processed separately during the projection. If R is the bag of "Relational Operations on Bags" Figure (b) and we calculate the bag-projection πA,B (R), then we get the bag of "Relational Operations on Bags" Figure (a).

Bag Operations on Sets

If the removal of one or more attributes during the projection causes the same tuple to be created from various tuples, these duplicate tuples are not removed from the result of a bag-projection. As a result, the three tuples (1, 2, 5), (1, 2, 7) and (1, 2, 8) of the relation R from "Relational Operations on Bags" Figure (b) each gave rise to the same tuple (1, 2) after projection onto attributes A and B. In the bag result, there are three occurrences of tuple (1, 2), while in the set-projection, this tuple appears only once.


Tags