Extended Operators of Relational Algebra

Extended Operators of Relational Algebra

"An Algebra of Relational Operations" presented the classical relational algebra, and "Relational Operations on Bags" introduced the modifications required to treat relations as bags of tuples rather than sets. The ideas of these two sections serve as a base for most of modern query languages. On the other hand, languages such as SQL have many other operations that have proved quite important in applications. Thus, a full treatment of relational operations must contain a number of other operators, which we introduce in this section. The additions:                

1. The duplicate-elimination operator δ turns a bag into a set by eliminating all but one copy of each tuple.  

2. Aggregation operators such as sums or averages, are not operations of relational algebra, but are used by the grouping operator. Aggregation operators apply to attributes (columns) of a relation. e.g. the sum of a column produces the one number that is the sum of all the values in that column.  

3. Grouping of tuples according to their value in one or more attributes has the effect of partitioning the tuples of a relation into "groups". Aggregation can then be applied to columns within each group, giving us the ability to express a number of queries that are impossible to express in the classical relational algebra. The grouping operator γ is an operator that combines the effect of grouping and aggregation.

4. The sorting operator T turns a relation into a list of tuples, sorted according to one or more attributes. This operator should be used sensibly, because other relational-algebra operators apply to sets or bags, but never to lists. Thus, T only makes sense as the final step of a series of operations.

5. Extended projection gives additional power to the operator π. In addition to projecting out some columns, in its generalized form π can perform computations involving the columns of its argument relation to produce new columns.

6. The outerjoin operator is a variant of the join that avoids losing dangling tuples. In the result of the outerjoin, dangling tuples are "padded" with the null value, so the dangling tuples can be represented in the output.

Duplicate Elimination

Sometimes, we need an operator that converts a bag to a set. For that purpose, we use δ(R) to return the set consisting of one copy of every tuple that appears one or more times in relation R.

Example 1 : If R is the relation
Duplicate Elimination a

from "Relational Operations on Bags" Figure (a) then δ(R) is
Duplicate Elimination b

Note that the tuple (1,2), which appeared three times in R, appears only once in δ(R).

Aggregation Operators

There are various operators that apply to sets or bags of atomic values. These operators are used to summarize or "aggregate" the values in one column of a relation, and thus are referred to as aggregation operators. The standard operators of this type are:

1. SUM produces the sum of a column with numerical values.

2. AVG produces the average of a column with numerical values.

3. MIN and MAX, applied to a column with numerical values, produces the smallest or largest value, respectively. When applied to a column with character-string values, they produce the lexicographically (alphabetically) first or last value, respectively.

4. COUNT produces the number of (not necessarily distinct) values in a column. Equivalently, COUNT applied to any attribute of a relation produces the number of tuples of that relation, including duplicates.

Example 2 : Consider the relation
Duplicate Elimination a

Some examples of aggregations on the attributes of this relation are:

1. SUM(B) = 2 + 4 + 2 + 2 = 10.

2. AVG(A) = (1 + 3 + 1 + 1)/4= 1.5.

3. MIN(A) = 1.

4. MAX(B) = 4.

5. COUNT(A) = 4.