An Algebra of Relational Operations

An Algebra of Relational Operations

In order to start our study of operations on relations, we shall learn about a special algebra, called "relational algebra", that comprises some simple but powerful ways to construct new relations from given relations. When the given relations are stored data, then the constructed relations can be answers to queries about this data.

The development of an algebra for relations has a history, which we shall follow roughly in our presentation. In the beginning, relational algebra was proposed by T. Codd as an algebra on sets of tuples (i.e., relations) that could be used to express usual queries about those relations. It consisted of five operations on sets:  union, set difference, and Cartesian product, with which you might already be familiar, and two unusual operations - selection and projection. To these, several operations that can be described in terms of these were added: varieties of "join" are the most important.

When DBMS's that used the relational model were first developed, their query languages largely implemented the relational algebra. Though, for efficiency purposes, these systems regarded relations as bags, not sets. That is, unless the user asked unambiguously that duplicate tuples be condensed into one (i.e., that "duplicates be eliminated"), relations were allowed to include duplicates. Therefore, in "Relational Operations on Bags", we shall study the same relational operations on bags and see the changes necessary.

Another change to the algebra that was necessitated by commercial implementations of the relational model is that many other operations are required. Most important is a way of performing aggregation, e.g., finding the average value of some column of a relation. We shall study these additional operations in "Extended Operators of Relational Algebra".

Basics of Relational Algebra

An algebra, generally, comprises operators and atomic operands. For example, in the algebra of arithmetic, the atomic operands are variables like x and constants like 15. The operators are the usual arithmetic ones: addition. subtraction, multiplication, and division. Any algebra allows us to build expressions by applying operators to atomic operands and/or other expressions of the algebra. Generally, parentheses are required to group operators and their operands. For example, in arithmetic we have expressions such as (x + y) * z  or  ((x+7) / (y-3)) + z .

Relational algebra is another instance of an algebra. Its atomic operands are:

1. Variables that stand for relations.
2. Constants, which are finite relations.
As we mentioned, in the classical relational algebra, all operands and the results of expressions are sets. The operations of the traditional relational algebra fall into four broad classes:

a) The usual set operations - union, intersection, and difference - applied to relations.

b) Operations that remove parts of a relation: "selection" removes some rows (tuples), and "projection" removes some columns.

c) Operations that combine the tuples of two relations, including "Cartesian product", which pairs the tuples of two relations in all possible ways, and many kinds of "join" operations, which selectively pair tuples from two relations.

d) An operation called "renaming" that does not affect the tuples of a relation, but changes the relation schema, i.e., the names of the attributes and/or the name of the relation itself.

We shall usually refer to expressions of relational algebra as queries. While we don't yet have the symbols required to show many of the expressions of relational algebra, you should be familiar with the operations of group (a), and thus recognize (R U S) as an example of an expression of relational algebra. R and S are atomic operands standing for relations, whose sets of tuples are unknown. This query asks for the union of whatever tuples are in the relations named R and S.