Subqueries

Subqueries

In SQL, one query can be used in many ways to help in the evaluation of another. A query that is part of another is called a subquery. Subqueries can have subqueries, and so on, down as many levels as we desire. We already saw one example of the use of subqueries; in "Union, Intersection, and Difference of Queries" we built a union, intersection, or difference query by connecting two subqueries to form the whole query. There are a number of other ways that subqueries can be used:

1. Subqueries can return a single constant, and this constant can be compared with another value in a WHERE clause.

2. Subqueries can return relations that can be used in many ways in WHERE clauses.

3. Subqueries can have their relations appear in FROM clauses, just like any stored relation can.

Subqueries that Produce Scalar Values

An atomic value that can appear as one component of a tuple is referred to as a scalar. A select-from-where expression can produce a relation with any number of attributes in its schema, and there can be any number of tuples in the relation. On the other hand, often we are only interested in values of a single attribute. Moreover, sometimes we can deduce from information about keys, or from other information, that there will be only a single value produced for that attribute.

If so, we can use this select-from-where expression, surrounded by parentheses, as if it were a constant. Particularly, it may appear in a WHERE clause any place we would expect to find a constant or an attribute representing a component of a tuple. For instance, we may compare the result of such a subquery to a constant or attribute.

Example 1 : Let us recall "Queries Involving More Than One Relation" Example 1, where we asked for the producer of Star Wars. We had to query the two relations



because only the former has movie title information and only the latter has producer names. The information is linked by "certificate numbers". These numbers uniquely identify producers. The query we developed is:



There is another way to look at this query. We need the Movie relation only to get the certificate number for the producer of Star Wars. Once we have it, we can query the relation MovieExec to find the name of the person with this certificate. The first problem, getting the certificate number, can be written as a subquery, and the result, which we expect will be a single value, can be used in the "main" query to achieve the same effect as the query above. This query is shown in Figure 1.

Finding the producer of Star Wars by using a nested subquery

Lines (4) through (6) of Figure 1 are the subquery. Looking only at this simple query by itself, we see that the result will be a unary relation with attribute producerC#, and we expect to find only one tuple in this relation. The tuple will look like (12345), that is, a single component with some integer, perhaps 12345 or whatever George Lucas certificate number is. If zero tuples or more than one tuple is produced by the subquery of lines (4) through (6). It is a run-time error.

Having executed this subquery, we can then execute lines (1) through (3) of Figure 1, as if the value 12345 replaced the entire  subquery. That is, the "main" query is executed as if it were



The result of this query should be George Lucas.

Conditions Involving Relations

There are a number of SQL operators that we can apply to a relation R and produce a boolean result. Usually, the relation R will be the result of a select-from-where subquery. Some of these operators - IN, ALL, and ANY - will be explained first in their simple form where a scalar value s is involved. In this situation, the relation R is required to be a one-column relation. Here are the definitions of the operators:

1. EXISTS R is a condition that is true if and only if R is not empty.

2. s IN R is true if and only if s is equal to one of the values in R. Similarly, s NOT IN R is true if and only if s is equal to no value in R. Here, we assume R is a unary relation. We shall discuss extensions to the IN and NOT IN operators where R has more than one attribute in its schema and s is a tuple in "Conditions Involving Tuples".

3. s > ALL R is true if and only if s is greater than every value in unary relation R. Likewise, the > operator could be replaced by any of the other five comparison operators, with the analogous meaning: s stands in the stated relationship to every tuple in R. For instance, s <> ALL R is the same as s NOT IN R.

4. s > ANY R is true if and only if s is greater than at least one value in unary relation R. Likewise, any of the other five comparisons could be used in place of >, with the meaning that s stands in the stated relationship to at least one tuple of R. For instance, s = ANY R is the same as s IN R.

The EXISTS, ALL, arid ANY operators can be negated by putting NOT in front of the entire expression, just like any other boolean-valued expression. Therefore, NOT EXISTS R is true if and only if R is empty. NOT s > ALL R is true if and only if s is not the maximum value in R, and NOT s > ANY R is true if and only if s is the minimum value in R. We shall see many examples of the use of these operators shortly.


Tags