The Database Language SQL

The Database Language SQL

The most frequently used relational DBMSs query and modify the database through a language called SQL (often pronounced "sequel"). SQL stands for "Structured Query Language". The portion of SQL that supports queries has capabilities very close to that of relational algebra, as extended in "Extended Operators of Relational Algebra". On the other hand, SQL also contains statements for modifying the database (e.g., inserting and deleting tuples from relations) and for declaring a database schema. Thus, SQL serves as both a data-manipulation language and as a data-definition language. SQL also standardizes many other database commands, covered in "Constraints and Triggers" and "System Aspects of SQL".

There are many different dialects of SQL. First, there are three major standards. There is ANSI (American National Standards Institute) SQL and an updated standard adopted in 1992, called SQL-92 or SQL2. The recent SQL-99 (previously referred to as SQL3) standard extends SQL2 with object-relational features and a number of other new capabilities. Then, there are versions of SQL produced by the principal DBMS vendors. These all contain the capabilities of the original ANSI standard. They also conform to a large extent to the more recent SQL2, although each has its variations and extensions beyond SQL2, containing some of the features in the SQL-99 standard.

In this and the next two sections we shall emphasize the use of SQL as a query language. This section focuses on the generic (or "ad-hoc") query interface for SQL. That is, we consider SQL as a stand-alone query language, where we sit at a terminal and ask queries about a database or request database modifications, such as insertion of new tuples into a relation. Query answers are displayed for us at our terminal.

The next section discusses constraints and triggers, as another way of exerting user control over the content of the database. "System Aspects of SQL" covers database related programming in conventional programming languages. Our discussion of SQL in this and the next two sections will conform to the SQL-99 standard, emphasizing features found in almost all commercial systems as well as the previous standards.

The purpose of this section and the following two sections is to provide the reader with a sense of what SQL is about, more at the level of a "tutorial" than a "manual". Therefore, we focus on the most frequently used features only. The references mention places where more of the details of the language and its dialects can be found.

Simple Queries in SQL

Perhaps the simplest form of query in SQL asks for those tuples of some one relation that satisfy a condition. Such a query is similar to a selection in relational algebra. This simple query, like almost all SQL queries, uses the three keywords, SELECT, FROM and WHERE that characterize SQL.

Example database schema repeated

Example 1 : In this and subsequent examples, we shall use the database schema explained in "Relational Algebra". To review, these relation schemas are the ones shown in Figure 1. We shall see in "Defining a Relation Schema in SQL" how to express schema information in SQL, but for the moment, suppose that each of the relations and domains (data types) mentioned in "Relational Algebra" apply to their SQL counterparts.

As our first query, let us ask about the relation

Movie(title, year, length, inColor, studioName, producerC#)

for all movies produced by Disney Studios in 1990. In SQL, we say

FROM Movie
WHERE StudioName = Disney AND year = 1990;

This query exhibits the characteristic select-from-where form of most SQL queries.

● The FROM clause gives the relation or relations to which the query refers. In our example, the query is about the relation Movie.

A Trick for Reading and Writing Queries

● The WHERE clause is a condition, much like a selection-condition in relational algebra. Tuples must satisfy the condition in order to match the query. Here, the condition is that the studioName attribute of the tuple has the value Disney and the year attribute of the tuple has the value 1990. All tuples meeting both stipulations satisfy the condition; other tuples do not.

The SELECT clause tells which attributes of the tuples matching the condition are produced as part of the answer. The * in this example indicates that the entire tuple is produced. The result of the query is the relation consisting of all tuples produced by this process.

One way to understand this query is to consider each tuple of the relation mentioned in the FROM clause. The condition in the WHERE clause is applied to the tuple. More specifically, any attributes mentioned in the WHERE clause are replaced by the value in the tuples component for that attribute. The condition is then evaluated, and if true, the components appearing in the SELECT clause are produced as one tuple of the answer. Therefore, the result of the query is the Movie tuples for those movies produced by Disney in 1990, for example,

Pretty Woman.

In detail, when the SQL query processor encounters the Movie tuple

(here, 999 is the imaginary certificate number for the producer of the movie), the value Disney is substituted for attribute studioName and value 1990 is substituted for attribute year in the condition of the WHERE clause, because these are the values for those attributes in the tuple in question. The WHERE clause thus becomes

WHERE Disney = Disney AND 1990 = 1990

Since this condition is evidently true, the tuple for Pretty Woman passes the test of the WHERE clause and the tuple becomes part of the result of the query.