System Aspects of SQL

System Aspects of SQL

We now consider the question of how SQL fits into a complete programming environment. In "SQL in a Programming Environment" we see how to embed SQL in programs that are written in an ordinary programming language, such as C. A critical issue is how we move data between SQL relations and the variables of the surrounding, or "host," language.

"Procedures Stored in the Schema" considers another way to combine SQL with general-purpose programming: persistent stored modules, which are pieces of code stored as part of a database schema and executable on command from the user. "The SQL Environment" covers additional system issues; such as support for a client-server model of computing.

A third programming approach is a "call-level interface," where we program in some conventional language and use a library of functions to access the database. In "Using a Call-Level Interface" we discuss the SQL-standard library called SQL/CLI, for making calls from C programs. Then, in "Java Database Connectivity" we meet Java's JDBC (database connectivity), which is an alternative call-level interface.

Then, "Transactions in SQL" introduces us to the "transaction," an atomic unit of work. Many database applications, such as banking, require that operations on the data appear atomic, or indivisible, even though a large number of concurrent operations may be in progress at once. SQL provides features to allow us to specify transactions, and SQL systems have mechanisms to make sure that what we call a transaction is really executed atomically. Finally, "Security and User Authorization in SQL" discusses how SQL controls unauthorized access to data, and how we can tell the SQL system what accesses are authorized.

SQL in a Programming Environment

To this point, we have used the generic SQL interface in our examples. That is, we have assumed there is an SQL interpreter, which accepts and executes the sorts of SQL queries and commands that we have learned. Although provided as an option by almost all DBMS's, this mode of operation is actually rare.

The Languages of the SQL Standard

In fact, most SQL statements are part of some larger piece of software. A more realistic view is that there is a program in some conventional host language such as C, but some of the steps in this program are actually SQL statements. In this section we shall describe one way SQL can be made to operate within a conventional program.

A sketch of a typical programming system that involves SQL statements is in Figure 1. There, we see the programmer writing programs in a host language, but with some special "embedded" SQL statements that are not part of the host language. The entire program is sent to a preprocessor, which changes the embedded SQL statements into something that makes sense in the host language. The representation of the SQL could be as simple as a call to a function that takes the SQL statement as a character-string argument and executes that SQL statement.

The preprocessed host-language program is then compiled in the usual manner. The DBMS vendor usually provides a library that supplies the necessary function definitions. Thus, the functions that implement SQL can be executed, and the whole program behaves as one unit. We also show in Figure 1 the possibility that the programmer writes code directly in the host language, using these function calls as required. This approach often referred to as a call-level interface or CLI, will be discussed in "Using a Call-Level Interface".

The Impedance Mismatch Problem

The basic problem of connecting SQL statements with those of a conventional programming language is impedance mismatch, the fact that the data model of SQL differs so much from the models of other languages. As we know, SQL uses the relational data model at its core. However, C and other common programming languages use a data model with integers, reals, arithmetic, characters, pointers, record structures, arrays, and so on. Sets are not represented directly in C or these other languages, while SQL does not use pointers, loops and branches, or many other common programming-language constructs. As a result, jumping or passing data between SQL and other languages is not straightforward, and a mechanism must be devised to allow the development of programs that use both SQL and another language.

Processing programs with SQL statements embedded

One might first suppose that it is preferable to use a single language; either do all computation in SQL or forget SQL and do all computation in a conventional language. However, we can quickly dispense with the idea of omitting SQL when there are database operations involved. SQL systems greatly aid the programmer in writing database operations that can be executed efficiently, yet that can be expressed at a very high level. SQL takes from the programmer's shoulders the need to understand how data is organized in storage or how to exploit that storage structure to operate efficiently on the database.

However, there are many important things that SQL cannot do at all. For instance, one cannot write an SQL query to compute the factorial of a number n [n! = n x (n - 1) x . . . x2x1], something that is an easy exercise in C or similar languages. As another example, SQL cannot format its output directly into a convenient form such as a graphic. Thus, real database programming requires both SQL and a conventional language; the latter is often referred to as the host language.

The SQL/Host Language Interface

The transfer of information between the database, which is accessed only by SQL statements, and the host-language program is through variables of the host language that can be read or written by SQL statements. All such shared variables are prefixed by a colon when they are referred to within an SQL statement, but they appear without the colon in host-language statements.

When we wish to use an SQL statement within a host-language program, we warn that SQL code is coming with the keywords EXEC SQL in front of the statement. A typical system will preprocess those statements and replace them by suitable function calls in the host language, making use of an SQL-related library of functions.

A special variable, called SQLSTATE in the SQL standard, serves to connect the host-language program with the SQL execution system. The type of SQLSTATE is an array of five characters.  Each time a function of the SQL library is called, a code is put in the variable SQLSTATE that indicates any problems found during that call. The SQL standard also specifies a large number of five-character codes and their meanings.

For instance, '00000' (five zeroes) indicates that no error condition occurred, and '02000' indicates that a tuple requested as part of the answer to an SQL query could not be found. We shall see that the latter code is very important, since it allows us to create a loop in the host-language program that examines tuples from some relation one-at-a-time and to break the loop after the last tuple has been examined. The value of SQLSTATE can be read by the host-language program and a decision made on the basis of the value found there.

The DECLARE Section

To declare shared variables, we place their declarations between two embedded SQL statements:

      . . .   

What appears between them is called the declare section. The form of variable declarations in the declare section is whatever the host language requires. Moreover, it only makes sense to declare variables to have types that both the host language and SQL can deal with, such as integers, reals, and character strings or arrays.

Example 1 : The following statements might appear in a C function that updates the Studio relation:

     char studioName [50], studioAddr [256];
     char SQLSTATE[6] ;

The first and last statements are the required beginning and end of the declare section. In the middle is a statement declaring two variables studioName and studioAddr. These are both character arrays and, as we shall see, they can be used to hold a name and address of a studio that are made into a tuple and inserted into the Studio relation. The third statement declares SQLSTATE to be a six-character array.