Default Values / Indexes

Default Values / Indexes

Default Values

When we create or change tuples, we sometimes do not have values for all components. For instance, we mentioned in "Simple Table Declarations" Example 2 that when we add a column to a relation schema, the existing tuples do not have a known value, and it was suggested that NULL could be used in place of a "real" value. Or, we suggested in "Database Modifications" Example 2 that we could insert new tuples into the Studio relation knowing only the studio name and not the address or president's certificate number. Again, it would be necessary to use some value that says "I don't know" in place of real values for the latter two attributes.

To address these problems, SQL provides the NULL value, which becomes the value of any component whose value is not specified, with the exception of certain situations where the NULL value is not permitted. However, there are times when we would prefer to use another choice of default value, the value that appears in a column if no other value is known.

Generally, any place we declare an attribute and its data type, we may add the keyword DEFAULT and an appropriate value. That value is either NULL or a constant. Certain other values that are provided by the system, such as the current time, may also be options.

Example 1 : Let us examine "Simple Table Declarations" Example 1. We might wish to use the character? as the default for an unknown gender, and we might also wish to use the earliest possible date. DATE '0000-00-00' for an unknown birthdate. We could replace lines (4) and (5) of "Simple Table Declarations" Figure 1 by:

4)     gender CHAR(1) DEFAULT '?',
5)     birthdate DATE DEFAULT DATE '0000-00-00'

As another example, we could have declared the default value for new attribute phone to be 'unlisted' when we added this attribute in "Simple Table Declarations" Example 2. The alteration statement would then look like:

ALTER TABLE MovieStar ADD phone CHAR(16)  DEFAULT 'unlisted' ;


An index on an attribute A of a relation is a data structure that makes it efficient to find those tuples that have a fixed value for attribute A. Indexes generally help with queries in which their attribute A is compared with a constant, for instance A = 3, or even A ≤ 3. The technology of implementing indexes on large relations is of central importance in the implementation of DBMS's.

When relations are very large, it becomes expensive to scan all the tuples of a relation to find those (perhaps very few) tuples that match a given condition. For instance, consider the first query we examined:

FROM Movie
WHERE studioName = 'Disney' AND year = 1990;

from "The Database Language SQL" Example 1. There might be 10,000 Movie tuples, of which only 200 were made in 1990.

The naive way to implement this query is to get all 10,000 tuples and test the condition of the WHERE clause on each. It would be much more efficient if we had some way of getting only the 200 tuples from the year 1990 and testing each of them to see if the studio was Disney. It would be even more efficient if we could obtain directly only the 10 or so tuples that satisfied both the conditions of the WHERE clause - that the studio be Disney and the year be 1990; see the discussion of "multiattribute indexes" below.

Although the creation of indexes is not part of any SQL standard up to and including SQL-99, most commercial systems have a way for the database designer to say that the system should create an index on a certain attribute for a certain relation. The following syntax is typical. Assume we want to have an index on attribute year for the relation Movie. Then we say:

CREATE INDEX YearIndex ON Movie(year) ;

The result will be that an index whose name is YearIndex will be created on attribute year of the relation Movie. Henceforth, SQL queries that specify a year may be executed by the SQL query processor in such a way that only those tuples of Movie with the specified year are ever examined: there is a resulting decrease in the time needed to answer the query.

Often, a DBMS allows us to build a single index on multiple attributes. This type of index takes values for several attributes and efficiently finds the tuples with the given values for these attributes.

Example 2 : Since title and year form a key for Movie, we might expect it to be common that values for both these attributes will be specified, or neither will. The following is a typical declaration of an index on these two attributes:

CREATE INDEX KeyIndex ON Movie(title, year) ;

Since (title, year) is a key, then when we are given a title and year, we know the index will find only one tuple, and that will be the desired tuple. On the contrary, if the query specifies both the title and year, but only YearIndex is available, then the best the system can do is retrieve all the movies of that year and check through them for the given title.

If, as is often the case, the key for the multiattribute index is really the concatenation of the attributes in some order, then we can even use this index to find all the tuples with a given value in the first of the attributes. Thus, part of the design of a multiattribute index is the choice of the order in which the attributes are listed. For example, if we were more likely to specify a title than a year for a movie, then we would prefer to order the attributes as above; if a year were more likely to be specified, then we would ask for an index on (year, title).

If we wish to delete the index, we simply use its name in a statement like: