Comparison of Strings

Comparison of Strings

Two strings are equal if they are the same sequence of characters. SQL allows declarations of different types of strings, for instance fixed-length arrays of characters and variable-length lists of characters. If so, we can expect reasonable

Representing Bit Strings

coercions among string types. For instance, a string like foo might be stored as a fixed-length string of length 10, with 7 "pad" characters, or it could be stored as a variable-length string. We would expect values of both types to be equal to each other and also equal to the constant string 'foo'.

When we compare strings by one of the less than operators, such as < or >=, we are asking whether one precedes the other in lexicographic order (i.e., in dictionary order, or alphabetically). That is, if a1a2 . . . an and b1b2 . . . bm are two strings, then the first is "less than" the second if either a1 < b1, or if a1 = b1 and a2 < b2 ,or if a1 = b1, a2 = b2, and a3 < b3, and so on. We also say a1a2.. .an < b1b2. . . bm if n < m and a1a2. . . an, = b1b2. . . bn; that is, the first string is a proper prefix of the second. For example, 'fodder'  < 'foo' , because the first two characters of each string are the same, fo, and the third character of fodder precedes the third character of foo. Also, 'bar' < 'bargain' because the former is a proper prefix of the latter. As with equality, we may expect reasonable coercion among different string types.

SQL also provides the capability to compare strings on the basis of a simple pattern match. An alternative form of comparison  expression is

s LIKE p

where s is a string and p is a pattern, that is, a string with the optional use of the two special characters % and - Ordinary  characters in p match only themselves in s. But % in p can match any sequence of 0 or more characters in s. and - in p matches any one character in s. The value of this expression is true if and only if string s matches pattern p. Likewise, s NOT LIKE p is true if and only if string s does not match pattern p.

Example 1: We remember a movie "Star something" and we remember that the something has four letters. What could this movie be? We call retrieve all such names with the query:

SELECT title
FROM Movie
WHERE title LIKE  'Star_ _ _ _' ;

This query asks if the title attribute of a movie has a value that is nine characters long, the first five characters being Star and a blank. The last four characters may be anything, since any sequence of four characters matches the four - symbols. The result of the query is the set of complete matching titles, such as Star Wars and Star Trek.

Example 2 : Let us search for all movies with a possessive ('s) in their titles. The desired query is

SELECT title
FROM Movie
WHERE title LIKE '%' 's%' ;

To understand this pattern, we must first observe that the apostrophe, being the character that surrounds strings in SQL, cannot also represent itself. The convention taken by SQL is that two consecutive apostrophes in a string represent a single apostrophe and do not end the string. Thus, ' 's in a pattern is matched by a single apostrophe followed by an s.

The two % characters on either side of the 's match any strings whatsoever. Thus, any title with s as a substring will match the  pattern, and the answer to this query will include films such as Logan's Run or Alice's Restaurant.