*Relational Operations on Bags*

While a set of tuples (i.e., a relation) is a simple, natural model of data as it might appear in a database, commercial database systems rarely, if ever, are based entirely on sets. In some situations, relations as they appear in database systems are allowed to have duplicate tuples. Remember that if a "set" is permitted to have multiple occurrences of a member, then that set is called a bag or multiset. In this section, we shall examine relations that are bags rather than sets; that is, we shall allow the same tuple to appear more than once in a relation. When we refer to a "set", we mean a relation without duplicate tuples; a "bag" means a relation that may (or may not) have duplicate tuples.**Example (a) :** The relation in Figure (a) is a bag of tuples. In it, the tuple (1,2) appears three times and the tuple (3,4) appears once. If Figure (a) were a set-valued relation, we would have to eliminate two occurrences of the tuple (1,2). In a bag-valued relation, we do allow multiple occurrences of the same tuple, but like sets, the order of tuples does not matter.

### Why Bags?

When we think about implementing relations efficiently, we can see a number of ways that allowing relations to be bags rather than sets can speed up operations on relations. We mentioned at the beginning of "An Algebra of Relational Operations" how allowing the result to be a bag could speed up the union of two relations. For another example, when we do a projection, allowing the resulting relation to be a bag (even when the original relation is a set) lets us work with each tuple separately. If we want a set as the result, we need to compare each projected tuple with all the other projected tuples, to make sure that each projection appears only once. Nevertheless, if we can accept a bag as the result, then we simply project each tuple and add it to the result; no comparison with other projected tuples is required.**Example (b) :**The bag of Figure (a) could be the result of projecting the relation shown in Figure (b) onto attributes A and B, on the condition we allow the result to be a bag and do not eliminate the duplicate occurrences of (1,2).

Had we used the ordinary projection operator of relational algebra, and therefore eliminated duplicates, the result would be only:

Note that the bag result, although larger, can be calculated more quickly, since there is no need to compare each tuple (1,2) or (3,4) with previously generated tuples.

In addition, if we are projecting a relation in order to take an aggregate (discussed in "Extended Operators of Relational Algebra"), such as "Find the average value of A in Figure (b)", we could not use the set model to think of the relation projected onto attribute A. As a set, the average value of A is 2, because there are only two values of A - 1 and 3 - in Figure (b), and their average is 2. On the other hand, if we treat the A-column in Figure (b) as a bag {1,3,1,1}, we get the correct average of A, which is 1.5, among the four tuples of Figure (b).

### Tags

- tuples
- database
- multiset
- attributes
- relational algebra
- Scrolling Cursors
- Protecting Against Concurrent Updates
- Modifications by Cursor
- Cursors
- Instead-Of Triggers
- Triggers in SQL
- Schema-Level Constraints and Triggers
- Tuple-Based CHECK Constraints
- Constraints on Attributes and Tuples
- Declaring Foreign-Key Constraints
- Keys Declared With UNIQUE
- Constraints and Triggers
- Interpreting Queries Involving Views
- Modifying Views
- View Definitions
- Introduction to Selection of Indexes
- Default Values / Indexes
- Simple Table Declarations
- Defining a Relation Schema in SQL
- Deletion / Updates
- Database Modifications
- Grouping / HAVING Clauses
- Full-Relation Operations
- Natural Joins / Outerjoins
- Subqueries in FROM Clauses
- Conditions Involving Tuples
- Subqueries
- Union, Intersection, and Difference of Queries
- Interpreting Multirelation Queries
- Tuple Variables
- Disambiguating Attributes
- Queries Involving More Than One Relation
- Null Values and Comparisons Involving NULL
- Selection in SQL
- Projection in SQL
- The Database Language SQL
- Additional Constraint Examples
- Referential Integrity Constraints
- Constraints on Relations
- Extending the Projection Operator
- Grouping
- Extended Operators of Relational Algebra
- Selection on Bags / Product of Bags / Joins of Bags
- Union, Intersection, and Difference of Bags
- A Linear Notation for Algebraic Expressions
- Dependent and Independent Operations
- Renaming
- Combining Operations to Form Queries
- Selection / Cartesian Product
- Set Operations on Relations
- An Algebra of Relational Operations
- Relational Algebra
- Attribute Lists
- Information Integration Via Semistructured Data
- Semistructured Data Representation
- Object-Oriented Versus Object-Relational
- Nested Relations
- The Object-Relational Model
- What If There Is No Key
- Representing ODL Relationships
- Representing Other Type Constructors
- Representing Set-Valued Attributes
- Nonatomic Attributes in Classes
- Declaring Keys in ODL
- Extents
- Subclasses in ODL / Multiple Inheritance in ODL
- Types in ODL
- Methods in ODL
- Multiplicity of Relationships
- Relationships in ODL / Inverse Relationships
- Attributes in ODL
- Introduction to ODL
- The Type System
- Review of Object-Oriented Concepts
- Decomposition into Fourth Normal Form
- Reasoning About Multivalued Dependencies
- Definition of Multivalued Dependencies
- Multivalued Dependencies
- Third Normal Form
- Boyce-Codd Normal Form
- Decomposing Relations
- Projecting Functional Dependencies
- Closing Sets of Functional Dependencies
- The Transitive Rule
- Why the Closure Algorithm Works
- Computing the Closure of Attributes
- Trivial Functional Dependencies
- The Splitting/Combining Rule
- Rules About Functional Dependencies
- Keys of Relations
- Using Null Values to Combine Relations - Comparison of Approaches
- An Object-Oriented Approach
- Converting Subclass Structures to Relations
- Handling Weak Entity Sets
- Combining Relations
- From E/R Relationships to Relations
- From Entity Sets to Relations
- From E/R Diagrams to Relational Designs
- Relation Instances
- Equivalent Representations of a Relation
- Tuples / Domains
- Attributes / Schemas
- The Relational Data Model
- Summary of The Entity-Relationship Data Model
- Weak Entity Set Notation
- Requirements for Weak Entity Sets
- Representing Keys in the E/R Model
- Keys in the E/R Model
- The Modeling of Constraints
- Picking the Right Kind of Element
- Choosing the Right Relationships
- Design Principles
- Subclasses in the E/R Model
- Converting Multiway Relationships to Binary
- Attributes on Relationships
- Multiway Relationships
- Elements of the E/R Model
- Information Integration Overview
- Database System Implementation
- Database Programming
- Database Design
- The Query Processor
- Storage and Buffer Management
- Data-Definition Language Commands
- Multimedia Data
- Smaller and Smaller Systems
- Relational Database Systems
- Corporate Records
- Early Database Management Systems
- The Evolution of Database Systems
- The Worlds of Database Systems