Skip to main content

Relations, Tuples, Attributes, Domains

What This Concept Is

A relation is a set of tuples drawn from fixed domains. Informally: a table. Formally: if D_1, D_2, ..., D_n are sets (the domains), a relation R over those domains is any subset of D_1 x D_2 x ... x D_n. Each member of R is a tuple. Each position has an attribute name, and the list of (attribute, domain) pairs is the schema of R.

Four consequences follow from "a relation is a set":

  • no two tuples in a relation are identical
  • tuples have no order
  • attributes are identified by name, not position (once a schema is fixed)
  • every attribute value is atomic with respect to the model: the database does not peek inside it

When you write CREATE TABLE employee(id INT, name TEXT, salary NUMERIC), you are declaring a schema: attributes id, name, salary drawn from domains INT, TEXT, NUMERIC. Rows you insert are tuples. The value of employee at any moment is the set of all its rows.

Why It Matters Here

Every idea in the rest of this module relies on this definition:

  • keys, functional dependencies, and normal forms are defined over schemas
  • SQL's duplicate-handling rules (SELECT DISTINCT, set operators) exist because real tables are usually multisets, and SQL has to reconcile that with the set-theoretic model
  • relational algebra is closed: every operator takes relations and returns relations, which is why queries compose
  • NULLs are the model's single biggest deviation from clean set theory, and most SQL surprises trace back to them

If you skip this definition, you will mistake SQL's behavior for the model's behavior.

Concrete Example

Suppose Employee has schema (id: INT, name: TEXT, dept: TEXT, salary: NUMERIC) with current value:

 id | name    | dept  | salary
----+---------+-------+--------
1 | Alice | Eng | 120000
2 | Bob | Eng | 95000
3 | Carol | Sales | 110000
  • schema: {id, name, dept, salary}
  • tuple: (1, 'Alice', 'Eng', 120000)
  • attribute dept is drawn from domain TEXT
  • cardinality (number of tuples) is 3; arity (number of attributes) is 4
  • attempting to insert (1, 'Alice', 'Eng', 120000) a second time is a violation: the relation is a set

In PostgreSQL, real tables are multisets (duplicates allowed unless a UNIQUE or PRIMARY KEY constraint forbids them). SQL exposes this gap: SELECT dept FROM employee returns three rows including 'Eng' twice; the relation of departments is two rows; SELECT DISTINCT dept recovers the set.

Common Confusion / Misconception

"A row is a tuple is an array." No. A tuple is an unordered mapping from attribute names to values. Position is only a convenience for printing. Swap the order of columns in CREATE TABLE and the relation is identical.

"SQL tables are relations." Almost. SQL tables are bags (multisets), and SQL queries default to bag semantics. The relational model is set-based, and SQL deviates because real systems need to count duplicates. This is why UNION removes duplicates but UNION ALL does not: UNION gives you the set, UNION ALL gives you the bag.

"A NULL is just another value." No. NULL is the model's "unknown" marker. Any arithmetic or comparison with NULL yields NULL (three-valued logic: true, false, unknown). WHERE x = NULL is never true; you must write WHERE x IS NULL.

How To Use It

Before you write any SQL:

  1. Write the schema as Rel(attr_1: domain_1, attr_2: domain_2, ...).
  2. State in one sentence what one tuple represents in the real world ("one row of employee is one active employment relationship at this company").
  3. Name which attributes must be present (no NULLs) and which domains are restrictive (e.g., salary > 0).
  4. If you cannot produce a one-sentence tuple interpretation, your schema is not finished yet.

Check Yourself

  1. Why can two rows in a relation never be identical even though SQL tables sometimes hold duplicates?
  2. If R(a, b) has 4 tuples and S(c, d) has 5 tuples, what is the cardinality of R x S?
  3. In a three-valued logic table, what does NULL AND FALSE evaluate to? What about NULL OR TRUE?

Mini Drill or Application

Given this English description, write the schema formally and state what one tuple represents:

"Our system tracks, for every student, every course they have enrolled in this semester, along with the grade once the course is finished."

  • Propose at least two schema shapes (one denormalized, one normalized).
  • For each, write one sentence describing what a tuple means.
  • Identify which attributes should forbid NULL, and which are allowed to be NULL and why.

Read This Only If Stuck