Relations, Tuples, Attributes, Domains
What This Concept Is
A relation is a set of tuples drawn from fixed domains. Informally: a table. Formally: if D_1, D_2, ..., D_n are sets (the domains), a relation R over those domains is any subset of D_1 x D_2 x ... x D_n. Each member of R is a tuple. Each position has an attribute name, and the list of (attribute, domain) pairs is the schema of R.
Four consequences follow from "a relation is a set":
- no two tuples in a relation are identical
- tuples have no order
- attributes are identified by name, not position (once a schema is fixed)
- every attribute value is atomic with respect to the model: the database does not peek inside it
When you write CREATE TABLE employee(id INT, name TEXT, salary NUMERIC), you are declaring a schema: attributes id, name, salary drawn from domains INT, TEXT, NUMERIC. Rows you insert are tuples. The value of employee at any moment is the set of all its rows.
Why It Matters Here
Every idea in the rest of this module relies on this definition:
- keys, functional dependencies, and normal forms are defined over schemas
- SQL's duplicate-handling rules (
SELECT DISTINCT, set operators) exist because real tables are usually multisets, and SQL has to reconcile that with the set-theoretic model - relational algebra is closed: every operator takes relations and returns relations, which is why queries compose
- NULLs are the model's single biggest deviation from clean set theory, and most SQL surprises trace back to them
If you skip this definition, you will mistake SQL's behavior for the model's behavior.
Concrete Example
Suppose Employee has schema (id: INT, name: TEXT, dept: TEXT, salary: NUMERIC) with current value:
id | name | dept | salary
----+---------+-------+--------
1 | Alice | Eng | 120000
2 | Bob | Eng | 95000
3 | Carol | Sales | 110000
- schema:
{id, name, dept, salary} - tuple:
(1, 'Alice', 'Eng', 120000) - attribute
deptis drawn from domainTEXT - cardinality (number of tuples) is 3; arity (number of attributes) is 4
- attempting to insert
(1, 'Alice', 'Eng', 120000)a second time is a violation: the relation is a set
In PostgreSQL, real tables are multisets (duplicates allowed unless a UNIQUE or PRIMARY KEY constraint forbids them). SQL exposes this gap: SELECT dept FROM employee returns three rows including 'Eng' twice; the relation of departments is two rows; SELECT DISTINCT dept recovers the set.
Common Confusion / Misconception
"A row is a tuple is an array." No. A tuple is an unordered mapping from attribute names to values. Position is only a convenience for printing. Swap the order of columns in CREATE TABLE and the relation is identical.
"SQL tables are relations." Almost. SQL tables are bags (multisets), and SQL queries default to bag semantics. The relational model is set-based, and SQL deviates because real systems need to count duplicates. This is why UNION removes duplicates but UNION ALL does not: UNION gives you the set, UNION ALL gives you the bag.
"A NULL is just another value." No. NULL is the model's "unknown" marker. Any arithmetic or comparison with NULL yields NULL (three-valued logic: true, false, unknown). WHERE x = NULL is never true; you must write WHERE x IS NULL.
How To Use It
Before you write any SQL:
- Write the schema as
Rel(attr_1: domain_1, attr_2: domain_2, ...). - State in one sentence what one tuple represents in the real world ("one row of
employeeis one active employment relationship at this company"). - Name which attributes must be present (no NULLs) and which domains are restrictive (e.g.,
salary > 0). - If you cannot produce a one-sentence tuple interpretation, your schema is not finished yet.
Check Yourself
- Why can two rows in a relation never be identical even though SQL tables sometimes hold duplicates?
- If
R(a, b)has 4 tuples andS(c, d)has 5 tuples, what is the cardinality ofR x S? - In a three-valued logic table, what does
NULL AND FALSEevaluate to? What aboutNULL OR TRUE?
Mini Drill or Application
Given this English description, write the schema formally and state what one tuple represents:
"Our system tracks, for every student, every course they have enrolled in this semester, along with the grade once the course is finished."
- Propose at least two schema shapes (one denormalized, one normalized).
- For each, write one sentence describing what a tuple means.
- Identify which attributes should forbid NULL, and which are allowed to be NULL and why.