Skip to main content

Loop Invariants Prove What the Loop Maintains

What This Concept Is

A loop invariant is a predicate that is true before the loop runs, stays true after every iteration, and, together with the exit condition, implies the loop's postcondition. It is the iterative analog of induction.

A loop-invariant proof has three parts (CLRS §2.1):

  1. Initialization. The invariant holds just before the first iteration.
  2. Maintenance. If the invariant holds before an iteration, it still holds after that iteration.
  3. Termination. When the loop exits, the invariant plus the exit condition give exactly the postcondition.

Termination as used here is the consequence step; the separate question "does the loop stop?" is the decreasing-measure argument of concept 7. Together, invariant + variant = a full correctness proof.

An invariant is chosen, not discovered. It is a stronger statement than "the result I want," specifically crafted to be true in the middle of the loop where the work is only partly done. Writing the right invariant is the hardest and most useful step.

Why It Matters Here

Iterative algorithms are easy to test and hard to trust. A loop invariant turns an informal "I think this works because..." into a verified claim. Insertion sort, binary search, selection sort, Horner's method, partition, two-pointer scans, sliding windows, and every useful loop in this module can be analyzed this way.

Testing checks a finite set of inputs. A loop invariant proves the whole family. When you later maintain a mission-critical piece of code, the invariant is the documented contract that lets someone else modify it without breaking it; in S7 it becomes the fitness-function contract the architecture enforces.

Concrete Example

Example 1 (insertion sort).

def insertion_sort(A):
for j in range(1, len(A)):
key = A[j]
i = j - 1
while i >= 0 and A[i] > key:
A[i + 1] = A[i]
i -= 1
A[i + 1] = key
return A

Outer-loop invariant (before iteration $j$): $A[0 \ldots j-1]$ consists of the original elements at those positions, now in sorted order.

  • Initialization. Before $j = 1$, $A[0 \ldots 0]$ is one element, trivially sorted and unchanged.
  • Maintenance. Iteration $j$ inserts $A[j]$ into its correct place among $A[0 \ldots j-1]$, shifting larger elements right. After the iteration, $A[0 \ldots j]$ is the original elements at those positions, sorted.
  • Termination. The outer loop exits with $j = \text{len}(A)$, so $A[0 \ldots \text{len}(A)-1]$ is the full array, sorted. Exactly the postcondition.

The inner while loop has its own invariant (that $A[i+1 \ldots j]$ are all greater than key), which combines with the outer invariant to prove the shift step correctly positions key.

Example 2 (binary search).

def bsearch(A, x):
lo, hi = 0, len(A) - 1
while lo <= hi:
mid = (lo + hi) // 2
if A[mid] == x: return mid
if A[mid] < x: lo = mid + 1
else: hi = mid - 1
return -1

Invariant: if $x$ appears in $A$, its index is in the closed interval $[\text{lo}, \text{hi}]$.

  • Initialization. The interval is $[0, n-1]$, the whole array.
  • Maintenance. After comparing A[mid], the half that cannot contain $x$ is discarded; the invariant still holds on the remaining interval.
  • Termination. When lo > hi, the interval is empty; combined with the invariant, $x$ is not in $A$ and the function returns -1. If A[mid] == x, we exit early with the correct index.

The invariant is the reason binary search is trustworthy: it is exactly strong enough that termination produces the right answer.

Common Confusion / Misconceptions

  • "The invariant is just what the loop is supposed to do." No. "The array will be sorted at the end" is false in the middle of insertion sort and therefore useless as an invariant. "$A[0 \ldots j-1]$ is sorted" is true in the middle and exactly strong enough.
  • "Initialization is obvious, skip it." Skipping initialization turns the induction into a self-referential circle. A loop whose maintenance holds but whose initialization fails is exactly a loop that produces systematic wrong answers.
  • "My invariant implies the postcondition only sometimes." Then it is too weak. Strengthen it until "invariant + exit condition" mechanically imply the postcondition.
  • "Invariants are a tool for textbooks." They are the documentation that survives refactoring. A five-line invariant at the top of a hand-tuned inner loop will save you on the day it has to be sped up or ported.
  • "The invariant should mention every variable." A good invariant is minimal: it mentions exactly the variables whose relationship carries the loop's contract. Every other variable becomes noise that makes the maintenance step hard to verify.

How To Use It

For any loop you care about:

  1. State the loop's postcondition in one sentence (what must be true when it exits).
  2. Ask what is true about the variables at the top of each iteration when the work is only partly done.
  3. Strengthen that sentence until the maintenance step - showing one iteration preserves it - becomes trivial.
  4. Verify initialization (before the first iteration the invariant holds).
  5. Verify maintenance (one iteration preserves it).
  6. Verify termination (invariant + exit condition = postcondition).
  7. If any step fails, either the invariant is wrong (revise) or the loop is wrong (fix the loop).

Treat writing the invariant as an active design step, not documentation. If the invariant is ugly, the loop often is too.

Transfer / Where This Shows Up Later

  • S1 M5 (problem solving) framed "define what done looks like." An invariant is the iterative version: define what the middle looks like.
  • S4 (systems) analyzes lock-free data structures by specifying atomic invariants that hold before/after each compare-and-swap; same skeleton, different domain.
  • S5 (OS) expresses scheduler invariants ("the ready queue is sorted by effective priority") that survive every interrupt.
  • S6 (databases) expresses transaction-log replay as a sequence of operations preserving a ledger invariant; recovery is "re-establish the invariant after a crash."
  • S7 (architecture) writes architectural invariants ("no module in orders depends on billing.client") and makes them enforceable by fitness functions (S7 M5, concept 13). The predicate is the same; the domain is components instead of array slots.
  • S8 (distributed systems) states safety properties (agreement, validity) as invariants over a message log; consensus algorithms are invariant-maintenance proofs.

Concept 6 (recursive correctness by strong induction) is the recursive counterpart; concept 7 (termination) is the orthogonal "does it stop?" question.

Within this semester: S2 M2 uses loop invariants for partition (Lomuto and Hoare) and for in-place heapification; S2 M3 uses BFS/DFS invariants ("after iteration $i$, all vertices at distance $\le i$ are discovered"); S2 M5 verifies union-find operations via invariants on the parent array.

Check Yourself

  1. State a loop invariant for binary search that makes the termination step obvious.
  2. Why does "the array will be sorted at the end" fail as a loop invariant for insertion sort?
  3. What goes wrong if the maintenance step holds but initialization does not?
  4. Give an invariant for the partition step of quicksort (Lomuto scheme): what does it say about the two sub-ranges and the pivot?
  5. Why does an empty array pass any correctly-written loop's invariant trivially? What bug pattern does that mask?
  6. In Hoare-logic notation, an invariant sits between ${\text{pre}}$ and ${\text{post}}$; the loop itself is ${I}, \text{while}, B, \text{do}, S, {I \land \neg B}$. Write the three proof obligations this skeleton imposes on $I$.
  7. You strengthen an invariant to prove maintenance; later the stronger invariant fails at initialization. What does this tell you?

Mini Drill or Application

Write a loop invariant and prove correctness for each:

  1. Linear search returning the index of $x$ in $A$ or $-1$.
  2. A loop computing sum = 0; for i in range(n): sum += A[i]. Invariant describes sum at step $i$.
  3. The partition step of quicksort using Lomuto or Hoare scheme (state the invariant over the two partitions).
  4. Horner's evaluation of a polynomial: result = 0; for c in reversed(coeffs): result = result * x + c.
  5. Two-pointer algorithm for "given a sorted array, does any pair sum to $T$?" State the invariant that A[lo] + A[hi] = T is possible only if lo <= ... <= hi.
  6. Implementation drill. Write pseudocode for maximum_subarray(A) using Kadane's algorithm. State invariants for (a) the running best suffix and (b) the running maximum overall. Verify initialization, maintenance, termination. Complexity derivation: one pass, $\Theta(1)$ work per iteration, total $\Theta(n)$ time, $\Theta(1)$ space.

Read This Only If Stuck