Skip to main content

Race Conditions and the Critical Section

What This Concept Is

A race condition exists when the final outcome of a program depends on the interleaving of threads, in a way the programmer did not intend.

A critical section is a segment of code that accesses shared state and must not be interleaved with another thread's critical section on the same state.

The classical critical-section problem asks for a protocol that guarantees three properties simultaneously:

  • Mutual exclusion. At most one thread executes its critical section at a time.
  • Progress. If no thread is in the critical section and some threads want to enter, the selection of the next entrant cannot be postponed forever.
  • Bounded waiting. There is a bound on how many times other threads can enter their critical sections after a thread has requested entry and before the request is granted.

Any protocol that achieves only one or two of these is a broken protocol.

Why It Matters Here

These three properties are the correctness specification for every mutual-exclusion mechanism in this module. Spinlocks, mutexes, semaphores, and monitors are all candidate implementations. If a proposed lock fails any of mutual exclusion, progress, or bounded waiting, you can reject it without looking at the code.

Progress and bounded waiting are the subtle ones. A lock that serializes access but lets the same thread monopolize the critical section forever has mutual exclusion but no fairness. A lock that requires both competitors to back off and retry may violate progress.

Concrete Example

Consider a bank-account transfer:

if (accounts[from].balance >= amount) {
accounts[from].balance -= amount;
accounts[to].balance += amount;
}

This is a critical section on the two accounts. The interleaving that breaks it:

StepThread A (transfer 100 from X)Thread B (transfer 100 from X)
1read X.balance = 150
2read X.balance = 150
3check 150 >= 100: ok
4check 150 >= 100: ok
5X.balance = 50
6X.balance = 50

The account had enough money for one transfer, not two, yet both succeeded. The invariant "balance never goes below 0 in a disallowed way" depends on the transfer code behaving as a single atomic step with respect to other transfers from the same account.

Common Confusion / Misconception

"A race is when two threads write at the same time."

A race is when at least one access is a write and the timing of accesses is unsynchronized, regardless of whether the access is a single instruction. Two reads never race. A read racing against a write is still a race, and it can return a torn or logically stale value.

Another misconception: "Making the critical section small is always better." Making it small reduces contention but makes it easier to accidentally move an invariant-preserving step outside the lock. Correctness first, then shrink.

A third misconception: "Mutual exclusion is the whole specification." It is not. A broken protocol that grants the critical section to the same thread forever technically excludes every other thread, so mutual exclusion holds, but progress fails and the other threads wait forever. The three properties together rule out both correctness bugs (concurrent access) and fairness bugs (indefinite waiting).

Data Races vs Race Conditions

These two terms are sometimes used interchangeably and sometimes distinguished carefully:

  • A data race is the narrow, formal notion: two unsynchronized memory accesses to the same location from different threads, at least one a write. In C11 and C++11 this is undefined behavior by the language standard. Tools like ThreadSanitizer detect data races directly by instrumenting memory accesses.
  • A race condition is the broader semantic notion: the program's behavior depends on scheduling even when individual operations are atomic. A bank-transfer protocol that reads balance, computes new balance, writes it back can be a race condition even if every load and store is itself atomic.

Data races are a strict subset of race conditions. Fixing data races (using atomics or locks on every shared access) does not automatically fix race conditions (which require holding a lock across the whole read-modify-write).

How To Use It

When you identify shared state:

  1. Find every code path that reads or writes it.
  2. Group those paths into critical sections that must be serialized.
  3. For each group, decide which invariant must hold at the boundary of the critical section.
  4. Protect the critical section with a mechanism that guarantees mutual exclusion, progress, and bounded waiting.
  5. Verify: can another thread observe state between the steps inside the critical section? If yes, the boundaries are wrong.

Check Yourself

  1. Give one example of a race that is not about two simultaneous writes.
  2. Why is progress not implied by mutual exclusion?
  3. Why is "just make the critical section as small as possible" incomplete advice?

Mini Drill or Application

Take the transfer code above. Write, in plain English, the exact invariant that must be true at the start and at the end of the critical section. Then write the smallest interleaving that violates the invariant if no synchronization is used. Then describe the smallest fix.

Read This Only If Stuck