Module 3: Concurrency & Synchronization
Primary text: Operating Systems: Three Easy Pieces (OSTEP), concurrency section (chapters 26-34)
Selective support: Operating System Concepts (Silberschatz) for rigorous formal treatment of the critical-section problem, monitors, and deadlock; Unix Network Programming (Stevens) for pthreads and IPC synchronization details
This guide is the primary teacher. You do not need to read the source books front-to-back to complete this module. You do need to develop a physical intuition for what happens when multiple threads touch the same memory, and an operational fluency with locks, condition variables, semaphores, and the standard coordination patterns.
Scope of This Module
This module is not a collection of synchronization API recipes. It is where you stop trusting that code "does what it says" and learn to reason about schedules.
What it covers in depth:
- threads and shared state, why
counter++is not one operation, and what a race condition actually is - critical sections, mutual exclusion, progress, and bounded waiting as the classical correctness conditions
- the hardware foundation: atomic instructions, cache coherence, memory visibility
- spinlocks versus blocking mutexes and when each is the right tool
- lock implementation from test-and-set and compare-and-swap up through queue-based locks
- lock granularity and its effect on correctness, contention, and scalability
- condition variables and the wait-signal pattern, with correct while-loop usage
- semaphores as counting and binary primitives and how they relate to mutexes and condition variables
- monitors as a higher-level language-integrated synchronization pattern
- classic problems: producer-consumer, readers-writers with fairness, dining philosophers
- deadlock: necessary conditions, prevention, detection, and recovery
- lock-free data structures, compare-and-swap loops, ABA, and memory ordering models
- async/await and event-loop concurrency as a contrasting single-threaded model
What it deliberately does not try to finish here:
- distributed consensus (later semester)
- transactional memory as a full topic
- formal model checking with TLA+ or Spin beyond brief motivation
- kernel-level scheduler internals (Module 1 covered scheduling)
This is an in-depth systems module. If your code happens to work on your laptop but you cannot explain the schedule that would break it, you are not done.
Before You Start
Answer these closed-book before starting the main path:
- Why is
counter = counter + 1not a single step from the hardware's perspective? - Two threads share a linked list. Why is "holding a pointer to a node" not the same as "owning that node"?
- If you protect every access to a shared variable with the same mutex, what correctness property have you guaranteed?
- Why can a program that never prints any wrong output still have a data race?
- What does it mean for a write performed on CPU 0 to be "visible" on CPU 1?
Diagnostic Interpretation
4-5 solid answers
- You are ready for the full path.
2-3 solid answers
- Continue, but expect extra time in the hardware foundation and memory ordering clusters.
0-1 solid answers
- Revisit Semester 4 Module on systems programming (processes, memory layout, pointers) and Module 1 of this semester (processes and scheduling). Concurrency only makes sense once those are stable.
What This Module Is For
Concurrency is where single-threaded reasoning stops working. Throughout your career you will repeatedly be asked:
- this code is correct when one thread runs it. Why does it corrupt memory under load?
- this server is fast on 1 core. Why does it get slower on 8 cores?
- this
if (cache == null) cache = load()looks harmless. Why does it sometimes return half-initialized objects? - this test passes 999 times and fails once. What is the minimal schedule that breaks it?
- where exactly is the race, and what is the smallest change that fixes it without introducing a deadlock?
This module builds the concurrency reasoning needed for:
- multithreaded servers and any backend that handles parallel requests
- operating system kernels, drivers, and runtime internals
- high-performance data structures and lock-free algorithms
- asynchronous I/O, event-driven servers, and reactive systems
- debugging the class of bugs that do not show up in logs
You are learning to read schedules, not just code.
Concept Map
How To Use This Module
Work the clusters in order. Each cluster's mastery check must pass before the next cluster makes sense.
Cluster 1: The Race Condition Problem
| Order | Concept | Type | Focus |
|---|---|---|---|
| 1 | Threads, Shared State, and Non-Atomic Operations | PRIMARY | Why counter++ decomposes into load-modify-store and what "shared" actually means |
| 2 | Race Conditions and the Critical Section | PRIMARY | Interleavings, the critical-section problem, and the mutual-exclusion / progress / bounded-waiting conditions |
| 3 | The Hardware Foundation: Atomic Instructions and Cache Coherence | PRIMARY | Why correctness rests on instructions the CPU promises to complete indivisibly |
Cluster mastery check: Can you write the interleaving that breaks a naive shared counter, and explain in one sentence what the CPU would have to guarantee for the fix to work?
Cluster 2: Locks and Mutual Exclusion
| Order | Concept | Type | Focus |
|---|---|---|---|
| 4 | Spinlocks vs Blocking Mutexes | PRIMARY | Spin cost vs sleep cost, when each fits, and the danger of spinning under the kernel |
| 5 | Test-and-Set, Compare-and-Swap, and Lock Implementations | PRIMARY | Building locks from hardware primitives, ticket locks, and queue-based locks |
| 6 | Lock Granularity: Coarse vs Fine-Grained | PRIMARY | The correctness-contention trade-off, per-bucket locks, and where fine grain becomes lock soup |
Cluster mastery check: Given a data structure, can you design a locking scheme and justify the granularity choice in terms of both correctness and contention?
Cluster 3: Coordination Primitives
| Order | Concept | Type | Focus |
|---|---|---|---|
| 7 | Condition Variables and the Wait-Signal Pattern | PRIMARY | Why wait must release the mutex atomically, and why predicate checks are loops not ifs |
| 8 | Semaphores: Counting, Binary, and Their Use Cases | PRIMARY | P/V, counting semaphores as resource pools, binary semaphores vs mutexes |
| 9 | Monitors and Higher-Level Synchronization | SUPPORTING | Monitors, Mesa vs Hoare semantics, and why modern languages integrate mutex + condvar |
Cluster mastery check: Can you build the same bounded-buffer three ways (mutex + condvar, two semaphores, monitor) and say what changes in each?
Cluster 4: Classic Concurrency Problems
| Order | Concept | Type | Focus |
|---|---|---|---|
| 10 | Producer-Consumer / Bounded Buffer | PRIMARY | Classic synchronization pattern, empty/full conditions, and common broken variants |
| 11 | Readers-Writers with Fairness | PRIMARY | Starvation of writers, reader-preferring vs writer-preferring vs fair schemes |
| 12 | Dining Philosophers and Deadlock Analysis | PRIMARY | Cyclic wait, resource ordering, and the first-principles deadlock fix |
Cluster mastery check: Given a classic problem, can you identify which correctness property is at risk (safety, liveness, fairness) and which primitive enforces it?
Cluster 5: Lock-Free and Modern Concurrency
| Order | Concept | Type | Focus |
|---|---|---|---|
| 13 | Deadlock: Necessary Conditions, Prevention, Detection, Recovery | PRIMARY | Coffman conditions, prevention strategies, detection cycles, and recovery trade-offs |
| 14 | Lock-Free Data Structures and Memory Ordering Models | PRIMARY | CAS loops, ABA, acquire/release semantics, and sequential consistency |
| 15 | Async/Await and Event-Loop Concurrency Models | SUPPORTING | Cooperative vs preemptive concurrency, where races go in single-threaded async, and when to reach for it |
Cluster mastery check: Can you explain why a lock-free stack is not automatically faster than a mutex-protected one, and name at least one memory-ordering hazard you must reason about?
Then work these practice pages:
| Order | Practice path | Focus |
|---|---|---|
| 1 | Race Condition and Locking Clinic | Spot races, write interleavings, and apply mutual exclusion |
| 2 | Coordination Primitive Workshop | Condition variables, semaphores, and monitor-style design |
| 3 | Deadlock and Lock-Free Reasoning Lab | Deadlock Coffman diagnosis, classic problems, and lock-free hazards |
| 4 | Code Katas | Thread-safe bounded queue, readers-writers, deadlock demo, lock-free stack |
Use Module Quiz after the concept and practice path. Use Reference and Selective Reading and Learning Resources only for targeted reinforcement.
Learning Objectives
By the end of this module you should be able to:
- Read a piece of concurrent code and write out the specific interleaving that would break it.
- Explain why a correct concurrent program needs mutual exclusion, progress, and bounded waiting together, not just one of them.
- Implement a spinlock from
test_and_setand describe why it is inappropriate as a general-purpose user-space lock. - Build a correct bounded buffer three different ways and say what each version trades.
- Choose lock granularity for a data structure and justify the choice in terms of contention.
- Diagnose a deadlock by identifying which Coffman condition a fix removes.
- Solve the readers-writers problem with an explicit fairness policy.
- Explain the ABA problem and at least one way to avoid it.
- Distinguish sequential consistency, acquire/release, and relaxed memory ordering at a usage level.
- Explain where races can and cannot occur in a single-threaded async/await program.
Outputs
- a concurrency notebook with at least 15 solved schedule-reasoning problems
- one working C + pthreads implementation of a thread-safe bounded queue with tests
- one working readers-writers implementation with an explicit fairness policy and a note on starvation behavior
- one deliberately buggy C + pthreads program that demonstrates a race, a commit of the fix, and a written analysis of the interleaving
- one deadlock demo plus a prevention-strategy write-up naming the Coffman condition broken
- one lock-free stack implementation with a CAS loop and a written note on ABA
- one mistake log naming at least 10 concurrency errors (for example
forgot while loop on cond wait,inconsistent lock order,double-checked lock without atomics,priority inversion ignored) - one short memo mapping Module 3 tools to future work (network services, databases, runtime schedulers)
Completion Standard
You have completed Module 3 when all of these are true:
- you can exhibit the schedule that breaks an unsynchronized shared counter and the schedule that fixes it
- you can implement a correct bounded buffer with mutex + condvar without looking it up
- you can explain why
waitreleases the mutex and why the predicate is rechecked in awhile - you can implement readers-writers and say whose starvation risk each variant accepts
- you can recognize a cyclic wait from a program and break it with either ordering or hold-and-wait removal
- you can write a lock-free stack and identify its ABA hazard
- you can say what
memory_order_acquireandmemory_order_releasemean and when they are enough
If the code runs but you cannot produce an interleaving that would break an earlier version of it, the module is not complete.
Reading Policy
- Concept pages are the main path.
- Local book chunks are selective reinforcement. Open one chunk for one specific gap; do not read whole chapters by default.
Read only if stuckmeans try the concept page, self-check, and drill first.Optional deep divemeans additional nuance (for example memory-model formalism or specific lock variants), not required progression.- Because concurrency bugs look fine in logs, written interleaving analyses are required, not optional enrichment.
Suggested Weekly Flow
| Day | Work |
|---|---|
| 1 | Concepts 1-3 and one written race-condition interleaving |
| 2 | Concepts 4-6 and one spinlock-from-test-and-set implementation |
| 3 | Concepts 7-9 and one bounded buffer with mutex + condvar |
| 4 | Concepts 10-12 and one readers-writers implementation |
| 5 | Concepts 13-15 and one deadlock demo + lock-free stack sketch |
| 6 | Practice pages 1-2 and targeted local-book reinforcement |
| 7 | Practice pages 3-4, quiz, and mistake-log cleanup |
Reference
If you need exact links into the local chunked books, use Reference and Selective Reading.
Rich Learning Pages
Worked Examples | Guided Labs | Case Studies | Mistake Clinic | Reading Guide | Capstone Thread