Module Quiz
Complete this quiz after finishing all concept and practice pages.
Current Module Questions
Question 1: Non-Atomic Operations
Why does counter++ need synchronization when two threads call it, even though counter is a single int?
Answer: counter++ decomposes into three hardware steps: load counter -> register, add 1, store register -> counter. Two threads can interleave so both load the same original value, both increment to the same new value, and both store it, losing one update. Only a read-modify-write done as one atomic instruction (or guarded by a lock) prevents this.
Question 2: Race Diagnosis from a Trace
You observe this log from two threads incrementing a shared counter that starts at 100:
T1: load counter -> 100
T2: load counter -> 100
T1: add 1, store 101
T2: add 1, store 101
What value is in memory at the end, and what concurrency bug is this?
Answer: counter = 101. This is a lost update: two +1 operations executed but only one was recorded. It is a data race on the load-modify-store sequence; the fix is a mutex around the increment or an atomic fetch-add.
Question 3: Race Diagnosis from a Trace (Deadlock)
You observe this waits-for state for two threads:
Thread A: holds m1, waiting for m2
Thread B: holds m2, waiting for m1
Which Coffman condition is satisfied, and what is the smallest change that would prevent this failure going forward?
Answer: Circular wait (all four conditions are satisfied, but circular wait is the one that is easiest and most practical to break). Define a total order over mutexes (for example, by address), and require every code path that needs more than one mutex to acquire them in that order. That alone eliminates the cycle.
Question 4: Critical-Section Properties
What are the three correctness properties of a mutual-exclusion protocol, and why does mutual exclusion alone not suffice?
Answer: Mutual exclusion, progress, and bounded waiting. Mutual exclusion alone can be satisfied by "nobody ever enters the critical section," which violates progress. Progress alone can be satisfied by round-robin without synchronization, which violates mutual exclusion. Bounded waiting rules out systems that admit indefinite overtaking even while satisfying the other two.
Question 5: Spinlock vs Mutex
A critical section takes 100 ns and runs on a 4-core machine with frequent contention. Spinlock or mutex, and why?
Answer: Spinlock. The critical section is shorter than a context switch (1-10 us), so sleeping and waking costs more than spinning. Preconditions: the holder must not block, fault, or be preempted inside the section.
Question 6: Memory Orderings
What does atomic_store_explicit(&flag, 1, memory_order_release) paired with atomic_load_explicit(&flag, memory_order_acquire) guarantee?
Answer: It establishes a happens-before relationship: any write the storing thread did before the release store becomes visible to the loading thread once it observes the stored value via the acquire load. In particular, plain (non-atomic) writes before the release are safe to read after the acquire.
Question 7: Condition Variable Usage
Explain why while (!predicate) cond_wait(&cv, &m); must use while, not if.
Answer: Pthreads condition variables have Mesa semantics: when a waiter is signaled, it must re-contend for the mutex and by the time it re-acquires it another thread may have consumed the resource. Spurious wakeups (waking without any signal) are also explicitly permitted. if would continue past the check with the predicate false; while rechecks and re-waits if necessary.
Question 8: Semaphore vs Mutex
Why is a binary semaphore not a safe substitute for a mutex?
Answer: A mutex tracks ownership: only the locking thread may unlock, and it supports features like recursion, error checking, and priority inheritance. A binary semaphore is ownerless -- any thread can post, which means a bug elsewhere can "unlock" a critical section it never entered, and priority-inversion protection is absent.
Question 9: Bounded Buffer Signal Choice
In a producer-consumer buffer with two condition variables (not_full, not_empty), why is signal sufficient and when would you need broadcast?
Answer: Each state change (a put enables one consumer; a get enables one producer) unblocks exactly one waiter of a specific kind, so signal is enough. broadcast is required when a single state change can unblock multiple waiters with different predicates (for example, on shutdown, to wake every blocked producer and consumer) or when multiple waiters share one condvar but have distinct predicates.
Question 10: Readers-Writers Fairness
A reader-preferring rwlock protects a config map. Writes are rare; reads arrive continuously at 10k/sec. A deploy that should take milliseconds has been waiting 30 seconds. Diagnose.
Answer: Writer starvation. The reader-preferring policy lets new readers enter even while a writer is waiting. Under sustained read load, no quiet interval appears for the writer. Fix: use a writer-preferring or FIFO-fair rwlock, or switch to a plain mutex if reads are short.
Question 11: Dining Philosophers
Name two Coffman conditions the resource-ordering fix can break, and explain which one it actually breaks.
Answer: The fix targets circular wait. "Hold and wait" is still present (a philosopher holds the lower-numbered fork while waiting for the higher-numbered one), but no cycle can form because any cycle would require some philosopher to wait for a lower-numbered fork while holding a higher-numbered one, which the ordering forbids.
Question 12: ABA Problem
Explain the ABA problem in a lock-free stack and one way to fix it.
Answer: Thread 1 reads top = A. It is preempted. Thread 2 pops A (so top = B), pops B (so top = C), and pushes A back (reusing memory). Thread 1's CAS top: A -> old_A->next succeeds because top is again A, but old_A->next now points into freed/reused memory. Fix: pair the pointer with a monotonically increasing counter in a double-word CAS (tagged pointers), or defer reuse with hazard pointers or epoch-based reclamation.
Question 13: Memory Ordering (relaxed)
When is memory_order_relaxed safe for an atomic counter, and when is it wrong?
Answer: Safe when the counter's value is the only thing communicated, no other state depends on the ordering of its update relative to other memory (for example, pure statistics: number of requests seen). Wrong when readers of the counter rely on other writes being visible after the counter changes -- for example, when the counter functions as a sequence number that publishes data written to another location.
Question 14: Async Races
Can a single-threaded asyncio program have a data race? Can it have a concurrency bug?
Answer: A data race at the CPU-instruction level: no, because only one coroutine runs between await points. A concurrency bug: yes. Logical races at await boundaries can cause lost updates (two coroutines reading a cache miss, both fetching, both writing) and ordering bugs. Coordinate with asyncio.Lock, in-flight futures, or sequence-aware structures.
Question 15: Lock Granularity
A hash table with one global mutex tops out at one core of throughput under 50/50 read-write load. You shard into 64 buckets with one mutex each. The throughput rises but not 64x. Why?
Answer: Real workloads have hot keys, contention on the allocator and the shared CPU cache, cross-bucket operations (resize, rebalance), and non-uniform access distribution. Amdahl's law applies: the serial fraction and the constant per-lock cost limit speedup. Measured scaling is usually sub-linear even with good sharding.
Interleaved Review Questions
Prior Module Question 1
Why is a page table walk expensive enough to justify a TLB?
Answer: A page walk requires several memory accesses (typically 4 levels on x86-64), each potentially a cache miss. A TLB hit resolves the translation in one cycle. TLBs turn per-memory-access cost from tens of cycles into one.
Prior Module Question 2
What is a process, and how does it differ from a thread?
Answer: A process is an address space plus one or more threads of execution. Threads inside a process share the address space and file descriptors; processes have independent address spaces and communicate through IPC.
Prior Module Question 3
State the scheduling goal trade-off between throughput and response time.
Answer: Throughput maximizes work done per unit time; response time minimizes latency from request to first progress. Schedulers that batch aggressively maximize throughput at the cost of response time; interactive schedulers do the reverse.
Prior Module Question 4
Why does demand paging rely on the principle of locality?
Answer: Programs tend to access a small subset of their address space over short windows. Demand paging exploits this by keeping only recently-used pages in physical memory, relying on locality to keep the fault rate low.
Prior Module Question 5
What is the difference between user mode and kernel mode, and why does it matter for correctness?
Answer: Kernel mode executes privileged instructions and accesses all memory; user mode cannot. System calls are the only legal transition. This boundary is what enforces process isolation, page protection, and device access control.
Self-Assessment and Remediation
Mastery Level (90-100% correct):
- Ready to advance with confidence. Concurrency intuition is solid.
Proficient Level (75-89% correct):
- Review only the missed concept pages and redo Kata 1 (bounded queue) and Kata 3 (deadlock) from scratch.
Developing Level (60-74% correct):
- Rework Practice pages 1 and 3. Pay particular attention to Mesa semantics, Coffman conditions, and memory orderings.
Insufficient Level (<60% correct):
- Return to the concept sequence. The interleaving habit (concepts 1-2) must be stable before anything else makes sense. Spend a full week rereading concept pages 1-6 and redoing every drill.