Module Quiz

Complete this quiz after finishing all concept and practice pages.

Current Module Questions

Question 1: Why Translate At All

Name three capabilities that virtual memory gives the OS, none of which is achievable when programs emit physical addresses directly.

Answer: Isolation between processes (each process's pointers cannot name another process's memory), relocation of physical pages without informing the process (the kernel can evict, migrate, compress, or share frames transparently), and over-commitment / abstraction (processes see a large, clean, private virtual address space even when physical memory is small or shared).

Question 2: Paging Math

A process uses 48-bit virtual addresses with 4 KiB pages. How many bits are the offset, how many are the VPN, and how many PTEs would a single flat page table hold per process?

Answer: Offset = 12 bits (4 KiB = 2^12). VPN = 36 bits. Flat table entries = 2^36 = ~6.87 * 10^10, times 8 bytes per PTE = 512 GiB per process. That is why multi-level tables exist.

Question 3: Page-Table Walk Walkthrough

Walk a 4-level (9+9+9+9+12) x86-64 translation for virtual address 0x0000_5555_5555_6000. Give the five fields.

Answer:

0x0000_5555_5555_6000 in binary, focus on bits 47-0.
bits 47-39 (PML4) = 0_1010_1010 = 0xAA
bits 38-30 (PDPT) = 1_0101_0101 = 0x155
bits 29-21 (PD) = 0_1010_1010 = 0xAA
bits 20-12 (PT) = 1_0101_0101 = 0x155
bits 11-0 (offset) = 0x000

(Exact numbers depend on careful bit alignment; the key is that each level is a 9-bit index into a 512-entry table, and the offset is 12 bits.)

Question 4: Walk Cost

What is the worst-case memory-access cost of a TLB miss on a 4-level page-table walk, and how does a 2 MiB huge page change it?

Answer: Up to 4 DRAM reads (one per level) plus the final data read. With a 2 MiB huge page the walk stops at the PD level, so worst case is 3 DRAM reads. With a 1 GiB page it stops at PDPT: 2 DRAM reads.

Question 5: TLB Miss vs Page Fault

Explain the difference between a TLB miss and a page fault.

Answer: A TLB miss is a cache miss on the translation itself; the page table is present in memory and the MMU performs the walk with no OS involvement, costing a few DRAM accesses. A page fault means the PTE is absent or protection-violating; the MMU raises an exception, the kernel runs the fault handler, potentially reads a page from disk, installs a PTE, and returns. Misses are frequent and cheap; faults are rare and expensive.

Question 6: Minor vs Major Faults

Give one example of each, and describe their order-of-magnitude cost difference.

Answer: Minor fault example: first write to a fresh anonymous mapping (the kernel zeroes a frame and installs a PTE, no I/O). Major fault example: read of a page from a file-backed mapping that is not in the page cache (the kernel reads from disk). Minor faults cost microseconds; major faults cost tens to hundreds of microseconds on SSDs and milliseconds on spinning disks. Often 100x-10000x slower.

Question 7: Replacement Simulation

Reference string 1 2 3 4 1 2 5 1 2 3 4 5, 3 frames. Give the fault count under FIFO, LRU, and OPT.

Answer:

FIFO: 9 faults. Order of residents after each reference, with evictions.
LRU: 10 faults.
OPT: 7 faults.

(Counts can shift slightly based on tie-breaking; the key is OPT < FIFO <= LRU on this string.)

Bonus: With 4 frames, FIFO yields 10 faults (Belady's anomaly: more frames, more FIFO faults on this string).

Question 8: Belady's Anomaly

State Belady's anomaly in one sentence.

Answer: For FIFO page replacement, increasing the number of frames can increase the fault count for some reference strings.

Question 9: Working Set and Thrashing

Define the working set W(t, tau) and explain why a system exhibits a sharp cliff rather than a slope when memory pressure increases.

Answer: W(t, tau) is the set of pages a process has referenced in the interval [t - tau, t]. The working-set size is |W(t, tau)|. While the sum of working sets fits in RAM, faults are rare and performance is smooth. Once the sum exceeds RAM, every eviction throws out a page another process is about to need; faults cascade. Because faults cost 100x-10000x more than hits, the transition is catastrophic rather than gradual.

Question 10: Kernel Allocators

Why does the kernel use both the buddy allocator and the slab allocator?

Answer: The buddy allocator hands out contiguous runs of 2^k page frames, well-matched to page-frame allocation and large DMA buffers, but it rounds requests up by powers of two, which is wasteful for small objects. The slab allocator runs on top of the buddy allocator: it takes page-sized chunks and carves them into per-type caches of equally-sized objects (inodes, sockets, task_struct, ...), with O(1) allocation and free and near-zero internal fragmentation within a cache. Neither is sufficient alone.

Question 11: User Allocator Choice

A long-running multithreaded service on 16 cores shows RSS of 24 GiB with 4 GiB live objects and 40% CPU time in __lll_lock_wait. Diagnose and propose a fix.

Answer: __lll_lock_wait inside libc indicates contention on the glibc ptmalloc arena locks. Combined with a large RSS-vs-live gap, this is classic ptmalloc arena accumulation and fragmentation. Fix candidates: set MALLOC_ARENA_MAX=2 or 4 (reducing arena count trades some contention for lower RSS), or switch to jemalloc or tcmalloc via LD_PRELOAD. jemalloc is usually the best default for long-lived services because of its aggressive page-release behavior.

Question 12: Fragmentation Types

Give one concrete example of internal fragmentation and one of external fragmentation.

Answer:

Internal: a 40-byte request served from a 48-byte size class wastes 8 bytes (~17%) per allocation; the slack is inside an allocated block.
External: a heap has free chunks of 1, 2, 1, 3, 1, 2 KiB (10 KiB total) but cannot satisfy a 5 KiB request. Or at the buddy-allocator level: plenty of order-0 free pages but zero at order 9, so a 2 MiB contiguous request fails.

Question 13: Copy-on-Write

A Redis server with 20 GiB of data forks a child to do a background save. What is each process's RSS immediately after fork? What does writing to a page cost?

Answer: Immediately after fork, both processes show 20 GiB RSS, but almost all pages are shared read-only across the two with a refcount of 2. Each writable page-table entry in both processes is marked read-only. On a write to a page, the MMU raises a protection fault; the kernel allocates a new frame, copies the page, installs it as read-write in the writing process's PTE, and decrements the refcount. Only the pages written by either process diverge; the rest stay shared until exec or exit.

Question 14: `mmap` Flavors

You need: (a) a 1 GiB zeroed scratch buffer, (b) shared memory between a parent and a child, (c) to read a 10 GiB file where you want the kernel to cache hot parts, (d) a persistent on-disk data structure that you want to manipulate in-memory with writes visible to other processes. Choose the right mmap flavor for each.

Answer:

(a) MAP_PRIVATE | MAP_ANONYMOUS (zero-filled anonymous private).
(b) MAP_SHARED | MAP_ANONYMOUS (or memfd_create + MAP_SHARED) so parent and child see each other's writes.
(c) MAP_PRIVATE over the file read-only, using the page cache via major faults on cold pages.
(d) MAP_SHARED over the file read-write; writes propagate to the file and to other mappers.

Question 15: Huge Pages Trade-off

A workload's perf stat shows a very high dTLB-load-misses rate. You turn on transparent huge pages globally. Performance improves on some workloads and degrades on others. Explain one case each.

Answer:

Improves: an in-memory database reads randomly across a 10 GiB buffer pool. With 4 KiB pages the working set vastly exceeds TLB reach; with 2 MiB pages TLB reach grows 512x and miss rate collapses.
Degrades: a fork-heavy service (e.g., Redis BGSAVE) triggers CoW on 2 MiB pages. Each divergence copies 2 MiB instead of 4 KiB (512x amplification). Also, khugepaged compaction sweeps can stall latency-sensitive threads. This is why some databases recommend disabling THP.

Interleaved Review Questions

Prior Module Question 1 (S5 Module 1: Processes & Scheduling)

Why does a context switch on x86-64 not require flushing the TLB on modern kernels?

Answer: Because the CPU supports PCID / ASID tagging: each TLB entry carries an address-space identifier, so the TLB can hold translations for many processes simultaneously. A context switch changes CR3 (and the current PCID); the TLB does not need to be flushed, only filtered by the new PCID on probes.

Prior Module Question 2 (S4 Systems Programming: Process Memory Layout)

Name the four traditional segments of a Unix process address space and whether each grows.

Answer: Text (code, fixed size, typically read-only), data (initialized globals, fixed size), BSS (uninitialized globals, fixed size and zero-filled), heap (grows up via brk/sbrk or mmap), stack (grows down toward the heap). Modern layouts also include loaded shared libraries and mmaped regions in the middle.

Prior Module Question 3 (S2 Module 1: Algorithm Analysis)

The buddy allocator's "split down to order k, then coalesce on free" sequence: what is the asymptotic cost per allocation, and why?

Answer: O(log N) where N is the maximum supported order, because the walk up and down the order free lists is bounded by the number of orders. Coalescing is also O(log N) because each coalescing step doubles the block size, so at most log N steps occur.

Prior Module Question 4 (S1 Module 1: Proofs / Induction)

Prove by induction that a buddy allocator that serves requests of exactly power-of-two sizes cannot produce external fragmentation (all free memory is always coalescible into usable blocks for some size class).

Answer: Base case: the allocator starts with one block of order K (the maximum). Inductive step: assume at some point every maximal free run of memory is aligned on its own size boundary and can be coalesced with its buddy if the buddy is also free. Any allocation of a power-of-two size splits blocks but preserves the alignment invariant. Any free reunites buddies if both free. Therefore the invariant holds. In particular, the allocator never holds unusable gaps smaller than its smallest size class.

Prior Module Question 5 (S1 Module 3: Probability)

If each of 100 threads independently hits one of 8 glibc arenas uniformly at random, what is the probability that all eight arenas are contended (each used by at least one thread)?

Answer: Occupancy problem. P(all 8 used) = 1 - P(at least one arena unused). Using inclusion-exclusion, or the standard surjection formula:

P = sum_{k=0}^{8} (-1)^k C(8,k) ((8-k)/8)^100 = 8! * S(100, 8) / 8^100

where S(n,k) is the Stirling number of the second kind. Numerically this is essentially 1 (with 100 threads over 8 arenas, all arenas being hit is almost certain).

Self-Assessment and Remediation

Mastery Level (90-100% correct):

Ready to advance with confidence. You can walk a translation and diagnose a real memory problem.

Proficient Level (75-89% correct):

Review only the missed concept pages and redo two problems of each missed type. Prioritize any questions about TLB cost and replacement policies.

Developing Level (60-74% correct):

Rework Cluster 2 (Page Tables and the TLB) and Cluster 3 (Page Replacement and Swapping). Redo Practice 1 and Practice 2. The gap is almost always page-table walk math or replacement-policy simulation.

Insufficient Level (<60% correct):

Return to the concept sequence from the start. The likely issue is that you are treating virtual memory as abstract rather than as a hardware + kernel system. Re-read Cluster 1 and redo the bit-split and page-table walk katas before advancing.

Current Module Questions​

Question 1: Why Translate At All​

Question 2: Paging Math​

Question 3: Page-Table Walk Walkthrough​

Question 4: Walk Cost​

Question 5: TLB Miss vs Page Fault​

Question 6: Minor vs Major Faults​

Question 7: Replacement Simulation​

Question 8: Belady's Anomaly​

Question 9: Working Set and Thrashing​

Question 10: Kernel Allocators​

Question 11: User Allocator Choice​

Question 12: Fragmentation Types​

Question 13: Copy-on-Write​

Question 14: mmap Flavors​

Question 15: Huge Pages Trade-off​

Interleaved Review Questions​

Prior Module Question 1 (S5 Module 1: Processes & Scheduling)​

Prior Module Question 2 (S4 Systems Programming: Process Memory Layout)​

Prior Module Question 3 (S2 Module 1: Algorithm Analysis)​

Prior Module Question 4 (S1 Module 1: Proofs / Induction)​

Prior Module Question 5 (S1 Module 3: Probability)​

Self-Assessment and Remediation​

Current Module Questions

Question 1: Why Translate At All

Question 2: Paging Math

Question 3: Page-Table Walk Walkthrough

Question 4: Walk Cost

Question 5: TLB Miss vs Page Fault

Question 6: Minor vs Major Faults

Question 7: Replacement Simulation

Question 8: Belady's Anomaly

Question 9: Working Set and Thrashing

Question 10: Kernel Allocators

Question 11: User Allocator Choice

Question 12: Fragmentation Types

Question 13: Copy-on-Write

Question 14: `mmap` Flavors

Question 15: Huge Pages Trade-off

Interleaved Review Questions

Prior Module Question 1 (S5 Module 1: Processes & Scheduling)

Prior Module Question 2 (S4 Systems Programming: Process Memory Layout)

Prior Module Question 3 (S2 Module 1: Algorithm Analysis)

Prior Module Question 4 (S1 Module 1: Proofs / Induction)

Prior Module Question 5 (S1 Module 3: Probability)

Self-Assessment and Remediation