Module 2: Memory Management & Virtual Memory

Primary text: Operating Systems: Three Easy Pieces (OSTEP), Part I chapters on virtualization of memory
Selective support: Operating System Concepts chapters 9-10 for alternate exposition, the Linux kernel memory-management admin guide, and LWN articles on allocators and huge pages

This guide is the primary teacher. You do not need to read OSTEP front-to-back to complete this module. You do need to become operationally strong at reading a page-table walk, predicting TLB and page-fault behavior, and reasoning about allocators in both the kernel and in userland.

Scope of This Module

This module is not a tour of memory hardware trivia. It is where virtual memory becomes a system you can reason about when something is slow, faulting, or leaking.

What it covers in depth:

physical versus virtual addresses and why translation exists at all
base-and-bounds and segmentation as the first attempts, and why they lost
paging as the modern default: page size, page number, page offset, page-table entry
single-level, multi-level, and inverted page tables with their space/time trade-offs
the TLB as the caching fix that makes paging affordable, plus miss costs
demand paging, minor versus major faults, and the page-fault control flow
page-replacement policies (OPT, FIFO, LRU, clock, second-chance) and their failure modes
thrashing and the working-set intuition for "when a machine falls off a cliff"
kernel allocators: buddy system and slab, and why the kernel needs both
userland allocators: dlmalloc, ptmalloc, jemalloc, tcmalloc and the arenas / thread-cache / size-class ideas
internal versus external fragmentation, and how allocators trade one for the other
copy-on-write and what makes fork + exec cheap
mmap, anonymous mappings, shared mappings, and file-backed mappings
huge pages, NUMA, and the modern wrinkles that change cost models

What it deliberately does not try to finish here:

concurrency primitives inside the allocator (Module 3 covers locks and lock-free structures)
the full page-cache and writeback story (Module 4 covers file systems and I/O)
hardware microarchitecture (cache hierarchy, prefetch) beyond how it touches TLB and page size
security topics like ASLR, W^X, KASLR, and side channels beyond a passing mention

This is an in-depth systems module. If you can quote definitions but cannot walk a virtual address to its page frame on paper, you are not done.

Before You Start

Answer these closed-book before starting the main path:

Why does the OS keep a process from using the raw physical address it wants?
Given a 48-bit virtual address and 4 KiB pages, how many bits are the offset and how many are the page number?
What is a TLB miss, and why is it not the same thing as a page fault?
Why is LRU a theoretical benchmark rather than a policy a real kernel actually implements?
What does fork copy, and what does it not copy, on a modern Linux system?

Diagnostic Interpretation

4-5 solid answers

You are ready for the full path.

2-3 solid answers

Continue, but expect extra time in Cluster 2 (page tables and TLB) and Cluster 4 (allocators).

0-1 solid answers

Revisit Module 1 of this semester (processes and scheduling) and the Semester 4 systems-programming material on pointers and the process address space before advancing. You need those before paging becomes readable.

What This Module Is For

Virtual memory is the abstraction that makes every other part of a real system tractable. Later work repeatedly asks questions like:

why did my service's p99 latency double after a memory upgrade?
why does this program spend 30% of its time in the kernel on TLB shootdowns?
why does malloc return quickly but my resident-set size climb for an hour?
why does fork on a 40 GiB process take milliseconds?
why does mmap of a large file look free until the first read of each page?

This module builds the memory reasoning needed for:

debugging page-fault storms, swap death, and NUMA imbalance
reading perf, vmstat, /proc/meminfo, pmap, and smem output without guessing
choosing and tuning allocators for latency-sensitive or memory-sensitive workloads
understanding GC, JIT, and runtime-level memory use through the lens of the virtual-memory subsystem underneath
designing data-intensive systems where memory layout, not CPU, is the bottleneck

You are learning to reason about memory without handwaving.

Concept Map

How To Use This Module

Work in order. The later clusters only make sense if the earlier clusters are stable.

Cluster 1: Address Spaces and Translation

Order	Concept	Type	Focus
1	Physical vs Virtual Addresses: Why We Need Translation	PRIMARY	Why the OS refuses to let programs use raw physical addresses
2	Base/Bounds and Segmentation: The First Answer	SUPPORTING	The earliest translation hardware and why it lost to paging
3	Paging: The Modern Answer	PRIMARY	Fixed-size pages, page numbers, offsets, and page-table entries

Cluster mastery check: Can you take a 48-bit virtual address with 4 KiB pages, split it into VPN and offset, and explain what the MMU does with each half?

Cluster 2: Page Tables and the TLB

Order	Concept	Type	Focus
4	Single-Level and Multi-Level Page Tables	PRIMARY	Why flat tables are infeasible, how radix trees fix it
5	Inverted Page Tables and Their Trade-offs	SUPPORTING	When one table per frame beats one table per process
6	The TLB and Its Caching Role	PRIMARY	How a 64-entry cache decides whether paging is tolerable

Cluster mastery check: Can you walk a 4-level page-table translation on paper and explain exactly what a TLB miss costs relative to a TLB hit?

Cluster 3: Page Replacement and Swapping

Order	Concept	Type	Focus
7	Demand Paging and Minor/Major Faults	PRIMARY	Pages appear on demand, and faults come in two costs
8	Replacement Policies: FIFO, LRU, Approximations	PRIMARY	OPT as a benchmark, why real kernels use clock and variants
9	Thrashing and Working-Set Theory	PRIMARY	The working set, and why swap death happens suddenly

Cluster mastery check: Given a reference string, can you simulate OPT, FIFO, and LRU, and explain Belady's anomaly without looking it up?

Cluster 4: Allocators and Fragmentation

Order	Concept	Type	Focus
10	Kernel Memory Allocators: Buddy System, Slab	PRIMARY	How the kernel gets page frames and small objects without fragmenting itself
11	Userland Allocators: dlmalloc, ptmalloc, jemalloc, tcmalloc	PRIMARY	Arenas, size classes, thread caches, and why choice matters
12	Internal vs External Fragmentation	SUPPORTING	Where the wasted bytes go, and how allocators trade one for the other

Cluster mastery check: Can you explain, in one paragraph each, what jemalloc does that ptmalloc does not, and why the kernel uses both buddy and slab rather than one or the other?

Cluster 5: Practical Virtual Memory Effects

Order	Concept	Type	Focus
13	Copy-on-Write and `fork`'s Efficiency	PRIMARY	Why `fork` on a 40 GiB process takes milliseconds
14	`mmap`, Anonymous Mappings, Shared Mappings	PRIMARY	The unified interface under allocation, shared memory, and file I/O
15	Large Pages, NUMA, and Modern Wrinkles	SUPPORTING	Huge pages, THP, NUMA policy, and why new hardware keeps moving the costs

Cluster mastery check: Can you describe one measurable real-system effect of each of these: CoW after fork, a file-backed mmap first-read, and a workload that benefits from 2 MiB huge pages?

Then work these practice pages:

Order	Practice path	Focus
1	Page Fault and TLB Measurement Lab	`perf stat`, minor vs major faults, TLB miss cost under varying stride
2	Page Replacement Simulator Workshop	Build and compare OPT, FIFO, LRU, and clock on real reference strings
3	Allocator Comparison Clinic	Benchmark `glibc`, `jemalloc`, `tcmalloc` on contention and RSS
4	Code Katas	Short timed drills on page-table math, `mmap`, and allocation patterns

Use Module Quiz after the concept and practice path. Use Reference and Selective Reading and Learning Resources only for targeted reinforcement.

Learning Objectives

By the end of this module you should be able to:

Explain why the OS refuses to expose physical memory directly and state the three things translation buys us.
Walk a multi-level page-table lookup for a 48-bit virtual address with 4 KiB pages, naming each index and entry.
Describe what a TLB hit, a TLB miss, a minor page fault, and a major page fault each cost, in order of magnitude.
Simulate FIFO, LRU, and clock on a reference string and explain Belady's anomaly.
Define the working set and explain why thrashing appears as a cliff rather than a slope.
Explain the buddy system and slab allocator, and why the kernel uses both.
Compare ptmalloc, jemalloc, and tcmalloc on contention, fragmentation, and RSS bloat behavior.
Distinguish internal and external fragmentation with a worked example of each.
Explain copy-on-write with a concrete fork/exec timeline.
Choose between malloc, anonymous mmap, shared mmap, and file-backed mmap for a given workload and justify.

Outputs

a page-table notebook with at least 8 walked translations (different levels, different page sizes) and a 1 GiB / 2 MiB / 4 KiB comparison table
one TLB and page-fault measurement report using perf stat on a stride-varying program, with the cost curve and a one-page interpretation
one page-replacement simulator in code, supporting OPT, FIFO, LRU, and clock, validated against at least three reference strings from OSTEP Chapter 22
one allocator-comparison report comparing glibc default, jemalloc, and tcmalloc on a multithreaded micro-benchmark, with RSS, throughput, and contention notes
one mmap playbook: at least four worked scenarios (anonymous, file-backed read, file-backed MAP_SHARED, huge-page-backed) with expected /proc/$pid/maps output
one CoW timeline from fork to divergence, with /proc/$pid/smaps readings before and after the first write
one mistake log naming at least 12 recurring errors such as confused TLB miss with page fault, assumed LRU in Linux, forgot anonymous vs file-backed, blamed the allocator for a page-fault problem, or ignored NUMA locality
one short memo mapping Module 2 reasoning into the concurrency (Module 3), file-systems (Module 4), and later distributed-systems modules

Completion Standard

You have completed Module 2 when all of these are true:

you can translate a virtual address by hand using a multi-level page table
you can state where the MMU, the TLB, the page-fault handler, and the allocator each sit in the translation and allocation path
you can simulate replacement policies and explain why OPT is only a yardstick
you can tell when a performance problem is really a TLB problem, a page-fault problem, an allocator problem, or a NUMA problem, using evidence
you can explain copy-on-write and mmap in plain language and predict their cost profile
you can pick between allocators for a real workload and defend the choice with at least one measurement
you can read /proc/meminfo, /proc/$pid/maps, and /proc/$pid/smaps without looking things up

If you can quote definitions but cannot debug a memory problem with perf and /proc output in front of you, the module is not complete.

Reading Policy

Concept pages are the main path.
Local book chunks are selective reinforcement, not a second syllabus.
Read only if stuck means try the concept page, self-check, and drill first.
Optional deep dive means additional nuance or exercise volume, not required progression.
External references are validated but optional; treat them as alternate exposition when a concept page does not land.

Suggested Weekly Flow

Day	Work
1	Concepts 1-3, draw an address-space diagram and do one paging math drill
2	Concepts 4-6, walk two multi-level translations by hand and estimate TLB miss cost
3	Concepts 7-9, run Practice 1 (fault and TLB measurement) to halfway
4	Concepts 10-12 and Practice 2 (page-replacement simulator)
5	Concepts 13-15 and Practice 3 (allocator comparison) halfway
6	Finish Practice 1-3, start Practice 4 katas
7	Practice 4 katas, quiz, mistake-log cleanup

Reference

If you need exact links into the local chunked books, use Reference and Selective Reading.

Build Your Own X — elective

The Operating System tutorial (Phil Opp's "Writing an OS in Rust") covers paging, frame allocation, and kernel heap in directly-runnable form. This is the deepest possible engagement with this module. See Build Your Own X overview.

Rich Learning Pages

Scope of This Module​

Before You Start​

Diagnostic Interpretation​

What This Module Is For​

Concept Map​

How To Use This Module​

Cluster 1: Address Spaces and Translation​

Cluster 2: Page Tables and the TLB​

Cluster 3: Page Replacement and Swapping​

Cluster 4: Allocators and Fragmentation​

Cluster 5: Practical Virtual Memory Effects​

Learning Objectives​

Outputs​

Completion Standard​

Reading Policy​

Suggested Weekly Flow​

Reference​

Rich Learning Pages​