Inverted Page Tables and Their Trade-offs

What This Concept Is

A standard page table is indexed by virtual address: one table per process, sized by the virtual address space.

An inverted page table flips the relationship: one global table, indexed by physical frame number, with one entry per page frame in the machine. Each entry records which (process, virtual page) currently owns that frame, or that the frame is free.

Translation is no longer "use the VPN as an index." It is "search the inverted table for an entry matching (current_pid, VPN)." To make that tractable, the inverted table is usually fronted by a hash table that maps (pid, VPN) to a bucket in the inverted table.

Historical systems that used inverted tables: PowerPC, some IBM mainframes, UltraSPARC, Itanium.

Why It Matters Here

The inverted page table answers a real question: with very large virtual address spaces, multi-level tables per process still use nontrivial memory, and that memory scales with the number of processes times the size of each used subtree. The inverted table puts a strict cap: one table, sized by physical memory, shared across all processes.

The trade is:

Standard (per-process, forward) page tables cost memory proportional to each process's active address space, but translation is a simple indexed walk (fast, TLB-friendly).
Inverted page tables cost memory proportional to physical RAM (cheap when RAM is small relative to virtual spaces), but translation requires a hash lookup (slower on TLB miss, harder to hardware-accelerate).

Modern general-purpose CPUs picked the forward multi-level tree because the TLB already hides walk cost and because implementing the hash probe efficiently in hardware is harder than a fixed-depth walk. But the inverted idea is alive in hashed / clustered page tables for specific workloads.

Concrete Example

Imagine a machine with 256 MiB of physical memory and 4 KiB pages: 65,536 page frames. An inverted page table has 65,536 entries, each roughly 16 bytes: about 1 MiB total, regardless of how many processes run.

A forward 4-level table on the same machine running 100 processes, each with a modest address space, would eat several MiB in page-table memory alone.

Translation example. Process pid=42 touches VPN 0xABCDE. The MMU (with hash support) computes hash(42, 0xABCDE) mod N to find a bucket, follows a chain of inverted-table entries until one matches (42, 0xABCDE). That entry's index is the PFN.

Common Confusion / Misconception

"Inverted tables are always smaller." Only when physical memory is small relative to aggregate virtual memory. On modern servers with large RAM and only a few hundred processes, a forward tree per process can easily beat a flat inverted table in total memory.

"Hashed page tables and inverted page tables are the same thing." Close but not identical. A hashed page table is indexed by (pid, VPN) -> PFN through a hash table; the table's size is decoupled from physical memory and is a pure software choice. An inverted table strictly has one entry per physical frame. Many real implementations blend the two.

"Sharing pages is easier." Actually harder in pure inverted tables: two processes sharing a frame would want two logical mappings, but an inverted table has only one entry per frame. Implementations store a list of mappers per frame, reintroducing complexity.

How To Use It

Use the inverted-table viewpoint when you are reasoning about:

hashed page tables in Linux on PowerPC or IBM POWER
the per-frame reverse-map (rmap) in Linux, which is an inverted index layered on top of the forward tables so the page-reclamation code can find all mappers of a given frame when it needs to unmap them
any system where memory per process is bounded by the machine, not by the virtual address space

If you meet a data structure keyed on physical frame rather than virtual page, ask: is this the translation structure itself, or an inverted lookup used only for reclaim and sharing?

Check Yourself

Why does the inverted-table size depend on physical memory and not on the size of the virtual address space?
Why is translation typically slower with an inverted table, even with hashing?
Why is page sharing trickier to represent in a pure inverted table?
Why does Linux still keep a reverse-map (rmap) for each page even though it uses forward multi-level page tables?

Mini Drill or Application

Machine with 4 GiB RAM, 4 KiB pages. Roughly how many entries does a pure inverted table have? If each entry is 32 bytes, total table size?
Forward 4-level table on the same machine, with 200 processes each touching 100 MiB. Estimate page-table memory and compare.
Under what access pattern is an inverted table's hash-probe translation cost worse than a 4-level forward walk?
Describe what the Linux kernel's rmap solves that cannot be solved by the forward page tables alone.

What This Concept Is​

Why It Matters Here​

Concrete Example​

Common Confusion / Misconception​

How To Use It​

Check Yourself​

Mini Drill or Application​

Read This Only If Stuck​