Physical vs Virtual Addresses: Why We Need Translation

What This Concept Is

Every memory access in a modern user program uses a virtual address. That address does not directly name a byte of DRAM. It is translated, on every access, into a physical address by the memory-management unit (MMU) using tables the OS maintains.

Translation exists so the OS can provide three things that a raw physical-memory model cannot:

Isolation. One process cannot even name memory that belongs to another.
Relocation. The kernel can move a process's pages around without the process knowing.
Over-commitment and abstraction. A process sees a large, clean, private address space even if physical memory is small, fragmented, or shared.

The OS never lets unprivileged code emit a physical address to the bus. Every load and store is interpreted.

Why It Matters Here

Almost every later idea in this module is a specialization of translation:

page tables are the data structure that makes translation work
the TLB is the cache that makes translation fast
page faults are the escape hatch when translation cannot complete
mmap and copy-on-write are features the OS can cheaply offer because translation is already there

If you think of memory as a flat array the program talks to directly, the rest of the module will not make sense. You are looking at an abstraction with hardware cost and software control.

Concrete Example

On a 64-bit Linux x86-64 process, a printf call might load from virtual address 0x00007f1b 4a23 c080. The MMU splits that into a page number and an offset, walks the per-process page table in DRAM, finds a page frame number such as 0x1 a09c, and emits the physical address 0x1 a09c 080 on the bus. The process never sees the physical address.

Two processes can both hold a char * with the value 0x7ffd c000 0000 and point at completely different physical bytes. A debugger printing pointers in one process tells you nothing about the other.

When the kernel decides to swap one process's page out to disk and bring another's in, the virtual addresses inside each process stay the same. Only the translations change.

Common Confusion / Misconception

"Virtual memory means swapping to disk." That is one consequence, not the definition. A system with enough RAM to never swap still uses virtual memory, because it still uses translation for isolation and relocation.

"The pointer I see in the debugger is the real address." Only if you are debugging the kernel, or running without an MMU (e.g., bare-metal firmware). In any userspace process, every pointer you see is a virtual address that means nothing outside that process.

"Translation must be slow if it happens on every access." It would be, except the TLB caches translations so that common cases skip the page-table walk. Cluster 2 covers that.

How To Use It

Whenever you look at a memory-related bug or performance surprise:

Ask whose address space the address you are looking at lives in.
Ask whether the address is virtual or physical (in userland code, it is always virtual).
Ask what the mapping says: is this address mapped at all, mapped read-only, mapped to a file, mapped but not yet faulted in?
Only after those three questions should you reason about values at that address.

Pointer arithmetic, memcpy, out-of-bounds reads, and cache behavior all live downstream of whether the OS even agreed to translate the address.

Check Yourself

Name three services the OS gives you that become impossible if programs use raw physical addresses.
Why do two processes showing the same pointer value in gdb not imply they are looking at the same data?
What does the kernel have to keep per-process to make translation work?

Mini Drill or Application

For each situation, write one sentence naming the virtual and physical objects involved:

A process calls malloc(4096) and writes one byte at the returned pointer.
Two processes forked from the same parent both read the same element of a large array inherited from the parent.
A process calls mmap(MAP_SHARED) on a file and another process does the same on the same file.
A process runs off the end of its stack into unmapped memory.

What This Concept Is​

Why It Matters Here​

Concrete Example​

Common Confusion / Misconception​

How To Use It​

Check Yourself​

Mini Drill or Application​

Read This Only If Stuck​