Skip to main content

Aside Data Structure Thepagetable

This page is a generated reference surface for selective reading. It exists to keep the learner apps guide-first while still preserving source access.

Learning objectives

  • Explain the main ideas and vocabulary in Aside Data Structure Thepagetable.
  • Work through the source examples for Aside Data Structure Thepagetable without depending on raw chunk order.
  • Use Aside Data Structure Thepagetable as selective reference when learner modules point back to Ostep.

Prerequisites

  • None curated yet.

Module targets

  • module-02-memory-management-virtual-memory

AI companion modes

  • Explain simply
  • Socratic tutor
  • Quiz me
  • Challenge my understanding
  • Diagnose my confusion
  • Generate extra practice
  • Revision mode
  • Connect forward / backward

Source-of-truth note

This unit is anchored to Ostep and the source chapter "Aside Data Structure Thepagetable". Use external resources only to clarify, extend, or modernize details without replacing the chapter's conceptual spine.

External enrichment

No chapter-specific enrichment resources are curated yet. Add them in the unit manifest when a source clearly improves learning.

Source provenance

  • Primary source: Ostep
  • Source chapter: Aside Data Structure Thepagetable
  • Raw source file: 084-aside-datastructure-thepagetable.md

Merged source

Aside Data Structure Thepagetable

ASIDE: DATASTRUCTURE-- THEPAGETABLE

One of the most important data structures in the memory management subsystem of a modern OS is thepage table. In general, a page table stores virtual-to-physical address translations, thus letting the system know where each page of an address space actually resides in physical memory. Because each address space requires such translations, in general there is one page table per process in the system. The exact structure of the page table is either determined by the hardware (older systems) or can be more flexibly managed by the OS (modern systems).

prompt> gcc -o array array.c -Wall -O

prompt> ./array
Of course, to truly understand what memory accesses this code snippet (which simply initializes an array) will make, we'll have to know (or assume) a few more things. First, we'll have todisassemblethe resulting binary (usingobjdumpon Linux, orotoolon a Mac) to see what assembly instructions are used to initialize the array in a loop. Here is the resulting assembly code:
0x1024 movl $0x0,(%edi,%eax,4) 0x1028 incl %eax 0x102c cmpl $0x03e8,%eax 0x1030 jne 0x1024

The code, if you know a littlex86, is actually quite easy to understand2.

The first instruction moves the value zero (shown as$0x0) into the virtual memory address of the location of the array; this address is computed by taking the contents of%ediand adding%eaxmultiplied by four to it.

Thus,%ediholds the base address of the array, whereas%eaxholds the array index (i); we multiply by four because the array is an array of integers, each of size four bytes.

The second instruction increments the array index held in%eax, and the third instruction compares the contents of that register to the hex value0x03e8, or decimal 1000. If the comparison shows that two values are not yet equal (which is what thejneinstruction tests), the fourth instruction jumps back to the top of the loop.

To understand which memory accesses this instruction sequence makes (at both the virtual and physical levels), we'll have to assume something about where in virtual memory the code snippet and array are found, as well as the contents and location of the page table.

For this example, we assume a virtual address space of size 64KB (unrealistically small). We also assume a page size of 1KB.

2We are cheating a little bit here, assuming each instruction is four bytes in size for simplicity; in actuality, x86 instructions are variable-sized.

PageTable[39] 1224 1174