Skip to main content

Thrashing and Working-Set Theory

What This Concept Is

Thrashing is a state where the system spends more time servicing page faults than doing useful work. Throughput collapses, CPU use stays high (in iowait), and response time explodes.

Peter Denning's working-set model gives a vocabulary for when thrashing happens:

  • W(t, tau) is the set of pages a process has referenced in the last tau seconds.
  • The working-set size |W(t, tau)| is the amount of memory the process actually needs right now to run without constant faults.
  • A system survives as long as the sum of working sets fits in physical memory: sum_i |W_i(t, tau)| <= total frames.
  • Once that inequality breaks, every process's eviction victim is someone else's working-set member. Faults cascade.

The sharpness of the transition is what makes thrashing so recognizable: performance does not degrade gracefully. It falls off a cliff.

Why It Matters Here

Every real memory pressure problem is a variant of "working sets no longer fit." The vocabulary helps you answer:

  • why adding 10% more load suddenly doubled latency when previous increments cost nothing
  • why a coresident memory-heavy tenant took down another one
  • why scheduling more processes on a memory-tight system makes it slower
  • why disabling or tuning swap changes the thrashing regime (it does not remove the underlying problem, only how it manifests)

Linux has mechanisms to react: the OOM killer, PSI pressure-stall information, memcg limits, and cgroup v2 memory controllers. Under all of them, the working-set framing tells you what they are really doing.

Concrete Example

Cliff example. Ten processes, each with working-set size ~600 MiB, run on a 6 GiB machine. Total working set ~= RAM; the system runs fine. Add an eleventh process of 600 MiB: total working set 6.6 GiB. Now every process's eviction target is reclaimed before it is done with it. Fault rates explode, CPU drops to ~5% user and 95% iowait, throughput collapses.

Measurement view with vmstat 1. Columns to watch:

  • r (runnable processes) -- may stay roughly constant
  • b (blocked processes) -- rises as processes wait on disk
  • si/so (swap-in / swap-out pages per second) -- nonzero and sustained is bad
  • us/sy/wa -- wa climbs, us falls

PSI. On Linux with /proc/pressure/memory:

some avg10=45.23 avg60=30.12 avg300=15.22 total=1234567890
full avg10=12.11 avg60=8.91 avg300=3.45 total=234567890

some tracks fraction of time at least one task is stalled on memory; full tracks fraction of time every task is stalled. A sustained some above 10-20% is often a warning signal; sustained full above 0 means real stalls.

Common Confusion / Misconception

"Thrashing means swapping." Swapping is a common symptom, but a system with swap disabled can still thrash on file-backed page reclaim (continually reading pages from files because the page cache is being stripped).

"Adding more processes should only slow the system linearly." Only until the working-set budget is exceeded. After that it is superlinear and often catastrophic.

"OOM killer is random." The OOM killer targets processes by a scoring heuristic; it is a last-resort response to thrashing, not a tuning knob. Its job is to restore the system to a state where the remaining working sets fit.

How To Use It

When you suspect memory pressure:

  1. Measure. vmstat 1, /proc/pressure/memory, sar -B, pidstat -r.
  2. Identify whether it is swap thrashing or page-cache reclaim thrashing (si/so vs. kswapd CPU).
  3. Estimate working-set sizes per process: pmap -X, smem, or /proc/$pid/smaps (look at Referenced / Active).
  4. Decide: can the machine fit them? If not, the options are: buy RAM, reduce the number of resident processes, reduce each process's working set (e.g., bounded caches), or move the workload off this box.

Thrashing is rarely solved by tuning alone. It is usually solved by reducing demand.

Check Yourself

  1. Write a one-sentence definition of the working set W(t, tau).
  2. Why does a thrashing machine appear "CPU-saturated" but not "CPU-busy"?
  3. What is the difference between swapping and page-cache reclaim, and why can both produce thrashing?
  4. Why does adding swap sometimes delay an OOM kill but not fix thrashing?
  5. If all processes have a fixed working-set size, how do you decide the minimum amount of RAM to not thrash?

Mini Drill or Application

  1. Estimate the working-set size of a Python program that builds a dict of 10 million entries, each a 100-byte string. Peak RSS? Steady-state working set (re-reading only recent entries)?
  2. Suppose RAM is 4 GiB and you have 5 processes each with working-set 1 GiB. Predict qualitative throughput. Now 4 processes.
  3. Read /proc/pressure/memory on your dev machine under normal load. Record the three averages. What would you expect them to show during a stress --vm run that exceeds RAM?
  4. Give two reasons why an application designed for a strict memory budget (Java heap cap, Go GOMEMLIMIT, database buffer cap) is easier to reason about than one that lets the OS push back via faults.

Read This Only If Stuck