Skip to main content

MLFQ and How Linux CFS Builds on It

What This Concept Is

Multi-Level Feedback Queue (MLFQ) is a family of scheduling policies that learns about jobs from their behavior rather than requiring the kernel to know their burst in advance. The classic rules:

  1. Multiple priority queues Q0 (highest) to Qn (lowest).
  2. A job at a higher priority runs before any lower-priority job.
  3. Within the same priority, round-robin with a per-level quantum.
  4. A job that uses its whole quantum drops one level (it is "CPU-bound enough to wait").
  5. A job that yields before the quantum expires keeps its level (it is "I/O-bound, reward it").
  6. Periodically, boost all jobs back to Q0 to prevent starvation and adapt to changing behavior.

The result approximates SJF without knowing burst times: short, interactive jobs stay high priority, CPU-bound background jobs settle low.

Linux CFS (Completely Fair Scheduler) generalizes the same idea. Instead of discrete queues it tracks virtual runtime (vruntime) per task, always picks the task with smallest vruntime, and weights the accumulation by nice value. CFS is to MLFQ roughly what continuous probability is to discrete -- same question, finer resolution.

Why It Matters Here

MLFQ is the first scheduler you meet that is actually usable in a general-purpose OS. It also introduces the key insight: schedulers should infer workload type, not ask for it. CFS is the production incarnation of that insight; understanding MLFQ first makes CFS readable instead of magic.

Concrete Example

Three jobs arrive at t = 0. MLFQ with Q0 quantum = 10 ms, Q1 = 20, Q2 = 40, boost every 100 ms:

  • A: long CPU burst of 60 ms
  • B: interactive, runs 1 ms then sleeps for I/O, repeatedly
  • C: long CPU burst of 40 ms, arrives at t = 30

Trace:

t=0   Q0 has A,B. Pick A.
t=10 A used full 10ms quantum -> drop to Q1. Pick B (Q0).
t=11 B yielded before quantum -> stay in Q0, wait for I/O.
t=11 Only ready job is A in Q1. Run A with q=20ms.
t=30 C arrives in Q0. A has used 19 of 20 -> at t=31 A drops to Q2.
t=31 Run C (Q0). After 10ms -> drop to Q1.
...
t=100 BOOST: A,C both go back to Q0.

Without boost, once A drops to Q2 it could starve behind an endless stream of interactive tasks. With boost, it periodically gets promoted and makes progress.

Common Confusion / Misconception

"MLFQ is just priorities + round-robin." No -- the feedback is the point. Without rules 4 and 5 it is static priority. Without rule 6 (boost) it starves. Every one of the six rules is a lesson about failure modes of earlier attempts. (OSTEP introduces them one at a time exactly to make each failure felt.)

"CFS has a run queue like MLFQ." CFS has a red-black tree keyed on vruntime; "who runs next" is the leftmost node. Concept 14 digs into this; here the only thing to remember is "smallest vruntime wins, weighted by nice."

How To Use It

When reading or writing a scheduler:

  1. Identify what the scheduler observes (burst usage, I/O yield, sleep time).
  2. Identify how observations change state (priority drop, vruntime credit).
  3. Identify the anti-starvation mechanism (boost, minimum fair share, bandwidth control).
  4. Identify the picking rule (highest-non-empty-queue RR; leftmost red-black node).

These four questions frame every real scheduler.

Check Yourself

  1. What does rule 6 (boost) fix that rules 1-5 alone cannot?
  2. Why does MLFQ approximate SJF without knowing burst times?
  3. How does CFS's virtual-runtime picking relate to MLFQ's "drop a level" rule?

Mini Drill or Application

Given three jobs and MLFQ with two queues (q0=5, q1=10, boost at t=30), trace the schedule:

  • A arrives t=0, CPU burst 20.
  • B arrives t=2, repeating {run 2, sleep 3} forever until killed at t=40.
  • C arrives t=15, CPU burst 10.

Write in one paragraph how CFS would handle the same workload differently.

Read This Only If Stuck