Skip to main content

Containers, cgroups, and Scheduling Isolation

What This Concept Is

A container on Linux is a process tree confined by:

  • namespaces (PID, mount, network, UTS, IPC, user) -- control what it sees.
  • cgroups (control groups v2) -- control what resources it can consume.

For scheduling, the relevant cgroup v2 controllers are:

  • cpu.weight -- proportional CPU share within a hierarchy. Default 100, range 1-10000. Two sibling groups with weights 100 and 200 share contested CPU 1:2.
  • cpu.max -- absolute CPU bandwidth cap: "QUOTA PERIOD". "50000 100000" means 50 ms of CPU per 100 ms window -> one core at 50%.
  • cpuset.cpus -- which physical CPUs this group may run on (affinity-by-group).
  • cpu.pressure -- pressure-stall info: how much time tasks in this group spent stalled waiting for CPU.

CFS extends its fair-share picture to the group: at the top level, groups compete for CPU in proportion to cpu.weight; within each group, individual tasks compete again via CFS. This is "hierarchical CFS" (CONFIG_FAIR_GROUP_SCHED).

Why It Matters Here

Almost every production workload runs in a container -- docker run, kubectl apply, or a systemd service with Slice=. The observable behavior of your scheduler is filtered through cgroup limits, so "my process is slow but CPU looks idle" is almost always a cgroup throttling story, not a kernel bug.

Concrete Example

Two containers on a 4-core host:

  • Container A: cpu.weight = 100, unlimited cpu.max.
  • Container B: cpu.weight = 300, cpu.max = "200000 100000" (2 cores worth).

Case 1 -- only A is busy, using 400% (all 4 cores). B has no work. Fine, no contention.

Case 2 -- both busy, each wanting 400%:

  • Raw weight ratio A:B = 100:300 = 1:3.
  • But B is capped at 200%. So B gets 200% (its cap), and A gets 200% (the remainder).
  • Final split: A: 200%, B: 200%, even though B's weight is 3x. The cap bound, not the weight.

Case 3 -- both busy, caps removed:

  • Weight split: A: 100%, B: 300%. B gets three times A.

These three cases are where most "why is my pod throttled?" confusion lives. The behavior is completely determined by (weight, max) × (offered load).

Kubernetes sets:

  • resources.requests.cpu -> cpu.weight (proportional share under contention)
  • resources.limits.cpu -> cpu.max (hard cap, enforced even with CPU idle)

A pod with requests: 100m, limits: 500m sets cpu.weight = ~10, cpu.max = "50000 100000".

Common Confusion / Misconception

"CPU limits only kick in under contention." False for cpu.max. cpu.max is a hard throttle: when the quota within a 100 ms window is used up, tasks in the group are parked until the next window, even if other cores are idle. This is the source of most "my container is slow but the host has headroom" incidents.

"requests == guarantee." cpu.weight (from requests) guarantees proportional share under contention, not an absolute minimum. On an idle host your pod can use more; under load it gets at least its share.

How To Use It

Debugging a "slow inside the container" complaint:

  1. cat /sys/fs/cgroup/<path>/cpu.max -> is there a hard cap?
  2. cat /sys/fs/cgroup/<path>/cpu.stat -> look at nr_throttled and throttled_usec -- did the kernel actually throttle, and for how long?
  3. cat /sys/fs/cgroup/<path>/cpu.pressure -> some value > 0 means tasks wanted CPU but didn't get it.
  4. perf sched latency inside the container -> per-task wait times.

If nr_throttled > 0 and latency is bad, raise the limit or widen the period. Do not "add CPU to the node" until this is ruled out.

Check Yourself

  1. What is the difference between cpu.weight and cpu.max in cgroup v2?
  2. Why can a container be throttled on a host with idle cores?
  3. How does Kubernetes' requests map to cgroup cpu.weight?

Mini Drill or Application

On a Linux host with cgroup v2 (most modern distros):

  1. Create two cgroups A and B under a parent test, set cpu.weight = 100 for A, 300 for B.
  2. Start a CPU-bound loop in each; measure CPU share.
  3. Set A's cpu.max = "100000 100000" (one core) and retest.
  4. Record and explain the ratios. Compare to the formulas in the Concrete Example.

Read This Only If Stuck