Skip to main content

Module 3: Container Orchestration

Primary source: kubernetes.io/docs
Selective support: The Linux Command Line (for shell, process, and permission context), local semester book chunks only where they sharpen the Linux-side background

This guide is the primary teacher. You do not need to read Kubernetes' documentation cover-to-cover to complete this module. You do need to become operationally strong at what a container actually is on Linux, how Kubernetes turns a declarative spec into a running workload, and how a real cluster is operated day to day.


Scope of This Module

This module is not "learn kubectl." It is where a container stops being a box of magic and becomes a small set of Linux kernel features, and where Kubernetes stops being a YAML pile and becomes a reconciliation loop over resources.

What it covers in depth:

  • what a container actually is: mnt, pid, net, uts, ipc, user, and cgroup namespaces plus cgroups v2 for resource limits
  • OCI images, layered filesystems, image manifests, and the runtime contract
  • the split between a high-level engine (Docker), a runtime daemon (containerd, CRI-O), and the OCI runtime (runc)
  • the Kubernetes control plane: kube-apiserver, etcd, kube-scheduler, kube-controller-manager, and the cloud-controller-manager
  • workload objects: Pods, ReplicaSets, Deployments, and how they compose
  • declarative reconciliation as observe -> diff -> act, not as imperative commands
  • the cluster networking model: every pod gets an IP; no NAT between pods
  • Services (ClusterIP, NodePort, LoadBalancer, headless), kube-proxy, and cluster DNS
  • Ingress and the Gateway API: L7 routing into the cluster
  • configuration injection: ConfigMaps, Secrets, and the env/volume patterns
  • storage: Volumes, PersistentVolumes, PersistentVolumeClaims, and StorageClasses
  • stateful workloads: StatefulSets and headless services for identity
  • day-2 operations: resource requests/limits, QoS classes, HPA, Pod Security Standards, RBAC, and a kubectl-based troubleshooting loop

What it deliberately does not try to finish here:

  • operator and CRD development in depth
  • multi-cluster federation
  • service mesh internals (covered in a later track)
  • a full production security review -- that is Module 5

This is an in-depth foundation module. If you can run kubectl apply -f but cannot explain what reconciles that spec or what isolates the workload from its host, you are not done.


Before You Start

Answer these closed-book before starting the main path:

  1. What is the kernel-level difference between "process in a VM" and "process in a container"?
  2. If you docker run an image, what pieces of state exist afterward: where is the image, where is the container filesystem, where is the running process in the process tree?
  3. What does "declarative" buy you over "imperative" in infrastructure?
  4. If you update a Deployment's image tag, what mechanism actually rolls out new pods and stops the old ones?
  5. Why does a pod not keep the same IP after a restart, and what absorbs that churn for clients?

Diagnostic Interpretation

4-5 solid answers

  • You are ready for the full path.

2-3 solid answers

  • Continue, but expect extra time in Clusters 1 and 2.

0-1 solid answers

  • Revisit Semester 5, Module 3 (processes, namespaces) and Semester 9, Module 1 (cloud compute, load balancers). Containers inherit their isolation story from Linux and their networking story from cloud-style L4/L7 routing.

What This Module Is For

Kubernetes is the dominant way to run server workloads in this era. Later work repeatedly asks questions like:

  • where exactly is this application running, and how is it isolated from the host?
  • what does the cluster do when a node dies in the middle of a rollout?
  • how do two microservices find each other without anyone hard-coding an IP?
  • what happens to the database pod when its disk lives on network storage and the node is replaced?
  • when the cluster is misbehaving, what is the 30-second triage order?

This module builds the orchestration reasoning needed for:

  • CI/CD pipelines that deploy to Kubernetes (Module 4)
  • cloud security and observability work that depends on pod identity, RBAC, and logs (Module 5)
  • the capstone, where you must actually operate a real deployment

You are learning to reason about a running cluster as a reconciler, not as a shell.

Local Cluster First

Use kind, minikube, or k3d as the default execution environment for this module. Managed Kubernetes (EKS/GKE/AKS) is not the starting point; it is the optional sandbox after you can prove the manifest and operating behavior locally.

Before creating a managed cluster, show evidence that you can:

  • create and delete a local cluster from a script or README command;
  • apply the same Deployment, Service, Ingress/Gateway, ConfigMap, Secret, HPA, and RBAC manifests locally;
  • debug scheduling, image-pull, DNS, service-selector, ingress, and RBAC failures with kubectl describe, events, and logs;
  • map any local-only substitutions (hostPath volumes, local ingress controller, fake load balancer, local Postgres) to their managed-cloud equivalents;
  • tear the local cluster down cleanly, then recreate it from version-controlled files.

Only move to EKS/GKE/AKS when the managed control plane itself is the lesson, and then apply the semester budget, alert, least-privilege, short-lived-resource, and teardown rules.


Concept Map


How To Use This Module

Work in order. Later clusters assume that the earlier pieces (what isolates a container, what a reconciler is, what a pod's IP means) are stable.

Cluster 1: What a Container Actually Is

OrderConceptTypeFocus
1Namespaces and cgroupsPRIMARYThe seven namespace types and cgroups v2 as the real mechanism
2OCI images, layers, and the runtimePRIMARYWhat an image is on disk, layered filesystems, and the image-to-runtime contract
3Docker vs containerd vs CRI-OSUPPORTINGWhere the line is between engine, runtime daemon, OCI runtime, and CRI

Cluster mastery check: Can you explain, at the kernel level, what docker run alpine sh actually does to the host?

Cluster 2: Kubernetes Foundations

OrderConceptTypeFocus
4The control planePRIMARYapi-server, etcd, scheduler, controller-manager, kubelet
5Pods, ReplicaSets, DeploymentsPRIMARYWhat each object does and how they compose
6The declarative reconciliation loopPRIMARYObserve -> diff -> act, with spec vs status

Cluster mastery check: Can you trace a single kubectl apply through the api-server, etcd, scheduler, kubelet, and runtime?

Cluster 3: Networking and Services

OrderConceptTypeFocus
7Cluster networking modelPRIMARYFlat pod network, no NAT between pods, CNI plugins
8Services, kube-proxy, cluster DNSPRIMARYService types, virtual IPs, EndpointSlices, CoreDNS
9Ingress and the Gateway APIPRIMARYL7 ingress, ingress controllers, and the successor API

Cluster mastery check: Can you draw the packet path from an external client to a specific container in a specific pod, and name who runs each hop?

Cluster 4: Configuration and State

OrderConceptTypeFocus
10ConfigMaps, Secrets, environment injectionPRIMARYDecoupling config from image
11Volumes, PersistentVolumes, StorageClassesPRIMARYEphemeral volumes, PV/PVC lifecycle, dynamic provisioning
12StatefulSets and headless servicesPRIMARYStable identity, ordered rollout, per-pod storage

Cluster mastery check: Can you explain why a Deployment of Postgres is wrong and what a StatefulSet gives you instead?

Cluster 5: Operating a Cluster

OrderConceptTypeFocus
13Resource requests/limits, QoS, HPAPRIMARYScheduler math, eviction order, autoscaling
14Security contexts, pod security, RBACPRIMARYWorkload identity, PSS, role-based access
15Observability and kubectl workflowSUPPORTINGA 30-second triage and a 5-minute debug loop

Cluster mastery check: Given a failing deployment in a cluster you have never seen before, can you produce a written triage plan and start executing it within 30 seconds?

Then work these practice pages:

OrderPractice pathFocus
1Container Fundamentals LabNamespaces, cgroups, and image layers at the shell
2Kubernetes Primitives WorkshopPods, Deployments, reconciliation, and spec vs status
3Services and Storage ClinicServices, DNS, Ingress, volumes, and StatefulSets
4K8s KatasFocused YAML-writing drills

Use Module Quiz after the concept and practice path. Use Reference and Selective Reading and Learning Resources only for targeted reinforcement.


Learning Objectives

By the end of this module you should be able to:

  1. Explain what isolates a container from the host, naming the seven namespace types and the role of cgroups v2.
  2. Describe the OCI image format and the runtime contract, and explain what each layer contributes.
  3. Distinguish Docker, containerd, CRI-O, runc, and the Kubernetes CRI, and say which piece talks to which.
  4. Draw the control plane, name each component, and trace a single kubectl apply end-to-end.
  5. Write a valid Deployment, Service, Ingress, ConfigMap, Secret, PV/PVC, and StatefulSet from memory.
  6. Explain the reconciliation loop in the explicit form observe -> diff -> act, using spec and status.
  7. Explain the pod network model and why every pod gets an IP, including the role of CNI and kube-proxy.
  8. Configure resource requests and limits, reason about QoS classes, and set up a basic HPA.
  9. Apply Pod Security Standards, write a minimal RBAC policy, and use securityContexts correctly.
  10. Produce a reproducible kubectl-based troubleshooting workflow for a broken deployment.

Outputs

  • a running local cluster (kind, minikube, or k3d) with at least one Deployment + Service + Ingress working end-to-end
  • a cluster notebook with at least 20 kubectl sessions and their written interpretation
  • one namespaces.md walkthrough where you use unshare, nsenter, lsns, and systemd-cgls to produce a minimal container by hand
  • one manifest library containing at least one valid Deployment, Service (ClusterIP and LoadBalancer), Ingress, ConfigMap, Secret, PVC, and StatefulSet
  • one troubleshooting runbook with at least 8 named failure modes (ImagePullBackOff, CrashLoopBackOff, Pending due to insufficient cpu, OOMKilled, CreateContainerConfigError, readiness probe failing, service has no endpoints, DNS lookup failing)
  • one RBAC + PSS memo explaining a real set of permissions for a CI service account
  • one short memo explaining which Module 3 tools carry into Modules 4 and 5

Completion Standard

You have completed Module 3 when all of these are true:

  • you can explain docker run in terms of kernel primitives
  • you can write a minimal Deployment + Service + Ingress from memory
  • you can read any YAML manifest and explain what controller will reconcile it
  • you can draw the packet path from client to container and name each hop
  • you can pick between a Deployment and a StatefulSet and justify the choice
  • you can set requests, limits, and an HPA without looking up the fields
  • you can write a minimal RBAC Role and RoleBinding without copy-paste
  • you can debug a broken pod in under five minutes using only kubectl, describe, logs, events, and exec

If you can produce YAML but cannot explain which controller reconciles it and what the kernel-level effect is, the module is not complete.


Reading Policy

  • Concept pages are the main path.
  • kubernetes.io/docs is the official reference. Open it at the specific page for a specific gap, not as background reading.
  • See also (external) links exist to resolve a specific confusion, not to reroute the whole learning flow.
  • Because this is an in-depth operational module, practice must include a real running cluster, not only YAML on paper.

Suggested Weekly Flow

DayWork
1Concepts 1-3 and the Container Fundamentals Lab (by-hand container)
2Concepts 4-6 and a running kind/minikube cluster with a Deployment applied
3Concepts 7-9 and the Services and Storage Clinic up to Ingress
4Concepts 10-12 and finish the Services and Storage Clinic with a StatefulSet
5Concepts 13-15 and one full troubleshooting runbook entry
6Practice pages 1-2 cleanup and kubectl katas
7Practice pages 3-4, quiz, and cluster teardown

Reference

If you need exact links into kubernetes.io, use Reference and Selective Reading.


Rich Learning Pages

Worked Examples | Guided Labs | Case Studies | Mistake Clinic | Reading Guide | Capstone Thread