Pods, ReplicaSets, Deployments
What This Concept Is
These three objects are the main way applications run in Kubernetes. They are layered:
- A Pod is the smallest deployable unit. It is one or more containers that share a network namespace (one IP), share IPC, and can share volumes. A Pod is scheduled to a single node and runs there until it dies. Pods are ephemeral.
- A ReplicaSet ensures that
NPods matching a label selector are running. It creates replacement Pods when existing ones disappear. It does not do rolling updates. - A Deployment manages ReplicaSets. A single Deployment produces a new ReplicaSet per template change and implements the rolling update, pause, rollback, and history behaviors.
You rarely create Pods or ReplicaSets directly. You write a Deployment; Kubernetes creates a ReplicaSet; the ReplicaSet creates Pods.
Why It Matters Here
Almost every mistake in the rest of this module is caused by missing which controller owns which state:
- "My pods keep getting recreated when I delete them." -- A ReplicaSet owns them.
- "My rollout is stuck at 50%." -- The Deployment's
maxSurge/maxUnavailableare interacting with a bad readiness probe. - "My two Deployments keep fighting over the same Pods." -- Two Deployments with overlapping selectors.
Concrete Example
A minimal Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
spec:
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: app
image: nginx:1.27
ports:
- containerPort: 80
resources:
requests: { cpu: "100m", memory: "128Mi" }
limits: { cpu: "500m", memory: "256Mi" }
When applied, three objects exist:
- A
Deploymentnamedweb. - A
ReplicaSetnamedweb-<hash>owned by the Deployment. - Three
Podsnamedweb-<hash>-<rand>owned by the ReplicaSet.
Change image: to nginx:1.28 and reapply. The Deployment creates a new ReplicaSet web-<hash2> with 0 Pods, scales it up one Pod at a time, and scales the old one down, respecting maxSurge and maxUnavailable. If the new Pods' readiness probes fail, the old ReplicaSet stays alive and the rollout stalls -- that is the feature, not a bug.
Common Confusion / Misconception
"A Pod is a container."
A Pod is a namespace bundle that contains one or more containers. The canonical example is a main container plus a sidecar (logging agent, service mesh proxy). They share localhost, /tmp if you mount an emptyDir, and the lifetime of the Pod.
A second confusion: "Deleting a Pod restarts it." Deleting a Pod owned by a ReplicaSet produces a new Pod with a new name and a new IP. Nothing is "restarted"; the controller observes the count is too low and creates a replacement.
A third confusion: "Deployments provide high availability." They provide replica count and rolling updates. For true HA, you additionally need topologySpreadConstraints, PodDisruptionBudgets, and multi-zone nodes. A Deployment with replicas: 3 and all Pods on the same node is not HA.
How To Use It
Write a Deployment for every stateless workload. Let the ReplicaSet be implicit. Never write a kind: Pod except for debugging (kubectl run --rm -it --image=...).
Inspect the chain:
kubectl get deploy web
kubectl get rs -l app=web
kubectl get pods -l app=web -o wide
kubectl rollout status deployment/web
kubectl rollout history deployment/web
Update Strategies
A Deployment has two update strategies:
RollingUpdate(default): create new Pods and delete old ones in an overlapping sequence governed bymaxSurge(how many extra new Pods can exist above the target replica count) andmaxUnavailable(how many Pods can be simultaneously unavailable). Typical values:maxSurge: 25%,maxUnavailable: 25%.Recreate: delete all old Pods first, then create new ones. Accepts downtime. Useful when two versions cannot coexist (database schema incompatibility, legacy file locks).
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
maxUnavailable: 0 with maxSurge: 1 gives a safer, slower rollout: always over-provision by one, never below the desired count.
Readiness probes interact critically with rolling updates: a new Pod is not considered "available" until its readiness probe passes. If readiness is always false, the rollout stops at the maxUnavailable boundary and waits -- which is exactly what you want.
Check Yourself
- What is the difference between a Pod and a container?
- Which controller is responsible for replacing a deleted Pod?
- Why does changing a Deployment's
image:create a new ReplicaSet instead of editing the old one?
Liveness, Readiness, and Startup Probes
A Pod has three probe types and they do different things:
- Liveness -- "should the kubelet restart this container?" Failing liveness kills the container (respecting
restartPolicy). - Readiness -- "should this Pod receive traffic?" Failing readiness removes the Pod from Service EndpointSlices but does not restart it.
- Startup -- "is the app still initializing?" While startup probes are in effect, liveness is suppressed; useful for slow-starting apps that would otherwise be killed during boot.
A worked example:
livenessProbe:
httpGet: { path: /healthz, port: 8080 }
periodSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet: { path: /ready, port: 8080 }
periodSeconds: 5
failureThreshold: 2
startupProbe:
httpGet: { path: /started, port: 8080 }
periodSeconds: 5
failureThreshold: 30 # up to 150s to start
Mixing these up is the most common cause of false CrashLoopBackOff: a liveness probe starts before the app is ready, kills the container, and the cycle repeats. A startup probe prevents this.
Mini Drill or Application
Apply the Deployment above. Run kubectl get rs,pods -l app=web. Now change the image tag and reapply. Watch the ReplicaSets with kubectl get rs -l app=web -w and write down exactly how many Pods existed at each step of the rollout. Explain what maxSurge and maxUnavailable let the Deployment do.
Pod Lifecycle Phases You Must Recognize
Every Pod moves through a small set of phases the api-server reports in status.phase:
| Phase | Meaning | Typical debug move |
|---|---|---|
Pending | Accepted, not yet scheduled or not yet fully launched | kubectl describe -> scheduling events / image pull |
Running | At least one container is running (or starting/restarting) | check ready sub-status and probe events |
Succeeded | All containers terminated with exit 0 and will not restart | only meaningful for Jobs and batch work |
Failed | All containers terminated; at least one with non-zero exit | look at Last State + exitCode |
Unknown | Node communication lost | likely kubelet/network failure -- node-side logs |
Containers inside the Pod have their own lifecycle (Waiting, Running, Terminated) with reasons like CrashLoopBackOff, ImagePullBackOff, OOMKilled. Most "why is my Pod stuck" questions resolve into a phase + container reason combination.
Read This Only If Stuck
- Linux Command Line: Sending signals to processes with kill -- Pod termination uses SIGTERM then SIGKILL after
terminationGracePeriodSeconds. - Linux Command Line: Viewing processes dynamically with top --
kubectl topis the cluster analogue of this workflow. - Kubernetes: Pods -- canonical reference for the Pod object and shared namespace semantics.
- Kubernetes: Pod Lifecycle -- phases, conditions, probes, and termination order in detail.
- Kubernetes: Deployments --
maxSurge,maxUnavailable, revision history, rollout strategies. - Kubernetes: ReplicaSets -- why you rarely write one directly and what it actually does.
- Kubernetes: Configure Liveness, Readiness, and Startup Probes -- probe types, HTTP vs exec vs TCP, timing parameters.
- Kubernetes: Pod Disruption Budgets -- how a Deployment gets actual HA during voluntary disruptions.
- Kubernetes: Well-Known Labels, Annotations, and Taints -- canonical keys controllers and schedulers watch for.