Skip to main content

Services, kube-proxy, and Cluster DNS

What This Concept Is

A Service is a stable virtual address for a changing set of Pods. Because Pod IPs churn, clients never target Pods directly. They target a Service; something routes the connection to one of the selected Pods.

Four common Service types:

TypeWhat it gives youTypical use
ClusterIP (default)A cluster-internal virtual IP. No external exposure.East-west traffic between microservices
NodePortExposes the Service on a port on every node.Low-effort external access, development
LoadBalancerProvisions a cloud L4 load balancer pointing at the NodePort on each node.External ingress when you do not want L7 features
Headless (clusterIP: None)No virtual IP; DNS returns the Pod IPs directly.StatefulSets, gossip protocols, custom clients

Two components actually implement a Service:

  • kube-proxy runs on every node and programs the kernel (iptables, ipvs, or nftables) so that traffic destined for the Service IP is DNAT'd to one of the Pod IPs.
  • CoreDNS runs as a Deployment inside the cluster and serves DNS names of the form service.namespace.svc.cluster.local based on Service and EndpointSlice objects.

An EndpointSlice object, populated by the EndpointSlice controller, is the list of ready Pod IPs behind a Service. kube-proxy watches EndpointSlices, not Services, to build its rules.

Why It Matters Here

Every inter-service call in a cluster uses this path. "Service has no endpoints" is one of the most common outage causes: the Service exists, DNS resolves, kube-proxy has no rule because no Pod's readiness probe passed, and clients get connection refused with no error log. If you cannot trace this chain, you cannot debug it.

Concrete Example

A Service in front of the web Deployment from Cluster 2:

apiVersion: v1
kind: Service
metadata:
name: web
spec:
type: ClusterIP
selector:
app: web
ports:
- name: http
port: 80
targetPort: 80

From any pod in the same namespace:

curl http://web             # short name
curl http://web.default.svc.cluster.local # fully qualified

DNS resolves web to a ClusterIP (say 10.96.42.17). The pod opens a TCP connection to 10.96.42.17:80. kube-proxy's rules on the node DNAT the connection to one of the Ready Pod IPs (e.g. 10.244.1.8:80). The container receives the packet on its net namespace.

Inspect:

kubectl get svc web
kubectl get endpointslices -l kubernetes.io/service-name=web
kubectl -n kube-system get pods -l k8s-app=kube-dns

If endpointslices is empty, no Ready Pod matches the selector and the Service is effectively a black hole.

Common Confusion / Misconception

"targetPort is the Pod's port on the cluster network."

targetPort is the port inside the Pod's net namespace (i.e. the port the container listens on). port is the Service's own port (what clients dial). nodePort is only for NodePort/LoadBalancer types. All three can differ.

A second confusion: "A Pod is automatically added to a Service when it matches the label selector." It is added to the EndpointSlice only when it is Ready. A Pod that passed Running but whose readiness probe has not yet succeeded is intentionally excluded. This is why a readiness probe that never returns success causes "Service has no endpoints."

A third confusion: "kube-proxy load-balances." For L4, yes -- kube-proxy in iptables mode does random selection and in ipvs mode supports several algorithms. But it does not do L7 load balancing and it does not preserve affinity across restarts by default.

How To Use It

Draw the DNS-to-packet path end to end:

Debugging order:

  1. Does the Service exist? kubectl get svc
  2. Does it have endpoints? kubectl get endpointslices -l kubernetes.io/service-name=web
  3. Does DNS resolve? kubectl run -it --rm dns --image=nicolaka/netshoot -- dig web.default.svc.cluster.local
  4. Does the TCP connection succeed? curl -v from a debug pod to the ClusterIP directly.

kube-proxy Modes

kube-proxy has three modes, chosen per-cluster:

  • iptables (most common default): programs a chain of DNAT rules per Service; selection of a backend Pod is random.
  • ipvs: uses the kernel's IPVS virtual server; supports load-balancing algorithms (rr, lc, dh), scales to larger cluster sizes with less CPU per rule update.
  • nftables (newer): like iptables but on the kernel's nftables framework; gradually replacing iptables mode.

All three are L4 only. None of them provide TLS, retries, HTTP-aware routing, or per-request load balancing. For any of those you need an L7 proxy (service mesh sidecar, Ingress controller, Gateway implementation).

Check Yourself

  1. What is the difference between port, targetPort, and nodePort?
  2. What makes a Pod appear in a Service's EndpointSlice?
  3. Which component programs the kernel to DNAT Service IPs to Pod IPs?

DNS Search Domains

CoreDNS publishes search-domain suffixes to Pods via /etc/resolv.conf. A Pod in namespace prod typically has:

search prod.svc.cluster.local svc.cluster.local cluster.local

So curl http://api from a Pod in prod resolves against api.prod.svc.cluster.local first, then api.svc.cluster.local, then the cluster domain, then the external resolver. Cross-namespace traffic requires at least api.other-ns to skip the first search match -- or you rely on the svc.cluster.local fallback. Many accidental outages during a refactor happen because code assumed the short name would resolve in the new namespace.

Mini Drill or Application

Apply a Deployment and a ClusterIP Service. Intentionally break the readiness probe (set httpGet.path to a 404). Run:

kubectl get endpointslices -l kubernetes.io/service-name=<svc>
kubectl describe svc <svc>
curl http://<svc> # from a debug pod

Write down exactly what each of the three observations shows, and tie each back to the kube-proxy + readiness relationship.

externalTrafficPolicy and Client IPs

For NodePort and LoadBalancer Services, externalTrafficPolicy controls a tradeoff:

  • Cluster (default) -- any node accepts traffic and may forward it to a Pod on another node via SNAT. Even traffic balancing, but the client's real IP is lost.
  • Local -- only nodes that have a Ready Pod for this Service accept traffic; the client IP is preserved. External load balancers need health checks to skip empty nodes.

Picking the wrong mode is a common source of "why is every log line tagged with a node IP" surprises. Applications that must see the original client IP (for audit or rate limiting) should set Local; architects who want uniform traffic across nodes and have a reverse proxy stamping X-Forwarded-For tend to stay on Cluster.

Debugging Service has no endpoints: a Checklist

Keep this short list where you can see it.

  1. kubectl get svc <name> -- does it exist and have a selector?
  2. kubectl get endpointslices -l kubernetes.io/service-name=<svc> -- zero ready endpoints is the number you hunt.
  3. kubectl get pods -l <selector> --show-labels -- does any pod match?
  4. For each matching pod: kubectl get pod <p> -o jsonpath='{.status.conditions}' -- is Ready: true?
  5. If pods are Running but not Ready: kubectl describe pod <p> -- readiness probe output and events.
  6. If pods are Ready but EndpointSlice is still empty: targetPort mismatch, wrong protocol, or a named port that is not declared on the container.

Ninety percent of "Service doesn't work" cases fall out at step 4 (readiness probe) or step 6 (port typo).

Read This Only If Stuck