Module 5: Network Protocols & Sockets: Case Studies
These case studies make the network inspectable: DNS, TCP, UDP, TLS, QUIC, sockets, backlog, and packet-level debugging.
Case Study 1: TCP Handshake Backlog Saturation
Scenario: During a traffic spike, clients time out before the application sees requests. The server process is healthy, but connection queues are saturated.
Source anchor: RFC 9293 TCP defines TCP connection establishment, state, and reliability behavior.
Module concepts: TCP handshake, listen backlog, SYN, accept queue, timeout.
Wrong Approach
"If the app logs no request, the network is fine."
Better Approach
Trace connection setup:
client SYN
server SYN-ACK
client ACK
kernel accept queue
application accept()
Tradeoff Table
| Choice | Gain | Cost |
|---|---|---|
| larger backlog tuning | absorbs bursts | memory and tuning limits |
| faster accept path | drains queue sooner | app design work |
| load shedding before saturation | protects service | dropped connections |
Failure Mode
Incoming handshakes complete faster than the application can drain the accept path, so kernel queues saturate and clients time out before useful work starts.
Project / Capstone Connection
Use this for API servers, chat systems, or game backends that need to survive bursty connection spikes without blaming the wrong layer.
Required Artifact
Draw the handshake and identify where backlog saturation drops or delays work.
Case Study 2: UDP Metrics Lost Under Load
Scenario: A metrics agent sends UDP datagrams. Under network pressure, some metrics disappear. The dashboard has gaps.
Source anchor: UDP's minimal datagram service is specified in RFC 768.
Module concepts: UDP, datagram, loss, application-level reliability.
Wrong Approach
"UDP is faster TCP."
Better Approach
Use UDP only when loss is acceptable or handled:
acceptable:
high-volume sampled metrics
not acceptable:
payment event, audit log
mitigation:
sequence numbers, batching, retry, TCP/QUIC
Tradeoff Table
| Choice | Gain | Cost |
|---|---|---|
| raw UDP | low overhead | loss and reordering |
| UDP with app reliability | selective control | added protocol work |
| TCP or QUIC | built-in reliability | higher transport complexity |
Failure Mode
Datagrams drop under load or congestion, and the application has no sequencing or retry model to detect what the dashboard missed.
Project / Capstone Connection
This fits telemetry, multiplayer, and event-stream capstones where transport choice depends on message semantics rather than a generic speed claim.
Required Artifact
Write a transport decision: message type, loss tolerance, ordering, retry, and backpressure.
Case Study 3: TIME_WAIT Misdiagnosed As A Leak
Scenario: A server shows many TIME_WAIT sockets. A learner thinks sockets are leaking.
Source anchor: RFC 9293 covers TCP connection states and close behavior.
Module concepts: TCP close, TIME_WAIT, socket lifecycle, port reuse.
Wrong Approach
Kill processes to remove TIME_WAIT.
Better Approach
Understand close state:
active closer:
enters TIME_WAIT
purpose:
handle delayed segments and prevent confusion with new connection
Tradeoff Table
| Choice | Gain | Cost |
|---|---|---|
| treat TIME_WAIT as leak | immediate simple story | wrong remediation |
| understand active closer role | correct diagnosis | requires TCP state reasoning |
| tune connection reuse carefully | better port usage | misuse can break correctness |
Failure Mode
Normal TCP close behavior leaves many sockets in TIME_WAIT after active closes, which looks alarming until connection-state purpose is understood.
Project / Capstone Connection
Use this when load-testing clients, proxies, or API gateways that open and close many short-lived TCP connections.
Required Artifact
Draw TCP close states for client-active close and server-active close.
Case Study 4: Head-Of-Line Blocking And QUIC
Scenario: HTTP/2 multiplexes requests over one TCP connection. Packet loss stalls delivery of later stream data because TCP preserves byte order.
Source anchor: RFC 9000 QUIC specifies QUIC as a UDP-based secure transport with multiplexed streams.
Module concepts: TCP byte stream, HTTP/2 multiplexing, QUIC, stream-level loss recovery.
Wrong Approach
"Multiplexing removes all head-of-line blocking."
Better Approach
Distinguish layers:
HTTP/2 over TCP:
app streams multiplexed
TCP still one ordered byte stream
QUIC:
multiplexed streams over UDP transport
loss affects stream differently
Tradeoff Table
| Choice | Gain | Cost |
|---|---|---|
| HTTP/1.1 multiple connections | simple mental model | extra connection overhead |
| HTTP/2 over TCP | efficient multiplexing | transport HOL remains |
| HTTP/3 over QUIC | stream-level recovery benefits | newer operational stack |
Failure Mode
One lost TCP segment stalls later bytes for all HTTP/2 streams on that connection, even though the application believes requests are multiplexed independently.
Project / Capstone Connection
This is useful for web delivery or media capstones choosing between HTTP/2 and HTTP/3 based on observed packet loss and latency.
Required Artifact
Compare HTTP/1.1, HTTP/2 over TCP, and HTTP/3/QUIC under one lost packet.
Case Study 5: Socket Server Architecture Choice
Scenario: A toy server handles one client at a time. Under 1,000 idle clients, it stalls. The team debates fork-per-connection, thread-per-connection, and event-driven epoll.
Source anchor: man7 epoll(7) anchors scalable readiness. Socket APIs are documented in socket(2).
Module concepts: iterative server, forked server, threaded server, event loop, fd readiness.
Wrong Approach
Use one architecture for every server.
Better Approach
Match workload:
few CPU-heavy clients:
process/thread pool
many idle sockets:
event loop + readiness
blocking work:
offload to workers
Tradeoff Table
| Choice | Gain | Cost |
|---|---|---|
| fork per connection | isolation | high overhead |
| thread per connection | straightforward blocking model | scale limits |
| event loop with workers | handles many idle fds | more coordination |
Failure Mode
An architecture that works for a few active clients collapses when thousands of mostly-idle sockets consume scheduler attention or per-connection overhead.
Project / Capstone Connection
Apply this when choosing the first server architecture for messaging, gateway, or multiplayer capstones that mix idle connections with bursts of CPU work.
Required Artifact
Write a server architecture comparison for 1,000 idle clients and 100 CPU-heavy clients.
Source Map
| Source | Use it for |
|---|---|
| RFC 9293 TCP | TCP state, handshake, reliability |
| RFC 768 UDP | UDP datagram semantics |
| RFC 9000 QUIC | QUIC transport behavior |
| epoll(7) | readiness-based event loops |
| socket(2) | sockets API |
Completion Standard
- At least three artifacts are completed.
- At least one artifact includes TCP state transitions.
- At least one artifact compares TCP/UDP/QUIC or server architectures.