Build Your Own Web Server (HTTP from scratch)
"HTTP is a text protocol over TCP. That's it. Everything else is convention." -- every web-protocols class
A web server is the cleanest possible introduction to API design, networking, concurrency models, and protocol design -- all at once. By the end you have an HTTP/1.1 server that serves static files, handles dynamic endpoints, and supports keep-alive connections.
1. Overview & motivation
A web server's job:
- Listen on a TCP port.
- Accept connections.
- Parse HTTP request: method, path, headers, body.
- Route to a handler (static file, dynamic endpoint).
- Write HTTP response: status, headers, body.
- Optionally keep the connection open for the next request.
What you can only learn by building one:
- Why HTTP is "just" text -- and why that simplicity made it the dominant protocol.
- Why concurrency models (thread-per-connection, event-loop, fiber/goroutine, async/await) are a fundamental design choice and how they trade off.
- Why keep-alive matters for performance (TCP handshake amortization).
- Why chunked transfer encoding exists.
- Why HTTP/2 and HTTP/3 were necessary upgrades.
2. Where this fits in the degree
- Phase: Architecture
- Semester: 7 (Architecture and DDD)
- Modules deepened: Module 4 (API design & contract evolution) -- HTTP is a contract. Module 1 (architecture fundamentals) -- concurrency model choices are architectural.
Cross-phase relevance:
- Sits on top of the Network Stack tutorial (TCP) -- though OS sockets are fine.
- Foundation for a Front-end Framework demo target.
3. Prerequisites
- A general-purpose language (C, Go, Rust, Node.js, Python all fine).
- TCP sockets at the API level (
socket/bind/listen/accept). - HTTP at the user level: you've used
curland a browser.
4. Theory & research
Required reading
- RFC 9110 -- HTTP Semantics (rfc-editor.org/rfc/rfc9110) and RFC 9112 -- HTTP/1.1 (rfc-editor.org/rfc/rfc9112). The current normative specs. Replace the old RFC 7230/7231/etc.
- MDN HTTP overview -- developer.mozilla.org/en-US/docs/Web/HTTP. The accessible reference.
Strongly recommended
- Roy Fielding, "Architectural Styles and the Design of Network-based Software Architectures" (PhD thesis, 2000) -- the REST chapter. Free PDF. Read once.
- Daniel Stenberg, "HTTP/3 Explained" -- http3-explained.haxx.se. Free book by the cURL maintainer.
Sources for concurrency patterns
- Dan Kegel, "The C10K problem" (kegel.com/c10k.html) -- historical. Why event loops and threads matter.
- "Node.js Design Patterns" (Casciaro & Mammino) -- chapters on async patterns.
5. Curated tutorial list (from BYO-X)
- C#: Writing a Web Server from Scratch
- Node.js: Build Your Own Web Server From Scratch In JavaScript -- build-your-own.org/webserver â comprehensive
- Node.js: Let's code a web server from scratch with NodeJS Streams
- Node.js: lets-build-express -- re-implementing Express
- PHP: Writing a webserver in pure PHP
- Python: A Simple Web Server -- Ruslan Spivak's series
- Python: Let's Build A Web Server -- Ruslan Spivak â recommended primary (3-part series, models concurrency progression)
- Python: Web application from scratch
- Python: Building a basic HTTP Server from scratch in Python
- Python: Implementing a RESTful Web API with Python & Flask -- different angle (using a framework)
- Ruby: Building a simple websockets server from scratch in Ruby
6. Recommended primary path
Ruslan Spivak, "Let's Build A Web Server" (Python, 3 parts).
- Part 1: Single-connection, one-request server.
- Part 2: Multi-process (fork-per-connection).
- Part 3: Concurrent server with
select.
This 3-part progression is the single best teaching artifact about HTTP server concurrency models. Read all three parts. Implement each.
After Spivak: James Smith, "Build Your Own Web Server" (Node.js) (build-your-own.org/webserver) is a full book covering streaming, HTTP/2, WebSocket. Substantial.
For Go: the standard net/http source is short and exemplary.
7. Implementation milestones
Milestone 1: TCP echo server (warmup)
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(('127.0.0.1', 8080))
s.listen(5)
while True:
conn, addr = s.accept()
data = conn.recv(4096)
conn.sendall(data)
conn.close()
Evidence: telnet localhost 8080; whatever you type is echoed.
Milestone 2: Parse one HTTP request, return one response
Parse the request line, headers, body. Return a hardcoded 200 OK Hello, world!.
def handle(conn):
data = b""
while b"\r\n\r\n" not in data:
data += conn.recv(4096)
headers_end = data.index(b"\r\n\r\n")
request = data[:headers_end].decode()
lines = request.split("\r\n")
method, path, version = lines[0].split()
headers = dict(line.split(": ", 1) for line in lines[1:])
body = "Hello, world!"
response = (
f"HTTP/1.1 200 OK\r\n"
f"Content-Length: {len(body)}\r\n"
f"Content-Type: text/plain\r\n"
f"\r\n"
f"{body}"
)
conn.sendall(response.encode())
conn.close()
Evidence: curl http://localhost:8080/foo returns Hello, world! with correct headers. curl -v shows the protocol.
Milestone 3: Static file serving
Route based on path. For /index.html, return the file contents with Content-Type: text/html.
def handle(conn):
# ... parse request ...
if method == "GET":
try:
with open(f"./www{path}", "rb") as f:
body = f.read()
status, content_type = "200 OK", guess_mime(path)
except FileNotFoundError:
body, status, content_type = b"Not Found", "404 Not Found", "text/plain"
# ... send response ...
Security note: do not allow ../ path traversal. Resolve and check.
Evidence: Browser navigates to http://localhost:8080/index.html and renders the page.
Milestone 4: Concurrency (process per connection)
Spawn a child process per accepted connection.
import os
while True:
conn, addr = s.accept()
if os.fork() == 0:
s.close() # child doesn't need the listening socket
handle(conn)
os._exit(0)
conn.close() # parent doesn't need this connection
# need to reap zombies via SIGCHLD or wait
Evidence: Multiple parallel curl requests are served simultaneously. Compare wall time of 100 sequential vs 100 concurrent requests.
Milestone 5: Event loop (single-threaded async I/O)
import selectors
sel = selectors.DefaultSelector()
sel.register(s, selectors.EVENT_READ, accept)
while True:
events = sel.select()
for key, _ in events:
callback = key.data
callback(key.fileobj)
Per-connection state (parse buffer, response buffer) lives in a struct registered with the selector.
This is the model used by nginx, Node.js, Tokio.
Evidence: Single-threaded server handles 10k concurrent connections. Use wrk or ab to benchmark.
Milestone 6: Keep-alive
By default, HTTP/1.1 connections are kept alive. After sending a response, don't close -- wait for the next request on the same connection.
Trickier: the parser must know when one request ends. Use Content-Length. Watch for chunked encoding (Transfer-Encoding: chunked).
Evidence: curl --keep-alive sends multiple requests on one connection.
Milestone 7: Routing and dynamic endpoints
Add a router: map (method, path-pattern) to handlers. Support path parameters (/users/:id).
@route("GET", "/users/:id")
def get_user(req, params):
return json_response({"id": params["id"], "name": "alice"})
Evidence: curl http://localhost:8080/users/42 returns {"id": "42", "name": "alice"}.
Milestone 8: Middleware
Logging, compression, authentication, CORS -- all as composable middleware. Express-style design:
app.use(logger)
app.use(compression)
app.use(auth)
Milestone 9 (optional): HTTP/2 or WebSocket
- HTTP/2 -- binary protocol, multiplexed streams, server push. Substantially different from HTTP/1.1.
- WebSocket -- handshakes via HTTP, then a stateful framed connection. RFC 6455.
8. Tests & evidence
| Test | How |
|---|---|
| Basic request/response | curl returns expected body and headers |
| Static files | All MIME types correctly served |
| Status codes | 200, 301, 404, 500 all correctly emitted |
| Concurrency | 1,000 simultaneous connections; no dropped requests |
| Keep-alive | curl --next reuses connection |
| Malformed input | Garbage request returns 400, not crash |
| Long body | POST with multi-MB body parsed correctly |
| Slow client | Client that sends one byte at a time eventually completes |
| Benchmark | wrk -t12 -c400 -d30s http://localhost:8080/ -- requests/sec, latency p95/p99 |
The strongest evidence: a wrk benchmark comparing your server to nginx or Caddy on the same static-file workload. You'll be slower; document by how much.
9. Common pitfalls
- Reading until you have a complete request.
recvreturns whatever's been received. You must buffer until you see\r\n\r\n(end of headers), then readContent-Lengthmore bytes. - Forgetting
Content-Length. Without it, HTTP/1.0 clients close after reading. HTTP/1.1 clients wait forever. Always include it. - Path traversal vulnerability.
../../../etc/passwdmust not return your/etc/passwd. Resolve the path and check it's under your serving directory. - Slowloris attack. Slow clients holding connections open. Add read timeouts.
- Forgetting to reap zombies in the fork model. Children become zombies until reaped. Use
signal(SIGCHLD, SIG_IGN)orwaitpidwithWNOHANG. - Mixing blocking and event-loop code. A single blocking call in an event-loop server stops everything.
- Not handling EAGAIN/EWOULDBLOCK. In nonblocking mode,
recvreturns these instead of blocking. Handle by yielding back to the event loop. - HTTP parsing is fiddly. Don't reinvent -- but don't blindly use a library either. Write enough to understand the corner cases (empty headers, header folding,
\r\nvs\n).
10. Extensions
- Reverse proxy mode. Forward requests to upstream servers.
- TLS / HTTPS. Use
opensslor your language's TLS stack. Addhttps://support. - HTTP/2. RFC 9113. Stream multiplexing, header compression (HPACK), server push.
- WebSocket. RFC 6455. After handshake, framed bidirectional protocol.
- Logging and metrics. Access logs in common format; Prometheus metrics endpoint.
- Compression -- gzip/brotli response bodies based on
Accept-Encoding. - Range requests -- for video streaming and large file resumption.
- HTTP/3. UDP-based, QUIC transport. Substantially harder.
11. Module integration
| Module | What the web server deepens |
|---|---|
| Sem 7 Module 4 -- API design | HTTP is a contract; designing routes, status codes, headers, errors. |
| Sem 7 Module 1 -- Architecture fundamentals | Concurrency model is an architectural decision with real tradeoffs. |
| Network Stack tutorial | If you've built TCP from scratch, you can run your web server on top of your own stack. |
| Front-end Framework tutorial | Natural pairing -- your framework needs a backend. |
12. Portfolio framing
What to publish:
- Source organized as
parser/,router/,server/,middleware/. - A performance comparison:
wrkbenchmarks against nginx, with charts. - A README with:
- Concurrency model used and tradeoffs.
- Supported HTTP features and limitations.
- Security considerations (path traversal, slowloris).
Reviewer entry points:
parser/http.go-- request parsing.server/loop.go-- the accept loop.router/router.go-- request dispatch.- README must include: benchmark numbers, concurrency model rationale, what's missing.
A working web server is a solid portfolio piece, especially with benchmark numbers showing principled trade-offs against industry-standard servers.
13. Local source backbone
Use Build Your Own Web Server From Scratch in Node.js (build-your-own/web-server-node-james-smith) as the advanced Node.js/server-systems expansion after the basic HTTP server works.
| Local chunks | Use them for | Add to this project |
|---|---|---|
001-004 | Network programming, HTTP/1.0 prototype, TCP sockets, event-loop basics | Add a TCP/event-loop design note before HTTP parsing. |
005-008 | Promises/events, async functions, socket writes, dynamic buffers | Replace naive recv loops with explicit buffering and backpressure notes. |
009-012 | HTTP grammar, headers, methods, parsing, basic server loop | Add parser tests from raw byte fixtures. |
013-014 | Large bodies, dynamic content, streaming, producers/consumers, chunked encoding | Add streaming response and request-body limit milestones. |
015-017 | tcpdump inspection, static files, resource ownership, buffer lifetime | Add wire-capture evidence and file-resource cleanup tests. |
018-021 | Range requests, caching, compression, stream API | Add Range, ETag/cache validation, and gzip/br streaming extensions. |
022-025 | WebSocket upgrade, queues, flow control, server task design | Add WebSocket as a final extension with bounded queues and backpressure. |
Extra checkpoints from the book chunks
- Wire checkpoint: include one
tcpdumpor Wireshark capture explaining request bytes and response bytes. - Parser checkpoint: parse fragmented requests where headers arrive across multiple TCP reads.
- Streaming checkpoint: serve a large file without reading it fully into memory.
- Backpressure checkpoint: show what happens when the client reads slowly and how the server avoids unbounded buffering.
14. Deep project spec
Project contract
Build an HTTP server from raw sockets or a minimal runtime socket API. The server must define supported HTTP version, request framing, header parsing, body limits, response serialization, concurrency model, static-file behavior, error responses, and timeout/backpressure policy.
Source-backed reading map
| Source ID | Use for | Required output |
|---|---|---|
build-your-own/web-server-node-james-smith | TCP byte streams, event loop, dynamic buffers, HTTP grammar, streaming, static files, WebSocket upgrade | parser fixtures, wire capture, streaming/backpressure tests |
Milestone map
| Milestone | Deliverable | Tests | Failure case |
|---|---|---|---|
| TCP loop | accept/read/write connections | socket smoke tests | client disconnect mid-request |
| Request parser | method, path, version, headers, body boundary | raw-byte fixtures | fragmented headers |
| Router/response | deterministic handlers | status/header/body fixtures | unknown route |
| Static files | safe path mapping and content type | file-serving tests | path traversal rejected |
| Concurrency | thread, process, event loop, or async model | many-client smoke test | slow client does not block all |
| Streaming | large response without full buffering | memory/stream test | client closes during stream |
| HTTP extensions | range/cache/chunked/WebSocket as chosen | protocol fixtures | unsupported feature returns clear error |
Test matrix
| Test type | Required examples |
|---|---|
| Golden | raw request bytes to parsed request and serialized response |
| Integration | curl/client tests for routes, static files, errors |
| Negative | malformed request line, huge header, path traversal |
| Concurrency | slow reader, many short connections, large file stream |
| Wire | tcpdump/Wireshark capture explained in notes |
Design notes required
http-subset.md: supported HTTP version, methods, headers, body handling.concurrency.md: event loop/thread model and resource ownership.security.md: path normalization, header limits, body limits, timeouts.backpressure.md: output buffers, slow clients, streaming behavior.
Portfolio evidence
Publish curl transcripts, raw parser fixtures, one packet capture, a concurrency/latency benchmark, and a design note comparing the server to a production server it intentionally does not replace.
Source
This tutorial draws from the BYO-X catalog "Web Server" section. Ruslan Spivak's "Let's Build A Web Server" series and the modern HTTP RFCs are the canonical primary sources.