Build Your Own Network Stack (TCP/IP)
"TCP is described in RFC 793. It's worth reading. It's also wrong in a few places that we discovered the hard way." — every TCP implementer
Building a TCP/IP stack from scratch — even a minimal one — is the deepest possible engagement with what "the internet" actually is. By the end you have a stack that can speak ARP, IP, ICMP, and TCP well enough to handshake with a real server and exchange data.
1. Overview & motivation
A network stack has layers:
[Application] HTTP, your-protocol ← you produce/consume bytes
[Transport] TCP, UDP ← you implement
[Internet] IP, ICMP ← you implement
[Link] Ethernet, ARP ← you implement
[Physical] Cable, wifi, TAP device ← OS provides
You won't touch the physical layer. You'll use a TUN/TAP device on Linux to inject and receive raw Ethernet frames from user-space. Your code does everything from Ethernet on up.
What you can only learn by building one:
- Why the OSI model and the real internet model don't match perfectly, and the actual layering matters.
- Why TCP is a real piece of engineering — sliding windows, congestion control, retransmission timers, reordering.
- Why byte order (network = big-endian) matters and why
htonl/ntohlexist. - Why every packet is a tiny formatted string of bytes, and you read RFCs by laying out structs.
- Why the three-way handshake is the cleanest, most-photographed protocol in CS.
2. Where this fits in the degree
- Phase: Systems
- Semester: 5 (OS and Networking)
- Modules deepened: Module 5 (network protocols & sockets) — this is the module's apex project.
Cross-phase relevance:
- Foundation for the BitTorrent Client tutorial — once TCP feels mundane, peer-to-peer becomes accessible.
- Direct connection to the Operating System tutorial — a kernel needs a network stack.
- Sets up the Web Server tutorial — you'll write the HTTP layer on top of a real socket.
3. Prerequisites
- Strong C or Rust. This project is in C in nearly every tutorial.
- Linux (TUN/TAP devices are Linux-specific; macOS has alternatives, Windows is impractical).
tcpdump/ Wireshark literacy. You will live in Wireshark.- Familiarity with hex dumps and byte layout.
4. Theory & research
Required reading
- RFC 793 — Transmission Control Protocol (rfc-editor.org/rfc/rfc793). The original TCP spec. 1981. Read it cover-to-cover. Has known errata; see RFC 9293 (2022) for an updated, consolidated version. ⭐
- RFC 791 — Internet Protocol. IP header layout.
- RFC 768 — User Datagram Protocol. Trivial; useful for warmup.
- RFC 826 — Ethernet Address Resolution Protocol (ARP).
- Beej's Guide to Network Programming — beej.us/guide/bgnet. Free, classic. Useful for the userland sockets API perspective. ⭐
Strongly recommended
- Stevens, Fenner, Rudoff, Unix Network Programming, Volume 1 — the canonical book on sockets programming.
- Peterson & Davie, Computer Networks: A Systems Approach — undergraduate textbook with strong protocol coverage.
- Kurose & Ross, Computer Networking: A Top-Down Approach — alternative textbook, more accessible.
- Saminiir, "Let's code a TCP/IP stack" — saminiir.com/lets-code-tcp-ip-stack-1-ethernet-arp/. 5-part blog series, exactly this project. ⭐ recommended primary.
For the protocol details
- TCP Illustrated, Volume 1 (Wright/Stevens) — packet-by-packet walk through TCP. Old but unmatched.
- Wireshark Wiki — wiki.wireshark.org — protocol references for every protocol you'll touch.
5. Curated tutorial list (from BYO-X)
- C: Beej's Guide to Network Programming — beej.us — sockets programming not stack-building, but essential background
- C: Let's code a TCP/IP stack — Saminiir's 5-part series ⭐ recommended primary
- C / Python: Build your own VPN/Virtual Switch — Build a VPN with a TUN/TAP device
- Ruby: How to build a network stack in Ruby
6. Recommended primary path
Saminiir's "Let's code a TCP/IP stack". Five parts:
- Ethernet & ARP.
- IPv4 & ICMPv4.
- TCP basics.
- TCP data flow & socket API.
- TCP retransmission.
Each part has working code and a tcpdump trace showing the stack at work. C, ~3,000 lines total.
After Saminiir, the natural next step is implementing congestion control (slow start, congestion avoidance, fast retransmit). The RFC 5681 spec is a short read.
For ARM/embedded systems, look at lwIP (savannah.nongnu.org/projects/lwip) — small, focused, real production stack. Read it after you've built your own.
7. Implementation milestones
Milestone 1: TUN/TAP device
On Linux, open /dev/net/tun, configure as a TAP device. From the OS's perspective, this is a new network interface. Anything written to the file descriptor is delivered as a "received" packet on that interface; anything sent on the interface arrives at the file descriptor.
int tun_alloc(char *dev) {
struct ifreq ifr;
int fd, err;
if ((fd = open("/dev/net/tun", O_RDWR)) < 0) return -1;
memset(&ifr, 0, sizeof(ifr));
ifr.ifr_flags = IFF_TAP | IFF_NO_PI;
strncpy(ifr.ifr_name, dev, IFNAMSIZ);
if ((err = ioctl(fd, TUNSETIFF, (void*)&ifr)) < 0) { close(fd); return err; }
return fd;
}
Then in a shell:
ip addr add 10.0.0.5/24 dev tap0
ip link set tap0 up
Evidence: ping 10.0.0.5 from another machine on the LAN delivers an ARP request to your fd.
Milestone 2: Ethernet & ARP
Parse Ethernet header (14 bytes: dest MAC, src MAC, ethertype). For ARP requests, respond with your MAC.
struct eth_hdr {
uint8_t dest[6];
uint8_t src[6];
uint16_t ethertype;
} __attribute__((packed));
struct arp_hdr {
uint16_t htype, ptype;
uint8_t hlen, plen;
uint16_t opcode;
uint8_t sender_mac[6];
uint8_t sender_ip[4];
uint8_t target_mac[6];
uint8_t target_ip[4];
} __attribute__((packed));
Evidence: ping 10.0.0.5 now triggers ARP responses (visible in tcpdump). The host can resolve your IP to your MAC.
Milestone 3: IPv4 & ICMP echo
Parse IP header (20 bytes minimum, then options). Validate checksum. Respond to ICMP echo requests with echo replies.
The checksum is one's-complement of the one's-complement sum of 16-bit words. It feels weird until you've implemented it.
uint16_t checksum(void *data, int len) {
uint32_t sum = 0;
uint16_t *p = data;
while (len > 1) { sum += *p++; len -= 2; }
if (len) sum += *(uint8_t*)p;
sum = (sum >> 16) + (sum & 0xffff);
sum += sum >> 16;
return ~sum;
}
Evidence: ping 10.0.0.5 succeeds. You see ping replies in tcpdump.
Milestone 4: TCP three-way handshake
TCP state machine starts at LISTEN. Receiving SYN → respond with SYN-ACK → wait for ACK → ESTABLISHED.
client server (you)
| --- SYN seq=1000 ---> | LISTEN
| | SYN_RCVD
| <-- SYN-ACK seq=5000 |
| ack=1001 |
| --- ACK ack=5001 ---> | ESTABLISHED
This is the most photographed protocol in CS. Get every bit right.
Evidence: nc 10.0.0.5 8080 connects. Your stack logs each transition. Wireshark shows a clean SYN/SYN-ACK/ACK exchange.
Milestone 5: TCP data transfer (with sliding window)
Implement send buffer, receive buffer, sequence numbers, acknowledgements. Respect the receiver's advertised window. Handle out-of-order packets (reorder buffer).
struct tcb { // transmission control block
uint32_t snd_una, snd_nxt, snd_wnd;
uint32_t rcv_nxt, rcv_wnd;
uint32_t iss, irs;
state_t state;
// buffers
};
Evidence: nc 10.0.0.5 8080 connection can exchange data in both directions. Test with files of varying sizes.
Milestone 6: Retransmission
Set a timer when sending. If ACK doesn't arrive in time, retransmit. Implement exponential backoff and a maximum retry count.
For correctness, use Karn's algorithm: don't update RTT estimates from retransmitted segments.
Evidence: Drop packets deliberately (use tc netem on Linux) and observe correct retransmission.
Milestone 7: Connection teardown
FIN handshake. CLOSE_WAIT, FIN_WAIT_1, FIN_WAIT_2, TIME_WAIT.
The TIME_WAIT state is the most misunderstood part of TCP — required to prevent old segments from polluting new connections. RFC 793 explains why; many tutorials gloss over it.
Evidence: Connections close cleanly. After many connections, netstat doesn't show stuck CLOSE_WAIT entries.
Milestone 8 (optional): Sockets API
Wrap your stack in socket(), bind(), listen(), accept(), read(), write(). Now an application can use your stack.
Milestone 9 (optional): Congestion control
Slow start, congestion avoidance, fast retransmit, fast recovery (RFC 5681). The algorithm that keeps the internet from melting.
8. Tests & evidence
| Test | How |
|---|---|
| Ethernet/ARP | ping 10.0.0.5 resolves via ARP |
| ICMP | ping 10.0.0.5 succeeds with full RTT report |
| TCP handshake | nc 10.0.0.5 8080 connects; Wireshark shows clean exchange |
| TCP data | Round-trip a 1 MB file via nc |
| Out-of-order | Reorder packets in a test harness; data still delivered correctly |
| Packet loss | Drop 10% of packets; data still delivered |
| Connection close | Many connections close cleanly; no leaks |
| Interop | A real client (Python, curl) can talk to your stack |
The strongest evidence: a Wireshark capture showing your stack having a clean conversation with a real client.
9. Common pitfalls
- Network byte order. All multi-byte fields in network headers are big-endian.
htons,htonl,ntohs,ntohlare your friends. - Struct packing.
__attribute__((packed))is required, or the compiler inserts padding. - Checksum miscalculation. One bit wrong → packet dropped silently. Verify each layer with Wireshark.
- TCP sequence numbers wrap around. Use modular comparison (
(int32_t)(a - b) < 0). - Forgetting the pseudo-header for TCP/UDP checksum. The TCP checksum includes a 12-byte "pseudo-header" with IPs and length. Easy to miss.
- TIME_WAIT. Don't shortcut it. Two reasons: ensure the final ACK is delivered; let stray segments expire.
- State machine completeness. The full TCP state diagram has 11 states. Skipping LAST_ACK or CLOSE_WAIT will produce hung connections.
- Reading the wrong RFC. RFC 793 has errata. Modern reference: RFC 9293 (2022).
10. Extensions
- UDP. Trivial after TCP. Add it for completeness.
- Real socket API.
socket()/bind()/listen()/... so an unmodified application can use your stack. - Selective acknowledgment (SACK) — RFC 2018.
- TCP Reno or CUBIC congestion control — the algorithms used in modern OSes.
- IPv6. New header format, different addressing. Same TCP underneath.
- TLS — way out of scope, but conceptually the next protocol you'd add.
11. Module integration
| Module | What the network stack deepens |
|---|---|
| Sem 4 Module 1 — C fundamentals | Large C project with strict correctness requirements. |
| Sem 4 Module 2 — Memory & pointers | Byte-level packet manipulation. Pointer-into-buffer is the dominant pattern. |
| Sem 5 Module 3 — Concurrency | Multiple connections, timers, retransmission queues — all concurrent. |
| Sem 5 Module 5 — Network protocols | The whole module. |
| BitTorrent Client tutorial | After implementing TCP, BitTorrent's protocol feels straightforward. |
| Operating System tutorial | Your OS needs a network stack. Same code can be ported to your kernel. |
| Web Server tutorial | HTTP layer sits on TCP. |
12. Portfolio framing
What to publish:
- C source organized as
src/{ethernet,arp,ip,icmp,tcp}.c. - A
Makefile. A run script that sets up the TAP device. - A README with a Wireshark screenshot showing a clean TCP handshake.
- A list of which RFCs you implemented and which you skipped.
What to keep private:
- None — this is portfolio-grade.
Reviewer entry points:
src/tcp.c— the state machine.tests/handshake.pcap— a captured exchange.- README must include: which RFCs are implemented, scope limitations, and the handshake screenshot.
This is a serious portfolio project. "I implemented TCP/IP from scratch" is a sentence that gets attention.
Source
This tutorial draws from the BYO-X catalog "Network Stack" entry. RFC 793 and Saminiir's 5-part series are the canonical primary sources.