Skip to main content

Build Your Own Network Stack (TCP/IP)

"TCP is described in RFC 793. It's worth reading. It's also wrong in a few places that we discovered the hard way." — every TCP implementer

Building a TCP/IP stack from scratch — even a minimal one — is the deepest possible engagement with what "the internet" actually is. By the end you have a stack that can speak ARP, IP, ICMP, and TCP well enough to handshake with a real server and exchange data.


1. Overview & motivation

A network stack has layers:

[Application]   HTTP, your-protocol      ← you produce/consume bytes
[Transport] TCP, UDP ← you implement
[Internet] IP, ICMP ← you implement
[Link] Ethernet, ARP ← you implement
[Physical] Cable, wifi, TAP device ← OS provides

You won't touch the physical layer. You'll use a TUN/TAP device on Linux to inject and receive raw Ethernet frames from user-space. Your code does everything from Ethernet on up.

What you can only learn by building one:

  • Why the OSI model and the real internet model don't match perfectly, and the actual layering matters.
  • Why TCP is a real piece of engineering — sliding windows, congestion control, retransmission timers, reordering.
  • Why byte order (network = big-endian) matters and why htonl/ntohl exist.
  • Why every packet is a tiny formatted string of bytes, and you read RFCs by laying out structs.
  • Why the three-way handshake is the cleanest, most-photographed protocol in CS.

2. Where this fits in the degree

  • Phase: Systems
  • Semester: 5 (OS and Networking)
  • Modules deepened: Module 5 (network protocols & sockets) — this is the module's apex project.

Cross-phase relevance:


3. Prerequisites

  • Strong C or Rust. This project is in C in nearly every tutorial.
  • Linux (TUN/TAP devices are Linux-specific; macOS has alternatives, Windows is impractical).
  • tcpdump / Wireshark literacy. You will live in Wireshark.
  • Familiarity with hex dumps and byte layout.

4. Theory & research

Required reading

  • RFC 793 — Transmission Control Protocol (rfc-editor.org/rfc/rfc793). The original TCP spec. 1981. Read it cover-to-cover. Has known errata; see RFC 9293 (2022) for an updated, consolidated version. ⭐
  • RFC 791 — Internet Protocol. IP header layout.
  • RFC 768 — User Datagram Protocol. Trivial; useful for warmup.
  • RFC 826 — Ethernet Address Resolution Protocol (ARP).
  • Beej's Guide to Network Programmingbeej.us/guide/bgnet. Free, classic. Useful for the userland sockets API perspective. ⭐
  • Stevens, Fenner, Rudoff, Unix Network Programming, Volume 1 — the canonical book on sockets programming.
  • Peterson & Davie, Computer Networks: A Systems Approach — undergraduate textbook with strong protocol coverage.
  • Kurose & Ross, Computer Networking: A Top-Down Approach — alternative textbook, more accessible.
  • Saminiir, "Let's code a TCP/IP stack"saminiir.com/lets-code-tcp-ip-stack-1-ethernet-arp/. 5-part blog series, exactly this project. ⭐ recommended primary.

For the protocol details

  • TCP Illustrated, Volume 1 (Wright/Stevens) — packet-by-packet walk through TCP. Old but unmatched.
  • Wireshark Wikiwiki.wireshark.org — protocol references for every protocol you'll touch.

5. Curated tutorial list (from BYO-X)

  • C: Beej's Guide to Network Programmingbeej.ussockets programming not stack-building, but essential background
  • C: Let's code a TCP/IP stackSaminiir's 5-part seriesrecommended primary
  • C / Python: Build your own VPN/Virtual SwitchBuild a VPN with a TUN/TAP device
  • Ruby: How to build a network stack in Ruby

Saminiir's "Let's code a TCP/IP stack". Five parts:

  1. Ethernet & ARP.
  2. IPv4 & ICMPv4.
  3. TCP basics.
  4. TCP data flow & socket API.
  5. TCP retransmission.

Each part has working code and a tcpdump trace showing the stack at work. C, ~3,000 lines total.

After Saminiir, the natural next step is implementing congestion control (slow start, congestion avoidance, fast retransmit). The RFC 5681 spec is a short read.

For ARM/embedded systems, look at lwIP (savannah.nongnu.org/projects/lwip) — small, focused, real production stack. Read it after you've built your own.


7. Implementation milestones

Milestone 1: TUN/TAP device

On Linux, open /dev/net/tun, configure as a TAP device. From the OS's perspective, this is a new network interface. Anything written to the file descriptor is delivered as a "received" packet on that interface; anything sent on the interface arrives at the file descriptor.

int tun_alloc(char *dev) {
struct ifreq ifr;
int fd, err;
if ((fd = open("/dev/net/tun", O_RDWR)) < 0) return -1;
memset(&ifr, 0, sizeof(ifr));
ifr.ifr_flags = IFF_TAP | IFF_NO_PI;
strncpy(ifr.ifr_name, dev, IFNAMSIZ);
if ((err = ioctl(fd, TUNSETIFF, (void*)&ifr)) < 0) { close(fd); return err; }
return fd;
}

Then in a shell:

ip addr add 10.0.0.5/24 dev tap0
ip link set tap0 up

Evidence: ping 10.0.0.5 from another machine on the LAN delivers an ARP request to your fd.

Milestone 2: Ethernet & ARP

Parse Ethernet header (14 bytes: dest MAC, src MAC, ethertype). For ARP requests, respond with your MAC.

struct eth_hdr {
uint8_t dest[6];
uint8_t src[6];
uint16_t ethertype;
} __attribute__((packed));

struct arp_hdr {
uint16_t htype, ptype;
uint8_t hlen, plen;
uint16_t opcode;
uint8_t sender_mac[6];
uint8_t sender_ip[4];
uint8_t target_mac[6];
uint8_t target_ip[4];
} __attribute__((packed));

Evidence: ping 10.0.0.5 now triggers ARP responses (visible in tcpdump). The host can resolve your IP to your MAC.

Milestone 3: IPv4 & ICMP echo

Parse IP header (20 bytes minimum, then options). Validate checksum. Respond to ICMP echo requests with echo replies.

The checksum is one's-complement of the one's-complement sum of 16-bit words. It feels weird until you've implemented it.

uint16_t checksum(void *data, int len) {
uint32_t sum = 0;
uint16_t *p = data;
while (len > 1) { sum += *p++; len -= 2; }
if (len) sum += *(uint8_t*)p;
sum = (sum >> 16) + (sum & 0xffff);
sum += sum >> 16;
return ~sum;
}

Evidence: ping 10.0.0.5 succeeds. You see ping replies in tcpdump.

Milestone 4: TCP three-way handshake

TCP state machine starts at LISTEN. Receiving SYN → respond with SYN-ACK → wait for ACK → ESTABLISHED.

client                   server (you)
| --- SYN seq=1000 ---> | LISTEN
| | SYN_RCVD
| <-- SYN-ACK seq=5000 |
| ack=1001 |
| --- ACK ack=5001 ---> | ESTABLISHED

This is the most photographed protocol in CS. Get every bit right.

Evidence: nc 10.0.0.5 8080 connects. Your stack logs each transition. Wireshark shows a clean SYN/SYN-ACK/ACK exchange.

Milestone 5: TCP data transfer (with sliding window)

Implement send buffer, receive buffer, sequence numbers, acknowledgements. Respect the receiver's advertised window. Handle out-of-order packets (reorder buffer).

struct tcb { // transmission control block
uint32_t snd_una, snd_nxt, snd_wnd;
uint32_t rcv_nxt, rcv_wnd;
uint32_t iss, irs;
state_t state;
// buffers
};

Evidence: nc 10.0.0.5 8080 connection can exchange data in both directions. Test with files of varying sizes.

Milestone 6: Retransmission

Set a timer when sending. If ACK doesn't arrive in time, retransmit. Implement exponential backoff and a maximum retry count.

For correctness, use Karn's algorithm: don't update RTT estimates from retransmitted segments.

Evidence: Drop packets deliberately (use tc netem on Linux) and observe correct retransmission.

Milestone 7: Connection teardown

FIN handshake. CLOSE_WAIT, FIN_WAIT_1, FIN_WAIT_2, TIME_WAIT.

The TIME_WAIT state is the most misunderstood part of TCP — required to prevent old segments from polluting new connections. RFC 793 explains why; many tutorials gloss over it.

Evidence: Connections close cleanly. After many connections, netstat doesn't show stuck CLOSE_WAIT entries.

Milestone 8 (optional): Sockets API

Wrap your stack in socket(), bind(), listen(), accept(), read(), write(). Now an application can use your stack.

Milestone 9 (optional): Congestion control

Slow start, congestion avoidance, fast retransmit, fast recovery (RFC 5681). The algorithm that keeps the internet from melting.


8. Tests & evidence

TestHow
Ethernet/ARPping 10.0.0.5 resolves via ARP
ICMPping 10.0.0.5 succeeds with full RTT report
TCP handshakenc 10.0.0.5 8080 connects; Wireshark shows clean exchange
TCP dataRound-trip a 1 MB file via nc
Out-of-orderReorder packets in a test harness; data still delivered correctly
Packet lossDrop 10% of packets; data still delivered
Connection closeMany connections close cleanly; no leaks
InteropA real client (Python, curl) can talk to your stack

The strongest evidence: a Wireshark capture showing your stack having a clean conversation with a real client.


9. Common pitfalls

  • Network byte order. All multi-byte fields in network headers are big-endian. htons, htonl, ntohs, ntohl are your friends.
  • Struct packing. __attribute__((packed)) is required, or the compiler inserts padding.
  • Checksum miscalculation. One bit wrong → packet dropped silently. Verify each layer with Wireshark.
  • TCP sequence numbers wrap around. Use modular comparison ((int32_t)(a - b) < 0).
  • Forgetting the pseudo-header for TCP/UDP checksum. The TCP checksum includes a 12-byte "pseudo-header" with IPs and length. Easy to miss.
  • TIME_WAIT. Don't shortcut it. Two reasons: ensure the final ACK is delivered; let stray segments expire.
  • State machine completeness. The full TCP state diagram has 11 states. Skipping LAST_ACK or CLOSE_WAIT will produce hung connections.
  • Reading the wrong RFC. RFC 793 has errata. Modern reference: RFC 9293 (2022).

10. Extensions

  • UDP. Trivial after TCP. Add it for completeness.
  • Real socket API. socket()/bind()/listen()/... so an unmodified application can use your stack.
  • Selective acknowledgment (SACK) — RFC 2018.
  • TCP Reno or CUBIC congestion control — the algorithms used in modern OSes.
  • IPv6. New header format, different addressing. Same TCP underneath.
  • TLS — way out of scope, but conceptually the next protocol you'd add.

11. Module integration

ModuleWhat the network stack deepens
Sem 4 Module 1 — C fundamentalsLarge C project with strict correctness requirements.
Sem 4 Module 2 — Memory & pointersByte-level packet manipulation. Pointer-into-buffer is the dominant pattern.
Sem 5 Module 3 — ConcurrencyMultiple connections, timers, retransmission queues — all concurrent.
Sem 5 Module 5 — Network protocolsThe whole module.
BitTorrent Client tutorialAfter implementing TCP, BitTorrent's protocol feels straightforward.
Operating System tutorialYour OS needs a network stack. Same code can be ported to your kernel.
Web Server tutorialHTTP layer sits on TCP.

12. Portfolio framing

What to publish:

  • C source organized as src/{ethernet,arp,ip,icmp,tcp}.c.
  • A Makefile. A run script that sets up the TAP device.
  • A README with a Wireshark screenshot showing a clean TCP handshake.
  • A list of which RFCs you implemented and which you skipped.

What to keep private:

  • None — this is portfolio-grade.

Reviewer entry points:

  • src/tcp.c — the state machine.
  • tests/handshake.pcap — a captured exchange.
  • README must include: which RFCs are implemented, scope limitations, and the handshake screenshot.

This is a serious portfolio project. "I implemented TCP/IP from scratch" is a sentence that gets attention.


Source

This tutorial draws from the BYO-X catalog "Network Stack" entry. RFC 793 and Saminiir's 5-part series are the canonical primary sources.