Module 4: Systems-Level Programming
Primary texts: The C Programming Language (K&R) Chapter 8 for the UNIX system-interface core; canonical Linux man pages (man7.org); Beej's Guides to Network Programming and IPC for sockets and pipes.
Selective support: Code (Petzold) Chapters 25-26 for the conceptual picture of peripherals and operating systems; Computer Organization and Design Chapter 5.4 (virtual memory) and Chapter 7 (parallel processing) for hardware context.
Local support now available: Computer Systems: A Programmer's Perspective (CSAPP, Bryant & O'Hallaron) is now available in the Semester 4 local chunked books. Use it as the best second source for fork/exec, virtual memory, signals, and concurrency after K&R and the man pages.
This guide is the primary teacher. You do not need to read the source books front-to-back to complete this module. You do need to become operationally comfortable crossing the kernel boundary: asking the operating system to create processes, open files, map memory, start threads, and move bytes over sockets -- and being able to read strace output when the program misbehaves.
Scope of This Module
This module is where C stops being a standalone language and starts being the way you talk to a UNIX-like operating system.
What it covers in depth:
- the process abstraction and the role of system calls as the kernel boundary
- creating processes with
fork, replacing them withexec, and reaping them withwait - exit codes, signals (
SIGINT,SIGTERM,SIGCHLD,SIGSEGV), and what "zombie" means - file descriptors as a unified handle for files, pipes, sockets, and devices
- the low-level I/O syscalls
open,read,write,close,lseek dup/dup2,pipe, and how a shell implements|and</>mmapfor memory-mapped files and how it relates to virtual memory- shared memory segments between cooperating processes
- custom allocators built from
sbrk/mmap: free lists and arenas - POSIX threads and the
pthread_create/pthread_joinlifecycle - mutexes, condition variables, and the classic producer-consumer pattern
- C11 atomics, memory ordering, and what "sequentially consistent" means in practice
- TCP sockets in C -- both client and server -- using the BSD socket API
- interactive debugging with
gdb: breakpoints, watchpoints, backtraces, core dumps - tracing and profiling with
strace,ltrace,perf, andvalgrind --tool=callgrind
What it deliberately does not try to finish here:
- full operating-system internals (Semester 5, Module 1)
- networking protocols above the socket API (Semester 5 covers IP, TCP, HTTP in depth)
- distributed systems concerns (Semester 6)
- kernel programming or device drivers
This is a "system-call fluency" module. If you can recite that fork returns twice but cannot write one from memory and predict the output, you are not done.
Before You Start
Answer these closed-book before starting the main path:
- What does a C program have to do differently to read a file than to read from
stdin? - What is the difference between a process and a thread, in one sentence?
- When a program prints to
stdout, which function call eventually talks to the kernel? - If two threads both increment the same
inta million times, why might the final value not be2,000,000? - What does the
&at the end of./server &in a shell actually do?
Diagnostic Interpretation
4-5 solid answers
- You have the prerequisite picture. Go straight in.
2-3 solid answers
- Continue, but slow down in Cluster 1 (processes and syscalls) and Cluster 4 (concurrency). These are the chapters where "I understand pointers" is not enough.
0-1 solid answers
- Revisit Module 2 (memory/pointers) before starting Cluster 3 (allocators). Revisit Module 1 stdio examples before Cluster 2. Low-level I/O assumes you already read files with
fopen.
What This Module Is For
By the end of Semester 4 you should be able to read a program's source and narrate what the kernel does for it. This module is the chapter where that picture becomes concrete.
After this module you should be able to:
- write a miniature shell that forks, redirects with pipes, and waits for children
- explain, to a new teammate, why
readcan return fewer bytes than requested - write a multi-threaded program that does not race, and defend the proof
- write a TCP echo server and client from memory
- reproduce a bug under
gdb, set a watchpoint on the offending variable, and capture a core dump
These are the operational skills that separate "I can code" from "I can build systems."
Concept Map
How To Use This Module
Work in order. Cluster 1 establishes the kernel boundary; every later cluster depends on it.
Cluster 1: Processes and System Calls
| Order | Concept | Type | Focus |
|---|---|---|---|
| 1 | What a Process Is and How Syscalls Cross the Kernel Boundary | PRIMARY | The process abstraction, user vs kernel mode, the syscall trap |
| 2 | fork, exec, wait: Creating and Managing Processes | PRIMARY | Why fork returns twice; the fork+exec pattern; reaping children |
| 3 | Exit Codes, Signals, and Process Lifetime | PRIMARY | exit vs _exit, signal delivery, zombies and orphans |
Cluster mastery check: Can you write a C program that forks a child, execs /bin/ls, waits for it, and prints its exit status -- without looking anything up?
Cluster 2: File Descriptors and I/O
| Order | Concept | Type | Focus |
|---|---|---|---|
| 4 | File Descriptors as a Unified I/O Handle | PRIMARY | Small-integer handles, the per-process FD table, FD 0/1/2 |
| 5 | open, read, write, close, lseek | PRIMARY | Byte-oriented I/O, partial reads, the file offset |
| 6 | dup, pipe, and Building Shell Redirection | PRIMARY | Rewiring FD 0/1 before exec; building ` |
Cluster mastery check: Can you diagram what the FD table looks like on both sides of pipe() + fork() + dup2(..., STDOUT_FILENO)?
Cluster 3: Memory Management in Practice
| Order | Concept | Type | Focus |
|---|---|---|---|
| 7 | mmap and Memory-Mapped Files | PRIMARY | Virtual memory as a file view; MAP_PRIVATE vs MAP_SHARED |
| 8 | Shared Memory Between Processes | SUPPORTING | shm_open, MAP_SHARED, and why shared memory needs sync |
| 9 | Writing a Custom Allocator: Free Lists and Arenas | SUPPORTING | K&R's storage allocator; bump allocators; fragmentation |
Cluster mastery check: Can you explain why mmap'ing a 4 GB file on a 64-bit system does not actually allocate 4 GB of RAM?
Cluster 4: Concurrency Primitives
| Order | Concept | Type | Focus |
|---|---|---|---|
| 10 | POSIX Threads: pthread_create, pthread_join | PRIMARY | Thread lifecycle, shared address space, stack vs heap |
| 11 | Mutexes, Condition Variables, and Producer-Consumer | PRIMARY | Mutual exclusion, waiting for a predicate, the canonical pattern |
| 12 | Atomics and Memory Ordering at the C Level | SUPPORTING | stdatomic.h, memory_order_*, when "just use a mutex" wins |
Cluster mastery check: Can you write the producer-consumer solution from memory and name, specifically, the race condition each line prevents?
Cluster 5: Sockets and Debugging Tools
| Order | Concept | Type | Focus |
|---|---|---|---|
| 13 | TCP Sockets in C: The BSD Socket API | PRIMARY | Client and server lifecycles; accept as an FD factory |
| 14 | Debugging with gdb: Breakpoints, Watchpoints, Core Dumps | PRIMARY | Interactive debugging; post-mortem from a core file |
| 15 | Profiling and Tracing: perf, strace, ltrace, valgrind --tool=callgrind | SUPPORTING | Choosing the right tool for "slow," "wrong," or "leaking" |
Cluster mastery check: Given a running program that suddenly hangs, can you name three independent tools you could reach for and what question each answers?
Then work these practice pages:
| Order | Practice path | Focus |
|---|---|---|
| 1 | Processes and Pipes Lab | Forking, exec, redirection; build a mini shell pipeline |
| 2 | File I/O and mmap Workshop | cat/wc from scratch; mmap'd word counter |
| 3 | Concurrency and Debugging Clinic | Producer-consumer; debugging a planted race with gdb |
| 4 | Code Katas | Mini shell, cat, wc, producer-consumer, tiny HTTP, gdb hunt |
Use Module Quiz after the concept and practice path. Use Reference and Selective Reading and Learning Resources only for targeted reinforcement.
Learning Objectives
By the end of this module you should be able to:
- Describe what a process is, what a system call is, and what happens on the CPU when user code invokes one.
- Write a C program that forks a child, runs an external command via
execvp, and collects the exit status withwaitpid. - Explain exit codes, signals, and how to install a
SIGINThandler without introducing an async-signal-safety bug. - Use the raw I/O syscalls (
open,read,write,close,lseek) to implementcatandwcthat match the output of the system versions for typical inputs. - Build a shell pipeline using
pipeanddup2and explain why unused FDs must be closed in both parent and child. - Memory-map a file with
mmapand explain the difference betweenMAP_SHAREDandMAP_PRIVATE. - Write a minimal custom allocator over a large
mmap'd region. - Write a producer-consumer program in POSIX threads, identify each line's role in preventing races, and argue its correctness.
- Write a TCP echo server and client from memory, including error handling for partial
send/recv. - Reproduce a crash under
gdb, set a watchpoint, and read a core dump to identify the failing line.
Outputs
- one annotated
fork/exec/waitprogram with predicted vs actual output - one custom
catand one customwcthat match the system tools on ASCII inputs - one miniature shell that supports a single pipe (
ls | wc -l) with correct FD closing - one
mmap-based file tool (either a word counter or a grep-like search) - one custom allocator over an
mmap'd arena with freelist coalescing, with a short writeup - one producer-consumer program plus a line-by-line race analysis
- one TCP echo server and matching client
- one
gdbsession transcript from a planted bug, frombreak/run/printto root cause - one
straceorperfoutput annotated with "this syscall line is where the time goes" - a mistake log with tags such as
forgot to close FD in parent,usedexitinstead of_exitin signal handler,partial read not looped,mutex not held when signalling, orendianness forgotten insockaddr_in``.
Completion Standard
You have completed Module 4 when all of these are true:
- you can write
fork+exec+waitfrom memory and predict the output before running - you can draw the FD table through
pipe+fork+dup2and say what closes when - you can state, for each concurrency primitive, the specific race condition it prevents
- you can point at a line of
straceoutput and say which C line produced it - you can attach
gdbto a running process and explain what you are doing to a teammate
If you can describe the syscall boundary but cannot demonstrate it at the keyboard, the module is not complete.
Reading Policy
- Concept pages are the main path.
- K&R Chapter 8 is the only book chapter that directly mirrors the module; treat it as co-teacher for Clusters 1-3.
- The Linux man pages (
man7.org) and Beej's guides are primary external references -- they are the canonical sources professional engineers read. Read This Only If Stucklinks are specific chunks; open them only after the concept page and a retrieval attempt.- CSAPP is now in the local Semester 4 library; where a concept page points you there, prefer the local chunk before opening a secondary external tutorial.
Suggested Weekly Flow
| Day | Work |
|---|---|
| 1 | Concepts 1-3 and a fork/exec/wait demo program |
| 2 | Concepts 4-6 and rewrite cat |
| 3 | Practice 1 (processes and pipes lab), extend to `ls |
| 4 | Concepts 7-9 and an mmap'd word counter |
| 5 | Concepts 10-12 and the producer-consumer program |
| 6 | Concepts 13-15, practice pages 2-3 |
| 7 | Code katas (mini shell, tiny HTTP, gdb hunt), quiz, mistake-log cleanup |
Reference
If you need exact links into the local chunked books or external canonical docs, use Reference and Selective Reading.
The Shell tutorial is fork/exec/pipe/dup2 in action. For the next step up, the Container Runtime tutorial extends the same primitives with namespaces and cgroups. See Build Your Own X overview.
Rich Learning Pages
Worked Examples | Guided Labs | Case Studies | Mistake Clinic | Reading Guide | Capstone Thread