Skip to main content

Module 4: Systems-Level Programming

Primary texts: The C Programming Language (K&R) Chapter 8 for the UNIX system-interface core; canonical Linux man pages (man7.org); Beej's Guides to Network Programming and IPC for sockets and pipes. Selective support: Code (Petzold) Chapters 25-26 for the conceptual picture of peripherals and operating systems; Computer Organization and Design Chapter 5.4 (virtual memory) and Chapter 7 (parallel processing) for hardware context. Local support now available: Computer Systems: A Programmer's Perspective (CSAPP, Bryant & O'Hallaron) is now available in the Semester 4 local chunked books. Use it as the best second source for fork/exec, virtual memory, signals, and concurrency after K&R and the man pages.

This guide is the primary teacher. You do not need to read the source books front-to-back to complete this module. You do need to become operationally comfortable crossing the kernel boundary: asking the operating system to create processes, open files, map memory, start threads, and move bytes over sockets -- and being able to read strace output when the program misbehaves.


Scope of This Module

This module is where C stops being a standalone language and starts being the way you talk to a UNIX-like operating system.

What it covers in depth:

  • the process abstraction and the role of system calls as the kernel boundary
  • creating processes with fork, replacing them with exec, and reaping them with wait
  • exit codes, signals (SIGINT, SIGTERM, SIGCHLD, SIGSEGV), and what "zombie" means
  • file descriptors as a unified handle for files, pipes, sockets, and devices
  • the low-level I/O syscalls open, read, write, close, lseek
  • dup/dup2, pipe, and how a shell implements | and </>
  • mmap for memory-mapped files and how it relates to virtual memory
  • shared memory segments between cooperating processes
  • custom allocators built from sbrk/mmap: free lists and arenas
  • POSIX threads and the pthread_create/pthread_join lifecycle
  • mutexes, condition variables, and the classic producer-consumer pattern
  • C11 atomics, memory ordering, and what "sequentially consistent" means in practice
  • TCP sockets in C -- both client and server -- using the BSD socket API
  • interactive debugging with gdb: breakpoints, watchpoints, backtraces, core dumps
  • tracing and profiling with strace, ltrace, perf, and valgrind --tool=callgrind

What it deliberately does not try to finish here:

  • full operating-system internals (Semester 5, Module 1)
  • networking protocols above the socket API (Semester 5 covers IP, TCP, HTTP in depth)
  • distributed systems concerns (Semester 6)
  • kernel programming or device drivers

This is a "system-call fluency" module. If you can recite that fork returns twice but cannot write one from memory and predict the output, you are not done.


Before You Start

Answer these closed-book before starting the main path:

  1. What does a C program have to do differently to read a file than to read from stdin?
  2. What is the difference between a process and a thread, in one sentence?
  3. When a program prints to stdout, which function call eventually talks to the kernel?
  4. If two threads both increment the same int a million times, why might the final value not be 2,000,000?
  5. What does the & at the end of ./server & in a shell actually do?

Diagnostic Interpretation

4-5 solid answers

  • You have the prerequisite picture. Go straight in.

2-3 solid answers

  • Continue, but slow down in Cluster 1 (processes and syscalls) and Cluster 4 (concurrency). These are the chapters where "I understand pointers" is not enough.

0-1 solid answers

  • Revisit Module 2 (memory/pointers) before starting Cluster 3 (allocators). Revisit Module 1 stdio examples before Cluster 2. Low-level I/O assumes you already read files with fopen.

What This Module Is For

By the end of Semester 4 you should be able to read a program's source and narrate what the kernel does for it. This module is the chapter where that picture becomes concrete.

After this module you should be able to:

  • write a miniature shell that forks, redirects with pipes, and waits for children
  • explain, to a new teammate, why read can return fewer bytes than requested
  • write a multi-threaded program that does not race, and defend the proof
  • write a TCP echo server and client from memory
  • reproduce a bug under gdb, set a watchpoint on the offending variable, and capture a core dump

These are the operational skills that separate "I can code" from "I can build systems."


Concept Map


How To Use This Module

Work in order. Cluster 1 establishes the kernel boundary; every later cluster depends on it.

Cluster 1: Processes and System Calls

OrderConceptTypeFocus
1What a Process Is and How Syscalls Cross the Kernel BoundaryPRIMARYThe process abstraction, user vs kernel mode, the syscall trap
2fork, exec, wait: Creating and Managing ProcessesPRIMARYWhy fork returns twice; the fork+exec pattern; reaping children
3Exit Codes, Signals, and Process LifetimePRIMARYexit vs _exit, signal delivery, zombies and orphans

Cluster mastery check: Can you write a C program that forks a child, execs /bin/ls, waits for it, and prints its exit status -- without looking anything up?

Cluster 2: File Descriptors and I/O

OrderConceptTypeFocus
4File Descriptors as a Unified I/O HandlePRIMARYSmall-integer handles, the per-process FD table, FD 0/1/2
5open, read, write, close, lseekPRIMARYByte-oriented I/O, partial reads, the file offset
6dup, pipe, and Building Shell RedirectionPRIMARYRewiring FD 0/1 before exec; building `

Cluster mastery check: Can you diagram what the FD table looks like on both sides of pipe() + fork() + dup2(..., STDOUT_FILENO)?

Cluster 3: Memory Management in Practice

OrderConceptTypeFocus
7mmap and Memory-Mapped FilesPRIMARYVirtual memory as a file view; MAP_PRIVATE vs MAP_SHARED
8Shared Memory Between ProcessesSUPPORTINGshm_open, MAP_SHARED, and why shared memory needs sync
9Writing a Custom Allocator: Free Lists and ArenasSUPPORTINGK&R's storage allocator; bump allocators; fragmentation

Cluster mastery check: Can you explain why mmap'ing a 4 GB file on a 64-bit system does not actually allocate 4 GB of RAM?

Cluster 4: Concurrency Primitives

OrderConceptTypeFocus
10POSIX Threads: pthread_create, pthread_joinPRIMARYThread lifecycle, shared address space, stack vs heap
11Mutexes, Condition Variables, and Producer-ConsumerPRIMARYMutual exclusion, waiting for a predicate, the canonical pattern
12Atomics and Memory Ordering at the C LevelSUPPORTINGstdatomic.h, memory_order_*, when "just use a mutex" wins

Cluster mastery check: Can you write the producer-consumer solution from memory and name, specifically, the race condition each line prevents?

Cluster 5: Sockets and Debugging Tools

OrderConceptTypeFocus
13TCP Sockets in C: The BSD Socket APIPRIMARYClient and server lifecycles; accept as an FD factory
14Debugging with gdb: Breakpoints, Watchpoints, Core DumpsPRIMARYInteractive debugging; post-mortem from a core file
15Profiling and Tracing: perf, strace, ltrace, valgrind --tool=callgrindSUPPORTINGChoosing the right tool for "slow," "wrong," or "leaking"

Cluster mastery check: Given a running program that suddenly hangs, can you name three independent tools you could reach for and what question each answers?

Then work these practice pages:

OrderPractice pathFocus
1Processes and Pipes LabForking, exec, redirection; build a mini shell pipeline
2File I/O and mmap Workshopcat/wc from scratch; mmap'd word counter
3Concurrency and Debugging ClinicProducer-consumer; debugging a planted race with gdb
4Code KatasMini shell, cat, wc, producer-consumer, tiny HTTP, gdb hunt

Use Module Quiz after the concept and practice path. Use Reference and Selective Reading and Learning Resources only for targeted reinforcement.


Learning Objectives

By the end of this module you should be able to:

  1. Describe what a process is, what a system call is, and what happens on the CPU when user code invokes one.
  2. Write a C program that forks a child, runs an external command via execvp, and collects the exit status with waitpid.
  3. Explain exit codes, signals, and how to install a SIGINT handler without introducing an async-signal-safety bug.
  4. Use the raw I/O syscalls (open, read, write, close, lseek) to implement cat and wc that match the output of the system versions for typical inputs.
  5. Build a shell pipeline using pipe and dup2 and explain why unused FDs must be closed in both parent and child.
  6. Memory-map a file with mmap and explain the difference between MAP_SHARED and MAP_PRIVATE.
  7. Write a minimal custom allocator over a large mmap'd region.
  8. Write a producer-consumer program in POSIX threads, identify each line's role in preventing races, and argue its correctness.
  9. Write a TCP echo server and client from memory, including error handling for partial send/recv.
  10. Reproduce a crash under gdb, set a watchpoint, and read a core dump to identify the failing line.

Outputs

  • one annotated fork/exec/wait program with predicted vs actual output
  • one custom cat and one custom wc that match the system tools on ASCII inputs
  • one miniature shell that supports a single pipe (ls | wc -l) with correct FD closing
  • one mmap-based file tool (either a word counter or a grep-like search)
  • one custom allocator over an mmap'd arena with freelist coalescing, with a short writeup
  • one producer-consumer program plus a line-by-line race analysis
  • one TCP echo server and matching client
  • one gdb session transcript from a planted bug, from break/run/print to root cause
  • one strace or perf output annotated with "this syscall line is where the time goes"
  • a mistake log with tags such as forgot to close FD in parent, used exitinstead of_exit in signal handler, partial read not looped, mutex not held when signalling, or endianness forgotten in sockaddr_in``.

Completion Standard

You have completed Module 4 when all of these are true:

  • you can write fork + exec + wait from memory and predict the output before running
  • you can draw the FD table through pipe + fork + dup2 and say what closes when
  • you can state, for each concurrency primitive, the specific race condition it prevents
  • you can point at a line of strace output and say which C line produced it
  • you can attach gdb to a running process and explain what you are doing to a teammate

If you can describe the syscall boundary but cannot demonstrate it at the keyboard, the module is not complete.


Reading Policy

  • Concept pages are the main path.
  • K&R Chapter 8 is the only book chapter that directly mirrors the module; treat it as co-teacher for Clusters 1-3.
  • The Linux man pages (man7.org) and Beej's guides are primary external references -- they are the canonical sources professional engineers read.
  • Read This Only If Stuck links are specific chunks; open them only after the concept page and a retrieval attempt.
  • CSAPP is now in the local Semester 4 library; where a concept page points you there, prefer the local chunk before opening a secondary external tutorial.

Suggested Weekly Flow

DayWork
1Concepts 1-3 and a fork/exec/wait demo program
2Concepts 4-6 and rewrite cat
3Practice 1 (processes and pipes lab), extend to `ls
4Concepts 7-9 and an mmap'd word counter
5Concepts 10-12 and the producer-consumer program
6Concepts 13-15, practice pages 2-3
7Code katas (mini shell, tiny HTTP, gdb hunt), quiz, mistake-log cleanup

Reference

If you need exact links into the local chunked books or external canonical docs, use Reference and Selective Reading.


Build Your Own X — elective

The Shell tutorial is fork/exec/pipe/dup2 in action. For the next step up, the Container Runtime tutorial extends the same primitives with namespaces and cgroups. See Build Your Own X overview.

Rich Learning Pages

Worked Examples | Guided Labs | Case Studies | Mistake Clinic | Reading Guide | Capstone Thread