Skip to main content

fork, exec, wait: Creating and Managing Processes

What This Concept Is

UNIX creates processes by splitting and replacing, not by constructing from scratch.

  • fork() makes a near-identical copy of the calling process. Both processes continue running from the point right after fork. The call returns twice: once in the parent (returning the child's pid) and once in the child (returning 0).
  • execvp(path, argv) replaces the current process's program image with a new one. The pid stays the same; everything else -- code, heap, stack -- is discarded and rebuilt from the executable file. On success, execvp never returns.
  • waitpid(pid, &status, options) blocks in the parent until a named child exits, and returns its exit status.

The reason UNIX separates fork from exec is flexibility. Between the fork and the exec, the child is still running your code, and you can set up file descriptors, change directory, change credentials, or ignore signals -- so the new program starts in exactly the environment you want. This is how shells implement redirection (Concept 6).

Why It Matters Here

Every multi-process program in UNIX -- shells, servers, containers, build systems -- is built from this triad. If fork / exec / wait is not automatic to you, shells and pipelines (Cluster 2) and most of the debugging in Cluster 5 will not make sense.

Concrete Example

A minimal program that runs /bin/ls -l in a child and waits for it:

#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

int main(void) {
pid_t pid = fork();
if (pid < 0) { perror("fork"); exit(1); }

if (pid == 0) {
char *argv[] = {"ls", "-l", NULL};
execvp("ls", argv);
perror("execvp"); /* only reached if exec failed */
_exit(127);
}

int status;
if (waitpid(pid, &status, 0) < 0) { perror("waitpid"); exit(1); }

if (WIFEXITED(status))
printf("child exited with %d\n", WEXITSTATUS(status));
else if (WIFSIGNALED(status))
printf("child killed by signal %d\n", WTERMSIG(status));
return 0;
}

Key things to notice. The child uses _exit(127) after perror, not exit, because exit would flush stdout buffers that were inherited from the parent. The parent inspects the child's status with the WIFEXITED/WEXITSTATUS macros rather than comparing status directly, because the bit layout is not portable.

Common Confusion / Misconception

"fork returns one value, which tells me whether I am the parent or the child." Technically yes, but the confusion is deeper: many beginners write code assuming the parent runs first, or assuming the child finishes before the parent's next line. Neither is guaranteed. After fork, both processes are ready to run and the scheduler chooses. If you rely on a particular order, you have a race.

Another trap: "If I exec a program, my open files are lost." Not by default -- open file descriptors are inherited by the new program unless you opened them with O_CLOEXEC or called fcntl(fd, F_SETFD, FD_CLOEXEC). This is exactly why shells can do cmd > file.txt -- the shell opens file.txt, dups it onto fd 1, and then execs cmd with fd 1 already pointing at the file.

A third trap: forgetting to wait. A child that has exited but has not been reaped is a zombie -- it still takes an entry in the kernel's process table. On a long-running parent, leaking zombies eventually exhausts the process table.

How To Use It

When designing a program that runs a subprocess, apply this sequence:

  1. Decide what the child should look like before exec (FDs open, cwd, signal handlers, environment).
  2. fork.
  3. In the child: set up that state. If anything fails, _exit with a nonzero code.
  4. execvp (or a sibling) with the full argv. Check for failure.
  5. In the parent: waitpid the child you just forked. Use a loop (while waitpid(...) < 0 && errno == EINTR) to survive signals.

Check Yourself

  1. Why does fork return twice? In which process is the return value zero?
  2. Why must the child use _exit rather than exit after a failed exec?
  3. What is a zombie, and what creates one?

Mini Drill or Application

Extend the example above. Do all four:

  1. Add a fprintf(stderr, "parent pid=%d child pid=%d\n", ...) before waitpid and run it a few times. Observe ordering of the parent's printf and the child's ls output.
  2. Remove the waitpid call and sleep(30) at the end. In another terminal, run ps -ef | grep ls and find the zombie. It will appear with state Z.
  3. Add O_CLOEXEC to an open you make before the fork, and verify (by passing the fd number as an argv) that the child cannot read it.
  4. State, in one sentence, why shells separate fork and exec instead of exposing a single spawn primitive.

Read This Only If Stuck