Skip to main content

Threads vs Processes: Shared vs Isolated State

What This Concept Is

A process has its own address space, open-file table, signal state, credentials, and PID. A thread is a schedulable execution within a process. Multiple threads of one process share:

StateShared across threads?Shared across processes?
Address space (code, heap, globals)YesNo
StackNo (each thread has its own)No
File descriptor tableYesNo (but copied on fork)
Signal handlersYesNo
Signal maskNo (per-thread)No
Current directory, umaskYesNo
PIDSameDifferent
TID (thread id)DifferentDifferent
Registers (PC, SP)DifferentDifferent
Credentials, effective UIDUsually sharedNo

On Linux, clone() is the primitive. Flags (CLONE_VM, CLONE_FILES, CLONE_SIGHAND, CLONE_THREAD) select what to share. pthread_create passes flags for "thread-like" sharing; fork passes none (full copy).

Why It Matters Here

"Thread" and "process" are not genres of thing; they are points on a spectrum of "how much state do you share." The decision is an engineering tradeoff:

  • Sharing is cheap communication (read a shared variable) and cheap switching (no CR3 reload, no TLB flush).
  • Isolation is crash containment (one process segfault doesn't take the others) and security (no shared memory -> no shared exploits).

The choice affects scheduling cost (Concept 11), synchronization strategy (Module 3), and failure modes (observability).

Concrete Example

Same problem, two shapes:

A. Multi-process web server (prefork, e.g., Apache mpm_prefork).

  • Master forks N worker processes.
  • Workers share code pages (read-only, COW), not heap.
  • A worker segfault kills itself, master respawns it; other workers keep serving.
  • Communication between workers: shared memory segments or a pipe.

B. Multi-threaded web server (e.g., nginx workers internally, or a Java app server).

  • One process, N threads, one address space, one connection table.
  • Segfault in one thread crashes the whole process.
  • Communication between threads: shared data structures under a lock.

Tradeoff summary:

QuestionMulti-processMulti-threaded
Context switch costHigher (address space change)Lower
Crash blast radiusContainedWhole process
Data sharingExplicit (shm, pipes)Implicit (just read)
Security boundaryYesNo
DebuggabilityEasier (isolated)Harder (race conditions)

Common Confusion / Misconception

"Threads are just lightweight processes, use them when you need speed." That framing sells threads short on safety and over-sells them on speed. If the workloads do not share state, processes can be faster in practice (no locking, no false sharing of cache lines).

"fork is always slow." With copy-on-write it is actually quite fast -- often faster than setting up a thread pool from scratch -- as long as the child promptly execs or uses only a small working set.

How To Use It

When choosing, ask the questions in this order:

  1. Do the units of work share significant mutable state? If no -> prefer processes.
  2. Is fault isolation important? If yes -> prefer processes.
  3. Is the switch rate high and cache-sensitive? If yes -> threads can help.
  4. Is portability across non-POSIX platforms needed? If yes -> threads (via a threading runtime) are more portable.

Async (epoll, futures) is a third option for many I/O-bound workloads and should be considered before either.

Check Yourself

  1. Two threads in the same process each have their own stack. Why must they?
  2. Why does a signal sent with kill target a process, while pthread_kill targets a thread?
  3. What does CLONE_VM without CLONE_FILES produce?

Mini Drill or Application

  1. Write a program that creates 10 threads, each running a counter loop, and sum their results.
  2. Rewrite it as 10 processes using fork and pipes.
  3. Measure total wall time and max RSS for each.
  4. Write one paragraph explaining the ranking and what would change on a 128-core machine.

Read This Only If Stuck