Skip to main content

POSIX Threads: pthread_create, pthread_join

What This Concept Is

A thread is an independently scheduled flow of execution inside the same process. All threads in a process share:

  • the code segment
  • the heap (anything reachable from malloc)
  • open file descriptors and other per-process kernel state
  • global and static variables

Each thread has its own:

  • stack (with local variables)
  • CPU register state
  • thread-local storage (__thread / thread_local)

Two POSIX calls define the basic lifecycle:

int pthread_create(pthread_t *tid, const pthread_attr_t *attr,
void *(*start)(void*), void *arg);
int pthread_join(pthread_t tid, void **retval);

pthread_create spawns a new thread that starts executing start(arg). pthread_join blocks until the thread finishes and returns its return value. A thread that is never joined (and is not detached) leaks its resources, similar to a zombie process.

Compile with -pthread (tells the compiler to link the threading runtime).

Why It Matters Here

Processes are isolated; threads are not. That is the single most important thing to hold about them. Two threads sharing a pointer is trivial -- just point at the same malloc'd buffer. Two processes sharing a pointer requires shared memory (Concept 8) and careful layout.

Threads give you genuine parallelism on multi-core machines and cheap asynchronous execution on one core (e.g., one thread per connection). The cost is: you must reason about concurrent access to shared state on every line where it happens. Concepts 11 and 12 supply the tools.

Concrete Example

Summing an array in two threads:

#include <pthread.h>
#include <stdio.h>

typedef struct { int *a; size_t lo, hi; long sum; } Range;

static void *worker(void *p) {
Range *r = p;
long s = 0;
for (size_t i = r->lo; i < r->hi; i++) s += r->a[i];
r->sum = s;
return NULL;
}

int main(void) {
enum { N = 1000000 };
static int data[N];
for (int i = 0; i < N; i++) data[i] = i;

Range r1 = { data, 0, N / 2, 0 };
Range r2 = { data, N / 2, N, 0 };

pthread_t t1, t2;
pthread_create(&t1, NULL, worker, &r1);
pthread_create(&t2, NULL, worker, &r2);

pthread_join(t1, NULL);
pthread_join(t2, NULL);

printf("sum = %ld\n", r1.sum + r2.sum);
}

This program has no race. Each thread writes only its own Range.sum, and the main thread reads them only after join (which establishes a happens-before ordering). The shared array data is read-only during the workers' lifetime.

Now picture the naive version where both threads updated a single long total; directly. That would race -- two threads doing total += s each compile to load/add/store, and the interleaving can lose updates. Concepts 11 and 12 fix that properly.

Common Confusion / Misconception

"Creating a thread is like calling a function." No. pthread_create returns immediately, before start may have executed at all. The new thread runs independently. If main returns before the thread finishes and you did not join or detach, behavior depends on details: returning from main is like calling exit, which terminates all threads in the process.

"Sharing state between threads is just writing to the same variable." Yes -- and exactly that casualness is the bug. Any variable that two or more threads touch, where at least one touch is a write, must be protected by a mutex, an atomic operation, or a happens-before boundary like pthread_join or a condition variable.

Another trap: passing a pointer to a local variable as the thread argument, then returning from the enclosing function. The local is destroyed while the thread is still reading from it. Allocate thread arguments on the heap or in a structure that outlives the thread.

How To Use It

A disciplined threading design:

  1. Identify shared state. Label each variable: private to one thread, read-only after init, shared and mutable.
  2. For each shared and mutable variable, pick a synchronization strategy (mutex, atomic, channel).
  3. Own the lifecycle: every thread created is either joined or explicitly detached.
  4. Pass arguments by value, by stable pointer (heap or long-lived struct), or by index into a read-only array. Never by pointer-to-local.
  5. Error-check pthread_create -- it can fail under resource limits.

Check Yourself

  1. What do two threads in the same process share, and what do they each have their own copy of?
  2. Why is passing &local from the parent function often a bug?
  3. What happens to a running thread if main returns without joining it?

Mini Drill or Application

Do all four:

  1. Compile and run the two-thread sum above with -pthread. Measure speedup over a single-thread version on a large array.
  2. Modify the workers to both increment a single shared long total without a mutex. Run 20 times. Record the distribution of wrong answers.
  3. Write a program that spawns 10 worker threads, each sleeping a random amount and returning its pid-equivalent. Collect all return values via pthread_join.
  4. Explain, in one sentence, why pthread_join establishes a happens-before relationship that makes reading the worker's output safe.

Read This Only If Stuck