Chapter 27: Thread Creation

This page is a generated reference surface for selective reading. It exists to keep the learner apps guide-first while still preserving source access.

Learning objectives

Explain the main ideas and vocabulary in Thread Creation.
Work through the source examples for Thread Creation without depending on raw chunk order.
Use Thread Creation as selective reference when learner modules point back to Ostep.

Prerequisites

Earlier prerequisite concepts leading into Chapter 27: Thread Creation.

Module targets

module-03-concurrency-synchronization

AI companion modes

Explain simply
Socratic tutor
Quiz me
Challenge my understanding
Diagnose my confusion
Generate extra practice
Revision mode
Connect forward / backward

Source-of-truth note

This unit is anchored to Ostep and the source chapter "Chapter 27: Thread Creation". Use external resources only to clarify, extend, or modernize details without replacing the chapter's conceptual spine.

External enrichment

No chapter-specific enrichment resources are curated yet. Add them in the unit manifest when a source clearly improves learning.

Source provenance

Primary source: Ostep
Source chapter 27: Chapter 27: Thread Creation
Raw source file: 124-27-1-thread-creation.md
Raw source file: 125-27-2-thread-completion.md
Raw source file: 126-27-3-locks.md
Raw source file: 127-27-4-condition-variables.md

Merged source

Thread Creation

27.1 Thread Creation

27 Interlude: Thread API

This chapter briefly covers the main portions of the thread API. Each part will be explained further in the subsequent chapters, as we show how to use the API. More details can be found in various books and online sources [B89, B97, B+96, K+96]. We should note that the subsequent chapters introduce the concepts of locks and condition variables more slowly, with many examples; this chapter is thus better used as a reference.

CRUX: HOWTOCREATEANDCONTROLTHREADS

What interfaces should the OS present for thread creation and control?

How should these interfaces be designed to enable ease of use as well as utility?

The first thing you have to be able to do to write a multi-threaded program is to create new threads, and thus some kind of thread creation interface must exist. In POSIX, it is easy:

#include <pthread.h>
int
pthread_create( pthread_t * thread,
const pthread_attr_t * attr,
void * (*start_routine)(void*),
void * arg);

This declaration might look a little complex (particularly if you haven't used function pointers in C), but actually it's not too bad. There are four arguments: thread, attr, startroutine, and arg. The first, thread, is a pointer to a structure of type pthreadt; we'll use this structure to interact with this thread, and thus we need to pass it to pthread_create()in order to initialize it.

The second argument,attr, is used to specify any attributes this thread might have. Some examples include setting the stack size or perhaps information about the scheduling priority of the thread. An attribute is initialized with a separate call topthreadattrinit(); see the manual page for details. However, in most cases, the defaults will be fine; in this case, we will simply pass the valueNULLin.

The third argument is the most complex, but is really just asking: which function should this thread start running in? In C, we call this afunction pointer, and this one tells us the following is expected: a function name (startroutine), which is passed a single argument of typevoid *(as indicated in the parentheses afterstartroutine), and which returns a value of typevoid *(i.e., avoid pointer).

If this routine instead required an integer argument, instead of a void pointer, the declaration would look like this:

int pthread_create(..., // first two args are the same
void * (*start_routine)(int),
int arg);

If instead the routine took a void pointer as an argument, but returned an integer, it would look like this:

int pthread_create(..., // first two args are the same
int (*start_routine)(void *),
void * arg);

Finally, the fourth argument,arg, is exactly the argument to be passed to the function where the thread begins execution. You might ask: why do we need these void pointers? Well, the answer is quite simple: having a void pointer as an argument to the functionstartroutineallows us to pass inanytype of argument; having it as a return value allows the thread to returnanytype of result.

Let's look at an example in Figure 27.1. Here we just create a thread that is passed two arguments, packaged into a single type we define ourselves (myarg_t). The thread, once created, can simply cast its argument to the type it expects and thus unpack the arguments as desired.

And there it is! Once you create a thread, you really have another live executing entity, complete with its own call stack, running within the sameaddress space as all the currently existing threads in the program.

The fun thus begins!

Thread Completion

27.2 Thread Completion

The example above shows how to create a thread. However, what happens if you want to wait for a thread to complete? You need to do something special in order to wait for completion; in particular, you must call the routine pthread_join().

int pthread_join(pthread_t thread, void **value_ptr);
typedef struct __myarg_t {
int a;
int b;
} myarg_t;
void *mythread(void *arg) {
myarg_t *m = (myarg_t *) arg;
printf("%d %d\n", m->a, m->b);
return NULL;
}

13 14 int

main(int argc, char *argv[]) {
pthread_t p;
int rc;
myarg_t args;
args.a = 10;
args.b = 20;
rc = pthread_create(&p, NULL, mythread, &args);
...
}

Figure 27.1:Creating a Thread

This routine takes two arguments. The first is of typepthreadt, and is used to specify which thread to wait for. This variable is initialized by the thread creation routine (when you pass a pointer to it as an argument topthread_create()); if you keep it around, you can use it to wait for that thread to terminate.

The second argument is a pointer to the return value you expect to get back. Because the routine can return anything, it is defined to return a pointer to void; because thepthread_join()routinechangesthe value of the passed in argument, you need to pass in a pointer to that value, not just the value itself.

Let's look at another example (Figure 27.2). In the code, a single thread is again created, and passed a couple of arguments via themyargtstructure. To return values, the myret_t type is used. Once the thread is finished running, the main thread, which has been waiting inside of the pthread_join()routine1, then returns, and we can access the values returned from the thread, namely whatever is in myret_t.

A few things to note about this example. First, often times we don't have to do all of this painful packing and unpacking of arguments. For

example, if we just create a thread with no arguments, we can pass NULL in as an argument when the thread is created. Similarly, we can passNULL intopthread_join()if we don't care about the return value.

Second, if we are just passing in a single value (e.g., an int), we don't. Note we use wrapper functions here; specifically, we call Malloc(), Pthread_join(), and

Pthread_create(), which just call their similarly-named lower-case versions and make sure the routines did not return anything unexpected.

#include <assert.h>
#include <stdlib.h>
typedef struct __myarg_t {
int a;
int b;
} myarg_t;
typedef struct __myret_t {
int x;
int y;
} myret_t;
void *mythread(void *arg) {
myarg_t *m = (myarg_t *) arg;
printf("%d %d\n", m->a, m->b);
myret_t *r = Malloc(sizeof(myret_t));
r->x = 1;
r->y = 2;
return (void *) r;
}

24 25 int

main(int argc, char *argv[]) {
int rc;
pthread_t p;

myret_t *m;

30 31 myarg_t args;

args.a = 10;
args.b = 20;
Pthread_create(&p, NULL, mythread, &args);
Pthread_join(p, (void **) &m);
printf("returned %d %d\n", m->x, m->y);
return 0;
}

Figure 27.2:Waiting for Thread Completion have to package it up as an argument. Figure 27.3 shows an example. In this case, life is a bit simpler, as we don't have to package arguments and

return values inside of structures.

Third, we should note that one has to be extremely careful with how values are returned from a thread. In particular, never return a pointer which refers to something allocated on the thread's call stack. If you do, what do you think will happen? (think about it!) Here is an example of a dangerous piece of code, modified from the example in Figure 27.2.

void *mythread(void *arg) {
myarg_t *m = (myarg_t *) arg;
printf("%d %d\n", m->a, m->b);

4 myret_t r; // ALLOCATED ON STACK: BAD!

r.x = 1;
r.y = 2;
return (void *) &r;
}
void *mythread(void *arg) {
int m = (int) arg;
printf("%d\n", m);
return (void *) (arg + 1);
}
int main(int argc, char *argv[]) {
pthread_t p; int rc, m;
Pthread_create(&p, NULL, mythread, (void *) 100);
Pthread_join(p, (void **) &m);
printf("returned %d\n", m);
return 0;
}

Figure 27.3:Simpler Argument Passing to a Thread

In this case, the variableris allocated on the stack ofmythread. However, when it returns, the value is automatically deallocated (that's why the stack is so easy to use, after all!), and thus, passing back a pointer to a now deallocated variable will lead to all sorts of bad results. Certainly, when you print out the values you think you returned, you'll probably (but not necessarily!) be surprised. Try it and find out for yourself2!

Finally, you might notice that the use ofpthread_create()to create a thread, followed by an immediate call topthread_join(), is a pretty strange way to create a thread. In fact, there is an easier way to accomplish this exact task; it's called aprocedure call. Clearly, we'll usually be creating more than just one thread and waiting for it to complete, otherwise there is not much purpose to using threads at all.

We should note that not all code that is multi-threaded uses the join routine. For example, a multi-threaded web server might create a number of worker threads, and then use the main thread to accept requests and pass them to the workers, indefinitely. Such long-lived programs thus may not need to join. However, a parallel program that creates threads to execute a particular task (in parallel) will likely use join to make sure all such work completes before exiting or moving onto the next stage of computation.

Locks

27.3 Locks

Beyond thread creation and join, probably the next most useful set of functions provided by the POSIX threads library are those for providing mutual exclusion to a critical section via locks. The most basic pair of routines to use for this purpose is provided by this pair of routines:

int pthread_mutex_lock(pthread_mutex_t *mutex);
int pthread_mutex_unlock(pthread_mutex_t *mutex);

2Fortunately the compilergccwill likely complain when you write code like this, which is yet another reason to pay attention to compiler warnings.

The routines should be easy to understand and use. When you have a region of code you realize is acritical section, and thus needs to be protected by locks in order to operate as desired. You can probably imagine what the code looks like:

pthread_mutex_t lock;

pthread_mutex_lock(&lock);
x = x + 1; // or whatever your critical section is
pthread_mutex_unlock(&lock);

The intent of the code is as follows: if no other thread holds the lock whenpthreadmutexlock()is called, the thread will acquire the lock and enter the critical section. If another thread does indeed hold the lock, the thread trying to grab the lock will not return from the call until it has acquired the lock (implying that the thread holding the lock has released it via the unlock call). Of course, many threads may be stuck waiting inside the lock acquisition function at a given time; only the thread with the lock acquired, however, should call unlock.

Unfortunately, this code is broken, in two important ways. The first problem is alack of proper initialization. All locks must be properly initialized in order to guarantee that they have the correct values to begin with and thus work as desired when lock and unlock are called.

With POSIX threads, there are two ways to initialize locks. One way to do this is to usePTHREADMUTEXINITIALIZER, as follows:

pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
Doing so sets the lock to the default values and thus makes the lock usable. The dynamic way to do it (i.e., at run time) is to make a call to pthreadmutexinit(), as follows:
int rc = pthread_mutex_init(&lock, NULL);
assert(rc == 0); // always check success!

The first argument to this routine is the address of the lock itself, whereas the second is an optional set of attributes. Read more about the attributes yourself; passingNULLin simply uses the defaults. Either way works, but we usually use the dynamic (latter) method. Note that a corresponding call topthreadmutexdestroy()should also be made, when you are done with the lock; see the manual page for all of details.

The second problem with the code above is that it fails to check error codes when calling lock and unlock. Just like virtually any library routine you call in a UNIXsystem, these routines can also fail! If your code doesn't properly check error codes, the failure will happen silently, which in this case could allow multiple threads into a critical section. Minimally, use wrappers, which assert that the routine succeeded (e.g., as in Figure 27.4); more sophisticated (non-toy) programs, which can't simply exit when something goes wrong, should check for failure and do something appropriate when the lock or unlock does not succeed.

// Use this to keep your code clean but check for failures
// Only use if exiting program is OK upon failure
void Pthread_mutex_lock(pthread_mutex_t *mutex) {
int rc = pthread_mutex_lock(mutex);
assert(rc == 0);
}

Condition Variables

27.4 Condition Variables

Figure 27.4:An Example Wrapper

The lock and unlock routines are not the only routines within the pthreads library to interact with locks. In particular, here are two more routines which may be of interest:

int pthread_mutex_trylock(pthread_mutex_t *mutex);
int pthread_mutex_timedlock(pthread_mutex_t *mutex,
struct timespec *abs_timeout);

These two calls are used in lock acquisition. Thetrylockversion returns failure if the lock is already held; thetimedlockversion of acquiring a lock returns after a timeout or after acquiring the lock, whichever happens first. Thus, the timedlock with a timeout of zero degenerates to the trylock case. Both of these versions should generally be avoided; however, there are a few cases where avoiding getting stuck (perhaps indefinitely) in a lock acquisition routine can be useful, as we'll see in future chapters (e.g., when we study deadlock).

The other major component of any threads library, and certainly the case with POSIX threads, is the presence of acondition variable. Condition variables are useful when some kind of signaling must take place between threads, if one thread is waiting for another to do something before it can continue. Two primary routines are used by programs wishing to interact in this way:

int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex);
int pthread_cond_signal(pthread_cond_t *cond);

To use a condition variable, one has to in addition have a lock that is associated with this condition. When calling either of the above routines, this lock should be held.

The first routine, pthreadcondwait(), puts the calling thread to sleep, and thus waits for some other thread to signal it, usually when something in the program has changed that the now-sleeping thread might care about. A typical usage looks like this:

pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
Pthread_mutex_lock(&lock);
while (ready == 0)
Pthread_cond_wait(&cond, &lock);
Pthread_mutex_unlock(&lock);

In this code, after initialization of the relevant lock and condition3, a thread checks to see if the variablereadyhas yet been set to something other than zero. If not, the thread simply calls the wait routine in order to sleep until some other thread wakes it.

The code to wake a thread, which would run in some other thread, looks like this:

Pthread_mutex_lock(&lock);
ready = 1;
Pthread_cond_signal(&cond);
Pthread_mutex_unlock(&lock);

A few things to note about this code sequence. First, when signaling (as well as when modifying the global variableready), we always make sure to have the lock held. This ensures that we don't accidentally introduce a race condition into our code.

Second, you might notice that the wait call takes a lock as its second parameter, whereas the signal call only takes a condition. The reason for this difference is that the wait call, in addition to putting the calling thread to sleep, releases the lock when putting said caller to sleep.

Imagine if it did not: how could the other thread acquire the lock and signal it to wake up? However,before returning after being woken, the pthreadcondwait()re-acquires the lock, thus ensuring that any time the waiting thread is running between the lock acquire at the beginning of the wait sequence, and the lock release at the end, it holds the lock.

One last oddity: the waiting thread re-checks the condition in a while loop, instead of a simple if statement. We'll discuss this issue in detail when we study condition variables in a future chapter, but in general, using a while loop is the simple and safe thing to do. Although it rechecks the condition (perhaps adding a little overhead), there are some pthread implementations that could spuriously wake up a waiting thread; in such a case, without rechecking, the waiting thread will continue thinking that the condition has changed even though it has not. It is safer thus to view waking up as a hint that something might have changed, rather than an absolute fact.

Note that sometimes it is tempting to use a simple flag to signal be-

tween two threads, instead of a condition variable and associated lock.

For example, we could rewrite the waiting code above to look more like this in the waiting code:

while (ready == 0)

; // spin

The associated signaling code would look like this:

ready = 1;
3Note that one could use pthreadcondinit() (and corresponding the pthreadconddestroy() call) instead of the static initializer

PTHREADCONDINITIALIZER. Sound like more work? It is.

Don't ever do this, for the following reasons. First, it performs poorly in many cases (spinning for a long time just wastes CPU cycles). Second, it is error prone. As recent research shows [X+10], it is surprisingly easy to make mistakes when using flags (as above) to synchronize between threads; in that study, roughly half the uses of thesead hocsynchronizations were buggy! Don't be lazy; use condition variables even when you think you can get away without doing so.

If condition variables sound confusing, don't worry too much (yet) we'll be covering them in great detail in a subsequent chapter. Until then, it should suffice to know that they exist and to have some idea how and why they are used.

27.5 Compiling and Running

All of the code examples in this chapter are relatively easy to get up and running. To compile them, you must include the headerpthread.h in your code. On the link line, you must also explicitly link with the pthreads library, by adding the-pthreadflag.

For example, to compile a simple multi-threaded program, all you have to do is the following:

prompt> gcc -o main main.c -Wall -pthread

As long asmain.cincludes the pthreads header, you have now successfully compiled a concurrent program. Whether it works or not, as usual, is a different matter entirely.

27.6 Summary

We have introduced the basics of the pthread library, including thread creation, building mutual exclusion via locks, and signaling and waiting via condition variables. You don't need much else to write robust and efficient multi-threaded code, except patience and a great deal of care!

We now end the chapter with a set of tips that might be useful to you when you write multi-threaded code (see the aside on the following page for details). There are other aspects of the API that are interesting; if you want more information, typeman -k pthreadon a Linux system to see over one hundred APIs that make up the entire interface. However, the basics discussed herein should enable you to build sophisticated (and hopefully, correct and performant) multi-threaded programs. The hard part with threads is not the APIs, but rather the tricky logic of how you build concurrent programs. Read on to learn more.

Learning objectives​

Prerequisites​

Module targets​

AI companion modes​

Source-of-truth note​

External enrichment​

Source provenance​

Merged source​

Thread Creation​