Chapter 31: Semaphores As Condition Variables

This page is a generated reference surface for selective reading. It exists to keep the learner apps guide-first while still preserving source access.

Learning objectives

Explain the main ideas and vocabulary in Semaphores As Condition Variables.
Work through the source examples for Semaphores As Condition Variables without depending on raw chunk order.
Use Semaphores As Condition Variables as selective reference when learner modules point back to Ostep.

Prerequisites

Earlier prerequisite concepts leading into Chapter 31: Semaphores As Condition Variables.

Module targets

module-03-concurrency-synchronization

AI companion modes

Explain simply
Socratic tutor
Quiz me
Challenge my understanding
Diagnose my confusion
Generate extra practice
Revision mode
Connect forward / backward

Source-of-truth note

This unit is anchored to Ostep and the source chapter "Chapter 31: Semaphores As Condition Variables". Use external resources only to clarify, extend, or modernize details without replacing the chapter's conceptual spine.

External enrichment

No chapter-specific enrichment resources are curated yet. Add them in the unit manifest when a source clearly improves learning.

Source provenance

Primary source: Ostep
Source chapter 31: Chapter 31: Semaphores As Condition Variables
Raw source file: 148-31-3-semaphores-as-condition-variables.md
Raw source file: 149-31-4-the-producer-consumer-bounded-buffer-problem.md
Raw source file: 150-31-5-reader-writer-locks.md
Raw source file: 151-31-6-the-dining-philosophers.md

Merged source

Semaphores As Condition Variables

31.3 Semaphores As Condition Variables

Semaphores are also useful when a thread wants to halt its progress waiting for a condition to become true. For example, a thread may wish to wait for a list to become non-empty, so it can delete an element from it.

In this pattern of usage, we often find a threadwaitingfor something to happen, and a different thread making that something happen and then signalingthat it has happened, thus waking the waiting thread. Because the waiting thread (or threads) is waiting for someconditionin the program to change, we are using the semaphore as acondition variable.

void *
child(void *arg) {
printf("child\n");
sem_post(&s); // signal here: child is done
return NULL;
}

9 10 int

main(int argc, char *argv[]) {

12 sem_init(&s, 0, X); // what should X be?

printf("parent: begin\n");
pthread_t c;
Pthread_create(c, NULL, child, NULL);
sem_wait(&s); // wait here for child
printf("parent: end\n");
return 0;
}

Figure 31.6:A Parent Waiting For Its Child

A simple example is as follows. Imagine a thread creates another thread and then wants to wait for it to complete its execution (Figure 31.6). When this program runs, we would like to see the following:

parent: begin child parent: end

The question, then, is how to use a semaphore to achieve this effect; as it turns out, the answer is relatively easy to understand. As you can see in the code, the parent simply callssemwait()and the childsempost() to wait for the condition of the child finishing its execution to become true. However, this raises the question: what should the initial value of this semaphore be?

(Again, think about it here, instead of reading ahead)

The answer, of course, is that the value of the semaphore should be set to is 0. There are two cases to consider. First, let us assume that the parent creates the child but the child has not run yet (i.e., it is sitting in a ready queue but not running). In this case (Figure 31.7, page 6), the parent will call semwait()before the child has calledsempost(); we'd like the parent to wait for the child to run. The only way this will happen is if the value of the semaphore is not greater than 0; hence, 0 is the initial value.

The parent runs, decrements the semaphore (to -1), then waits (sleeping).

When the child finally runs, it will callsempost(), increment the value of the semaphore to 0, and wake the parent, which will then return from semwait()and finish the program.

The second case (Figure 31.8, page 6) occurs when the child runs to completion before the parent gets a chance to callsemwait(). In this case, the child will first callsempost(), thus incrementing the value of the semaphore from 0 to 1. When the parent then gets a chance to run, it will callsemwait()and find the value of the semaphore to be 1; the parent will thus decrement the value (to 0) and return fromsemwait() without waiting, also achieving the desired effect.

Value Parent State Child State 0 create(Child) Running (Child exists; is runnable) Ready 0 callsemwait() Running Ready

-1 decrement sem Running Ready

-1 (sem<0)→sleep Sleeping Ready

-1 Switch→Child Sleeping child runs Running

-1 Sleeping callsempost() Running 0 Sleeping increment sem Running 0 Ready wake(Parent) Running 0 Ready sempost()returns Running 0 Ready Interrupt; Switch→Parent Ready 0 semwait()returns Running Ready

Figure 31.7:Thread Trace: Parent Waiting For Child (Case 1)

Value Parent State Child State 0 create(Child) Running (Child exists; is runnable) Ready 0 Interrupt; Switch→Child Ready child runs Running 0 Ready callsempost() Running 1 Ready increment sem Running 1 Ready wake(nobody) Running 1 Ready sempost()returns Running 1 parent runs Running Interrupt; Switch→Parent Ready 1 callsemwait() Running Ready 0 decrement sem Running Ready

0 (sem>=0)→awake Running Ready

0 semwait()returns Running Ready

Figure 31.8:Thread Trace: Parent Waiting For Child (Case 2)

The Producer Consumer Bounded Buffer Problem

31.4 The Producer/Consumer (Bounded Buffer) Problem

The next problem we will confront in this chapter is known as theproducer/consumerproblem, or sometimes as thebounded bufferproblem

[D72]. This problem is described in detail in the previous chapter on condition variables; see there for details.

First Attempt

Our first attempt at solving the problem introduces two semaphores,empty andfull, which the threads will use to indicate when a buffer entry has been emptied or filled, respectively. The code for the put and get routines is in Figure 31.9, and our attempt at solving the producer and consumer problem is in Figure 31.10.

In this example, the producer first waits for a buffer to become empty in order to put data into it, and the consumer similarly waits for a buffer

to become filled before using it. Let us first imagine thatMAX=1(there is

only one buffer in the array), and see if this works.

Imagine again there are two threads, a producer and a consumer. Let us examine a specific scenario on a single CPU. Assume the consumer gets to run first. Thus, the consumer will hit line c1 in the figure above, calling semwait(&full). Because full was initialized to the value 0,

int fill = 0;
int use = 0;
void put(int value) {
buffer[fill] = value; // line f1
fill = (fill + 1) % MAX; // line f2
}
int get() {
int tmp = buffer[use]; // line g1
use = (use + 1) % MAX; // line g2
return tmp;
}

Figure 31.9:The Put And Get Routines 1 sem_t empty; 2 sem_t full; 3

void *producer(void *arg) {
int i;
for (i = 0; i < loops; i++) {
sem_wait(&empty); // line P1
put(i); // line P2
sem_post(&full); // line P3
}
}
void *consumer(void *arg) {
int i, tmp = 0;
while (tmp != -1) {
sem_wait(&full); // line C1
tmp = get(); // line C2
sem_post(&empty); // line C3
printf("%d\n", tmp);
}
}
int main(int argc, char *argv[]) {

24 // ...

25 sem_init(&empty, 0, MAX); // MAX buffers are empty to begin with...

26 sem_init(&full, 0, 0); // ... and 0 are full 27 // ...

Figure 31.10:Adding The Full And Empty Conditions the call will decrement full(to -1), block the consumer, and wait for another thread to callsempost()onfull, as desired.

Assume the producer then runs. It will hit line P1, thus calling the semwait(&empty)routine. Unlike the consumer, the producer will continue through this line, because empty was initialized to the value

MAX (in this case, 1). Thus, empty will be decremented to 0 and the producer will put a data value into the first entry of buffer (line P2). The producer will then continue on to P3 and callsempost(&full), changing the value of the full semaphore from -1 to 0 and waking the consumer (e.g., move it from blocked to ready).

In this case, one of two things could happen. If the producer continues to run, it will loop around and hit line P1 again. This time, however, it would block, as the empty semaphore's value is 0. If the producer instead was interrupted and the consumer began to run, it would call semwait(&full)(line c1) and find that the buffer was indeed full and thus consume it. In either case, we achieve the desired behavior.

You can try this same example with more threads (e.g., multiple producers, and multiple consumers). It should still work.

Let us now imagine thatMAXis greater than 1 (sayMAX= 10). For this

example, let us assume that there are multiple producers and multiple consumers. We now have a problem: a race condition. Do you see where it occurs? (take some time and look for it) If you can't see it, here's a hint:

look more closely at the put() and get() code.

OK, let's understand the issue. Imagine two producers (Pa and Pb) both calling into put() at roughly the same time. Assume producer Pa gets

to run first, and just starts to fill the first buffer entry (fill = 0 at line f1).

Before Pa gets a chance to increment the fill counter to 1, it is interrupted.

Producer Pb starts to run, and at line f1 it also puts its data into the 0th element of buffer, which means that the old data there is overwritten!

This is a no-no; we don't want any data from the producer to be lost.

A Solution: Adding Mutual Exclusion

As you can see, what we've forgotten here ismutual exclusion. The filling of a buffer and incrementing of the index into the buffer is a critical section, and thus must be guarded carefully. So let's use our friend the binary semaphore and add some locks. Figure 31.11 shows our attempt.

Now we've added some locks around the entire put()/get() parts of the code, as indicated by theNEW LINEcomments. That seems like the right idea, but it also doesn't work. Why? Deadlock. Why does deadlock occur? Take a moment to consider it; try to find a case where deadlock arises. What sequence of steps must happen for the program to deadlock?

Avoiding Deadlock

OK, now that you figured it out, here is the answer. Imagine two threads, one producer and one consumer. The consumer gets to run first. It acquires the mutex (line c0), and then callssemwait()on the full semaphore (line c1); because there is no data yet, this call causes the consumer to block and thus yield the CPU; importantly, though, the consumer still holds the lock.

A producer then runs. It has data to produce and if it were able to run, it would be able to wake the consumer thread and all would be good.

Unfortunately, the first thing it does is call semwait()on the binary mutex semaphore (line p0). The lock is already held. Hence, the producer is now stuck waiting too.

3 sem_t mutex; 4

void *producer(void *arg) {
int i;
for (i = 0; i < loops; i++) {
sem_wait(&mutex); // line p0 (NEW LINE)
sem_wait(&empty); // line p1
put(i); // line p2
sem_post(&full); // line p3
sem_post(&mutex); // line p4 (NEW LINE)
}
}
void *consumer(void *arg) {
int i;
for (i = 0; i < loops; i++) {
sem_wait(&mutex); // line c0 (NEW LINE)
sem_wait(&full); // line c1
int tmp = get(); // line c2
sem_post(&empty); // line c3
sem_post(&mutex); // line c4 (NEW LINE)
printf("%d\n", tmp);
}
}
int main(int argc, char *argv[]) {

29 // ...

30 sem_init(&empty, 0, MAX); // MAX buffers are empty to begin with...

31 sem_init(&full, 0, 0); // ... and 0 are full 32 sem_init(&mutex, 0, 1); // mutex=1 because it is a lock (NEW LINE) 33 // ...

Figure 31.11:Adding Mutual Exclusion (Incorrectly)

There is a simple cycle here. The consumerholds the mutex and is waitingfor the someone to signal full. The producer couldsignalfull but iswaitingfor the mutex. Thus, the producer and consumer are each stuck waiting for each other: a classic deadlock.

Finally, A Working Solution

To solve this problem, we simply must reduce the scope of the lock. Figure 31.12 shows the final working solution. As you can see, we simply move the mutex acquire and release to be just around the critical section; the full and empty wait and signal code is left outside. The result is a simple and working bounded buffer, a commonly-used pattern in multithreaded programs. Understand it now; use it later. You will thank us for years to come. Or at least, you will thank us when the same question is asked on the final exam.

3 sem_t mutex; 4

void *producer(void *arg) {
int i;
for (i = 0; i < loops; i++) {
sem_wait(&empty); // line p1
sem_wait(&mutex); // line p1.5 (MOVED MUTEX HERE...)
put(i); // line p2
sem_post(&mutex); // line p2.5 (... AND HERE)
sem_post(&full); // line p3
}
}
void *consumer(void *arg) {
int i;
for (i = 0; i < loops; i++) {
sem_wait(&full); // line c1
sem_wait(&mutex); // line c1.5 (MOVED MUTEX HERE...)
int tmp = get(); // line c2
sem_post(&mutex); // line c2.5 (... AND HERE)
sem_post(&empty); // line c3
printf("%d\n", tmp);
}
}
int main(int argc, char *argv[]) {

29 // ...

30 sem_init(&empty, 0, MAX); // MAX buffers are empty to begin with...

31 sem_init(&full, 0, 0); // ... and 0 are full 32 sem_init(&mutex, 0, 1); // mutex=1 because it is a lock 33 // ...

Figure 31.12:Adding Mutual Exclusion (Correctly)

Reader Writer Locks

31.5 Reader-Writer Locks

Another classic problem stems from the desire for a more flexible locking primitive that admits that different data structure accesses might require different kinds of locking. For example, imagine a number of concurrent list operations, including inserts and simple lookups. While inserts change the state of the list (and thus a traditional critical section makes sense), lookups simplyreadthe data structure; as long as we can guarantee that no insert is on-going, we can allow many lookups to proceed concurrently. The special type of lock we will now develop to support this type of operation is known as areader-writer lock[CHP71]. The code for such a lock is available in Figure 31.13.

The code is pretty simple. If some thread wants to update the data structure in question, it should call the new pair of synchronization operations: rwlockacquirewritelock(), to acquire a write lock, and rwlockreleasewritelock(), to release it. Internally, these simply use thewritelocksemaphore to ensure that only a single writer can ac- 2 sem_t lock; // binary semaphore (basic lock) 3 sem_t writelock; // used to allow ONE writer or MANY readers 4 int readers; // count of readers reading in critical section

} rwlock_t;
void rwlock_init(rwlock_t *rw) {
rw->readers = 0;
sem_init(&rw->lock, 0, 1);
sem_init(&rw->writelock, 0, 1);
}
void rwlock_acquire_readlock(rwlock_t *rw) {
sem_wait(&rw->lock);
rw->readers++;
if (rw->readers == 1)
sem_wait(&rw->writelock); // first reader acquires writelock
sem_post(&rw->lock);
}
void rwlock_release_readlock(rwlock_t *rw) {
sem_wait(&rw->lock);
rw->readers--;
if (rw->readers == 0)
sem_post(&rw->writelock); // last reader releases writelock
sem_post(&rw->lock);
}
void rwlock_acquire_writelock(rwlock_t *rw) {
sem_wait(&rw->writelock);
}
void rwlock_release_writelock(rwlock_t *rw) {
sem_post(&rw->writelock);
}

Figure 31.13:A Simple Reader-Writer Lock quire the lock and thus enter the critical section to update the data structure in question.

More interesting is the pair of routines to acquire and release read locks. When acquiring a read lock, the reader first acquires lockand then increments the readersvariable to track how many readers are currently inside the data structure. The important step then taken within rwlockacquirereadlock()occurs when the first reader acquires the lock; in that case, the reader also acquires the write lock by calling semwait()on thewritelocksemaphore, and then finally releasing thelockby callingsempost().

Thus, once a reader has acquired a read lock, more readers will be allowed to acquire the read lock too; however, any thread that wishes to acquire the write lock will have to wait untilall readers are finished; the last one to exit the critical section callssempost()on "writelock" and thus enables a waiting writer to acquire the lock.

This approach works (as desired), but does have some negatives, espe-

TIP: SIMPLEANDDUMBCANBEBETTER(HILL'SLAW)

You should never underestimate the notion that the simple and dumb approach can be the best one. With locking, sometimes a simple spin lock works best, because it is easy to implement and fast. Although something like reader/writer locks sounds cool, they are complex, and complex can mean slow. Thus, always try the simple and dumb approach first.

This idea, of appealing to simplicity, is found in many places. One early source is Mark Hill's dissertation [H87], which studied how to design caches for CPUs. Hill found that simple direct-mapped caches worked better than fancy set-associative designs (one reason is that in caching, simpler designs enable faster lookups). As Hill succinctly summarized his work: "Big and dumb is better." And thus we call this similar advice

Hill's Law.

cially when it comes to fairness. In particular, it would be relatively easy for readers to starve writers. More sophisticated solutions to this problem exist; perhaps you can think of a better implementation? Hint: think about what you would need to do to prevent more readers from entering the lock once a writer is waiting.

Finally, it should be noted that reader-writer locks should be used with some caution. They often add more overhead (especially with more sophisticated implementations), and thus do not end up speeding up performance as compared to just using simple and fast locking primitives [CB08]. Either way, they showcase once again how we can use semaphores in an interesting and useful way.

The Dining Philosophers

31.6 The Dining Philosophers

One of the most famous concurrency problems posed, and solved, by

Dijkstra, is known as thedining philosopher's problem[D71]. The problem is famous because it is fun and somewhat intellectually interesting;

however, its practical utility is low. However, its fame forces its inclusion here; indeed, you might be asked about it on some interview, and you'd really hate your OS professor if you miss that question and don't get the job. Conversely, if you get the job, please feel free to send your OS professor a nice note, or some stock options.

The basic setup for the problem is this (as shown in Figure 31.14): assume there are five "philosophers" sitting around a table. Between each pair of philosophers is a single fork (and thus, five total). The philosophers each have times where they think, and don't need any forks, and times where they eat. In order to eat, a philosopher needs two forks, both the one on their left and the one on their right. The contention for these forks, and the synchronization problems that ensue, are what makes this a problem we study in concurrent programming.

f2 P1

P2 f1 f3 P0

P3 f0 f4 P4

Figure 31.14:The Dining Philosophers

Here is the basic loop of each philosopher:

while (1) {
think();
getforks();
eat();
putforks();
}

The key challenge, then, is to write the routines getforks()and putforks()such that there is no deadlock, no philosopher starves and never gets to eat, and concurrency is high (i.e., as many philosophers can eat at the same time as possible).

Following Downey's solutions [D08], we'll use a few helper functions to get us towards a solution. They are:

int left(int p) { return p; }
int right(int p) { return (p + 1) % 5; }
When philosopherpwishes to refer to the fork on their left, they simply call left(p). Similarly, the fork on the right of a philosopherpis referred to by calling right(p); the modulo operator therein handles
the one case where the last philosopher (p=4) tries to grab the fork on

their right, which is fork 0.

We'll also need some semaphores to solve this problem. Let us assume we have five, one for each fork:semt forks[5].

sem_wait(forks[left(p)]);
sem_wait(forks[right(p)]);
}
void putforks() {
sem_post(forks[left(p)]);
sem_post(forks[right(p)]);
}

Figure 31.15:Thegetforks()Andputforks()Routines

Broken Solution

We attempt our first solution to the problem. Assume we initialize each semaphore (in theforksarray) to a value of 1. Assume also that each philosopher knows its own number (p). We can thus write thegetforks() andputforks()routine as shown in Figure 31.15.

The intuition behind this (broken) solution is as follows. To acquire the forks, we simply grab a "lock" on each one: first the one on the left, and then the one on the right. When we are done eating, we release them.

Simple, no? Unfortunately, in this case, simple means broken. Can you see the problem that arises? Think about it.

The problem isdeadlock. If each philosopher happens to grab the fork on their left before any philosopher can grab the fork on their right, each will be stuck holding one fork and waiting for another, forever. Specifically, philosopher 0 grabs fork 0, philosopher 1 grabs fork 1, philosopher 2 grabs fork 2, philosopher 3 grabs fork 3, and philosopher 4 grabs fork 4; all the forks are acquired, and all the philosophers are stuck waiting for a fork that another philosopher possesses. We'll study deadlock in more detail soon; for now, it is safe to say that this is not a working solution.

A Solution: Breaking The Dependency

The simplest way to attack this problem is to change how forks are acquired by at least one of the philosophers; indeed, this is how Dijkstra himself solved the problem. Specifically, let's assume that philosopher 4 (the highest numbered one) acquires the forks in adifferent order. The code to do so is as follows:

void getforks() {
if (p == 4) {
sem_wait(forks[right(p)]);
sem_wait(forks[left(p)]);
} else {
sem_wait(forks[left(p)]);
sem_wait(forks[right(p)]);
}
}

Because the last philosopher tries to grab right before left, there is no situation where each philosopher grabs one fork and is stuck waiting for another; the cycle of waiting is broken. Think through the ramifications of this solution, and convince yourself that it works.

3 pthread_cond_t cond; 4 pthread_mutex_t lock;

} Zem_t;

6 7 // only one thread can call this

void Zem_init(Zem_t *s, int value) {
s->value = value;
Cond_init(&s->cond);
Mutex_init(&s->lock);
}
void Zem_wait(Zem_t *s) {
Mutex_lock(&s->lock);
while (s->value <= 0)
Cond_wait(&s->cond, &s->lock);
s->value--;
Mutex_unlock(&s->lock);
}
void Zem_post(Zem_t *s) {
Mutex_lock(&s->lock);
s->value++;
Cond_signal(&s->cond);
Mutex_unlock(&s->lock);
}

Figure 31.16:Implementing Zemaphores With Locks And CVs

There are other "famous" problems like this one, e.g., the cigarette smoker's problemor the sleeping barber problem. Most of them are just excuses to think about concurrency; some of them have fascinating names. Look them up if you are interested in learning more, or just getting more practice thinking in a concurrent manner [D08].

31.7 How To Implement Semaphores

Finally, let's use our low-level synchronization primitives, locks and condition variables, to build our own version of semaphores called ...

(drum roll here) ... Zemaphores. This task is fairly straightforward, as you can see in Figure 31.16.

As you can see from the figure, we use just one lock and one condition variable, plus a state variable to track the value of the semaphore. Study the code for yourself until you really understand it. Do it!

One subtle difference between our Zemaphore and pure semaphores as defined by Dijkstra is that we don't maintain the invariant that the value of the semaphore, when negative, reflects the number of waiting threads; indeed, the value will never be lower than zero. This behavior is easier to implement and matches the current Linux implementation.

Curiously, building locks and condition variables out of semaphores

TIP: BECAREFULWITHGENERALIZATION

The abstract technique of generalization can thus be quite useful in systems design, where one good idea can be made slightly broader and thus solve a larger class of problems. However, be careful when generalizing; as Lampson warns us "Don't generalize; generalizations are generally wrong" [L83].

One could view semaphores as a generalization of locks and condition variables; however, is such a generalization needed? And, given the difficulty of realizing a condition variable on top of a semaphore, perhaps this generalization is not as general as you might think.

is a much trickier proposition. Some highly experienced concurrent programmers tried to do this in the Windows environment, and many different bugs ensued [B04]. Try it yourself, and see if you can figure out why building condition variables out of semaphores is more challenging than it might appear.

Learning objectives​

Prerequisites​

Module targets​

AI companion modes​

Source-of-truth note​

External enrichment​

Source provenance​

Merged source​

Semaphores As Condition Variables​