Better, But Still Broken: While, Not If

Let's try to work on the solution of the producer/consumer problem from the last lesson.

We'll cover the following

In the last lesson, we saw how the solution we came up with for the producer/consumer problem did not work correctly for more than one consumer due to the race condition. Fortunately, fixing it is easy (see the code excerpt below): change the if to a while.

Press + to interact
int loops; // must initialize somewhere...
cond_t cond;
mutex_t mutex;
void *producer(void *arg) {
int i;
for (i = 0; i < loops; i++){
Pthread_mutex_lock(&mutex); // p1
if (count == 1) // p2
Pthread_cond_wait(&cond, &mutex); // p3
put(i); // p4
Pthread_cond_signal(&cond); // p5
Pthread_mutex_unlock(&mutex); // p6
}
}
void *consumer(void *arg) {
int i;
for(i = 0; i < loops; i++) {
Pthread_mutex_lock(&mutex); // c1
while (count == 0) // c2
Pthread_cond_wait(&cond, &mutex); // c3
int tmp = get(); // c4
Pthread_cond_signal(&cond); // c5
Pthread_mutex_unlock(&mutex); // c6
printf("%d\n", tmp);
}
}

Why this works?

Think about why this works; now consumer Tc1T_{c1} wakes up and (with the lock held) immediately re-checks the state of the shared variable (c2). If the buffer is empty at that point, the consumer simply goes back to sleep (c3). The corollary if is also changed to a while in the producer (p2).

Thanks to Mesa semantics, a simple rule to remember with condition variables is to always use while loops. Sometimes you don’t have to re-check the condition, but it is always safe to do so; just do it and be happy.

However, this code still has a bug, the second of two problems mentioned in the last lesson. Can you see it? It has something to do with the fact that there is only one condition variable. Try to figure out what the problem is, before reading ahead. DO IT!

(pause for you to think, or close your eyes…)

Let’s confirm you figured it out correctly. The problem occurs when two consumers run first (Tc1T_{c1}) and Tc2T_{c2}) and both go to sleep (c3). Then, the producer runs, puts a value in the buffer, and wakes one of the consumers (say Tc1T_{c1}). The producer then loops back (releasing and reacquiring the lock along the way) and tries to put more data in the buffer; because the buffer is full, the producer instead waits on the condition (thus sleeping). Now, one consumer is ready to run (Tc1T_{c1}), and two threads are sleeping on a condition (Tc2T_{c2} and TpT_p). We are about to cause a problem: things are getting exciting!

The consumer Tc1T_{c1} then wakes by returning from wait() (c3), re-checks the condition (c2), and finding the buffer full, consumes the value (c4). This consumer then, critically, signals on the condition (c5), waking only one thread that is sleeping. However, which thread should it wake?

Because the consumer has emptied the buffer, it clearly should wake the producer. However, if it wakes the consumer Tc2T_{c2} (which is definitely possible, depending on how the wait queue is managed), we have a problem. Specifically, the consumer Tc2T_{c2} will wake up and find the buffer empty (c2), and go back to sleep (c3). The producer TpT_{p}, which has a value to put into the buffer, is left sleeping. The other consumer thread, Tc1T_{c1}, also goes back to sleep. All three threads are left sleeping, a clear bug; see the figure below for the brutal step-by-step of this terrible calamity.

Signaling is clearly needed but must be more directed. A consumer should not wake other consumers, only producers, and vice-versa.

Get hands-on with 1400+ tech skills courses.