Lecture 12: Semaphores

Note: Reading these lecture notes is not a substitute for watching the lecture. I frequently go off script, and you are responsible for understanding everything I talk about in lecture unless I specify otherwise.

Deadlock with the Dining Philosophers

The Dining Philosophers problem is phrased as follows:

Five philosophers sit around a table with an enormous plate of spaghetti on it. There is a fork in between each philosopher (for a total of five forks). Each philosopher thinks for some time, then grabs both forks surrounding him – one on his left, one on his right – and starts eating, then puts them back and goes back to thinking.

Illustration courtesy of Roz Cyrus.

Here is some code that models each philosopher as a thread, and each fork as a mutex (such that only one philosopher can be holding the fork at a time):

This code suffers from deadlock. Consider this possibility: At the very same time, every philosopher grabs the fork to the left of him. Then, when each philosopher goes to grab the fork to his right, it will have already been taken by the philosopher to his right. (This can still happen even on a single-CPU machine: each philosopher grabs the fork to their left, then their time slice ends and the scheduler switches to a different philosopher.)

Let’s fix this. Crucially, we can notice that even though there are 5 philosophers and 5 forks, only two of them can be eating at the same time (since in that scenario, four forks would be in use). We need some way to limit the number of currently-eating philosophers so that only two attempt to start eating at the same time.

Semaphores

A semaphore is another kind of synchronization primitive used to synchronize threads. You can think of a semaphore of a bucket of things: threads can call semaphore.signal() to place things in the bucket, and they can call semaphore.wait() to wait for at least one thing to be in the bucket, then take something out.

semaphore.wait():
- If a ball is already in the bucket, the thread will take the ball and immediately return without blocking
- If there are no balls in the bucket, the thread will wait for one to be available, then take it and return.
semaphore.signal():
- Adds a ball to the bucket
- Never blocks, because this a thread calling this function isn’t waiting for anything.
semaphore::semaphore(int initialCount):
- The constructor initializes the semaphore with some initial number of balls in the bucket.
Note that a semaphore isn’t a data structure and it doesn’t actually store anything. Internally, it just stores an integer counter to remember how many things are in the bucket, but there is no data being placed in / taken out of the bucket.

Last class, we introduced a mutex as a way to ensure only one thread is doing something at a time, and we mentioned how you can think of a mutex as a “rubber chicken” used to ensure that in a heated discussion, everyone must hold the chicken before talking, so that only one person can be talking at a time. If it helps, you can conceptualize a semaphore as a bucket of rubber chickens: a thread needs to take a chicken from the bucket before it can do anything, and then puts it back when it’s done. Unlike a mutex, there can be multiple chickens.

There are two main patterns in which semaphores are used:

They can be used to limit how many things are happening at a time. We create a bucket of balls/“rubber chickens.” Before doing something, a thread will take a chicken, and afterwards, put it back.
They can be used to coordinate handoff between threads (where some threads are “producers,” and other threads are “consuming” data that was produced). We can start with an empty bucket. When a producer generates something that needs to be handled, it will put a ball in the bucket. A consumer will wait for a ball to be placed in the bucket, and then it will know that there is data to be handled.

Using semaphores for rate limiting: Internet Cafe

Let’s imagine we’re modeling an Internet cafe, where there is a fixed number of computers, and there are several people wishing to use those computers. For our example, let’s say there are 4 computers and 10 people trying to use them.

Here is a toy example that models the people as threads:

But there is no synchronization here, so all 10 people start using the computers at once, which would be physically impossible:

1 started using a computer.
2 started using a computer.
3 started using a computer.
4 started using a computer.
5 started using a computer.
6 started using a computer.
7 started using a computer.
8 started using a computer.
9 started using a computer.
10 started using a computer.
1 finished using the computer.
4 finished using the computer.
3 finished using the computer.
7 finished using the computer.
8 finished using the computer.
5 finished using the computer.
10 finished using the computer.
9 finished using the computer.
6 finished using the computer.
2 finished using the computer.

We can fix this by using a semaphore to limit how many threads are “using the computer” at a time. We create a bucket of 4 balls; before using a computer, a thread needs to acquire a ball, and it puts it back when it is finished.

Fixing the Dining Philosophers with semaphores

We can fix our deadlock by only allowing up to four philosophers to grab forks at a time. This way, at least one philosopher is guaranteed to be able to grab both forks; as soon as they finish eating and put the forks down, other philosophers will be able to eat.

We can create a semaphore that is initialized with four “rubber chickens” in the bucket. For each philosopher to start eating, they first has to grab a rubber chicken (waiting for one to be available in the bucket if there are none). When they’re done, they put it back in the bucket for someone else to grab.

Cplayground here:

int main(int argc, const char *argv[]) {
  mutex forks[kNumForks];
  thread philosophers[kNumPhilosophers];
  semaphore chickens(4);
  for (size_t i = 0; i < kNumPhilosophers; i++) {
    mutex &left = forks[i];
    mutex &right = forks[(i + 1) % kNumPhilosophers];
    philosophers[i] = thread([i, &left, &right, &chickens](){
      philosopher(i, left, right, chickens);
    });
  }
  for (thread& p: philosophers) p.join();
  return 0;
}

static void philosopher(size_t id, mutex& left, mutex& right, semaphore& chickens) {
  for (size_t i = 0; i < kNumMeals; i++) {
    think(id);
    chickens.wait();
    eat(id, left, right);
    chickens.signal();
  }
}

Using semaphores to coordinate handoff between threads

As mentioned earlier, we can also use semaphores to coordinate handoff in which one thread “gives” something to another thread.

In this example, we create an empty bucket. A producer thread places balls in the bucket, and a different thread takes them out. (If this were a real program, the producer thread would place some data somewhere in shared memory, then place a ball in the bucket to wake up the consumer thread. The consumer thread, waking up, would know that data is available, and would read from that place in shared memory.)

Try stepping through the code in cplayground here:

void produce(int creationCount, semaphore &s) {
    for (int i = 0; i < creationCount; i++) {
        cout << oslock << "Now creating " << i << endl << osunlock;
        s.signal();
    }
}

void consume(int consumeCount, semaphore &s) {
    for (int i = 0; i < consumeCount; i++) {
        s.wait();
        cout << oslock << "Now consuming " << i << endl << osunlock;
    }
}

int main(int argc, const char *argv[]) {
    semaphore zeroSemaphore(0); // can omit (0), since default initializes to 0
    int numIterations = 5;
    
    thread thread_waited_on(create, numIterations, ref(zeroSemaphore));
    thread waiting_thread(consume_after_create, numIterations, ref(zeroSemaphore));
    
    thread_waited_on.join();
    waiting_thread.join();
    return 0;
}

Reading/writing to a ring buffer

A ring buffer is a fixed-size array. Data is written to the buffer normally, but when a writer reaches the end of the buffer, it wraps around to the beginning and starts writing data at the beginning of the array.

Ring buffers are commonly used for threads to send data to each other. A producer can write data into the buffer, and a consumer will wait for data to be added. However, the producer must also take care to avoid wrapping to the beginning of the array and overwriting data that hasn’t yet been read by the consumer.

To coordinate access, we can use two semaphores:

One semaphore, which we’ll call full, is initialized to 0. The producer signals this semaphore after adding data to the buffer, and the consumer waits on this semaphore before consuming data. (Conceptually, this semaphore is a bucket of bytes that have been added to the buffer.)
Another semaphore, which we’ll call empty, is inititialized to the size of the ring buffer (8 in our example). Conceptually, this semaphore is a bucket of free spaces in the buffer. The consumer signals this semaphore after reading data to indicate that more space is available for writing, and the producer waits on this semaphore before writing (to ensure that it doesn’t overwrite data that hasn’t yet been read).