Lab 5 Solutions

These questions were written by Jerry Cain and Ryan Eberhardt.

Before the end of lab, be sure to fill out the lab checkoff sheet here!

Problem 1: Threading Short Answer Questions

Here are some short answer questions about the specifics of threads and the directives that help threads communicate.

Problem 2: Debugging Myth Busters

The myth cluster is composed of several individual computers, numbered myth51-myth66. When you run ssh myth.stanford.edu, a load balancer evaluates how busy each machine is, and it directs you to the machine with the least load. In order to do this, the load balancer needs to be able to efficiently survey how busy each myth machine is.

We have implemented some starter code that sequentially logs into each myth machine and counts the number of running processes. You can get the starter code by cloning the lab5 folder:

$ git clone /usr/class/cs110/repos/lab5/shared lab5
$ cd lab5
$ make

We have provided a basic sequential (no threads) implementation in myth-buster-sequential.cc. The countMythProcesses function builds a map mapping myth numbers (e.g. 56 for myth56) to the number of processes running on that machine:

static const int kMinMythMachine = 51;
static const int kMaxMythMachine = 66;
static map<int, int> countMythProcesses() {
    map<int, int> processCounts;
    for (int mythNum = kMinMythMachine; mythNum <= kMaxMythMachine; mythNum++) {
        // SSH into the myth machine and count the processes it's running
        int numProcesses = getNumProcesses(mythNum);
        // numProcesses may be zero if a myth machine is currently down/dead.
        // Skip those.
        if (numProcesses >= 0) {
            cout << "myth" << mythNum << " has this many processes: " << numProcesses << endl;
            processCounts[mythNum] = numProcesses;
        }
    }
    return processCounts;
}

This code works, but is much too slow, taking 12.5 seconds to run:

🍉 time ./myth-buster-sequential
myth51 has this many processes: 368
myth52 has this many processes: 347
myth53 has this many processes: 318
myth54 has this many processes: 353
myth55 has this many processes: 366
myth56 has this many processes: 325
myth57 has this many processes: 401
myth58 has this many processes: 363
myth59 has this many processes: 420
myth60 has this many processes: 332
myth61 has this many processes: 367
myth62 has this many processes: 324
myth63 has this many processes: 355
myth64 has this many processes: 337
myth65 has this many processes: 355
myth66 has this many processes: 332
Machine with fewest processes: myth53
Number of processes on least loaded machine: 318
./myth-buster-sequential  0.30s user 0.17s system 3% cpu 12.445 total

We want to implement a parallel version of countMythProcesses that uses threads to speed up this process. You should use multiple threads so that you can connect to multiple myth machines in parallel, with the following requirements:

We have implemented a buggy threaded countMythProcesses in myth-buster-threaded.cc. Try running ./myth-buster-threaded, observe the broken behavior, and then try to fix each bug. Try to avoid figuring out the bugs from simply reading and staring at the code. Instead, see if you can use print statements or gdb to figure out what the program is doing incorrectly, and then update the code to fix the issue. We want you to practice figuring out what a program is doing (incorrectly) when it’s not immediately obvious what’s wrong with the code, since this is a skill that will help you the most in your assignments and in your own work after this class.

As an additional, optional exercise after you are finished debugging, see if you can implement your own threaded version from scratch! You can cp myth-buster-sequential.cc myth-buster-threaded.cc to replace our threaded implementation and get a blank canvas for your own implementation.

There were six bugs. Here they are, listed in the order I think you would have found them if you were debugging based on observed symptoms (instead of reading/analyzing the code):

Here is our solution with all bugs fixed:

static int countForMachine(int mythNum, semaphore &concurrentConnections) {
    semaphore_guard sg(concurrentConnections);
    return getNumProcesses(mythNum);
}

static map<int, int> countMythProcesses() {
    map<int, int> processCounts;
    vector<thread> threads;
    mutex processCountsMutex;
    semaphore concurrentConnections(5);
    for (int mythNum = kMinMythMachine; mythNum <= kMaxMythMachine; mythNum++) {
        threads.push_back(thread([mythNum, &processCounts, &processCountsMutex, &concurrentConnections](){
            int numProcesses = countForMachine(mythNum, concurrentConnections);
            // numProcesses may be zero if a myth machine is currently down/dead.
            // Skip those.
            if (numProcesses >= 0) {
                cout << oslock << "myth" << mythNum << " has this many processes: " << numProcesses
                    << endl << osunlock;
                lock_guard<mutex> lg(processCountsMutex);
                processCounts[mythNum] = numProcesses;
            }
        }));
    }
    for (thread &t : threads) {
        t.join();
    }
    return processCounts;
}

Questions for discussion, after the bugs are fixed