Lab 4: Assignment 4 Redux, Multithreading
This handout was adapted from Jerry Cain’s Spring 2018 offering.
I’ve created a Slack channel for Lab 4 discussion (aptly named #lab4), and all students (but particularly remote students) are encouraged to share their ideas there.
Problem 1: Assignment 2 Revisited
Here are a collection of short answer questions drilling your understanding of
subprocess
and farm
. It’s not uncommon for students to get working
solutions to assignments and still not be entirely clear why they work. These
questions are here to force you to think big picture and understand the systems
concepts I feel are important.
- Your Assignment 2 implementation of
subprocess
required two pipes – one to foster a parent-to-child communication channel, and a second to foster a child-to-parent communication channel. Clearly explain why a single pipe shouldn’t be used to support both communication channels. - You’ve seen
dprintf
in lecture and in theassign3
handout, and it presumably contributed to manyfarm
implementations. Explain why there’s adprintf
function, but there’s no analogousdscanf
function. Hint: Think about whydprintf(fd, "%s %d\n", str, i)
would be easy to manage whereasdscanf(fd, "%s %d\n", str, &i)
wouldn’t be. Read the first few lines of theman
pages for the traditionalfprintf
andfscanf
functions to understand how they operate. Consider the implementation of
spawnAllWorkers
below. Even though it rarely causes problems, the highlighted line below technically contains a race condition. Briefly describe the race condition, and explain how to fix it.struct worker { worker() {} worker(char *argv[], const set<int>& openfds = set<int>()) : sp(subprocess(argv, true, false, openfds)), available(false) {} subprocess_t sp; bool available; }; static const size_t kNumCPUs = sysconf(_SC_NPROCESSORS_ONLN); static vector<worker> workers(kNumCPUs); static size_t numWorkersAvailable; static const char *kWorkerArguments[] = { "./factor.py", "--self-halting", NULL }; static void spawnAllWorkers() { cout << "There are this many CPUs: " << kNumCPUs << ", numbered 0 through " << kNumCPUs - 1 << "." << endl; set<int> openfds; for (size_t i = 0; i < workers.size(); i++) { workers[i] = worker(const_cast<char **>(kWorkerArguments, openfds)); openfds.insert(workers[i].supplyfd); assignToCPU(workers[i].sp.pid , i); // implementation omitted, irrelevant } } int main(int argc, char *argv[]) { signal(SIGCHLD, markWorkersAsAvailable); // markWorkersAsAvailable is correct spawnAllWorkers(); // other functions, assume all correct return 0; }
While implementing the
farm
program forassign2
, you were expected to implement agetAvailableWorker
function to effectively blockfarm
until at least one worker was available. My own solution relied on a helper function I calledwaitForAvailableWorker
, which I present below. After analyzing my own solution, answer the following questions:- Assuming no signals are blocked at the time
waitForAvailableWorker
is called, clearly identify whenSIGCHLD
is blocked and when it is not. - Had I accidentally passed in
&additions
to thesigsuspend
call instead of&existing
, the farm could have deadlocked. Explain why. - Had I accidentally omitted the
sigaddset
call and not blockedSIGCHLD
,farm
could have deadlocked. Explain how. - In past quarters, I saw a bunch of students who lifted the block on
SIGCHLD
before the two lines in bold instead of after. As it turns out, executingnumWorkersAvailable--
after the block is lifted can cause problems, but executingworkers[i].available = false
actually can’t. Explain why the placement of the--
is more sensitive to race conditions than the Boolean assignment is.
static sigset_t waitForAvailableWorker() { sigset_t existing, additions; sigemptyset(&additions); sigaddset(&additions, SIGCHLD); sigprocmask(SIG_BLOCK, &additions, &existing); while (numWorkersAvailable == 0) sigsuspend(&existing); return existing; } static size_t getAvailableWorker() { sigset_t existing = waitForAvailableWorker(); size_t i; for (i = 0; !workers[i].available; i++); assert(i < workers.size()); **numWorkersAvailable**--; workers[i].available = false;** sigprocmask(SIG_SETMASK, &existing, NULL); // restore original block set return i; }
- Assuming no signals are blocked at the time
Problem 2: Threading Short Answer Questions
Here are some more short answer questions about the specifics of threads and the directives that help threads communicate.
- Is
i--
thread safe on a single core machine? Why or why not? - What’s the difference between a
mutex
and asemaphore
with an initial value of 1? Can one be substituted for the other? - What is the
lock_guard
class used for, and why is it useful? - What is busy waiting? Is it ever a good idea? Does your answer to the good-idea question depend on the number of CPUs your multithreaded application has access to?
- As it turns out, the semaphore’s constructor allows a negative number to be passed in, as with
semaphore s(-11)
. Identify a scenario where -11 might be a sensible initial value. - What would the implementation of
semaphore::signal(size_t increase = 1)
need to look like if we wanted to allow asemaphore
’s encapsulated value to be promoted by theincrease
amount? Note thatincrease
defaults to 1, so that this version could just replace the standardsemaphore::signal
that’s officially exported by thesemaphore
abstraction. - What’s the multiprocessing equivalent of the
mutex
? - What’s the multiprocessing equivalent of the
condition_variable_any
? - The
semaphore
implementation we produced in lecture is repeated below. Many students have asked whether the very last line ofsemaphore::signal
could have callednotify_one
instead ofnotify_all
. Does is matter? Explain.
semaphore::semaphore(int value) : value(value) {}
void semaphore::wait() {
lock_guard<mutex> lg(m);
cv.wait(m, [this]{ return value > 0; });
value--;
}
void semaphore::signal() {
lock_guard<mutex> lg(m);
value++;
if (value == 1) cv.notify_all();
}
Problem 3: Threads vs Processes
To provide some answers about the pros and cons of threads and processes, we’ll lead you through a collection of short answer questions about multiprocessing and multithreading that focuses more on the big picture.
- What does it mean when we say that a process has a private address space?
- What are the advantages of a private address space?
- What are the disadvantages?
- What programming directives have we used in prior discussion section handouts to circumvent address space privacy?
- When architecting a larger program like
farm
orstsh
that relies on multiprocessing, what did we need to do to exchange information across process boundaries? - Can a process be used to execute multiple executables? Restated, can it
execvp
twice to run multiple programs? - Threads are often called lightweight processes. In what sense are they processes? And why the lightweight distinction?
- Threads are often called virtual processes as well. In what sense are threads an example of virtualization?
- Threads running within the same process all share the same address space. What are the advantages and disadvantages of allowing threads to access pretty much all of virtual memory?
Each thread within a larger process is given a thread id, also called a tid. In fact, the thread id concept is just an extension of the process id. For singly threaded processes, the pid and the main thread’s tid are precisely the same. If a process with pid 12345 creates three additional threads beyond the main thread (and no other processes or threads within other processes are created), then the tid of the main thread would be 12345, and the thread ids of the three other threads would be 12346, 12347, and 12348.
- What are the advantages of leveraging the pid abstraction for thread ids?
- What happens if you pass a thread id that isn’t a process id to
waitpid
? - What happens if you pass a thread id to
sched_setaffinity
? - What are the advantages of requiring that a thread always be assigned to the same CPU?
In some situations, the decision to rely on multithreading instead of multiprocessing is dictated solely by whether the code to be run apart from the main thread is available in executable or in library form. But in other situations, we have a choice.
- Why might you prefer multithreading over multiprocessing if both are reasonably good options?
- Why might you prefer multiprocessing over multithreading if both are reasonably good options?
- What happens if a thread within a larger process calls
fork
? - What happens if a thread within a larger process calls
execvp
?
And some final questions about how farm
and stsh
could be implemented:
- Assume you know nothing about multiprocessing but needed to emulate the functionality of
farm
. How could you use multithreading to design an equally parallel solution without ever relying on multiprocessing? - Could multithreading contribute to the implementation of
stsh
in any meaningful way? Why or why not?