Lab 3: Synchronization

These questions were written by Jerry Cain and Ryan Eberhardt.

Note: you probably won’t have time to go through every question in this handout. Work with your lab CA to prioritize the questions you feel most confused about, and try the rest as extra practice if you have time!

Before the end of lab, be sure to fill out the lab checkoff sheet here!

Problem 1: Process scheduling

Recall from lecture that the process scheduler is a component of the operating system that decides whether a running process should continue running and, if not, what process should run next. This scheduler maintains three different data structures to help manage the selection process: the set of running processes, the ready queue, and the blocked set.

Problem 2: Racing children

In this problem, we want you to implement a raceChildren function that runs two programs simultaneously, and prints out which program finished first. For example, if the first program is sleep 3 and the second command is sleep 4, your program should print First child exited first!.

You can find the starter code here. Be careful to properly reap any child processes you create.

Problem 3: Analyzing parallel mergesort

Before starting, go ahead and clone the lab3 folder, which contains a mergesort program that uses many processes to sort an array in parallel.

$ git clone /usr/class/cs110/repos/lab3/shared lab3
$ cd lab3
$ make

Consider the architecturally interesting portion of the mergesort executable, which launches 128 peer processes to cooperatively sort an array of 128 randomly generated numbers. The implementations of createSharedArray and freeSharedArray are omitted for the time being.

static bool shouldKeepMerging(size_t start, size_t reach, size_t length) {
  return start % reach == 0 && reach <= length;
}

static void repeatedlyMerge(int numbers[], size_t length, size_t start) {
  int *base = numbers + start;
  for (size_t reach = 2; shouldKeepMerging(start, reach, length); reach *= 2) {
    raise(SIGTSTP);
    inplace_merge(base, base + reach/2, base + reach);
  }
  exit(0);
}

static void createMergers(int numbers[], pid_t workers[], size_t length) {
  for (size_t start = 0; start < length; start++) {
    workers[start] = fork();
    if (workers[start] == 0) 
      repeatedlyMerge(numbers, length, start);
  }
}

static void orchestrateMergers(int numbers[], pid_t workers[], size_t length) {
  size_t step = 1;
  while (step <= length) {
    for (size_t start = 0; start < length; start += step) 
      waitpid(workers[start], NULL, WUNTRACED);
    step *= 2;
    for (size_t start = 0; start < length; start += step) 
      kill(workers[start], SIGCONT);
  }
}

static void mergesort(int numbers[], size_t length) {
  pid_t workers[length];
  createMergers(numbers, workers, length);
  orchestrateMergers(numbers, workers, length);
}

static const size_t kNumElements = 128;
int main(int argc, char *argv[]) {
  for (size_t trial = 1; trial <= 10000; trial++) {
    int *numbers = createSharedArray(kNumElements);    
    mergesort(numbers, kNumElements);
    assert(is_sorted(numbers, numbers + kNumElements));
    freeSharedArray(numbers, kNumElements);
  }
  return 0;
}

The program presented above is a nod to concurrent programming and whether parallelism can reduce the asymptotic running time of an algorithm. (I use the term “asymptotic running time” rather loosely; that generally refers to the behavior as the size of the input approaches infinity, and, usually, we don’t have infinite resources. But for the purposes of this problem, pretend like we do.) We’ll lead you through a series of short questions – some easy, some less easy – to test your multiprocessing and signal chops and to understand why the “asymptotic” running time of an algorithm can sometimes be improved in a parallel programming world.

For reasons I’ll discuss shortly, this above program works because the address in the numbers variable is cloned across the 128 fork calls, and this particular address maps to the same set of physical addresses in all 128 processes (and that’s different than what usually happens).

The program successfully sorts any array of length 128 by relying on 128 independent processes. (Again, assume infinite resources, so assume we have as many CPU cores as we do elements in the array.) In a nutshell, the above program works because:

For this lab problem, we want to lead you through a series of short answer questions to verify a deeper understanding of how the entire mergesort system works. Truth be told, the mergesort algorithm we’ve implemented is more of theoretical interest than practical. But it’s still a novel example of parallel programming that rings much more relevant and real-world than the toy examples I’ve been presenting in lecture.

Use the following short answer questions to guide the discussion.

The createSharedArray function (defined in memory.h and memory.cc in your lab3 repo) sets aside space for an array of 128 (or, more generally, length) integers and seeds it with random numbers. It does so using the mmap function you’ve seen in Assignment 1.

static int *createSharedArray(size_t length) {
    int *numbers =
        static_cast<int *>(mmap(NULL, length * sizeof(int), PROT_READ | PROT_WRITE,
                                MAP_SHARED | MAP_ANONYMOUS, -1, 0));
    
    static RandomGenerator rgen;
    for (size_t i = 0; i < length; i++) 
        numbers[i] = rgen.getNextInt(kMinValue, kMaxValue);
    return numbers;
}

The mmap function takes the place of malloc here, because it sets up space not in the heap, but in a brand new segment that other processes can see and touch (that’s what MAP_ANONYMOUS and MAP_SHARED mean).

Fun/extra problem 4: The /proc directory

Note: This problem isn’t important, but I think it’s fascinating. It demonstrates how the /proc virtual directory can be explored to view information about running processes on your machine.