Lab 7: Proxy and MapReduce

This handout was adapted from Jerry Cain’s Spring 2018 offering.

Getting started

Before starting, go ahead and clone the lab7 folder, which you’ll descend into to create custom mapper and reducer executables that, when used by my own solution to Assignment 6, will process several MBs of data to produce a custom result that might be of interest to the wine industry.

$ git clone /usr/class/cs110/repos/lab7/shared lab7
$ cd lab7
$ make

Problem 1: proxy Thought Questions

Problem 2: Using MapReduce

For this problem, you’re going to flesh out the implementation of mapper and reducer executables (analogues to the word-count-mapper.cc and word-count-reducer.cc files that ship with your assign8 repos) that process CSV file chunks housing, um, wine ratings. You’ll rely on these mapper and reducer executables to generate output files that map country names to the average rating of all of the wines produced within the country. (By the way, I downloaded the data files for this problem from www.kaggle.com, which is chockfull of all kinds of awesome data sets that you can download yourself and play with.)

Here’s the series of steps I’d like you to work through for this problem: