Assignment 4: Stanford Shell
Most of this assignment was written by Jerry Cain, with some additions by Ryan Eberhardt.
Kudos to Randal Bryant and Dave O’Hallaron of Carnegie Mellon for assignment inspiration and for parts of this handout. Huge thanks to Truman Cranor for writing the command line parser using tools you’ll learn in CS143, which you should all take someday because it’s an amazing class.
You’ve all been using shells to drive your conversations with UNIX systems
since the first day you logged into a myth
. It’s high time we uncover the
shell’s magic building one, almost from scratch, to support process control,
job lists, signals, pipelines, and I/O redirection – all while managing the
interprocess concurrency problems that make a shell’s implementation a
genuinely advanced systems programming project. There’s lots of neat code to
write, and with your smarts and my love to guide you, I’m confident you can
pull it off.
Due Date: Thursday, July 29th Friday, July 30th, 2021 at 11:59 p.m. PT.
If you submit by Thursday 11:59 p.m., we’ll give you a 3% bonus.
All coding should be done on a myth
cluster machine, as that’s where we’ll be
testing all assign4
submissions. You should clone the git repository we’ve
set up for you by typing:
git clone /usr/class/cs110/repos/assign4/$USER assign4
Doing so will create an assign4
directory within your AFS space, and you can
descend into that assign4
directory and code there. There’s a working
sanitycheck
for this assignment, and your repo includes a soft link to a
fully functional solution. When in doubt about how something should work, just
run my solution (which can be found at ./samples/stsh_soln
) to see what it
does and imitate the solution.
Opening notes
I personally think this is the hardest assignment of the quarter (others might disagree), but it is also my favorite assignment of the quarter. This is how real shells work – you can clone bash’s source code or zsh’s source code and see very similar patterns to what you’re writing in this assignment.
The starter code is a functional shell, which you are asked to extend. It’s worth compiling the starter code and playing with that shell to see what it can currently do. It’s also worth running the sample solution and spending some time with that to see what you’re asked to build. As you’re working on this project, test very often to ensure you haven’t broken anything!
It won’t take you nearly as long to figure out the starter code as it did with
Assignment 2, but you should still expect to spend some time figuring out
pipeline
and command
and friends.
You should know your way around your own shell before building this one. You should be familiar with the following things:
-
You can create a pipeline of processes with the pipe (
|
) operator. If you want to see how many files you have whose names contain “std,” you can do this:ls -l | grep std | wc -l
The output of the first process is piped to the stdin of the second, and the output of the second process is piped to the stdin of the third.
-
You can run a command in the background by adding a trailing ampersand:
sleep 5 &
The shell will immediately print a prompt, not waiting for that command to finish. However, it keeps track of the child processes’ state in order to maintain a job list (which you can see by typing
jobs
) -
You can interrupt a running foreground process by pressing ctrl+c. (I’m sure you know this.) Usually (but not always), that kills the process.
-
You can suspend a running foreground process by pressing ctrl+z. This sends SIGTSTP to any foreground processes, temporarily stopping them until they’re sent SIGCONT. To resume the last suspended process, you can type
fg
in your shell (orbg
to continue the process, but relegate it to the background).
Builtin stsh
Commands
Your Assignment 4 shell needs to support a collection of builtin commands that should execute without creating any new processes. You might want to run the sample solution and play around to see how to interact with these builtins.
The required builtins are:
quit
, which exits the shell and abandons any jobs that were still running. If there are any extraneous arguments after thequit
, just ignore them.exit
, which does the same thing asquit
. Extraneous arguments? Just ignore them.fg
, which prompts a stopped job to continue in the foreground or brings a running background job into the foreground.fg
takes a single job number (e.g.fg 3
). If the provided argument isn’t a number, there’s no such job with that number, or the number of arguments isn’t correct, then throw anSTSHException
around an actionable error message and allow your shell to carry on as if you never typed anything.bg
, which prompts a stopped job to continue in the background.bg
takes a single job number (e.g.bg 3
). If the provided argument isn’t a number, there’s no such job with that number, or the number of arguments isn’t correct, then throw anSTSHException
around an actionable error message and allow your shell to carry on as if you never typed anything.slay
, which is used to terminate a single process (which may have many sibling processes as part of a larger pipeline).slay
takes either one or two numeric arguments. If only one number is provided, it’s assumed to be the pid of the process to be killed. If two numbers are provided, the first number is assumed to be the job number, and the second is assumed to be a process index within the job. So,slay 12345
would terminate (in theSIGKILL
sense of terminate) the process with pid 12345.slay 2 0
would terminate the first process in the pipeline making up job 2.slay 13 8
would terminate the 9th process in the pipeline of processes making up job 13. If there are any issues (e.g., the arguments aren’t numbers, or they are numbers but identify some nonexistent process, or the argument count is incorrect), then throw anSTSHException
around an actionable error message and allow your shell to carry on as if you never typed anything.halt
, which has nearly the same story asslay
, except that its one or two numeric arguments identify a single process that should be halted (but not terminated) usingSIGTSTP
.cont
, which has the same story asslay
andhalt
, except that its one or two numeric arguments identify a single process that should continue viaSIGCONT
. When you prompt a single process to continue, you’re asking that it do so in the background.jobs
, which prints the job list to the console. If there are any additional arguments, then just ignore them.
quit
, exit
, and jobs
are already implemented for you. You’re responsible
for implementing the others and ensuring the job list is appropriately updated.
Getting Started
Inspect the stsh.cc
file we give you. This is the only file you should need
to modify. The core main
function you’re provided looks like this:
int main(int argc, char *argv[]) {
pid_t stshpid = getpid();
installSignalHandlers();
rlinit(argc, argv); // configures stsh-readline library so readline works properly
while (true) {
string line;
if (!readline(line)) break;
if (line.empty()) continue;
try {
pipeline p(line);
bool builtin = handleBuiltin(p);
if (!builtin) createJob(p); // createJob is initially defined as a wrapper around cout << p;
} catch (const STSHException& e) {
cerr << e.what() << endl;
if (getpid() != stshpid) exit(0); // if exception is thrown from child process, kill it
}
}
return 0;
}
The readline
function prompts the user to enter a command and a pipeline
record is constructed around it. readline
and pipeline
(which is a
different from the pipeline
function you implemented for Assignment 2) are
implemented via a suite of files in the stsh-parser
subdirectory, and for the
most part you can ignore those implementations. You should, however, be
familiar with the type definitions of the command
and pipeline
types,
though, and they are right here:
const size_t kMaxCommandLength = 32;
const size_t kMaxArguments = 32;
struct command {
char command[kMaxCommandLength + 1]; // '\0' terminated
char *tokens[kMaxArguments + 1]; // NULL-terminated array, C strings are all '\0' terminated
};
struct pipeline {
std::string input; // the empty string if no input redirection file to first command
std::string output; // the empty string if no output redirection file from last command
std::vector<command> commands;
bool background;
pipeline(const std::string& str); // constructor that parses an input string
~pipeline();
};
Check out what the initial version of stsh
is capable of before you add any
new code.
Milestones
The best approach to implementing anything this complex is to invent a collection of milestones that advance you toward your final goal. Never introduce more than a few lines of code before compiling and confirming that the lines you added do what you expect. I repeat: Never introduce more than a few lines of code before compiling, testing, and confirming that the additional lines do what you expect. View everything you add as a slight perturbation to a working system that slowly evolves into the final product. Try to understand every single line you add, why it’s needed, and why it belongs where you put it.
Here is a sequence of milestones I’d like you to work through in order to get started:
-
Descend into the
stsh-parser
directory, read through thestsh-readline.h
andstsh-parse.h
header files for data type definitions and function/method prototypes, typemake
, and play with thestsh-parse-test
to gain a sense of whatreadline
and thepipeline
constructor do for you. In general, thereadline
function is likegetline
, except that you can use your up and down arrows to scroll through your history of inputs (neat!). Thepipeline
record defines a bunch of fields that store all of the various commands that chain together to form a pipeline. For example, the textcat < /usr/include/stdio.h | wc > output.txt
would be split into twocommand
s – one for thecat
and a second for thewc
– and populate thevector<command>
in the pipeline with information about each of them. Theinput
andoutput
fields would each be nonempty, and the background field would befalse
. -
Add code to
Stsh::createJob
to get a pipeline of just one command (e.g.sleep 5
) to run in the foreground until it’s finished. You’ll need to construct anargv
array on the stack, copying in thecommand
andtokens
from the firstcommand
in the pipeline. Rely on a call towaitpid
to stallstsh
until the foreground job finishes. Ignore the job list, don’t worry about background jobs, pipelining, or redirection. Don’t worry about programs likeemacs
just yet. Focus on these executables instead:ls
,date
,sleep
, as their execution is simple and predictable.Testing suggestion: Try running
sleep 3
. It should run for 3 seconds, and then thestsh>
prompt should reappear after the sleep. -
Read through
stsh-job-list.h
,stsh-job.h
, andstsh-process.h
to learn how to add a new foreground job to the job list, and how to add a process to that job. Add code that does exactly that to thestsh.cc
file, right after you successfully fork off a new process. After yourwaitpid
call returns, remove the job from the job list by setting the process’s state tokTerminated
and callingSTSHJobList::synchronize
. If it helps, inlinecout << joblist;
lines in strategically chosen locations to confirm your new job is being added afterfork
and being removed afterwaitpid
.Testing suggestion: Add
cout << "Joblist after adding process:" << endl << joblist;
after adding the new process to the job list (beforewaitpid
), and addcout << "Joblist before return:" << endl << joblist;
to the very end ofcreateJob
:stsh> sleep 3 Joblist after adding process: [1] 1965986 Running sleep 3 Joblist before return: stsh>
-
Establish the process group ID of the job to be the PID of the process by investigating the
setpgid
system call. Every process runs in one process group, and group membership is inherited onfork()
, so right now,stsh
and all its children are running in the same group. However, it’s conventional to run all of the processes of a job in their own group, separate from the parent shell or any other jobs. This makes job control easier, since you can send signals to an entire group if you want to pause/resume/kill a job, and generally makes it easier to identify groups of processes working together.After
fork()
, in both the parent and the child (see Tips and Tidbits below), you should usesetpgid
to add the child to its own process group (e.g. a child with pid123
should be added to group123
). Right now, the child will be the only process in the group, but we’ll add more processes later on when you implement pipelines of multiple commands.Testing suggestion: Open two terminals logged into the same myth machine (you can
echo $HOST
on one terminal you’re logged into, thenssh [email protected]
from the other.) On one terminal, run./stsh
and runsleep 100
from within stsh. On the other, run the followingps
command:🍉 ps o pid,ppid,pgid,stat,user,command -u $USER | grep "PGID\|sleep\|stsh" | grep -v grep PID PPID PGID STAT USER COMMAND 1274737 1274629 1274737 S+ rebs ./stsh 1274738 1274737 1274738 S rebs sleep 100
Note that
stsh
andsleep
are in different process groups (PGID
column), and the PGID ofsleep
matches itsPID
. -
Add the ability to kill or pause a job by pressing ctrl-c or ctrl-z on the keyboard. If the shell receives
SIGINT
(ctrl+c) orSIGTSTP
(ctrl+z) while a foreground job is running, it should forward the signal to the foreground process group. You can send a signal to a group using thekillpg(pgid, signal)
syscall. Although this will work the same as usingkill
since there is only one process in the group, it’s important to usekillpg
for later, when we have jobs with several commands (e.g.cat words.txt | sort | uniq | wc -l
).To do this, you’ll need to block
SIGINT
andSIGTSTP
to prevent the shell from being killed or stopped. You’ll also need to replace yourwaitpid
call from milestone 2. Now, instead of simply waiting for the child to terminate, we need to wait for it to terminate or forSIGINT
/SIGTSTP
to come in, so that we can forward those signals to the child.The logic should look something like this:
while the foreground job is running (see STSHJobList::hasForegroundJob): wait for SIGINT, SIGTSTP, or SIGCHLD if SIGINT or SIGTSTP came in: forward the signal using killpg() if SIGCHLD came in: use waitpid to get the status of the child update the joblist with the status
At this point, you should also add extra flags to
waitpid
so that you can pick up on child processes that stop/continue. You should also be able to handle child processes that terminate unexpectedly (e.g. segfault).Some important notes:
- You may find
STSHJob::getGroupID
to be helpful. - Signal handling configuration is inherited by child processes. You will
need to unblock signals (
SIG_UNBLOCK
) to ensure that child processes can properly handle them. - When you forward SIGINT or SIGTSTP to a child, you should not assume that
those signals will kill/stop the child, since the child could ignore
those signals (e.g. vim does not exit when you press ctrl+c). You should
only update the job list when
waitpid
tells you that a child stopped, exited, or continued.
Note: We’ve uploaded a video overviewing some pieces of this milestone here. It may be helpful if you’re having some trouble putting this together.
Testing suggestions:
Try running
sleep 5
in your shell, and press ctrl+c. Then runjobs
to print the job list. The list should be empty, sincesleep
terminated. Try it again with ctrl+z. The job list should showsleep
asStopped
.stsh> sleep 5 ^Cstsh> jobs stsh> sleep 5 ^Zstsh> jobs [2] 1279583 Stopped sleep 5 stsh>
If this doesn’t work, think about all the steps involved in the process, and try to find ways to observe what is happening and confirm which steps are happening correctly:
- Is your shell receiving the
SIGINT
/SIGTSTP
? You can check this with some print statements. - If so, does it seem to be sending the signal to the right place? You can
print out the
killpg
arguments and return value to make sure that call is working. - If so, does the child process seem to be receiving the signal and handling
it using the default behavior? Run the
ps
command from the previous milestone. If you sentSIGINT
, is the child gone from the list, or marked a zombie (Z
in theSTAT
column)? If you sentSIGTSTP
, is the child stopped (T
in theSTAT
column)? - If so, what is your
waitpid
call doing? Is it returning the right PID? What are you seeing fromWIFEXITED
/WIFSIGNALED
/WIFSTOPPED
/WIFCONTINUED
? - If your
waitpid
call is working fine, is the problem with updating the joblist? Try printing the joblist at different points to see.
Also ensure that your code works for children that do not exit gracefully. You can test this with
./fpe 3
, which sleeps for 3 seconds and then crashes due to a floating point exception:stsh> ./fpe 3 stsh> jobs stsh>
- You may find
-
Make sure that if
SIGINT
orSIGTSTP
come in while no foreground job is running (e.g. we’re displaying the shell prompt and waiting for user input), nothing happens. The shell should not exit or stop. Also, if you press ctrl+c at the shell prompt and thensleep 5
,sleep
should sleep for a full 5 seconds without immediately quitting due toSIGINT
. If you have a print statement when callingkillpg
, that print statement should not appear.Keep in mind that if
SIGINT
arrives while it is blocked, it will be added to the pending set and delivered whensigwait
is called. If the user presses ctrl+c while the shell prompt is displayed, that will addSIGINT
to the pending set, and thenSIGINT
will be delivered increateJob
. We don’t want that to happen. To avoid this, we can clearSIGINT
andSIGTSTP
from the pending set before starting thesigwait
loop:// Tell the OS we want to completely ignore SIGINT/SIGTSTP. If these were // already in the pending set, they will be removed. signal(SIGINT, SIG_IGN); signal(SIGTSTP, SIG_IGN); // Allow SIGINT/SIGTSTP to come in again. Assuming these signals are still // blocked, they will be added to the pending set when they come in, and any // calls to `sigwait` will retrieve them. signal(SIGINT, SIG_DFL); signal(SIGTSTP, SIG_DFL);
Note: We’ve uploaded a video overviewing some pieces of this milestone here. It may be helpful if you’re having some trouble putting this together.
Testing suggestion: Start your shell, press ctrl+c, and then run
sleep 3
. It should sleep for a full 3 seconds. Separately, try runningsleep 3
and pressing ctrl+c while it is running. Make sure it still exits as you expect. Test ctrl+z as well. Try this several times, and in different orders, to make sure your code is robust. -
Implement the
fg
builtin, which takes a stopped process – stopped presumably because it was running in the foreground at the time you pressed ctrl-z – and prompts it to continue, or it takes a process running in the background and brings it into the foreground. Thefg
builtin takes job number, translates that job number to a process group ID, and forwards aSIGCONT
on to the process group via a call tokillpg(groupID, SIGCONT)
. Again, right now, process groups consist of just one process, but once you start to support pipelines, you’ll wantfg
to bring the entire job into the foreground, whichkillpg
can help with. After sendingSIGCONT
, update the job state tokForeground
, and then wait for the job to stop/terminate or for SIGINT/SIGTSTP to come in, same as you did increateJob
. Be sure to decompose and avoid copy/paste.Of course, if the argument passed to
fg
isn’t a number, or it is but it doesn’t identify a real job, then you should throw anSTSHException
that’s wrapped around a clear error message saying so. You will find theparseNumber
function instsh-parse-utils
to be helpful.Testing suggestion: Run
./spin 5
, press ctrl+z, runjobs
to see the job number (in square brackets), runfg jobNum
, and make sure thatspin
runs for another few seconds.stsh> ./spin 5 ^Zstsh> jobs [1] 2013794 Stopped ./spin 5 stsh> fg 1 <sleeps for 4 more seconds before showing shell prompt....> stsh> jobs stsh>
Try pressing
ctrl+z
and running fg several times.We recommend testing this with
./spin
instead ofsleep
. Whensleep 5
starts running, it calculates the time 5 seconds in the future and sleeps until that time. This means that if you press ctrl+z and then take 5+ seconds to typefg jobNum
, sleep may exit immediately instead of sleeping for several more seconds. By contrast,./spin 5
will spin on the CPU for 5 whole seconds, so if you pause in the middle, it will still continue sleeping when it comes back. -
Add support for background jobs. The
pipeline
constructor already searches for trailing&
's and records whether or not the pipeline should be run in the background in thepipeline
struct. A background job should be run exactly the same as a foreground job, except you should passkBackground
tojoblist.addJob()
, and you should not use thesigwait
loop to wait for the job to finish. (It’s running in the background, after all.) Also, when a pipeline is started in the background your shell should print out a job summary that’s consistent with the following output:stsh> sleep 10 | sleep 10 | sleep 10 & [1] 27684 27685 27686
(There are no supplied functions that construct this output; you’ll have to print the job ID in square brackets and loop over the processes to print the PIDs. Also, you’re only handling a single process right now, so there will only be one PID in the output, but this is what it should look like when you implement multiple processes later on.)
This introduces a complication: if we aren’t waiting for the child process to exit, then when do we update the job list? Well, the only time it’s important for the job list to be updated is when we are printing it, which happens when handling the
jobs
builtin. An additional complication: we can have multiple background jobs running at the same time, so we might have any number of child processes that we need to update the status of.We can handle this! Before printing the job list, call
waitpid
repeatedly to get any status updates for child processes. You can’t callwaitpid
normally as you did earlier in the assignment, because if all child processes are running and haven’t had any state changes, thenwaitpid
will block and we won’t print out the job list until a child stops/continues/terminates. That’s not good. But if we add theWNOHANG
flag towaitpid
, thenwaitpid
will check if child processes have had state changes without waiting for them. If some children have had state changes (i.e. we want to update the job list),waitpid
will return the PID of one such child; if no children have changed state, it will immediately return0
without blocking. With this in mind, we can write something like the following:while true: call waitpid with WNOHANG and any other additional flags if waitpid returned 0 or -1, there are no more updates, so break out of the loop update the child's status in the job list
Be sure to consolidate any redundant code with the
waitpid
code you previously wrote forcreateJob
andfg
. You should only have onewaitpid
call in all of your code. With background processes involved, it’s crucial that you callwaitpid
on pid-1
withWNOHANG
even increateJob
andfg
, because otherwise,SIGCHLD
signals generated from background processes might cause you to callwaitpid
when waiting for the foreground job to finish, and if you haven’t done this properly, your shell will block onwaitpid
and ignore any incomingSIGINT
/SIGTSTP
signals.Testing suggestion: Run
sleep 5 &
twice. Thestsh>
prompt should print immediately. Runjobs
, and you should see the jobs running. (There should be no delays in seeing thejobs
output.) Wait 5 seconds, then runjobs
again, and the jobs should be gone.stsh> sleep 5 & [1] 38070 stsh> sleep 5 & [2] 38071 stsh> jobs [1] 38070 Running sleep 5 [2] 38071 Running sleep 5 stsh> jobs stsh>
-
Add support for the
bg
command, which is almost identical tofg
but continues a job in the background. You should unify as much code as possible withfg
. -
Add support for
slay
,halt
,cont
, which sendSIGKILL
,SIGTSTP
, andSIGCONT
to a single process (as opposed tofg
andbg
, which sent signals to an entire group). Be sure to unify as much code as possible, and be sure to guard against errors in user input.You do not need to update the job list for these builtins. It’s unnecessary, and keep in mind that sending
SIGTSTP
isn’t even guaranteed to stop the process. Instead, let thejobs
builtin take care of updating the list.Testing suggestion: Run
./spin 20
(or 30, or 60) and test out each builtin, ensuring it behaves as expected:stsh> ./spin 20 & [1] 38295 stsh> halt 1 0 stsh> jobs [1] 38295 Stopped ./spin 20 stsh> cont 1 0 stsh> jobs [1] 38295 Running ./spin 20 stsh> slay 1 0 stsh> jobs stsh>
If the
jobs
output is not what you expect, that could be a bug in yourjobs
builtin, or it could be a bug inslay
/halt
/cont
. Try adding print statements, especially around yourwaitpid
calls, to make sure that yourjobs
builtin is picking up on all state changes. You can also use theps
command from milestone 4 to make sure that the child process states are changing as they’re signaled.
The following are additional milestones you need to hit on your way to a fully
functional stsh
. Each of these bullet points represents something larger.
-
Add support for foreground jobs whose leading process requires control of the terminal (e.g.
cat
,more
,emacs
,vi
, and other executables requiring elaborate control of the console). You should investigate thetcsetpgrp
function as a way of transferring terminal control to a process group, and update your solution to call it from the first child process in a pipeline.getpgid
may be helpful. Note: you must block, handle, or ignoreSIGTTOU
in the child process before callingtcsetpgrp
. The good news is that we’ve already provided this to you in the starter code, so you don’t have to worry about it. Iftcsetpgrp(STDIN_FILENO, pgid)
succeeds, then it will return0
. If it fails, it will return-1
and you should throw anSTSHException
.After a foreground job has falls out of the foreground (e.g. it exits or stops),
stsh
should take control back so that it can prompt the user for more input.You’ll also need to update your
fg
builtin so thatstsh
gives terminal control to the job before resuming it, and takes control back after it is no longer in the foreground.Note: if you are trying to run
stsh
in the cplayground debugger, you should know thattcsetpgrp
doesn’t work well there, so you may need to comment out yourtcsetpgrp
calls to debug anything there. -
Add support for pipelines consisting of two processes (i.e. binary pipelines, e.g
cat /usr/include/stdio.h | wc
). Make sure that the standard output of the first is piped to the standard input of the second, and that each of the two processes are part of the same process group, using a process group ID that is identical to the pid of the leading process. You needn’t do much error checking: You can assume that all system calls succeed, with the exception ofexecvp
, which may fail because of user error (misspelled executable name, file isn’t an executable, lack of permissions, etc.). You might want to include more error checking if it helps you triage bugs, assert the truth of certain expectations during execution, and arrive at a working product more quickly, but do all that because it’s good for you and not because you’re trying to make us happy. (Hint: theconduit
user program we dropped in your repo starts to become useful as soon as you deal with nontrivial pipelines. Try typingecho 12345 | ./conduit --delay 1
in the standard shell to see what happens, and try to replicate the behavior instsh
.)Note that before,
createJob
andfg
only needed to wait for one process to finish, so you could get by with a singlewaitpid
call, but now we have multiple processes. Remember that signals are not queued, so if two child processes finish at the same time,sigwait
might only return a singleSIGCHLD
signal. This means that you must callwaitpid
in a loop withWNOHANG
, the same as you did in milestone 8. (Really, this code should be decomposed into a single function.)Also, we highly recommend using
pipe2
withO_CLOEXEC
instead of callingpipe
:int fds[2]; pipe2(fds, O_CLOEXEC);
The
O_CLOEXEC
flag is short for “close on exec.” When the child processes callexecvp
, the file descriptors created bypipe2
will be automatically closed. This means you don’t need to worry about closing pipe file descriptors in the child (although closing them isn’t an error). You still need to close the file descriptors in the parent.You only need to call
tcsetpgrp
in the first process of the pipeline, although calling it in all of the children isn’t wrong.Testing suggestions: Try running
echo 12345 | ./conduit --delay 1
; the characters 1, 2, 3, 4, and 5 should appear, with a one-second delay in between each character. Also try ctrl+c/z and the builtins from earlier milestones to make sure everything is working properly with two processes. In particular, you may want to verify that both children are being added to the same process group.See the Testing Resources section for recommendations on tools that might help you debug problems here.
-
Once you get your head around pipelines of one and two processes, work on getting arbitrarily long pipeline chains to do the right thing. So, if the user types in
echo 12345 | ./conduit --delay 1 | ./conduit | ./conduit
, four processes are created, each with their own pid, and all in a process group whose process group id is that of the leading process (in this case, the one running echo).echo
's standard out feeds the standard in of the firstconduit
, whose standard out feeds into the standard in of the secondconduit
, which pipes its output to the standard input of the lastconduit
, which at last prints to the console. Be sure to minimize code duplication with the previous milestone.Note that you’ll need to create multiple pipes here, which will require storing a variable number of file descriptors. There are many ways to do this, and you can do it however you like. Our solution uses a
vector<array<int, 2>>
to store this (seestd::array
).Testing suggestion: See the Testing Resources section for recommendations on tools that might help you debug problems here. Also, you should go over the basic milestones again, testing functionality like ctrl+c/ctrl+z and the builtins, ensuring that everything you implemented works properly with multiple processes.
-
Finally, add support for input and output redirection via < and > (e.g.
cat < /usr/include/stdio.h | wc > output.txt
). The names of input and output redirection files are surfaced by thepipeline
constructor, and if there is a nonempty string in the input and/or output fields of the pipeline record, that’s your signal that input redirection, output redirection, or both are needed. Anyopen
calls should be made in the child, not the parent. If the file you’re writing to doesn’t exist, create it (O_CREAT
), and go with 0644 (with the leading zero) as the octal constant to establish therw- r-- r--
permission. If the output file you’re redirecting to already exists, then truncate it using theO_TRUNC
flag. Note that input redirection always impacts where the leading process draws its input from and that output redirection influences where the caboose process publishes its output. Sometimes those two processes are the same, and sometimes they are different. Typeman 2 open
for the full skinny on theopen
system call and a reminder of what flags can be bitwise-OR’ed together for the second argument.Testing suggestion: You’re done implementing a fully-fledged shell! Try out pipelines of varying lengths that use input/output redirection. Be sure to use the tools in Testing Resources to ensure you don’t have any leaked file descriptors.
Shell Driver
Note: You don’t have to understand this section or how to use the shell driver. However, it will be useful if you want to write your own tests. (That’s useful for quickly testing the shell each time you make a change, alerting you if you’ve broken anything unexpected.) The sanitycheck tests are not thorough.
The stsh-driver
program (there’s a copy of it in your repo) executes stsh
as a child process, sends it commands and signals as directed by a trace file,
and allows the shell to print to standard output and error as it normally
would. The stsh
process is driven by the stsh-driver
, which is why we call
stsh-driver
a driver.
Go ahead and type ./stsh-driver -h
to learn how to use it:
$ ./stsh-driver -h
Usage: ./stsh-driver [-hv] -t <trace> -s <shell> [-a <args>]
Options:
-h Print this message
-v Output more information
-t <trace> Trace file
-s <shell> Version of stsh to test
-a <args> Arguments to pass through to stsh implementation
We’ve also provided several trace files that you can feed to the driver to
test your stsh
. If you look drill into your repo’s samples
symlink, you’ll
arrive at /usr/class/cs110/samples/assign4
, which includes not only a copy of
my own stsh
solution, but also a directory of shared trace files called
scripts
. Within scripts
, you’ll see simple
, intermediate
, and
advanced
subdirectories, each of which contains one or more trace files you
can use for testing.
Run the shell driver on your own shell using trace file bg-spin.txt
by typing
this:
./stsh-driver -t ./samples/scripts/simple/bg-spin.txt -s ./stsh -a "--suppress-prompt --no-history"
(the -a "--suppress-prompt --no-history"
argument tells stsh
to not emit a
prompt or to use the fancy readline
history stuff, since it confuses
sanitycheck
and autograder scripts.)
Similarly, to compare your results with those generated by my own solution, you
can run the driver on ./stsh_soln
shell by typing:
./stsh-driver -t ./samples/scripts/simple/bg-spin.txt -s ./samples/stsh_soln -a "--suppress-prompt --no-history"`
The neat thing about the trace files is that they generate the same output you
would have gotten had you run your shell interactively (except for an initial
comment identifying the output as something generated via stsh-driver
). For
example:
$ ./stsh-driver -t ./samples/scripts/advanced/simple-pipeline-1.txt -s ./samples/stsh_soln -a "--suppress-prompt --no-history"
# Trace: simple-pipeline-1
# ------------------------
# Exercises support for pipes via a foreground pipeline with
# just two processes.
stsh> /bin/echo abc | ./conduit --count 3
aaabbbcccdddeeefffggghhhiiijjj
The process IDs listed as part of a trace’s output will be different from run to run, but otherwise your output should be exactly the same as that generated by my solution.
The trace files can contain regular shell commands as well as a few special extra commands (check out the provided trace files for some examples):
#
- anything starting with#
is considered a comment and ignoredTSTP
- sendsSIGTSTP
to the shellINT
- sendsSIGINT
to the shellQUIT
- sendsSIGQUIT
to the shellKILL
- sendsSIGKILL
to the shellSLEEP X
- sleeps for X secondsWAIT
- waits for shell to exitCLOSE
- closes shell’s stdin
Testing resources
Here is a cplayground with the starter code. This may be helpful for debugging file descriptor wiring, and there’s a chance it might help for debugging signals.
Note: it appears tcsetpgrp
doesn’t play well with gdb as used by
cplayground, so you’ll need to comment out any tcsetpgrp
error checking.
In order to enable the debugger, you’ll need to set a breakpoint somewhere even
if you don’t actually need to step through the code line-by-line. I just set a
breakpoint on the last line of main (return 0;
), then opened the Open Files
tab. Here’s an example output showing a pipeline of 3 processes with output
redirection to a file:
I’ve also written a script that you can run directly on myth. The display isn’t as nice as cplayground, but you can run it on your code without needing to copy anything into cplayground.
Open two terminals logged into the same myth machine. Then:
- Run
wget https://reberhardt.com/cs110/inspect-fds.py
to download the script - Run
stsh
in one terminal, and start a pipeline you’d like to see the file descriptors for - In the other terminal, run
python inspect-fds.py <name or PID of your program>
(e.g.python inspect-fds.py stsh
)
Here’s the same example (sleep 10 | ./conduit | cat > out.txt
) shown using
inspect-fds.py
:
The stsh
process has stdin/out/err pointing at the terminal, with no leaked
file descriptors. sleep
has stdout going into a pipe that ./conduit
is
reading from; ./conduit
has stdout going into a pipe that cat
is reading
from; and cat
's output is going into out.txt
.
Tips and Tidbits
- Chapters 1 and 2 in the Stanford version of the B&O reader (or Chapters 8 and 10 in the full textbook) are good references for the material in this assignment.
- Your implementation should be in C++ unless there’s no way to avoid it. By
that, I mean you should use
C++
strings unless you interface with a system call that requires C strings, usecout
instead ofprintf
, and so forth. Understand that when you redefine whereSTDOUT_FILENO
directs its text, that impacts whereprintf
(which you’re not using) andcout
(which you are) place that text. - We have reasonably low expectations on error checking for this assignment,
but we do have some. I want you to focus on how to leverage system
directives to get a fully functional shell working, but I don’t want you to
examine the return value of every system call when it’s more or less
impossible for them to fail. In general, your error checking should guard
against user error – attempts to invoke a nonexistent executable, providing
out-of-range arguments to command-line built-ins, and so forth. In some
cases – and I point out those cases in this handout – you do need to
check the return value of a system call or two, because sometimes system call
“failure” (the air quotes are intentional) isn’t really a failure at all.
You’ve seen situations where
waitpid
returns -1 even though everything was fine, and that happens with a few other system calls as well.- You should match the sample solution’s error messages for anything tested by sanitycheck. You don’t need to match other error messages.
- All unused file descriptors should be closed.
- When creating a
pipeline
–whether it consists of a single process, two processes, or many processes–you need to ensure that all of the pipeline processes are in the same process group, but in a different process group than thestsh
instance is. To support this, you should investigatesetpgid
to see how all of the sibling processes in a pipeline can be added to a new process group whose id is the same as the pid of the leading process. So, if a pipeline consists of four processes with pids 4004, 4005, 4007, and 4008, they would all be placed in a process group with an ID of 4004. By doing this, you can send a signal to every process in a group using thekillpg
function, where the first argument is the process group id (e.g.killpg(4004, SIGTSTP)
) . In order to avoid some race conditions, you should callsetpgid
in the parent and in each of the children. Why does the parent need to call it? Because it needs to ensure the process group exists before it advances on to add other processes in the pipeline to the same group. Why do child processes need to call it? Because if the child relies on the parent to do it, the child mayexecvp
(and invalidate its own pid as a validsetpgid
argument) before the parent gets around to it. Race conditions: deep stuff. - You do not need to support pipelines or redirection for any of the builtins.
In principle, you should be able to, say, redirect the output of
jobs
to a file, or pipe the output to another process, but you don’t need to worry about this for Assignment 4. Our tests will never mix builtins with pipes and redirection. - Make sure pipelines of multiple commands work even when the processes spawned
on their behalf ignore standard in and standard out, e.g.
sleep 10 | sleep 10 | sleep 10 | sleep 10 > empty-output.txt
should run just fine – the entire pipeline lasts for about 10 seconds, andempty-output.txt
is created or truncated and ends up being 0 bytes. - We will be testing your submission against many more traces than the ones we provide, so be sure to test thoroughly to verify your shell is unbreakable!
- You don’t need to guard against the possibility of the child process executing and exiting before the parent has a chance to add it to the job list.
- You don’t need to worry about waiting for all children to exit before exiting
the shell. On exit, it would be a good idea to iterate over the processes in
job list, kill each of them, and call
waitpid(pid, NULL, 0)
on them to ensure they are properly cleaned up, but we aren’t requiring you to do this.
Submitting your work
Once you’re done, you should test all of your work as you normally would and
then run the infamous submissions script by typing ./tools/submit
.