Lecture 8: execvp and pipes

Note: Reading these lecture notes is not a substitute for watching the lecture. I frequently go off script, and you are responsible for understanding everything I talk about in lecture unless I specify otherwise.

Lifecycle of a process

Job control

Processes start, cycle on/off the CPU, and eventually terminate, but they can also be paused at arbitrary points by job control signals. This is useful in a variety of circumstances: for example, maybe you’re running a long, CPU-intense program, and you want to pause it so you can quickly run some quick CPU-intense program. Or, as another example, Mac OS will send “pause” signals to programs when it starts running out of (physical) memory, prompting you to close some apps before resuming them. Job control is sometimes even used programmatically to synchronize between processes; for example, Process A will pause itself to wait for Process B to catch up, and then Process B will signal Process A to continue when it’s ready.

We’ll talk more about signals next week, so don’t worry much about the details of how this works, but in summary, you can send SIGSTOP to pause a process and SIGCONT to continue the process.

On the command line:

# Pause PID 1234
kill -STOP 1234
# Resume PID 1234
kill -CONT 1234

Or, programmatically:

# Pause PID 1234
kill(1234, SIGSTOP);
# Resume PID 1234
kill(1234, SIGSTOP);

Job control and waitpid

waitpid can also be used to observe when a program changes job control states (e.g. stops or continues due to SIGSTOP or SIGCONT). This is accomplished through the third flags parameter:

Specifying WUNTRACED will cause waitpid to return information about processes that have terminated or stopped. (It’s not a great name in my opinion, but has some historical legacy behind it.)
Specifying WCONTINUED will cause waitpid to return information about processes that have terminated or continued.
Specifying WUNTRACED | WCONTINUED will cause waitpid to return information about any state change: it will return when a process stops, continues, or terminates/exits.

Using job control for synchronization

Job control signals are very often used when two processes need to synchronize at multiple points throughout their execution. A child process will commonly self-halt by sending itself SIGSTOP to wait for the parent to reach the synchronization point, and the parent will call waitpid with WUNTRACED to wait for the child to self-halt. Then, the parent can send SIGCONT to wake up the child, and both can continue on their way.

Here’s an example of this pattern at work with a single synchronization point:

Here’s an example that uses this pattern multiple times to go back-and-forth between the parent and child:

Loading executables: execvp

The execvp system call is used to start running an executable from a binary on disk. It completely wipes the virtual address space of the function that calls it, replacing all the segments with new segments from the executable file.

execvp(const char *path, char *const argv[]);

path is the name of the executable we want to run, and argv is a NULL-terminated array of strings (which ends up being the argv passed to main in the new executable!).

execvp never returns (if it succeeds), since the old program gets cannibalized, so there’s nothing to return to. It returns -1 if it couldn’t find the specified executable to run, or if there was some other error (e.g. permissions error).

Implementing `system`

int system(char* command) is a standard library function that runs a command and returns the status code of the command that ran. We’re going to implement it (named mysystem so as to not conflict with the stdlib definition) in the context of implementing a super basic shell.

Here is an implementation of the shell using the standard library system function:

Note: This is an example of a read-eval-print loop (REPL). That’s a term you may see crop up in various places in software engineering – now you know what it means!

We want to implement our own version of the system function, which we will call mysystem. Here’s a full implementation (cplayground here):

int mysystem(char *command) {
    pid_t pid = fork();
    if (pid == 0) {
        char *argv[] = {command, NULL};
        execvp(argv[0], argv);
        // If we get here, there was an error
        printf("Command not found: %s\n", command);
        exit(0);    // DANGER: what if we did "return 0" here instead?
    }

    // At this point, we want to wait for the child process to exit, and get
    // its return code. (The child process *is* the executable we ran)
    int status;
    waitpid(pid, &status, 0);
    if (WIFEXITED(status)) {
        return WEXITSTATUS(status);
    } else {
        return -1;
    }
}

It turns out this doesn’t work for commands like make clean, because execvp tries to find a binary called make clean (including the space) when, in reality, we want to find a binary called make and pass it an argument clean. As a sort of ugly fix, we can invoke /bin/sh to do the tokenization for us:

char *argv[] = {"/bin/sh", "-c", command, NULL};

This isn’t a great solution; our shell is invoking another shell to finally invoke the program we want. This is how the real system works, though. You’ll implement a much more robust and efficient shell in Assignment 4.

File descriptor/file entry/vnode tables revisited

Refresh your memory on the three tables involved with managing file descriptors; you can find a reference here.

In addition to supporting regular files, Unix has a lot of “virtual files”: resources that are not really files, but which we make look like files in order to use file-related abstractions. For example, there are standard in/out/error files linked to your terminal; if you write to the stdout file, the text appears on your screen. File descriptors 0, 1, and 2 point to stdin, stdout, and stderr files, respectively.

The file descriptor table is cloned on fork and preserved across execvp boundaries. (On fork, reference counts in the file entry table are doubled.) Consequentially, on fork, a child process inherits the stdout linked to the terminal, and if it calls execvp, the new executable can still write to the terminal.

Pipes

Another type of “virtual file” is a pipe. Pipes are one of several mechanisms for interprocess communication that we’ll study this quarter. They allow processes to exchange free-form data during process execution.

The pipe syscall creates two new “files” that are linked to each other. If you write to one of the file descriptors, you can read what you wrote from the other file descriptor, almost as if the two file descriptors were linked like a cup-and-string phone. This is a minimal example of pipes at work:

Because the file descriptor table is shared across a fork boundary, we can have one process write, and have the other process read:

Here, we’ve hardcoded for the child to read exactly 6 bytes, but what if we wanted the parent to send the child a message, and the child doesn’t know in advance how big the message is?

To do this, it is very common for the child to read in a while (true) loop until all of the data has been read. We rely on two important properties of read:

If you try reading from an empty pipe, read will block (i.e. take you off the CPU) until at least one byte is available in the pipe.
If the pipe is empty and all write-oriented file descriptors into the pipe have been closed, then it is impossible for any process to add more data into the pipe. If you were reading from an actual file on disk, this would be the same as having read all the bytes from the file, and now being at the end of file (EOF). In this situation, read returns 0 without blocking to indicate “hey, I didn’t read any bytes, and I also didn’t wait for any new bytes to come in because you’ve reached the end of the ‘file’.”

Implementing `subprocess`

Similar to system, subprocess launches an executable in a child process. However, instead of waiting for the child process to exit, it returns immediately, so that the parent can communicate with the child while it runs. It returns the PID of the child, as well as a file descriptor; if the parent writes to this file descriptor, the child will be able to read that data by reading from stdin.

typedef struct {
    pid_t pid;
    int infd;
} subprocess_t;

subprocess_t subprocess(const char *command);

// Example demonstrating how to use subprocess. We start "sort," feed it 4
// words (by writing to the input file descriptor, which is wired to stdin of
// "sort"), then close the input file descriptor (equivalent to pressing ctrl+D
// on the keyboard -- tells "sort" we're done feeding it words).
int main(int argc, char *argv[]) {
    subprocess_t sp = subprocess("/usr/bin/sort");
    const char *words[] = {"pen", "pineapple", "apple", "pen"};
    for (size_t i = 0; i < sizeof(words) / sizeof(words[0]); i++) {
        dprintf(sp.infd, "%s\n", words[i]);
    }
    close(sp.infd);

    int status;
    pid_t pid = waitpid(sp.pid, &status, 0);
    return pid == sp.pid && WIFEXITED(status) ? WEXITSTATUS(status) : -1;
}

subprocess_t subprocess(const char *command) {
    int fds[2];
    pipe(fds);
    subprocess_t process = { fork(), fds[1] };
    if (process.pid == 0) {
        close(fds[1]);  // The child isn't writing to the pipe, so we can close
                        // the write end
        dup2(fds[0], STDIN_FILENO); // Rewires fd 0 to point to the read end of
                        // the pipe. The read end of the pipe now has 2 file
                        // descriptors pointing to it.
        close(fds[0]);  // Now that STDIN_FILENO points to the read end of the
                        // pipe, we can close this extra file descriptor.

        // Start the target exectuable:
        char *argv[] = {"/bin/sh", "-c", (char *) command, NULL};
        execvp(argv[0], argv);
    }

    // The parent isn't reading from the pipe, so we can close the read end:
    close(fds[0]);

    return process;
}

CS 110