Lecture 17

Note: Reading these lecture notes is not a substitute for watching the lecture. I frequently go off script, and you are responsible for understanding everything I talk about in lecture unless I specify otherwise.

Systems classes following CS 110

Principles of Computer Systems

You may already feel familiar with a lot of these concepts, but I want to put them into concrete terms and talk about how they have been relevant to our discussions throughout the past quarter. Many times we take these ideas for granted, but it’s worth thinking about them for a class.

Abstraction

Abstraction is about defining interfaces and focusing on the ideas behind a function instead of on the implementation details. We can define interfaces and use them without needing to know how everything works under the hood, and we can support multiple implementations that follow the same interface.

For example: Do you actually know how writing to stdout causes characters to appear on your terminal window? Probably not, but we have defined an interface where you can write to file descriptor 1 and those bytes will be printed to your terminal. You can use this abstraction without even understanding how it works; furthermore, different operating systems implement things differently, but we can use the same interface regardless of what operating system we’re using.

Other abstractions we’ve used in this class:

Modularity and Layering

Modularity: as soon as code starts getting complicated, let’s start breaking it down into manageable pieces.

Layering is a special form of modularity in which we stack pieces on top of each other.

You’ve seen layering since your CS 106 days. For example, a stack is a data structure layered on top of a vector, and a vector is a data structure layered on top of an array.

Some layering we have seen in this class:

Naming and name resolution

We need names to refer to system resources. (How else would you address a process? How else would you address an open file?) We also need name resolution systems to convert from human-friendly names to machine-friendly ones.

Caching

A cache is a component – sometimes implemented in hardware, sometimes in software – that stores data so that future requests can be handled more quickly.

We see caching all over in the storage hierarchy:

There are also TLB caches, DNS caches, and web caches (like what you did in proxy).

Virtualization

Virtualization is about making many hardware resources look like one, or making one hardware resource look like many.

Making many hardware resources look like one:

Making one hardware resource look like many:

Concurrency

This is about multiple threads or processes running at the same time. We’ve seen concurrency even across clusters of machines in MapReduce. Even signal and interrupt handlers are a form of concurrency. Some programming languages (e.g. Erlang) are designed so entirely around concurrency that they make race conditions impossible.

Client-server request-response

Request/response is a good way to organize functionality into modules that have a clear set of responsibilities. You were already familiar with this with functions and libraries of functions. Now, in this class, we’ve seen this pattern extended to system calls, to multiprocessing, and to network requests.