Lab 1: Filesystems and System Calls
These questions were written by Jerry Cain and Ryan Eberhardt.
Before the end of lab, be sure to fill out the lab checkoff sheet here!
Getting started
Before starting, go ahead and clone the lab1
folder, which is necessary for
problems 3 and 4:
$ git clone /usr/class/cs110/repos/lab1/shared lab1
$ cd lab1
$ make
Problem 1: Direct, Singly Indirect, and Doubly Indirect Block Numbers
Assume blocks are 512 bytes in size, block numbers are four-byte int
s,
and that inodes include space for 6 block numbers. The first three contain
direct block numbers, the next two contain singly indirect block numbers, and
the final one contains a doubly indirect block number. (Note: This is
different from the scheme we’ve discussed in class.)
- What’s the maximum file size?
- How large does a file need to be before the relevant inode requires the first singly indirect block number be used?
- How large does a file need to be before the relevant inode requires the first doubly indirect block number be used?
- Draw as detailed an inode as you can if it’s to represent a regular file that’s 2049 bytes in size.
Problem 2: Filesystems short-answer questions
- Consider the Unix V6 filesystem, which uses 512-byte blocks and 32-byte inodes. Which sector should we read in order to get inode 256? If that sector is an array of inodes, which index should we go to in order to get inode 256? What about inode 345?
- Unlike the Unix V6 filesystem we learned about in class, the ext2/3/4 family
of filesystems (commonly used on Linux) use variable-length directory
entries:
struct ext3_dir_entry { uint32_t inode_number; uint16_t name_length; uint16_t file_type; char[] file_name; };
What is the benefit to designing directory entries this way? What are some drawbacks? (Consider what happens when deleting files.)
- In lecture, we described how soft/symbolic links are just another kind of file, with the link destination stored as file contents. However, this can make accessing files via symbolic links quite slow, since we need to find the symlink inode, read the file contents, find the destination file inode, and read its contents. What optimization might make symbolic links faster when the target file path is short?
Problem 3: Debugging practice
In this problem, we’ll look at a program called search
, which works similar
to the built in find
command that you may or may not have used in the past.
search
takes two arguments: a directory to search in, and a file to search
for. For example, we can search for hi
in the current directory, and search
will recurse through all child directories, printing any matching files:
🍉 ./search . hi.txt
./samples/a/b/c/hi.txt
./samples/hi.txt
./samples/subdir/hi.txt
./samples/subdir2/hi.txt
We have implemented search
for you in the lab1
directory. However, there
are three bugs that prevent the implementation from working.
First, take a few minutes to read through the code and make sure you understand
what it’s trying to do. The program works by recursing through subdirectories
using the listMatches
function. Here are some example recursive calls:
listMatches(".", 1, "hi.txt")
listMatches("./samples", 9, "hi.txt")
listMatches("./samples/a", 11, "hi.txt")
listMatches("./samples/a/b", 13, "hi.txt")
listMatches("./samples/a/b/c", 15, "hi.txt")
listMatches("./samples/subdir", 16, "hi.txt")
listMatches("./samples/subdir2", 17, "hi.txt")
There are a few new functions used in this code, such as stat
(which gets
information from a file’s inode), opendir
(which opens a directory for
reading), and readdir
(which reads a directory entry from an open directory).
You won’t need these functions for the rest of this class, so don’t spend much
time worrying about them; you just need to understand what they are being used
for.
Once you have a feel for how the code is supposed to work, try using print statements and/or gdb to follow its execution and get an idea of what it is actually doing. Again, there are three bugs you’ll need to fix in order to make it work!
To be clear, we don’t want you to read the code and find what’s wrong just by looking at it. Sometimes you can find bugs that way, but many times it’s more productive to debug by building a mental model of how the program is misbehaving, and then use that information to figure out where the bug is.
Problem 4: valgrind
and orphaned file descriptors
This problem may be useful in your assignments to show you how to ensure you aren’t leaking file descriptors.
Here’s a very short exercise to enhance your understanding of valgrind
and
what it can do for you. Start by opening open-fds.cc
in the lab1 directory,
and trace through the code to keep tabs on what file descriptors are created,
properly closed, and orphaned. Then run valgrind ./open-fds
to confirm that
there aren’t any memory leaks or errors (how could there be?), but then run
valgrind --track-fds=yes ./open-fds
to get information about the file
descriptors that were (intentionally) left open. Without changing the logic of
the program, insert as many close statements as necessary so that all file
descriptors (including 0, 1, and 2) are properly donated back. (In practice,
you do not have to close file descriptors 0, 1, and 2.)