Ownership (continued) and Error Handling

Congratulations on making it through the first week of spring quarter! We hope you are doing well, no matter where you are.

Ownership in C

In the survey results, many people indicated feeling confused about Rust’s ownership model.

I would like to make the claim that Rust code actually does the same thing as good C code! If you read production C code, you’ll find notions of ownership embedded in the comments.

Here’s some code from Open vSwitch (software for networking switches):

/* Get status of the virtual port (ex. tunnel, patch).
 *
 * Returns '0' if 'port' is not a virtual port or has no errors.
 * Otherwise, stores the error string in '*errp' and returns positive errno
 * value. The caller is responsible for freeing '*errp' (with free()).
 *
 * This function may be a null pointer if the ofproto implementation does
 * not support any virtual ports or their states.
 */
int (*vport_get_status)(const struct ofport *port, char **errp);

This comment indicates that this function is allocating memory and returning “ownership” of that memory to the function that called vport_get_status. As the caller ends up with ownership, it is responsible for freeing the memory.

Here’s another example from ffmpeg, a popular media transcoding library:

/**
 * @note Any old dictionary present is discarded and replaced with a copy of the new one. The
 * caller still owns val is and responsible for freeing it.
 */
int av_opt_set_dict_val(void *obj, const char *name, const AVDictionary *val, int search_flags);

This example shows a sort of borrow in C. The caller owns an AVDictionary and passes it to av_opt_set_dict_val, which uses it to set an option in obj, but the caller maintains ownership and is still responsible for freeing the memory.

We also see ownership being transferred in more complicated ways. For example, see this function from the Linux kernel:

/**
 * iscsi_boot_create_target() - create boot target sysfs dir
 * @boot_kset: boot kset
 * @index: the target id
 * @data: driver specific data for target
 * @show: attr show function
 * @is_visible: attr visibility function
 * @release: release function
 *
 * Note: The boot sysfs lib will free the data passed in for the caller
 * when all refs to the target kobject have been released.
 */
struct iscsi_boot_kobj *
iscsi_boot_create_target(struct iscsi_boot_kset *boot_kset, int index,
			 void *data,
			 ssize_t (*show) (void *data, int type, char *buf),
			 umode_t (*is_visible) (void *data, int type),
			 void (*release) (void *data))
{
	return iscsi_boot_create_kobj(boot_kset, &iscsi_boot_target_attr_group,
				      "target%d", index, data, show, is_visible,
				      release);
}
EXPORT_SYMBOL_GPL(iscsi_boot_create_target);

Here, the caller passes ownership of some data to iscsi_boot_create_target; in doing so, the caller is no longer expected to free that memory. The iscsi_boot_create_target function then passes ownership to the sysfs library through a chain of other functions not shown here, so that the memory can be freed at the appropriate time.

Sometimes, ownership gets complicated. Many sufficiently complex codebases have their own free functions that do additional cleanup on resources; some even use their own custom memory allocators. In this example (also from Open vSwitch), port_query_by_name initializes memory and returns ownership of that memory to the caller, but the memory can’t be freed by calling free(); instead, ofproto_port_destroy() must be used.

/* Looks up a port named 'devname' in 'ofproto'.  On success, returns 0 and
 * initializes '*port' appropriately. Otherwise, returns a positive errno
 * value.
 *
 * The caller owns the data in 'port' and must free it with
 * ofproto_port_destroy() when it is no longer needed. */
int (*port_query_by_name)(const struct ofproto *ofproto,
                          const char *devname, struct ofproto_port *port);

Ownership is extra complicated when structs are involved. It’s common to see cases where one function is responsible for freeing a struct, but a different function is responsible for freeing buffers that struct pointed to. For example, this code from Linux:

/**
 * dvb_unregister_frontend() - Unregisters a DVB frontend
 *
 * @fe: pointer to &struct dvb_frontend
 *
 * Stops the frontend kthread, calls dvb_unregister_device() and frees the
 * private frontend data allocated by dvb_register_frontend().
 *
 * NOTE: This function doesn't frees the memory allocated by the demod,
 * by the SEC driver and by the tuner. In order to free it, an explicit call to
 * dvb_frontend_detach() is needed, after calling this function.
 */
int dvb_unregister_frontend(struct dvb_frontend *fe);

Or, this code from Miller (a command-line data processing utility):

static void mapper_count_similar_free(mapper_t* pmapper, context_t* _) {
	mapper_count_similar_state_t* pstate = pmapper->pvstate;
	slls_free(pstate->pgroup_by_field_names);

	// lhmslv_free will free the keys: we only need to free the void-star values.
	for (lhmslve_t* pa = pstate->pcounts_by_group->phead; pa != NULL; pa = pa->pnext) {
		unsigned long long* pcount = pa->pvvalue;
		free(pcount);
	}
	lhmslv_free(pstate->pcounts_by_group);

    ...
}

Rust is challenging to learn not because the language itself is challenging, but because ownership (as a universal concept) is challenging, and because the Rust compiler forces you to figure out all possible ownership issues before it allows you to run your code. It’s not enough to have figured out which function will free some buffer if you also have a struct that maintains a pointer to the buffer and outlives the buffer; what if someone accidentally tries to reference the buffer using that pointer after the buffer has been freed? The Rust compiler forces you to think through all these edge cases up front.

In C, the most confusing ownership-related problems happen when there aren’t even any comments to explain what is happening. Recently, I was working on developing a feature for C Playground that generates diagrams of open file tables as you execute code. In building this, I had to implement an extension to the Linux kernel. However, when I deployed the new code and people started using it, I encountered a problem where the server hosting C Playground would occasionally (and randomly) completely halt – I couldn’t even SSH in. After a week of debugging, I discovered this was an ownership problem. I was looping over processes in a linked list, and I was properly taking ownership of each process struct while I was using it, but I hadn’t realized that I also needed to call a special function to take ownership of the entire linked list. Because I didn’t call that function, I would (very rarely, but sometimes) hit an issue where I would try to get the next process in the linked list, but the linked list had been deallocated by a different function, and I ended up reading garbage memory. Since this was kernel code, that was enough to lock up the entire machine. This would not have happened in Rust; the compiler would have forced the kernel developers to explicitly declare ownership semantics, and then it would have forced me to write my code in a way that respected those semantics.

What’s the compiler doing?

Last week, we also saw some confusion from people wondering about the performance implications of passing ownership vs passing borrowed references.

To clarify: the compiler performs ownership checks on your behalf at compile time. However, when it compiles your code to assembly, the generated assembly looks very similar to assembly generated from C code.

Will it compile?

Here are a few examples designed to refine your understanding of Rust’s ownership model. For each of these examples, think about:

Ownership and mutability

Here’s our first example:

fn main() {
    let s = String::from("hello");
    s.push_str(" world");
}

This fails, because s is immutable by default, and push_str would mutate the string.

In C and C++ (and most other languages), you need to explicitly designate variables as immutable using const. In C++, a const string cannot be mutated and would also have a compiler error. In C, the const keyword is a real mess… const char* gets parsed as (const char)*, meaning you’re not allowed to modify the destination buffer, but you could reassign that variable to point to a different string. You can also write char* const, which does the opposite: you can’t reassign the variable to point to a different string, but you can modify the string buffer. If you want to get a true immutable string, you have to use the type const char* const (or char const* const), which which is just* just silly silly. (Demo here) Also, const was introduced later in C’s development, so const gets used very inconsistently throughout the standard library, and it’s not uncommon to need to insert dubious casts to get your code to compile.

To fix, we need to use the mut keyword:

fn main() {
    let mut s = String::from("hello");
    s.push_str(" world");
}

Let’s take a look at passing variables to functions (a common sticking point on week 1’s exercises). Does this code compile?

fn om_nom_nom(s: String) {
    println!("{}", s);
}

fn main() {
  let s = String::from("hello");
  om_nom_nom(s);
}

This works! What if we add another om_nom_nom call?

fn om_nom_nom(param: String) {
    println!("{}", param);
}

fn main() {
  let s = String::from("hello");
  om_nom_nom(s);
  om_nom_nom(s);
}

The compiler complains about ownership here. Let’s break this down:

Important note for understanding: I think a lot of people look at this demo and think, oh my gosh, that’s annoying. Why does the compiler have to make things so complicated? However, I would argue that this only looks silly because it doesn’t have any mallocs or frees, and because the code is so short. Let’s look at how we might have done this in C (keeping in mind that String is a heap-allocated buffer).

We could have written the code like this:

void om_nom_nom(char* s) {
    printf("%s\n", s);
}

int main() {
    char* s = strdup("hello");
    om_nom_nom(s);
    om_nom_nom(s);
    free(s);
}

Or like this:

void om_nom_nom(char* s) {
    printf("%s\n", s);
    free(s);
}

int main() {
    char* s = strdup("hello");
    om_nom_nom(s);
    om_nom_nom(s);
}

Or like this:

void om_nom_nom(char* s) {
    printf("%s\n", s);
    free(s);
}

int main() {
    char* s = strdup("hello");
    om_nom_nom(s);
    om_nom_nom(s);
    free(s);
}

Or like this:

void om_nom_nom(char* s) {
    printf("%s\n", s);
}

int main() {
    char* s = strdup("hello");
    om_nom_nom(s);
    om_nom_nom(s);
}

Of these four possibilities, only one works without memory errors. Keep in mind that this is a trivial example, and real systems code is far more complex. 100+ line functions aren’t rare, and it’s not uncommon to have memory that is allocated in one place and freed 9 hours and 2000k lines of code later. It’s extremely important to maintain some notion of ownership, i.e. some notion of who is responsible for cleaning up resources.

Exceptions to ownership

What if we pass a u32 (unsigned int) instead of a String? Is this always an issue?

fn om_nom_nom(param: u32) {
    println!("{}", param);
}

fn main() {
    let x = 1;
    om_nom_nom(x);
    om_nom_nom(x);

This actually works fine! As mentioned on last Thursday’s lecture, the type u32 implements a “copy trait” that changes what happens when it is assigned to variables or passed as a parameter. We will talk more about traits next week, but for now, just know that if a type implements the copy trait, then it is copied on assignment and when passed as a parameter.

This is probably pretty confusing. How are you supposed to anticipate whether the compiler will copy a value when you pass it, or whether it will use ownership semantics? Unfortunately, you kind of just need to know about the types you’re using. The good news is that the vast majority of types aren’t tricky like this and use normal ownership semantics. Only primitive types + a handful of others use copy semantics.

References

Let’s talk about borrowing. How does this code look to you?

fn main() {
    let s = String::from("hello");
    let s1 = &s;
    let s2 = &s;
    println!("{} {}", s, s1);
}

This code works fine because s, s1, and s2 are all immutable. Remember, you can have as many read-only pointers to something as you want, as long as no one can change what is being pointed to. (We want to avoid the scenario where chaos ensues because people are making sneak edits to the Google doc while others are trying to read it over.)

What if we bring mutable references into the mix?

fn main() {
    let s = String::from("hello");
    let s1 = &mut s;
    let s2 = &s;
    println!("{} {}", s, s1);
}

This fails to compile because s is immutable, and on the next line, we try to borrow a mutable reference to s. If this were allowed, we could modify the string using s1, even though it was supposed to be immutable.

Let’s fix that by declaring s as mutable:

fn main() {
    let mut s = String::from("hello");
    let s1 = &mut s;
    let s2 = &s;
    println!("{} {} {}", s, s1, s2);
}

This fails again, but for a different reason.

Let’s remove the second borrow. Does this work?

fn main() {
    let mut s = String::from("hello");
    let s1 = &mut s;
    println!("{} {}", s, s1);
}

Here’s the compiler error:

error[E0502]: cannot borrow `s` as immutable because it is also borrowed as mutable
 --> src/main.rs:4:23
  |
3 |     let s1 = &mut s;
  |              ------ mutable borrow occurs here
4 |     println!("{} {}", s, s1);
  |                       ^  -- mutable borrow later used here
  |                       |
  |                       immutable borrow occurs here

The compiler is saying “hey, you borrowed s here, into s1. Now you’re trying to use s, but you haven’t gotten the value back yet. I can’t give you the value back yet, because s1 is still going to be used (as the second thing being printed in that println).

How about this code?

fn main() {
    let mut s = String::from("hello");
    let s1 = &mut s;
    println!("{}", s1);
    println!("{}", s)
}

Unlike the previous example, this actually works. After the first println, Rust sees that s1 will not be used again, so it “returns” the borrowed value back to s. Then, when we try to use s, everything checks out. 👌

Here’s a question we got from the week 1 survey:

“One thing that’s confusing is why sometimes I need to &var and other times I can just use var: for example, set.contains(&var), but set.insert(var) – why?"

Can you answer this question based on your understanding of references now?

When inserting an item into a set, we want to transfer ownership of that item into the set; that way, the item will exist as long as the set exists. (It would be bad if you added a string to the set, and then someone freed the string while it was still a member of the set.) However, when trying to see if the set contains an item, we want to retain ownership, so we only pass a reference.

Error handling

I’d like to argue that this (made-up) code has a security vulnerability:

// Imagine this is code for a network server that has just received and is
// processing a packet of data.
size_t len = packet.length;
void *buf = malloc(len);
memcpy(buf, packet.data, len);
// Do stuff with buf
// ...
free(buf);

Can you identify the weakness here?

The problem with this code is that it doesn’t do error handling. If malloc fails to allocate memory (most likely because there isn’t enough memory available), it returns NULL and sets errno to indicate the error. If this happens, the memcpy in our code will cause a segfault. Someone capable of sending large packets could take down the server (known as a “Denial of Service attack”).

This may seem dumb, but such small mistakes have caused real problems. There are two core issues here:

Handling nulls

NULL values appear all over the place, and not just in the context of errors; for example, you can pass NULL as the second argument to waitpid to indicate that you’re not interested in getting extra information about how a process changed state. NULL seems innocuous at first glance, but Tony Hoare, who invented the null reference in 1965, went on to call null references his “billion-dollar mistake” because of the problems they have caused. You can find a long list of security vulnerabilities caused by NULL here. Most of them are denial of service vulnerabilities, but some of them result in complete disclosure of information or remote code execution (e.g. this critical Linux kernel vulnerability).

Why are null pointers so dangerous? I would argue that the biggest issue is the cognitive burden they add for a programmer. If you are implementing an API, you must remember which parameters might be NULL and add code to handle these edge cases. If you are consuming an API, you must ensure you never pass NULL when it might not be expected, and you must also check return values when NULL might be returned. If the above links are any indication, humans simply can’t do this reliably.

To solve this problem, we might want some way to indicate to the compiler when a value might be NULL, so that the compiler can then ensure code using those values is equipped to handle NULL.

Rust does this with the Option type. A value of type Option<T> can either be None or Some(value of type T). For example, here’s a function that sometimes returns a String:

fn feeling_lucky() -> Option<String> {
    if get_random_num() > 10 {
        Some(String::from("I'm feeling lucky!"))
    } else {
        None
    }
}

How can we use this returned value? Option has some pretty good documentation that you may want to check out, but here are a few things you can do:

Under the hood, Option is an enum (an Option enumerates two possible forms: None or Some). We’ll spend more time on enums in ~2 weeks, but if you are curious, there is excellent documentation here.

Handling errors

C has an absolutely garbage system for handling errors. (It’s not really anything that’s built into the language; it’s more a convention on top of the language that people widely adopted, because there wasn’t anything better.) The system is typically this:

As you might expect, a lot of issues have arisen from failing to check for errors, or from having done it incorrectly. For example, this critical kernel vulnerability allowed attackers over the network to execute arbitrary code with kernel privileges. The kernel had a set of functions that returned 0 on success and -1 on error, but could also return NET_XMIT_CN (defined to be 2) to indicate network congestion (not a failure condition, but potentially useful for the caller to know about). Under congestion conditions, the code calling those functions saw a nonzero return code, assumed network failure, and freed a bunch of related memory. However, since the network hadn’t actually failed, that memory was used again later: a use-after-free that led to remote code execution.

The fix for this critical vulnerability? A one-line change:

--- a/drivers/infiniband/hw/cxgb3/iwch_cm.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_cm.c
@@ -149,7 +149,7 @@ static int iwch_l2t_send(struct t3cdev *tdev, struct sk_buff *skb, struct l2t_en
 	error = l2t_send(tdev, skb, l2e);
 	if (error < 0)
 		kfree_skb(skb);
-	return error;
+	return error < 0 ? error : 0;
 }

Most other languages (including C++) use exceptions to manage error conditions. Exceptions work pretty well! However, they have some drawbacks. What are some downsides you can think of?

In my opinion, the biggest disadvantage is that failure modes are hard to spot. Any code can throw any exception at pretty much any time, and exceptions bubble up the stack until they’re handled (or until they reach the bottom of the stack, at which point the program will crash). That means you can call one function, and it might fail with an exception that was thrown by a totally unrelated helper function that is twelve function calls away. If you’ve worked on a large project, you have probably experienced cases where you deployed code you were happy with, but then it crashed in production, and you thought, whoops, I need to handle this exception too.

This problem compounds when you have a large codebase that is constantly evolving. Someone could modify a helper function to throw a new exception, and the code would compile fine, even if users of that function hadn’t added any new code to handle the new exception.

Exceptions are especially bad when combined with manual memory management (i.e. in C++). In handling an exception, you can forget to free memory, or accidentally double-free memory that was already freed before the exception was thrown, or you can end up with half-baked data structures if you caught an exception while initializing a struct and attempt to recover improperly.

Rust takes a different, two-pronged approach to error handling:

Here’s how you can use Result:

fn poke_toddler() -> Result<&'static str, &'static str> {
    if get_random_num() > 10 {
        Ok("Hahahaha!")
    } else {
        Err("Waaaaahhh!")
    }
}

fn main() {
    match poke_toddler() {
        Ok(message) => println!("Toddler said: {}", message),
        Err(cry) => println!("Toddler cried: {}", cry),
    }
}

Similar to how Rust’s compiler forces you to think about ownership up front and address any possible lifetime issues, usage of Result for error handling also forces you to think about failure modes and address all possible error conditions (at least, all error conditions that the compiler is able to find – it can’t find logic bugs in your program). This can be very annoying, but if your program compiles, you can be much more confident that you won’t hit an unexpected exception!

You will also commonly see the unwrap and expect methods being used:

// Panic if the baby cries:
let ok_message = poke_toddler().unwrap();
// Same thing, but print a more descriptive panic message:
let ok_message = poke_toddler().expect("Toddler cried :(");

If the Result was Ok, unwrap() returns the success value; otherwise, it causes a panic. expect() does the same thing, but prints the supplied error message when it panics.

It’s a good idea to avoid unwrap and expect when possible, but often, you’ll encounter errors where the most reasonable thing to do is to panic. For example, if you are writing a command-line program to process text from stdin, and reading input fails, then there isn’t really much you can do to handle that error, and it would be reasonable to panic.

// Read line from stdin
let mut line = String::new();
io::stdin().read_line(&mut line).expect("Failed to read from stdin");

This isn’t a Rust class, and we aren’t going grade you based on your usage of panics vs Result. However, we do want you to think about the implications of calling unwrap/expect vs doing error checking and returning Result.

What should I take away from this?

Error handling in Rust (and in general) is a big topic, and we could spend several lectures on it if we had the time. Sadly, we don’t. As such, we just want to give you the foundation for understanding how Rust approaches error handling, and to expose you to the motivation for why Rust decided to take this approach.

Armin will be implementing a linked list in lecture on Thursday to contextualize ownership mechanics and error handling. You’ll be able to see a practical example worked out in code; if you have questions, bring them to lecture or post on Slack!