Ownership (continued) and Error Handling
Congratulations on making it through the first week of spring quarter! We hope you are doing well, no matter where you are.
Ownership in C
In the survey results, many people indicated feeling confused about Rust’s ownership model.
I would like to make the claim that Rust code actually does the same thing as good C code! If you read production C code, you’ll find notions of ownership embedded in the comments.
Here’s some code from Open vSwitch (software for networking switches):
/* Get status of the virtual port (ex. tunnel, patch).
*
* Returns '0' if 'port' is not a virtual port or has no errors.
* Otherwise, stores the error string in '*errp' and returns positive errno
* value. The caller is responsible for freeing '*errp' (with free()).
*
* This function may be a null pointer if the ofproto implementation does
* not support any virtual ports or their states.
*/
int (*vport_get_status)(const struct ofport *port, char **errp);
This comment indicates that this function is allocating memory and returning
“ownership” of that memory to the function that called vport_get_status
. As
the caller ends up with ownership, it is responsible for freeing the memory.
Here’s another example from ffmpeg, a popular media transcoding library:
/**
* @note Any old dictionary present is discarded and replaced with a copy of the new one. The
* caller still owns val is and responsible for freeing it.
*/
int av_opt_set_dict_val(void *obj, const char *name, const AVDictionary *val, int search_flags);
This example shows a sort of borrow in C. The caller owns an AVDictionary
and
passes it to av_opt_set_dict_val
, which uses it to set an option in obj
,
but the caller maintains ownership and is still responsible for freeing the
memory.
We also see ownership being transferred in more complicated ways. For example, see this function from the Linux kernel:
/**
* iscsi_boot_create_target() - create boot target sysfs dir
* @boot_kset: boot kset
* @index: the target id
* @data: driver specific data for target
* @show: attr show function
* @is_visible: attr visibility function
* @release: release function
*
* Note: The boot sysfs lib will free the data passed in for the caller
* when all refs to the target kobject have been released.
*/
struct iscsi_boot_kobj *
iscsi_boot_create_target(struct iscsi_boot_kset *boot_kset, int index,
void *data,
ssize_t (*show) (void *data, int type, char *buf),
umode_t (*is_visible) (void *data, int type),
void (*release) (void *data))
{
return iscsi_boot_create_kobj(boot_kset, &iscsi_boot_target_attr_group,
"target%d", index, data, show, is_visible,
release);
}
EXPORT_SYMBOL_GPL(iscsi_boot_create_target);
Here, the caller passes ownership of some data
to iscsi_boot_create_target
;
in doing so, the caller is no longer expected to free that memory. The
iscsi_boot_create_target
function then passes ownership to the sysfs library
through a chain of other functions not shown here, so that the memory can be
freed at the appropriate time.
Sometimes, ownership gets complicated. Many sufficiently complex codebases have
their own free
functions that do additional cleanup on resources; some even
use their own custom memory allocators. In this
example
(also from Open vSwitch), port_query_by_name
initializes memory and returns
ownership of that memory to the caller, but the memory can’t be freed by
calling free()
; instead, ofproto_port_destroy()
must be used.
/* Looks up a port named 'devname' in 'ofproto'. On success, returns 0 and
* initializes '*port' appropriately. Otherwise, returns a positive errno
* value.
*
* The caller owns the data in 'port' and must free it with
* ofproto_port_destroy() when it is no longer needed. */
int (*port_query_by_name)(const struct ofproto *ofproto,
const char *devname, struct ofproto_port *port);
Ownership is extra complicated when structs are involved. It’s common to see cases where one function is responsible for freeing a struct, but a different function is responsible for freeing buffers that struct pointed to. For example, this code from Linux:
/**
* dvb_unregister_frontend() - Unregisters a DVB frontend
*
* @fe: pointer to &struct dvb_frontend
*
* Stops the frontend kthread, calls dvb_unregister_device() and frees the
* private frontend data allocated by dvb_register_frontend().
*
* NOTE: This function doesn't frees the memory allocated by the demod,
* by the SEC driver and by the tuner. In order to free it, an explicit call to
* dvb_frontend_detach() is needed, after calling this function.
*/
int dvb_unregister_frontend(struct dvb_frontend *fe);
Or, this code from Miller (a command-line data processing utility):
static void mapper_count_similar_free(mapper_t* pmapper, context_t* _) {
mapper_count_similar_state_t* pstate = pmapper->pvstate;
slls_free(pstate->pgroup_by_field_names);
// lhmslv_free will free the keys: we only need to free the void-star values.
for (lhmslve_t* pa = pstate->pcounts_by_group->phead; pa != NULL; pa = pa->pnext) {
unsigned long long* pcount = pa->pvvalue;
free(pcount);
}
lhmslv_free(pstate->pcounts_by_group);
...
}
Rust is challenging to learn not because the language itself is challenging, but because ownership (as a universal concept) is challenging, and because the Rust compiler forces you to figure out all possible ownership issues before it allows you to run your code. It’s not enough to have figured out which function will free some buffer if you also have a struct that maintains a pointer to the buffer and outlives the buffer; what if someone accidentally tries to reference the buffer using that pointer after the buffer has been freed? The Rust compiler forces you to think through all these edge cases up front.
In C, the most confusing ownership-related problems happen when there aren’t even any comments to explain what is happening. Recently, I was working on developing a feature for C Playground that generates diagrams of open file tables as you execute code. In building this, I had to implement an extension to the Linux kernel. However, when I deployed the new code and people started using it, I encountered a problem where the server hosting C Playground would occasionally (and randomly) completely halt – I couldn’t even SSH in. After a week of debugging, I discovered this was an ownership problem. I was looping over processes in a linked list, and I was properly taking ownership of each process struct while I was using it, but I hadn’t realized that I also needed to call a special function to take ownership of the entire linked list. Because I didn’t call that function, I would (very rarely, but sometimes) hit an issue where I would try to get the next process in the linked list, but the linked list had been deallocated by a different function, and I ended up reading garbage memory. Since this was kernel code, that was enough to lock up the entire machine. This would not have happened in Rust; the compiler would have forced the kernel developers to explicitly declare ownership semantics, and then it would have forced me to write my code in a way that respected those semantics.
What’s the compiler doing?
Last week, we also saw some confusion from people wondering about the performance implications of passing ownership vs passing borrowed references.
To clarify: the compiler performs ownership checks on your behalf at compile time. However, when it compiles your code to assembly, the generated assembly looks very similar to assembly generated from C code.
- When you pass ownership of memory, you’re really just passing a pointer to
that memory location. However, the compiler will insert a
free
call at the end of that value’s lifetime.- I should note that this is not strictly true; sometimes the compiler will need to copy memory in order to transfer ownership. However, for the purposes of our class, this approximation works.
- When you pass a reference, you are also just passing a pointer. The compiler generates code to automatically dereference the pointer for you.
- When you explicitly copy something, the memory will be copied.
Will it compile?
Here are a few examples designed to refine your understanding of Rust’s ownership model. For each of these examples, think about:
- Will it compile?
- If not, why not? What could go wrong in an equivalent C or C++ program that does compile?
Ownership and mutability
Here’s our first example:
fn main() {
let s = String::from("hello");
s.push_str(" world");
}
This fails, because s
is immutable by default, and push_str
would mutate
the string.
In C and C++ (and most other languages), you need to explicitly designate
variables as immutable using const
. In C++, a const string
cannot be
mutated and would also have a compiler error. In C, the const
keyword is a
real mess… const char*
gets parsed as (const char)*
, meaning you’re not
allowed to modify the destination buffer, but you could reassign that variable
to point to a different string. You can also write char* const
, which does
the opposite: you can’t reassign the variable to point to a different string,
but you can modify the string buffer. If you want to get a true immutable
string, you have to use the type const char* const
(or char const* const
),
which which is just* just silly silly. (Demo
here) Also, const
was introduced
later in C’s development, so const
gets used very inconsistently throughout
the standard library, and it’s not uncommon to need to insert dubious casts to
get your code to compile.
To fix, we need to use the mut
keyword:
fn main() {
let mut s = String::from("hello");
s.push_str(" world");
}
Let’s take a look at passing variables to functions (a common sticking point on week 1’s exercises). Does this code compile?
fn om_nom_nom(s: String) {
println!("{}", s);
}
fn main() {
let s = String::from("hello");
om_nom_nom(s);
}
This works! What if we add another om_nom_nom
call?
fn om_nom_nom(param: String) {
println!("{}", param);
}
fn main() {
let s = String::from("hello");
om_nom_nom(s);
om_nom_nom(s);
}
The compiler complains about ownership here. Let’s break this down:
- On the first line of
main
,s
owns the string. - On the next line, ownership gets transferred to the
param
parameter ofom_nom_nom
- When
om_nom_nom
returns,param
goes out of scope, and ownership of the string hasn’t been transferred anywhere else, so the string is “dropped” and the string’s memory is freed. - Back in
main
, on the third line, we try to uses
again. However, we previously gaves
away (and in facts
has already been destroyed). The compiler complains with an error explaining this:error[E0382]: use of moved value: `s` --> src/main.rs:8:14 | 6 | let s = String::from("hello"); | - move occurs because `s` has type `std::string::String`, which does not implement the `Copy` trait 7 | om_nom_nom(s); | - value moved here 8 | om_nom_nom(s); | ^ value used here after move error: aborting due to previous error
Important note for understanding: I think a lot of people look at this demo
and think, oh my gosh, that’s annoying. Why does the compiler have to make
things so complicated? However, I would argue that this only looks silly
because it doesn’t have any malloc
s or free
s, and because the code is so
short. Let’s look at how we might have done this in C (keeping in mind that
String
is a heap-allocated buffer).
We could have written the code like this:
void om_nom_nom(char* s) {
printf("%s\n", s);
}
int main() {
char* s = strdup("hello");
om_nom_nom(s);
om_nom_nom(s);
free(s);
}
Or like this:
void om_nom_nom(char* s) {
printf("%s\n", s);
free(s);
}
int main() {
char* s = strdup("hello");
om_nom_nom(s);
om_nom_nom(s);
}
Or like this:
void om_nom_nom(char* s) {
printf("%s\n", s);
free(s);
}
int main() {
char* s = strdup("hello");
om_nom_nom(s);
om_nom_nom(s);
free(s);
}
Or like this:
void om_nom_nom(char* s) {
printf("%s\n", s);
}
int main() {
char* s = strdup("hello");
om_nom_nom(s);
om_nom_nom(s);
}
Of these four possibilities, only one works without memory errors. Keep in mind that this is a trivial example, and real systems code is far more complex. 100+ line functions aren’t rare, and it’s not uncommon to have memory that is allocated in one place and freed 9 hours and 2000k lines of code later. It’s extremely important to maintain some notion of ownership, i.e. some notion of who is responsible for cleaning up resources.
Exceptions to ownership
What if we pass a u32
(unsigned int) instead of a String
? Is this always an issue?
fn om_nom_nom(param: u32) {
println!("{}", param);
}
fn main() {
let x = 1;
om_nom_nom(x);
om_nom_nom(x);
This actually works fine! As mentioned on last Thursday’s lecture, the type
u32
implements a “copy trait” that changes what happens when it is assigned
to variables or passed as a parameter. We will talk more about traits next
week, but for now, just know that if a type implements the copy trait, then it
is copied on assignment and when passed as a parameter.
This is probably pretty confusing. How are you supposed to anticipate whether the compiler will copy a value when you pass it, or whether it will use ownership semantics? Unfortunately, you kind of just need to know about the types you’re using. The good news is that the vast majority of types aren’t tricky like this and use normal ownership semantics. Only primitive types + a handful of others use copy semantics.
References
Let’s talk about borrowing. How does this code look to you?
fn main() {
let s = String::from("hello");
let s1 = &s;
let s2 = &s;
println!("{} {}", s, s1);
}
This code works fine because s
, s1
, and s2
are all immutable. Remember,
you can have as many read-only pointers to something as you want, as long as no
one can change what is being pointed to. (We want to avoid the scenario where
chaos ensues because people are making sneak edits to the Google doc while
others are trying to read it over.)
What if we bring mutable references into the mix?
fn main() {
let s = String::from("hello");
let s1 = &mut s;
let s2 = &s;
println!("{} {}", s, s1);
}
This fails to compile because s
is immutable, and on the next line, we try to
borrow a mutable reference to s
. If this were allowed, we could modify the
string using s1
, even though it was supposed to be immutable.
Let’s fix that by declaring s
as mutable:
fn main() {
let mut s = String::from("hello");
let s1 = &mut s;
let s2 = &s;
println!("{} {} {}", s, s1, s2);
}
This fails again, but for a different reason.
- We first declare
s
as mutable. 👍 - We borrow a mutable reference to
s
. 👍 - We try to borrow an immutable reference to
s
. However, there already exists a mutable reference tos
. Rust doesn’t allow multiple references to exist when a mutable reference has been borrowed. Otherwise, the mutable reference could be used to change (potentially reallocate) memory when code using the other references least expect it.
Let’s remove the second borrow. Does this work?
fn main() {
let mut s = String::from("hello");
let s1 = &mut s;
println!("{} {}", s, s1);
}
- We first declare
s
as mutable. 👍 - We borrow a mutable reference to
s
. 👍 - We try to use
s
. However, the value has been “borrowed out” tos1
and hasn’t been “returned” yet. As such, we can’t uses1
.
Here’s the compiler error:
error[E0502]: cannot borrow `s` as immutable because it is also borrowed as mutable
--> src/main.rs:4:23
|
3 | let s1 = &mut s;
| ------ mutable borrow occurs here
4 | println!("{} {}", s, s1);
| ^ -- mutable borrow later used here
| |
| immutable borrow occurs here
The compiler is saying “hey, you borrowed s
here, into s1
. Now you’re
trying to use s
, but you haven’t gotten the value back yet. I can’t give you
the value back yet, because s1
is still going to be used (as the second thing
being printed in that println
).
How about this code?
fn main() {
let mut s = String::from("hello");
let s1 = &mut s;
println!("{}", s1);
println!("{}", s)
}
Unlike the previous example, this actually works. After the first println
,
Rust sees that s1
will not be used again, so it “returns” the borrowed value
back to s
. Then, when we try to use s
, everything checks out. 👌
Here’s a question we got from the week 1 survey:
“One thing that’s confusing is why sometimes I need to &var and other times I can just use var: for example, set.contains(&var), but set.insert(var) – why?"
Can you answer this question based on your understanding of references now?
When inserting an item into a set, we want to transfer ownership of that item into the set; that way, the item will exist as long as the set exists. (It would be bad if you added a string to the set, and then someone freed the string while it was still a member of the set.) However, when trying to see if the set contains an item, we want to retain ownership, so we only pass a reference.
Error handling
I’d like to argue that this (made-up) code has a security vulnerability:
// Imagine this is code for a network server that has just received and is
// processing a packet of data.
size_t len = packet.length;
void *buf = malloc(len);
memcpy(buf, packet.data, len);
// Do stuff with buf
// ...
free(buf);
Can you identify the weakness here?
The problem with this code is that it doesn’t do error handling. If malloc
fails to allocate memory (most likely because there isn’t enough memory
available), it returns NULL
and sets errno
to indicate the error. If this
happens, the memcpy
in our code will cause a segfault. Someone capable of
sending large packets could take down the server (known as a “Denial of Service
attack”).
This may seem dumb, but such small mistakes have caused real problems. There are two core issues here:
- The use of
NULL
in place of a real value - The lack of a proper error handling system
Handling nulls
NULL
values appear all over the place, and not just in the context of errors;
for example, you can pass NULL
as the second argument to waitpid
to
indicate that you’re not interested in getting extra information about how a
process changed state. NULL
seems innocuous at first glance, but Tony Hoare,
who invented the null reference in 1965, went on to call null references his
“billion-dollar mistake” because of
the problems they have caused. You can find a long list of security
vulnerabilities caused by NULL
here. Most of
them are denial of service vulnerabilities, but some of them result in
complete disclosure of information or remote code execution (e.g. this
critical Linux kernel vulnerability).
Why are null pointers so dangerous? I would argue that the biggest issue is the
cognitive burden they add for a programmer. If you are implementing an API, you
must remember which parameters might be NULL
and add code to handle these
edge cases. If you are consuming an API, you must ensure you never pass NULL
when it might not be expected, and you must also check return values when
NULL
might be returned. If the above links are any indication, humans simply
can’t do this reliably.
To solve this problem, we might want some way to indicate to the compiler when
a value might be NULL
, so that the compiler can then ensure code using
those values is equipped to handle NULL
.
Rust does this with the Option
type. A value of type Option<T>
can either
be None
or Some(value of type T)
. For example, here’s a function that
sometimes returns a String
:
fn feeling_lucky() -> Option<String> {
if get_random_num() > 10 {
Some(String::from("I'm feeling lucky!"))
} else {
None
}
}
How can we use this returned value? Option
has some pretty good
documentation that you
may want to check out, but here are a few things you can do:
- You can call the
is_some()
oris_none()
methods to check whether theOption
is “null.” For example:if feeling_lucky().is_none() { println!("Not feeling lucky :("); }
- You can call
unwrap_or(default)
to get the value of the option, usingdefault
as the default value if the option isNone
.let message = feeling_lucky().unwrap_or(String::from("Not lucky :("));
- Most idiomatically, you can use the
match
operator.match
is kind of like aswitch
statement in other languages, but is much more powerful. As this class isn’t focusing on the Rust language itself, we won’t spend much time talking aboutmatch
, but we would encourage you to check out this excellent introduction here.match feeling_lucky() { Some(message) => { println!("Got message: {}", message); }, None => { println!("No message returned :-/"); }, }
Under the hood, Option
is an enum (an Option
enumerates two possible
forms: None
or Some
). We’ll spend more time on enums in ~2 weeks, but if
you are curious, there is excellent documentation
here.
Handling errors
C has an absolutely garbage system for handling errors. (It’s not really anything that’s built into the language; it’s more a convention on top of the language that people widely adopted, because there wasn’t anything better.) The system is typically this:
- If a function might encounter an error, its return type is made to be
int
(or sometimesvoid*
). - If the function is successful, it returns
0
. Otherwise, if an error is encountered, it returns-1
. (If the function is returning a pointer, it returns a valid pointer in the success case, orNULL
if an error occurs.) - The function that encountered the error sets the global variable
errno
to be an integer indicating what went wrong. If the caller sees that the function returned-1
orNULL
, it can checkerrno
to see what error was encountered. You can see about half the possibleerrno
codes here.
As you might expect, a lot of issues have arisen from failing to check for
errors, or from having done it incorrectly. For example, this critical kernel
vulnerability allowed
attackers over the network to execute arbitrary code with kernel privileges.
The kernel had a set of functions that returned 0
on success and -1
on
error, but could also return NET_XMIT_CN
(defined to be 2
) to indicate
network congestion (not a failure condition, but potentially useful for the
caller to know about). Under congestion conditions, the code calling those
functions saw a nonzero return code, assumed network failure, and freed a bunch
of related memory. However, since the network hadn’t actually failed, that
memory was used again later: a use-after-free that led to remote code
execution.
The fix for this critical vulnerability? A one-line change:
--- a/drivers/infiniband/hw/cxgb3/iwch_cm.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_cm.c
@@ -149,7 +149,7 @@ static int iwch_l2t_send(struct t3cdev *tdev, struct sk_buff *skb, struct l2t_en
error = l2t_send(tdev, skb, l2e);
if (error < 0)
kfree_skb(skb);
- return error;
+ return error < 0 ? error : 0;
}
Most other languages (including C++) use exceptions to manage error conditions. Exceptions work pretty well! However, they have some drawbacks. What are some downsides you can think of?
In my opinion, the biggest disadvantage is that failure modes are hard to spot. Any code can throw any exception at pretty much any time, and exceptions bubble up the stack until they’re handled (or until they reach the bottom of the stack, at which point the program will crash). That means you can call one function, and it might fail with an exception that was thrown by a totally unrelated helper function that is twelve function calls away. If you’ve worked on a large project, you have probably experienced cases where you deployed code you were happy with, but then it crashed in production, and you thought, whoops, I need to handle this exception too.
This problem compounds when you have a large codebase that is constantly evolving. Someone could modify a helper function to throw a new exception, and the code would compile fine, even if users of that function hadn’t added any new code to handle the new exception.
Exceptions are especially bad when combined with manual memory management (i.e. in C++). In handling an exception, you can forget to free memory, or accidentally double-free memory that was already freed before the exception was thrown, or you can end up with half-baked data structures if you caught an exception while initializing a struct and attempt to recover improperly.
Rust takes a different, two-pronged approach to error handling:
-
If an unrecoverable error occurs – one where you think, crap, this program is a dumpster fire… – you should panic. Panics terminate the program immediately and cannot be caught. (Side note: it’s technically possible to catch and recover from panics, but doing so really defeats the philosophy of error handling in Rust, so it’s not advised.)
To panic, use the
panic!
macro with an error message:if sad_times() { panic!("Sad times!"); }
-
If it’s possible for a recoverable error to occur, you should return a
Result
. If you returnResult<T, E>
, you can either returnOk(value of type T)
orErr(value of type E)
.
Here’s how you can use Result
:
fn poke_toddler() -> Result<&'static str, &'static str> {
if get_random_num() > 10 {
Ok("Hahahaha!")
} else {
Err("Waaaaahhh!")
}
}
fn main() {
match poke_toddler() {
Ok(message) => println!("Toddler said: {}", message),
Err(cry) => println!("Toddler cried: {}", cry),
}
}
Similar to how Rust’s compiler forces you to think about ownership up front and
address any possible lifetime issues, usage of Result
for error handling also
forces you to think about failure modes and address all possible error
conditions (at least, all error conditions that the compiler is able to find –
it can’t find logic bugs in your program). This can be very annoying, but if
your program compiles, you can be much more confident that you won’t hit an
unexpected exception!
You will also commonly see the unwrap
and expect
methods being used:
// Panic if the baby cries:
let ok_message = poke_toddler().unwrap();
// Same thing, but print a more descriptive panic message:
let ok_message = poke_toddler().expect("Toddler cried :(");
If the Result
was Ok
, unwrap()
returns the success value; otherwise, it
causes a panic. expect()
does the same thing, but prints the supplied error
message when it panics.
It’s a good idea to avoid unwrap
and expect
when possible, but often,
you’ll encounter errors where the most reasonable thing to do is to panic. For
example, if you are writing a command-line program to process text from stdin,
and reading input fails, then there isn’t really much you can do to handle that
error, and it would be reasonable to panic.
// Read line from stdin
let mut line = String::new();
io::stdin().read_line(&mut line).expect("Failed to read from stdin");
This isn’t a Rust class, and we aren’t going grade you based on your usage of
panics vs Result
. However, we do want you to think about the implications of
calling unwrap
/expect
vs doing error checking and returning Result
.
What should I take away from this?
Error handling in Rust (and in general) is a big topic, and we could spend several lectures on it if we had the time. Sadly, we don’t. As such, we just want to give you the foundation for understanding how Rust approaches error handling, and to expose you to the motivation for why Rust decided to take this approach.
Armin will be implementing a linked list in lecture on Thursday to contextualize ownership mechanics and error handling. You’ll be able to see a practical example worked out in code; if you have questions, bring them to lecture or post on Slack!