Object Oriented Rust

Overview

Today we will be coding up a linked list together in Rust! This will be the most sophisticated piece of Rust code you’ve probably seen so far and it will help solidify your understanding of ownership, borrowing, and lifetimes while teaching you new concepts. Specifically, you will learn:

That’s a lot of new stuff! I’ll try to get through as much as possible during our 50 minute lecture. You’ll get to see these same concepts on the Week 2 exercises. Please ask questions in the Slack if you find any of this confusing!

The Goal

We want to build a LinkedList that stores unsigned integers (u32s). Later in the course we’ll discuss how to deal with generics so we can have a single linked list implementation that can store any type, but for the purposes of this lecture, u32s are all that matter. We want to support constant time insertion and deletion (LIFO i.e. a stack). Furthermore, we want our list to keep track of its size and be able to display itself.

Defining our Struct

How should we represent our linked list type? Well, it’s like the old saying goes “behind every great linked list is a great node implementation.” So let’s start there. If you were writing a C++ linked list, your Node would contain two things: (1) the value stored in that node and (2) a pointer to the next Node. Usually the pointer to the next Node is a pointer to a chunk of heap memory. The word we’re using today for “pointer to a chunk of heap memory” in Rust is going to be Box (there are other kinds of pointers to heap memory in Rust, but we will talk about those later!). Box functions as std::unique_ptr does in C++. Once the Box goes out of scope, the drop function will automatically deallocate the heap memory and insert the appropriate free call leaving no room for the programmer to malloc the wrong amount of memory, free at the wrong time or forget to free altogether! Amazing!

Like Vec, Box has you specify what type is contained within it using the angle bracket (<>) syntax. For example, Box<u32> would be a pointer to a heal allocated unsigned 32 bit integer. So the following seems like a reasonable candidate for our Node representation:

struct Node {
    value: u32,
    next: Box<Node>,
}

If you squint your eyes, it looks almost like C/C++. It’s just that Rust does that cute little quirky thing where it puts the type after the variable name. And if we want to indicate that this is the last node in the list, we simply make next a NULL-pointer… Hang on! We can’t have NULL-pointers in Rust! Box can’t make one for you… so what do we do? Maybe we should include an additional parameter in our struct that acts as a flag for whether or not our Node is a valid Node with a meaningful value in it or just a marker for the end of the list… Well that’s kinda wasteful and not very clean and idiomatic. The truth is that we don’t want next to be just a Box… we want it to be a Box or just nothing. That is, we actually want next to be an Option<Box<Node>>! The following should therefore be our struct definition:

struct Node {
    value: u32,
    next: Option<Box<Node>>, // no null pointers in Rust!
}

Our actual LinkedList struct needs to maintain a pointer to the front of the list and the size of the list. Since we also need to express the situation that the list is empty, we similarly store the head of the list as an Option<Box<Node>> instead of just a Box<Node>:

pub struct LinkedList {
    head: Option<Box<Node>>,
    size: usize,
}

You’ll notice that we tacked on a pub to identify the struct as public so we can use it in different files/modules (we’ll talk about modules/privacy later in more detail, don’t worry about it too much now). Intuitively, this is because we may want to expose our linked list interface to other pieces of code but we don’t want to expose our node interface.

We’re now ready to start implementing methods on these objects!

new

We implement methods on our structs using the impl keyword:

impl Node {
    fn new(value: u32, next: Option<Box<Node>>) -> Node {
        Node {value: value, next: next}
    }
}

So far this doesn’t look very different from the function syntax you’re accustomed to. What is new is the struct initialization syntax. This is more or less the same as the struct definition syntax except instead of the type on the right hand side of the colon, you have the value you wish to place in that struct member. In Rust, it’s common practice to call your constructor new.

Similarly, we begin our LinkedList implementation as follows:

impl LinkedList {
    pub fn new() -> LinkedList {
        LinkedList {head: None, size: 0}
    }
}

Our linked list is not very exciting yet, but that’s about to change.

is_empty and get_size

Ok, before it gets too exciting, let’s reap some low-hanging fruit so we can acquaint ourselves with more of Rust’s object oriented syntax. Let’s start by implementing a getter for the size.

pub fn get_size(&self) -> usize {
    self.size
}

So this looks like most of the functions you’ve seen before, but now there’s this mysterious self parameter all of a sudden. Like in python, the self parameter provides a pointer to the current object that get_size operates on. Unlike python, this self needs to situate itself within our notions of ownership and borrowing (like any other parameter we’d pass into a function). You’ll notice here that we’re using a shared/immutable borrow to self as indicated by the &. This is essentially a promise that we won’t mutate any state associated with our struct (this isn’t entirely correct since there is such a thing as interior mutability in Rust, but we’ll talk about that later, I’m just mentioning it here for completeness. If that confuses you more, forget you even read this.). We access struct members using the . operator (just like in C/C++). Hang on though… isn’t &self a pointer/reference? It sure is and the following code does the same exact thing as the snippet before:

pub fn get_size(&self) -> usize {
    (*self).size
}

When you use the . operator on a pointer, Rust will dereference the pointer as many times as it needs to in order to get to the actual struct. So it’s considered more idiomatic to not manually dereference as we did above and use the . operator directly.

Not surprisingly, we can implement is_empty as follows:

pub fn is_empty(&self) -> bool {
    self.size == 0
}

Yes, we could have also checked to see if the head was None. We can even call our handy-dandy get_size function if we’d like:

pub fn is_empty(&self) -> bool {
    self.get_size() == 0
}

Make sense? You may now be wondering how we’d take a mutable borrow to self and if you guessed &mut self you’re 100% right and we’ll see that next.

push

So now things are going to get really exciting because we can start putting things in our list! The implementation below seems like a reasonable attempt:

pub fn push(&mut self, value: u32) {
    let new_node = Box::new(Node::new(value, self.head));
    self.head = Some(new_node);
    self.size += 1;
}

This has a lot going for it – we’re incrementing the size by 1, that looks good. We are allocating a new node on the heap using the Box::new function. However, the borrow checker is clearly less than happy with us:

error[E0507]: cannot move out of borrowed content
  --> src/main.rs:49:50
   |
49 |         let new_node = Box::new(Node::new(value, self.head));
   |                                                  ^^^^ cannot move out of borrowed content

Recall that our Node::new function takes ownership of the second parameter. It kinda needs to here since we don’t want to store references to an Option<Box<Node>> – how do we know if that reference will live long enough? Ok, so how do we take an &mut self and take ownership of something inside of it? We’ll have to introduce a new function: take() When you call take() on a mutable reference to an option, it will give you ownership over the data contained witin the Option and leave None in its place. Therefore, we must write push as follows:

pub fn push(&mut self, value: u32) {
    let new_node = Box::new(Node::new(value, self.head.take()));
    self.head = Some(new_node);
    self.size += 1;
}

pop

The following is one possible implementation of pop (The ? syntax will unwrap the Option if it’s not None and bind it to node otherwise the function will return None.):

pub fn pop(&mut self) -> Option<u32> {
  let node = self.head.take()?;
  self.head = node.next;
  self.size -= 1;
  Some(node.value)
}

display

Iterating over linked lists in Rust is not too different from what you’d do in C/C++. Note here how we use a match statement to parse the Option.

pub fn display(&self) {
    let mut current: &Option<Box<Node>> = &self.head;
    let mut result = String::new();
    loop {
        match current {
            Some(node) => {
                result = format!("{} {}", result, node.value);
                current = &node.next;
            },
            None => break,
        }
    }
    println!("{}", result);
}

You might not have seen format! before – it’s a macro that works just like println! but it spits out a String instead of printing to standard out.

Enhancing our Code with Traits (if time permits)

How do I tell Rust’s type system that my LinkedList type has special powers? For instance, how can I tell Rust that my list knows how to print/display itself? We can do this by implementing the Display trait:

impl fmt::Display for LinkedList {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        let mut current: &Option<Box<Node>> = &self.head;
        let mut result = String::new();
        loop {
            match current {
                Some(node) => {
                    result = format!("{} {}", result, node.value);
                    current = &node.next;
                },
                None => break,
            }
        }
        write!(f, "{}", result)
    }
}

This way, we no longer need to call the display function. If we have a list named list and we want to print it, we can simply say println!("{}", list) and Rust will know what to do because we implemented the Display trait (this is similar to interfaces in Java if you’ve seen those before). We will talk about traits in far more detail next week. This is just a sneak peek.

We can also override the default drop implementation by implementing the Drop trait for our type as follows:

impl Drop for LinkedList {
    fn drop(&mut self) {
        let mut current = self.head.take();
        while let Some(mut node) = current {
            current = node.next.take();
        }
    }
}

(adapted from here)

It is favorable to override drop in this way for efficiency reasons; read more here This is also showcases the fancy let Some syntax that you can use in while loops and if statements. We could have used this above in our implementation of display/fmt.

We can also define traits on our LinkedList that allow us to iterate over them with ease. We might see this as an example next week when we cover traits in more depth.

Ok, that was a lot of new material in 50 minutes! If it doesn’t make sense yet don’t worry. It’ll sink in with more practice and as you read and write more code.