Software! Math! Data! The blog of R. Sean Bowman
The blog of R. Sean Bowman
June 04 2016

Resource “ownership” in software is tricky. Poorly thought out ownership can lead to issues with memory management, lifetime management, and other problems that make you want to bang your head on the desk. In a language like C or C++, you were traditionally on our own, responsible for keeping track of ownership and lifetimes without help from the compiler or language itself. But these days, things are changing for the better.

Ownership in C++ and Rust

My colleague recently taught me a cool trick. I want to remember it (and share it!) so I’m writing it down. But it strikes me that his very simple, elegant solution to this problem is related to the type system of Rust and its cousins. So it seems like there’s really something going on here. Hmm.

For our purposes “ownership” refers to some object, function, or bit of code having access to a resource. Ownership doesn’t have to be exclusive; several objects may hold a reference to some other object, for example. We have a system that wants to pass an object from one part of the system to another, but we have some constraints (or at least desires):

  • The objects should have exactly one owner. Perhaps the objects aren’t thread safe and we’re moving them across thread boundaries. Whatever the reason, I believe that this requirement is not uncommon.

  • We’d like a way to explicitly pass ownership of the objects.

  • Further, it would be nice if this passing of ownership were reified in the code in such a way that it’s obvious that the caller is giving up access to the object. At the very least, this reminds us (as programmers) not to use the object any more.

It turns out that this is pretty much the way Rust works by default:

fn foo(x: X) {
    // print stuff from x, say
}

let x = X { val: 0 };
foo(x);
println!("hey val {}", x.val);

This code will not compile because ownership of x is transferred to foo. Further, the compiler is able to tell us at compile time that we would be using an object we relinquished if we tried to print from it. Wow! That really solves the problem above, because not only can we “reify” the transfer (ownership is transferred by default, even), the compiler won’t even let us use the object in a potentially unsafe way.

C++, rvalue references, and so on

The situation is a bit different in C++, but way better than a decade or so ago. My colleague’s solution to the problem above was to create an API in which objects are passed by rvalue reference. I don’t want to go in to the details about what exactly these are; here’s one good explanation and of course Scott Meyers also has a very good explanation. For our purposes, these are a device that lets us differentiate lvalues from rvalues. Here’s what I mean:

void foo(X&& x) {
    // do stuff with x
}

How does this function behave from a caller’s perspective? Here’s the cool thing: we can call it with a temporary. We can call it by std::moveing an object of type X. But we can’t pass it an lvalue! Note that this satisfies many of the criteria above: temporaries can’t be held by several threads at once because, you know, they’re temporary. If we use std::move to pass an object, we must think deliberately to do so, and in the process remind ourselves that we are in fact relinquishing control of the object. Finally, if we try to pass something to the other side but keep control of it ourselves, we won’t be able to (because it won’t compile).

Here’s a summary of what we can do, supposing that X is a type with an operator+:

X y{1};
foo(X{7}); // legal; this creates a temporary
foo(X{1} + y); // legal; this is a temporary as well
foo(y); // NOT ALLOWED!

For me, it’s a big deal that we’re starting to get legitimate ownership support from the language and compiler itself. This frees us up to worry about other problems. Ownership issues aren’t going away any time soon, even if you drink the cool-aid of immutability everywhere (or some more limited doctrine). With lots and lots of cores becoming the norm, reasoning about code being as difficult as ever, and performance always lurking as a concern, it’s important that we offload some of the work we’ve traditionally done by convention and standard to software.

Spatiotemporal metaphors for lifetime

Speaking of ownership and lifetime, one of Rust’s ancestors is a cool language called Cyclone. It was built on C and included lots of features intended to make the language safer while still maintaining performance. One of those features was “lifetimes”, a way of allocating memory in pools that could be freed at specific points in the program. There were “lexical pools” that acted like stack allocated memory, but also more complex regions that were programmer controlled. It strikes me that the spatial metaphor (“region”) makes a lot of sense in many cases: I find myself writing lots of C++ these days that uses unnamed scopes. (Anonymous scopes? I’m not sure what the term is…)

void bar(int i) {
    // something here
    {
        RaiiThingamajob _(x, y, z);
        // do stuff
    } // RaiiThingamajob dtor called here
    // possibly do more stuff
}

It’s clear here that the lifetime of the RaiiThingamajob is that of the scope it’s in, which is to say that it lasts as long as the region of code between the brackets. We think of executing the program as proceeding in time line by line (possibly with jumps), but there is the other structure of physical regions delimited in C++ by curly brackets. It makes sense that we could choose either a spatial or temporal point of view in thinking about lifetime issues.

Now, you might object to some of the above on the grounds that “region” refers to a chunk of memory in a region based memory management scheme, and, you know, you’d be right that it does. But I still think there’s a relationship there, particularly in light of Cyclone’s “lexical” regions, which really do correspond to blocks of code. In any case, whether the “region” refers to a block of code or a block of memory, it’s a spatial metaphor for resource lifetime.

The idea of regions got mixed up in a bit pot and stirred together with a bunch of other stuff and out came Rust. They’re not exactly comparable, but Rust’s lifetimes are a bit like Cyclone’s Regions. They tell us “how long” a resource will live, which might (but doesn’t have to) correspond to some particular section of code. Thus the spatial metaphor used in Cyclone became a temporal one in Rust. Rust’s lifetime system is intimately connected to ownership, since it’s used to prove that a resource is never released before it’s done being used by anything that might have access to it.

Conclusion

Issues of ownership, lifetime, and resource management are at the core of software development. Traditionally, they were handled in the heads of programmers, by using conventions, following informal rules, and talking at the water cooler often. Thankfully these ideas are beginning to show up in the languages and compilers we use in a way that allows us to enforce stronger and more interesting invariants than we could before.

Second, there’s a theme in computer science of time and space. Time space tradeoffs, temporal and spatial locality of reference, and so on. Cyclone used a spatial metaphor for its lifetime management, Rust uses a temporal one. I have a feeling that there’s something deeper lurking here and I just don’t quite understand it yet… As always, there’s much more to the story than I have time to learn and write about, but, as always, very interesting stuff!

Addenda

Please note that I’ve left out so many important and cool things in this post, things like linear and affine types. These have been studied in the programming language and logic community for decades. Languages like ATS have supported fascinating mechanisms to try to address lifetime and ownership issues. I’m sure I’ve left a ton of other stuff out, too!

I’d like to thank my colleague for sharing his knowledge on the C++ trick above – thanks!

Approx. 1392 words, plus code