Software! Math! Data! The blog of R. Sean Bowman
The blog of R. Sean Bowman
April 02 2016

This post my self-flagellation for spending quite a bit of time tracking down a problem with some code I was writing. The error was tricky, but I should have known better, and afterward I made the connection with some other tricky bits I knew about in different languages. Let’s look at the problem, which involves environments and closures (and probably other stuff, too!)

Some Python

The first place I saw this problem was in Python: suppose you create a bunch of functions in a loop, something like this:

def loopy(n):
    funcs = []
    for i in range(n):
        funcs.append(lambda x: x*i)
    return funcs

Nothing too weird about that code, as long as you’re used to creating functions and so forth. The problem, as it were, or rather the thing that is counterintuitive at first, is that you end up with a list of functions that are all the same. They all compute their argument times n instead of their argument times i, where i is the index into the array.

It’s not too hard to see why upon reflection: the lambda there creates a closure, capturing its environment (and the variable i in particular). But i is mutated, and by the time we get around to calling any of our functions, the value of i is n, not whatever it was when the function was created.

There are several solutions to the problem: Python captures default arguments by value at the time of closure creation, so this works the way we expected the above to:

def loopy(n):
    funcs = []
    for i in range(n):
        funcs.append(lambda x, i=i: x*i)
    return funcs

And of course here it would be better to just go completely functional:

funcs = map(lambda i: lambda x: x * i, range(n))

(Call me old school, but I still love map, filter, and reduce. They’re my toys, and I’m not giving them away. Comprehensions have their place, but they’re no substitute for the trifecta of awesomeness.)

Some Javascript

The same situation obtains in Javascript:

function loopy(n) {
    var funcs = [];
    for (var i = 0; i < n; ++i) {
        funcs.append(x => x*i);
    }
    return funcs;
}

We end up with a list of the same functions, because i is mutated. Here I’m in a little less familiar territory, and the first thing I came up with to fix the problem is this:

function loopy(n) {
    var funcs = [];
    for (var i = 0; i < n; ++i) {
        funcs.push((i => (x => x*i))(i));
    }
    return funcs;
}

Again, using a functional style is probably better. Using lodash or something similar, we can write

var funcs = _.range(n).map(i => (x => x * i));

The actual problem: C++

So it turns out that I know to avoid creating closures in loops in Python. I’ve never encountered that in Javascript, so who knows. Maybe I’d make that mistake there. In any case, I was writing in C++, using std::thread when I made this mistake. I had something like the following code:

const std::size_t n_threads = 4;
std::vector<std::thread> threads;
std::vector<std::size_t> counter(n_threads, 0);

for (std::size_t i = 0; i < n_threads; ++i) {
    threads.emplace_back([&]() {
        // do stuff here, probably in a loop
        counter[i]++;
    });
}

Do you see the problem? It’s the same as above: in C++, the [&]() {...} syntax means create an anonymous function, capturing the variables in its environment by reference (thus the & in the capture list, as it’s called). Thus i is captured by reference, and since it’s mutated, we end up with the same problem as above: all the threads try to access the last element of counter. The solution is especially simple in C++: we just tell the compiler to capture i by value.

for (std::size_t i = 0; i < n_threads; ++i) {
    threads.emplace_back([&, i]() {
        // do stuff here, probably in a loop
        counter[i]++;
    });
}

Aaargh. I nearly pounded my head into the desk over that one, and it’s the exact thing that I would raise an eyebrow over if I saw it in Python.

The takeaway

So what’s the takeaway, if there is one? I guess it’s important to know exactly what you’re closing over, how you intend to use it, what the lifetimes are of the various things are, and so forth. Maybe it’s better to be explicit about what you’re using, to write it all down in the capture list. Truthfully I was too lazy to do it in this case, and it bit me. Maybe (maybe!) I’ll be more careful in the future.

In the meantime, I hope that you don’t create this problem for yourself! Happy coding, and good luck.

Approx. 775 words, plus code