Software! Math! Data! The blog of R. Sean Bowman
The blog of R. Sean Bowman
June 14 2015

In the previous post, we looked a some advantages of STM in the context of a bank account transaction. Let’s tackle a famous problem having to do with concurrent systems, the Dining Philosopher’s Problem. Recall that in this problem a group of philosophers sit at a round table. Between each of them is a fork. In order to eat their meal, the philosophers need both forks, so each tries to grab one fork and then the other. If they can do so, they eat for a bit and put both forks down.

The problem is that since there are only as many forks as philosophers, it’s possible that each person grabs the fork on their left and then waits for the fork on their right to become available. Everybody is waiting on everybody else! (These philosophers aren’t very smart, it seems.) This situation is called deadlock, a condition of the system where no progress can be made. It’s bad, people spend lots of time trying to avoid it and prove that it can’t happen, but unfortunately is a tricky problem to solve.

How can we solve this problem with STM? We’ll create a Philosopher class first:

using Fork = int;

class StmPhilosopher {
    int eaten;
    Fork& left_fork, &right_fork;

    bool get_fork(Fork& f) {
        if (f == 0) {
            f = 1;
            return true;
        } else {
            return false;
        }
    }

public:
    string name;

    StmPhilosopher(string name, Fork& left, Fork& right) :
        eaten(0), left_fork(left), right_fork(right),
        name(name) {}

    bool get_left_fork() {
        return get_fork(left_fork);
    }

    bool get_right_fork() {
        return get_fork(right_fork);
    }

    void release_forks() {
        left_fork = 0;
        right_fork = 0;
    }

    void eat(int amount) {
        eaten += amount;
    }
};

We need to initialize the philosophers with the correct forks, in this case represented by integers. Note that the last fork wraps around the table; we use modular arithmetic to ensure that the last philosopher holds forks[n_threads - 1] as well as forks[0].

constexpr int n_threads = 5;
constexpr int n_courses = 100000;

int main(void) {
    thread thds[n_threads];
    Fork forks[n_threads] = {0};
    vector<StmPhilosopher> stm_philosophers;
    string names[] = {"Alice", "Bob", "Charlie",
                      "Danielle", "Evan"};

    for (int i = 0; i < n_threads; ++i)
        stm_philosophers.push_back(
            StmPhilosopher(names[i], forks[i],
                forks[(i+1) % n_threads]));

    for (int i = 0; i < n_threads; ++i)
        t = thread(thread_eat_stm_func,
                   std::ref(stm_philosophers[i]),
                   n_courses);

    for (auto& t : thds)
        t.join();

    return 0;
}

All that remains is to write the function thread_eat_stm_func which is executed by the threads. Here it is:

void thread_eat_stm_func(StmPhilosopher& p, int n_courses) {
    for (int i = 0; i < n_courses; ++i) {
        auto eaten = false;
        do {
            __transaction_atomic {
                if (p.get_left_fork() && p.get_right_fork()) {
                    p.eat(1);
                    eaten = true;
                    p.release_forks();
                }
            }
        } while (!eaten);
    }
}

Again, the solution to the goofy problem of each diner reaching for the fork on their left is simply to ensure that the operation of grabbing the two forks is atomic. If a diner can’t get both forks right away, they try again until eventually they’re able to eat. Note how simple the solution is: it doesn’t require acquiring resources in any order, there are no locks to forget to handle correctly, and there is exactly one use of an atomic transaction.

I’d like to start looking at more traditional locking in future posts, so let’s see a solution to this problem using locks. Edsger Dijkstra, who invented some modern concurrency constructs like semaphores, originally came up with the problem, and I believe that his solution looked something like this:

class LockingPhilosopher {
    int eaten;
    mutex& left_m;
    mutex& right_m;

public:
    string name;

    LockingPhilosopher(string name, mutex& left_m,
        mutex& right_m) : eaten(0),
        left_m(left_m), right_m(right_m), name(name) {}

    void get_left_fork() {
        left_m.lock();
    }
    void get_right_fork() {
        right_m.lock();
    }
    void release_forks() {
        left_m.unlock();
        right_m.unlock();
    }
    void eat(int amount) {
        eaten += amount;
    }
};

Note that now forks are represented by mutexes. Dijkstra’s insight was to acquire the locks in a specific order: each diner grabs the lower numbered fork first, and then the higher numbered one. If \(n-1\) of the diners pick up the lowest numbered fork, then the remaining diner cannot pick up any fork. By the pigeon hole principle, some diner is able to eat, and an argument along those lines shows that there is never a deadlock. Here is the initialization code and thread function:

int main(void) {
    thread thds[n_threads];
    mutex mtx[n_threads];
    vector<LockingPhilosopher> philosophers;
    string names[] = {"Alice", "Bob", "Charlie",
                      "Danielle", "Evan"};

    for (int i = 0; i < n_threads - 1; ++i)
        philosophers.push_back(
            LockingPhilosopher(names[i], mtx[i],
                               mtx[(i+1) % n_threads]));
    philosophers.push_back(
        LockingPhilosopher(names[n_threads - 1], mtx[0],
                           mtx[n_threads - 1]));

    for (int i = 0; i < n_threads; ++i)
        thds[i] = thread(thread_eat_lock_func,
                        std::ref(philosophers[i]),
                        n_courses,
                        std::ref(logger));
    for (auto& t : thds)
        t.join();
    return 0;
}

void thread_eat_lock_func(LockingPhilosopher& p,
                          int n_courses) {
    for (int i = 0; i < n_courses; ++i) {
        p.get_left_fork();
        p.get_right_fork();
        p.eat(1);
        p.release_forks();
    }
}

Note that we arranged for the left fork the be the lower numbered one (in terms of its index in the array of mutexes), so grabbing the left fork first is always correct. Note how many locks there are in this version of the problem, and consider how it may be difficult to keep track of the order in which they’re acquired. In my opinion, the STM version is easier to understand.

I haven’t talked much about performance yet, because, frankly, the STM performance isn’t that great right now. (Of course, such things are often due to operator error!) However, this is one case where the STM version performs about as fast as the one using std::mutex. With some hardware support and a little more thought, it seems reasonable to bet that STM performance will be more than good enough considering the benefits it offers.

Approx. 941 words, plus code