How would you implement your own reader/writer lock in C++11?

jack picture jack · Aug 20, 2012 · Viewed 46.3k times · Source

I have a set of data structures I need to protect with a readers/writer lock. I am aware of boost::shared_lock, but I would like to have a custom implementation using std::mutex, std::condition_variable and/or std::atomic so that I can better understand how it works (and tweak it later).

Each data structure (moveable, but not copyable) will inherit from a class called Commons which encapsulates the locking. I'd like the public interface to look something like this:

class Commons {
public:
    void read_lock();
    bool try_read_lock();
    void read_unlock();

    void write_lock();
    bool try_write_lock();
    void write_unlock();
};

...so that it can be publicly inherited by some:

class DataStructure : public Commons {};

I'm writing scientific code and can generally avoid data races; this lock is mostly a safeguard against the mistakes I'll probably make later. Thus my priority is low read overhead so I don't hamper a correctly-running program too much. Each thread will probably run on its own CPU core.

Could you please show me (pseudocode is ok) a readers/writer lock? What I have now is supposed to be the variant that prevents writer starvation. My main problem so far has been the gap in read_lock between checking if a read is safe to actually incrementing a reader count, after which write_lock knows to wait.

void Commons::write_lock() {
    write_mutex.lock();
    reading_mode.store(false);
    while(readers.load() > 0) {}
}

void Commons::try_read_lock() {
    if(reading_mode.load()) {
        //if another thread calls write_lock here, bad things can happen
        ++readers; 
        return true;
    } else return false;
}

I'm kind of new to multithreading, and I'd really like to understand it. Thanks in advance for your help!

Answer

fgp picture fgp · Sep 30, 2012

Here's pseudo-code for a ver simply reader/writer lock using a mutex and a condition variable. The mutex API should be self-explanatory. Condition variables are assumed to have a member wait(Mutex&) which (atomically!) drops the mutex and waits for the condition to be signaled. The condition is signaled with either signal() which wakes up one waiter, or signal_all() which wakes up all waiters.

read_lock() {
  mutex.lock();
  while (writer)
    unlocked.wait(mutex);
  readers++;
  mutex.unlock();
}

read_unlock() {
  mutex.lock();
  readers--;
  if (readers == 0)
    unlocked.signal_all();
  mutex.unlock();
}

write_lock() {
  mutex.lock();
  while (writer || (readers > 0))
    unlocked.wait(mutex);
  writer = true;
  mutex.unlock();
}

write_unlock() {
  mutex.lock();
  writer = false;
  unlocked.signal_all();
  mutex.unlock();
}

That implementation has quite a few drawbacks, though.

Wakes up all waiters whenever the lock becomes available

If most of the waiters are waiting for a write lock, this is wastefull - most waiters will fail to acquire the lock, after all, and resume waiting. Simply using signal() doesn't work, because you do want to wake up everyone waiting for a read lock unlocking. So to fix that, you need separate condition variables for readability and writability.

No fairness. Readers starve writers

You can fix that by tracking the number of pending read and write locks, and either stop acquiring read locks once there a pending write locks (though you'll then starve readers!), or randomly waking up either all readers or one writer (assuming you use separate condition variable, see section above).

Locks aren't dealt out in the order they are requested

To guarantee this, you'll need a real wait queue. You could e.g. create one condition variable for each waiter, and signal all readers or a single writer, both at the head of the queue, after releasing the lock.

Even pure read workloads cause contention due to the mutex

This one is hard to fix. One way is to use atomic instructions to acquire read or write locks (usually compare-and-exchange). If the acquisition fails because the lock is taken, you'll have to fall back to the mutex. Doing that correctly is quite hard, though. Plus, there'll still be contention - atomic instructions are far from free, especially on machines with lots of cores.

Conclusion

Implementing synchronization primitives correctly is hard. Implementing efficient and fair synchronization primitives is even harder. And it hardly ever pays off. pthreads on linux, e.g. contains a reader/writer lock which uses a combination of futexes and atomic instructions, and which thus probably outperforms anything you can come up with in a few days of work.