The main reason for using atomics over mutexes, is that mutexes are expensive but with the default memory model for atomics
being memory_order_seq_cst
, isn't this just as expensive?
Question: Can concurrent a program using locks be as fast as concurrent lock-free program?
If so, it may not be worth the effort unless I want to use memory_order_acq_rel
for atomics.
Edit: I may be missing something but lock-based cant be faster than lock-free because each lock will have to be a full memory barrier too. But with lock-free, it's possible to use techniques that are less restrictive then memory barriers.
So back to my question, is lock-free any faster than lock based in new C++11 standard with default memory_model
?
Is "lock-free >= lock-based when measured in performance" true? Let's assume 2 hardware threads.
Edit 2: My question is not about progress guarantees, and maybe I'm using "lock-free" out of context.
Basically when you have 2 threads with shared memory, and the only guarantee you need is that if one thread is writing then the other thread can't read or write, my assumption is that a simple atomic compare_and_swap
operation would be much faster than locking a mutex.
Because if one thread never even touches the shared memory, you will end up locking and unlocking over and over for no reason but with atomic operations you only use 1 CPU cycle each time.
In regards to the comments, a spin-lock vs a mutex-lock is very different when there is very little contention.
Lockfree programming is about progress guarantees: From strongest to weakest, those are wait-free, lock-free, obstruction-free, and blocking.
A guarantee is expensive and comes at a price. The more guarantees you want, the more you pay. Generally, a blocking algorithm or datastructure (with a mutex, say) has the greatest liberties, and thus is potentially the fastest. A wait-free algorithm on the other extreme must use atomic operations at every step, which may be much slower.
Obtaining a lock is actually rather cheap, so you should never worry about that without a deep understanding of the subject. Moreover, blocking algorithms with mutexes are much easier to read, write and reason about. By contrast, even the simplest lock-free data structures are the result of long, focused research, each of them worth one or more PhDs.
In a nutshell, lock- or wait-free algorithms trade worst latency for mean latency and throughput. Everything is slower, but nothing is ever very slow. This is a very special characteristic that is only useful in very specific situations (like real-time systems).