As demonstrated in this answer I recently posted, I seem to be confused about the utility (or lack thereof) of volatile
in multi-threaded programming contexts.
My understanding is this: any time a variable may be changed outside the flow of control of a piece of code accessing it, that variable should be declared to be volatile
. Signal handlers, I/O registers, and variables modified by another thread all constitute such situations.
So, if you have a global int foo
, and foo
is read by one thread and set atomically by another thread (probably using an appropriate machine instruction), the reading thread sees this situation in the same way it sees a variable tweaked by a signal handler or modified by an external hardware condition and thus foo
should be declared volatile
(or, for multithreaded situations, accessed with memory-fenced load, which is probably a better a solution).
How and where am I wrong?
The problem with volatile
in a multithreaded context is that it doesn't provide all the guarantees we need. It does have a few properties we need, but not all of them, so we can't rely on volatile
alone.
However, the primitives we'd have to use for the remaining properties also provide the ones that volatile
does, so it is effectively unnecessary.
For thread-safe accesses to shared data, we need a guarantee that:
volatile
variable as a flag to indicate whether or not some data is ready to be read. In our code, we simply set the flag after preparing the data, so all looks fine. But what if the instructions are reordered so the flag is set first?volatile
does guarantee the first point. It also guarantees that no reordering occurs between different volatile reads/writes. All volatile
memory accesses will occur in the order in which they're specified. That is all we need for what volatile
is intended for: manipulating I/O registers or memory-mapped hardware, but it doesn't help us in multithreaded code where the volatile
object is often only used to synchronize access to non-volatile data. Those accesses can still be reordered relative to the volatile
ones.
The solution to preventing reordering is to use a memory barrier, which indicates both to the compiler and the CPU that no memory access may be reordered across this point. Placing such barriers around our volatile variable access ensures that even non-volatile accesses won't be reordered across the volatile one, allowing us to write thread-safe code.
However, memory barriers also ensure that all pending reads/writes are executed when the barrier is reached, so it effectively gives us everything we need by itself, making volatile
unnecessary. We can just remove the volatile
qualifier entirely.
Since C++11, atomic variables (std::atomic<T>
) give us all of the relevant guarantees.