Why is a pthread mutex considered "slower" than a futex?

Jason picture Jason · Jun 15, 2011 · Viewed 20.2k times · Source

Why are POSIX mutexes considered heavier or slower than futexes? Where is the overhead coming from in the pthread mutex type? I've heard that pthread mutexes are based on futexes, and when uncontested, do not make any calls into the kernel. It seems then that a pthread mutex is merely a "wrapper" around a futex.

Is the overhead simply in the function-wrapper call and the need for the mutex function to "setup" the futex (i.e., basically the setup of the stack for the pthread mutex function call)? Or are there some extra memory barrier steps taking place with the pthread mutex?

Answer

ninjalj picture ninjalj · Jun 16, 2011

Futexes were created to improve the performance of pthread mutexes. NPTL uses futexes, LinuxThreads predated futexes, which I think is where the "slower" consideration comes. NPTL mutexes may have some additional overhead, but it shouldn't be much.

Edit: The actual overhead basically consists on:

  • selecting the correct algorithm for the mutex type (normal, recursive, adaptive, error-checking; normal, robust, priority-inheritance, priority-protected), where the code heavily hints to the compiler that we are likely using a normal mutex (so it should convey that to the CPU's branch prediction logic),
  • and a write of the current owner of the mutex if we manage to take it which should normally be fast, since it resides in the same cache-line as the actual lock which we have just taken, unless the lock is heavily contended and some other CPU accessed the lock between the time we took it and when we attempted to write the owner (this write is unneeded for normal mutexes, but needed for error-checking and recursive mutexes).

So, a few cycles (typical case) to a few cycles + a branch misprediction + an additional cache miss (very worst case).