I have read so many times, here and everywhere on the net, that mutexes are slower than critical section/semaphores/insert-your-preferred-synchronisation-method-here. but i have never seen any paper or study or whatever to back up this claim.
so, where does this idea come from ? is it a myth or a reality ? are mutexes really slower ?
In the book "Multithreading application in win32" by Jim Beveridge and Robert Wiener it says: "It takes almost 100 times longer to lock an unowned mutex than it does to lock an unowned critical section because the critical section can be done in user mode without involving the kernel"
And on msdn here it says "critical section objects provide a slightly faster, more efficient mechanism for mutual-exclusion synchronization"