Ok, so I am reading about synchronization, and I read through various algorithms such as spinlocks, semaphores, and mutex to avoid race condition.
However, these algorithms can't prevent race condition in SMP when multiple proceses access the data exactly at the same time.
For example, suppose thread 1 in processor A runs lock(mutex1); withdraw(1000); unlock(mutex1);
and thread 2 in processor B runs lock(mutex1); deposit(1000); deposit(1000); unlock(mutex1);
When both threads run EXACTLY AT THE SAME TIME, both threads will be in critical section simultaneously.
The only solution (should be in hardware level) would be making each processors run slightly off to each other, but it defeats the purpose of parallelism.
Is there any hardware level support to avoid these situation where multiple processors try to acquire the lock at the exactly same time?
(this is not a problem of atomicity, but rather problem of exact parallelism, and I wonder how SMP deals with it).
The whole point of a mutex is that even if two cores try to acquire it at the same time, one of them will be blocked until the other one releases it. A mutex that permitted two cores to hold that mutex at the same time would be completely broken and useless for its sole, intended purpose.
Somewhere in hardware there is a bus arbitrator that permits only one core to control the bus that links those two cores. If either core already has the memory holding the mutex in a private cache, that core will win. Otherwise, whichever one gets the bus first will win.
The bus arbitrator may work in many ways, but typically it will rotate. So if the cores are 0, 1, 2, and 3 and core 2 had the bus last, the bus will next go to core 3 if it wants it, otherwise core 0 if it wants it, otherwise core 1 if it wants it, otherwise back to core 2. Depending on exactly which bus is involved (whether it's a fight between the two core's L2 caches or over the memory itself or whatever) some of the cores may contend as a unit against other core groups and then sub-contend for which specific core gets it first.
It may be that one core already has control over the bus and so it will win outright. Typically, the arbitrator allows a core to continuously hold the bus so long as it continues to want to use it for a few transactions to avoid wasteful handoffs that don't allow the core to make forward progress.
The exact details can vary depending on a large number of factors including how the cores are arranged, which cores have the lock in their caches in what states, who had the bus last and whether the bus arbitrator uses timeslices, round-robin, or some other mechanism, and so on. But any implementation that didn't guarantee that one and only one core winds up getting the lock would be considered horribly broken.