What's the algorithm behind sleep()?

Leonel picture Leonel · Oct 6, 2008 · Viewed 10.5k times · Source

Now there's something I always wondered: how is sleep() implemented ?

If it is all about using an API from the OS, then how is the API made ?

Does it all boil down to using special machine-code on the CPU ? Does that CPU need a special co-processor or other gizmo without which you can't have sleep() ?

The best known incarnation of sleep() is in C (to be more accurate, in the libraries that come with C compilers, such as GNU's libc), although almost every language today has its equivalent, but the implementation of sleep in some languages (think Bash) is not what we're looking at in this question...

EDIT: After reading some of the answers, I see that the process is placed in a wait queue. From there, I can guess two alternatives, either

  1. a timer is set so that the kernel wakes the process at the due time, or
  2. whenever the kernel is allowed a time slice, it polls the clock to check whether it's time to wake a process.

The answers only mention alternative 1. Therefore, I ask: how does this timer behave ? If it's a simple interrupt to make the kernel wake the process, how can the kernel ask the timer to "wake me up in 140 milliseconds so I can put the process in running state" ?

Answer

user3458 picture user3458 · Oct 7, 2008

The "update" to question shows some misunderstanding of how modern OSs work.

The kernel is not "allowed" a time slice. The kernel is the thing that gives out time slices to user processes. The "timer" is not set to wake the sleeping process up - it is set to stop the currently running process.

In essence, the kernel attempts to fairly distribute the CPU time by stopping processes that are on CPU too long. For a simplified picture, let's say that no process is allowed to use the CPU more than 2 milliseconds. So, the kernel would set timer to 2 milliseconds, and let the process run. When the timer fires an interrupt, the kernel gets control. It saves the running process' current state (registers, instruction pointer and so on), and the control is not returned to it. Instead, another process is picked from the list of processes waiting to be given CPU, and the process that was interrupted goes to the back of the queue.

The sleeping process is simply not in the queue of things waiting for CPU. Instead, it's stored in the sleeping queue. Whenever kernel gets timer interrupt, the sleep queue is checked, and the processes whose time have come get transferred to "waiting for CPU" queue.

This is, of course, a gross simplification. It takes very sophisticated algorithms to ensure security, fairness, balance, prioritize, prevent starvation, do it all fast and with minimum amount of memory used for kernel data.