waitpid, wnohang, wuntraced. How do I use these

8this picture 8this · Nov 3, 2015 · Viewed 49.1k times · Source

I am a bit confused. As I understand, waitpid with a pid of -1 means that I wait for all child's to finish but if I add an option to the waitpid of WNOHANG, that options says to exit immediately if none have finished...These seems extremely confusing.

Why would I tell the computer to wait for child processes to finish and then immediately afterwards tell it to exit immediately if none of the childs have finished?

Can someone explain this option and the WUNTRACED options? I don't know what it means to be traced.

Answer

Jemenake picture Jemenake · Jan 18, 2016

You usually use WNOHANG and WUNTRACED in different cases.

Case 1: Suppose you have a process which spawns off a bunch of children and needs to do other stuff while the children are running. These children sometimes exit or are killed, but the kernel will hold onto their exit status until some other process claims it via wait() or waitpid(). So, your parent process needs to call wait()/waitpid() on occasion to let the kernel rid itself of the remains of the child. But we don't want wait()/waitpid() to block, because, in this case, our process has other things that it needs to do. We just want to collect the status of a dead process if there are any. That's what WNOHANG is for. It prevents wait()/waitpid() from blocking so that your process can go on with other tasks. If a child died, its pid will be returned by wait()/waitpid() and your process can act on that. If nothing died, then the returned pid is 0.

Case 2: Suppose your parent process, instead, wants to do nothing while children are running. You don't want to just have it do some thumb-twidling for-loop, so you use a normal wait()/waitpid() without WNOHANG. Your process is taken out of the execution queue until one of the children dies. But what if one of your children is stopped via a SIGSTOP? Your child is no longer working on the task you have set it to, but the parent is still waiting. So, you've got a deadlock, in a sense, unless the child is continued by some means external to your parent and that child. WUNTRACED allows your parent to be returned from wait()/waitpid() if a child gets stopped as well as exiting or being killed. This way, your parent has a chance to send it a SIGCONT to continue it, kill it, assign its tasks to another child, whatever.