waitpid with execl used in child returns -1 with ECHILD?

kapilddit picture kapilddit · Mar 27, 2014 · Viewed 14k times · Source

When do I need to use waitpid if I am using execl in a child process which may take time to finish?

When I use waitpid in the parent, it tells me that the child is running as the return value from waitpid is 0. However, if I call waitpid after some time in another function, it returns -1 with errno set to ECHILD. When should I use waitpid if I am not sure whether the child has completed or not?

//pid_t Checksum_pid = fork();
Checksum_pid = fork();

if (Checksum_pid == 0)
{
    execl(path, name, argument as needed, ,NULL);
    exit(EXIT_SUCCESS);     
}
else if (Checksum_pid > 0)
{ 
    pid_t returnValue = waitpid(Checksum_pid, &childStatus, WNOHANG);

    if ( returnValue > 0)
    {
        if (WIFEXITED(childStatus))
        {
            printf("Exit Code: _ WEXITSTATUS(childStatus)") ;               
        }
    }
    else if ( returnValue == 0)
    {
        //Send positive response with routine status running (0x03)
        printf("Child process still running") ; 
    }
    else
    {
        if ( errno == ECHILD )
        {
             printf(" Error ECHILD!!") ;
        } 
        else if ( errno == EINTR )
        {
            // other real errors handled here.
            printf(" Error EINTR!!") ;
        }
        else
        {
            printf("Error EINVAL!!") ;
        }       
    }
}
else
{   
    /* Fork failed. */
    printf("Fork Failed") ;
}

Answer

luen picture luen · Jul 6, 2014

If the signal SIGCHLD is ignored (which is the default) and you skip WNOHANG as mentioned by Jonathan, waitpid will hang until the child exits and then fail with code ECHILD. I encountered this myself in a scenario where the parent process simply should wait for the child to finish and it seemed a bit overkill to register a handler for SIGCHLD. The natural follow-up question would be "Is it is OK to treat an ECHILD as an expected event in this case, or am I always supposed to write a handler for SIGCHLD?", but for that I don't have an answer.

From the documentation for waitpid(2):

ECHILD

(for waitpid() or waitid()) The process specified by pid (waitpid()) or idtype and id (waitid()) does not exist or is not a child of the calling process. (This can happen for one's own child if the action for SIGCHLD is set to SIG_IGN. See also the Linux Notes section about threads.)

and further down

POSIX.1-2001 specifies that if the disposition of SIGCHLD is set to SIG_IGN or the SA_NOCLDWAIT flag is set for SIGCHLD (see sigaction(2)), then children that terminate do not become zombies and a call to wait() or waitpid() will block until all children have terminated, and then fail with errno set to ECHILD. (The original POSIX standard left the behavior of setting SIGCHLD to SIG_IGN unspecified. Note that even though the default disposition of SIGCHLD is "ignore", explicitly setting the disposition to SIG_IGN results in different treatment of zombie process children.) Linux 2.6 conforms to this specification. However, Linux 2.4 (and earlier) does not: if a wait() or waitpid() call is made while SIGCHLD is being ignored, the call behaves just as though SIGCHLD were not being ignored, that is, the call blocks until the next child terminates and then returns the process ID and status of that child.

See also this answer.