Understanding C's fork() through a simple example

Locke McDonnell picture Locke McDonnell · Mar 7, 2013 · Viewed 39.5k times · Source
#include <stdio.h>
int num = 0;
int main(int argc, char*argv[]){
    int pid;
    pid = fork();
    printf("%d", num);  
    if(pid == 0){       /*child*/
        num = 1;
    }else if(pid > 0){  /*parent*/
        num = 2;
    }
    printf("%d", num);
}

I'm having trouble understanding why the possible outputs would be 0102 or 0012 or 0201 or 0021.

Here is what I (think) it should be producing. It hits the first printf statement and no matter what child or parent gets executed first, num hasn't been modified so 0 first. THEN next is either 1 or 2, then the next process executes so starts with 0 again (copied from the parent) and then either a 1 or 2 again. So the possible outputs should be:

0101 or 0102 or 0201 or 0202

Answer

Samuel Edwin Ward picture Samuel Edwin Ward · Mar 7, 2013

In both the parent and the child, num is 0 for the first printf. In both the parent and the child, 0 is printed followed by the other value. In the parent process, the other value is 2. In the child process, the other value is 1.

However, the important thing to note is that although each process has an enforced order that zero has to be printed before the other number, there is no restriction on the printing of the two processes relative to each other.

Here's a real-life analogy: Suppose my coworker and each I leave work at the same time, stop at the grocery store, and then arrive home. We know I was at the store before I was at my home, and we know he was at the grocery store before he was at his home. But we don't know who was at the grocery store first, or who was at home first. We could each arrive at the grocery store around the same time and then each arrive home around the same time, or maybe he's delayed and I get the grocery store and home before he even gets to the store.

Something that won't happen is the printing of 1 or 2 more than once. Although after fork returns we have two processes running conceptually at once, and the timing of their events relative to each other is unspecified, the order of events in each process is well defined. Each process will set num to either 1 or 2 before printing it again, and because fork is defined to return 0 in the child and the child's pid in the parent, they will each set it to different values.

There is, actually, another reasonable output: 00. If fork is unable to create a new process, it returns -1. In this case program will print 0, the if and else ifs will fail because -1 is neither 0 nor greater than 0, num is not changed, and the program prints 0 again.

If you want to learn a lot about the definition of the ordering of effects in C programs, the key words to search for are "sequence points". In this program it's fairly straightforward (aside from the fact that we have two copies running at once), but it can sometimes be less obvious.