In many programs and man pages of Linux, I have seen code using fork()
. Why do we need to use fork()
and what is its purpose?
fork()
is how you create new processes in Unix. When you call fork
, you're creating a copy of your own process that has its own address space. This allows multiple tasks to run independently of one another as though they each had the full memory of the machine to themselves.
Here are some example usages of fork
:
fork
to run the programs you invoke from the command line.fork
to create multiple server processes, each of which handles requests in its own address space. If one dies or leaks memory, others are unaffected, so it functions as a mechanism for fault tolerance.fork
to handle each page within a separate process. This will prevent client-side code on one page from bringing your whole browser down.fork
is used to spawn processes in some parallel programs (like those written using MPI). Note this is different from using threads, which don't have their own address space and exist within a process.fork
indirectly to start child processes. For example, every time you use a command like subprocess.Popen
in Python, you fork
a child process and read its output. This enables programs to work together.Typical usage of fork
in a shell might look something like this:
int child_process_id = fork();
if (child_process_id) {
// Fork returns a valid pid in the parent process. Parent executes this.
// wait for the child process to complete
waitpid(child_process_id, ...); // omitted extra args for brevity
// child process finished!
} else {
// Fork returns 0 in the child process. Child executes this.
// new argv array for the child process
const char *argv[] = {"arg1", "arg2", "arg3", NULL};
// now start executing some other program
exec("/path/to/a/program", argv);
}
The shell spawns a child process using exec
and waits for it to complete, then continues with its own execution. Note that you don't have to use fork this way. You can always spawn off lots of child processes, as a parallel program might do, and each might run a program concurrently. Basically, any time you're creating new processes in a Unix system, you're using fork()
. For the Windows equivalent, take a look at CreateProcess
.
If you want more examples and a longer explanation, Wikipedia has a decent summary. And here are some slides here on how processes, threads, and concurrency work in modern operating systems.