I read in a paper that the underlying system call to create processes and threads is actually the same, and thus the cost of creating processes over threads is not that great.
EDIT:
Quoting article:
Replacing pthreads with processes is surprisingly inexpensive, especially on Linux where both pthreads and processes are invoked using the same underlying system call.
Processes are usually created with fork
, threads (lightweight processes) are usually created with clone
nowadays. However, anecdotically, there exist 1:N thread models, too, which don't do either.
Both fork
and clone
map to the same kernel function do_fork
internally. This function can create a lightweight process that shares the address space with the old one, or a separate process (and many other options), depending on what flags you feed to it. The clone
syscall is more or less a direct forwarding of that kernel function (and used by the higher level threading libraries) whereas fork
wraps do_fork
into the functionality of the 50 year old traditional Unix function.
The important difference is that fork
guarantees that a complete, separate copy of the address space is made. This, as Basil points out correctly, is done with copy-on-write nowadays and therefore is not nearly as expensive as one would think.
When you create a thread, it just reuses the original address space and the same memory.
However, one should not assume that creating processes is generally "lightweight" on unix-like systems because of copy-on-write. It is somewhat less heavy than for example under Windows, but it's nowhere near free.
One reason is that although the actual pages are not copied, the new process still needs a copy of the page table. This can be several kilobytes to megabytes of memory for processes that use larger amounts of memory.
Another reason is that although copy-on-write is invisible and a clever optimization, it is not free, and it cannot do magic. When data is modified by either process, which inevitably happens, the affected pages fault.
Redis is a good example where you can see that fork
is everything but lightweight (it uses fork
to do background saves).