Faster forking of large processes on Linux?

timday picture timday · Apr 28, 2010 · Viewed 17.7k times · Source

What's the fastest, best way on modern Linux of achieving the same effect as a fork-execve combo from a large process ?

My problem is that the process forking is ~500MByte big, and a simple benchmarking test achieves only about 50 forks/s from the process (c.f ~1600 forks/s from a minimally sized process) which is too slow for the intended application.

Some googling turns up vfork as having being invented as the solution to this problem... but also warnings about not to use it. Modern Linux seems to have acquired related clone and posix_spawn calls; are these likely to help ? What's the modern replacement for vfork ?

I'm using 64bit Debian Lenny on an i7 (the project could move to Squeeze if posix_spawn would help).

Answer

tmm1 picture tmm1 · Mar 1, 2011

On Linux, you can use posix_spawn(2) with the POSIX_SPAWN_USEVFORK flag to avoid the overhead of copying page tables when forking from a large process.

See Minimizing Memory Usage for Creating Application Subprocesses for a good summary of posix_spawn(2), its advantages and some examples.

To take advantage of vfork(2), make sure you #define _GNU_SOURCE before #include <spawn.h> and then simply posix_spawnattr_setflags(&attr, POSIX_SPAWN_USEVFORK)

I can confirm that this works on Debian Lenny, and provides a massive speed-up when forking from a large process.

benchmarking the various spawns over 1000 runs at 100M RSS
                            user     system      total        real
fspawn (fork/exec):     0.100000  15.460000  40.570000 ( 41.366389)
pspawn (posix_spawn):   0.010000   0.010000   0.540000 (  0.970577)