Replacing fork() and exec() in the Linux Kernel
Jonathan Corbet wrote:
fork()is a relatively expensive system call; it must copy the entire process state (including memory) for the child process. Many optimizations have been made over the years, but a fork is still a fundamentally costly operation. To make things worse, afork()call is often immediately followed by anexec(), which will discard all of that memory that was so carefully copied for the child.[…]
Chen’s patch set takes an interesting approach to optimize the
fork()andexec()pattern. It is focused on applications that repeatedly launch processes running the same executable; imagine, for example, a program that must run Git repeatedly to obtain information about the contents of a repository. In such cases, the program could establish a template to accelerate those invocations, spreading the setup cost across multiple operations.
It’s cool to see innovation happening even system calls that have been around
since the earliest days of UNIX. From fork(2):
HISTORY
The fork() function appeared in Version 1 AT&T UNIX.
The _Fork() function appeared in FreeBSD 13.1.
Process spawning happens a lot on a running system, so optimizing it has the potential for enormous gains.