What is Child Process in Node.js? When to Use fork and spawn?

Daily short news for you

Finally, Windsurf has returned to Cognition.ai. For those who may not know, Cognition has a product called devin.ai - a type of AI Agent that they refer to as a "software engineer." In other words, it can work like an independent programmer.

Cognition’s acquisition of Windsurf

» Read more
I just found this website that compiles the free tiers of popular services like Google, AWS, Microsoft... There are also many others, so feel free to click and explore, you might find something you need there.

freetier.co

» Read more
The previous buzz about OpenAI acquiring Windsurf for $3 billion has recently fallen through, and the CEO of Windsurf is joining Google DeepMind. 😳

OpenAI’s Windsurf deal is off — and Windsurf’s CEO is going to Google

» Read more

Issue

There is a piece of advice for anyone working with Node.js: "never block the event loop." Blocking here means preventing the Event Loop from processing tasks that need to be solved. Node.js has only one thread to execute JavaScript code; if a task takes a long time to process, it will cause a serious bottleneck in the main thread. Imagine a scenario where all subsequent requests must wait for the previous request to complete before processing begins. It's truly dreadful.

Knowing this, Node.js must provide some solutions. Instead of calling synchronous functions, switch to calling asynchronous functions; for instance, when reading a file, readFile is preferred over readFileSync because readFile is asynchronous and processes in the main thread. Conversely, readFileSync is synchronous and executed outside the main thread. Additionally, if a task requires CPU computational capabilities, this is when you need to know about the built-in child_process module in Node.

child_process is a module that has been present in Node.js since its early versions. Later, Node.js added the worker_threads module, which functions similarly to child_process but has a more user-friendly API. I have an article about What are Worker Threads? Do you know when to use Worker Threads in Node.js? that you can refer to. However, for the scope of this article, let's temporarily forget about worker_threads and focus on what child_process is and how it is used.

What is Child Process?

Child process is a Node.js module that allows the creation of independent child processes to perform specific tasks. It enables Node.js to run multiple tasks concurrently and fully utilize server power. When a child process is created, it runs independently of the parent process and can communicate with the parent via streams, events, etc. The created child processes have independent resources, helping to minimize the impact on other processes when handling heavy tasks or in case of errors.

To visualize, a Node.js application upon startup is a process with a V8 Engine instance created. To prevent the event loop from being blocked, the best way is to create another process to handle it. In that case, it can run independently of the parent process, process, and return results to the parent process through a communication channel as mentioned above.

Depending on how a child process is created, it has different ways of executing tasks. There are two common ways to create a child process: spawn and fork. While fork attempts to create a "copy" of the parent process, meaning "cloning" a V8 Engine to process tasks, spawn simply creates a process to execute a specific command. For more details, let's go through each method to understand what they actually are.

Spawn

spawn is a method for creating a new child process. When using spawn, we can pass the child process parameters, options, and arguments necessary to execute the command or executable file.

child_process.spawn(command[, args][, options])

When a child process is created using spawn, it can operate independently of the parent process and can exchange data with the parent process through pipe or stream. We can also manage the child process by monitoring events to know when it completes or encounters an error.

Example of how to use spawn:

const { spawn } = require('child_process');
const ls = spawn('ls', ['-lh', '/usr']);

ls.stdout.on('data', (data) => {
  console.log(`stdout: ${data}`);
});

ls.stderr.on('data', (data) => {
  console.error(`stderr: ${data}`);
});

ls.on('close', (code) => {
  console.log(`child process exited with code ${code}`);
});

On line 2, we are creating a child process that executes the ls command with the options -lh and /usr. In other words, that is a command:

$ ls -lh /usr

Then, we use on to listen for events from the child process to receive data in the parent process. In the above example, on is "listening" to 3 events of the child process: success, failure, and closing the child process.

If you notice, spawn can also run a node command:

spawn('node', ['index.js']);

You can run a .js file by this method in a new process, or even faster, use fork to simplify usage as shown in the section below.

Fork

fork is also a method for creating a new child process; it is a special case of spawn, in other words, fork is just a function based on spawn. This child process runs an independent version of the specified JavaScript code. This code can be placed in a file or a function passed as a parameter to fork.

child_process.fork(modulePath[, args][, options])

The fork function creates a new child process, "cloned" from the parent (including things like creating a new V8 engine - which makes fork resource-intensive), but with an independent environment and a different process ID. This child process can perform tasks independently of the parent process and can communicate with the parent through an IPC (Inter-Process Communication) channel provided by Node.js.

fork is the perfect solution for sharing workloads, handling heavy tasks, and running asynchronous code without affecting the performance of the parent.

For example, you have a fibonacci.js file used to calculate the Fibonacci sequence:

function fibonacci(n) {
  if (n < 2) {
    return n;
  } else {
    return fibonacci(n - 1) + fibonacci(n - 2);
  }
}

process.on('message', (msg) => {
  const result = fibonacci(msg);
  process.send(result);
});

Then, create a child_process to handle the call to fibonacci() in a separate process.

const { fork } = require('child_process');

const child = fork('fibonacci.js');

child.on('message', (result) => {
  console.log(`Fibonacci: ${result}`);
});

child.send(10);
});

When to Use Child Process as well as Fork or Spawn?

First of all, it's important to say that the choice of using child_process depends on the problem being solved. The cost of creating a child process is quite expensive, so it's not just about creating as many as possible to speed up your application. On the contrary, it can quickly deplete server resources as well as the communication costs between processes.

Node.js handles asynchronous I/O very well; if the application leans towards I/O, it may need to focus on configuring to optimize Worker Pools in libuv rather than creating many child_process to handle asynchronous I/O. You can refer to the article Distinguishing Between I/O Tasks and CPU-Intensive Tasks to learn how to differentiate I/O tasks from CPU-intensive tasks.

In cases where the application requires more CPU computation, child_process is more suitable. At this time, experience in using the two methods of creating child processes mentioned above is needed to optimize resource costs.

For example, if the server has installed software, commands, bash scripts, etc., and you want to call them from Node.js, use spawn. It simply creates a process to execute the command in spawn and returns the result, making it fast and efficient.

fork, on the other hand, is suitable when the task is within a file or a JavaScript function. fork creates a copy of the V8 Engine and has full access to the modules (node_modules) within your application. Furthermore, since it is an independent process, if the process crashes, it won't affect the main thread.

Conclusion

child_process is a module in Node.js that allows the creation of independent child processes to perform specific tasks, aiming to prevent blocking the event loop. There are various ways to create a child process through the child_process module, among which are the two methods spawn and fork. spawn is used to run a specific command, while fork creates a copy of the V8 Engine to run a piece of JavaScript code. Depending on the problem, it's essential to choose the appropriate way to use the child process to avoid wasting resources and to enhance application performance.

References:

Child process | Node.js v18.20.8 Documentation

Premium

5 profound lessons

Every product comes with stories. The success of others is an inspiration for many to follow. 5 lessons learned have changed me forever. How about you? Click now!

View all

What is Child Process in Node.js? When to Use fork and spawn?

Issue

What is Child Process?

Spawn

Fork

When to Use Child Process as well as Fork or Spawn?

Conclusion

Upgrade to Premium

Premium

Premium Plus

Premium