What is Child Process in Node.js? When to Use fork and spawn?

What is Child Process in Node.js? When to Use fork and spawn?

The Issue

There is a piece of advice that every Node.js developer should remember, which is "never block the event loop." Blocking here means not allowing the Event Loop to execute its own functions. Node.js only has one thread to handle JavaScript code, so if a task takes a considerable amount of time to process, it will cause a serious bottleneck in the main thread. In other words, all API calls may never respond until that task is complete.

Knowing this issue, of course, Node.js has to provide us with some ways to solve it. Instead of calling synchronous functions, we should switch to calling asynchronous functions, such as using readFile instead of readFileSync, because readFile is an asynchronous function. Furthermore, if a task requires high CPU computing power, such as image processing or video processing, there is another solution: using the child_process module integrated in Node.

We can say that child_process is the earliest solution that Node.js has come up with. Later, we also have the worker_threads module, which can solve the issue of blocking the event loop as well. I have written an article about What are Worker threads? Do you know when to use Worker threads in Node.js?, where you can refer to the concept and how to use them. However, in this article, let's temporarily forget about Worker threads and focus on understanding what Child process is and how it is used.

What is Child Process?

Child process is a module in Node.js that allows creating independent child processes to perform specific tasks. It enables Node.js to run multiple tasks concurrently and maximize the power of the server. When creating a child process, it runs independently from the parent process and can communicate with the parent through streams, events, etc. The child processes created have their own resources, minimizing the impact on other processes when handling heavy tasks or encountering errors.

To better understand, when an application written in Node.js is started, it is a process with a V8 Engine created. To prevent the event loop from being blocked, the best way is to create another process to handle tasks. In this case, it can run independently from the parent process, process the tasks, and return the result to the parent process through a communication channel as mentioned above.

Depending on how the child process is created, it may perform different tasks. There are two commonly used methods to create child processes: spawn and fork. While fork attempts to create a "clone" of the parent process, meaning "clone" a V8 Engine to process tasks, spawn simply creates a process to execute a command. To dive into more details, let's go through each method to understand what they actually are.

spawn

spawn is a method to create a new child process. With spawn, we can pass parameters, options, and necessary arguments to the child process to execute a command or an executable file.

child_process.spawn(command[, args][, options])

When a child process is created with spawn, it can work independently from the parent process, and it can exchange data with the parent through pipes or streams. We can also manage the child process by monitoring events to know when it completes or encounters errors.

Here is a simple example of using spawn:

const { spawn } = require('child_process');
const ls = spawn('ls', ['-lh', '/usr']);

ls.stdout.on('data', (data) => {
  console.log(`stdout: ${data}`);
});

ls.stderr.on('data', (data) => {
  console.error(`stderr: ${data}`);
});

ls.on('close', (code) => {
  console.log(`child process exited with code ${code}`);
});

In line 2, we are creating a child process that executes the ls command with the options -lh and /usr. In other words, it is equivalent to the following command:

$ ls -lh /usr

Then, we use on to listen to events from the child process and receive data in the parent process. In the example above, on is listening to 3 events of the child process, which are success, failure, and closing of the child process.

If you pay attention, you can see that spawn can run a node command as well:

spawn('node', ['index.js']);

You can run a .js file in a new process using the above method, or even simpler, use fork to simplify its usage as shown in the following section.

fork

fork is also a method to create a new child process, and it is a special case of spawn, or in other words, fork is just a function based on spawn. This child process will run an independent version of the specified JavaScript code. The code can be placed in a file or a function passed as a parameter to the fork function.

child_process.fork(modulePath[, args][, options])

The fork function creates a new child process that is "copied" from the parent process (including things like creating a brand new V8 engine - which makes the fork resource-intensive), but with an independent environment and a different process ID. This child process can perform tasks independently from the parent process and can communicate with the parent through an Inter-Process Communication (IPC) channel provided by Node.js.

With fork, we can use child processes to share the workload, handle heavy tasks, run non-blocking code without affecting the performance of the parent process.

For example, if you have a simple fibonacci.js file like this:

function fibonacci(n) {
  if (n < 2) {
    return n;
  } else {
    return fibonacci(n - 1) + fibonacci(n - 2);
  }
}

process.on('message', (msg) => {
  const result = fibonacci(msg);
  process.send(result);
});

Then, create a child process to call the fibonacci() function in a separate process.

const { fork } = require('child_process');

const child = fork('fibonacci.js');

child.on('message', (result) => {
  console.log(`Fibonacci: ${result}`);
});

child.send(10);

When to Use Child Process, fork, or spawn?

First of all, it must be said that the choice to use child processes depends on the problem you are trying to solve. Creating a process is resource-intensive, so creating more child processes does not necessarily mean that your application will perform faster. Conversely, it can quickly deplete server resources and the cost of communication between processes.

Node.js handles asynchronous I/O very well. If your application involves a lot of I/O, you may need to consider configuring it in a way that optimizes the Worker Pools in libuv instead of creating multiple child processes to handle asynchronous I/O tasks. You can refer to the article Distinguishing I/O Tasks and Deep CPU Tasks to learn how to differentiate I/O tasks from deep CPU tasks.

In the case where the application requires high CPU computation, child processes are indeed suitable. At this point, you need to apply the experience of using the two methods mentioned above to optimize resource costs.

For example, if you have installed software, commands, bash scripts, etc., on the server and want to call them from Node.js, then use spawn. It simply creates a process to execute a command and is both fast and efficient.

fork, on the other hand, is suitable for the case of "High CPU" tasks being in a JavaScript file or function. fork creates a copy of the V8 Engine and has full access to the modules (node_modules) in your application. Moreover, because it is an independent process, any potential errors in the process do not have a significant impact on your application.

Talking about heavy tasks may seem abstract, so let's take a specific example of image processing - it is one of the tasks that requires high CPU computation, such as resizing, filtering, adjusting brightness, etc. You can find many command-line applications (cli) to install on your machine, or search for image processing libraries for Node.js. If using a cli is more convenient, use spawn; if using a library, use fork.

Conclusion

Child process is a module in Node.js that allows creating independent child processes to perform specific tasks, in order to prevent blocking the event loop. There are many ways to create a child process using the child_process module, among which are the two methods spawn and fork. spawn is used to run a specific command, while fork creates a copy of the V8 Engine to run a piece of JavaScript code. Depending on the problem, we should choose the appropriate use of child processes to avoid wasting resources and improve the performance of our application.

References:

or
* The summary newsletter is sent every 1-2 weeks, cancel anytime.
Author

Hello, my name is Hoai - a developer who tells stories through writing ✍️ and creating products 🚀. With many years of programming experience, I have contributed to various products that bring value to users at my workplace as well as to myself. My hobbies include reading, writing, and researching... I created this blog with the mission of delivering quality articles to the readers of 2coffee.dev.Follow me through these channels LinkedIn, Facebook, Instagram, Telegram.

Did you find this article helpful?
NoYes

Comments (1)

Leave a comment...
Avatar
Jess Vanes1 year ago
quưkdnqmxncks skdnc akaofmxnak
Reply