What are Worker threads? Do you know when to use them in node.js?

What are Worker threads? Do you know when to use them in node.js?

Problem

Worker threads were first introduced in Node.js version 10.5 and their API was still in the experimental stage until it received stable release in version 12LTS.

Worker threads provide a solution to run JavaScript code on a separate thread parallel to the main thread. So, how does this work and what benefits does it bring? Keep reading to find out.

CPU-intensive tasks

You might already know that node.js excels at handling asynchronous I/O tasks. When it comes to I/O, people often think about tasks like reading/writing data to files, making HTTP requests, and so on.

However, for synchronous tasks, such as complex computations on a large dataset, it can cause a serious bottleneck in the main thread.

Imagine a synchronous computation that takes 10 seconds to process. This means the main thread will be blocked for 10 seconds to handle that request before it can process subsequent requests, which is detrimental to the server's responsiveness.

A classic example of such a computation is the Fibonacci sequence. The Fibonacci sequence is an infinite sequence of natural numbers that begins with 0 and 1, and each subsequent element is the sum of the two preceding ones. A Fibonacci function in JavaScript can be written like this:

const fibonacci = (n) => {
  var i;
  var fib = [];

  fib[0] = 0;
  fib[1] = 1;
  for (i = 2; i <= n; i++) {
    fib[i] = fib[i - 2] + fib[i - 1];
  }
  return fib;
}

Try calling the fibonacci(999999) function, and your main thread might take more than a second to compute the result.

What are Worker threads?

Worker threads are a module in node.js that allows you to run JavaScript code parallel to the main thread. Each worker runs independently, but they can communicate with each other through postMessage(). For more in-depth knowledge, you can refer to the full documentation on Worker threads at Worker threads.

Why do we need Worker threads?

As mentioned earlier, we may need Worker threads to handle cases where we have large or complex data computations to prevent blocking the main thread.

The main thread sends a request to a worker to execute JavaScript code. After completion, the worker informs the main thread by calling postMessage(). The main thread receives the data from the worker and continues processing that request.

By moving complex JavaScript computations away from the main thread, subsequent requests can be processed normally without any blocking.

The cost of creating a worker

Before Worker threads were introduced in version 10.15, there were other ways to run JavaScript code on a separate thread, such as Cluster and Child Process.

Cluster maximizes the use of CPU threads to create more main threads since by default, a node.js project runs on a single thread. Using Cluster, if you have a server with 4 cores and 8 threads, the maximum number of main threads created is 8, which matches the number of CPU threads. Incoming requests will be distributed in a round-robin fashion or using another algorithm.

Child Process is a different solution compared to Cluster. It creates a separate process with its own dedicated event loop and main thread, resulting in a higher usage of system resources for each process. However, communication between processes is relatively complex because each process has its own memory.

Worker threads were introduced to address resource usage concerns with Child Process. Instead of creating a new process, worker threads create a new thread within the process of the running application. This helps minimize resource usage because the resources needed to create a thread are less than those needed to create a process. Furthermore, threads share resources, making communication between them relatively easy.

To visualize this, you can refer to the comparison diagram between Child Process and Worker threads:

The cost of creating a worker

However, both Child Process and Worker threads have resource costs, so it's important to carefully consider creating too many of them.

How to use Worker threads?

The documentation of node.js provides a simple example of how to implement a single worker, which you can see at Worker threads.

In this article, I'll provide an example of a simple implementation of a worker that calculates the Fibonacci sequence in a separate thread.

First, create a file named main.js:

const { Worker } = require('worker_threads');

const runService = (workerData) => {
  return new Promise((resolve, reject) => {
    const worker = new Worker('./worker.js', { workerData });

    worker.on('message', resolve);
    worker.on('error', reject);
    worker.on('exit', (code) => {
      if (code !== 0)
        reject(new Error(`stopped with  ${code} exit code`));
    });
  })
}

const run = async () => {
  const result = await runService(999999);
  console.log(result);
}

run().catch(console.log);

Next, create a worker.js file:

const { parentPort, workerData } = require('worker_threads');

const fibonacci = (n) => {
  var i;
  var fib = [];

  fib[0] = 0;
  fib[1] = 1;
  for (i = 2; i <= n; i++) {
    fib[i] = fib[i - 2] + fib[i - 1];
  }

  parentPort.postMessage(fib);
}

fibonacci(workerData);

Then, run the main.js file, and you will see the result of the Fibonacci sequence almost instantaneously.

To explain the code, when you call new Worker in the main file, it creates a worker that contains the code from the worker.js file. new Worker takes the second parameter workerData to pass data from the main thread to the worker. After the worker finishes processing, it calls the postMessage function to communicate the result back to the main thread.

In practical implementations of Worker threads, it is advisable to adhere to agreed-upon principles for consistency. One such principle is to use community-built packages that offer high compatibility and quick implementation, like the node-worker-threads-pool npm package.

For example, to reimplement the Fibonacci code using the package, the code becomes shorter and easier to read:

const { StaticPool } = require('node-worker-threads-pool');

const fibonacci = (n) => {
  var i;
  var fib = [];

  fib[0] = 0;
  fib[1] = 1;
  for (i = 2; i <= n; i++) {
    fib[i] = fib[i - 2] + fib[i - 1];
  }
  return fib;
}

const staticPool = new StaticPool({
  size: 4,  
  task: fibonacci,  
});

staticPool.exec(999999).then(console.log);

Conclusion

Worker threads are a module in node.js that allows you to run JavaScript code parallel to the main thread. Use worker threads when you have synchronous code that takes a significant amount of processing time. This way, you can free up the main thread to handle subsequent requests without being blocked for a certain period of time.

The resource cost of creating a worker is lower compared to Child Process, but both can be "expensive," so be cautious when creating too many of them.

Implementing worker threads has become easier with the help of community-supported packages available on npm, such as the node-worker-threads-pool npm package.

or
* The summary newsletter is sent every 1-2 weeks, cancel anytime.
Author

Hello, my name is Hoai - a developer who tells stories through writing ✍️ and creating products 🚀. With many years of programming experience, I have contributed to various products that bring value to users at my workplace as well as to myself. My hobbies include reading, writing, and researching... I created this blog with the mission of delivering quality articles to the readers of 2coffee.dev.Follow me through these channels LinkedIn, Facebook, Instagram, Telegram.

Did you find this article helpful?
NoYes

Comments (6)

Leave a comment...
Avatar
Ẩn danh1 year ago
Thư viện node-worker-threads-pool này chỉ giúp function chạy trên luồng riêng chứ không chạy trên multi CPU cùng lúc.
Reply
Avatar
Đình Trung1 year ago
worker threads có giống/khác gì với worker pools không a?
Reply
Avatar
Xuân Hoài Tống1 year ago
Khác bạn ạ, mình có series bài viết nói về kiến trúc Node.js bạn có thể tìm đọc lại sẽ dễ hiểu hơn.
Avatar
Trần Huy Hoàng1 year ago
Ngoài tính dãy fibo trên kia ra thì có ứng dụng thực tế nào nữa ko bạn?
Reply
Avatar
Xuân Hoài Tống1 year ago
Một câu hỏi khó, bạn có thể áp dụng trong bất kì trường hợp nào mà "công việc" của bạn đủ lâu để chặn luồng chính, ví dụ như xử lý hình ảnh, video chẳng hạn.
Avatar
Tiến Đức1 year ago
nếu vậy thì khi nào dùng worker threads khi nào dùng child process vậy ạ
Reply
Avatar
Xuân Hoài Tống1 year ago
@Phương mình thì thấy worker thread và child process có cách dùng tương đương nhau. Có điều worker được ra sau và nó dùng ít tài nguyên hơn so với child nên nó được khuyên dùng hơn.
Avatar
Nguyễn Minh Phương1 year ago
theo ngu kiến của e thì khi nào bác cần scale up project của bác mà ko muốn dùng đến các công cụ khác như docker, .. thì dùng child process, còn khi nào bác cần xử lý dữ liệu data trong project phải lặp lên đến cả triệu element(đại loại thế) thì nên dùng worker threads để tránh block event loop
Avatar
Xuân Hoài Tống1 year ago
Câu hỏi này chắc có thời gian mình sẽ viết một bài riêng, nhưng để mà nói ngắn gọn là worker threads được yêu thích sử dụng hơn&nbsp;
Avatar
Tùng Nguyễn2 years ago
Thực ra node là đa luồng ở libuv vậy thì tại sao lại phải tạo ra worker thread để làm gì?
Reply
Avatar
Xuân Hoài Tống2 years ago
Bác ở dưới nói đúng rồi đấy b Tùng, node có thể đa luồng ở background nhưng luồng chính chỉ có một và nó xử lý đồng bộ mã của js
Avatar
Văn Thành Phan2 years ago
Nhưng luồng chính vẫn phải wait nhiều hơn nếu chỉ 1 main thread chứ bác
Avatar
Nhí Nhố Tí2 years ago
Quá tuyệt vời quá nai xừ 😍
Reply