What are Worker threads? Do you know when to use them in node.js?

What are Worker threads? Do you know when to use them in node.js?

Daily short news for you
  • Privacy Guides is a non-profit project aimed at providing users with insights into privacy rights, while also recommending best practices or tools to help reclaim privacy in the world of the Internet.

    There are many great articles here, and I will take the example of three concepts that are often confused or misrepresented: Privacy, Security, and Anonymity. While many people who oppose privacy argue that a person does not need privacy if they have 'nothing to hide.' 'This is a dangerous misconception, as it creates the impression that those who demand privacy must be deviant, criminal, or wrongdoers.' - Why Privacy Matters.

    » Read more
  • There is a wonderful place to learn, or if you're stuck in the thought that there's nothing left to learn, then the comments over at Hacker News are just for you.

    Y Combinator - the company behind Hacker News focuses on venture capital investments for startups in Silicon Valley, so it’s no surprise that there are many brilliant minds commenting here. But their casual discussions provide us with keywords that can open up many new insights.

    Don't believe it? Just scroll a bit, click on a post that matches your interests, check out the comments, and don’t forget to grab a cup of coffee next to you ☕️

    » Read more
  • Just got played by my buddy Turso. The server suddenly crashed, and checking the logs revealed a lot of errors:

    Operation was blocked LibsqlError: PROXY_ERROR: error executing a request on the primary

    Suspicious, I went to the Turso admin panel and saw the statistics showing that I had executed over 500 million write commands!? At that moment, I was like, "What the heck? Am I being DDoSed? But there's no way I could have written 500 million."

    Turso offers users free monthly limits of 1 billion read requests and 25 million write requests, yet I had written over 500 million. Does that seem unreasonable to everyone? 😆. But the server was down, and should I really spend money to get it back online? Roughly calculating, 500M would cost about $500.

    After that, I went to the Discord channel seeking help, and very quickly someone came in to assist me, and just a few minutes later they informed me that the error was on their side and had restored the service for me. Truly, in the midst of misfortune, there’s good fortune; what I love most about this service is the quick support like this 🙏

    » Read more

Problem

Worker threads were first introduced in Node.js version 10.5 and their API was still in the experimental stage until it received stable release in version 12LTS.

Worker threads provide a solution to run JavaScript code on a separate thread parallel to the main thread. So, how does this work and what benefits does it bring? Keep reading to find out.

CPU-intensive tasks

You might already know that node.js excels at handling asynchronous I/O tasks. When it comes to I/O, people often think about tasks like reading/writing data to files, making HTTP requests, and so on.

However, for synchronous tasks, such as complex computations on a large dataset, it can cause a serious bottleneck in the main thread.

Imagine a synchronous computation that takes 10 seconds to process. This means the main thread will be blocked for 10 seconds to handle that request before it can process subsequent requests, which is detrimental to the server's responsiveness.

A classic example of such a computation is the Fibonacci sequence. The Fibonacci sequence is an infinite sequence of natural numbers that begins with 0 and 1, and each subsequent element is the sum of the two preceding ones. A Fibonacci function in JavaScript can be written like this:

const fibonacci = (n) => {
  var i;
  var fib = [];

  fib[0] = 0;
  fib[1] = 1;
  for (i = 2; i <= n; i++) {
    fib[i] = fib[i - 2] + fib[i - 1];
  }
  return fib;
}

Try calling the fibonacci(999999) function, and your main thread might take more than a second to compute the result.

What are Worker threads?

Worker threads are a module in node.js that allows you to run JavaScript code parallel to the main thread. Each worker runs independently, but they can communicate with each other through postMessage(). For more in-depth knowledge, you can refer to the full documentation on Worker threads at Worker threads.

Why do we need Worker threads?

As mentioned earlier, we may need Worker threads to handle cases where we have large or complex data computations to prevent blocking the main thread.

The main thread sends a request to a worker to execute JavaScript code. After completion, the worker informs the main thread by calling postMessage(). The main thread receives the data from the worker and continues processing that request.

By moving complex JavaScript computations away from the main thread, subsequent requests can be processed normally without any blocking.

The cost of creating a worker

Before Worker threads were introduced in version 10.15, there were other ways to run JavaScript code on a separate thread, such as Cluster and Child Process.

Cluster maximizes the use of CPU threads to create more main threads since by default, a node.js project runs on a single thread. Using Cluster, if you have a server with 4 cores and 8 threads, the maximum number of main threads created is 8, which matches the number of CPU threads. Incoming requests will be distributed in a round-robin fashion or using another algorithm.

Child Process is a different solution compared to Cluster. It creates a separate process with its own dedicated event loop and main thread, resulting in a higher usage of system resources for each process. However, communication between processes is relatively complex because each process has its own memory.

Worker threads were introduced to address resource usage concerns with Child Process. Instead of creating a new process, worker threads create a new thread within the process of the running application. This helps minimize resource usage because the resources needed to create a thread are less than those needed to create a process. Furthermore, threads share resources, making communication between them relatively easy.

To visualize this, you can refer to the comparison diagram between Child Process and Worker threads:

worker threads diagram

However, both Child Process and Worker threads have resource costs, so it's important to carefully consider creating too many of them.

How to use Worker threads?

The documentation of node.js provides a simple example of how to implement a single worker, which you can see at Worker threads.

In this article, I'll provide an example of a simple implementation of a worker that calculates the Fibonacci sequence in a separate thread.

First, create a file named main.js:

const { Worker } = require('worker_threads');

const runService = (workerData) => {
  return new Promise((resolve, reject) => {
    const worker = new Worker('./worker.js', { workerData });

    worker.on('message', resolve);
    worker.on('error', reject);
    worker.on('exit', (code) => {
      if (code !== 0)
        reject(new Error(`stopped with  ${code} exit code`));
    });
  })
}

const run = async () => {
  const result = await runService(999999);
  console.log(result);
}

run().catch(console.log);

Next, create a worker.js file:

const { parentPort, workerData } = require('worker_threads');

const fibonacci = (n) => {
  var i;
  var fib = [];

  fib[0] = 0;
  fib[1] = 1;
  for (i = 2; i <= n; i++) {
    fib[i] = fib[i - 2] + fib[i - 1];
  }

  parentPort.postMessage(fib);
}

fibonacci(workerData);

Then, run the main.js file, and you will see the result of the Fibonacci sequence almost instantaneously.

To explain the code, when you call new Worker in the main file, it creates a worker that contains the code from the worker.js file. new Worker takes the second parameter workerData to pass data from the main thread to the worker. After the worker finishes processing, it calls the postMessage function to communicate the result back to the main thread.

In practical implementations of Worker threads, it is advisable to adhere to agreed-upon principles for consistency. One such principle is to use community-built packages that offer high compatibility and quick implementation, like the node-worker-threads-pool npm package.

For example, to reimplement the Fibonacci code using the package, the code becomes shorter and easier to read:

const { StaticPool } = require('node-worker-threads-pool');

const fibonacci = (n) => {
  var i;
  var fib = [];

  fib[0] = 0;
  fib[1] = 1;
  for (i = 2; i <= n; i++) {
    fib[i] = fib[i - 2] + fib[i - 1];
  }
  return fib;
}

const staticPool = new StaticPool({
  size: 4,  
  task: fibonacci,  
});

staticPool.exec(999999).then(console.log);

Conclusion

Worker threads are a module in node.js that allows you to run JavaScript code parallel to the main thread. Use worker threads when you have synchronous code that takes a significant amount of processing time. This way, you can free up the main thread to handle subsequent requests without being blocked for a certain period of time.

The resource cost of creating a worker is lower compared to Child Process, but both can be "expensive," so be cautious when creating too many of them.

Implementing worker threads has become easier with the help of community-supported packages available on npm, such as the node-worker-threads-pool npm package.

Premium
Hello

The secret stack of Blog

As a developer, are you curious about the technology secrets or the technical debts of this blog? All secrets will be revealed in the article below. What are you waiting for, click now!

As a developer, are you curious about the technology secrets or the technical debts of this blog? All secrets will be revealed in the article below. What are you waiting for, click now!

View all

Subscribe to receive new article notifications

or
* The summary newsletter is sent every 1-2 weeks, cancel anytime.

Comments (6)

Leave a comment...
Avatar
Ẩn danh1 year ago
Thư viện node-worker-threads-pool này chỉ giúp function chạy trên luồng riêng chứ không chạy trên multi CPU cùng lúc.
Reply
Avatar
Đình Trung1 year ago
worker threads có giống/khác gì với worker pools không a?
Reply
Avatar
Xuân Hoài Tống1 year ago
Khác bạn ạ, mình có series bài viết nói về kiến trúc Node.js bạn có thể tìm đọc lại sẽ dễ hiểu hơn.
Avatar
Trần Huy Hoàng1 year ago
Ngoài tính dãy fibo trên kia ra thì có ứng dụng thực tế nào nữa ko bạn?
Reply
Avatar
Xuân Hoài Tống1 year ago
Một câu hỏi khó, bạn có thể áp dụng trong bất kì trường hợp nào mà "công việc" của bạn đủ lâu để chặn luồng chính, ví dụ như xử lý hình ảnh, video chẳng hạn.
Avatar
Tiến Đức2 years ago
nếu vậy thì khi nào dùng worker threads khi nào dùng child process vậy ạ
Reply
Avatar
Xuân Hoài Tống2 years ago
@Phương mình thì thấy worker thread và child process có cách dùng tương đương nhau. Có điều worker được ra sau và nó dùng ít tài nguyên hơn so với child nên nó được khuyên dùng hơn.
Avatar
Nguyễn Minh Phương2 years ago
theo ngu kiến của e thì khi nào bác cần scale up project của bác mà ko muốn dùng đến các công cụ khác như docker, .. thì dùng child process, còn khi nào bác cần xử lý dữ liệu data trong project phải lặp lên đến cả triệu element(đại loại thế) thì nên dùng worker threads để tránh block event loop
Avatar
Xuân Hoài Tống2 years ago
Câu hỏi này chắc có thời gian mình sẽ viết một bài riêng, nhưng để mà nói ngắn gọn là worker threads được yêu thích sử dụng hơn&nbsp;
Avatar
Tùng Nguyễn2 years ago
Thực ra node là đa luồng ở libuv vậy thì tại sao lại phải tạo ra worker thread để làm gì?
Reply
Avatar
Xuân Hoài Tống2 years ago
Bác ở dưới nói đúng rồi đấy b Tùng, node có thể đa luồng ở background nhưng luồng chính chỉ có một và nó xử lý đồng bộ mã của js
Avatar
Văn Thành Phan2 years ago
Nhưng luồng chính vẫn phải wait nhiều hơn nếu chỉ 1 main thread chứ bác
Avatar
Nhí Nhố Tí2 years ago
Quá tuyệt vời quá nai xừ 😍
Reply