Node.js Architecture - Single thread, call stack, synchronous and asynchronous I/O in Node.js

Node.js Architecture - Single thread, call stack, synchronous and asynchronous I/O in Node.js

Daily short news for you
  • Privacy Guides is a non-profit project aimed at providing users with insights into privacy rights, while also recommending best practices or tools to help reclaim privacy in the world of the Internet.

    There are many great articles here, and I will take the example of three concepts that are often confused or misrepresented: Privacy, Security, and Anonymity. While many people who oppose privacy argue that a person does not need privacy if they have 'nothing to hide.' 'This is a dangerous misconception, as it creates the impression that those who demand privacy must be deviant, criminal, or wrongdoers.' - Why Privacy Matters.

    » Read more
  • There is a wonderful place to learn, or if you're stuck in the thought that there's nothing left to learn, then the comments over at Hacker News are just for you.

    Y Combinator - the company behind Hacker News focuses on venture capital investments for startups in Silicon Valley, so it’s no surprise that there are many brilliant minds commenting here. But their casual discussions provide us with keywords that can open up many new insights.

    Don't believe it? Just scroll a bit, click on a post that matches your interests, check out the comments, and don’t forget to grab a cup of coffee next to you ☕️

    » Read more
  • Just got played by my buddy Turso. The server suddenly crashed, and checking the logs revealed a lot of errors:

    Operation was blocked LibsqlError: PROXY_ERROR: error executing a request on the primary

    Suspicious, I went to the Turso admin panel and saw the statistics showing that I had executed over 500 million write commands!? At that moment, I was like, "What the heck? Am I being DDoSed? But there's no way I could have written 500 million."

    Turso offers users free monthly limits of 1 billion read requests and 25 million write requests, yet I had written over 500 million. Does that seem unreasonable to everyone? 😆. But the server was down, and should I really spend money to get it back online? Roughly calculating, 500M would cost about $500.

    After that, I went to the Discord channel seeking help, and very quickly someone came in to assist me, and just a few minutes later they informed me that the error was on their side and had restored the service for me. Truly, in the midst of misfortune, there’s good fortune; what I love most about this service is the quick support like this 🙏

    » Read more

Issue

Nowadays, many programming languages support synchronous programming. This means that the code will be executed sequentially, from top to bottom, from left to right. It finishes executing one part of the code before moving on to the next part.

An example in Go: a simple program that prints the words "Hello" and "World" with a 2-second delay between them:

package main

import (
    "fmt"
    "time"
)

func main() {
    fmt.Printf("Hello")
    time.Sleep(2 * time.Second)
    fmt.Printf("World")
}

However, in JavaScript or Node.js, there are functions that can be executed synchronously or asynchronously. Asynchronous means that the code may not immediately return a result, and the result will be returned at some point in time. Node.js continues running the subsequent code after that.

Another example is a Node.js program that seems to print the words "Hello" and "World" in order, but the result is "World Hello" instead:

setTimeout(function() {
  console.log("Hello");
}, 0);

console.log("World");

Because setTimeout is an asynchronous function, Node.js "takes note" to execute that code, but the result is not immediately available. In the example above, the result is returned after printing the word "World".

So how does this "asynchronous" behavior affect Node.js? Let's find out in this article!

Single Thread

You may have heard that JavaScript or Node.js is single-threaded. This means that your JavaScript code runs in a single thread. If that's the case, then tasks such as file I/O, making external HTTP requests, etc. would have to be done sequentially, which would result in slow response times!

Imagine you are writing an API server with an endpoint that contains JavaScript code, and each request takes an average of 5 seconds to complete. When the second request comes, it has to wait for 5 seconds before it can continue processing!?

This may or may not be true in some cases. It is true when all the code inside is synchronous functions, but when there are asynchronous functions mixed in, the waiting time will not be 5 seconds. This is because Node.js handles asynchronous functions using the Event Loop with the help of the Event Queue and the Thread Pool. To avoid overwhelming you with too many concepts, let me introduce them one by one.

In Node.js, there are synchronous and asynchronous functions. Basic statements such as if-else, switch-case, loops, JSON.parse, etc. are executed synchronously. Synchronous functions play the role of built-ins in Node.js, such as readFileSync, gzipSync, etc. Asynchronous functions include readFile, gzip, HTTP requests through the HTTP module, and even third-party libraries that interact with files, databases, etc.

Here is a general diagram of the components in Node.js:

Thành phần Node.js

You can see that Node.js consists of three main components: Chrome's V8, the Node.js Standard Library, and LibUv. V8 is where JavaScript code is executed. The Node.js Standard Library provides additional libraries that V8 cannot handle, such as file operations, HTTP requests, etc. The third component is LibUv.

Call Stack

The call stack is where JavaScript code is pushed in order to be executed. This means that your code is pushed onto the call stack to determine the order of execution. At any given time, only one piece of code is being processed.

To illustrate this further, let's take a look at an example of the code that converts Celsius to Fahrenheit:

const add = (a, b) => a + b;
const multiply = (a, b) => a * b;

const addCofficient = (val) => multiply(val, 1.8);
const addConst = (val) => add(val, 32);

const convertCtoF = (val) => {
  let result = val;
  result = addCofficient(result);
  result = addConst(result);
  return result;
};

convertCtoF(100);

In the above example, we call the convertCtoF function, which calls the addCofficient and addConst functions, both of which in turn call the multiply and add functions.

The execution order of these functions in the call stack is described in the following diagram:

Callstack thực hiện chương trình

We can see that convertCtoF is pushed into the call stack first, followed by addCofficient. Since addCofficient calls the multiply function, it is pushed on top of the stack. When there are no more functions inside, it starts executing the operations starting from the top of the stack. This is also known as the First In Last Out (FILO) algorithm, so we call the call stack a stack.

If an error or exception occurs during execution, the error traceback will display the Error Stack Trace, which shows the location of the error. Because the functions are pushed onto the call stack in order, the error traceback can easily trace where they are in the program.

For example, let's modify the addConst function by changing the second parameter in the add function to a variable that doesn't exist in the program:

const addConst = (val) => add(val, number);

When running the program, an error will be thrown, including the cause and the location of the error:

ReferenceError: number is not defined
   at addConst:5:32
   at convertCtoF:10:12
   at eval:14:1

=> This means that number is not defined, at line 5, starting from column 32, in the convertCtoF function at line 10, starting from column 12...

You may have heard that JavaScript runs on a single thread, but if that's the case, wouldn't it be slow? Or some may say that Node.js is fundamentally multi-threaded!? It sounds contradictory, doesn't it? On one hand, we say that JavaScript is single-threaded, and on the other hand, we say that Node.js is based on JavaScript and yet it is multi-threaded. So what is the truth? Is Node.js single-threaded or not?

The answer is yes, Node.js is single-threaded, but it cleverly handles time-consuming tasks in another place (LibUv), and that place handles tasks in a multi-threaded manner!

I/O Tasks

I/O tasks in Node.js typically refer to operations such as file read/write or network-related activities like making HTTP requests. In a real-world server program, I/O tasks are common and you probably use them frequently. These tasks take a considerable amount of time to process because they are related to factors such as file size, network bandwidth, or server processing speed.

In Node.js, I/O consists of two types: synchronous and asynchronous.

Synchronous I/O

Let's consider an example of reading files:

const pdf = fs.readFileSync(file.pdf);
console.log("pdf size", pdf.size);
const doc = fs.readFileSync(file.doc);
console.log("doc size", doc.size);

Reading a file is a time-consuming task. readFileSync is a synchronous function, which means that the file.pdf is read first before the file.doc is read and the result is printed to the console. The processing time for the two tasks is described in the following diagram:

Quá trình đọc file

We can see that the time to read file.pdf is 3ms, file.doc is 3ms, and the total time we have to wait for all the tasks to complete is 6ms.

6ms may be fast in this example, but imagine if the file sizes increase, resulting in 30 seconds of reading time for each file? Then the call stack would be blocked, which means that during the time of reading the file, no other code would be processed. The code would run in the order: Read file -> Print -> Read file -> Print sequentially.

Asynchronous I/O

Now let's modify the above code a bit by replacing the readFileSync function with readFile:

const pdf = fs.readFile(file.pdf);
console.log("pdf size", pdf.size);
const doc = fs.readFile(file.doc);
console.log("doc size", doc.size);

readFile is an asynchronous function. An asynchronous function does not immediately return a result, but instead returns it at some point in time. If you run the code above, you will see a result that looks like this:

pdf size undefined
doc size undefined

That's because the result of the pdf and doc variables is not immediately available, so any attempt to access their size property will not yield any result.

To solve this problem for asynchronous functions, callbacks are a useful method. In simple terms, a callback is a function that is called after an asynchronous function has a result.

It may be challenging to imagine, but let me give you an example that is easy to understand. I will modify the above code a bit:

fs.readFile(file.pdf)
  .then(pdf => console.log("pdf size", pdf.size));

fs.readFile(file.doc)
  .then(doc => console.log("doc size", doc.size));

then is used as a way to provide a callback for the asynchronous function. Alternatively, the callback can also be passed as a second parameter to the readFile function like this:

fs.readFile(file.pdf, function(err, pdf) {
  console.log("pdf size", pdf.size);
})

By replacing the readFileSync function with the readFile function, the processing time is significantly reduced because the file reading is done almost in parallel in a place we called Thread Pool. Please refer to the following diagram for a clearer understanding:

Đọc file bất đồng bộ

These are the benefits that asynchronous brings to Node.js. Time-consuming I/O tasks are offloaded to the Thread Pool, preventing them from occupying too much processing time in the call stack and causing program congestion.

A clear example to illustrate the difference between synchronous and asynchronous is when you have worked with other synchronous languages like PHP or Golang, where database queries are executed sequentially. However, in Node.js, they are asynchronous, and you have to use callbacks or Promises to catch the results at some point in time.

mysql.query("select * from user where id = 1", function (err, result) {
  console.log("user": user);
});

Conclusion

Node.js is a single-threaded environment, which means that at any given time, only one piece of code is being executed. However, this doesn't make Node.js slow, as it uses asynchronous functions with the help of the Event Loop.

Node.js is made up of three main components: V8, which handles JavaScript code execution, LibUv, which provides the Event Loop for handling asynchronous operations.

So how does Node.js handle asynchronous tasks? I will explain this in more detail in the next article.

Premium
Hello

The secret stack of Blog

As a developer, are you curious about the technology secrets or the technical debts of this blog? All secrets will be revealed in the article below. What are you waiting for, click now!

As a developer, are you curious about the technology secrets or the technical debts of this blog? All secrets will be revealed in the article below. What are you waiting for, click now!

View all

Subscribe to receive new article notifications

or
* The summary newsletter is sent every 1-2 weeks, cancel anytime.

Comments (0)

Leave a comment...