New Trojan Source attack impacts compilers for most programming languages
Academics from the University of Cambridge in the United Kingdom have published details today about a theoretical attack that can be used to insert malicious code inside legitimate apps via their comment fields.
The attack, nicknamed Trojan Source, relies on the usage of bidirectional control characters inside source code comments.
Also known as BiDi characters, these are Unicode control characters that are used inside a line of text to signal the shift from an LTR (left-to-right) mode to RTL (right-to-left) or vice versa.
In practice, these characters are meant solely for software applications and are invisible to the human eye, as they are only used to embed text of a different reading direction inside large blocks of text (such as inserting Arabic or Hebrew strings inside large blocks of Latin text).
The Cambridge research team said it discovered that most code compilers and code editors don't have protocols to handle BiDi characters or signal their presence inside source code comments.
Researchers said that attackers could insert BiDi control characters inside comments that human reviewers won't be able to see, and which, when compiled, will move text from the comment field into executable code or move (security-related) code into a commented section, opening applications to attacks or negating security checks.
"We've verified that this attack works against C, C++, C#, JavaScript, Java, Rust, Go, and Python, and suspect that it will work against most other modern languages," Ross Anderson, one of the two researchers behind the Trojan Source technique, explained in a blog post published earlier today.
In addition to code compilers, Anderson and his colleague Nicholas Boucher said that several code editors and source code hosting services were also found to be vulnerable [see table below].
Besides the BiDi-related attack, the two researchers also discovered that source code compilers were also vulnerable to a second issue, known as a homoglyph attack — where classic Latin letters are replaced with lookalike characters from other Unicode family sets (alphabets).
Researchers said this second attack could be used to create two different functions that look the same in the eyes of a code reviewer but are actually different from one another.
Anderson and Boucher argued that an attacker could use a dependency or a plugin to define the homoglyph function outside the app's main codebase and add malicious code to a project without the maintainer's knowledge.
Since much of today's coding processes rely on contributions from a team of multiple developers, the Cambridge researchers argued that it was important for code compilers and code editors to detect BiDi and homoglyph characters and signal to human code reviewers that non-standard Unicode glyphs are being used in source code — typically written in the Latin character set.
The two researchers said they gave all the affected parties a 99-day embargo to fix the two attacks in their tools before they published details about Trojan Source attack earlier today.
At the time of writing, the team behind the official Rust compiler has released a security update to fix both attacks — tracked as CVE-2021-42574 (the BiDi attack) and CVE-2021-42694 (the homoglyph attack). Other fixes are expected in the coming days.
Additional details about the Trojan Source attacks are available on a dedicated website. Proof-of-concept code is available on GitHub.
Liran Tal, a security researcher ad DevSecOps firm Snyk published an npm package and ESLint plugin to detect Trojan Source attacks in JavaScript-based projects.
Catalin Cimpanu
is a cybersecurity reporter who previously worked at ZDNet and Bleeping Computer, where he became a well-known name in the industry for his constant scoops on new vulnerabilities, cyberattacks, and law enforcement actions against hackers.