// Update: 05.11.2021
It's now possible to not only encode, but also decode files. Means that now files containing supported bidi chars can be translated to template files with bidi placeholders (LRO, ...).
Generate malicious files using recently published bidi-attack vulnerability, which was discovered in Unicode Specification and affects many interpreters / compilers.
Cite from cve.mitre.org:
An issue was discovered in the Bidirectional Algorithm in the Unicode Specification through 14.0. It permits the visual reordering of characters via control sequences, which can be used to craft source code that renders different logic than the logical ordering of tokens ingested by compilers and interpreters. Adversaries can leverage this to encode source code for compilers accepting Unicode such that targeted vulnerabilities are introduced invisibly to human reviewers.
See the report about the Bidirectional Algorithm from unicode.org:
https://www.unicode.org/reports/tr9/tr9-42.html
See the original source from Camebridge University:
https://www.trojansource.codes/trojan-source.pdf
python3 codegen.py [-h] [-m MODE] [-i INFILE] [-o OUTFILE] [-u] [-a]
arg | long arg | param | description |
---|---|---|---|
-h | --help | none | show this help message and exit |
-i | --infile | INFILE | Input file (template) containing unicode placeholders |
-o | --outfile | OUTFILE | Output file to store the final code |
-u | --uctable | none | Supported bidi-related characters |
-a | --about | none | Print about text |
-m | --mode | MODE | Use e|ncode to convert template to malicious code and d|ecode vice versa |
This will translate a template file containg bidi-placeholders to a file with actual bidi characters. All examples are taken from the referenced PDF. To run these examples, execute codegen.py
with the required arguments:
python3 codegen.py -m encode -i infile.xyz -o outfile.xyz
and run/compile outfile.xyz
.
This will translate a file containing bidi characters to a file with the corresponding bidi placeholders. NOTE: The output cannot be run, as it's only a template.
python3 codegen.py -m decode -i infile.xyz -o outfile.xyz
To create own templates, set placeholders (python3 codegen.py -u
) where you want the special characters to appear. See the examples to have a first impression on how a template could look like.
The following table (taken from the original Camebridge report) shows the characters which are currently supported by this script. ~ https://www.trojansource.codes/trojan-source.pdf