From 435f6fd05819ba1cb8af9c329fcb1c02dbabe25c Mon Sep 17 00:00:00 2001
From: Aaron Taylor <ataylor@subgeniuskitty.com>
Date: Mon, 3 May 2021 14:16:27 -0700
Subject: [PATCH] Added a `brainstorming.md` file with misc ideas for future
 obfuscation projects.

---
 brainstorming.md | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 47 insertions(+)
 create mode 100644 brainstorming.md

diff --git a/brainstorming.md b/brainstorming.md
new file mode 100644
index 0000000..66d9f33
--- /dev/null
+++ b/brainstorming.md
@@ -0,0 +1,47 @@
+# Overview #
+
+A few thoughts for use in future obfuscated programs.
+
+
+# Digraphs, Trigraphs and Syntax Highlighting #
+
+At this point, trigraphs betray their presence by requiring compiler flags,
+making any direct benefit for obfuscation dubious.
+
+However, trigraphs cause many syntax highlighting packages to incorrectly
+highlight the source code. For example, the following code snippet frequently
+displays the `exit(0);` line as code rather than comment when processed by
+syntax highlighting programs which miss the trigraph `??/` converting to `\`,
+thereby escaping the newline and creating a two line comment.
+
+    // Should I exit early?????/
+    exit(0);
+
+As long as syntax highlighting is kept sane elsewhere in an obfuscated program,
+the user may gradually come to trust it, perhaps allowing an instance or two of
+trigraph-induced syntax highlighting failure to slip past the reader.
+
+Of course, readers may run the equivalent of a search and replace, condensing
+trigraphs to their single character equivalent. Since the CPP does an
+equivalent search and replace before performing any other processing, this is
+safe. On the other hand, digraphs are dealt with during the tokenization
+process, meaning that a simple search-and-replace by the user is not
+necessarily a safe transformation of the source code. Is it possible to include
+two important digraphs hidden amongst frivolous usage, such that
+
+  - one digraph breaks syntax highlighting in a useful way, like the example
+    demonstrated above, and
+
+  - the other digraph isn't a real digraph, rather being something which breaks
+    the program if digraphs are converted with a simple search-and-replace?
+
+One possible example of the 'false' digraph would be embedding the characters
+inside another token, perhaps a multi-part string split across multiple lines?
+If a naive search-and-replace would convert the string into something
+syntax-breaking, then the reader may avoid doing a digraph conversion before
+reading the source, despite knowing such digraphs are there, and thus may be
+tricked into believing lies from their syntax highlighter.
+
+I suppose that leads to the natural question: Do people typically do a
+search-and-replace for digraphs when reading obfuscated code, or do they use a
+more language-aware method?
-- 
2.20.1