git.subgeniuskitty.com - vvhitespace/.git/blob

While considering methods for further obfuscating Whitespace code, I realized that the Whitespace language reference didn’t enforce any particular alignment or grouping for code execution. In theory this means one could create a sequence of Whitespace trits which represents different sequences of commands depending on exactly where execution begins.

One implementation of this idea is the creation of a ‘hidden’ label. For example, the command sequence

PUSH 0; DROP; PUSH 46

assembles as

SSSSN SNN SSSTSTTTSN

which can be visually regrouped as

SSSSNSN NSSSTSTTTSN

and contains the MARK label0 command (i.e. NSSSTSTTTSN) used in the next set of examples.

Additionally, since PUSH 0; DROP is effectively a NOP, ‘hijacking’ the code at this location allows one to insert their own integer on the stack in place of the PUSH 46 command, sneakily substituting it as an input to any downstream processing.

I decided to investigate the behavior of specific Whitespace interpreters, discovering that they broke down into two methods for locating labels.

Method 1 Scan from the start of the file for the first occurence of the mark-label bytestring and jump.

Examples:
```
whitespacers/c: (c) meth0dz
```
Method 2 Scan from the start of the file, looking for a mark-label bytestring, but ‘parsing’ one bytestring at a time, and jumping to the first ‘standalone’ mark-label bytestring. Note that this is different than executing the program, particularly when user-input commands are present.

Examples:
```
whitespacers/haskell: (c) 2003 Edwin Brady
whitespacers/ruby: (c) 2003 by Wayne E. Conrad
whitespacers/perl: (c) 2003 Micheal Koelbl
threeifbywhiskey/satan
```

Both of these methods can be broken using valid Whitespace code:

Type A: No ‘standalone’ label exists. This breaks Method 2.

By programmer’s intent, this should print a ! before infinite . lines.
Type B: Hidden label before ‘standalone’ label. This breaks Method 1.

By programmer’s intent, this should print an infinite chain of . lines.

This is the Type A program:

SSSTSSSSTN  | Push +33 (ASCII '!')
NSNSTSTTTSN | JMP > 0101110 (label0)
NSSTTTTN    | MARK: 1111 (label2)
SSSSN       | PUSH 0
SNN         | DROP
SSSTSTTTSN  | Push +46 (ASCII '.')
TNSS        | Output character
SSSTSTSN    | Push +10 (ASCII '\n')
TNSS        | Output character
NSNTTTTN    | JMP > 1111 (label2)

Append this to turn it into the Type B program:

NSSSTSTTTSN | MARK: 0101110 (label0) (2nd time)
NSNTTTTN    | JMP > 1111 (label2)

VVhitespace avoids this ambiguity by marking label definitions with a vertical tab [VTab] immediately before the label.

Old label: NSS  TSTS N
New label: NSSV TSTS N

Since Whitespace ignores [VTab] as a comment character, and since the Whitespace VM is a superset of the VVhitespace VM, all valid VVhitespace programs are also valid Whitespace programs, though the task of locating a suitable Whitespace interpreter is left for the reader.

Quoting from the original Whitespace tutorial which I used as language reference:

The programmer is free to push arbitrary width integers onto the stack.

I have yet to find a Whitespace interpreter which successfully implements that statement.

Since I wanted to implement bitwise logic functions, this broad definition posed a problem. For example, how many 1s should be in the output of the expression NOT(0)? Should the expression NOT(0) == NOT(0) always be true?

VVhitespace sidesteps the problem by declaring all integers to be 64-bits wide.