| 1 | # Overview # |
| 2 | |
| 3 | This document is a complete specification for the VVhitespace language. |
| 4 | |
| 5 | Since Whitespace and VVhitespace are closely related, this document is heavily |
| 6 | copied/adapted from the "Whitespace Tutorial" published by e.c.brady@dur.ac.uk. |
| 7 | |
| 8 | # Reference # |
| 9 | |
| 10 | The only lexical tokens in the VVhitespace language are |
| 11 | |
| 12 | * Tab (ASCII 9), |
| 13 | * Line Feed (ASCII 10), |
| 14 | * Vertical Tab (ASCII 11), and |
| 15 | * Space (ASCII 32). |
| 16 | |
| 17 | The language itself is an imperative, stack based language. Each command |
| 18 | consists of a series of tokens, beginning with the Instruction Modification |
| 19 | Parameter (IMP). These are listed in the table below. |
| 20 | |
| 21 | | IMP | Meaning | |
| 22 | | :----------- | :----------------- | |
| 23 | | [Space] | Stack Manipulation | |
| 24 | | [Tab][Space] | Arithmetic | |
| 25 | | [Tab][Tab] | Heap access | |
| 26 | | [LF] | Flow Control | |
| 27 | | [Tab][LF] | I/O | |
| 28 | |
| 29 | The virtual machine on which programs run has a stack and a heap, both of |
| 30 | fixed, implementation defined size and both designed around a 64-bit word. The |
| 31 | programmer may push and pop words from the stack in addition to accessing the |
| 32 | heap on a per-word basis as a permanent store of variables and data structures. |
| 33 | |
| 34 | Many commands require numbers or labels as parameters. |
| 35 | |
| 36 | Numbers are integers ranging from `-(2^63)` to `+(2^63)-1` and are represented |
| 37 | in sign-magnitude format as a sign, either [Space] for positive or [Tab] for |
| 38 | negative, followed by a series of [Space] (zero) and [Tab] (one) digits |
| 39 | terminated by a [LF]. Note that positive and negative zero are considered |
| 40 | equivalent. As an example, the decimal number +42 has the following |
| 41 | representation. |
| 42 | |
| 43 | [Space] [Tab][Space][Tab][Space][Tab][Space] [LF] |
| 44 | |
| 45 | Labels consist of an [LF] terminated list of up to sixteen spaces and tabs. The |
| 46 | program must not begin with a label. There is only one global namespace so all |
| 47 | labels must be unique. Labels are left-padded with [Space] up to sixteen |
| 48 | characters; the following two labels are interchangeable. |
| 49 | |
| 50 | [Tab][Space][Tab] [LF] |
| 51 | [Space][Tab][Space][Tab] [LF] |
| 52 | |
| 53 | ## Stack Manipulation (IMP: [Space]) ## |
| 54 | |
| 55 | Stack manipulation is one of the more common operations, hence the shortness of |
| 56 | the IMP [Space]. There are four stack instructions. |
| 57 | |
| 58 | | Command | Params | Meaning | |
| 59 | | :---------- | :----- | :---------------------------------- | |
| 60 | | [Space] | Number | Push the number onto the stack | |
| 61 | | [LF][Space] | --- | Duplicate the top item on the stack | |
| 62 | | [LF][Tab] | --- | Swap the top two items on the stack | |
| 63 | | [LF][LF] | --- | Discard the top item on the stack | |
| 64 | |
| 65 | ## Arithmetic (IMP: [Tab][Space]) ## |
| 66 | |
| 67 | Arithmetic commands operate on the top two items on the stack, and replace them |
| 68 | with the result of the operation. The first item pushed is considered to be |
| 69 | left of the operator. |
| 70 | |
| 71 | | Command | Params | Meaning | |
| 72 | | :------------- | :----- | :--------------- | |
| 73 | | [Space][Space] | --- | Addition | |
| 74 | | [Space][Tab] | --- | Subtraction | |
| 75 | | [Space][LF] | --- | Multiplication | |
| 76 | | [Tab][Space] | --- | Integer Division | |
| 77 | | [Tab][Tab] | --- | Modulo | |
| 78 | |
| 79 | ## Heap Access (IMP: [Tab][Tab]) ## |
| 80 | |
| 81 | Heap access commands look at the stack to find the address of items to be |
| 82 | stored or retrieved. To store an item, push the address then the value and run |
| 83 | the store command. To retrieve an item, push the address and run the retrieve |
| 84 | command, which will place the value stored in the location at the top of the |
| 85 | stack. |
| 86 | |
| 87 | | Command | Params | Meaning | |
| 88 | | :------ | :----- | :------- | |
| 89 | | [Space] | --- | Store | |
| 90 | | [Tab] | --- | Retrieve | |
| 91 | |
| 92 | ## Flow Control (IMP: [LF]) ## |
| 93 | |
| 94 | Flow control operations are also common. Subroutines are marked by labels, as |
| 95 | well as the targets of conditional and unconditional jumps, by which loops can |
| 96 | be implemented. Programs must be ended by means of [LF][LF][LF] so that the |
| 97 | interpreter can exit cleanly. |
| 98 | |
| 99 | | Command | Params | Meaning | |
| 100 | | :------------------- | :----- | :------------------------------- | |
| 101 | | [Space][Space][VTab] | Label | Mark a location in the program | |
| 102 | | [Space][Tab] | Label | Call a subroutine | |
| 103 | | [Space][LF] | Label | Jump unconditionally to a label | |
| 104 | | [Tab][Space] | Label | Jump to label if TOS is zero | |
| 105 | | [Tab][Tab] | Label | Jump to label if TOS is negative | |
| 106 | | [Tab][LF] | --- | Return from subroutine | |
| 107 | | [LF][LF] | --- | End the program | |
| 108 | |
| 109 | ## I/O (IMP: [Tab][LF]) ## |
| 110 | |
| 111 | Finally, we need to be able to interact with the user. There are IO |
| 112 | instructions for reading and writing numbers and individual characters. |
| 113 | |
| 114 | The read instructions take the heap address in which to store the result from |
| 115 | the top of the stack. |
| 116 | |
| 117 | | Command | Params | Meaning | |
| 118 | | :------------- | :----- | :------------------------------------------------ | |
| 119 | | [Space][Space] | --- | Output the character at the TOS | |
| 120 | | [Space][Tab] | --- | Output the number at the TOS | |
| 121 | | [Tab][Space] | --- | Read character and store at location given by TOS | |
| 122 | | [Tab][Tab] | --- | Read number and store at location given by TOS | |