Commit | Line | Data |
---|---|---|
cbbb46ce AT |
1 | # Overview # |
2 | ||
3 | This document is a complete specification for the VVhitespace language. | |
4 | ||
5 | Since Whitespace and VVhitespace are closely related, this document is heavily | |
6 | copied/adapted from the "Whitespace Tutorial" published by e.c.brady@dur.ac.uk. | |
7 | ||
8 | # Reference # | |
9 | ||
10 | The only lexical tokens in the VVhitespace language are | |
11 | ||
12 | * Tab (ASCII 9), | |
13 | * Line Feed (ASCII 10), | |
14 | * Vertical Tab (ASCII 11), and | |
15 | * Space (ASCII 32). | |
16 | ||
17 | The language itself is an imperative, stack based language. Each command | |
18 | consists of a series of tokens, beginning with the Instruction Modification | |
19 | Parameter (IMP). These are listed in the table below. | |
20 | ||
21 | | IMP | Meaning | | |
22 | | :----------- | :----------------- | | |
23 | | [Space] | Stack Manipulation | | |
24 | | [Tab][Space] | Arithmetic | | |
25 | | [Tab][Tab] | Heap access | | |
26 | | [LF] | Flow Control | | |
27 | | [Tab][LF] | I/O | | |
28 | ||
29 | The virtual machine on which programs run has a stack and a heap, both of | |
30 | fixed, implementation defined size and both designed around a 64-bit word. The | |
31 | programmer may push and pop words from the stack in addition to accessing the | |
32 | heap on a per-word basis as a permanent store of variables and data structures. | |
33 | ||
34 | Many commands require numbers or labels as parameters. | |
35 | ||
36 | Numbers are integers ranging from `-(2^63)` to `+(2^63)-1` and are represented | |
37 | in sign-magnitude format as a sign, either [Space] for positive or [Tab] for | |
38 | negative, followed by a series of [Space] (zero) and [Tab] (one) digits | |
39 | terminated by a [LF]. Note that positive and negative zero are considered | |
40 | equivalent. As an example, the decimal number +42 has the following | |
41 | representation. | |
42 | ||
43 | [Space] [Tab][Space][Tab][Space][Tab][Space] [LF] | |
44 | ||
0b96db3a AT |
45 | Labels consist of an [LF] terminated list of up to sixteen spaces and tabs. The |
46 | program must not begin with a label. There is only one global namespace so all | |
47 | labels must be unique. Labels are left-padded with [Space] up to sixteen | |
48 | characters; the following two labels are interchangeable. | |
cbbb46ce | 49 | |
0b96db3a AT |
50 | [Tab][Space][Tab] [LF] |
51 | [Space][Tab][Space][Tab] [LF] | |
cbbb46ce AT |
52 | |
53 | ## Stack Manipulation (IMP: [Space]) ## | |
54 | ||
55 | Stack manipulation is one of the more common operations, hence the shortness of | |
56 | the IMP [Space]. There are four stack instructions. | |
57 | ||
58 | | Command | Params | Meaning | | |
59 | | :---------- | :----- | :---------------------------------- | | |
60 | | [Space] | Number | Push the number onto the stack | | |
61 | | [LF][Space] | --- | Duplicate the top item on the stack | | |
62 | | [LF][Tab] | --- | Swap the top two items on the stack | | |
63 | | [LF][LF] | --- | Discard the top item on the stack | | |
64 | ||
65 | ## Arithmetic (IMP: [Tab][Space]) ## | |
66 | ||
67 | Arithmetic commands operate on the top two items on the stack, and replace them | |
68 | with the result of the operation. The first item pushed is considered to be | |
69 | left of the operator. | |
70 | ||
71 | | Command | Params | Meaning | | |
72 | | :------------- | :----- | :--------------- | | |
73 | | [Space][Space] | --- | Addition | | |
74 | | [Space][Tab] | --- | Subtraction | | |
75 | | [Space][LF] | --- | Multiplication | | |
76 | | [Tab][Space] | --- | Integer Division | | |
77 | | [Tab][Tab] | --- | Modulo | | |
78 | ||
79 | ## Heap Access (IMP: [Tab][Tab]) ## | |
80 | ||
81 | Heap access commands look at the stack to find the address of items to be | |
82 | stored or retrieved. To store an item, push the address then the value and run | |
83 | the store command. To retrieve an item, push the address and run the retrieve | |
84 | command, which will place the value stored in the location at the top of the | |
85 | stack. | |
86 | ||
87 | | Command | Params | Meaning | | |
88 | | :------ | :----- | :------- | | |
89 | | [Space] | --- | Store | | |
90 | | [Tab] | --- | Retrieve | | |
91 | ||
92 | ## Flow Control (IMP: [LF]) ## | |
93 | ||
94 | Flow control operations are also common. Subroutines are marked by labels, as | |
95 | well as the targets of conditional and unconditional jumps, by which loops can | |
96 | be implemented. Programs must be ended by means of [LF][LF][LF] so that the | |
97 | interpreter can exit cleanly. | |
98 | ||
0b96db3a AT |
99 | | Command | Params | Meaning | |
100 | | :------------------- | :----- | :------------------------------- | | |
101 | | [Space][Space][VTab] | Label | Mark a location in the program | | |
102 | | [Space][Tab] | Label | Call a subroutine | | |
103 | | [Space][LF] | Label | Jump unconditionally to a label | | |
104 | | [Tab][Space] | Label | Jump to label if TOS is zero | | |
105 | | [Tab][Tab] | Label | Jump to label if TOS is negative | | |
106 | | [Tab][LF] | --- | Return from subroutine | | |
107 | | [LF][LF] | --- | End the program | | |
cbbb46ce AT |
108 | |
109 | ## I/O (IMP: [Tab][LF]) ## | |
110 | ||
111 | Finally, we need to be able to interact with the user. There are IO | |
112 | instructions for reading and writing numbers and individual characters. | |
113 | ||
114 | The read instructions take the heap address in which to store the result from | |
115 | the top of the stack. | |
116 | ||
117 | | Command | Params | Meaning | | |
118 | | :------------- | :----- | :------------------------------------------------ | | |
119 | | [Space][Space] | --- | Output the character at the TOS | | |
120 | | [Space][Tab] | --- | Output the number at the TOS | | |
121 | | [Tab][Space] | --- | Read character and store at location given by TOS | | |
122 | | [Tab][Tab] | --- | Read number and store at location given by TOS | |