Updates to `tests/` for release. Mostly adding comments and improving the README.
[vvhitespace] / reference.md
CommitLineData
cbbb46ce
AT
1# Overview #
2
3This document is a complete specification for the VVhitespace language.
4
5Since Whitespace and VVhitespace are closely related, this document is heavily
6copied/adapted from the "Whitespace Tutorial" published by e.c.brady@dur.ac.uk.
7
8# Reference #
9
10The only lexical tokens in the VVhitespace language are
11
12 * Tab (ASCII 9),
13 * Line Feed (ASCII 10),
14 * Vertical Tab (ASCII 11), and
15 * Space (ASCII 32).
16
17The language itself is an imperative, stack based language. Each command
18consists of a series of tokens, beginning with the Instruction Modification
19Parameter (IMP). These are listed in the table below.
20
21| IMP | Meaning |
22| :----------- | :----------------- |
23| [Space] | Stack Manipulation |
24| [Tab][Space] | Arithmetic |
25| [Tab][Tab] | Heap access |
26| [LF] | Flow Control |
27| [Tab][LF] | I/O |
28
29The virtual machine on which programs run has a stack and a heap, both of
30fixed, implementation defined size and both designed around a 64-bit word. The
31programmer may push and pop words from the stack in addition to accessing the
32heap on a per-word basis as a permanent store of variables and data structures.
33
34Many commands require numbers or labels as parameters.
35
36Numbers are integers ranging from `-(2^63)` to `+(2^63)-1` and are represented
37in sign-magnitude format as a sign, either [Space] for positive or [Tab] for
38negative, followed by a series of [Space] (zero) and [Tab] (one) digits
39terminated by a [LF]. Note that positive and negative zero are considered
40equivalent. As an example, the decimal number +42 has the following
41representation.
42
43 [Space] [Tab][Space][Tab][Space][Tab][Space] [LF]
44
0b96db3a
AT
45Labels consist of an [LF] terminated list of up to sixteen spaces and tabs. The
46program must not begin with a label. There is only one global namespace so all
47labels must be unique. Labels are left-padded with [Space] up to sixteen
48characters; the following two labels are interchangeable.
cbbb46ce 49
0b96db3a
AT
50 [Tab][Space][Tab] [LF]
51 [Space][Tab][Space][Tab] [LF]
cbbb46ce
AT
52
53## Stack Manipulation (IMP: [Space]) ##
54
55Stack manipulation is one of the more common operations, hence the shortness of
56the IMP [Space]. There are four stack instructions.
57
58| Command | Params | Meaning |
59| :---------- | :----- | :---------------------------------- |
60| [Space] | Number | Push the number onto the stack |
61| [LF][Space] | --- | Duplicate the top item on the stack |
62| [LF][Tab] | --- | Swap the top two items on the stack |
63| [LF][LF] | --- | Discard the top item on the stack |
64
65## Arithmetic (IMP: [Tab][Space]) ##
66
67Arithmetic commands operate on the top two items on the stack, and replace them
68with the result of the operation. The first item pushed is considered to be
9e28c156 69left of the operator. The modulo command will always return a positive result.
cbbb46ce
AT
70
71| Command | Params | Meaning |
72| :------------- | :----- | :--------------- |
73| [Space][Space] | --- | Addition |
74| [Space][Tab] | --- | Subtraction |
75| [Space][LF] | --- | Multiplication |
76| [Tab][Space] | --- | Integer Division |
77| [Tab][Tab] | --- | Modulo |
78
79## Heap Access (IMP: [Tab][Tab]) ##
80
81Heap access commands look at the stack to find the address of items to be
82stored or retrieved. To store an item, push the address then the value and run
83the store command. To retrieve an item, push the address and run the retrieve
84command, which will place the value stored in the location at the top of the
85stack.
86
87| Command | Params | Meaning |
88| :------ | :----- | :------- |
89| [Space] | --- | Store |
90| [Tab] | --- | Retrieve |
91
92## Flow Control (IMP: [LF]) ##
93
94Flow control operations are also common. Subroutines are marked by labels, as
95well as the targets of conditional and unconditional jumps, by which loops can
96be implemented. Programs must be ended by means of [LF][LF][LF] so that the
97interpreter can exit cleanly.
98
0b96db3a
AT
99| Command | Params | Meaning |
100| :------------------- | :----- | :------------------------------- |
101| [Space][Space][VTab] | Label | Mark a location in the program |
102| [Space][Tab] | Label | Call a subroutine |
103| [Space][LF] | Label | Jump unconditionally to a label |
104| [Tab][Space] | Label | Jump to label if TOS is zero |
105| [Tab][Tab] | Label | Jump to label if TOS is negative |
106| [Tab][LF] | --- | Return from subroutine |
107| [LF][LF] | --- | End the program |
cbbb46ce
AT
108
109## I/O (IMP: [Tab][LF]) ##
110
111Finally, we need to be able to interact with the user. There are IO
112instructions for reading and writing numbers and individual characters.
113
114The read instructions take the heap address in which to store the result from
115the top of the stack.
116
117| Command | Params | Meaning |
118| :------------- | :----- | :------------------------------------------------ |
119| [Space][Space] | --- | Output the character at the TOS |
120| [Space][Tab] | --- | Output the number at the TOS |
121| [Tab][Space] | --- | Read character and store at location given by TOS |
122| [Tab][Tab] | --- | Read number and store at location given by TOS |