Commit | Line | Data |
---|---|---|
15637ed4 RG |
1 | .Go 8 "INTERNAL" |
2 | .PP | |
3 | You don't need to know the material in this section to use \*E. | |
4 | You only need it if you intend to modify \*E. | |
5 | .PP | |
6 | You should also check out the CFLAGS, TERMCAP, ENVIRONMENT VARIABLES, | |
7 | VERSIONS, and QUIESTIONS & ANSWERS sections of this manual. | |
8 | .NH 2 | |
9 | The temporary file | |
10 | .PP | |
11 | The temporary file is divided into blocks of 1024 bytes each. | |
12 | The functions in "blk.c" maintain a cache of the five most recently used blocks, | |
13 | to minimize file I/O. | |
14 | .PP | |
15 | When \*E starts up, the file is copied into the temporary file | |
16 | by the function \fBtmpstart()\fR in "tmp.c". | |
17 | Small amounts of extra space are inserted into the temporary file to | |
18 | insure that no text lines cross block boundaries. | |
19 | This speeds up processing and simplifies storage management. | |
20 | The extra space is filled with NUL characters. | |
21 | the input file must not contain any NULs, to avoid confusion. | |
22 | This also limits lines to a length of 1023 characters or less. | |
23 | .PP | |
24 | The data blocks aren't necessarily stored in sequence. | |
25 | For example, it is entirely possible that the data block containing | |
26 | the first lines of text will be stored after the block containing the | |
27 | last lines of text. | |
28 | .PP | |
29 | In RAM, \*E maintains two lists: one that describes the "proper" | |
30 | order of the disk blocks, and another that records the line number of | |
31 | the last line in each block. | |
32 | When \*E needs to fetch a given line of text, it uses these tables | |
33 | to locate the data block which contains that line. | |
34 | .PP | |
35 | Before each change is made to the file, these lists are copied. | |
36 | The copies can be used to "undo" the change. | |
37 | Also, the first list | |
38 | -- the one that lists the data blocks in their proper order -- | |
39 | is written to the first data block of the temp file. | |
40 | This list can be used during file recovery. | |
41 | .PP | |
42 | When blocks are altered, they are rewritten to a \fIdifferent\fR block in the file, | |
43 | and the order list is updated accordingly. | |
44 | The original block is left intact, so that "undo" can be performed easily. | |
45 | \*E will eventually reclaim the original block, when it is no longer needed. | |
46 | .NH 2 | |
47 | Implementation of Editing | |
48 | .PP | |
49 | There are three basic operations which affect text: | |
50 | .ID | |
51 | \(bu delete text - delete(from, to) | |
52 | \(bu add text - add(at, text) | |
53 | \(bu yank text - cut(from, to) | |
54 | .DE | |
55 | .PP | |
56 | To yank text, all text between two text positions is copied into a cut buffer. | |
57 | The original text is not changed. | |
58 | To copy the text into a cut buffer, | |
59 | you need only remember which physical blocks that contain the cut text, | |
60 | the offset into the first block of the start of the cut, | |
61 | the offset into the last block of the end of the cut, | |
62 | and what kind of cut it was. | |
63 | (Cuts may be either character cuts or line cuts; | |
64 | the kind of a cut affects the way it is later "put".) | |
65 | Yanking is implemented in the function \fBcut()\fR, | |
66 | and pasting is implemented in the function \fBpaste()\fR. | |
67 | These functions are defined in "cut.c". | |
68 | .PP | |
69 | To delete text, you must modify the first and last blocks, and | |
70 | remove any reference to the intervening blocks in the header's list. | |
71 | The text to be deleted is specified by two marks. | |
72 | This is implemented in the function \fBdelete()\fR. | |
73 | .PP | |
74 | To add text, you must specify | |
75 | the text to insert (as a NUL-terminated string) | |
76 | and the place to insert it (as a mark). | |
77 | The block into which the text is to be inserted may need to be split into | |
78 | as many as four blocks, with new intervening blocks needed as well... | |
79 | or it could be as simple as modifying a single block. | |
80 | This is implemented in the function \fBadd()\fR. | |
81 | .PP | |
82 | There is also a \fBchange()\fR function, | |
83 | which generally just calls delete() and add(). | |
84 | For the special case where a single character is being replaced by another | |
85 | single character, though, change() will optimize things somewhat. | |
86 | The add(), delete(), and change() functions are all defined in "modify.c". | |
87 | .PP | |
88 | The \fBinput()\fR function reads text from a user and inserts it into the file. | |
89 | It makes heavy use of the add(), delete(), and change() functions. | |
90 | It inserts characters one at a time, as they are typed. | |
91 | .PP | |
92 | When text is modified, an internal file-revision counter, called \fBchanges\fR, | |
93 | is incremented. | |
94 | This counter is used to detect when certain caches are out of date. | |
95 | (The "changes" counter is also incremented when we switch to a different file, | |
96 | and also in one or two similar situations -- all related to invalidating caches.) | |
97 | .NH 2 | |
98 | Marks and the Cursor | |
99 | .PP | |
100 | Marks are places within the text. | |
101 | They are represented internally as 32-bit values which are split | |
102 | into two bitfields: | |
103 | a line number and a character index. | |
104 | Line numbers start with 1, and character indexes start with 0. | |
105 | Lines can be up to 1023 characters long, so the character index is 10 bits | |
106 | wide and the line number fills the remaining 22 bits in the long int. | |
107 | .PP | |
108 | Since line numbers start with 1, | |
109 | it is impossible for a valid mark to have a value of 0L. | |
110 | 0L is therefore used to represent unset marks. | |
111 | .PP | |
112 | When you do the "delete text" change, any marks that were part of | |
113 | the deleted text are unset, and any marks that were set to points | |
114 | after it are adjusted. | |
115 | Marks are adjusted similarly after new text is inserted. | |
116 | .PP | |
117 | The cursor is represented as a mark. | |
118 | .NH 2 | |
119 | Colon Command Interpretation | |
120 | .PP | |
121 | Colon commands are parsed, and the command name is looked up in an array | |
122 | of structures which also contain a pointer to the function that implements | |
123 | the command, and a description of the arguments that the command can take. | |
124 | If the command is recognized and its arguments are legal, | |
125 | then the function is called. | |
126 | .PP | |
127 | Each function performs its task; this may cause the cursor to be | |
128 | moved to a different line, or whatever. | |
129 | .NH 2 | |
130 | Screen Control | |
131 | .PP | |
132 | In input mode or visual command mode, | |
133 | the screen is redrawn by a function called \fBredraw()\fR. | |
134 | This function is called in the getkey() function before each keystroke is | |
135 | read in, if necessary. | |
136 | .PP | |
78ed81a3 | 137 | Redraw() writes to the screen via a package which looks like the "curses" |
15637ed4 RG |
138 | library, but isn't. |
139 | It is actually much simpler. | |
140 | Most curses operations are implemented as macros which copy characters | |
141 | into a large I/O buffer, which is then written with a single large | |
142 | write() call as part of the refresh() operation. | |
143 | .PP | |
144 | (Note: Under MS-DOS, the pseudo-curses macros check to see whether you're | |
145 | using the pcbios interface. If you are, then the macros call functions | |
146 | in "pc.c" to implement screen updates.) | |
147 | .PP | |
148 | The low-level functions which modify text (namely add(), delete(), and change()) | |
149 | supply redraw() with clues to help redraw() decide which parts of the | |
150 | screen must be redrawn. | |
151 | The clues are given via a function called \fBredrawrange()\fR. | |
152 | .PP | |
153 | Most EX commands use the pseudo-curses package to perform their output, | |
154 | like redraw(). | |
155 | .PP | |
156 | There is also a function called \fBmsg()\fR which uses the same syntax as printf(). | |
157 | In EX mode, msg() writes message to the screen and automatically adds a | |
158 | newline. | |
159 | In VI mode, msg() writes the message on the bottom line of the screen | |
160 | with the "standout" character attribute turned on. | |
161 | .NH 2 | |
162 | Options | |
163 | .PP | |
164 | For each option available through the ":set" command, | |
165 | \*E contains a character array variable, named "o_\fIoption\fR". | |
166 | For example, the "lines" option uses a variable called "o_lines". | |
167 | .PP | |
168 | For boolean options, the array has a dimension of 1. | |
169 | The first (and only) character of the array will be NUL if the | |
170 | variable's value is FALSE, and some other value if it is TRUE. | |
171 | To check the value, just by dereference the array name, | |
172 | as in "if (*o_autoindent)". | |
173 | .PP | |
174 | For number options, the array has a dimension of 3. | |
175 | The array is treated as three unsigned one-byte integers. | |
176 | The first byte is the current value of the option. | |
177 | The second and third bytes are the lower and upper bounds of that | |
178 | option. | |
179 | .PP | |
180 | For string options, the array usually has a dimension of about 60 | |
181 | but this may vary. | |
182 | The option's value is stored as a normal NUL-terminated string. | |
183 | .PP | |
184 | All of the options are declared in "opts.c". | |
185 | Most are initialized to their default values; | |
186 | the \fBinitopts()\fR function is used to perform any environment-specific | |
187 | initialization. | |
188 | .NH 2 | |
189 | Portability | |
190 | .PP | |
191 | To improve portability, \*E collects as many of the system-dependent | |
192 | definitions as possible into the "config.h" file. | |
193 | This file begins with some preprocessor instructions which attempt to | |
194 | determine which compiler and operating system you have. | |
195 | After that, it conditionally defines some macros and constants for your system. | |
196 | .PP | |
197 | One of the more significant macros is \fBttyread()\fR. | |
198 | This macro is used to read raw characters from the keyboard, possibly | |
199 | with timeout. | |
200 | For UNIX systems, this basically reads bytes from stdin. | |
201 | For MSDOS, TOS, and OS9, ttyread() is a function defined in curses.c. | |
202 | There is also a \fBttywrite()\fR macro. | |
203 | .PP | |
204 | The \fBtread()\fR and \fBtwrite()\fR macros are versions of read() and write() that are | |
205 | used for text files. | |
206 | On UNIX systems, these are equivelent to read() and write(). | |
207 | On MS-DOS, these are also equivelent to read() and write(), | |
208 | since DOS libraries are generally clever enough to convert newline characters | |
209 | automatically. | |
210 | For Atari TOS, though, the MWC library is too stupid to do this, | |
211 | so we had to do the conversion explicitly. | |
212 | .PP | |
213 | Other macros may substitute index() for strchr(), or bcopy() for memcpy(), | |
214 | or map the "void" data type to "int", or whatever. | |
215 | .PP | |
216 | The file "tinytcap.c" contains a set of functions that emulate the termcap | |
217 | library for a small set of terminal types. | |
218 | The terminal-specific info is hard-coded into this file. | |
219 | It is only used for systems that don't support real termcap. | |
220 | Another alternative for screen control can be seen in | |
221 | the "curses.h" and "pc.c" files. | |
222 | Here, macros named VOIDBIOS and CHECKBIOS are used to indirectly call | |
223 | functions which perform low-level screen manipulation via BIOS calls. | |
224 | .PP | |
225 | The stat() function must be able to come up with UNIX-style major/minor/inode | |
226 | numbers that uniquely identify a file or directory. | |
227 | .PP | |
228 | Please try to keep you changes localized, | |
229 | and wrap them in #if/#endif pairs, | |
230 | so that \*E can still be compiled on other systems. | |
231 | And PLEASE let me know about it, so I can incorporate your changes into | |
232 | my latest-and-greatest version of \*E. |