This commit was manufactured by cvs2svn to create tag 'FreeBSD-release/1.0'.
[unix-history] / usr.bin / elvis / doc / internal.ms
CommitLineData
15637ed4
RG
1.Go 8 "INTERNAL"
2.PP
3You don't need to know the material in this section to use \*E.
4You only need it if you intend to modify \*E.
5.PP
6You should also check out the CFLAGS, TERMCAP, ENVIRONMENT VARIABLES,
7VERSIONS, and QUIESTIONS & ANSWERS sections of this manual.
8.NH 2
9The temporary file
10.PP
11The temporary file is divided into blocks of 1024 bytes each.
12The functions in "blk.c" maintain a cache of the five most recently used blocks,
13to minimize file I/O.
14.PP
15When \*E starts up, the file is copied into the temporary file
16by the function \fBtmpstart()\fR in "tmp.c".
17Small amounts of extra space are inserted into the temporary file to
18insure that no text lines cross block boundaries.
19This speeds up processing and simplifies storage management.
20The extra space is filled with NUL characters.
21the input file must not contain any NULs, to avoid confusion.
22This also limits lines to a length of 1023 characters or less.
23.PP
24The data blocks aren't necessarily stored in sequence.
25For example, it is entirely possible that the data block containing
26the first lines of text will be stored after the block containing the
27last lines of text.
28.PP
29In RAM, \*E maintains two lists: one that describes the "proper"
30order of the disk blocks, and another that records the line number of
31the last line in each block.
32When \*E needs to fetch a given line of text, it uses these tables
33to locate the data block which contains that line.
34.PP
35Before each change is made to the file, these lists are copied.
36The copies can be used to "undo" the change.
37Also, the first list
38-- the one that lists the data blocks in their proper order --
39is written to the first data block of the temp file.
40This list can be used during file recovery.
41.PP
42When blocks are altered, they are rewritten to a \fIdifferent\fR block in the file,
43and the order list is updated accordingly.
44The original block is left intact, so that "undo" can be performed easily.
45\*E will eventually reclaim the original block, when it is no longer needed.
46.NH 2
47Implementation of Editing
48.PP
49There are three basic operations which affect text:
50.ID
51\(bu delete text - delete(from, to)
52\(bu add text - add(at, text)
53\(bu yank text - cut(from, to)
54.DE
55.PP
56To yank text, all text between two text positions is copied into a cut buffer.
57The original text is not changed.
58To copy the text into a cut buffer,
59you need only remember which physical blocks that contain the cut text,
60the offset into the first block of the start of the cut,
61the offset into the last block of the end of the cut,
62and what kind of cut it was.
63(Cuts may be either character cuts or line cuts;
64the kind of a cut affects the way it is later "put".)
65Yanking is implemented in the function \fBcut()\fR,
66and pasting is implemented in the function \fBpaste()\fR.
67These functions are defined in "cut.c".
68.PP
69To delete text, you must modify the first and last blocks, and
70remove any reference to the intervening blocks in the header's list.
71The text to be deleted is specified by two marks.
72This is implemented in the function \fBdelete()\fR.
73.PP
74To add text, you must specify
75the text to insert (as a NUL-terminated string)
76and the place to insert it (as a mark).
77The block into which the text is to be inserted may need to be split into
78as many as four blocks, with new intervening blocks needed as well...
79or it could be as simple as modifying a single block.
80This is implemented in the function \fBadd()\fR.
81.PP
82There is also a \fBchange()\fR function,
83which generally just calls delete() and add().
84For the special case where a single character is being replaced by another
85single character, though, change() will optimize things somewhat.
86The add(), delete(), and change() functions are all defined in "modify.c".
87.PP
88The \fBinput()\fR function reads text from a user and inserts it into the file.
89It makes heavy use of the add(), delete(), and change() functions.
90It inserts characters one at a time, as they are typed.
91.PP
92When text is modified, an internal file-revision counter, called \fBchanges\fR,
93is incremented.
94This counter is used to detect when certain caches are out of date.
95(The "changes" counter is also incremented when we switch to a different file,
96and also in one or two similar situations -- all related to invalidating caches.)
97.NH 2
98Marks and the Cursor
99.PP
100Marks are places within the text.
101They are represented internally as 32-bit values which are split
102into two bitfields:
103a line number and a character index.
104Line numbers start with 1, and character indexes start with 0.
105Lines can be up to 1023 characters long, so the character index is 10 bits
106wide and the line number fills the remaining 22 bits in the long int.
107.PP
108Since line numbers start with 1,
109it is impossible for a valid mark to have a value of 0L.
1100L is therefore used to represent unset marks.
111.PP
112When you do the "delete text" change, any marks that were part of
113the deleted text are unset, and any marks that were set to points
114after it are adjusted.
115Marks are adjusted similarly after new text is inserted.
116.PP
117The cursor is represented as a mark.
118.NH 2
119Colon Command Interpretation
120.PP
121Colon commands are parsed, and the command name is looked up in an array
122of structures which also contain a pointer to the function that implements
123the command, and a description of the arguments that the command can take.
124If the command is recognized and its arguments are legal,
125then the function is called.
126.PP
127Each function performs its task; this may cause the cursor to be
128moved to a different line, or whatever.
129.NH 2
130Screen Control
131.PP
132In input mode or visual command mode,
133the screen is redrawn by a function called \fBredraw()\fR.
134This function is called in the getkey() function before each keystroke is
135read in, if necessary.
136.PP
78ed81a3 137Redraw() writes to the screen via a package which looks like the "curses"
15637ed4
RG
138library, but isn't.
139It is actually much simpler.
140Most curses operations are implemented as macros which copy characters
141into a large I/O buffer, which is then written with a single large
142write() call as part of the refresh() operation.
143.PP
144(Note: Under MS-DOS, the pseudo-curses macros check to see whether you're
145using the pcbios interface. If you are, then the macros call functions
146in "pc.c" to implement screen updates.)
147.PP
148The low-level functions which modify text (namely add(), delete(), and change())
149supply redraw() with clues to help redraw() decide which parts of the
150screen must be redrawn.
151The clues are given via a function called \fBredrawrange()\fR.
152.PP
153Most EX commands use the pseudo-curses package to perform their output,
154like redraw().
155.PP
156There is also a function called \fBmsg()\fR which uses the same syntax as printf().
157In EX mode, msg() writes message to the screen and automatically adds a
158newline.
159In VI mode, msg() writes the message on the bottom line of the screen
160with the "standout" character attribute turned on.
161.NH 2
162Options
163.PP
164For each option available through the ":set" command,
165\*E contains a character array variable, named "o_\fIoption\fR".
166For example, the "lines" option uses a variable called "o_lines".
167.PP
168For boolean options, the array has a dimension of 1.
169The first (and only) character of the array will be NUL if the
170variable's value is FALSE, and some other value if it is TRUE.
171To check the value, just by dereference the array name,
172as in "if (*o_autoindent)".
173.PP
174For number options, the array has a dimension of 3.
175The array is treated as three unsigned one-byte integers.
176The first byte is the current value of the option.
177The second and third bytes are the lower and upper bounds of that
178option.
179.PP
180For string options, the array usually has a dimension of about 60
181but this may vary.
182The option's value is stored as a normal NUL-terminated string.
183.PP
184All of the options are declared in "opts.c".
185Most are initialized to their default values;
186the \fBinitopts()\fR function is used to perform any environment-specific
187initialization.
188.NH 2
189Portability
190.PP
191To improve portability, \*E collects as many of the system-dependent
192definitions as possible into the "config.h" file.
193This file begins with some preprocessor instructions which attempt to
194determine which compiler and operating system you have.
195After that, it conditionally defines some macros and constants for your system.
196.PP
197One of the more significant macros is \fBttyread()\fR.
198This macro is used to read raw characters from the keyboard, possibly
199with timeout.
200For UNIX systems, this basically reads bytes from stdin.
201For MSDOS, TOS, and OS9, ttyread() is a function defined in curses.c.
202There is also a \fBttywrite()\fR macro.
203.PP
204The \fBtread()\fR and \fBtwrite()\fR macros are versions of read() and write() that are
205used for text files.
206On UNIX systems, these are equivelent to read() and write().
207On MS-DOS, these are also equivelent to read() and write(),
208since DOS libraries are generally clever enough to convert newline characters
209automatically.
210For Atari TOS, though, the MWC library is too stupid to do this,
211so we had to do the conversion explicitly.
212.PP
213Other macros may substitute index() for strchr(), or bcopy() for memcpy(),
214or map the "void" data type to "int", or whatever.
215.PP
216The file "tinytcap.c" contains a set of functions that emulate the termcap
217library for a small set of terminal types.
218The terminal-specific info is hard-coded into this file.
219It is only used for systems that don't support real termcap.
220Another alternative for screen control can be seen in
221the "curses.h" and "pc.c" files.
222Here, macros named VOIDBIOS and CHECKBIOS are used to indirectly call
223functions which perform low-level screen manipulation via BIOS calls.
224.PP
225The stat() function must be able to come up with UNIX-style major/minor/inode
226numbers that uniquely identify a file or directory.
227.PP
228Please try to keep you changes localized,
229and wrap them in #if/#endif pairs,
230so that \*E can still be compiled on other systems.
231And PLEASE let me know about it, so I can incorporate your changes into
232my latest-and-greatest version of \*E.