BSD 4_2 development
[unix-history] / usr / lisp / ch12.n
CommitLineData
1c9c8e44
C
1." $Header: ch12.n 1.2 83/07/23 12:41:32 layer Exp $
2.Lc Liszt\ -\ the\ lisp\ compiler 12
3.sh 2 "General strategy of the compiler" \n(ch 1
4.pp
5The purpose of the lisp compiler, Liszt, is to create an object module which
6when brought into the lisp system using
7.i fasl
8will have the same effect as bringing in the corresponding lisp coded source
9module with
10.i load
11with one important exception,
12functions will be defined as sequences of machine language instructions, instead
13of lisp S-expressions.
14Liszt is not a function compiler, it is a
15.i file
16compiler.
17Such a file can contain more than function definitions; it can
18contain other lisp S-expressions which are evaluated
19at load time.
20These other S-expressions will also be stored in the object
21module produced by Liszt and will be evaluated at fasl time.
22.pp
23As is almost universally true of Lisp compilers, the main pass of Liszt
24is written in Lisp.
25A subsequent pass is the assembler, for which we use the
26standard UNIX assembler.
27.sh 2 "Running the compiler"
28.pp
29The compiler is normally run in this manner:
30.br
31% \fBliszt foo\fP
32.br
33will compile the file foo.l or foo (the preferred way to indicate a lisp
34source file is to end the file name with `.l').
35The result of the compilation will be placed in the file foo.o if no
36fatal errors were detected.
37All messages which Liszt generates go to the standard output.
38Normally each function name is printed before it is compiled (the \-q
39option suppresses this).
40.sh 2 "Special forms"
41.pp
42Liszt makes one pass over the source file.
43It processes each form in this way:
44.sh 3 macro\ expansion
45.pp
46If the form is a macro invocation (i.e it is a list whose car is a symbol
47whose function binding is a macro), then that macro invocation is expanded.
48This is repeated until the top level form is not a macro invocation.
49When Liszt begins, there are already some macros defined, in fact some
50functions (such as defun) are actually macros.
51The user may define his own macros as well.
52For a macro to be used it must be defined in the Lisp system
53in which Liszt runs.
54.sh +0 classification
55.pp
56After all macro expansion is done, the form is classified according to its
57.i car
58(if the form is not a list, then it is classified as an
59.i other ).
60.sh +1 "eval-when"
61.pp
62The form of eval-when is
63\fI(eval-when\ (time1\ time2\ ...)\ form1\ form2\ ...)\fP
64where the time\fIi\fP are one of
65.i eval ,
66.i compile ,
67or
68.i load .
69The compiler examines the form\fIi\fP in sequence and the action taken
70depends on what is in the time list.
71If
72.i compile
73is in the list then the compiler will invoke
74.i eval
75on each form\fIi\fP as it examines it.
76If
77.i load
78is in the list then the compile will recursively call itself to compile
79each form\fIi\fP as it examines it.
80Note that if
81.i compile
82and
83.i load
84are in the time list, then the compiler will both evaluate and compile
85each form.
86This is useful if you need a function to be defined in the compiler
87at both compile time (perhaps to aid macro expansion) and at run time
88(after the file is
89.i fasl ed
90in).
91.sh +0 "declare"
92.pp
93Declare is used to provide information about functions and variables to
94the compiler.
95It is (almost) equivalent to \fI(eval-when\ (compile)\ ...)\fP.
96You may declare functions to be one of three types: lambda (*expr),
97nlambda (*fexpr), lexpr (*lexpr).
98The names in parenthesis are the Maclisp names and are accepted by the
99compiler as well (and not just when the compiler is in Maclisp mode).
100Functions are assumed to be lambdas until they are declared otherwise
101or are defined differently.
102The compiler treats calls to lambdas and lexprs equivalently, so you needn't
103worry about declaring lexprs either.
104It is important to declare nlambdas or define them before calling them.
105Another attribute you can declare for a function is localf which
106makes the function `local'.
107A local function's name is
108known only to the functions defined
109within the file itself. The
110advantage of a local function is that is can be entered
111and exited very quickly and it can have the same name as a function in
112another file and there will be no name conflict.
113.pp
114Variables may be declared special or unspecial.
115When a special variable is lambda bound (either in a lambda,
116prog or do expression), its old value is stored away on a stack for the
117duration of the lambda, prog or do expression.
118This takes time and is often not necessary.
119Therefore the default classification for variables is unspecial.
120Space for unspecial variables is dynamically allocated on a stack.
121An unspecial variable can only be accessed from within the function
122where it is created by its presence in a lambda, prog or do
123expression variable list.
124It is possible to declare that all variables are special as will be
125shown below.
126.pp
127You may declare any number of things in each declare statement.
128A sample declaration is
129.ft I
130.nf
131(declare
132\ \ \ \ \ (lambda func1 func2)
133\ \ \ \ \ (*fexpr func3)
134\ \ \ \ \ (*lexpr func4)
135\ \ \ \ \ (localf func5)
136\ \ \ \ \ (special var1 var2 var3)
137\ \ \ \ \ (unspecial var4))
138.fi
139.ft R
140.pp
141You may also declare all variables to be special with
142\fI(declare\ (specials\ t))\fP.
143You may declare that macro definitions should be compiled as well as
144evaluated at compile time by \fI(declare\ (macros\ t))\fP.
145In fact, as was mentioned above, declare is much like
146\fI(eval-when\ (compile)\ ...)\fP.
147Thus if the compiler sees \fI(declare\ (foo\ bar))\fP
148and foo is defined, then it will evaluate \fI(foo\ bar)\fP.
149If foo is not defined then an undefined declare attribute warning will
150be issued.
151.sh +0 "(progn 'compile \fRform1 form2 ... formn\fB)\fP"
152.pp
153When the compiler sees this it simply compiles form1 through formn as if
154they too were seen at top level.
155One use for this is to allow a macro at top-level to
156expand into more than one function definition for the compiler to compile.
157.sh +0 "include/includef"
158.pp
159.i Include
160and
161.i includef
162cause another file to be read and compiled by
163the compiler. The result is the same as if the included file were
164textually inserted into the original file. The only difference
165between
166.i include
167and
168.i includef
169is that include doesn't evaluate its
170argument and includef does. Nested includes are allowed.
171.sh +0 "def"
172.pp
173A def form is used to define a function. The macros
174.i defun
175and
176.i defmacro
177expand to a def form.
178If the function being defined is a lambda, nlambda or lexpr then
179the compiler converts the lisp definition to a sequence of machine
180language instructions.
181If the function being defined is a macro, then the compiler will evaluate
182the definition, thus defining the macro withing the running Lisp compiler.
183Furthermore, if the variable
184.i macros
185is set to a non nil value, then the macro definition will also be translated
186to machine language and thus will be defined when the object file is
187fasled in.
188The variable
189.i macros
190is set to t by
191\fI(declare\ (macros\ t))\fP.
192.pp
193When a function or macro definition is compiled, macro expansion is
194done whenever possible.
195If the compiler can determine that a form would be evaluated if this
196function were interpreted then it will macro expand it.
197It will not macro expand arguments to a nlambda unless the characteristics
198of the nlambda is known (as is the case with
199.i cond).
200The map functions (
201.i map ,
202.i mapc ,
203.i mapcar ,
204and so on)
205are expanded to a
206.i do
207statement.
208This allows the first argument to the map function to be a lambda
209expression which references local variables of the function being
210defined.
211.sh +0 "other forms"
212.pp
213All other forms are simply stored in the object file and are evaluated
214when the file is
215.i fasl ed
216in.
217.sh 2 "Using the compiler"
218.pp
219The previous section describes exactly what the compiler does with its
220input.
221Generally you won't have to worry about all that detail as files which
222work interpreted will work compiled.
223Following is a list of steps you should follow to insure that a file
224will compile correctly.
225.ip [1]
226Make sure all macro definitions precede their use in functions or other
227macro definitions.
228If you want the macros to be around when you
229.i fasl
230in the object file you should include this statement at the beginning
231of the file: \fI(declare\ (macros\ t))\fP
232.ip [2]
233Make sure all nlambdas are defined or declared before they are used.
234If the compiler comes across a call to a
235function which has not been defined in the current file,
236which does not currently have a function binding,
237and whose type has not been declared then it will assume that the function
238needs its arguments evaluated
239(i.e. it is a lambda or lexpr) and will generate code
240accordingly.
241This means that you do not have to declare nlambda functions like
242.i status
243since they have an nlambda function binding.
244.ip [3]
245Locate all variables which are used for communicating values between
246functions.
247These variables must be declared special at the beginning of a file.
248In most cases there won't be many special declarations but if you
249fail to declare a variable special that should be, the compiled code
250could fail in mysterious ways.
251Let's look at a common problem, assume that a file contains just
252these three lines:
253.sp 2v
254.ft I
255(def aaa (lambda (glob loc) (bbb loc)))
256.br
257(def bbb (lambda (myloc) (add glob myloc)))
258.br
259(def ccc (lambda (glob loc) (bbb loc)))
260.sp 2v
261.ft R
262We can see that if we load in these two definitions then (aaa 3 4) is
263the same as (add 3 4) and will give us 7.
264Suppose we compile the file containing these definitions.
265When Liszt compiles aaa, it will assume that both glob and loc are local
266variables and will allocate space on the temporary stack for their values
267when aaa is called.
268Thus the values of the local variables glob and loc
269will not affect the values of the symbols glob and loc in the Lisp system.
270Now Liszt moves on to function bbb.
271Myloc is assumed to be local.
272When it sees the add statement, it find a reference to a variable called
273glob.
274This variable is not a local variable to this function and therefore
275glob must refer to the value of the symbol glob.
276Liszt will automatically declare glob to be special and it will print
277a warning to that effect.
278Thus subsequent uses of glob will always refer to the symbol glob.
279Next Liszt compiles ccc and treats glob as a special and loc
280as a local.
281When the object file is
282.i fasl 'ed
283in, and (ccc 3 4) is evaluated,
284the symbol glob will be lambda bound to 3
285bbb will be called and will return 7.
286However (aaa 3 4) will fail since when bbb is called, glob will be unbound.
287What should be done here is to put
288\fI(declare\ (special\ glob)\fP
289at the beginning of the file.
290.ip [4]
291Make sure that all calls to
292.i arg
293are within the lexpr whose arguments they reference.
294If \fIfoo\fP is a compiled lexpr and it calls \fIbar\fP then \fIbar\fP cannot
295use \fIarg\fP to get at \fIfoo\fP's arguments.
296If both
297.i foo
298and
299.i bar
300are interpreted this will work however.
301The macro
302.i listify
303can be used to put all of some of a lexprs arguments in a list which
304then can be passed to other functions.
305.sh 2 "Compiler options"
306.pp
307The compiler recognizes a number of options which are described below.
308The options are typed anywhere on the command line preceded by a minus sign.
309The entire command line is scanned and all options recorded before any action
310is taken. Thus
311.br
312% liszt -mx foo
313.br
314% liszt -m -x foo
315.br
316% liszt foo -mx
317.br
318are all equivalent.
319Before scanning the command line for options, liszt looks for in the
320environment for the variable LISZT, and if found scans its value
321as if it was a string of options.
322The meaning of the options are:
323.ip \fBC\fP
324The assembler language output of the compiler is commented.
325This is useful when debugging the compiler and is not normally done since
326it slows down compilation.
327.ip \fBI\fP
328The next command line argument is taken as a filename, and loaded prior
329to compilation.
330.ip \fBe\fP
331Evaluate the next argument on the command line before starting compilation.
332For example
333.br
334% liszt -e '(setq foobar "foo string")' foo
335.br
336will evaluate the above s-expression. Note that the shell requires
337that the arguments be surrounded by single quotes.
338.ip \fBi\fP
339Compile this program in interlisp compatibility mode.
340This is not implemented yet.
341.ip \fBm\fP
342Compile this program in Maclisp mode.
343The reader syntax will be changed to the Maclisp syntax and a file of
344macro definitions will be loaded in (usually named /usr/lib/lisp/machacks).
345This switch brings us sufficiently close to Maclisp to allow us to compile
346Macsyma, a large Maclisp program.
347However Maclisp is a moving target and we can't guarantee that this switch
348will allow you to compile any given program.
349.ip \fBo\fP
350Select a different object or assembler language file name.
351For example
352.br
353% liszt foo -o xxx.o
354.br
355will compile foo and into xxx.o instead of the default foo.o, and
356.br
357% liszt bar -S -o xxx.s
358.br
359will compile to assembler language into xxx.s instead of bar.s.
360.ip \fBp\fP
361place profiling code at the beginning of each non-local function.
362If the lisp system is also created with profiling in it, this allows
363function calling frequency to be determined (see \fIprof(1)\fP)
364.ip \fBq\fP
365Run in quiet mode.
366The names of functions being compiled and various
367"Note"'s are not printed.
368.ip \fBQ\fP
369print compilation statistics and warn of strange constructs.
370This is the inverse of the \fBq\fP switch and is the default.
371.ip \fBr\fP
372place bootstrap code at the beginning of the object file, which when
373the object file is executed will cause a lisp system to be invoked
374and the object file \fIfasl\fPed in.
375This is known as `autorun' and is described below.
376.ip \fBS\fP
377Create an assembler language file only.
378.br
379% liszt -S foo
380.br
381will create the file assembler language file foo.s and will not attempt
382to assemble it.
383If this option is not specified, the assembler language file will be put
384in the temporary disk area under a automatically generated name based on
385the lisp compiler's process id.
386Then if there are no compilation errors, the assembler will be invoked to
387assemble the file.
388.ip \fBT\fP
389Print the assembler language output on the standard output file.
390This is useful when debugging the compiler.
391.ip \fBu\fP
392Run in UCI-Lisp mode.
393The character syntax is changed to that of UCI-Lisp and a UCI-Lisp compatibility
394package of macros is read in.
395.ip \fBw\fP
396Suppress warning messages.
397.ip \fBx\fP
398Create an cross reference file.
399.br
400% liszt -x foo
401.br
402not only compiles foo into foo.o but also generates the file foo.x\ .
403The file foo.x is lisp readable and lists for each function all functions
404which that function could call.
405The program lxref reads one or more of these ".x" files and produces a
406human readable cross reference listing.
407.sh 2 autorun
408.pp
409The object file
410which liszt writes does not contain all the functions necessary
411to run the lisp program which was compiled.
412In order to use the object file, a lisp system must be started and
413the object file
414.i fasl ed
415in.
416When the -r switch is given to liszt, the object file created will
417contain a small piece of bootstrap code at the beginning, and the
418object file will be made executable.
419Now, when the name of the object file is given to the UNIX command
420interpreter (shell) to run, the bootstrap code at the beginning
421of the object file will cause a lisp system to be started and
422the first action the lisp system will take is to
423.i fasl
424in the object file which started it.
425In effect the object file has created an environment in which it can run.
426.pp
427Autorun is an alternative to
428.i dumplisp .
429The advantage of autorun is that the object file which starts the whole
430process is typically small, whereas the minimum
431.i dumplisp ed
432file is very large (one half megabyte).
433The disadvantage of autorun is that the file must be
434.i fasl ed
435into a lisp each time it is used whereas the file which
436.i dumplisp
437creates can be run as is.
438liszt itself is a
439.i dumplisp ed
440file since it is used so often and is large enough that
441too much time would be wasted
442.i fasl ing
443it in each time it was used.
444The lisp cross reference program, lxref, uses
445.i autorun
446since it is a small and rarely used program.
447.pp
448In order to have the program
449.i fasl ed
450in begin execution
451(rather than starting a lisp top level),
452the value of the symbol user-top-level should be set to the name of the
453function to get control. An example of this is shown next.
454.Eb
455\fIwe want to replace the unix date program with one written in lisp.\fP
456
457% \fBcat lispdate.l\fP
458(de\kBfun mydate nil
459 \h'|\nBu'\kA(patom "The date is ")
460 \h'|\nAu'\kB(patom (status ctime))
461 \h'|\nBu'\kA(terpr)
462 \h'|\nAu'(exit 0))
463(se\kAtq user-top-level 'mydate)
464
465% \fBliszt -r lispdate\fP
466Compilation begins with Lisp Compiler 5.2
467source: lispdate.l, result: lispdate.o
468mydate
469%Note: lispdate.l: Compilation complete
470%Note: lispdate.l: Time: Real: 0:3, CPU: 0:0.28, GC: 0:0.00 for 0 gcs
471%Note: lispdate.l: Assembly begins
472%Note: lispdate.l: Assembly completed successfully
4733.0u 2.0s 0:17 29%
474
475\fI We change the name to remove the ".o", (this isn't necessary) \fP
476% \fBmv lispdate.o lispdate\fP
477
478\fI Now we test it out \fP
479% \fBlispdate\fP
480The date is Sat Aug 1 16:58:33 1981
481%
482.Ee
483.sh 2 "pure literals"
484.pp
485Normally the quoted lisp objects (literals) which appear in functions are
486treated as constants.
487Consider this function:
488.br
489.ft I
490
491(de\kCf foo
492 \h'|\nCu'(lambda nil (cond \kA(\kB(not (eq 'a (car (setq x '(a b)))))
493 \h'|\nBu'(print 'impossible!!))
494 \h'|\nAu'(t (rplaca x 'd)))))
495
496.ft P
497.br
498At first glance it seems that the first cond clause will never be
499true, since the \fIcar\fP of \fI(a\ b)\fP should always be
500.i a .
501However if you run this function twice, it will print 'impossible!!' the
502second time.
503This is because the following clause modifies the 'constant' list \fI(a\ b)\fP
504with the \fIrplaca\fP function.
505Such modification of literal lisp objects can cause programs to behave
506strangely as the above example shows, but more importantly it can cause
507garbage collection problems if done to compiled code.
508When a file is \fIfasl\fPed in, if the
509symbol $purcopylits is non nil, the literal lisp data is put
510in 'pure' space, that is it put in space which needn't be looked at
511by the garabage collector. This reduces the work the garbage collector
512must do but it is dangerous since if the literals are modified to point
513to non pure objects, the marker may not mark the non pure objects.
514If the symbol $purcopylits is nil then the literal lisp data is put in
515impure space and the compiled code will act like the interpreted
516code when literal data is modified.
517The default value for $purcopylits is t.
518.sh 2 "transfer tables"
519.pp
520A transfer table is setup by
521.i fasl
522when the object file is loaded in.
523There is one entry in the transfer table for each function which is
524called in that object file.
525The entry for a call to the function
526.i foo
527has two parts whose contents are:
528.ip [1]
529function address \-
530This will initially point to the internal function
531.i qlinker .
532It may some time in the future point to the function
533.i foo
534if certain conditions are satisfied (more on this below).
535.ip [2]
536function name \-
537This is a pointer to the symbol
538.i foo .
539This will be used by
540.i qlinker.
541.sp 2v
542.lp
543When a call is made to the function
544.i foo
545the call will actually be made to the address in the
546transfer table entry and will end up in the
547.i qlinker
548function.
549.i Qlinker
550will determine that
551.i foo
552was the function being called by locating the function name
553entry in the transfer table\*[\(dg\*].
554.(f
555\*[\(dg\*]\fIQlinker\fP does this by tracing back the call stack until it
556finds the \fIcalls\fP machine instruction which called it. The address
557field of the \fIcalls\fP contains the address of the transfer table entry.
558.)f
559If the function being called is not compiled then
560.i qlinker
561just calls
562.i funcall
563to perform the function call.
564If
565.i foo
566is compiled and if \fI(status\ translink)\fP is non nil, then
567.i qlinker
568will modify the function address part of the transfer table to point directly
569to the function
570.i foo .
571Finally
572.i qlinker
573will call
574.i foo
575directly .
576The next time a call is made to
577.i foo
578the call will go directly to
579.i foo
580and not through
581.i qlinker .
582This will result in a substantial speedup in compiled code to compiled code
583transfers.
584A disadvantage is that no debugging information is left on the stack,
585so
586.i showstack
587and
588.i baktrace
589are useless.
590Another disadvantage is that if you redefine a compiled function either
591through loading in a new version or interactively defining it, then
592the old version may still be called from compiled code if the fast linking
593described above has already been done.
594The solution to these problems is to use \fI(sstatus\ translink\ value)\fP.
595If value is
596.ip \fInil\fP
597All transfer tables will be cleared, i.e. all function
598addresses will be set to point to
599.i qlinker .
600This means that the next time a function is called
601.i qlinker
602will be called and will look at the current definition.
603Also, no fast links will be set up since \fI(status\ translink)\fP
604will be nil.
605The end result is that
606.i showstack
607and
608.i baktrace
609will work and the function definition at the time of call will always be used.
610.ip \fIon\fP
611This causes the lisp system to go through all transfer tables and set up
612fast links wherever possible.
613This is normally used after you have
614.i fasl ed
615in all of your files.
616Furthermore since \fI(status\ translink)\fP is not nil,
617.i qlinker
618will make new fast links if the situation arises (which isn't likely unless
619you
620.i fasl
621in another file).
622.ip \fIt\fP
623This or any other value not previously mentioned will just make
624\fI(status\ translink)\fP be non nil, and as a result fast links will
625be made by
626.i qlinker
627if the called function is compiled.
628.sh +0 "Fixnum functions"
629.pp
630The compiler will generate inline arithmetic code for fixnum only functions.
631Such functions include \(pl, \(mi, *, /, \\, 1\(pl and 1\-.
632The code generated will be much faster than using \fIadd\fP, \fIdifference\fP,
633etc.
634However it will only work if the arguments to and results of the functions
635are fixnums.
636No type checking is done.