document distributed with 4.3BSD

author Nick Cuccia <cuccia@ucbvax.Berkeley.EDU>

Wed, 30 Apr 1986 10:11:40 +0000 (02:11 -0800)

committer Nick Cuccia <cuccia@ucbvax.Berkeley.EDU>

Wed, 30 Apr 1986 10:11:40 +0000 (02:11 -0800)
author Nick Cuccia <cuccia@ucbvax.Berkeley.EDU>
Wed, 30 Apr 1986 10:11:40 +0000 (02:11 -0800)
committer Nick Cuccia <cuccia@ucbvax.Berkeley.EDU>
Wed, 30 Apr 1986 10:11:40 +0000 (02:11 -0800)
diff --git a/usr/src/old/lisp/PSD.doc/franz.n b/usr/src/old/lisp/PSD.doc/franz.n

index d2135ac..a7aba6a 100644 (file)
--- a/usr/src/old/lisp/PSD.doc/franz.n
+++ b/usr/src/old/lisp/PSD.doc/franz.n
@@ -2,5 +2,1638 @@
  .\" All rights reserved.  The Berkeley software License Agreement
  .\" specifies the terms and conditions for redistribution.
  .\"
  .\" All rights reserved.  The Berkeley software License Agreement
  .\" specifies the terms and conditions for redistribution.
  .\"
-.\"    @(#)franz.n     5.1 (Berkeley) %G%
+.\"    @(#)franz.n     6.1 (Berkeley) %G%
  .\"
  .\"
+."  franz.n                            -[Thu Jun 17 11:01:27 1982 by jkf]-
+." this file will contain a description of the franz lisp system.
+." sort of a systems manual.
+.de Fr
+F\s-2RANZ\s0 L\s-2ISP\s0
+..
+.tp
+.(l C
+.sz +2
+\fBThe Franz Lisp System\fP
+.sz -2
+.sp 2v
+\fIby
+\fIJohn K. Foderaro\fP
+.)l
+.sp 3v
+.tl ''Abstract''
+.br
+This document describes the 
+.Fr
+system written at The University of California 
+at Berkeley.
+Included are descriptions of the memory management, interpreter
+and compiler, the conventions used within the C coded kernel
+and those within the compiled code.
+This does not duplicate the information found in The Franz Lisp Manual.
+.++ C '\s-2Draft\s+2'\fBThe\ Franz\ Lisp\ System\fP'%'
+.+c Characteristics\ of\ \F\s-2RANZ\ \s0L\s-2ISP\s0
+.ls 1
+.pp
+There is no formal standard for lisp systems, thus each lisp
+system is almost guaranteed to be different from any other lisp system.
+In this section we will examine the design decisions which most often
+characterize a lisp system.
+This
+focus does  not imply that all other parts of the language are generally
+agreed upon.
+In fact, there is nothing sacred to lisp system designers.
+For example, one usually identifies the symbol 
+.i nil 
+with the empty list,
+and one usually thinks of lisp as a dynamically scoped language.
+Both of these conventions are are not true in the lisp dialect 
+.i NIL
+currently being written at MIT.
+As another example, one imagines that a lisp system must 
+use  garbage collection to reclaim storage.
+The lisp dialect
+.i ZetaLisp
+that is running on Lisp Machines doesn't
+normally use a garbage collector
+because of the way in which it allocates its address space.
+It is faster to reboot the machine than to do a garbage collection.
+In most lisp dialects the lisp expressions are written in 
+parenthesised form.  In 
+.i Portable
+.i Standard
+.i Lisp
+(PSL) at the University of Utah, programs are written 
+primarily 
+in an Algol-like
+syntax.
+.pp
+Thus some of the discussion to follow indicates not so much 
+.i unique
+charateristics of 
+.Fr
+but instead, how decisions were made.
+.sh 2 Scoping\ and\ Binding
+.pp
+The 
+.Fr
+interpreter
+uses 
+.i dynamic 
+.i scoping , 
+that is, after A calls B, B can access all of the
+variables that  A allocated
+as well
+as those that A can access, provided B doesn't allocate variables of the same
+names.
+There are two popular ways of implementing dynamic scoping: 
+.i deep
+.i binding
+and 
+.i shallow
+.i binding .
+Note that we will use the terms variable allocation and lambda
+binding interchangeably in this report.
+The former term is less language-specific than the latter.
+.sh +1 Deep\ Binding
+.pp
+In a deep binding implementation, when a variable is  allocated,
+the name of the variable and a space for its value are pushed on the
+top of a stack.
+When a program asks for the value of a variable, 
+the lisp system must search down the stack for the first occurrence of
+the variable.
+This system offers great flexibility for the 
+programmer since  the state of his program 
+can be described by the contents of the stack.
+It is thus possible to save the context of a program by just saving
+the stack or in some cases just a pointer to
+a place in  the stack.
+The problem with deep binding is that it is time-consuming to search
+down the stack for the value of a variable, and, 
+as a result, most systems 
+modify the deep binding model by giving variables a global value
+cell and allowing programs to set and access the global cell.
+Some implementations of Interlisp use deep binding.
+.sh +0 Shallow\ Binding
+.pp
+In a shallow binding implementation, each variable has a global value cell
+which contains the current value of the variable.
+When a variable is allocated inside a program, its name 
+and old value are pushed on a stack called a
+.i bindstack .
+The variable is then given a new value. 
+Throughout the lifetime of the allocation, the current value of the
+variable can be found in the global value cell associated with the variable.
+When leaving the context of the variable's allocation, the old value
+is restored from the 
+.i bindstack .
+A shallow binding scheme makes it much cheaper to access 
+the values of variables, however it is much more difficult to 
+save the state and to restore it. 
+.pp
+.Fr
+and most other lisps which are not derived from Interlisp
+use shallow binding.
+Some versions of Interlisp use shallow binding.
+.sh -1 Lisp\ Compiler
+.pp
+Dynamic scoping often is not necessary and it is never cheap, even
+in a shallow binding implementation.
+Thus the 
+.Fr  
+compiler prefers to do lexical scoping rather than dynamic
+scoping.
+If the user does not specify otherwise,  when a 
+function is compiled, all variables 
+allocated will be 
+put on a local stack and will 
+.i not 
+be accessible by any
+function that this function calls.
+This
+convention results in faster code being generated by
+the compiler, but if the
+user is not careful, it can result in compiled and interpreted code
+working differently.
+In some of the new lisp designs, such as 
+.i NIL 
+and
+.i Common
+.i Lisp
+the semantics of compiled and interpreted code have been
+unified by
+transferring the compiler semantics (lexical scoping) to the interpreter.
+This is a rather large step
+if dynamic scoping is no
+longer an option, and it is not clear whether the resulting
+language should be called 'lisp'.
+.sh +0 Data\ Types
+.pp
+A complete list of data types in 
+.Fr 
+can be found in the first chapter of the 
+.Fr
+Manual.
+The most important ones are described below.
+.pp
+Lisp is a list processing language and the basic data type is the list
+cell. 
+A list cell also goes by the name of cons cell, pair, or dotted pair (dtpr).
+It contains pointers to two lisp objects, and these pointers can
+be accessed via the 
+.i car
+and
+.i cdr 
+functions.
+Each pointer requires four (8-bit) bytes and thus the list cell is 
+eight bytes large.
+The cdr pointer is stored first since this makes it easier 
+to 'cdr down a list' using the addressing modes of the VAX.
+.pp
+The symbol cell is used to implement variables.  It contains  a pointer
+to a string that is the print name, a cell to store a value, a cell to
+store the function definition, a cell to store a pointer to  the property
+list and one pointer that the list system uses if it stores
+the symbol in the oblist.
+.sh +0 Memory\ Management
+.pp
+A lisp system must be able to determine the type of an
+object at run time.
+The method used to determine the type influences the
+way storage is managed and garbage collection is done.
+Next, we will look at three possible places
+to store the type information.
+.sh +1 Typed\ data
+.pp
+The type of the data object could be placed in the object itself, much
+as Pascal stores the type of a variant record in the record itself.
+This could result is a large amount of storage being used to store
+type information.
+No lisp system that we know of uses this method exclusively.
+.sh +0 Typed\ pointers
+.pp
+Every reference to a lisp object could contain the type of the object
+referenced.
+This is a good idea in a machine like an IBM 370 where only
+part of  machine word is used by the addressing hardware.
+Lisps that use typed pointers generally use a
+.i heap
+memory management scheme and a 
+.i copying 
+garbage collector.
+In a heap scheme, 
+all memory is divided by a pointer (called the
+heap pointer) separating lisp
+data and free space.  
+When a new lisp object is needed, it is formed at the
+dividing line and then  
+the heap pointer is incremented
+past the new object.
+Garbage collection is done by maintaining two separate spaces for lisp
+data, only one of which is used at any one time.
+When one fills up, all active lisp objects are copied to the 
+other space, leaving the first space totally free. 
+Subsequent allocations are done from the second space, until it fills up,
+at which point 
+all active data in the second space is copied to the first space.
+The advantage of the copying garbage collector is that the data space
+is compacted, which will improve virtual memory behavior.
+The disadvantage is that
+half the data space is always unused.
+.pp
+PSL on a PDP-10 uses this scheme, as does Lisp 370 on 
+an IBM 370.
+PSL  and NIL on the VAX will also use this scheme.  
+Since the VAX hardware 
+uses the entire word for address 
+calculation, 
+PSL and NIL 
+must mask off the type bits
+of a pointer before using it to reference memory.
+.sh +0 Typed\ pages
+.pp
+The final alternative is to allocate data objects by pages rather than
+individually.
+Each page contains only one type of object and there is a table that 
+keeps track of the type of data object in each page.
+The address of an object tells which page the object is on and 
+the page number tells which type the object is.
+Since a whole page of objects is allocated when only one 
+object is necessary,
+the rest of the objects in the page are put on a free list.
+This method of allocation is sometimes referred to as
+.i bibop
+for BIg Bag Of Pages.
+The garbage collector for bibop is usually
+a traditional
+.i mark
+and
+.i sweep
+algorithm.
+All active data objects are marked  and
+then each page is swept linearly with
+the free cells being put on the free list and the used cells 
+left alone.
+The advantage of mark and sweep over a copying garbage collector is that
+the mark phase is cheaper because data objects do not have to be copied.
+Also, all of memory can be used since there is no requirement for two spaces.
+The disadvantage is that a sweep phase is required in a mark and sweep
+but one is not required in a copying garbage collector.
+The sweep phase is expensive because it has to alter most data pages 
+while building a free list.
+This operation 
+can be expensive in a virtual memory environment.  
+Alternatives to the sweep phase have been proposed in
+[Foderaro+Fateman Characterization of Vax Macsyma].
+.pp
+.Fr
+uses a bibop memory allocator and a mark and sweep garbage collector,
+as does Maclisp (on a PDP-10).
+The reason that
+.Fr 
+uses bibop is primarily due to the VAX architecture,
+which makes typed pointers
+expensive to implement.
+Also, typed pointers would make it difficult to pass lisp values
+to  programs written in other languages, some of which may not have
+the ability to extract the address of a lisp object from a typed pointer.
+.sh -1 Construction
+.pp
+The
+.Fr
+system is built by adding lisp code to a C-coded kernel.
+The C-coded kernel is a working lisp interpreter, although it has few of
+the functions a lisp system needs.
+The lisp code provides most of the functions, the top level and error
+handlers.
+The lisp compiler is also written in lisp, 
+but is usually 
+run as a separate process.
+.sh +0 Unique\ features
+.pp
+.Fr
+can dynamically load and execute functions defined in the other languages
+which run on the VAX.
+It uses two different dynamic loaders.
+One is part of the lisp system and is used for lisp only, it is called 
+.i fasl
+(for fast loader).
+The other is the Unix loader, ld, which is used for loading in 
+C, Fortran or Pascal code.
+Loading code with fasl
+is much faster than with ld
+since fasl benefits from 
+the simplicity of a lisp object file.
+.+c Data\ Structures
+.sh 0 _
+.nr $1 \n(ch
+.pp
+In this chapter we shall examine the data structures which are most 
+important to the 
+.Fr
+system.
+First, though we will see how the lisp system uses the address space.
+.sh 2 Physical\ layout
+.pp
+As a Unix process, lisp has three address regions: text, data and stack.
+Text begins at location 0 and continues up to 0x3b73c (0x means hex number).
+The data segment begins on the next page boundary and continues
+up to a limit set by the configuration parameters (currently 0x2fd000).
+Lisp does not begin running with such a large data segment,  rather it grows
+when necessary.
+The stack segment begins at address 0x7fffffff and grows downward to a 
+maximum size of one-half megabyte.
+.pp
+The text segment stores the instructions for the C kernel as well as read-only
+data.
+The read-only data for lisp are 
+the symbol nil, the i/o ports, and the small integer
+table.
+The symbol nil is stored at location 0 which makes it very easy to test
+whether a value is nil.
+The problem with storing nil in read-only space is that a special case
+must be made for nil's property list, which is the only thing in the
+nil symbol that the user is permitted to alter.
+.pp
+The fixnums -1024 through 1023 are stored sequentially, with 0 being
+stored at 0x1400.
+.Fr
+doesn't have any 'immediate' lisp data, that is data whose value is 
+stored as part of the reference to the data. 
+But, by storing some of the fixnums in a known place, we can achieve
+some of the benefits of immediate data: 
+A program  can use 
+.i eq
+as a test for a fixnum in the range -1024 to 1023.
+In the majority of cases, when asked
+to allocate a fixnum, the system can  return a pointer into this table
+and bypass the memory allocator.
+.sh +0 Typetable
+.pp
+.Fr
+uses the typed pages (or bibop)
+method of type determination.
+The 
+.i typetable
+is an array of bytes (
+.i chars 
+in  C lingo).
+This table describes the type of all pages, from page 0 where nil is stored,
+up to the end of data space.
+Thus there are many entries that describe the types of the pages which
+make up the C kernel.
+In order to reduce the wasted space in the typetable,
+we could have made the typetable begin typing pages at the start of data
+space and  make a special case of the pages that contain nil and
+the small integer table.
+However, the
+effect of this change would be that type checks 
+(which are done in-line in compiled code) would
+be longer and slower.  Since type checking happens quite
+frequently, we felt it was better to waste the space in the
+typetable in order to save execution time  and instruction space.
+.pp
+Each page on a VAX is 512 bytes, and thus to find the type of an 
+object given the address of it, 
+it is only necessary to shift the address right 9 bits
+and index the typetable array offset by one.
+The offset by one is required because the value -4, which is an illegal
+address, is used as a sentinel value to indicate an illegal value.
+Thus when we take the type of -4 we will be indexing the table by -1
+and we want to retrieve the first byte in the table (which contains
+the value UNBOUND).
+The C macro which retrieves the type is this (from file h/global.h):
+.(b I
+
+#define        TYPE(a1)        ((typetable+1)[(int)(a1) >> 9])
+
+.)b
+This is code generated by the lisp compiler to check if the type
+code of an argument (stored at 0(r10)) is a symbol (which is type
+code 1):
+.(b I
+
+ashl   $-9,0(r10),r0
+cmpb   _typetable+1[r0],$1
+
+.)b
+.pp
+The type codes which are stored in the typetable are these:
+.ts 2i 4i 6i
+.(b I
+UNBO     -1\tSTRNG     0\tATOM     1\tINT       2
+DTPR      3\tDOUB      4\tBCD      5\tPORT      6
+ARRAY     7\tOTHER     8\tSDOT     9\tVALUE    10
+HUNK2    11\tHUNK4    12\tHUNK8   13\tHUNK16   14
+HUNK32   15\tHUNK64   16\tHUNK128 17
+.)b
+The names given above are the C kernel internal names.
+ATOM is symbol, INT is fixnum, DTPR is list cell, 
+DOUB is flonum, BCD is binary,
+SDOT is bignum and all the HUNKn types are just known as hunk to
+the user.
+.sh +0 Stacks
+.pp
+.Fr
+uses three stacks: the 
+.i C 
+.i runtime 
+stack, the 
+.i namestack
+and the
+.i bindstack .
+The C runtime stack is the stack part of the address space 
+and the other two stacks are stored in the data space.
+.sh +1 C\ runtime\ stack
+The C-coded kernel uses this
+stack in the same way as a typical  C program
+use the stack:
+storing return addresses
+and non-lisp-object arguments to  subroutines,
+saving registers,
+and storing  local variables within a function.
+The lisp system stores 
+.i catch
+.i frames
+on this stack as well (to be described later).
+.pp
+The lisp system assumes that there are no lisp data on the stack and
+thus  the use of this stack 
+by a program is unrestricted.
+As will be discussed later on, the 
+.b calls
+and 
+.b ret
+instructions are used for most subroutine calls and returns.
+These instructions expect the stack to look a certain way.
+.sh +0 Namestack
+.pp
+The namestack contains only lisp data.
+It is used to pass arguments to functions and to  hold
+local (lexically scoped) data within lisp functions.
+It is also used as a temporary storage spot for lisp data
+which must be protected from garbage collection.
+.pp
+A slight digression on the garbage collector:
+The person who writes code for the lisp system must always be aware of
+his greatest enemy: the garbage collector.  
+Whenever a function is called that could possibly allocate more lisp
+data, one must assume that it 
+when it attempts to allocate space,  the garbage collector
+will be invoked and that it will take away everything that isn't protected
+from garbage collection.
+The objects that are protected from garbage collection are: the
+namestack, the bindstack, the oblist (table of interned symbols),
+and the compiler literals.  Objects that are only 
+referred to by values in registers or
+the C runtime stack will not be seen by the mark phase of the
+garbage collector and will be swept up during the sweep phase.
+.pp
+Back to the namestack:
+The first free element on the namestack is pointed to by the
+C variable 
+.i np .
+This variable is always stored in the VAX register r6.
+Another pointer into the namestack is the C variable
+.i lbot .
+It is always stored in VAX register r7.
+Its use will be described in the section on calling conventions.
+.sh +0 Bindstack
+.pp
+The bindstack is a stack of lisp object pairs: symbol and
+value.
+It is used to implement shallow binding.
+When a symbol is lambda-bound, the symbol and its current value are put
+on this stack.
+Then the symbol can be given a new value.
+When the context of the lambda binding is over, the symbol and value pair
+are popped from the stack and the symbol is given its old value.
+The C variable
+.i bnp
+points to the first free entry on the bindstack.
+In the C code, the following macros lambda-bind a symbol to a value
+and restore the old value:
+.(b
+#define PUSHDOWN(atom,value)\e
+       {bnp->atm=(atom);\e
+         bnp++->val=(atom)->a.clb;\e
+        (atom)->a.clb=value;\e
+       if(bnp>bnplim) binderr();}
+
+#define POP\e
+       {--bnp;  bnp->atm->a.clb=bnp->val;}
+.)b
+.sh -1 Bitmap
+.pp
+The bitmap is used in garbage collection to hold
+the mark bits for the mark and sweep garbage collector.
+As its names implies, it is viewed as a collection of bits.
+The minimum size of a lisp data object is 4 bytes, thus
+there are 128 of those on a VAX page of 512 bytes.
+Since there are 8 bits in a byte, it takes 16 bytes to obtain 128 bits.
+Therefore the size of the bitmap in bytes is 16 
+times the maximum number of
+pages.
+Like the typetable, the bitmap  keeps track of marks from the
+bottom of memory, not the bottom of data space.
+The bitmap and the typetable are static structures.
+It is their size, which is determined when the C kernel is built,
+which determines the size to which the data segment can grow.
+.sh +0 Transfer\ Tables
+.pp
+Transfer tables are used by compiled lisp code as a transfer vector to
+other functions.
+A transfer table consists of pairs of entries: symbol and pointer to
+function.
+When a compiled lisp function  calls another (non-local) function, it
+calls indirectly through an entry in the transfer table.
+Depending on the setting of certain status variables, the call may
+bring control into a function linkage routine or directly to
+the function desired (if that function is a compiled lisp or C function).
+.pp
+Transfer tables serve a number of useful purposes.
+.np
+They allow compiled code to call interpreted code
+or compiled code using the same calling sequence.
+.np
+They allow the selection of which function to call to be determined
+at runtime based on the name of the function.
+In most other languages, this selection process is done at either
+compile or load time.
+.np
+They allow the user to run in a debugging mode where all calls
+from compiled code go through the interpreter.
+Once control reaches the interpreter, it is easier to debug.
+.np
+They allow the user to run in a production mode, where all calls
+from compiled to other compiled code are done without function
+lookup overhead.
+.np
+They allow the user to switch between debugging and production modes
+at runtime with one function call.
+.pp
+Transfer tables will be described further  in the section on 
+the lisp compiler.
+.sh +0 Catch\ frames
+.pp
+Lisp has a number of forms of non-local transfers.
+Among them are 
+.i throw ,
+.i error ,
+.i return 
+and
+.i go .
+If a program is willing to accept a non-local transfer, it puts a 
+.i catch
+.i frame
+on the stack which describes which type of transfer it 
+accepts.
+The catch frame describes the current state of the registers,
+the variables np, lbot, and bnp, and also contains entries that describe
+what kinds of non-local transfers the function will accept.
+After creating a catch frame, the program  goes on to evaluate forms.
+Should the evaluation of one of those forms result in a non-local
+transfer to the catch frame that this program has set up, the 
+system will restore the namestack and bindstack to the way
+they were when the program put the catch frame on the stack (by using
+np and bnp) and will return control to the program (setting 
+the variable retval to describe why it returned).
+This is described further in the 
+section on interpreter conventions.
+.pp
+The C variable 
+.i errp
+points to the most recent catch frame, and then each catch frame points
+to the previous catch frame.
+.sh +0 oblist
+.pp
+Normally when symbols are created they are
+.i interned ,
+that is they are put in a hash table called an oblist (or obarray).
+The purpose of the oblist is to insure that if you type a
+symbol's name to the reader, you will always get  the  same symbol.
+Also it protects the symbol from garbage collection.
+The oblist is simply a hash table with buckets, with a hash link being
+part of the symbol data structure.
+Currently the hash table is not accessible to a lisp program, but with
+a little work it could be.
+.+c Interpreter
+.sh 0 _
+.nr $1 \n(ch
+.pp
+The interpreter is composed of these parts:
+.ip \fIcore:\fP
+The functions eval, apply and funcall.
+.ip \fIstack\ management:\fP
+The code to create catch frames and handle non-local transfers.
+.ip \fImemory\ management:\fP
+The code to allocate and garbage collect memory.
+.ip \fIthe\ functions:\fP
+The user callable functions that expect lisp arguments and return
+lisp values.
+.pp
+In the above list, the first three are written mainly in C (with a little
+assembler) and the last is written mainly in Lisp with a little 
+bit in C
+and a very small amount in assembler.
+.pp
+The core functions are the center of activity  in the interpreter.
+The 
+.i eval 
+function given a lisp expression  to evaluate.  
+It must determine if it is a simple expression such as a symbol or number
+whose value eval can determine immediately, or if it is an function
+calling expression.
+If the form is a function call, eval must determine if the arguments should
+be evaluated and if so eval must recursively call itself to evaluate the
+arguments and then stack them.
+If the function called is to be interpreted, eval must lambda-bind the
+formal variables of the function to the arguments just stacked.
+If the function being called is compiled, eval just calls the function and
+lets the function do the lambda binding if it wants to.
+.pp
+The 
+.i apply 
+function is passed two lisp objects: 
+a function to call (either a symbol or a functional object)
+and a list of arguments already evaluated.
+It will do lambda binding if the function to call is to
+be interpreted.
+.pp
+The 
+.i funcall
+function is passed any number of lisp objects,
+the first of which is the function to call, and the rest are the
+arguments to the function 
+which have already been evaluated and stacked.
+Funcall will  do lambda binding if the function to call is
+to be interpreted.
+.pp
+When compiled lisp code calls a function 
+which must be interpreted, it enters the interpreter through the
+funcall function.
+The interpreter may go to compiled code through either eval, apply or
+funcall, though most often it goes through eval.
+.sh 2 Conventions
+.pp
+These are the conventions used in interpreted and compiled code.
+.sh +1 C\ conventions
+.pp
+Our conventions are extensions of the C calling conventions, so we
+describe here the conventions for C.
+The VAX has 16 general
+purpose registers.
+Registers r12 through r15 are reserved
+for use by the hardware because we use the
+.i calls
+subroutine call.
+Registers r0-r5 can be used by a program without saving them first.
+The result of a function call is returned in r0.
+Registers r11-r6 are allocated  (from high to low)
+by the C compiler when a 
+.i register
+declaration is made in the C code.
+Registers r11-r6 must be
+saved upon entry and restored upon exit if they are used within a function.
+On the VAX it is very easy to preserve registers
+since it is done automatically
+by the hardware if the
+.i calls
+(or 
+.i callg ) 
+instruction is used.
+The first short word (two bytes) of a subroutine is a 
+register save mask which
+tells which registers should be saved (on the C runtime stack) upon
+entry and restored when a 
+.i ret
+instruction is done.
+.sh +0 np\ and\ lbot
+.pp
+Earlier we  mentioned that the C variables np and lbot are 
+always stored in
+registers r6 and r7.
+It is very difficult to globally assign a variable to a register
+in the C language (no external register declarations
+are permitted)
+and thus we have to filter
+the output of the  C compiler and convert all occurrences of 'np' to 'r6'
+and all occurrences of 'lbot' to 'r7'.
+This is only half the job, though.
+We also must modify  the register save masks for those routines which 
+alter the value of np or lbot to insure that those registers get 
+saved and restored upon function entry and exit.
+This is done in the C code by putting 
+.(b C
+Savestack(n)
+.)b
+at the beginning of a routine which will alter np or lbot and which
+also allocates n register variables.
+Also in that routine, before the function returns, we put 
+.(b C
+Restorestack()
+.)b
+This is not really necessary in the VAX, but it is there for other 
+machine implementations which don't have a 
+.i ret
+function which restores registers.
+The calls to Savestack are recognized by a filter which processes
+the output
+of the C compiler and fixes up 
+the save masks.
+.sh +0 Function\ calling
+.pp
+The following text describes what the conventions are for 
+calling
+compiled lisp functions, whether they were written in lisp or C.
+We look at it from the viewpoint of the called function.
+.pp
+Upon entry
+to the compiled functon,
+lbot points to the first argument and np points to the
+first free spot on the namestack.
+If np equals lbot then there are no arguments.
+Recall that np will be in r6 and lbot in r7.
+The function is free to alter registers r0 through r5 and should return
+the result in r0.
+The function may alter registers r6 through r11 as long as their 
+original values
+are restored when the function returns.
+The value of np should always 
+point to the first free entry in the namestack.
+This is all that is required.
+The extra conventions followed by 
+the lisp compiler 
+in the code it generates are described in the next chapter.
+.sh +0 Catch\ frames
+.pp
+A catch frame saves the state of execution that a program
+might  want to return to at some later time.
+A catch frame  is stored on the C runtime stack, with the most recent 
+frame pointed to by the C global variable errp.
+The information saved is 
+.ip \fIargs\fP
+- one or two optional arguments. The lisp compiler always 
+stacks two 
+arguments since it must know exactly how large the frame is.
+.ip \fIclass\fP
+- the class describes what type of frame this is (described below).
+.ip \fIretaddr\fP
+- address to return to if returning from this frame
+.ip \fIolderrp\fP
+- pointer to next older catch frame on the stack
+.ip \fIbnp\fP
+- value of bnp (bindstack pointer) when the frame was created
+.ip \fInp\fP
+- value of np when the frame was created
+.ip \fIlbot\fP
+- value of lbot when the frame was created. 
+.ip \fIsystem\ dependent\fP
+- the rest of the information stacked depends on the particular machine.
+In the case of the VAX, registers r13 through r8 are stacked. (r14 and
+r15 are the stack pointer and program counter; they are not saved since
+they can be calculated from the other information).
+.pp
+The information in a catch frame is put on the C runtime stack
+in the precise order given above, and the variable errp points not
+at the beginning of the entire frame, but to the lbot entry.
+Thus errp\ +\ 4 points to np.
+The classes of frames are described below.
+Each class is defined as a constant in the C code (h/frame.h) whose value
+is a small integer. 
+.ip F_PROG\ [1]
+- takes no arguments.  This frame marks the beginning of a piece of
+code which can accept a 
+.i return
+or a 
+.i go .
+The interpreter uses this to implement
+.i prog
+and 
+.i do
+and for its primative break loop.
+The lisp compiler does not use catch frames for progs since it only
+permits 
+.i returns
+and
+.i gos
+to occur within
+.i progs 
+or 
+.i dos
+and thus it can determine how to do the non-local transfer
+at compile time.
+.ip F_CATCH\ [2]
+- this takes one or two arguments and is used to implement both
+.i catch
+and 
+.i errset .
+In both cases 
+the first argument is the tag which is being caught,
+which in the case of an 
+.i errset
+is symbol ER%all.
+An 
+.i errset
+also
+supplies a second argument which is non nil if the error message
+should be printed.
+.ip F_RESET\ [3]
+- in the C kernel, the reset function 
+is implemented as a non local transfer to a F_RESET catchframe.
+When the lisp system is built, the reset function is redefined to
+do a 
+.i throw.
+Thus this type of catch frame is rarely used.
+.ip F_EVAL\ [4]
+- this has one argument, the form being evaluated. 
+Since stacking this
+on every eval would be expensive,
+this type of catch frame is only stacked if a \fI(*rset\ t)\fP
+has been done and if the value of the symbol 
+.i *rset
+is non nil.
+The form being evaluated is stacked  so that 
+the necessary information for the
+.i evalframe
+function is available and so that 
+.i freturn
+can return from the context of any pending evaluation.
+.ip F_FUNCALL\ [5]
+- this is much like F_EVAL, 
+except the one argument it takes is the
+name of the function to call.  
+It has the same restrictions on when it is stacked as  F_EVAL
+and for the same reasons.
+.pp
+In C, a frame is pushed on the stack with Pushframe, with a call
+of one of these forms:
+.(b
+errp = Pushframe(class);
+errp = Pushframe(class,arg1);
+errp = Pushframe(class,arg1,arg2);
+.)b
+After the call the C variable 
+.i retval
+contains the value which describes why this function returned.
+You must remember that it is possible for this one function call to
+return more than once! 
+The reasons it can return are these (from h/frame.h):
+.ip C_INITIAL\ [0]
+This is the initial call to set up the frame.
+.ip C_GO\ [1]
+This will only happen for F_PROG frames. 
+In this case, 
+the C variable lispretval contains a lisp symbol which is the tag
+to which control should be transferred.
+If the tag cannot be found in this prog or do body, the
+tag should be again thrown up the stack to a next higher prog or do.
+.ip C_RET\ [2]
+This will only happen for F_PROG frames.
+In this case, lispretval contains the value to return from this prog.
+.ip C_THROW\ [3]
+This will only happen for F_CATCH frames.
+In this case lispretval contains the value to return from this catch.
+.ip C_RESET\ [4]
+This will only happen for F_RESET frames.
+It simply means that a reset is being done.
+.ip C_FRETURN\ [5]
+This will only happen for F_EVAL and F_FRETURN frames.
+The variable lispretval contains the value to return from this
+call to 
+.i eval
+or
+.i funcall .
+.pp
+The call to Pushframe is turned into a simple subroutine call (using
+the 
+.i jsb
+instruction instead of the
+.i calls )
+by the filters which alter the code coming out of the C compiler.
+Thus it may be useful to see what stacking a catch frame really looks
+like.  
+Here is what the lisp compiler generates 
+to stack the frame for \fI(*catch\ 'tag\ x)\fP
+.(b
+movl   0(r8),r0        #move 'tag to r0
+pushl  $0              # dummy argument
+pushl  r0              # put tag argument on C runtime stack
+pushl  $2              # push F_CATCH
+jsb    _qpushframe     # call Pushframe
+movl   r0,_errp        # move result into errp
+.)b
+.pp
+Every function which does a Pushframe() must also do a Popframe()
+before it returns to its caller.
+This simply removes the frame from the stack.
+In C it is written:
+.br
+.tl ''errp = Popframe()''
+in the code generated by the lisp compiler it looks like:
+.(b
+movl   _errp,r1        # put current errp in r1
+movl   12(r1),_errp    # put previous errp in errp
+addl3  $32,r1,sp       # pop frame from stack
+.)b
+.pp
+Non-local transfers are done with the Inonlocalgo
+function. This function always takes three arguments. 
+The
+first is the return type (one of the types mentioned
+above which begin with 'C_'). It will be assigned to retval.
+The second argument is the value to be assigned to lispretval,
+except in the case of a C_THROW, where the second argument
+is the tag to throw to and the third argument is the value
+to assign to lispretval.
+This function never returns. 
+If it doesn't find a catch frame
+which matches what it is looking for, it signals an error.
+The function is called with 
+.i calls
+and the arguments are stacked on the C runtime stack, not the 
+namestack.
+.+c Liszt:\ The\ Lisp\ Compiler
+.sh 0 _
+.nr $1 \n(ch
+.pp
+The purpose of compiling a lisp function is to create a representation of 
+the function which can be evaluated in less time and perhaps take
+up less space.
+There are two approaches to lisp compilers.
+One is to convert the functions to a compact form, often called 
+.i bytecodes
+which can be rapidly interpreted.
+Each bytecode represents a primitive operation in the lisp system.
+This approach is simple to implement but there is an time penalty
+in using an  interpreter.
+The other approach is to compiled to  machine code.
+In general, this is not as portable as the bytecode  approach but the result
+generally runs faster.
+There are two ways of compiling to  machine code. 
+One is to place arguments to functions in registers. 
+If there are more arguments than registers, the arguments are put on
+a stack and a special type of call is made.
+This method is used in Maclisp.
+The other method is to assume a stack model, in which
+arguments to a function are placed on a stack.
+This is what the 
+.Fr
+compiler Liszt does.
+The stack model made it much easier to write parts of the lisp
+system in the C langauge.
+It also simplifies the garbage collector since the mark phase doesn't
+have to locate and peruse all saved registers looking for lisp data.
+.sh 2 File\ Compiler
+.pp
+When a file of lisp expressions is loaded,
+the 
+.i load 
+function repeatedly reads and evaluates the forms in the
+file.
+Some of these evaluations may result in functions being defined, 
+and others may use the functions just defined or previously defined.
+The job of the lisp compiler is to create an object file, which,
+when read in 
+by 
+.i fasl,
+acts just like it was 
+.i load ed
+in, except when a function is defined it is defined in machine
+code, not as a lisp expression.
+This is quite a bit different from what compilers do in other languages
+and it is done this way to make it easier to 
+switch between compiled and
+interpreted code.
+.sh +0 Differences\ between\ compiled\ and\ interpreted
+.pp
+There are some differences, though, between compiled and interpreted code.
+Local variables in compiled code are lexically scoped (put on the
+namestack and inaccessible to other functions) unless the variable
+has been declared 
+.i special.
+The declaration:
+.(b
+\fI(declare (special x y))\fP
+.)b
+declares both x and y to be special from this point in the file on.
+The declaration 
+.(b
+\fI(declare (specials t))\fP
+.)b
+declares all variables to be special.
+.pp
+Lisp has a very powerful macro definition system.
+The compiler will macro expand all it can, whereas the interpreter
+expands macros when necessary but never replaces a macro call with
+its expansion.
+Thus if you redefine a macro, the compiled code that uses it will
+still work the same way, but the interpreted code will use the 
+new definition. 
+Also, when compiling a file, macro definitions must be
+done before any call on the macro  is encountered during compiling.
+In the interpreter, 
+macros can be defined or redefined anytime up until
+they are used.
+Thus the interpreter is far freer about macro definitions than 
+the compiler.
+This could cause programs which work interpreted to
+fail compiled.
+.sh +0 Top\ level\ algorithm
+.pp
+The top level algorithm of the compiler is simply this: 
+.np 
+read a lisp expression from the input file
+.np 
+macro expand the top level of the form as much as possible
+.np 
+if the form is a function definition (a list beginning with
+the symbol 'def')
+then compile it.
+.np 
+if the form is not a function definition then put it on a list of 
+forms that will be evaluated when the file is 
+.i fasl ed
+in.
+.np
+if not at end of file, go to step 1.
+.pp
+In reality, step 3 is a bit more complex.
+If the definition is of a macro, then the form will be evaluated
+immediately, thus adding the macro definition to the compiler's 
+environment.
+If the lisp variable 
+.i macros 
+is t then the macro will also be compiled.
+There are also some forms like \fI(progn\ 'compile\ ...)\fP
+and \fI(eval-when\ ()\ )\fP which determine when the 
+forms get compiled and evaluated.  
+See the lisp manual for details.
+.sh +0 Expression\ Compilation
+.pp
+Just as the interpreter is centered around the 
+.i eval
+function, the lisp compiler is centered around the function
+.i d-exp
+whose job it is to compile the expression passed to it.
+The lisp compiler is one pass.
+Before a call to d-exp returns,
+all the compiled code for a form has been computed and written
+to a file.
+One reason that the lisp compiler can be a single pass compiler
+is that the assembler which reads the compiler's output is 
+a two pass assembler.
+.sh +1 global\ state\ variables
+.pp
+There are a number of variables that maintain the state of
+the compilation process.
+These variables are declared special and are thus dynamically scoped
+and visible to any function within the compiler.
+When d-exp 
+is called their meanings are:
+.ip \fIv-form\fP
+- contains the expression to be compiled.
+.ip \fIg-loc\fP
+- tells where the result of evaluating this expression should be put.
+If g-loc is nil, then the value returned is unimportant and shouldn't
+be put anywhere.
+.ip \fIg-cc\fP
+- controls conditional branches.
+If g-cc is non nil, then it is a list cell whose value has
+either a non-null car or non-null cdr but not both. 
+If the car is non-nil then
+if  the  value  of  the expression held in 
+v-form  is  non-nil, a branch should be done to the symbol
+stored  in  \fI(car\ g-cc)\fP. 
+If the cdr is non-nil  then if the value of v-form
+is  nil,  a branch should be done to the symbol stored in 
+\fI(cdr\ g-cc)\fP. Since at
+every  conditional  
+branch  control should  either jump or continue, the car and cdr will
+never both be specified.
+.ip   \fIg-ret\fP
+- is non nil if the result of evaluating v-form will be returned as the
+value of the function we are evaluating. 
+This is used to allow liszt to detect
+when tail recursion removal is possible.
+.ip  \fIg-locs\fP
+- maintains current information about the state of the stacks:
+the bindstack (for specials), the namestack (for locals) and the
+C runtime stack (for catch frames)
+The form of g-locs is a stack of tokens stored as a list.
+The meaning of each token is as follows:
+.br
+\fInil\fP - this represents an unnamed object on the namestack.  This happens
+when calling functions, when the arguments are stacked prior
+to a function call.
+.br
+\fI<symbol>\fP - the given symbol is the name of a local variable on the namestack.
+.br
+\fI(prog . <n>)\fP - prog statement which binds <n> special variables
+.br
+\fI(do . <n>)\fP - head of a do statement which binds <n> special variables
+.br
+\fI(catcherrset . 0)\fP - catch or errset frame on stack
+.br
+\fI(lambda . <n>)\fP - lambda expression which binds <n> special variables
+.ip \fIg-labs\fP
+- this is a stack of labels. 
+There is one entry in g-labs for every entry which is a list
+in g-locs.
+the forms of the entries are:
+.br
+\fInil\fP - no labels in this form
+.br
+\fI(endlab (real . gen) (real2 . gen2) ...)\fP - endlab is the label to go to
+to get out of this body.  After this label will be code
+to unbind specials and pop off locals.
+The 'real' labels are those found in the code. the gen
+labels are those generated and put in the assembler code.
+.sh +0 Function\ compilation
+.pp
+The action d-exp takes when compiling an expression depends on the
+type of expression.
+For atoms (symbols and numbers) the compilation is very simple, it
+is just a matter of putting the value where g-loc specifies and 
+then jumping if specified by g-cc.
+When the expression is a list, d-exp first macro
+expands the form and then looks at the first element of 
+the list (if the list has not macro expanded to an atom).
+If the first element is a symbol  then the list is
+is a function call.
+If the function is one of the functions which liszt knows how to open compile
+then liszt will call the particular routine designated to open
+compile this function.
+There are two classes of functions within liszt that
+do open compiling.
+The first class, the fl-expr class, are distinguished by names which
+begin with c-.
+These functions always generate
+code which returns the result in r0.
+They are not equipped to obey the contents of g-loc and g-cc.
+Thus d-exp, after calling one of these functions, must do what
+is necessary to obey g-loc and g-cc.
+The other class of open compiling functions, the fl-exprcc class (whose
+names begin with cc-),
+handle g-loc and g-cc.
+As a result these are usually much more complex and generate better code.
+.sh -1 Register\ Conventions
+.pp
+The register conventions used by liszt are an extension of those
+used by the C code.
+.ip \fIr0\fP
+- return value placed here
+.ip \fIr1,r2,r3,r4\fP
+- scratch registers.
+When long strings of 
+.i car's
+or
+.i cdr's
+are done (such as
+.i caddadr )
+these registers are used in a least-recently-used fashion to access down
+the list. 
+The compiler keeps track of the values in these registers but must
+assume that they are garbage after a function is called.
+.ip \fIr5\fP
+- fixnum accumulator.
+There a number of functions which work on fixnum's only and these
+usually put their result in r5.
+The assembler code function 
+.i qnewint
+which returns a pointer to a cell containing a fixnum value (after
+checking if it is in the fixnum table), expects its argument to be
+in r5.
+.ip \fIr6\fP
+np
+.ip \fIr7\fP
+lbot.
+When calling a function, this register is set just before the function
+call, and after the function call its value is used to reset the value
+of np in order to pop the arguments off the namestack.
+.ip \fIr8\fP
+the literal table pointer.
+The compiler generates a table of all the literal lisp data
+which the compiled code might access.
+Upon function entry a pointer to the base of this table is put in r8.
+For example, if we compile \fI(setq\ x\ 'foo)\fP then we will
+need a pointer to the lisp symbol
+.i foo
+and if the symbol
+.i x
+as been declared special we will also need a pointer to 
+.i x .
+.ip \fIr10\fP
+- upon function entry the value of lbot (r7) is put in r10.  
+This allows us to reference the arguments to our function while
+using lbot to call other function.
+.sh +0 Addresses
+There are four types of addresses used internally in Franz: 
+symbolic, intermediate addresses (iadr), extended intermediate (eiadr)
+and vax assembler format.
+.sh +1 Symbolic
+.pp
+These are the names of lisp objects, such as `a', `foo', 34,
+or 3.234234.
+A name is either a local variable, a special variable or a
+number.  A number is either a small fixnum, large fixnum,
+bignum or floating point number.
+.sh +0 Intermediate\ address\ (IADR)
+.pp
+This type of address is generated from a symbolic address by
+.i d-loc, 
+.i d-loclit 
+and the routines 
+.i d-simple
+and 
+.i d-rsimple 
+which
+call them. The forms are
+.ip \fINil\fP
+- the location or value of nil.
+.ip \fIT\fP   
+- the location or value of t.
+.ip \fIreg\fP
+- register r0, which is where the result of function calls
+are stored.
+.ip \fIstack\fP
+- the address pointed to by np with np incremented after
+the value is stored.  (i.e  (r6)+)
+.ip \fIunstack\fP
+- the address one word below np (np is decremented before
+accessing. (i.e. -(r6))
+.ip \fI<symbol>\fP
+- this is just <symbol>.  This allows strange forms to be
+represented directly.
+.ip \fI(stack\ n)\fP
+- The n'th value on the namestack for this function.
+The first value 0(r10) is (stack 1).
+.ip \fI(vstack\ n)\fP
+- The cdr of the n'th value on the namestack.  
+That is, (stack 1) is *0(r10)
+.ip \fI(bind\ n)\fP
+- The value of the n'th value in 
+the literal table.  If 
+this refers to a symbol, then this is the value
+of the symbol.
+If this refers to a list then this
+this is the cdr of the list (although this is a rare
+use). (bind 1) corresponds to *0(r8).
+.ip \fI(lbind\ n)\fP
+- The n'th value in the literal table.  If 
+this refers to 
+object foo then this is the address of foo
+in memory.
+.ip \fI(fixnum\ n)\fP
+- This is the address of small fixnum n in memory.
+A small fixnum is in the range -1024 to 1023.
+.ip \fI(immed\ n)\fP
+- The is the immediate constant n. 
+.sh +0 extended\ intermediate\ address\ (EIADR)
+.pp
+This address is generated from an IADR by e-cvt.  It 
+symbolically represents a vax address.  
+.ip \fI<atom>\fP
+- This corresponds to <atom>.
+.ip \fI(n\ <atom>)\fP
+- This corresponds to n(<atom>)
+(stack n+1) and (lbind n+1) are converted to these forms.
+.ip \fI(n\ <atom1>\ <atom2>)\fP
+- This corresponds to n(<atom1>)[<atom2>]
+There is currently no IADR which generates this.
+.ip \fI(*\ n\ <atom>)\fP
+-This corresponds to *n(<atom>)
+(vstack n+1) and (bind n+1) are converted to these forms.
+.ip \fI(+\ <atom>)\fP
+- This corresponds to (<atom>)+.
+stack is converted to this form.
+.ip \fI(-\ <atom>)\fP
+- This corresponds to -(<atom>)
+unstack is converted to this form.
+.ip \fI($\ <numb>)\fP
+- This corresponds to $<atom>
+(immed <numb>) is converted to this form.
+.ip \fI(#\ <numb>)\fP
+- This corresponds to $<loc> where <loc> equals 
+the base of the fixnums (0x1400) plus 4 * <numb>
+(fixnum <numb>) is converted to this form
+.ip \fI(*\ #\ <numb>)\fP
+- This corresponds to $<numb>.  It is generated
+by d-rsimple occasionally when you ask for the
+cdr of a number (which you do sometimes when you
+are compiling fixnum operators).
+.sh +0 Vax\ Unix\ assembler\ addresses
+.pp
+The addresses are printed from a EIADR by e-cvtas.  The conversions
+are shown in the above section.
+.sh -1 Function\ calling\ convention
+.sh +1 Function\ linkages
+.pp
+The function associated with a symbol is stored in the function 
+definition slot of the symbol.  If the function slot contains a
+list then the function is to be interpreted and its 'car' will 
+be lambda, nlambda, lexpr or macro.  If the function is compiled then 
+the function definition slot will contain a binary object. 
+A binary object 
+is composed of two independent parts: the discipline and the entry.
+The discipline is one of:
+.ip \fIlambda\fP
+- a function whose arguments should be evaluated.
+.ip \fInlambda\fP
+- a function whose arguments should not be evaluated but
+which should be passed as a list
+.ip  \fImacro\fP
+- a function which should be passed the unevaluated form
+being evaluated and whose result should be evaluated.
+.ip \fI\"subroutine\"\fP
+- a foreign function subroutine
+.ip  \fI\"integer-function\"\fP
+- a foreign function returning an integer
+.ip \fI\"real-function"\fP
+- a foreign function returning a flonum.
+.pp   
+A lexpr is not in the list as there is no difference to the caller
+between compiled lambda's and compiled lexprs.
+.sh +0 Transfer\ tables
+Most calls from compiled code to other lisp functions go through
+transfer tables.  A transfer table entry is a pair: symbol, address
+of routine. 
+When another lisp function is called it uses the 
+.i calls 
+instruction which will
+indirectly jump through the address in the transfer table.  This
+address may point to the desired function or it may point to the
+interface routine.  If a call ends up in the interface routine, 
+then that routine will trace back through the call stack and eventually
+reach the tranfer table entry that it was 'called through'. Since the
+transfer table entry contains a symbol which names the function
+to be called, the interface routine can determine which routine
+was to have been called.  If that routine is compiled then the 
+interface routine can modify the transfer table so that a call
+through that table entry will go directly to the desired function.
+If the routine to be called is interpreted, or if the user has
+requested that transfer linkages should be disabled, then the
+interface routine will go through the 'funcall' function 
+in the interpreter to continue the call.  
+.sh +0 calling\ sequence\ in\ the\ compiler:
+.pp
+When transfer tables are used
+.(b
+       \fBforeach\fP arg
+        \fBcompute\fP arg and stack result using (r6)+
+        for example: movl r0,(r6)+
+       movab -n(r6),r7    where n = 4*number of args to fcn
+       calls  $0,*trantb+m      where m is the index of the function
+                                in the transfer table.
+       movl r7,r6              restore r6
+.)b
+.pp
+The compiler supports local functions, which are function accessible
+only within one file.
+Because they are not accessible from C code, we can use a very fast
+call and return sequence when calling them.
+To call a local function
+.(b
+       \fBforeach\fP arg
+        \fBcompute\fP and stack using (r6)+
+       jsb     LOCALNAME       go directly to the function, it will 
+                               restore r6 before it returns.
+.)b
+.pp
+The beginning of each function looks as follows. 
+First for a non-lexpr function called in the standard way
+(topsim is the label jumped to when tail merging, it will be unique
+for each function; the brackets indicate the optional code which
+exists if the -p switch is given to liszt);
+.(b
+       .word   0x5c0           # save r6, r7, r8, r10
+    [   movab  mcounts,r0      # if profiling, mcounts replaced by fasl
+       jsb     mcount  ]
+       movab   linker,r8       # set up r8
+       movl    r7,r10          # set up oldlbot
+       movab   n(r10),r6       # n = 4*Number of args expected.
+topsim:
+.)b
+.pp
+Now for lexprs:
+.(b
+       .word   0x5c0
+    [   movab  mcounts,r0      # if profiling. [mcounts replaced by fasl]
+       jsb     mcount  ]
+       movab   linker,r8       # set up r8
+       subl3   $4,r7,-(sp)     # address one word below bottom of args
+topsim:
+       movl    r6,r10          # first free addr to arg base
+       subl3   r7,r6,r0        # number of args * 4 into r0
+       movab   0x1400(r0),(r6)+ # stack boxed number of args
+       movl    0(r10),-(sp)    # also store on stack so user can't clobber
+.)b
+.pp
+And finally for local functions:
+.(b
+       movl    r10,-(sp)       # save r10
+       movab   -n(r6),r10      # set up arg base using arg top
+topsim:
+.)b
+.sh -1 Assembler\ file\ format
+.pp
+The liszt compiler generates a file which is in Unix assembler format.
+The Unix assembler converts that file into an object file which fasl
+then reads.
+Although the object file generated is a standard Unix object
+file (as defined in /usr/include/a.out.h), it is not of a 
+format that  the Unix ld loader can understand.
+This is because the requirements  of a lisp object file are different
+from an object file of other languages.
+The 
+run time semantics of lisp and the fact that lisp data must be protected
+from garbage collection are the principal differences.
+The unconventional object file created by the unix assembler 
+is a result of the
+unconventional assembler input file.
+Next we will look at what must be put in the assembler file and how
+it is put there.
+.pp
+The assembler file must contain
+.ip \fIinstructions\fP
+- vax assembler code for the compiled functions. 
+If there aren't any functions compiled, this can be empty.
+.ip \fIliterals\fP
+- lisp data which is referred to by compiled code
+.ip \fItransfer\ table\fP
+- the names of the functions which correspond to the calls through
+the transfer table.
+The instructions simply say 'call indirect through the nth transfer
+table entry'.
+.ip \fIfunction\ names\fP
+- the names of the functions which are being 
+defined.
+.ip \fIload\ time\ forms\fP
+- in order to mimic the 
+.i load
+function, fasl has to be able to evaluate lisp expressions at fasl 
+time, so we must be able to store lisp expressions in the assembler
+file and indicate when they should be evaluated.
+.pp
+Based on the requirements above, the form of the assembler file
+is as described below.
+The assembler  builds two segments: text and data.
+We put all of our information in the text segment.
+The compiler places some annotation strings in the data segment
+so that the object file can be identified, however the data segment
+is ignored by fasl.
+The format is
+.ip \fIcompiled\ instructions\fP
+The instructions for each compiled  (non-local) function begins with
+.(b
+    .globl F00003
+    F00003:
+   .word 0x5c0
+.)b
+The globl declaration and the fact that the symbol name begins
+with a F will tell fasl that this is the beginning of a lisp
+function.
+The symbols beginning with F must be unique within a file
+but may be duplicated in other files.
+The lisp  name of the function will appear later.
+Next the instructions for the function are given.
+Only a small fixed set of external symbols may be referenced.
+The list is found in the code for nfasl.c and the common
+ones will be described below (soon).
+Labels should be given a name beginning with L and should
+be unique within the file.
+.ip \fItable\ sizes\fP
+somewhere in the file there should be a pair of 'set' assembler
+pseudo ops which describe the sizes of the literal table and
+transfer table.
+The form is this
+.(b
+    .set linker_size  4
+    .set trans_size 3
+.)b
+where linker_size is the number of entries in the literal table
+which will be required and trans_size is the number of entries
+in the transfer table which will be required.
+Those tables will be built by  fasl.
+.ip \fIbinder\ table\fP
+- this table describes  the order that the functions should be
+defined and forms evaluated.
+It is a table of longwords  beginning at the symbol bind_org
+and ending when a -1 entry is seen.
+The meaning of the entries will be described below.
+.ip \fIlisp\ expression\ table\fP
+- this is a table of null terminated strings beginning at the 
+symbol lit_org and ending at the symbol lit_end.
+Each string is read by the lisp read function (using the
+raw readtable).
+The first linker_size expressions are put in the literal table.
+The next trans_size expressions are the names  of the
+functions to put in the transfer table.
+The remaining expressions correspond to the entries in the
+binder table.
+The binder entries are processed, one by one. 
+Provided that the binder entry is not -1,  an expression
+is read from the lisp expression table.
+Then if the binder table entry is 0, that expression is the name of
+a lambda type lisp function.
+A binary object is created, the discipline is set to lambda and
+the function address is set to the lexically next function 
+defined in the
+file.
+If the binder entry is 1 then this is an nlambda function, and
+if the entry is 2 then this is a macro function.
+If the entry is 99 then the expression just read should be evaluated
+by the lisp function eval.
+.pp
+The lisp compiler normally puts the assembler format output file in /tmp
+and removes it when it is done. 
+The -S switch will tell liszt to write the 
+assembler file in the same place as
+the source file, and with the same root name but a .s extension.
+The -C file will tell the lisp compiler to comment the file as it
+generates it, making it easier to understand what is going on.
+Assume that the following is file foo.l:
+.(b
+(defun foo (x) (bar y) (baz k))
+(print (foo 3))
+(def test (nlambda (l) (print 'hithere) (foo 3)))
+.)b
+we now compile it with -SC
+.(b
+    .globl     F00007  #(fcn lambda foo)
+    F00007:
+    .word      0x5c0
+    movab      linker,r8
+    movl       r7,r10
+    movab      4(r10),r6
+    L00008:
+    movl       *0(r8),(r6)+    #(calling bar)
+           #(from y to stack)
+    movab      -4(r6),r7
+    calls      $0,*trantb+0
+    movl       r7,r6
+    movl       *4(r8),(r6)+    #(calling baz)
+           #(from k to stack)
+    movab      -4(r6),r7
+    calls      $0,*trantb+8
+    movl       r7,r6
+    ret
+    .globl     F00009  #(fcn nlambda test)
+    F00009:
+    .word      0x5c0
+    movab      linker,r8
+    movl       r7,r10
+    movab      4(r10),r6
+    L00010:
+    movl       8(r8),(r6)+     #(calling print)
+           #(from 'hithere to stack)
+    movab      -4(r6),r7
+    calls      $0,*trantb+16
+    movl       r7,r6
+    movl       $5132,(r6)+     #(calling foo)
+           #(from (fixnum 3) to stack)
+    movab      -4(r6),r7
+    calls      $0,*trantb+24
+    movl       r7,r6
+    ret
+    bind_org:
+    .set linker_size,  3
+    .set trans_size,   4
+    .long      0
+    .long      99
+    .long      1
+    .long -1
+    lit_org:
+    .asciz "y"
+    .asciz "k"
+    .asciz "hithere"
+    .asciz "bar"
+    .asciz "baz"
+    .asciz "print"
+    .asciz "foo"
+    .asciz "foo"
+    .asciz "(print (foo 3))"
+    .asciz "test"
+    lit_end:
+    .data  # this is just for documentation 
+    .asciz "@(#)Compiled by Lisp Compiler 7.1 on Sun Feb 21 17:51:54 1982"
+    .asciz "@(#)decl.l 1.5     2/10/82"
+    .asciz "@(#)array.l        1.1     9/25/81"
+    .asciz "@(#)datab.l        1.1     9/25/81"
+    .asciz "@(#)expr.l 1.1     9/25/81"
+    .asciz "@(#)io.l   1.1     9/25/81"
+    .asciz "@(#)funa.l 1.3     2/10/82"
+    .asciz "@(#)funb.l 1.2     2/10/82"
+    .asciz "@(#)func.l 1.2     2/10/82"
+    .asciz "@(#)tlev.l 1.4     10/24/81"
+    .asciz "@(#)fixnum.l       1.6     10/21/81"
+    .asciz "@(#)util.l 1.2     10/7/81"
+.)b
+.sh +0 functions\ callable\ from\ compiled\ lisp\ code
+.sh +0 Object\ File\ Format
+.sh +0 Fasl
+
+
+
+
+
author	Nick Cuccia <cuccia@ucbvax.Berkeley.EDU>
	Wed, 30 Apr 1986 10:11:40 +0000 (02:11 -0800)
committer	Nick Cuccia <cuccia@ucbvax.Berkeley.EDU>
	Wed, 30 Apr 1986 10:11:40 +0000 (02:11 -0800)