.\" @(#)efl 6.1 (Berkeley) 4/29/86 .\" .EH 'PS2:6-%''The Programming Language EFL' .OH 'The Programming Language EFL''PS2:6-%' .\".ds [[ \fR\z[\h'.15m'[\fP .\".ds ]] \fR\z]\h'.15m']\fP .ND "4 June 1979" .\". .TM 79-1273-6 39199 39199-11 .TL The Programming Language EFL .AU "MH 2C-570" 2059 Stuart I. Feldman .AI .MH .OK Fortran Preprocessors Ratfor .AB .PP EFL is a clean, general purpose computer language intended to encourage portable programming. It has a uniform and readable syntax and good data and control flow structuring. EFL programs can be translated into efficient Fortran code, so the EFL programmer can take advantage of the ubiquity of Fortran, the valuable libraries of software written in that language, and the portability that comes with the use of a standardized language, without suffering from Fortran's many failings as a language. It is especially useful for numeric programs. The EFL language permits the programmer to express complicated ideas in a comprehensible way, while permitting access to the power of the Fortran environment. EFL can be viewed as a descendant of B. W. Kernighan's Ratfor [1]; the name originally stood for `Extended Fortran Language'. The current version of the EFL compiler is written in portable C. .AE .CS 35 0 35 0 0 1 .SH .ds ~ \\v'.25m'\\s+2~\\s-2\\v'-.25m' .if n .ls 2 .EQ delim @@ .EN .NH 1 INTRODUCTION .NH 2 Purpose .PP EFL is a clean, general purpose computer language intended to encourage portable programming. It has a uniform and readable syntax and good data and control flow structuring. EFL programs can be translated into efficient Fortran code, so the EFL programmer can take advantage of the ubiquity of Fortran, the valuable libraries of software written in that language, and the portability that comes with the use of a standardized language, without suffering from Fortran's many failings as a language. It is especially useful for numeric programs. Thus, the EFL language permits the programmer to express complicated ideas in a comprehensible way, while permitting access to the power of the Fortran environment. .NH 2 History .PP EFL can be viewed as a descendant of B. W. Kernighan's Ratfor [1]; the name originally stood for `Extended Fortran Language'. A. D. Hall designed the initial version of the language and wrote a preliminary version of a compiler. I extended and modified the language and wrote a full compiler (in C) for it. The current compiler is much more than a simple preprocessor: it attempts to diagnose all syntax errors, to provide readable Fortran output, and to avoid a number of niggling restrictions. To achieve this goal, a sizable two-pass translator is needed. .NH 2 Notation .PP In examples and syntax specifications, .B boldface type is used to indicate literal words and punctuation, such as \fBwhile\fR. Words in .I italic type indicate an item in a category, such as an .I expression. A construct surrounded by double brackets represents a list of one or more of those items, separated by commas. Thus, the notation .DS C \fI\*([[ item \*(]]\fR .DE could refer to any of the following: .DS B .I item item\fB, \fIitem \fIitem\fB, \fIitem\fB, \fIitem\fR .DE .PP The reader should have a fair degree of familiarity with some procedural language. There will be occasional references to Ratfor and to Fortran which may be ignored if the reader is unfamiliar with those languages. .bp .NH 1 LEXICAL FORM .NH 2 Character Set .PP The following characters are legal in an EFL program: .KS .TS center; ll. \fIletters \fBa b c d e f g h i j k l m\fI \fBn o p q r s t u v w x y z\fI digits \fB0 1 2 3 4 5 6 7 8 9\fI white space \fIblank tab\fI quotes \fB\' "\fI sharp \fB#\fI continuation \fB\(ru\fI braces \fB{ }\fI parentheses \fB( )\fI other \fB, ; : . + \- \(** /\fI \fB= < > & \*~ | $\fI .TE .KE Letter case (upper or lower) is ignored except within strings, so `\fBa\fR' and `\fBA\fR' are treated as the same character. All of the examples below are printed in lower case. An exclamation mark (`\fB!\fR') may be used in place of a tilde (`\fB\*~\fR'). Square brackets (`[' and `]') may be used in place of braces (`{' and `}'). .NH 2 Lines .PP EFL is a line-oriented language. Except in special cases (discussed below), the end of a line marks the end of a token and the end of a statement. The trailing portion of a line may be used for a comment. There is a mechanism for diverting input from one source file to another, so a single line in the program may be replaced by a number of lines from the other file. Diagnostic messages are labeled with the line number of the file on which they are detected. .NH 3 White Space .PP Outside of a character string or comment, any sequence of one or more spaces or tab characters acts as a single space. Such a space terminates a token. .NH 3 Comments .PP A comment may appear at the end of any line. It is introduced by a sharp (#) character, and continues to the end of the line. (A sharp inside of a quoted string does not mark a comment.) The sharp and succeeding characters on the line are discarded. A blank line is also a comment. Comments have no effect on execution. .NH 3 Include Files .PP It is possible to insert the contents of a file at a point in the source text, by referencing it in a line like .DS C .B include joe .R .DE No statement or comment may follow an .B include on a line. In effect, the .B include line is replaced by the lines in the named file, but diagnostics refer to the line number in the included file. \fBInclude\fRs may be nested at least ten deep. .NH 3 Continuation .PP Lines may be continued explicitly by using the underscore (\fB_\fR) character. If the last character of a line (after comments and trailing white space have been stripped) is an underscore, the end of line and the initial blanks on the next line are ignored. Underscores are ignored in other contexts (except inside of quoted strings). Thus .DS B 1_000_000_ 000 .DE equals @10 sup 9@. .PP There are also rules for continuing lines automatically: the end of line is ignored whenever it is obvious that the statement is not complete. To be specific, a statement is continued if the last token on a line is an operator, comma, left brace, or left parenthesis. (A statement is not continued just because of unbalanced braces or parentheses.) Some compound statements are also continued automatically; these points are noted in the sections on executable statements. .NH 3 Multiple Statements on a Line .PP A semicolon terminates the current statement. Thus, it is possible to write more than one statement on a line. A line consisting only of a semicolon, or a semicolon following a semicolon, forms a null statement. .NH 2 Tokens .PP A program is made up of a sequence of tokens. Each token is a sequence of characters. A blank terminates any token other than a quoted string. End of line also terminates a token unless explicit continuation (see above) is signaled by an underscore. .NH 3 Identifiers .PP An identifier is a letter or a letter followed by letters or digits. The following is a list of the reserved words that have special meaning in EFL. They will be discussed later. .KF .TS center; lll . .B array exit precision automatic external procedure break false read call field readbin case for real character function repeat common go return complex goto select continue if short debug implicit sizeof default include static define initial struct dimension integer subroutine do internal true double lengthof until doubleprecision logical value else long while end next write equivalence option writebin .R .TE .KE The use of these words is discussed below. These words may not be used for any other purpose. .NH 3 Strings .PP A character string is a sequence of characters surrounded by quotation marks. If the string is bounded by single-quote marks ( \fB\'\fR ), it may contain double quote marks ( \fB"\fR ), and vice versa. A quoted string may not be broken across a line boundary. .DS .B \'hello there\' "ain\'t misbehavin\'" .R .DE .NH 3 Integer Constants .PP An integer constant is a sequence of one or more digits. .DS B .B 0 57 123456 .R .DE .NH 3 Floating Point Constants .PP A floating point constant contains a dot and/or an exponent field. An .I "exponent field" is a letter .B d or .B e followed by an optionally signed integer constant. If @I@ and @J@ are integer constants and @E@ is an exponent field, then a floating constant has one of the following forms: .DS B .I \fB.\fPI I\fB.\fP I\fB.\fPJ IE I\fB.\fPE \fB.\fPIE I\fB.\fPJE .R .DE .NH 3 Punctuation .PP Certain characters are used to group or separate objects in the language. These are .TS center; ll. parentheses ( ) braces { } comma , semicolon ; colon : end-of-line .TE The end-of-line is a token (statement separator) when the line is neither blank nor continued. .NH 3 Operators .PP The EFL operators are written as sequences of one or more non-alphanumeric characters. .DS B + \- \(** / \(**\(** < <= > >= == \*~= && |\|| & | += \-= \(*= /= \(**\(**= &&= |\||= &= |= \-> . $ .DE A dot (`\fB.\fR') is an operator when it qualifies a structure element name, but not when it acts as a decimal point in a numeric constant. There is a special mode (see the Atavisms section) in which some of the operators may be represented by a string consisting of a dot, an identifier, and a dot (\fIe.g., \fB.lt.\fR ). .NH 2 Macros .PP EFL has a simple macro substitution facility. An identifier may be defined to be equal to a string of tokens; whenever that name appears as a token in the program, the string replaces it. A macro name is given a value in a .B define statement like .DS define count n += 1 .DE Any time the name .B count appears in the program, it is replaced by the statement .DS C .B n += 1 .R .DE A .B define statement must appear alone on a line; the form is .DS C \fBdefine \fIname \fIrest-of-line\fR .DE Trailing comments are part of the string. .NH 1 PROGRAM FORM .NH 2 Files .PP A .I file is a sequence of lines. A file is compiled as a single unit. It may contain one or more procedures. Declarations and options that appear outside of a procedure affect the succeeding procedures on that file. .NH 2 Procedures .PP Procedures are the largest grouping of statements in EFL. Each procedure has a name by which it is invoked. (The first procedure invoked during execution, known as the .I main procedure, has the null name.) Procedure calls and argument passing are discussed in Section 8. .NH 2 Blocks .PP Statements may be formed into groups inside of a procedure. To describe the scope of names, it is convenient to introduce the ideas of .I block and of .I "nesting level." The beginning of a program file is at nesting level zero. Any options, macro definitions, or variable declarations there are also at level zero. The text immediately following a .B procedure statement is at level 1. After the declarations, a left brace marks the beginning of a new block and increases the nesting level by 1; a right brace drops the level by 1. (Braces inside declarations do not mark blocks.) (See Section 7.2). An .B end statement marks the end of the procedure, level 1, and the return to level 0. A name (variable or macro) that is defined at level @k@ is defined throughout that block and in all deeper nested levels in which that name is not redefined or redeclared. Thus, a procedure might look like the following: .DS B .ta .7i 1.4i 2.1i 2.8i .B # block 0 procedure george real x x = 2 . . . if(x > 2) { # new block integer x # a different variable do x = 1,7 write(,x) . . . } # end of block end # end of procedure, return to block 0 .DE .NH 2 Statements .PP A statement is terminated by end of line or by a semicolon. Statements are of the following types: .DS B Option Include Define .sp .3 Procedure End .sp .3 Declarative Executable .DE The .B option statement is described in Section 10. The .B include, .B define, and .B end statements have been described above; they may not be followed by another statement on a line. Each procedure begins with a .B procedure statements and finishes with an .B end statement; these are discussed in Section 8. Declarations describe types and values of variables and procedures. Executable statements cause specific actions to be taken. A block is an example of an executable statement; it is made up of declarative and executable statements. .NH 2 Labels .PP An executable statement may have a .I label which may be used in a branch statement. A label is an identifier followed by a colon, as in .DS B .B .ta 1i read(, x) if(x < 3) goto error . . . error: fatal("bad input") .R .DE .NH 1 DATA TYPES AND VARIABLES .PP EFL supports a small number of basic (scalar) types. The programmer may define objects made up of variables of basic type; other aggregates may then be defined in terms of previously defined aggregates. .NH 2 Basic Types .PP The basic types are .DS B \fBlogical \fBinteger \fBfield(\fIm\|\fB:\fIn\|\fB) \fBreal \fBcomplex \fBlong real \fBlong complex \fBcharacter(\fIn\|\fB) .R .DE A logical quantity may take on the two values true and false. An integer may take on any whole number value in some machine-dependent range. A field quantity is an integer restricted to a particular closed interval @([m:n])@. A `real' quantity is a floating point approximation to a real or rational number. A long real is a more precise approximation to a rational. (Real quantities are represented as single precision floating point numbers; long reals are double precision floating point numbers.) A complex quantity is an approximation to a complex number, and is represented as a pair of reals. A character quantity is a fixed-length string of @n@ characters. .NH 2 Constants .PP There is a notation for a constant of each basic type. .LP A logical may take on the two values .DS B .B true false .R .DE An integer or field constant is a fixed point constant, optionally preceded by a plus or minus sign, as in .DS B .B 17 \-94 +6 0 .R .DE A long real (`double precision') constant is a floating point constant containing an exponent field that begins with the letter .B d. A real (`single precision') constant is any other floating point constant. A real or long real constant may be preceded by a plus or minus sign. The following are valid .B real constants: .DS B .B 17.3 \-.4 7.9e\-6 @(~=~7.9 times 10 sup -6 )@ 14e9 @(~=~1.4 times 10 sup 10 )@ .R .DE The following are valid .B "long real" constants .DS B .B 7.9d\-6 @(~=~7.9 times 10 sup -6 )@ 5d3 .R .DE .LP A character constant is a quoted string. .NH 2 Variables .PP A variable is a quantity with a name and a location. At any particular time the variable may also have a value. (A variable is said to be .I undefined before it is initialized or assigned its first value, and after certain indefinite operations are performed.) Each variable has certain attributes: .NH 3 Storage Class .PP The association of a name and a location is either transitory or permanent. Transitory association is achieved when arguments are passed to procedures. Other associations are permanent (static). (A future extension of EFL may include dynamically allocated variables.) .NH 3 Scope of Names .PP The names of common areas are global, as are procedure names: these names may be used anywhere in the program. All other names are local to the block in which they are declared. .NH 3 Precision .PP Floating point variables are either of normal or .B long precision. This attribute may be stated independently of the basic type. .NH 2 Arrays .PP It is possible to declare rectangular arrays (of any dimension) of values of the same type. The index set is always a cross-product of intervals of integers. The lower and upper bounds of the intervals must be constants for arrays that are local or .B common. A formal argument array may have intervals that are of length equal to one of the other formal arguments. An element of an array is denoted by the array name followed by a parenthesized comma-separated list of integer values, each of which must lie within the corresponding interval. (The intervals may include negative numbers.) Entire arrays may be passed as procedure arguments or in input/output lists, or they may be initialized; all other array references must be to individual elements. .NH 2 Structures .PP It is possible to define new types which are made up of elements of other types. The compound object is known as a .I structure; its constituents are called .I members of the structure. The structure may be given a name, which acts as a type name in the remaining statements within the scope of its declaration. The elements of a structure may be of any type (including previously defined structures), or they may be arrays of such objects. Entire structures may be passed to procedures or be used in input/output lists; individual elements of structures may be referenced. The uses of structures will be detailed below. The following structure might represent a symbol table: .DS B .B .ta .7i 1.4i 2.1i struct tableentry { character(8) name integer hashvalue integer numberofelements field(0:1) initialized, used, set field(0:10) type } .DE .NH 1 EXPRESSIONS .PP Expressions are syntactic forms that yield a value. An expression may have any of the following forms, recursively applied: .DS B .I primary \fB(\fI expression \fB)\fI unary-operator expression expression binary-operator expression .DE In the following table of operators, all operators on a line have equal precedence and have higher precedence than operators on later lines. The meanings of these operators are described in sections 5.3 and 5.4. .DS B .B \-> . \(**\(** \(** / \fIunary\fB + \- ++ \-\- + \- < <= > >= == \*~= & && | |\|| $ = += \-= \(**= /= \(**\(**= &= |= &&= |\||= .R .DE Examples of expressions are .DS B .B a > greater than >= @>=@ greater than or equal .TE .KE Since the complex numbers are not ordered, the only relational operators that may take complex operands are \fB==\fR and \fB\*~=\fR . The character collating sequence is not defined. .NH 3 Assignment Operators .PP All of the assignment operators are right associative. The simple form of assignment is .DS C \fIbasic-left-side \fB= \fIexpression\fR .DE A .I basic-left-side is a scalar variable name, array element, or structure member of basic type. This statement computes the expression on the right side, and stores that value (possibly after coercing the value to the type of the left side) in the location named by the left side. The value of the assignment expression is the value assigned to the left side after coercion. .PP There is also an assignment operator corresponding to each binary arithmetic and logical operator. In each case, @a ~op = ~ b@ is equivalent to @a ~=~ a ~ op~ b@. (The operator and equal sign must not be separated by blanks.) Thus, .B n+=2 adds 2 to n. The location of the left side is evaluated only once. .NH 2 Dynamic Structures .PP EFL does not have an address (pointer, reference) type. However, there is a notation for dynamic structures, .DS B \fIleftside \fB\-> \fIstructurename\fR .DE This expression is a structure with the shape implied by .I structurename but starting at the location of .I leftside. In effect, this overlays the structure template at the specified location. The .I leftside must be a variable, array, array element, or structure member. The type of the .I leftside must be one of the types in the structure declaration. An element of such a structure is denoted in the usual way using the dot operator. Thus, .DS C .B place(i) \-> st.elt .R .DE refers to the .B elt member of the .B st structure starting at the @i sup th@ element of the array .B place. .NH 2 Repetition Operator .PP Inside of a list, an element of the form .DS C \fIinteger-constant-expression \fB$\fI constant-expression\fR .DE is equivalent to the appearance of the .I expression a number of times equal to the first expression. Thus, .DS C .B (3, 3$4, 5) .R .DE is equivalent to .DS C .B (3, 4, 4, 4, 5) .R .DE .NH 2 Constant Expressions .PP If an expression is built up out of operators (other than functions) and constants, the value of the expression is a constant, and may be used anywhere a constant is required. .NH 1 DECLARATIONS .PP Declarations statement describe the meaning, shape, and size of named objects in the EFL language. .NH 2 Syntax .PP A declaration statement is made up of attributes and variables. Declaration statements are of two form: .DS B .I attributes variable-list attributes { declarations } .R .DE In the first case, each name in the .I variable-list has the specified attributes. In the second, each name in the declarations also has the specified attributes. A variable name may appear in more than one variable list, so long as the attributes are not contradictory. Each name of a nonargument variable may be accompanied by an initial value specification. The .I declarations inside the braces are one or more declaration statements. Examples of declarations are .DS B .B integer k=2 .sp .5 long real b(7,3) .sp .5 common(cname) { integer i long real array(5,0:3) x, y character(7) ch } .R .DE .ne 1i .NH 2 Attributes .NH 3 Basic Types .PP The following are basic types in declarations .DS .B logical integer field(@m:n@) character(@k@) real complex .R .DE In the above, the quantities @k@, @m@, and @n@ denote integer constant expressions with the properties @k>0@ and @n>m@. .NH 3 Arrays .PP The dimensionality may be declared by an .B array attribute .EQ C bold array( b sub 1 , ..., b sub n bold ) .EN Each of the @b sub i@ may either be a single integer expression or a pair of integer expressions separated by a colon. The pair of expressions form a lower and an upper bound; the single expression is an upper bound with an implied lower bound of 1. The number of dimensions is equal to @n,@ the number of bounds. All of the integer expressions must be constants. An exception is permitted only if all of the variables associated with an array declarator are formal arguments of the procedure; in this case, each bound must have the property that @upper - lower + 1@ is equal to a formal argument of the procedure. (The compiler has limited ability to simplify expressions, but it will recognize important cases such as .B "(0:n\-1)". The upper bound for the last dimension @(b sub n )@ may be marked by an asterisk ( \fB\(**\fR ) if the size of the array is not known. The following are legal @bold array@ attributes: .DS B .B array(5) array(5, 1:5, \-3:0) array(5, \(**) array(0:m\-1, m) .R .DE .NH 3 Structures .PP A structure declaration is of the form .DS B \fBstruct \fIstructname \fB{ \fI declaration statements \fB}\fR .DE The .I structname is optional; if it is present, it acts as if it were the name of a type in the rest of its scope. Each name that appears inside the .I declarations is a .I member of the structure, and has a special meaning when used to qualify any variable declared with the structure type. A name may appear as a member of any number of structures, and may also be the name of an ordinary variable, since a structure member name is used only in contexts where the parent type is known. The following are valid structure attributes .DS B .B struct xx { integer a, b real x(5) } struct { xx z(3); character(5) y } .R .DE The last line defines a structure containing an array of three @bold xx 's@ and a character string. .NH 3 Precision .PP Variables of floating point (@bold real@ or @bold complex@) type may be declared to be @bold long@ to ensure they have higher precision than ordinary floating point variables. The default precision is \fBshort\fR. .NH 3 Common .PP Certain objects called .I common\ areas have external scope, and may be referenced by any procedure that has a declaration for the name using a .DS C \fBcommon ( \fI commonareaname \fB)\fR .DE attribute. All of the variables declared with a particular \fBcommon\fR attribute are in the same block; the order in which they are declared is significant. Declarations for the same block in differing procedures must have the variables in the same order and with the same types, precision, and shapes, though not necessarily with the same names. .NH 3 External .PP If a name is used as the procedure name in a procedure invocation, it is implicitly declared to have the .B external attribute. If a procedure name is to be passed as an argument, it is necessary to declare it in a statement of the form .DS B \fBexternal \*([[ \fIname \fB\*(]]\fR .DE If a name has the external attribute and it is a formal argument of the procedure, then it is associated with a procedure identifier passed as an actual argument at each call. If the name is not a formal argument, then that name is the actual name of a procedure, as it appears in the corresponding .B procedure statement. .NH 2 Variable List .PP The elements of a variable list in a declaration consist of a name, an optional dimension specification, and an optional initial value specification. The name follows the usual rules. The dimension specification is the same form and meaning as the parenthesized list in an .B array attribute. The initial value specification is an equal sign (\fB=\fR) followed by a constant expression. If the name is an array, the right side of the equal sign may be a parenthesized list of constant expressions, or repeated elements or lists; the total number of elements in the list must not exceed the number of elements of the array, which are filled in column-major order. .NH 2 The Initial Statement .PP An initial value may also be specified for a simple variable, array, array element, or member of a structure using a statement of the form .DS B \fBinitial \*([[ \fIvar \fB= \fIval \*(]]\fR .DE The @var@ may be a variable name, array element specification, or member of structure. The right side follows the same rules as for an initial value specification in other declaration statements. .NH 1 EXECUTABLE STATEMENTS .PP Every useful EFL program contains executable statements \(em otherwise it would not do anything and would not need to be run. Statements are frequently made up of other statements. Blocks are the most obvious case, but many other forms contain statements as constituents. .PP To increase the legibility of EFL programs, some of the statement forms can be broken without an explicit continuation. A square (\fR\(sq\fP) in the syntax represents a point where the end of a line will be ignored. .NH 2 Expression Statements .NH 3 Subroutine Call .PP A procedure invocation that returns no value is known as a subroutine call. Such an invocation is a statement. Examples are .DS B .B work(in, out) run(\|) .R .DE .PP Input/output statements (see Section 7.7) resemble procedure invocations but do not yield a value. If an error occurs the program stops. .NH 3 Assignment Statements .PP An expression that is a simple assignment (\fB=\fR) or a compound assignment (\fB+=\fR etc.) is a statement: .DS B .B a = b a = sin(x)/6 x \(**= y .R .DE .NH 2 Blocks .PP A block is a compound statement that acts as a statement. A block begins with a left brace, optionally followed by declarations, optionally followed by executable statements, followed by a right brace. A block may be used anywhere a statement is permitted. A block is not an expression and does not have a value. An example of a block is .DS B .B { integer i # this variable is unknown outside the braces .sp .3 big = 0 do i = 1,n if(big < a(i)) big = a(i) } .R .DE .NH 2 Test Statements .PP Test statements permit execution of certain statements conditional on the truth of a predicate. .NH 3 If Statement .PP The simplest of the test statements is the .B if statement, of form .DS C \fBif ( \fIlogical-expression\fB ) \fR\(sq\fP \fIstatement\fR .DE The logical expression is evaluated; if it is true, then the .I statement is executed. .NH 3 If-Else .PP A more general statement is of the form .DS B \fBif ( \fIlogical-expression \fB) \fR\(sq\fP \fI statement-1 \fR\(sq\fP \fBelse \fR\(sq\fP \fI statement-2 \fR .DE If the expression is .B true then .I statement-1 is executed, otherwise .I statement-2 is executed. Either of the consequent statements may itself be an .B if-else so a completely nested test sequence is possible: .DS B .B if(x .gt. >= .ge. == .eq. \*~= .ne. & .and. | .or. && .andand. |\|| .oror. \*~ .not. true .true. false .false. .TE .R .DE In this mode, no structure element may be named .B lt, .B le, etc. The readable forms in the left column are always recognized. .NH 2 Complex Constants .PP A complex constant may be written as a parenthesized list of real quantities, such as .DS C .B (1.5, 3.0) .R .DE The preferred notation is by a type coercion, .DS C .B complex(1.5, 3.0) .R .DE .NH 2 Function Values .PP The preferred way to return a value from a function in EFL is the @bold return ( value )@ construct. However, the name of the function acts as a variable to which values may be assigned; an ordinary @bold return@ statement returns the last value assigned to that name as the function value. .NH 2 Equivalence .PP A statement of the form .EQ C bold equivalence ~ v sub 1 ,~ v sub 2 ,~ ...,~ v sub n .EN declares that each of the @v sub i@ starts at the same memory location. Each of the @v sub i@ may be a variable name, array element name, or structure member. .NH 2 Minimum and Maximum Functions .PP There are a number of non-generic functions in this category, which differ in the required types of the arguments and the type of the return value. They may also have variable numbers of arguments, but all the arguments must have the same type. .DS .TS center; ccc lll . Function Argument Type Result Type _ .B amin0 integer real amin1 real real min0 integer integer min1 real integer dmin1 long real long real amax0 integer real amax1 real real max0 integer integer max1 real integer dmax1 long real long real .R .TE .DE .NH 1 COMPILER OPTIONS .PP A number of options can be used to control the output and to tailor it for various compilers and systems. The defaults chosen are conservative, but it is sometimes necessary to change the output to match peculiarities of the target environment. .PP Options are set with statements of the form .DS C \fBoption \fI\*([[ \fIopt \fI\*(]]\fR .DE where each .I opt is of one of the forms .DS B .I optionname optionname \fB= \fIoptionvalue .R .DE The .I optionvalue is either a constant (numeric or string) or a name associated with that option. The two names .B yes and .B no apply to a number of options. .NH 2 Default Options .PP Each option has a default setting. It is possible to change the whole set of defaults to those appropriate for a particular environment by using the .B system option. At present, the only valid values are .B system=unix and .B system=gcos. .NH 2 Input Language Options .PP The .B dots option determines whether the compiler recognizes .B .lt. and similar forms. The default setting is .B no. .NH 2 Input/Output Error Handling .PP The .B ioerror option can be given three values: .B none means that none of the I/O statements may be used in expressions, since there is no way to detect errors. The implementation of the .B ibm form uses ERR= and END= clauses. The implementation of the .B fortran77 form uses IOSTAT= clauses. .NH 2 Continuation Conventions .PP By default, continued Fortran statements are indicated by a character in column 6 (Standard Fortran). The option .B "continue=column1" puts an ampersand (\fB&\fR) in the first column of the continued lines instead. .NH 2 Default Formats .PP If no format is specified for a datum in an iolist for a .B read or .B write statement, a default is provided. The default formats can be changed by setting certain options .DS .TS center; cc ll. Option Type _ \fBiformat\fR integer \fBrformat\fR real \fBdformat\fR long real \fBzformat\fR complex \fBzdformat\fR long complex \fBlformat\fR logical .TE .DE The associated value must be a Fortran format, such as .DS C .B option rformat=f22.6 .R .DE .NH 2 Alignments and Sizes .PP In order to implement .B character variables, structures, and the .B sizeof and .B lengthof operators, it is necessary to know how much space various Fortran data types require, and what boundary alignment properties they demand. The relevant options are .DS .B .TS center; ccc lll. Fortran Type Size Option Alignment Option _ integer isize ialign real rsize ralign long real dsize dalign complex zsize zalign logical lsize lalign .R .TE .DE The sizes are given in terms of an arbitrary unit; the alignment is given in the same units. The option .B charperint gives the number of characters per .B integer variable. .NH 2 Default Input/Output Units .PP The options .B ftnin and .B ftnout are the numbers of the standard input and output units. The default values are .B ftnin=5 and .B ftnout=6. .NH 2 Miscellaneous Output Control Options .PP Each Fortran procedure generated by the compiler will be preceded by the value of the .B procheader option. .PP No Hollerith strings will be passed as subroutine arguments if .B hollincall=no is specified. .PP The Fortran statement numbers normally start at 1 and increase by 1. It is possible to change the increment value by using the .B deltastno option. .ta .5i 1i 1.5i 2i 2.5i 3.0i .NH 1 EXAMPLES .PP In order to show the flavor or programming in EFL, we present a few examples. They are short, but show some of the convenience of the language. .NH 2 File Copying .PP The following short program copies the standard input to the standard output, provided that the input is a formatted file containing lines no longer than a hundred characters. .DS .B procedure # main program character(100) line while( read( , line) == 0 ) write( , line) end .R .DE Since .B read returns zero until the end of file (or a read error), this program keeps reading and writing until the input is exhausted. .NH 2 Matrix Multiplication .PP The following procedure multiplies the @m times n@ matrix a by the @n times p@ matrix b to give the @m times p@ matrix c. The calculation obeys the formula @c sub ij ~=~ sum a sub ik b sub kj@. .DS .ta .7i 1.4i 2.1i 2.8i .B procedure matmul(a,b,c, m,n,p) integer i, j, k, m, n, p long real a(m,n), b(n,p), c(m,p) .sp .3 do i = 1,m do j = 1,p { c(i,j) = 0 do k = 1,n c(i,j) += a(i,k) \(** b(k,j) } end .R .DE .NH 2 Searching a Linked List .PP Assume we have a list of pairs of numbers @(x,y)@. The list is stored as a linked list sorted in ascending order of @x@ values. The following procedure searches this list for a particular value of @x@ and returns the corresponding @y@ value. .DS .B .ta .7i 1.4i 2.1i 2.8i define LAST 0 define NOTFOUND \-1 integer procedure val(list, first, x) # list is an array of structures. # Each structure contains a thread index value, an x, and a y value. .sp .3 struct { integer nextindex integer x, y } list(\(**) .sp .3 integer first, p, arg for(p = first , p\*~=LAST && list(p).x<=x , p = list(p).nextindex) if(list(p).x == x) return( list(p).y ) return(NOTFOUND) end .R .DE The search is a single .B for loop that begins with the head of the list and examines items until either the list is exhausted (p==LAST) or until it is known that the specified value is not on the list (list(p).x > x). The two tests in the conjunction must be performed in the specified order to avoid using an invalid subscript in the .B list(p) reference. Therefore, the .B && operator is used. The next element in the chain is found by the iteration statement .B "p=list(p).nextindex". .NH 2 Walking a Tree .PP As an example of a more complicated problem, let us imagine we have an expression tree stored in a common area, and that we want to print out an infix form of the tree. Each node is either a leaf (containing a numeric value) or it is a binary operator, pointing to a left and a right descendant. In a recursive language, such a tree walk would be implement by the following simple pseudocode: .DS .I if this node is a leaf print its value otherwise print a left parenthesis print the left node print the operator print the right node print a right parenthesis .R .DE In a nonrecursive language like EFL, it is necessary to maintain an explicit stack to keep track of the current state of the computation. The following procedure calls a procedure .B outch to print a single character and a procedure .B outval to print a value. .DS .ta .7i 1.4i 2.1i 2.8i .B procedure walk(first) # print out an expression tree .sp .5 integer first # index of root node integer currentnode integer stackdepth common(nodes) struct { character(1) op integer leftp, rightp real val } tree(100) # array of structures .sp .5 struct { integer nextstate integer nodep } stackframe(100) .sp .5 define NODE tree(currentnode) define STACK stackframe(stackdepth) .sp .5 # nextstate values define DOWN 1 define LEFT 2 define RIGHT 3 .DE .DS .B # initialize stack with root node stackdepth = 1 STACK.nextstate = DOWN STACK.nodep = first .DE .DS .B while( stackdepth > 0 ) { currentnode = STACK.nodep select(STACK.nextstate) { case DOWN: if(NODE.op == " ") # a leaf { outval( NODE.val ) stackdepth \-= 1 } else { # a binary operator node outch( "(" ) STACK.nextstate = LEFT stackdepth += 1 STACK.nextstate = DOWN STACK.nodep = NODE.leftp } .sp .5 case LEFT: outch( NODE.op ) STACK.nextstate = RIGHT stackdepth += 1 STACK.nextstate = DOWN STACK.nodep = NODE.rightp .sp .5 case RIGHT: outch( ")" ) stackdepth \-= 1 } } end .DE .NH 1 PORTABILITY .PP One of the major goals of the EFL language is to make it easy to write portable programs. The output of the EFL compiler is intended to be acceptable to any Standard Fortran compiler (unless the .B fortran77 option is specified). .NH 2 Primitives .PP Certain EFL operations cannot be implemented in portable Fortran, so a few machine-dependent procedures must be provided in each environment. .NH 3 Character String Copying .PP The subroutine .B ef1asc is called to copy one character string to another. If the target string is shorter than the source, the final characters are not copied. If the target string is longer, its end is padded with blanks. The calling sequence is .DS B subroutine ef1asc(a, la, b, lb) integer a(\(**), la, b(\(**), lb .DE and it must copy the first .B lb characters from .B b to the first .B la characters of .B a. .NH 3 Character String Comparisons .PP The function .B ef1cmc is invoked to determine the order of two character strings. The declaration is .DS B integer function ef1cmc(a, la, b, lb) integer a(\(**), la, b(\(**), lb .DE The function returns a negative value if the string .B a of length .B la precedes the string .B b of length .B lb. It returns zero if the strings are equal, and a positive value otherwise. If the strings are of differing length, the comparison is carried out as if the end of the shorter string were padded with blanks. .NH 1 ACKNOWLEDGMENTS .PP A. D. Hall originated the EFL language and wrote the first compiler for it; he also gave inestimable aid when I took up the project. B. W. Kernighan and W. S. Brown made a number of useful suggestions about the language and about this report. N. L. Schryer has acted as willing, cheerful, and severe first user and helpful critic of each new version and facility. J. L. Blue, L. C. Kaufman, and D. D. Warner made very useful contributions by making serious use of the compiler, and noting and tolerating its misbehaviors. .NH 1 REFERENCE .IP 1. B. W. Kernighan, ``Ratfor \(em A Preprocessor for a Rational Fortran'', Bell Laboratories Computing Science Technical Report #55 .bp .SH APPENDIX A. Relation Between EFL and Ratfor .PP There are a number of differences between Ratfor and EFL, since EFL is a defined language while Ratfor is the union of the special control structures and the language accepted by the underlying Fortran compiler. Ratfor running over Standard Fortran is almost a subset of EFL. Most of the features described in the Atavisms section are present to ease the conversion of Ratfor programs to EFL. .PP There are a few incompatibilities: The syntax of the .B for statement is slightly different in the two languages: the three clauses are separated by semicolons in Ratfor, but by commas in EFL. (The initial and iteration statements may be compound statements in EFL because of this change). The input/output syntax is quite different in the two languages, and there is no FORMAT statement in EFL. There are no ASSIGN or assigned GOTO statements in EFL. .PP The major linguistic additions are character data, factored declaration syntax, block structure, assignment and sequential test operators, generic functions, and data structures. EFL permits more general forms for expressions, and provides a more uniform syntax. (One need not worry about the Fortran/Ratfor restrictions on subscript or DO expression forms, for example.) .SH APPENDIX B. COMPILER .SH B.1. Current Version .PP The current version of the EFL compiler is a two-pass translator written in portable C. It implements all of the features of the language described above except for .B "long complex" numbers. Versions of this compiler run under the .SM GCOS and .UX operating systems. .SH B.2. Diagnostics .PP The EFL compiler diagnoses all syntax errors. It gives the line and file name (if known) on which the error was detected. Warnings are given for variables that are used but not explicitly declared. .SH B.3. Quality of Fortran Produced .PP The Fortran produced by EFL is quite clean and readable. To the extent possible, the variable names that appear in the EFL program are used in the Fortran code. The bodies of loops and test constructs are indented. Statement numbers are consecutive. Few unneeded GOTO and CONTINUE statements are used. It is considered a compiler bug if incorrect Fortran is produced (except for escaped lines). The following is the Fortran procedure produced by the EFL compiler for the matrix multiplication example (Section 11.2): .DS B .B \0\0\0\0\0\0subroutine\0matmul(a,\0b,\0c,\0m,\0n,\0p) \0\0\0\0\0\0integer\0m,\0n,\0p \0\0\0\0\0\0double\0precision\0a(m,\0n),\0b(n,\0p),\0c(m,\0p) \0\0\0\0\0\0integer\0i,\0j,\0k \0\0\0\0\0\0do\0\03\0i\0=\01,\0m \0\0\0\0\0\0\0\0\0do\0\02\0j\0=\01,\0p \0\0\0\0\0\0\0\0\0\0\0\0c(i,\0j)\0=\00 \0\0\0\0\0\0\0\0\0\0\0\0do\0\01\0k\0=\01,\0n \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0c(i,\0j)\0=\0c(i,\0j)+a(i,\0k)*b(k,\0j) \0\0\01\0\0\0\0\0\0\0\0\0\0\0continue \0\0\02\0\0\0\0\0\0\0\0continue \0\0\03\0\0\0\0\0continue \0\0\0\0\0\0end .R .DE The following is the procedure for the tree walk (Section 11.4): .DS B .B \0\0\0\0\0\0subroutine\0walk(first) \0\0\0\0\0\0integer\0first \0\0\0\0\0\0common\0/nodes/\0tree \0\0\0\0\0\0integer\0tree(4,\0100) \0\0\0\0\0\0real\0tree1(4,\0100) \0\0\0\0\0\0integer\0staame(2,\0100),\0stapth,\0curode \0\0\0\0\0\0integer\0const1(1) \0\0\0\0\0\0equivalence\0(tree(1,1),\0tree1(1,1)) \0\0\0\0\0\0data\0const1(1)/4h\0\0\0\0/ c\0print\0out\0an\0expression\0tree c\0index\0of\0root\0node c\0array\0of\0structures c\0\0\0nextstate\0values c\0\0\0initialize\0stack\0with\0root\0node \0\0\0\0\0\0stapth\0=\01 \0\0\0\0\0\0staame(1,\0stapth)\0=\01 \0\0\0\0\0\0staame(2,\0stapth)\0=\0first \0\0\01\0\0if\0(stapth\0.le.\00)\0goto\0\09 \0\0\0\0\0\0\0\0\0curode\0=\0staame(2,\0stapth) \0\0\0\0\0\0\0\0\0goto\0\07 \0\0\02\0\0\0\0\0\0\0\0if\0(tree(1,\0curode)\0.ne.\0const1(1))\0goto\03 \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0call\0outval(tree1(4,\0curode)) c\0a\0leaf \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0stapth\0=\0stapth-1 \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0goto\0\04 \0\0\03\0\0\0\0\0\0\0\0\0\0\0call\0outch(1h() c\0a\0binary\0operator\0node \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0staame(1,\0stapth)\0=\02 \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0stapth\0=\0stapth+1 \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0staame(1,\0stapth)\0=\01 \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0staame(2,\0stapth)\0=\0tree(2,\0curode) \0\0\04\0\0\0\0\0\0\0\0goto\0\08 \0\0\05\0\0\0\0\0\0\0\0call\0outch(tree(1,\0curode)) \0\0\0\0\0\0\0\0\0\0\0\0staame(1,\0stapth)\0=\03 \0\0\0\0\0\0\0\0\0\0\0\0stapth\0=\0stapth+1 \0\0\0\0\0\0\0\0\0\0\0\0staame(1,\0stapth)\0=\01 \0\0\0\0\0\0\0\0\0\0\0\0staame(2,\0stapth)\0=\0tree(3,\0curode) \0\0\0\0\0\0\0\0\0\0\0\0goto\0\08 \0\0\06\0\0\0\0\0\0\0\0call\0outch(1h)) \0\0\0\0\0\0\0\0\0\0\0\0stapth\0=\0stapth-1 \0\0\0\0\0\0\0\0\0\0\0\0goto\0\08 \0\0\07\0\0\0\0\0\0\0\0if\0(staame(1,\0stapth)\0.eq.\03)\0goto\0\06 \0\0\0\0\0\0\0\0\0\0\0\0if\0(staame(1,\0stapth)\0.eq.\02)\0goto\0\05 \0\0\0\0\0\0\0\0\0\0\0\0if\0(staame(1,\0stapth)\0.eq.\01)\0goto\0\02 \0\0\08\0\0\0\0\0continue \0\0\0\0\0\0\0\0\0goto\0\01 \0\0\09\0\0continue \0\0\0\0\0\0end .R .DE .SH APPENDIX C. CONSTRAINTS ON THE DESIGN OF THE EFL LANGUAGE .PP Although Fortran can be used to simulate any finite computation, there are realistic limits on the generality of a language that can be translated into Fortran. The design of EFL was constrained by the implementation strategy. Certain of the restrictions are petty (six character external names), but others are sweeping (lack of pointer variables). The following paragraphs describe the major limitations imposed by Fortran. .SH C.1. External Names .PP External names (procedure and COMMON block names) must be no longer than six characters in Fortran. Further, an external name is global to the entire program. Therefore, EFL can support block structure within a procedure, but it can have only one level of external name if the EFL procedures are to be compilable separately, as are Fortran procedures. .SH C.2. Procedure Interface .PP The Fortran standards, in effect, permit arguments to be passed between Fortran procedures either by reference or by copy-in/copy-out. This indeterminacy of specification shows through into EFL. A program that depends on the method of argument transmission is illegal in either language. .PP There are no procedure-valued variables in Fortran: a procedure name may only be passed as an argument or be invoked; it cannot be stored. Fortran (and EFL) would be noticeably simpler if a procedure variable mechanism were available. .SH C.3. Pointers .PP The most grievous problem with Fortran is its lack of a pointer-like data type. The implementation of the compiler would have been far easier if certain hard cases could have been handled by pointers. Further, the language could have been simplified considerably if pointers were accessible in Fortran. (There are several ways of simulating pointers by using subscripts, but they founder on the problems of external variables and initialization.) .SH C.4. Recursion .PP Fortran procedures are not recursive, so it was not practical to permit EFL procedures to be recursive. (Recursive procedures with arguments can be simulated only with great pain.) .SH C.5. Storage Allocation .PP The definition of Fortran does not specify the lifetime of variables. It would be possible but cumbersome to implement stack or heap storage disciplines by using COMMON blocks.