.ds [[ \fR\z[\h'.15m'[\fP
.ds ]] \fR\z]\h'.15m']\fP
. .TM 79-1273-6 39199 39199-11
The Programming Language EFL
EFL is a clean, general purpose computer language intended to encourage
It has a uniform and readable syntax and good data and control flow structuring.
EFL programs can be translated into efficient Fortran code,
so the EFL programmer can take advantage of the ubiquity of Fortran,
the valuable libraries of software written in that language, and the portability
that comes with the use of a standardized language,
without suffering from Fortran's many failings as a language.
It is especially useful for numeric programs.
The EFL language permits the programmer to express
complicated ideas in a comprehensible way,
while permitting access to the power of the Fortran environment.
EFL can be viewed as a descendant of B. W. Kernighan's Ratfor [1];
the name originally stood for `Extended Fortran Language'.
The current version of the EFL compiler is written in
.ds ~ \\v'.25m'\\s+2~\\s-2\\v'-.25m'
EFL is a clean, general purpose computer language intended to encourage
It has a uniform and readable syntax and good data and control flow structuring.
EFL programs can be translated into efficient Fortran code,
so the EFL programmer can take advantage of the ubiquity of Fortran,
the valuable libraries of software written in that language, and the portability
that comes with the use of a standardized language,
without suffering from Fortran's many failings as a language.
It is especially useful for numeric programs.
Thus, the EFL language permits the programmer to express
complicated ideas in a comprehensible way,
while permitting access to the power of the Fortran environment.
EFL can be viewed as a descendant of B. W. Kernighan's Ratfor [1];
the name originally stood for `Extended Fortran Language'.
A. D. Hall designed the initial version of the language and wrote a preliminary version of a compiler.
I extended and modified the language and wrote a full compiler (in C) for it.
The current compiler is much more than a simple preprocessor:
it attempts to diagnose all syntax errors, to provide readable Fortran output,
and to avoid a number of niggling restrictions. To achieve this goal, a sizable two-pass translator is needed.
In examples and syntax specifications,
type is used to indicate literal words and punctuation, such as
indicate an item in a category, such as an
A construct surrounded by double brackets represents a list of one or more of those items, separated by commas.
could refer to any of the following:
\fIitem\fB, \fIitem\fB, \fIitem\fR
The reader should have a fair degree of familiarity with some procedural language.
There will be occasional references to Ratfor and to Fortran
which may be ignored if the reader is unfamiliar with those languages.
The following characters are legal in an EFL program:
\fIletters \fBa b c d e f g h i j k l m\fI
\fBn o p q r s t u v w x y z\fI
digits \fB0 1 2 3 4 5 6 7 8 9\fI
white space \fIblank tab\fI
other \fB, ; : . + \- \(** /\fI
Letter case (upper or lower) is ignored except within strings,
so `\fBa\fR' and `\fBA\fR' are treated as the same character.
All of the examples below are printed in lower case.
An exclamation mark (`\fB!\fR') may be used in place of a tilde (`\fB\*~\fR').
Square brackets (`[' and `]') may be used in place of braces (`{' and `}').
EFL is a line-oriented language.
Except in special cases (discussed below),
the end of a line marks the end of a token and the end of a statement.
The trailing portion of a line may be used for a comment.
There is a mechanism for diverting input from one source file to another,
so a single line in the program may be replaced by a number of lines from the other file.
Diagnostic messages are labeled with the line number of the file on which they are detected.
Outside of a character string or comment,
any sequence of one or more spaces or tab characters acts as a single space.
Such a space terminates a token.
A comment may appear at the end of any line.
It is introduced by a sharp (#) character,
and continues to the end of the line.
(A sharp inside of a quoted string does not mark a comment.)
The sharp and succeeding characters on the line are discarded.
A blank line is also a comment.
Comments have no effect on execution.
It is possible to insert the contents of a file at a point in the source text,
by referencing it in a line like
No statement or comment may follow an
line is replaced by the lines in the named file,
but diagnostics refer to the line number in the included file.
\fBInclude\fRs may be nested at least ten deep.
Lines may be continued explicitly by using the underscore (\fB_\fR) character.
If the last character of a line (after comments and trailing white space have been stripped) is an underscore,
the end of line and the initial blanks on the next line are ignored.
Underscores are ignored in other contexts (except inside of quoted strings).
There are also rules for continuing lines automatically:
the end of line is ignored whenever it is obvious that the statement is not complete.
To be specific, a statement is continued if the last token on a line is an operator, comma,
left brace, or left parenthesis.
(A statement is not continued just because of unbalanced braces or parentheses.)
Some compound statements are also continued automatically;
these points are noted in the sections on executable statements.
Multiple Statements on a Line
A semicolon terminates the current statement.
Thus, it is possible to write more than one statement on a line.
A line consisting only of a semicolon, or a semicolon following a semicolon, forms a null statement.
A program is made up of a sequence of tokens.
Each token is a sequence of characters.
A blank terminates any token other than a quoted string.
End of line also terminates a token unless explicit continuation (see above) is signaled by an underscore.
An identifier is a letter or a letter followed by letters or digits.
The following is a list of the reserved words that have special meaning in EFL.
They will be discussed later.
automatic external procedure
character function repeat
dimension integer subroutine
doubleprecision logical value
equivalence option writebin
The use of these words is discussed below.
These words may not be used for any other purpose.
A character string is a sequence of characters surrounded by quotation marks.
If the string is bounded by single-quote marks ( \fB\'\fR ), it may contain double
quote marks ( \fB"\fR ), and vice versa.
A quoted string may not be broken across a line boundary.
An integer constant is a sequence of one or more digits.
A floating point constant contains a dot and/or an exponent field.
followed by an optionally signed integer constant.
are integer constants and
is an exponent field, then a floating constant has one of the following forms:
Certain characters are used to group or separate objects in the language.
The end-of-line is a token (statement separator)
when the line is neither blank nor continued.
The EFL operators are written as sequences of one or more
non-alphanumeric characters.
A dot (`\fB.\fR') is an operator when it qualifies a structure element name,
but not when it acts as a decimal point in a numeric constant.
There is a special mode (see the Atavisms section)
in which some of the operators may be represented by a string consisting of a dot, an identifier, and a dot
EFL has a simple macro substitution facility.
An identifier may be defined to be equal to a string of tokens;
whenever that name appears as a token in the program,
A macro name is given a value in a
appears in the program, it is replaced by the statement
statement must appear alone on a line;
\fBdefine \fIname \fIrest-of-line\fR
Trailing comments are part of the string.
A file is compiled as a single unit.
It may contain one or more procedures.
Declarations and options that appear outside of a procedure
affect the succeeding procedures on that file.
Procedures are the largest grouping of statements in EFL.
Each procedure has a name by which it is invoked.
(The first procedure invoked during execution, known as the
Procedure calls and argument passing are discussed in Section 8.
Statements may be formed into groups inside of a procedure.
To describe the scope of names, it is convenient to introduce the ideas of
The beginning of a program file is at nesting level zero.
Any options, macro definitions,
or variable declarations there are also at level zero.
The text immediately following a
a left brace marks the beginning of a new block and increases the nesting level by 1;
a right brace drops the level by 1.
(Braces inside declarations do not mark blocks.)
statement marks the end of the procedure, level 1, and the return to level 0.
is defined throughout that block and in all deeper nested levels in which that name is not
Thus, a procedure might look like the following:
integer x # a different variable
end # end of procedure, return to block 0
A statement is terminated by end of line or by a semicolon.
Statements are of the following types:
statement is described in Section 10.
statements have been described above;
they may not be followed by another statement on a line.
Each procedure begins with a
statements and finishes with an
statement; these are discussed in Section 8.
Declarations describe types and values of variables and
Executable statements cause specific actions to be taken.
A block is an example of an executable statement; it is made up
of declarative and executable statements.
An executable statement may have a
which may be used in a branch statement.
A label is an identifier followed by a colon, as in
error: fatal("bad input")
EFL supports a small number of basic (scalar) types.
The programmer may define objects made up of variables of basic type;
other aggregates may then be defined in terms of previously defined aggregates.
\fBfield(\fIm\|\fB:\fIn\|\fB)
A logical quantity may take on the two values true and false.
An integer may take on any whole number value in some machine-dependent range.
A field quantity is an integer restricted to a particular closed interval
A `real' quantity is a floating point approximation to a real or rational number.
A long real is a more precise approximation to a rational.
(Real quantities are represented as single precision floating point numbers;
long reals are double precision floating point numbers.)
A complex quantity is an approximation to a complex number, and is represented as a pair of reals.
A character quantity is a fixed-length string of @n@ characters.
There is a notation for a constant of each basic type.
A logical may take on the two values
An integer or field constant is a fixed point constant,
optionally preceded by a plus or minus sign, as in
A long real (`double precision') constant is a floating point constant containing an exponent field that
A real (`single precision') constant is any other floating point constant.
A real or long real constant may be preceded by a plus or minus sign.
7.9e\-6 @(~=~7.9 times 10 sup -6 )@
14e9 @(~=~1.4 times 10 sup 10 )@
7.9d\-6 @(~=~7.9 times 10 sup -6 )@
A character constant is a quoted string.
A variable is a quantity with a name and a location.
At any particular time the variable may also have a value.
(A variable is said to be
before it is initialized or assigned its first value,
and after certain indefinite operations are performed.)
Each variable has certain attributes:
The association of a name and a location is either
Transitory association is achieved when arguments are passed to procedures.
Other associations are permanent (static).
(A future extension of EFL may include dynamically allocated variables.)
these names may be used anywhere in the program.
All other names are local to the block in which they are declared.
Floating point variables are either of normal or
This attribute may be stated independently of the basic type.
It is possible to declare rectangular arrays (of any dimension) of values of the same type.
The index set is always a cross-product of intervals of integers.
The lower and upper bounds of the intervals must be constants for arrays that are local or
A formal argument array may have intervals that are of length equal to one of the other formal arguments.
An element of an array is denoted by the array name followed by a parenthesized comma-separated list of integer values,
each of which must lie within the corresponding interval.
(The intervals may include negative numbers.)
Entire arrays may be passed as procedure arguments or in input/output lists,
or they may be initialized;
all other array references must be to individual elements.
It is possible to define new types which are made up of elements of other types.
The compound object is known as a
its constituents are called
The structure may be given a name,
which acts as a type name in the remaining statements within the scope of its declaration.
The elements of a structure may be of any type
(including previously defined structures),
or they may be arrays of such objects.
Entire structures may be passed to procedures or be used in input/output lists;
individual elements of structures may be referenced.
The uses of structures will be detailed below.
The following structure might represent a symbol table:
field(0:1) initialized, used, set
Expressions are syntactic forms that yield a value.
An expression may have any of the following forms, recursively applied:
\fB(\fI expression \fB)\fI
unary-operator expression
expression binary-operator expression
In the following table of operators,
all operators on a line have equal precedence
and have higher precedence than operators on later lines.
The meanings of these operators are described in sections 5.3 and 5.4.
\(** / \fIunary\fB + \- ++ \-\-
= += \-= \(**= /= \(**\(**= &= |= &&= |\||=
Examples of expressions are
\-(a + sin(x)) / (5+cos(x))\(**\(**2
Primaries are the basic elements of expressions, as follows:
Constants are described in Section 4.2.
Scalar variable names are primaries.
They may appear on the left or the right side of an assignment.
Unqualified names of aggregates (structures or arrays)
may only appear as procedure arguments and in input/output lists.
An element of an array is denoted by the array name followed by a parenthesized list of subscripts,
one integer value for each declared dimension:
A structure name followed by a dot followed by the name of a member of that structure constitutes a reference to
If that element is itself a structure, the reference may be further qualified.
A procedure is invoked by an expression of one of the forms
\fIprocedurename \fB( )\fR
\fIprocedurename \fB( \fIexpression\fB )\fR
\fIprocedurename \fB( \fIexpression-1\fB, \fI...\fB, \fIexpression-n \fB)\fR
is either the name of a variable
function known to the EFL compiler (see Section 8.5),
or it is the actual name of a procedure, as it appears in a
and is an argument of the current procedure,
it is associated with the procedure name passed as actual argument;
otherwise it is the actual name of a procedure.
in the above is called an
Examples of procedure invocations are
When one of these procedure invocations is to be performed,
each of the actual argument expressions is first evaluated.
The types, precisions, and bounds of actual and formal arguments should agree.
If an actual argument is a variable name, array element, or structure member,
the called procedure is permitted to use the corresponding formal argument as the left side
of an assignment or in an input list;
otherwise it may only use the value.
After the formal and actual arguments are associated,
control is passed to the first executable statement of the procedure.
statement is executed in that procedure,
or when control reaches the
statement of that procedure,
the function value is made available as the value of the procedure invocation.
The type of the value is determined by the attributes of
that are declared or implied in the calling procedure,
which must agree with the attributes declared for the function in its procedure.
In the special case of a generic function,
the type of the result is also affected by the type of the argument.
See Chapter 8 for details.
The EFL input/output syntactic forms
may be used as integer primaries that have
if an error occurs during the input or output.
An expression of one precision or type may be converted to another by an expression of the form
\fIattributes \fB( \fIexpression \fB)\fR
permitted are precision and basic types.
Attributes are separated by white space.
An arithmetic value of one type may be coerced to any other arithmetic type;
a character expression of one length may be coerced to a character expression of another length;
logical expressions may not be coerced to a nonlogical type.
type may be constructed from two integer or real quantities
by passing two expressions (separated by a comma) in the coercion.
Examples and equivalent values are
Most conversions are done implicitly,
since most binary operators permit operands of different arithmetic types.
Explicit coercions are of most use when it is necessary to convert the type of an actual argument
to match that of the corresponding formal parameter in a procedure call.
There is a notation which yields the amount of memory required to store a datum
or an item of specified type:
\fBsizeof ( \fIleftside\fB )
\fBsizeof ( \fIattributes\fB )
can denote a variable, array, array element, or structure member.
is an integer, which gives the size in arbitrary units.
If the size is needed in terms of the size of some specific unit, this
can be computed by division:
sizeof(x) / sizeof(integer)
yields the size of the variable
The distance between consecutive elements of an array may not equal
because certain data types require final padding on some machines.
operator gives this larger value, again in arbitrary units.
\fBlengthof ( \fIleftside\fB )
\fBlengthof ( \fIattributes\fB )
An expression surrounded by parentheses is itself an expression.
A parenthesized expression must be evaluated before an expression of which it is a part is evaluated.
All of the unary operators in EFL are prefix operators.
The result of a unary operator has the same type as its operand.
yields the negative of its operand.
subtracts one from its operand.
The value of either expression is the result of the addition or subtraction.
For these two operators, the operand must be a scalar,
array element, or structure member of arithmetic type.
(As a side effect, the operand value is changed.)
The only logical unary operator is complement
This operator is defined by the equations
Most EFL operators have two operands, separated by the operator.
Because the character set must be limited,
some of the operators are denoted by strings of two or three special characters.
All binary operators except exponentiation are left associative.
The binary arithmetic operators are
Exponentiation is right associative:
a\(**\(**b\(**\(**c = a\(**\(**(b\(**\(**c) = @a sup {(b sup c )}@
The operations have the conventional meanings:
@8 \(**\(** 2 ~=~ 8 sup 2 ~=~ 64@.
The type of the result of a binary operation
is determined by the types of its operands:
Type of A integer real long real complex long complex
integer integer real long real complex long complex
real real real long real complex long complex
long real long real long real long real long complex long complex
complex complex complex long complex complex long complex
long complex long complex long complex long complex long complex long complex
If the type of an operand differs from the type of the result,
the calculation is done as if the operand were first coerced to the type of the result.
If both operands are integers, the result is of type integer, and is computed exactly.
(Quotients are truncated toward zero, so
The two binary logical operations in EFL,
are defined by the truth tables:
Each of these operators comes in two forms.
In one form, the order of evaluation is specified.
is evaluated by first evaluating
if it is false then the expression is false and
otherwise the expression has the value of
is evaluated by first evaluating
if it is true then the expression is true and
otherwise the expression has the value of
The other forms of the operators
(\fB&\fR for \fBand\fR and \fB|\fR for \fBor\fR)
do not imply an order of evaluation.
With the latter operators,
the compiler may speed up the code by
evaluating the operands in any order.
There are six relations between arithmetic quantities.
These operators are not associative.
<= @<=@ less than or equal to
>= @>=@ greater than or equal
Since the complex numbers are not ordered, the only relational operators that may take complex operands
The character collating sequence is not defined.
All of the assignment operators are right associative.
The simple form of assignment is
\fIbasic-left-side \fB= \fIexpression\fR
is a scalar variable name, array element, or structure member of basic type.
This statement computes the expression on the right side, and stores that value
(possibly after coercing the value to the type of the left side)
in the location named by the left side.
The value of the assignment expression is the value assigned to the left side after coercion.
There is also an assignment operator corresponding to each binary arithmetic and logical operator.
(The operator and equal sign must not be separated by blanks.)
The location of the left side is evaluated only once.
EFL does not have an address (pointer, reference) type.
However, there is a notation for dynamic structures,
\fIleftside \fB\-> \fIstructurename\fR
This expression is a structure with the shape implied by
but starting at the location of
In effect, this overlays the structure template at the specified location.
must be a variable, array, array element, or structure member.
must be one of the types in the structure declaration.
An element of such a structure is denoted in the usual way using the dot operator.
structure starting at the
Inside of a list, an element of the form
\fIinteger-constant-expression \fB$\fI constant-expression\fR
is equivalent to the appearance of the
a number of times equal to the first expression.
If an expression is built up out of operators (other than functions) and constants,
the value of the expression is a constant, and may be used anywhere a constant is required.
Declarations statement describe the meaning, shape, and size of named
objects in the EFL language.
A declaration statement is made up of attributes and variables.
Declaration statements are of two form:
attributes { declarations }
In the first case, each name in the
has the specified attributes.
In the second, each name in the declarations also has the specified attributes.
A variable name may appear in more than one variable list,
so long as the attributes are not contradictory.
Each name of a nonargument variable may be accompanied by an initial value specification.
inside the braces are one or more declaration statements.
Examples of declarations are
long real array(5,0:3) x, y
The following are basic types in declarations
In the above, the quantities @k@, @m@, and @n@ denote integer constant expressions with the properties
The dimensionality may be declared by an
bold array( b sub 1 , ..., b sub n bold )
may either be a single integer expression or a pair of integer expressions separated by a colon.
The pair of expressions form a lower and an upper bound; the single expression is an upper bound with
an implied lower bound of 1.
The number of dimensions is equal to
All of the integer expressions must be constants.
An exception is permitted only if all of the variables associated with an
array declarator are formal arguments of the procedure; in this case, each bound
must have the property that
is equal to a formal argument of the procedure.
(The compiler has limited ability to simplify expressions, but it will recognize
The upper bound for the last dimension
may be marked by an asterisk
if the size of the array is not known.
The following are legal @bold array@ attributes:
A structure declaration is of the form
\fBstruct \fIstructname \fB{ \fI declaration statements \fB}\fR
is optional; if it is present, it acts as if it were the name of a type in the rest of its scope.
Each name that appears inside the
of the structure, and has a special meaning when used to qualify any variable declared with the structure type.
A name may appear as a member of any number of structures,
and may also be the name of an ordinary variable,
since a structure member name is used only in contexts where the parent type is known.
The following are valid structure attributes
struct { xx z(3); character(5) y }
The last line defines a structure containing an array of three @bold xx 's@
Variables of floating point
(@bold real@ or @bold complex@) type may be declared to be
to ensure they have higher precision than ordinary floating point variables.
and may be referenced by any procedure that has a declaration for the name using a
\fBcommon ( \fI commonareaname \fB)\fR
All of the variables declared with a particular \fBcommon\fR attribute are in the same
block; the order in which they are declared is significant.
Declarations for the same block in differing procedures must have the variables in the same order and with the
same types, precision, and shapes, though not necessarily with the same names.
If a name is used as the procedure name in a procedure invocation,
it is implicitly declared to have the
If a procedure name is to be passed as an argument, it is necessary to declare
it in a statement of the form
\fBexternal \*([[ \fIname \fB\*(]]\fR
If a name has the external attribute and it is a formal argument of
then it is associated with a procedure identifier passed as an actual argument
If the name is not a formal argument, then that name is the actual name
of a procedure, as it appears in the corresponding
The elements of a variable list in a declaration
an optional dimension specification,
and an optional initial value specification.
The name follows the usual rules.
The dimension specification is the same form and meaning as the parenthesized list in an
The initial value specification is an equal sign (\fB=\fR) followed by a constant expression.
If the name is an array, the right side of the equal sign may be a parenthesized list of constant expressions,
or repeated elements or lists; the total number of elements in the list must not exceed the number of elements of the
array, which are filled in column-major order.
An initial value may also be specified for a simple variable,
array, array element, or member of a structure
using a statement of the form
\fBinitial \*([[ \fIvar \fB= \fIval \*(]]\fR
The @var@ may be a variable name, array element specification, or member of structure.
The right side follows the same rules as for an initial value specification
in other declaration statements.
Every useful EFL program contains executable statements \(em
otherwise it would not do anything and would not need to be run.
Statements are frequently made up of other statements.
Blocks are the most obvious case,
but many other forms contain statements as constituents.
To increase the legibility of EFL programs,
some of the statement forms can be broken without an explicit continuation.
A square (\fR\(sq\fP) in the syntax represents a point where the end of a line will be ignored.
A procedure invocation that returns no value is known as a subroutine call.
Such an invocation is a statement.
Input/output statements (see Section 7.7)
resemble procedure invocations
but do not yield a value.
An expression that is a simple assignment (\fB=\fR) or
a compound assignment (\fB+=\fR etc.) is a statement:
A block is a compound statement that acts as a statement.
A block begins with a left brace,
optionally followed by declarations,
optionally followed by executable statements,
followed by a right brace.
A block may be used anywhere a statement is permitted.
A block is not an expression and does not have a value.
integer i # this variable is unknown outside the braces
Test statements permit execution of certain statements conditional on the truth of a predicate.
The simplest of the test statements is the
\fBif ( \fIlogical-expression\fB ) \fR\(sq\fP \fIstatement\fR
The logical expression is evaluated;
A more general statement is of the form
\fBif ( \fIlogical-expression \fB) \fR\(sq\fP \fI statement-1 \fR\(sq\fP \fBelse \fR\(sq\fP \fI statement-2 \fR
Either of the consequent statements may itself be an
so a completely nested test sequence is possible:
applies to the nearest preceding un-\fBelse\fRd \fBif\fR.
A more common use is as a sequential test:
A multiway test on the value of a quantity is succinctly stated as a
statement, which has the general form
\fBselect( \fIexpression\fB ) \fR\(sq\fP \fIblock\fR
two special types of labels are recognized.
\fBcase \fI\*([[ constant \*(]] \fB:\fR
marks the statement to which control is passed if the
in the select has a value equal to one of the case constants.
If the expression equals none of these constants, but there is
a branch is taken to that point;
otherwise the statement following the right brace is executed.
Once execution begins at a
label, it continues until the next
example above is better written as
Note that control does not `fall through' to the next case.
The loop forms provide the best way of repeating a statement
or sequence of operations.
The simplest (\fBwhile\fR) form is theoretically sufficient, but it is very convenient to have
the more general loops available, since each expresses a mode of control
that arises frequently in practice.
This construct has the form
\fBwhile ( \fIlogical-expression\fB ) \fR\(sq\fP \fIstatement\fR
The expression is evaluated; if it is true, the statement is executed, and then the test is performed again.
If the expression is false, execution proceeds to the next statement.
statement is a more elaborate looping construct.
\fBfor ( \fIinitial-statement \fB, \fR\(sq\fP \fIlogical-expression \fB, \fR\(sq\fP \fI iteration-statement \fB) \fR\(sq\fP \fIbody-statement
Except for the behavior of the
statement (see Section 7.6.3), this construct is equivalent to
\fBwhile ( \fIlogical-expression\fB )
This form is useful for general arithmetic iterations, and for various pointer-type operations.
The sum of the integers from 1 to 100 can be computed by the fragment
for(i = 1, i <= 100, i += 1)
Alternatively, the computation could be done by the single statement
for( { n = 0 ; i = 1 } , i<=100 , { n += i ; ++i } )
Note that the body of the
loop is a null statement in this case.
An example of following a linked list will be given later.
\fBrepeat \fR\(sq\fP \fIstatement\fR
executes the statement, then does it again, without any termination test.
Obviously, a test inside the
is needed to stop the loop.
loop performs a test before each iteration.
\fBrepeat \fR\(sq \fIstatement \fR\(sq \fBuntil ( \fIlogical-expression \fB)
then evaluates the logical;
true the loop is complete;
otherwise control returns to the
Thus, the body is always executed at least once.
refers to the nearest preceding
that has not been paired with an
In practice, this appears to be the least frequently used looping construct.
The simple arithmetic progression is a very common one in numerical applications.
EFL has a special loop form for ranging over an ascending arithmetic sequence
\fBdo \fIvariable \fB= \fIexpression-1, expression-2, expression-3\fR
The variable is first given the value
The statement is executed, then
is added to the variable.
The loop is repeated until the variable exceeds
and the preceding comma are omitted, the increment is taken to be 1.
The loop above is equivalent to
for(variable = expression-1 , variable <= t2 , variable += t3)
(The compiler translates EFL
which are in turn usually compiled into excellent code.)
may not be changed inside of the loop,
The sum of the first hundred positive integers could be computed by
Most of the need for branch statements in programs can be
averted by using the loop and test constructs,
but there are programs where they are very useful.
The most general, and most dangerous, branching statement is the simple unconditional
After executing this statement, the next statement performed is the one following the given label.
the case labels of that block may be used as labels, as in the following example:
the case labels of the outer
are not accessible from the inner one.)
A safer statement is one which transfers control to the statement following the current
A statement of this sort is almost always needed in a
More general forms permit controlling a branch out of more than one construct.
transfers control to the statement following the third loop and/or
surrounding the statement.
It is possible to specify which type of construct
(\fBfor\fR, \fBwhile\fR, \fBrepeat\fR, \fBdo\fR, or \fBselect\fR)
breaks out of the first surrounding
will transfer to the statement after the third enclosing
statement causes the first surrounding loop statement to go on to the next iteration:
the next operation performed is the
Elaborations similar to those for
The last statement of a procedure is followed by a return of control to the caller.
If it is desired to effect such a return from any other point in the procedure, a
statement may be executed.
Inside a function procedure, the function value is specified as an argument of the statement:
\fBreturn ( \fIexpression \fB)
EFL has two input statements (\fBread\fR and \fBreadbin\fR),
two output statements (\fBwrite\fR and \fBwritebin\fR),
and three control statements (\fBendfile\fR, \fBrewind\fR, and \fBbackspace\fR).
These forms may be used either as a primary with a
If an exception occurs when one of these forms is used as a statement,
the result is undefined but will probably be treated as a fatal error.
If they are used in a context where they return a value,
zero if no exception occurs.
For the input forms, a negative value indicates end-of-file and
a positive value an error.
The input/output part of EFL very strongly reflects the facilities of Fortran.
Each I/O statement refers to a `unit',
identified by a small positive integer.
Two special units are defined by EFL,
.I "standard output unit."
These particular units are assumed if no unit is specified in an I/O transmission statement.
The data on the unit are organized into
These records may be read or written in a fixed sequence,
and each transmission moves an integral number of records.
Transmission proceeds from the first record until the
statements transmit data in a machine-dependent but swift manner.
The statements are of the form
\fBwritebin( \fIunit \fB, \fIbinary-output-list \fB)\fR
\fBreadbin( \fIunit \fB, \fIbinary-input-list \fB)\fR
Each statement moves one unformatted record between storage and the device.
is an integer expression.
(see below) without any format specifiers.
without format specifiers in which each of the expressions
is a variable name, array element, or structure member.
transmit data in the form of lines of characters.
Each statement moves one or more records (lines).
Numbers are translated into decimal notation.
The exact form of the lines is determined by format specifications,
whether provided explicitly in the statement
The syntax of the statements is
\fBwrite( \fIunit \fB,\fI formatted-output-list \fB)\fR
\fBread( \fIunit \fB,\fI formatted-input-list \fB)\fR
The lists are of the same form as for binary I/O,
except that the lists may include format specifications.
is omitted, the standard input or output unit is used.
specifies a set of values to be written or a set of variables into which
\fIdo-specification \fB{ \fIiolist \fB}\fR
\fIioexpression \fB:\fI format-specifier
\fB:\fI format-specifier\fR
statement, and has a similar effect:
the values in the braces are transmitted repeatedly until the
The following are permissible
@w@, @d@, and @k@ must be integer constant expressions.
\fBi(\fIw\fB)\fR integer with \fIw\fR digits
\fBf(\fIw\fB,\fId\fB)\fR floating point number of \fIw\fR characters,
\fId\fR of them to the right of the decimal point.
\fBe(\fIw\fB,\fId\fB)\fR floating point number of \fIw\fR characters,
\fId\fR of them to the right of the decimal point,
with the exponent field marked with the letter \fBe\fR
\fBl(\fIw\fB)\fR logical field of width \fIw\fR characters,
the first of which is \fBt\fR or \fBf\fR
(the rest are blank on output, ignored on input)
standing for \fBtrue\fR and \fBfalse\fR respectively
\fBc\fR character string of width equal to the length of the datum
\fBc(\fIw\fB)\fR character string of width \fIw\fR
\fBs(\fIk\fB)\fR skip \fIk\fR lines
\fBx(\fIk\fB)\fR skip \fIk\fR spaces
" ... " use the characters inside the string as a Fortran format
If no format is specified for an item in a formatted input/output statement,
a default form is chosen.
If an item in a list is an array name,
then the entire array is transmitted as a sequence of elements,
each with its own format.
The elements are transmitted in column-major order,
the same order used for array initializations.
The three input/output statements
look like ordinary procedure calls,
but may be used either as statements or as integer expressions
causes the specified unit to back up,
read will re-read the previous record,
and the next write will over-write it.
moves the device to its beginning,
so that the next input statement will read the first record.
causes the file to be marked so that the record most recently written will be the last record on the file,
and any attempt to read past is an error.
Procedures are the basic unit of an EFL program,
and provide the means of segmenting a program into separately compilable
Each procedure begins with a statement of one of the forms
\fIattributes \fBprocedure \fIprocedurename
\fIattributes \fBprocedure \fIprocedurename \fB( )\fR
\fIattributes \fBprocedure \fIprocedurename \fB( \fI\*([[ name \*(]] \fB) \fR
The first case specifies the main procedure, where execution begins.
In the two other cases, the
may specify precision and type,
or they may be omitted entirely.
The precision and type of the procedure may be declared in an ordinary declaration statement.
If no type is declared, then the procedure is called a
and no value may be returned for it.
Otherwise, the procedure is a function and a value of the declared type is returned for each call.
inside the parentheses in the last form above is called a
Each procedure terminates with a statement
When a procedure is invoked,
the actual arguments are evaluated.
If an actual argument is the name of a variable, an array element,
that entity becomes associated with the formal argument,
and the procedure may reference the values in the object,
the value of the actual is associated with the formal argument,
but the procedure may not attempt to change the value of that formal argument.
If the value of one of the arguments is changed in the procedure,
it is not permitted that the corresponding actual argument be associated
with another formal argument or with a
element that is referenced in the procedure.
Execution and Return Values
After actual and formal arguments have been associated,
control passes to the first executable statement of the procedure.
Control returns to the invoker
statement of the procedure is reached or when a
If the procedure is a function
is coerced to the correct type and precision and returned.
A number of functions are known to EFL, and need not be declared.
The compiler knows the types of these functions.
i.e., they name a family of functions that differ in the types of their arguments and return values.
The compiler chooses which element of the set to invoke based upon the attributes of the actual arguments.
Minimum and Maximum Functions
The generic functions are
calls return the value of their smallest argument;
calls return the value of their largest argument.
These are the only functions that may take different numbers of arguments in different calls.
If any of the arguments are
Otherwise, if any of the arguments are
otherwise all the arguments and the result must be
function is a generic function that returns the magnitude of its argument.
integer and real arguments the type of the result is identical to the type of the argument;
for complex arguments the type of the result is the real of the same precision.
The following generic functions take arguments of
\fBreal\fR, \fBlong real\fR, or \fBcomplex\fR
type and return a result of the same type:
exp exponential function (@e sup x@).
log natural (base \fIe\fP) logarithm
log10 common (base 10) logarithm
sqrt square root function (@sqrt x@).
In addition, the following functions accept only
\fBatan\fR @atan(x) = tan sup -1 x@
\fBatan2\fR @atan2(x,y) = tan sup -1 x over y@
functions takes two arguments of identical type;
@bold sign (x,y) ~=~ sgn(y) |x|@.
function yields the remainder of its first argument when divided by its second.
These functions accept integer and real arguments.
Certain facilities are included in the EFL language to ease the conversion of old
Fortran or Ratfor programs to EFL.
In order to make use of nonstandard features of the local Fortran compiler,
it is occasionally necessary to pass a particular line through to the EFL compiler output.
A line that begins with a percent sign (`\fB%\fR')
is copied through to the output, with the percent sign removed but no other change.
Inside of a procedure, each escape line is treated as an executable statement.
If a sequence of lines constitute a continued Fortran statement, they should be enclosed in braces.
A subroutine call may be preceded by the keyword
The following keywords are recognized as synonyms of EFL keywords:
\fBdouble precision long real
\fBsubroutine procedure \fI(untyped)\fR
Standard statement labels are identifiers.
A numeric (positive integer constant) label is also permitted;
the colon is optional following a numeric label.
If a name is used but does not appear in a declaration,
the EFL compiler gives a warning and assumes a declaration for it.
If it is used in the context of a procedure invocation, it is assumed to be a procedure name;
otherwise it is assumed to be a local variable defined at nesting level 1 in the current procedure.
The assumed type is determined by the first letter of the name.
The association of letters and types may be given in an
\fBimplicit ( \fIletter-list\fB ) \fI type \fR
is a list of individual letters or ranges (pair of letters separated by a minus sign).
statement appears, the following rules are assumed:
implicit (a\-h, o\-z) real
Fortran contains an indexed multi-way branch; this facility may be used in EFL
\fBgoto ( \fI\*([[ label \*(]] \fB), \fIexpression\fR
The expression must be of type integer and be positive but be no larger than the number of labels in the list.
Control is passed to the statement marked by the label whose position in the list is equal to the expression.
In unconditional and computed \fBgoto\fR
statements, it is permissible to separate the \fBgo\fR and \fBto\fR words, as in
Fortran uses a restricted character set,
and represents certain operators by multi-character sequences.
There is an option (\fBdots=on\fR; see Section 10.2) which forces the compiler to recognize the forms
in the second column below:
In this mode, no structure element may be named
The readable forms in the left column are always recognized.
A complex constant may be written as a parenthesized list of real quantities, such as
The preferred notation is by a type coercion,
The preferred way to return a value from a function in EFL is the
However, the name of the function acts as a variable to which values may be assigned;
statement returns the last value assigned to that name as the function value.
bold equivalence ~ v sub 1 ,~ v sub 2 ,~ ...,~ v sub n
declares that each of the @v sub i@ starts at the same memory location.
Each of the @v sub i@ may be a variable name, array element name, or structure member.
Minimum and Maximum Functions
There are a number of non-generic functions in this category,
which differ in the required types of the arguments and the type of the return value.
They may also have variable numbers of arguments, but all the arguments must have the same type.
Function Argument Type Result Type
dmin1 long real long real
dmax1 long real long real
A number of options can be used to control the output
and to tailor it for various compilers and systems.
The defaults chosen are conservative, but it is sometimes necessary to change the output to match peculiarities of the
Options are set with statements of the form
\fBoption \fI\*([[ \fIopt \fI\*(]]\fR
optionname \fB= \fIoptionvalue
is either a constant (numeric or string) or
a name associated with that option.
apply to a number of options.
Each option has a default setting.
It is possible to change the whole set of defaults to those appropriate
for a particular environment
At present, the only valid values are
option determines whether the compiler recognizes
and similar forms. The default setting is
Input/Output Error Handling
option can be given three values:
means that none of the I/O statements may be used in expressions, since there is no way to detect errors.
The implementation of the
form uses ERR= and END= clauses.
The implementation of the
form uses IOSTAT= clauses.
By default, continued Fortran statements are indicated by a character in column 6 (Standard Fortran).
puts an ampersand (\fB&\fR) in the first column of the continued lines instead.
If no format is specified for a datum in an
statement, a default is provided.
The default formats can be changed by setting certain options
\fBzdformat\fR long complex
The associated value must be a Fortran format, such as
variables, structures, and the
operators, it is necessary to know how much space various Fortran data types require,
and what boundary alignment properties they demand.
Fortran Type Size Option Alignment Option
The sizes are given in terms of an arbitrary unit;
the alignment is given in the same units.
gives the number of characters per
Default Input/Output Units
are the numbers of the standard input and output units.
Miscellaneous Output Control Options
Each Fortran procedure generated by the compiler will be preceded by the value of the
No Hollerith strings will be passed as subroutine arguments if
The Fortran statement numbers normally start at 1 and increase by 1.
It is possible to change the increment value by using the
.ta .5i 1i 1.5i 2i 2.5i 3.0i
In order to show the flavor or programming in EFL,
we present a few examples.
They are short, but show some of the convenience of the language.
The following short program copies the standard input to the standard output,
provided that the input is a formatted file containing
lines no longer than a hundred characters.
while( read( , line) == 0 )
until the end of file (or a read error),
this program keeps reading and writing until the input is exhausted.
The following procedure multiplies the
to give the @m times p@ matrix c.
The calculation obeys the formula
@c sub ij ~=~ sum a sub ik b sub kj@.
procedure matmul(a,b,c, m,n,p)
long real a(m,n), b(n,p), c(m,p)
c(i,j) += a(i,k) \(** b(k,j)
Assume we have a list of pairs of numbers @(x,y)@.
The list is stored as a linked list sorted in ascending order of @x@ values.
The following procedure searches this list for a particular value of @x@
and returns the corresponding @y@ value.
integer procedure val(list, first, x)
# list is an array of structures.
# Each structure contains a thread index value, an x, and a y value.
for(p = first , p\*~=LAST && list(p).x<=x , p = list(p).nextindex)
loop that begins with the head of the list
and examines items until either the list is exhausted
or until it is known that the specified value is not on the list
The two tests in the conjunction must
be performed in the specified order
to avoid using an invalid subscript in the
The next element in the chain is found by the iteration statement
.B "p=list(p).nextindex".
As an example of a more complicated problem, let us imagine we have
an expression tree stored in a common area,
and that we want to print out an infix form of the tree.
Each node is either a leaf (containing a numeric value)
or it is a binary operator, pointing to a left and a right descendant.
such a tree walk would be implement by the following simple pseudocode:
print a right parenthesis
In a nonrecursive language like EFL, it is necessary to maintain an explicit stack
to keep track of the current state of the computation.
to print a single character
procedure walk(first) # print out an expression tree
integer first # index of root node
} tree(100) # array of structures
define NODE tree(currentnode)
define STACK stackframe(stackdepth)
# initialize stack with root node
currentnode = STACK.nodep
if(NODE.op == " ") # a leaf
else { # a binary operator node
STACK.nodep = NODE.rightp
One of the major goals of the EFL language is to make it easy to write portable programs.
The output of the EFL compiler is intended to be acceptable to any Standard Fortran
Certain EFL operations cannot be implemented in portable Fortran,
so a few machine-dependent procedures must be provided in each environment.
is called to copy one character string to another.
If the target string is shorter than the source,
the final characters are not copied.
If the target string is longer, its end is padded with blanks.
subroutine ef1asc(a, la, b, lb)
integer a(\(**), la, b(\(**), lb
and it must copy the first
Character String Comparisons
is invoked to determine the order of two character strings.
integer function ef1cmc(a, la, b, lb)
integer a(\(**), la, b(\(**), lb
The function returns a negative value if the string
It returns zero if the strings are equal, and a positive value otherwise.
If the strings are of differing length, the comparison is carried out
as if the end of the shorter string were padded with blanks.
A. D. Hall originated the EFL language and wrote the first compiler for it;
he also gave inestimable aid when I took up the project.
B. W. Kernighan and W. S. Brown made a number of useful suggestions about the language and about this report.
N. L. Schryer has acted as willing, cheerful, and severe first user
and helpful critic of each new version and facility.
J. L. Blue, L. C. Kaufman, and D. D. Warner
made very useful contributions by making serious use of the compiler,
and noting and tolerating its misbehaviors.
``Ratfor \(em A Preprocessor for a Rational Fortran'',
Bell Laboratories Computing Science Technical Report #55
APPENDIX A. Relation Between EFL and Ratfor
There are a number of differences between Ratfor and EFL,
since EFL is a defined language while Ratfor is
the union of the special control structures and the language accepted by the underlying Fortran compiler.
Ratfor running over Standard Fortran is almost a subset of EFL.
Most of the features described in the Atavisms section are present to ease
the conversion of Ratfor programs to EFL.
There are a few incompatibilities:
statement is slightly different in the two languages:
the three clauses are separated by semicolons in Ratfor,
(The initial and iteration statements may be compound statements in EFL because of this change).
The input/output syntax is quite different in the two languages,
and there is no FORMAT statement in EFL.
There are no ASSIGN or assigned GOTO statements in EFL.
The major linguistic additions are
factored declaration syntax,
assignment and sequential test operators,
EFL permits more general forms for expressions,
and provides a more uniform syntax.
(One need not worry about the Fortran/Ratfor restrictions
on subscript or DO expression forms, for example.)
The current version of the EFL compiler is a two-pass translator written in
It implements all of the features of the language described above except for
Versions of this compiler run under the
The EFL compiler diagnoses all syntax errors.
It gives the line and file name (if known) on which the error was detected.
Warnings are given for variables that are used but not explicitly declared.
B.3. Quality of Fortran Produced
The Fortran produced by EFL is quite clean and readable.
To the extent possible, the variable names that appear in the EFL program are used in the Fortran code.
The bodies of loops and test constructs are indented.
Statement numbers are consecutive.
Few unneeded GOTO and CONTINUE statements are used.
It is considered a compiler bug if incorrect Fortran is produced
(except for escaped lines).
The following is the Fortran procedure produced by the EFL compiler for
the matrix multiplication example (Section 11.2):
\0\0\0\0\0\0subroutine\0matmul(a,\0b,\0c,\0m,\0n,\0p)
\0\0\0\0\0\0integer\0m,\0n,\0p
\0\0\0\0\0\0double\0precision\0a(m,\0n),\0b(n,\0p),\0c(m,\0p)
\0\0\0\0\0\0integer\0i,\0j,\0k
\0\0\0\0\0\0do\0\03\0i\0=\01,\0m
\0\0\0\0\0\0\0\0\0do\0\02\0j\0=\01,\0p
\0\0\0\0\0\0\0\0\0\0\0\0c(i,\0j)\0=\00
\0\0\0\0\0\0\0\0\0\0\0\0do\0\01\0k\0=\01,\0n
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0c(i,\0j)\0=\0c(i,\0j)+a(i,\0k)*b(k,\0j)
\0\0\01\0\0\0\0\0\0\0\0\0\0\0continue
\0\0\02\0\0\0\0\0\0\0\0continue
\0\0\03\0\0\0\0\0continue
The following is the procedure for the tree walk (Section 11.4):
\0\0\0\0\0\0subroutine\0walk(first)
\0\0\0\0\0\0integer\0first
\0\0\0\0\0\0common\0/nodes/\0tree
\0\0\0\0\0\0integer\0tree(4,\0100)
\0\0\0\0\0\0real\0tree1(4,\0100)
\0\0\0\0\0\0integer\0staame(2,\0100),\0stapth,\0curode
\0\0\0\0\0\0integer\0const1(1)
\0\0\0\0\0\0equivalence\0(tree(1,1),\0tree1(1,1))
\0\0\0\0\0\0data\0const1(1)/4h\0\0\0\0/
c\0print\0out\0an\0expression\0tree
c\0\0\0initialize\0stack\0with\0root\0node
\0\0\0\0\0\0staame(1,\0stapth)\0=\01
\0\0\0\0\0\0staame(2,\0stapth)\0=\0first
\0\0\01\0\0if\0(stapth\0.le.\00)\0goto\0\09
\0\0\0\0\0\0\0\0\0curode\0=\0staame(2,\0stapth)
\0\0\0\0\0\0\0\0\0goto\0\07
\0\0\02\0\0\0\0\0\0\0\0if\0(tree(1,\0curode)\0.ne.\0const1(1))\0goto\03
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0call\0outval(tree1(4,\0curode))
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0stapth\0=\0stapth-1
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0goto\0\04
\0\0\03\0\0\0\0\0\0\0\0\0\0\0call\0outch(1h()
c\0a\0binary\0operator\0node
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0staame(1,\0stapth)\0=\02
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0stapth\0=\0stapth+1
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0staame(1,\0stapth)\0=\01
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0staame(2,\0stapth)\0=\0tree(2,\0curode)
\0\0\04\0\0\0\0\0\0\0\0goto\0\08
\0\0\05\0\0\0\0\0\0\0\0call\0outch(tree(1,\0curode))
\0\0\0\0\0\0\0\0\0\0\0\0staame(1,\0stapth)\0=\03
\0\0\0\0\0\0\0\0\0\0\0\0stapth\0=\0stapth+1
\0\0\0\0\0\0\0\0\0\0\0\0staame(1,\0stapth)\0=\01
\0\0\0\0\0\0\0\0\0\0\0\0staame(2,\0stapth)\0=\0tree(3,\0curode)
\0\0\0\0\0\0\0\0\0\0\0\0goto\0\08
\0\0\06\0\0\0\0\0\0\0\0call\0outch(1h))
\0\0\0\0\0\0\0\0\0\0\0\0stapth\0=\0stapth-1
\0\0\0\0\0\0\0\0\0\0\0\0goto\0\08
\0\0\07\0\0\0\0\0\0\0\0if\0(staame(1,\0stapth)\0.eq.\03)\0goto\0\06
\0\0\0\0\0\0\0\0\0\0\0\0if\0(staame(1,\0stapth)\0.eq.\02)\0goto\0\05
\0\0\0\0\0\0\0\0\0\0\0\0if\0(staame(1,\0stapth)\0.eq.\01)\0goto\0\02
\0\0\08\0\0\0\0\0continue
\0\0\0\0\0\0\0\0\0goto\0\01
APPENDIX C. CONSTRAINTS ON THE DESIGN OF THE EFL LANGUAGE
Although Fortran can be used to simulate any finite computation,
there are realistic limits on the generality of a language that can be
The design of EFL was constrained by the implementation strategy.
Certain of the restrictions are petty (six character external names),
but others are sweeping (lack of pointer variables).
The following paragraphs describe the major limitations imposed by Fortran.
External names (procedure and COMMON block names)
must be no longer than six characters in Fortran.
Further, an external name is global to the entire program.
Therefore, EFL can support block structure within a procedure,
but it can have only one level of external name if the
EFL procedures are to be compilable separately,
as are Fortran procedures.
The Fortran standards, in effect, permit arguments to be passed between
Fortran procedures either by reference or by copy-in/copy-out.
This indeterminacy of specification shows through into EFL.
A program that depends on the method of argument transmission is
illegal in either language.
There are no procedure-valued variables in Fortran: a procedure name may
only be passed as an argument or be invoked; it cannot be stored.
Fortran (and EFL) would be noticeably simpler if a procedure variable mechanism
The most grievous problem with Fortran is its lack of a pointer-like
The implementation of the compiler would have been far easier if certain hard
cases could have been handled by pointers.
Further, the language could have been simplified considerably if pointers were
(There are several ways of simulating pointers by using subscripts,
but they founder on the problems of external variables and initialization.)
Fortran procedures are not recursive,
so it was not practical to permit EFL procedures to be recursive.
(Recursive procedures with arguments can be simulated only with great pain.)
The definition of Fortran does not specify the lifetime of variables.
It would be possible but cumbersome to implement stack or heap
storage disciplines by using COMMON blocks.