BSD 4_2 development
[unix-history] / usr / doc / pascal / pumanA.n
CommitLineData
c29f38fa
C
1.if \n(xx .bp
2.if !\n(xx \{\
3.so tmac.p \}
4.ND
5.nr H1 0
6.af H1 A
7.NH
8Appendix to Wirth's Pascal Report
9.PP
10This section is an appendix to
11the definition of the Pascal language in Niklaus Wirth's
12.I "Pascal Report"
13and, with that Report, precisely defines the
14Berkeley
15implementation.
16This appendix includes a summary of extensions to the language,
17gives the ways in which the undefined specifications were resolved,
18gives limitations and restrictions of the current implementation,
19and lists the added functions and procedures available.
20It concludes with a list of differences with the commonly available
21Pascal 6000\-3.4 implementation,
22and some comments on standard and portable Pascal.
23.NH 2
24Extensions to the language Pascal
25.PP
26This section defines non-standard language constructs available in
27.UP .
28The
29.B s
30standard Pascal option of the translators
31.PI
32and
33.PC
34can be used to detect these extensions in programs which are to be transported.
35.SH
36String padding
37.PP
38.UP
39will pad constant strings with blanks in expressions and as
40value parameters to make them as long as is required.
41The following is a legal
42.UP
43program:
44.LS
45\*bprogram\fP x(output);
46\*bvar\fP z : \*bpacked\fP \*barray\fP [ 1 .. 13 ] \*bof\fP char;
47\*bbegin\fP
48 z := 'red';
49 writeln(z)
50\*bend\fP;
51.LE
52The padded blanks are added on the right.
53Thus the assignment above is equivalent to:
54.LS
55z := 'red '
56.LE
57which is standard Pascal.
58.SH
59Octal constants, octal and hexadecimal write
60.PP
61Octal constants may be given as a sequence of octal digits followed
62by the character `b' or `B'.
63The forms
64.LS
65write(a:n \*boct\fP)
66.LE
67and
68.LS
69write(a:n \*bhex\fP)
70.LE
71cause the internal representation of
72expression
73.I a,
74which must be Boolean, character, integer, pointer, or a user-defined enumerated
75type,
76to be written in octal or hexadecimal respectively.
77.SH
78Assert statement
79.PP
80An
81.B assert
82statement causes a
83.I Boolean
84expression to be evaluated
85each time the statement is executed.
86A runtime error results if any of the expressions evaluates to be
87.I false .
88The
89.B assert
90statement is treated as a comment if run-time tests are disabled.
91The syntax for
92.B assert
93is:
94.LS
95\*bassert\fP <expr>
96.LE
97.SH
98Enumerated type input-output
99.PP
100Enumerated types may be read and written.
101On output the string name associated with the enumerated
102value is output.
103If the value is out of range,
104a runtime error occurs.
105On input an identifier is read and looked up
106in a table of names associated with the
107type of the variable, and
108the appropriate internal value is assigned to the variable being
109read.
110If the name is not found in the table
111a runtime error occurs.
112.SH
113Structure returning functions
114.PP
115An extension has been added which allows functions
116to return arbitrary sized structures rather than just
117scalars as in the standard.
118.SH
119Separate compilation
120.PP
121The compiler
122.PC
123has been extended to allow separate compilation of programs.
124Procedures and functions declared at the global level
125may be compiled separately.
126Type checking of calls to separately compiled routines is performed
127at load time to insure that the program as a whole
128is consistent.
129See section 5.10 for details.
130.NH 2
131Resolution of the undefined specifications
132.SH
133File name \- file variable associations
134.PP
135Each Pascal file variable is associated with a named
136.UX
137file.
138Except for
139.I input
140and
141.I output,
142which are
143exceptions to some of the rules, a name can become associated
144with a file in any of three ways:
145.IP "\ \ \ \ \ 1)" 10
146If a global Pascal file variable appears in the
147.B program
148statement
149then it is associated with
150.UX
151file of the same name.
152.IP "\ \ \ \ \ 2)"
153If a file was reset or rewritten using the
154extended two-argument form of
155.I reset
156or
157.I rewrite
158then the given name
159is associated.
160.IP "\ \ \ \ \ 3)"
161If a file which has never had
162.UX
163name associated
164is reset or rewritten without specifying a name
165via the second argument, then a temporary name
166of the form `tmp.x'
167is associated with the file.
168Temporary names start with
169`tmp.1' and continue by incrementing the last character in the
170.SM
171USASCII
172.NL
173ordering.
174Temporary files are removed automatically
175when their scope is exited.
176.SH
177The program statement
178.PP
179The syntax of the
180.B program
181statement is:
182.LS
183\*bprogram\fP <id> ( <file id> { , <file id > } ) ;
184.LE
185The file identifiers (other than
186.I input
187and
188.I output )
189must be declared as variables of
190.B file
191type in the global declaration part.
192.SH
193The files input and output
194.PP
195The formal parameters
196.I input
197and
198.I output
199are associated with the
200.UX
201standard input and output and have a
202somewhat special status.
203The following rules must be noted:
204.IP "\ \ \ \ \ 1)" 10
205The program heading
206.B must
207contains the formal parameter
208.I output.
209If
210.I input
211is used, explicitly or implicitly, then it must
212also be declared here.
213.IP "\ \ \ \ \ 2)"
214Unlike all other files, the
215Pascal files
216.I input
217and
218.I output
219must not be defined in a declaration,
220as their declaration is automatically:
221.LS
222\*bvar\fP input, output: text
223.LE
224.IP "\ \ \ \ \ 3)"
225The procedure
226.I reset
227may be used on
228.I input.
229If no
230.UX
231file name has ever been associated with
232.I input,
233and no file name is given, then an attempt will be made
234to `rewind'
235.I input.
236If this fails, a run time
237error will occur.
238.I Rewrite
239calls to output act as for any other file, except that
240.I output
241initially has no associated file.
242This means that a simple
243.LS
244rewrite(output)
245.LE
246associates a temporary name with
247.I output.
248.SH
249Details for files
250.PP
251If a file other than
252.I input
253is to be read,
254then reading must be initiated by a call to the
255procedure
256.I reset
257which causes the Pascal system to attempt to open the
258associated
259.UX
260file for reading.
261If this fails, then a runtime error occurs.
262Writing of a file other than
263.I output
264must be initiated by a
265.I rewrite
266call,
267which causes the Pascal system to create the associated
268.UX
269file and
270to then open the file for writing only.
271.SH
272Buffering
273.PP
274The buffering for
275.I output
276is determined by the value of the
277.B b
278option
279at the end of the
280.B program
281statement.
282If it has its default value 1,
283then
284.I output
285is
286buffered in blocks of up to 512 characters,
287flushed whenever a writeln occurs
288and at each reference to the file
289.I input.
290If it has the value 0,
291.I output
292is unbuffered.
293Any value of
2942 or more gives block buffering without line or
295.I input
296reference flushing.
297All other output files are always buffered in blocks of 512 characters.
298All output buffers are flushed when the files are closed at scope exit,
299whenever the procedure
300.I message
301is called, and can be flushed using the
302built-in procedure
303.I flush.
304.PP
305An important point for an interactive implementation is the definition
306of `input\(ua'.
307If
308.I input
309is a teletype, and the Pascal system reads a character at the beginning
310of execution to define `input\(ua', then no prompt could be printed
311by the program before the user is required to type some input.
312For this reason, `input\(ua' is not defined by the system until its definition
313is needed, reading from a file occurring only when necessary.
314.SH
315The character set
316.PP
317Seven bit
318.SM USASCII
319is the character set used on
320.UX .
321The standard Pascal
322symbols `and', 'or', 'not', '<=', '>=', '<>',
323and the uparrow `\(ua' (for pointer qualification)
324are recognized.\*(dg
325.FS
326\*(dgOn many terminals and printers, the up arrow is represented
327as a circumflex `^'.
328These are not distinct characters, but rather different graphic
329representations of the same internal codes.
330.FE
331Less portable are the
332synonyms tilde `~'
333for
334.B not ,
335`&' for
336.B and ,
337and `|' for
338.B or .
339.PP
340Upper and lower case are considered to be distinct.\*(st
341.FS
342\*(stThe proposed standard for Pascal considers them to be the same.
343.FE
344Keywords and built-in
345.B procedure
346and
347.B function
348names are
349composed of all lower case letters.
350Thus the identifiers GOTO and GOto are distinct both from each other and
351from the keyword
352\*bgoto\fP.
353The standard type `boolean' is also available as `Boolean'.
354.PP
355Character strings and constants may be delimited by the character
356`\''
357or by the character `#';
358the latter is sometimes convenient when programs are to be transported.
359Note that the `#' character has special meaning
360.up
361when it is the first character on a line \- see
362.I "Multi-file programs"
363below.
364.SH
365The standard types
366.PP
367The standard type
368.I integer
369is conceptually defined as
370.LS
371\*btype\fP integer = minint .. maxint;
372.LE
373.I Integer
374is implemented with 32 bit twos complement arithmetic.
375Predefined constants of type
376.I integer
377are:
378.LS
379\*bconst\fP maxint = 2147483647; minint = -2147483648;
380.LE
381.PP
382The standard type
383.I char
384is conceptually defined as
385.LS
386\*btype\fP char = minchar .. maxchar;
387.LE
388Built-in character constants are `minchar' and `maxchar', `bell' and `tab';
389ord(minchar) = 0, ord(maxchar) = 127.
390.PP
391The type
392.I real
393is implemented using 64 bit floating point arithmetic.
394The floating point arithmetic is done in `rounded' mode, and
395provides approximately 17 digits of precision
396with numbers as small as 10 to the negative 38th power and as large as
39710 to the 38th power.
398.SH
399Comments
400.PP
401Comments can be delimited by either `{' and `}' or by `(*' and `*)'.
402If the character `{' appears in a comment delimited by `{' and `}',
403a warning diagnostic is printed.
404A similar warning will be printed if the sequence `(*' appears in
405a comment delimited by `(*' and `*)'.
406The restriction implied by this warning is not part of standard Pascal,
407but detects many otherwise subtle errors.
408.SH
409Option control
410.PP
411Options of the translators may be controlled
412in two distinct ways.
413A number of options may appear on the command line invoking the translator.
414These options are given as one or more strings of letters preceded by the
415character `\-' and cause the default setting of
416each given option to be changed.
417This method of communication of options is expected to predominate
418for
419.UX .
420Thus the command
421.LS
422% \*bpi \-l \-s foo.p\fR
423.LE
424translates the file foo.p with the listing option enabled (as it normally
425is off), and with only standard Pascal features available.
426.PP
427If more control over the portions of the program where options are enabled is
428required, then option control in comments can and should be used.
429The
430format for option control in comments is identical to that used in Pascal
4316000\-3.4.
432One places the character `$' as the first character of the comment
433and follows it by a comma separated list of directives.
434Thus an equivalent to the command line example given above would be:
435.LS
436{$l+,s+ listing on, standard Pascal}
437.LE
438as the first line of the program.
439The `l'
440option is more appropriately specified on the command line,
441since it is extremely unlikely in an interactive environment
442that one wants a listing of the program each time it is translated.
443.PP
444Directives consist of a letter designating the option,
445followed either by a `+' to turn the option on, or by a `\-' to turn the
446option off.
447The
448.B b
449option takes a single digit instead of
450a `+' or `\-'.
451.SH
452Notes on the listings
453.PP
454The first page of a listing
455includes a banner line indicating the version and date of generation of
456.PI
457or
458.PC .
459It also
460includes the
461.UX
462path name supplied for the source file and the date of
463last modification of that file.
464.PP
465Within the body of the listing, lines are numbered consecutively and
466correspond to the line numbers for the editor.
467Currently, two special
468kinds of lines may be used to format the listing:
469a line consisting of a form-feed
470character, control-l, which causes a page
471eject in the listing, and a line with
472no characters which causes the line number to be suppressed in the listing,
473creating a truly blank line.
474These lines thus correspond to `eject' and `space' macros found in many
475assemblers.
476Non-printing characters are printed as the character `?' in the listing.\*(dg
477.FS
478\*(dgThe character generated by a control-i indents
479to the next `tab stop'.
480Tab stops are set every 8 columns in
481.UX .
482Tabs thus provide a quick way of indenting in the program.
483.FE
484.SH
485The standard procedure write
486.PP
487If no minimum field length parameter is specified
488for a
489.I write,
490the following default
491values are assumed:
492.KS
493.TS
494center;
495l n.
496integer 10
497real 22
498Boolean length of `true' or `false'
499char 1
500string length of the string
501oct 11
502hex 8
503.TE
504.KE
505The end of each line in a text file should be explicitly
506indicated by `writeln(f)', where `writeln(output)' may be written
507simply as `writeln'.
508For
509.UX ,
510the built-in function `page(f)' puts a single
511.SM ASCII
512form-feed character on the output file.
513For programs which are to be transported the filter
514.I pcc
515can be used to interpret carriage control, as
516.UX
517does not normally do so.
518.NH 2
519Restrictions and limitations
520.SH
521Files
522.PP
523Files cannot be members of files or members of dynamically
524allocated structures.
525.SH
526Arrays, sets and strings
527.PP
528The calculations involving array subscripts and set elements
529are done with 16 bit arithmetic.
530This
531restricts the types over which arrays and sets may be defined.
532The lower bound of such a range must be greater than or equal to
533\-32768, and the upper bound less than 32768.
534In particular, strings may have any length from 1 to 65535 characters,
535and sets may contain no more than 65535 elements.
536.SH
537Line and symbol length
538.PP
539There is no intrinsic limit on the length of identifiers.
540Identifiers
541are considered to be distinct if they differ
542in any single position over their entire length.
543There is a limit, however, on the maximum input
544line length.
545This limit is quite generous however, currently exceeding 160
546characters.
547.SH
548Procedure and function nesting and program size
549.PP
550At most 20 levels of
551.B procedure
552and
553.B function
554nesting are allowed.
555There is no fundamental, translator defined limit on the size of the
556program which can be translated.
557The ultimate limit is supplied by the
558hardware and thus, on the \s-2PDP\s0-11,
559by the 16 bit address space.
560If
561one runs up against the `ran out of memory' diagnostic the program may yet
562translate if smaller procedures are used, as a lot of space is freed
563by the translator at the completion of each
564.B procedure
565or
566.B function
567in the current
568implementation.
569.PP
570On the \s-2VAX\s0-11, there is an implementation defined limit
571of 65536 bytes per variable.
572There is no limit on the number of variables.
573.SH
574Overflow
575.PP
576There is currently no checking for overflow on arithmetic operations at
577run-time on the \s-2PDP\s0-11.
578Overflow checking is performed on the \s-2VAX\s0-11 by the hardware.
579.br
580.ne 15
581.NH 2
582Added types, operators, procedures and functions
583.SH
584Additional predefined types
585.PP
586The type
587.I alfa
588is predefined as:
589.LS
590\*btype\fP alfa = \*bpacked\fP \*barray\fP [ 1..10 ] \*bof\fP \*bchar\fP
591.LE
592.PP
593The type
594.I intset
595is predefined as:
596.LS
597\*btype\fP intset = \*bset of\fP 0..127
598.LE
599In most cases the context of an expression involving a constant
600set allows the translator to determine the type of the set, even though the
601constant set itself may not uniquely determine this type.
602In the
603cases where it is not possible to determine the type of the set from
604local context, the expression type defaults to a set over the entire base
605type unless the base type is integer\*(dg.
606.FS
607\*(dgThe current translator makes a special case of the construct
608`if ... in [ ... ]' and enforces only the more lax restriction
609on 16 bit arithmetic given above in this case.
610.FE
611In the latter case the type defaults to the current
612binding of
613.I intset,
614which must be ``type set of (a subrange of) integer'' at that point.
615.PP
616Note that if
617.I intset
618is redefined via:
619.LS
620\*btype\fP intset = \*bset of\fP 0..58;
621.LE
622then the default integer set is the implicit
623.I intset
624of
625Pascal 6000\-3.4
626.SH
627Additional predefined operators
628.PP
629The relationals `<' and `>' of proper set
630inclusion are available.
631With
632.I a
633and
634.I b
635sets, note that
636.LS
637(\*bnot\fR (\fIa\fR < \fIb\fR)) <> (\fIa\fR >= \fIb\fR)
638.LE
639As an example consider the sets
640.I a
641= [0,2]
642and
643.I b
644= [1].
645The only relation true between these sets is `<>'.
646.SH
647Non-standard procedures
648.IP argv(i,a) 25
649where
650.I i
651is an integer and
652.I a
653is a string variable
654assigns the (possibly truncated or blank padded)
655.I i \|'th
656argument
657of the invocation of the current
658.UX
659process to the variable
660.I a .
661The range of valid
662.I i
663is
664.I 0
665to
666.I argc\-1 .
667.IP date(a)
668assigns the current date to the alfa variable
669.I a
670in the format `dd mmm yy ', where `mmm' is the first
671three characters of the month, i.e. `Apr'.
672.IP flush(f)
673writes the output buffered for Pascal file
674.I f
675into the associated
676.UX
677file.
678.IP halt
679terminates the execution of the program with
680a control flow backtrace.
681.IP linelimit(f,x)\*(dd
682.FS
683\*(ddCurrently ignored by pdp-11
684.X .
685.FE
686with
687.I f
688a textfile and
689.I x
690an integer expression
691causes
692the program to be abnormally terminated if more than
693.I x
694lines are
695written on file
696.I f .
697If
698.I x
699is less than 0 then no limit is imposed.
700.IP message(x,...)
701causes the parameters, which have the format of those
702to the
703built-in
704.B procedure
705.I write,
706to be written unbuffered on the diagnostic unit 2,
707almost always the user's terminal.
708.IP null
709a procedure of no arguments which does absolutely nothing.
710It is useful as a place holder,
711and is generated by
712.XP
713in place of the invisible empty statement.
714.IP remove(a)
715where
716.I a
717is a string causes the
718.UX
719file whose
720name is
721.I a,
722with trailing blanks eliminated, to be removed.
723.IP reset(f,a)
724where
725.I a
726is a string causes the file whose name
727is
728.I a
729(with blanks trimmed) to be associated with
730.I f
731in addition
732to the normal function of
733.I reset.
734.IP rewrite(f,a)
735is analogous to `reset' above.
736.IP stlimit(i)
737where
738.I i
739is an integer sets the statement limit to be
740.I i
741statements.
742Specifying the
743.B p
744option to
745.I pc
746disables statement limit counting.
747.IP time(a)
748causes the current time in the form `\ hh:mm:ss\ ' to be
749assigned to the alfa variable
750.I a.
751.SH
752Non-standard functions
753.IP argc 25
754returns the count of arguments when the Pascal program
755was invoked.
756.I Argc
757is always at least 1.
758.IP card(x)
759returns the cardinality of the set
760.I x,
761i.e. the
762number of elements contained in the set.
763.IP clock
764returns an integer which is the number of central processor
765milliseconds of user time used by this process.
766.IP expo(x)
767yields the integer valued exponent of the floating-point
768representation of
769.I x ;
770expo(\fIx\fP) = entier(log2(abs(\fIx\fP))).
771.IP random(x)
772where
773.I x
774is a real parameter, evaluated but otherwise
775ignored, invokes a linear congruential random number generator.
776Successive seeds are generated as (seed*a + c) mod m and
777the new random number is a normalization of the seed to the range 0.0 to 1.0;
778a is 62605, c is 113218009, and m is
779536870912.
780The initial seed
781is 7774755.
782.IP seed(i)
783where
784.I i
785is an integer sets the random number generator seed
786to
787.I i
788and returns the previous seed.
789Thus seed(seed(i))
790has no effect except to yield value
791.I i.
792.IP sysclock
793an integer function of no arguments returns the number of central processor
794milliseconds of system time used by this process.
795.IP undefined(x)
796a Boolean function.
797Its argument is a real number and
798it always returns false.
799.IP wallclock
800an integer function of no arguments returns the time
801in seconds since 00:00:00 GMT January 1, 1970.
802.NH 2
803Remarks on standard and portable Pascal
804.PP
805It is occasionally desirable to prepare Pascal programs which will be
806acceptable at other Pascal installations.
807While certain system dependencies are bound to creep in,
808judicious design and programming practice can usually eliminate
809most of the non-portable usages.
810Wirth's
811.I "Pascal Report"
812concludes with a standard for implementation and program exchange.
813.PP
814In particular, the following differences may cause trouble when attempting
815to transport programs between this implementation and Pascal 6000\-3.4.
816Using the
817.B s
818translator option may serve to indicate many problem areas.\*(dg
819.FS
820\*(dgThe
821.B s
822option does not, however, check that identifiers differ
823in the first 8 characters.
824.I Pi
825and
826.PC
827also do not check the semantics of
828.B packed .
829.FE
830.SH
831Features not available in Berkeley Pascal
832.IP
833Segmented files and associated functions and procedures.
834.IP
835The function
836.I trunc
837with two arguments.
838.IP
839Arrays whose indices exceed the capacity of 16 bit arithmetic.
840.SH
841Features available in Berkeley Pascal but not in Pascal 6000-3.4
842.IP
843The procedures
844.I reset
845and
846.I rewrite
847with file names.
848.IP
849The functions
850.I argc,
851.I seed,
852.I sysclock,
853and
854.I wallclock.
855.IP
856The procedures
857.I argv,
858.I flush,
859and
860.I remove.
861.IP
862.I Message
863with arguments other than character strings.
864.IP
865.I Write
866with keyword
867.B hex .
868.IP
869The
870.B assert
871statement.
872.IP
873Reading and writing of enumerated types.
874.IP
875Allowing functions to return structures.
876.IP
877Separate compilation of programs.
878.IP
879Comparison of records.
880.SH
881Other problem areas
882.PP
883Sets and strings are more general in \*
884.UP ;
885see the restrictions given in
886the
887Jensen-Wirth
888.I "User Manual"
889for details on the 6000\-3.4 restrictions.
890.PP
891The character set differences may cause problems,
892especially the use of the function
893.I chr,
894characters as arguments to
895.I ord,
896and comparisons of characters,
897since the character set ordering
898differs between the two machines.
899.PP
900The Pascal 6000\-3.4 compiler uses a less strict notion of type equivalence.
901In
902.UP ,
903types are considered identical only if they are represented
904by the same type identifier.
905Thus, in particular, unnamed types are unique
906to the variables/fields declared with them.
907.PP
908Pascal 6000\-3.4 doesn't recognize our option
909flags, so it is wise to
910put the control of
911.UP
912options to the end of option lists or, better
913yet, restrict the option list length to one.
914.PP
915For Pascal 6000\-3.4 the ordering of files in the program statement has
916significance.
917It is desirable to place
918.I input
919and
920.I output
921as the first two files in the
922.B program
923statement.