BSD 1 development
[unix-history] / puman / pumanA.n
CommitLineData
5bd13011
BJ
1.if \n(xx .bp
2.if !\n(xx \{\
3.so tmac.p \}
4.ND
5.nr H1 0
6.af H1 A
7.NH
8Appendix to Wirth's Pascal Report
9.PP
10This section is an appendix to
11the definition of the Pascal language in Niklaus Wirth's
12.I "Pascal Report"
13and, with that Report, precisely defines the
14.UX
15implementation.
16This appendix includes a summary of extensions to the language,
17gives the ways in which the undefined specifications were resolved,
18gives limitations and restrictions of the current implementation,
19and lists the added functions and procedures available.
20It concludes with a list of differences with the commonly available
21Pascal 6000\-3.4 implementation,
22and some comments on standard and portable Pascal.
23.NH 2
24Extensions to the language Pascal
25.PP
26This section defines non-standard language constructs available in
27.UP .
28The
29.B s
30standard Pascal option of the translator
31.PI
32can be used to detect these extensions in programs which are to be transported.
33.SH
34String padding
35.PP
36.UP
37will pad constant strings with blanks in expressions and as
38value parameters to make them as long as is required.
39The following is a legal
40.UP
41program:
42.LS
43\*bprogram\fP x(output);
44\*bvar\fP z : \*bpacked\fP \*barray\fP [ 1 .. 13 ] \*bof\fP char;
45\*bbegin\fP
46 z := 'red';
47 writeln(z)
48\*bend\fP;
49.LE
50The padded blanks are added on the right.
51Thus the assignment above is equivalent to:
52.LS
53z := 'red '
54.LE
55which is standard Pascal.
56.SH
57Octal constants, octal and hexadecimal write
58.PP
59Octal constants may be given as a sequence of octal digits followed
60by the character `b' or `B'.
61The forms
62.LS
63write(a:n \*boct\fP)
64.LE
65and
66.LS
67write(a:n \*bhex\fP)
68.LE
69cause the internal representation of
70expression
71.I a,
72which must be Boolean, character, integer, pointer, or a user-defined enumerated
73type,
74to be written in octal or hexadecimal respectively.
75.SH
76Assert statement
77.PP
78An
79.B assert
80statement causes a
81.I Boolean
82expression to be evaluated
83each time the statement is executed.
84A runtime error results if any of the expressions evaluates to be
85.I false .
86The
87.B assert
88statement is treated as a comment if run-time tests are disabled.
89The syntax for
90.B assert
91is:
92.LS
93\*bassert\fP <expr>
94.LE
95.br
96.ne 8
97.NH 2
98Resolution of the undefined specifications
99.SH
100File name \- file variable associations
101.PP
102Each Pascal file variable is associated with a named
103.UX
104file.
105Except for
106.I input
107and
108.I output,
109which are
110exceptions to some of the rules, a name can become associated
111with a file in any of three ways:
112.IP "\ \ \ \ \ 1)" 10
113If a global Pascal file variable appears in the
114.B program
115statement
116then it is associated with
117.UX
118file of the same name.
119.IP "\ \ \ \ \ 2)"
120If a file was reset or rewritten using the
121extended two-argument form of
122.I reset
123or
124.I rewrite
125then the given name
126is associated.
127.IP "\ \ \ \ \ 3)"
128If a file which has never had
129.UX
130name associated
131is reset or rewritten without specifying a name
132via the second argument, then a temporary name
133of the form `tmp.x'
134is associated with the file.
135Temporary names start with
136`tmp.1' and continue by incrementing the last character in the
137.SM
138USASCII
139.NL
140ordering.
141Temporary files are removed automatically
142when their scope is exited.
143.SH
144The program statement
145.PP
146The syntax of the
147.B program
148statement is:
149.LS
150\*bprogram\fP <id> ( <file id> { , <file id > } ) ;
151.LE
152The file identifiers (other than
153.I input
154and
155.I output )
156must be declared as variables of
157.B file
158type in the global declaration part.
159.SH
160The files input and output
161.PP
162The formal parameters
163.I input
164and
165.I output
166are associated with the
167.UX
168standard input and output and have a
169somewhat special status.
170The following rules must be noted:
171.IP "\ \ \ \ \ 1)" 10
172The program heading
173.B must
174contains the formal parameter
175.I output.
176If
177.I input
178is used, explicitly or implicitly, then it must
179also be declared here.
180.IP "\ \ \ \ \ 2)"
181Unlike all other files, the
182Pascal files
183.I input
184and
185.I output
186must not be defined in a declaration,
187as their declaration is automatically:
188.LS
189\*bvar\fP input, output: text
190.LE
191.IP "\ \ \ \ \ 3)"
192The procedure
193.I reset
194may be used on
195.I input.
196If no
197.UX
198file name has ever been associated with
199.I input,
200and no file name is given, then an attempt will be made
201to `rewind'
202.I input.
203If this fails, a run time
204error will occur.
205.I Rewrite
206calls to output act as for any other file, except that
207.I output
208initially has no associated file.
209This means that a simple
210.LS
211rewrite(output)
212.LE
213associates a temporary name with
214.I output.
215.SH
216Details for files
217.PP
218If a file other than
219.I input
220is to be read,
221then reading must be initiated by a call to the
222procedure
223.I reset
224which causes the Pascal system to attempt to open the
225associated
226.UX
227file for reading.
228If this fails, then a runtime error occurs.
229Writing of a file other than
230.I output
231must be initiated by a
232.I rewrite
233call,
234which causes the Pascal system to create the associated
235.UX
236file and
237to then open the file for writing only.
238.SH
239Buffering
240.PP
241The buffering for
242.I output
243is determined by the value of the
244.B b
245option
246at the end of the
247.B program
248statement.
249If it has its default value 1,
250then
251.I output
252is
253buffered in blocks of up to 512 characters,
254flushed whenever a writeln occurs
255and at each reference to the file
256.I input.
257If it has the value 0,
258.I output
259is unbuffered.
260Any value of
2612 or more gives block buffering without line or
262.I input
263reference flushing.
264All other output files are always buffered in blocks of 512 characters.
265All output buffers are flushed when the files are closed at scope exit,
266whenever the procedure
267.I message
268is called, and can be flushed using the
269built-in procedure
270.I flush.
271.PP
272An important point for an interactive implementation is the definition
273of `input\(ua'.
274If
275.I input
276is a teletype, and the Pascal system reads a character at the beginning
277of execution to define `input\(ua', then no prompt could be printed
278by the program before the user is required to type some input.
279For this reason, `input\(ua' is not defined by the system until its definition
280is needed, reading from a file occurring only when necessary.
281.SH
282The character set
283.PP
284Seven bit
285.SM USASCII
286is the character set used on
287.UX .
288The standard Pascal
289symbols `and', 'or', 'not', '<=', '>=', '<>',
290and the uparrow `\(ua' (for pointer qualification)
291are recognized.\*(dg
292.FS
293\*(dgOn many terminals and printers, the up arrow is represented
294as a circumflex `^'.
295These are not distinct characters, but rather different graphic
296representations of the same internal codes.
297.FE
298Less portable are the
299synonyms tilde `~'
300for
301.B not ,
302`&' for
303.B and ,
304and `|' for
305.B or .
306.PP
307Upper and lower case are considered distinct.
308Keywords and built-in
309.B procedure
310and
311.B function
312names are
313composed of all lower case letters.
314Thus the identifiers GOTO and GOto are distinct both from each other and
315from the keyword
316\*bgoto\fP.
317The standard type `boolean' is also available as `Boolean'.
318.PP
319Character strings and constants may be delimited by the character
320`\''
321or by the character `#';
322the latter is sometimes convenient when programs are to be transported.
323Note that the `#' character has special meaning
324.up
325when it is the first character on a line \- see
326.I "Multi-file programs"
327below.
328.SH
329The standard types
330.PP
331The standard type
332.I integer
333is conceptually defined as
334.LS
335\*btype\fP integer = minint .. maxint;
336.LE
337.I Integer
338is implemented with 32 bit twos complement arithmetic.
339Predefined constants of type
340.I integer
341are:
342.LS
343\*bconst\fP maxint = 2147483647; minint = -2147483648;
344.LE
345.PP
346The standard type
347.I char
348is conceptually defined as
349.LS
350\*btype\fP char = minchar .. maxchar;
351.LE
352Built-in character constants are `minchar' and `maxchar', `bell' and `tab';
353ord(minchar) = 0, ord(maxchar) = 127.
354.PP
355The type
356.I real
357is implemented using 64 bit floating point arithmetic.
358The floating point arithmetic is done in `rounded' mode, and
359provides approximately 17 digits of precision
360with numbers as small as 10 to the negative 38th power and as large as
36110 to the 38th power.
362.SH
363Comments
364.PP
365Comments can be delimited by either `{' and `}' or by `(*' and `*)'.
366If the character `{' appears in a comment delimited by `{' and `}',
367a warning diagnostic is printed.
368A similar warning will be printed if the sequence `(*' appears in
369a comment delimited by `(*' and `*)'.
370The restriction implied by this warning is not part of standard Pascal,
371but detects many otherwise subtle errors.
372.SH
373Option control
374.PP
375Options of the translator may be controlled
376in two distinct ways.
377A number of options may appear on the command line invoking the translator.
378These options are given as one or more strings of letters preceded by the
379character `\-' and cause the default setting of
380each given option to be changed.
381This method of communication of options is expected to predominate
382for
383.UX .
384Thus the command
385.LS
386% \*bpi \-ls foo.p\fR
387.LE
388translates the file foo.p with the listing option enabled (as it normally
389is off), and with only standard Pascal features available.
390.PP
391If more control over the portions of the program where options are enabled is
392required, then option control in comments can and should be used.
393The
394format for option control in comments is identical to that used in Pascal
3956000\-3.4.
396One places the character `$' as the first character of the comment
397and follows it by a comma separated list of directives.
398Thus an equivalent to the command line example given above would be:
399.LS
400{$l+,s+ listing on, standard Pascal}
401.LE
402as the first line of the program.
403The `l'
404option is more appropriately specified on the command line,
405since it is extremely unlikely in an interactive environment
406that one wants a listing of the program each time it is translated.
407.PP
408Directives consist of a letter designating the option,
409followed either by a `+' to turn the option on, or by a `\-' to turn the
410option off.
411The
412.B b
413option takes a single digit instead of
414a `+' or `\-'.
415.SH
416Notes on the listings
417.PP
418The first page of a listing
419includes a banner line indicating the version and date of generation of
420.PI .
421It also
422includes the
423.UX
424path name supplied for the source file and the date of
425last modification of that file.
426.PP
427Within the body of the listing, lines are numbered consecutively and
428correspond to the line numbers for the editor.
429Currently, two special
430kinds of lines may be used to format the listing:
431a line consisting of a form-feed
432character, control-l, which causes a page
433eject in the listing, and a line with
434no characters which causes the line number to be suppressed in the listing,
435creating a truly blank line.
436These lines thus correspond to `eject' and `space' macros found in many
437assemblers.
438Non-printing characters are printed as the character `?' in the listing.\*(dg
439.FS
440\*(dgThe character generated by a control-i indents
441to the next `tab stop'.
442Tab stops are set every 8 columns in
443.UX .
444Tabs thus provide a quick way of indenting in the program.
445.FE
446.SH
447Multi-file programs
448.PP
449It is also possible to prepare programs whose parts are placed in more
450than one file.
451The files other than the main one are called
452.B include
453files and have names ending with `.i'.
454The contents of an \*binclude\fR file are referenced through a pseudo-statement
455of the form:
456.LS
457#\*binclude\fR "file.i"
458.LE
459The `#' character must be the first character on the line.
460The file name may be delimited with `"' or `\'' characters.
461Nested
462.B include s
463are possible up to 10 deep.
464More details are given in sections 5.9 and 5.10.
465.SH
466The standard procedure write
467.PP
468If no minimum field length parameter is specified
469for a
470.I write,
471the following default
472values are assumed:
473.KS
474.TS
475center;
476l n.
477integer 10
478real 22
479Boolean 10
480char 1
481string length of the string
482oct 11
483hex 8
484.TE
485.KE
486The end of each line in a text file should be explicitly
487indicated by `writeln(f)', where `writeln(output)' may be written
488simply as `writeln'.
489For
490.UX ,
491the built-in function `page(f)' puts a single
492.SM ASCII
493form-feed character on the output file.
494For programs which are to be transported the filter
495.I pcc
496can be used to interpret carriage control, as
497.UX
498does not normally do so.
499.NH 2
500Restrictions and limitations
501.SH
502Files
503.PP
504Files cannot be members of files or members of dynamically
505allocated structures.
506.SH
507Arrays, sets and strings
508.PP
509The calculations involving array subscripts and set elements
510are done with 16 bit arithmetic.
511This
512restricts the types over which arrays and sets may be defined.
513The lower bound of such a range must be greater than or equal to
514\-32768, and the upper bound less than 32768.
515In particular, strings may have any length from 1 to 32767 characters,
516and sets may contain no more than 32767 elements.
517.SH
518Line and symbol length
519.PP
520There is no intrinsic limit on the length of identifiers.
521Identifiers
522are considered to be distinct if they differ
523in any single position over their entire length.
524There is a limit, however, on the maximum input
525line length.
526This is quite generous however, currently exceeding 160
527characters.
528.SH
529Procedure and function nesting and program size
530.PP
531At most 20 levels of
532.B procedure
533and
534.B function
535nesting are allowed.
536There is no fundamental, translator defined limit on the size of the
537program which can be translated.
538The ultimate limit is supplied by the
539hardware and the fact that the \s-2PDP\s0-11 has a 16 bit address space.
540If
541one runs up against the `ran out of memory' diagnostic the program may yet
542translate if smaller procedures are used, as a lot of space is freed
543by the translator at the completion of each
544.B procedure
545or
546.B function
547in the current
548implementation.
549.SH
550Overflow
551.PP
552There is currently no checking for overflow on arithmetic operations at
553run-time.
554.br
555.ne 15
556.NH 2
557Added types, operators, procedures and functions
558.SH
559Additional predefined types
560.PP
561The type
562.I alfa
563is predefined as:
564.LS
565\*btype\fP alfa = \*bpacked\fP \*barray\fP [ 1..10 ] \*bof\fP \*bchar\fP
566.LE
567.PP
568The type
569.I intset
570is predefined as:
571.LS
572\*btype\fP intset = \*bset of\fP 0..127
573.LE
574In most cases the context of an expression involving a constant
575set allows the translator to determine the type of the set, even though the
576constant set itself may not uniquely determine this type.
577In the
578cases where it is not possible to determine the type of the set from
579local context, the expression type defaults to a set over the entire base
580type unless the base type is integer\*(dg.
581.FS
582\*(dgThe current translator makes a special case of the construct
583`if ... in [ ... ]' and enforces only the more lax restriction
584on 16 bit arithmetic given above in this case.
585.FE
586In the latter case the type defaults to the current
587binding of
588.I intset,
589which must be ``type set of (a subrange of) integer'' at that point.
590.PP
591Note that if
592.I intset
593is redefined via:
594.LS
595\*btype\fP intset = \*bset of\fP 0..58;
596.LE
597then the default integer set is the implicit
598.I intset
599of
600Pascal 6000\-3.4
601.SH
602Additional predefined operators
603.PP
604The relationals `<' and `>' of proper set
605inclusion are available.
606With
607.I a
608and
609.I b
610sets, note that
611.LS
612(\*bnot\fR (\fIa\fR < \fIb\fR)) <> (\fIa\fR >= \fIb\fR)
613.LE
614As an example consider the sets
615.I a
616= [0,2]
617and
618.I b
619= [1].
620The only relation true between these sets is `<>'.
621.SH
622Non-standard procedures
623.IP argv(i,a) 25
624where
625.I i
626is an integer and
627.I a
628is a string variable
629assigns the (possibly truncated or blank padded)
630.I i \|'th
631argument
632of the invocation of the current
633.UX
634process to the variable
635.I a .
636The range of valid
637.I i
638is
639.I 0
640to
641.I argc\-1 .
642.IP date(a)
643assigns the current date to the alfa variable
644.I a
645in the format `dd mmm yy ', where `mmm' is the first
646three characters of the month, i.e. `Apr'.
647.IP flush(f)
648writes the output buffered for Pascal file
649.I f
650into the associated
651.UX
652file.
653.IP halt
654terminates the execution of the program with
655a control flow backtrace.
656.IP linelimit(f,x)\*(dd
657.FS
658\*(ddCurrently ignored by
659.X .
660.FE
661with
662.I f
663a textfile and
664.I x
665an integer expression
666causes
667the program to be abnormally terminated if more than
668.I x
669lines are
670written on file
671.I f .
672If
673.I x
674is less than 0 then no limit is imposed.
675.IP message(x,...)
676causes the parameters, which have the format of those
677to the
678built-in
679.B procedure
680.I write,
681to be written unbuffered on the diagnostic unit 2,
682almost always the user's terminal.
683.IP null
684a procedure of no arguments which does absolutely nothing.
685It is useful as a place holder,
686and is generated by
687.XP
688in place of the invisible empty statement.
689.IP remove(a)
690where
691.I a
692is a string causes the
693.UX
694file whose
695name is
696.I a,
697with trailing blanks eliminated, to be removed.
698.IP reset(f,a)
699where
700.I a
701is a string causes the file whose name
702is
703.I a
704(with blanks trimmed) to be associated with
705.I f
706in addition
707to the normal function of
708.I reset.
709.IP rewrite(f,a)
710is analogous to `reset' above.
711.IP stlimit(i)
712where
713.I i
714is an integer sets the statement limit to be
715.I i
716statements.
717Specifying the
718.B p
719option to
720.I pc
721disables statement limit counting.
722.IP time(a)
723causes the current time in the form `\ hh:mm:ss\ ' to be
724assigned to the alfa variable
725.I a.
726.SH
727Non-standard functions
728.IP argc 25
729returns the count of arguments when the Pascal program
730was invoked.
731.I Argc
732is always at least 1.
733.IP card(x)
734returns the cardinality of the set
735.I x,
736i.e. the
737number of elements contained in the set.
738.IP clock
739returns an integer which is the number of central processor
740milliseconds of user time used by this process.
741.IP expo(x)
742yields the integer valued exponent of the floating-point
743representation of
744.I x ;
745expo(\fIx\fP) = entier(log2(abs(\fIx\fP))).
746.IP random(x)
747where
748.I x
749is a real parameter, evaluated but otherwise
750ignored, invokes a linear congruential random number generator.
751Successive seeds are generated as (seed*a + c) mod m and
752the new random number is a normalization of the seed to the range 0.0 to 1.0;
753a is 62605, c is 113218009, and m is
754536870912.
755The initial seed
756is 7774755.
757.IP seed(i)
758where
759.I i
760is an integer sets the random number generator seed
761to
762.I i
763and returns the previous seed.
764Thus seed(seed(i))
765has no effect except to yield value
766.I i.
767.IP sysclock
768an integer function of no arguments returns the number of central processor
769milliseconds of system time used by this process.
770.IP undefined(x)
771a Boolean function.
772Its argument is a real number and
773it always returns false.
774.IP wallclock
775an integer function of no arguments returns the time
776in seconds since 00:00:00 GMT January 1, 1970.
777.NH 2
778Remarks on standard and portable Pascal
779.PP
780It is occasionally desirable to prepare Pascal programs which will be
781acceptable at other Pascal installations.
782While certain system dependencies are bound to creep in,
783judicious design and programming practice can usually eliminate
784most of the non-portable usages.
785Wirth's
786.I "Pascal Report"
787concludes with a standard for implementation and program exchange.
788.PP
789In particular, the following differences may cause trouble when attempting
790to transport programs between this implementation and Pascal 6000\-3.4.
791Using the
792.B s
793translator option may serve to indicate many problem areas.\*(dg
794.FS
795\*(dgThe
796.B s
797option does not, however, check that identifiers differ
798in the first 8 characters.
799.I Pi
800also does not check the semantics of
801.B packed .
802.FE
803.SH
804Features not available in UNIX Pascal
805.IP
806Formal parameters which are
807.B procedure
808or
809.B function .
810.IP
811Segmented files and associated functions and procedures.
812.IP
813The function
814.I trunc
815with two arguments.
816.IP
817Arrays whose indices exceed the capacity of 16 bit arithmetic.
818.SH
819Features available in UNIX Pascal but not in Pascal 6000-3.4
820.IP
821The procedures
822.I reset
823and
824.I rewrite
825with file names.
826.IP
827The functions
828.I argc,
829.I seed,
830.I sysclock,
831and
832.I wallclock.
833.IP
834The procedures
835.I argv,
836.I flush,
837and
838.I remove.
839.IP
840.I Message
841with arguments other than character strings.
842.IP
843.I Write
844with keyword
845.B hex .
846.IP
847The
848.B assert
849statement.
850.SH
851Other problem areas
852.PP
853Sets and strings are more general in \*
854.UP ;
855see the restrictions given in
856the
857Jensen-Wirth
858.I "User Manual"
859for details on the 6000\-3.4 restrictions.
860.PP
861The character set differences may cause problems,
862especially the use of the function
863.I chr,
864characters as arguments to
865.I ord,
866and comparisons of characters,
867since the character set ordering
868differs between the two machines.
869.PP
870The Pascal 6000\-3.4 compiler uses a less strict notion of type equivalence.
871In
872.UP ,
873types are considered identical only if they are represented
874by the same type identifier.
875Thus, in particular, unnamed types are unique
876to the variables/fields declared with them.
877.PP
878Pascal 6000\-3.4 doesn't recognize our option
879flags, so it is wise to
880put the control of
881.UP
882options to the end of option lists or, better
883yet, restrict the option list length to one.
884.PP
885For Pascal 6000\-3.4 the ordering of files in the program statement has
886significance.
887It is desirable to place
888.I input
889and
890.I output
891as the first two files in the
892.B program
893statement.