Bell 32V development
[unix-history] / usr / doc / ctour / newstuff
CommitLineData
05841234
TL
1.na
2.ce
3C Changes
4
51. Long integers
6
7The compiler implements 32-bit integers.
8The associated type keyword is `long'.
9The word can act rather like an adjective in that
10`long int' means a 32-bit integer and `long float'
11means the same as `double.'
12But plain `long' is a long integer.
13Essentially all operations on longs are implemented except that
14assignment-type operators do not have values, so
15l1+(l2=+l3) won't work.
16Neither will l1 = l2 = 0.
17
18Long constants are written with a terminating `l' or `L'.
19E.g. "123L" or "0177777777L" or "0X56789abcdL".
20The latter is a hex constant, which could also have been short;
21it is marked by starting with "0X".
22Every fixed decimal constant larger than 32767 is taken to
23be long, and so are octal or hex constants larger than
240177777 (0Xffff, or 0xFFFF if you like).
25A warning is given in such a case since this is actually
26an incompatibility with the older compiler.
27Where the constant is just used as an initializer or
28assigned to something it doesn't matter.
29If it is passed to a subroutine
30then the routine will not get what it expected.
31
32When a short and a long integer are
33operands of an arithmetic operator,
34the short is converted to long (with sign extension).
35This is true also when a short is assigned to a long.
36When a long is assigned to a short integer it
37is truncated at the high order end with no notice
38of possible loss of significant digits.
39This is true as well when a long is added to a pointer
40(which includes its usage as a subscript).
41The conversion rules for expressions involving
42doubles and floats mixed with longs
43are the same as those for short integers,
44.ul
45mutatis mutandis.
46
47A point to note is that constant expressions involving
48longs are not evaluated at compile time,
49and may not be used where constants are expected.
50Thus
51
52 long x {5000L*5000L};
53
54is illegal;
55
56 long x {5000*5000};
57
58is legal but wrong because the high-order part is lost;
59but both
60
61 long x 25000000L;
62
63and
64
65 long x 25.e6;
66
67are correct
68and have the same meaning
69because the double constant is converted to long at compile time.
70
712. Unsigned integers
72
73A new fundamental data type with keyword `unsigned,' is
74available. It may be used alone:
75
76 unsigned u;
77
78or as an adjective with `int'
79
80 unsigned int u;
81
82with the same meaning. There are not yet (or possibly ever)
83unsigned longs or chars. The meaning of an unsigned variable is
84that of an integer modulo 2^n, where n is 16 on the PDP-11. All
85operators whose operands are unsigned produce results consistent
86with this interpretation except division and remainder where the
87divisor is larger than 32767; then the result is incorrect. The
88dividend in an unsigned division may however have any value (i.e.
89up to 65535) with correct results. Right shifts of unsigned
90quantities are guaranteed to be logical shifts.
91
92When an ordinary integer and an unsigned integer are combined
93then the ordinary integer is mapped into an integer mod 2^16 and
94the result is unsigned. Thus, for example `u = -1' results in
95assigning 65535 to u. This is mathematically reasonable, and
96also happens to involve no run-time overhead.
97
98When an unsigned integer is assigned to a plain integer, an
99(undiagnosed) overflow occurs when the unsigned integer exceeds
1002^15-1.
101
102It is intended that unsigned integers be used in contexts where
103previously character pointers were used (artificially and
104nonportably) to represent unsigned integers.
105
1063. Block structure.
107
108A sequence of declarations may now appear at the beginning of any
109compound statement in {}. The variables declared thereby are
110local to the compound statement. Any declarations of the same
111name existing before the block was entered are pushed down for
112the duration of the block. Just as in functions, as before, auto
113variables disappear and lose their values when the block is left;
114static variables retain their values. Also according to the same
115rules as for the declarations previously allowed at the start of
116functions, if no storage class is mentioned in a declaration the
117default is automatic.
118
119Implementation of inner-block declarations is such that there is
120no run-time cost associated with using them.
121
1224. Initialization (part 1)
123
124This compiler properly handles initialization of structures
125so the construction
126
127 struct { char name[8]; char type; float val; } x
128 { "abc", 'a', 123.4 };
129
130compiles correctly.
131In particular it is recognized that the string is supposed
132to fill an 8-character array, the `a' goes into a character,
133and that the 123.4 must be rounded and placed in a single-precision
134cell.
135Structures of arrays, arrays of structures, and the like all work;
136a more formal description of what is done follows.
137
138<initializer> ::= <element>
139
140<element> ::= <expression> | <element> , <element> |
141 { <element> } | { <element> , }
142
143An element is an expression or a comma-separated sequence of
144elements possibly enclosed in braces.
145In a brace-enclosed
146sequence, a comma is optional after the last element.
147This very
148ambiguous definition is parsed as described below.
149"Expression"
150must of course be a constant expression within the previous
151meaning of the Act.
152
153An initializer for a non-structured scalar is an element with
154exactly one expression in it.
155
156An "aggregate" is a structure or an array.
157If the initializer
158for an aggregate begins with a left brace, then the succeeding
159comma-separated sequence of elements initialize the members of
160the aggregate.
161It is erroneous for the number of members in the
162sequence to exceed the number of elements in the aggregate.
163If
164the sequence has too few members the aggregate is padded.
165
166If the initializer for an aggregate does not begin with a left
167brace, then the members of the aggregate are initialized with
168successive elements from the succeeding comma-separated sequence.
169If the sequence terminates before the aggregate is filled the
170aggregate is padded.
171
172The "top level" initializer is the object which initializes an
173external object itself, as opposed to one of its members.
174The
175top level initializer for an aggregate must begin with a left
176brace.
177
178If the top-level object being initialized is an array and if its
179size is omitted in the declaration, e.g. "int a[]", then the size
180is calculated from the number of elements which initialized it.
181
182Short of complete assimilation of this description, there are two
183simple approaches to the initialization of complicated objects.
184First, observe that it is always legal to initialize any object
185with a comma-separated sequence of expressions.
186The members of
187every structure and array are stored in a specified order, so the
188expressions which initialize these members may if desired be laid
189out in a row to successively, and recursively, initialize the
190members.
191
192Alternatively, the sequences of expressions which initialize
193arrays or structures may uniformly be enclosed in braces.
194
1955. Initialization (part 2)
196
197Declarations, whether external, at the head of functions, or
198in inner blocks may have initializations whose syntax is the same
199as previous external declarations with initializations. The only
200restrictions are that automatic structures and arrays may not be
201initialized (they can't be assigned either); nor, for the moment
202at least, may external variables when declared inside a function.
203
204The declarations and initializations should be thought of as
205occurring in lexical order so that forward references in
206initializations are unlikely to work. E.g.,
207
208 { int a a;
209 int b c;
210 int c 5;
211 ...
212 }
213
214Here a is initialized by itself (and its value is thus
215undefined); b is initialized with the old value of c (which is
216either undefined or any c declared in an outer block).
217
2186. Bit fields
219
220A declarator inside a structure may have the form
221
222 <declarator> : <constant>
223
224which specifies that the object declared is stored in a field
225the number of bits in which is specified by the constant.
226If several such things are stacked up next to each other
227then the compiler allocates the fields from right to left,
228going to the next word
229when the new field will not fit.
230The declarator may also have the form
231
232 : <constant>
233
234which allocates an unnamed field to simplify accurate
235modelling of things like hardware formats where there are unused
236fields.
237Finally,
238
239 : 0
240
241means to force the next field to start on a word boundary.
242
243The types of bit fields can be only "int" or "char".
244The only difference between the two
245is in the alignment and length restrictions:
246no int field can be longer than 16 bits, nor any char longer
247than 8 bits.
248If a char field will not fit into the current character,
249then it is moved up to the next character boundary.
250
251Both int and char fields
252are taken to be unsigned (non-negative)
253integers.
254
255Bit-field variables are not quite full-class citizens.
256Although most operators can be applied to them,
257including assignment operators,
258they do not have addresses (i.e. there are no bit pointers)
259so the unary & operator cannot be applied to them.
260For essentially this reason there are no arrays of bit field
261variables.
262
263There are three twoes in the implementation:
264addition (=+) applied to fields
265can result in an overflow into the next field;
266it is not possible to initialize bit fields.
267
2687. Macro preprocessor
269
270The proprocessor handles `define' statements with formal arguments.
271The line
272
273 #define macro(a1,...,an) ...a1...an...
274
275is recognized by the presence of a left parenthesis
276following the defined name.
277When the form
278
279 macro(b1,...,bn)
280
281is recognized in normal C program text,
282it is replaced by the definition, with the corresponding
283.ul
284bi
285actual argument string substituted for the corresponding
286.ul
287ai
288formal arguments.
289Both actual and formal arguments are separated by
290commas not included in parentheses; the formal arguments
291have the syntax of names.
292
293Macro expansions are no longer surrounded by spaces.
294Lines in which a replacement has taken place are rescanned until
295no macros remain.
296
297The preprocessor has a rudimentary conditional facility.
298A line of the form
299
300 #ifdef name
301
302is ignored if
303`name' is defined to the preprocessor
304(i.e. was the subject of a `define' line).
305If name is not defined then all lines through
306a line of the form
307
308 #endif
309
310are ignored.
311A corresponding
312form is
313
314 #ifndef name
315 ...
316 #endif
317
318which ignores the intervening lines unless `name' is defined.
319The name `unix' is predefined and replaced by itself
320to aid writers of C programs which are expected to be transported
321to other machines with C compilers.
322
323In connection with this, there is a new option to the cc command:
324
325 cc -Dname
326
327which causes `name' to be defined to the preprocessor (and replaced by
328itself).
329This can be used together with conditional preprocessor
330statements to select variant versions of a program at compile time.
331
332The previous two facilities (macros with arguments, conditional
333compilation)
334were actually available in the 6th Edition system, but
335undocumented.
336New in this release of the cc command is the ability to
337nest `include' files.
338Preprocessor include lines may have the new form
339
340 #include <file>
341
342where the angle brackets replace double quotes.
343In this case, the file name is prepended with a standard prefix,
344namely `/usr/include'.
345In is intended that commonly-used include files be placed
346in this directory;
347the convention reduces the dependence on system-specific
348naming conventions.
349The standard prefix can be replaced by
350the cc command option `-I':
351
352 cc -Iotherdirectory
353
3548. Registers
355
356A formal argument may be given the storage class `register.'
357When this occurs the save sequence copies it
358from the place
359the caller left it into a fast register;
360all usual restrictions on its use are the same
361as for ordinary register variables.
362
363Now any variable inside a function may be declared `register;'
364if the type is unsuitable, or if
365there are more than three register declarations,
366then the compiler makes it `auto' instead.
367The restriction that the & operator may not be applied
368to a register remains.
369
3709. Mode declarations
371
372A declaration of the form
373
374 typedef\b\b\b\b\b\b\b_______ type-specifier declarator ;\b_
375
376makes the name given in the declarator into the equivalent
377of a keyword specifying the type which the name would have
378in an ordinary declaration.
379Thus
380
381 typedef int *iptr;
382
383makes `iptr' usable in declarations of pointers to integers;
384subsequently the declarations
385
386 iptr ip;
387.br
388 int *ip;
389
390would mean the same thing.
391Type names introduced in this way
392obey the same scope rules as ordinary variables.
393The facility is new, experimental, and probably buggy.
394
39510. Restrictions
396
397The compiler is somewhat stickier about
398some constructions that used to be accepted.
399
400One difference is that external declarations made inside
401functions are remembered to the end of the file,
402that is even past the end of the function.
403The most frequent problem that this causes is that
404implicit declaration of a function as an integer in one
405routine,
406and subsequent explicit declaration
407of it as another type,
408is not allowed.
409This turned out to affect
410several source programs
411distributed with the system.
412
413It is now required that all forward references to labels
414inside a function be the subject of a `goto.'
415This has turned out to affect mainly people who
416pass a label to the routine `setexit.'
417In fact a routine is supposed to be passed here,
418and why a label worked I do not know.
419
420In general this compiler makes it more difficult
421to use label variables.
422Think of this as a contribution to structured programming.
423
424The compiler now checks multiple declarations of the same name
425more carefully for consistency.
426It used to be possible to declare the same name to
427be a pointer to different structures;
428this is caught.
429So too are declarations of the same array as having different
430sizes.
431The exception is that array declarations with empty brackets
432may be used in conjunction with a declaration with a specified size.
433Thus
434
435 int a[];
436 int a[50];
437
438is acceptable (in either order).
439
440An external array all of whose definitions
441involve empty brackets is diagnosed as `undefined'
442by the loader;
443it used to be taken as having 1 element.