Commit | Line | Data |
---|---|---|
05841234 TL |
1 | .na |
2 | .ce | |
3 | C Changes | |
4 | ||
5 | 1. Long integers | |
6 | ||
7 | The compiler implements 32-bit integers. | |
8 | The associated type keyword is `long'. | |
9 | The word can act rather like an adjective in that | |
10 | `long int' means a 32-bit integer and `long float' | |
11 | means the same as `double.' | |
12 | But plain `long' is a long integer. | |
13 | Essentially all operations on longs are implemented except that | |
14 | assignment-type operators do not have values, so | |
15 | l1+(l2=+l3) won't work. | |
16 | Neither will l1 = l2 = 0. | |
17 | ||
18 | Long constants are written with a terminating `l' or `L'. | |
19 | E.g. "123L" or "0177777777L" or "0X56789abcdL". | |
20 | The latter is a hex constant, which could also have been short; | |
21 | it is marked by starting with "0X". | |
22 | Every fixed decimal constant larger than 32767 is taken to | |
23 | be long, and so are octal or hex constants larger than | |
24 | 0177777 (0Xffff, or 0xFFFF if you like). | |
25 | A warning is given in such a case since this is actually | |
26 | an incompatibility with the older compiler. | |
27 | Where the constant is just used as an initializer or | |
28 | assigned to something it doesn't matter. | |
29 | If it is passed to a subroutine | |
30 | then the routine will not get what it expected. | |
31 | ||
32 | When a short and a long integer are | |
33 | operands of an arithmetic operator, | |
34 | the short is converted to long (with sign extension). | |
35 | This is true also when a short is assigned to a long. | |
36 | When a long is assigned to a short integer it | |
37 | is truncated at the high order end with no notice | |
38 | of possible loss of significant digits. | |
39 | This is true as well when a long is added to a pointer | |
40 | (which includes its usage as a subscript). | |
41 | The conversion rules for expressions involving | |
42 | doubles and floats mixed with longs | |
43 | are the same as those for short integers, | |
44 | .ul | |
45 | mutatis mutandis. | |
46 | ||
47 | A point to note is that constant expressions involving | |
48 | longs are not evaluated at compile time, | |
49 | and may not be used where constants are expected. | |
50 | Thus | |
51 | ||
52 | long x {5000L*5000L}; | |
53 | ||
54 | is illegal; | |
55 | ||
56 | long x {5000*5000}; | |
57 | ||
58 | is legal but wrong because the high-order part is lost; | |
59 | but both | |
60 | ||
61 | long x 25000000L; | |
62 | ||
63 | and | |
64 | ||
65 | long x 25.e6; | |
66 | ||
67 | are correct | |
68 | and have the same meaning | |
69 | because the double constant is converted to long at compile time. | |
70 | ||
71 | 2. Unsigned integers | |
72 | ||
73 | A new fundamental data type with keyword `unsigned,' is | |
74 | available. It may be used alone: | |
75 | ||
76 | unsigned u; | |
77 | ||
78 | or as an adjective with `int' | |
79 | ||
80 | unsigned int u; | |
81 | ||
82 | with the same meaning. There are not yet (or possibly ever) | |
83 | unsigned longs or chars. The meaning of an unsigned variable is | |
84 | that of an integer modulo 2^n, where n is 16 on the PDP-11. All | |
85 | operators whose operands are unsigned produce results consistent | |
86 | with this interpretation except division and remainder where the | |
87 | divisor is larger than 32767; then the result is incorrect. The | |
88 | dividend in an unsigned division may however have any value (i.e. | |
89 | up to 65535) with correct results. Right shifts of unsigned | |
90 | quantities are guaranteed to be logical shifts. | |
91 | ||
92 | When an ordinary integer and an unsigned integer are combined | |
93 | then the ordinary integer is mapped into an integer mod 2^16 and | |
94 | the result is unsigned. Thus, for example `u = -1' results in | |
95 | assigning 65535 to u. This is mathematically reasonable, and | |
96 | also happens to involve no run-time overhead. | |
97 | ||
98 | When an unsigned integer is assigned to a plain integer, an | |
99 | (undiagnosed) overflow occurs when the unsigned integer exceeds | |
100 | 2^15-1. | |
101 | ||
102 | It is intended that unsigned integers be used in contexts where | |
103 | previously character pointers were used (artificially and | |
104 | nonportably) to represent unsigned integers. | |
105 | ||
106 | 3. Block structure. | |
107 | ||
108 | A sequence of declarations may now appear at the beginning of any | |
109 | compound statement in {}. The variables declared thereby are | |
110 | local to the compound statement. Any declarations of the same | |
111 | name existing before the block was entered are pushed down for | |
112 | the duration of the block. Just as in functions, as before, auto | |
113 | variables disappear and lose their values when the block is left; | |
114 | static variables retain their values. Also according to the same | |
115 | rules as for the declarations previously allowed at the start of | |
116 | functions, if no storage class is mentioned in a declaration the | |
117 | default is automatic. | |
118 | ||
119 | Implementation of inner-block declarations is such that there is | |
120 | no run-time cost associated with using them. | |
121 | ||
122 | 4. Initialization (part 1) | |
123 | ||
124 | This compiler properly handles initialization of structures | |
125 | so the construction | |
126 | ||
127 | struct { char name[8]; char type; float val; } x | |
128 | { "abc", 'a', 123.4 }; | |
129 | ||
130 | compiles correctly. | |
131 | In particular it is recognized that the string is supposed | |
132 | to fill an 8-character array, the `a' goes into a character, | |
133 | and that the 123.4 must be rounded and placed in a single-precision | |
134 | cell. | |
135 | Structures of arrays, arrays of structures, and the like all work; | |
136 | a more formal description of what is done follows. | |
137 | ||
138 | <initializer> ::= <element> | |
139 | ||
140 | <element> ::= <expression> | <element> , <element> | | |
141 | { <element> } | { <element> , } | |
142 | ||
143 | An element is an expression or a comma-separated sequence of | |
144 | elements possibly enclosed in braces. | |
145 | In a brace-enclosed | |
146 | sequence, a comma is optional after the last element. | |
147 | This very | |
148 | ambiguous definition is parsed as described below. | |
149 | "Expression" | |
150 | must of course be a constant expression within the previous | |
151 | meaning of the Act. | |
152 | ||
153 | An initializer for a non-structured scalar is an element with | |
154 | exactly one expression in it. | |
155 | ||
156 | An "aggregate" is a structure or an array. | |
157 | If the initializer | |
158 | for an aggregate begins with a left brace, then the succeeding | |
159 | comma-separated sequence of elements initialize the members of | |
160 | the aggregate. | |
161 | It is erroneous for the number of members in the | |
162 | sequence to exceed the number of elements in the aggregate. | |
163 | If | |
164 | the sequence has too few members the aggregate is padded. | |
165 | ||
166 | If the initializer for an aggregate does not begin with a left | |
167 | brace, then the members of the aggregate are initialized with | |
168 | successive elements from the succeeding comma-separated sequence. | |
169 | If the sequence terminates before the aggregate is filled the | |
170 | aggregate is padded. | |
171 | ||
172 | The "top level" initializer is the object which initializes an | |
173 | external object itself, as opposed to one of its members. | |
174 | The | |
175 | top level initializer for an aggregate must begin with a left | |
176 | brace. | |
177 | ||
178 | If the top-level object being initialized is an array and if its | |
179 | size is omitted in the declaration, e.g. "int a[]", then the size | |
180 | is calculated from the number of elements which initialized it. | |
181 | ||
182 | Short of complete assimilation of this description, there are two | |
183 | simple approaches to the initialization of complicated objects. | |
184 | First, observe that it is always legal to initialize any object | |
185 | with a comma-separated sequence of expressions. | |
186 | The members of | |
187 | every structure and array are stored in a specified order, so the | |
188 | expressions which initialize these members may if desired be laid | |
189 | out in a row to successively, and recursively, initialize the | |
190 | members. | |
191 | ||
192 | Alternatively, the sequences of expressions which initialize | |
193 | arrays or structures may uniformly be enclosed in braces. | |
194 | ||
195 | 5. Initialization (part 2) | |
196 | ||
197 | Declarations, whether external, at the head of functions, or | |
198 | in inner blocks may have initializations whose syntax is the same | |
199 | as previous external declarations with initializations. The only | |
200 | restrictions are that automatic structures and arrays may not be | |
201 | initialized (they can't be assigned either); nor, for the moment | |
202 | at least, may external variables when declared inside a function. | |
203 | ||
204 | The declarations and initializations should be thought of as | |
205 | occurring in lexical order so that forward references in | |
206 | initializations are unlikely to work. E.g., | |
207 | ||
208 | { int a a; | |
209 | int b c; | |
210 | int c 5; | |
211 | ... | |
212 | } | |
213 | ||
214 | Here a is initialized by itself (and its value is thus | |
215 | undefined); b is initialized with the old value of c (which is | |
216 | either undefined or any c declared in an outer block). | |
217 | ||
218 | 6. Bit fields | |
219 | ||
220 | A declarator inside a structure may have the form | |
221 | ||
222 | <declarator> : <constant> | |
223 | ||
224 | which specifies that the object declared is stored in a field | |
225 | the number of bits in which is specified by the constant. | |
226 | If several such things are stacked up next to each other | |
227 | then the compiler allocates the fields from right to left, | |
228 | going to the next word | |
229 | when the new field will not fit. | |
230 | The declarator may also have the form | |
231 | ||
232 | : <constant> | |
233 | ||
234 | which allocates an unnamed field to simplify accurate | |
235 | modelling of things like hardware formats where there are unused | |
236 | fields. | |
237 | Finally, | |
238 | ||
239 | : 0 | |
240 | ||
241 | means to force the next field to start on a word boundary. | |
242 | ||
243 | The types of bit fields can be only "int" or "char". | |
244 | The only difference between the two | |
245 | is in the alignment and length restrictions: | |
246 | no int field can be longer than 16 bits, nor any char longer | |
247 | than 8 bits. | |
248 | If a char field will not fit into the current character, | |
249 | then it is moved up to the next character boundary. | |
250 | ||
251 | Both int and char fields | |
252 | are taken to be unsigned (non-negative) | |
253 | integers. | |
254 | ||
255 | Bit-field variables are not quite full-class citizens. | |
256 | Although most operators can be applied to them, | |
257 | including assignment operators, | |
258 | they do not have addresses (i.e. there are no bit pointers) | |
259 | so the unary & operator cannot be applied to them. | |
260 | For essentially this reason there are no arrays of bit field | |
261 | variables. | |
262 | ||
263 | There are three twoes in the implementation: | |
264 | addition (=+) applied to fields | |
265 | can result in an overflow into the next field; | |
266 | it is not possible to initialize bit fields. | |
267 | ||
268 | 7. Macro preprocessor | |
269 | ||
270 | The proprocessor handles `define' statements with formal arguments. | |
271 | The line | |
272 | ||
273 | #define macro(a1,...,an) ...a1...an... | |
274 | ||
275 | is recognized by the presence of a left parenthesis | |
276 | following the defined name. | |
277 | When the form | |
278 | ||
279 | macro(b1,...,bn) | |
280 | ||
281 | is recognized in normal C program text, | |
282 | it is replaced by the definition, with the corresponding | |
283 | .ul | |
284 | bi | |
285 | actual argument string substituted for the corresponding | |
286 | .ul | |
287 | ai | |
288 | formal arguments. | |
289 | Both actual and formal arguments are separated by | |
290 | commas not included in parentheses; the formal arguments | |
291 | have the syntax of names. | |
292 | ||
293 | Macro expansions are no longer surrounded by spaces. | |
294 | Lines in which a replacement has taken place are rescanned until | |
295 | no macros remain. | |
296 | ||
297 | The preprocessor has a rudimentary conditional facility. | |
298 | A line of the form | |
299 | ||
300 | #ifdef name | |
301 | ||
302 | is ignored if | |
303 | `name' is defined to the preprocessor | |
304 | (i.e. was the subject of a `define' line). | |
305 | If name is not defined then all lines through | |
306 | a line of the form | |
307 | ||
308 | #endif | |
309 | ||
310 | are ignored. | |
311 | A corresponding | |
312 | form is | |
313 | ||
314 | #ifndef name | |
315 | ... | |
316 | #endif | |
317 | ||
318 | which ignores the intervening lines unless `name' is defined. | |
319 | The name `unix' is predefined and replaced by itself | |
320 | to aid writers of C programs which are expected to be transported | |
321 | to other machines with C compilers. | |
322 | ||
323 | In connection with this, there is a new option to the cc command: | |
324 | ||
325 | cc -Dname | |
326 | ||
327 | which causes `name' to be defined to the preprocessor (and replaced by | |
328 | itself). | |
329 | This can be used together with conditional preprocessor | |
330 | statements to select variant versions of a program at compile time. | |
331 | ||
332 | The previous two facilities (macros with arguments, conditional | |
333 | compilation) | |
334 | were actually available in the 6th Edition system, but | |
335 | undocumented. | |
336 | New in this release of the cc command is the ability to | |
337 | nest `include' files. | |
338 | Preprocessor include lines may have the new form | |
339 | ||
340 | #include <file> | |
341 | ||
342 | where the angle brackets replace double quotes. | |
343 | In this case, the file name is prepended with a standard prefix, | |
344 | namely `/usr/include'. | |
345 | In is intended that commonly-used include files be placed | |
346 | in this directory; | |
347 | the convention reduces the dependence on system-specific | |
348 | naming conventions. | |
349 | The standard prefix can be replaced by | |
350 | the cc command option `-I': | |
351 | ||
352 | cc -Iotherdirectory | |
353 | ||
354 | 8. Registers | |
355 | ||
356 | A formal argument may be given the storage class `register.' | |
357 | When this occurs the save sequence copies it | |
358 | from the place | |
359 | the caller left it into a fast register; | |
360 | all usual restrictions on its use are the same | |
361 | as for ordinary register variables. | |
362 | ||
363 | Now any variable inside a function may be declared `register;' | |
364 | if the type is unsuitable, or if | |
365 | there are more than three register declarations, | |
366 | then the compiler makes it `auto' instead. | |
367 | The restriction that the & operator may not be applied | |
368 | to a register remains. | |
369 | ||
370 | 9. Mode declarations | |
371 | ||
372 | A declaration of the form | |
373 | ||
374 | typedef\b\b\b\b\b\b\b_______ type-specifier declarator ;\b_ | |
375 | ||
376 | makes the name given in the declarator into the equivalent | |
377 | of a keyword specifying the type which the name would have | |
378 | in an ordinary declaration. | |
379 | Thus | |
380 | ||
381 | typedef int *iptr; | |
382 | ||
383 | makes `iptr' usable in declarations of pointers to integers; | |
384 | subsequently the declarations | |
385 | ||
386 | iptr ip; | |
387 | .br | |
388 | int *ip; | |
389 | ||
390 | would mean the same thing. | |
391 | Type names introduced in this way | |
392 | obey the same scope rules as ordinary variables. | |
393 | The facility is new, experimental, and probably buggy. | |
394 | ||
395 | 10. Restrictions | |
396 | ||
397 | The compiler is somewhat stickier about | |
398 | some constructions that used to be accepted. | |
399 | ||
400 | One difference is that external declarations made inside | |
401 | functions are remembered to the end of the file, | |
402 | that is even past the end of the function. | |
403 | The most frequent problem that this causes is that | |
404 | implicit declaration of a function as an integer in one | |
405 | routine, | |
406 | and subsequent explicit declaration | |
407 | of it as another type, | |
408 | is not allowed. | |
409 | This turned out to affect | |
410 | several source programs | |
411 | distributed with the system. | |
412 | ||
413 | It is now required that all forward references to labels | |
414 | inside a function be the subject of a `goto.' | |
415 | This has turned out to affect mainly people who | |
416 | pass a label to the routine `setexit.' | |
417 | In fact a routine is supposed to be passed here, | |
418 | and why a label worked I do not know. | |
419 | ||
420 | In general this compiler makes it more difficult | |
421 | to use label variables. | |
422 | Think of this as a contribution to structured programming. | |
423 | ||
424 | The compiler now checks multiple declarations of the same name | |
425 | more carefully for consistency. | |
426 | It used to be possible to declare the same name to | |
427 | be a pointer to different structures; | |
428 | this is caught. | |
429 | So too are declarations of the same array as having different | |
430 | sizes. | |
431 | The exception is that array declarations with empty brackets | |
432 | may be used in conjunction with a declaration with a specified size. | |
433 | Thus | |
434 | ||
435 | int a[]; | |
436 | int a[50]; | |
437 | ||
438 | is acceptable (in either order). | |
439 | ||
440 | An external array all of whose definitions | |
441 | involve empty brackets is diagnosed as `undefined' | |
442 | by the loader; | |
443 | it used to be taken as having 1 element. |