[unix-history] / usr / src / contrib / gcc-2.3.3 / README.NS32K

Copyright (C) 1987 Free Software Foundation, Inc.
Contributed by Michael Tiemann (tiemann@mcc.com)

This file is part of GNU CC.

GNU CC is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 1, or (at your option)
any later version.

GNU CC is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with GNU CC; see the file COPYING.  If not, write to
the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.


This file describes the implementation notes of the GNU C Compiler for
the National Semiconductor 32032 chip (and 32000 family).

The 32032 machine description and configuration file for this compiler
is, for NS32000 family machine, primarily machine independent.
However, since this release still depends on vendor-supplied
assemblers and linkers, the compiler must obey the existing
conventions of the actual machine to which this compiler is targeted.
In this case, the actual machine which this compiler was targeted to
is a Sequent Balance 8000, running DYNIX 2.1.

The assembler for DYNIX 2.1 (and DYNIX 3.0, alas) does not cope with
the full generality of the addressing mode REGISTER RELATIVE.
Specifically, it generates incorrect code for operands of the
following form:

	sym(rn)

Where `rn' is one of the general registers.  Correct code is generated
for operands of the form

	sym(pn)

where `pn' is one of the special processor registers (sb, fp, or sp).

An equivalent operand can be generated by the form

	sym[rn:b]

although this addressing mode is about twice as slow on the 32032.

The more efficient addressing mode is controlled by defining the
constant SEQUENT_ADDRESS_BUG to 0.  It is currently defined to be 1.

Another bug in the assembler makes it impossible to compute with
explicit addresses.  In order to compute with a symbolic address, it
is necessary to load that address into a register using the "addr"
instruction.  For example, it is not possible to say

	cmpd _p,@_x

Rather one must say

	addr _x,rn
	cmpd _p,rn


The ns32032 chip has a number of known bugs.  Any attempt to make the
compiler unaware of these deficiencies will surely bring disaster.
The current list of know bugs are as follows (list provided by Richard
Stallman):

1) instructions with two overlapping operands in memory
(unlikely in C code, perhaps impossible).

2) floating point conversion instructions with constant
operands (these may never happen, but I'm not certain).

3) operands crossing a page boundary.  These can be prevented
by setting the flag in tm.h that requires strict alignment.

4) Scaled indexing in an insn following an insn that has a read-write
operand in memory.  This can be prevented by placing a no-op in
between.  I, Michael Tiemann, do not understand what exactly is meant
by `read-write operand in memory'.  If this is referring to the special
TOS mode, for example "addd 5,tos" then one need not fear, since this
will never be generated.  However, is this includes "addd 5,-4(fp)"
then there is room for disaster.  The Sequent compiler does not insert
a no-op for code involving the latter, and I have been informed that
Sequent is aware of this list of bugs, so I must assume that it is not
a problem.

5) The 32032 cannot shift by 32 bits.  It shifts modulo the word size
of the operand.  Therefore, for 32-bit operations, 32-bit shifts are
interpreted as zero bit shifts.  32-bit shifts have been removed from
the compiler, but future hackers must be careful not to reintroduce
them.

6) The ns32032 is a very slow chip; however, some instructions are
still very much slower than one might expect.  For example, it is
almost always faster to double a quantity by adding it to itself than
by shifting it by one, even if that quantity is deep in memory.  The
MOVM instruction has a 20-cycle setup time, after which it moves data
at about the speed that normal moves would.  It is also faster to use
address generation instructions than shift instructions for left
shifts less than 4.  I do not claim that I generate optimal code for all
given patterns, but where I did escape from National's "clean
architecture", I did so because the timing specification from the data
book says that I will win if I do.  I suppose this is called the
"performance gap".


Signed bitfield extraction has not been implemented.  It is not
provided by the NS32032, and while it is most certainly possible to do
better than the standard shift-left/shift-right sequence, it is also
quite hairy.  Also, since signed bitfields do not yet exist in C, this
omission seems relatively harmless.


Zero extractions could be better implemented if it were possible in
GCC to provide sized zero extractions: i.e. a byte zero extraction
would be allowed to yield a byte result.  The current implementation
of GCC manifests 68000-ist thinking, where bitfields are extracted
into a register, and automatically sign/zero extended to fill the
register.  See comments in ns32k.md around the "extzv" insn for more
details.


It should be noted that while the NS32000 family was designed to
provide odd-aligned addressing capability for multi-byte data (also
provided by the 68020, but not by the 68000 or 68010), many machines
do not opt to take advantage of this.  For example, on the sequent,
although there is no advantage to long-word aligning word data, shorts
must be int-aligned in structs.  This is an example of another
machine-specific machine dependency.


Because the ns32032 is has a coherent byte-order/bit-order
architecture, many instructions which would be different for
68000-style machines, fold into the same instruction for the 32032.
The classic case is push effective address, where it does not matter
whether one is pushing a long, word, or byte address.  They all will
push the same address.


The macro FUNCTION_VALUE_REGNO_P is probably not sufficient, what is
needed is FUNCTION_VALUE_P, which also takes a MODE parameter.  In
this way it will be possible to determine more exactly whether a
register is really a function value register, or just one that happens
to look right.
Commit	Line	Data
fa988507 C	1	Copyright (C) 1987 Free Software Foundation, Inc.
	2	Contributed by Michael Tiemann (tiemann@mcc.com)
	3
	4	This file is part of GNU CC.
	5
	6	GNU CC is free software; you can redistribute it and/or modify
	7	it under the terms of the GNU General Public License as published by
	8	the Free Software Foundation; either version 1, or (at your option)
	9	any later version.
	10
	11	GNU CC is distributed in the hope that it will be useful,
	12	but WITHOUT ANY WARRANTY; without even the implied warranty of
	13	MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
	14	GNU General Public License for more details.
	15
	16	You should have received a copy of the GNU General Public License
	17	along with GNU CC; see the file COPYING. If not, write to
	18	the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.
	19
	20
	21	This file describes the implementation notes of the GNU C Compiler for
	22	the National Semiconductor 32032 chip (and 32000 family).
	23
	24	The 32032 machine description and configuration file for this compiler
	25	is, for NS32000 family machine, primarily machine independent.
	26	However, since this release still depends on vendor-supplied
	27	assemblers and linkers, the compiler must obey the existing
	28	conventions of the actual machine to which this compiler is targeted.
	29	In this case, the actual machine which this compiler was targeted to
	30	is a Sequent Balance 8000, running DYNIX 2.1.
	31
	32	The assembler for DYNIX 2.1 (and DYNIX 3.0, alas) does not cope with
	33	the full generality of the addressing mode REGISTER RELATIVE.
	34	Specifically, it generates incorrect code for operands of the
	35	following form:
	36
	37	sym(rn)
	38
	39	Where `rn' is one of the general registers. Correct code is generated
	40	for operands of the form
	41
	42	sym(pn)
	43
	44	where `pn' is one of the special processor registers (sb, fp, or sp).
	45
	46	An equivalent operand can be generated by the form
	47
	48	sym[rn:b]
	49
	50	although this addressing mode is about twice as slow on the 32032.
	51
	52	The more efficient addressing mode is controlled by defining the
	53	constant SEQUENT_ADDRESS_BUG to 0. It is currently defined to be 1.
	54
	55	Another bug in the assembler makes it impossible to compute with
	56	explicit addresses. In order to compute with a symbolic address, it
	57	is necessary to load that address into a register using the "addr"
	58	instruction. For example, it is not possible to say
	59
	60	cmpd _p,@_x
	61
	62	Rather one must say
	63
	64	addr _x,rn
65	cmpd _p,rn
66
67
68	The ns32032 chip has a number of known bugs. Any attempt to make the
69	compiler unaware of these deficiencies will surely bring disaster.
70	The current list of know bugs are as follows (list provided by Richard
71	Stallman):
72
73	1) instructions with two overlapping operands in memory
74	(unlikely in C code, perhaps impossible).
75
76	2) floating point conversion instructions with constant
77	operands (these may never happen, but I'm not certain).
78
79	3) operands crossing a page boundary. These can be prevented
80	by setting the flag in tm.h that requires strict alignment.
81
82	4) Scaled indexing in an insn following an insn that has a read-write
83	operand in memory. This can be prevented by placing a no-op in
84	between. I, Michael Tiemann, do not understand what exactly is meant
85	by `read-write operand in memory'. If this is referring to the special
86	TOS mode, for example "addd 5,tos" then one need not fear, since this
87	will never be generated. However, is this includes "addd 5,-4(fp)"
88	then there is room for disaster. The Sequent compiler does not insert
89	a no-op for code involving the latter, and I have been informed that
90	Sequent is aware of this list of bugs, so I must assume that it is not
91	a problem.
92
93	5) The 32032 cannot shift by 32 bits. It shifts modulo the word size
94	of the operand. Therefore, for 32-bit operations, 32-bit shifts are
95	interpreted as zero bit shifts. 32-bit shifts have been removed from
96	the compiler, but future hackers must be careful not to reintroduce
97	them.
98
99	6) The ns32032 is a very slow chip; however, some instructions are
100	still very much slower than one might expect. For example, it is
101	almost always faster to double a quantity by adding it to itself than
102	by shifting it by one, even if that quantity is deep in memory. The
103	MOVM instruction has a 20-cycle setup time, after which it moves data
104	at about the speed that normal moves would. It is also faster to use
105	address generation instructions than shift instructions for left
106	shifts less than 4. I do not claim that I generate optimal code for all
107	given patterns, but where I did escape from National's "clean
108	architecture", I did so because the timing specification from the data
109	book says that I will win if I do. I suppose this is called the
110	"performance gap".
111
112
113	Signed bitfield extraction has not been implemented. It is not
114	provided by the NS32032, and while it is most certainly possible to do
115	better than the standard shift-left/shift-right sequence, it is also
116	quite hairy. Also, since signed bitfields do not yet exist in C, this
117	omission seems relatively harmless.
118
119
120	Zero extractions could be better implemented if it were possible in
121	GCC to provide sized zero extractions: i.e. a byte zero extraction
122	would be allowed to yield a byte result. The current implementation
123	of GCC manifests 68000-ist thinking, where bitfields are extracted
124	into a register, and automatically sign/zero extended to fill the
125	register. See comments in ns32k.md around the "extzv" insn for more
126	details.
127
128
129	It should be noted that while the NS32000 family was designed to
130	provide odd-aligned addressing capability for multi-byte data (also
131	provided by the 68020, but not by the 68000 or 68010), many machines
132	do not opt to take advantage of this. For example, on the sequent,
133	although there is no advantage to long-word aligning word data, shorts
134	must be int-aligned in structs. This is an example of another
135	machine-specific machine dependency.
136
137
138	Because the ns32032 is has a coherent byte-order/bit-order
139	architecture, many instructions which would be different for
140	68000-style machines, fold into the same instruction for the 32032.
141	The classic case is push effective address, where it does not matter
142	whether one is pushing a long, word, or byte address. They all will
143	push the same address.
144
145
146	The macro FUNCTION_VALUE_REGNO_P is probably not sufficient, what is
147	needed is FUNCTION_VALUE_P, which also takes a MODE parameter. In
148	this way it will be possible to determine more exactly whether a
149	register is really a function value register, or just one that happens
150	to look right.