Initial commit of OpenSPARC T2 architecture model.
[OpenSPARC-T2-SAM] / obp / obp / fm / lib / decomp.fth
CommitLineData
920dae64
AT
1\ ========== Copyright Header Begin ==========================================
2\
3\ Hypervisor Software File: decomp.fth
4\
5\ Copyright (c) 2006 Sun Microsystems, Inc. All Rights Reserved.
6\
7\ - Do no alter or remove copyright notices
8\
9\ - Redistribution and use of this software in source and binary forms, with
10\ or without modification, are permitted provided that the following
11\ conditions are met:
12\
13\ - Redistribution of source code must retain the above copyright notice,
14\ this list of conditions and the following disclaimer.
15\
16\ - Redistribution in binary form must reproduce the above copyright notice,
17\ this list of conditions and the following disclaimer in the
18\ documentation and/or other materials provided with the distribution.
19\
20\ Neither the name of Sun Microsystems, Inc. or the names of contributors
21\ may be used to endorse or promote products derived from this software
22\ without specific prior written permission.
23\
24\ This software is provided "AS IS," without a warranty of any kind.
25\ ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES,
26\ INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A
27\ PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE HEREBY EXCLUDED. SUN
28\ MICROSYSTEMS, INC. ("SUN") AND ITS LICENSORS SHALL NOT BE LIABLE FOR
29\ ANY DAMAGES SUFFERED BY LICENSEE AS A RESULT OF USING, MODIFYING OR
30\ DISTRIBUTING THIS SOFTWARE OR ITS DERIVATIVES. IN NO EVENT WILL SUN
31\ OR ITS LICENSORS BE LIABLE FOR ANY LOST REVENUE, PROFIT OR DATA, OR
32\ FOR DIRECT, INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL OR PUNITIVE
33\ DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY,
34\ ARISING OUT OF THE USE OF OR INABILITY TO USE THIS SOFTWARE, EVEN IF
35\ SUN HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
36\
37\ You acknowledge that this software is not designed, licensed or
38\ intended for use in the design, construction, operation or maintenance of
39\ any nuclear facility.
40\
41\ ========== Copyright Header End ============================================
42id: @(#)decomp.fth 2.17 04/02/02 10:01:54
43purpose:
44copyright: Copyright 1999-2004 Sun Microsystems, Inc. All Rights Reserved
45copyright: Copyright 1985-1994 Bradley Forthware, Inc.
46copyright: Use is subject to license terms.
47
48\
49\ The decompiler.
50\ This program is based on the F83 decompiler by Perry and Laxen,
51\ but it has been heavily modified:
52\ Structured decompilation of conditionals
53\ Largely machine independent
54\ Prints the name of the definer for child words instead of the
55\ definer's DOES> clause.
56\ "Smart" decompilation of literals.
57
58\ A Forth decompiler is a utility program that translates
59\ executable forth code back into source code. For many compiled languages,
60\ decompilation is very hard or impossible. Decompilation of threaded
61\ code is relatively easy.
62\ It was written with modifiability in mind, so if you add your
63\ own special compiling words, it will be easy to change the
64\ decompiler to include them. This code is implementation
65\ dependant, and will not necessarily work on other Forth system.
66\ However, most of the machine dependencies have been isolated into a
67\ separate file "decompiler.m.f".
68\ To invoke the decompiler, use the word SEE <name> where <name> is the
69\ name of a Forth word. Alternatively, (SEE) will decompile the word
70\ whose acf is on the stack.
71
72: (in-dictionary?) ( adr -- flag ) origin here within ;
73\ defer in-dictionary?
74' (in-dictionary?) is in-dictionary?
75
76: probably-cfa? ( possible-acf -- flag )
77 dup dup acf-aligned = over in-dictionary? and if
78 colon-cf?
79 else
80 drop false
81 then
82;
83
84\ Given an ip, scan backwards until you find the acf. This assumes
85\ that the ip is within a colon definition, and it is not absolutely
86\ guaranteed to work, but in practice it usually does.
87
88: find-cfa ( ip -- acf )
89 begin
90 dup in-dictionary? 0= if drop ['] lose exit then
91 #talign - dup probably-cfa?
92 until
93;
94
95\needs iscreate create iscreate
96\needs (wlit) create (wlit)
97
98start-module
99decimal
100
101only forth also hidden also forth definitions
102defer (see)
103
104hidden definitions
105headerless
106\ Like ." but goes to a new line if needed.
107: cr". ( adr len -- ) dup ?line type ;
108: .." ( -- ) [compile] " compile cr". ; immediate
109
110\ Positional case defining word
111\ Subscripts start from 0
112
113\ Support routines:
114\ Report out of range error
115: out ( #subscript pfa -- )
116 cr ." subscript out of range on " dup body> .name
117 ." max is " ? ." tried " . quit
118;
119
120\ Convert subscript # to address of token within body
121: maptoken ( #subscript pfa -- token-adr )
122 2dup @ u< if na1+ swap ta+ else out then
123;
124
125\ forth definitions
126\
127\ headers
128
129\ Positional case defining word
130: case: ( n -- )
131 create , ]
132 does> ( #subscript pfa -- ) \ executes #'th word
133 maptoken token@ execute
134;
135
136: tassociative: ( n -- )
137 create ,
138 does> ( xt pfa -- index )
139 dup @ ( xt pfa cnt )
140 dup 2swap ( cnt cnt xt pfa )
141 na1+ ( cnt cnt xt table-addr )
142 rot 0 do ( cnt xt table-addr )
143 2dup token@ = if
144 \ Clear stack and return index that matched
145 3drop i 0 0 leave
146 then
147 ta1+ ( cnt xt table-addr' )
148 loop ( index xt table-addr' )
149 2drop
150;
151
152\ hidden definitions \ We are already -- still -- in
153 \ hidden definitions headerless
154headers transient
155: #entries ( associative-acf -- n ) >body @ ;
156resident headerless
157
158: nulldis ( apf -- ) drop ." <no disassembler>" ;
159defer disassemble ' nulldis is disassemble
160
161\ headerless \ We are already -- still -- in
162 \ hidden definitions headerless
163
164\ Breaks is a list of places in a colon definition where control
165\ is transferred without there being a branch nearby.
166\ Each entry has two items: the address and a number which indicates
167\ what kind of branch target it is (either a begin, for backward branches,
168\ a then, for forward branches, or an exit.
169
170h# 40 /n* buffer: breaks
171variable end-breaks
172
173variable break-type variable break-addr variable where-break
174: next-break ( -- break-address break-type )
175 -1 break-addr ! ( prime stack)
176 end-breaks @ breaks ?do
177 i 2@ over break-addr @ u< if
178 break-type ! break-addr ! i where-break !
179 else
180 2drop
181 then
182 /n 2* +loop
183 break-addr @ -1 <> if -1 -1 where-break @ 2! then
184;
185: forward-branch? ( ip-of-branch-token -- f )
186 dup >target u<
187;
188
189\ Bare-if? checks to see if the target address on the stack was
190\ produced by an IF with no ELSE. This is used to decide whether
191\ to put a THEN at that target address. If the conditional branch
192\ to this target is part of an IF ELSE THEN, the target address
193\ for the THEN is found from the ELSE. If the conditional branch
194\ to this target was produced by a WHILE, there is no THEN.
195
196 \ Support function for IF and WHILE
197 \ Prepare for further examination of the token preceding
198 \ the target of the current branch; it might be a branch...
199 \ Leave its IP as well as its CFA on the stack
200: >next-branch? ( ip-of-branch-target -- ip' possible-branch-acf )
201 /branch - /token - dup token@ ( ip' possible-branch-acf )
202;
203
204: bare-if? ( ip-of-branch-target -- f )
205 >next-branch? ( ip' possible-branch-acf )
206 dup ['] branch = \ unconditional branch means else or repeat
207 if drop drop false exit then ( ip' acf )
208 ['] ?branch = \ cond. forw. branch is for an IF THEN with null body
209 if forward-branch? else drop true then
210;
211
212\ While? decides if the conditional branch at the current ip is
213\ for a WHILE as opposed to an IF. It finds out by looking at the
214\ target for the conditional branch; if there is a backward branch
215\ just before the target, it is a WHILE.
216: while? ( ip-of-?branch -- f )
217 >target >next-branch? ( ip' possible-branch-acf )
218 ['] branch = if \ looking for the uncond. branch from the REPEAT
219 forward-branch? 0= \ if the branch is forward, it's an IF .. ELSE
220 else
221 drop false
222 then
223
224;
225defer indent
226: (indent) ( -- )
227 #out @ lmargin @ > if cr then
228 lmargin @ #out @ - spaces
229;
230' (indent) is indent
231
232: +indent ( -- ) 3 lmargin +! indent ;
233: -indent ( -- ) -3 lmargin +! indent ;
234: <indent ( -- ) -3 lmargin +! indent 3 lmargin +! ;
235
236: .begin ( -- ) .." begin " +indent ;
237: .then ( -- ) -indent .." then " ;
238
239\ Extent holds the largest known extent of the current word, as determined
240\ by branch targets seen so far. This is used to decide if an exit should
241\ terminate the decompilation, or whether it is "protected" by a conditional.
242variable extent extent off
243: +extent ( possible-new-extent -- ) extent @ umax extent ! ;
244: +branch ( ip-of-branch -- next-ip ) ta1+ /branch + ;
245: .endof ( ip -- ip' ) .." endof" indent +branch ;
246: .endcase ( ip -- ip' ) indent .." endcase" indent ta1+ ;
247
248: add-break ( break-address break-type -- )
249 end-breaks @ breaks h# 40 /n* + >= ( adr,type full? )
250 abort" Decompiler internal table overlow" ( adr,type )
251 end-breaks @ breaks > if ( adr,type )
252 over end-breaks @ /n 2* - >r r@ 2@ ( adr,type adr prev-adr,type )
253 ['] .endof = -rot = and if ( adr,type )
254 r@ 2@ 2swap r> 2! ( prev-adr,type )
255 else ( adr,type )
256 r> drop ( adr,type )
257 then ( adr,type )
258 then ( adr,type )
259 end-breaks @ 2! /n 2* end-breaks +! ( )
260;
261: ?add-break ( break-address break-type -- )
262 over ( break-address break-type break-address )
263 end-breaks @ breaks ?do
264 dup i 2@ drop = ( found? ) if
265 drop 0 leave
266 then
267 /n 2* +loop ( break-address break-type not-found? )
268
269 if add-break else 2drop then
270;
271
272: scan-of ( ip-of-(of -- ip' )
273 dup >target dup +extent ( ip next-of )
274 /branch - /token - ( ip endof-addr )
275 dup ['] .endof add-break ( ip endof-addr )
276 ['] .endcase ?add-break
277 +branch
278;
279: scan-branch ( ip-of-?branch -- ip' )
280 dup dup forward-branch? if
281 >target dup +extent ( branch-target-address)
282 dup bare-if? if ( ip ) \ is this an IF branch?
283 ['] .then add-break
284 else
285 drop
286 then
287 else
288 >target ['] .begin add-break
289 then
290 +branch
291;
292
293: scan-unnest ( ip -- ip' | 0 )
294 dup extent @ u>= if drop 0 else ta1+ then
295;
296: scan-;code ( ip -- ip' | 0 ) does-ip? 0= if drop 0 then ;
297: .;code ( ip -- ip' )
298 does-ip? if
299 .." does> "
300 else
301 0 lmargin ! indent .." ;code " cr disassemble 0
302 then
303;
304: .branch ( ip -- ip' )
305 dup forward-branch? if
306 <indent .." else" indent
307 else
308 -indent .." repeat "
309 then
310 +branch
311;
312: .?branch ( ip -- ip' )
313 dup forward-branch? if
314 dup while? if
315 <indent .." while" indent
316 else
317 .." if " +indent
318 then
319 else
320 -indent .." until "
321 then
322 +branch
323;
324
325: .do ( ip -- ip' ) .." do " +indent +branch ;
326: .?do ( ip -- ip' ) .." ?do " +indent +branch ;
327: .loop ( ip -- ip' ) -indent .." loop " +branch ;
328: .+loop ( ip -- ip' ) -indent .." +loop " +branch ;
329: .of ( ip -- ip' ) .." of " +branch ;
330
331\ first check for word being immediate so that it may be preceded
332\ by [compile] if necessary
333: check-[compile] ( acf -- acf )
334 dup immediate? if .." [compile] " then
335;
336
337: .cword ( ip -- ip' ) \ Display run-time word, e.g. (is) sans '()'
338 dup token@ ?cr ( ip acf )
339 >name name>string ( ip adr len )
340 swap 1+ swap 2 - type space ( ip ) \ Remove parentheses
341 ta1+
342;
343: .word ( ip -- ip' ) dup token@ check-[compile] ?cr .name ta1+ ;
344\ : skip-word ( ip -- ip' ) ta1+ ;
345alias skip-word ta1+
346: .inline ( ip -- ip' ) ta1+ dup unaligned-@ n. na1+ ;
347: skip-inline ( ip -- ip' ) ta1+ na1+ ;
348: .wlit ( ip -- ip' ) ta1+ dup unaligned-w@ 1- . wa1+ ;
349: .llit ( ip -- ip' ) ta1+ dup unaligned-l@ 1- . la1+ ;
350: .dlit ( ip -- ip' ) ta1+ dup 2@ swap (ud.) type ." . " 2 na+ ;
351: skip-wlit ( ip -- ip' ) ta1+ wa1+ ;
352: skip-llit ( ip -- ip' ) ta1+ la1+ ;
353: skip-dlit ( ip -- ip' ) ta1+ 2 na+ ;
354: skip-branch ( ip -- ip' ) +branch ;
355: .quote ( ip -- ip' ) .word .word ;
356\ : skip-quote ( ip -- ip' ) ta1+ ta1+ ;
357: skip-2-tokens ( ip -- ip' ) ta1+ ta1+ ;
358alias skip-quote skip-2-tokens
359: .compile ( ip -- ip' ) ." compile " ta1+ .word ;
360\ : skip-compile ( ip -- ip' ) ta1+ ta1+ ;
361alias skip-compile skip-2-tokens
362: skip-string ( ip -- ip' ) ta1+ +str ;
363: .(') ( ip -- ip' ) ta1+ .." ['] " dup token@ .name ta1+ ;
364\ : skip-(') ( ip -- ip' ) ta1+ ta1+ ;
365alias skip-(') skip-2-tokens
366: .is ( ip -- ip' ) ." is " ta1+ dup token@ .name ta1+ ;
367: .string-tail ( ip -- ip' ) dup count type +str ." "" " ;
368: .string ( ip -- ip' ) .cword .string-tail ;
369: .pstring ( ip -- ip' ) ?cr ." p"" " ta1+ .string-tail ;
370
371\ Use this version of .branch if the structured conditional code is not used
372\ : .branch ( ip -- ip' ) .word dup <w@ . /branch + ;
373
374: .unnest ( ip -- ip' )
375 dup extent @ u>= if
376 0 lmargin ! indent .." ; " drop 0
377 else
378 .." exit " ta1+
379 then
380;
381
382: dummy ;
383
384\ classify each word in a definition
385\ Common constant for sizing the three classes:
386headers
387transient d# 34 constant #decomp-classes resident
388headerless
389
390#decomp-classes tassociative: execution-class ( token -- index )
391]
392 ( 0 ) (lit) ( 1 ) ?branch
393 ( 2 ) branch ( 3 ) (loop)
394 ( 4 ) (+loop) ( 5 ) (do)
395 ( 6 ) compile ( 7 ) (.")
396 ( 8 ) (abort") ( 9 ) (;code)
397 ( 10 ) unnest ( 11 ) (")
398 ( 12 ) (?do) ( 13 ) (does>)
399 ( 14 ) exit ( 15 ) (wlit)
400 ( 16 ) (') ( 17 ) (of)
401 ( 18 ) (endof) ( 19 ) (endcase)
402 ( 20 ) ("s) ( 21 ) (is-defer)
403 ( 22 ) (dlit) ( 23 ) (llit)
404 ( 24 ) (is-user) ( 25 ) (is-const)
405 ( 26 ) dummy ( 27 ) dummy
406 ( 28 ) dummy ( 29 ) dummy
407 ( 30 ) dummy ( 31 ) dummy
408 ( 32 ) dummy ( 33 ) dummy
409[
410
411\ Print a word that has been classified by execution-class
412#decomp-classes 1+ case: .execution-class ( ip index -- ip' )
413 ( 0 ) .inline ( 1 ) .?branch
414 ( 2 ) .branch ( 3 ) .loop
415 ( 4 ) .+loop ( 6 ) .do
416 ( 6 ) .compile ( 7 ) .string
417 ( 8 ) .string ( 9 ) .;code
418 ( 10 ) .unnest ( 11 ) .string
419 ( 12 ) .?do ( 13 ) .;code
420 ( 14 ) .unnest ( 15 ) .wlit
421 ( 16 ) .(') ( 17 ) .of
422 ( 18 ) .endof ( 19 ) .endcase
423 ( 20 ) .pstring ( 21 ) .is
424 ( 22 ) .dlit ( 23 ) .llit
425 ( 24 ) .is ( 25 ) .is
426 ( 26 ) dummy ( 27 ) dummy
427 ( 28 ) dummy ( 29 ) dummy
428 ( 30 ) dummy ( 31 ) dummy
429 ( 32 ) dummy ( 33 ) dummy
430 ( default ) .word
431[
432
433\ Determine the control structure implications of a word
434\ that has been classified by execution-class
435#decomp-classes 1+ case: do-scan
436 ( 0 ) skip-inline ( 1 ) scan-branch
437 ( 2 ) scan-branch ( 3 ) skip-branch
438 ( 4 ) skip-branch ( 6 ) skip-branch
439 ( 6 ) skip-compile ( 7 ) skip-string
440 ( 8 ) skip-string ( 9 ) scan-;code
441 ( 10 ) scan-unnest ( 11 ) skip-string
442 ( 12 ) skip-branch ( 13 ) scan-;code
443 ( 14 ) scan-unnest ( 15 ) skip-wlit
444 ( 16 ) skip-(') ( 17 ) scan-of
445 ( 18 ) skip-branch ( 19 ) skip-word
446 ( 20 ) skip-string ( 21 ) skip-word
447 ( 22 ) skip-dlit ( 23 ) skip-llit
448 ( 24 ) skip-word ( 25 ) skip-word
449 ( 26 ) dummy ( 27 ) dummy
450 ( 28 ) dummy ( 29 ) dummy
451 ( 30 ) dummy ( 31 ) dummy
452 ( 32 ) dummy ( 33 ) dummy
453 ( default ) skip-word
454[
455
456: install-decomp ( literal-acf display-acf skip-acf -- )
457 rot ( disp-acf skip-acf lit-acf )
458 ['] dummy \ target
459 ['] execution-class >body na1+ \ search-start
460 dup [ #decomp-classes ] literal ta+ \ search-limit
461 tsearch if token!
462 ['] dummy ['] do-scan (patch
463 ['] dummy ['] .execution-class (patch
464 else 3drop where ." Can't install-decomp. Tables full." cr
465 then
466;
467
468\ Scan the parameter field of a colon definition and determine the
469\ places where control is transferred.
470: scan-pf ( apf -- )
471 dup extent ! ( apf )
472 breaks end-breaks ! ( apf )
473 begin ( adr )
474 dup token@ execution-class do-scan ( adr' )
475 dup 0= ( adr' flag )
476 until ( adr )
477 drop
478;
479
480forth definitions
481headers
482: .token ( ip -- ip' | 0 ) dup token@ execution-class .execution-class ;
483
484\ Decompile the parameter field of colon definition
485: .pf ( apf -- )
486 dup scan-pf next-break 3 lmargin ! indent ( apf )
487 begin ( adr )
488 ?cr break-addr @ over = if ( adr )
489 begin ( adr )
490 break-type @ execute ( adr )
491 next-break break-addr @ over <> ( adr done? )
492 until ( adr )
493 else ( adr )
494 .token ( adr' )
495 then ( adr' )
496 dup 0= exit? if nullstring throw then ( adr' )
497 until drop ( )
498;
499headerless
500hidden definitions
501
502: .immediate ( acf -- ) immediate? if .." immediate" then ;
503
504: .definer ( acf definer-acf -- acf ) .name dup .name ;
505
506: dump-body ( pfa -- )
507 push-hex
508 dup @ n. 2 spaces 8 emit.ln
509 pop-base
510;
511\ Display category of word
512: .: ( acf definer -- ) .definer space space >body .pf ;
513: .constant ( acf definer -- ) over >data ? .definer drop ;
514: .2constant ( acf definer -- ) over >data dup ? na1+ ? .definer drop ;
515: .vocabulary ( acf definer -- ) .definer drop ;
516: .code ( acf definer -- ) .definer >code disassemble ;
517: .variable ( acf definer -- )
518 over >data n. .definer .." value = " >data ?
519;
520: .create ( acf definer -- )
521 over >body n. .definer .." value = " >body dump-body
522;
523: .user ( acf definer -- )
524 over >body ? .definer .." value = " >data ?
525;
526: .defer ( acf definer -- )
527 .definer .." is " cr >data token@ (see)
528;
529: .alias ( acf definer -- )
530 .definer >body token@ .name
531;
532: .value ( acf definer -- )
533 swap >data ? .definer
534;
535
536
537\ Decompile a word whose type is not one of those listed in
538\ definition-class. These include does> and ;code words which
539\ are not explicitly recognized in definition-class.
540: .other ( acf definer -- )
541 .definer >body ." (Body: " dump-body ." ) " cr
542;
543
544
545\ Classify a word based on its acf
546headers transient
547alias isalias noop
548create iscreate
5490 0 2constant is2cons
550
551: wt, \ name ( -- ) \ Compile name's word type
552 ' word-type token,
553;
554resident headerless
555
556d# 10 tassociative: word-types
557 ( 0 ) wt, here ( 1 ) wt, bl
558 ( 2 ) wt, #user ( 3 ) wt, base
559 ( 4 ) wt, emit ( 5 ) wt, iscreate
560 ( 6 ) wt, forth ( 7 ) wt, isalias
561 ( 8 ) wt, limit ( 9 ) wt, is2cons
562
563d# 11 tassociative: definition-class
564]
565 ( 0 ) : ( 1 ) constant
566 ( 2 ) variable ( 3 ) user
567 ( 4 ) defer ( 5 ) create
568 ( 6 ) vocabulary ( 7 ) alias
569 ( 8 ) value ( 9 ) 2constant
570 ( 10 ) code
571[
572
573d# 12 case: .definition-class
574 ( 0 ) .: ( 1 ) .constant
575 ( 2 ) .variable ( 3 ) .user
576 ( 4 ) .defer ( 5 ) .create
577 ( 6 ) .vocabulary ( 7 ) .alias
578 ( 8 ) .value ( 9 ) .2constant
579 ( 10) .code ( 11) .other
580[
581
582: definer ( acf-of-child -- acf-of-defining-word )
583 dup code? if drop ['] code exit then ( acf )
584 dup word-type word-types ( acf index )
585 dup [ ' word-types #entries ] literal = if ( acf index )
586 drop word-type find-cfa ( definer )
587 else ( acf index )
588 nip ['] definition-class >body maptoken token@ ( definer )
589 then
590;
591
592\ top level of the decompiler SEE
593: ((see ( acf -- )
594 d# 64 rmargin !
595 dup dup definer dup definition-class .definition-class
596 .immediate
597 ??cr
598;
599headers
600' ((see is (see)
601
602forth definitions
603
604: see \ name ( -- )
605 ' ['] (see) catch if drop then
606;
607only forth also definitions
608end-module