12 point everything
[unix-history] / usr / src / usr.bin / gprof / PSD.doc / postp.me
CommitLineData
98e01bad 1\" @(#)postp.me 1.6 %G%
30a96129
PK
2.EQ
3delim ##
98e01bad 4gsize 12
30a96129 5.EN
819c1fed
PK
6.sh 1 "Post Processing"
7.pp
8Having gathered the arcs of the call graph and timing information
9for an execution of the program,
10we are interested in attributing the time for each routine to the
11routines that call it.
12We construct a dynamic call graph with arcs from caller to callee,
13and propagate time from descendants to ancestors
14by topologically sorting the call graph.
28f2ab56
PK
15Time propagation is performed from the leaves of the
16call graph toward the roots, according to the order
17assigned by a topological numbering algorithm.
18The topological numbering ensures that
19all edges in the graph go from higher numbered nodes to lower
20numbered nodes.
21If we propagate time from nodes in the
22reverse order assigned by the algorithm,
23execution time can be propagated from descendants to ancestors
24after a single traversal of each arc in the call graph.
25Each parent receives some fraction of a child's time.
26Thus time is charged to the
27caller in addition to being charged to the callee.
28.(b
30a96129
PK
29.TS
30center;
31c s c c s.
28f2ab56 328 9
30a96129
PK
33
34
35.T&
36c c c c c.
37 3 7
38
39
402 5 6
41
42
43 1 4
44.TE
b8b92e41
PK
45.ce 1
46Topological ordering
47.ce 0
28f2ab56 48.)b
bf3fc937 49.pp
30a96129
PK
50Let #C sub e# be the number of calls to some routine,
51#e#, and #C sub e sup r# be the number of
52calls from a caller #r# to a callee #e#.
53Since we are assuming each call to a routine takes the
54average amount of time for all calls to that routine,
55the caller is accountable for
56#C sub e sup r# \(sl #C sub e#
57of the time spent by the callee.
58Let the #S sub e# be the #selftime# of a routine, #e#.
59The selftime of a routine can be determined from the
60timing information gathered during profiled program execution.
61The total time, #T sub r#, we wish to account to a routine
62#r#, is then given by the recurence equation:
63.EQ
64T sub r ~ = ~ {S sub r} ~ + ~
65 sum from {r ~ roman CALLS ~ e}
66 {T sub e \(mu {{C sub e sup r} over {C sub e}}}
67.EN
68where #r ~ roman CALLS ~ e# is a relation indicating all routines
69#e# called by a routine #r#.
70This relation is easily available from the call graph.
71.pp
819c1fed
PK
72However, if the execution contains recursive calls,
73the call graph has cycles that
74cannot be topologically sorted.
bf3fc937 75In these cases, we discover strongly-connected
819c1fed
PK
76components in the call graph,
77treat each such component as a single node,
78and then sort the resulting graph.
79We use a variation of Tarjan's strongly-connected
80components algorithm
81that discovers strongly-connected components as it is assigning
82topological order numbers [Tarjan72].
819c1fed
PK
83.pp
84Time propagation within strongly connected
85components is a problem.
86For example, a self-recursive routine
87(a trivial cycle in the call graph)
88is accountable for all the time it
89uses in all its recursive instantiations.
90In our scheme, this time should be
91shared among its call graph parents.
92The arcs from a routine to itself are of interest,
93but do not participate in time propagation.
28f2ab56
PK
94Thus the simple equation for time propagation
95does not work within strongly connected components.
819c1fed
PK
96Time is not propagated from one member of a cycle to another,
97since, by definition, this involves propagating time from a routine
98to itself.
99In addition, children of one member of a cycle
100must be considered children of all members of the cycle.
101Similarly, parents of one member of the cycle must inherit
102all members of the cycle as descendants.
103It is for these reasons that we collapse connected components.
104Our solution collects all members of a cycle together,
105summing the time and call counts for all members.
106All calls into the cycle are made to share the total
107time of the cycle, and all descendants of the cycle
108propagate time into the cycle as a whole.
109Calls among the members of the cycle
110do not propagate any time,
111though they are listed in the call graph profile.
28f2ab56 112.(b
30a96129
PK
113.TS
114center;
115c s c c s.
28f2ab56 116o o
30a96129
PK
117
118
119.T&
120c c c c c.
b8b92e41 121 o o
30a96129
PK
122
123
b8b92e41 124o o o
30a96129
PK
125
126
b8b92e41
PK
127 o o
128.TE
129.ce 1
130Cycle to be collapsed.
28f2ab56
PK
131.)b
132.(b
b8b92e41
PK
133.TS
134center;
135c s c c s.
28f2ab56 1367 8
b8b92e41
PK
137
138
139.T&
140c c c c c.
141 6 6
142
143
1442 4 5
145
146
147 1 3
30a96129 148.TE
b8b92e41
PK
149.ce 1
150Topological numbering after cycle collapsing.
151.ce 0
28f2ab56 152.)b
819c1fed
PK
153.pp
154Since the technique described above only collects the
155dynamic call graph,
156and the program typically does not call every routine
157on each execution,
158different executions can introduce different cycles in the
159dynamic call graph.
160Since cycles often have a significant effect on time propagation,
161it is desirable to incorporate the static call graph so that cycles
162will have the same members regardless of how the program runs.
163.pp
164The static call graph can be constructed from the source text
165of the program.
166However, discovering the static call graph from the source text
167would require two moderately difficult steps:
168finding the source text for the program
169(which may not be available),
170and scanning and parsing that text,
171which may be in any one of a number of languages.
172In our programming system,
173the static calling information is also contained in the
174executable version of the program,
175which we already have available,
176and which is in language-independent form.
177One can examine the instructions
178in the object program,
179looking for calls to routines, and note which
180routines can be called.
181This technique allows us to add arcs to those already in the
182dynamic call graph.
183If a statically discovered arc already exists in the dynamic call
184graph, no action is required.
185Statically discovered arcs that do not exist in the dynamic call
186graph are added to the graph with a traversal count of zero.
187Thus they are never responsible for any time propagation.
188However, they may affect the structure of the graph.
189Since they may complete strongly connected components,
190the static call graph construction is
191done before topological ordering.