Initial commit of OpenSPARC T2 architecture model.
[OpenSPARC-T2-SAM] / rst / rstf / rstFormat.tex
CommitLineData
920dae64
AT
1\documentclass[10pt]{article}
2
3\usepackage{rqdefs} \usepackage{rqfullpage} \usepackage{utopia}
4\usepackage{rqcode}
5
6%% Example ltoh commands (start with %-ltoh-)
7%-ltoh- title := The RS trace format, aka RST
8%-ltoh- :{}:\gold:<font color=gold>:</>:
9%-ltoh- :comm:\salsa:<strong>salsa</strong>::
10
11\rqfullpageD
12
13\begin{document}
14
15\rqtitle{The RST Trace format}
16\rqsubtitle{AAD Tools, last updated \today}
17
18\toc
19
20A HTML version of this document is available at
21\rqlink{http://ppgweb.eng/archperf/rstFormat.html}.
22
23The master workspace for RST is \textff{/import/archperf/ws/rstf}. This
24workspace contains source to generate PS and PDF versions of this
25document. A (possibly out of date) PostScript version of this document
26is available at \texttt{/home/quong/proj/rstf/rstFormat.ps}.
27
28\section{Other Useful links}
29
30\begin{tabular}{|l|l|} \hline
31 What & URL \\ \hline
32 List of known traces &
33 \rqlink{http://traces.eng/} \\ \hline
34 ArchTools Trace FAQ &
35 \rqlink{http://ppgweb.eng/archperf/traceFAQ.html} \\ \hline
36 The RST trace format &
37 \rqlink{http://ppgweb.eng/archperf/rstFormat.html} \\ \hline
38 The atrace tool/format &
39 \rqlink{http://muskoka.eng/\rqtilde{}bmc/atrace/} \\ \hline
40 Converting atrace to RST &
41 \rqlink{http://ppgweb.eng/\rqtilde{}quong/atrace2rst.html} \\ \hline
42 Converting bustraces to RST &
43 \rqlink{http://ppgweb.eng/\rqtilde{}quong/bustrace.html} \\ \hline
44 Instruction Trace validation &
45 \rqlink{http://ppgweb.eng/archperf/trace-validation2002.html} \\ \hline
46 Blaze web page &
47 \rqlink{http://ppgweb.eng/archperf/blaze.html} \\ \hline
48 Blaze user guide &
49 \rqlink{http://ppgweb.eng/\rqtilde{}quong/blaze-userguide.html} \\ \hline
50 Blaze TPCC trace &
51 \rqlink{http://ppgweb.eng/\rqtilde{}quong/blaze-tpcc-try500.html} \\ \hline
52\end{tabular}
53
54The trace FAQ answers many questions on how to analyze/process RST
55traces. Unfortunately, there are many issues that could reasonably
56belong in the trace FAQ page or this page. We (RQ) have tried to limit
57this web page to RST format definition issues, but the need for examples
58causes this web page to overlap with the trace FAQ.
59
60\section{What is RST?}
61
62RST is short for RS Trace format, which is a format for computer
63architecture traces. RS stands for "Really Simple" or "Russell's Simple",
64depending on who you talk to (or "to whom you talk" if you hate dangling
65particples). An RST trace consists of fixed-length records (24 bytes)
66in which the first byte of each record, known as the \textsl{rtype},
67specifies the type. These two properties ensure that an RST trace is
68easy to decode both now and in the future. In particular if your old
69analyzer sees a new \textsl{rtype} it does not understand, your code can
70simply skip that record, ensuring forward compatibility.
71
72There are many different kinds of RST records types, including
73\begin{rqitemize}{0em}
74 \item instructions
75 \item events (traps, interrupts)
76 \item MMU state (changes to the TLB, PA-VA diffs)
77 \item string data
78 \item internal processor state (register dumps)
79 \item high-level evnts (process/context switchs, thread switch)
80 \item markers (timestamp, current CPU)
81 \item state (cache/memory state)
82 \item define your own
83\end{rqitemize}
84
85Here is an example of 4 records in an RST trace: pavadiff, instr, instr,
86and trap. Again, each record is the same size and the \textsl{rtype}
87indicates the record type.
88
89\begin{rqcode}{\small}
90 +================+
91 |rtype=PAVADIFFT |
92 | i/d contexts |
93 | PA-VA for PC |
94 | PA-VA for EA |
95 +================+
96 |rtype=INSTR\_T |
97 | flags+instr |
98 | PC (VA) |
99 | EA (VA) |
100 +================+
101 |rtype=INSTR\_T |
102 | flags+instr |
103 | PC (VA) |
104 | EA (VA) |
105 +================+
106 |rtype=TRAP\_T |
107 | trap type |
108 | trap level |
109 | |
110 +================+
111\end{rqcode}
112
113
114\section{Why should I use RST?}
115
116Because RST it is simple, flexible, extensible and supported. It has
117provisions for MP traces, traps, events (snooping, DMA, etc), VA/PA, and
118time stamps, descriptive strings, and trace patching. Adding new types
119of data in an RST trace, is as simple as defining new rtypes. In short,
120RST was designed to all kinds of trace data for the next 5 years.
121
122There are numerous tools based on RST. There is an RST compressor,
123rstzip; used with gzip, we typically find compression rates of 18-40X
124and have seen compression of 200X.
125
126The \textss{rst-snapper} reads RST. The \textss{rstgen} is a shade
127analyzer which generates RST. The RST tracer module (
128\textss{rstracer.so} ) for \rqhttp{blaze}{} geneerates rich RST traces.
129Additionally, snaps for the new (4/2000) version of Aztecs, the cycle
130accurate simulator for Cheetah/Jubatus, use RST instructions.
131
132\section{What tools exist for RST?}
133
134The following tools exist for handling RST traces. All binaries exist
135in \textss{/import/archperf/bin}.
136
137\begin{tabular}{|l|l|} \hline
138 Tool & Description \\ \hline
139 \textss{trv.sh} & view (un/compressed) RST trace \\ \hline
140 \textss{trconv} & view RST trace in ASCII format \\ \hline
141 \textss{rstFilter} & process an RST trace producing a new RST trace \\ \hline
142 \textss{atr2rst.sh} & script to \rqhttp{convert atrace to
143 RST}{atrace2rst.html} \\ \hline
144 \textss{rstzip2} & De/Compressor tailored for MP RST instr traces \\ \hline
145 \textss{rstzip} & De/Compressor tailored for RST instr traces (deprecated) \\ \hline
146 \textss{rstsnap} & snap an RST trace for Aztecs \\ \hline
147 \textss{rstgen} & generate an RST trace from shade (used for Spec2K)
148\\ \hline
149 \textss{rstracer.so} & Blaze module to dump/generate RST traces \\ \hline
150\end{tabular}
151
152\subsection{RST Compression}
153
154A significant, but often overlooked, benefit of RST is an excellent
155compression algorithm. Typically, RST traces compress to about 1 bytes
156per instruction. The compressor was implemented by Kelvin Fong.
157
158Unfortunately, there are two incompatible RST compression formats, V1
159and V2, requiring different de/compressors, \textss{rstzip} and
160\textss{rstzip2}, respectively. Version 2 RST compression supports MP
161traces and is the preferred compression method for all RST instruction
162traces after Aug 2001. Most RST traces before Jun 2001 were compressed
163with V1 compression (rstzip).
164
165\begin{tabular}{|l|l|l|}
166 Format & Suffix & Description \\
167 V1 & \textss{rz.gz} \textss{rsz} \textss{rsz.gz} & Original 1P format \\
168 V2 & \textss{rz2.gz} & Support for MP and value tracing \\
169\end{tabular}
170
171\subsection{Viewing an RST trace}
172
173Use \textss{trv.sh} or \textss{trconv}. This code was recently updated
174on 6/28/2000, so some flags may have changed.
175
176\begin{rqenumerate}{0em}
177 \item An example run is \textss{trv.sh -n 1000 -s 200 trace45.rz2.gz | less}.
178
179 \item An example run is \textss{trconv -n 1000 -s 20 file.rst}.
180
181 \item There is on-line help for \textss{trconv}.
182
183 \item In the \texttt{INSTR}, \texttt{PAVADIFF} and \texttt{PHYSADDR}
184records, there is a \textss{ea\_valid} field which indicates if the
185corresponding \textss{ea\_xxxx} field contains valid data. If
186\textss{ea\_valid} is false (0) then the \textss{ea\_xxxx} field must be
187ignored.
188
189Most of my programs (atr2rst, the blaze rstracer) which generate RST
190traces put a bogus, easily recognizable value in \textss{ea\_xxxx} if
191\textss{ea\_valid} is false.
192\end{rqenumerate}
193
194\section{Getting started analyzing a trace}
195
196In a typical trace analyzer (e.g. cache simulator) two records types
197suffice for most of what you want to do. The \textss{INSTR\_T} record
198gives you the instruction word, PC and optional EA. The
199\textss{PAVADIFF\_T} record lets you translate the VAs into PAs. For
200both the I and D references, the last seen \textss{PAVADIFF\_T} contains
201the current (PA-VA) difference values and the effective context used.
202
203Let's consider an example. Consider the following output from
204\textss{trconv}:
205
206\begin{verbatim}
207 8896 pavadiff: cpuid=0icontext=0 dcontext=0 pc_pa_va=0xffffffffff400000 ea_pa_va=0xffffffffff400000 ea_valid=1
208 8897 instr : cpuid=0 p [0x0000000001085be0] lduw [%g2 + 0xf0], %g2 [0x00000000014000f0]
209 8898 instr : cpuid=0 p [0x0000000001085be4] srl %g2, 0xb, %g2
210 8899 instr : cpuid=0 p [0x0000000001085be8] subcc %g2, 0, %g0
211 8900 instr : cpuid=0 p [0x0000000001085bec] bpe,a,pt %icc, 0x1085c00 T [0x0000000001085c00]
212 8901 pavadiff: cpuid=0icontext=0 dcontext=0 pc_pa_va=0xffffffffff400000 ea_pa_va=0xfffffd5fe5c08000 ea_valid=1
213 8902 instr : cpuid=0 p [0x0000000001085bf0] lduh [%g7 + 0x188], %g2 [0x000002a100225ec8]
214 8903 instr : cpuid=0 p [0x0000000001085c00] sra %g2, 0, %i1
215 8904 instr : cpuid=0 p [0x0000000001085c04] call 0x1032420 T [0x0000000001032420]
216 8905 instr : cpuid=0 p [0x0000000001085c08] restore %g0, %g0, %g0
217\end{verbatim}
218
219At record 8897, we have an LDUW instruction with PC=0x0000000001085be0.
220Thus the PC PA is 0x0000000001085be00 + 0xffffffffff400000. The
221PAVADIFF also indicates the I-context is 0, meaning we are in priviledge
222mode hence executing kernel code. (We could also determine this as the
223PC VA is in the kernel text region.) The EA of the load is in record
2248897. To derive PA of the LDUW, we use the
225\textss{PAVADIFF\_T::ea\_pa\_va\_diff} field.
226
227At record 8898, we have an SRL instruction with PC=0x0000000001085be4.
228We continue to use the values from the last \textss{PAVADIFF\_T} to get
229the PC PA.
230
231At record 8901, we see another \textss{PAVADIFF\_T} record, because the
232LDUH load in the next record, 8902, accesses data on a different page.
233This \textss{PAVADIFF\_T} record contains the necessary
234\textss{ea\_pa\_va\_diff} value for the following LDUH instruction. In
235this case, the PC (PA-VA) value has not changed.
236
237The dependence on only two record types and the use of the last PAVADIFF
238record data makes processing simple. Here is the main loop of
239\textff{readRST.C} which prints out the VA and PA of each instruction.
240
241There is C and C++ starter source code for RST readers in
242\textff{/import/archperf/ws/rstf}.
243
244\begin{verbatim}
245 ...
246 for (long long ix = 0; ix < nrecords; ix++) {
247 long long recidx = ix + skip;
248 rstf_unionT * up = &rec[ix];
249 uint8_t rtype = up->context.rtype;
250 if (rtype == CONTEXT_T) {
251 icontext = (up->context.traplevel > 0) ? 0 : up->context.primD;
252 dcontext = up->context.primD;
253 } else if (rtype == PAVADIFF_T) {
254 rstf_pavadiffT * pv = & up->pavadiff;
255 icontext = pv->icontext;
256 dcontext = pv->dcontext;
257 pava_pc = pv->pc_pa_va;
258 if (pv->ea_valid) {
259 pava_ea = pv->ea_pa_va;
260 }
261 } else if (rtype == INSTR_T) {
262 rstf_instrT * ip = & up->instr;
263 short ih = ip->ihash;
264 if ( ih == IH_LDX ) { // shade V5 ihash values
265 // instr is a LDX
266 }
267 iw = ip->instr;
268 pc = ip->pc_va;
269 pc_pa = pc + pava_pc;
270 fprintf(out, "%4lld PC V/P %llx / %llx (IW=0x%8x)",
271 recidx, pc, pc_pa, iw);
272 if (ip->ea_valid) {
273 ea = ip->ea_va;
274 ea_pa = ea + pava_ea;
275 fprintf(out, "(EA V/P %llx / %llx)",
276 ea, ea_pa, iw);
277 }
278 fprintf(out, "\n");
279 }
280 }
281\end{verbatim}
282
283
284\section{Versions of RST}
285
286\begin{verbatim}
287 2.0 6/??/2001 Mostly the same as V1.10. Has official MP support
288 1.10 5/09/2001 MP support via cpuid field; better utility fn API\n\
289 1.9 4/20/2001 PREG_T: add cpuid, rename asiReg\n\
290 1.8 3/27/2001 unixcommand(), rstf_snprintf(), stdized rstf_headerT,magic\n\
291 1.7 3/26/2001 Add RECNUM_T for rst-snapper\n\
292 1.6 3/15/2001 Add support for MP (cpu-id to pavadiff, more TLB info)\n\
293 1.5 9/18/2000 Fixed Shade V6 record types (thanks Kelvin)\n\
294 1.4 9/9/2000 Added icontext and dcontext to PAVADIFF_T rec\n\
295 1.4 9/?/2000 Added major, minor numbers to HEADER_T rec\n\
296 1.3 8/25/2000 Added PATCH_T type.\n\
297 1.2 8/22/2000 Added STATUS_T type.\n\
298\end{verbatim}
299
300\subsection{Where do I get RST code?}
301
302The \textff{rstf.h} header file is at
303\textff{/import/archperf/include/rstf/rstf.h}.
304Various other RST binaries exist in \textff{/import/archperf/bin/}
305
306Source code to various RST utilities are in the Code Manager WS
307\textff{/import/archperf/ws/rstf/}.
308
309\labsec{Canonical}
310\section{Detailed specification of the RST trace type}
311
312We present a precise specification for common record types, as a
313reference for various analyzers. Compounding matters, different trace
314sources, produce slightly differing values even for common cases. In
315particular, the \textss{ea\_va} field for branches has several
316interpretations. Historical note: In my mind, when defining the
317\textss{INSTR\_T} record, there was no chance for ambiguity. I was
318wrong.
319
320\subsection{Analyzing a canonical RST trace}
321
322The following table shows where/how to get information from an RST
323trace. The notation \textss{ttt::fff} means to look at field
324\textss{fff} in the \textem{last} record of type \textss{ttt}. The
325notation \textss{ppp?xxx:yyy} means use value \textss{xxx} if predicate
326\textss{ppp} is true, else use \textss{yyy}; if \textss{yyy} is missing
327then there is no valid data value.
328
329\begin{tabular}{|l|l|} \hline
330 Value of interest & Where / how \\ \hline
331 IW (instr) & \textss{I-TLB miss} ? 0x0 : \textss{INSTR\_T::instr} \\ \hline
332 PC VA & \textss{INSTR\_T::pc\_va} \\ \hline
333 PC PA & \textss{INSTR\_T::pc\_va} + \textss{PAVADIFF\_T::pc\_pa\_va}
334(known wrong if I-TLB miss) \\ \hline
335 ld/st VA & \textss{INSTR\_T::ea\_valid ? INSTR\_T::ea\_va} \\ \hline
336 ld/st PA & \textss{INSTR\_T::ea\_valid ? (INSTR\_T::ea\_va +
337PAVADIFF\_T::ea\_pa\_va)} (PA invalid on D-TLB miss or non-translating ASI) \\ \hline
338 PC I context & \textss{PAVADIFF\_T::icontext} \\ \hline
339 ld/st D context & \textss{PAVADIFF\_T::dcontext} \\ \hline
340 ld/st ASI goes to mem ? & examine ASI; translating+bypass ASI's go to memory \\ \hline
341 instr is CTI ? & decode IW or look at \textss{INSTR\_T::ihash} \\ \hline
342 CTI is taken ? & \textss{INSTR\_T::bt} \\ \hline
343 taken CTI target & \textss{INSTR\_T::bt ? INSTR\_T::ea\_va} \\ \hline
344 instr is annulled ? & \textss{INSTR\_T::an} \\ \hline
345 \hline
346 ld/st ASI & \textss{Immed-ASI ? IW : PREG\_T::asireg} (Decode IW to
347determine Immed-ASI)\\ \hline
348 trap level & \textss{saw TRAP\_T ? (TRAP\_T:tl + 1) :
349PREG\_T::trap\_lvl} \\ \hline
350 enter trap & Get a \textss{TRAP\_T} record and/or
351\textss{INSTR\_T::tr} \\ \hline
352 trap type & \textss{TRAP\_T::ttype} \\ \hline
353 exit trap & Get DONE / RETRY IW and/or get \textss{PREG\_T} \\ \hline
354 system call & \textss{TRAP\_T::syscall} \\ \hline
355 TLB demap & \textss{TLB\_T::demap} is 1.
356\end{tabular}
357
358The \textss{PREG\_T} (priviledged register) record is the new name for
359badly-named \textss{CONTEXT\_T} record. The \textss{PREG\_T} record
360encodes various hardware values. And making things all the more
361galling, the \textss{CONTEXT\_T/PREG\_T} \textem{should not be used to
362detect context switches}. It is deprecated as of RST format V2.05 and
363has been renamed \textss{PREG\_T}.
364
365\begin{tabular}{|l|l|} \hline
366 Register & Where / how \\ \hline
367 PSTATE & \textss{CONTEXT\_T::pstate} \\ \hline
368 ASI reg & \textss{CONTEXT\_T::asireg} \\ \hline
369 I-MMU primary context & \textss{CONTEXT\_T::primA} (non-existent for
370SPARC) \\ \hline
371 I-MMU secondary context & \textss{CONTEXT\_T::secA} (non-existent for
372SPARC) \\ \hline
373 D-MMU primary context & \textss{CONTEXT\_T::primD} \\ \hline
374 D-MMU secondary context & \textss{CONTEXT\_T::secD} \\ \hline
375\end{tabular}
376
377\subsection{Table of common cases for INSTR and PAVADIFF records}
378
379To help clarify the above information, the following table lists the
380value from \textss{INSTR\_T} and \textss{PAVADIFF\_T} fields for the
381various common cases. We use the nomenclature \texttt{undef}=undefined,
382\textss{valid}=expected value, N/A=not used or not applicable, and
383\texttt{impdep}=implementation dependent. In particular, pavadiff
384records are emitted on demand and a N/A entry will not trigger a
385pavadiff record.
386
387\begin{tabular}{|l|l|l|l|l|l|}
388 & \mc{3}{|c|}{\textss{INSTR\_T}} & \mc{2}{|c|}{\textss{PAVADIFF\_T}} \\
389Case & \texttt{IW} & \texttt{pc\_va} & \texttt{ea\_va} & \texttt{pavaPC} & \texttt{pavaEA} \\
390ITLB miss & norec & norec & norec & norec & norec \\
391non-mem, non-CTI & valid & PC VA & undef & valid & valid \\
392memop good & valid & PC VA & EA VA & valid & valid \\
393memop DTLB miss & valid & PC VA & EA VA & valid & N/A \\
394CTI taken & valid & PC VA & target PC & valid & N/A \\
395CTI non-taken & valid & PC VA & \textem{impdep} & valid & N/A \\
396annul ITLB miss & undef & PC VA & norec & norec & norec \\
397annul instr & valid & PC VA & N/A & valid & N/A \\
398\end{tabular}
399
400\subsection{Ordering of simultaneous records}
401
402When several RST records apply to a given event, we recommend the
403following ordering. Records in the same group can be ordered
404arbitrarily; in practice, only Our guiding strategy is to make it easy
405for an RST trace consumer to process the information. Generally, we try
406to put \textss{INSTR\_T} record last, unless there is a
407\textss{REGVAL\_T} record containing values produced by the instr.
408
409\begin{tabular}{|l|l|}
410 Group 1 & \textss{CPU\_T} \\
411 Group 2 & \textss{PREG\_T}, \textss{REGVAL\_T} with postInstr=0 \\
412 Group 3 & \textss{TRAP\_T}, \textss{TRAPEXIT\_T} \\
413 Group 4 & \textss{PAVADIFF\_T} \\
414 Group 5 & \textss{INSTR\_T} \\
415 Group last & \textss{REGVAL\_T} with postInstr=1 \\
416\end{tabular}
417
418If \textss{REGVAL\_T::postInstr} is set, then the values are those
419present after the instruction has executed.
420
421\swallow{
422arch specific
423 algorithm for I context changes in MM
424 TLB field
425 ihash fn
426}
427
428\subsection{Common errors}
429
430To detect a context switch, examine \textss{PAVADIFF\_T::icontext},
431\textsl{do not use a \textss{CONTEXT\_T} record}.
432\textss{PAVADIFF\_T::icontext} and \textss{PAVADIFF\_T::dcontext} give
433the effective I and D contexts being used, which is correct to the best
434of my knowledge.
435
436On a ASI LD/ST using an immediate ASI, do not use the
437\textss{CONTEXT\_T::asireg} field, as this field contains the contents
438of the ASI register.
439
440The \textss{TLB\_T::valid} bit is meaning less. To determine if a TLB
441line is valid, instead, look at the valid bit in the TLB TTE data.
442
443\subsection{Clarification and the blaze RST tracer}
444
445We cover several ambiguous corner cases in this section. We also
446describe what the blaze V1.x and V2.x RST tracer does in these cases.
447
448\begin{rqitemize}{0em}
449 \item {}[Annulled instr - I-TLB hit]
450 The VA PC, PA PC and IW are all valid. The blaze RST tracer emits an
451 instruction record with the annulled bit set.
452
453 \item {}[Annulled instr - with ITLB miss] If there is an I-TLB miss on
454 an annulled instruction, \textsl{only the VA PC is valid}. Because we
455 cannot determine the PC PA, we cannot even fetch the IW.
456
457(The blaze RST tracer) On an annulled instr that misses the I-TLB, we
458emit an IW=0 (illegal trap) and blithely let the previous PAVADIFF
459remain. While this PC PAVADIFF is technically wrong, we deemed it it
460less intrusive than generating a PAVADIFF just to flag that we have an
461unknow PC PA. (I had tried emitting a special PAVADIFF, but have since
462retracted this approach.) Additionally, trace analyzers that do care
463about the PC PA of an annulled instruction should be smart enough to
464suppress the I-TLB miss.
465
466 \item {}[Memory ops with a D-TLB miss]
467
468 A memop that has a D-TLB miss will appear twice in the trace. The
469 sequence will be $(i)$ memop-try-1 + DTLB miss, $(ii)$ D-TLB miss handler
470 with possible complications like TSB miss and/or page fault, $(iii)$
471 memop try-2 which succeeds. On a D-TLB miss, the
472 PA EA will be unknown on the first try.
473
474(The blaze RST tracer) On a D-TLB miss, we first emit an RST trap
475record indicating the D-TLB miss and then it emits instr record for the
476memop. The RST instruction record for the memop has its trap bit set.
477That's it. In particular, we do not emit a PAVADIFF record, as we rely
478on the trace consumer to detect the DTLB miss and squash the memop.
479
480 \item {}[EA for untaken branches]
481If a CTI instr is not taken (an untaken branch), we know we shall fall
482through to next PC. In this case, \textss{INSTR\_T::ea\_valid} is 0,
483and \textss{INSTR\_T::ea\_va} \textsl{is unspecified}. For blaze and
484atrace-based RST traces, \textss{INSTR\_T::ea\_va = PC +8}; for
485shade-based RST traces, \textss{INSTR\_T::ea\_va = taken-target-PC}.
486
487 \item {} [PSTATE.AM = 1] If the AM bit is one, all virtual addresses
488are limited to 32 bits. The \textss{INSTR\_T::ea\_va} field must
489contain a 32-bit value; namely the upper 32 bits of the 64-bit
490\textss{ea\_va} field must be zero.
491
492 \item {}[PA EA only applies to mem ops]
493Although \textss{INSTR\_T::ea\_va} holds both $(i)$ memory addresses and
494$(ii)$ CTI target, you must use \textss{PAVADIFF\_T::ea\_pa\_va} value
495only for memory operations. The RST spec forbids using PAVADIFF to get
496the PA of a CTI/branch target, because $(a)$ you can get the PA PC from
497the actual target instruction itself and $(b)$ at the time of the
498CTI/branch, the PA PC may not be known as we may incur an I-MMU miss.
499
500 \item {}[TLB demap operation] On a TLB demap operation, we record the
501 VA and context of the TLB entry that is demapped. We do not record
502 the TTE\_data of the entry being demapped. (There was a bug where the
503 \textss{TLB\_T::demap} was not being set.)
504
505 \item {}[LD/ST ASI to non-memory (e.g. to an MMU register)] For all
506loads and stores, even load/store ASI instructions,
507\textss{INSTR\_T::ea\_valid = 1} and \textss{INSTR\_T::ea\_va} holds
508the virtual address. Some of the ld/st ASI have a meaningful virtual
509addresses, e.g. \textss{ASI\_UDB\_INTR\_W} or
510\textss{ASI\_DTLB\_DATA\_TAG\_REG}, so the \textss{INSTR\_T::ea\_va}
511must contain the effective address.
512
513For non-translating ASI's, the downstream trace analyzer must not
514generate a PA, which puts the burden of knowing whether to generate a PA
515on the trace analyzer. The ASI's obey the following breakdown, where
516[aa,bb] is the range inclusive of aa and bb, namely $aa \le x \le bb$.
517
518\begin{tabular}{|l|l|} \hline
519 Range & How to get PA from VA \\ \hline
520 {}[0x04,0x11] [0x18,0x19] [0x24,0x2c] & Translate via MMU \\
521 {}[0x70,0x73] [0x78,0x79] [0x80,0xff] & Translate via MMU \\ \hline
522 {}[0x14,0x15] [0x1c,0x1d] & Bypass. PA=VA \\ \hline
523 {}[0x45,0x6f] [0x76,0x77] [0x7e,0x7f] & Non-translating (no PA) \\ \hline
524\end{tabular}
525
526The following tables lists some common translating ASI's.
527
528\begin{tabular}{|l|l|}
529 Value & ASI name \\
530 0X04 & NUCLEUS \\
531 0X0C & NUCLEUS\_LITTLE \\
532 0X10 & AS\_IF\_USER\_PRIMARY \\
533 0X11 & AS\_IF\_USER\_SECONDARY \\
534 0X80 & PRIMARY (the default ASI for all loads) \\
535 0X81 & SECONDARY \\
536 0X82 & PRIMARY\_NO\_FAULT \\
537 0X83 & SECONDARY\_NO\_FAULT \\
538 0X88 & PRIMARY\_LITTLE \\
539 0X89 & SECONDARY\_LITTLE \\
540 0XE0 & BLK\_COMMIT\_PRIMARY \\
541 0XE1 & BLK\_COMMIT\_SECONDARY \\
542 0XF0 & BLK\_PRIMARY \\
543 0XF1 & BLK\_SECONDARY \\
544\end{tabular}
545
546\end{rqitemize}
547
548\labsec{PAVA}
549\subsection{VA to PA translation}
550
551The RST format is designed to capture both VA and PAs. There are
552several overlapping ways to specify the necessary information. As of
55311/2001, the defacto standard is the \texttt{PAVADIFF} method and as
554such, you may safely assume \textss{PAVADIFF\_T} records always exist.
555(As of 3/2001, blaze, atrace and shade based RST traces all use
556PAVADIFF\_T records).
557
558\textsl{PAVADIFF method:} A standard method is to use \texttt{PAVADIFF}
559records, which captures the (PA-VA) values for the I-TLB and D-TLB. In
560a \texttt{PAVADIFF} record, the \textss{pc\_pa\_va} field contains the
561difference of (PA-VA) for the PC of the next \texttt{INSTR} record. If
562the INSTR is a \texttt{load} or \texttt{store} (but not a
563\texttt{branch/call/jump} ), then the \textss{ea\_pa\_va} field of the
564\texttt{PAVADIFF} record holds the (PA-VA) for the EA, which is how the
565D-TLB would translate that EA. Here is how the trace might look.
566
567As of RST V1.9 (4/2001), there is a separate \texttt{PAVADIFF} record
568for each CPU. The CPU ID is contained in \texttt{PAVADIFF\_T::cpuid}.
569
570{\footnotesize
571\begin{verbatim}
572 3 pavadiff: context=571 cpu=0 pc_pa_va=0x00000002c0800000 ((ea_pa_va=0xffffffffffffffff)) ea_valid=0
573 4 instr : u [0x000000010050f144] srl %i1, 0, %i0
574 5 instr : u [0x000000010050f148] or %g3, %g2, %g2
575 6 instr : u [0x000000010050f14c] sll %i0, 2, %g3
576 7 pavadiff: context=571 pc_pa_va=0x00000002c0800000 ea_pa_va=0x00000002c0800000 ea_valid=1
577 8 instr : u [0x000000010050f150] ldsw [%g3 + %g2], %g3 [0x000000010050e390]
578 9 instr : u [0x000000010050f154] jmpl %g3 + %g2, %g0 T
579 10 instr : u [0x000000010050f158] nop
580\end{verbatim}
581}
582
583Note for a control-transfter instruction (e.g \texttt{br/call/jmpl} etc)
584which jumps to a target PC, \texttt{targPC}, you must "wait" until you
585see the target INSTR record to determine the PA for \texttt{targPC}. If
586\texttt{targPC} is on a different page with a different (PA-VA) value
587than the current PC, there will be a PAVADIFF record before the target
588instruction, if necessary.
589
590\begin{rqcode}{ }
591 PAVADIFF pc\_pa\_va=diffaa
592 ...
593 PCaa br targPC // targPC is on a different page
594 PCaa+4 delay slot instr
595
596 PAVADIFF pc\_pa\_va=diffbb // new value for PA-VA for targPC.
597 targPC target instruction // PA of targPC = (targPC + diffbb)
598\end{rqcode}
599
600There are two common ways of using PAVADIFF records. With
601\textsl{on-change}, I generate PAVADIFF records only the (PA-VA) values
602change for either the PC or the EA. If the \textss{ea\_valid} field is
603false (0), then the previous \textss{ea\_pa\_va} value is still assumed
604to be correct. One caveat, it is possible for the (PA-VA) values to be the same for
605different (TLB) pages, in which case, you may not see PAVADIFF record
606even when we cross pages.
607
608Another possible use of PAVADIFF records is on a \textsl{every-instr}
609basis, in which PAVADIFF record precedes every instr, which nearly
610doubles the trace size. Nobody in their right mind does this as of
6111/2001.
612
613\textsl{TLB method:} The conceptually preferred (but practically
614difficult) method is to have \texttt{TLB} records which describe all the
615necessary mappings before they are used. Whenever a TLB line is
616changed, a corresponding TLB records appears in the trace. Also, in an
617RST trace from blaze, the entire TLB is dumped at the beginning of the
618trace. As of RST V1.9, a TLB record contians the TLB unit (e.g. Cheetah
619has two I-TLBS units) and the CPU to which it applies.
620
621Despite its compactness, TLB records cannot be used universally for
622VA-to-PA translation. The TLB method for tramslation is the most
623compact, because the TLB records should be relatively infrequent in a
624trace. However, to get PA's, your analyzer program must simulate a TLB,
625which has proven to be difficult, slow and hence extremely unpopular.
626And in many cases, (e.g. when the trace is from atrace or shade), TLB
627information is unavailable, so TLB records will be missing.
628
629Here is sample output from a RST trace from the blaze \textss{rstracer}
630module. Records 9-4014 (roughly 2048 I-TLB + 2048 D-TLB) are the
631initial TLB values. At record 4649, we replace I-TLB entry 1090. Here
632\texttt{type=0} means I-TLB. The \texttt{demap=0} field means that this
633entry is being added to the TLB, which also replaces any previous entry.
634
635{\footnotesize
636\begin{verbatim}
637 1 string : string=date=00-06-26
638 2 string : string=host=bigc
639 3 string : string=ramsize=1024M
640
641 4 string : string=tlbsize=2048
642 5 string : string=nwins=8
643 6 string : string=cpufreq=600000000
644 7 string : string=mpsteps=200
645 8 cpu : cpu=0 timestamp=0x45a318a1df
646 9 tlb : demap=0 type=0 valid=0 index=0 state=0x0000 context=0x0000 tag=0x00000000ffd00000 data=0xa000000000e00064 pa=0xe00000
647 10 tlb : demap=0 type=0 valid=0 index=1 state=0x0000 context=0x0000 tag=0x00000000ffd10000 data=0xa000000000e10064 pa=0xe10000
648 ...
649 ...
650 4103 tlb : demap=0 type=1 valid=1 index=2046 state=0x0000 context=0x110f tag=0x00000003a68b310f data=0xe00010000c000032 pa=0x
651c000000
652 4104 tlb : demap=0 type=1 valid=0 index=2047 state=0x0000 context=0x1115 tag=0x00000003a9fc7115 data=0xe000000008c00032 pa=0x
6538c00000
654 4105 cpu : cpu=1077781320 timestamp=0x45a318a1df
655 4106 context : asi=0x0082 last_context=0x0120 trap_lvl=0x00 trap_type=0x00 pstate=0x0012 primA=0x0000 secA=0x0000 primD=0x0120 se
656cD=0x0120
657 4107 instr : u [0x000000010087ec18] add %i1, %o2, %o0
658 4108 instr : u [0x000000010087ec1c] ldub [%o2 + %o4], %g3 [0x0000000101b7a7ba]
659 4109 instr : u [0x000000010087ec20] subcc %g3, %g2, %g0
660 4110 instr : u [0x000000010087ec24] bple,a,pn %icc, 0x10087ec30 T [0x000000010087ec30]
661 ...
662 ...
663 4648 instr : p [0x0000000010000cb4] nop an
664 4649 tlb : demap=0 type=0 valid=0 index=1090 state=0x0000 context=0x0120 tag=0x0000000100b2a120 data=0x8000000003756020 pa=0x
6653756000
666 4650 instr : p [0x0000000010000cb8] stxa %g5, [%g0 + %g0]0x54
667 4651 context : asi=0x0082 last_context=0x0000 trap_lvl=0x00 trap_type=0x00 pstate=0x0012 primA=0x0000 secA=0x0000 primD=0x0120 se
668cD=0x0120
669 4652 instr : p [0x0000000010000cbc] retry T [0x0000000100b2bed0]
670 4653 instr : p [0x0000000100b2bed0] or %g0, %o0, %g2
671\end{verbatim}
672}
673
674\textsl{PHYSADDR method:} The last brute force method is to put a
675PHYSADDR record before every instr record. The PHYSADDR record contains
676the PA for the following PC and the EA, if appropriate.
677
678\subsection{VA to PA translation historical notes}
679
680For many months, I (RQ) was convinced the TLB method was the correct way
681to handle VA-PA translation. How difficult could simulating a TLB be?
682The PAVADIFF was meant to be a stop-gap, until correct TLB simulators
683were written. I was wrong. Very wrong.
684
685In retrospect, the use of PAVADIFF records has greatly simplified RST
686trace processing. Even now (03/2002), two years after the initial
687discussion, finding a correct TLB simulator (e.g. one that agrees with
688the PAVADIFF recors) remains elusive. Special "thanks" to the MM team,
689especially Sudi K, for steadfastly being unable to use TLB records,
690forcing PAVADIFF records to become the standard.
691
692\subsection{Underlying philosophy}
693
694You should be able to glean most of what you want to know from just
695\textss{INSTR\_T}, \textss{PAVADIFF\_T} and \textss{TRAP\_T} records.
696
697((to be finished))
698
699The \textss{TLB\_T} records let you do your own VA to PA translation
700were extremely unpopular and have been superceded in practice by
701\textss{PAVADIFF\_T} records.
702
703\textss{CONTEXT\_T} records were originally meant to be much more useful.
704
705\section{The blaze RST trace}
706
707While each individual RST record type is fairly unambigious, how the
708records are put together is implementation dependent.
709
710In a \textss{TRAP\_T} record, the HW state is that \textbf{before} the
711trap is taken. Thus, a trap from user code will show \textss{TL=0}.
712
713If executing an instr causes a trap, say a D-TLB miss, you will see the
714instr (with the \textss{tr} bit set) and then trap. If fetching an
715instr causes a trap (e.g. IMMU miss), you will not see the instruction
716until after the trap returns.
717
718If there is a write to the \textss{PSTATE} or the \textss{TL} registers,
719the new values are shown in a \textss{CONTEXT\_T} record.
720
721\subsection{Information in the RST header}
722
723As of 4/2001 (V1.4), the blaze \textss{rstracer} module spits out
724copious information about the configuration. Before that, a more modest
725modicum of information was spit out. As of Version 1.4, we get the
726following series of records at the start of a trace.
727
728\begin{verbatim}
729 $ trv.sh -n 24 /import/arch-trace03/blaze/tpcc-try8/try8-t6.rsz
730RST trace format (stdin)
731================
732 User/ Branch
733 Rec # Type Priv PC Disassembly Taken EA
734 0 header : majorVer=1 minorVer=8 RST Header v1.8
735 1 strdesc : "Blaze [ rstracer.so ]"
736 2 strdesc : "rstracer=V1.4"
737 6 strdesc : "rstracer [compiled against Blz3.48 - Excal 5.8 RW MP ||Disk API=[Trace,Timing]]"
738 8 strdesc : "date=2001-04-05_01:51:56"
739 9 strdesc : "host=bigc"
740 10 strdesc : "<blazeinfo>"
741 13 strdesc : "blz::version=3.49 - Excal 5.8 RW MP ||Disk API=[Trace,Timing]"
742 14 strdesc : "blz::ncpus=1"
743 15 strdesc : "blz::ram=1024M "
744 16 strdesc : "blz::tlbsize=2048"
745 17 strdesc : "blz::mmutype=spitfire"
746 18 strdesc : "blz::cpufreq=200000000"
747 19 strdesc : "blz::sysfreq=10000000"
748 20 strdesc : "blz::diskdelay=800000"
749 21 strdesc : "blz::nwins=8"
750 22 strdesc : "blz::mpsteps=2"
751 23 strdesc : "</blazeinfo>"
752\end{verbatim}
753
754\section{Where is the source for RST?}
755
756The current source is in \textss{/import/archperf/pkgs/rstf/latest/}.
757
758\begin{tabular}{|l|l|} \hline
759 file & Description \\ \hline
760 \rqhttp{\textss{rstf.h}}{file:/import/archperf/pkgs/rstf/latest/rstf.h} & the RST
761format \\ \hline
762 \rqhttp{\textss{rstf.c}}{file:/import/archperf/pkgs/rstf/latest/rstf.c} & a few utility routines and some test code \\ \hline
763\end{tabular}
764
765\section{I want to process an RST trace, where do I start?}
766
767A simple, sample C++ skeleton to read an RST trace file at
768\textss{/import/archperf/pkgs/rstf/latest/readRST.C}.
769
770A simple, sample ANSI C skeleton to read an RST trace file at
771\textss{/import/archperf/pkgs/rstf/latest/readRST-ansiC.c}. I "thank"
772Anders who found the C++ skeleton impenetrable, and so spent several
773hours doing numerous moronic things getting this code to work.
774
775Finally, the file \textss{/import/archperf/ws/rstf/rstFilter.h}
776contains a more realistic (i.e. complicated) example of RST processing
777in which we read an RST trace, adding/modifying/deleting records, and
778generate a new RST trace. This code double buffers both input and
779output to guarantee that we can always access/modify the previous K
780records at both the input and output. (In contrast, if you use a single
781buffer and you just happen to fill (flush) the input (output) buffer,
782you cannot access or modify the previous record).
783
784\subsection{The actual RST code}
785
786Here are the corresponding record definitions directly from
787\textss{rstf.h}. The code on this web page maybe a bit out of date, so
788check the source \rqlink{/import/archperf/pkgs/rstf/latest/rstf.h}.
789
790\begin{verbatim}
791typedef struct {
792 uint8_t rtype; /* value = INSTR_T */
793 unsigned notused : 1; /* not used */
794 unsigned ea_valid : 1; /* ea_va field is valid */
795 unsigned tr : 1; /* trap occured 1=yes */
796 unsigned notused2 : 1; /* not used */
797 unsigned pr : 1; /* priviledged or user 1=priv */
798 unsigned bt : 1; /* branch/trap taken, cond-move/st done, like Shade6 */
799 unsigned an : 1; /* 1=annulled (instr was not executed) */
800 unsigned reservedCompress : 1; /* used by rstzip compression */
801 uint16_t ihash; /* ihash value (optional) */
802 uint32_t instr; /* instruction word (opcode, src, dest) */
803 uint64_t pc_va; /* VA */
804 uint64_t ea_va; /* Eff addr VA */
805} rstf_instrT;
806
807typedef struct {
808 uint8_t rtype; /* value = PAVADIFF_T */
809 unsigned ea_valid : 1; /* does ea_pa contain a valid address */
810 unsigned cpuid : 7;
811 uint16_t notused16; /* (deprecated) context used for these diffs */
812 uint16_t icontext; /* I-context used for these diffs */
813 uint16_t dcontext; /* only valid if ea_valid is true, */
814 uint64_t pc_pa_va; /* (PA-VA) of PC */
815 uint64_t ea_pa_va; /* (PA-VA) of EA for ld/st (not branches), if ea_valid is true */
816} rstf_pavadiffT;
817
818typedef struct {
819 uint8_t rtype; /* value = TRAP_T */
820 unsigned is_async : 1 ; /* asynchronous trap ? */
821 unsigned unused : 3 ; /* unused */
822 unsigned tl : 4 ; /* trap level in the trap handler */
823 uint16_t ttype; /* trap type for V9, only 9 bits matter */
824
825 uint16_t pstate; /* Pstate register in the trap, only 9 bits */
826 uint16_t syscall; /* If a system call, the syscall # */
827
828 uint64_t pc;
829 uint64_t npc;
830} rstf_trapT;
831\end{verbatim}
832
833\section{System calls}
834
835Depending on the tracing harness, system call information maybe present
836in the trace. E.g. in \textss{RST/blaze}, system call information is
837present.
838
839A system call consists of a (software) trap instruction to trap TRNUM,
840with the \texttt{\%g1} register containing the system call number.
841There is one trap number for 32-bit and a second trap for 64-bit system
842calls.
843
844\begin{tabular}{|l|l|}
845 TRNUM & system call \\
846 \texttt{0x108} & 32 bit system call \\
847 \texttt{0x140} & 64 bit system call \\
848\end{tabular}
849
850See the C header file \textff{/usr/include/sys/syscall.h} for the system
851call numbers. Thus \textss{2=fork}, \textss{5=open}, and
852\textss{173=pread}. The header file \textff{/usr/include/sys/trap.h}
853has the 32-bit trap number. (I forgot where I found the 64-bit system
854call trap.)
855
856Thus, a system call will appear as a \textss{TRAP\_T} record with the
857\textss{ttype} field set to either \textss{0x108} or \textss{0x140} and
858the \textss{syscall} field holding system call index. For example, in
859this blaze TPCC trace snippet (\textss{t5sds}), the trap record at 86034
860indicates a system call (pread) is being made at instruction record
86186035.
862
863\begin{verbatim}
864 86032 instr : cpuid=0 u [0xffffffff7dfa34d8] stx %o0, [%sp + 0x87f] [0xffffffff7fff0910]
865 86033 instr : cpuid=0 u [0xffffffff7dfa34dc] or %g0, 0xad, %g1
866 86034 trap : cpuid=0 is_async=0 async==0 tl=0 ttype=0x140 pstate=0x012 syscall=0x00ad
867 86035 instr : cpuid=0 p [0xffffffff7dfa34e0] ta %icc, %g0 + 0x40 T [0x0000000001002800] tr
868\end{verbatim}
869
870\section{Trace format design}
871
872\subsection{How do I encode state (such as warmed cache state) in RST?}
873
874In short, do not do this. RST is designed for capturing a dynamic
875sequence of events (instructions, TLB activity, etc) from an computer
876system.
877
878If you need to heterogenous information in a single trace, create a
879\rqhttp{unatrace}{http://smeeng.eng/\rqtilde{}quong/unawrap.html}, which
880is a general purpose trace \textsl{wrapper} format. Aztecs snaps, which
881consist of [cache + TLB + branch predictor warming + RST instruction
882traces] use the unawrap format.
883
884\subsection{Design tradeoffs in RST}
885
886Any trace format must be a balance of the following design tradeoffs,
887because not all properties can be achieved simultaneously. We evaluate
888RST against various criterion.
889
890\begin{tabularx}{\linewidth}{|l|l|l|X|} \hline
891 Goal & RST grade & Conflicts with & Description \\ \hline
892 Simple & A & Size & Trace format should be easy to use. RST uses a
893fixed size record so it is easy to skip N records. RST has a common
894rtype byte so decoding a record is very easy. \\ \hline
895 Size & D & Simplicity & Information density should be high, as traces
896are often very large. RST requires about 30 bytes per instruction
897(PA+VA, PC+EA, TLB, traps events). We believe an separate compression
898phase can be used to reduce the RST size (use of a beta quality
899compressor and gzip reduced the size of RST by approx 5-10X). \\ \hline
900 Flexible & A & Size & A trace should be able to hold different types
901of data. A trace format which uses a fixed-record type severely
902restricts flexibility, because every record must have a field for every
903type. We avoid this in RST by having a different record types in RST.
904 \\ \hline
905\end{tabularx}
906
907Other RST design notes. (1) The RST trace instruction record was
908designed to hold an instruction word (32-bit), instruction record
909(64-bit PC) and memory effective address (64-bit EA) and other overhead
910such as the \textss{rtype} byte. This lead to the 24-byte record size.
911
912\section{Patching for Aztecs}
913
914\begin{verbatim}
915 1418 instr : u [0x000000010048e9b8] add %g4, 1, %g4 tr
916 1419 patch : isbegin=1 rewindrecs=0 id=1 length=2 descr=atrPCdAZ
917 1420 instr : u [0x000000010048e9bc] jmpl %g2 + 0, %g1 T [0x0000000078404780]
918 1421 instr : u [0x000000010048e9c0] nop
919 1422 patch : isbegin=0 rewindrecs=0 id=1 length=2 descr=atrPCdAZ
920 1423 context : asi=0x0000 last_context=0x0000 trap_lvl=0x00 trap_type=0x00 pstate=0x0000 primA=0x0000 secA=0x0000 primD=0x0000 secD=
9210x0000
922 1424 pavadiff: context=0 pc_pa_va=0x00000003673c0000 ((ea_pa_va=0xffffffffffffffff)) ea_valid=0
923 1425 instr : u [0x0000000078404780] save %sp, -0xb0, %sp
924\end{verbatim}
925
926\section{FAQ}
927
928\subsection{There is a TRAP record and a tr bit in the instruction record. What is the difference?}
929
930The trap record contains many values including the trap type, trap
931level, PC, NPC, pstate register and the system call number (\%g1
932register) on a syscall trap.
933
934The \textss{tr} bit in the instruction simply indicates if a trap
935occurred during this instruction. The tr bit is necessary to clearly
936distinguish when a trap occurs.
937
938\section{The rstf workspace}
939
940\subsection{Purpose}
941
942\begin{rqenumerate}{0em}
943 \item The main purpose of this WS is to define the RST file format in
944 \textff{rstf.h}.
945 Some secondary and/or deprecated definitions are in
946 \textff{rstf\_*.h}
947
948 \item A secondary purpose is to define common RST utilities/code, including
949 starter code and RST-to-RST filters.
950\end{rqenumerate}
951
952\subsection{Guidance on updating this workspace}
953
954The file \textff{rstf.h} defines the RST file format. The file format
955consist of the rtype definitions and the fields within each record.
956\textbf{Many} other programs use \textff{rstf.h}. So....
957
958\begin{rqitemize}{0em}
959 \item Try to avoid changing this file if possible.
960 In the last 12 months (10/2001-10/2002), I have bumped the minor
961 version once.
962 \item Avoid breaking backward compatibility \textbf{AT ALL COSTS}.
963 There is considerable data in \texttt{rstf} 2.04-2.06 format.
964 \item The safest changes involve adding new rtypes or adding more constants
965 to existing enumerations. E.g. filling out the register constants
966 in the \textss{REGVAL\_T:regtype[]}
967 \item There is a Java port of \texttt{rstf}, in the (to be released
968 12/2002) \textss{jrst} workspace. A Perl script in \textss{jrst}
969 "parses" \textff{rstf.h} and makes undocumented assumptions about
970 the way \textff{rstf.h} looks. Please try to conform to the
971 existing style in the typedefs and enums.
972
973 \item I (RQ) have tried to be stingy in using \textss{rtype} values.
974 I have unofficially reserved bits 7 and 6 of the rtype as a hedge for
975 (two rounds of) sweeping changes to RST in the distant future if it
976 comes to that. Thus, i strongly recommend only using rtypes from
977 2-63.
978\end{rqitemize}
979
980If you must change \textff{rstf.h}, bump the version number in
981\textff{rstf.h}
982
983\subsection{Version numbers}
984
985Many programs or code snippets have version numbers. The big rule about
986version numbers is that given an RST trace and full knowledge about the
987history of the programs involved in producing the trace, you must (or
988should) be able to determine what idiosyncrasies exist in that trace.
989Note, you do \textem{not} know what version of the program were involved
990producing the trace.
991
992As an example, you are given the trace \textss{try8-t24.rz.gz} from
9936/2001, which was produced by \textss{blaze V3} and \textss{rstracer}.
994You are given the phone numbers of all the developers involved in
995tracing at Sun, so you can obtain the history of all programs involved.
996What are the issues, if any, of this trace from a data format and
997correctness standpoint? First you have to determine which components
998(or programs) were involved in this trace. Running \textss{trv.sh -n
99940} on this trace we see
1000
1001\begin{flushleft}
1002 0 header : majorVer=1 minorVer=10 RST Header v1.10\\
1003 4 strdesc : " rstracer=V1.8"\\
1004 8 strdesc : " compiled against Blz 3.64 - Excal 5.8 LL RW MP ||Disk API=[Trace,Timing]"\\
100524 strdesc : "blz::version=3.65 - Excal 5.8 LL RW MP ||Disk API=[Trace,Timing]"\\
1006\end{flushleft}
1007
1008Thus this is a RSTF v1.10 trace and \textss{blaze V3.65} and the
1009\textss{rstracer V1.8} were involved. You call up their developers and
1010get the details of these programs from the dawn of time until now and
1011have an understanding of the trace issues.
1012
1013Thus, here are the strong recommendations regarding version numbers and
1014traces.
1015
1016\begin{rqenumerate}{0em}
1017 \item An RST trace must contain the version numbers of all programs
1018 involved in producing the trace. In the case of
1019 \textss{try8-t24.rz.gz}, this trace has the version numbers of
1020 \textss{rstf}, \textss{rstracer} and \textss{blaze}.
1021
1022 \item The version number of each component (or program) must indicate
1023 that state of that component. I.e. if something is changed, the
1024 version number of that component must be changed.
1025
1026 \item There must be a record of known bugs for each component for each
1027 version number.
1028\end{rqenumerate}
1029
1030Here are some examples of version numbers.
1031
1032\begin{tabularx}{\linewidth}{|l|X|}
1033 Code & Description/philosopy of version numbers \\
1034 rstf & Version of the RST Format records. Should not change often. \\
1035 & The first record in an RSTF trace must define the version number
1036 If a new version of RSTF breaks backward compatibility (e.g. the
1037 format for PAVADIFF changes), increment
1038 the major version. And this should happen once every never. \\
1039 rstFilter & updated when a filter is added or updated. Update freely. \\
1040 rstracer & (in rstracer WS)
1041 Reflects which version of the rst tracer. The version indicates
1042 what bugs/idiosyncrasies exist. Note that the RST trace produced
1043 by \textss{rstracer} contains both the rstracer version number and
1044 the RSTF version num.
1045\end{tabularx}
1046
1047\subsection{Basic programs and scripts in the rstf workspace}
1048
1049The master workspace for RST is \textff{/import/archperf/ws/rstf}.
1050It should be open to all to do a bringover, aka world bringover-able.
1051If you need to do a putback to this workspace, talk to someone in Arch
1052Tools, say \textss{lren@eng}.
1053
1054\subsubsection{trv.sh}
1055
1056Look at RST files (compressed or not) in ASCII. (Replaces rstunzip and
1057trconv). Runs a PAGER ( \textss{more} or \textss{less} ) if output is a
1058terminal. \textem{Use this program}.
1059
1060\subsubsection{rstFilter.C}
1061
1062Implemements many (30+) RST-to-RST filters (read stdin/file , write
1063stdout). Typically you need to use several filters in a row. All error
1064messages go to stderr. This code offers generic double-buffering on
1065both input and the output, making is "easier" (hah) to do
1066transformations that must look at several records.
1067
1068\subsubsection{runRSTFilt.sh}
1069
1070Convenient shell script driver for running \textss{rstFilter}. Use
1071this.
1072
1073\begin{rqcode}{ }
1074 // by hand
1075rstFilter -a filter1 input-file | rstFilter -a filter2 | rstFilter -a
1076filter3 > output
1077
1078 // using runRSTFilt.sh
1079runRSTFilt -a 'filter1 filter2 filter3' > output
1080 // Same as above but generate ASCII dumps of all intermediate files, too
1081runRSTFilt -u -a 'filter1 filter2 filter3' > output
1082
1083 // E.g to clean up the raw outout from atrace2rst [Atrace->RST] files,
1084runRSTFilt.sh -a 'ihash addBrTarg' [-u] raw.rst > clean.rst
1085\end{rqcode}
1086
1087\subsubsection{atr2rst.sh, atrace2rst.C and dumpatr}
1088
1089The script \textss{atr2rst.sh} = runs \texttt{atrace2rst} and does some
1090post processing to clean up the RST. The 64-bit executable
1091\textss{atrace2rst} converts an atrace to raw RST. The post processing
1092adds ihash values and branch targets among other things.
1093\textss{Dumpatr} is a hard link to atrace2rst; it is the same as running
1094\textss{atrace2rst -a}.
1095
1096\subsubsection{snapForAztecs.sh}
1097
1098Generate snaps suitable for aztecs. Snaps the RST file and then runs a
1099horrific combination of RST filters on the result and then compresses
1100the results. Even the author does not want to look at this script.
1101
1102\section{History}
1103
1104The RST format and this document was started and then maintained by R
1105Quong through 11/2002.
1106
1107\end{document}