eval 'exec $PERL_CMD -S $0 ${1+"$@"}' if 0; # not running under some shell use strict; use Pod::Usage; use File::Temp qw(tempfile); use Midas::MMU::TTEFormat; my ($ofh, $filename) = tempfile(".midasformat.XXXX", UNLINK=>1); my $src = $0; open(SRC, "<$src") || die "Can't open '$src': $!\n"; while() { my $line = $_; if($line =~ /^\%\%TTE\s*DATA\s*(\S+)\s*(\S+)/) { my ($mmu, $format) = ($1, $2); if(exists $TSB_DATA_FORMAT{$mmu}{$format}) { foreach my $key (keys %{$TSB_DATA_FORMAT{$mmu}{$format}}) { my $rec = $TSB_DATA_FORMAT{$mmu}{$format}{$key}; my $bits = ($rec->{hi} == $rec->{lo}) ? "$rec->{hi}" :"$rec->{hi}:$rec->{lo}"; printf $ofh "=item TTE_%-6s Bits %5s ", $key, $bits; if(defined $rec->{descr}) { print $ofh "$rec->{descr}\n"; print $ofh "\n"; } print $ofh "\n"; } } } else { print $ofh $line; } } close SRC; pod2usage(-verbose => 2, -exitval => 0, -input => $filename); exit(0); __END__ =head1 NAME midasformat - Format for diags recognized by B =head1 DESCRIPTION This document describes the format of diags that can be assembled by B. Note that this is not a guide for writing diags, since it makes no assumptions about the contents of project-standard include files or boot code. =head2 Source Languages B can assemble diags in the following formats: =over 4 =item Augmented Assembly The main supported source language, denoted with a .s extension on the source file, is an augmented SPARC assembly file. By "augmented", we refer to several directives used to program the MMU. These directives are interpreted by B, and their presence means that the raw diag, even though it ends in .s, would not be acceptable input to a standard assembler. An assembly diag should be one file, though it may #include others. A diag may also contain a perl script, used by the simulation framework to do postprocessing on the diag's output. B scans a diag for the symbol "__PERL__" on a line by itself. If it finds such a symbol, then everything after the __PERL__ line is taken to be a perl script and is not assembled. =item PAL (perl augmented language) A diag that ends in a .pal extension is assumed to be written in PAL (perl augmented language). The diag is run through the PAL preprocessor, and the output of PAL should be an augmented assembly file as described above. B takes care of running the PAL preprocesing phase if the diag ends in .pal. Since PAL and assembly diags are treated identically after PAL preprocessing, the remainder of the this document discusses only assembly diags. =item C The top-level diag file is always an augmented assembly file (or a PAL script that generated an augmented assembly file), but there are directives to include C programs in a diag. See the L section. =item Object files (.o) and Library files (.a) As with C files, raw object (.o) files and static library (.a) files may be included in a diag. See the L section. =item Linked Executables These can be included as-is in a diag. See the L section. =back =head2 Output Files The following output files are generated by B: (the names of the output files can be reconfigured, but these are the default names) =over 4 =item F The F file is the main output of B. It contains the initial contents of physical memory in a verilog memory-image format. The format consists lines containing "@Ephysical_addressE", followed by lines of data to write at that address. =item F This is an event file generated for vera. Any assembly line that contains a comment that begins with "$EV" will generate an entry in this file. I =item F This file contains 4 columns: symbol_name, virtual_addres, real address, and physical_address. It exists to support the simulation framework so it can lookup addresses for symbol names. If any of the addresses are inappropriate (such as real address for unmapped sections), it will be represented by 'X'. =item F This is a perl script extracted from the diag (i.e., the code after the "__PERL__" directive). It is used by the simulation framework to do diag-specific post-processing. =item F This is the linked executable built by B (the * is the name of the application, if there is a non-default application). It is not used by the simulation framework, but it can be useful, for instance, for disassembly. =back =head1 BUILD PROCESS This section is an overview of how a diag is processed to produce F. =head2 Preprocessing Diags are run through several preprocessing steps that enable complex macro environments to perform diag setup. Such macro environments are not part of B and are therefore beyond the scope of this document. =over 4 =item Split perl and assembly The first preprocessing step is to split the diag into assembly and perl parts. B assumes that the diag is entirely assembly unless it encounters the symbol "__PERL__" on a line by itself. If it sees this symbol, then everything after it assumed to be a perl script. The output of this phase are files in the build directory: F and, if necessary, F. =item B The next step is to run the diag through the C preprocessor. Most diags will #include their boot code and perhaps some project-specific definitions. Diags can also #define symbols before including the boot code to configure its operation. Note that this is a Sun, not a GNU, preprocessor, which means that GNU extensions to cpp (such as preprocessor directives where the '#' isn't in column 1) cannot be used. For information the default include path, consult the B man page. The output of this stage is a file in the build directory called F. =item B After B, the diag is processed by B. This allows macro preprocessing that is substantially more powerful than what is possible with B. The content of these macros is project-specific. The version of B used by B is special in that it was compiled to allow 64-bit arithmetic via the C directive. The output of this stage is a file in the build directory called F. =item Sectioning A diag is made up of sections. Each section may contain a text, data, bss or other segments. The defining characteristic of a section is that each segment is contiguous in the virtual address space (i.e., the text segment is contiguous and so is the data segment, but they need not be contiguous with each other). A section begins with a C
directive at the beginning of a line. The C
line defines the section's name and optionally some parameters. Any data or code appearing after a C
directive belongs to that section, until the next C
directive is encountered. Any code or data before the first C
directive is part of the first section. If a SECTION line appears for a section that has previously been defined, the meaning is to append to the existing section. It is illegal to have SECTION-line arguments for any but the first definition of a section. Note that it simply appends to the existing section, so be sure to begin your appended version with a .text or .data assembly directive, as appropriate. Sections are linked at a specific, user-defined address. Linker scripts require that each section be in a separte file. The sectioning phase, therefore, extracts the C
directives from the assembly file and writes "pure" assembly files for each section. By "pure", we mean that these files have no B-specific directives and can therefore be assembled directly. The output of this phase are a series of files in the build directory, one for each section. Their names are FnumE.EsecnameE.s>. The sectioning phase also produces the file 'diag.midas' which contains all of the midas directives. Midas then parses this file, and leaves the others to the assembler. =back =head2 Assembly Each section written by the sectioning phase above is assembled via the GNU assembler. The output is a .o file. =head2 Link executable All object files are linked, using the GNU loader. Each section is linked at the virtual address defined in its section header. The output of this phase is F. =head2 PostProcessing After the diag is linked, the following postprocessing is done: =over 4 =item Generate F Generation of F is done a section at a time, not by simply disassembling F. The reason is that F should represent the initial contents of physical memory, which may or may not be a simple dump of the text and data segments for each section. For most diags, this will simply be a hex dump of F (with the sections linked at the appropriate physical addresses, as defined by the section header). How exactly the MMU constructs F is controlled by the section header, which is described in detail in the next section. Generation of F is handled by B. =item Generate F To generate the symbol table, the F file is examined to find virtual addresses for each symbol. The MMU is then used to do the virtual-to-physical translation and write the F file. Generation of F is handled by B. =item Generate F The F file is generated by examining the diag source for comments containing C<$EV>. These are then cross-referenced with F to producde F. =back =head1 DIAG FORMAT A diag consists of applications and sections within those applications. Diags may also contain TSBs. =head2 APPLICATION An application begins with: APPLICATION [FILE=] An application defines a single linked executable. All SECTIONs that follow are linked into this application. A linked executable is just an intermediate file in F generation, so an APPLICATION directive affects only relocation and the scope of labels. All diags have at least one application, even if none is defined, since all diags are treated as if their first line were: APPLICATION default If the optional filename is given, then that file is taken as the linked executable to use. The link path is searched to find a file by this name. This is how you can include a linked executable that was not generated by B. =head3 goldfinger_cmd blocks There is currently no way to specify address translations and F contents for applications that midas does not generate. As the tool matures, I plan to invent an interface. In the meantime, you can include a goldfinger_cmd block. Such a block begins a line with "goldfinger_cmd" and an open curly-brace. It ends with the closed-curly. The contents of the block are not interpreted B at all. They are simply copied into F inside the currently open application. An example: goldfinger_cmd { BLOCK .main_text_0 SECTION_NAME = ".MAIN"; SEGMENT_NAME = "text"; LINK_SECTION = "sec7.maint"; SRC_FILE = "diag.m4"; SRC_LINE = 5398; COMPRESS = 0; VA = 0x0000000020000000; RA = 0x0130000000; PA = 0x1130000000; IN_IMAGE = 1; BLOCK_TSB part_0_i_ctx_nonzero_ps0_tsb page_size = 8192; va_index_bits = 21 : 13; tag_addr_bits = 63 : 13; data_addr_bits = 39 : 13; tag_base = 0x0000000000000044; data_base = 0x8000000000000440; END BLOCK_TSB END BLOCK } Note that until the tool matures, the B-B interface may change, so this syntax is deprecated, but it can be useful in a pinch. =head2 SECTION Definitions A SECTION defines a region of the diag that may contain up to 3 segments: text, data, and bss. Each of these segments is contiguous in the virtual address space (but not necessarily in the real or physical address spaces). Note that the B terminology is different from the B terminology. Each segment in B terminology corresponds to an B section. SECTION [section_args] When a section directive is encountered, all assembly code (and data) that follows is placed in that section, until the next SECTION directive is encountered. The C
header affects the text and data segments that follow it, until another C
header is reached. As a special case, all code and data in the assembly file before the first C
header belongs to the first section. A SECTION line may be split across multiple lines of input by escaping the newline with a \. The section_args should define the virtual addresses at which to link the various segments. This is done by a comma-separated list such as: SECTION .MAIN TEXT_VA=0x20000000, DATA_VA=0x60000000, \ BSS_VA=0x68030000 Any of the virtual addresses may be ommitted, but if they are, that segment will not be included in the link. The *_VA symbols are all case-insensitive. The addresses themselves are assumed to be 64-bit decimal numbers, unless they start with 0x (in which case they are 64-bit hex numbers). See the section on L<"ADDRESS TRANSLATIONS"> for details on how address translations can be specified for the segments in a section. Note that unless address translations are specified, there is no physical address to place segments in the F file! =head2 TSB OBJECT DEFINITIONS A TSB object is decared with the following syntax: MIDAS_TSB [args] This defines a TSB with the specified name, which is initialized by the config register EregisterE. It will be instantiated in the memory image if any attr block tries to use it. All MMU types get a base address and TSB size from the config register as defined in their PRM. Niagara-2, in addition, parses a page size (same meaning as the page_size optional argument below) and sun4u/sun4v, which will be used instead of the global default for ttefmt. Note that if you provide the optional arguments page_size and/or ttefmt for Niagara-2, the optional arguments will override the config register. The optional args can be: =over 4 =item link=EnameE Use the specified name as a link area. This is used in the case of TSB collisions to hold a linked list. There must be MIDAS_TSB_LINK declaration by this name. =item force_ctx_zero If this is specified, then any entries added to this TSB will have context bits of zero, regardless of how they are specified in the attr blocks. =item page_size=EcodedSizeE This defines a default page size for all entries that are added to the TSB. This will be used if no TTE_Size_Ptr values are given for the entries. The coded size is the same encoding used in the TTE_Size field. =item way=EwayE If a TSB is split (which only applies to the "ultra2" and "niagara" MMU types), this is specified to midas by creating two TSBs that have the same value of the config register with the split bit set. Midas treats each half of the TSB separately. This makes it easy for diags to control which half of the split TSB gets each translation. The way definition on the TSB line tells midas which half of the split TSB applies to this definition. The only legal settings are "way=0" and "way=1". If way is set to zero, the TSB is configured just as if it were not split. If way is set to one, then the base address is modifed internally so that it starts after the way=0 TSB would end. The way setting is ignored if the TSB is not split or if the MMU type does not support split TSBs. It is the responsibility of the diag writer to make sure that the two halves of a split TSB are configured in a compatible fashion (both sides having split bit on and the same base address). =item ttefmt=EformatE Sets the format for this TSB to the specified format (either sun4u or sun4v). This setting will be used instead of the default value of -ttefmt. =back =head2 TSB_LINK OBJECT DEFINITIONS A TSB_LINK object is an area used to store linked lists in the case of collisions in the TSB. Multiple TSBs can share a TSB_LINK. The syntax is: MIDAS_TSB_LINK This declares a TSB_LINK object that will start at the specified PA. It will be instantiated in the memory image if any TSB that uses it is instantiated. =head2 ADDRESS TRANSLATIONS Address translations are created by attr_ blocks. The name of the block defines the segment on which the block operates. They syntax is: attr_ { name|section=, =, =, = ... } The Esegment_nameE may be "text", "data", or "bss". Each attr block must specify which SECTION they belong to. They do this by setting name= or section= inside the block. This means the attr block itself may appear anywhere in the diag, not necessarily lexically inside the section. The blocks are matched to the sections by name, which is case-insensitive. The contents of the block are a list of key=value pairs (name|section= just being a special case). These pairs can be separated by commas and/or newlines. Key names are case-insensitive. A TSB name may appear as a key with no value. If any other key appears with no =value, the value is assumed to be 1. An example of an attr block is: attr_text { Name = .TRAPS, RA = 0x120000, PA = 0x1000120000, part_0_i_ctx_zero_ps0_tsb, TTE_Context=0, TTE_V=1, TTE_Size=0, TTE_NFO=0, TTE_IE=0, TTE_Soft2=0, TTE_Diag=0, TTE_Soft=0, TTE_L=0, TTE_CP=1, TTE_CV=0, TTE_E=0, TTE_P=1, TTE_W=1 } An attr block has two purposes: setting up TSB mappings and writing to F. It therefore needs to contain enough information to: =over 4 =item Select a subset of the segment An attr block need not define the same translation for an entire segment, and it may define a subset of the segment on which to operated. =item Physical address This defines where to write the segment (or segment subset) in the F. =item Define TSB parameters These include a list of TSBs that should contain translations for this section, an RA (real address) that should be included in the TSB, and TTE elements. The exact details may be processor-specific. It is controlled by the mmu type. Note that for MMUs that do not have two-level address translation (i.e., "ultra2"), there is no RA, so PA is used for TSBs instead. =back =head3 Selecting a subset Selecting a subset consits of defining a starting and stopping virtual address for the block. =head4 Defining the starting virtual address =over 4 =item start_label If the key C exists it must be a label inside the segment. It is used as the beginning of the attr block. It must be a page-aligned address unless the block is not being entered into a TSB. =item VA The attr block may explicitly define a starting virtual address using the tag C. It is an error if this virtual address is not a page-aligned address within the segment (if the block is not writing a TSB entry the alignment contraint is relaxed). For this reason, the start_label syntax is the preferred one for most diags. =item I If neither VA nor start_label are specified for an attr block, the starting VA for the segment is used. =back =head4 Defining the ending virtual address There are three ways to define the ending address for an attr block. =over 4 =item end_label If C is defined, it must be label inside the segment (and of course, it must appear after the starting VA). The C definiton ends at the address of the label. The address need not be page-aligned. =item end_va The attr block may explicitly define an ending virtual address. It is an error if this address is not part of this segment. If the special attribute C is used (see below), then end_va may be specified past the end of the segment. The C attribute implies C (i.e., data is written to the TSB but not to the memory image). =item I If neither end_label nor end_va are specified, then the attr block lasts until the end of the section. =back =head3 Physical address An attr block must have one of the following keys to define the physical address. The physical address is used to write F =over 4 =item PA The physical address is specified with the tag "PA". It should be set to an address, and the subset of the segment will be written to the F file at that physical address. It is an error to write to the same physical address twice in the same diag. =item tsbonly This special key tells the attr block not to write anything to F. It can be used if you want to create TSB entries but do not want to overwrite something to F. If the key "PA" is included, it is used only for symbol table generation. =item uninitialized This is exactly the same as tsbonly, except that normally the "end_va" key is checked to make sure that it is contained in the segment. Using uninitialized instead of tsbonly suspends that check. =item hypervisor (or bypass on "ultra2" MMU) The special tag "hypervisor" tells the attr block bypass both VA to RA translation and RA to PA tranlation. The segment will be completely unmapped. Therefore, it will not generate any TSB mappings, and it will write to F at PA=VA (actually, it generates a PA from as many low bits of the VA as will fit in a PA). It is used for segments where the MMU is off. For mmus that have only one level of address translation (i.e., "ultra2"), the key "hypervisor" does not exist, but "bypass" has the same meaning. =item compressimage This does not control the address, but it does affect how F creation is done. If compressimage is given in an attr block then lines of zeros are suppressed in F generation. Each aligned 32-byte chunk is compared against 0. If all bits are 0, then it is not written to mem.image. If the global -env_zero switch is enabled (on by default in Niagara-1), then a backdoor mechanism is used to initialize the memory to zero in the environment. Otherwise, it is left uninitialized. If the environment does not intialize all memory to zero, then this can actually change the meaning of mem.image, since it makes zero-ed memory uninitialized, rather than intialized to zero. If the flag -nocompress_image is given to B, then no blocks are compressed, regardless of compressimage tags. =head3 TSB parameters The following parameters should be set in each attr block (unless it contains the "hypervisor" key described above, or the "bypass" key for "ultra2"): =over 4 =item RA This defines the real address (middle of the 3-address scheme). It is the address to be written to the TSB data. It must be page-aligned. In the "ultra2" MMU, there is no RA, so the PA gets double-duty: PA is used both for mem.image generation and for TSB data. =item bypass This directive means to bypass VA to RA translation. In an MMU with two levels of address translation (like niagara), it simply sets RA=VA (actually, as many low bits of VA as will fit). It is an error to specify both RA and bypass. In an MMU with one level of translation ("ultra2"), it means to bypass all address translation, so its function is similar to the hypervisor directive described above. =item Etsb_nameE Any tsb names that are listed (and there may be more than one) will cause the attr block to add the subset to those TSBs. The tsb_names must be defined somewhere in the diag with a MIDAS_TSB directive. =item notsb Tells the attr block not to do any TSB generation. If RA is provided, it is simply used in the symbol table. =back Unless notsb is defined or the section is completely unmapped (bypass for "ultra2" or hypervisor for other MMUs), the attr block will be writing TSB entries. The following parameters are used to set the appropriate bits of the TSB entry. How exactly the TSB entries are formed is mmu-specific. Check the PRM for your processor. The default value for TTE_V (valid) is 1. The default value for all other fields is 0. =head3 MMU-Specific TTE Settings The fields in the TSB tag and data depend on the MMU type and the currently configured ttefmt setting (sun4u or sun4v). The ttefmt is can be contralled by the TSBs (which is always the case in Niagara-2) or by the global default set by -ttefmt. The MMU specific settings are described below. The default for each TTE setting is 0, except for TTE_V (valid), which defaults to 1. All TTE tags are case-insensitive. =head4 Ultrasparc II MMU "ultra2" The ultra2 MMU type supports only the sun4u TTE data format. =over 4 =item TTE_G Tag: 1 bit Global =item TTE_Context Tag:13 bits Context %%TTE DATA ultra2 sun4u =back =head4 Niagara MMU "niagara" The niagara MMU supports both the sun4u and sun4v TTE data formats. For sun4u, the following fields are valid: =over 4 =item TTE_Context Tag:13 bits Context %%TTE DATA niagara sun4u =back The following fields are valid for sun4v: =over 4 =item TTE_Context Tag:13 bits Context %%TTE DATA niagara sun4v =back =head4 Niagara2 MMU "niagara2" The niagara2 MMU supports both the sun4u and sun4v TTE data format. For sun4u, the following fields are valid: =over 4 =item TTE_Context Tag:13 bits Context %%TTE DATA niagara2 sun4u =back The following fields are valid for sun4v: =over 4 =item TTE_Context Tag:13 bits Context %%TTE DATA niagara2 sun4v =back # =head4 X MMU "X" # The X MMU supports only the sun4v data format. # =over 4 # =item TTE_Context Tag:13 bits Context # %%TTE DATA X sun4v # =back There are a few special cases to be aware of: =over 4 =item TTE_Size This is the size bits field to use in the TSB entry. It is also used to calculate the number of pages the attr block will create. The attr block will create as many pages as it needs to to span the section. The TTE_Size field controls the size of these pages. See the PRM for the exact coding of page size in TTE_Size. For Niagara, and Niagara-2 the encoding is: =over 4 =item 0 -> 8 kB =item 1 -> 64 kB =item 2 -> 512 kB (illegal on Niagara and Niagara-2) =item 3 -> 4 MB =item 5 -> 256 MB =item 6 -> 2 GB (illegal on Niagara and Niagara-2) =item 7 -> 16 GB (illegal on Niagara and Niagara-2) =back The "ultra2" MMU only has a 2-bit size field, so it supports page sizes 0-3 (which are also 8 kB - 4 MB). Note that when the above sections state that VA, RA, and PA must be page-aligned when adding them to a TSB, this is where the page size comes from. =item TTE_Size_Ptr The page size is used in the formula to calculate a TSB pointer. The page size used for pointer calculation is controlled by a hardware register, but B needs to set up the TSBs statically. By default, the attr block with use TTE_Size when it computes the TSB index (or the TSB page_size parameter/Niagara-2 TSB config, if one is defined), as well as for the uses above. If you set TTE_Size_Ptr, however, it will use this as the page size setting when computing the TSB index. Use this whenever you wish to have a different setting for TTE_Size than the hardware will have in its config register. =back =head2 High Level Languages The output of compilers of high-level languages may be inserted into midas. =head3 Object files The midas directive: MIDAS_OBJ FILE= may be placed inside any section. That object file will be linked with the assembly output for the section and will share its attr blocks. No special interpretation is done on the contents of the object file - the text, data, and bss segments are simply linked in with the output of the assembler for that section. The search path for .o files is controlled by the -L switch. The default path is the starting directory, and Ediag_rootE/verif/diag. =head3 Library files The midas directive: MIDAS_LIB FILE= Works just like a MIDAS_OBJ directive, except that it includes a library in the link. Note that the library must be static (i.e., a .a file, and not a .so file), because there is no runtime linker in the diag environment. Other than the file format, the difference between a library file and an object file, is that an object file will include all text/data/bss from the file, but linking with a library file will cause only those symbols that are actually used to be included. Library files will also search the same link path as object files. =head3 C files Simiar to object files, C language files may be included with the directive: MIDAS_CC FILE= [OUTPUT=] [ARGS=-O2] The OUTPUT and ARGS tags are optional, but the FILE tag is mandatory. The search path for the C source file is controleld by the -C switch, and the default is the starting directory, then Ediag_rootE/verif/diag/c. The ARGS tag (which must appear last, since the args last until the end of the line) is arguments to gcc. You must not use the -o or -c switches, since midas will provide its own. If "-S" is supplied to gcc through the ARGS tag, then gcc -S will be used to create an assembly file in the Sectioning phase, which will then be assembled like all the other assembly files in the Assembly phase. If -S is not present, then nothing is done in the Sectioning phase (except for finding the C file and copying it to the build directory). Rather, gcc is used to compile the C file directly to an object file during the assembly phase. The object file, generated either by gcc or by the of assembling of gcc -S, is then linked in with the rest of the section. Once the object file is generated, it is treated just as a MIDAS_OBJ directive. =head4 Stack Compiled C code expects the system to set up a stack for it before it runs. Template files are provided for this purpose. Be sure to use them or have some solution for setting up the stack if you are working with compiled code. =head1 AUTHOR =head1 SEE ALSO B(1), tre_perldoc Midas, B(1), B(1).