BSD 4_3_Reno development
[unix-history] / usr / share / man / cat5 / tarformat.0
TAR(5) 1985 TAR(5)
N\bNA\bAM\bME\bE
tar - tape archive file format
D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
_\bT_\ba_\br, (the tape archive command) dumps several files into
one, in a medium suitable for transportation.
A ``tar tape'' or file is a series of blocks. Each block is
of size TBLOCK. A file on the tape is represented by a
header block which describes the file, followed by zero or
more blocks which give the contents of the file. At the end
of the tape are two blocks filled with binary zeros, as an
end-of-file indicator.
The blocks are grouped for physical I/O operations. Each
group of _\bn blocks (where _\bn is set by the b\bb keyletter on the
_\bt_\ba_\br(1) command line - default is 20 blocks) is written with
a single system call; on nine-track tapes, the result of
this write is a single tape record. The last group is
always written at the full size, so blocks after the two
zero blocks contain random data. On reading, the specified
or default group size is used for the first read, but if
that read returns less than a full tape block, the reduced
block size is used for further reads.
The header block looks like:
#define TBLOCK 512
#define NAMSIZ 100
union hblock {
char dummy[TBLOCK];
struct header {
char name[NAMSIZ];
char mode[8];
char uid[8];
char gid[8];
char size[12];
char mtime[12];
char chksum[8];
char linkflag;
char linkname[NAMSIZ];
} dbuf;
};
_\bN_\ba_\bm_\be is a null-terminated string. The other fields are
zero-filled octal numbers in ASCII. Each field (of width w)
contains w-2 digits, a space, and a null, except _\bs_\bi_\bz_\be and
_\bm_\bt_\bi_\bm_\be, which do not contain the trailing null and _\bc_\bh_\bk_\bs_\bu_\bm
which has a null followed by a space. _\bN_\ba_\bm_\be is the name of
the file, as specified on the _\bt_\ba_\br command line. Files
dumped because they were in a directory which was named in
Printed 7/27/90 November 1
TAR(5) 1985 TAR(5)
the command line have the directory name as prefix and
/_\bf_\bi_\bl_\be_\bn_\ba_\bm_\be as suffix. _\bM_\bo_\bd_\be is the file mode, with the top
bit masked off. _\bU_\bi_\bd and _\bg_\bi_\bd are the user and group numbers
which own the file. _\bS_\bi_\bz_\be is the size of the file in bytes.
Links and symbolic links are dumped with this field speci-
fied as zero. _\bM_\bt_\bi_\bm_\be is the modification time of the file at
the time it was dumped. _\bC_\bh_\bk_\bs_\bu_\bm is an octal ASCII value
which represents the sum of all the bytes in the header
block. When calculating the checksum, the _\bc_\bh_\bk_\bs_\bu_\bm field is
treated as if it were all blanks. _\bL_\bi_\bn_\bk_\bf_\bl_\ba_\bg is NULL if the
file is ``normal'' or a special file, ASCII `1' if it is an
hard link, and ASCII `2' if it is a symbolic link. The name
linked-to, if any, is in _\bl_\bi_\bn_\bk_\bn_\ba_\bm_\be, with a trailing null.
Unused fields of the header are binary zeros (and are
included in the checksum).
The first time a given i-node number is dumped, it is dumped
as a regular file. The second and subsequent times, it is
dumped as a link instead. Upon retrieval, if a link entry
is retrieved, but not the file it was linked to, an error
message is printed and the tape must be manually re-scanned
to retrieve the linked-to file.
The encoding of the header is designed to be portable across
machines.
S\bSE\bEE\bE A\bAL\bLS\bSO\bO
tar(1)
B\bBU\bUG\bGS\bS
Names or linknames longer than NAMSIZ produce error reports
and cannot be dumped.
Printed 7/27/90 November 2