BSD 4_3_Tahoe development
[unix-history] / usr / man / cat1 / compress.0
COMPRESS(1) UNIX Programmer's Manual COMPRESS(1)
N\bNA\bAM\bME\bE
compress, uncompress, zcat - compress and expand data
S\bSY\bYN\bNO\bOP\bPS\bSI\bIS\bS
c\bco\bom\bmp\bpr\bre\bes\bss\bs [ -\b-f\bf ] [ -\b-v\bv ] [ -\b-c\bc ] [ -\b-b\bb _\bb_\bi_\bt_\bs ] [ _\bn_\ba_\bm_\be ... ]
u\bun\bnc\bco\bom\bmp\bpr\bre\bes\bss\bs [ -\b-f\bf ] [ -\b-v\bv ] [ -\b-c\bc ] [ _\bn_\ba_\bm_\be ... ]
z\bzc\bca\bat\bt [ _\bn_\ba_\bm_\be ... ]
D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
_\bC_\bo_\bm_\bp_\br_\be_\bs_\bs reduces the size of the named files using adaptive
Lempel-Ziv coding. Whenever possible, each file is replaced
by one with the extension .\b.Z\bZ,\b, while keeping the same owner-
ship modes, access and modification times. If no files are
specified, the standard input is compressed to the standard
output. Compressed files can be restored to their original
form using _\bu_\bn_\bc_\bo_\bm_\bp_\br_\be_\bs_\bs or _\bz_\bc_\ba_\bt.
The -\b-f\bf option will force compression of _\bn_\ba_\bm_\be, even if it
does not actually shrink or the corresponding _\bn_\ba_\bm_\be.Z file
already exists. Except when run in the background under
/_\bb_\bi_\bn/_\bs_\bh, if -\b-f\bf is not given the user is prompted as to
whether an existing _\bn_\ba_\bm_\be.Z file should be overwritten.
The -\b-c\bc (``cat'') option makes _\bc_\bo_\bm_\bp_\br_\be_\bs_\bs/_\bu_\bn_\bc_\bo_\bm_\bp_\br_\be_\bs_\bs write to
the standard output; no files are changed. The nondestruc-
tive behavior of _\bz_\bc_\ba_\bt is identical to that of _\bu_\bn_\bc_\bo_\bm_\bp_\br_\be_\bs_\bs -\b-c\bc.\b.
_\bC_\bo_\bm_\bp_\br_\be_\bs_\bs uses the modified Lempel-Ziv algorithm popularized
in "A Technique for High Performance Data Compression",
Terry A. Welch, _\bI_\bE_\bE_\bE _\bC_\bo_\bm_\bp_\bu_\bt_\be_\br, vol. 17, no. 6 (June 1984),
pp. 8-19. Common substrings in the file are first replaced
by 9-bit codes 257 and up. When code 512 is reached, the
algorithm switches to 10-bit codes and continues to use more
bits until the limit specified by the -\b-b\bb flag is reached
(default 16). _\bB_\bi_\bt_\bs must be between 9 and 16. The default
can be changed in the source to allow _\bc_\bo_\bm_\bp_\br_\be_\bs_\bs to be run on
a smaller machine.
After the _\bb_\bi_\bt_\bs limit is attained, _\bc_\bo_\bm_\bp_\br_\be_\bs_\bs periodically
checks the compression ratio. If it is increasing, _\bc_\bo_\bm_\bp_\br_\be_\bs_\bs
continues to use the existing code dictionary. However, if
the compression ratio decreases, _\bc_\bo_\bm_\bp_\br_\be_\bs_\bs discards the table
of substrings and rebuilds it from scratch. This allows the
algorithm to adapt to the next "block" of the file.
Note that the -\b-b\bb flag is omitted for _\bu_\bn_\bc_\bo_\bm_\bp_\br_\be_\bs_\bs, since the
_\bb_\bi_\bt_\bs parameter specified during compression is encoded
within the output, along with a magic number to ensure that
neither decompression of random data nor recompression of
compressed data is attempted.
Printed 7/9/88 May 11, 1986 1
COMPRESS(1) UNIX Programmer's Manual COMPRESS(1)
The amount of compression obtained depends on the size of
the input, the number of _\bb_\bi_\bt_\bs per code, and the distribution
of common substrings. Typically, text such as source code
or English is reduced by 50-60%. Compression is generally
much better than that achieved by Huffman coding (as used in
_\bp_\ba_\bc_\bk), or adaptive Huffman coding (_\bc_\bo_\bm_\bp_\ba_\bc_\bt), and takes less
time to compute.
The -\b-v\bv option causes the printing of the percentage reduc-
tion of each file.
If an error occurs, exit status is 1, else if the last file
was not compressed because it became larger, the status is
2; else the status is 0.
D\bDI\bIA\bAG\bGN\bNO\bOS\bST\bTI\bIC\bCS\bS
Usage: compress [-fvc] [-b maxbits] [file ...]
Invalid options were specified on the command line.
Missing maxbits
Maxbits must follow -\b-b\bb.
_\bf_\bi_\bl_\be: not in compressed format
The file specified to _\bu_\bn_\bc_\bo_\bm_\bp_\br_\be_\bs_\bs has not been
compressed.
_\bf_\bi_\bl_\be: compressed with _\bx_\bx bits, can only handle _\by_\by bits
_\bF_\bi_\bl_\be was compressed by a program that could deal
with more _\bb_\bi_\bt_\bs than the compress code on this
machine. Recompress the file with smaller _\bb_\bi_\bt_\bs.
_\bf_\bi_\bl_\be: already has .Z suffix -- no change
The file is assumed to be already compressed.
Rename the file and try again.
_\bf_\bi_\bl_\be: filename too long to tack on .Z
The file cannot be compressed because its name is
longer than 12 characters. Rename and try again.
This message does not occur on BSD systems.
_\bf_\bi_\bl_\be already exists; do you wish to overwrite (y or n)?
Respond "y" if you want the output file to be
replaced; "n" if not.
uncompress: corrupt input
A SIGSEGV violation was detected which usually means
that the input file is corrupted.
Compression: _\bx_\bx._\bx_\bx%
Percentage of the input saved by compression.
(Relevant only for -\b-v\bv.)
-- not a regular file: unchanged
When the input file is not a regular file, (e.g. a
directory), it is left unaltered.
-- has _\bx_\bx other links: unchanged
The input file has links; it is left unchanged. See
_\bl_\bn(1) for more information.
-- file unchanged
No savings is achieved by compression. The input
remains virgin.
Printed 7/9/88 May 11, 1986 2
COMPRESS(1) UNIX Programmer's Manual COMPRESS(1)
B\bBU\bUG\bGS\bS
Although compressed files are compatible between machines
with large memory, -\b-b\bb12 should be used for file transfer to
architectures with a small process data space (64KB or less,
as exhibited by the DEC PDP series, the Intel 80286, etc.)
_\bc_\bo_\bm_\bp_\br_\be_\bs_\bs should be more flexible about the existence of the
`.Z' suffix.
Printed 7/9/88 May 11, 1986 3