delete mention of ^Y; currently don't allow ^Y or ^E.
[unix-history] / usr / src / usr.bin / compress / compress.1
CommitLineData
58777538
KB
1.\" Copyright (c) 1986 The Regents of the University of California.
2.\" All rights reserved.
3.\"
4.\" This code is derived from software contributed to Berkeley by
5.\" James A. Woods, derived from original work by Spencer Thomas
6.\" and Joseph Orost.
7.\"
8.\" Redistribution and use in source and binary forms are permitted
9.\" provided that the above copyright notice and this paragraph are
10.\" duplicated in all such forms and that any documentation,
11.\" advertising materials, and other materials related to such
12.\" distribution and use acknowledge that the software was developed
13.\" by the University of California, Berkeley. The name of the
14.\" University may not be used to endorse or promote products derived
15.\" from this software without specific prior written permission.
16.\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
17.\" IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
18.\" WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
19.\"
20.\" @(#)compress.1 6.6 (Berkeley) %G%
12c87604
KM
21.\"
22.TH COMPRESS 1 ""
23.UC 6
24.SH NAME
29351f11 25compress, uncompress, zcat \- compress and expand data
12c87604 26.SH SYNOPSIS
58ad5548 27.PU
12c87604
KM
28.ll +8
29.B compress
30[
12c87604
KM
31.B \-f
32] [
29351f11 33.B \-v
12c87604
KM
34] [
35.B \-c
36] [
37.B \-b
38.I bits
39] [
29351f11 40.I "name \&..."
12c87604
KM
41]
42.ll -8
43.br
44.B uncompress
45[
46.B \-f
47] [
29351f11 48.B \-v
12c87604
KM
49] [
50.B \-c
51] [
29351f11 52.I "name \&..."
12c87604
KM
53]
54.br
55.B zcat
56[
29351f11 57.I "name \&..."
12c87604
KM
58]
59.SH DESCRIPTION
29351f11
KM
60.I Compress
61reduces the size of the named files using adaptive Lempel-Ziv coding.
62Whenever possible,
63each file is replaced by one with the extension
12c87604 64.B "\&.Z,"
29351f11
KM
65while keeping the same ownership modes, access and modification times.
66If no files are specified, the standard input is compressed to the
67standard output.
68Compressed files can be restored to their original form using
12c87604 69.I uncompress
29351f11
KM
70or
71.I zcat.
12c87604 72.PP
29351f11 73The
12c87604 74.B \-f
29351f11 75option will force compression of
b1c071ae
JL
76.IR name ,
77even if it does not actually shrink
78or the corresponding
79.IR name .Z
80file already exists.
81Except when run in the background under
82.IR /bin/sh ,
83if
12c87604 84.B \-f
b1c071ae
JL
85is not given the user is prompted as to whether an existing
86.IR name .Z
87file should be overwritten.
12c87604
KM
88.PP
89The
90.B \-c
06b3e56e 91(``cat'') option makes
29351f11
KM
92.I compress/uncompress
93write to the standard output; no files are changed.
94The nondestructive behavior of
12c87604 95.I zcat
29351f11 96is identical to that of
12c87604 97.I uncompress
29351f11 98.B \-c.
12c87604
KM
99.PP
100.I Compress
29351f11 101uses the modified Lempel-Ziv algorithm popularized in
12c87604
KM
102"A Technique for High Performance Data Compression",
103Terry A. Welch,
29351f11
KM
104.I "IEEE Computer,"
105vol. 17, no. 6 (June 1984), pp. 8-19.
12c87604
KM
106Common substrings in the file are first replaced by 9-bit codes 257 and up.
107When code 512 is reached, the algorithm switches to 10-bit codes and
108continues to use more bits until the
29351f11 109limit specified by the
12c87604
KM
110.B \-b
111flag is reached (default 16).
112.I Bits
113must be between 9 and 16. The default can be changed in the source to allow
114.I compress
115to be run on a smaller machine.
116.PP
117After the
118.I bits
29351f11 119limit is attained,
12c87604
KM
120.I compress
121periodically checks the compression ratio. If it is increasing,
122.I compress
123continues to use the existing code dictionary. However,
124if the compression ratio decreases,
125.I compress
126discards the table of substrings and rebuilds it from scratch. This allows
127the algorithm to adapt to the next "block" of the file.
128.PP
29351f11 129Note that the
12c87604 130.B \-b
29351f11
KM
131flag is omitted for
132.I uncompress,
133since the
134.I bits
135parameter specified during compression
136is encoded within the output, along with
137a magic number to ensure that neither decompression of random data nor
138recompression of compressed data is attempted.
12c87604
KM
139.PP
140.ne 8
141The amount of compression obtained depends on the size of the
29351f11 142input, the number of
12c87604 143.I bits
29351f11
KM
144per code, and the distribution of common substrings.
145Typically, text such as source code or English
146is reduced by 50\-60%.
12c87604
KM
147Compression is generally much better than that achieved by
148Huffman coding (as used in
149.IR pack ),
150or adaptive Huffman coding
151.RI ( compact ),
152and takes less time to compute.
153.PP
b1c071ae 154The
29351f11 155.B \-v
b1c071ae
JL
156option causes
157the printing of the percentage reduction of each file.
12c87604 158.PP
b1c071ae
JL
159If an error occurs, exit status is 1, else
160if the last file was not compressed because it became larger, the status
161is 2; else the status is 0.
12c87604 162.SH "DIAGNOSTICS"
06b3e56e 163Usage: compress [\-fvc] [\-b maxbits] [file ...]
12c87604
KM
164.in +8
165Invalid options were specified on the command line.
166.in -8
167Missing maxbits
168.in +8
169Maxbits must follow
170.BR \-b \.
171.in -8
12c87604
KM
172.IR file :
173not in compressed format
174.in +8
29351f11
KM
175The file specified to
176.I uncompress
177has not been compressed.
12c87604
KM
178.in -8
179.IR file :
180compressed with
181.I xx
182bits, can only handle
183.I yy
184bits
185.in +8
29351f11
KM
186.I File
187was compressed by a program that could deal with
12c87604
KM
188more
189.I bits
29351f11
KM
190than the compress code on this machine.
191Recompress the file with smaller
12c87604
KM
192.IR bits \.
193.in -8
194.IR file :
195already has .Z suffix -- no change
196.in +8
29351f11
KM
197The file is assumed to be already compressed.
198Rename the file and try again.
12c87604
KM
199.in -8
200.IR file :
201filename too long to tack on .Z
202.in +8
29351f11 203The file cannot be compressed because its name is longer than
12c87604 20412 characters.
29351f11
KM
205Rename and try again.
206This message does not occur on BSD systems.
12c87604
KM
207.in -8
208.I file
209already exists; do you wish to overwrite (y or n)?
210.in +8
211Respond "y" if you want the output file to be replaced; "n" if not.
212.in -8
06b3e56e
JL
213uncompress: corrupt input
214.in +8
215A SIGSEGV violation was detected which usually means that the input file is
216corrupted.
217.in -8
12c87604
KM
218Compression:
219.I "xx.xx%"
220.in +8
29351f11
KM
221Percentage of the input saved by compression.
222(Relevant only for
223.BR \-v \.)
12c87604
KM
224.in -8
225-- not a regular file: unchanged
226.in +8
29351f11
KM
227When the input file is not a regular file,
228(e.g. a directory), it is
229left unaltered.
12c87604
KM
230.in -8
231-- has
232.I xx
233other links: unchanged
234.in +8
29351f11 235The input file has links; it is left unchanged. See
12c87604
KM
236.IR ln "(1)"
237for more information.
238.in -8
239-- file unchanged
240.in +8
29351f11
KM
241No savings is achieved by
242compression. The input remains virgin.
12c87604 243.in -8
29351f11 244.SH "BUGS"
06b3e56e 245Although compressed files are compatible between machines with large memory,
29351f11
KM
246.BR \-b \12
247should be used for file transfer to architectures with
248a small process data space (64KB or less, as exhibited by the DEC PDP
249series, the Intel 80286, etc.)
b1c071ae
JL
250.PP
251.I compress
252should be more flexible about the existence of the `.Z' suffix.