cleanups, add manual page
[unix-history] / usr / src / usr.bin / compress / compress.1
CommitLineData
b1c071ae 1.\" @(#)compress.1 6.5 (Berkeley) %G%
12c87604
KM
2.\"
3.TH COMPRESS 1 ""
4.UC 6
5.SH NAME
29351f11 6compress, uncompress, zcat \- compress and expand data
12c87604 7.SH SYNOPSIS
58ad5548 8.PU
12c87604
KM
9.ll +8
10.B compress
11[
12c87604
KM
12.B \-f
13] [
29351f11 14.B \-v
12c87604
KM
15] [
16.B \-c
17] [
18.B \-b
19.I bits
20] [
29351f11 21.I "name \&..."
12c87604
KM
22]
23.ll -8
24.br
25.B uncompress
26[
27.B \-f
28] [
29351f11 29.B \-v
12c87604
KM
30] [
31.B \-c
32] [
29351f11 33.I "name \&..."
12c87604
KM
34]
35.br
36.B zcat
37[
29351f11 38.I "name \&..."
12c87604
KM
39]
40.SH DESCRIPTION
29351f11
KM
41.I Compress
42reduces the size of the named files using adaptive Lempel-Ziv coding.
43Whenever possible,
44each file is replaced by one with the extension
12c87604 45.B "\&.Z,"
29351f11
KM
46while keeping the same ownership modes, access and modification times.
47If no files are specified, the standard input is compressed to the
48standard output.
49Compressed files can be restored to their original form using
12c87604 50.I uncompress
29351f11
KM
51or
52.I zcat.
12c87604 53.PP
29351f11 54The
12c87604 55.B \-f
29351f11 56option will force compression of
b1c071ae
JL
57.IR name ,
58even if it does not actually shrink
59or the corresponding
60.IR name .Z
61file already exists.
62Except when run in the background under
63.IR /bin/sh ,
64if
12c87604 65.B \-f
b1c071ae
JL
66is not given the user is prompted as to whether an existing
67.IR name .Z
68file should be overwritten.
12c87604
KM
69.PP
70The
71.B \-c
06b3e56e 72(``cat'') option makes
29351f11
KM
73.I compress/uncompress
74write to the standard output; no files are changed.
75The nondestructive behavior of
12c87604 76.I zcat
29351f11 77is identical to that of
12c87604 78.I uncompress
29351f11 79.B \-c.
12c87604
KM
80.PP
81.I Compress
29351f11 82uses the modified Lempel-Ziv algorithm popularized in
12c87604
KM
83"A Technique for High Performance Data Compression",
84Terry A. Welch,
29351f11
KM
85.I "IEEE Computer,"
86vol. 17, no. 6 (June 1984), pp. 8-19.
12c87604
KM
87Common substrings in the file are first replaced by 9-bit codes 257 and up.
88When code 512 is reached, the algorithm switches to 10-bit codes and
89continues to use more bits until the
29351f11 90limit specified by the
12c87604
KM
91.B \-b
92flag is reached (default 16).
93.I Bits
94must be between 9 and 16. The default can be changed in the source to allow
95.I compress
96to be run on a smaller machine.
97.PP
98After the
99.I bits
29351f11 100limit is attained,
12c87604
KM
101.I compress
102periodically checks the compression ratio. If it is increasing,
103.I compress
104continues to use the existing code dictionary. However,
105if the compression ratio decreases,
106.I compress
107discards the table of substrings and rebuilds it from scratch. This allows
108the algorithm to adapt to the next "block" of the file.
109.PP
29351f11 110Note that the
12c87604 111.B \-b
29351f11
KM
112flag is omitted for
113.I uncompress,
114since the
115.I bits
116parameter specified during compression
117is encoded within the output, along with
118a magic number to ensure that neither decompression of random data nor
119recompression of compressed data is attempted.
12c87604
KM
120.PP
121.ne 8
122The amount of compression obtained depends on the size of the
29351f11 123input, the number of
12c87604 124.I bits
29351f11
KM
125per code, and the distribution of common substrings.
126Typically, text such as source code or English
127is reduced by 50\-60%.
12c87604
KM
128Compression is generally much better than that achieved by
129Huffman coding (as used in
130.IR pack ),
131or adaptive Huffman coding
132.RI ( compact ),
133and takes less time to compute.
134.PP
b1c071ae 135The
29351f11 136.B \-v
b1c071ae
JL
137option causes
138the printing of the percentage reduction of each file.
12c87604 139.PP
b1c071ae
JL
140If an error occurs, exit status is 1, else
141if the last file was not compressed because it became larger, the status
142is 2; else the status is 0.
12c87604 143.SH "DIAGNOSTICS"
06b3e56e 144Usage: compress [\-fvc] [\-b maxbits] [file ...]
12c87604
KM
145.in +8
146Invalid options were specified on the command line.
147.in -8
148Missing maxbits
149.in +8
150Maxbits must follow
151.BR \-b \.
152.in -8
12c87604
KM
153.IR file :
154not in compressed format
155.in +8
29351f11
KM
156The file specified to
157.I uncompress
158has not been compressed.
12c87604
KM
159.in -8
160.IR file :
161compressed with
162.I xx
163bits, can only handle
164.I yy
165bits
166.in +8
29351f11
KM
167.I File
168was compressed by a program that could deal with
12c87604
KM
169more
170.I bits
29351f11
KM
171than the compress code on this machine.
172Recompress the file with smaller
12c87604
KM
173.IR bits \.
174.in -8
175.IR file :
176already has .Z suffix -- no change
177.in +8
29351f11
KM
178The file is assumed to be already compressed.
179Rename the file and try again.
12c87604
KM
180.in -8
181.IR file :
182filename too long to tack on .Z
183.in +8
29351f11 184The file cannot be compressed because its name is longer than
12c87604 18512 characters.
29351f11
KM
186Rename and try again.
187This message does not occur on BSD systems.
12c87604
KM
188.in -8
189.I file
190already exists; do you wish to overwrite (y or n)?
191.in +8
192Respond "y" if you want the output file to be replaced; "n" if not.
193.in -8
06b3e56e
JL
194uncompress: corrupt input
195.in +8
196A SIGSEGV violation was detected which usually means that the input file is
197corrupted.
198.in -8
12c87604
KM
199Compression:
200.I "xx.xx%"
201.in +8
29351f11
KM
202Percentage of the input saved by compression.
203(Relevant only for
204.BR \-v \.)
12c87604
KM
205.in -8
206-- not a regular file: unchanged
207.in +8
29351f11
KM
208When the input file is not a regular file,
209(e.g. a directory), it is
210left unaltered.
12c87604
KM
211.in -8
212-- has
213.I xx
214other links: unchanged
215.in +8
29351f11 216The input file has links; it is left unchanged. See
12c87604
KM
217.IR ln "(1)"
218for more information.
219.in -8
220-- file unchanged
221.in +8
29351f11
KM
222No savings is achieved by
223compression. The input remains virgin.
12c87604 224.in -8
29351f11 225.SH "BUGS"
06b3e56e 226Although compressed files are compatible between machines with large memory,
29351f11
KM
227.BR \-b \12
228should be used for file transfer to architectures with
229a small process data space (64KB or less, as exhibited by the DEC PDP
230series, the Intel 80286, etc.)
b1c071ae
JL
231.PP
232.I compress
233should be more flexible about the existence of the `.Z' suffix.