Annotation of src/usr.bin/compress/compress.1, Revision 1.25
1.25 ! jmc 1: .\" $OpenBSD: compress.1,v 1.24 2003/08/02 17:45:15 millert Exp $
1.1 deraadt 2: .\" $NetBSD: compress.1,v 1.5 1995/03/26 09:44:34 glass Exp $
3: .\"
4: .\" Copyright (c) 1986, 1990, 1993
5: .\" The Regents of the University of California. All rights reserved.
6: .\"
7: .\" This code is derived from software contributed to Berkeley by
8: .\" James A. Woods, derived from original work by Spencer Thomas
9: .\" and Joseph Orost.
10: .\"
11: .\" Redistribution and use in source and binary forms, with or without
12: .\" modification, are permitted provided that the following conditions
13: .\" are met:
14: .\" 1. Redistributions of source code must retain the above copyright
15: .\" notice, this list of conditions and the following disclaimer.
16: .\" 2. Redistributions in binary form must reproduce the above copyright
17: .\" notice, this list of conditions and the following disclaimer in the
18: .\" documentation and/or other materials provided with the distribution.
1.16 millert 19: .\" 3. Neither the name of the University nor the names of its contributors
1.1 deraadt 20: .\" may be used to endorse or promote products derived from this software
21: .\" without specific prior written permission.
22: .\"
23: .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
24: .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
25: .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
26: .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
27: .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
28: .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
29: .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
30: .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
31: .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
32: .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
33: .\" SUCH DAMAGE.
34: .\"
35: .\" @(#)compress.1 8.2 (Berkeley) 4/18/94
36: .\"
37: .Dd April 18, 1994
38: .Dt COMPRESS 1
1.7 aaron 39: .Os
1.1 deraadt 40: .Sh NAME
41: .Nm compress ,
1.18 deraadt 42: .Nm uncompress ,
43: .Nm gzip ,
44: .Nm gunzip
1.1 deraadt 45: .Nd compress and expand data
46: .Sh SYNOPSIS
47: .Nm compress
1.14 mickey 48: .Op Fl LV
49: .Nm compress
1.21 millert 50: .Op Fl cdfghlOnNqrtv123456789
1.1 deraadt 51: .Op Fl b Ar bits
1.18 deraadt 52: .Op Fl S Ar suffix
1.4 mickey 53: .Op Fl o Ar filename
1.1 deraadt 54: .Op Ar
55: .Nm uncompress
1.21 millert 56: .Op Fl cfhlnNqrtv
1.18 deraadt 57: .Op Fl o Ar filename
58: .Op Ar
59: .Pp
60: .Nm gzip
61: .Op Fl LV
62: .Nm gzip
1.21 millert 63: .Op Fl cdfghlnNOqrtv123456789
1.18 deraadt 64: .Op Fl b Ar bits
65: .Op Fl S Ar suffix
66: .Op Fl o Ar filename
67: .Op Ar
68: .Nm gunzip
1.21 millert 69: .Op Fl cfhnNqrltv
1.4 mickey 70: .Op Fl o Ar filename
1.7 aaron 71: .Op Ar
1.18 deraadt 72: .Pp
1.12 mickey 73: .Nm zcat
1.20 millert 74: .Op Fl fhqr
1.12 mickey 75: .Op Ar
1.1 deraadt 76: .Sh DESCRIPTION
1.9 aaron 77: The
1.18 deraadt 78: .Nm compress
79: and
80: .Nm gzip
81: utilities
82: reduce the size of the named files using adaptive Lempel-Ziv coding.
1.19 jmc 83: They are functionally identical, but use different algorithms for compression.
84: If invoked as
1.18 deraadt 85: .Nm gzip
1.19 jmc 86: or
87: .Nm compress Fl g
88: the deflate mode of compression is chosen by default;
89: otherwise the older method of compression
90: .Pq compress mode
91: is used.
92: .Pp
1.1 deraadt 93: Each
94: .Ar file
95: is renamed to the same name plus the extension
1.17 jmc 96: .Dq .Z ,
1.14 mickey 97: or
1.17 jmc 98: .Dq .gz
1.14 mickey 99: (in deflate mode).
1.1 deraadt 100: As many of the modification time, access time, file flags, file mode,
101: user ID, and group ID as allowed by permissions are retained in the
102: new file.
103: If compression would not reduce the size of a
104: .Ar file ,
1.17 jmc 105: the file is ignored (unless
1.14 mickey 106: .Fl f
107: is used).
1.1 deraadt 108: .Pp
1.9 aaron 109: The
1.6 aaron 110: .Nm uncompress
1.18 deraadt 111: and
112: .Nm gunzip
113: utilities restore compressed files to their original form, renaming the
1.24 millert 114: files by removing the extension (or by using the stored name if the
115: .Fl N
116: flag is specified).
117: When decompressing, the following extensions are recognized:
118: .Dq .Z ,
119: .Dq -Z ,
120: .Dq _Z ,
121: .Dq .gz ,
122: .Dq -gz ,
123: .Dq _gz ,
124: .Dq .tgz ,
125: .Dq -tgz ,
126: .Dq _tgz ,
127: .Dq .taz ,
128: .Dq -taz ,
129: and
130: .Dq _taz .
1.25 ! jmc 131: Extensions ending in
1.24 millert 132: .Dq tgz
133: and
134: .Dq taz
135: are not removed when decompressing, instead they are converted to
136: .Dq tar .
1.12 mickey 137: .Pp
138: The
139: .Nm zcat
1.13 mickey 140: command is equivalent in functionality to
1.12 mickey 141: .Nm uncompress
1.13 mickey 142: .Fl c .
1.1 deraadt 143: .Pp
144: If renaming the files would cause files to be overwritten and the standard
145: input device is a terminal, the user is prompted (on the standard error
146: output) for confirmation.
147: If prompting is not possible or confirmation is not received, the files
148: are not overwritten.
149: .Pp
150: If no files are specified, the standard input is compressed or uncompressed
151: to the standard output.
1.9 aaron 152: If either the input or output files are not regular files, the checks for
1.1 deraadt 153: reduction in size and file overwriting are not performed, the input file is
154: not removed, and the attributes of the input file are not retained.
155: .Pp
156: The options are as follows:
157: .Bl -tag -width Ds
1.14 mickey 158: .It Fl V
1.23 jmc 159: Display the program version
160: .Pq RCS IDs of the source files
161: and exit.
1.6 aaron 162: .It Fl b Ar bits
1.1 deraadt 163: Specify the
164: .Ar bits
1.23 jmc 165: code limit
166: .Pq see below .
1.1 deraadt 167: .It Fl c
168: Compressed or uncompressed output is written to the standard output.
1.17 jmc 169: No files are modified (force
1.14 mickey 170: .Nm zcat
171: mode).
1.4 mickey 172: .It Fl d
1.14 mickey 173: Decompress the source files instead of compressing them (force
174: .Nm uncompress
175: mode).
1.1 deraadt 176: .It Fl f
177: Force compression of
178: .Ar file ,
179: even if it is not actually reduced in size.
180: Additionally, files are overwritten without prompting for confirmation.
1.4 mickey 181: .It Fl g
1.14 mickey 182: Use deflate scheme which reportedly provides better compression rates (force
1.17 jmc 183: .Nm gzip
1.14 mickey 184: mode).
1.18 deraadt 185: This flag need not be specified when invoked as
186: .Nm gzip .
1.20 millert 187: .It Fl h
188: Print a short help message.
1.21 millert 189: .It Fl l
190: List information for the specified compressed files.
191: The following information is listed:
1.23 jmc 192: .Bl -tag -width "compression ratio"
1.21 millert 193: .It compressed size
1.23 jmc 194: Size of the compressed file.
1.21 millert 195: .It uncompressed size
1.23 jmc 196: Size of the file when uncompressed.
1.21 millert 197: .It compression ratio
1.23 jmc 198: Ratio of the difference between the compressed and uncompressed
1.21 millert 199: sizes to the uncompressed size.
200: .It uncompressed name
1.23 jmc 201: Name the file will be saved as when uncompressing.
1.21 millert 202: .El
203: .Pp
204: If the
205: .Fl v
206: option is specified, the following additional information is printed:
1.23 jmc 207: .Bl -tag -width "compression method"
1.21 millert 208: .It compression method
1.23 jmc 209: Name of the method used to compress the file.
1.21 millert 210: .It crc
1.23 jmc 211: 32-bit CRC
212: .Pq cyclic redundancy code
213: of the uncompressed file.
1.21 millert 214: .It "time stamp"
1.23 jmc 215: Date and time corresponding to the last data modification time
1.21 millert 216: (mtime) of the compressed file (if the
217: .Fl n
218: option is specified, the time stamp stored in the compressed file
219: is printed instead).
220: .El
221: .It Fl n
222: When compressing, do not save the original file name and time stamp.
223: This information is saved by default when the deflate scheme is used.
224: When uncompressing, do not restore the original file name and time stamp.
225: By default, the uncompressed file inherits the time stamp of the
1.24 millert 226: compressed version and the uncompressed file name is generated from
227: the name of the compressed file name as described above.
1.21 millert 228: .It Fl N
229: When compressing, save the original file name and time stamp in the
230: compressed file.
231: This information is saved by default when the deflate scheme is used.
232: When uncompressing or listing, use the time stamp and file name stored
233: in the compressed file, if any, for the uncompressed version.
1.14 mickey 234: .It Fl 1...9
1.19 jmc 235: Use deflate scheme with compression factor of
236: .Fl 1
237: to
238: .Fl 9 .
239: Compression factor
240: .Fl 1
241: is the fastest, but provides a poorer level of compression.
242: Compression factor
243: .Fl 9
244: provides the best level of compression, but is relatively slow.
245: The default is
246: .Fl 6 .
247: This option implies
248: .Fl g .
1.4 mickey 249: .It Fl O
1.14 mickey 250: Use old compression method.
1.6 aaron 251: .It Fl o Ar filename
1.4 mickey 252: Set the output file name.
1.14 mickey 253: .It Fl S Ar suffix
254: Set suffix for compressed files.
1.4 mickey 255: .It Fl t
1.6 aaron 256: Test the integrity of each file leaving any files intact.
1.15 millert 257: .It Fl r
258: Recursive mode,
259: .Nm
260: will descend into specified directories.
1.4 mickey 261: .It Fl q
1.14 mickey 262: Be quiet, suppress all messages.
1.1 deraadt 263: .It Fl v
1.14 mickey 264: Print the percentage reduction of each file and other information.
1.1 deraadt 265: .El
266: .Pp
1.14 mickey 267: In normal mode,
1.8 aaron 268: .Nm
1.19 jmc 269: uses a modified Lempel-Ziv algorithm
270: .Pq LZW .
1.1 deraadt 271: Common substrings in the file are first replaced by 9-bit codes 257 and up.
272: When code 512 is reached, the algorithm switches to 10-bit codes and
273: continues to use more bits until the
274: limit specified by the
275: .Fl b
1.9 aaron 276: flag is reached.
1.6 aaron 277: .Ar bits
1.23 jmc 278: must be between 9 and 16
279: .Pq the default is 16 .
1.1 deraadt 280: .Pp
281: After the
282: .Ar bits
283: limit is reached,
1.8 aaron 284: .Nm
1.1 deraadt 285: periodically checks the compression ratio.
286: If it is increasing,
1.8 aaron 287: .Nm
1.1 deraadt 288: continues to use the existing code dictionary.
289: However, if the compression ratio decreases,
1.8 aaron 290: .Nm
1.11 aaron 291: discards the table of substrings and rebuilds it from scratch.
292: This allows the algorithm to adapt to the next
1.8 aaron 293: .Dq block
294: of the file.
1.1 deraadt 295: .Pp
1.18 deraadt 296: .Nm gzip
1.19 jmc 297: uses a slightly different version of the Lempel-Ziv algorithm
298: .Pq LZ77 .
299: Common substrings are replaced by pointers to previous strings,
300: and are found using a hash table.
301: Unique substrings are emitted as a string of literal bytes,
302: and compressed as Huffman trees.
1.18 deraadt 303: .Pp
1.1 deraadt 304: The
305: .Fl b
306: flag is omitted for
1.3 deraadt 307: .Nm uncompress
1.18 deraadt 308: or
309: .Nm gunzip
1.1 deraadt 310: since the
311: .Ar bits
312: parameter specified during compression
313: is encoded within the output, along with
314: a magic number to ensure that neither decompression of random data nor
315: recompression of compressed data is attempted.
316: .Pp
317: The amount of compression obtained depends on the size of the
318: input, the number of
319: .Ar bits
320: per code, and the distribution of common substrings.
1.23 jmc 321: Typically, text such as source code or English is reduced by 50 \- 60% using
1.19 jmc 322: .Nm
1.23 jmc 323: and by 60 \- 70% using
1.19 jmc 324: .Nm gzip .
1.1 deraadt 325: Compression is generally much better than that achieved by Huffman
326: coding (as used in the historical command pack), or adaptive Huffman
327: coding (as used in the historical command compact), and takes less
328: time to compute.
329: .Pp
330: The
1.8 aaron 331: .Nm
1.18 deraadt 332: and
333: .Nm gzip
334: utilities exit with 0 on success, 1 if an error occurred, or 2 if one or
1.5 denny 335: more files were not compressed because they would have grown in
1.6 aaron 336: size (and
337: .Fl f
1.9 aaron 338: was not specified).
1.22 millert 339: .Sh RETURN VALUES
340: The
341: .Nm
342: utility exits with one of the following values:
343: .Pp
344: .Bl -tag -width flag -compact
345: .It Li 0
346: The file was compressed successfully.
347: .It Li 1
348: An error occurred.
349: .It Li 2
350: A warning occurred.
1.23 jmc 351: .El
1.1 deraadt 352: .Sh SEE ALSO
353: .Rs
354: .%A Welch, Terry A.
355: .%D June, 1984
356: .%T "A Technique for High Performance Data Compression"
357: .%J "IEEE Computer"
358: .%V 17:6
359: .%P pp. 8-19
360: .Re
1.19 jmc 361: .Pp
362: .Bl -tag -width 12n -compact
1.23 jmc 363: .It RFC 1950
364: ZLIB Compressed Data Format Specification.
365: .It RFC 1951
366: DEFLATE Compressed Data Format Specification.
367: .It RFC 1952
368: GZIP File Format Specification.
1.19 jmc 369: .El
1.5 denny 370: .Sh STANDARDS
371: The
1.8 aaron 372: .Nm
1.5 denny 373: utility is compliant with the
374: .St -p1003.2-92
375: specification.
1.19 jmc 376: .Pp
377: The
378: .Nm gzip
379: and
380: .Nm gunzip
381: utilities are extensions.
1.1 deraadt 382: .Sh HISTORY
383: The
384: .Nm
385: command appeared in
386: .Bx 4.3 .
1.6 aaron 387: The deflate compression support was added in
1.4 mickey 388: .Ox 2.1 .