Annotation of src/usr.bin/compress/compress.1, Revision 1.31
1.31 ! jmc 1: .\" $OpenBSD: compress.1,v 1.30 2005/07/22 08:38:46 jmc Exp $
1.1 deraadt 2: .\" $NetBSD: compress.1,v 1.5 1995/03/26 09:44:34 glass Exp $
3: .\"
4: .\" Copyright (c) 1986, 1990, 1993
5: .\" The Regents of the University of California. All rights reserved.
6: .\"
7: .\" This code is derived from software contributed to Berkeley by
8: .\" James A. Woods, derived from original work by Spencer Thomas
9: .\" and Joseph Orost.
10: .\"
11: .\" Redistribution and use in source and binary forms, with or without
12: .\" modification, are permitted provided that the following conditions
13: .\" are met:
14: .\" 1. Redistributions of source code must retain the above copyright
15: .\" notice, this list of conditions and the following disclaimer.
16: .\" 2. Redistributions in binary form must reproduce the above copyright
17: .\" notice, this list of conditions and the following disclaimer in the
18: .\" documentation and/or other materials provided with the distribution.
1.16 millert 19: .\" 3. Neither the name of the University nor the names of its contributors
1.1 deraadt 20: .\" may be used to endorse or promote products derived from this software
21: .\" without specific prior written permission.
22: .\"
23: .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
24: .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
25: .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
26: .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
27: .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
28: .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
29: .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
30: .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
31: .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
32: .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
33: .\" SUCH DAMAGE.
34: .\"
35: .\" @(#)compress.1 8.2 (Berkeley) 4/18/94
36: .\"
37: .Dd April 18, 1994
38: .Dt COMPRESS 1
1.7 aaron 39: .Os
1.1 deraadt 40: .Sh NAME
41: .Nm compress ,
1.18 deraadt 42: .Nm uncompress ,
43: .Nm gzip ,
1.31 ! jmc 44: .Nm gunzip ,
! 45: .Nm zcat ,
! 46: .Nm gzcat
1.1 deraadt 47: .Nd compress and expand data
48: .Sh SYNOPSIS
49: .Nm compress
1.14 mickey 50: .Op Fl LV
51: .Nm compress
1.30 jmc 52: .Op Fl 123456789cdfghlNnOqrtv
1.1 deraadt 53: .Op Fl b Ar bits
1.30 jmc 54: .Op Fl o Ar filename
1.18 deraadt 55: .Op Fl S Ar suffix
1.1 deraadt 56: .Op Ar
57: .Nm uncompress
1.30 jmc 58: .Op Fl cfhlNnqrtv
1.18 deraadt 59: .Op Fl o Ar filename
60: .Op Ar
61: .Pp
62: .Nm gzip
63: .Op Fl LV
64: .Nm gzip
1.30 jmc 65: .Op Fl 123456789cdfghlNnOqrtv
1.18 deraadt 66: .Op Fl b Ar bits
1.30 jmc 67: .Op Fl o Ar filename
1.18 deraadt 68: .Op Fl S Ar suffix
69: .Op Ar
70: .Nm gunzip
1.30 jmc 71: .Op Fl cfhlNnqrtv
1.4 mickey 72: .Op Fl o Ar filename
1.7 aaron 73: .Op Ar
1.18 deraadt 74: .Pp
1.12 mickey 75: .Nm zcat
1.28 jmc 76: .Op Fl fghqr
77: .Op Ar
78: .Nm gzcat
1.20 millert 79: .Op Fl fhqr
1.12 mickey 80: .Op Ar
1.1 deraadt 81: .Sh DESCRIPTION
1.9 aaron 82: The
1.18 deraadt 83: .Nm compress
84: and
85: .Nm gzip
86: utilities
87: reduce the size of the named files using adaptive Lempel-Ziv coding.
1.19 jmc 88: They are functionally identical, but use different algorithms for compression.
89: If invoked as
1.18 deraadt 90: .Nm gzip
1.19 jmc 91: or
1.28 jmc 92: .Nm compress Fl g ,
1.19 jmc 93: the deflate mode of compression is chosen by default;
94: otherwise the older method of compression
95: .Pq compress mode
96: is used.
97: .Pp
1.1 deraadt 98: Each
99: .Ar file
100: is renamed to the same name plus the extension
1.17 jmc 101: .Dq .Z ,
1.14 mickey 102: or
1.17 jmc 103: .Dq .gz
1.14 mickey 104: (in deflate mode).
1.1 deraadt 105: As many of the modification time, access time, file flags, file mode,
106: user ID, and group ID as allowed by permissions are retained in the
107: new file.
108: If compression would not reduce the size of a
109: .Ar file ,
1.17 jmc 110: the file is ignored (unless
1.14 mickey 111: .Fl f
112: is used).
1.1 deraadt 113: .Pp
1.9 aaron 114: The
1.6 aaron 115: .Nm uncompress
1.18 deraadt 116: and
117: .Nm gunzip
118: utilities restore compressed files to their original form, renaming the
1.24 millert 119: files by removing the extension (or by using the stored name if the
120: .Fl N
121: flag is specified).
122: When decompressing, the following extensions are recognized:
123: .Dq .Z ,
124: .Dq -Z ,
125: .Dq _Z ,
126: .Dq .gz ,
127: .Dq -gz ,
128: .Dq _gz ,
129: .Dq .tgz ,
130: .Dq -tgz ,
131: .Dq _tgz ,
132: .Dq .taz ,
133: .Dq -taz ,
134: and
135: .Dq _taz .
1.25 jmc 136: Extensions ending in
1.24 millert 137: .Dq tgz
138: and
139: .Dq taz
140: are not removed when decompressing, instead they are converted to
141: .Dq tar .
1.12 mickey 142: .Pp
143: The
144: .Nm zcat
1.13 mickey 145: command is equivalent in functionality to
1.12 mickey 146: .Nm uncompress
1.13 mickey 147: .Fl c .
1.28 jmc 148: The
149: .Nm gzcat
150: command is equivalent in functionality to
151: .Nm gunzip
152: .Fl c .
1.1 deraadt 153: .Pp
154: If renaming the files would cause files to be overwritten and the standard
155: input device is a terminal, the user is prompted (on the standard error
156: output) for confirmation.
157: If prompting is not possible or confirmation is not received, the files
158: are not overwritten.
159: .Pp
160: If no files are specified, the standard input is compressed or uncompressed
161: to the standard output.
1.9 aaron 162: If either the input or output files are not regular files, the checks for
1.1 deraadt 163: reduction in size and file overwriting are not performed, the input file is
164: not removed, and the attributes of the input file are not retained.
165: .Pp
166: The options are as follows:
167: .Bl -tag -width Ds
1.30 jmc 168: .It Fl 1...9
169: Use deflate scheme with compression factor of
170: .Fl 1
171: to
172: .Fl 9 .
173: Compression factor
174: .Fl 1
175: is the fastest, but provides a poorer level of compression.
176: Compression factor
177: .Fl 9
178: provides the best level of compression, but is relatively slow.
179: The default is
180: .Fl 6 .
181: This option implies
182: .Fl g .
1.6 aaron 183: .It Fl b Ar bits
1.1 deraadt 184: Specify the
185: .Ar bits
1.23 jmc 186: code limit
187: .Pq see below .
1.1 deraadt 188: .It Fl c
189: Compressed or uncompressed output is written to the standard output.
1.17 jmc 190: No files are modified (force
1.14 mickey 191: .Nm zcat
1.28 jmc 192: or
193: .Nm gzcat
1.14 mickey 194: mode).
1.4 mickey 195: .It Fl d
1.14 mickey 196: Decompress the source files instead of compressing them (force
197: .Nm uncompress
198: mode).
1.1 deraadt 199: .It Fl f
200: Force compression of
201: .Ar file ,
202: even if it is not actually reduced in size.
203: Additionally, files are overwritten without prompting for confirmation.
1.27 tedu 204: If the input data is not in a format recognized by
205: .Nm
206: and if the option
207: .Fl c
208: is also given, copy the input data without change
1.29 jmc 209: to the standard output: let
1.27 tedu 210: .Nm zcat
1.28 jmc 211: or
212: .Nm gzcat
1.27 tedu 213: behave as
1.28 jmc 214: .Xr cat 1 .
1.4 mickey 215: .It Fl g
1.14 mickey 216: Use deflate scheme which reportedly provides better compression rates (force
1.17 jmc 217: .Nm gzip
1.14 mickey 218: mode).
1.18 deraadt 219: This flag need not be specified when invoked as
220: .Nm gzip .
1.20 millert 221: .It Fl h
222: Print a short help message.
1.21 millert 223: .It Fl l
224: List information for the specified compressed files.
225: The following information is listed:
1.23 jmc 226: .Bl -tag -width "compression ratio"
1.21 millert 227: .It compressed size
1.23 jmc 228: Size of the compressed file.
1.21 millert 229: .It uncompressed size
1.23 jmc 230: Size of the file when uncompressed.
1.21 millert 231: .It compression ratio
1.23 jmc 232: Ratio of the difference between the compressed and uncompressed
1.21 millert 233: sizes to the uncompressed size.
234: .It uncompressed name
1.23 jmc 235: Name the file will be saved as when uncompressing.
1.21 millert 236: .El
237: .Pp
238: If the
239: .Fl v
240: option is specified, the following additional information is printed:
1.23 jmc 241: .Bl -tag -width "compression method"
1.21 millert 242: .It compression method
1.23 jmc 243: Name of the method used to compress the file.
1.21 millert 244: .It crc
1.23 jmc 245: 32-bit CRC
246: .Pq cyclic redundancy code
247: of the uncompressed file.
1.21 millert 248: .It "time stamp"
1.23 jmc 249: Date and time corresponding to the last data modification time
1.21 millert 250: (mtime) of the compressed file (if the
251: .Fl n
252: option is specified, the time stamp stored in the compressed file
253: is printed instead).
254: .El
1.30 jmc 255: .It Fl N
256: When compressing, save the original file name and time stamp in the
257: compressed file.
258: This information is saved by default when the deflate scheme is used.
259: When uncompressing or listing, use the time stamp and file name stored
260: in the compressed file, if any, for the uncompressed version.
1.21 millert 261: .It Fl n
262: When compressing, do not save the original file name and time stamp.
263: This information is saved by default when the deflate scheme is used.
264: When uncompressing, do not restore the original file name and time stamp.
265: By default, the uncompressed file inherits the time stamp of the
1.24 millert 266: compressed version and the uncompressed file name is generated from
267: the name of the compressed file name as described above.
1.4 mickey 268: .It Fl O
1.14 mickey 269: Use old compression method.
1.6 aaron 270: .It Fl o Ar filename
1.4 mickey 271: Set the output file name.
1.30 jmc 272: .It Fl q
273: Be quiet, suppress all messages.
274: .It Fl r
275: Recursive mode,
276: .Nm
277: will descend into specified directories.
1.14 mickey 278: .It Fl S Ar suffix
279: Set suffix for compressed files.
1.4 mickey 280: .It Fl t
1.6 aaron 281: Test the integrity of each file leaving any files intact.
1.30 jmc 282: .It Fl V
283: Display the program version
284: .Pq RCS IDs of the source files
285: and exit.
1.1 deraadt 286: .It Fl v
1.14 mickey 287: Print the percentage reduction of each file and other information.
1.1 deraadt 288: .El
289: .Pp
1.14 mickey 290: In normal mode,
1.8 aaron 291: .Nm
1.19 jmc 292: uses a modified Lempel-Ziv algorithm
293: .Pq LZW .
1.1 deraadt 294: Common substrings in the file are first replaced by 9-bit codes 257 and up.
295: When code 512 is reached, the algorithm switches to 10-bit codes and
296: continues to use more bits until the
297: limit specified by the
298: .Fl b
1.9 aaron 299: flag is reached.
1.6 aaron 300: .Ar bits
1.23 jmc 301: must be between 9 and 16
302: .Pq the default is 16 .
1.1 deraadt 303: .Pp
304: After the
305: .Ar bits
306: limit is reached,
1.8 aaron 307: .Nm
1.1 deraadt 308: periodically checks the compression ratio.
309: If it is increasing,
1.8 aaron 310: .Nm
1.1 deraadt 311: continues to use the existing code dictionary.
312: However, if the compression ratio decreases,
1.8 aaron 313: .Nm
1.11 aaron 314: discards the table of substrings and rebuilds it from scratch.
315: This allows the algorithm to adapt to the next
1.8 aaron 316: .Dq block
317: of the file.
1.1 deraadt 318: .Pp
1.18 deraadt 319: .Nm gzip
1.19 jmc 320: uses a slightly different version of the Lempel-Ziv algorithm
321: .Pq LZ77 .
322: Common substrings are replaced by pointers to previous strings,
323: and are found using a hash table.
324: Unique substrings are emitted as a string of literal bytes,
325: and compressed as Huffman trees.
1.18 deraadt 326: .Pp
1.1 deraadt 327: The
328: .Fl b
329: flag is omitted for
1.3 deraadt 330: .Nm uncompress
1.18 deraadt 331: or
332: .Nm gunzip
1.1 deraadt 333: since the
334: .Ar bits
335: parameter specified during compression
336: is encoded within the output, along with
337: a magic number to ensure that neither decompression of random data nor
338: recompression of compressed data is attempted.
339: .Pp
340: The amount of compression obtained depends on the size of the
341: input, the number of
342: .Ar bits
343: per code, and the distribution of common substrings.
1.23 jmc 344: Typically, text such as source code or English is reduced by 50 \- 60% using
1.19 jmc 345: .Nm
1.23 jmc 346: and by 60 \- 70% using
1.19 jmc 347: .Nm gzip .
1.1 deraadt 348: Compression is generally much better than that achieved by Huffman
349: coding (as used in the historical command pack), or adaptive Huffman
350: coding (as used in the historical command compact), and takes less
351: time to compute.
352: .Pp
353: The
1.8 aaron 354: .Nm
1.18 deraadt 355: and
356: .Nm gzip
357: utilities exit with 0 on success, 1 if an error occurred, or 2 if one or
1.5 denny 358: more files were not compressed because they would have grown in
1.6 aaron 359: size (and
360: .Fl f
1.9 aaron 361: was not specified).
1.22 millert 362: .Sh RETURN VALUES
363: The
364: .Nm
365: utility exits with one of the following values:
366: .Pp
367: .Bl -tag -width flag -compact
368: .It Li 0
369: The file was compressed successfully.
370: .It Li 1
371: An error occurred.
372: .It Li 2
373: A warning occurred.
1.23 jmc 374: .El
1.1 deraadt 375: .Sh SEE ALSO
1.29 jmc 376: .Xr compress 3
377: .Pp
1.1 deraadt 378: .Rs
379: .%A Welch, Terry A.
380: .%D June, 1984
381: .%T "A Technique for High Performance Data Compression"
382: .%J "IEEE Computer"
383: .%V 17:6
384: .%P pp. 8-19
385: .Re
1.19 jmc 386: .Pp
387: .Bl -tag -width 12n -compact
1.23 jmc 388: .It RFC 1950
389: ZLIB Compressed Data Format Specification.
390: .It RFC 1951
391: DEFLATE Compressed Data Format Specification.
392: .It RFC 1952
393: GZIP File Format Specification.
1.19 jmc 394: .El
1.5 denny 395: .Sh STANDARDS
396: The
1.8 aaron 397: .Nm
1.5 denny 398: utility is compliant with the
399: .St -p1003.2-92
400: specification.
1.19 jmc 401: .Pp
402: The
403: .Nm gzip
404: and
405: .Nm gunzip
406: utilities are extensions.
1.1 deraadt 407: .Sh HISTORY
408: The
409: .Nm
410: command appeared in
411: .Bx 4.3 .
1.6 aaron 412: The deflate compression support was added in
1.4 mickey 413: .Ox 2.1 .
1.26 millert 414: Full
415: .Nm gzip
416: compatibility was added in
417: .Ox 3.4 .
418: The
419: .Sq g
420: in this version of
421: .Nm gzip
422: stands for
423: .Dq gratis .