Annotation of src/usr.bin/compress/compress.1, Revision 1.28
1.28 ! jmc 1: .\" $OpenBSD: compress.1,v 1.27 2003/09/05 04:46:35 tedu Exp $
1.1 deraadt 2: .\" $NetBSD: compress.1,v 1.5 1995/03/26 09:44:34 glass Exp $
3: .\"
4: .\" Copyright (c) 1986, 1990, 1993
5: .\" The Regents of the University of California. All rights reserved.
6: .\"
7: .\" This code is derived from software contributed to Berkeley by
8: .\" James A. Woods, derived from original work by Spencer Thomas
9: .\" and Joseph Orost.
10: .\"
11: .\" Redistribution and use in source and binary forms, with or without
12: .\" modification, are permitted provided that the following conditions
13: .\" are met:
14: .\" 1. Redistributions of source code must retain the above copyright
15: .\" notice, this list of conditions and the following disclaimer.
16: .\" 2. Redistributions in binary form must reproduce the above copyright
17: .\" notice, this list of conditions and the following disclaimer in the
18: .\" documentation and/or other materials provided with the distribution.
1.16 millert 19: .\" 3. Neither the name of the University nor the names of its contributors
1.1 deraadt 20: .\" may be used to endorse or promote products derived from this software
21: .\" without specific prior written permission.
22: .\"
23: .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
24: .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
25: .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
26: .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
27: .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
28: .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
29: .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
30: .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
31: .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
32: .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
33: .\" SUCH DAMAGE.
34: .\"
35: .\" @(#)compress.1 8.2 (Berkeley) 4/18/94
36: .\"
37: .Dd April 18, 1994
38: .Dt COMPRESS 1
1.7 aaron 39: .Os
1.1 deraadt 40: .Sh NAME
41: .Nm compress ,
1.18 deraadt 42: .Nm uncompress ,
43: .Nm gzip ,
44: .Nm gunzip
1.1 deraadt 45: .Nd compress and expand data
46: .Sh SYNOPSIS
47: .Nm compress
1.14 mickey 48: .Op Fl LV
49: .Nm compress
1.21 millert 50: .Op Fl cdfghlOnNqrtv123456789
1.1 deraadt 51: .Op Fl b Ar bits
1.18 deraadt 52: .Op Fl S Ar suffix
1.4 mickey 53: .Op Fl o Ar filename
1.1 deraadt 54: .Op Ar
55: .Nm uncompress
1.21 millert 56: .Op Fl cfhlnNqrtv
1.18 deraadt 57: .Op Fl o Ar filename
58: .Op Ar
59: .Pp
60: .Nm gzip
61: .Op Fl LV
62: .Nm gzip
1.21 millert 63: .Op Fl cdfghlnNOqrtv123456789
1.18 deraadt 64: .Op Fl b Ar bits
65: .Op Fl S Ar suffix
66: .Op Fl o Ar filename
67: .Op Ar
68: .Nm gunzip
1.21 millert 69: .Op Fl cfhnNqrltv
1.4 mickey 70: .Op Fl o Ar filename
1.7 aaron 71: .Op Ar
1.18 deraadt 72: .Pp
1.12 mickey 73: .Nm zcat
1.28 ! jmc 74: .Op Fl fghqr
! 75: .Op Ar
! 76: .Nm gzcat
1.20 millert 77: .Op Fl fhqr
1.12 mickey 78: .Op Ar
1.1 deraadt 79: .Sh DESCRIPTION
1.9 aaron 80: The
1.18 deraadt 81: .Nm compress
82: and
83: .Nm gzip
84: utilities
85: reduce the size of the named files using adaptive Lempel-Ziv coding.
1.19 jmc 86: They are functionally identical, but use different algorithms for compression.
87: If invoked as
1.18 deraadt 88: .Nm gzip
1.19 jmc 89: or
1.28 ! jmc 90: .Nm compress Fl g ,
1.19 jmc 91: the deflate mode of compression is chosen by default;
92: otherwise the older method of compression
93: .Pq compress mode
94: is used.
95: .Pp
1.1 deraadt 96: Each
97: .Ar file
98: is renamed to the same name plus the extension
1.17 jmc 99: .Dq .Z ,
1.14 mickey 100: or
1.17 jmc 101: .Dq .gz
1.14 mickey 102: (in deflate mode).
1.1 deraadt 103: As many of the modification time, access time, file flags, file mode,
104: user ID, and group ID as allowed by permissions are retained in the
105: new file.
106: If compression would not reduce the size of a
107: .Ar file ,
1.17 jmc 108: the file is ignored (unless
1.14 mickey 109: .Fl f
110: is used).
1.1 deraadt 111: .Pp
1.9 aaron 112: The
1.6 aaron 113: .Nm uncompress
1.18 deraadt 114: and
115: .Nm gunzip
116: utilities restore compressed files to their original form, renaming the
1.24 millert 117: files by removing the extension (or by using the stored name if the
118: .Fl N
119: flag is specified).
120: When decompressing, the following extensions are recognized:
121: .Dq .Z ,
122: .Dq -Z ,
123: .Dq _Z ,
124: .Dq .gz ,
125: .Dq -gz ,
126: .Dq _gz ,
127: .Dq .tgz ,
128: .Dq -tgz ,
129: .Dq _tgz ,
130: .Dq .taz ,
131: .Dq -taz ,
132: and
133: .Dq _taz .
1.25 jmc 134: Extensions ending in
1.24 millert 135: .Dq tgz
136: and
137: .Dq taz
138: are not removed when decompressing, instead they are converted to
139: .Dq tar .
1.12 mickey 140: .Pp
141: The
142: .Nm zcat
1.13 mickey 143: command is equivalent in functionality to
1.12 mickey 144: .Nm uncompress
1.13 mickey 145: .Fl c .
1.28 ! jmc 146: The
! 147: .Nm gzcat
! 148: command is equivalent in functionality to
! 149: .Nm gunzip
! 150: .Fl c .
1.1 deraadt 151: .Pp
152: If renaming the files would cause files to be overwritten and the standard
153: input device is a terminal, the user is prompted (on the standard error
154: output) for confirmation.
155: If prompting is not possible or confirmation is not received, the files
156: are not overwritten.
157: .Pp
158: If no files are specified, the standard input is compressed or uncompressed
159: to the standard output.
1.9 aaron 160: If either the input or output files are not regular files, the checks for
1.1 deraadt 161: reduction in size and file overwriting are not performed, the input file is
162: not removed, and the attributes of the input file are not retained.
163: .Pp
164: The options are as follows:
165: .Bl -tag -width Ds
1.14 mickey 166: .It Fl V
1.23 jmc 167: Display the program version
168: .Pq RCS IDs of the source files
169: and exit.
1.6 aaron 170: .It Fl b Ar bits
1.1 deraadt 171: Specify the
172: .Ar bits
1.23 jmc 173: code limit
174: .Pq see below .
1.1 deraadt 175: .It Fl c
176: Compressed or uncompressed output is written to the standard output.
1.17 jmc 177: No files are modified (force
1.14 mickey 178: .Nm zcat
1.28 ! jmc 179: or
! 180: .Nm gzcat
1.14 mickey 181: mode).
1.4 mickey 182: .It Fl d
1.14 mickey 183: Decompress the source files instead of compressing them (force
184: .Nm uncompress
185: mode).
1.1 deraadt 186: .It Fl f
187: Force compression of
188: .Ar file ,
189: even if it is not actually reduced in size.
190: Additionally, files are overwritten without prompting for confirmation.
1.27 tedu 191: If the input data is not in a format recognized by
192: .Nm
193: and if the option
194: .Fl c
195: is also given, copy the input data without change
196: to the standard ouput: let
197: .Nm zcat
1.28 ! jmc 198: or
! 199: .Nm gzcat
1.27 tedu 200: behave as
1.28 ! jmc 201: .Xr cat 1 .
1.4 mickey 202: .It Fl g
1.14 mickey 203: Use deflate scheme which reportedly provides better compression rates (force
1.17 jmc 204: .Nm gzip
1.14 mickey 205: mode).
1.18 deraadt 206: This flag need not be specified when invoked as
207: .Nm gzip .
1.20 millert 208: .It Fl h
209: Print a short help message.
1.21 millert 210: .It Fl l
211: List information for the specified compressed files.
212: The following information is listed:
1.23 jmc 213: .Bl -tag -width "compression ratio"
1.21 millert 214: .It compressed size
1.23 jmc 215: Size of the compressed file.
1.21 millert 216: .It uncompressed size
1.23 jmc 217: Size of the file when uncompressed.
1.21 millert 218: .It compression ratio
1.23 jmc 219: Ratio of the difference between the compressed and uncompressed
1.21 millert 220: sizes to the uncompressed size.
221: .It uncompressed name
1.23 jmc 222: Name the file will be saved as when uncompressing.
1.21 millert 223: .El
224: .Pp
225: If the
226: .Fl v
227: option is specified, the following additional information is printed:
1.23 jmc 228: .Bl -tag -width "compression method"
1.21 millert 229: .It compression method
1.23 jmc 230: Name of the method used to compress the file.
1.21 millert 231: .It crc
1.23 jmc 232: 32-bit CRC
233: .Pq cyclic redundancy code
234: of the uncompressed file.
1.21 millert 235: .It "time stamp"
1.23 jmc 236: Date and time corresponding to the last data modification time
1.21 millert 237: (mtime) of the compressed file (if the
238: .Fl n
239: option is specified, the time stamp stored in the compressed file
240: is printed instead).
241: .El
242: .It Fl n
243: When compressing, do not save the original file name and time stamp.
244: This information is saved by default when the deflate scheme is used.
245: When uncompressing, do not restore the original file name and time stamp.
246: By default, the uncompressed file inherits the time stamp of the
1.24 millert 247: compressed version and the uncompressed file name is generated from
248: the name of the compressed file name as described above.
1.21 millert 249: .It Fl N
250: When compressing, save the original file name and time stamp in the
251: compressed file.
252: This information is saved by default when the deflate scheme is used.
253: When uncompressing or listing, use the time stamp and file name stored
254: in the compressed file, if any, for the uncompressed version.
1.14 mickey 255: .It Fl 1...9
1.19 jmc 256: Use deflate scheme with compression factor of
257: .Fl 1
258: to
259: .Fl 9 .
260: Compression factor
261: .Fl 1
262: is the fastest, but provides a poorer level of compression.
263: Compression factor
264: .Fl 9
265: provides the best level of compression, but is relatively slow.
266: The default is
267: .Fl 6 .
268: This option implies
269: .Fl g .
1.4 mickey 270: .It Fl O
1.14 mickey 271: Use old compression method.
1.6 aaron 272: .It Fl o Ar filename
1.4 mickey 273: Set the output file name.
1.14 mickey 274: .It Fl S Ar suffix
275: Set suffix for compressed files.
1.4 mickey 276: .It Fl t
1.6 aaron 277: Test the integrity of each file leaving any files intact.
1.15 millert 278: .It Fl r
279: Recursive mode,
280: .Nm
281: will descend into specified directories.
1.4 mickey 282: .It Fl q
1.14 mickey 283: Be quiet, suppress all messages.
1.1 deraadt 284: .It Fl v
1.14 mickey 285: Print the percentage reduction of each file and other information.
1.1 deraadt 286: .El
287: .Pp
1.14 mickey 288: In normal mode,
1.8 aaron 289: .Nm
1.19 jmc 290: uses a modified Lempel-Ziv algorithm
291: .Pq LZW .
1.1 deraadt 292: Common substrings in the file are first replaced by 9-bit codes 257 and up.
293: When code 512 is reached, the algorithm switches to 10-bit codes and
294: continues to use more bits until the
295: limit specified by the
296: .Fl b
1.9 aaron 297: flag is reached.
1.6 aaron 298: .Ar bits
1.23 jmc 299: must be between 9 and 16
300: .Pq the default is 16 .
1.1 deraadt 301: .Pp
302: After the
303: .Ar bits
304: limit is reached,
1.8 aaron 305: .Nm
1.1 deraadt 306: periodically checks the compression ratio.
307: If it is increasing,
1.8 aaron 308: .Nm
1.1 deraadt 309: continues to use the existing code dictionary.
310: However, if the compression ratio decreases,
1.8 aaron 311: .Nm
1.11 aaron 312: discards the table of substrings and rebuilds it from scratch.
313: This allows the algorithm to adapt to the next
1.8 aaron 314: .Dq block
315: of the file.
1.1 deraadt 316: .Pp
1.18 deraadt 317: .Nm gzip
1.19 jmc 318: uses a slightly different version of the Lempel-Ziv algorithm
319: .Pq LZ77 .
320: Common substrings are replaced by pointers to previous strings,
321: and are found using a hash table.
322: Unique substrings are emitted as a string of literal bytes,
323: and compressed as Huffman trees.
1.18 deraadt 324: .Pp
1.1 deraadt 325: The
326: .Fl b
327: flag is omitted for
1.3 deraadt 328: .Nm uncompress
1.18 deraadt 329: or
330: .Nm gunzip
1.1 deraadt 331: since the
332: .Ar bits
333: parameter specified during compression
334: is encoded within the output, along with
335: a magic number to ensure that neither decompression of random data nor
336: recompression of compressed data is attempted.
337: .Pp
338: The amount of compression obtained depends on the size of the
339: input, the number of
340: .Ar bits
341: per code, and the distribution of common substrings.
1.23 jmc 342: Typically, text such as source code or English is reduced by 50 \- 60% using
1.19 jmc 343: .Nm
1.23 jmc 344: and by 60 \- 70% using
1.19 jmc 345: .Nm gzip .
1.1 deraadt 346: Compression is generally much better than that achieved by Huffman
347: coding (as used in the historical command pack), or adaptive Huffman
348: coding (as used in the historical command compact), and takes less
349: time to compute.
350: .Pp
351: The
1.8 aaron 352: .Nm
1.18 deraadt 353: and
354: .Nm gzip
355: utilities exit with 0 on success, 1 if an error occurred, or 2 if one or
1.5 denny 356: more files were not compressed because they would have grown in
1.6 aaron 357: size (and
358: .Fl f
1.9 aaron 359: was not specified).
1.22 millert 360: .Sh RETURN VALUES
361: The
362: .Nm
363: utility exits with one of the following values:
364: .Pp
365: .Bl -tag -width flag -compact
366: .It Li 0
367: The file was compressed successfully.
368: .It Li 1
369: An error occurred.
370: .It Li 2
371: A warning occurred.
1.23 jmc 372: .El
1.1 deraadt 373: .Sh SEE ALSO
374: .Rs
375: .%A Welch, Terry A.
376: .%D June, 1984
377: .%T "A Technique for High Performance Data Compression"
378: .%J "IEEE Computer"
379: .%V 17:6
380: .%P pp. 8-19
381: .Re
1.19 jmc 382: .Pp
383: .Bl -tag -width 12n -compact
1.23 jmc 384: .It RFC 1950
385: ZLIB Compressed Data Format Specification.
386: .It RFC 1951
387: DEFLATE Compressed Data Format Specification.
388: .It RFC 1952
389: GZIP File Format Specification.
1.19 jmc 390: .El
1.5 denny 391: .Sh STANDARDS
392: The
1.8 aaron 393: .Nm
1.5 denny 394: utility is compliant with the
395: .St -p1003.2-92
396: specification.
1.19 jmc 397: .Pp
398: The
399: .Nm gzip
400: and
401: .Nm gunzip
402: utilities are extensions.
1.1 deraadt 403: .Sh HISTORY
404: The
405: .Nm
406: command appeared in
407: .Bx 4.3 .
1.6 aaron 408: The deflate compression support was added in
1.4 mickey 409: .Ox 2.1 .
1.26 millert 410: Full
411: .Nm gzip
412: compatibility was added in
413: .Ox 3.4 .
414: The
415: .Sq g
416: in this version of
417: .Nm gzip
418: stands for
419: .Dq gratis .