Annotation of src/usr.bin/compress/compress.1, Revision 1.23
1.23 ! jmc 1: .\" $OpenBSD: compress.1,v 1.22 2003/07/20 13:25:52 millert Exp $
1.1 deraadt 2: .\" $NetBSD: compress.1,v 1.5 1995/03/26 09:44:34 glass Exp $
3: .\"
4: .\" Copyright (c) 1986, 1990, 1993
5: .\" The Regents of the University of California. All rights reserved.
6: .\"
7: .\" This code is derived from software contributed to Berkeley by
8: .\" James A. Woods, derived from original work by Spencer Thomas
9: .\" and Joseph Orost.
10: .\"
11: .\" Redistribution and use in source and binary forms, with or without
12: .\" modification, are permitted provided that the following conditions
13: .\" are met:
14: .\" 1. Redistributions of source code must retain the above copyright
15: .\" notice, this list of conditions and the following disclaimer.
16: .\" 2. Redistributions in binary form must reproduce the above copyright
17: .\" notice, this list of conditions and the following disclaimer in the
18: .\" documentation and/or other materials provided with the distribution.
1.16 millert 19: .\" 3. Neither the name of the University nor the names of its contributors
1.1 deraadt 20: .\" may be used to endorse or promote products derived from this software
21: .\" without specific prior written permission.
22: .\"
23: .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
24: .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
25: .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
26: .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
27: .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
28: .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
29: .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
30: .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
31: .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
32: .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
33: .\" SUCH DAMAGE.
34: .\"
35: .\" @(#)compress.1 8.2 (Berkeley) 4/18/94
36: .\"
37: .Dd April 18, 1994
38: .Dt COMPRESS 1
1.7 aaron 39: .Os
1.1 deraadt 40: .Sh NAME
41: .Nm compress ,
1.18 deraadt 42: .Nm uncompress ,
43: .Nm gzip ,
44: .Nm gunzip
1.1 deraadt 45: .Nd compress and expand data
46: .Sh SYNOPSIS
47: .Nm compress
1.14 mickey 48: .Op Fl LV
49: .Nm compress
1.21 millert 50: .Op Fl cdfghlOnNqrtv123456789
1.1 deraadt 51: .Op Fl b Ar bits
1.18 deraadt 52: .Op Fl S Ar suffix
1.4 mickey 53: .Op Fl o Ar filename
1.1 deraadt 54: .Op Ar
55: .Nm uncompress
1.21 millert 56: .Op Fl cfhlnNqrtv
1.18 deraadt 57: .Op Fl o Ar filename
58: .Op Ar
59: .Pp
60: .Nm gzip
61: .Op Fl LV
62: .Nm gzip
1.21 millert 63: .Op Fl cdfghlnNOqrtv123456789
1.18 deraadt 64: .Op Fl b Ar bits
65: .Op Fl S Ar suffix
66: .Op Fl o Ar filename
67: .Op Ar
68: .Nm gunzip
1.21 millert 69: .Op Fl cfhnNqrltv
1.4 mickey 70: .Op Fl o Ar filename
1.7 aaron 71: .Op Ar
1.18 deraadt 72: .Pp
1.12 mickey 73: .Nm zcat
1.20 millert 74: .Op Fl fhqr
1.12 mickey 75: .Op Ar
1.1 deraadt 76: .Sh DESCRIPTION
1.9 aaron 77: The
1.18 deraadt 78: .Nm compress
79: and
80: .Nm gzip
81: utilities
82: reduce the size of the named files using adaptive Lempel-Ziv coding.
1.19 jmc 83: They are functionally identical, but use different algorithms for compression.
84: If invoked as
1.18 deraadt 85: .Nm gzip
1.19 jmc 86: or
87: .Nm compress Fl g
88: the deflate mode of compression is chosen by default;
89: otherwise the older method of compression
90: .Pq compress mode
91: is used.
92: .Pp
1.1 deraadt 93: Each
94: .Ar file
95: is renamed to the same name plus the extension
1.17 jmc 96: .Dq .Z ,
1.14 mickey 97: or
1.17 jmc 98: .Dq .gz
1.14 mickey 99: (in deflate mode).
1.1 deraadt 100: As many of the modification time, access time, file flags, file mode,
101: user ID, and group ID as allowed by permissions are retained in the
102: new file.
103: If compression would not reduce the size of a
104: .Ar file ,
1.17 jmc 105: the file is ignored (unless
1.14 mickey 106: .Fl f
107: is used).
1.1 deraadt 108: .Pp
1.9 aaron 109: The
1.6 aaron 110: .Nm uncompress
1.18 deraadt 111: and
112: .Nm gunzip
113: utilities restore compressed files to their original form, renaming the
1.9 aaron 114: files by removing the
1.1 deraadt 115: .Dq .Z
1.14 mickey 116: or
117: .Dq .gz
1.1 deraadt 118: extension.
1.12 mickey 119: .Pp
120: The
121: .Nm zcat
1.13 mickey 122: command is equivalent in functionality to
1.12 mickey 123: .Nm uncompress
1.13 mickey 124: .Fl c .
1.1 deraadt 125: .Pp
126: If renaming the files would cause files to be overwritten and the standard
127: input device is a terminal, the user is prompted (on the standard error
128: output) for confirmation.
129: If prompting is not possible or confirmation is not received, the files
130: are not overwritten.
131: .Pp
132: If no files are specified, the standard input is compressed or uncompressed
133: to the standard output.
1.9 aaron 134: If either the input or output files are not regular files, the checks for
1.1 deraadt 135: reduction in size and file overwriting are not performed, the input file is
136: not removed, and the attributes of the input file are not retained.
137: .Pp
138: The options are as follows:
139: .Bl -tag -width Ds
1.14 mickey 140: .It Fl V
1.23 ! jmc 141: Display the program version
! 142: .Pq RCS IDs of the source files
! 143: and exit.
1.6 aaron 144: .It Fl b Ar bits
1.1 deraadt 145: Specify the
146: .Ar bits
1.23 ! jmc 147: code limit
! 148: .Pq see below .
1.1 deraadt 149: .It Fl c
150: Compressed or uncompressed output is written to the standard output.
1.17 jmc 151: No files are modified (force
1.14 mickey 152: .Nm zcat
153: mode).
1.4 mickey 154: .It Fl d
1.14 mickey 155: Decompress the source files instead of compressing them (force
156: .Nm uncompress
157: mode).
1.1 deraadt 158: .It Fl f
159: Force compression of
160: .Ar file ,
161: even if it is not actually reduced in size.
162: Additionally, files are overwritten without prompting for confirmation.
1.4 mickey 163: .It Fl g
1.14 mickey 164: Use deflate scheme which reportedly provides better compression rates (force
1.17 jmc 165: .Nm gzip
1.14 mickey 166: mode).
1.18 deraadt 167: This flag need not be specified when invoked as
168: .Nm gzip .
1.20 millert 169: .It Fl h
170: Print a short help message.
1.21 millert 171: .It Fl l
172: List information for the specified compressed files.
173: The following information is listed:
1.23 ! jmc 174: .Bl -tag -width "compression ratio"
1.21 millert 175: .It compressed size
1.23 ! jmc 176: Size of the compressed file.
1.21 millert 177: .It uncompressed size
1.23 ! jmc 178: Size of the file when uncompressed.
1.21 millert 179: .It compression ratio
1.23 ! jmc 180: Ratio of the difference between the compressed and uncompressed
1.21 millert 181: sizes to the uncompressed size.
182: .It uncompressed name
1.23 ! jmc 183: Name the file will be saved as when uncompressing.
1.21 millert 184: .El
185: .Pp
186: If the
187: .Fl v
188: option is specified, the following additional information is printed:
1.23 ! jmc 189: .Bl -tag -width "compression method"
1.21 millert 190: .It compression method
1.23 ! jmc 191: Name of the method used to compress the file.
1.21 millert 192: .It crc
1.23 ! jmc 193: 32-bit CRC
! 194: .Pq cyclic redundancy code
! 195: of the uncompressed file.
1.21 millert 196: .It "time stamp"
1.23 ! jmc 197: Date and time corresponding to the last data modification time
1.21 millert 198: (mtime) of the compressed file (if the
199: .Fl n
200: option is specified, the time stamp stored in the compressed file
201: is printed instead).
202: .El
203: .It Fl n
204: When compressing, do not save the original file name and time stamp.
205: This information is saved by default when the deflate scheme is used.
206: When uncompressing, do not restore the original file name and time stamp.
207: By default, the uncompressed file inherits the time stamp of the
208: compressed version and the uncompressed file name is generated by
209: stripping the
210: .Dq Z
211: or
212: .Dq gz
213: extension from the compressed file name.
214: .It Fl N
215: When compressing, save the original file name and time stamp in the
216: compressed file.
217: This information is saved by default when the deflate scheme is used.
218: When uncompressing or listing, use the time stamp and file name stored
219: in the compressed file, if any, for the uncompressed version.
1.14 mickey 220: .It Fl 1...9
1.19 jmc 221: Use deflate scheme with compression factor of
222: .Fl 1
223: to
224: .Fl 9 .
225: Compression factor
226: .Fl 1
227: is the fastest, but provides a poorer level of compression.
228: Compression factor
229: .Fl 9
230: provides the best level of compression, but is relatively slow.
231: The default is
232: .Fl 6 .
233: This option implies
234: .Fl g .
1.4 mickey 235: .It Fl O
1.14 mickey 236: Use old compression method.
1.6 aaron 237: .It Fl o Ar filename
1.4 mickey 238: Set the output file name.
1.14 mickey 239: .It Fl S Ar suffix
240: Set suffix for compressed files.
1.4 mickey 241: .It Fl t
1.6 aaron 242: Test the integrity of each file leaving any files intact.
1.15 millert 243: .It Fl r
244: Recursive mode,
245: .Nm
246: will descend into specified directories.
1.4 mickey 247: .It Fl q
1.14 mickey 248: Be quiet, suppress all messages.
1.1 deraadt 249: .It Fl v
1.14 mickey 250: Print the percentage reduction of each file and other information.
1.1 deraadt 251: .El
252: .Pp
1.14 mickey 253: In normal mode,
1.8 aaron 254: .Nm
1.19 jmc 255: uses a modified Lempel-Ziv algorithm
256: .Pq LZW .
1.1 deraadt 257: Common substrings in the file are first replaced by 9-bit codes 257 and up.
258: When code 512 is reached, the algorithm switches to 10-bit codes and
259: continues to use more bits until the
260: limit specified by the
261: .Fl b
1.9 aaron 262: flag is reached.
1.6 aaron 263: .Ar bits
1.23 ! jmc 264: must be between 9 and 16
! 265: .Pq the default is 16 .
1.1 deraadt 266: .Pp
267: After the
268: .Ar bits
269: limit is reached,
1.8 aaron 270: .Nm
1.1 deraadt 271: periodically checks the compression ratio.
272: If it is increasing,
1.8 aaron 273: .Nm
1.1 deraadt 274: continues to use the existing code dictionary.
275: However, if the compression ratio decreases,
1.8 aaron 276: .Nm
1.11 aaron 277: discards the table of substrings and rebuilds it from scratch.
278: This allows the algorithm to adapt to the next
1.8 aaron 279: .Dq block
280: of the file.
1.1 deraadt 281: .Pp
1.18 deraadt 282: .Nm gzip
1.19 jmc 283: uses a slightly different version of the Lempel-Ziv algorithm
284: .Pq LZ77 .
285: Common substrings are replaced by pointers to previous strings,
286: and are found using a hash table.
287: Unique substrings are emitted as a string of literal bytes,
288: and compressed as Huffman trees.
1.18 deraadt 289: .Pp
1.1 deraadt 290: The
291: .Fl b
292: flag is omitted for
1.3 deraadt 293: .Nm uncompress
1.18 deraadt 294: or
295: .Nm gunzip
1.1 deraadt 296: since the
297: .Ar bits
298: parameter specified during compression
299: is encoded within the output, along with
300: a magic number to ensure that neither decompression of random data nor
301: recompression of compressed data is attempted.
302: .Pp
303: The amount of compression obtained depends on the size of the
304: input, the number of
305: .Ar bits
306: per code, and the distribution of common substrings.
1.23 ! jmc 307: Typically, text such as source code or English is reduced by 50 \- 60% using
1.19 jmc 308: .Nm
1.23 ! jmc 309: and by 60 \- 70% using
1.19 jmc 310: .Nm gzip .
1.1 deraadt 311: Compression is generally much better than that achieved by Huffman
312: coding (as used in the historical command pack), or adaptive Huffman
313: coding (as used in the historical command compact), and takes less
314: time to compute.
315: .Pp
316: The
1.8 aaron 317: .Nm
1.18 deraadt 318: and
319: .Nm gzip
320: utilities exit with 0 on success, 1 if an error occurred, or 2 if one or
1.5 denny 321: more files were not compressed because they would have grown in
1.6 aaron 322: size (and
323: .Fl f
1.9 aaron 324: was not specified).
1.22 millert 325: .Sh RETURN VALUES
326: The
327: .Nm
328: utility exits with one of the following values:
329: .Pp
330: .Bl -tag -width flag -compact
331: .It Li 0
332: The file was compressed successfully.
333: .It Li 1
334: An error occurred.
335: .It Li 2
336: A warning occurred.
1.23 ! jmc 337: .El
1.1 deraadt 338: .Sh SEE ALSO
339: .Rs
340: .%A Welch, Terry A.
341: .%D June, 1984
342: .%T "A Technique for High Performance Data Compression"
343: .%J "IEEE Computer"
344: .%V 17:6
345: .%P pp. 8-19
346: .Re
1.19 jmc 347: .Pp
348: .Bl -tag -width 12n -compact
1.23 ! jmc 349: .It RFC 1950
! 350: ZLIB Compressed Data Format Specification.
! 351: .It RFC 1951
! 352: DEFLATE Compressed Data Format Specification.
! 353: .It RFC 1952
! 354: GZIP File Format Specification.
1.19 jmc 355: .El
1.5 denny 356: .Sh STANDARDS
357: The
1.8 aaron 358: .Nm
1.5 denny 359: utility is compliant with the
360: .St -p1003.2-92
361: specification.
1.19 jmc 362: .Pp
363: The
364: .Nm gzip
365: and
366: .Nm gunzip
367: utilities are extensions.
1.1 deraadt 368: .Sh HISTORY
369: The
370: .Nm
371: command appeared in
372: .Bx 4.3 .
1.6 aaron 373: The deflate compression support was added in
1.4 mickey 374: .Ox 2.1 .