Annotation of src/usr.bin/compress/compress.1, Revision 1.30
1.30 ! jmc 1: .\" $OpenBSD: compress.1,v 1.29 2003/10/01 08:43:17 jmc Exp $
1.1 deraadt 2: .\" $NetBSD: compress.1,v 1.5 1995/03/26 09:44:34 glass Exp $
3: .\"
4: .\" Copyright (c) 1986, 1990, 1993
5: .\" The Regents of the University of California. All rights reserved.
6: .\"
7: .\" This code is derived from software contributed to Berkeley by
8: .\" James A. Woods, derived from original work by Spencer Thomas
9: .\" and Joseph Orost.
10: .\"
11: .\" Redistribution and use in source and binary forms, with or without
12: .\" modification, are permitted provided that the following conditions
13: .\" are met:
14: .\" 1. Redistributions of source code must retain the above copyright
15: .\" notice, this list of conditions and the following disclaimer.
16: .\" 2. Redistributions in binary form must reproduce the above copyright
17: .\" notice, this list of conditions and the following disclaimer in the
18: .\" documentation and/or other materials provided with the distribution.
1.16 millert 19: .\" 3. Neither the name of the University nor the names of its contributors
1.1 deraadt 20: .\" may be used to endorse or promote products derived from this software
21: .\" without specific prior written permission.
22: .\"
23: .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
24: .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
25: .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
26: .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
27: .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
28: .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
29: .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
30: .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
31: .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
32: .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
33: .\" SUCH DAMAGE.
34: .\"
35: .\" @(#)compress.1 8.2 (Berkeley) 4/18/94
36: .\"
37: .Dd April 18, 1994
38: .Dt COMPRESS 1
1.7 aaron 39: .Os
1.1 deraadt 40: .Sh NAME
41: .Nm compress ,
1.18 deraadt 42: .Nm uncompress ,
43: .Nm gzip ,
44: .Nm gunzip
1.1 deraadt 45: .Nd compress and expand data
46: .Sh SYNOPSIS
47: .Nm compress
1.14 mickey 48: .Op Fl LV
49: .Nm compress
1.30 ! jmc 50: .Op Fl 123456789cdfghlNnOqrtv
1.1 deraadt 51: .Op Fl b Ar bits
1.30 ! jmc 52: .Op Fl o Ar filename
1.18 deraadt 53: .Op Fl S Ar suffix
1.1 deraadt 54: .Op Ar
55: .Nm uncompress
1.30 ! jmc 56: .Op Fl cfhlNnqrtv
1.18 deraadt 57: .Op Fl o Ar filename
58: .Op Ar
59: .Pp
60: .Nm gzip
61: .Op Fl LV
62: .Nm gzip
1.30 ! jmc 63: .Op Fl 123456789cdfghlNnOqrtv
1.18 deraadt 64: .Op Fl b Ar bits
1.30 ! jmc 65: .Op Fl o Ar filename
1.18 deraadt 66: .Op Fl S Ar suffix
67: .Op Ar
68: .Nm gunzip
1.30 ! jmc 69: .Op Fl cfhlNnqrtv
1.4 mickey 70: .Op Fl o Ar filename
1.7 aaron 71: .Op Ar
1.18 deraadt 72: .Pp
1.12 mickey 73: .Nm zcat
1.28 jmc 74: .Op Fl fghqr
75: .Op Ar
76: .Nm gzcat
1.20 millert 77: .Op Fl fhqr
1.12 mickey 78: .Op Ar
1.1 deraadt 79: .Sh DESCRIPTION
1.9 aaron 80: The
1.18 deraadt 81: .Nm compress
82: and
83: .Nm gzip
84: utilities
85: reduce the size of the named files using adaptive Lempel-Ziv coding.
1.19 jmc 86: They are functionally identical, but use different algorithms for compression.
87: If invoked as
1.18 deraadt 88: .Nm gzip
1.19 jmc 89: or
1.28 jmc 90: .Nm compress Fl g ,
1.19 jmc 91: the deflate mode of compression is chosen by default;
92: otherwise the older method of compression
93: .Pq compress mode
94: is used.
95: .Pp
1.1 deraadt 96: Each
97: .Ar file
98: is renamed to the same name plus the extension
1.17 jmc 99: .Dq .Z ,
1.14 mickey 100: or
1.17 jmc 101: .Dq .gz
1.14 mickey 102: (in deflate mode).
1.1 deraadt 103: As many of the modification time, access time, file flags, file mode,
104: user ID, and group ID as allowed by permissions are retained in the
105: new file.
106: If compression would not reduce the size of a
107: .Ar file ,
1.17 jmc 108: the file is ignored (unless
1.14 mickey 109: .Fl f
110: is used).
1.1 deraadt 111: .Pp
1.9 aaron 112: The
1.6 aaron 113: .Nm uncompress
1.18 deraadt 114: and
115: .Nm gunzip
116: utilities restore compressed files to their original form, renaming the
1.24 millert 117: files by removing the extension (or by using the stored name if the
118: .Fl N
119: flag is specified).
120: When decompressing, the following extensions are recognized:
121: .Dq .Z ,
122: .Dq -Z ,
123: .Dq _Z ,
124: .Dq .gz ,
125: .Dq -gz ,
126: .Dq _gz ,
127: .Dq .tgz ,
128: .Dq -tgz ,
129: .Dq _tgz ,
130: .Dq .taz ,
131: .Dq -taz ,
132: and
133: .Dq _taz .
1.25 jmc 134: Extensions ending in
1.24 millert 135: .Dq tgz
136: and
137: .Dq taz
138: are not removed when decompressing, instead they are converted to
139: .Dq tar .
1.12 mickey 140: .Pp
141: The
142: .Nm zcat
1.13 mickey 143: command is equivalent in functionality to
1.12 mickey 144: .Nm uncompress
1.13 mickey 145: .Fl c .
1.28 jmc 146: The
147: .Nm gzcat
148: command is equivalent in functionality to
149: .Nm gunzip
150: .Fl c .
1.1 deraadt 151: .Pp
152: If renaming the files would cause files to be overwritten and the standard
153: input device is a terminal, the user is prompted (on the standard error
154: output) for confirmation.
155: If prompting is not possible or confirmation is not received, the files
156: are not overwritten.
157: .Pp
158: If no files are specified, the standard input is compressed or uncompressed
159: to the standard output.
1.9 aaron 160: If either the input or output files are not regular files, the checks for
1.1 deraadt 161: reduction in size and file overwriting are not performed, the input file is
162: not removed, and the attributes of the input file are not retained.
163: .Pp
164: The options are as follows:
165: .Bl -tag -width Ds
1.30 ! jmc 166: .It Fl 1...9
! 167: Use deflate scheme with compression factor of
! 168: .Fl 1
! 169: to
! 170: .Fl 9 .
! 171: Compression factor
! 172: .Fl 1
! 173: is the fastest, but provides a poorer level of compression.
! 174: Compression factor
! 175: .Fl 9
! 176: provides the best level of compression, but is relatively slow.
! 177: The default is
! 178: .Fl 6 .
! 179: This option implies
! 180: .Fl g .
1.6 aaron 181: .It Fl b Ar bits
1.1 deraadt 182: Specify the
183: .Ar bits
1.23 jmc 184: code limit
185: .Pq see below .
1.1 deraadt 186: .It Fl c
187: Compressed or uncompressed output is written to the standard output.
1.17 jmc 188: No files are modified (force
1.14 mickey 189: .Nm zcat
1.28 jmc 190: or
191: .Nm gzcat
1.14 mickey 192: mode).
1.4 mickey 193: .It Fl d
1.14 mickey 194: Decompress the source files instead of compressing them (force
195: .Nm uncompress
196: mode).
1.1 deraadt 197: .It Fl f
198: Force compression of
199: .Ar file ,
200: even if it is not actually reduced in size.
201: Additionally, files are overwritten without prompting for confirmation.
1.27 tedu 202: If the input data is not in a format recognized by
203: .Nm
204: and if the option
205: .Fl c
206: is also given, copy the input data without change
1.29 jmc 207: to the standard output: let
1.27 tedu 208: .Nm zcat
1.28 jmc 209: or
210: .Nm gzcat
1.27 tedu 211: behave as
1.28 jmc 212: .Xr cat 1 .
1.4 mickey 213: .It Fl g
1.14 mickey 214: Use deflate scheme which reportedly provides better compression rates (force
1.17 jmc 215: .Nm gzip
1.14 mickey 216: mode).
1.18 deraadt 217: This flag need not be specified when invoked as
218: .Nm gzip .
1.20 millert 219: .It Fl h
220: Print a short help message.
1.21 millert 221: .It Fl l
222: List information for the specified compressed files.
223: The following information is listed:
1.23 jmc 224: .Bl -tag -width "compression ratio"
1.21 millert 225: .It compressed size
1.23 jmc 226: Size of the compressed file.
1.21 millert 227: .It uncompressed size
1.23 jmc 228: Size of the file when uncompressed.
1.21 millert 229: .It compression ratio
1.23 jmc 230: Ratio of the difference between the compressed and uncompressed
1.21 millert 231: sizes to the uncompressed size.
232: .It uncompressed name
1.23 jmc 233: Name the file will be saved as when uncompressing.
1.21 millert 234: .El
235: .Pp
236: If the
237: .Fl v
238: option is specified, the following additional information is printed:
1.23 jmc 239: .Bl -tag -width "compression method"
1.21 millert 240: .It compression method
1.23 jmc 241: Name of the method used to compress the file.
1.21 millert 242: .It crc
1.23 jmc 243: 32-bit CRC
244: .Pq cyclic redundancy code
245: of the uncompressed file.
1.21 millert 246: .It "time stamp"
1.23 jmc 247: Date and time corresponding to the last data modification time
1.21 millert 248: (mtime) of the compressed file (if the
249: .Fl n
250: option is specified, the time stamp stored in the compressed file
251: is printed instead).
252: .El
1.30 ! jmc 253: .It Fl N
! 254: When compressing, save the original file name and time stamp in the
! 255: compressed file.
! 256: This information is saved by default when the deflate scheme is used.
! 257: When uncompressing or listing, use the time stamp and file name stored
! 258: in the compressed file, if any, for the uncompressed version.
1.21 millert 259: .It Fl n
260: When compressing, do not save the original file name and time stamp.
261: This information is saved by default when the deflate scheme is used.
262: When uncompressing, do not restore the original file name and time stamp.
263: By default, the uncompressed file inherits the time stamp of the
1.24 millert 264: compressed version and the uncompressed file name is generated from
265: the name of the compressed file name as described above.
1.4 mickey 266: .It Fl O
1.14 mickey 267: Use old compression method.
1.6 aaron 268: .It Fl o Ar filename
1.4 mickey 269: Set the output file name.
1.30 ! jmc 270: .It Fl q
! 271: Be quiet, suppress all messages.
! 272: .It Fl r
! 273: Recursive mode,
! 274: .Nm
! 275: will descend into specified directories.
1.14 mickey 276: .It Fl S Ar suffix
277: Set suffix for compressed files.
1.4 mickey 278: .It Fl t
1.6 aaron 279: Test the integrity of each file leaving any files intact.
1.30 ! jmc 280: .It Fl V
! 281: Display the program version
! 282: .Pq RCS IDs of the source files
! 283: and exit.
1.1 deraadt 284: .It Fl v
1.14 mickey 285: Print the percentage reduction of each file and other information.
1.1 deraadt 286: .El
287: .Pp
1.14 mickey 288: In normal mode,
1.8 aaron 289: .Nm
1.19 jmc 290: uses a modified Lempel-Ziv algorithm
291: .Pq LZW .
1.1 deraadt 292: Common substrings in the file are first replaced by 9-bit codes 257 and up.
293: When code 512 is reached, the algorithm switches to 10-bit codes and
294: continues to use more bits until the
295: limit specified by the
296: .Fl b
1.9 aaron 297: flag is reached.
1.6 aaron 298: .Ar bits
1.23 jmc 299: must be between 9 and 16
300: .Pq the default is 16 .
1.1 deraadt 301: .Pp
302: After the
303: .Ar bits
304: limit is reached,
1.8 aaron 305: .Nm
1.1 deraadt 306: periodically checks the compression ratio.
307: If it is increasing,
1.8 aaron 308: .Nm
1.1 deraadt 309: continues to use the existing code dictionary.
310: However, if the compression ratio decreases,
1.8 aaron 311: .Nm
1.11 aaron 312: discards the table of substrings and rebuilds it from scratch.
313: This allows the algorithm to adapt to the next
1.8 aaron 314: .Dq block
315: of the file.
1.1 deraadt 316: .Pp
1.18 deraadt 317: .Nm gzip
1.19 jmc 318: uses a slightly different version of the Lempel-Ziv algorithm
319: .Pq LZ77 .
320: Common substrings are replaced by pointers to previous strings,
321: and are found using a hash table.
322: Unique substrings are emitted as a string of literal bytes,
323: and compressed as Huffman trees.
1.18 deraadt 324: .Pp
1.1 deraadt 325: The
326: .Fl b
327: flag is omitted for
1.3 deraadt 328: .Nm uncompress
1.18 deraadt 329: or
330: .Nm gunzip
1.1 deraadt 331: since the
332: .Ar bits
333: parameter specified during compression
334: is encoded within the output, along with
335: a magic number to ensure that neither decompression of random data nor
336: recompression of compressed data is attempted.
337: .Pp
338: The amount of compression obtained depends on the size of the
339: input, the number of
340: .Ar bits
341: per code, and the distribution of common substrings.
1.23 jmc 342: Typically, text such as source code or English is reduced by 50 \- 60% using
1.19 jmc 343: .Nm
1.23 jmc 344: and by 60 \- 70% using
1.19 jmc 345: .Nm gzip .
1.1 deraadt 346: Compression is generally much better than that achieved by Huffman
347: coding (as used in the historical command pack), or adaptive Huffman
348: coding (as used in the historical command compact), and takes less
349: time to compute.
350: .Pp
351: The
1.8 aaron 352: .Nm
1.18 deraadt 353: and
354: .Nm gzip
355: utilities exit with 0 on success, 1 if an error occurred, or 2 if one or
1.5 denny 356: more files were not compressed because they would have grown in
1.6 aaron 357: size (and
358: .Fl f
1.9 aaron 359: was not specified).
1.22 millert 360: .Sh RETURN VALUES
361: The
362: .Nm
363: utility exits with one of the following values:
364: .Pp
365: .Bl -tag -width flag -compact
366: .It Li 0
367: The file was compressed successfully.
368: .It Li 1
369: An error occurred.
370: .It Li 2
371: A warning occurred.
1.23 jmc 372: .El
1.1 deraadt 373: .Sh SEE ALSO
1.29 jmc 374: .Xr compress 3
375: .Pp
1.1 deraadt 376: .Rs
377: .%A Welch, Terry A.
378: .%D June, 1984
379: .%T "A Technique for High Performance Data Compression"
380: .%J "IEEE Computer"
381: .%V 17:6
382: .%P pp. 8-19
383: .Re
1.19 jmc 384: .Pp
385: .Bl -tag -width 12n -compact
1.23 jmc 386: .It RFC 1950
387: ZLIB Compressed Data Format Specification.
388: .It RFC 1951
389: DEFLATE Compressed Data Format Specification.
390: .It RFC 1952
391: GZIP File Format Specification.
1.19 jmc 392: .El
1.5 denny 393: .Sh STANDARDS
394: The
1.8 aaron 395: .Nm
1.5 denny 396: utility is compliant with the
397: .St -p1003.2-92
398: specification.
1.19 jmc 399: .Pp
400: The
401: .Nm gzip
402: and
403: .Nm gunzip
404: utilities are extensions.
1.1 deraadt 405: .Sh HISTORY
406: The
407: .Nm
408: command appeared in
409: .Bx 4.3 .
1.6 aaron 410: The deflate compression support was added in
1.4 mickey 411: .Ox 2.1 .
1.26 millert 412: Full
413: .Nm gzip
414: compatibility was added in
415: .Ox 3.4 .
416: The
417: .Sq g
418: in this version of
419: .Nm gzip
420: stands for
421: .Dq gratis .