Annotation of src/usr.bin/compress/gzip.1, Revision 1.5
1.5 ! jmc 1: .\" $OpenBSD: gzip.1,v 1.4 2007/04/04 16:26:33 jmc Exp $
1.1 jmc 2: .\"
3: .\" Copyright (c) 1986, 1990, 1993
4: .\" The Regents of the University of California. All rights reserved.
5: .\"
6: .\" This code is derived from software contributed to Berkeley by
7: .\" James A. Woods, derived from original work by Spencer Thomas
8: .\" and Joseph Orost.
9: .\"
10: .\" Redistribution and use in source and binary forms, with or without
11: .\" modification, are permitted provided that the following conditions
12: .\" are met:
13: .\" 1. Redistributions of source code must retain the above copyright
14: .\" notice, this list of conditions and the following disclaimer.
15: .\" 2. Redistributions in binary form must reproduce the above copyright
16: .\" notice, this list of conditions and the following disclaimer in the
17: .\" documentation and/or other materials provided with the distribution.
18: .\" 3. Neither the name of the University nor the names of its contributors
19: .\" may be used to endorse or promote products derived from this software
20: .\" without specific prior written permission.
21: .\"
22: .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
23: .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24: .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
25: .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
26: .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
27: .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
28: .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
29: .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
30: .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
31: .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
32: .\" SUCH DAMAGE.
33: .\"
34: .\" @(#)compress.1 8.2 (Berkeley) 4/18/94
35: .\"
1.5 ! jmc 36: .Dd $Mdocdate$
1.1 jmc 37: .Dt GZIP 1
38: .Os
39: .Sh NAME
40: .Nm gzip ,
41: .Nm gunzip ,
42: .Nm gzcat
43: .Nd compress and expand data (deflate mode)
44: .Sh SYNOPSIS
45: .Nm gzip
46: .Op Fl 123456789cdfghLlNnOqrtVv
47: .Op Fl b Ar bits
48: .Op Fl o Ar filename
49: .Op Fl S Ar suffix
50: .Op Ar
51: .Nm gunzip
52: .Op Fl cfhlNnqrtv
53: .Op Fl o Ar filename
54: .Op Ar
55: .Nm gzcat
56: .Op Fl fghqr
57: .Op Ar
58: .Sh DESCRIPTION
59: The
60: .Nm
61: utility
62: reduces the size of the named files using adaptive Lempel-Ziv coding,
63: in deflate mode.
64: If invoked as
65: .Nm gzip -O ,
66: the compress mode of compression is chosen;
67: see
68: .Xr compress 1
69: for more information.
70: Each file is renamed to the same name plus the extension
71: .Dq .gz .
72: As many of the modification time, access time, file flags, file mode,
73: user ID, and group ID as allowed by permissions are retained in the
74: new file.
75: If compression would not reduce the size of a file,
76: the file is ignored (unless
77: .Fl f
78: is used).
79: .Pp
80: The
81: .Nm gunzip
82: utility restores compressed files to their original form, renaming the
83: files by removing the extension (or by using the stored name if the
84: .Fl N
85: flag is specified).
86: It has the ability to restore files compressed by both
87: .Nm
88: and
89: .Xr compress 1 ,
90: recognising the following extensions:
91: .Dq .Z ,
92: .Dq -Z ,
93: .Dq _Z ,
94: .Dq .gz ,
95: .Dq -gz ,
96: .Dq _gz ,
97: .Dq .tgz ,
98: .Dq -tgz ,
99: .Dq _tgz ,
100: .Dq .taz ,
101: .Dq -taz ,
102: and
103: .Dq _taz .
104: Extensions ending in
105: .Dq tgz
106: and
107: .Dq taz
108: are not removed when decompressing, instead they are converted to
109: .Dq tar .
110: .Pp
111: The
112: .Nm gzcat
113: command is equivalent in functionality to
114: .Nm gunzip
115: .Fl c .
116: .Pp
117: If renaming the files would cause files to be overwritten and the standard
118: input device is a terminal, the user is prompted (on the standard error
119: output) for confirmation.
120: If prompting is not possible or confirmation is not received, the files
121: are not overwritten.
122: .Pp
123: If no files are specified, the standard input is compressed or uncompressed
124: to the standard output.
125: If either the input or output files are not regular files, the checks for
126: reduction in size and file overwriting are not performed, the input file is
127: not removed, and the attributes of the input file are not retained.
128: .Pp
1.3 millert 129: By default, when compressing, the original file name and time stamp
1.4 jmc 130: are stored in the compressed file.
1.3 millert 131: When uncompressing, this information is not used.
132: Instead, the uncompressed file inherits the time stamp of the
133: compressed version and the uncompressed file name is generated from
134: the name of the compressed file as described above.
135: These defaults may be overridden by the
136: .Fl N
137: and
138: .Fl n
139: flags, described below.
140: .Pp
1.1 jmc 141: The options are as follows:
142: .Bl -tag -width Ds
143: .It Fl 1...9
144: Use the deflate scheme, with compression factor of
145: .Fl 1
146: to
147: .Fl 9 .
148: Compression factor
149: .Fl 1
150: is the fastest, but provides a poorer level of compression.
151: Compression factor
152: .Fl 9
153: provides the best level of compression, but is relatively slow.
154: The default is
155: .Fl 6 .
156: This option implies
157: .Fl g .
158: .It Fl b Ar bits
159: Specify the
160: .Ar bits
161: code limit
162: .Pq see below .
163: .It Fl c
164: Compressed or uncompressed output is written to the standard output.
165: No files are modified (force
166: .Nm gzcat
167: mode).
168: .It Fl d
169: Decompress the source files instead of compressing them (force
170: .Nm gunzip
171: mode).
172: .It Fl f
173: Force compression of
174: .Ar file ,
175: even if it is not actually reduced in size.
176: Additionally, files are overwritten without prompting for confirmation.
177: If the input data is not in a format recognized by
178: .Nm
179: and if the option
180: .Fl c
181: is also given, copy the input data without change
182: to the standard output: let
183: .Nm gzcat
184: behave as
185: .Xr cat 1 .
186: .It Fl g
187: Use the deflate scheme, which reportedly provides better compression rates
188: (the default).
189: .It Fl h
190: Print a short help message.
191: .It Fl L
192: Print the license.
193: .It Fl l
194: List information for the specified compressed files.
195: The following information is listed:
196: .Bl -tag -width "compression ratio"
197: .It compressed size
198: Size of the compressed file.
199: .It uncompressed size
200: Size of the file when uncompressed.
201: .It compression ratio
202: Ratio of the difference between the compressed and uncompressed
203: sizes to the uncompressed size.
204: .It uncompressed name
205: Name the file will be saved as when uncompressing.
206: .El
207: .Pp
208: If the
209: .Fl v
210: option is specified, the following additional information is printed:
211: .Bl -tag -width "compression method"
212: .It compression method
213: Name of the method used to compress the file.
214: .It crc
215: 32-bit CRC
216: .Pq cyclic redundancy code
217: of the uncompressed file.
218: .It "time stamp"
219: Date and time corresponding to the last data modification time
220: (mtime) of the compressed file (if the
221: .Fl n
222: option is specified, the time stamp stored in the compressed file
223: is printed instead).
224: .El
225: .It Fl N
226: When uncompressing or listing, use the time stamp and file name stored
227: in the compressed file, if any, for the uncompressed version.
228: .It Fl n
1.3 millert 229: When compressing, do not store the original file name and time stamp
230: in the
231: .Nm
232: header.
1.1 jmc 233: .It Fl O
234: Use old compression method
235: (force
236: .Xr compress 1
237: mode).
238: .It Fl o Ar filename
239: Set the output file name.
240: .It Fl q
241: Be quiet: suppress all messages.
242: .It Fl r
243: Recursive mode:
244: .Nm
245: will descend into specified directories.
246: .It Fl S Ar suffix
247: Set the suffix for compressed files.
248: .It Fl t
249: Test the integrity of each file leaving any files intact.
250: .It Fl V
251: Display the program version
252: .Pq RCS IDs of the source files
253: and exit.
254: .It Fl v
255: Print the percentage reduction of each file and other information.
256: .El
257: .Pp
258: .Nm
259: uses a modified Lempel-Ziv algorithm
260: .Pq LZW .
261: Common substrings are replaced by pointers to previous strings,
262: and are found using a hash table.
263: Unique substrings are emitted as a string of literal bytes,
264: and compressed as Huffman trees.
265: When code 512 is reached, the algorithm switches to 10-bit codes and
266: continues to use more bits until the
267: limit specified by the
268: .Fl b
269: flag is reached.
270: .Ar bits
271: must be between 9 and 16
272: .Pq the default is 16 .
273: .Pp
274: After the
275: .Ar bits
276: limit is reached,
277: .Nm
278: periodically checks the compression ratio.
279: If it is increasing,
280: .Nm
281: continues to use the existing code dictionary.
282: However, if the compression ratio decreases,
283: .Nm
284: discards the table of substrings and rebuilds it from scratch.
285: This allows the algorithm to adapt to the next
286: .Dq block
287: of the file.
288: .Pp
289: The
290: .Fl b
291: flag is omitted for
292: .Nm gunzip
293: since the
294: .Ar bits
295: parameter specified during compression
296: is encoded within the output, along with
297: a magic number to ensure that neither decompression of random data nor
298: recompression of compressed data is attempted.
299: .Pp
300: The amount of compression obtained depends on the size of the
301: input, the number of
302: .Ar bits
303: per code, and the distribution of common substrings.
304: Typically, text such as source code or English is reduced by 60 \- 70% using
305: .Nm .
306: Compression is generally much better than that achieved by Huffman
307: coding (as used in the historical command pack), or adaptive Huffman
308: coding (as used in the historical command compact), and takes less
309: time to compute.
310: .Pp
311: The
312: .Nm gzip ,
313: .Nm gunzip ,
314: and
315: .Nm gzcat
316: utilities exit with 0 on success; 1 if an error occurred;
317: or 2 if a warning occurred.
1.2 ray 318: .Sh ENVIRONMENT
319: .Bl -tag -width Ds
320: .It Ev GZIP
321: Options which are passed to
322: .Nm ,
323: .Nm gunzip ,
324: and
325: .Nm gzcat
326: automatically.
327: .El
1.1 jmc 328: .Sh SEE ALSO
329: .Xr compress 1 ,
330: .Xr gzexe 1 ,
331: .Xr gzsig 1 ,
332: .Xr zdiff 1 ,
333: .Xr zforce 1 ,
334: .Xr zmore 1 ,
335: .Xr znew 1 ,
336: .Xr compress 3
337: .Pp
338: .Bl -tag -width 12n -compact
339: .It RFC 1950
340: ZLIB Compressed Data Format Specification.
341: .It RFC 1951
342: DEFLATE Compressed Data Format Specification.
343: .It RFC 1952
344: GZIP File Format Specification.
345: .El
346: .Sh HISTORY
347: .Nm gzip
348: compatibility was added to
349: .Xr compress 1
350: in
351: .Ox 3.4 .
352: The
353: .Sq g
354: in this version of
355: .Nm gzip
356: stands for
357: .Dq gratis .