Annotation of src/usr.bin/file/magic.5, Revision 1.2
1.2 ! deraadt 1: .\" @(#)$OpenBSD: magic.5,v 1.1.1.1 1995/10/18 08:45:09 deraadt Exp $
1.1 deraadt 2: .TH MAGIC 5 "Public Domain"
3: .\" install as magic.5 on USG, magic.5 on V7 or Berkeley systems.
4: .SH NAME
5: magic \- file command's magic number file
6: .SH DESCRIPTION
7: The
8: .IR file (1)
9: command identifies the type of a file using,
10: among other tests,
11: a test for whether the file begins with a certain
12: .IR "magic number" .
13: The file
14: .B /etc/magic
15: specifies what magic numbers are to be tested for,
16: what message to print if a particular magic number is found,
17: and additional information to extract from the file.
18: .PP
19: Each line of the file specifies a test to be performed.
20: A test compares the data starting at a particular offset
21: in the file with a 1-byte, 2-byte, or 4-byte numeric value or
22: a string. If the test succeeds, a message is printed.
23: The line consists of the following fields:
24: .IP offset \w'message'u+2n
25: A number specifying the offset, in bytes, into the file of the data
26: which is to be tested.
27: .IP type
28: The type of the data to be tested. The possible values are:
29: .RS
30: .IP byte \w'message'u+2n
31: A one-byte value.
32: .IP short
33: A two-byte value (on most systems) in this machine's native byte order.
34: .IP long
35: A four-byte value (on most systems) in this machine's native byte order.
36: .IP string
37: A string of bytes.
38: .IP date
39: A four-byte value interpreted as a unix date.
40: .IP beshort
41: A two-byte value (on most systems) in big-endian byte order.
42: .IP belong
43: A four-byte value (on most systems) in big-endian byte order.
44: .IP bedate
45: A four-byte value (on most systems) in big-endian byte order,
46: interpreted as a unix date.
47: .IP leshort
48: A two-byte value (on most systems) in little-endian byte order.
49: .IP lelong
50: A four-byte value (on most systems) in little-endian byte order.
51: .IP ledate
52: A four-byte value (on most systems) in little-endian byte order,
53: interpreted as a unix date.
54: .RE
55: .PP
56: The numeric types may optionally be followed by
57: .B &
58: and a numeric value,
59: to specify that the value is to be AND'ed with the
60: numeric value before any comparisons are done. Prepending a
61: .B u
62: to the type indicates that ordered comparisons should be unsigned.
63: .IP test
64: The value to be compared with the value from the file. If the type is
65: numeric, this value
66: is specified in C form; if it is a string, it is specified as a C string
67: with the usual escapes permitted (e.g. \en for new-line).
68: .IP
69: Numeric values
70: may be preceded by a character indicating the operation to be performed.
71: It may be
72: .BR = ,
73: to specify that the value from the file must equal the specified value,
74: .BR < ,
75: to specify that the value from the file must be less than the specified
76: value,
77: .BR > ,
78: to specify that the value from the file must be greater than the specified
79: value,
80: .BR & ,
81: to specify that the value from the file must have set all of the bits
82: that are set in the specified value,
83: .BR ^ ,
84: to specify that the value from the file must have clear any of the bits
85: that are set in the specified value, or
86: .BR x ,
87: to specify that any value will match. If the character is omitted,
88: it is assumed to be
89: .BR = .
90: .IP
91: Numeric values are specified in C form; e.g.
92: .B 13
93: is decimal,
94: .B 013
95: is octal, and
96: .B 0x13
97: is hexadecimal.
98: .IP
99: For string values, the byte string from the
100: file must match the specified byte string.
101: The operators
102: .BR = ,
103: .B <
104: and
105: .B >
106: (but not
107: .BR & )
108: can be applied to strings.
109: The length used for matching is that of the string argument
110: in the magic file. This means that a line can match any string, and
111: then presumably print that string, by doing
112: .B >\e0
113: (because all strings are greater than the null string).
114: .IP message
115: The message to be printed if the comparison succeeds. If the string
116: contains a
117: .IR printf (3S)
118: format specification, the value from the file (with any specified masking
119: performed) is printed using the message as the format string.
120: .PP
121: Some file formats contain additional information which is to be printed
122: along with the file type. A line which begins with the character
123: .B >
124: indicates additional tests and messages to be printed. The number of
125: .B >
126: on the line indicates the level of the test; a line with no
127: .B >
128: at the beginning is considered to be at level 0.
129: Each line at level
130: .IB n \(pl1
131: is under the control of the line at level
132: .IB n
133: most closely preceding it in the magic file.
134: If the test on a line at level
135: .I n
136: succeeds, the tests specified in all the subsequent lines at level
137: .IB n \(pl1
138: are performed, and the messages printed if the tests succeed. The next
139: line at level
140: .I n
141: terminates this.
142: If the first character following the last
143: .B >
144: is a
145: .B (
146: then the string after the parenthesis is interpreted as an indirect offset.
147: That means that the number after the parenthesis is used as an offset in
148: the file. The value at that offset is read, and is used again as an offset
149: in the file. Indirect offsets are of the form:
150: .BI (( x [.[bsl]][+-][ y ]).
151: The value of
152: .I x
153: is used as an offset in the file. A byte, short or long is read at that offset
154: depending on the
155: .B [bsl]
156: type specifier. To that number the value of
157: .I y
158: is added and the result is used as an offset in the file. The default type
159: if one is not specified is long.
160: .SH BUGS
161: The formats
162: .IR long ,
163: .IR belong ,
164: .IR lelong ,
165: .IR short ,
166: .IR beshort ,
167: .IR leshort ,
168: .IR date ,
169: .IR bedate ,
170: and
171: .I ledate
172: are system-dependent; perhaps they should be specified as a number
173: of bytes (2B, 4B, etc),
174: since the files being recognized typically come from
175: a system on which the lengths are invariant.
176: .PP
177: There is (currently) no support for specified-endian data to be used in
178: indirect offsets.
179: .SH SEE ALSO
180: .IR file (1)
181: \- the command that reads this file.
182: .\"
183: .\" From: guy@sun.uucp (Guy Harris)
184: .\" Newsgroups: net.bugs.usg
185: .\" Subject: /etc/magic's format isn't well documented
186: .\" Message-ID: <2752@sun.uucp>
187: .\" Date: 3 Sep 85 08:19:07 GMT
188: .\" Organization: Sun Microsystems, Inc.
189: .\" Lines: 136
190: .\"
191: .\" Here's a manual page for the format accepted by the "file" made by adding
192: .\" the changes I posted to the S5R2 version.
193: .\"
194: .\" Modified for Ian Darwin's version of the file command.