[BACK]Return to file.1 CVS log [TXT][DIR] Up to [local] / src / usr.bin / file

Annotation of src/usr.bin/file/file.1, Revision 1.2

1.1       deraadt     1: .TH FILE 1 "Copyright but distributable"
1.2     ! deraadt     2: .\" $Id: file.1,v 1.8 1995/10/27 23:33:18 christos Exp $
1.1       deraadt     3: .SH NAME
                      4: file
                      5: \- determine file type
                      6: .SH SYNOPSIS
                      7: .B file
                      8: [
                      9: .B \-vczL
                     10: ]
                     11: [
                     12: .B \-f
                     13: namefile ]
                     14: [
                     15: .B \-m
1.2     ! deraadt    16: magicfiles ]
1.1       deraadt    17: file ...
                     18: .SH DESCRIPTION
                     19: .I File
                     20: tests each argument in an attempt to classify it.
                     21: There are three sets of tests, performed in this order:
                     22: filesystem tests, magic number tests, and language tests.
                     23: The
                     24: .I first
                     25: test that succeeds causes the file type to be printed.
                     26: .PP
                     27: The type printed will usually contain one of the words
                     28: .B text
                     29: (the file contains only ASCII characters and is
                     30: probably safe to read on an ASCII terminal),
                     31: .B executable
                     32: (the file contains the result of compiling a program
                     33: in a form understandable to some \s-1UNIX\s0 kernel or another),
                     34: or
                     35: .B data
                     36: meaning anything else (data is usually `binary' or non-printable).
                     37: Exceptions are well-known file formats (core files, tar archives)
                     38: that are known to contain binary data.
                     39: When modifying the file
                     40: .I /etc/magic
                     41: or the program itself,
                     42: .B "preserve these keywords" .
                     43: People depend on knowing that all the readable files in a directory
                     44: have the word ``text'' printed.
                     45: Don't do as Berkeley did \- change ``shell commands text''
                     46: to ``shell script''.
                     47: .PP
                     48: The filesystem tests are based on examining the return from a
                     49: .IR stat (2)
                     50: system call.
                     51: The program checks to see if the file is empty,
                     52: or if it's some sort of special file.
                     53: Any known file types appropriate to the system you are running on
                     54: (sockets, symbolic links, or named pipes (FIFOs) on those systems that
                     55: implement them)
                     56: are intuited if they are defined in
                     57: the system header file
                     58: .BR sys/stat.h  .
                     59: .PP
                     60: The magic number tests are used to check for files with data in
                     61: particular fixed formats.
                     62: The canonical example of this is a binary executable (compiled program)
                     63: .B a.out
                     64: file, whose format is defined in
                     65: .B a.out.h
                     66: and possibly
                     67: .B exec.h
                     68: in the standard include directory.
                     69: These files have a `magic number' stored in a particular place
                     70: near the beginning of the file that tells the \s-1UNIX\s0 operating system
                     71: that the file is a binary executable, and which of several types thereof.
                     72: The concept of `magic number' has been applied by extension to data files.
                     73: Any file with some invariant identifier at a small fixed
                     74: offset into the file can usually be described in this way.
                     75: The information in these files is read from the magic file
                     76: .I /etc/magic.
                     77: .PP
                     78: If an argument appears to be an
                     79: .SM ASCII
                     80: file,
                     81: .I file
                     82: attempts to guess its language.
                     83: The language tests look for particular strings (cf \fInames.h\fP)
                     84: that can appear anywhere in the first few blocks of a file.
                     85: For example, the keyword
                     86: .B .br
                     87: indicates that the file is most likely a troff input file,
                     88: just as the keyword
                     89: .B struct
                     90: indicates a C program.
                     91: These tests are less reliable than the previous
                     92: two groups, so they are performed last.
                     93: The language test routines also test for some miscellany
                     94: (such as
                     95: .I tar
                     96: archives) and determine whether an unknown file should be
                     97: labelled as `ascii text' or `data'.
                     98: .SH OPTIONS
                     99: .TP 8
                    100: .B \-v
                    101: Print the version of the program and exit.
                    102: .TP 8
1.2     ! deraadt   103: .B \-m list
        !           104: Specify an alternate list of files containing magic numbers.
        !           105: This can be a single file, or a colon-separated list of files.
1.1       deraadt   106: .TP 8
                    107: .B \-z
                    108: Try to look inside compressed files.
                    109: .TP 8
                    110: .B \-c
                    111: Cause a checking printout of the parsed form of the magic file.
                    112: This is usually used in conjunction with
                    113: .B \-m
                    114: to debug a new magic file before installing it.
                    115: .TP 8
                    116: .B \-f namefile
                    117: Read the names of the files to be examined from
                    118: .I namefile
                    119: (one per line)
                    120: before the argument list.
                    121: Either
                    122: .I namefile
                    123: or at least one filename argument must be present;
                    124: to test the standard input, use ``-'' as a filename argument.
                    125: .TP 8
                    126: .B \-L
                    127: option causes symlinks to be followed, as the like-named option in
                    128: .IR ls (1).
                    129: (on systems that support symbolic links).
                    130: .SH FILES
                    131: .I /etc/magic
                    132: \- default list of magic numbers
1.2     ! deraadt   133: .SH ENVIRONMENT
        !           134: The environment variable
        !           135: .B MAGIC
        !           136: can be used to set the default magic number files.
1.1       deraadt   137: .SH SEE ALSO
                    138: .IR magic (5)
                    139: \- description of magic file format.
                    140: .br
                    141: .IR Strings (1), " od" (1)
                    142: \- tools for examining non-textfiles.
                    143: .SH STANDARDS CONFORMANCE
                    144: This program is believed to exceed the System V Interface Definition
                    145: of FILE(CMD), as near as one can determine from the vague language
                    146: contained therein.
                    147: Its behaviour is mostly compatible with the System V program of the same name.
                    148: This version knows more magic, however, so it will produce
                    149: different (albeit more accurate) output in many cases.
                    150: .PP
                    151: The one significant difference
                    152: between this version and System V
                    153: is that this version treats any white space
                    154: as a delimiter, so that spaces in pattern strings must be escaped.
                    155: For example,
                    156: .br
                    157: >10    string  language impress\       (imPRESS data)
                    158: .br
                    159: in an existing magic file would have to be changed to
                    160: .br
                    161: >10    string  language\e impress      (imPRESS data)
                    162: .br
                    163: In addition, in this version, if a pattern string contains a backslash,
                    164: it must be escaped.  For example
                    165: .br
                    166: 0      string          \ebegindata     Andrew Toolkit document
                    167: .br
                    168: in an existing magic file would have to be changed to
                    169: .br
                    170: 0      string          \e\ebegindata   Andrew Toolkit document
                    171: .br
                    172: .PP
                    173: SunOS releases 3.2 and later from Sun Microsystems include a
                    174: .IR file (1)
                    175: command derived from the System V one, but with some extensions.
                    176: My version differs from Sun's only in minor ways.
                    177: It includes the extension of the `&' operator, used as,
                    178: for example,
                    179: .br
                    180: >16    long&0x7fffffff >0              not stripped
                    181: .SH MAGIC DIRECTORY
                    182: The magic file entries have been collected from various sources,
                    183: mainly USENET, and contributed by various authors.
                    184: Christos Zoulas (address below) will collect additional
                    185: or corrected magic file entries.
                    186: A consolidation of magic file entries
                    187: will be distributed periodically.
                    188: .PP
                    189: The order of entries in the magic file is significant.
                    190: Depending on what system you are using, the order that
                    191: they are put together may be incorrect.
                    192: If your old
                    193: .I file
                    194: command uses a magic file,
                    195: keep the old magic file around for comparison purposes
                    196: (rename it to
                    197: .IR /etc/magic.orig ).
                    198: .SH HISTORY
                    199: There has been a
                    200: .I file
                    201: command in every UNIX since at least Research Version 6
                    202: (man page dated January, 1975).
                    203: The System V version introduced one significant major change:
                    204: the external list of magic number types.
                    205: This slowed the program down slightly but made it a lot more flexible.
                    206: .PP
                    207: This program, based on the System V version,
                    208: was written by Ian Darwin without looking at anybody else's source code.
                    209: .PP
                    210: John Gilmore revised the code extensively, making it better than
                    211: the first version.
                    212: Geoff Collyer found several inadequacies
                    213: and provided some magic file entries.
                    214: The program has undergone continued evolution since.
                    215: .SH AUTHOR
                    216: Written by Ian F. Darwin, UUCP address {utzoo | ihnp4}!darwin!ian,
                    217: Internet address ian@sq.com,
                    218: postal address: P.O. Box 603, Station F, Toronto, Ontario, CANADA M4Y 2L8.
                    219: .PP
                    220: Altered by Rob McMahon, cudcv@warwick.ac.uk, 1989, to extend the `&' operator
                    221: from simple `x&y != 0' to `x&y op z'.
                    222: .PP
                    223: Altered by Guy Harris, guy@auspex.com, 1993, to:
                    224: .RS
                    225: .PP
                    226: put the ``old-style'' `&'
                    227: operator back the way it was, because 1) Rob McMahon's change broke the
                    228: previous style of usage, 2) the SunOS ``new-style'' `&' operator,
                    229: which this version of
                    230: .I file
                    231: supports, also handles `x&y op z', and 3) Rob's change wasn't documented
                    232: in any case;
                    233: .PP
                    234: put in multiple levels of `>';
                    235: .PP
                    236: put in ``beshort'', ``leshort'', etc. keywords to look at numbers in the
                    237: file in a specific byte order, rather than in the native byte order of
                    238: the process running
                    239: .IR file .
                    240: .RE
                    241: .PP
                    242: Changes by Ian Darwin and various authors including
                    243: Christos Zoulas (christos@ee.cornell.edu), 1990-1992.
                    244: .SH LEGAL NOTICE
                    245: Copyright (c) Ian F. Darwin, Toronto, Canada,
                    246: 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993.
                    247: .PP
                    248: This software is not subject to and may not be made subject to any
                    249: license of the American Telephone and Telegraph Company, Sun
                    250: Microsystems Inc., Digital Equipment Inc., Lotus Development Inc., the
                    251: Regents of the University of California, The X Consortium or MIT, or
                    252: The Free Software Foundation.
                    253: .PP
                    254: This software is not subject to any export provision of the United States
                    255: Department of Commerce, and may be exported to any country or planet.
                    256: .PP
                    257: Permission is granted to anyone to use this software for any purpose on
                    258: any computer system, and to alter it and redistribute it freely, subject
                    259: to the following restrictions:
                    260: .PP
                    261: 1. The author is not responsible for the consequences of use of this
                    262: software, no matter how awful, even if they arise from flaws in it.
                    263: .PP
                    264: 2. The origin of this software must not be misrepresented, either by
                    265: explicit claim or by omission.  Since few users ever read sources,
                    266: credits must appear in the documentation.
                    267: .PP
                    268: 3. Altered versions must be plainly marked as such, and must not be
                    269: misrepresented as being the original software.  Since few users
                    270: ever read sources, credits must appear in the documentation.
                    271: .PP
                    272: 4. This notice may not be removed or altered.
                    273: .PP
                    274: A few support files (\fIgetopt\fP, \fIstrtok\fP)
                    275: distributed with this package
                    276: are by Henry Spencer and are subject to the same terms as above.
                    277: .PP
                    278: A few simple support files (\fIstrtol\fP, \fIstrchr\fP)
                    279: distributed with this package
                    280: are in the public domain; they are so marked.
                    281: .PP
                    282: The files
                    283: .I tar.h
                    284: and
                    285: .I is_tar.c
                    286: were written by John Gilmore from his public-domain
                    287: .I tar
                    288: program, and are not covered by the above restrictions.
                    289: .SH BUGS
                    290: There must be a better way to automate the construction of the Magic
                    291: file from all the glop in Magdir. What is it?
                    292: Better yet, the magic file should be compiled into binary (say,
                    293: .IR ndbm (3)
                    294: or, better yet, fixed-length ASCII strings
                    295: for use in heterogenous network environments) for faster startup.
                    296: Then the program would run as fast as the Version 7 program of the same name,
                    297: with the flexibility of the System V version.
                    298: .PP
                    299: .I File
                    300: uses several algorithms that favor speed over accuracy,
                    301: thus it can be misled about the contents of ASCII files.
                    302: .PP
                    303: The support for ASCII files (primarily for programming languages)
                    304: is simplistic, inefficient and requires recompilation to update.
                    305: .PP
                    306: There should be an ``else'' clause to follow a series of continuation lines.
                    307: .PP
                    308: The magic file and keywords should have regular expression support.
                    309: Their use of ASCII TAB as a field delimiter is ugly and makes
                    310: it hard to edit the files, but is entrenched.
                    311: .PP
                    312: It might be advisable to allow upper-case letters in keywords
                    313: for e.g., troff commands vs man page macros.
                    314: Regular expression support would make this easy.
                    315: .PP
                    316: The program doesn't grok \s-2FORTRAN\s0.
                    317: It should be able to figure \s-2FORTRAN\s0 by seeing some keywords which
                    318: appear indented at the start of line.
                    319: Regular expression support would make this easy.
                    320: .PP
                    321: The list of keywords in
                    322: .I ascmagic
                    323: probably belongs in the Magic file.
                    324: This could be done by using some keyword like `*' for the offset value.
                    325: .PP
                    326: Another optimisation would be to sort
                    327: the magic file so that we can just run down all the
                    328: tests for the first byte, first word, first long, etc, once we
                    329: have fetched it.  Complain about conflicts in the magic file entries.
                    330: Make a rule that the magic entries sort based on file offset rather
                    331: than position within the magic file?
                    332: .PP
                    333: The program should provide a way to give an estimate
                    334: of ``how good'' a guess is.
                    335: We end up removing guesses (e.g. ``From '' as first 5 chars of file) because
                    336: they are not as good as other guesses (e.g. ``Newsgroups:'' versus
                    337: "Return-Path:").  Still, if the others don't pan out, it should be
                    338: possible to use the first guess.
                    339: .PP
                    340: This program is slower than some vendors' file commands.
                    341: .PP
                    342: This manual page, and particularly this section, is too long.
                    343: .SH AVAILABILITY
                    344: You can obtain the original author's latest version by anonymous FTP
                    345: on
                    346: .B tesla.ee.cornell.edu
                    347: in the directory
                    348: .BR /pub/file-X.YY.tar.gz