[BACK]Return to file.1 CVS log [TXT][DIR] Up to [local] / src / usr.bin / file

Annotation of src/usr.bin/file/file.1, Revision 1.5

1.5     ! ian         1: .\" $OpenBSD: file.1,v 1.4 1997/02/09 23:58:20 millert Exp $
1.4       millert     2: .TH FILE 1 "Copyrighted but distributable"
1.1       deraadt     3: .SH NAME
                      4: file
                      5: \- determine file type
                      6: .SH SYNOPSIS
                      7: .B file
                      8: [
                      9: .B \-vczL
                     10: ]
                     11: [
                     12: .B \-f
                     13: namefile ]
                     14: [
                     15: .B \-m
1.2       deraadt    16: magicfiles ]
1.1       deraadt    17: file ...
                     18: .SH DESCRIPTION
1.4       millert    19: This manual page documents version 3.22 of the
                     20: .B file
                     21: command.
                     22: .B File
1.1       deraadt    23: tests each argument in an attempt to classify it.
                     24: There are three sets of tests, performed in this order:
                     25: filesystem tests, magic number tests, and language tests.
                     26: The
                     27: .I first
                     28: test that succeeds causes the file type to be printed.
                     29: .PP
                     30: The type printed will usually contain one of the words
                     31: .B text
1.4       millert    32: (the file contains only
                     33: .SM ASCII
                     34: characters and is probably safe to read on an
                     35: .SM ASCII
                     36: terminal),
1.1       deraadt    37: .B executable
                     38: (the file contains the result of compiling a program
                     39: in a form understandable to some \s-1UNIX\s0 kernel or another),
                     40: or
                     41: .B data
                     42: meaning anything else (data is usually `binary' or non-printable).
                     43: Exceptions are well-known file formats (core files, tar archives)
                     44: that are known to contain binary data.
                     45: When modifying the file
                     46: .I /etc/magic
                     47: or the program itself,
                     48: .B "preserve these keywords" .
                     49: People depend on knowing that all the readable files in a directory
                     50: have the word ``text'' printed.
                     51: Don't do as Berkeley did \- change ``shell commands text''
                     52: to ``shell script''.
                     53: .PP
                     54: The filesystem tests are based on examining the return from a
1.4       millert    55: .BR stat (2)
1.1       deraadt    56: system call.
                     57: The program checks to see if the file is empty,
                     58: or if it's some sort of special file.
                     59: Any known file types appropriate to the system you are running on
                     60: (sockets, symbolic links, or named pipes (FIFOs) on those systems that
                     61: implement them)
                     62: are intuited if they are defined in
                     63: the system header file
1.4       millert    64: .IR sys/stat.h  .
1.1       deraadt    65: .PP
                     66: The magic number tests are used to check for files with data in
                     67: particular fixed formats.
                     68: The canonical example of this is a binary executable (compiled program)
1.4       millert    69: .I a.out
1.1       deraadt    70: file, whose format is defined in
1.4       millert    71: .I a.out.h
1.1       deraadt    72: and possibly
1.4       millert    73: .I exec.h
1.1       deraadt    74: in the standard include directory.
                     75: These files have a `magic number' stored in a particular place
                     76: near the beginning of the file that tells the \s-1UNIX\s0 operating system
                     77: that the file is a binary executable, and which of several types thereof.
                     78: The concept of `magic number' has been applied by extension to data files.
                     79: Any file with some invariant identifier at a small fixed
                     80: offset into the file can usually be described in this way.
                     81: The information in these files is read from the magic file
                     82: .I /etc/magic.
                     83: .PP
                     84: If an argument appears to be an
                     85: .SM ASCII
                     86: file,
1.4       millert    87: .B file
1.1       deraadt    88: attempts to guess its language.
1.4       millert    89: The language tests look for particular strings (cf
                     90: .IR names.h )
1.1       deraadt    91: that can appear anywhere in the first few blocks of a file.
                     92: For example, the keyword
                     93: .B .br
1.4       millert    94: indicates that the file is most likely a
                     95: .BR troff (1)
                     96: input file, just as the keyword
1.1       deraadt    97: .B struct
                     98: indicates a C program.
                     99: These tests are less reliable than the previous
                    100: two groups, so they are performed last.
                    101: The language test routines also test for some miscellany
                    102: (such as
1.4       millert   103: .BR tar (1)
1.1       deraadt   104: archives) and determine whether an unknown file should be
                    105: labelled as `ascii text' or `data'.
                    106: .SH OPTIONS
                    107: .TP 8
                    108: .B \-v
                    109: Print the version of the program and exit.
                    110: .TP 8
1.2       deraadt   111: .B \-m list
                    112: Specify an alternate list of files containing magic numbers.
                    113: This can be a single file, or a colon-separated list of files.
1.1       deraadt   114: .TP 8
                    115: .B \-z
                    116: Try to look inside compressed files.
                    117: .TP 8
                    118: .B \-c
                    119: Cause a checking printout of the parsed form of the magic file.
                    120: This is usually used in conjunction with
                    121: .B \-m
                    122: to debug a new magic file before installing it.
                    123: .TP 8
                    124: .B \-f namefile
                    125: Read the names of the files to be examined from
                    126: .I namefile
                    127: (one per line)
                    128: before the argument list.
                    129: Either
                    130: .I namefile
                    131: or at least one filename argument must be present;
                    132: to test the standard input, use ``-'' as a filename argument.
                    133: .TP 8
                    134: .B \-L
                    135: option causes symlinks to be followed, as the like-named option in
1.4       millert   136: .BR ls (1).
1.1       deraadt   137: (on systems that support symbolic links).
                    138: .SH FILES
                    139: .I /etc/magic
                    140: \- default list of magic numbers
1.2       deraadt   141: .SH ENVIRONMENT
                    142: The environment variable
                    143: .B MAGIC
                    144: can be used to set the default magic number files.
1.1       deraadt   145: .SH SEE ALSO
1.4       millert   146: .BR magic (5)
1.1       deraadt   147: \- description of magic file format.
                    148: .br
1.4       millert   149: .BR strings (1), " od" (1)
1.1       deraadt   150: \- tools for examining non-textfiles.
                    151: .SH STANDARDS CONFORMANCE
                    152: This program is believed to exceed the System V Interface Definition
                    153: of FILE(CMD), as near as one can determine from the vague language
                    154: contained therein.
                    155: Its behaviour is mostly compatible with the System V program of the same name.
                    156: This version knows more magic, however, so it will produce
                    157: different (albeit more accurate) output in many cases.
                    158: .PP
                    159: The one significant difference
                    160: between this version and System V
                    161: is that this version treats any white space
                    162: as a delimiter, so that spaces in pattern strings must be escaped.
                    163: For example,
                    164: .br
                    165: >10    string  language impress\       (imPRESS data)
                    166: .br
                    167: in an existing magic file would have to be changed to
                    168: .br
                    169: >10    string  language\e impress      (imPRESS data)
                    170: .br
                    171: In addition, in this version, if a pattern string contains a backslash,
                    172: it must be escaped.  For example
                    173: .br
                    174: 0      string          \ebegindata     Andrew Toolkit document
                    175: .br
                    176: in an existing magic file would have to be changed to
                    177: .br
                    178: 0      string          \e\ebegindata   Andrew Toolkit document
                    179: .br
                    180: .PP
                    181: SunOS releases 3.2 and later from Sun Microsystems include a
1.4       millert   182: .BR file (1)
1.1       deraadt   183: command derived from the System V one, but with some extensions.
                    184: My version differs from Sun's only in minor ways.
                    185: It includes the extension of the `&' operator, used as,
                    186: for example,
                    187: .br
                    188: >16    long&0x7fffffff >0              not stripped
                    189: .SH MAGIC DIRECTORY
                    190: The magic file entries have been collected from various sources,
                    191: mainly USENET, and contributed by various authors.
                    192: Christos Zoulas (address below) will collect additional
                    193: or corrected magic file entries.
                    194: A consolidation of magic file entries
                    195: will be distributed periodically.
                    196: .PP
                    197: The order of entries in the magic file is significant.
                    198: Depending on what system you are using, the order that
                    199: they are put together may be incorrect.
                    200: If your old
1.4       millert   201: .B file
1.1       deraadt   202: command uses a magic file,
                    203: keep the old magic file around for comparison purposes
                    204: (rename it to
                    205: .IR /etc/magic.orig ).
                    206: .SH HISTORY
                    207: There has been a
1.4       millert   208: .B file
                    209: command in every \s-1UNIX\s0 since at least Research Version 6
1.1       deraadt   210: (man page dated January, 1975).
                    211: The System V version introduced one significant major change:
                    212: the external list of magic number types.
                    213: This slowed the program down slightly but made it a lot more flexible.
                    214: .PP
                    215: This program, based on the System V version,
                    216: was written by Ian Darwin without looking at anybody else's source code.
                    217: .PP
                    218: John Gilmore revised the code extensively, making it better than
                    219: the first version.
                    220: Geoff Collyer found several inadequacies
                    221: and provided some magic file entries.
                    222: The program has undergone continued evolution since.
                    223: .SH AUTHOR
1.5     ! ian       224: Written by Ian F. Darwin, ian@darwinsys.com.
1.1       deraadt   225: .PP
                    226: Altered by Rob McMahon, cudcv@warwick.ac.uk, 1989, to extend the `&' operator
                    227: from simple `x&y != 0' to `x&y op z'.
                    228: .PP
                    229: Altered by Guy Harris, guy@auspex.com, 1993, to:
                    230: .RS
                    231: .PP
                    232: put the ``old-style'' `&'
                    233: operator back the way it was, because 1) Rob McMahon's change broke the
                    234: previous style of usage, 2) the SunOS ``new-style'' `&' operator,
                    235: which this version of
1.4       millert   236: .B file
1.1       deraadt   237: supports, also handles `x&y op z', and 3) Rob's change wasn't documented
                    238: in any case;
                    239: .PP
                    240: put in multiple levels of `>';
                    241: .PP
                    242: put in ``beshort'', ``leshort'', etc. keywords to look at numbers in the
                    243: file in a specific byte order, rather than in the native byte order of
                    244: the process running
1.4       millert   245: .BR file .
1.1       deraadt   246: .RE
                    247: .PP
                    248: Changes by Ian Darwin and various authors including
1.4       millert   249: Christos Zoulas (christos@deshaw.com), 1990-1997.
1.1       deraadt   250: .SH LEGAL NOTICE
1.5     ! ian       251: This program is distributed under the terms of the accompanying
        !           252: license file LEGAL.NOTICE.
1.1       deraadt   253: .PP
                    254: A few support files (\fIgetopt\fP, \fIstrtok\fP)
                    255: distributed with this package
                    256: are by Henry Spencer and are subject to the same terms as above.
                    257: .PP
                    258: A few simple support files (\fIstrtol\fP, \fIstrchr\fP)
                    259: distributed with this package
                    260: are in the public domain; they are so marked.
                    261: .PP
                    262: The files
                    263: .I tar.h
                    264: and
                    265: .I is_tar.c
                    266: were written by John Gilmore from his public-domain
1.4       millert   267: .B tar
1.1       deraadt   268: program, and are not covered by the above restrictions.
                    269: .SH BUGS
                    270: There must be a better way to automate the construction of the Magic
                    271: file from all the glop in Magdir. What is it?
                    272: Better yet, the magic file should be compiled into binary (say,
1.4       millert   273: .BR ndbm (3)
                    274: or, better yet, fixed-length
                    275: .SM ASCII
                    276: strings for use in heterogenous network environments) for faster startup.
1.1       deraadt   277: Then the program would run as fast as the Version 7 program of the same name,
                    278: with the flexibility of the System V version.
                    279: .PP
1.4       millert   280: .B File
1.1       deraadt   281: uses several algorithms that favor speed over accuracy,
1.4       millert   282: thus it can be misled about the contents of
                    283: .SM ASCII
                    284: files.
                    285: .PP
                    286: The support for
                    287: .SM ASCII
                    288: files (primarily for programming languages)
1.1       deraadt   289: is simplistic, inefficient and requires recompilation to update.
                    290: .PP
                    291: There should be an ``else'' clause to follow a series of continuation lines.
                    292: .PP
                    293: The magic file and keywords should have regular expression support.
1.4       millert   294: Their use of
                    295: .SM "ASCII TAB"
                    296: as a field delimiter is ugly and makes
1.1       deraadt   297: it hard to edit the files, but is entrenched.
                    298: .PP
                    299: It might be advisable to allow upper-case letters in keywords
1.4       millert   300: for e.g.,
                    301: .BR troff (1)
                    302: commands vs man page macros.
1.1       deraadt   303: Regular expression support would make this easy.
                    304: .PP
                    305: The program doesn't grok \s-2FORTRAN\s0.
                    306: It should be able to figure \s-2FORTRAN\s0 by seeing some keywords which
                    307: appear indented at the start of line.
                    308: Regular expression support would make this easy.
                    309: .PP
                    310: The list of keywords in
                    311: .I ascmagic
                    312: probably belongs in the Magic file.
                    313: This could be done by using some keyword like `*' for the offset value.
                    314: .PP
                    315: Another optimisation would be to sort
                    316: the magic file so that we can just run down all the
                    317: tests for the first byte, first word, first long, etc, once we
                    318: have fetched it.  Complain about conflicts in the magic file entries.
                    319: Make a rule that the magic entries sort based on file offset rather
                    320: than position within the magic file?
                    321: .PP
                    322: The program should provide a way to give an estimate
                    323: of ``how good'' a guess is.
                    324: We end up removing guesses (e.g. ``From '' as first 5 chars of file) because
                    325: they are not as good as other guesses (e.g. ``Newsgroups:'' versus
                    326: "Return-Path:").  Still, if the others don't pan out, it should be
                    327: possible to use the first guess.
                    328: .PP
                    329: This program is slower than some vendors' file commands.
                    330: .PP
                    331: This manual page, and particularly this section, is too long.
                    332: .SH AVAILABILITY
                    333: You can obtain the original author's latest version by anonymous FTP
1.5     ! ian       334: at
        !           335: .I ftp://ftp.astron.com/pub/file/
        !           336: with a name like
        !           337: .I file-X.YY.tar.gz .