[BACK]Return to file.1 CVS log [TXT][DIR] Up to [local] / src / usr.bin / file

Annotation of src/usr.bin/file/file.1, Revision 1.9

1.9     ! aaron       1: .\" $OpenBSD: file.1,v 1.8 2000/03/06 02:38:19 aaron Exp $
1.8       aaron       2: .\" $FreeBSD: src/usr.bin/file/file.1,v 1.16 2000/03/01 12:19:39 sheldonh Exp $
                      3: .Dd July 30, 1997
                      4: .Dt FILE 1
                      5: .Os
                      6: .Sh NAME
                      7: .Nm file
                      8: .Nd determine file type
                      9: .Sh SYNOPSIS
                     10: .Nm file
                     11: .Op Fl vczL
                     12: .Op Fl f Ar namefile
                     13: .Op Fl m Ar magicfiles
                     14: .Ar file Op Ar ...
                     15: .Sh DESCRIPTION
1.4       millert    16: This manual page documents version 3.22 of the
1.8       aaron      17: .Nm
1.4       millert    18: command.
1.8       aaron      19: .Nm
1.1       deraadt    20: tests each argument in an attempt to classify it.
                     21: There are three sets of tests, performed in this order:
                     22: filesystem tests, magic number tests, and language tests.
1.8       aaron      23: The first test that succeeds causes the file type to be printed.
                     24: .Pp
1.1       deraadt    25: The type printed will usually contain one of the words
1.8       aaron      26: .Dq text
1.4       millert    27: (the file contains only
1.8       aaron      28: .Tn ASCII
1.4       millert    29: characters and is probably safe to read on an
1.8       aaron      30: .Tn ASCII
1.4       millert    31: terminal),
1.8       aaron      32: .Dq executable
1.1       deraadt    33: (the file contains the result of compiling a program
1.8       aaron      34: in a form understandable to some
                     35: .Ux
                     36: kernel or another),
1.1       deraadt    37: or
1.8       aaron      38: .Dq data
                     39: meaning anything else (data is usually binary or non-printable).
                     40: .Pp
1.1       deraadt    41: Exceptions are well-known file formats (core files, tar archives)
                     42: that are known to contain binary data.
                     43: When modifying the file
1.8       aaron      44: .Pa /etc/magic
1.6       aaron      45: or the program itself,
1.8       aaron      46: .Em "preserve these keywords" .
                     47: .Pp
1.1       deraadt    48: People depend on knowing that all the readable files in a directory
1.8       aaron      49: have the word
                     50: .Dq text
                     51: printed.
                     52: Don't do as Berkeley did; change
                     53: .Dq shell commands text
                     54: to
                     55: .Dq shell script .
                     56: .Pp
1.1       deraadt    57: The filesystem tests are based on examining the return from a
1.8       aaron      58: .Xr stat 2
1.1       deraadt    59: system call.
                     60: The program checks to see if the file is empty,
                     61: or if it's some sort of special file.
                     62: Any known file types appropriate to the system you are running on
                     63: (sockets, symbolic links, or named pipes (FIFOs) on those systems that
                     64: implement them)
                     65: are intuited if they are defined in
                     66: the system header file
1.9     ! aaron      67: .Aq Pa sys/stat.h .
1.8       aaron      68: .Pp
1.1       deraadt    69: The magic number tests are used to check for files with data in
                     70: particular fixed formats.
                     71: The canonical example of this is a binary executable (compiled program)
1.8       aaron      72: .Pa a.out
1.6       aaron      73: file, whose format is defined in
1.8       aaron      74: .Aq Pa a.out.h
1.1       deraadt    75: and possibly
1.8       aaron      76: .Aq Pa exec.h
1.1       deraadt    77: in the standard include directory.
1.8       aaron      78: These files have a
                     79: .Dq magic number
                     80: stored in a particular place
                     81: near the beginning of the file that tells the
                     82: .Ux
                     83: operating system
1.1       deraadt    84: that the file is a binary executable, and which of several types thereof.
1.8       aaron      85: .Pp
                     86: The concept of magic number has been applied by extension to data files.
1.1       deraadt    87: Any file with some invariant identifier at a small fixed
                     88: offset into the file can usually be described in this way.
                     89: The information in these files is read from the magic file
1.8       aaron      90: .Pa /etc/magic .
                     91: .Pp
1.1       deraadt    92: If an argument appears to be an
1.8       aaron      93: .Tn ASCII
1.1       deraadt    94: file,
1.8       aaron      95: .Nm
1.1       deraadt    96: attempts to guess its language.
1.4       millert    97: The language tests look for particular strings (cf
1.8       aaron      98: .Pa names.h )
1.1       deraadt    99: that can appear anywhere in the first few blocks of a file.
                    100: For example, the keyword
1.8       aaron     101: .Em .br
1.4       millert   102: indicates that the file is most likely a
1.8       aaron     103: .Xr troff 1
1.6       aaron     104: input file, just as the keyword
1.8       aaron     105: .Li struct
1.1       deraadt   106: indicates a C program.
                    107: These tests are less reliable than the previous
                    108: two groups, so they are performed last.
                    109: The language test routines also test for some miscellany
1.6       aaron     110: (such as
1.8       aaron     111: .Xr tar 1
1.1       deraadt   112: archives) and determine whether an unknown file should be
1.8       aaron     113: labelled as
                    114: .Dq ASCII text
                    115: or
                    116: .Dq data .
                    117: .Pp
                    118: The options are as follows:
                    119: .Bl -tag -width indent
                    120: .It Fl v
1.1       deraadt   121: Print the version of the program and exit.
1.8       aaron     122: .It Fl m Ar list
                    123: Specify an alternate
                    124: .Ar list
                    125: of files containing magic numbers.
1.2       deraadt   126: This can be a single file, or a colon-separated list of files.
1.8       aaron     127: .It Fl z
1.1       deraadt   128: Try to look inside compressed files.
1.8       aaron     129: .It Fl c
1.1       deraadt   130: Cause a checking printout of the parsed form of the magic file.
1.6       aaron     131: This is usually used in conjunction with
1.8       aaron     132: .Fl m
1.1       deraadt   133: to debug a new magic file before installing it.
1.8       aaron     134: .It Fl f Ar namefile
1.6       aaron     135: Read the names of the files to be examined from
1.8       aaron     136: .Ar namefile
1.6       aaron     137: (one per line)
1.1       deraadt   138: before the argument list.
1.6       aaron     139: Either
1.8       aaron     140: .Ar namefile
1.1       deraadt   141: or at least one filename argument must be present;
1.8       aaron     142: to test the standard input, use
                    143: .Dq -
                    144: as a filename argument.
                    145: .It Fl L
                    146: Cause symlinks to be followed, as the like-named option in
                    147: .Xr ls 1 .
1.1       deraadt   148: (on systems that support symbolic links).
1.8       aaron     149: .El
                    150: .Sh FILES
                    151: .Bl -tag -width /etc/magic -compact
                    152: .It Pa /etc/magic
                    153: default list of magic numbers
                    154: .El
                    155: .Sh ENVIRONMENT
                    156: The following environment varibles affect the execution of
                    157: .Nm file :
                    158: .Pp
                    159: .Bl -tag -width indent
                    160: .Ev MAGIC
                    161: Default magic number files.
                    162: .El
                    163: .Sh SEE ALSO
                    164: .Xr hexdump 1 ,
                    165: .Xr od 1 ,
                    166: .Xr strings 1 ,
                    167: .Xr magic 5
                    168: .Sh STANDARDS CONFORMANCE
1.1       deraadt   169: This program is believed to exceed the System V Interface Definition
                    170: of FILE(CMD), as near as one can determine from the vague language
1.6       aaron     171: contained therein.
1.1       deraadt   172: Its behaviour is mostly compatible with the System V program of the same name.
                    173: This version knows more magic, however, so it will produce
1.6       aaron     174: different (albeit more accurate) output in many cases.
1.8       aaron     175: .Pp
1.6       aaron     176: The one significant difference
1.1       deraadt   177: between this version and System V
1.8       aaron     178: is that this version treats any white space
1.1       deraadt   179: as a delimiter, so that spaces in pattern strings must be escaped.
                    180: For example,
1.8       aaron     181: .Pp
                    182: >10     string  language impress\       (imPRESS data)
                    183: .Pp
1.1       deraadt   184: in an existing magic file would have to be changed to
1.8       aaron     185: .Pp
                    186: >10     string  language\e impress      (imPRESS data)
                    187: .Pp
1.1       deraadt   188: In addition, in this version, if a pattern string contains a backslash,
1.9     ! aaron     189: it must be escaped.
        !           190: For example
1.8       aaron     191: .Pp
                    192: 0       string          \ebegindata     Andrew Toolkit document
                    193: .Pp
1.1       deraadt   194: in an existing magic file would have to be changed to
1.8       aaron     195: .Pp
                    196: 0       string          \e\ebegindata   Andrew Toolkit document
                    197: .Pp
1.1       deraadt   198: SunOS releases 3.2 and later from Sun Microsystems include a
1.8       aaron     199: .Xr file 1
1.1       deraadt   200: command derived from the System V one, but with some extensions.
                    201: My version differs from Sun's only in minor ways.
1.8       aaron     202: It includes the extension of the
                    203: .Ql &
                    204: operator, used as,
1.1       deraadt   205: for example,
1.8       aaron     206: .Pp
                    207: >16     long&0x7fffffff >0              not stripped
                    208: .Sh MAGIC DIRECTORY
1.1       deraadt   209: The magic file entries have been collected from various sources,
                    210: mainly USENET, and contributed by various authors.
1.8       aaron     211: .An Christos Zoulas
                    212: (address below) will collect additional
1.1       deraadt   213: or corrected magic file entries.
1.6       aaron     214: A consolidation of magic file entries
1.1       deraadt   215: will be distributed periodically.
                    216: The order of entries in the magic file is significant.
                    217: Depending on what system you are using, the order that
                    218: they are put together may be incorrect.
                    219: If your old
1.8       aaron     220: .Nm
1.1       deraadt   221: command uses a magic file,
                    222: keep the old magic file around for comparison purposes
1.6       aaron     223: (rename it to
1.8       aaron     224: .Pa /etc/magic.orig ) .
                    225: .Sh HISTORY
1.6       aaron     226: There has been a
1.8       aaron     227: .Nm
                    228: command in every
                    229: .Ux
                    230: since at least Research Version 6
1.1       deraadt   231: (man page dated January, 1975).
                    232: The System V version introduced one significant major change:
                    233: the external list of magic number types.
                    234: This slowed the program down slightly but made it a lot more flexible.
1.8       aaron     235: .Pp
1.1       deraadt   236: This program, based on the System V version,
1.8       aaron     237: was written by
                    238: .An Ian Darwin
                    239: without looking at anybody else's source code.
                    240: .Pp
                    241: .An John Gilmore
                    242: revised the code extensively, making it better than
1.1       deraadt   243: the first version.
1.8       aaron     244: .An Geoff Collyer
                    245: found several inadequacies
1.1       deraadt   246: and provided some magic file entries.
                    247: The program has undergone continued evolution since.
1.8       aaron     248: .Sh AUTHORS
                    249: Written by
                    250: .An Ian F. Darwin Aq ian@sq.com ,
                    251: UUCP address {utzoo | ihnp4}!darwin!ian,
                    252: postal address: P.O. Box 603, Station F, Toronto, Ontario, CANADA M4Y 2L8.
                    253: .Pp
                    254: Altered by
                    255: .An Rob McMahon Aq cudcv@warwick.ac.uk ,
                    256: 1989, to extend the
                    257: .Ql &
                    258: operator from simple
                    259: .Dq x&y != 0
                    260: to
                    261: .Dq x&y op z .
                    262: .Pp
                    263: Altered by
                    264: .An Guy Harris Aq guy@auspex.com ,
                    265: 1993, to:
                    266: .Bl -item -offset indent
                    267: .It
                    268: put the
                    269: .Dq old-style
                    270: .Ql &
                    271: operator back the way it was, because
                    272: .Bl -enum -offset indent
                    273: .It
                    274: Rob McMahon's change broke the
                    275: previous style of usage,
                    276: .It
                    277: The SunOS
                    278: .Dq new-style
                    279: .Ql &
                    280: operator, which this version of
                    281: .Nm
                    282: supports, also handles
                    283: .Dq x&y op z ,
                    284: .It
                    285: Rob's change wasn't documented in any case;
                    286: .El
                    287: .It
                    288: put in multiple levels of
                    289: .Ql > ;
                    290: .It
                    291: put in
                    292: .Dq beshort ,
                    293: .Dq leshort ,
                    294: etc. keywords to look at numbers in the
1.1       deraadt   295: file in a specific byte order, rather than in the native byte order of
                    296: the process running
1.8       aaron     297: .Nm file .
                    298: .El
                    299: .Pp
                    300: Changes by
                    301: .An Ian Darwin
                    302: and various authors including
                    303: .An Christos Zoulas Aq christos@deshaw.com ,
                    304: 1990-1992.
                    305: .Sh LEGAL NOTICE
                    306: Copyright (c) Ian F. Darwin, Toronto, Canada,
                    307: 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993.
                    308: .Pp
                    309: This software is not subject to and may not be made subject to any
                    310: license of the American Telephone and Telegraph Company, Sun
                    311: Microsystems Inc., Digital Equipment Inc., Lotus Development Inc., the
                    312: Regents of the University of California, The X Consortium or MIT, or
                    313: The Free Software Foundation.
                    314: .Pp
                    315: This software is not subject to any export provision of the United States
                    316: Department of Commerce, and may be exported to any country or planet.
                    317: .Pp
                    318: Permission is granted to anyone to use this software for any purpose on
                    319: any computer system, and to alter it and redistribute it freely, subject
                    320: to the following restrictions:
                    321: .Bl -enum -offset indent
                    322: .It
                    323: The author is not responsible for the consequences of use of this
                    324: software, no matter how awful, even if they arise from flaws in it;
                    325: .It
                    326: The origin of this software must not be misrepresented, either by
1.9     ! aaron     327: explicit claim or by omission.
        !           328: Since few users ever read sources,
1.8       aaron     329: credits must appear in the documentation;
                    330: .It
                    331: Altered versions must be plainly marked as such, and must not be
1.9     ! aaron     332: misrepresented as being the original software.
        !           333: Since few users ever read sources, credits must appear in the documentation;
1.8       aaron     334: .It
                    335: This notice may not be removed or altered.
                    336: .El
                    337: .Pp
                    338: A few support files
                    339: .Pf ( Fn getopt ,
                    340: .Fn strtok )
1.1       deraadt   341: distributed with this package
1.8       aaron     342: are by
                    343: .An Henry Spencer
                    344: and are subject to the same terms as above.
                    345: .Pp
                    346: A few simple support files
                    347: .Pf ( Fn strtol ,
                    348: .Fn strchr )
1.1       deraadt   349: distributed with this package
                    350: are in the public domain; they are so marked.
1.8       aaron     351: .Pp
1.1       deraadt   352: The files
1.8       aaron     353: .Pa tar.h
1.1       deraadt   354: and
1.8       aaron     355: .Pa is_tar.c
                    356: were written by
                    357: .An John Gilmore
                    358: from his public-domain
                    359: .Nm tar
1.1       deraadt   360: program, and are not covered by the above restrictions.
1.8       aaron     361: .Sh BUGS
1.1       deraadt   362: There must be a better way to automate the construction of the Magic
1.8       aaron     363: file from all the glop in Magdir.
                    364: What is it?
1.1       deraadt   365: Better yet, the magic file should be compiled into binary (say,
1.8       aaron     366: .Xr ndbm 3
1.4       millert   367: or, better yet, fixed-length
1.8       aaron     368: .Tn ASCII
1.4       millert   369: strings for use in heterogenous network environments) for faster startup.
1.1       deraadt   370: Then the program would run as fast as the Version 7 program of the same name,
                    371: with the flexibility of the System V version.
1.8       aaron     372: .Pp
                    373: .Nm
1.1       deraadt   374: uses several algorithms that favor speed over accuracy,
1.4       millert   375: thus it can be misled about the contents of
1.8       aaron     376: .Tn ASCII
1.4       millert   377: files.
1.8       aaron     378: .Pp
1.4       millert   379: The support for
1.8       aaron     380: .Tn ASCII
1.4       millert   381: files (primarily for programming languages)
1.1       deraadt   382: is simplistic, inefficient and requires recompilation to update.
1.8       aaron     383: .Pp
                    384: There should be an
                    385: .Dq else
                    386: clause to follow a series of continuation lines.
                    387: .Pp
1.1       deraadt   388: The magic file and keywords should have regular expression support.
1.4       millert   389: Their use of
1.8       aaron     390: .Tn ASCII TAB
1.4       millert   391: as a field delimiter is ugly and makes
1.1       deraadt   392: it hard to edit the files, but is entrenched.
1.8       aaron     393: .Pp
1.1       deraadt   394: It might be advisable to allow upper-case letters in keywords
1.4       millert   395: for e.g.,
1.8       aaron     396: .Xr troff 1
1.4       millert   397: commands vs man page macros.
1.1       deraadt   398: Regular expression support would make this easy.
1.8       aaron     399: .Pp
1.1       deraadt   400: The program doesn't grok \s-2FORTRAN\s0.
1.6       aaron     401: It should be able to figure \s-2FORTRAN\s0 by seeing some keywords which
1.1       deraadt   402: appear indented at the start of line.
                    403: Regular expression support would make this easy.
1.8       aaron     404: .Pp
1.6       aaron     405: The list of keywords in
1.8       aaron     406: .Em ascmagic
1.1       deraadt   407: probably belongs in the Magic file.
1.8       aaron     408: This could be done by using some keyword like
                    409: .Ql *
                    410: for the offset value.
                    411: .Pp
                    412: Another optimization would be to sort
1.1       deraadt   413: the magic file so that we can just run down all the
                    414: tests for the first byte, first word, first long, etc, once we
1.9     ! aaron     415: have fetched it.
        !           416: Complain about conflicts in the magic file entries.
1.1       deraadt   417: Make a rule that the magic entries sort based on file offset rather
                    418: than position within the magic file?
1.8       aaron     419: .Pp
1.6       aaron     420: The program should provide a way to give an estimate
1.8       aaron     421: of
                    422: .Dq how good
                    423: a guess is.
                    424: We end up removing guesses (e.g.,
                    425: .Dq From\
                    426: as first 5 chars of file) because
                    427: they are not as good as other guesses (e.g.,
                    428: .Dq Newsgroups:
                    429: versus
                    430: .Qq Return-Path: ) .
                    431: Still, if the others don't pan out, it should be
1.6       aaron     432: possible to use the first guess.
1.8       aaron     433: .Pp
                    434: This program is slower than some vendors'
                    435: .Nm
                    436: commands.
                    437: .Pp
1.1       deraadt   438: This manual page, and particularly this section, is too long.
1.8       aaron     439: .Sh AVAILABILITY
1.1       deraadt   440: You can obtain the original author's latest version by anonymous FTP
1.8       aaron     441: on
                    442: .Em ftp.deshaw.com
                    443: in the directory
                    444: .Pa /pub/file/file-X.YY.tar.gz
                    445: