[BACK]Return to file.1 CVS log [TXT][DIR] Up to [local] / src / usr.bin / file

Annotation of src/usr.bin/file/file.1, Revision 1.19

1.19    ! ian         1: .\" $OpenBSD: file.1,v 1.18 2003/02/15 09:44:42 jmc Exp $
1.8       aaron       2: .\" $FreeBSD: src/usr.bin/file/file.1,v 1.16 2000/03/01 12:19:39 sheldonh Exp $
1.18      jmc         3: .\"
1.19    ! ian         4: .\" Copyright (c) Ian F. Darwin 1986-1995.
        !             5: .\" Software written by Ian F. Darwin and others;
        !             6: .\" maintained 1995-present by Christos Zoulas and others.
        !             7: .\"
        !             8: .\" Redistribution and use in source and binary forms, with or without
        !             9: .\" modification, are permitted provided that the following conditions
        !            10: .\" are met:
        !            11: .\" 1. Redistributions of source code must retain the above copyright
        !            12: .\"    notice immediately at the beginning of the file, without modification,
        !            13: .\"    this list of conditions, and the following disclaimer.
        !            14: .\" 2. Redistributions in binary form must reproduce the above copyright
        !            15: .\"    notice, this list of conditions and the following disclaimer in the
        !            16: .\"    documentation and/or other materials provided with the distribution.
        !            17: .\" 3. All advertising materials mentioning features or use of this software
        !            18: .\"    must display the following acknowledgement:
        !            19: .\"    This product includes software developed by Ian F. Darwin and others.
        !            20: .\" 4. The name of the author may not be used to endorse or promote products
        !            21: .\"    derived from this software without specific prior written permission.
        !            22: .\"
        !            23: .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
        !            24: .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
        !            25: .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
        !            26: .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
        !            27: .\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
        !            28: .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
        !            29: .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
        !            30: .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
        !            31: .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
        !            32: .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
        !            33: .\" SUCH DAMAGE.
1.18      jmc        34: .\"
1.8       aaron      35: .Dd July 30, 1997
                     36: .Dt FILE 1
                     37: .Os
                     38: .Sh NAME
                     39: .Nm file
                     40: .Nd determine file type
                     41: .Sh SYNOPSIS
                     42: .Nm file
1.17      millert    43: .Op Fl vbczL
1.8       aaron      44: .Op Fl f Ar namefile
                     45: .Op Fl m Ar magicfiles
                     46: .Ar file Op Ar ...
                     47: .Sh DESCRIPTION
1.4       millert    48: This manual page documents version 3.22 of the
1.8       aaron      49: .Nm
1.4       millert    50: command.
1.8       aaron      51: .Nm
1.1       deraadt    52: tests each argument in an attempt to classify it.
                     53: There are three sets of tests, performed in this order:
                     54: filesystem tests, magic number tests, and language tests.
1.8       aaron      55: The first test that succeeds causes the file type to be printed.
                     56: .Pp
1.1       deraadt    57: The type printed will usually contain one of the words
1.8       aaron      58: .Dq text
1.4       millert    59: (the file contains only
1.8       aaron      60: .Tn ASCII
1.4       millert    61: characters and is probably safe to read on an
1.8       aaron      62: .Tn ASCII
1.4       millert    63: terminal),
1.8       aaron      64: .Dq executable
1.1       deraadt    65: (the file contains the result of compiling a program
1.8       aaron      66: in a form understandable to some
                     67: .Ux
                     68: kernel or another),
1.1       deraadt    69: or
1.8       aaron      70: .Dq data
                     71: meaning anything else (data is usually binary or non-printable).
                     72: .Pp
1.1       deraadt    73: Exceptions are well-known file formats (core files, tar archives)
                     74: that are known to contain binary data.
                     75: When modifying the file
1.8       aaron      76: .Pa /etc/magic
1.6       aaron      77: or the program itself,
1.8       aaron      78: .Em "preserve these keywords" .
                     79: .Pp
1.1       deraadt    80: People depend on knowing that all the readable files in a directory
1.8       aaron      81: have the word
                     82: .Dq text
                     83: printed.
                     84: Don't do as Berkeley did; change
                     85: .Dq shell commands text
                     86: to
                     87: .Dq shell script .
                     88: .Pp
1.1       deraadt    89: The filesystem tests are based on examining the return from a
1.8       aaron      90: .Xr stat 2
1.1       deraadt    91: system call.
                     92: The program checks to see if the file is empty,
                     93: or if it's some sort of special file.
                     94: Any known file types appropriate to the system you are running on
                     95: (sockets, symbolic links, or named pipes (FIFOs) on those systems that
                     96: implement them)
                     97: are intuited if they are defined in
                     98: the system header file
1.9       aaron      99: .Aq Pa sys/stat.h .
1.8       aaron     100: .Pp
1.1       deraadt   101: The magic number tests are used to check for files with data in
                    102: particular fixed formats.
                    103: The canonical example of this is a binary executable (compiled program)
1.8       aaron     104: .Pa a.out
1.6       aaron     105: file, whose format is defined in
1.8       aaron     106: .Aq Pa a.out.h
1.1       deraadt   107: and possibly
1.8       aaron     108: .Aq Pa exec.h
1.1       deraadt   109: in the standard include directory.
1.8       aaron     110: These files have a
                    111: .Dq magic number
                    112: stored in a particular place
                    113: near the beginning of the file that tells the
                    114: .Ux
                    115: operating system
1.1       deraadt   116: that the file is a binary executable, and which of several types thereof.
1.8       aaron     117: .Pp
                    118: The concept of magic number has been applied by extension to data files.
1.1       deraadt   119: Any file with some invariant identifier at a small fixed
                    120: offset into the file can usually be described in this way.
                    121: The information in these files is read from the magic file
1.8       aaron     122: .Pa /etc/magic .
                    123: .Pp
1.1       deraadt   124: If an argument appears to be an
1.8       aaron     125: .Tn ASCII
1.1       deraadt   126: file,
1.8       aaron     127: .Nm
1.1       deraadt   128: attempts to guess its language.
1.4       millert   129: The language tests look for particular strings (cf
1.8       aaron     130: .Pa names.h )
1.1       deraadt   131: that can appear anywhere in the first few blocks of a file.
                    132: For example, the keyword
1.8       aaron     133: .Em .br
1.4       millert   134: indicates that the file is most likely a
1.8       aaron     135: .Xr troff 1
1.6       aaron     136: input file, just as the keyword
1.8       aaron     137: .Li struct
1.1       deraadt   138: indicates a C program.
                    139: These tests are less reliable than the previous
                    140: two groups, so they are performed last.
                    141: The language test routines also test for some miscellany
1.6       aaron     142: (such as
1.8       aaron     143: .Xr tar 1
1.1       deraadt   144: archives) and determine whether an unknown file should be
1.8       aaron     145: labelled as
                    146: .Dq ASCII text
                    147: or
                    148: .Dq data .
                    149: .Pp
                    150: The options are as follows:
1.11      aaron     151: .Bl -tag -width Ds
1.8       aaron     152: .It Fl v
1.1       deraadt   153: Print the version of the program and exit.
1.8       aaron     154: .It Fl m Ar list
                    155: Specify an alternate
                    156: .Ar list
                    157: of files containing magic numbers.
1.2       deraadt   158: This can be a single file, or a colon-separated list of files.
1.8       aaron     159: .It Fl z
1.1       deraadt   160: Try to look inside compressed files.
1.17      millert   161: .It Fl b
                    162: Do not prepend filenames to output lines (brief mode).
1.8       aaron     163: .It Fl c
1.1       deraadt   164: Cause a checking printout of the parsed form of the magic file.
1.6       aaron     165: This is usually used in conjunction with
1.8       aaron     166: .Fl m
1.1       deraadt   167: to debug a new magic file before installing it.
1.8       aaron     168: .It Fl f Ar namefile
1.6       aaron     169: Read the names of the files to be examined from
1.8       aaron     170: .Ar namefile
1.6       aaron     171: (one per line)
1.1       deraadt   172: before the argument list.
1.6       aaron     173: Either
1.8       aaron     174: .Ar namefile
1.1       deraadt   175: or at least one filename argument must be present;
1.8       aaron     176: to test the standard input, use
                    177: .Dq -
                    178: as a filename argument.
                    179: .It Fl L
                    180: Cause symlinks to be followed, as the like-named option in
                    181: .Xr ls 1 .
1.1       deraadt   182: (on systems that support symbolic links).
1.8       aaron     183: .El
                    184: .Sh ENVIRONMENT
                    185: .Bl -tag -width indent
1.13      smart     186: .It Ev MAGIC
1.8       aaron     187: Default magic number files.
                    188: .El
1.12      aaron     189: .Sh FILES
                    190: .Bl -tag -width /etc/magic -compact
                    191: .It Pa /etc/magic
                    192: default list of magic numbers
                    193: .El
1.8       aaron     194: .Sh SEE ALSO
                    195: .Xr hexdump 1 ,
                    196: .Xr od 1 ,
                    197: .Xr strings 1 ,
                    198: .Xr magic 5
                    199: .Sh STANDARDS CONFORMANCE
1.1       deraadt   200: This program is believed to exceed the System V Interface Definition
                    201: of FILE(CMD), as near as one can determine from the vague language
1.6       aaron     202: contained therein.
1.1       deraadt   203: Its behaviour is mostly compatible with the System V program of the same name.
                    204: This version knows more magic, however, so it will produce
1.6       aaron     205: different (albeit more accurate) output in many cases.
1.8       aaron     206: .Pp
1.6       aaron     207: The one significant difference
1.1       deraadt   208: between this version and System V
1.8       aaron     209: is that this version treats any white space
1.1       deraadt   210: as a delimiter, so that spaces in pattern strings must be escaped.
                    211: For example,
1.8       aaron     212: .Pp
                    213: >10     string  language impress\       (imPRESS data)
                    214: .Pp
1.1       deraadt   215: in an existing magic file would have to be changed to
1.8       aaron     216: .Pp
                    217: >10     string  language\e impress      (imPRESS data)
                    218: .Pp
1.1       deraadt   219: In addition, in this version, if a pattern string contains a backslash,
1.9       aaron     220: it must be escaped.
                    221: For example
1.8       aaron     222: .Pp
                    223: 0       string          \ebegindata     Andrew Toolkit document
                    224: .Pp
1.1       deraadt   225: in an existing magic file would have to be changed to
1.8       aaron     226: .Pp
                    227: 0       string          \e\ebegindata   Andrew Toolkit document
                    228: .Pp
1.1       deraadt   229: SunOS releases 3.2 and later from Sun Microsystems include a
1.8       aaron     230: .Xr file 1
1.1       deraadt   231: command derived from the System V one, but with some extensions.
                    232: My version differs from Sun's only in minor ways.
1.8       aaron     233: It includes the extension of the
                    234: .Ql &
                    235: operator, used as,
1.1       deraadt   236: for example,
1.8       aaron     237: .Pp
                    238: >16     long&0x7fffffff >0              not stripped
                    239: .Sh MAGIC DIRECTORY
1.1       deraadt   240: The magic file entries have been collected from various sources,
                    241: mainly USENET, and contributed by various authors.
1.8       aaron     242: .An Christos Zoulas
                    243: (address below) will collect additional
1.1       deraadt   244: or corrected magic file entries.
1.6       aaron     245: A consolidation of magic file entries
1.1       deraadt   246: will be distributed periodically.
                    247: The order of entries in the magic file is significant.
                    248: Depending on what system you are using, the order that
                    249: they are put together may be incorrect.
                    250: If your old
1.8       aaron     251: .Nm
1.1       deraadt   252: command uses a magic file,
                    253: keep the old magic file around for comparison purposes
1.6       aaron     254: (rename it to
1.8       aaron     255: .Pa /etc/magic.orig ) .
                    256: .Sh HISTORY
1.6       aaron     257: There has been a
1.8       aaron     258: .Nm
                    259: command in every
                    260: .Ux
1.16      mickey    261: since at least Research Version 4
                    262: (man page dated November, 1973).
1.1       deraadt   263: The System V version introduced one significant major change:
                    264: the external list of magic number types.
                    265: This slowed the program down slightly but made it a lot more flexible.
1.8       aaron     266: .Pp
1.10      ian       267: This program, based on the System V version, was written by
                    268: .An Ian F. Darwin Aq ian@darwinisys.com
1.8       aaron     269: without looking at anybody else's source code.
                    270: .Pp
                    271: .An John Gilmore
                    272: revised the code extensively, making it better than
1.1       deraadt   273: the first version.
1.8       aaron     274: .An Geoff Collyer
                    275: found several inadequacies
1.1       deraadt   276: and provided some magic file entries.
1.8       aaron     277: .Pp
                    278: Altered by
                    279: .An Rob McMahon Aq cudcv@warwick.ac.uk ,
                    280: 1989, to extend the
                    281: .Ql &
                    282: operator from simple
                    283: .Dq x&y != 0
                    284: to
                    285: .Dq x&y op z .
                    286: .Pp
                    287: Altered by
                    288: .An Guy Harris Aq guy@auspex.com ,
                    289: 1993, to:
                    290: .Bl -item -offset indent
                    291: .It
                    292: put the
                    293: .Dq old-style
                    294: .Ql &
                    295: operator back the way it was, because
                    296: .Bl -enum -offset indent
                    297: .It
                    298: Rob McMahon's change broke the
                    299: previous style of usage,
                    300: .It
                    301: The SunOS
                    302: .Dq new-style
                    303: .Ql &
                    304: operator, which this version of
                    305: .Nm
                    306: supports, also handles
                    307: .Dq x&y op z ,
                    308: .It
                    309: Rob's change wasn't documented in any case;
                    310: .El
                    311: .It
                    312: put in multiple levels of
                    313: .Ql > ;
                    314: .It
                    315: put in
                    316: .Dq beshort ,
                    317: .Dq leshort ,
                    318: etc. keywords to look at numbers in the
1.1       deraadt   319: file in a specific byte order, rather than in the native byte order of
                    320: the process running
1.8       aaron     321: .Nm file .
                    322: .El
                    323: .Pp
1.10      ian       324: Currently maintained by
                    325: .An Christos Zoulas Aq christos@zoulas.com .
1.8       aaron     326: .Sh LEGAL NOTICE
1.10      ian       327: Copyright (c) Ian F. Darwin, Toronto, Canada, 1986-1999.
                    328: Covered by the standard Berkeley Software Distribution copyright; see the file
                    329: LEGAL.NOTICE in the distribution.
1.8       aaron     330: .Pp
1.1       deraadt   331: The files
1.8       aaron     332: .Pa tar.h
1.1       deraadt   333: and
1.8       aaron     334: .Pa is_tar.c
                    335: were written by
                    336: .An John Gilmore
                    337: from his public-domain
                    338: .Nm tar
1.10      ian       339: program.
1.8       aaron     340: .Sh BUGS
1.1       deraadt   341: There must be a better way to automate the construction of the Magic
1.8       aaron     342: file from all the glop in Magdir.
                    343: What is it?
1.1       deraadt   344: Better yet, the magic file should be compiled into binary (say,
1.8       aaron     345: .Xr ndbm 3
1.4       millert   346: or, better yet, fixed-length
1.8       aaron     347: .Tn ASCII
1.4       millert   348: strings for use in heterogenous network environments) for faster startup.
1.1       deraadt   349: Then the program would run as fast as the Version 7 program of the same name,
                    350: with the flexibility of the System V version.
1.8       aaron     351: .Pp
                    352: .Nm
1.15      pjanzen   353: uses several algorithms that favor speed over accuracy;
1.4       millert   354: thus it can be misled about the contents of
1.8       aaron     355: .Tn ASCII
1.4       millert   356: files.
1.8       aaron     357: .Pp
1.4       millert   358: The support for
1.8       aaron     359: .Tn ASCII
1.4       millert   360: files (primarily for programming languages)
1.1       deraadt   361: is simplistic, inefficient and requires recompilation to update.
1.8       aaron     362: .Pp
                    363: There should be an
                    364: .Dq else
                    365: clause to follow a series of continuation lines.
                    366: .Pp
1.1       deraadt   367: The magic file and keywords should have regular expression support.
1.4       millert   368: Their use of
1.8       aaron     369: .Tn ASCII TAB
1.4       millert   370: as a field delimiter is ugly and makes
1.1       deraadt   371: it hard to edit the files, but is entrenched.
1.8       aaron     372: .Pp
1.1       deraadt   373: It might be advisable to allow upper-case letters in keywords
1.4       millert   374: for e.g.,
1.8       aaron     375: .Xr troff 1
1.4       millert   376: commands vs man page macros.
1.1       deraadt   377: Regular expression support would make this easy.
1.8       aaron     378: .Pp
1.1       deraadt   379: The program doesn't grok \s-2FORTRAN\s0.
1.6       aaron     380: It should be able to figure \s-2FORTRAN\s0 by seeing some keywords which
1.1       deraadt   381: appear indented at the start of line.
                    382: Regular expression support would make this easy.
1.8       aaron     383: .Pp
1.6       aaron     384: The list of keywords in
1.8       aaron     385: .Em ascmagic
1.1       deraadt   386: probably belongs in the Magic file.
1.8       aaron     387: This could be done by using some keyword like
                    388: .Ql *
                    389: for the offset value.
                    390: .Pp
                    391: Another optimization would be to sort
1.1       deraadt   392: the magic file so that we can just run down all the
                    393: tests for the first byte, first word, first long, etc, once we
1.9       aaron     394: have fetched it.
                    395: Complain about conflicts in the magic file entries.
1.1       deraadt   396: Make a rule that the magic entries sort based on file offset rather
                    397: than position within the magic file?
1.8       aaron     398: .Pp
1.6       aaron     399: The program should provide a way to give an estimate
1.8       aaron     400: of
                    401: .Dq how good
                    402: a guess is.
                    403: We end up removing guesses (e.g.,
                    404: .Dq From\
                    405: as first 5 chars of file) because
                    406: they are not as good as other guesses (e.g.,
                    407: .Dq Newsgroups:
                    408: versus
                    409: .Qq Return-Path: ) .
                    410: Still, if the others don't pan out, it should be
1.6       aaron     411: possible to use the first guess.
1.8       aaron     412: .Pp
                    413: This program is slower than some vendors'
                    414: .Nm
                    415: commands.
                    416: .Pp
1.1       deraadt   417: This manual page, and particularly this section, is too long.
1.8       aaron     418: .Sh AVAILABILITY
1.1       deraadt   419: You can obtain the original author's latest version by anonymous FTP
1.8       aaron     420: on
1.15      pjanzen   421: .Em ftp.astron.com
1.8       aaron     422: in the directory
                    423: .Pa /pub/file/file-X.YY.tar.gz