[BACK]Return to file.1 CVS log [TXT][DIR] Up to [local] / src / usr.bin / file

Annotation of src/usr.bin/file/file.1, Revision 1.23

1.23    ! jaredy      1: .\" $OpenBSD: file.1,v 1.22 2004/10/14 20:56:57 jaredy Exp $
1.8       aaron       2: .\" $FreeBSD: src/usr.bin/file/file.1,v 1.16 2000/03/01 12:19:39 sheldonh Exp $
1.18      jmc         3: .\"
1.19      ian         4: .\" Copyright (c) Ian F. Darwin 1986-1995.
                      5: .\" Software written by Ian F. Darwin and others;
                      6: .\" maintained 1995-present by Christos Zoulas and others.
1.20      jmc         7: .\"
1.19      ian         8: .\" Redistribution and use in source and binary forms, with or without
                      9: .\" modification, are permitted provided that the following conditions
                     10: .\" are met:
                     11: .\" 1. Redistributions of source code must retain the above copyright
                     12: .\"    notice immediately at the beginning of the file, without modification,
                     13: .\"    this list of conditions, and the following disclaimer.
                     14: .\" 2. Redistributions in binary form must reproduce the above copyright
                     15: .\"    notice, this list of conditions and the following disclaimer in the
                     16: .\"    documentation and/or other materials provided with the distribution.
1.20      jmc        17: .\"
1.19      ian        18: .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
                     19: .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
                     20: .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
                     21: .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
                     22: .\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
                     23: .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
                     24: .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
                     25: .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
                     26: .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
                     27: .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
                     28: .\" SUCH DAMAGE.
1.18      jmc        29: .\"
1.23    ! jaredy     30: .Dd December 4, 2004
1.8       aaron      31: .Dt FILE 1
                     32: .Os
                     33: .Sh NAME
                     34: .Nm file
                     35: .Nd determine file type
                     36: .Sh SYNOPSIS
                     37: .Nm file
1.23    ! jaredy     38: .Op Fl bckLNnrsvz
        !            39: .Op Fl F Ar separator
1.8       aaron      40: .Op Fl f Ar namefile
                     41: .Op Fl m Ar magicfiles
1.23    ! jaredy     42: .Bk -words
        !            43: .Ar file ...
        !            44: .Ek
        !            45: .Nm file
        !            46: .Op Fl m Ar magicfiles
        !            47: .Fl C
1.8       aaron      48: .Sh DESCRIPTION
1.22      jaredy     49: The
1.8       aaron      50: .Nm
1.22      jaredy     51: utility
1.1       deraadt    52: tests each argument in an attempt to classify it.
                     53: There are three sets of tests, performed in this order:
                     54: filesystem tests, magic number tests, and language tests.
1.8       aaron      55: The first test that succeeds causes the file type to be printed.
                     56: .Pp
1.1       deraadt    57: The type printed will usually contain one of the words
1.8       aaron      58: .Dq text
1.4       millert    59: (the file contains only
1.8       aaron      60: .Tn ASCII
1.4       millert    61: characters and is probably safe to read on an
1.8       aaron      62: .Tn ASCII
1.4       millert    63: terminal),
1.8       aaron      64: .Dq executable
1.1       deraadt    65: (the file contains the result of compiling a program
1.8       aaron      66: in a form understandable to some
                     67: .Ux
                     68: kernel or another),
1.1       deraadt    69: or
1.8       aaron      70: .Dq data
                     71: meaning anything else (data is usually binary or non-printable).
                     72: .Pp
1.1       deraadt    73: Exceptions are well-known file formats (core files, tar archives)
                     74: that are known to contain binary data.
                     75: When modifying the file
1.8       aaron      76: .Pa /etc/magic
1.6       aaron      77: or the program itself,
1.8       aaron      78: .Em "preserve these keywords" .
                     79: .Pp
1.1       deraadt    80: People depend on knowing that all the readable files in a directory
1.8       aaron      81: have the word
                     82: .Dq text
                     83: printed.
                     84: Don't do as Berkeley did; change
                     85: .Dq shell commands text
                     86: to
                     87: .Dq shell script .
                     88: .Pp
1.1       deraadt    89: The filesystem tests are based on examining the return from a
1.8       aaron      90: .Xr stat 2
1.1       deraadt    91: system call.
                     92: The program checks to see if the file is empty,
                     93: or if it's some sort of special file.
                     94: Any known file types appropriate to the system you are running on
                     95: (sockets, symbolic links, or named pipes (FIFOs) on those systems that
                     96: implement them)
                     97: are intuited if they are defined in
                     98: the system header file
1.9       aaron      99: .Aq Pa sys/stat.h .
1.8       aaron     100: .Pp
1.1       deraadt   101: The magic number tests are used to check for files with data in
                    102: particular fixed formats.
                    103: The canonical example of this is a binary executable (compiled program)
1.8       aaron     104: .Pa a.out
1.6       aaron     105: file, whose format is defined in
1.8       aaron     106: .Aq Pa a.out.h
1.1       deraadt   107: and possibly
1.8       aaron     108: .Aq Pa exec.h
1.23    ! jaredy    109: in the standard include directory and is explained in
        !           110: .Xr a.out 5 .
1.8       aaron     111: These files have a
                    112: .Dq magic number
                    113: stored in a particular place
                    114: near the beginning of the file that tells the
                    115: .Ux
                    116: operating system
1.1       deraadt   117: that the file is a binary executable, and which of several types thereof.
1.8       aaron     118: .Pp
                    119: The concept of magic number has been applied by extension to data files.
1.1       deraadt   120: Any file with some invariant identifier at a small fixed
                    121: offset into the file can usually be described in this way.
                    122: The information in these files is read from the magic file
1.8       aaron     123: .Pa /etc/magic .
                    124: .Pp
1.1       deraadt   125: If an argument appears to be an
1.8       aaron     126: .Tn ASCII
1.1       deraadt   127: file,
1.8       aaron     128: .Nm
1.1       deraadt   129: attempts to guess its language.
1.4       millert   130: The language tests look for particular strings (cf
1.8       aaron     131: .Pa names.h )
1.1       deraadt   132: that can appear anywhere in the first few blocks of a file.
                    133: For example, the keyword
1.8       aaron     134: .Em .br
1.4       millert   135: indicates that the file is most likely a
1.8       aaron     136: .Xr troff 1
1.6       aaron     137: input file, just as the keyword
1.8       aaron     138: .Li struct
1.1       deraadt   139: indicates a C program.
                    140: These tests are less reliable than the previous
                    141: two groups, so they are performed last.
                    142: The language test routines also test for some miscellany
1.6       aaron     143: (such as
1.8       aaron     144: .Xr tar 1
1.1       deraadt   145: archives) and determine whether an unknown file should be
1.8       aaron     146: labelled as
                    147: .Dq ASCII text
                    148: or
                    149: .Dq data .
                    150: .Pp
                    151: The options are as follows:
1.11      aaron     152: .Bl -tag -width Ds
1.17      millert   153: .It Fl b
                    154: Do not prepend filenames to output lines (brief mode).
1.23    ! jaredy    155: .It Fl C
        !           156: For each magic number file, write a
        !           157: .Pa magic.mgc
        !           158: output file that contains a preparsed (compiled) version of it.
1.8       aaron     159: .It Fl c
1.1       deraadt   160: Cause a checking printout of the parsed form of the magic file.
1.6       aaron     161: This is usually used in conjunction with
1.8       aaron     162: .Fl m
1.1       deraadt   163: to debug a new magic file before installing it.
1.23    ! jaredy    164: .It Fl F Ar separator
        !           165: Use the specified string as the separator between the filename and
        !           166: the file result returned.
        !           167: Defaults to
        !           168: .Sq \&: .
1.8       aaron     169: .It Fl f Ar namefile
1.6       aaron     170: Read the names of the files to be examined from
1.8       aaron     171: .Ar namefile
1.6       aaron     172: (one per line)
1.1       deraadt   173: before the argument list.
1.6       aaron     174: Either
1.8       aaron     175: .Ar namefile
1.1       deraadt   176: or at least one filename argument must be present;
1.8       aaron     177: to test the standard input, use
1.23    ! jaredy    178: .Sq -
1.8       aaron     179: as a filename argument.
1.23    ! jaredy    180: .It Fl k
        !           181: Don't stop at the first match, keep going.
1.8       aaron     182: .It Fl L
                    183: Cause symlinks to be followed, as the like-named option in
1.23    ! jaredy    184: .Xr ls 1
1.1       deraadt   185: (on systems that support symbolic links).
1.23    ! jaredy    186: .It Fl m Ar magiclist
        !           187: Specify an alternate list,
        !           188: .Ar magiclist ,
        !           189: of files containing magic numbers.
        !           190: This can be a single file or a colon-separated list of files.
        !           191: If a compiled magic file is found alongside, it will be used instead.
        !           192: .It Fl N
        !           193: Don't pad filenames so that they align in the output.
        !           194: .It Fl n
        !           195: Force
        !           196: .Em stdout
        !           197: to be flushed after checking each file.
        !           198: This is only useful if checking a list of files.
        !           199: It is intended to be used by programs that want filetype output from a
        !           200: pipe.
        !           201: .It Fl r
        !           202: Don't translate unprintable characters to
        !           203: .Sq \e Ns Em ooo .
        !           204: Normally
        !           205: .Nm
        !           206: translates unprintable characters to their octal representation
        !           207: (raw mode).
        !           208: .It Fl s
        !           209: Normally,
        !           210: .Nm
        !           211: only attempts to read and determine the type of argument files which
        !           212: .Xr stat 2
        !           213: reports are ordinary files.
        !           214: This prevents problems, because reading special files may have peculiar
        !           215: consequences.
        !           216: Specifying the
        !           217: .Fl s
        !           218: option causes
        !           219: .Nm
        !           220: to also read argument files which are block or character special files.
        !           221: This is useful for determining the filesystem types of the data in raw
        !           222: disk partitions, which are block special files.
        !           223: This option also causes
        !           224: .Nm
        !           225: to disregard the file size as reported by
        !           226: .Xr stat 2 ,
        !           227: since on some systems it reports a zero size for raw disk partitions.
        !           228: .It Fl v
        !           229: Print the version of the program and exit.
        !           230: .It Fl z
        !           231: Try to look inside files that have been run through
        !           232: .Xr compress 1 .
1.8       aaron     233: .El
                    234: .Sh ENVIRONMENT
                    235: .Bl -tag -width indent
1.13      smart     236: .It Ev MAGIC
1.23    ! jaredy    237: Default magic number files, separated by colon characters.
        !           238: .Nm
        !           239: adds
        !           240: .Dq .mgc
        !           241: to the value of this variable as appropriate.
1.8       aaron     242: .El
1.12      aaron     243: .Sh FILES
                    244: .Bl -tag -width /etc/magic -compact
                    245: .It Pa /etc/magic
                    246: default list of magic numbers
                    247: .El
1.8       aaron     248: .Sh SEE ALSO
1.23    ! jaredy    249: .Xr compress 1 ,
1.8       aaron     250: .Xr hexdump 1 ,
1.23    ! jaredy    251: .Xr ls 1 ,
1.8       aaron     252: .Xr od 1 ,
                    253: .Xr strings 1 ,
1.23    ! jaredy    254: .Xr a.out 5 ,
1.8       aaron     255: .Xr magic 5
                    256: .Sh STANDARDS CONFORMANCE
1.1       deraadt   257: This program is believed to exceed the System V Interface Definition
                    258: of FILE(CMD), as near as one can determine from the vague language
1.6       aaron     259: contained therein.
1.1       deraadt   260: Its behaviour is mostly compatible with the System V program of the same name.
                    261: This version knows more magic, however, so it will produce
1.6       aaron     262: different (albeit more accurate) output in many cases.
1.8       aaron     263: .Pp
1.6       aaron     264: The one significant difference
1.1       deraadt   265: between this version and System V
1.8       aaron     266: is that this version treats any white space
1.1       deraadt   267: as a delimiter, so that spaces in pattern strings must be escaped.
                    268: For example,
1.8       aaron     269: .Pp
                    270: >10     string  language impress\       (imPRESS data)
                    271: .Pp
1.1       deraadt   272: in an existing magic file would have to be changed to
1.8       aaron     273: .Pp
                    274: >10     string  language\e impress      (imPRESS data)
                    275: .Pp
1.1       deraadt   276: In addition, in this version, if a pattern string contains a backslash,
1.9       aaron     277: it must be escaped.
                    278: For example
1.8       aaron     279: .Pp
                    280: 0       string          \ebegindata     Andrew Toolkit document
                    281: .Pp
1.1       deraadt   282: in an existing magic file would have to be changed to
1.8       aaron     283: .Pp
                    284: 0       string          \e\ebegindata   Andrew Toolkit document
                    285: .Pp
1.1       deraadt   286: SunOS releases 3.2 and later from Sun Microsystems include a
1.20      jmc       287: .Nm file
1.1       deraadt   288: command derived from the System V one, but with some extensions.
                    289: My version differs from Sun's only in minor ways.
1.8       aaron     290: It includes the extension of the
                    291: .Ql &
                    292: operator, used as,
1.1       deraadt   293: for example,
1.8       aaron     294: .Pp
                    295: >16     long&0x7fffffff >0              not stripped
                    296: .Sh MAGIC DIRECTORY
1.1       deraadt   297: The magic file entries have been collected from various sources,
                    298: mainly USENET, and contributed by various authors.
1.8       aaron     299: .An Christos Zoulas
                    300: (address below) will collect additional
1.1       deraadt   301: or corrected magic file entries.
1.6       aaron     302: A consolidation of magic file entries
1.1       deraadt   303: will be distributed periodically.
                    304: The order of entries in the magic file is significant.
                    305: Depending on what system you are using, the order that
                    306: they are put together may be incorrect.
                    307: If your old
1.8       aaron     308: .Nm
1.1       deraadt   309: command uses a magic file,
                    310: keep the old magic file around for comparison purposes
1.6       aaron     311: (rename it to
1.8       aaron     312: .Pa /etc/magic.orig ) .
                    313: .Sh HISTORY
1.6       aaron     314: There has been a
1.8       aaron     315: .Nm
                    316: command in every
                    317: .Ux
1.16      mickey    318: since at least Research Version 4
                    319: (man page dated November, 1973).
1.1       deraadt   320: The System V version introduced one significant major change:
                    321: the external list of magic number types.
                    322: This slowed the program down slightly but made it a lot more flexible.
1.8       aaron     323: .Pp
1.10      ian       324: This program, based on the System V version, was written by
                    325: .An Ian F. Darwin Aq ian@darwinisys.com
1.8       aaron     326: without looking at anybody else's source code.
                    327: .Pp
                    328: .An John Gilmore
                    329: revised the code extensively, making it better than
1.1       deraadt   330: the first version.
1.8       aaron     331: .An Geoff Collyer
                    332: found several inadequacies
1.1       deraadt   333: and provided some magic file entries.
1.23    ! jaredy    334: Contributions to the
        !           335: .Ql &
        !           336: operator by
        !           337: .An Rob McMahon Aq cudcv@warwick.ac.uk ,
        !           338: 1989.
        !           339: .Pp
        !           340: .An Guy Harris Aq guy@auspex.com
        !           341: made many changes from 1993 to the present.
        !           342: .Pp
        !           343: Primary development and maintenence from 1990 to the present by
        !           344: .An Christos Zoulas Aq christos@zoulas.com .
1.8       aaron     345: .Pp
                    346: Altered by
1.23    ! jaredy    347: .An Chris Lowth Aq chris@lowth.com ,
        !           348: 2000: Handle the
        !           349: .Fl i
        !           350: option to output mime type strings and using an alternative magic file
        !           351: and internal logic.
1.8       aaron     352: .Pp
                    353: Altered by
1.23    ! jaredy    354: .An Eric Fischer Aq enf@pobox.com ,
        !           355: July, 2000, to identify character codes and attempt to identify the
        !           356: languages of non-ASCII files.
        !           357: .Pp
        !           358: The list of contributors to the
        !           359: .Dq magdir
        !           360: directory (source for the
        !           361: .Pa /etc/magic
        !           362: file) is too long to include here.
        !           363: You know who you are; thank you.
1.8       aaron     364: .Sh LEGAL NOTICE
1.10      ian       365: Copyright (c) Ian F. Darwin, Toronto, Canada, 1986-1999.
                    366: Covered by the standard Berkeley Software Distribution copyright; see the file
                    367: LEGAL.NOTICE in the distribution.
1.8       aaron     368: .Pp
1.1       deraadt   369: The files
1.8       aaron     370: .Pa tar.h
1.1       deraadt   371: and
1.8       aaron     372: .Pa is_tar.c
                    373: were written by
                    374: .An John Gilmore
                    375: from his public-domain
                    376: .Nm tar
1.23    ! jaredy    377: program, and are not covered by the above license.
1.8       aaron     378: .Sh BUGS
1.1       deraadt   379: There must be a better way to automate the construction of the Magic
1.8       aaron     380: file from all the glop in Magdir.
                    381: What is it?
1.1       deraadt   382: Better yet, the magic file should be compiled into binary (say,
1.8       aaron     383: .Xr ndbm 3
1.4       millert   384: or, better yet, fixed-length
1.8       aaron     385: .Tn ASCII
1.4       millert   386: strings for use in heterogenous network environments) for faster startup.
1.1       deraadt   387: Then the program would run as fast as the Version 7 program of the same name,
                    388: with the flexibility of the System V version.
1.8       aaron     389: .Pp
                    390: .Nm
1.15      pjanzen   391: uses several algorithms that favor speed over accuracy;
1.4       millert   392: thus it can be misled about the contents of
1.8       aaron     393: .Tn ASCII
1.4       millert   394: files.
1.8       aaron     395: .Pp
1.4       millert   396: The support for
1.8       aaron     397: .Tn ASCII
1.4       millert   398: files (primarily for programming languages)
1.1       deraadt   399: is simplistic, inefficient and requires recompilation to update.
1.8       aaron     400: .Pp
                    401: There should be an
                    402: .Dq else
                    403: clause to follow a series of continuation lines.
                    404: .Pp
1.1       deraadt   405: The magic file and keywords should have regular expression support.
1.4       millert   406: Their use of
1.8       aaron     407: .Tn ASCII TAB
1.4       millert   408: as a field delimiter is ugly and makes
1.1       deraadt   409: it hard to edit the files, but is entrenched.
1.8       aaron     410: .Pp
1.1       deraadt   411: It might be advisable to allow upper-case letters in keywords
1.4       millert   412: for e.g.,
1.8       aaron     413: .Xr troff 1
1.4       millert   414: commands vs man page macros.
1.1       deraadt   415: Regular expression support would make this easy.
1.8       aaron     416: .Pp
1.1       deraadt   417: The program doesn't grok \s-2FORTRAN\s0.
1.6       aaron     418: It should be able to figure \s-2FORTRAN\s0 by seeing some keywords which
1.1       deraadt   419: appear indented at the start of line.
                    420: Regular expression support would make this easy.
1.8       aaron     421: .Pp
1.6       aaron     422: The list of keywords in
1.8       aaron     423: .Em ascmagic
1.1       deraadt   424: probably belongs in the Magic file.
1.8       aaron     425: This could be done by using some keyword like
                    426: .Ql *
                    427: for the offset value.
                    428: .Pp
                    429: Another optimization would be to sort
1.1       deraadt   430: the magic file so that we can just run down all the
                    431: tests for the first byte, first word, first long, etc, once we
1.9       aaron     432: have fetched it.
                    433: Complain about conflicts in the magic file entries.
1.1       deraadt   434: Make a rule that the magic entries sort based on file offset rather
                    435: than position within the magic file?
1.8       aaron     436: .Pp
1.6       aaron     437: The program should provide a way to give an estimate
1.8       aaron     438: of
                    439: .Dq how good
                    440: a guess is.
                    441: We end up removing guesses (e.g.,
1.20      jmc       442: .Dq From\ \&
1.8       aaron     443: as first 5 chars of file) because
                    444: they are not as good as other guesses (e.g.,
                    445: .Dq Newsgroups:
                    446: versus
                    447: .Qq Return-Path: ) .
                    448: Still, if the others don't pan out, it should be
1.6       aaron     449: possible to use the first guess.
1.8       aaron     450: .Pp
                    451: This program is slower than some vendors'
                    452: .Nm
                    453: commands.
                    454: .Pp
1.1       deraadt   455: This manual page, and particularly this section, is too long.
1.8       aaron     456: .Sh AVAILABILITY
1.1       deraadt   457: You can obtain the original author's latest version by anonymous FTP
1.8       aaron     458: on
1.15      pjanzen   459: .Em ftp.astron.com
1.8       aaron     460: in the directory
1.20      jmc       461: .Pa /pub/file/file-X.YY.tar.gz .