Annotation of src/usr.bin/join/join.1, Revision 1.22
1.22 ! jmc 1: .\" $OpenBSD: join.1,v 1.21 2009/02/08 17:15:10 jmc Exp $
1.9 aaron 2: .\"
1.3 michaels 3: .\" Copyright (c) 1990, 1993
4: .\" The Regents of the University of California. All rights reserved.
1.1 deraadt 5: .\"
6: .\" This code is derived from software contributed to Berkeley by
7: .\" the Institute of Electrical and Electronics Engineers, Inc.
8: .\"
9: .\" Redistribution and use in source and binary forms, with or without
10: .\" modification, are permitted provided that the following conditions
11: .\" are met:
12: .\" 1. Redistributions of source code must retain the above copyright
13: .\" notice, this list of conditions and the following disclaimer.
14: .\" 2. Redistributions in binary form must reproduce the above copyright
15: .\" notice, this list of conditions and the following disclaimer in the
16: .\" documentation and/or other materials provided with the distribution.
1.12 millert 17: .\" 3. Neither the name of the University nor the names of its contributors
1.1 deraadt 18: .\" may be used to endorse or promote products derived from this software
19: .\" without specific prior written permission.
20: .\"
21: .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
22: .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
23: .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
24: .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
25: .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
26: .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
27: .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
28: .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
29: .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
30: .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
31: .\" SUCH DAMAGE.
32: .\"
1.3 michaels 33: .\" @(#)join.1 8.3 (Berkeley) 4/28/95
1.1 deraadt 34: .\"
1.22 ! jmc 35: .Dd $Mdocdate: February 8 2009 $
1.1 deraadt 36: .Dt JOIN 1
37: .Os
38: .Sh NAME
39: .Nm join
40: .Nd relational database operator
41: .Sh SYNOPSIS
42: .Nm join
1.17 jmc 43: .Op Fl 1 Ar field
44: .Op Fl 2 Ar field
1.1 deraadt 45: .Oo
46: .Fl a Ar file_number | Fl v Ar file_number
47: .Oc
48: .Op Fl e Ar string
49: .Op Fl o Ar list
50: .Op Fl t Ar char
51: .Ar file1
52: .Ar file2
53: .Sh DESCRIPTION
1.6 aaron 54: The
55: .Nm
56: utility performs an
57: .Dq equality join
58: on the specified files
1.1 deraadt 59: and writes the result to the standard output.
1.6 aaron 60: The
61: .Dq join field
62: is the field in each file by which the files are compared.
1.1 deraadt 63: The first field in each line is used by default.
64: There is one line in the output for each pair of lines in
65: .Ar file1
66: and
67: .Ar file2
68: which have identical join fields.
1.4 millert 69: Each output line consists of the join field, the remaining fields from
1.1 deraadt 70: .Ar file1
1.4 millert 71: and then the remaining fields from
1.1 deraadt 72: .Ar file2 .
73: .Pp
74: The default field separators are tab and space characters.
75: In this case, multiple tabs and spaces count as a single field separator,
76: and leading tabs and spaces are ignored.
77: The default output field separator is a single space character.
78: .Pp
79: Many of the options use file and field numbers.
1.6 aaron 80: Both file numbers and field numbers are 1 based, i.e., the first file on
1.1 deraadt 81: the command line is file number 1 and the first field is field number 1.
1.8 aaron 82: .Pp
1.17 jmc 83: When the default field delimiter characters are used, the files to be joined
84: should be ordered in the collating sequence of
85: .Xr sort 1 ,
86: using the
87: .Fl b
88: option, on the fields on which they are to be joined, otherwise
89: .Nm
90: may not report all field matches.
91: When the field delimiter characters are specified by the
92: .Fl t
93: option, the collating sequence should be the same as
94: .Xr sort 1
95: without the
96: .Fl b
97: option.
98: .Pp
99: If one of the arguments
100: .Ar file1
101: or
102: .Ar file2
103: is
104: .Sq - ,
105: the standard input is used.
106: .Pp
1.8 aaron 107: The options are as follows:
1.10 aaron 108: .Bl -tag -width Ds
1.17 jmc 109: .It Fl 1 Ar field
110: Join on the
111: .Ar field Ns 'th
112: field of
113: .Ar file1 .
114: .It Fl 2 Ar field
115: Join on the
116: .Ar field Ns 'th
117: field of
118: .Ar file2 .
1.1 deraadt 119: .It Fl a Ar file_number
120: In addition to the default output, produce a line for each unpairable
121: line in file
122: .Ar file_number .
123: .It Fl e Ar string
124: Replace empty output fields with
125: .Ar string .
126: .It Fl o Ar list
1.5 aaron 127: Specifies the fields that will be output from each file for
1.1 deraadt 128: each line with matching join fields.
129: Each element of
130: .Ar list
131: has the form
1.9 aaron 132: .Dq file_number.field ,
1.1 deraadt 133: where
134: .Ar file_number
135: is a file number and
136: .Ar field
1.14 otto 137: is a field number,
138: or the form
139: .Dq 0
140: (zero),
141: representing the join field.
1.6 aaron 142: The elements of list must be either comma
143: .Pq Ql \&,
144: or whitespace separated.
1.17 jmc 145: (The latter requires quoting to protect it from the shell, or a simpler
1.1 deraadt 146: approach is to use multiple
147: .Fl o
148: options.)
149: .It Fl t Ar char
150: Use character
151: .Ar char
152: as a field delimiter for both input and output.
153: Every occurrence of
154: .Ar char
155: in a line is significant.
156: .It Fl v Ar file_number
157: Do not display the default output, but display a line for each unpairable
158: line in file
159: .Ar file_number .
160: The options
161: .Fl v Ar 1
162: and
163: .Fl v Ar 2
164: may be specified at the same time.
165: .El
1.22 ! jmc 166: .Sh EXIT STATUS
1.17 jmc 167: .Ex -std join
1.11 aaron 168: .Sh SEE ALSO
169: .Xr awk 1 ,
170: .Xr comm 1 ,
1.16 hugh 171: .Xr lam 1 ,
1.11 aaron 172: .Xr paste 1 ,
173: .Xr sort 1 ,
174: .Xr uniq 1
1.13 jmc 175: .Sh STANDARDS
1.15 otto 176: The
177: .Nm
1.19 jmc 178: utility is compliant with the
1.21 jmc 179: .St -p1003.1-2008
1.19 jmc 180: specification.
1.15 otto 181: .Pp
182: In the absence of the
183: .Fl o
184: option,
185: historical versions of
186: .Nm
187: wrote non-matching lines without reordering the fields.
188: The current version writes the join field first, followed by the
189: remaining fields.
190: .Pp
191: For compatibility with historical versions of
1.1 deraadt 192: .Nm join ,
193: the following options are available:
194: .Bl -tag -width Fl
195: .It Fl a
196: In addition to the default output, produce a line for each unpairable line
1.17 jmc 197: in both
198: .Ar file1
199: and
200: .Ar file2 .
201: .It Fl j Ar field
202: Join on the
203: .Ar field Ns 'th
204: field of both
205: .Ar file1
206: and
207: .Ar file2 .
1.1 deraadt 208: .It Fl j1 Ar field
209: Join on the
210: .Ar field Ns 'th
1.17 jmc 211: field of
212: .Ar file1 .
1.1 deraadt 213: .It Fl j2 Ar field
214: Join on the
215: .Ar field Ns 'th
1.17 jmc 216: field of
217: .Ar file2 .
1.1 deraadt 218: .It Fl o Ar list ...
219: Historical implementations of
1.6 aaron 220: .Nm
1.1 deraadt 221: permitted multiple arguments to the
222: .Fl o
223: option.
1.6 aaron 224: These arguments were of the form
225: .Dq file_number.field_number
226: as described for the current
1.1 deraadt 227: .Fl o
228: option.
1.6 aaron 229: This has obvious difficulties in the presence of files named
230: .Dq 1.2 .
1.1 deraadt 231: .El
232: .Pp
1.15 otto 233: These options are available only so historical shell scripts don't require
1.1 deraadt 234: modification and should not be used.
1.15 otto 235: .Sh HISTORY
236: A
1.6 aaron 237: .Nm
1.15 otto 238: utility appeared in
239: .At v7 .