Annotation of src/usr.bin/mandoc/mdoc.3, Revision 1.15
1.15 ! schwarze 1: .\" $Id: mdoc.3,v 1.14 2010/12/22 00:33:25 schwarze Exp $
1.1 kristaps 2: .\"
1.15 ! schwarze 3: .\" Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv>
1.11 schwarze 4: .\" Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org>
1.1 kristaps 5: .\"
6: .\" Permission to use, copy, modify, and distribute this software for any
1.2 schwarze 7: .\" purpose with or without fee is hereby granted, provided that the above
8: .\" copyright notice and this permission notice appear in all copies.
1.1 kristaps 9: .\"
1.2 schwarze 10: .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
11: .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
12: .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
13: .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
14: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
15: .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
16: .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
1.4 schwarze 17: .\"
1.15 ! schwarze 18: .Dd $Mdocdate: December 22 2010 $
1.2 schwarze 19: .Dt MDOC 3
1.1 kristaps 20: .Os
21: .Sh NAME
1.7 schwarze 22: .Nm mdoc ,
1.1 kristaps 23: .Nm mdoc_alloc ,
24: .Nm mdoc_endparse ,
1.7 schwarze 25: .Nm mdoc_free ,
26: .Nm mdoc_meta ,
1.1 kristaps 27: .Nm mdoc_node ,
1.7 schwarze 28: .Nm mdoc_parseln ,
1.1 kristaps 29: .Nm mdoc_reset
30: .Nd mdoc macro compiler library
31: .Sh SYNOPSIS
1.7 schwarze 32: .In mandoc.h
1.5 schwarze 33: .In mdoc.h
1.1 kristaps 34: .Vt extern const char * const * mdoc_macronames;
35: .Vt extern const char * const * mdoc_argnames;
1.15 ! schwarze 36: .Ft int
! 37: .Fo mdoc_addspan
! 38: .Fa "struct mdoc *mdoc"
! 39: .Fa "const struct tbl_span *span"
! 40: .Fc
1.1 kristaps 41: .Ft "struct mdoc *"
1.9 schwarze 42: .Fo mdoc_alloc
43: .Fa "struct regset *regs"
44: .Fa "void *data"
45: .Fa "mandocmsg msgs"
46: .Fc
1.1 kristaps 47: .Ft int
1.7 schwarze 48: .Fn mdoc_endparse "struct mdoc *mdoc"
1.1 kristaps 49: .Ft void
50: .Fn mdoc_free "struct mdoc *mdoc"
1.7 schwarze 51: .Ft "const struct mdoc_meta *"
52: .Fn mdoc_meta "const struct mdoc *mdoc"
53: .Ft "const struct mdoc_node *"
54: .Fn mdoc_node "const struct mdoc *mdoc"
1.1 kristaps 55: .Ft int
1.9 schwarze 56: .Fo mdoc_parseln
57: .Fa "struct mdoc *mdoc"
58: .Fa "int line"
59: .Fa "char *buf"
60: .Fc
1.1 kristaps 61: .Ft int
1.7 schwarze 62: .Fn mdoc_reset "struct mdoc *mdoc"
1.1 kristaps 63: .Sh DESCRIPTION
64: The
65: .Nm mdoc
1.4 schwarze 66: library parses lines of
1.1 kristaps 67: .Xr mdoc 7
1.7 schwarze 68: input
69: into an abstract syntax tree (AST).
1.1 kristaps 70: .Pp
71: In general, applications initiate a parsing sequence with
72: .Fn mdoc_alloc ,
1.4 schwarze 73: parse each line in a document with
1.1 kristaps 74: .Fn mdoc_parseln ,
75: close the parsing session with
76: .Fn mdoc_endparse ,
77: operate over the syntax tree returned by
1.4 schwarze 78: .Fn mdoc_node
1.1 kristaps 79: and
80: .Fn mdoc_meta ,
81: then free all allocated memory with
82: .Fn mdoc_free .
83: The
84: .Fn mdoc_reset
85: function may be used in order to reset the parser for another input
1.7 schwarze 86: sequence.
1.1 kristaps 87: .Ss Types
1.6 schwarze 88: .Bl -ohang
1.1 kristaps 89: .It Vt struct mdoc
1.13 schwarze 90: An opaque type.
1.1 kristaps 91: Its values are only used privately within the library.
92: .It Vt struct mdoc_node
1.7 schwarze 93: A parsed node.
1.4 schwarze 94: See
1.1 kristaps 95: .Sx Abstract Syntax Tree
96: for details.
97: .El
98: .Ss Functions
1.6 schwarze 99: .Bl -ohang
1.15 ! schwarze 100: .It Fn mdoc_addspan
! 101: Add a table span to the parsing stream.
! 102: Returns 0 on failure, 1 on success.
1.1 kristaps 103: .It Fn mdoc_alloc
1.7 schwarze 104: Allocates a parsing structure.
105: The
1.1 kristaps 106: .Fa data
1.7 schwarze 107: pointer is passed to
108: .Fa msgs .
109: Returns NULL on failure.
110: If non-NULL, the pointer must be freed with
1.1 kristaps 111: .Fn mdoc_free .
112: .It Fn mdoc_reset
1.7 schwarze 113: Reset the parser for another parse routine.
114: After its use,
1.1 kristaps 115: .Fn mdoc_parseln
1.7 schwarze 116: behaves as if invoked for the first time.
117: If it returns 0, memory could not be allocated.
1.1 kristaps 118: .It Fn mdoc_free
1.7 schwarze 119: Free all resources of a parser.
120: The pointer is no longer valid after invocation.
1.1 kristaps 121: .It Fn mdoc_parseln
1.7 schwarze 122: Parse a nil-terminated line of input.
123: This line should not contain the trailing newline.
124: Returns 0 on failure, 1 on success.
125: The input buffer
1.1 kristaps 126: .Fa buf
127: is modified by this function.
128: .It Fn mdoc_endparse
1.7 schwarze 129: Signals that the parse is complete.
130: Note that if
1.1 kristaps 131: .Fn mdoc_endparse
132: is called subsequent to
133: .Fn mdoc_node ,
1.7 schwarze 134: the resulting tree is incomplete.
135: Returns 0 on failure, 1 on success.
1.1 kristaps 136: .It Fn mdoc_node
1.7 schwarze 137: Returns the first node of the parse.
138: Note that if
1.1 kristaps 139: .Fn mdoc_parseln
140: or
141: .Fn mdoc_endparse
142: return 0, the tree will be incomplete.
143: .It Fn mdoc_meta
1.7 schwarze 144: Returns the document's parsed meta-data.
145: If this information has not yet been supplied or
1.1 kristaps 146: .Fn mdoc_parseln
147: or
148: .Fn mdoc_endparse
149: return 0, the data will be incomplete.
150: .El
151: .Ss Variables
1.6 schwarze 152: .Bl -ohang
1.1 kristaps 153: .It Va mdoc_macronames
154: An array of string-ified token names.
155: .It Va mdoc_argnames
156: An array of string-ified token argument names.
157: .El
158: .Ss Abstract Syntax Tree
1.4 schwarze 159: The
1.1 kristaps 160: .Nm
161: functions produce an abstract syntax tree (AST) describing input in a
1.7 schwarze 162: regular form.
163: It may be reviewed at any time with
1.1 kristaps 164: .Fn mdoc_nodes ;
165: however, if called before
166: .Fn mdoc_endparse ,
167: or after
1.4 schwarze 168: .Fn mdoc_endparse
1.1 kristaps 169: or
170: .Fn mdoc_parseln
1.4 schwarze 171: fail, it may be incomplete.
1.1 kristaps 172: .Pp
173: This AST is governed by the ontological
174: rules dictated in
175: .Xr mdoc 7
1.4 schwarze 176: and derives its terminology accordingly.
1.1 kristaps 177: .Qq In-line
178: elements described in
179: .Xr mdoc 7
1.4 schwarze 180: are described simply as
1.1 kristaps 181: .Qq elements .
182: .Pp
1.4 schwarze 183: The AST is composed of
1.1 kristaps 184: .Vt struct mdoc_node
185: nodes with block, head, body, element, root and text types as declared
186: by the
187: .Va type
1.7 schwarze 188: field.
189: Each node also provides its parse point (the
1.1 kristaps 190: .Va line ,
191: .Va sec ,
192: and
193: .Va pos
194: fields), its position in the tree (the
195: .Va parent ,
196: .Va child ,
1.10 schwarze 197: .Va nchild ,
1.4 schwarze 198: .Va next
1.1 kristaps 199: and
1.4 schwarze 200: .Va prev
1.10 schwarze 201: fields) and some type-specific data, in particular, for nodes generated
202: from macros, the generating macro in the
203: .Va tok
204: field.
1.1 kristaps 205: .Pp
206: The tree itself is arranged according to the following normal form,
207: where capitalised non-terminals represent nodes.
208: .Pp
1.6 schwarze 209: .Bl -tag -width "ELEMENTXX" -compact
1.1 kristaps 210: .It ROOT
211: \(<- mnode+
212: .It mnode
213: \(<- BLOCK | ELEMENT | TEXT
214: .It BLOCK
1.8 schwarze 215: \(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]]
1.1 kristaps 216: .It ELEMENT
217: \(<- TEXT*
218: .It HEAD
1.10 schwarze 219: \(<- mnode*
1.1 kristaps 220: .It BODY
1.10 schwarze 221: \(<- mnode* [ENDBODY mnode*]
1.1 kristaps 222: .It TAIL
1.10 schwarze 223: \(<- mnode*
1.1 kristaps 224: .It TEXT
1.7 schwarze 225: \(<- [[:printable:],0x1e]*
1.1 kristaps 226: .El
227: .Pp
228: Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of
1.8 schwarze 229: the BLOCK production: these refer to punctuation marks.
1.7 schwarze 230: Furthermore, although a TEXT node will generally have a non-zero-length
231: string, in the specific case of
1.1 kristaps 232: .Sq \&.Bd \-literal ,
233: an empty line will produce a zero-length string.
1.8 schwarze 234: Multiple body parts are only found in invocations of
235: .Sq \&Bl \-column ,
236: where a new body introduces a new phrase.
1.11 schwarze 237: .Ss Badly-nested Blocks
238: The ENDBODY node is available to end the formatting associated
239: with a given block before the physical end of that block.
240: It has a non-null
1.10 schwarze 241: .Va end
242: field, is of the BODY
243: .Va type ,
244: has the same
245: .Va tok
246: as the BLOCK it is ending, and has a
247: .Va pending
248: field pointing to that BLOCK's BODY node.
249: It is an indirect child of that BODY node
250: and has no children of its own.
251: .Pp
252: An ENDBODY node is generated when a block ends while one of its child
253: blocks is still open, like in the following example:
254: .Bd -literal -offset indent
255: \&.Ao ao
256: \&.Bo bo ac
257: \&.Ac bc
258: \&.Bc end
259: .Ed
260: .Pp
261: This example results in the following block structure:
262: .Bd -literal -offset indent
263: BLOCK Ao
264: HEAD Ao
265: BODY Ao
266: TEXT ao
267: BLOCK Bo, pending -> Ao
268: HEAD Bo
269: BODY Bo
270: TEXT bo
271: TEXT ac
272: ENDBODY Ao, pending -> Ao
273: TEXT bc
274: TEXT end
275: .Ed
276: .Pp
1.11 schwarze 277: Here, the formatting of the
278: .Sq \&Ao
279: block extends from TEXT ao to TEXT ac,
280: while the formatting of the
281: .Sq \&Bo
282: block extends from TEXT bo to TEXT bc.
283: It renders as follows in
1.10 schwarze 284: .Fl T Ns Cm ascii
285: mode:
1.11 schwarze 286: .Pp
1.10 schwarze 287: .Dl <ao [bo ac> bc] end
1.11 schwarze 288: .Pp
289: Support for badly-nested blocks is only provided for backward
1.10 schwarze 290: compatibility with some older
291: .Xr mdoc 7
292: implementations.
1.11 schwarze 293: Using badly-nested blocks is
294: .Em strongly discouraged :
295: the
296: .Fl T Ns Cm html
297: and
298: .Fl T Ns Cm xhtml
299: front-ends are unable to render them in any meaningful way.
300: Furthermore, behaviour when encountering badly-nested blocks is not
301: consistent across troff implementations, especially when using multiple
302: levels of badly-nested blocks.
1.1 kristaps 303: .Sh EXAMPLES
304: The following example reads lines from stdin and parses them, operating
1.4 schwarze 305: on the finished parse tree with
1.1 kristaps 306: .Fn parsed .
1.6 schwarze 307: This example does not error-check nor free memory upon failure.
308: .Bd -literal -offset indent
1.9 schwarze 309: struct regset regs;
1.1 kristaps 310: struct mdoc *mdoc;
1.3 schwarze 311: const struct mdoc_node *node;
1.1 kristaps 312: char *buf;
313: size_t len;
314: int line;
315:
1.9 schwarze 316: bzero(®s, sizeof(struct regset));
1.1 kristaps 317: line = 1;
1.12 schwarze 318: mdoc = mdoc_alloc(®s, NULL, NULL);
1.6 schwarze 319: buf = NULL;
320: alloc_len = 0;
1.1 kristaps 321:
1.6 schwarze 322: while ((len = getline(&buf, &alloc_len, stdin)) >= 0) {
323: if (len && buflen[len - 1] = '\en')
324: buf[len - 1] = '\e0';
325: if ( ! mdoc_parseln(mdoc, line, buf))
326: errx(1, "mdoc_parseln");
327: line++;
1.1 kristaps 328: }
329:
330: if ( ! mdoc_endparse(mdoc))
1.6 schwarze 331: errx(1, "mdoc_endparse");
1.1 kristaps 332: if (NULL == (node = mdoc_node(mdoc)))
1.6 schwarze 333: errx(1, "mdoc_node");
1.1 kristaps 334:
335: parsed(mdoc, node);
336: mdoc_free(mdoc);
337: .Ed
1.7 schwarze 338: .Pp
1.13 schwarze 339: To compile this, execute
340: .Pp
1.14 schwarze 341: .Dl % cc main.c libmdoc.a libmandoc.a
1.13 schwarze 342: .Pp
343: where
1.7 schwarze 344: .Pa main.c
1.13 schwarze 345: is the example file.
1.1 kristaps 346: .Sh SEE ALSO
347: .Xr mandoc 1 ,
348: .Xr mdoc 7
349: .Sh AUTHORS
350: The
351: .Nm
1.7 schwarze 352: library was written by
1.6 schwarze 353: .An Kristaps Dzonsons Aq kristaps@bsd.lv .