[BACK]Return to eurobsdcon2018-mandoc.roff CVS log [TXT][DIR] Up to [local] / www / papers

File: [local] / www / papers / eurobsdcon2018-mandoc.roff (download)

Revision 1.1, Sun Sep 23 07:13:07 2018 UTC (5 years, 8 months ago) by schwarze
Branch: MAIN
CVS Tags: HEAD

add my EuroBSDCon 2018 slides

.\"
.\" Copyright (c) 2018 Ingo Schwarze <schwarze@openbsd.org>
.\"
.\" Permission to use, copy, modify, and distribute this presentation for any
.\" purpose with or without fee is hereby granted, provided that the above
.\" copyright notice and this permission notice appear in all copies.
.\"
.\" THE PRESENTATION IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL
.\" WARRANTIES WITH REGARD TO THIS PRESENTATION INCLUDING ALL IMPLIED
.\" WARRANTIES OF MERCHANTABILITY AND FITNESS.  IN NO EVENT SHALL THE
.\" AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL
.\" DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA
.\" OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
.\" TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
.\" PERFORMANCE OF THIS PRESENTATION.
.\"
.\" --------------------------------------------------------------------
.\"
.\" These slides use the mm and gpresent groff macros.
.\" For example, on OpenBSD, install these ports:
.\" groff, gpresent, ghostscript.
.\"
.\" Build instructions:
.\" groff -st -mm -mpresent talk.roff > talk.pps
.\" presentps -l talk.pps > talk.ps
.\" ps2pdf talk.ps
.\"
.\" --- global mm configuration settings -------------------------------
.nr Pi 3
.\" --- global gpresent configuration settings -------------------------
.DEFCOLOR Kea1 0 0.8 0.48
.DEFCOLOR Kea2 0 0.5 0.3
.TITLECOLOR Kea1
.SUBTITLEFORMAT C
.SUBTITLECOLOR Kea2
.FOOTERSIZE 2
.\" We don't want a header line for the title page,
.\" so we have to start it before setting up headers.
.TITLE "Better documentation"
.\" === gpresent header setup ==========================================
.\" --- define gpresent extension registers ----------------------------
.nr gpe_page_tot 1
.nr gpe_page_sec 0
.af gpe_page_sec I
.nr gpe_dur_min 0
.nr gpe_dur_sec 0
.af gpe_dur_sec 02
.nr gpe_time_tsec 14*60+2*60
.nr gpe_time_hour 14
.nr gpe_time_min 2
.af gpe_time_min 02
.nr gpe_time_sec 0
.af gpe_time_sec 02
.
.\" --- macro to start a new section -----------------------------------
.de GPE_SECTION
.ds gpe_title_sec \\$1
.nr gpe_page_sec 0
..
.\" --- macro to prepare a new page ------------------------------------
.de GPE_NEXT
.ds gpe_next \\$1
.SK
..
.\" --- gpresent page header callback ----------------------------------
.de HEADER
.nr gpe_page_tot +1
.nr gpe_page_sec +1
.sp 0.5v
.ds gpe_middle page \\n[gpe_page_tot]: \\*[gpe_title_sec] \\n[gpe_page_sec]
.tl 'Ingo Schwarze: Better documentation \(em Web & LibreSSL'\
\\h'2c'\\*[gpe_middle]'\
Bucuresti, September 22, 2018'
.sp -0.5v
.\" horizontal line below the page header
\l'\\n(.lu'\h'-\\n(.lu'
.br
..
.\" --- initialize the first section before completing the title page --
.GPE_SECTION INTRO
.\" === define some gpresent extension macros ==========================
.\" --- line length ----------------------------------------------------
.\" To return to full line length after temporarily reducing it.
.nr gpe_ll \n(.l
.\" --- emphasis -------------------------------------------------------
.de GPE_EM
.COLOR red
\\$1\c
.COLOR P
\&\\$2
..
.de GPE_OK
\Z'\\$1'\m[red]\l'\w"\\$1"u\(mi'\m[]
\m[Kea2]\(OK\m[]\\$2
..
.\" --- small text -----------------------------------------------------
.de GPE_SM
.S -4
\\$1
.S P
..
.\" --- quoted literals ------------------------------------------------
.de GPE_QL
\(lq\f(CW\\$1\fP\(rq\\$2
..
.\" --- tweak -mpresent macros -----------------------------------------
.\" Reduce vertical spacing by the given argument.
.\" Can only be called right after .SP with a larger argument.
.de GPE_SPM
.sp -\\$1
.nr line*ac\\n[.z] -\\$1
.nr line*lp\\n[.z] \\n[.d]
..
.am TITLE
.GPE_SPM 0.1i
..
.am SUBTITLE
.GPE_SPM 0.05i
..
.de INACTIVE_TITLE
.TITLECOLOR violet
.TITLE \\$*
.TITLECOLOR Kea1
..
.\" --- title page -----------------------------------------------------
.\" The main title line has already been printed.
.sp -1v
.SUBTITLE "on the web and for LibreSSL"
.SUBTITLE "EuroBSDCon, Bucuresti, September 22, 2018"
.SUBTITLE "Ingo Schwarze <schwarze@openbsd.org>"
.sp -0.5v
.PSPIC Images/YellowBelliedMarmot2.eps
.ce
.GPE_SM "Yellow Bellied Marmot, Kananaskis, Alberta, Canada (July 2015)"
.\" === gpresent footer setup ==========================================
.\" We dont want a footer line for the title page,
.\" so we have to set it up after completing the title page.
.SK
.\" --- macros to start a new page -------------------------------------
.\" arg: time for this page in seconds
.de GPE_TIME
.nr gpe_dur_min \\$1/60
.nr gpe_dur_sec \\$1%60
.nr gpe_time_tsec +\\$1
.nr gpe_time_hour \\n[gpe_time_tsec]/3600
.nr gpe_time_min \\n[gpe_time_tsec]%3600/60
.nr gpe_time_sec \\n[gpe_time_tsec]%60
.ie '\\$2'' .ds gpe_source NEW
.el .ds gpe_source BSDCan 2018 p. \\$2
..
.\" --- gpresent page footer callback ----------------------------------
.de FOOTER
.ps 18
.vs 20
.sp -2v
\l'\\n(.lu'\h'-\\n(.lu'
.br
.tl '\s-6\\n[gpe_dur_min]:\s-2\\n[gpe_dur_sec]\s+2 \(->\
 \\n[gpe_time_hour]:\\n[gpe_time_min]:\s-2\\n[gpe_time_sec]\
 \\*[gpe_source]\s+8''\\m[Kea2]\\*[gpe_next]\ \ \(->\\m[]'
.ps
.vs
..
.\" The INTRO section was already started in header.roff.
.TITLE "The context of this talk"
.ce
I occasionally present updates on documentation tools at BSD conferences.
.SUBTITLE "General reminders about documentation"
.BL
.LI
.GPE_EM "Without documentation, code is unusable,"
.br
.GPE_EM "and bad documentation is about as bad as bad code."
.LI
Documentation must be correct, complete, concise, all in one place,
.br
marked up for display and search, easy to read, and easy to write.
.LI
All BSD projects use the
.GPE_EM "mdoc(7)"
markup language because it is by far the best language available:
concise, simple, providing the right amount of semantic markup.
Thanks to Cynthia Livingston.
.GPE_SM "(USENIX, UC Berkeley CSRG 1990 to 1994)"
.P
.GPE_SM "texinfo(1) and DocBook are excessively complicated,\
 ill-designed, and unmaintained, DocBook also buggy as hell and\
 sluggish; man(7), perlpod(1), and markdown provide no semantic markup."
.LI
All BSD projects use the
.GPE_EM "mandoc(1)"
toolbox because it is functional,
.br
free (no GPL), lightweight (no C++ or XML), portable, small, and fast.
.br
Five input formats, five output formats, two converters,
.br
very powerful searching, all integrated.
.LI
See my presentation at EuroBSDCon 2015 regarding how mandoc(1) became
.br
the standard toolbox.
.GPE_SM "(and those at BSDCan 2011, 2014, and 2015, too)
.LE
.GPE_TIME 80 2
.GPE_NEXT "Table of contents"
.TITLE "The plan for this talk"
.ce 2
.GPE_EM "Three selected topics"
from the many things that were done with mandoc in 2016\(en2018
.sp
.mk
.BVL 1c
.LI "1. Use the strength of mdoc(7):"
Manual pages on the
.GPE_EM web .
.LI "2. Easily cope with language design from hell:"
The
.GPE_EM markdown
output mode.
.LI "3. Document the inscrutable:"
.GPE_EM LibreSSL
and API design.
.sp
.LI "4. Other progress with mandoc in 2016-2018:"
.BL
.LI
Why we deleted SQLite
.br
from the OpenBSD base system.
.LI
Improvements for manual pages in ports.
.LI
Aim for perfection: the small things matter.
.LE
.LI "5. Summary:"
Completed and open tasks.
Mandoc adoption.
.LE
.rt
.PSPIC -R Images/Paris17_SacreCoeurClocher.eps
.rj
.GPE_SM "Paris, Sacr\('e-C\(oeur (2017)"
.GPE_TIME 60 3
.GPE_SECTION MAN.CGI
.GPE_NEXT "What about manual pages on the web?"
.TITLE "Choice of HTML elements"
About ten important improvements to HTML/CSS output and man.cgi(8)
.br
were implemented in 2016\(en2018 alone.
.BL
.LI
Traditionally, man.cgi(8) emitted almost no tags except
.GPE_QL <b>
and
.GPE_QL <i> ;
look at the HTML source code of the NetBSD online manual pages (even today).
.br
That's bad because HTML is intended to provide content,
marked up for its function in the respective context \(em
not presentation, which is the job of CSS.
.LI
Mandoc now selects adequate HTML elements depending on mdoc(7) macros,
for example:
.mk
.GPE_SM "(2017 Jan 19 to 2018 May 8)"
.BL \n(Pi compact
.LI
\&.Fl .Cm .Ic .Fo .In .Fd .Cd \(->
.GPE_QL <code>
.LI
\&.Dv .Ev .Er \(->
.GPE_QL <code>
.LI
\&.Ar .Fa .Ft .Vt .Va \(->
.GPE_QL <var>
.LI
\&.Xr .Sx .Lk .Mt \(->
.GPE_QL <a>
.LI
\&.Sh \(->
.GPE_QL <h1>
.LI
\&.Ss \(->
.GPE_QL <h2>
.LI
\&.Rs \(->
.GPE_QL <cite>
.LE
.LI
Some mdoc(7) macros do not have corresponding HTML elements,
.br
so they use generic HTML elements, for example:
.GPE_SM "(2018 May 20)"
.BL \n(Pi compact
.LI
\&.Pa .St .An \(->
.GPE_QL <span>
.LI
\&.Nd \(->
.GPE_QL <div>
.LE
.LE
.rt
.sp -0.5v
.PSPIC -R Images/Ottawa18_Diefenbaker.eps
.rj
.GPE_SM "Ottawa, Foreign Affairs (2018)"
.GPE_TIME 90
.GPE_NEXT "How to decide on presentation?"
.TITLE "Select presentation by class"
.sp -0.5v
.BL
.LI
You see that mdoc(7) is much more specific than HTML,
many mdoc(7) macros end up with the same HTML elements,
even some that require quite different presentation,
for example .Fl (wants bold) and .Ev (typewriter) \(->
.GPE_QL <code> .
.LI
Consequently, the information from the mdoc(7) macro
has to be saved in a
.GPE_QL class=
attribute, and the CSS code has to select the presentation
.GPE_EM "by class"
rather than by element type.
.LI
Also note that browsers render
.GPE_QL <code>
in typewriter font by default, but many macros resulting in
.GPE_QL <code>
require bold font instead.
.LI
Both of the previous items require
.I always
having CSS or the presentation will look very wrong.
If and
.GPE_SM "(since 2018 May 1)"
.I "only if"
no external
.GPE_EM stylesheet
is provided with the mandoc(1)
.GPE_QL "-O style="
command line option, embed a minimal stylesheet in the
.GPE_QL <head>
element using the
.GPE_QL <style>
element.
.LE
.SUBTITLE "Other HTML improvements"
.sp -0.5v
.BL
.LI
Fixed several HTML syntax violations \(em still not perfect.
.br
.GPE_SM "(2018 May 8 to 28:\
 bad element nesting, duplicate \f(CWid=\fP attributes, ...)"
.LI
HTML output line break and indentation logic.
.GPE_SM "(2017 Jan 18)"
.br
Human-readable HTML code matters \(em today just as much as 30 years ago.
.LI
C code cleanup in the HTML formatter, print_otag() reorg.
.GPE_SM "(2017 Jan 16\(en29)"
.LE
.GPE_TIME 120
.GPE_NEXT "Improvements of the presentation?"
.TITLE "Baby steps towards responsive design"
.BL
.LI
Replacement of hard-coded HTML
.GPE_QL "style="
attributes with CSS where possible, for example for
.GPE_QL ".Bl \-compact" .
.GPE_SM "(2017 July 14)"
.LI
In particular, no more fixed widths in HTML
.GPE_QL "style="
attributes.
.br
.GPE_SM "(2018 May 28 to June 25)"
.LE
.ll 13c
.mk
.BL
.LI
Use a CSS
.GPE_QL @media
query to adapt indentations to the physical screen size.
.br
.GPE_SM "(2018 May 26)"
.LI
Emit the
.GPE_QL "meta name=viewport"
element to work around the sad fact that essentially all mobile browsers
are broken in so far as they assume a fixed width of about 1000px for all
pages, even those readily adapting to whatever the physical screen size
actually is.
.sp 0.5v
There is no standard way to fix this that browsers actually support,
and even more sadly, this is a HTML workaround for something
that is purely a CSS problem.
.br
.GPE_SM "(2018 May 18)"
.LE
.ll \n[gpe_ll]u
.rt
.sp -1v
.PSPIC -R Images/Brantford18_TurtleCrossings.eps 8c
.rj
.GPE_SM "SC Johnson Rail Trail, Brantford, Ontario, Canada (2018)"
.GPE_TIME 150
.GPE_NEXT "Was the CSS also improved in other ways?"
.TITLE "Other CSS improvements"
.BL
.LI
Use real macro names as
.GPE_QL class=
attributes.
.GPE_SM "(2017 Jan 19 and 20)"
.br
That makes them much easier to understand, saving the reader from
having to learn yet another syntax.
.LI
Use CSS
.GPE_EM "child selectors"
where appropriate rather than assigning
yet another separate class to the child.
.GPE_SM "(2018 May 8)"
.LI
Avoid overqualified CSS selectors, keep them minimal by
.GPE_EM "omitting the element"
when the class alone is sufficient.
.GPE_SM "(2018 May 28 and July 23)"
.LI
Consistently use
.GPE_QL em
units throughout.
.GPE_SM "(2018 May 26 and July 23)"
.LI
Tricky CSS implementation of having .Bl -tag heads and bodies on the
same line when they fit.
.GPE_SM "(2017 Jan 24, with help from many)"
.br
This is still fragile, more help is welcome.
.LE
.mk
.PSPIC Images/Toronto18_Skyline.eps
.rt
.sp 5.5v
.in +15.6c
.S -4
Toronto,
.br
Ontario,
.br
Canada
.br
(2018)
.br
.S P
.in -15.6c
.GPE_TIME 90
.GPE_NEXT "How was the user interface improved?"
.TITLE "Concise resource identifiers and deep linking"
.BL
.LI
When using the search form, redirect to concise URIs of the form:
.br
https://man.openbsd.org/[manpath/][arch/]name[.sec]
.br
The optional parts are omitted whenever possible.
.GPE_SM "(2017 Mar 15)"
.LI
Deep linking into manual pages:
.GPE_SM "(2017 Mar 15)"
.br
To almost the same places as the less(1) :t tags on the terminal.
.br
Implemented with
.GPE_QL "id="
attributes.
.LI
Both together result in concise, human-readable URIs like:
.GPE_EM "https://man.openbsd.org/mmap.2#MAP_STACK"
.LI
Dotted underline in HTML+CSS output, hover to cut and paste the URI.
.LI
In man(7), only for .SH, .SS, and .UR due to lack of semantic information.
.LE
.ll 13.5c
.SUBTITLE "Other content improvements"
.BL
.LI
The HTML
.GPE_QL <title>
element now shows the name and section number of the manual page.
.br
.GPE_SM "(2017 Mar 15)"
.LI
Preserve leading comments, usually containing author, Copyright,
and license information.
.br
.GPE_SM "(2018 Apr 11)"
.LE
.ll \n[gpe_ll]u
.rt
.sp -1v
.PSPIC -R Images/Nassagaweya18.eps
.rj
.GPE_SM "Marsh near Nassagaweya Line 6,\
 Regional Municipality of Halton, Ontario (2018)
.GPE_TIME 150
.GPE_NEXT "How are semantic functions communicated?"
.TITLE "Making semantic functions visible"
.BL
.LI
The mdoc(7) source code contains rich information
about what individual words represent, but that information
was never shown to the user in the past.
.LI
In HTML output,
.GPE_EM tooltips
now show the semantic function of marked-up content.
.GPE_SM "(2017 Mar 13)"
.LI
That is useful because it may occasionally help understanding the text,
it helps to develop the ability of using apropos(1) semantic search
effectively, and because slowly becoming familiar with the macro keys
also helps to lower the entry barrier for users who consider sending
patches to manual pages.
.LI
Implemented with
.GPE_QL "title="
attributes for now.
.br
That is bad because it confuses screen readers,
so it needs to be improved.
.br
John Gardner already explained better techniques to me.
.LE
.PSPIC Images/Guelph18_Squirrel.eps
.ce
.GPE_SM "American Red Squirrel, Guelph Lake Conservation Area,\
 Ontario, Canada (June 2018)"
.GPE_TIME 60
.GPE_NEXT "How do we help high-traffic websites"
.TITLE "Bulk conversion of manual pages"
.BL
.LI
Traffic on man.openbsd.org is very moderate, the machine is mostly idle
and can easily render each requested page in real time.
.LI
By contrast, traffic on manpages.debian.org is so massive that their
service went offline due to overload trying to render in realtime
with groff.
.LI
So they decided
.AL 1 \n(Li 1
.LI
to use mandoc rather than groff for higher speed, and
.LI
to pre-render and serve static pages
instead of rendering on demand.
.LE
.LI
Even the pre-rendering needs to be fast because they want to do it
as often as possible (at least daily) and they have several tens
of thousands of manual pages.
.LI
So, do not fork mandoc for each and every manual page;
instead, run a mandocd(8) that receives pairs of input/output
file descriptors.
.br
.GPE_SM "(2017 Feb 4, designed and mostly implemented by Michael Stapelberg)"
.LI
Can for example be driven with our new catman(8) implementation.
.LI
Both
.GPE_EM "manpages.debian.org"
and Arch Linux now use the mandoc formatter
.br
for their official online manuals.
.LI
Github staff is right now considering support for manual pages using mandoc:
.br
https://github.com/github/markup/pull/1196
.LE
.GPE_TIME 240
.GPE_SECTION MARKDOWN
.GPE_NEXT "What about markdown?"
.TITLE "Markdown output format"
Avoid that people have to maintain two copies of documentation
and allow using mdoc(7) even when a project policy requires markdown.
.P
New output mode implemented in just two weeks, part-time.
.GPE_SM "(2017 Mar 3 to 17)"
.SUBTITLE "How simple it has become to implement a new mandoc output mode"
.TS
nl.
1600	lines of C code grand total
_
110	head matter: license, includes, protos, flag defines
140	mdoc macro dispatch table
 40	main function: header, main loop over mdoc nodes, footer (straightforward)
 60	node driver (incl. text line and roff request formatting)
 20	markdown stack handler (for blockquote (>), code blocks (tab), lists)
 60	spacing and outflags handling
 70	input escape character handling
190	output character escaping
850	mdoc node handlers (mostly straightforward)
.TE
.sp 3v
.GPE_SM "\h'8m' Cime de Caron 3193m, Vanoise, France (2017)"
.sp -8v
.PSPIC -R Images/Vanoise17_CaronStation.eps
.rj
.GPE_TIME 70 32
.GPE_NEXT "Was anything difficult?"
.TITLE "Markdown output format \(em the most difficult parts"
.BVL 1cm
.LI "Markdown output character escaping:"
Totally horrific due to context sensitivity, 190 lines, 12% of the code.
.LI "Markdown block nesting:"
Somewhat complicated due to context sensitivity,
interacts with output character escaping,
touches about a dozen places in the code,
about 90 lines of code (6%).
.LI "Horizontal spacing in the output:"
Somewhat difficult in many output formatters;
about 160 lines of code (10%).
.LE
.sp 2
.GPE_EM "The bulk of the code (50%) is"
.br
.GPE_EM "straightforward mdoc node handling."
.mk
.sp 2
All difficult parts
.br
.GPE_SM "(except horizontal spacing, which is always somewhat tricky)"
.br
are due to quirks of the markdown language,
.br
none are intrinsic difficulties
.br
of writing a mandoc(1) output module.
.br
.rt
.sp 0.5v
.PSPIC -R Images/Stockholm15_JaervaKrog.eps
.rj
.GPE_SM "\fIIs there a way?\fP\
 Stockholm, J\(:arva Krog (during EuroBSDCon 2015)"
.GPE_TIME 90 33
.GPE_NEXT "Is markdown a good language?"
.TITLE "Markdown: how a markup language should not be designed"
.BVL 1cm
.LI "Lack of expressiveness:"
Goal: easy writing like in plain-text email; 
yet e.g. no syntax for definition lists.
.LI "Context sensitivity:"
Almost every token can take different meanings depending on where it appears.
.LI "Ambiguity:"
Enclosing in asterisks/underscores:
long_var_name, **bold***italic***bold**
.LI "Mixup of semantic and presentational markup:"
No way to switch off filling without <code> tags.
.br
.S -4
Could be improved, but HTML output from markdown
is now fixed by tradition and people's CSS.
.br
.S P
.LI "Lack of independence:"
Allows and requires embedded HTML, but with crippling restrictions.
.br
.S -4
In unfilled text: no char refs, no flow-level elements, no native formatting.
.br
In indented text: no paragraph breaks, no block-level HTML.
.br
.S P
.LI "Syntax inspired by Whitespace:"
Two trailing blanks mean a line break.
.LI "Lack of both standardization and extensibility:"
Bad because it lacks features, so everybody adds their own, incompatible ones.
.LE
.sp
.mk
.GPE_EM "Do not use markdown:"
.S -4
.br
Use mdoc(7) to maintain your source documents
and mandoc(1) to convert them when needed.
.S P
.br
.rt
.sp -0.5v
.rj
See my essay on undeadly.org for more details.
.GPE_TIME 120 34
.GPE_SECTION LIBRESSL
.GPE_SECTION LIBRESSL
.GPE_NEXT "What about LibreSSL?"
.mk
.TITLE "LibreSSL motivation"
.BL
.LI
Forking LibreSSL from OpenSSL was triggered
.br
by the CVE-2014-0160 "Heartbleed" vulnerability.
.LI
But the real reason was that inspecting the codebase
.br
revealed a general
.GPE_EM "neglect of basic security"
practices,
.br
not just one single vulnerability.
.LI
A second reason for forking was
.br
frequent failure of the OpenSSL team
.br
to cooperate when patches were sent.
.LI
Once forking was decided, everything happened very quickly, see below.
.LI
Initial focus was on
.GPE_EM deleting
needless code, and on preparing the code for audit;
later on code
.GPE_EM auditing ,
improving robustness and security, and the new
.GPE_EM libtls .
.LE
.TS
lll.
April 7	Damien Miller	cherrypicks the heartbleed fix from OpenSSL
April 8	Ted Unangst	"exploit mitigation countermeasures" mail on tech@
April 13	Miod Vallat	imports OpenSSL 1.0.1g
April 13	Bob Beck	first Valhalla commit, cryptlib.c rev. 1.16
April 14	Joel Sing	starts applying KNF to the OpenSSL code
April 15	Ted Unangst	removes FIPS mode support
May 17	Bob Beck	first talk on LibreSSL during BSDCan 2014
.TE
.rt
.PSPIC -R Images/LibreSSL.eps
.GPE_TIME 40 15
.GPE_NEXT "What was the task with respect to documentation?"
.TITLE "LibreSSL documentation: the task"
.BL
.LI
Just like the OpenSSL code was way below OpenBSD quality standards,
.br
so was the documentation.
.mk
.LI
Incomplete, generally sloppy,
.br
and an inferior markup language.
.LI
Just like the code needed reformatting
.br
before audit (KNF, style(9)),
.br
so did the manual pages.
.LI
But: the code could remain in C,
.br
.GPE_SM "(even though Boring SSL switched to C++)"
.br
while the manual pages had to change
.br
markup language: perlpod(1) \(-> mdoc(7).
.LI
Semi-automatic code reformatting
.br
allowed the check "no object change";
.br
no such check was possible
.br
for manual pages \(-> manual work.
.LI
Long delays because the manual work
.br
required is very substantial.
.LE
.rt
.sp
.PSPIC -R Images/Paris17_SacreCoeurTours.eps
.rj
.GPE_SM "Paris, Sacr\('e-C\(oeur (2017)"
.GPE_TIME 40 16
.GPE_NEXT "Which tool was used?"
.TITLE "LibreSSL manual page conversion: the tool"
.BL
.LI
Tool used:
.GPE_EM "pod2mdoc(1)" ,
a small, self-contained C program.
.LI
Started by Kristaps Dzonsons
.GPE_SM "(2014 Mar 20 to Apr 7)"
\(em
.br
by chance, exactly during the two weeks before heartbleed.
.LI
Some steps forward during the next two years, with long pauses.
.LI
Main difficulty: convert
.GPE_EM "presentational to semantic"
markup, requires heuristics.
.LI
Real work started during g2k14:
some code improvements by schwarze@,
.br
first 83 pages converted by bentley@.
.GPE_SM "(2014 Jul 11 to 19)"
.LI
The Ft/Fo/Fa/Fc heuristic formatter for the SYNOPSIS.
.GPE_SM "(2014 Oct 22)"
.mk
.LI
Use ohash(3) to improve markup,
.br
convert another 45 pages.
.br
.GPE_SM "(2015 Feb 12 to 23)"
.LI
Complete conversion during l2k16:
.br
the last 130 pages.
.GPE_SM "(2016 Nov 2 to 6)"
.\" 2016 Nov 2 DES DH DSA EC ERR
.\" 2016 Nov 3 EVP HMAC MD5 config PEM PKCS RAND
.\" 2016 Nov 4 RSA X509
.\" 2016 Nov 6 copyediting by jmc@
.LI
The tool was already in good shape
.br
since 2015, almost no more changes
.br
were needed during l2k16.
.\" 2016 Nov 3 write function prototypes without args using .Fn rather than .Fo
.LE
.rt
.sp 0.5v
.PSPIC -R Images/Paris17_LouvreFlorePeniche.eps
.rj
.GPE_SM "Paris, Louvre, Pavillon Flore, vue du Pont Senghor (2017)"
.GPE_TIME 40 17
.GPE_NEXT "What needed to be done by hand?"
.TITLE "Work done by hand during the conversion"
.BL
.LI
Almost always: missing macros,
.GPE_EM "no markup in the original" .
.LI
Often: macros not automatically recognized (phys to sem, Vt, Fn...).
.LI
Occasionally: fix markup that was poor in the original.
.LI
Occasionally: unusually difficult markup, like for callback functions.
.LI
A few technicalities, like removing useless character escapes.
.LI
.GPE_EM "All required reading of the complete text by a human."
.LE
.SUBTITLE "While here, do initial cleanup of content"
.mk
.ll 14c
.BL
.LI
Delete inapplicable or useless text.
.LI
Apply wording tweaks.
.LI
Improve SEE ALSO sections.
.LE
.P
It is technically desirable to keep content changes separate,
but that would be a waste of effort:
both cleanups require a human to read the complete text.
.P
After that, linting and semi-automatic checks were done:
for example, several functions in libcrypto
.br
were documented in more than one manual page.
.br
.ll \n[gpe_ll]u
.rt
.PSPIC -R Images/Paris17_LouvreCourMarly.eps
.rj
.GPE_SM "Paris, Louvre, Cour Marly (2017)"
.GPE_TIME 40 18
.GPE_NEXT "Did OpenSSL have better documentation?"
.TITLE "Initial synchronization with OpenSSL"
.BL
.LI
Systematically work through all OpenSSL manual pages.
.LI
Add Copyright and license to each file.
.GPE_SM "(2016 Nov 10 to Dec 10)"
.LI
Bring in bug fixes from OpenSSL.
.LI
Add missing pages from OpenSSL to our tree.
.GPE_SM "(2016 Nov 10 to Dec 11)"
.LI
Remove inapplicable stuff.
.GPE_SM "(e.g. 2016 Sep 5 CMS)"
.LI
While reading the text again, watch out for various kinds of bugs.
.LI
Reorg overview pages: BIO BN DH DSA EC RSA ssl
.GPE_SM "(2016 Dec 6 to 11)"
.LI
Fix d2i* pages.
.GPE_SM "(2016 Dec 24 to 2017 Jan 6)"
.br
All in one page in OpenSSL.
Ideally, such pages should say something about the actual format
but at least clearly specify what the object is.
.LE
.PSPIC Images/Paris17_LesDocks.eps
.ce
.GPE_SM "Paris, Les Docks, Quai d'Austerlitz (2017)"
.GPE_TIME 40 19
.GPE_NEXT "What needs to be done regularly?"
.TITLE "Maintenance tasks"
.BL
.LI
At irregular intervals, evaluate all changes in OpenSSL since the last sync.
.br
.GPE_SM "(syncs started 2016 Dec 10; 2017 Mar 25, Aug 19;\
 2018 Feb 12, Mar 29, ...)"
.LI
Merge changes that make sense and apply to our code.
.LE
.SUBTITLE "Maintenance related to local code changes"
.BL
.LI
Split tls_init(3).
.GPE_SM "(2017 Jan 25)"
.LI
Start merging OpenSSL-1.1 interfaces.
.GPE_SM "(2018 Feb 10, with jsing@ and tb@)"
.LI
Start constification.
.GPE_SM "(2018 May 1, with tb@)"
.LE
.SUBTITLE "Diverse major quality improvements"
.BL
.LI
Write HISTORY sections.
.GPE_SM "(2018 Mar 20 to 27)"
.LI
Systematic checks of openssl(1).
.GPE_SM "(started 2018 Mar 30, incomplete)"
.LI
Rewrite ENGINE manuals.
.GPE_SM "(2018 Apr 14)"
.LE
.P
In several cases, reading code in order to document it
revealed bugs that got fixed, for example
in X509_NAME_add_entry(3).
.GPE_SM "(2018 Apr 4)"
.GPE_TIME 40 20
.GPE_NEXT "Anything completely new?"
.TITLE "Pages written from scratch: the problem"
In a few cases, missing pages were written as soon as the lack was noticed;
.br
earliest example: BN_set_negative(3).
.GPE_SM "(2016 Nov 5)"
.P
Not in general, both because too many pages are missing,
so it would have derailed and blocked the rest of the work,
and because in some cases, functions are intentionally undocumented,
and identifying these cases is non-trivial.
.P
Some classes of functions that could be safely documented
without the risk of accidentally exposing internals:
.PSPIC Images/Paris17_Cite.eps
.ce
.GPE_SM "Paris, vue du Montmartre vers la cit\('e (2017)"
.GPE_TIME 40 21
.GPE_NEXT "What \fIcould\fP be done?"
.TITLE "Pages written from scratch: progress"
.BL
.LI
Functions referenced elsewhere in the manuals:
.br
e.g. 11 new pages in libssl
.GPE_SM "(2016 Dec 6 to 10)"
.\" 2016 Dec 6 SSL_SESSION_new(3) SSL_SESSION_print(3)
.\" 2016 Dec 7 SSL_dup_CA_list(3) SSL_dup(3)
.\"   SSL_copy_session_id(3) SSL_renegotiate(3)
.\" 2016 Dec 10 SSL_get_version(3) SSL_get_certificate(3)
.\"   SSL_get_state(3) SSL_num_renegotiations(3) SSL_get_shared_ciphers(3)
.br
X509_STORE_load_locations(3)
.GPE_SM "(2017 Jan 6)"
.br
OPENSSL_sk_new(3) and STACK_OF(3)
.GPE_SM "(2018 Mar 1)"
.LI
ASN1 and X509 constructor manuals.
.GPE_SM "(2016 Dec 12 to 2017 Jan 4)"
.br
Public objects always require at least constructor documentation.
.br
Explain not just what the constructor technically does, but what the
meaning of the constructed objects is, with refs to STANDARDS etc.
.LI
Functions analogous to documented functions,
.br
e.g.  SSL_set_tmp_ecdh(3).
.GPE_SM "(2017 Aug 12)"
.LI
Public functions with semantics that substantially differs from OpenSSL,
.br
e.g.  ASN1_STRING_TABLE_add(3).
.GPE_SM "(2017 Aug 20)"
.LI
Pages where OpenSSL manuals are seriously misleading,
.br
e.g. X509_check_private_key(3).
.GPE_SM "(2017 Aug 20)"
.LI
Only one systematic effort so far to get a particularly important
sub-library completely documemted \(em but even that is still incomplete:
two new pages
.\" 2017 Jan 29 BN_set_flags(3)
.\" 2017 Jan 30 get_rfc3526_prime_8192(3)
and two new functions in existing pages in BN documentation.
.\" 2017 Jan 25 BN_asc2bn(3) ERR_load_BN_strings(3)
.GPE_SM "(2017 Jan 25 to 30)"
.LE
.GPE_TIME 40 22
.GPE_NEXT "What is still missing?"
.TITLE "The current state of affairs: what is still missing"
Much better than OpenSSL (all improvements merged, large numbers
of additional bugs fixed, many substantial content improvements,
many new pages) \(em but:
.BL
.LI
Many public functions lack manual pages in LibreSSL.
Write them from scratch or add comments saying
why they are intentionally undocumented.
.br
Many months of full-time work (at least)...
.LI
Almost all existing pages need basic copy-editing, they are in general
wordy, imprecise, and incomplete.
Fixing them usually requires comparing the text to the code,
but the code is often contorted, so reading it is time-consuming.
.br
Many months of full-time work (at least)...
.LI
Complete the first pass through the openssl(1) manual page.
Lack of motivation due to low code quality, it's basically
a quick and dirty testing tool, and it is of considerable size.
Probably about a week of full-time work, maybe more...
.mk
.br
.ll 14c
.LI
Routine syncs with OpenSSL.
These will become a serious problem in the near future
when OpenSSL changes their license and becomes non-free (Apache 2).
.LI
And of course, the usual ongoing maintenance when code is added, deleted,
or changed in the future, like for any other code and documentation.
.LE
.ll \n[gpe_ll]u
.rt
.sp 0.5v
.PSPIC -R Images/Paris17_SacreCoeurChoir.eps
.rj
.GPE_SM "Paris, Sacr\('e-C\(oeur (2017)"
.GPE_TIME 70 23
.GPE_NEXT "What did we learn from LibreSSL?"
.TITLE "Lessons learnt about LibreSSL and API design (1)"
.BL
.LI
.GPE_EM "Use standard POSIX functions."
If you want, provide fallback implementations for defective operating
systems.
If you can't, accept the truism that running defective operating systems
implies limited functionality.
Never design your API to cater for the worst possible system you can find.
That makes everybody suffer from idiosyncrasy and bloat.
For example, the \m[Kea2]BIO\m[] abomination was originally designed
to deal with shortcomings of Microsoft Windows.
.LI
.GPE_EM "Avoid wrappers"
around POSIX functions.
Exception: in application code (but not in a library), a wrapper around
malloc(3) that calls err(3) on failure is OK and can make the main
code more readable and less prone to errors.
.br
Bad example: \m[Kea2]OPENSSL_malloc(3)\m[], \m[Kea2]CRYPTO_malloc(3)\m[] \(em
both exist!
.mk
.br
.ll 13c
.LI
.GPE_EM "Follow C library semantics"
a much much as possible, lest you cause misunderstandings, bugs,
and the need for confusing warnings in the documentation.
For example, \m[Kea2]ASN1_STRING_cmp(3)\m[] ought to do
lexicographical ordering like strcmp(3), but it does not.
As a last resort, if you must be different for some reason,
use a clearly different name.
.LE
.ll \n[gpe_ll]u
.rt
.sp
.PSPIC -R Images/Paris16_NotreDame.eps
.rj
.GPE_SM "Paris, Notre Dame (2016)"
.GPE_TIME 90 24
.GPE_NEXT "Anything about interface size?"
.TITLE "Lessons learnt about LibreSSL and API design (2)"
.BL
.LI
.GPE_EM "Minimize the number of public API functions."
With too many functions, some functions will likely remain undocumented
for lack of time to write the text.
With too many functions, users will lose their way
and fail to find the the function they need.
It is harder to keep a large number of functions consistent
than a small number.
Particularly bad example: \m[Kea2]PEM_read_bio_PrivateKey(3)\m[].
.LI
In particular, avoid families of nearly identical functions
differing only in one minor aspect.
They are almost impossible to document properly, it is very hard
to avoid that the documentation becomes both vague and repetitive.
.br
Just design your API in some different way.
.br
Bad offenders:
see \m[Kea2]CRYPTO_set_ex_data(3)\m[], \m[Kea2]BIO_set_ex_data(3)\m[].
.LI
In particular, if the prototypes agree with each other, make them
one function, with the behaviour controlled by one of the arguments.
.br
Bad example: \m[Kea2]DES_ofb64_encrypt(3)\m[] and its companions.
.LI
Be wary, even when doing so, you can still create absurdly large
APIs; for a particularly bad example, see \m[Kea2]EVP_EncryptInit(3)\m[].
For complex tasks, avoiding such failure is not trivial and
requires careful and disciplined design.
.LI
Never create families of functions differing only in one type
and one component of their name.
.GPE_EM "Never generate function names from preprocessor macros."
.br
Such functions are almost impossible to document.
.br
Particularly bad examples: \m[Kea2]STACK_OF(3)\m[], \m[Kea2]lh_new(3)\m[].
.LE
.GPE_TIME 60 25
.GPE_NEXT "Anything about objects?"
.TITLE "Lessons learnt about LibreSSL and API design (3)"
.BL
.LI
.GPE_EM "Minimize the number of objects"
in object oriented code.
Each object needs at least one manual page, usually more than one
if it is non-trivial.
Constructor manuals usually add a lot of volume with relatively
little useful content.
.LI
Avoid representing the same logical entity
on two different API levels.
For example, ASN.1 objects ought to be either represented as NIDs
troughout (with utility functions to retrieve names etc.) or as
objects (structs) throughout.
OpenSSL provides both, also resulting in duplicate sets of accessors
and in duplication in several other interfaces.
.br
See \m[Kea2]OBJ_nid2obj(3)\m[] for the ugly consequences.
.LI
.GPE_EM "Avoid redundant interfaces."
For example, if your objects have child objects, either provide
copying accessors only and require users to
.I always
free the retrieved copies after use.
Or provide accessors only that return references and require users
to copy them themselves when needed.
Providing both causes confusion, invites bugs, and bloats the interface
and the documentation.
.LI
.GPE_EM "Avoid cryptic naming conventions"
like \m[Kea2]get0, get1, add0, add1\m[].
They require users to learn additional rules and make understanding
harder than necessary.
.br
If you feel tempted, your interface is getting too complicated.
.LI
Be consistent whether copies are deep or shallow; it's a major source
of confusion and bugs.
OpenSSL is inconsistent and rarely even documents it.
.LE
.GPE_TIME 30 26
.GPE_NEXT "Even more about objects?"
.TITLE "Lessons learnt about LibreSSL and API design (4)"
.BL
.LI
.GPE_EM "Never provide public accessor that can break invariants" ,
like
\m[Kea2]ASN1_STRING_length_set(3)\m[].
They are a sure sign of failed interface design.
.LI
.GPE_EM "Think twice before using callbacks" ;
they make interfaces and documentation significantly more complex
and massively obstruct call tree analysis during auditing.
If you must use them, typedef the public callback prototypes.
.br
A SYNOPSIS becomes inscrutable when callback prototypes appear
verbatim as function arguments or return values.
Bad offender: \m[Kea2]BIO_meth_get_read(3)\m[].
.LI
.GPE_EM "Avoid object flags radically changing the behaviour"
of the object, for example ASN1_OBJECT_FLAG_DYNAMIC or BN_FLG_CONSTTIME.
They cause surprising behaviour, invite bugs, and complicate
the documentation substantially.
Look at \m[Kea2]ASN1_OBJECT_free(3)\m[] and \m[Kea2]BN_set_flags(3)\m[]
for particularly bad examples.
.mk
.br
.ll 12c
.LI
Never define types that are essentially typedefs for
.GPE_QL "void *" .
They are confusing and merely cause a false sense of type safety.
If you really must sacrifice type safety, use
.GPE_QL "void *"
directly.
.br
Bad counter-example: \m[Kea2]ASN1_VALUE\m[].
.LE
.ll \n[gpe_ll]u
.sp 0.5v
.GPE_SM "Novi Beograd, Zapadna kapija (West gate) during EuroBSDCon 2016"
.br
.rt
.PSPIC -R Images/Beograd16_ZapadnaKapija.eps
.GPE_TIME 30 27
.GPE_NEXT "Any additional pitfalls?"
.TITLE "Lessons learnt about LibreSSL and API design (5)"
.BL
.LI
.GPE_EM "Avoid functions that show radically different behaviour"
depending on input arguments.
For example, a function ought to either fill in provided storage
or allocate new storage, but not decide itself based on whether
it was given a NULL pointer.
A function ought to either operate on NUL-terminated strings or on
fixed-length char buffers, but not decide itself based on whether
it was given a length of -1.
Radical behaviour changes cause confusion, invite bugs, and force
wordy and unwieldy documentation.
.br
Particularly bad offender: \m[Kea2]ASN1_item_d2i(3)\m[].
.LI
.GPE_EM "Good naming is vital for comprehensibility."
Absolutely avoid wrong names;
.br
they confuse users, invite bugs,
and make correct documentation sound wrong.
.mk
.br
.ll 13.5c
For example, the type \m[Kea2]ASN1_STRING_TABLE\m[] is really a
.IR "table entry" ,
not a complete
.IR table .
.br
For example, the type \m[Kea2]ASN1_TYPE\m[] does not contain a type at all,
but a value of arbitrary type, so it should be called ASN1_VALUE_ANY
or similar.
.br
For example, \m[Kea2]X509_check_private_key(3)\m[] compares the
.I public
key components only.
.LI
Limit function arguments to reasonable numbers.
.LE
.ll \n[gpe_ll]u
.rt
.sp 2v
.PSPIC -R Images/Cambridge16_KingsCollege.eps
.rj
.GPE_SM "\fIRural countryside:\fP Cambridge, Kings College (2016)"
.GPE_TIME 40 28
.GPE_NEXT "Do these lessons ever end?"
.TITLE "Lessons learnt about LibreSSL and API design (6)"
.BL
.LI
.GPE_EM "Get syntax and semantics right when first adding a function."
.br
Changing syntax or semantics in a later release not only necessitates
change of application programs but also complicates documentation.
Adding a function constitutes a huge responsibility and is not to be
done lightly.
.LI
.GPE_EM "Keep logging and error reporting as simple as possible."
.br
I'm not aware of any other subject area so prone to overengineering.
.br
\m[Kea2]ERR(3)\m[], ERR_get_error(3), ERR_error_string(3),
ERR_print_errors(3), ERR_GET_LIB(3), ERR_put_error(3),
ERR_set_mark(3), ERR_load_strings(3), ERR_load_crypto_strings(3), ...
.LI
.GPE_EM "Keep configuration and initialization simple."
.br
They are also quite prone to feature creep and spaghetti code.
See \m[Kea2]OPENSSL_config(3)\m[] and the functions mentioned there
for a bad example.
.LE
.mk
.PSPIC Images/BrockeQol.eps
.rt
.sp 2.5v
.in +17.5c
.S -4
John Brocke,
.br
Qol/Voice,
.br
1987
.sp
(Calgary,
.br
Glenbow Museum,
.br
2015)
.br
.S P
.in -17.5c
.GPE_TIME 80 29
.GPE_SECTION DATABASE
.GPE_NEXT "What's wrong with SQLite?"
.TITLE "The problem with SQLite"
.BL
.LI
Frequent releases: on average 5 feature
and 6 bugfix releases per year.
.GPE_SM "(2011\(en2017)"
.ig
2017: 3.16-3.21 = 6 + 6 bugfix releases
2016: 3.10-3.15 = 6 + 9 bugfix releases
2015: 3.8.8-3.9 = 5 + 8 bugfix releases
2014: 3.8.3-3.8.7 = 5 + 8 bugfix releases
2013: 3.7.16-3.8.2 = 5 + 5 bugfix releases
2012: 3.7.10-3.7.15 = 6 + 3 bugfix releases
2011: 3.7.5-3.7.9 = 5 + 4 bugfix releases
2010: 3.6.22-3.7.4 = 7 + 2 bugfix releases
2009: 3.6.8-3.6.21 = 14 + 3 bugfix releases
2008: 3.5.5-3.6.7 = 13 + 2 bugfix releases
2007: 3.3.9-3.5.4 = 17
..
.LI
Extensive changes in each release:
about +15k \-5k LOC to audit per year
.br
in 2015\(en2017, counting the directory src/ only.
That's half the volume
.br
of the complete mandoc codebase \(em but every year anew!
.ig
3.22 +3810 -1286
3.21 +2754 -2615
3.20 +3426 -1341
3.19 +1272 -640
3.18 +2348 -665
3.17 +996 -547
3.16 +3166 -1683
3.15 +2996 -1466
3.14 +2571 -1215
3.13 +3100 -1263
3.12 +2289 -1551
3.11 +2170 -1445
3.10 +4062 -1717
3.9 +2552 -1598
3.8.11 +3102 -4704
3.8.10 +1326 -727
3.8.9 +2282 -1336
3.8.7 +3779 -1814
..
.LI
.GPE_EM "That volume of changes is impossible to audit."
.LI
Also, the coding style is so radically different from OpenBSD that
.br
people would be unwilling to audit the code even if they had the time.
.mk
.br
.ll 11c 
.LI
Forking would be an absurd waste:
.br
SQLite upstream provides
.br
excellent maintenance and quality.
.LI
So it is the
.GPE_EM "ideal software for ports" :
.br
trustworthy and stable
.br
without us checking
.br
and continuously re-checking
.br
all the details.
.LI
But unfortunately,
.br
mandoc used it in base...
.LE
.ll \n[gpe_ll]u
.rt
.sp 0.5v
.PSPIC -R Images/Nantes16_FosseDuChateau.eps
.rj
.GPE_SM "\fIThat's massive:\fP Nantes, Fosse du Chateau (during p2k16)"
.GPE_TIME 60 4
.GPE_NEXT "What was the root of the problem?"
.INACTIVE_TITLE "The root cause of the problem"
.ce
.GPE_EM "Schematic, superficial approach to architecture."
.SUBTITLE Antipattern:
.BL
.LI
What kind of task is at hand? \(em Database.
.GPE_SM "(not completely wrong)"
.mk
.LI
Which is the simplest and highest quality standard toolkit
.br
for that kind of problem? \(em SQLite.
.GPE_SM "(completely true)"
.LI
So reuse that as a dependency.
.GPE_SM "(wrong, premature conclusion)"
.LE
.SUBTITLE "Three errors:"
.AL
.LI
.GPE_EM Requirements
were not evaluated in detail, but only summarily \(->
.br
selection of excessively powerful, large, heavyweight tools.
.LI
.GPE_EM "Integration costs"
were not evaluated \(->
the tool may save time and code lines for the functionality itself,
but may require wasting effort on glue code.
.LI
.GPE_EM "Maintenance costs"
were not evaluated \(em
.br
and it turned out that maintenance was impossible,
see the previous slide \(->
.br
that was the fatal error which forced us to redo the work.
.LE
.rt
.sp 0.5v
.PSPIC -R Images/SQLite.eps 5c
.GPE_TIME 0 5
.GPE_NEXT "What do we actually need in mandoc?"
.INACTIVE_TITLE "The actual requirements"
.BVL 1cm
.LI "Frequent Reading"
Keep that in mind for the design.
.LI "Only specific query types"
\&... whereas a strength of SQL is flexibility of queries
and good performance
.br
on any kind of query, even those that were never anticipated.
.LI "Limited database size when reading"
\&... whereas a strength of a full-scale database
is scaling to huge database sizes.
.LI "Rare writing, only at system update and pkg_add times"
\&... whereas a strength of a full-scale database
is efficient writing.
.LI "Limited database size when writing"
\&... whereas a strength of a full-scale database
is scaling write performance.
.LI "Linear search is good enough"
\&... whereas a strength of a full-scale database
is optimized searching,
.br
for example using index and hashing techniques.
.LE
.P
.S -4
The most common forms of queries are apropos(1) substring
and regular expression searches.
.br
Even the simplest search optimizations like binary searching
or hashing are hopeless with that.
.br
Speed optimization for man(1) lookups would be possible but useless,
it is very fast anyway.
.br
.S P
.GPE_TIME 0 6
.GPE_NEXT "How did we solve this?"
.INACTIVE_TITLE "The solution for mandoc: searching pages"
Dedicated database format from scratch with minimal structural overhead.
.P
A very fast and simple way to repeatedly read
.br
the same parts of a file of moderate size from disk:
.P
Use the kernel's buffer cache by simply
.GPE_EM "mmap(2)" ing
the file into RAM.
.P
Keep the data typically searched together contiguous, as close together
as possible, to minimize the number of pages actually faulted into RAM.
.P
For example, the list of one-line descriptions (used for apropos)
looks like this:
.sp
.VERBON 18 14
ACME client\e0convert addresses into file names and line numbers.\e0
format floppy disks\e0apply a command to a set of arguments\e0...
.VERBOFF
.mk
.ll 14c
.sp
.SUBTITLE "Trivial lookup algorithm"
.BL
.LI
Do a linear search for the substring.
.LI
When the N-th title matches, ...
.LI
\&... access the data struct for the N-th page.
.LE
.ll \n[gpe_ll]u
.rt
.sp 0.5v
.PSPIC -R Images/Nantes16_NotreDameAnneaux.eps
.rj
.GPE_SM "\fILookup:\fP Nantes, Notre Dame, vue par les Anneaux (2016)"
.GPE_TIME 0 7
.GPE_NEXT "How do we access search results?"
.INACTIVE_TITLE "The solution for mandoc: access search results"
For information retrieval, the file contains a list of records of pointers;
.br
in C code, each record can be accessed as a struct.
.P
For example, the record for the ACME client page
simply contains the pointers:
.BL
.LI
to the page name in the page name list ("acme-client")
.LI
to the section in the section list ("1")
.LI
to the architecture in the architecture list ("")
.LI
to the one line description ("ACME client")
.LI
to the filename ("man1/acme-client.1")
.LE
.sp
So, a command like
.GPE_QL "apropos acme" :
.AL
.LI
Maps and linearily searches the name and description lists.
.LI
Looks up the N-th record in the table of manual pages.
.LI
Follows the pointers to the name and description (already cached)...
.LI
\&... and to the section (will be faulted into RAM at this point).
.LI
Assembles the line "acme-client(1) \(em ACME client" and prints it.
.LE
.GPE_TIME 0 8
.GPE_NEXT "How does the sequence of operations look like?"
.INACTIVE_TITLE "The solution for mandoc: program execution"
The ktrace(1) is very short an clean:
.AL
.LI
let ld.so(1) load libc, libz, libutil
.LI
pledge(2)
.LI
open(2) the file mandoc.db(5)
.LI
mmap(2)
.LI
search (not even visible in ktrace, purely userland)
.LI
access(2) to validate the resulting file name
.LI
retrieve the record (not even visible in ktrace, purely userland)
.LI
write(2) the result line
.mk
.LE
.sp
Similarly for semantic searches:
.P
One table for each macro key.
.P
The full format is documented
.br
in mandoc.db(5).
.br
.rt
.PSPIC -R Images/Nantes16_AdminArbre.eps
.rj
.GPE_SM "Nantes, Maison de L'Administration Noevelle (2016)"
.GPE_TIME 0 9
.GPE_NEXT "Summary about the solution"
.INACTIVE_TITLE "The solution for mandoc: summary"
.BL
.LI
This may sound like "heavily optimized for performance".
.LI
But no, quite to the contrary, it is actually heavily optimized
.br
for simplicity and readability of code.
.LI
Performance is merely a by-product
of choosing a well-adapted data format
.br
and the simplest possible algorithms.
.LE
.PSPIC Images/Nantes16_SteAnneCale3.eps
.ce
.GPE_SM "Nantes, Ste. Anne et la Grue Titan Jaune vue de la Cale 3 (2016)"
.GPE_TIME 0 10
.GPE_NEXT "How does it perform?"
.INACTIVE_TITLE "Performance of the new mandoc.db(5)"
.mk
.BL
.LI
Half the database size.
.br
For example for OpenBSD 6.3
.br
/usr/share/man/: 4.35 \(-> 2.05 MB
.LI
Double lookup speed.
.br
For example, for
.br
.GPE_QL "man \-w pledge"
.br
on my notebook:
.br
from disk: 2.7ms \(-> 1.2ms
.br
from buffer cache: 2.1ms \(-> 0.9ms 
.LI
Small increase in the database rebuild time.
.br
For example for /usr/share/man/ on my notebook: 4.1s \(-> 5.1s
.br
We don't care that much about rebuild times...
.br
Doubling it would be unfortunate, but 25% is not an issue.
.LI
Substantially slower database update:
.br
For example, add one page to /usr/share/man/ on my notebook:
7.5ms \(-> 330ms
.br
That is 50 times longer (because SQLite efficiently inserts data,
while the new makewhatis(8) just reads in the whole file, manipulates
the data structures in memory, and writes out the whole file again) \(em
but, honestly, how does less than half a second matter when
doing system updates or installing new packages?
.LE
.rt
.sp -0.5v
.PSPIC -R Images/Nantes16_Fonderies.eps
.rj 2
.GPE_SM "Nantes, Jardin des Fonderies,
.br
.GPE_SM "Rue Louis Joxe (2016)"
.GPE_TIME 0 11
.GPE_NEXT "How did the source code change?"
.INACTIVE_TITLE "Source code"
.sp -1v
.TS
allbox center;
lnnnl.
	SQLite	new code	change
database *.c LOC	205,000	1,570	\-99.2%
database *.c size	7,200kB	44kB	\-99.4%
database *.h LOC	10,700	225	\-97.9%
database *.h size	508kB	12kB	\-97.6%
makewhatis glue LOC	365	125	\-65%	now 2350 LOC
apropos glue LOC	510	475	\-7%	now 850 LOC
.TE
.sp
makewhatis(8):
Practically no change to the bulk of the code (file system iteration,
mdoc(7) node handling, man(7) and *.cat page handling).
.P
apropos(1), man(1):
Search and lookup code is now substantially different,
.br
but
.GPE_EM "not larger" ,
even though it still does macro specific searches
and still supports logical operations like "and", "or", and parentheses.
.mk
.P
.ll 12c
Both:
the new code is much
.GPE_EM "easier to read"
\(em
.br
constructing command strings
.br
of one programming language (SQL)
.br
in another programming language (C)
.br
is not the kind of code you want to audit.
.br
.ll \n[gpe_ll]u
.rt
.sp -0.5v
.PSPIC -R Images/Nantes16_RondPoint.eps
.rj
.GPE_SM "\fIPlus beau:\fP Nantes, Rond-Point du Pont Willy Brandt (2016)"
.GPE_TIME 0 12
.GPE_NEXT "What are the benefits of the new mandoc.db(5)?"
.TITLE "Immediate benefits of the new mandoc.db(5)"
.sp -0.5v
.BL
.LI
Designed and implemented a new database backend from scratch
.br
using only mmap(2) and POSIX C.
.LI
Deleted SQLite from the OpenBSD base system.
.GPE_SM "(sthen@ 2016 Sep 23)"
.LI
200,000 lines of code less to maintain, 
.LI
15,000 lines of new code less to audit per year.
.LI
Half the database sizes, double lookup speed.
.LI
300 lines less of very ugly glue code in mandoc itself.
.GPE_SM "(2016 Aug 1)"
.LE
.mk
.ll 12c
.SUBTITLE Costs
.BL
.LI
Much slower update time
.br
(but still below half a second).
.LI
Slightly slower rebuild time
.br
(5 instead of 4 seconds).
.LI
Had to write slightly below
.br
2000 lines of new code \(em
.br
now two years old,
.br
needs almost no maintenance.
.LE
.ll \n[gpe_ll]u
.rt
.sp -0.5v
.PSPIC -R Images/Nantes16_AdminPanneau.eps
.rj
.GPE_SM "Nantes, Maison de l'Administration Nouvelle (2016)"
.GPE_TIME 60 13
.GPE_NEXT "What did we learn from the SQLite removal?"
.INACTIVE_TITLE "Generic lessons from the SQLite removal"
.mk
.BL
.LI
.GPE_EM "Do not blindly use standard tools."
.LI
Evaluate specific requirements before
.br
selecting tools, do not fall for buzzwords.
.LI
Evaluate integration costs before deciding.
.LI
Try to estimate maintenance cost
.br
before committing to a tool.
.LI
Seriously consider using the
.br
.GPE_EM "POSIX C library as your main toolkit:"
.br
It is surprisingly powerful.
.P
.S -4
In a long-term project that sees substantial use,
the additional effort required for using pure C
.br
is sometimes justified by the resulting simplicity,
self-containedness, and maintainability,
.br
even when performance is *not* the main concern.
.br
.S P
.LI
Quality is multi-dimensional,
so even if a tool is excellent in many respects,
.br
it may not be good enough.
.LI
Quality does not imply adequacy, so even if something is excellent
with respect to
.I all
its goals and the goals cover all your needs,
it may not be good enough.
.LI
"Light" is a relative statement and may not be light enough.
.LE
.P
Of course, all this applies to long-term maintenance of public
software used by many people \(em
.I not
to private, quick
and dirty sysadmin scripts.
.br
.rt
.sp -0.5v
.PSPIC -R Images/Nantes16_NotreDameJaune.eps
.rj 2
.GPE_SM "Nantes, Notre Dame, vue de"
.br
.GPE_SM "la Grue Titan Jaune (2016)"
.GPE_TIME 0 14
.GPE_SECTION PORTS
.GPE_NEXT "What about mandoc in ports?"
.TITLE "Better mandoc support for third-party manual pages"
.BL
.LI
In 2016, about 200 ports remained whose manual pages mandoc
could not handle because they used unimplemented low-level
roff(7) features.
.LI
A parser unification performed in 2015\(en2017
now allows generating syntax tree nodes in the roff parser,
facilitating implemention of about two dozen additional
low-level roff(7) requests and escape sequences.
.GPE_SM "(2017 May 4 to June 18)"
.LI
In addition to that, several improvements were implemented
in the tbl(7) formatter, most notably filling text inside table columns.
.GPE_SM "(2017 June 7 to 27)"
.LI
As a result, the number of OpenBSD ports that
.GPE_EM "still require groff"
was reduced from over 200 to just 25 \(em or
.GPE_EM "0.25% of ports."
.GPE_SM "(2017 May 7 to June 29)"
.LI
Mandoc can now handle the manual pages of GNU troff itself.
.GPE_SM "(2018 Aug 9 to 25)"
.LE
.PSPIC Images/Paris17_LouvrePyramide.eps 14c
.ce
.GPE_SM "Paris, Louvre, La Pyramide (2017)"
.GPE_TIME 100
.GPE_NEXT "What about mandoc in ports?"
.mk
.INACTIVE_TITLE "Mandoc in ports: how it began"
Initially, all manuals in ports
.br
were formatted with groff(1)
.br
at package build time
.br
(USE_GROFF Makefile variable).
.sp
Ports were switched to
.br
install source manuals instead
.br
after checking that the manuals
.br
worked well with mandoc(1).
.sp 2v
.SUBTITLE "Three reasons to get rid of USE_GROFF"
.AL
.LI
Installing source manuals
.GPE_EM "allows using semantic searching"
with apropos(1) \(em though so far, that mostly applies to mdoc(7)
manuals and doesn't make much of a difference for man(7) manuals.
.LI
Avoiding dependencies simplifies optimization
of bulk builds for speed.
.LI
Getting rid of USE_GROFF altogether
.br
would take one complication out of the ports build infrastructure.
.LE
.rt
.sp
.PSPIC -R Images/Paris17_InstitutDeFrance.eps
.rj
.GPE_SM "Paris, Institut de France, Quai de Conti (2017)"
.GPE_TIME 0 35
.GPE_NEXT "Did we get rid of USE_GROFF?"
.INACTIVE_TITLE "Mandoc in ports: how it evolved"
.mk
.TS
rnnn.
	USE_GROFF	all ports
Oct 20, 2010	3080	5750	54%
BSDCan 2011	2850	6150	46%
BSDCan 2012	2516	6967	36%
BSDCan 2013	2180	7790	28%
BSDCan 2014	1210	8260	15%
\m[red]BSDCan 2015	215	8569	2.5%\m[]
BSDCan 2016	199	8858	2.2%
BSDCan 2017	76	8898	0.85%
BSDCan 2018	28	8930	0.3%
.TE
.sp 2.5v
.BL
.LI
Progress in 2011\(en2014 mostly by straightforward implementation
.br
of missing low-level features.
.LI
State in early 2015: USE_GROFF remained in about 250 ports.
.LI
First list of reasons for all remaining USE_GROFFs
.br
drafted by naddy@ before p2k15.
.LI
Explicit effort to reduce the list duing p2k15 only fixed 22 out of 250
.br
because the top remaining reasons (ta 60, ti 50, \eh 30)
were almost unfixable.
.LE
.rt
.sp 0.5v
.PSPIC -R Images/Nantes16_PontTabarty.eps
.rj
.GPE_SM "\fIObstacle:\fP Nantes, Boulevard Maurice Bertin,\
 vue vers le Pont \('Eric Tabarty (2016)"
.GPE_TIME 0 36
.GPE_NEXT "Which was the latest leap forward?"
.INACTIVE_TITLE "Parser unification"
Originally, mandoc(1) only had mdoc(7) and man(7) parsers,
no roff(7) handling.
.P
Partial roff(7) request handling was added in 2010,
.br
but purely in the form of a preprocessor.
.GPE_SM "(see my BSDCan 2011 talk)"
.P
But several roff(7) requests operate in a way similar to macros
.br
and require syntax tree nodes for representation:
.br
producing output (.br .mc .sp) or
changing formatter state (.ce .ft .ll .po .rj .ta .ti)
.SUBTITLE "Extensive reorganization"
.BL \n(Pi 1
.LI
Unified types:
enum roff_type, struct roff_node, roff_meta, roff_man
.br
.GPE_SM "(2015 Apr 2 to 18)"
.\" enum roff_type (2015 Apr 2)
.\" struct roff_node (2015 Apr 2)
.\" struct roff_meta (2015 Apr 2)
.\" struct roff_man (2015 Apr 18)
.LI
Unified node handling library in roff.c.
.GPE_SM "(2015 Apr 19)"
.LI
Unified way to use the ohash library.
.GPE_SM "(2015 Oct 13)"
.LI
Separate the validation phase from parsing.
.GPE_SM "(2015 Oct 20)"
.LI
Unified token IDs: enum roff_tok, roff_name[].
.GPE_SM "(2017 Apr 24)"
.LI
Use ohash for all request and macro tables.
.GPE_SM "(2017 Apr 29)"
.LE
.P
.GPE_EM "Syntax tree nodes can now be generated on the roff(7) level."
.P
Framework for terminal and HTML output from these roff nodes.
.GPE_SM "(2017 May 4)"
.GPE_TIME 0 37
.GPE_NEXT "Which features did this make possible?"
.INACTIVE_TITLE "New low-level roff(7) features"
Generating nodes on the roff(7) level allowed implementing
many new roff(7) requests and escape sequences,
(ab)used by man(7) pages in ports.
.sp 0.5v
.BVL 1cm
.LI "Requests moved from the man(7) parser to the roff(7) parser:"
\&.br .ft
.GPE_SM "(2017 May 4)"
\&.ll .sp
.GPE_SM "(2017 May 5)"
.LI "New requests changing state visible to the formatters:"
\&.ta
.GPE_SM "(2017 May 7)"
\&.ti
.GPE_SM "(2017 May 8)"
\&.ce
.GPE_SM "(2017 June 6)"
\&.rj .po
.GPE_SM "(2017 June 14)"
.LI "New requests changing preprocessor state only:"
\&.ec .eo
.GPE_SM "(2017 June 3)"
\&.rn
.GPE_SM "(2017 June 6)"
\&.als
.GPE_SM "(2017 June 14)"
\&.am
.GPE_SM "(2017 June 18)"
.br
\en auto-increment and .nr step size
.GPE_SM "(2018 Apr 9)"
.br
\&.nop
.GPE_SM "(2018 Aug 9)"
\e#
.GPE_SM "(2018 Aug 18)"
\&.while
.GPE_SM "(2018 Aug 24)"
\&.char
.GPE_SM "(2018 Aug 25)"
.LI "New requests and escapes producing output:"
\&.mc
.GPE_SM "(2017 June 4)"
\eh
.GPE_SM "(2017 June 1 and 14)"
\el
.GPE_SM "(2017 June 2)"
\ep
.GPE_SM "(2017 June 13)"
.br
\e[charNNN]
.GPE_SM "(2018 Aug 10)"
\e*(.T
.GPE_SM "(2018 Aug 16)"
.LI "New state inspectors:"
\en[an\-margin]
.GPE_SM "(2017 June 13)"
\&.if d conditional
.GPE_SM "(2017 June 14)"
.br
\&.if c conditional
.GPE_SM "(2018 Aug 19)"
\en(.$
.GPE_SM "(2018 Aug 20)"
\e\e$@
.GPE_SM "(2018 Aug 21)"
.LI "New man(7) macros:"
\&.DT
.GPE_SM "(2017 May 7)"
\&.MT .ME
.GPE_SM "(bentley@ 2017 June 25)"
.br
\&.TQ
.GPE_SM "(2018 Aug 16)"
\&.SY .YS
.GPE_SM "(2018 Aug 17)"
.LE
.GPE_TIME 0 38
.GPE_NEXT "Was there anything to do besides new requests?"
.INACTIVE_TITLE "Improvements for tbl(7)"
Rather tricky improvements in the tbl(7) formatter.
.BL
.LI
.GPE_EM "Filling text inside table columns."
.GPE_SM "(2017 June 7 and 12)"
.LI
Table option "allbox".
.GPE_SM "(2017 June 12)"
.LI
Layout specifier "w" (column width).
.GPE_SM "(2017 June 8)"
.LI
Specification of column spacing in the table layout.
.GPE_SM "(2017 June 27)"
.LI
Various improvements regarding data contained in unspecified output cells
.br
and regarding horizontal and vertical lines.
.GPE_SM "(2017 June 16 and Aug 19)"
.LI
Improvements to horizontal alignment in table cells.
.GPE_SM "(2018 Aug 18 and 19)"
.LE
.PSPIC Images/Paris17_LouvrePyramide.eps 16c
.ce
.GPE_SM "Paris, Louvre, La Pyramide (2017)"
.GPE_TIME 0 39
.GPE_NEXT "How much progress did all this allow?"
.INACTIVE_TITLE "Benefit for ports"
Once the framework was in place, we reduced the number of OpenBSD ports
.br
that still USE_GROFF from over 200 to just 25 in just two months:
.br
.mk
.TS
lnl.
date	ports	main new feature (ab)used in these ports
_
2017 May 7/8	22	tabulator settings: .ta .DT
2017 May 8	42	temporary indent: .ti
2017 June 1/2	42	horizontal spacing: \eh \el
2017 June 3	7	escape control: .ec .eo
2017 June 4	4	margin notes: .mc
2017 June 4	3	centering: .ce
2017 June 4	2	remove number register: .rn
2017 June 12	8	tbl(7) improvements
2017 June 13	6	\en[an\-margin], \ep, .if d
2017 June 14\(en29	21	various
.TE
.sp
.SUBTITLE "Non-English manual pages"
.BL
.LI
Let pkg_add(1) run makewhatis(8) in /usr/local/man/\fIlang\fP/.
.GPE_SM "(2017 May 15)"
.LI
Paragraph in the porting guide:
.br
standard directories, always UTF-8, use iconv(1) if needed.
.GPE_SM "(2017 June 1)"
.LE
.rt
.sp 1.5
.PSPIC -R Images/Beograd16_CrkvaSvetogMarkaSE.eps
.rj
.GPE_SM "Beograd, Crkva Svetog Marka (2016)"
.GPE_TIME 0 40
.GPE_SECTION MAINTENANCE
.GPE_NEXT "Any additional progress?"
.TITLE "Additional highlights from mandoc maintenance"
.BL
.LI
Cross-community cooperation:
Since 2011, i contributed almost 50 patches to GNU troff (groff),
and since early 2018, i'm now also a GNU troff committer.
.LI
Use pledge(2) in man.cgi(8).
.GPE_SM "(semarie@ 2017 Feb 22)"
.LI
The mandoc regression suite is now portable.
.GPE_SM "(2077 Feb 17 and July 18)"
.LI
Complete integration of mdoclint(1) into mandoc -Tlint.
.br
.GPE_SM "(2017 April 27 to July 3, with wiz@)"
.LI
Deleted the MLINKS from the OpenBSD base system.
.GPE_SM "(jmc@ 2016 March 30)"
.LI
Even better ctags(1)-style internal searching with less(1) :t.
.GPE_SM "(2016 Nov 8)"
.LE
.mk
.ll 12c
.BL
.LI
Enable full makewhatis(8) by default in OpenBSD.
.GPE_SM "(2017 Apr 15)"
.LI
Complete eqn(7) lexer rewrite, many parser improvements,
and considerable improvements in both the HTML and terminal formatters.
.GPE_SM "(2017 Juni 21 to Aug 23)"
.LI
PostScript output is now substantially smaller.
.GPE_SM "(espie@ 2017 Nov 1)"
.LI
Portable releases:
1.14.1
.GPE_SM "(2017 Feb 21),"
.br
.\" 1.14.2
.\" GPE_SM "(2017 July 28),"
1.14.3
.GPE_SM "(2017 Aug 5),"
1.14.4
.GPE_SM "(2018 Aug 8)"
.LE
.ll \n[gpe_ll]u
.rt
.PSPIC -R Images/Beograd16_Predsednishtvo.eps
.rj
.GPE_SM "Beograd, Predsednishtvo (2016)"
.GPE_TIME 80
.GPE_NEXT "Is mandoc an isolated project?"
.INACTIVE_TITLE "Cooperation across communities"
Coordinating with other implementations of the languages
and utilities in question:
.SUBTITLE "groff = GNU troff"
.BL
.LI
ASCII output of special characters (
.GPE_QL "tty\-char.tmac" ):
.br
focus on meaning rather than graphical shape.
.GPE_SM "(2017 Aug 22)"
.LI
Rewrite groff_mdoc(7) .Lk macro.
.GPE_SM "(2017 Apr 10 to 13 and 2018 Jan 12)"
.LI
3 minor features,
.\" [man] Print volume headers like mdoc.  2011 Dec 1
.\" [mdoc] Implement `.%C'. 2013 July 24
.\" Support `Mdocdate' CVS keyword substitution. 2014 Oct 13
2 formatting improvements,
.\" mdoc %T: use typographic quotes 2017 Feb 16
.\" mdoc \e*[Lq], \e*[Rq]: map to \e[lq], \e[rq] 2017 Feb 16
7 bugfixes,
.\" [mdoc] Make `Fl' correctly restore fonts.  2012 July 17
.\" tmac/an\-old.tmac (TP): Do not clobber line length
.\"   after double call to `.TP'.  2013 July 16
.\" man: Correctly reset margins.  2014 Mar 13
.\" tmac/doc\-common\-u (Dd): Avoid warning `unbalanced .el request'. 2015 Mar 7
.\" unicode_to_glyph_list double entries: 2015 Jan 14
.\" Prevent mdoc(7) Bl with trailing \-width or \-offset
.\"   from picking up old args.  2015 Apr 11
.\" Simplify behaviour of .Bl \-tag 2016 Oct 8
4 string table updates,
.\" [mdoc] Synchronize string tables with the mandoc(1) utility.  2011 Oct 23
.\" [mdoc] * tmac/doc\-syms: Fix meaning of XBD acronym.  2012 Jan 25
.\" Add `.At III' and `.St \-iso8601'.  2014 Oct 12
.\" Update operating system release numbers.  2014 Oct 12
5 documentation improvements,
.\" 2017 Aug 28 2x  Apr 29  2014 Sep 23 Feb 16
16 build system fixes:
.\" 2018 Jan 13  2017 Aug 15  2014 Sep 17 Sep 9 Oct 9 Jun 21
.\" Mar 16 Mar 11 6x  2013 Mar 17  2012 Mar 3  2011 Oct 17 
.br
grand total about
.GPE_EM "40 contributions"
in 2011\(en2018.
.LE
.mk
.ll 13c
.SUBTITLE "man\-db = GPLv2+ manual page viewer"
.BL
.LI
For compatibility with the
.br
man(1) implementation of man\-db,
.br
interpret names containing slashes
.br
as absolute or relative file names,
.br
even without the
.GPE_QL "\-l"
option.
.br
.GPE_SM "(2018 Apr 19)"
.LE
.ll \n[gpe_ll]u
.rt
.PSPIC -R Images/Beograd16_NarodnaSkupshtinaSE.eps
.rj
.GPE_SM "Beograd, Narodna Skupshtina (2016)"
.GPE_TIME 0 41
.GPE_NEXT "But that's more or less all that was done, right?"
.INACTIVE_TITLE "All those small things: infrastructure"
.BVL 1cm
.LI "Keeping up with newly developed security features:"
.GPE_EM "pledge(2)"
for man(1), mandoc(1), apropos(1), makewhatis(8).
.GPE_SM "(2016 Nov 15)"
.br
pledge(2) for man.cgi(8).
.GPE_SM "(semarie@ 2017 Feb 22)"
.LI "Moving ahead with regression testing:"
New portable version of the mandoc
.GPE_EM "regression suite" ,
.br
including more than 1000 of the existing test cases.
.GPE_SM "(2017 Feb 17)"
.br
Run it iteratively rather than recursively in portable mode.
.GPE_SM "(2017 July 18)"
.LI "Further improving diagnostic functionalities:"
Most prone to overengineering, feature creep, and code sprawl;
.br
consider groff, OpenSSL, SQLite, errno(2), even mandoc(1) itself.
.\" Stricter validation of the NAME section.
.\" GPE_SM "(2017 Jan 7)"
.\" Links to self.
.\" GPE_SM "(2017 April 27)"
.br
Integration of
.GPE_EM "mdoclint(1)" .
.GPE_SM "(2017 April 27 to July 3, with wiz@)"
.\" Start deleting:
.\" GPE_SM "(2017 April 28)"
.br
Message levels
.GPE_QL "\-W style"
.GPE_SM "(2017 May 16 to July 6)"
and
.GPE_QL "\-W base" .
.GPE_SM "(2017 June 24)"
.LI "Getting rid of MLINKS."
Fewer files in the distribution tarballs, simpler Makefiles.
.GPE_SM "(jmc@ 2016 March 30)"
.LI "Constant bugfixing and usability improvements."
.LI "Portable releases:"
1.14.1
.GPE_SM "(2017 Feb 21),"
.\" 1.14.2
.\" GPE_SM "(2017 July 28),"
1.14.3
.GPE_SM "(2017 Aug 5)"
.LE
.GPE_TIME 0 42
.GPE_NEXT "But there are surely no additional new features, right?"
.INACTIVE_TITLE "All those small things: features"
Further improving
.GPE_EM "search" :
.BL
.LI
Support more than one tag entry for the same search term,
tag leading presentational macros in .It, and some more improvements
to ctags(1)-style internal searching with less(1) :t.
.GPE_SM "(2016 Nov 8)"
.LI
Enable full makewhatis(8) by default in OpenBSD.
.GPE_SM "(2017 Apr 15)"
.LE
.P
Improving
.GPE_EM "eqn(7)" :
.BL \n(Pi 1
.LI
Lexer: complete rewrite \(em simpler and fixing several bugs.
.GPE_SM "(2017 June 26)"
.LI
Parser: recognize well-known function names.
.GPE_SM "(2017 June 21)"
.LI
Parser: quoted words are not parsed.
.GPE_SM "(2017 June 21)"
.LI
Parser: better font selection.
.GPE_SM "(2017 June 21)"
.LI
Parser: implement operator precedence.
.GPE_SM "(2017 July 5)"
.LI
HTML formatter: use <mi>, <mn>, <mo> in MathML.
.GPE_SM "(2017 June 22)"
.LI
Terminal formatter: output disambiguation with parentheses.
.GPE_SM "(2017 July 7)"
.LI
Terminal formatter: better horizontal spacing.
.GPE_SM "(2017 Aug 23)"
.LI
Use in erf(3) and lgamma(3).
.GPE_SM "(2017 Aug 26)"
.LE
.P
Occasional performance enhancements:
.BL
.LI
For example substantially smaller
.GPE_EM PostScript
output.
.GPE_SM "(espie@ 2017 Nov 1)"
.LE
.GPE_TIME 0 43
.GPE_SECTION SUMMARY
.GPE_NEXT "Did we reach our goals?"
.TITLE "Done..."
.S -1
.TS
lrrr.
project	announced	completed	presented
_
install manual page sources	BSDCan 2011	June 23, 2011	BSDCan 2014
implement \-mdoc \-Tman	BSDCan 2011	Nov 19, 2012	BSDCan 2014
apropos(1), makewhatis(8)	BSDCan 2011	April 14, 2014	BSDCan 2014
replace man.cgi(8)	BSDCan 2011	July 12, 2014	EuroBSD 2014
\m[red]pod2mdoc(1) for LibreSSL	BSDCan 2011	Nov 6, 2016	BSDCan 2018\m[]
implement \-man \-Tmdoc	BSDCan 2011	abandoned
_
integrate preconv(1)	BSDCan 2014	Oct 30, 2014	BSDCan 2015
\m[red]unify parsers, better roff(7)	BSDCan 2014	in progress	BSDCan 2018\m[]
docbook2mdoc(1) for man(7)	BSDCan 2014	stalled
pod2mdoc(1) for Perl manuals	BSDCan 2014	not yet started
_
default to \-Tlocale	EuroBSD 2014	Dec 2, 2014	BSDCan 2015
replace man(1)	EuroBSD 2014	Dec 14, 2014	BSDCan 2015
_
.\" split ERROR level (\-Wunsupp)	BSDCan 2015	Jan 26, 2015	dto.
use less(1) tags	BSDCan 2015	July 17, 2015	EuroBSD 2015
\m[red]delete most MLINKS	BSDCan 2015	March 30, 2016	BSDCan 2018\m[]
use texi2mdoc(1) in practice	BSDCan 2015	not yet started
_
\m[red]better less(1) tags	EuroBSD 2015	mostly done	BSDCan 2018\m[]
_
\m[red]more HTML/CSS improvements	BSDCan 2018	mostly done	EuroBSD 2018\m[]
.TE
.S P
.GPE_TIME 30 44
.GPE_NEXT "Which tasks are still open?"
.TITLE "Future directions(1)"
You might think mandoc is finished,
but there is still surprisingly much to do:
.BVL 1c
.LI "Structure \(em privilege separation:"
Do parsing and formatting in different processes.
.br
Stricter pledge(2).
Use unveil(2).
.LI "Parsers \(em continue unification:"
Represent escape sequences as AST nodes
and improve width measurements.
.br
Retain more roff(7) requests as AST nodes,
and document AST invariants.
.br
Provide an mdoc(7) output mode (normalization).
.LI "Parsers \(em mdoc(7):"
Better support pages that apply to multiple, but not to all architectures.
.br
Disentangle filling and font control for displays.
.mk
.LI "Parsers \(em man(7):"
Add minimal heuristics
.br
for better linking and markup.
.LI "Parsers \(em tbl(7):"
Support macros inside tables.
.LI "Parsers \(em roff(7):"
A few requests that are abused in
.br
practice are still unimplemented.
.LE
.rt
.sp 0.5v
.PSPIC -R Images/Vanoise17_Casse.eps
.rj
.GPE_SM "La Grande Casse 3855m, Vanoise, France"
.GPE_TIME 30 45
.GPE_NEXT "Which tasks are still open?"
.TITLE "Future directions (2)"
.BVL 1c
.LI "Formatters \(em HTML:"
.GPE_OK "Use more fitting HTML elements for some macros" ,
.GPE_OK "improve CSS style" ,
.GPE_OK "further reduce style= attributes" ,
.GPE_OK "solve the problem of duplicate anchors" ,
avoid title= attributes for tooltips,
solve the remaining HTML syntax violations.
.LI "Formatters \(em PostScript and PDF:"
Better font support;
espie@ started work, but i ran out of time to support him.
.mk
.LI "Foreign formats \(em perlpod(1):"
Use pod2mdoc(1) for Perl manual pages.
.LI "Foreign formats\(em texinfo(5):"
Use texi2mdoc(1) in practice.
.LI "Foreign formats \(em man(7):"
Support man(7) to mdoc(7)
.br
semi-automatic migrations
.br
with doclifter(1) and docbook2mdoc(1).
.LI "Manual pages:"
Write the missing LibreSSL manuals
.br
and clean up the existing ones.
.br
That's a huge problem due to the volume.
.LE
.P
.ce
For many minor issues, see the mandoc TODO list.
.br
.rt
.sp 1.5v
.PSPIC -R Images/Vanoise17_LaDaille.eps
.rj
.GPE_SM "La Daille 1800m, Val d'Is\('ere, Vanoise, France"
.GPE_TIME 20 46
.GPE_NEXT "What was suggested during the conference?"
.TITLE "Future directions (3)"
.SUBTITLE "Suggested by attendees of EuroBSDCon 2018"
.BL
.LI
Table of contents at the top of HTML (and perhaps PS/PDF) pages.
.br
Only if there are at least two (or three?) non-standard sections in a page.
.br
Only if a new option
.GPE_QL "-O toc"
is given.
.LI
Support
.GPE_QL "-O man"
with two arguments, typically using the first for a local tree
(like the release pages on mandoc.bsd.lv)
and the second for a remote tree (e.g. man.openbsd.org).
Link to the first one if it contains the page pointed to,
or to the second one otherwise.
.br
Probable syntax:
.GPE_QL "-O man=first;second"
.br
Suggested by kristaps@.
.LI
Use Unicode character U+221A for sqrt() in eqn(7) output.
.LE
.PSPIC Images/EuroBSDCon2018.eps
.GPE_TIME 0
.GPE_NEXT "Who uses mandoc?"
.TITLE "Adoption of mandoc"
.BVL 1cm
.LI "Default formatter and viewer"
OpenBSD
.GPE_SM "(schwarze@),"
Alpine Linux
.GPE_SM "(Sabogal),"
Void Linux
.GPE_SM "(leah@)"
.LI "Default formatter and search tool, but weaker viewer"
FreeBSD
.GPE_SM "(bapt@)"
.LI "Default formatter, but using weaker view and search tools"
NetBSD
.GPE_SM "(christos@),"
illumos
.GPE_SM "(yuripv@)"
.LI "Included by default, but outdated and not used by default"
Dragonfly, Minix 3
.LI "Official ports and packages"
Debian
.GPE_SM "(stapelberg@),"
Ubuntu, Gentoo, pkgsrc
.GPE_SM "(wiz@)"
.LI "Unofficial ports and packages"
Arch, Slackware, Crux
.GPE_SM "(juef@),"
MacPorts, MacOS Homebrew
.LE
.SUBTITLE "Mandoc for official online web pages"
man.openbsd.org,
manpages.debian.org,
and manpage links on wiki.archlinux.org
.GPE_TIME 60 47
.GPE_NEXT "What are the most important conclusions?"
.TITLE "Conclusions about API design"
.BL
.LI
Writing documentation is an excellent way to understand the
quality of an API.
.LI
.GPE_EM "Bad API design can make documentation almost impossible."
.LI
Typical problems:
too many functions,
cryptic conventions,
misleading names,
wrappers,
redundancy,
inconsistent and surprising semantics,
excessively complicated logging and initialization,
incompatible API changes in the past and resulting compatibility hacks.
.mk
.LI
About the worst API design error ever:
.br
function names autogenerated with macros.
.br
Completely impossible to properly document.
.LI
Callbacks are very hard to document.
.br
Avoid them as much as possible.
.br
If you cannot avoid them, typedef the prototypes.
.LI
Gratuitious use of typedef.
.br
Never
.GPE_QL "typedef struct foo FOO" .
.br
Never
.GPE_QL "typedef struct foo *foo_p" .
.br
Never
.GPE_QL "typedef int64_t myint" .
.br
Such typedefs seriously obfuscate documentation.
.LE
.rt
.sp 0.5v
.PSPIC -R Images/Beograd16_UlicaSvetogorska.eps
.rj
.GPE_SM "\fIPromote confusion:\fP Beograd, Ulica Svetogorska (2016)"
.GPE_TIME 90 48
.GPE_NEXT "More conclusions..."
.TITLE "Conclusions related to documentation tools"
.BL
.LI
Mandoc has been the standard BSD toolkit for manuals on the
.GPE_EM "command line"
.br
since about EuroBSDCon 2015.
.LI
Now, it is also becoming the standard formatter for the
.GPE_EM "web" .
.br
It has extensive support for semantic markup and hyperlinking.
.LI
Mandoc now covers well above 99% of
.GPE_EM "ports"
due to improved roff(7) support.
.LI
Never try to write
.GPE_EM "markdown"
documentation by hand.
Generate it from mdoc(7).
.LE
.SUBTITLE "Conclusions for any kind of free software project"
.BL
.LI
.GPE_EM "Do not prematurely introduce dependencies" ,
not even on widely-available,
high quality libraries.
Evaluate requirements, integration costs, and maintenance costs first.
Seriously consider using the POSIX C library as your main toolkit.
.LI
.GPE_EM "Worrying about performance is vastly overvalued."
Good performance is a by-product of pursuing
.I other
goals, mainly simplicity.
Simply choose well-adapted data structures
and the most straighforward algorithms,
and you usually get performance that is more than good enough,
plus excellent maintainability.
.LI
The success of a long-term software project depends on a good
balance of features vs. maintenance.
In case of doubt,
.GPE_EM "attention to maintenance details"
(code quality and cleanup, usability details, diagnostics,
bugfixes, regression testing, ...)
matters more than adding ever more bells and whistles.
.LE
.GPE_TIME 30 49
.GPE_NEXT "Who contributed to all this?"
.TITLE "Thanks!"
These slides only list new contributions since EuroBSDCon 2015.
.br
For complete acknowledgements since the start of the mandoc project,
.br
see my EuroBSDCon 2015 slides.
.BL
.LI
.GPE_EM "Anthony Bentley"
(OpenBSD) for about ten code contributions
.\" implement stylesheet unification
.\" work on pledge(2) for makewhatis(8)
.\" implement man(7) .MT .ME
.\" 4 usability patches to cgi.c and html.c
.\" manual page patch
.\" groff patch to improving formatting of quotes
.\" joint work fixing two japanese ports
.\" maintenance of the pod2mdoc port
and dozens of bug reports, useful suggestions, and discussions.
.LI
.GPE_EM "Christian Weisgerber"
(OpenBSD) for lots of work on the USE_GROFF removal
and several bug reports and suggestions.
.LI
.GPE_EM "Michael Stapelberg"
(Debian) for designing and writing most of mandocd(8) and catman(8)
and for more than a dozen patches, bug reports, and suggestions.
.LI
.GPE_EM "Marc Espie"
(OpenBSD) for implementing smaller PostScript output
and many useful suggestions and discussions.
.LI
.GPE_EM "Baptiste Daroussin"
(FreeBSD) for writing soelim(1), for makewhatis(8) performance testing,
and for some bug reports and suggestions.
.LI
.GPE_EM "Jason McIntyre"
(OpenBSD) for countless useful discussions,
suggestions, and bug reports, repeated testing,
and lots of copy-editing in the LibreSSL manuals.
.LI
.GPE_EM "Thomas Klausner"
(NetBSD) for extensive cooperation on the mdoclint(1) integration
and for several bug reports and useful suggestions.
.LI
.GPE_EM "Joel Sing"
(OpenBSD) for lots of feedback regarding LibreSSL manual pages.
.LE
.GPE_TIME 60
.GPE_NEXT "Who contributed to all this?"
.TITLE "Thanks!"
.BL
.LI
Kristaps Dzonsons (bsd.lv) for implementing Mac OS X sandbox_init(3)
support and for useful discussions.
.LI
Sebastien Marie (OpenBSD) for contributing code and ideas
to pledge(2) support and some useful discussions.
.LI
Jonathan Gray and Theo B\(:uhler (OpenBSD) for a bug fix patch
and more than twenty bug reports and suggestions each,
many based on afl(1) findings.
.LI
Alexander Bluhm (OpenBSD) for several significant improvements
to the mandoc regression suite.
.LI
Todd Miller (OpenBSD) for important help with process group handling,
for checking many patches, and for useful discussions and suggestions.
.LI
John Gardner for extensive suggestions regarding HTML output.
.LI
Carsten Kunze (Heirloom troff), Werner Lemberg, Bertrand Garrigues,
Branden Robinson, Peter Schaffter, and Ralph Corderoy (GNU troff)
for help getting patches committed to groff(1)
and for useful discussions.
.LI
Theo de Raadt (OpenBSD)
for several bug reports, useful discussions, and suggestions,
and for checking several patches.
.LI
Florian Obser, J\('er\('emie Courr\(`eges-Anglas, Martin Natano,
Philipp Guenther (OpenBSD), Ed Maste (FreeBSD), and Peter Bray
for patches, bug reports, and useful discussions.
.bp
.LI
Michael McConville (OpenBSD), Kamil Rytarowski (NetBSD),
Andreas Voegele, Fabian Raetz, Max Fillinger, and Tiago Silva
for patches.
.LI
The OpenCSW.org team for access to the Solaris test cluster.
.LI
Anton Lindqvist (OpenBSD) for suggesting a new feature \" <title>
and multiple bug reports.
.LI
Reyk Floeter, \" -T markdown
Paul Irofti (OpenBSD), \" TIOCGWINSZ
Vsevolod Stakhov (FreeBSD), \" -T markdown
Colin Watson (Debian), \" / implies -l
Jean-Yves Migeon, \" header.html, footer.html
Mike Williams, \" %%DocumentMedia
Nate Bargmann, \" / implies -l
and Thomas Guettler \" id=
for suggesting new features.
.LI
T. J. Townsend,
Ted Unangst (OpenBSD),
Christos Zoulas,
Sevan Janiyan (NetBSD),
Yuri Pankov (illumos),
Leah Neukirchen (Void),
Svyatoslav Mishyn (Crux),
Jan Stary,
and Markus Waldeck
for multiple bug reports and useful suggestions.
.LI
Kurt Jaeger (FreeBSD) for reporting multiple missing features.
.LI
Antoine Jacoutot (OpenBSD) for suggesting multiple
usuability improvements.
.LI
Stuart Henderson (OpenBSD)
for bug reports and for checking multiple ports patches.
.LI
Aaron M. Ucko, Bdale Garbee (Debian),
Daniel Sabogal (Alpine),
Daniel Levai,
and Rafael Neves
for suggesting portability improvements.
.LI
James Turner (OpenBSD)
and Ulrich Sp\(:orlein (FreeBSD)
for release testing.
.bp
.LI
Brad Smith,
Daniel Dickman,
David Coppa,
Dmitrij Czarkoff,
Igor Sobrado,
Ken Westerback,
Martijn van Duren,
Mike Belopuhov,
Tim van der Molen (OpenBSD),
Takeshi Nakayama (NetBSD),
Alexander Kuleshov,
Andy Bradford,
Gabriel Guzman,
George Brown,
Gonzalo Tornaria,
Jeremy Mates,
Jerome Ibanes,
Jesper Wallin,
Lorenzo Beretta,
Mark Patruck,
Maxim Belooussov,
Michal Mazurek,
Michael Reed,
Pavan Maddamsetti,
Peter Bui,
Raf Czlonka,
Reiner Herrmann,
Sean Levy,
Serguey Parkhomovsky,
Shane Kerr,
Steffen Nurpmeso,
Tony Sim,
Will Backman,
and Wolfgang Mueller
for bug reports.
.LI
Mark Kettenis,
Otto Moerbeek,
Tobias Stoeckmann,
and Tom Cosgrove
for checking patches in the base system.
.LI
Jasper Lievisse Adriaanse,
Klemens Nanni,
Markus Friedl,
Masahiko Yasuoka,
Matthias Kilian,
Pierre-Emmanuel Andre,
Rafael Sadowski,
Sebastian Reitenbach,
Todd Fries (OpenBSD),
Greg Steuck,
and Jung Lee
for checking patches in the ports tree.
.LI
Alexander Hall,
Andrew Fresh,
Brent Cook,
Doug Hogan,
Kent Spillner,
Nicholas Marriott,
Peter Hessler,
Stefan Sperling,
Vadim Zhukov (OpenBSD),
Abhinav Upadhyay,
Joerg Sonnenberger (NetBSD),
Dag-Erling Sm\(/orgrav (FreeBSD),
Benny Lofgren,
David Dahlberg,
and Laura Morales
for useful discussions.
.LE
.P
.S -6
All photographs (except the project logos)
are (C) Copyright 2015-2018 Ingo Schwarze
.br
and are available under the same ISC license as these slides,
see the roff source file.
.br
.S P
.ds gpe_next The end.
.GPE_TIME 0 50