Conversion of Open Group's troff sources to POSIX man pages =========================================================== This directory contains files and scripts that are used to convert the POSIX manual pages to 'man' format, suitable for release by the Linux man-pages project. 1. Necessary data: ================== * obtainable from The Open Group - directory with the troff sources [1] - file ,xref.5 containing information to crossreferences - file _strings.def containing information to references to other standards * obtainable online - the HTML version of the standard [2] [1] The troff sources are not part of this repository, and must be obtained by contacting The Open Group. [2] As at November 2020, the HTML version of the standard can be downloaded from https://pubs.opengroup.org/onlinepubs/9699919799/. The directory of troff sources contains four directories: "Builtins", "Commands", "Functions", "Headers". (Some of these contain subdirectories with "LEGACY" interfaces.) The directories contain .mm and .h files containing groff_mm files with extensions by The Open Group. Upon request one can also obtain a file defining their custom macros but this file is not necessary for the scripts. A relevant line in ,xref.5 could look like gropdf-info:href workdir page 104 Section 3.441 It contains a label ("workdir"), the page number and the section number. A line in _strings.def might look like .ds Z5 ISO\ POSIX\(hy1 standard This tells us how to translate the escape sequence \*(Z5 . The HTML version of the standard can be obtained at http://pubs.opengroup.org/onlinepubs/9699919799/download/index.html The relevant files for the scripts are basedefs/V1_chap*.html, functions/V2_chap*.html, utilities/V3_chap*.html and xrat/V4_*_chap*.html. These are parts of the standard we do not have the sources for. 2. Procedure to generate the man pages ====================================== Change your directory to the directory containing the conversion scripts. Type ./,xref.1.awk < ,xref.5 > ,xref.1 ./,xref.py /path/to/HTML_version_of_standard > ,xref to generate ,xref and sed -f _strings.sed _strings.def > _strings to generate _strings. With this done you can start generating individual man pages. To generate all pages use: ./posix.py 0p /path/to/troff_sources/Headers/*.h ./posix.py 1p /path/to/troff_sources/Built-Ins/*.mm ./posix.py 1p /path/to/troff_sources/Commands/*.mm ./posix.py 3p /path/to/troff_sources/Functions/*.mm You can now find the converted pages in your current working directory. Clean up: rm ,xref ,xref.1 _strings 3. Description of the included scripts ====================================== ,xref.1.awk takes ,xref.5 from its standard input, strips irrelevant lines and transforms lines of the form gropdf-info:href whitespace page 103 Section 3.436 to whitespace Section 3.436 ,xref.1.py expects ,xref.1 generated from ,xref.1.awk in the current working directory and the path to the HTML version of the standard as its first argument. It extracts section, table and figure names for parts of the standard we do not have sources for, adds them to the xrefs and writes them to standard output. For the example, inside /path/to/HTML_version_of_standard/basedefs/V1_chap03.html it finds a line class; see also White Space.

and therefore outputs whitespace Section 3.436, White Space to ,xref. The sed script _strings.sed does a simple conversion of lines of the form .ds Z5 ISO\ POSIX\(hy1 standard to \*(Z5 ISO\ POSIX\(hy1 standard The main script is posix.py. It takes the name of the man section as its first argument and the names of the pages to be converted as its other arguments. Furthermore, it expects the data files ,xref and _strings in its current working directory. It outputs converted man pages to its current working directory. Notes: A final processing of the xrefs happens in posix.py: On the one hand the section names for cross-references internal to the current page are added. On the other hand the references to other man pages are correctly formatted. The order of the entries in ,xref is used to deduce the right section number. This could also be achieved by careful examining the source directory. The code in posix.py to get the indentation right by inserting ".RS ..." and ".RE" in the right places is very hacky and might fail with pages with a slightly more complex structure then now.