The HWEB system for literate programming

Note that this entire document describes a fantasy system which does not as yet exist—and may never exist, if I keep writing myself into corners with the implementation.

The structure of an HWEB program is very simple: each program consists of a series of code blocks, with interspersed comment blocks. The htangle compiler rearranges the code blocks into the order expected by a C compiler; the hweave compiler formats the comment blocks in HTML, with the code blocks interwoven and formatted according to syntax.

The HWEB language

The HWEB processors are essentially line-oriented and case-insensitive, as far as the HWEB language is concerned; obviously, the compilers don't have any problem with poorly formatted code blocks!

HWEB understands the following command set:

- maintenance notes - Notes to the programmer relating to the actual implementation of the program are enclosed in single hyphens. The hyphen must be the only thing on its line.
= usage/algorithm notes = Notes to the user, or to the reader who wishes to get a clear understanding of the algorithms involved without necessarily understanding the details of coding, may be enclosed in single equals-sign marks. The sign must be the only thing on its line.
-= and =- The two-character sequences -= and =- are shorthand for the same two characters on separate lines.
Begin <block name>.
C code
End <block name>.
A code block; the block must have a unique name, which does not need to be spelled out in its entirety as long as it has a unique prefix. The block name (and angle brackets) may be completely omitted from the End. command. If a HWEB program contains two Begin. commands regarding the same block, it is in error and will be flagged by htangle.
More <block name>.
C code
End <block name>.
An addition to an existing code block. The above proviso regarding unique prefixes applies here as well. The new material is appended to the end of any existing code block, or starts a new code block if no code block of this name exists yet. (Thus More. is a less-error-checked version of Begin.)
Heredoc <block name> flag
text
flag
Note the absence of the period . in this command. Creates a "here document," references to which expand in the tangled code to a C string literal (with necessary escape codes) representing the entire quoted text.
>block name< A reference to the block or here-document named block name; expands to the complete text of that block or here-document. A block may not include any references to itself (thus disallowing recursion). The text is re-scanned after expansion as many times as necessary to expand all references within it.
>Prologue< The special section Prologue contains the useful C macros steq and stneq for string equality and inequality, NELEM for the number of elements in an array, SQR for the square of a number, and the useful C functions do_error and do_help. The do_help function can be customized using the special Help and Manual code blocks.
>Help< and >Manual< The special code block Help begins as the empty C string "" and may be added to with More <Help>. commands at any point in the program. The same applies to Manual. The do_help function is defined inside Prologue as follows:
void do_help(int man) {
    if (man) puts(
      >Manual<
    ); else puts(
      >Help<
    ); exit(EXIT_FAILURE);
}

The HWEB documentation syntax

While the original WEB system is based on the TEX typesetting system, the hweave compiler is designed to emit simple hypertext in HTML format. Code block references are hyperlinked to their definitions; comment blocks are formatted in paragraphs delimited by blank lines in the input.

The HWEB system recognizes the vertical-pipe delimiter to mean its contents to be set in a fixed-width font; thus, the input

  =
  This is an example of the |HWEB| system.
  =
comes out looking like this:

This is an example of the HWEB system.

The pipe delimiter is overloaded in the following way: If the pipe-delimited string happens to be the name of a code block, then it is formatted and hyperlinked accordingly; thus, one may write
  -
  This section describes the input facilities of the
  program; for output, see |Output facili...|.
  -
and have the phrase S6. Output facilities appear properly hyperlinked in the HTML documentation (given, of course, that the section in question turns out to be Section 6).

The forward-slash delimiter is also recognized as indicating italics in comment blocks; the dollar-sign delimiter indicates a math mode similar to that of TEX, but with several key differences: