This file defines the PlDoc wiki parser, which parses both comments and
wiki text files. The original version of this SWI-Prolog wiki format was
largely modeled after Twiki (http://twiki.org/). The current version is
extended to take many aspects from markdown, in particular the doxygen
refinement thereof.
- See also
- - http://www.stack.nl/~dimitri/doxygen/manual/markdown.html
wiki_lines_to_dom(+Lines:lines, +Args:list(atom), -Term) is det- Translate a Wiki text into an HTML term suitable for html//1
from the html_write library.
wiki_codes_to_dom(+String, +Args, -DOM) is det- Translate a plain text into a DOM term.
- Arguments:
-
String | - Plain text. Either a string or a list of codes. |
wiki_structure(+Lines:lines, +BaseIndent, -Blocks:list(block)) is det[private]- Get the structure in terms of block-level elements: paragraphs,
lists and tables. This processing uses a mixture of layout and
punctuation.
take_block(+Lines, +BaseIndent, ?Block, -RestLines) is semidet[private]- Take a block-structure from the input. Defined block elements
are lists, table, hrule, section header and paragraph.
ruler(+Line) is semidet[private]- True if Line contains 3 ruler chars and otherwise spaces.
list_item(+Lines, ?Type, ?Indent, -LI0, -LIT, -RestLines) is det[private]- Create a list-item. Naturally this should produce a single item,
but DL lists produce two items, so we create the list of items
as a difference list.
- To be done
- - Pass base-indent
rest_list_item(+Lines, +Type, +Indent, -RestItem, -RestLines) is det[private]- Extract the remainder (after the first line) of a list item.
take_blocks_at_indent(+Lines, +Indent, -Pars, -RestLines) is det[private]- Process paragraphs and verbatim blocks (==..==) in bullet-lists.
rest_list(+Lines, +Type, +Indent, -Items, -ItemTail, -RestLines) is det[private]
list_item_prefix(?Type, +Line, -Rest) is det[private]
split_dt(+LineAfterDollar, -DT, -Rest)[private]- First see whether the entire line is the item. This allows
creating items holding : by using $ <tokens> :\n
ul_to_dl(+UL, -DL) is semidet[private]- Translate an UL list into a DL list if all entries are of the
form "* <term> nl, <description>" and at least one <description>
is non-empty, or all items are of the form
[[PredicateIndicator]].
term_item(+LI, -DLItem, ?Tail) is semidet[private]- If LI is of the form <Term> followed by a newline, return it as
dt-dd tuple. The <dt> item contains a term
\term(Text, Term, Bindings).
row(-Cells)// is det[private]
rest_table(+Lines, +Indent, -Rows, -RestLines)[private]
- column_alignment(-Alignment) is semidet[private]
- Process an alignment line.
rest_par(+Lines, -Par, +BaseIndent, +MaxI0, -MaxI, -RestLines) is det[private]- Take the rest of a paragraph. Paragraphs are ended by a blank
line or the start of a list-item. The latter is a bit dubious.
Why not a general block-level object? The current definition
allows for writing lists without a blank line between the items.
section_header(+Lines, -Section, -RestLines) is semidet[private]- Get a section line from the input.
twiki_section_line(+Tokens, -Section) is semidet[private]- Extract a section using the Twiki conventions. The section may
be preceeded by [Word], in which case we generate an anchor name
Word for the section.
md_section_line(+Tokens, -Section) is semidet[private]- Handle markdown section lines staring with #
strip_ws_tokens(+Tokens, -Stripped)[private]- Strip leading and trailing whitespace from a token list. Note
the the whitespace is already normalised.
strip_leading_ws(+Tokens, -Stripped) is det[private]- Strip leading whitespace from a token list.
tags(+Lines:lines, -Tags) is semidet[private]- If the first line is a @tag, read the remainder of the lines to
a list of \
tag(Name, Value)
terms.
collect_tags(+IndentedLines, -Tags) is semidet[private]- Create a list Order-
tag(Tag,Tokens)
for each @tag encountered.
Order is the desired position as defined by tag_order/2.
- To be done
- - Tag content is often poorly aligned. We now find the
alignment of subsequent lines and assume the first line is
alligned with the remaining lines.
tag_name(+String, -Tag:atom, -Order:int) is semidet[private]- If String denotes a know tag-name,
renamed_tag(+DeprecatedTag:atom, -Tag:atom, -Warn) is semidet[private]- Declaration for deprecated tags.
tag_order(+Tag:atom, -Order:int) is semidet[private]- Both declares the know tags and their expected order. Currently
the tags are forced into this order without warning. Future
versions may issue a warning if the order is inconsistent.
combine_tags(+Tags:list(tag(Key,Value)), -Tags:list) is det[private]- Creates the final tag-list. Tags is a list of
- \
params(list(param(Name, Descr)))
- \
tag(Name, list(Descr))
Descr is a list of tokens.
wiki_faces(+Structure, +ArgNames, -HTML) is det[private]- Given the wiki structure, analyse the content of the paragraphs,
list items and table cells and apply font faces and links.
structure_term(+Term, -Functor, -Content) is semidet[private]
- structure_term(-Term, +Functor, +Content) is det[private]
- (Un)pack a term describing structure, so we can process Content
and re-pack the structure.
verbatim_term(?Term) is det[private]- True if Term must be passes verbatim.
matches(:Goal, -Input, -Last)//[private]- True when Goal runs successfully on the DCG input and Input
is the list of matched tokens.
wiki_faces(-WithFaces, +ArgNames)// is nondet[private]
wiki_faces(-WithFaces, +ArgNames, +Options)// is nondet[private]- Apply font-changes and automatic links to running text. The
faces are applied after discovering the structure (paragraphs,
lists, tables, keywords).
- Arguments:
-
Options | - is a dict, minimally containing depth |
- prolog:doc_wiki_face(-Out, +VarNames)// is semidet[multifile]
- prolog:doc_wiki_face(-Out, +VarNames, +Options0)// is semidet[multifile]
- Hook that can be used to provide additional processing for
additional inline wiki constructs. The DCG list is a list of
tokens. Defined tokens are:
- w(Atom)
- Recognised word (alphanumerical)
- Atom
- Single character atom representing punctuation marks or the
atom
' '
(space), representing white-space.
The Out variable is input for the backends defined in
doc_latex.pl
and doc_html.pl. Roughly, these are terms similar
to what html//1 from library(http/html_write) accepts.
- wiki_face_simple(-Out, +ArgNames, +Options)[private]
- Skip simple (non-markup) wiki.
code_words(-Words)//[private]- True when Words is the content as it appears in
`code`
,
where ``
is mapped to `
.
eq_code_words(-Words)//[private]- Stuff that can be between single
=
. This is limited to
- Start and end must be a word
- In between may be the following punctuation chars:
.-:/
, notably dealing with file names and
identifiers in various external languages.
code_face(+Text, +Term, +Vars, -Code) is det[private]- Deal with
`... code ...`
sequences. Text is the matched
text, Term is the parsed Prolog term and Code is the resulting
intermediate code.
- emphasis_seq(-Out, +ArgNames, +Options) is semidet[private]
- Recognise emphasis sequences
emphasis_term(+Emphasis, +Tokens, -Term) is det[private]
emphasis_before(-Before)// is semidet[private]
emphasis_start(-Emphasis)// is semidet[private]
emphasis_end(+Emphasis)// is semidet[private]- Primitives for Doxygen emphasis handling.
- arg_list(-Atoms) is nondet[private]
- Atoms is a token-list for a Prolog argument list. An
argument-list is a sequence of tokens '(' ... ')'.
- bug
- - the current implementation does not deal correctly with
brackets that are embedded in quoted strings.
term_face(+Text, +Term, +Vars, -Face, +Options) is semidet[private]- Process embedded Prolog-terms. Currently processes Alias(Arg)
terms that refer to files. Future versions will also provide
pretty-printing of Prolog terms.
image_label(-Label)//[private]- Match File[;param=value[,param=value]*]
- file_options(-Options) is det[private]
- Extracts additional processing options for files. The format is
;name="value",name2=value2,... Spaces are not allowed.
wiki_link(-Link, +Options)// is semidet[private]- True if we can find a link to a file or URL. Links are described
as one of:
- filename
-
A filename defined using autolink_file/2 or
autolink_extension/2
- <url-protocol>://<rest-url>
-
A fully qualified URL
- '<' URL '>'
-
Be more relaxed on the URL specification.
- prolog:url_expansion_hook(+Term, -HREF, -Label) is semidet[multifile]
- This hook is called after recognising
<Alias:Rest>
, where
Term is of the form Alias(Rest). If it succeeds, it must bind
HREF to an atom or string representing the link target and Label
to an html//1 expression for the label.
file_name(-Name:atom, -Ext:atom)// is semidet[private]- Matches a filename. A filename is defined as a sequence
<segment>{/<segment}.<ext>.
- resolve_file(+Name, -FileOptions, ?RestOptions, +Options) is det[private]
- Find the actual file based on the pldoc_file global variable. If
present and the file is resolvable, add an option
absolute_path(Path)
that reflects the current location of the
file.
arity(-Arity:int)// is semidet[private]- True if the next token can be interpreted as an arity. That is,
refers to a non-negative integers of at most 20. Although Prolog
allows for higher arities, we assume 20 is a fair maximum for
user-created predicates that are documented.
symbol_string(-String)// is nondet[private]- Accept a sequence of Prolog symbol characters, starting with the
shortest (empty) match.
prolog_symbol_char(?Char)[private]- True if char is classified by Prolog as a symbol char.
autolink_extension(?Ext, ?Type) is nondet- True if Ext is a filename extensions that create automatic links
in the documentation.
autolink_file(?File, -Type) is nondet- Files to which we automatically create links, regardless of the
extension.
citations(-List)//[private]- Parse @cite1[;@cite2]* into a list of citations.
section_comment_header(+Lines, -Header, -RestLines) is semidet- Processes /** <section> comments. Header is a term
\
section(Type, Title)
, where Title is an atom holding the
section title and Type is an atom holding the text between <>.
- Arguments:
-
Lines | - List of Indent-Codes. |
Header | - DOM term of the format \section(Type, Title) ,
where Type is an atom from <type> and Title is
a string holding the type. |
tokenize_lines(+Lines:lines, -TokenLines) is det[private]- Convert Indent-Codes into Indent-Tokens
line_tokens(-Tokens:list)// is det[private]- Create a list of tokens, where is token is either a ' ' to
denote spaces, a term
w(Word)
denoting a word or an atom
denoting a punctuation character. Underscores (_) appearing
inside an alphanumerical string are considered part of the word.
E.g., "hello_world_" tokenizes into [w(hello_world)
, '_'].
verbatim(+Lines, +EnvIndent, -Pre, -RestLines) is det[private]- Extract a verbatim environment. The returned Pre is of the
format
pre(Attributes, String)
. The indentation of the leading
fence is substracted from the indentation of the verbatim lines.
Two types of fences are supported: the traditional ==
and
the Doxygen ~~~
(minimum 3 ~
characters), optionally
followed by {.ext}
to indicate the language.
Verbatim environment is delimited as
...,
verbatim(Lines, Pre, Rest)
...,
In addition, a verbatim environment may simply be indented. The
restrictions are described in the documentation.
tilde_fence_ext(-Ext)// is semidet[private]- Detect
`{.prolog} (Doxygen) or
`{prolog} (GitHub)
indented_verbatim_body(+Lines, +Indent, -CodeLines, -RestLines)[private]- Takes more verbatim lines. The input ends with the first line
that is indented less than Indent. There cannot be more than one
consequtive empty line in the verbatim body.
valid_verbatim_opening(+Line) is semidet[private]- Tests that line does not look like a list item or table.
lines_code_text(+Lines, +Indent, -Codes) is det[private]- Extract the actual code content from a list of line structures.
pre_indent(+Indent)// is det[private]- Insert Indent leading spaces. Note we cannot use tabs as these
are not expanded by the HTML <pre> element.
summary_from_lines(+Lines:lines, -Summary:list(codes)) is det- Produce a summary for Lines. Similar to JavaDoc, the summary is
defined as the first sentence of the documentation. In addition,
a sentence is also ended by an empty line or the end of the
comment.
skip_empty_lines(+LinesIn, -LinesOut) is det[private]- Remove empty lines from the start of the input. Note that
this is used both to process character and token data.
indented_lines(+Text:list(codes), +Prefixes:list(codes), -Lines:list) is det- Extract a list of lines without leading blanks or characters
from Prefix from Text. Each line is a term Indent-Codes, where
Indent specifies the line_position of the real text of the line.
end_of_comment//[private]- Succeeds if we hit the end of the comment.
- bug
- - %*/ will be seen as the end of the comment.
take_prefix(+Prefixes:list(codes), +Indent0:int, -Indent:int)// is det[private]- Get the leading characters from the input and compute the
line-position at the end of the leading characters.
string_update_linepos(+Codes, +Pos0, -Pos) is det[private]- Update line-position after adding Codes at Pos0.
update_linepos(+Code, +Pos0, -Pos) is det[private]- Update line-position after adding Code.
- To be done
- - Currently assumes tab-width of 8.
take_line(-Line:codes)// is det[private]- Take a line from the input. Line does not include the
terminating \r or \n
character(s)
, nor trailing whitespace.
normalise_indentation(+LinesIn, -LinesOut) is det[private]- Re-normalise the indentation, such that the lef-most line is at
zero. Note that we skip empty lines in the computation.
strip_leading_par(+Dom0, -Dom) is det- Remove the leading paragraph for environments where a paragraph
is not required.
ws// is det[private]- Eagerly skip layout characters
nl//[private]- Get end-of-line
peek(H)//[private]- True if next token is H without eating it.
tokens(-Tokens:list)// is nondet[private]
tokens(+Max, -Tokens:list)// is nondet[private]- Defensively take tokens from the input. Backtracking takes more
tokens. Do not include structure terms.
tokens_no_whitespace(-Tokens:list(atom))// is nondet[private]- Defensively take tokens from the input. Backtracking takes more
tokens. Tokens cannot include whitespace. Word tokens are
returned as their represented words.
limit(+Count, :Rule)//[private]- As limit/2, but for grammar rules.