| Did you know ... | Search Documentation: | 
|  | xpath.pl -- Select nodes in an XML DOM | 
The library xpath.pl provides predicates to select nodes from an XML DOM tree as produced by library(sgml) based on descriptions inspired by the XPath language.
The predicate xpath/3 selects a sub-structure of the DOM non-deterministically based on an XPath-like specification. Not all selectors of XPath are implemented, but the ability to mix xpath/3 calls with arbitrary Prolog code provides a powerful tool for extracting information from XML parse-trees.
 xpath_chk(+DOM, +Spec, ?Content) is semidet
 xpath_chk(+DOM, +Spec, ?Content) is semidet xpath(+DOM, +Spec, ?Content) is nondet
 xpath(+DOM, +Spec, ?Content) is nondet//Term/Term
The Terms above are of type callable. The functor specifies
the element name. The element name '*' refers to any element.
The name self refers to the top-element itself and is often
used for processing matches of an earlier xpath/3 query. A term
NS:Term refers to an XML name in the namespace NS. Optional
arguments specify additional constraints and functions. The
arguments are processed from left to right. Defined conditional
argument values are:
lastlast - IntExprlast-1 is the element directly preceding the last one.index(Integer).lastindex(last).last - IntExprindex(last-IntExpr).Defined function argument values are:
selfcontenttexttext(As)atom or string.normalize_spacetext, but uses normalize_space/2 to normalise
white-space in the outputnumber@Attributenumber, but subsequently transform the value
into an integer using the round/1 function.number, but subsequently transform the value
into a float using the float/1 function.@href and
@href(atom) are equivalent. The SGML parser
can return attributes as strings using the
attribute_value(string) option.In addition, the argument-list can be conditions:
content = content defines that the content
of the element is the atom content.
The functions lower_case and upper_case can be applied
to Right (see example below).contains(Haystack, Needle)h3 element inside a div element, where the div
element itself contains an h2 child with a strong
child.
//div(h2/strong)/h3
This is equivalent to the conjunction of XPath goals below.
..., xpath(DOM, //(div), Div), xpath(Div, h2/strong, _), xpath(Div, h3, Result)
Examples:
Match each table-row in DOM:
xpath(DOM, //tr, TR)
Match the last cell of each tablerow in DOM. This example illustrates that a result can be the input of subsequent xpath/3 queries. Using multiple queries on the intermediate TR term guarantee that all results come from the same table-row:
xpath(DOM, //tr, TR), xpath(TR, /td(last), TD)
Match each href attribute in an <a> element
xpath(DOM, //a(@href), HREF)
Suppose we have a table containing rows where each first column is the name of a product with a link to details and the second is the price (a number). The following predicate matches the name, URL and price:
product(DOM, Name, URL, Price) :-
    xpath(DOM, //tr, TR),
    xpath(TR, td(1), C1),
    xpath(C1, /self(normalize_space), Name),
    xpath(C1, a(@href), URL),
    xpath(TR, td(2, number), Price).
Suppose we want to select books with genre="thriller" from a
tree containing elements <book genre=...>
thriller(DOM, Book) :-
    xpath(DOM, //book(@genre=thiller), Book).
Match the elements <table align="center"> and <table
align="CENTER">:
    //table(@align(lower) = center)
Get the width and height of a div element as a number,
and the div node itself:
    xpath(DOM, //div(@width(number)=W, @height(number)=H), Div)
Note that div is an infix operator, so parentheses must be
used in cases like the following:
    xpath(DOM, //(div), Div)
 sub_dom(-Index, -Count, +Name, -Sub, +DOM) is nondet[private]
 sub_dom(-Index, -Count, +Name, -Sub, +DOM) is nondet[private] count_named_elements(+Content, +Name, -Count) is det[private]
 count_named_elements(+Content, +Name, -Count) is det[private] nth_element(?N, +Name, -Element, +Content:list) is nondet[private]
 nth_element(?N, +Name, -Element, +Content:list) is nondet[private] modifiers(+Modifiers, +I, +Clen, +DOM, -Value)[private]
 modifiers(+Modifiers, +I, +Clen, +DOM, -Value)[private] process_equality(+Left, +Right) is semidet[private]
 process_equality(+Left, +Right) is semidet[private]For example the XPath expression in [1], and the equivalent Prolog expression in [2], would both match the HTML element in [3].
[1] //table[align=lower-case(center)] [2] //table(@align=lower_case(center)) [3] <table align="CENTER">
 text_of_dom(+DOM, +As, -Text:atom) is det[private]
 text_of_dom(+DOM, +As, -Text:atom) is det[private]