This library provides high-performance C-based primitives for
manipulating URIs. We decided for a C-based implementation for the much
better performance on raw character manipulation. Notably, URI handling
primitives are used in time-critical parts of RDF processing. This
implementation is based on RFC-3986:
http://labs.apache.org/webarch/uri/rfc/rfc3986.html
The URI processing in this library is rather liberal. That is, we break
URIs according to the rules, but we do not validate that the components
are valid. Also, percent-decoding for IRIs is liberal. It first tries
UTF-8; then ISO-Latin-1 and finally accepts %-characters verbatim.
Earlier experience has shown that strict enforcement of the URI syntax
results in many errors that are accepted by many other web-document
processing tools.
This library provides explicit support for URN URIs.
 uri_components(+URI, -Components) is det uri_components(+URI, -Components) is det
- uri_components(-URI, +Components) is det
- Break a URI into its 5 basic components according to the
RFC-3986 regular expression:
^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
 12            3  4          5       6  7        8 9 
If the schema is urn, it is broken into its schema, NSI
(Namespace Identifier) and NSS (Namespace Specific String).
 
- Arguments:
- 
| Components | - is a one of 
uri_components(Scheme, Authority, Path, Search, Fragment)If a URI is parsed, i.e., using mode (+,-), components that
are not found are left uninstantiated (variable). See
uri_data/3 for accessing this structure.urn_components(Scheme, NID, NSS, Search, Fragment)Here Scheme is always urn. Otherwise the same comments
as for uri_components/5 apply. |  
 
 
 uri_data(+Field, +Components, -Data) is semidet uri_data(+Field, +Components, -Data) is semidet
- uri_data(-Field, +Components, -Data) is nondet
- Provide access the uri_componentsorurn_componentsstructure.
The Fieldschemeis always present. Other fields depend on the
scheme. Theurnscheme providesnidandnss. Other schems
provideauthority,path,searchandfragment
 uri_data(+Field, +Components, +Data, -NewComponents) is det uri_data(+Field, +Components, +Data, -NewComponents) is det
- NewComponents is the same as Components with Field set to Data.
- Errors
- - domain_error(uri_field, Field)if Field is invalid.
- - instantiation_error if Field or Components is unbound.
 
 uri_normalized(+URI, -NormalizedURI:atom) is det uri_normalized(+URI, -NormalizedURI:atom) is det
- NormalizedURI is the normalized form of URI. Normalization is
syntactic and involves the following steps:
- 6.2.2.1. Case Normalization
- 6.2.2.2. Percent-Encoding Normalization
- 6.2.2.3. Path Segment Normalization
 
 iri_normalized(+IRI, -NormalizedIRI) is det iri_normalized(+IRI, -NormalizedIRI) is det
- NormalizedIRI is the normalized form of IRI. Normalization is
syntactic and involves the following steps:
- 6.2.2.1. Case Normalization
- 6.2.2.3. Path Segment Normalization
 
- See also
- - This is similar to uri_normalized/2, but does not do
normalization of %-escapes.
 
 uri_normalized_iri(+URI, -NormalizedIRI) is det uri_normalized_iri(+URI, -NormalizedIRI) is det
- As uri_normalized/2, but percent-encoding is translated into IRI
Unicode characters. The translation is liberal: valid UTF-8
sequences of %-encoded bytes are mapped to the Unicode
character. Other %XX-sequences are mapped to the corresponding
ISO-Latin-1 character and sole % characters are left untouched.
- See also
- - uri_iri/2.
 
 uri_is_global(+URI) is semidet uri_is_global(+URI) is semidet
- True if URI has a scheme. The semantics is the same as the code
below, but the implementation is more efficient as it does not need
to parse the other components, nor needs to bind the scheme. The
condition to demand a scheme of more than one character is added to
avoid confusion with DOS path names.
uri_is_global(URI) :-
        uri_components(URI, Components),
        uri_data(scheme, Components, Scheme),
        nonvar(Scheme),
        atom_length(Scheme, Len),
        Len > 1.
 uri_resolve(+URI, +Base, -GlobalURI:atom) is det uri_resolve(+URI, +Base, -GlobalURI:atom) is det
- Resolve a possibly local URI relative to Base. This implements
http://labs.apache.org/webarch/uri/rfc/rfc3986.html#relative-transform
 uri_normalized(+URI, +Base, -NormalizedGlobalURI:atom) is det uri_normalized(+URI, +Base, -NormalizedGlobalURI:atom) is det
- NormalizedGlobalURI is the normalized global version of URI.
Behaves as if defined by:
uri_normalized(URI, Base, NormalizedGlobalURI) :-
        uri_resolve(URI, Base, GlobalURI),
        uri_normalized(GlobalURI, NormalizedGlobalURI).
 iri_normalized(+IRI, +Base, -NormalizedGlobalIRI:atom) is det iri_normalized(+IRI, +Base, -NormalizedGlobalIRI:atom) is det
- NormalizedGlobalIRI is the normalized global version of IRI.
This is similar to uri_normalized/3, but does not do %-escape
normalization.
 uri_normalized_iri(+URI, +Base, -NormalizedGlobalIRI:atom) is det uri_normalized_iri(+URI, +Base, -NormalizedGlobalIRI:atom) is det
- NormalizedGlobalIRI is the normalized global IRI of URI. Behaves
as if defined by:
uri_normalized(URI, Base, NormalizedGlobalIRI) :-
        uri_resolve(URI, Base, GlobalURI),
        uri_normalized_iri(GlobalURI, NormalizedGlobalIRI).
 uri_query_components(+String, -Query:atom) is det uri_query_components(+String, -Query:atom) is det
- uri_query_components(-String, +Query) is det
- Perform encoding and decoding of an URI query string. Query is a
list of fully decoded (Unicode) Name=Value pairs. In mode (-,+),
query elements of the forms Name(Value) and Name-Value are also
accepted to enhance interoperability with the option and pairs
libraries. E.g.
?- uri_query_components(QS, [a=b, c('d+w'), n-'VU Amsterdam']).
QS = 'a=b&c=d%2Bw&n=VU%20Amsterdam'.
?- uri_query_components('a=b&c=d%2Bw&n=VU%20Amsterdam', Q).
Q = [a=b, c='d+w', n='VU Amsterdam'].
 uri_authority_components(+Authority, -Components) is det uri_authority_components(+Authority, -Components) is det
- uri_authority_components(-Authority:atom, +Components) is det
- Break-down the authority component of a URI. The fields of the
structure Components can be accessed using uri_authority_data/3.
This predicate deals with IPv6 addresses written as [ip],
returning the ip ashost, without the enclosing[]. When
constructing an authority string and the host contains:, the
host is embraced in[]. If[]is not used correctly, the
behavior should be considered poorly defined. If there is no
balancing `]` or the host part does not end with `]`, these
characters are considered normal characters and part of the
(invalid) host name.
 uri_authority_data(+Field, ?Components, ?Data) is semidet uri_authority_data(+Field, ?Components, ?Data) is semidet
- Provide access the uri_authority structure. Defined field-names
are: user,password,hostandport
 uri_encoded(+Component, +Value, -Encoded:atom) is det uri_encoded(+Component, +Value, -Encoded:atom) is det
- uri_encoded(+Component, -Value:atom, +Encoded) is det
- Encoded is the URI encoding for Value. When encoding
(Value->Encoded), Component specifies the URI component where the
value is used. It is one of query_value,fragment,pathorsegment. Besides alphanumerical characters, the following
characters are passed verbatim (the set is split in logical groups
according to RFC3986).
- query_value, fragment
- 
"-._~" | "!$'()*,;" | "@" | "/?"
- path
- 
"-._~" | "!$&'()*,;=" | "@" | "/"
- segment
- 
"-._~" | "!$&'()*,;=" | "@"
 
 uri_iri(+URI, -IRI:atom) is det uri_iri(+URI, -IRI:atom) is det
- uri_iri(-URI:atom, +IRI) is det
- Convert between a URI, encoded in US-ASCII and an IRI. An IRI is
a fully expanded Unicode string. Unicode strings are first
encoded into UTF-8, after which %-encoding takes place.
- Errors
- - syntax_error(Culprit)in mode (+,-) if URI is not a
legally percent-encoded UTF-8 string.
 
 uri_file_name(+URI, -FileName:atom) is semidet uri_file_name(+URI, -FileName:atom) is semidet
- uri_file_name(-URI:atom, +FileName) is det
- Convert between a URI and a local file_name. This protocol is
covered by RFC 1738. Please note that file-URIs use absolute
paths. The mode (-, +) translates a possible relative path into
an absolute one.
 uri_edit(+Actions, +URI0, -URI) is det uri_edit(+Actions, +URI0, -URI) is det
- Modify a URI according to Actions. Actions is either a single
action or a (nested) list of actions. Defined primitive actions
are:
- scheme(+Scheme)
- Set the Scheme of the URI (typically http,https, etc.)
- user(+User)
- Add/set the user of the authority component.
- password(+Password)
- Add/set the password of the authority component.
- host(+Host)
- Add/set the host (or ip address) of the authority component.
- port(+Port)
- Add/set the port of the authority component.
- path(+Path)
- Set/extend the pathcomponent. If Path is not absolute it
is taken relative to the path of URI0.
- search(+KeyValues)
- Extend the Key=Valuepairs of the current search (query)
component. New values replace existing values. If KeyValues
is written as =(KeyValues) the current search component is
ignored. KeyValues is a list, whose elements are one ofKey=Value,Key-Valueor `Key(Value)`.
- fragment(+Fragment)
- Set the Fragment of the uri.
- nid(+NID)
- Set the Namespace Identifier for a URN URI.
- nss(+NSS)
- Set the Namespace Specific String for a URN URI.
 
Components can be removed by using a variable as value, except
from pathwhich can be reset usingpath(/)and query which can
be dropped usingquery(=([])).
 
- Arguments:
- 
| URI0 | - is either a valid uri or a variable to start fresh. |