Did you know ... | Search Documentation: |
Flexible ordering and equivalence based on character table |
This package was developed as part of the GRASP project, where it is used for browsing lexical and ontology information, which is normally stored using‘dictionary’order, rather than the more conventional alphabetical ordering based on character codes. To achieve programmable ordering, the table package defines‘order tables’. An order table is a table with the cardinality of the size of the character set (256 for extended ASCII), and maps each character onto its‘order number’, and some characters onto special codes.
The default (exact
) table matches all character codes
onto themselves. The default case_insensitive
table matches
all uppercase characters onto their corresponding lowercase character.
The tables iso_latin_1
and iso_latin_1_case_insensitive
map the ISO-latin-1 letters with diacritics into their plain
counterpart.
To support dictionary ordering, the following special categories are defined:
ignore | Characters of the ignore set are simple discarded from the input. |
break | Characters from the break set are treated as word-breaks, and each non-empty sequence of them is considered equal. A word break precedes a normal character. |
tag | Characters of type tag indicate the start of a‘tag’that should not be considered in ordering, unless both strings are the same upto the tag. |
The following predicates are defined to manage and use these tables:
case_insensitive | Map all upper- to lowercase characters. |
iso_latin_1 | Start with an ISO-Latin-1 table |
iso_latin_1_case_insensitive | Start with a case-insensitive ISO-Latin-1 table |
copy(+Table) | Copy all entries from Table. |
tag(+ListOfCodes) | Add these characters to the set of‘tag’characters. |
ignore(+ListOfCodes) | Add these characters to the set of‘ignore’characters. |
break(+ListOfCodes) | Add these characters to the set of‘break’characters. |
+Code1 = +Code2 | Map Code1 onto Code2. |
break
, ignore
or tag
.<
, =
or >
.