| Did you know ... | Search Documentation: |
| uniname.pl -- Unicode character names |
This library relates Unicode code points to their formal Unicode
character names (the Name property of UnicodeData.txt). It ships its
own compact UCD-derived table (about 360 KB) and is independent of
library(unicode) and library(unicode_security).
Algorithmic name ranges (Hangul syllables, CJK and Tangut ideographs and
the various PREFIX-<hex> families) are synthesised from the code
point and carry no per-code-point storage; the remaining ~34,600 names
are stored as a shared word table plus a packed token stream. See
etc/gen_uniname.pl in the package directory to regenerate the table
on a Unicode-version bump.
unicode_name(?CodePoint:integer, ?Name:atom) is nondetunicode_name(+CodePoint, -Name) is semidet: the name of
CodePoint, failing when it has none (control, surrogate,
private-use or unassigned code points).unicode_name(-CodePoint, +Name) is semidet: the (unique)
code point with the given name.unicode_name(-CodePoint, -Name) is nondet: enumerate every
named code point on backtracking.Name is an atom of the formal Unicode name in upper case, e.g.
?- unicode_name(0'A, N). N = 'LATIN CAPITAL LETTER A'. ?- unicode_name(C, 'EURO SIGN'). C = 8364. ?- unicode_name(0xAC00, N). N = 'HANGUL SYLLABLE GA'.