Lexikon std - html
Lexikon Standards
Populating a namespace per Specs on the PSAID is fairly easy, but for advanced features, one also needs posted namespace specs, here called the Public Namespace Documentation (PND).
Increasing the power of PSAID vocabularies seems to make increased
demands on PND pages. This page summarizes the rules as they now
appear. For more detail, please contact Lexikos Corporation, who
will try to co-ordinate these standards pending passage to an agreeable
public group.
Compact Sense and Image Codes
Chronological sense-numbers can dramatically shorten SAID strings. The
time tag and all following characters get replaced by a digit unique to the
namespace. This makes the SAID easier to read, pronounce, remember and type.
The Image code is similar to a sense-number, but adds one extra digit
(or more if the PND so states). An image code idiomatically names a blank node
for a typed but unnamed individual by extending the sense-code which preceeds it.
This helps embed prototypes into a namespace, for example.
So users can recover missing SAID data, the PND must cite a public URL
from which a CSV text file holding it can be downloaded. It is basically a sense-registration log, holding all normal characters for each name space SAID, comma-separated into time tag, shape, axioms and "extension facets".
Extension Facets
Each of these is a short alphanumeric string denoting extra
facets modeling the subject of its SAID. These follow after the
upper ontology axioms, and stay separated from them and one another by
periods. Extension facets of two broad types are defined so
far. If adopted by a namespace, PND must cite public specs, and
give the relative order of facets (so they can be unpacked properly).
Axiom-specific facets
- Many SAIDs may need these because the general axioms are so --- general.
They work fine for basic discriminants (the first 10 Questions in a
game of 20), but extensions adding semantic details may seem required. A namespace that focuses on
"physical substances" for example, might add to each such SAID one of 27 million CAS Registry
codes to better identify its subject. For more depth in
its "situation" SAIDs, it might instead tag on some extension facet from the Lexikos Situation ontology.
Universal facets - Very broad extra discriminants are really hard to identify, but other UOs do exist whose IDs may help. The Lexikos Scanner, for example, uses a core-English
lexikon holding several thousand most-common roots, hand-tagged (see MEANS) with the 1,000 or so category codes in Roget's Fourth International Thesaurus.
In a PSAID facet, these can greatly refine the Realm of most content terms in a paragraph .
This data works best if bundled with part-of-speech, so
a character for that gets included .
In practice, such extensions could go into totally separate namespaces, so users can mix in or
ignore them at application levels. Such freedom carries high integration
costs,
however, and extra complexity. A namespace author may eliminate it for his/her user
community by pre-bundling such descriptors as extended SAIDs at assembly time, when the merging is relatively easy to arrange.
Namespace Web Services
The namespace in a legal PSAID must
resolve to specs, but lexikon publishers
may comply in several ways. One is offering links to multiple pages, each focused on SAIDs with specific
signatures of axioms. If
some of those pages are dynamic (executable), they can DO things for their related SAIDs, such as:
Translation - This downloads the axioms and
extension facets of any SAID or sense code in
another formal language - like WORDS, LTM or OWL. It is just a
variation on the CSV file cited earlier, but it can help end-users (or
their scripts) convert to some language another tool demands. For WORDS
and
LTM at least, Lexikos PSIs now perform such a service
on request for each of the PSIDs in CTM - our Case Frame Thesaurus. OWL Abstract Syntax is also planned
Variations - Via extra arguments, a user may request content changes
on such translations. An example is a WORDS model
for some verb-like SAID, modified by extra arguments which define its
mood, tense and complements in some English clause.
The return would model the contextual (argument-influenced) clause meaning, not
the (baseline) verb sense.
In the biochemical realm, a similar variation might be the OWL model of
some interaction SAID, modified by the contextual presence
of some catalyst(s) or the atypical ionization in some participant(s).
Expansion - Both types of services could work by adjusting JSP templates. Expansions could work similarly,
but would graft in additional content
present in neither the SAID signature nor the extra arguments.
Here,
such input strings get used as keys into a data base (or other web
services) which can
return virtually any kind of data, in any format or amount the
namespace
authority is willing and able to provide. For a SAID with a
CAS code, for example, they might return OWL models of molecular
structure or vendors who might supply it.
These examples may remind you of web
services generally. The main difference, perhaps, is that today most such services get arguments which are semantically
barren - typically just numbers and strings. Under Lexikon
Standards, namespace services get SAID arguments which
(to their publishers at least) denote highly robust , faceted descriptions built by local experts. That can make a large difference in the behavior
and output of services, even if the programming languages in which those services are written do not change.
Semantics arise in a system's vocabulary, not in its reasoning abilities.
|