logo

Home | Products | Research | Projects | History

What's in CONTEXT?

MODELER is a framework designed to efficiently transliterate English grammar forms into isomorphic semantic nets, whose  meaning is formalized by the ISO standard paradigm of the Topic Map.  This conversion is usually done in bulk, with operator guidance, under an interactive UI resembling that of a spelling corrector.

An active topic map called CONTEXT records the created semantic net using topics and associations, driven by our NLP utilities for discourse models and self-expanding semantic lexicons.  Input paragraphs are in everyday English, but for two non-trivial requirements, the first of which eases in version 2.0:
  1. Grammar forms are parenthesized.  (Our parser will later automate this.).
  2. Each word or idiom used must be defined by a lexicon in CONTEXT.
CONTEXT is configured by our fixed base of scripted CONCEPT topics, plus those in your personal extension lexicons.  The combination defines a flexible ontology letting MODELER map any legal WORDS expression into new Topic characteristics.  Each expression is a character string - one of two kinds:
  • TERM - any related spelling (root, inflection) for a CONCEPT.   A hashmap of these spellings indexes all loaded lexicons, and it lets MODELER find the expected CONCEPTS for any spelling in its domain-specific vocabulary:
    • Entering an isolated TERM returns a list of all related CONCEPTS
    • But in a FORM one gets picked (if necessary by the operator)

  • FORM - a grammatical list of TERMs and sub-FORMs in parentheses.  Operators and scripts both use FORMS to access CONTEXT.  Every English grammar structure is defined, possibly in a simplified version.  Each FORM type in WORDS simulates its equivalent in English:
    • Noun phrases (NPs) find or build a topic for each subject cited
    • Modifying phrases adjust characteristics of these NP topics
    • Sentences associate the NP topics using case frame templates
The topics in CONTEXT hold WORDS scripts that make MODELER flexible and semi-intelligent.   Used as needed, they help it find or build the one contextual topic which best represents the referent of its input expression under our simulated rules of discourse.  Two kinds of topic may qualify:
  • CONCEPT - models a TERM sense using part-of-speech codes from our PSI-backed Lexicon, plus semantic codes which are the 1,000+ categories in Roget's Thesaurus.   Combined, these symbols can uniquely identify any common English word sense:
    • All other characteristics must extend these unique semantics
    • So too must those of all IMAGES it dynamically instantiates
    • A CONCEPT author ensures this by adding scripts and other data

  • IMAGE- models a FORM meaning by building a structure of Topics.  The CONCEPT for its head word guides this process as a dynamic template - like a kind of scripted blueprint.  Each paragraph of such IMAGES is then charted into XTM or RDF streams, meant for use downstream by applications, inference engines or intelligent agents:
    • In any format, a WORDS semantic net should seem sensible
    • If it is not, some CONCEPT is malfunctioning and needs repair
Aided by public groups working on case frames and conceptual graphs, Lexikos will adjust MODELER's fixed base of CONCEPTS and help its users add new ones, for private use or open-source publication.
 
MODELER's world knowledge, linguistic range and ease of use should improve steadily as these refinements occur, but at all points the same simple goal applies: the subject of each IMAGE topic, under the Topic Map paradigm, should be the same as the intuitive referent, to an English reader, of the grammar FORM that was interpreted to build it.

If this goal succeeds, then MODELER will understand the FORM as a human would. That is the formal Q/A test we want it to meet, as often as possible.



Lexikos Corporation
Boston & Knoxville
Email: Dan@Lexikos.com