|
We are expanding and migrating into web apps this suite of English text processing tools.
Their names (left side) link to functional summaries below.
Each tool uses a fixed codebase to process syntax, plus an optional semantic module
(right side). The core memory manager (yellow) maintains a set of application-specific
Topic Maps, which enable both new (green) modules:
|
English-like Syntax
|
Semantics via Metadata
|
|
TRANSCRIBER
|
Phrase Structure PARSER
- Grammar Rules
- Rule Application
- Syntax Structures
|
Paragraph INTERPRETER
- Interface to Context
- Queries on Semantics
- Chart Production
|
|
MODELER
|
Training via WORDS
- 1-to-1 Web Dialogs
- Editing of Lexicons
- Speech Act Scripts
|
Topics in CONTEXT
- Discourse Models
- Expected Concepts
- Semantic Constraints
|
|
SCANNER
|
Standard Root LEXICON
- Morphology
- Syntactic Features
- Multitoken Names
|
Case Frame THESAURUS
- Preposition Models
- Verb Complements
- Inglish Associations
|
TRANSCRIBER - analyzes text structures in high-speed memory
The PARSER (yours or ours)
structurally diagrams sentences and phrases.
To confirm or refine such analyses, it may use the other system modules to
find a model for each topic, then test the plausibility of their implied
semantic interactions.
Our INTERPRETER is really a suite of support utilities
for doing such things. It can operate as a standalone
graph-handling suite called AGENCY, which
includes a semantic search engine for triples; structured data base
feeds; and rules which react to triple structures. Even without a
parser, the modules below can also pass it quasi-English inputs
expressed in situational
case frames.
MODELER - manages topic models in a mid term memory
This is a listener for WORDS, a topic-modeling language
that can express the cyclic nodes found in any syntax analysis. In standalone mode,
it can also interactively define or adjust custom vocabulary extensions.
Either way, CONTEXT models the result under
our ontology for Idealized English, which
covers a discourse model for handling anaphora, plus a powerful set of open-source
semantic constraint patterns that can formally associate or annotate topics to denote
whatever WORDS expressions said about them.
SCANNER - finds needed lexical data in long term memory files
Using a LEXICON, this utility maps
an English paragraph
(posted or read) into a stream of lexical data supporting linguistic analysis.
Each data token shows a root, any inflection, part-of-speech, and detailed binary
linguistic features supporting tests for complements, etc.
The THESAURUS option adds to each token broad-coverage (surface) semantic models for
case frames of common verbs and prepositions, contextually ranked and filtered. To
provide for specific vocabulary expected in your own application, it also helps you
interactively build up extension lexicons from training texts.
|