| We are expanding and migrating into web apps this suite
of very fast English text
"understanding" tools.
Their names (left side) link to functional summaries below.
Each tool uses a fixed codebase for syntax, plus related
semantic extensions (right side). Our core memory manager
(yellow) uses a cohesive set of application-specific Topic Maps to define/guide/document
both new
(green) modules:
|
English-like
Syntax
|
Semantics
via
Metadata
|
| TRANSCRIBER |
Phrase
Structure
PARSER
- Grammar Rules
- Rule Application
- Syntax Structures
|
Case
Frame
THESAURUS
- Preposition Models
- Verb Complements
- Inglish Associations
|
| MODELER |
Agent
Training
in WORDS
- 1-to-1 Web Dialogs
- Editing of Lexicons
- Speech Act Scripts
|
Topics
in
CONTEXT
- Discourse Models
- Semantic Constraints
- Chart Production
|
| LEXICON |
Tokenizing
SCANNER
- Lexical Morphology
- Syntactic Features
- Multitoken Names
|
Token
INTERPRETER
- Expected Concepts
- Driven by a Context
- Sets Token Meaning
|
TRANSCRIBER - analyzes text structures in high-speed
memory
The PARSER (yours or ours)
structurally diagrams sentences and phrases. Ours is fast
(semi-deterministic) with a decent grammar - a package designed to find
clause structures unambiguously. Overall parsing complexity rises
fast if yours demands plausibility tests of implied semantic
interactions, which usually can only prune the set of "final" situation
models emitted by our multi-stage
analysis.
Our THESAURUS option
lets SYNTAX structures of
interest be mapped from such surface semantics into DISCOURSE models tracking
anaphora. Logic processes using Common
Logic or Owl can use
such data to model what was said
in a text. Warning: such
output may cost 50 times more than what
was discussed, which can be found by simpler tools below.
MODELER - manages topic models within a mid-term memory
This is a listener for WORDS, a
topic-modeling language
that can express the cyclic nodes found in any syntax analysis. Used in
standalone mode,
it can also interactively define custom vocabulary
extensions needed for your domain.
Either way, CONTEXT
models the result per our ontology for Idealized English, which
covers a discourse model for handling anaphora, plus a very powerful
set of
open-source semantic constraint patterns that can formally associate or
annotate topics denoting whatever subjects WORDS expressions have
declared.
LEXICON - holds needed lexical data within long-term
memory
files
Our SCANNER utility
maps an English paragraph (posted or read) into a stream of world-class
lexical
data supporting linguistic analysis.
Each token shows its root, inflection if any, part-of-speech, and
powerful binary linguistic features that can guide a parser to properly
handle each word's expected complement patterns. Under "MEANS",
it assigns as candidate meanings Roget's category options.
Our INTERPRETER culls lexical
ambiguity from inputs very early using
contextual expectations, optionally aided by curation. Our
standalone Agency
framework for semantic text analysis, search, summarization, alerting
and word-learning uses this as its core. Other NLU options above, if added, therefore
run deterministically to boost accuracy and functionality without
degrading net analysis speed.
|