CASE FRAME THEORYIn
1990, I wrote a design paper surveying NLP literature
on case frame role types. I have since lost its electronic format, but these scans of the hard copy show its pages in
order. The content is still highly relevant:
-
Roleplay - Its Introduction - read this first, for terminology, etc.
- Case Role Types, '90-'04 - recently added notes
- Case Frame Type Models - recently added notes
-
Case Frame Instances ----- recently added notes
Case frames "work" by recasting the contextual semantic content of an
English paragraph into associations. Using standard case role
types is essential. As fixed reference points, they let
inference-engine rule sets be written that can reason about the
SITUATION and topics depicted by each clause.
Page [2] is pretty dense. It shows the Lexikos type hierarchy of case
frame
role types, which
I assembled by just merging public surveys of what other NLP shops were
using A similar list from John Sowa called
Thematic Roles forms the basis of his Conceptual Graphs paradigm.
The red
handwriting adds my recent notes on how CG case role types would fit into the
Lexikos schema. Most are obvious. A half dozen cited at the bottom of the page took
some extra analysis, but eventually fell into place.
We are now working to expand a version formalized for Topic Maps.
Under any schema, the 2-3 dozen role types act as "semantic
primitives". Each one names and semantically models a substring of words expected in a
clause. These substrings are called verb complements and denote topics. Syntactically, most look like noun phrases, and are often marked by leading prepositions.
Each
verb sense requires only 2-4 complements, much like arguments in method
calls. Others are optional, and need not be present in surface
text. Each case included puts fixed semantic constraints on its
topic; and each verb sense can add more, written by using the case role
types as variable names.
Net effect - before words denoting any specific topic get examined, the parsing system ALREADY KNOWS a lot about it, simply by tapping semantic constraints - which ultimately come from linguistic universals (role types) and the verb models (association types) stored within its vocabulary.
Page [3] begins with a few partial verb templates
in a frame-like notation.
They would model expected association types, and could be cast as
"definitions" of relationships in a semantic lexikon. A
fair number of these may be needed to define the working vocabulary of
any particular application system..
Notice how MENTAL SITUATION and PHYSICAL DOING EVENT have
two very different signatures of case roles. That shows up
directly in English paragraphs This effect is just what
PropBank
and similar NLP groups are trying to formally document in bulk,
for all verb and preposition senses in our language.
Because they are numerous, clever use of type trees
and data inheritance is actually *required* to
make this job tractable - and it's not nearly done yet. But once it is, the patterns
for all legal English clauses will be cataloged. Until
then, smaller, domain-specific subsets can be quickly built by
using Lextender
Each role of each template on page [3] would get
domain-specific constraints (not now shown) on the topics expected to
fill roles
in any matching English clause.
Reasoning code can then exploit those constraints, plus the actual
words used, to infer new facts on each topic - and what the whole
clause means.
Starting at the bottom of [3] are English clauses that illustrate. Nearby are instances of matching SITUATIONs.
Under case frame theory, each merely restates the clause to "say the
same thing" in a more formal notation. Only the clause representation
has changed - neither its topics nor its meaning.
At the bottom of [4] is the payoff from all the above - models of
the topics filling those case frame roles, which have been automatically
accumulating new data from the inverse slots in the three frames above,
to which each links.
Note how each topic is modeled by its
own type, but also by the roles it plays in various SITUATION instances, which gradually build up as external clauses
get mapped into the internal case frame semantic net. Note too that each model is not only domain-specific, but also specific to the current discourse context
Page [4] shows roughly how NL understanding works. Other ways
to
explain it exist, but even non-technical people can often follow this
one . But the story is not over, and in fact it has a significant final
chapter:
how can case frames help the hearer react to such NL inputs, after they are put into canconial forms?
One way is to let the hearer answer queries about what was
talked about, and what was said about them. Here, standard case
roles help a lot, by making the queries easier to write.
But first, we must import the semantic net for - let us say - some page
of text - into a suitable query system. This takes a chart.
Using MODELER to store topics, roles and
associations lets paragraphs be exported in an XTM chart and passed to
any of several Topic Map storage "engines" that can store and query the
semantic content - which is the same as the input page's. Some TM engines
can also export a chart in RDF, thus adding the page to the Semantic Web.
One can also export a chart formatted for a business rules engine,
which it can import as bulk facts (AKA assertions). The semantic net so loaded will
then be processed by logic modules of IF-THEN rules, each of which is
written to match the case roles in the frame instances being loaded,
and/or the type of the frame.itself
Example: to import such SITUATION and topic models into
this free Jess download,
the format shown on page [4] is very close to what is required - frames in
a LISP-like ASCII notation that makes all the case role names very prominent.
Each rule set would be application
specific. Each match triggers a script that could then use each
clause case frame as a whole to do many useful sorts of things - like
infer new implied facts
(inferences) and add them to its knowledge base as if they were new
inputs; or query the TM storage engine; or drive the system's UI; or
expand some "steam-of-consciousness" debugging log.
But It can also call any external application that English
inputs might usefully drive - such as robotic or equipment controls; or
the indexer for an image-storage jukebox; or perhaps a CRM application
in your customer support center.
The specific rules used must depend on what intelligent agents
and behaviors you want to create, and - of course - what paragraphs you
have available to drive them. Case Frame theory cannot help, as it
only covers NL understanding in the UI module.
The input text, and the back end scripting, remain up to you.
Please contact Dan Corwin with any questions or comments.
|