logo

PSI Home | Products | Research | Projects | History

What is a PSI-lexikon?

A PSI-lexikon is a formally structured text file. Lexikos supports it as a useful module of named definitions, convenient for expanding ontologies, thesauri, gazetteers, or any vocabulary of semantic web metadata.

Each such file holds human-readable comments on its purpose and change history, followed by a sequentially numbered list of PSI-trees.

As metadata modules, PSI-lexikons are expressive, compact, flexible, easily read by people and software, and easy to create or edit. Some specific practical benefits of using them are explained in more depth below.

BENEFIT 1: PSI PAGES

A legal PSI-lexikon can be neatly displayed on demand as a web page of standards-compliant PSIs. A simple web publication process for specs or usage is thus one practical benefit of using PSI-lexikons.

Each PSI-lexikon displays under in the logic of a JSP, enhanced by a PSI-lexikon Java package. It reads and parses the PSI-lexikon file, which embeds metadata definitions flexibly into a list of explicitly typed, related PSI-tree templates.

The formatting rules add just enough structure to let the JSP easily control web-display services for the PSI-trees. It requests the services using scriptlets, as well as data to embed in HTML fragments. Everything stays modular, flexible, and easy to adapt for special project needs.

Any number of PSI-lexikons can be combined into an ontology, so some services focus on hyperlinks. They can link one PSI-tree display to another, even across web page boundaries. Such links make PSI documentation simpler to write, and let browser-based readers more easily follow it.

BENEFIT 2: LTM, XTM, RDF

A greater benefit arises from embedded PSI-trees that authors can treat like templates for complex structures of topics, occurrences, and/or associations. Since they share a common source definition, such structures always emerge consistently with the PSI-lexikon's latest public PSI page.

Each typed template works like a macro. The Java support package expands it into LTM, and adds proper PSIs for merging control. If imported into an TM engine, the native powers of that engine will convert it further into ISO-standard XTM, or (in some cases) into equivalent RDF.

An ability to dump any of these output forms consistent with PSI documentation can speed up R&D work nicely, and reduce costly errors due to version changes. This is especially true for complex patterns like hierarchies, whose PSI-tree templates may expand dozens of times when recast as LTM.

An author declares a PSI-tree to be a template by naming its kind on its first line within the PSI-lexikon file. This declaration tells the support code what kind of template follows, warns it about extra formatting or display rules, special include files to load, etc.

Effectively, the kind-declaration dictates which logic module gets used to expand or display each template. Each human reader of the PSI pages or raw file gets the same textual warning, plus links to general-purpose web specs explaining the ontological meaning and special syntax for each kind of template their local system can support.

BENEFIT 3: OPEN SPECS

PSI-lexikons work well in practice because their legal kinds of templates remain open ended. These can usually be expanded or adjusted by just reconfiguring our Java support code, and/or twiddling its reformatting macros. Centralized upgrading is a key advantage to working under the PSI-lexikon framework.

Its rules effectively let each site add new kinds of PSI-tree notations as desired. The process is fast and limited mostly by imagination, so each related project can freely invent its own new kinds of templates to aid authors' productivity.

So far, only these specific kinds of PSI-tree templates are actively supported:

  • *A* - Specs an association type, plus the types of its roles
  • *T* - Specs a topic type, plus its immediate subtypes
  • *O* - Specs an occurrence type whose expected value is a URI
  • *D* - Specs an occurrence type meant to hold a resourceData string
  • *E* - Specs an enumeration of legal values for a role or occurrence
Our current template-support package has many flaws and limits, but a simple web demo does exist to show it working. (Contact me for a tour). The specific LTM streams it generated went to a recent Lexikos client, who seemed quite pleased with them.

PSI-lexikon files let me build his ontology quickly. Better yet, they made it far easier to adjust and refine, as some requested changes could be centrally added by recoding template definitions, versus modifying all the PSI-lexikon files. This benefit grows quickly with the scale of the project.

BENEFIT 4: CONSTRAINTS

The stream from a compiled PSI-lexikon also includes constraints for each type being defined. In particular, the web demo noted above now specifies:
  • What role types are required for each association type
  • What occurrence types are required for each topic type
  • The cardinality of each role and occurrence type
  • The domain of each role and occurrence type
  • The range of each role and occurrence type

Today such core constraints emerge in pseudo-code. Central logic changes only could recast them into any formal constraint language desired, including OWL or a future TMCL.

By default, however, we plan to embed all such constraints directly into the LTM emitted for each constrained type. This lets everything be automaticaly translated into XTM or RDF, so each application can use the constraints it requires at run time, in any way it requires.

Interestingly, because the PSI-lexikon support package itself understands the constraints, it can do limited validation of PSI-tree templates even as it works. Maximizing the ease and utility of this is a goal for its future improvement.

BENEFIT 5: DEFAULTS

Embedding constraints also eases the creation of default (prototype) instances for every defined association or topic type. This is generally a run-time need, but it must be supported by constraints and other default data on each type.

PSI-lexikons can help by ensuring such default data exists within the ontology, For each defined occurrence type, a prototype value spec is needed that meets all known (domain-specific) constraints. Iff this spec is given, then PSI-lexikon support code can be centrally used to generate the LTM for prototype topics citing it.

Our *E* template in particular needs such logic. We originally defined it to create the LTM for whole-part trees, a critically important large structure in modeling, but then deferred finishing it pending resolution of this open design issue:

  1. Some whole-part trees involve types. They help one to model complex things like machines or human bodies in terms of their expected parts.
  2. Others involve named instances, such as the regions of earth's surface, or the sectors of our economy, or the people within some named group.
  3. At other times, enumerated values model aspects not part of any whole, such as the primary colors, or the small/medium/large range for sizes.
To handle each case properly from a modeling standpoint, we must split up *E* into three variations. Case 2 especially will work better if LTM for prototypes becomes computable regardless of what types become involved.

Such future features will be formalized as programming time and funding permit. Meanwhile, these specs are a work in progress, selectively published for your comments. Please contact me with reactions, questions, suggestions for improvements, or demos. Thanks.

- Dan Corwin, CTO



Lexikos Corporation
Boston & Knoxville
Email: Dan@Lexikos.com