XML Topic Maps for SHORE

by Andreas Hess

Introduction

XML Topic Maps for SHORE is a tool to use XML topic maps togehter with our Semantic Hypertext Object Repository SHORE. The main purpose of this document is to explain to metamodelling capabilities of SHORE by comparing it with the widely known topic maps.

It is assumed that the reader of this document is familiar with XML Topic Maps. For a description of Topic Maps see www.topicmaps.org.

XML Topic Maps for SHORE consists of two tools:

Both are based on tools from the TM4J project: Their merge tool implemented in Java preprocesses topic maps, and a set of XSL style sheets converts topic maps into the SHORE file formats.

Topic Maps and SHORE Meta Models

Topics can be both: instances and classes of other topics that are instances of these classes. SHORE makes a distinction between object types and objects that are instances of object types. Object types are defined in SHORE's metamodel document; objects are defined in freely definable document types. The same is the case with associations: SHORE makes a distinction between relations and relation types.
To process a XML topic map for SHORE one needs to perform two steps:

  1. Extract all the classes (topics that are referenced by other topics and associations by instanceOf) into a SHORE metamodel
  2. Convert the topic map into a set of XML documents with markup that is recognized by SHORE.

We define the first step, that is performed by xtm2mdl, first and use Ontopia's opera topic map for the examples:

For each topic that is a class (that is referenced by other topics by instanceOf) introduce a SHORE object type. Example:

<topic id="artform">
<subjectIdentity>
<subjectIndicatorRef xlink:href="http://psi.ontopia.net/opera/#artform"/>
</subjectIdentity>
<baseName>
<baseNameString>Art form</baseNameString>
</baseName>
</topic>

<topic id="opera">
  <instanceOf><topicRef xlink:href="#artform"/></instanceOf>
  <!-- rest omitted -->
</topic>

For this, in SHORE's metamodel an object type (Objekttyp) artform is defined (as a subtype of another object type topic) to appear in a document type topicMap:

Objekttyp artform
ist ein topic
ist definiert in topicMap

(If one overlooks that most of SHORE is still German) One can see that SHORE supports inheritance between object types. With topic maps any topic can have instances, with SHORE only the leaves of the inheritance tree, that means only object types that have no subtypes, can have instances. To solve this we introduce another object type _artform derived from artform that is used to instantiate objects for artform:

Objekttyp _artform
ist ein artform
ist definiert in topicMap

In SHORE documents artforms are defined as instances of the object type _artform while in the SHORE meta model subtypes of artform like opera are derived from the object type artform.
Topics can be instances of more than one other topic, so one could say that topic maps support something like multiple inheritance. SHORE supports a similar concept but only between object types. A SHORE object is always an instance of exactly one object type. So for a topic like

<topic id="boito">
<instanceOf><topicRef xlink:href="#composer"/></instanceOf>
<instanceOf><topicRef xlink:href="#librettist"/></instanceOf>
  <!-- rest omitted -->
</topic>

that is a composer as well as a librettist we need a SHORE object type to define the object boito as an instance of.

Objekttyp _composer_and_librettist
ist ein composer
ist ein librettist
ist definiert in topicMap

For each topic that is referenced by an association with instanceOf, we define a SHORE relation type (Beziehungstyp)

<topic id="written-by">
<subjectIdentity>
<subjectIndicatorRef xlink:href="http://psi.ontopia.net/literature/#written-by"/>
</subjectIdentity>
<baseName>
<baseNameString>Written by/wrote</baseNameString>
</baseName>
<baseName>
<scope><topicRef xlink:href="#writer"/></scope>
<baseNameString>Wrote</baseNameString>
</baseName>
<baseName>
<scope><topicRef xlink:href="#work"/></scope>
<baseNameString>Written by</baseNameString>
</baseName>
<baseName>
<scope><topicRef xlink:href="#librettist"/></scope>
<baseNameString>Wrote libretto for</baseNameString>
</baseName>
<baseName>
<scope><topicRef xlink:href="#opera"/></scope>
<baseNameString>Libretto written by</baseNameString>
</baseName>
</topic>

<association>
<instanceOf><topicRef xlink:href="opera-template.xtmp#written-by"/></instanceOf>
<scope><topicRef xlink:href="ontopsi.xtmm#music"/></scope>
<member>
<roleSpec><topicRef xlink:href="#opera"/></roleSpec>
<topicRef xlink:href="#la-tilda"/>
</member>
<member>
<roleSpec><topicRef xlink:href="#librettist"/></roleSpec>
<topicRef xlink:href="#zanardini"/>
</member>
</association>

Because there is an association that is an instance of written-by, we define a corresponding relation type in SHORE's meta model:

Beziehungstyp written-by
alias Written_by_wrote
von 1 bis * topic
nach 1 bis * topic
ist definiert in topicMap

In this case we have an instance of written-by that is a binary association with two members. In topic maps you can find any order of members of an association:

<association>
<instanceOf><topicRef xlink:href="#written-by"/></instanceOf>
<scope><topicRef xlink:href="ontopsi.xtmm#literature"/></scope>
<member>
<roleSpec><topicRef xlink:href="#work"/></roleSpec>
<topicRef xlink:href="#marion-de-lorme"/>
</member>
<member>
<roleSpec><topicRef xlink:href="#writer"/></roleSpec>
<topicRef xlink:href="#hugo"/>
</member>
</association>

<association>
<instanceOf><topicRef xlink:href="#written-by"/></instanceOf>
<scope><topicRef xlink:href="ontopsi.xtmm#literature"/></scope>
<member>
<roleSpec><topicRef xlink:href="#writer"/></roleSpec>
<topicRef xlink:href="#brazier"/>
</member>
<member>
<roleSpec><topicRef xlink:href="#work"/></roleSpec>
<topicRef xlink:href="#catherine-ou-la-croix-dor"/>
</member>
</association>

In topic maps this can be easily done because all members have a role spec, so their order doesn't matter. Different from associations in topic maps, SHORE relations are directed, so the role of an object in a relation is expressed by the fact that an object is a starting point or an end point. To handle this, specialized relation types for any order of roles that appears in a topic map are defined as subsets (Teilmenge) of the relation type that corresponds with the association as a whole.

Beziehungstyp written-by_work_writer
alias Written_by
von 1 bis * topic
nach 1 bis * topic
ist Teilmenge von written-by
ist definiert in topicMap

Beziehungstyp written-by_writer_work
alias Wrote
von 1 bis * topic
nach 1 bis * topic
ist Teilmenge von written-by
ist definiert in topicMap

Beziehungstyp written-by_opera_librettist
alias Libretto_written_by
von 1 bis * topic
nach 1 bis * topic
ist Teilmenge von written-by
ist definiert in topicMap

The alias feature of SHORE, that is used here, is comparable to the basename of a topic. We use the basename for that role that  forms the start point of the relation, as an alias of the relation.

While SHORE relations represent binary associations, in topic maps an association can have more than two members. Such an association can be represented by an additional object, that represents the association, and relations from this object to the objects involved in various roles in the association.

Objekttyp killed-by_association
ist ein topic
ist definiert in topicMap

Beziehungstyp role_cause-of-death_in_killed-by
von 1 bis * killed-by_association
nach 1 bis * topic
ist definiert in topicMap

Beziehungstyp role_perpetrator_in_killed-by
von 1 bis * killed-by_association
nach 1 bis * topic
ist definiert in topicMap

Beziehungstyp role_victim_in_killed-by
von 1 bis * killed-by_association
nach 1 bis * topic
ist definiert in topicMap

Accessing the generated XML Files

After importing the meta model generated by xtm2mdl and also the topic map itself, one can access the contents of the topic map in the repository. Let's assume we are interested in the operas: Choose from the objects menu the requested object type opera.

choose opera from menu

What you get is a list of the operas that SHORE has extracted from the converted topic map. By the way - you could have used the object type _opera as well. The objects found are actually instances of _opera.

list of operas

Let's choose the opera Mefistofele from the list. What you get is the SHORE document, that xtm2xml has produced for this topic. The XML file is converted on the fly to HTML by the SHORE server.

opera mefistofele

The document, that xtm2xml generates for each topic, mainly consists of the occurrences of the topic plus elements for navigation. With SHORE, each object is defined somewhere in a document. In the case of topics the decision for "somewhere" was the text "About <topicname>". This is the regular starting point for navigating relations that start or end at an object. Compared to the static solution of the tm4web project, whose XSLT was re-used for xtm2xml, associations are not displayed in the document - they are delivered dynamically by the server on a navigation request. For convenience a second starting point to navigatei relations is included in the document - the "ALL ASSOCIATIONS" line.

Clicking on one of the two hyperlinks the server retrieves all relations for the opera Mefistofele and the browser displays them in the lower frame.

navigation

The list consists of two parts: relations that end at the current object and relations that start at the current object.
The rest is standard SHORE navigation. A more detailed description can be found in the users manual.

One special thing are n-ary associations. An example for this can be found with the "kills-by" association in the opera topic map. Roles in this association are victim, perpetrator and cause of death. In the aria Tosca the character Floria Tosca is stabbing the character Baron Scarpia:

Later she is killing herself jumping from a roof:

As described above with the metamodel we split n-ary associations into a set of  relations that start at an additional object, that represents the association. If the current object is Floria Tosca navigation looks  this way:

associations with Floria Tosca

Among other relations one can see relations starting from the two artificially inserted association objects:

  1. represents Tosca's self killing (she is perpetrator and victim at the same time)
  2. represents Tosca's killing of Baron Scarpia (she is only perpetrator)

If we follow the second relation, the artificially inserted association object becomes the current object.

tosca association

From here you can follow the relation to Baron Scarpia to learn for example, that he was the chief of police.

As every object the association object is defined in exactly on document. For the association objects it was decided to use the document of the last association member - in this case the "cause of death" object.

Installation

Prerequisite is an installed SHORE system. Let's assume that you've installed OpenSHORE in the directory C:\Programme\OpenSHORE.

SHORE is a client / server system. The ZIP File includes both parts - client side and server side software.

The most like case at the moment will be, that you run the SHORE server on your client machine. In this case just extract the ZIP File to the directory where you've installed SHORE, e.g. C:\Programme\OpenSHORE. The directory structure in the ZIP will fit to the structure already present. If you installed SHORE to another directory, you need to adapt the LIB-Variable in xtm2mdl.cmd and xtm2xml.cmd in the client subdirectory.

Make also sure that the LIB directory is on your PATH so that xtm2mdl.cmd and xtm2xml.cmd and merge.cmd will be found.

If client and server are not on the same machine or if want to have the client part outside the the server directory structure you can install parts of the ZIP file seperately. For the administrational tasks of the steps 1-5 on the client you need the content of the parser\lib\xtm and the content of the client directory.

The Opera example from Ontopia is a bit more difficult: The topic map opera.xtm uses mergeMap to include the opera-template, geography, history, and ontopsi topic maps and therefore needs merging as very first step:
merge ontopia_opera\*.xtm* > opera_merged.xtm

Then you can start with step 1 from the description above. Be aware that because of the size of the topic map step 1 and step 7 can take a while - depending on your machine a couple of minutes. Step 7 will split the topic map into more than 1000 XML files.

FAQ

Q: Why merging always as the first step?
A: xtm2mdl and xtm2xml always call the merge tool of the tm4j project as the first step. merge does some useful preprocessing, even if no other topic maps are merges, e.g. in the result of merge all topic have an ID.

Possible future extensions

One could think about a mechanism that extends the SHORE meta model "on the fly" if a new document type like a topic map is imported the first time. Why that?

SHORE was developed for situations where one has many documents that are instances of a few document types. The document types and their content are described in the meta model, the documents are just imported into the repository. This is the reason why the steps 1 - 5 described above typically are tasks of an administrator and only steps 6 and 7 are performed by normal shore users.

It may turn out that the assumption "many documents, few document types" does not hold for topic maps. If this is the case, one reason might be that topic maps do not differntiate between instances and classes. It may turn out that most of the topic maps are like the opera example: a big XML coded database that defines it's own structure.

In such a situation the 7 step procedure described above will not be very handy and an "on the fly" extension mechanism for SHORE's meta model would be helpful.

Of course there is another and easier way to process XML topic maps with SHORE: If one just models the meta structure of topic maps (e.g. topics, occurrences, ... as the only object types with relation types like hasInstance, hasOccurrence, ...) it is possible to process all topic maps with one single meta model. Such a SHORE application could also be helpful in special cases but navigation on this meta structure would very ugly.

Known Bugs

Acknowledgements

This product includes software developed by the TM4J Project (http://sourceforge.net/projects/tm4j).
TM4J makes use of the following software, so ...
This product includes software developed by the Apache Software Foundation (http://www.apache.org/).
This product includes the "jargs" software, Copyright (c) 2001, Stephen Purcell (http://jargs.sourceforge.net/).
This product includes the "mango" library developed by Jez Higgins (http://www.jezuk.co.uk/).