SIG:Ontologies

Introduction
In May and June 2004, there was a discussion on the TEI mailing list about prosopographical tags. This lead to a suggestion that detailed information about persons (physical and legal), dates, events, places, objects etc. and their interpretation could be marked up outside the text, and that this could be connected in on-going ontology work being done e.g. in the Museum community, such as the Conceptual Reference Model (CIDOC-CRM). The result of this was the establishment of a Ontologies SIG at the 4th annual members meeting of TEI in October 2004.

The SIG runs a mailing list on this topic. To join, send a message to tei-ontology-sig-request@lists.sourceforge.net with the word SUBSCRIBE in the header or the body of the email.

The SIG is convened by Øyvind Eide, of the Unit for Digital Documentation at the University of Oslo, and Christian-Emil Ore, of the Unit for Digital Documentation at the University of Oslo, who is also the chair of The International Committee for Documentation of the International Council of Museums (ICOM-CIDOC).

Meeting on October 23 2004
The meeting was held during the 4th annual members meeting of TEI in Baltimore in October 2004.

As a motivation for the conveners' interest in the topic, a short introduction to the CIDOC-CRM was given, together with a description of how the Unit for Digital Documentation use the standard. This was combined with a lively discussion about TEI and ontologies in general, and on the CIDOC-CRM standard in specific.

In 2004 and 2005, the conveners have promised to do the following work:


 * Set up a mailing list (2004)
 * Create a simple web page (2004)
 * Locate people interested in participation
 * Communicate with ICOM/CIDOC
 * Do TEI-CRM work on some of our data
 * Report on our work to the SIG list
 * Report progress at the next TEI meeting

In addition, we hope others will report on their work to the mailing list.

It was also agreed that the following tasks should be done by someone, but the responsibility was not appointed to anyone in specific:


 * Identify all elements in TEI with ontological relevance
 * Define mappings from these elements to CRM and to other ontological systems
 * Identify missing elements in both standards
 * Find the borders between TEI and ontologies in order to avoid duplication
 * Investigate the use of METS as a way to connect TEI to CRM
 * Feedback into TEI P5 based on the aforementioned tasks
 * Suggest more complete solutions on the integration/connection issue

Open meeting on June 15 2005
The TEI Ontologies SIG held an open meeting during the ACH/ALLC Conference in Victoria, Canada, on June 15, 2005. Six persons were participating.

Introduction
In the minutes from the lest meeting, two groups of tasks were set up:

1. Tasks the conveners promised to do in 2004 and 2005

2. Tasks we agreed should be done, but with no appointed person responsible

The tasks in group 1 are all on-going or finished, while of the tasks in group 2, only the identification of elements in TEI with "ontological relevance" (more on this expression below) is started.

Motivation
As a motivation for his interest in this work, Øyvind Eide described a project in which an analysis of relationships between person name elements in a TEI document was modeled in CIDOC-CRM. This work was discussed by the group, together with some aspects of the CIDOC-CRM model.

The group agreed on the general idea that there are things, such as relationships between persons, that should be modeled outside TEI, but with a possibility for links to TEI documents.

Identification of TEI element
This item was presented to the group as "Identifying TEI elements of special ontological interest". This initiated a discussion, because the group rejected the wording. Several other phrases were suggested to replace "special ontological interest", among them "extra-textual (ontological) interest" and "references to the physical world". None of these truly covers what we want to express, though.

Nevertheless, there were an common general understanding about what kind of elements we were talking about: Elements such as names, date and performances of plays, while elements such as italics, stanza and paragraphs are outside the scope of this SIG.

Time did not permit any actual investigation into which elements in TEI are of interest.

The on-going work in the FRBR/CIDOC-CRM Harmonization Group was discussed. It was argued that the TEI community should be involved in this work.

2005-08-01 Øyvind Eide

Meeting on October 28 2005
The meeting was held in Sofia on the morning of October 28, 2005. Six persons were participating.

With reference to the last meeting in Victoria, there was a discussion about the scope of the SIG. We have not been able to make any precise definition, and will have to do with the more practical description of our scope as "elements such as persons, places, dates and events".

The SIG agreed that the example presented at the first day of the TEI meeting by Conal Tuohy at the New Zealand Electronic Text Centre is a good example of the type of work interesting for this SIG.

The group discussed whether we should propose a generalized event element in TEI, similar to the name element. We did not conclude on this, but it should be further discussed.

We also discussed different ways to connect a TEI document to ontological information. A TEI document can contain the ontology, either as reference or included. We will work further on examining practical implementations of such connections.

Plans for 2006
We feel that we should now be able to do practical work to see how the ideas discussed in the SIG will work in practical implementation. The conveners will work on this, and will publish results, both on the SIG list an WIKI, and through other channels.

It was also interest in exploring ways to include Topic Maps in the TEI header.

The conveners will publish examples of our work on the WIKI, as will the New Zealand Electronic Text Centre. We hope this can lead on to a best practice document.

The SIG will probably have a meeting at the Digital Humanities conference in Paris in 2006.

2005-12-09 Øyvind Eide

Meeting on October 28 2006
11 persons were present at (parts of) the meeting.

The last year
The activity on the list and the WIKI has been quite limited the last year, and only the convener have posted anything either place. There is a visible interest in the topic of the SIG in many areas, though.

The annual conference of The International Committee for Documentation of the International Council of Museums (ICOM-CIDOC) was held in Gothenburg September 10-14. During the meetings of the Documentation Standards Group, liason with other parties was discussed and the work in this SIG came up. The group expressed support for the work done by the Unit for Digital Documentation at the University of Oslo on the relationship between CIDOC-CRM and TEI, and hoped the work would continue. This support includes, of course, to the work being done by other members of this list and participants at the SIG meetings as well. The CIDOC do not have an economy which makes it possible with financial support, but moral support is also good!

General discussion
The background of the SIG was presented together with a discussion of the difference between TEI and CIDOC-CRM.

The problem with authority control came up, how the problem relates to the possible use of FRBM and CIDOC-CRM in connection with TEI was discussed. There is a need for authority services, and the development and use of such services is related to the ontology work.

The cut-off point between TEI and conceptual systems/ontologies is an interesting problem that has to be continously worked on. How much information should be stored in TEI, how much in other formats with pointers to the TEI document?

Ontological information systems will often want to point to well defined access points in TEI.

CIDOC-CRM
The draft for a CIDOC-CRM mapping of some TEI P5 elements being done by EDD will be published on the mailing list and the WIKI this year. The possibility for use of the equiv element to express such a mapping will be examined. Øyvind Eide and Christian-Emil Ore is responsible for this.

We will also work on the construction of parts of a system for automatic building CIDOC-CRM compatible models based on TEI documents and the TEI-->CIDOC-CRM mapping in 2007.

FRBR
Gregory Crane at the Perseus Project will do some work on the relationship between FRBR and TEI in 2007 and report to the list and the WIKI.

TEI
Instances of persons are just added to TEI at element level, and place may be on the way (as opposed to names of persons and places). There are event elements, but only specific (persEvent, event in transcription of speech). Should a general event element be added as well?

And maybe a general object element?

There have been some work on different practical methods to relate ontological information to TEI documents. More experiments are welcome.

Meetings in 2007
The next meeting at the TEI meeting in October 2007. There may also an interim meeting at the Digital Humanities conference in June 2007 if potential participants are interested.

2006-11-21 Øyvind Eide

Meeting on November 3 2007
The meeting was held in College Part, Maryland in the morning of November 3, 2007, as part of the annual TEI meeting. Six persons were present.

Report from last year's work
There have been very little activity on the mailing list, but some relevant papers have been presented:


 * A draft mapping of TEI elements to CIDOC-CRM was published in January.


 * A paper called Expressing Complex Associations in Medieval Historical Documents: The Henry III Fine Rolls Project by Arianna Ciula, Paul Spence, José Miguel Vieira and Gautier Poupeau was presented at the Digital Humanities conference in June.


 * A poster called From TEI to a CIDOC-CRM Conforming Model by Øyvind Eide and Christian-Emil Ore was presented at the Digital Humanities conference in June.


 * A paper called Mapping from TEI to CIDOC-CRM: Will the New TEI Elements Make any Difference? by Øyvind Eide and Christian-Emil Ore was presented at the TEI meeting.

Two interesting projects have been noted:


 * The Henry III Fine Rolls Project.


 * The Perseus project is combining TEI documents with museum data using CIDOC-CRM and FRBR, as described in the article Named Entity Identification and Cyberinfrastructure by Alison Babeu, David Bamman, Gregory Crane, Robert Kummer and Gabriel Weaver.

The work on FRBR that was planned last year has not yet been completed.

Discussion at the meeting
The new CIDOC Co-reference Working Group was presented. The group will work on solutions for co-referencing particulars in various culture heritage information systems, including TEI documents.

The draft mapping to CIDOC-CRM of selected elements from TEI P5 was discussed, without any conclusions. It was agreed that further discussions should take place on the mailing list, though.

A mapping to CIDOC-CRM of one of the examples in chapter 13 was described and discussed. It was agreed that some guidelines for how to create TEI documents easy to map to ontologies such as the CIDOC-CRM would be a nice accomplishment.

The Henry III Fine Rolls Project and their use of an ontology in combination with TEI were described and discussed. This approach really helps the project, but we think it will be even more of an advance where data from several project may be combined, using the ontology as the connection layer.

Plans for the next year
There will be a SIG meeting at the TEI meeting in London in October/November 2008. There may also be a meeting at the Digital Humanities conference in Oulu in June 2008 if there is any interest for it.

There should be some discussion on the mailing list about the issues raised at this meeting. We also agreed to try to locate funding for our work.

Progress on the Co-reference work and on the Henry III Fine Rolls Project will be reported on the mailing list.

There is an interest to work on the mapping of TEI to FRBRoo, and we hope some results will be presented on the email list. We will also continue to work on mappings to CIDOC-CRM, especially of TEI examples, and report on the list as well as elsewhere.

Work should start on the development of guidelines for how to create TEI documents that easily may be mapped to ontologies such as the CIDOC-CRM, and results should be reported to the mailing list.

2007-11-26 Øyvind Eide

Meeting on June 25 2008
Four people met briefly during the Digital Humanities conference in Oulu. The 2008 work of the SIG was desribed, with some paper at the Digital Humanities conference, the planned session at the TEI meeting and the full day SIG meeting planned during the TEI meeting.

The idea of a full week workshop to get some more work done was put forward. We hope to be able to make such an event, but will need funding.

Meeting on November 8 2008
The meeting was held in London from 9.30 to 16.45 November 8, as part of the TEI annual meeting. There were 11 persons present in the morning session, but not everyone stayed for the afternoon session.

Report from last year's work
With regard to the plans that were agreed upon at the last meeting, there has been progress in some of the areas. Due to limited interest, there were no real meeting at the DH conference in June, but a few people met an discussed our work over lunch.

We have not been able to locate any specific funds for our work, but some of the members continue to do relevant work funded otherwise. No progess has been reported on the TEI to FRBR work, but there have been work on TEI to CIDOC-CRM mapping. It is also interesting to note that the work of harmonising FRBR and CIDOC-CRM (FRBRoo) is finelised and in the process of being accepted by the relevant bodies.

Several papers have been published with reference to our work:

Øyvind Eide and Christian-Emil Ore: TEI and cultural heritage ontologies. Paper at the Digital Humanities 2008 conference.

Arianna Ciula and Tamara Lopez: Reflecting on a dual: Henry III fine rolls print and web. Paper at the Digital Humanities 2008 conference.

Ciula, Arianna and Vieira, José Miguel (2007), ‘Implementing an RDF/OWL Ontology on Henry the III Fine Rolls’, paper given at OWLED 2007, Innsbruck.

Øyvind Eide The Exhibition Problem. A Real-life Example with a Suggested Solution. Literary and Linguistic Computing 2008 23: 27-37; doi:10.1093/llc/fqm040

There was also a session at the 2008 TEI meeting organised by this SIG. The session was "Connecting TEI to External Conceptual Models as a Method for Information Integration". and included the following papers:


 * Gregory Crane: Fourth Generation Collections: TEI, FRBR, and Canonical Text Services


 * Arianna Ciula and José Miguel Vieira: Complementing and extending TEI documents with an ontology: Henry III Fine Rolls project case study


 * Øyvind Eide and Achille Felicetti: Creating hybrid TEI/CIDOC-CRM documents for semantic browsing of information about 3D digital objects

Mapping of standards
Based on the responses on invitations to suggest topics for this meeting at the mailing list, as well as on the interests of the participants present at the meeting, the discussion was concentrated on the relationship between TEI and CIDOC-CRM. Thus, the meeting started by an introduction to CIDOC-CRM by Christian-Emil Ore, followed by a discussion based on the paper by Ore and Eide described above, "TEI and cultural heritage ontologies". Several items from the paper were discussed in detail. The meeting agreed that the paper will be circulated on the mailing list in December this year in order for the members of the SIG to discuss the different items. This will be a closed circulation as the paper is going to be printed in LLC in 2009.

The goal of this work is to make the following available:


 * A mapping from CRM to TEI++. TEI++ is TEI with the necessary extensions to make the mapping possible.


 * A general mapping from TEI to CRM. Being general, this mapping will be less detailed than what will be possible from specific groups of documents. We will also look into more specific mappings.

Based on this results, some items will be included in the Sourceforge system as suggestions for changes of TEI to include TEI++ in the TEI guidelines. Some items regarding CIDOC-CRM will be discussed in the CIDOC CRM SIG (web ref).

Mapping of documents
To try mappings of real documents at the meeting, we had been provided with a set of documents by Sebastian Rahtz. We went through a fragment from one of the documents (href til side hos oss også). Based on this example, we created the following model (href til graphml version):

Note that this is work in progess and is still open for change. There are some dotted objects on the figure. They are NOT based on the TEI file, but are included to illustrate some other possibilities.

A RDF version of this example will be included here shortly.

Papers
Some paper with relevance to the work of this SIG have been presented lately. The following is a short list, please add more!

Arianna Ciula, Paul Spence, José Miguel Vieira and Gautier Poupeau: Expressing Complex Associations in Medieval Historical Documents: The Henry III Fine Rolls Project

[http://www.edd.uio.no/artiklar/teknikk_informatikk/CIDOC2006/EIDE_HOLMEN_Reading_Gray_Literature.pdf Eide, Ø and Holmen, J: Reading Gray Literature as Texts. Semantic Mark-up of Museum Acquisition Catalogues]

[http://www.allc-ach2006.colloques.paris-sorbonne.fr/DHs.pdf Eide, Ø and Ore, CE: TEI, CIDOC - CRM and a Possible Interface between the Two. Page 62-65]

Øyvind Eide and Christian-Emil Ore: From TEI to a CIDOC-CRM Conforming Model (poster)

Øyvind Eide and Christian-Emil Ore: Mapping from TEI to CIDOC-CRM: Will the New TEI Elements Make any Difference?

Draft CIDOC-CRM mapping of TEI
The draft mapping, together with a presentation of this SIG. Please comment to oyvind.eide@edd.uio.no.