Minutes Oxford 02-2010

Genetic Editions meeting 2010-02-25 notes taken by Sebastian Rahtz This meeting too place at Oxford University Computing Services on 25/26 February 2010. Present were Malte Rehbein, Elena Pierazza, Lou Burnard, Fotis Jannidis, Gregor Middell, James Cummings and Sebastian Rahtz.

Targets for genetic editions meeting The interaction with this SIG and the TEI Technical Council was discussed. The Council meets in April, and it	 was agreed that at least part of the proposals which affected the schema of the TEI (ie new and changed elements) should be presented there for agreement. This could be done in a modular way, but with dependencies (ie start with	 low-level changes and move up to larger components such as	  document?). It is clear that some proposed changes/additions would be to existing chapters, other parts might need a new chapter. The Genetic document as it stands can be/may be maintained as ongoing tutorial/exemplar (comparable to Lite and Tite), or completely subsumed into the Guidelines. It was agreed that the meeting would attempt to present at least some concrete changes to the Council for April. Principles Straight into document models, and concepts of stages, layers, physical depth of material etc.	 Pages torn from diaries and later found, how recorded? How to model surfaces which are lost, damaged, refound etc. Should be possible to use a combination of gap, damage etc. Needs a discussion of the problem in first part of section 3.1. Not clear whether damage is a specialization of	 zone, or whether the two concepts are orthogonal. This affects the content model of zone - in fact, of line, because it is regarded as mandatory that text is always described as one or more lines.

The very important point of principle as to whether one should be allowed to encode any analytical elements within line caused a long discussion. Do we have just a	 seg wrapper (and allow all the children of	  seg), or nothing, or a new element segment, or by-default-empty model.linePart class to let future generations hang themselves. Do we need hi to indicate colour, size etc? Probably. Finally agreed the the content model of	 line should be cut back to hiLike, transcriptional, editorial, gLike and global and something very like, or	 itself, seg (but without all the baggage	  inside it).

Did the argument end there? No.

What about constructs like tables, formulae and lists? are they part of the document or the interpretation? Tables seem to be special, as they model a system of organizing writing on paper which is different from lines. Agreed to allow table as sibling of line. Unfortunately, this has the same problem as seg, that cell can contain almost anything!.

Manuscript and dossier levels revisionCampaign and stages. How to encode the identifed timeline(s) for the document. We need to have some explicit structure, which cannot simply be derived from the document. The TEI timeline element is only a	 skeleton to which parts of the text point, and cannot have its own content. A possible evolution element might reverse things and point from itself into the text, but also have its own content. Relationshiop to graph? An	 "evolution" documents a particular path through a	  graph. Problems over sub-sequences of stages. Stop trying to be so detailed for the moment. In the simplest case of	 grouping, we used to say use mod, otherwise if that falls over point from individual changelets to overall stage model. But why bother, because it fails with overlapping sequences anyway, so let's drop mod and let each event point explicitly to a change (remembering that change	 can be nested, so you can be as granular as you like in the	  linking.)  Forget about seq. What about stage="0"? Lord knows. Proposal is to instead add a new attribute instant, with boolean value which provides some sort of. yes means so close to the current stage that its not worth recording the difference, while no means some stage later than the current stage stage.

Dateable objects: is stage the same as period in att.dateable? apparently yes. Both pointers to an location defining a period of time; stage is slightly more general. So the relationship of att.staged and att.dateable needs to be resolved. The SIG believes that transcriptional objects should be stageable but not dateable.

Discussing rewrite

Now discussing spanning modification, like a	 strikethrough across a whole page. Proposed to resurrect mod for this, to mark some spanning-type modification to an arbitrary part of a surface; so	 mod has to be allowed at any level within a	  surface. Remove the rider which are considered as belonging to the same revision campaign because we want to get rid of that idea.

Discussion of headings from Gregor (see next section). I	 don't think I quite got a sense of the discussion :-}

Action plan. Revise the current draft of the document over the next 4 weeks, complete by 2nd April. From that, prepare an executive list of requests to Council by 19th April for submission. Everyone to edit the document on	 Sourceforge now - ask Lou or Sebastian for commit rights if needed.

Jobs for people to concentrate on	   Gregor: genetic Fotis: change Elena: document level. She also has to talk about virtual documents

Where does all this stuff go in the Guideines? Each of	 these bits can be done as a separate TEI enhancement Section 3: improves the current transcriptional chapter. stage belongs in names/people/places/dates chapter, probably Section 1/4: dossiers and documents is a new concept, goes in new section Section 5: improves the critical apparatus/editions chapter

Discussion points from Gregor (for the record) Theoretical framework (1) Status vs. Process: The theoretical framework does not address the question, what is supposed to be encoded on the documentary level: the text as it is given on the manuscript or the process of its writing? That decision has practial consequences, e.g. whether a fixation of a 	  textual passage is transcribed once (because the text is identical) or twice (because it has been written twice)? Aspects of Genetic Editions (2) The topological description: Besides the transcription/ layout of 	  the text, to what degree should we offer means of encoding for non- textual artifacts? It seems, that in trying to classify the different objects on a manuscript, we observe 4 classes: 1.) purely textual 	  inscriptions, 2.) purely graphical artifacts (e.g. drawings), 3.)  	  graphical elements with a stable type/token relationship and 4.) graphical elements, where this is not the case. We should discuss, whether this classification is sound and what encoding guidelines we can give for each class. (Textual) Alterations: Alterations (additions, deletions, 	  substitutions etc.) could be distinguished between those, that act on  	  the text itself (characters added/ removed), and those, that act on markup (underlining undone/removed, paragraph boundary removed etc.). We should therefore handle *textual* alterations on the text level, not on the documentary level, and rethink the concept of alteration/ variance on the document level. Transcription of a document (3.1) Lines: Lines should be optionally typed (att.typed), e. g. for normal vs. interlinear lines or counted vs. uncounted lines etc.	 Topological annotations: Zones should not be the only target for the annotation of topological information (coordinates, rotation 	  etc.). Such information could also make sense on a line, word or even on a character level. Thus it would be necessary to make arbitrary segments of character data addressable and to bundle topological information into a separate class/model (e.g. att.topological), so it 	  can be attached to elements, that partition the transcription on the preferred granularity level. Textual alterations (3.2) Befund vs. Deutung: A deletion encoded uniquely on the text level (del/) can be expressed in different ways on the document level. For example the passage can be striked through (probably the default case) or it can be made invisible by placing a patch over it or ... 1.) We 	should differentiate between both levels in our draft. 2.) Do we need different tags for each level, or could we make the semantics of existing tags depend on their usage context (del/ in ge:line/ vs. 	del/ in p/)? Additions and rewritings (3.2.1) ge:rewrite/: As noted above, this element flags a passage as 	being rewritten, thereby freeing the transcriber from typing the same text twice. As soon as it comes to the genetic analysis of the passage though, one might want to address the two acts of writing separately. Deletions and mark as used (3.2.2) Befund vs. Deutung: What might be indicated via the same expression on the document level (e. g. a passage being striked 	through), might to be interpreted differently on the textual level (e.g. text deleted vs. text marked as used). The ge:used/ element therefore would be bound to the textual level of markup. On the documentary level then, a redefined del/ or delSpan/ would suffice. Metamarks (3.2.3) Should attfunction/att be specified more thoroughly? What’s clear is, that metamarks are not part of the text (the “meta” aspect). On the contrary, what kind of „markup“ they represent (in the end: their 	semantics and their interpretation) is only explained by example in our draft and might need clarification. Transpositions (3.2.4) Arbitrary segments of text could be transposed. Could we promote a 	"model.global" element like milestone/ to be a spanning element, so that such segments can be addressed? Substitution (3.2.5) Is the distinction between a substitution and a grouping of changes (aka. "revision campaign") clear? Could a substitution be reformulated as a grouping of an addition and a deletion or do we loose specific semantics of "substitution" in this case? Undoing alterations (3.2.6) Can any kind of markup be undone? del and add for example have clearly defined semantics on the textual level. So their reverse effect is properly defined as well. Extending the notion of undoing to 	a potentially open set of markup, trades flexibility in expression for clarity of what’s expressed. Revision campaigns (3.3) change/: In the Faust-Edition we would like to group revision campaigns, so that we can start with smaller campaigns on a single page, for example motivated by adherence to a rhyme scheme, and then walk our way up to larger groupings by assembling smaller campaigns, maybe because they have been executed with the same writing material. Could we encode this properly by nesting change/ elements? Collation and Critical Apparatus (5) How do we express variance in/ alteration of markup? For example how does one express the removal of a paragraph boundary? In the Faust- Edition we currently use the inline-apparatus construct with each reading representing one markup alternative. This has the obvious drawback of necessarily doubling the affected segment and the markup that might go unaltered.