Minutes Oxford 02-2010
Genetic Editions meeting 2010-02-25
notes taken by Sebastian Rahtz
This meeting too place at Oxford University Computing Services
on 25/26 February 2010. Present were
Malte Rehbein, Elena Pierazza, Lou Burnard, Fotis Jannidis,
Gregor Middell, James Cummings and Sebastian Rahtz.
Targets for genetic editions meeting
The interaction with this SIG and the TEI Technical
Council was discussed. The Council meets in April, and it
was agreed that at least part of the proposals which
affected the schema of the TEI (ie new and changed elements)
should be presented there for agreement. This could be done
in a modular way, but with dependencies (ie start with
low-level changes and move up to larger components such as
document?). It is clear that some proposed
changes/additions would be to existing chapters, other parts
might need a new chapter. The Genetic document as it stands
can be/may be maintained as ongoing tutorial/exemplar
(comparable to Lite and Tite), or completely subsumed into
the Guidelines. It was agreed that the meeting would attempt
to present at least some concrete changes to the Council for
April.
Principles Straight into document models, and concepts of stages, layers, physical depth of material etc. Pages torn from diaries and later found, how recorded? How to model surfaces which are lost, damaged, refound etc. Should be possible to use a combination of gap, damage etc. Needs a discussion of the problem in first part of section 3.1. Not clear whether damage is a specialization of zone, or whether the two concepts are orthogonal. This affects the content model of zone - in fact, of line, because it is regarded as mandatory that text is always described as one or more lines.
The very important point of principle as to whether one
should be allowed to encode any analytical elements within
line caused a long discussion. Do we have just a
seg wrapper (and allow all the children of
seg), or nothing, or a new element
segment, or by-default-empty
model.linePart class to let future generations hang
themselves. Do we need hi to indicate colour, size
etc? Probably. Finally agreed the the content model of
line should be cut back to hiLike, transcriptional,
editorial, gLike and global and something very like, or
itself, seg (but without all the baggage
inside it).
Did the argument end there? No.
What about constructs like tables, formulae and lists? are they part of the document or the interpretation? Tables seem to be special, as they model a system of organizing writing on paper which is different from lines. Agreed to allow table as sibling of line. Unfortunately, this has the same problem as seg, that cell can contain almost anything!.
Manuscript and dossier levels
revisionCampaign and stages. How to encode the
identifed timeline(s) for the document. We need to have some
explicit structure, which cannot simply be derived from the
document. The TEI timeline element is only a
skeleton to which parts of the text point, and cannot have
its own content. A possible evolution element might
reverse things and point from itself into the text, but also
have its own content. Relationshiop to graph? An
"evolution" documents a particular path through a
graph. Problems over sub-sequences of stages. Stop trying to
be so detailed for the moment. In the simplest case of
grouping, we used to say use mod, otherwise if that falls over
point from individual changelets to overall stage model. But
why bother, because it fails with overlapping sequences
anyway, so let's drop mod and let each event point
explicitly to a change (remembering that change
can be nested, so you can be as granular as you like in the
linking.) Forget about seq. What about stage="0"?
Lord knows. Proposal is to instead add a new attribute instant,
with boolean value which provides some sort of . yes means so close to the current
stage that its not worth recording the difference, while
no means some stage later than the current stage
stage.
Dateable objects: is stage the same as period in att.dateable? apparently yes. Both pointers to an location defining a period of time; stage is slightly more general. So the relationship of att.staged and att.dateable needs to be resolved. The SIG believes that transcriptional objects should be stageable but not dateable.
Discussing rewrite
Now discussing spanning modification, like a strikethrough across a whole page. Proposed to resurrect mod for this, to mark some spanning-type modification to an arbitrary part of a surface; so mod has to be allowed at any level within a surface. Remove the rider which are considered as belonging to the same revision campaign because we want to get rid of that idea.
Discussion of headings from Gregor (see next section). I
don't think I quite got a sense of the discussion :-}
Action plan. Revise the current draft of the document over the next 4 weeks, complete by 2nd April. From that, prepare an executive list of requests to Council by 19th April for submission. Everyone to edit the document on Sourceforge now - ask Lou or Sebastian for commit rights if needed.
Jobs for people to concentrate on
Gregor: genetic Fotis: change Elena: document level. She also has to talk about virtual documents
Where does all this stuff go in the Guideines? Each of these bits can be done as a separate TEI enhancement
Section 3: improves the current transcriptional chapter. stage belongs in names/people/places/dates chapter, probably Section 1/4: dossiers and documents is a new concept, goes in new section Section 5: improves the critical apparatus/editions chapter
Discussion points from Gregor
(for the record)
Theoretical framework (1)
Status vs. Process: The theoretical framework does not address the question, what is supposed to be encoded on the documentary level: the text as it is given on the manuscript or the process of its writing? That decision has practial consequences, e.g. whether a fixation of a textual passage is transcribed once (because the text is identical) or twice (because it has been written twice)?
Aspects of Genetic Editions (2)
The topological description: Besides the transcription/ layout of the text, to what degree should we offer means of encoding for non- textual artifacts? It seems, that in trying to classify the different objects on a manuscript, we observe 4 classes: 1.) purely textual inscriptions, 2.) purely graphical artifacts (e.g. drawings), 3.) graphical elements with a stable type/token relationship and 4.) graphical elements, where this is not the case. We should discuss, whether this classification is sound and what encoding guidelines we can give for each class.
(Textual) Alterations: Alterations (additions, deletions, substitutions etc.) could be distinguished between those, that act on the text itself (characters added/ removed), and those, that act on markup (underlining undone/removed, paragraph boundary removed etc.). We should therefore handle *textual* alterations on the text level, not on the documentary level, and rethink the concept of alteration/ variance on the document level.
Transcription of a document (3.1)
Lines: Lines should be optionally typed (att.typed), e. g. for normal vs. interlinear lines or counted vs. uncounted lines etc.
Topological annotations: Zones should not be the only target for the annotation of topological information (coordinates, rotation etc.). Such information could also make sense on a line, word or even on a character level. Thus it would be necessary to make arbitrary segments of character data addressable and to bundle topological information into a separate class/model (e.g. att.topological), so it can be attached to elements, that partition the transcription on the preferred granularity level.
Textual alterations (3.2)
Befund vs. Deutung: A deletion encoded uniquely on the text level (del/) can be expressed in different ways on the document level. For example the passage can be striked through (probably the default case) or it can be made invisible by placing a patch over it or ... 1.) We should differentiate between both levels in our draft. 2.) Do we need different tags for each level, or could we make the semantics of existing tags depend on their usage context (del/ in ge:line/ vs. del/ in p/)?
Additions and rewritings (3.2.1)
ge:rewrite/: As noted above, this element flags a passage as being rewritten, thereby freeing the transcriber from typing the same text twice. As soon as it comes to the genetic analysis of the passage though, one might want to address the two acts of writing separately.
Deletions and mark as used (3.2.2)
Befund vs. Deutung: What might be indicated via the same expression on the document level (e. g. a passage being striked through), might to be interpreted differently on the textual level (e.g. text deleted vs. text marked as used). The ge:used/ element therefore would be bound to the textual level of markup. On the documentary level then, a redefined del/ or delSpan/ would suffice.
Metamarks (3.2.3)
Should attfunction/att be specified more thoroughly? What’s clear is, that metamarks are not part of the text (the “meta” aspect). On the contrary, what kind of „markup“ they represent (in the end: their semantics and their interpretation) is only explained by example in our draft and might need clarification.
Transpositions (3.2.4)
Arbitrary segments of text could be transposed. Could we promote a "model.global" element like milestone/ to be a spanning element, so that such segments can be addressed?
Substitution (3.2.5)
Is the distinction between a substitution and a grouping of changes (aka. "revision campaign") clear? Could a substitution be reformulated as a grouping of an addition and a deletion or do we loose specific semantics of "substitution" in this case?
Undoing alterations (3.2.6)
Can any kind of markup be undone? del and add for example have
clearly defined semantics on the textual level. So their reverse
effect is properly defined as well. Extending the notion of undoing to
a potentially open set of markup, trades flexibility in expression for
clarity of what’s expressed.
Revision campaigns (3.3)
change/: In the Faust-Edition we would like to group revision campaigns, so that we can start with smaller campaigns on a single page, for example motivated by adherence to a rhyme scheme, and then walk our way up to larger groupings by assembling smaller campaigns, maybe because they have been executed with the same writing material. Could we encode this properly by nesting change/ elements?
Collation and Critical Apparatus (5)
How do we express variance in/ alteration of markup? For example how does one express the removal of a paragraph boundary? In the Faust- Edition we currently use the inline-apparatus construct with each reading representing one markup alternative. This has the obvious drawback of necessarily doubling the affected segment and the markup that might go unaltered.