Encoding Positions and Items

Aims
In our edition’s textual encoding we try to record, if possible, all the processes which occur during the genesis of a handwritten text. As part of that we want to take account of the collaborative nature of Goethe’s mode of work. Both aspects are of general interest when dealing with modern manuscripts.

The headline “positions and items” is due to the fact that we consider the destinction between positions and items relevant for an adequate encoding of these processes (we do not intend to encode textual positions for their own sake). We are especially interested in your opinion about the practical solutions to the following problems which we outline by means of the terms “position” and “item”.

The distinction between positions and items is based on the fact that most texts can be viewed under two aspects:
 * the aspect of being a sequence of tokens, and
 * the aspect of being a sequence of positions which can be singled out independently of the specific token occupying them, e.g. in syntactical or metrical terms.

Usually the sequence of positions remains stable, and there is a one to one matching of positions and items: Each position is occupied by exactly one item (a linguistic token). The indispensability of the distinction between “positions” and “items” becomes obvious in the case of positions which lack an item (see below). Without the term “position” we could hardly adress this part of text directly.

The distinction can also be applied to the usage of markup which records alterations of the text. and  record an addition or deletion of an item (for example a word), and (coming along with that) an addition or deletion of a specific position. In contrast  indicates that one item replaces the other, the position itself remaining unaffected. Whether or not alterations effect the sequence of positions is not explicit in the definitions of the P5 “Guidelines”. However, in the given examples the mere additions and deletions are to be interpreted as accompanied by the emergence or disappearance of the respective position, whereas this does not apply to.

Two or more items concur to occupy one position: A simple case (solved)
Sometimes two or more items concur to occupy one and the same position, as in the present case, where one verse line is followed by seven attempts to the following verse line of which you can see only the first two ones:



The problem of indicating their common belonging to one position is solved on the basis of, as you can see in the following detail of a textual transcript:



Positions without any item (solved)
Sometimes a position is not occupied by any item at all, as you see in this example:



In the manuscript, such unoccupied positions are marked with dots, each of which represents a verse line.

In the textual transcript, unoccupied positions like these are marked with :



The unoccupied position can also be recorded in the following form:

       

Both expressions should be interpreted as equivalent.

We can use the attributes  and   or surrounding structural elements like   if we know the extent of the concerning positions. However, there are many cases were we only know, for example, that there are several lines or a part of a speech missing. We are searching for unified expressions of this.

Open problem: The author assigns an item to a previously unoccupied position: “ ”
In the following example the author assigned items to previously unoccupied positions:



Just as in the previous example, the unoccupied position had been marked with two dots, each of which representing a verse line.

In a way this process may seem like an addition. But firstly,  is used to mark an addition of an item and, simultaneously, of a position. However:
 * 1) The position was already there before the authorial intervention took place.
 * 2)  implies an alteration of the text. It implies that the author changed his opinion about the shape of the text. However, the text was not altered. It was merely completed. That is: The author realised his original plan.

Therefore it would rather be justifiable to encode these two new verse lines like any other without further comment. Instead of this we would like to propose a new non-empty element “ ” (with an -like syntax but a different semantic) that would be applied in the following way:  Hier iſt die Auſſicht frey   Der Geiſt erhoben. 

Open problem: a position is cleared: “ ”
In the following example a wording of a verse line is dismissed whereas the position is left intact, because it is part of a stanza pattern:



In other words: The item would have to be substituted, but is not.

We may be inclined to use, because we are rightly used to interprete (and with good reason) a strikeout as a deletion. However:
 * 1)  would be interpreted as a deletion of a position.
 * 2) In fact, as we said above, the position is left intact.

Therefore, we propose a new non-empty element, e.g. “ ” (with an -like syntax but a different semantic) that would be applied in the following way: “ ”.

Considered alterations
An alteration is considered by the author, but not carried out:



Apparently, the author was not decisive about who is speaking: The “Schatzmeister“ (treasurer) or “Marschalk” (constable)?.

Proposed Alterations
Another person, e.g. one of the author’s consultants, proposes an alteration:



The consultant prefers “Gespenst” (ghost) to the mere repetition of “Gespinst” (figment) in “Gespinst-Gespinnsten”.

Accepted Proposals
In the following example a consultant of the author had proposed a substitution (by pencil). The author later accepted the proposal (by over-inking):



An item is questioned
Passages may be marked for the purpose of a prospective deletion or substitution:



In this case, the original “Und” now is inappropriate as a result of the addition of two verse lines in pencil of which the first one begins with “Und”, too.

Such an act is to be considered as alteration of the text, because an item looses it’s previously unchallenged validity.

A combined example


In this case the three last phaenomena appear together:
 * 1) A consultant of the author questions “Anteus” (underlining by pencil).
 * 2) The consultant proceeds to propose a substitution (by pencil on the left margin).
 * 3) Eventually the author accepts the proposal (by over-inking).

Possible approaches
If we want to do justice to the collaborative nature of the working process, we have to find a proper way to record the various steps which lead up to a final wording.

Starting point
There are at least two different approaches: We can record the above mentioned alterations either
 * 1) via elements or
 * 2) via attributes.

With regard to the considered alterations, a proposal has been made in the “Encoding Model for Genetic Editions” (c. “3.2.4 Alternative Readings”). Applied to the above mentioned example, the textual encoding would look like this:



In this way of encoding,  is supposed to indicate that both items eventually concur to occupy one and the same position. is supposed to record the authorial intervention. Both is expressed by means of elements, not attributes.

We met two problems in this proposal:
 * 1)  indicates that a position was occupied more than once already before an intervention took place.
 * 2)  requires a   as its correlate.

The conflict underlying these two problems seems to be the following: Should we prioritize the alteration which is considered or the fact that it is only considered? serves the first,  serves the second.

If we want to record the concerned alteration we should record it completely. In the above mentioned case, we would have to record a substitution:

Schatzmeiſter. Marſchalk.

But how to record the mentioned fact that this substitution is only considered?

Expression via attributes
First we recorded the fact that an alteration is only considered by means of an attribute:

 Schatzmeiſter. Marſchalk.

meaning “revision type”.

Then we were made aware of the fact that a text which would be automatically derived from this encoding would read “Marſchalk” because “Schatzmeister” is unambigously abandoned. However, there may be cases were the original wording is to be preferred from an editorial point of view, although its deletion was taken into consideration by the author.

Perhaps it is possible to circumvent this problem by teaching a machine to include or ignore s and s depending on specific attribute values with which they could be provided:


 * 1) “ ” with  or the consultant’s name as value   as value for the considered alterations (which might be regarded as a special kind of proposed alterations)
 * 2) “ ” with  as the normal value
 * 3) “ ” with  or   as value.

It is impossible to deal with the above mentioned act of questioning only by means of an attribute, because in many cases we cannot infer a specific alteration from it that could be provided with an attribute (especially in prose texts we may not even know wether the position itself or only its item is concerned). We would have to use (abuse?)  as a carrier for the attribute   and one of the above mentioned values.

However: If the act of questioning is to be concerned as an alteration, and if we want to encode alterations consistently, we should seek to record this act in the same way as if we were recording an addition or a deletion, that is: we should seek for an element which name fits to the nature of the act.

Expression via elements
In the above mentioned “Encoding Model for Genetic Editions” a special non-empty element has been introduced to record the fact that an alteration is to be regarded as undone:  (c. 3.2.7 Undoing alterations). This element “points to the element the effect of which is being cancelled”. Does an encoding of this kind enable us to derive automatically a text that contains, for example, the content of a  (because the concerning deletion has been undone)? If it does, the following elements could work syntactically analogous:
 * 1) “ ” with  and @target
 * 2) “ ” with  and @target
 * 3) “ ” with  and @target

Again, the act of questioning would be dealt with in a different way, because there is no given element to point to. Instead of an empty “ ” which points to a specially allocated, we prefer a non-empty “ ”.