Prosopography

Section 13.3 Prosopography of the TEI Guidelines discusses how the elements and attributes from the TEI P5 namesdates module can be used to mark up prosopographical data.

This wiki page attempts to summarise a number of issues in recording prosopographical data, and aims to provide a (sometimes personal) starting point for discussion and sharing of encoding experiences and insights. Rather than a documentation of best practice, it focuses on possible approaches for encoding some prosopographical phenomena, and tries to find sensible ones.

Unknown dates vs. lacking dates
Birth and death dates can only be recorded if they are known. Yet, omitting the element from a description may lead to ambiguity. Consider following example: Elvis Presley 1935

Does this indicate: (Obviously, this ambiguity does not hold for birth dates, as can be assumed that all existing persons must have been born.)
 * that the person is still living?
 * that the person has died but the date of death is unknown?

In order to resolve this ambiguity, it could be assumed that a description in a element withouth element would relate to a living person. When the person is known to have died, a statement would then be required. If only the birth date is known, this can be used as a lower bound for the possible date of death: Elvis Presley 1935    ? (source: Sebastian Rahtz on TEI-L)

Note how the @notBefore attribute is used to specify this lower bound, in combination with the @precision attribute to indicate the precision of this statement. One could equally infer an upper bound by assuming that nobody lives longer than 120 years, and document this with the @notAfter attribute on the element. The status of this conjecture and its certainty could be indicated by @evidence and @cert attributes, respectively: Corey Adams 1835    ? (source: Syd Bauman on TEI-L)

The lacking data could be represented by e.g. a question mark as text content of the element.

When no birth nor death date is known for a dead person, no precise dates (or approximations) can be given in attributes. Still, it could make sense to have and elements, whose text contents indicate this knowledge gap: Elvis Presley ?    ?

Circa Dates
This issue has been raised on the TEI-L mailing list. Currently, TEI P5 only offers implicit means to indicate that a date is only an approximation:

Aischylos ca. 525 b.C.    ca. 456 b.C.

...with the @precision attribute indicating the low precision of these statements. (source: Gabriel Bodard on TEI-L)

Yet, depending on the requirements of the encoding, an ideal representation of circa dates could get quite complex, and touches on broader issues concerning an accurate representation of uncertainty. See following resources:
 * Tim Finney: 'Uncertainty in text, markup and beyond'
 * feature request by Tim Finney

Alternative Dates
Sometimes, multiple possibilities exist for determining the date of birth or death of a person. When both dates are successive, one could encode this as a date range, with @notBefore and @notAfter, as in this case with two possible birth dates: Duncan, Isadora <birth notBefore="1877" notAfter="1878">[1877 or] 1878 1927

(source: Sebastian Rahtz on TEI-L)

When, on the other hand, both dates are non-successive, multiple or dates can be provided, with an @exclude attribute indicating their exclusive alternation: Vanhautte, Delphien</persName> <birth when="1869-09-02"> <placeName>Ardooie</placeName> 02.09.1869    <death when="1944-02-03" xml:id="d19440203" exclude="#d19441203"> <placeName>Edewalle (Handzame)</placeName> 03.02/12(?).1944    <death when="1944-12-03" xml:id="d19441203" exclude="#d19440203"> <placeName>Edewalle (Handzame)</placeName> 03.02/12(?).1944 (source: Syd Bauman on TEI-L)

Only a period is known
Sometimes no exact dates of birth or death are known, while the lifetime can be approximated in a period. As has been remarked by Sebastian Rathtz on TEI-L, the period can be approximated by @notBefore and @notAfter attributes on and elements, while the ambiguity can be recorded in a element: Regina van Alisa</persName> <birth notBefore="0201"/> <death notAfter="0300"/> 3rd cent

In his response, Gabriel Bodard on TEI-L makes a refinement. If the period mentioned does not literally apply to birth and death dates, and in the previous example rather conveys that this person lived the better part of her life in the 3rd century (while she might have been born or died before the start or after the end of that century), the most accurate way of representing this would imply an inversion of the @notBefore and @notAfter attributes on respectively and : Regina van Alisa</persName> <birth notAfter="0300"/> <death notBefore="0201"/> 3rd cent

Pseudonyms
Section 13.3.2.2 Personal States of the TEI Guidelines describes how multiple s are allowed for a person, possibly categorised with a @type attribute. For pseudonyms, the value 'pseudo' can be used. Hence, the guitarist commonly known as Slash could be described as follows:  Hudson Saul </persName> Slash</persName>

This may raise an issue regarding the 'canonicity' of either name. In this case, a bibliographic entry would probably use the pseudonym 'Slash', whereas for less famous people the actual name may be preferred. This 'preference' could be recorded implicitly by the order of the , or with the @sort attribute:  Hudson Saul </persName> Slash</persName>

This could then mean that the pseudonym is the 'canonical' name for this person.

Another question is whether the actual name should then be typed as @type='real'?

Alternative Names
A related issue arises when people are commonly known by another name than their real name. For example, Joe English, a Belgian painter, is actually called Joseph English. I wonder whether a regularisation with would be the most suitable solution to encode this?

 English, Joe A.M.      English Joseph A.M.    </persName>

Mixing Prose and Structure
The TEI P5 Guidelines define the contents as either a collection of highly structured biographical 'fields' (using dedicated elements like, , ...), or a collection of loose prose paragraphs (&lt;p>). Especially when transforming existing biographical records to TEI P5, this exclusive alternation can be unsatisfactory:
 * splitting up all biographical prose into biographical fields may be too demanding
 * OTOH, choosing to retain the description in prose paragraphs also prohibits the specification of basic biographical data in specific fields (, , ) inside

One (tentative) solution could be to extract record-like information into biographical fields, while grouping the prose descriptions as paragraphs into a element:

<person xml:id="FJ">  Fabricius, Jan Fabricius Jan </persName> <birth when="1871-09-30"> <placeName>Assen</placeName> 30.09.1871    <death when="1964-11-23"> <placeName>Wimborne, Dorset</placeName> 23.11.1964      Journalist, Nederlands toneelschrijver. Vader van de schrijver Johan Fabricius. Hoofdredacteur van verschillende bladen, zowel in Nederlands-Indië als in Nederland. In 1938 vestigde hij zich in Engeland. Hoewel Fabricius op latere leeftijd ook enkele romans en drie delen memoires heeft gepubliceerd, ligt zijn betekenis in zijn toneelwerk. Zijn technische vaardigheid en meesterschap in het schrijven van een dialoog, maken zijn stukken goed speelbaar. Kind van een tijd die het realisme beoefende (Heijermans, Hauptmann), kon hij toch niet de romantiek geheel laten varen. Het handelingsverloop is vaak gewild; er wordt dan bewust aangestuurd op een aandoenlijke werking, zodat sommige van zijn stukken het predikaat 'melodramatisch' verdienen. Gedreven door een sterk gevoel voor rechtvaardigheid zocht hij zijn gegevens in de problematiek van milieus die hem vertrouwd waren: het boerenland van Groningen en Drente, de planterswereld van Indië. Fabricius' werk werd in Nederland en ook in het buitenland met succes opgevoerd. Tot zijn bekendste werken behoren Eenzaam (1907) en Dolle Hans (1916). Sommige stukken zijn vertaald in het Frans, Duits, Engels, Noors, Deens, Fries en Gronings. <ptr ref="http://www.dbnl.org/auteurs/borkv/Fabr004.htm"/>

This allows to group other types of information into other notes as well, distinguished by means of appropriate values for their @type attributes.

Traits, States, and Events
The Guidelines use traits, states, and events as basic categories for prosopography. Traits differ from states in being generally immutable. It may be that the distinction between states and traits is unnecessary. Almost every member of model.persTraitLike (age faith langKnowledge nationality s*x socecStatus trait) can change. Dropping "trait" leaves a simple yet expressive model where entities are described by states and state-transitions (i.e. events).

Relations
At the moment, relations are defined by attributes such as name, active and passive:

<relation name="parent" active="#p1 #p2" passive="#p3 #p4"/>

An alternative would be to use RDF-like subject, predicate, object triples:

<relation subject="#p1" predicate="child-of" object="#p2"/>

If the relation element could be placed inside the element which describes the subject (e.g. a person element) then the subject attribute would be implicit.

Projects
This is a list of projects which have significant prosopography components. Feel free to add your project here. If possible, include links to the XML resources you have created.


 * The Map of Early Modern London (XML personography)
 * The Colonial Despatches of BC and Vancouver Island (XML personography)
 * The William Still Digital History Project