Future changes to Best Practices for TEI in Libraries

This page describes changes to the Best Practices for TEI in Libraries which will be made at some point but which did not hold up release of version 3.0.

These items will be migrated into the GitHub issue tracker.

milestone element
-- Is this TEI-conformant? Is there a better way to do this in any case?


 * This is not TEI conformant, unless you think that "*****" is a valid way of naming something (here this particular milestone). I would suggest


 * Or possibly


 * Or you could use if you have followed tite in adding it. LouBurnard


 * I agree with Lou that “typography” is not really a unit, and stars are not really a label for the unit, whatever it is. I’m not fond of &lt;space> for this purpose, but perhaps there are convincing arguments I haven’t considered. I quite like
 * except that I think, in general, these units are structural. So I would prefer "undetermined" as the value of unit=. Syd Bauman
 * except that I think, in general, these units are structural. So I would prefer "undetermined" as the value of unit=. Syd Bauman

deprecate key in favor of ref using a URN
The TEI Council, after discussing feature request 2919640, decided at its April 2010 meeting in Dublin to change the Guidelines to recommend use of @ref with a URN in place of @key. Revisions to P5 are forthcoming.

If we work this revision into the BP, we will probably need to revise our section on "key and ref" to something like:

Shakespeare, William, 1564-1616

Not sure whether use of &lt;taxonomy&gt; is still warranted.

The TEI Header

 * consider whether various MARC 1xx and 7xx subfields could be broken out into components of persName. If so, we'll change recommendations for persName@type.

list of elements deleted and changed by our ODDs
Syd's list of elements is now on a different wiki page because this page is meant for things to be addressed once version 3.0 is complete. (Which is now entirely out of date. —Syd)

Pending Issues Discussed
Should we have a place in the header to indicate an identifier for an outside metadata record for the item? Examples: Having such a link would allow a delivery system to provide an unambiguous link to this full metadata without relying on matching other information in the header like a title, ISBN, or call number. (Kshawkin)
 * record number for the source document in the local catalog
 * record number for the source document in WorldCat
 * record number for this TEI document in the local catalog
 * record number for this TEI document in WorldCat

Yes, I think we should. How about the spot where the TEI Guidelines recommend putting the code for the classification of the text (in some scheme), &lt;classCode> inside &lt;classDec>, or is that too much of a stretch? (—Syd)


 * During the call on 2009-02-10, Syd said he no longer thinks use of classCode</tt> (and a corresponding classDecl</tt>) is a good idea. Instead, he suggested we createa new element, otherDesc</tt>, to contain elements from outside the TEI namespace for metadata not covered by the TEI header. The GBP could specify how this element is used. (Kshawkin)

NOTE: we talked about this during our conf call on 2009-02-10; we decided to have a sub-group conference call on 2009-02-17 to talk in more detail about this. Emcaulay


 * We didn't get to this on 2009-02-17, so we postponed to 2009-03-03. However, few people showed up, so we postponed again.  As Syd put it, there are two issues to consider here:


 * A. What mechanism should we use to we point from the TEI header to metadata located outside the TEI document? (For example, how do you identify a MARC, METS, or MODS record that provides additional metadata about the TEI document and/or the source document?)


 * B. Should we provide a recommendation on storing non-TEI metadata within the TEI document (using a different element namespace)? For example, should we allow Dublin Core elements anywhere in the TEI header?

Email discussions in late March 2009 and early April 2009 with Syd, Melanie, Kevin, Michelle and Glen did not reach a conclusion. Tentative plans for the future would do this sort of thing when an element has the @ref attribute:

Welles, Gideon, 1802-1878.</persName>

except that in your example there's no @type or other method for describing the relationship between the content of  and the value of @ref. P5 says that @ref "provides an explicit means of locating a full definition for the entity being named by means of one or more URIs", but we are looking for a typology of some sort for these links and need a place to indicate the type of link.

And we'd do this when there's no @ref:

<sourceDesc id="sourceDesc_1"> [. . .] </sourceDesc>

for which you'd find elsewhere in the document:

<link type="MARCsource" target="#sourceDesc_1 http://mirlyn.lib.umich.edu/F/?func=direct&doc_number=000601789&local_base=MIU01_PUB"/>

link elements might be grouped together in one of these places:


 * TEI/teiHeader/profileDesc/creation/ab/linkGrp
 * 1st child of last child of
 * TEI/text/back/div[@type='editorial']/linkGrp

UPDATE: We'll probably use in various header elements: see https://sourceforge.net/tracker/index.php?func=detail&aid=2493417&group_id=106328&atid=644065. In any case, we'll need to tell people how much metadata to include in TEI header if they will also have external, possibly canonical, metadata sources.

While at the Swinburne conference with John I happen to look over his shoulder while he was encoding and I noticed his use of &lt;relatedItem></tt> in the &lt;biblStruct></tt>: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-relatedItem.html. You may have come across this already in your break-out meetings, but at a quick glance, the application seems to fit our needs: &lt;relatedItem></tt> contains or references some other bibliographic item which is related to the present one in some specified manner, for example as a constituent or alternative version of it. John was using it in conjunction with an embedded &lt;note></tt> to provide additional context. &lt;ptr></tt> and &lt;ref></tt> are also valid within <relatedItem>. (—Mdalmau)


 * this element is designed to reference a bibliographic item, not another metadata record or a single piece of metadata. So I don't think it's quite right. (Kshawkin)

Metadata Working Group
Per discussion at the SIG on Libraries meeting in Ann Arbor (2009-11), Syd put out a call for members of a working group to look at the relationship between the TEI header and other sources of metadata. This group will formulate recommendations for the TEI Council and/or for the text of the Best Practices. However, as decided a few times in the past, release of the Best Practices will not be held up for this group's work to finish because of the large scope of the work.

Proposal from IDP project meeting
http://listserv.brown.edu/archives/cgi-bin/wa?A2=ind1106&L=TEI-L&T=0&F=&S=&P=15216

extent element
Instead of

viii, [7]-215 p 20 cm.

we might use elements from the msDescription module and do it this way:

viii, [7]-215 p  <height quantity="20" unit="cm"/>

Had a width been given for this item in the catalog record, in the header it would have been:

viii, [7]-215 p  <width quantity="30" unit="cm"/> <height quantity="20" unit="cm"/>

appInfo and application
Consider inclusion of these elements in the header. Lou wrote, "There is at least one proposal forthcoming for further work on defining the scope and usage of these elements". Wait for these proposals to make their way into P5 before revisiting this question.

children of editorialDecl
Per a change in P5, editorialDecl can now have mixed content of s and specialized elements like. Revisit our decision to put all content into s and come up with recommended uses for specialized elements.

indicating interviewers and interviewees
Section “Level 4 Oral History” currently recommends that the speaking participants in the interview be identified is as authors of the document, or in profileDesc/particDesc/list/item/name. In the example encoding of this interviewees and interviewers are differentiated as text inside the given &lt;item></tt>. This strikes me as sloppy. It may well be appropriate to permit this encoding, in case someone is digitizing hundreds of interviews and has OCRed metadata that lists the participants this way. But certainly a best practice would make use of &lt;listPerson></tt>, and explicitly indicate "interviewer"</tt>, "interviewee"</tt>, "thirdParty"</tt> (or whatever) on the <tt>role=</tt> of <tt>&lt;person></tt>, no? —Syd

encoding &lt;pb>s within &lt;note>s
Give guidance on encoding of &lt;pb/>s within &lt;note>s. Should these be encoded or omitted, with the &lt;note> element appearing in the XML "within" the page on which it began? (Note that the BP allows for local practice in gathering all notes for a given section to a div at the end of the section.)

If encoding such &lt;pb/>s will be optional or required, there will generally be two instances of &lt;pb n="X"/> every time a note crosses a page boundary. Should we recommend use of @sameAs or @corresp to indicate that you are encoding the same page break twice? See this thread on TEI-L.

bibliographies and other lists of works cited
We give no guidance on encoding bibliographies, lists of works cited, lists of references, and other such things in documents. In TEI, you would use listBibl for this. We need to decide:


 * 1) Whether to include this in the BP and at which level
 * 2) Whether to use bibl, biblStruct, or biblFull inside the list.  (I strongly argue for bibl for faithfully source documents.)
 * 3) Since listBibl can include a head, the listBibl element could be a sibling of a div in TEI.  However, this would go against the BP, which says to use divs for everything.  So should listBibl go inside a div?  If so, should the head be a child of listBibl (as in TEI) or of the div (to be consistent with other parts of the BP)?

(Kshawkin 12:45, 14 July 2011 (EDT))

providing stylesheets
providing stylesheets for header-MARC and header-MODS


 * From MARC to TEI Header
 * Black Mesa Technologies has undertaken this work for us (gratis)!
 * From Tite to Level 3.5 (Syd recommendation)

providing tools
providing tools for workflows

mention RDA
The section "Determining Data Values for the TEI Header" mentions AACR2 and ISBD(ER). Perhaps also mention RDA?

how to record non-ASCII characters
People often want to know what to do with non-ASCII characters. There are generally four options:


 * insert them as they are (and assume the files will be manipulated by Unicode-aware software and that people can distinguish these from similar characters)
 * insert decimal entity references
 * insert hexadecimal entity references
 * insert mneumonic entity references (which need to be declared)

We should give advice on this.

add &lt;distinct> to Level 4
Consider adding distinct to Level 4.