Talk:Best Practices for TEI in Libraries

From TEIWiki
Revision as of 21:03, 14 June 2010 by Mdalmau (talk | contribs) (Add Tite as Level 3.5: Status update; moved draft text to main document)
Jump to navigation Jump to search

The following are revisions to make to the BP before making an official "release". There is a separate list of Future changes to Best Practices for TEI in Libraries.

Test ODDs and schemas derived from them

Test Syd's ODDs and schemas derived from them: http://bauman.zapto.org/~syd/temp/BestPractices/ . Just go to that URL, download the .rng files, and create a new XML document based on the schema. So if it allows you to insert all the elements you expect to be able to insert. Syd has been asked to make the following changes:

  • in header ODD, allow only a structured <publicationStmt>
  • lib1.rng: <oXygen/> says "Errors encountered: Probably no start pattern found".
  • The only allowed child of front, body, or back *at any level* should be a div.
  • note should not be allowed at in Level 1 or Level 2
  • ab should be the only child allowed of any div (in both Level 1 and Level 2). This element seems to be missing from the schema.
  • floatingText is missing in Level 3 or Level 4 schemas.

Possible additional tweaks to the ODDs based on email threads. Need to determine whether to move to Future changes to Best Practices for TEI in Libraries:

  • Use of attribute(s) on sourceDesc/biblStruct/monogr/imprint/date is now required, and with attribute values mapping to the Dates fixed fields (per email sent by Kevin on 1/16/2010)
  • normalizing AACR2 dates for machine processing, to be shared with Thutmose project (per email sent by Kevin on 1/16/2010)

Pending Review: Use of any P5 attributes

We need to figure out how much guidance we will give on use of attributes on elements within the body of a TEI document.

  • Prose changed to reflect recommended/required attributes in the body of a text
    • This text may change again after the list has been generated (it's purposefully vague)
    • May add an appendix of all attributes identified
  • Lisa is compiling a list of frequently used attributes, see List of attributes suggested to include in BPG, and those excluded
  • The group will need to review the list for completeness
  • Syd will constrain the ODDs accordingly

Resolved: Publication Statement

As of 5/26/2010, allow structured and unstructured publication statements, per Kevin: "In cataloging, I believe you don't state publication information if something is unpublished rather than saying something like "unpublished", but I don't have AACR2 with me at the moment to verify. However, we've been trying to give people the option of creating TEI headers conforming to the BP by hand (not from MARC source) in case they want to, and I agree that the TEI way to do this is with a <p> element. So I now favor allowing either the structured or unstructured publicationStmt in the ODD but saying in the prose that people they should only use the unstructured one for a statement such as your example. Does this sound okay to others?"

  • Irrelevant since we are talking about fileDesc publicationStmt; stick with structred statement

Resolved: Direction of pointing between note references and notes themselves

Decide whether to change back to having <ref> point to <note> instead of <note> point to <ref>, as Syd recommended. See this ticket:

https://sourceforge.net/tracker/?func=detail&aid=2796148&group_id=106328&atid=644062

and this change to the Guidelines:

http://tei.svn.sourceforge.net/viewvc/tei/trunk/P5/Source/Guidelines/en/CO-CoreElements.xml?r1=6937&r2=6936&pathrev=6937

or, for the full story, see Kevin's email from Nov. 6 and previous quoted messages.

Pending Review: meeting element

Decide whether to include <meeting> in sourceDesc/biblStruct/monogr/ and/or in titleStmt. (Per a change on 2010-01-15 in SourceForge, meeting is now allowed in titleStmt.) As Kevin discussed in an email sent on Oct. 12, the name of a meeting is usually included in a MARC record, but it's not distinguished from an author or editor in the same way TEI divides up the world. The essential question is: if you digitize a volume of conference proceedings, is the name of the meeting, as opposed to the title of the volume, really important enough to warrant inclusion in the TEI header? If so, we need to wrestle with the questions Kevin brought up on Oct. 11.

  • June 2, 2010 call: add meeting element to the header; Kevin will contact Renee about MARC mappings
    • I've added meeting to the header in fileDesc and sourceDesc and made it optional because if a conference is not named in the item being cataloged but could still be inferred or assumed, this name would not be included in the catalog record and therefore would not be available for inclusion in meeting. By making it optional, we won't require human intervention. (Kshawkin)

Resolved: appInfo and application

Decide whether to include <appInfo> and <application> in our header recommendations. In email discussions, Syd saw them as useful, but Lisa didn't think we need them.

There is at least one proposal forthcoming for further work on defining the scope and usage of these elements, which have not yet reached the degree of stability desirable for inclusion in a BP document, imho LouBurnard
  • June 2, 2010 call: Due to the instability of these elements, we will table their inclusion for now and revisit after TEI Council determines a ruling :-).
  • Should I move this to "Future work" page so we don't lose sight?
    • I've written it up there. When we clean up this page, this section can be removed.

Add history of version 3 to Appendix A

Say that the text was written between April 2008 and ___ 2010 (the release date).

  • June 2, 2010 call: Michelle will draft

Draft of History Prose

In April 2008, select members from the TEI Consortium Libraries Special Interest Group (SIG) and the DLF-sponsored TEI Task Force partnered to update the best practices. The revision was prompted by the release of P5, the newest version of the TEI, and the desire to create a true library-centric customization of the TEI. The group convened for a DLF-sponsored meeting at the Spring Forum in Minneapolis, Minnesota to tackle the revision work. Those in attendance were:

  • Syd Bauman (Brown University)
  • Michelle Dalmau (Indiana University)
  • Matthew Gibson (University of Virginia)
  • Kevin Hawkins (University of Michigan)
  • Lisa McAulay (University of California, Los Angeles)
  • Chris Powell (University of Michigan)
  • Jenn Riley (Indiana University)
  • Andrew Rouner (Washington University in St. Louis)
  • Melanie Schlosser (Ohio State University)
  • Natasha Smith (University of North Carolina, Chapel Hill)
  • Perry Trolard (Washington University in St. Louis)
  • Perry Willett (then University of Michigan, now California Digital Library)
  • Glen Worthey (Stanford University)

Work continued through conference calls, in which Renee McBride (University of North Carolina, Chapel Hill) and Richard Wisneski (Case Western University) also participated, and at a DLF-sponsored meeting that took place as part of the DLF Spring Forum in Raleigh, North Carolina on May 6, 2009.

In April 2009, a year after the revision work began, the significantly revamped best practices soon to be known as "Best Practices for TEI in Libraries" (version 3) were disseminated for public comment. At DLF that year, a Birds-of-a-Feather session (http://www.diglib.org/forums/spring2009/2009springprogram.htm) entitled "TEI Text Encoding in Libraries" was held to gather in-person public feedback. Comments received at the in-person meeting, from the TEILIB-L listserv, through a survey, and by direct email were gathered and prioritized at the DLF meeting. Renee McBride (University of North Carolina, Chapel Hill) agreed to map header elements to MARC elements, and Vitus Tang (Stanford University) provided valuable comments. In addition to addressing most of the comments received, it was resolved that Syd Bauman will generate an ODD specification (One Document Does it All; schema, prose documentation, etc.) for levels 1-4, further ensuring interoperability of texts encoded according to these best practices.

The revised best practices contain updated versions of the widely adopted encoding 'levels' - from fully automated conversion to content analysis and scholarly encoding. They also contain a substantially revised section on the TEI Header, designed to support interoperability between text collections and the use of complementary metadata schemas such as MARC. They also explore the relationship between METS and TEI and the relationship between these best practices and the new vendor specification, TEI Tite.

The new best practices also reflect an organizational shift. Originally authored by the DLF-sponsored TEI Task Force, the current revision work is a partnership between members of the Task Force and the TEI SIG on Libraries. As a result of this partnership, responsibility for the best practices will migrate to the SIG, allowing closer work with the TEI Consortium as a whole, and a stronger basis for advocating for the needs of libraries in future TEI releases.

Add Tite as Level 3.5

  • Dependent on ongoing Tite revisions; need confirmation from Dan O'Donnell/Perry Trolard
  • Adding level 3.5 on hold
  • Updated clarification between Tite and the BP added to the main prose

This was strongly recommended by Daniel Pitti in Ann Arbor because he felt certain that administrators and funders would be confused about the difference between TEI Tite and the Best Practices ("don't the libraries already have a TEI customization?"); in fact, Kevin has known this same confusion to arise among TEI Council members. While we have a section of the BP discussion its relationship to Tite, by having a Level 3.5, we can be more explicit about mapping between the two.

Mapping clarification from Kevin: Instead of actually mapping elements, Daniel wanted us to simply proclaim use of Tite as one of a number of appropriate encoding levels for libraries.

Naturally we will not be able to describe Tite the way we do other levels -- by simply saying "all the elements in the previous levels, plus the following". Tite uses different element names of all sorts. There's no point in having Syd make an ODD for Tite since one already exists. So what Kevin envisions here is a sort of "sidebar" about Tite, inserted between Levels 3 and 4 that discusses Tite in a bit more detail than we currently have in the beginning of the BP, with particular discussion of mapping between the two.

We recently had some discussion about the merits of this, so maybe we won't do it in the end. But if we do, we'll need a draft of this new sidebar. Two paragraphs are already written for you (the brief discussion of the relationship between Tite and the BP), and you can pull more information from Tite's discussion of an earlier version of the Best Practices.

Would someone be willing to write a first draft of all of this? Two paragraphs are already written for you, and you can pull more information from Tite's discussion of an earlier version of the Best Practices.

Can we just use what's written here (rather than link to it) and modify accordingly fpr our level 3.5: http://www.tei-c.org/release/doc/tei-p5-exemplars/html/tei_tite.doc.html#tei-in-lib-bpg. Didn't Kevin write this anyway? If not, whose permission do we need?
I'm pretty sure Perry Trolard wrote this section. Shouldn't be a problem to use it. However, we should carefully check all the assertions since things have likely changed since Tite was last revised. Another round of Tite revisions is supposed to be forthcoming, so perhaps wait on this.

Revise section on hyphenation

Revise the section on hyphenation per outcome of the discussion on TEI-L and perhaps also on how this is handled in the ongoing Tite revisions.

  • Kevin and Syd seem to agree to follow the main P5 Guidelines
    • Is this pending on Tite revisions or P5 Guidelines revisions (this section appeared under the Tite heading ...)?
      • No dependence on Tite unless they make drastic changes to their handling of hyphenation. Kevin is discussing revisions he's making to the BP with TEI Council to check that they match forthcoming changes to P5.

Acknowledgments

Tools Dev

Need to add Michael to this list and I think we should especially highlight Syd's ODD work so I propose we add the following section above the copy-editors:

The individuals who have contributed complementary tools to this document are:

  • Syd Bauman (Brown University): ODD specifications for levels 1-4
  • Michael Sperberg-McQueen (Black Mesa Technologies LLC): Thutmose stylesheets for MARC-to-TEI Header mappings
Great idea. (Kshawkin)

Editors

Do we want to higlight Kevin and Michelle as editors of this document? Who else?

I vote yes, to acknowledge these especially heroic efforts. (Gworthey)