Difference between revisions of "Best Practices for TEI in Libraries"

From TEIWiki
Jump to navigation Jump to search
(Introduction: fixed paragraphs)
 
Line 1: Line 1:
[http://www.tei-c.org/wiki/index.php/SIG:Libraries TEI in Libraries: Home]
+
See http://purl.oclc.org/NET/teiinlibraries for the latest published version of the Best Practices.
  
==Introduction and History==
+
See also [[Future changes to Best Practices for TEI in Libraries]].
    <p>The Text Encoding Initiative Guidelines for Electronic Text Encoding and Interchange (referred to as
 
    the <i>TEI Guidelines</i>) were first published in 1994 and represent a tremendous
 
    achievement in electronic text standards by providing a highly sophisticated structure for encoding
 
    electronic text. Digital librarians have benefited greatly from the standardization provided by these
 
    guidelines, and the potential for interoperability and long-term preservation of digital collections
 
    facilitated by their wide adoption.</p>
 
  
    <p>In 1998, the Digital Library Federation (DLF) sponsored the [http://www.lib.umich.edu/lit/dlps/history/teidlf/index.html TEI and XML in Digital Libraries Workshop] at the
 
    Library of Congress to discuss the use of the <i>TEI Guidelines</i> in libraries for electronic text,
 
    and to create a set of best practices for librarians implementing them. From this workshop, three working
 
    groups were formed the members of which represented some of the largest and most mature digital library
 
    programs in the U.S. Group 2 was charged with developing a set of recommendations for libraries using the
 
    TEI Guidelines in electronic text encoding. This group included the following representatives from six libraries:
 
    </p>
 
<ul>
 
      <li>LeeEllen Friedland, Library of Congress</li>
 
      <li>Nancy Kushigian, University of California, Davis</li>
 
      <li>Christina Powell, University of Michigan</li>
 
      <li>David Seaman, University of Virginia</li>
 
      <li>Natalia Smith, University of North Carolina at Chapel Hill</li>
 
      <li>Perry Willett, Indiana University (chair)</li>
 
    </ul>
 
    <p>At the ALA mid-winter (January 1999), the DLF task force revised a draft set of best practices, called
 
    TEI Text Encoding in Libraries: Guidelines for Best Practices (referred to as <i>TEI in Libraries
 
      Guidelines</i>). The revised recommendations were circulated to the conference working group in May
 
    1999 and presented at the joint annual meeting of the Association of Computers and the Humanities and
 
    Association of Literary and Linguistic Computing in June 1999. [http://www.diglib.org/standards/tei-old.htm Version 1.0] was circulated for comments in
 
    August 1999. These guidelines were endorsed by the DLF, and have been used by many digital libraries,
 
    including those of the task force members, as a model for their own local best practices. Libraries,
 
    museums and end-users have benefitted from a set of best practices for electronic text in a number of
 
    ways, including better interoperability between electronic text collections, better documented practices
 
    among digital libraries, and a starting point for discussion of best practices with commercial publishers
 
    regarding electronic text creation.</p>
 
 
    <p>Written in 1998, this first iteration of <i>TEI in Libraries Guidelines</i> made no mention of XML,
 
    XSLT, or any of the other powerful tools that have now become common parlance and practice in creating
 
    digital documents and collections. Based on these important changes in markup technology, it came to the
 
    attention of the DLF and members of the original Task Force that the <i>TEI in Libraries Guidelines</i>
 
    required substantial revision. In 2002, the TEI Consortium published a new edition of the complete
 
      <i>TEI Guidelines</i> that conformed to XML specifications. In order to remain useful, the <i>TEI in
 
      Libraries Guidelines</i> had to be updated to reflect these developments.</p>
 
 
    <p>Furthermore, librarians need more guidance than the original <i>TEI in Libraries Guidelines</i>
 
    provided. There are many library-specific encoding issues which need to be addressed and documented to
 
    ensure consistency. The intention of this document is to provide recommended paths of encoding for these
 
    issues.</p>
 
 
    <p>In addition, these library guidelines have the potential to be much more useful if they can serve as a
 
    training document from which librarians can learn about text encoding and addressing particular encoding
 
    challenges. To fulfill this role, the guidelines require more examples and detailed explanations,
 
    giving documentation of the use of TEI in a library context. Librarians also need a set of standards
 
    and best practices for vendors and publishers who create electronic text for digital libraries, so that
 
    these collections adhere to the same archival standards as locally-created electronic text collections.
 
    With detailed guidelines that could serve as an encoding specification, librarians might encourage
 
    vendors to follow the principles in these standards, to facilitate the long-term preservation of
 
    commercially published electronic text collections, and more readily allow for cross-collection
 
    searching.</p>
 
 
    <p>In order to facilitate the evolution of this document, another DLF-sponsored Task Force&mdash;some
 
    of the representatives of which were on the original Task Force&mdash;met on October 24-25, 2003 at
 
    the Cosmos Club in Washington, D.C.: </p>
 
    <ul>
 
      <li>Richard Gartner, Oxford University Library</li>
 
      <li>Matthew Gibson, University of Virginia Library</li>
 
      <li>Kirk Hastings, California Digital Library</li>
 
      <li>Christina Powell, University of Michigan</li>
 
      <li>Merrilee Proffitt, RLG</li>
 
      <li>David Seaman, Digital Library Federation</li>
 
      <li>Natalia Smith, University of North Carolina at Chapel Hill</li>
 
      <li>Perry Willett, Indiana University (chair)</li>
 
    </ul>
 
    <p>These representatives met to revise the original <i>TEI in Libraries Guidelines</i> in order
 
    that they: </p>
 
    <ul>
 
      <li>reflect changes occuring within the text encoding world generally and within the TEI
 
      community specifically</li>
 
      <li>further illuminate the different levels of encoding by offering clearer and more robust
 
      examples.</li>
 
    </ul>
 
    <p>After producing [http://www.diglib.org/standards/tei2/tei20.htm Version 2.0] of the Guidelines, this group (with some changes in membership) met again
 
    at the Cosmos Club on February 13-14, 2006. Those in attendance were: </p>
 
    <ul>
 
      <li>Syd Bauman, The TEI Consortium</li>
 
      <li>Richard Gartner, Oxford University Library (by phone)</li>
 
      <li>Matthew Gibson, Virginia Foundation for the Humanities (chair)</li>
 
      <li>Merrilee Proffitt, RLG</li>
 
      <li>Chris Powell, The University of Michigan</li>
 
      <li>David Seaman, Digital Library Federation</li>
 
      <li>Natasha Smith, University of North Carolina at Chapel Hill</li>
 
      <li>Perry Willett, The University of Michigan</li>
 
    </ul>
 
    <p>This group of continues to meet and based upon their discussions and resolutions, the following
 
    guidelines will continue to be updated and enhanced.</p>
 
 
==General Overview and Comments==
 
 
    <p>These recommendations are for libraries using the XML version of the TEILite DTD (teixlite).
 
    There are many different library text digitization projects, for different purposes. With this in mind,
 
    the Task Force has attempted to make these recommendations as inclusive as possible by developing a
 
    series of encoding levels. These levels are meant to allow for a range of practice, from wholly automated
 
    text creation and encoding, to encoding that requires expert content knowledge, analysis, and editing.</p>
 
 
    <p>Recommendations for Levels 1-4 are intended for projects wishing to create encoded electronic text with
 
    structural markup, but minimal semantic or content markup. Also, the encoding levels are cumulative:
 
    encoding requirements at each level incorporate the requirements of lower levels. Levels 1-4 allow the
 
    conversion and encoding of texts to be performed without the assistance of deep content knowledge and can
 
    be enriched with more markup at any time. Level 5, in contrast, requires scholarly analysis. </p>
 
 
    <p>These recommendations are concerned with the text portion of a TEI-encoded document. While there are
 
    modest requirements for including certain information about encoding level in the TEI header, a separate
 
    set of recommendations, now integrated into this document, was developed to address issues concerning TEI header
 
    contents to MARC-format bibliographic data (see the [[#The TEI Header (based on June 16, 2001 Draft)|&lt;teiHeader> section]]).</p>
 
 
==General Recommendations==
 
 
    <p>Note: all recommendations that follow are based on P4:2004, with the exception of the facs= attribute of &lt;pb>, which comes from the P5 module for the transcription of primary sources ('transcr').</p>
 
 
    <ul>
 
    <li>The encoding level (as described in this document) should be recorded in the
 
      &lt;editorialDecl&gt; in the TEI header, along with an explanation of any deviation from the
 
      recommendations.</li>
 
 
    <li>Electronic text at all levels of encoding should begin with the transcription of the first word on
 
      the first leaf of the original work. It may be impractical or undesirable to transcribe and encode
 
      certain features of the text, such as publisher's advertisements or indexes, but if at all possible,
 
      they should be included as links to page images. Any omissions of material found in the original work
 
      should be noted in the &lt;editorialDecl&gt; in the TEI header.</li>
 
    <li>File naming should follow ISO 9660 conventions: 8-character filenames, 3-character extensions,
 
      using A-Z, a-z, 0-9, underscores and hyphens. The rationale behind this suggestion is that when moving
 
      texts across different platforms (DOS for instance), some systems will truncate beyond the eighth
 
      character.</li>
 
    <li>We recommend the use of numbered divisions throughout the electronic text, always beginning with
 
      &lt;div1&gt; (we prefer that &lt;div0&gt; not be used since the TEI Guidelines does not
 
      make it available in &lt;front&gt; or &lt;back&gt; matter). Numbered divisions present
 
      advantages to search and indexing software by explicitly communicating the hierarchical level of the
 
      section described. Texts at all levels should include at least one &lt;div1&gt;.</li>
 
    <li>Page breaks should be encoded using the &lt;pb&gt; element,
 
which should demark the top of a page (i.e. the text of page
 
seven should immediately follow &lt;pb n="7"/&gt;), and should
 
always be contained within a division. E.g., a page break that
 
occurs between chapters 2 and 3 should be encoded near the top
 
of the &lt;div> that holds chapter 3 (rather than near the bottom
 
of the &lt;div> that holds chapter 2).</li>
 
  </ul>
 
 
==The TEI Header==
 
 
===Introduction===
 
At the TEI and XML in Digital Libraries Workshop that was held at the Library of Congress in July 1998, several working groups
 
were formed to consider various aspects of the Text Encoding Initiative.  Group 1 was charged to recommend some best practices
 
for TEI header content and to review the relationship between the Text Encoding Initiative header and MARC.  To this end,
 
representatives of the University of Virginia Library and the University of Michigan Library gathered in Ann Arbor in early
 
October 1998 to develop a recommended practice guide.  This work was assisted by similar efforts that had taken place in the United
 
Kingdom under the auspices of the Oxford Text Archive the previous year.
 
 
The following section is based on a draft of those recommended practices. It was submitted to various constituencies for
 
comment.  In mid-2008, Melanie Schlosser and Kevin Hawkins heavily revised this section for discussion by the TEI SIG on Libraries and the DLF TEI Task Force.
 
 
===Working Assumptions===
 
<p>A TEI header can serve many publics.  Headers can be created in a text center and reflect the center's standards, or they can
 
serve as the basis for other types of metadata system records produced by other agencies.  Headers can function in detached
 
form as records in a catalog, as a title page inherent to the document, or as a source for index displays.</p>
 
 
<p>In addition, a header may describe a collection of documents, a single item, or a portion of an item.  Variances in TEI header
 
content can result from making different choices of what is being described.</p>
 
 
<p>A TEI header may not have a one to one correspondence with a MARC record.  One TEI header may have multiple MARC analytic records,
 
or one MARC record may be used to describe a collection of TEI documents with individual headers.</p>
 
 
<p>A TEI header serves several purposes.  It may contain an historical background on how the file has been treated.  It can extend the
 
information of a classic catalog record.  The Text Center and/or cataloging agency can act as the gatekeeper for creators by
 
providing standards for content.</p>
 
 
===Cataloging a TEI Document===
 
 
<p>Does the TEI header act as the electronic title page or as a catalog record?  Is it integral to the document it describes or
 
independent?  Depending on the community being served, the TEI elements will reflect the interest of that community.  Nonetheless,
 
it is possible to describe a set of "best practices" that will produce compatible content while accommodating this variety of
 
purposes.  Compatibility of content encourages a more understandable set of results when information about assorted items is
 
displayed as a set of search results, a contents list, or an index, and it allows for more reasonable conversion of content
 
information from TEI elements to elements of other metadata sets when this action seems advisable.</p>
 
 
<p>It is a traditional practice of librarianship to agree upon where in a document and in what order of preference one should look
 
to identify the title, author, etc., of that document.  This permits a certain consistency in terminology and allows for a certain
 
amount of authentication of content.  We recommend the following preferences to those who create headers and to those who attempt
 
to use headers to create traditional catalog records that are compliant with AACR2 and ISBD(ER) rules.</p>
 
 
<p>As a member of the academic community, the header creator/editor has a responsibility to verify, whenever humanly possible, the
 
intellectual source for an electronic document that presents itself without any information regarding its source or authorship.</p>
 
 
====Chief Sources of Information for Several Types of Electronic Resources Are:====
 
<ul> <li>1. For an electronic document with a digitized title page (without a header), prefer
 
<ul>
 
<li>a) Chief source of information = information coded as title page</li>
 
<li>b) Use added information from an originating paper document if absolutely certain it is the source</li>
 
</ul>
 
 
</li>
 
<li>2.  Electronic document with header (without a title page)
 
<ul>
 
<li>a) Chief source of information = supplied and verified header <ref name="verify">Verified means that the cataloger/editor has established for him/herself that the information represented as title information is an accurate representation of content.</ref></li>
 
 
<li>b) Use information from paper document if absolutely certain it is the source</li>
 
</ul>
 
 
</li>
 
 
<li>3.  Electronic document with header and title page
 
<ul>
 
<li>a) Chief source of information = supplied header (if verified)<ref name="verify"/></li>
 
<li>b) If header is not verified, use title page as chief source</li>
 
<li>c) Use information from paper document if absolutely certain it is the source</li>
 
</ul>
 
 
</li>
 
<li>4.  If neither header nor title page is present and there is no evidence of a source document, the header creator
 
<ul>
 
<li>a) May assign a title and author if appropriate</li>
 
<li>b) Enclose the information in brackets, using the standard English language convention for editorial interjections</li>
 
</ul>
 
</li>
 
<li>5.  If neither header nor title page is present but the header creator has satisfactory evidence of an originating source, that
 
document should be used as the chief source of information for the title and author of the header.  If the source cannot be fully
 
verified as to edition, authorship, etc., this fact should be clearly indicated in a note in the &lt;fileDesc&gt;.
 
</li>
 
 
</ul>
 
 
 
===Element Recommendations for the &lt;teiHeader&gt;===
 
 
<table border="1" width="80%">
 
<tr>
 
<td colspan="1">&lt;teiHeader type="____"&gt;</td>
 
<td colspan="1">
 
Standards which apply to the header, e.g., &lt;teiHeader type="ISBD(ER)"&gt;, &lt;teiHeader
 
type="AACR2"&gt;
 
</td>
 
</tr>
 
</table>
 
 
===&lt;fileDesc&gt;===
 
<table border="1" width="80%">
 
 
<tr>
 
<td colspan="1">
 
&lt;title type="____"&gt;</td>
 
<td colspan="1">
 
Only uniform title and main title for the TEI document, not the source document, should be entered here, e.g., &lt;title type="uniform"&gt; or &lt;title type="main"&gt;.  See &lt;sourceDesc&gt; for other title forms for documents where a user might seek the documents under titles other than those assigned.  Where a title is supplied, the title should be enclosed in square brackets using standard English-language conventions for editorial insertion.</td>
 
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;author&gt;</td>
 
<td colspan="1">
 
Author of original source (electronic or print) should be entered into the &lt;author&gt; element before the &lt;respStmt&gt;. Whenever possible, establish or use the form of the name from a national name authority file.
 
</td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;editor&gt;</td>
 
<td colspan="1">
 
Editor of original source (electronic or print) should be entered into the &lt;editor&gt; element <b>before</b> the &lt;respStmt&gt;. Whenever possible, establish or use the form of the name from a national name authority file.</td>
 
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;respStmt&gt;</td>
 
<td colspan="1">
 
The editor (also compiler, illustrator) of the TEI document, not the source document, should be entered here.  Whenever possible, establish or use the form of the name from a national name authority file.</td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;editionStmt&gt;</td>
 
 
<td colspan="1">
 
This element should be used sparingly as there are currently no standards as to when versions become editions. Users should refer to the TEI guidelines.</td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
 
&lt;extent&gt;</td>
 
<td colspan="1">
 
Use the standard text "ca.**** kilobytes".</td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
 
&lt;publicationStmt&gt;</td>
 
<td colspan="1">
 
Use the child elements below rather than &lt;p&gt; for a prose description.</td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;publisher&gt;</td>
 
<td colspan="1">
 
The publisher is whoever has collected the TEI document, not the source document, and has made decisions concerning it.</td>
 
</tr>
 
 
<tr>
 
 
<td colspan="1">
 
 
&lt;distributor&gt;</td>
 
<td colspan="1">
 
The distributor is whoever makes the TEI document, not the source document, available.</td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;idno&gt;</td>
 
<td colspan="1">
 
Any unique identification number determined by the publisher. Use type="OCLC-___" to point to the catalog record for the TEI document, if applicable. The institution's three-letter MARC code follows "OCLC-" in the attribute value. Alternatively, use an ISO 15511 code in the type attribute to specify the holding institution.</td>
 
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;availability status="___"&gt;&lt;p&gt;</td>
 
<td colspan="1">
 
Describe any rights statement for the TEI document. Use a standard scheme, such as the code for a Creative Commonse license, for the value of the status attribute if possible.</td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;date&gt;</td>
 
<td colspan="1">
 
Refers to the date of the first publication of the TEI document. <b>For most purposes, the year date (yyyy) will be adequate. If greater precision is required, enter dates as yyyymmdd.</b></td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;seriesStmt&gt;&lt;title&gt;</td>
 
 
<td colspan="1">
 
Whenever possible, establish or use the name from a national name authority file authorized for the electronic, locally created series.
 
</td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;notesStmt&gt;</td>
 
 
<td colspan="1">
 
Optional, for indicating questionable attributions for title, author, etc.</td>
 
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;sourceDesc&gt;</td>
 
<td colspan="1">
 
In order to effectively represent the source(s) when many documents are represented by the TEI header, we see the need for structured elements that minimally allow us to identify parent-child and component relationships.  In the absence of these structures, we suggest that multiple source descriptions be employed with relationships described in free text. Relationships also could be useful in other portions of the TEI header. Cataloger may need to do research to establish the original source.</td>
 
</tr>
 
 
<tr>
 
 
<td colspan="1">
 
&lt;bibl&gt; or
 
&lt;biblFull&gt;</td>
 
<td colspan="1">
 
Use &lt;bibl&gt; with child elements listed below. metadata for the source document is automatically generated from a MARC record. It is suggested that child elements appear in the following order for ease of display according to ISBD: &lt;author&gt;, &lt;title type="245a"&gt;, &lt;title type="245b"&gt;, &lt;respStmt&gt;, &lt;edition&gt;, &lt;pubPlace&gt;, &lt;publisher&gt;, &lt;date&gt;, &lt;extent&gt;, &lt;title type="series"&gt;, &lt;note&gt;, &lt;idno type="isbn-13"&gt;, &lt;idno type="isbn-10"&gt;, &lt;idno type="OCLC-___"&gt;. Use &lt;biblFull&gt; for exhaustive description created by hand. </td>
 
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;title&gt;</td>
 
<td colspan="1">
 
Title of the source document. It is possible to have multiple &lt;title&gt; elements.  Alternative titles (cover, running, spine titles) should be entered in separate &lt;title&gt; elements in &lt;biblFull&gt;.</td>
 
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;author&gt;</td>
 
<td colspan="1">
 
Author of the source document. It is possible to have multiple &lt;author&gt; elements. Whenever possible, establish or use the form of the name from a national name authority file.</td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
 
&lt;editor&gt;</td>
 
<td colspan="1">
 
Editor of the source document. It is possible to have multiple &lt;editor&gt; elements.</td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
 
&lt;respStmt&gt;</td>
 
<td colspan="1">
 
If using &lt;bibl&gt; and generating from a MARC record, use this for the statement of responsibility (MARC field 245, subfield c).</td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
 
&lt;editionStmt&gt;</td>
 
<td colspan="1">
 
Edition statement in the source document (if present).</td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;extent&gt;</td>
 
 
<td colspan="1">
 
Enter physical description for the original source. If using &lt;bibl&gt; and generating from a MARC record, use this for MARC field 300.</td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;publisher&gt;</td>
 
<td colspan="1">
 
Don't repeat field.  Enter multiple publishers divided by semicolons.</td></tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;pubPlace&gt;</td>
 
<td colspan="1">
 
 
Don't repeat field.  Enter multiple publication places divided by semicolons.</td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;date&gt;</td>
 
 
<td colspan="1">
 
Imprint date for the original source.  '''For most purposes, the year date (yyyy) will be adequate. If greater precision is required, enter dates as yyyymmdd.'''</td>
 
</tr>
 
 
<tr>
 
 
<td colspan="1">
 
&lt;idno&gt;</td>
 
<td colspan="1">
 
In this location, &lt;idno&gt; refers to identification numbers for the source document.  They can be used to indicate the source's location in an individual institution's collection.  If a formal standard location system is being used, indicate the nature of the system, e.g.,  &lt;idno type="LC call number"&gt;. If using &lt;bibl&gt; and generating from a MARC record, use type="isbn-13" and type=-"isbn-10" if applicable. Use type="LC call number" (or appropriate abbreviation) for a call number. Use type="OCLC-___" to point to the catalog record for the source document. The institution’s three-letter MARC code follows "OCLC-" in the attribute value. Alternatively, use an ISO 15511 code in the type attribute to specify the holding institution.</td>
 
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;seriesStmt&gt; or &lt;series&gt;</td>
 
 
<td colspan="1">
 
Whenever possible, establish or use the form of the name from a national name authority file.</td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;notes&gt;</td>
 
 
<td colspan="1">
 
Notes about the source document. If using &lt;bibl&gt; and generating from a MARC record, use for the value of MARC field 500.</td>
 
</tr>
 
</table>
 
 
===&lt;encodingDesc&gt;===
 
<table border="1" width="80%">
 
 
<tr>
 
<td colspan="1">
 
&lt;projectDesc&gt;</td>
 
<td colspan="1">Enter a description of the purpose for which the electronic file was encoded.</td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;editorialDecl&gt;</td>
 
<td colspan="1">
 
Enter general and specific statements about how the electronic file has been treated.  Record here editorial decisions made
 
during encoding.</td>
 
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;refsDecl&gt;</td>
 
<td colspan="1">
 
&lt;refsDecl&gt; could be used for administrative metadata, e.g., pagination and page sequencing.</td>
 
</tr>
 
 
<tr>
 
 
<td colspan="1">
 
 
&lt;classDecl&gt; &lt;taxonomy id="____"&gt; &lt;bibl&gt;</td>
 
<td colspan="1">
 
If used, identify the appropriate taxonomy definitions or descriptive sources in the &lt;taxonomy&gt; element followed by id, e.g.,
 
&lt;taxonomy id="LCSH"&gt; &lt;bibl&gt;Library of Congress Subject Headings&lt;/bibl&gt; &lt;/taxonomy&gt;, &lt;taxonomy id="AAT"&gt; &lt;bibl&gt; Art &amp; Architecture Theasaurus&lt;/bibl&gt; &lt;/taxonomy&gt;.</td>
 
</tr>
 
</table>
 
 
===&lt;profileDesc&gt;===
 
<table border="1" width="80%">
 
 
<tr>
 
<td colspan="1">
 
 
&lt;langUsage&gt; &lt;language id="____"&gt;</td>
 
<td colspan="1">
 
Use the ISO 639-2 standard (which is the same as the MARC language codes) for the id attribute. The content of the element should be the name of the language given in ISO 639-2.</td>
 
 
</tr>
 
 
 
<tr>
 
<td colspan="1">
 
&lt;classCode scheme="___"&gt;</td>
 
<td colspan="1">
 
True classification numbers as opposed to call numbers can be entered here.</td>
 
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;keywords&gt;<br />
 
 
&lt;term&gt;
 
</td>
 
<td colspan="1">
 
Use for uncontrolled terms.</td>
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;keywords scheme="____"&gt;<br />
 
&lt;term&gt;
 
</td>
 
<td colspan="1">
 
Use for controlled vocabulary as specified in &lt;encodingDesc&gt; taxonomy id. Example: scheme="LCSH"</td>
 
 
</tr>
 
</table>
 
 
===&lt;revisionDesc&gt;===
 
<table border="1" width="80%">
 
 
<tr>
 
<td colspan="1">
 
&lt;change&gt;
 
</td>
 
<td colspan="1">
 
Create a &lt;change&gt; element to record each significant change to the TEI document.</td>
 
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;date&gt;
 
</td>
 
<td colspan="1">
 
The date of the change in ISO 8601 form (YYYY-MM-DD).</td>
 
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;respStmt&gt; &lt;name&gt;
 
</td>
 
<td colspan="1">
 
Initials or other identifying information about the person or process making the change.</td>
 
 
</tr>
 
 
<tr>
 
<td colspan="1">
 
&lt;item&gt;
 
</td>
 
<td colspan="1">
 
Prose description of the change.</td>
 
 
</tr>
 
</table>
 
 
===Minimal TEI Header Recommendation===
 
<blockquote>
 
<div class="indent">&lt;teiHeader&gt;</div>
 
<div class="lindent">&lt;fileDesc&gt;</div>
 
 
<div class="lindent2">&lt;titleStmt&gt;</div>
 
<div class="lindent3">&lt;title&gt;&lt;/title&gt;</div>
 
<div class="lindent3">&lt;author&gt;&lt;/author&gt;</div>
 
<div class="lindent2">&lt;/titleStmt&gt;
 
 
</div>
 
<div class="lindent2">&lt;publicationStmt&gt;
 
</div>
 
<div class="lindent3">&lt;publisher&gt;&lt;/publisher&gt;
 
 
</div>
 
<div class="lindent3">&lt;pubPlace&gt;&lt;/pubPlace&gt;
 
</div>
 
<div class="lindent3">&lt;idno&gt;&lt;/idno&gt;
 
</div>
 
<div class="lindent3">&lt;availability&gt;&lt;p&gt;&lt;/p&gt;&lt;/availability&gt;
 
</div>
 
<div class="lindent3">&lt;date&gt;&lt;/date&gt;
 
 
</div>
 
<div class="lindent2">&lt;/publicationStmt&gt;
 
 
</div>
 
<div class="lindent2">&lt;sourceDesc&gt;
 
</div>
 
<div class="lindent3">&lt;biblFull&gt;
 
</div>
 
<div class="lindent4">&lt;titleStmt&gt;
 
</div>
 
<div class="lindent5">&lt;title&gt;&lt;/title&gt;
 
 
</div>
 
<div class="lindent5">&lt;author&gt;&lt;/author&gt;
 
</div>
 
<div class="lindent4">&lt;/titleStmt&gt;
 
</div>
 
<div class="lindent4">&lt;publicationStmt&gt;
 
</div>
 
<div class="lindent5">&lt;publisher&gt;&lt;/publisher&gt;
 
</div>
 
<div class="lindent5">&lt;pubPlace&gt;&lt;/pubPlace&gt;
 
 
</div>
 
<div class="lindent5">&lt;date&gt;&lt;/date&gt;
 
</div>
 
<div class="lindent4">&lt;/publicationStmt&gt;
 
</div>
 
<div class="lindent3">&lt;/biblFull&gt;
 
</div>
 
<div class="lindent2">&lt;/sourceDesc&gt;
 
</div>
 
<div class="lindent">&lt;/fileDesc&gt;
 
</div>
 
 
<div class="indent">&lt;/teiHeader&gt;
 
</div>
 
</blockquote>
 
 
Repeat the &lt;biblFull&gt; field, as appropriate, if there is more than one source for the electronic item. [http://www.diglib.org/standards/teiheaders.html See some examples in context …]
 
 
===Recommended Additions to the teixlite DTD===
 
<ul>
 
<li>Add elements under &lt;name&gt; to allow for the variable ordering and alphabetization of names:
 
&lt;forename&gt;, &lt;surname&gt;, &lt;birthdate&gt;, &lt;deathdate&gt;, &lt;title&gt;, etc.</li>
 
 
<li>In order to effectively represent the source(s) when many documents are
 
represented by the TEI header, we see the need for structured elements that, minimally,
 
allow us to identify parent-child and component relationships.</li>
 
</ul>
 
 
 
===Acknowledgments and Bibliography===
 
<ul>
 
<li>
 
This guide was prepared by Judy Ahronheim, Thomas Champagne, Lynn Marko, Kelly Webster, and Chris Wilcox of the University of
 
Michigan Library and Jackie Shieh of the University of Virginia Library in October 1998.  The source documents were the cataloging
 
guides prepared by those two institutions ([http://www.lib.virginia.edu/cataloging/manual/chapters/chapxiib.html Virginia] and [http://www.lib.umich.edu/ts/sections/electronicresourcesdev.html Michigan]).  In addition,
 
documentation from the Oxford Text Archive, Arts and Humanities Data Service of the United Kingdom also was made available to assist
 
in this effort.
 
 
</li>
 
 
</ul>
 
 
==Encoding Levels==
 
===LEVEL 1: Fully Automated Conversion and Encoding===
 
 
'''Purpose''': To create electronic text with the primary purpose of keyword searching and linking to page images. The primary advantage in using the teixlite DTD at this level is that a TEI header is attached to the text file.
 
 
    <p>
 
'''Rationale''': The text is subordinate to the page image, and is not intended to
 
      stand alone as an electronic text (without page images).</p>
 
 
    <p>Texts at Level 1 can be created and encoded by fully automated means, using uncorrected OCR of page
 
      images ("dirty OCR"), exporting from existing electronic text files, or actually not including any text
 
      at all. Only those elements that are necessary to divide the text from the header and facilitate linking to
 
      page images are used. Encoding is performed automatically based on artifacts of the OCR or other
 
      document creation process (page breaks, for example) and metadata collected during the imaging or
 
      preparation process. This encoding is both minimal and reliable, and does not typically require
 
      extensive review of each page of each text.</p>
 
 
    <p>Level 1 texts are not intended to be adequate for textual analysis; they are more likely to be suited
 
      to the goals of a preservation unit or mass digitization initiative. Though their encoding is minimal,
 
      Level 1 texts are fully valid XML texts. In addition to taking advantage of the TEI header, using the
 
      teixlite DTD--with the consistency suggested by these guidelines--allows Level 1 texts to be compatible
 
      with more richly encoded teixlite texts (that also follow these guidelines) for
 
      searching, for example. Further encoding based on document structures or content analysis can be added
 
      to a Level 1 text at any time.</p>
 
 
    <p>
 
      '''Level 1 is most suitable for projects with the following characteristics''':
 
    </p>
 
 
    <ul>
 
      <li>a large volume of material is to be made available online quickly</li>
 
      <li>a digital image of each page is desired</li>
 
      <li>no manual intervention will be performed in the text creation process</li>
 
      <li>the material is of interest to a large community of users who wish to read texts that allow
 
      keyword searching</li>
 
      <li>sophisticated search and display capabilities based on the structure of the text are not necessary</li>
 
 
      <li>extensibility is desired; that is, one desires to keep open the option for a higher level of
 
      encoding to be added at a later date</li>
 
    </ul>
 
 
   
 
      <table border="1" width="80%">
 
      <tr>
 
        <td colspan="1">&lt;div1&gt;</td>
 
        <td colspan="1">If no type= attribute is specified, a type= value of
 
        "section" should be presumed.</td>
 
      </tr>
 
 
      <tr>
 
        <td colspan="1">&lt;p&gt;</td>
 
        <td colspan="1">At least one "container" element per div is required (while &lt;ab&gt; is another option for this case, the Task Force suggests using &lt;p&gt; in order that the document be open to being extended to other encoding levels).</td>
 
      </tr>
 
      <tr>
 
 
        <td colspan="1">&lt;pb&gt;</td>
 
        <td colspan="1">Required in Level 1. Page images can be linked to the text by specifying a jpeg or other image file as the value of the facs= attribute. Page numbers can be supplied with the n= attribute to record the number that is on the page. The Task Force sees the use of METS here as having a tremendous advantage. METS/TEI page turning documentation will be included in the near future.</td>
 
      </tr>
 
      </table>
 
   
 
====Basic Structural Example:====
 
<pre>
 
<TEI.2 id="someid">
 
  <teiHeader>
 
    [Source and processing information goes here]
 
  </teiHeader>
 
  <text>
 
    <body>
 
      <div1>
 
        <p>
 
          <pb id="p00000001" n="1"/>
 
          [main body of the unmarked up plain text begins here]
 
          <pb id="p00000002" n="2"/>
 
          [more plain text goes here with appropriate page breaks interspersed] ...
 
          <pb id="p00000145" n="145"/>
 
          [more plain text]
 
          <pb id="p00000146" n="146"/>
 
          [text ends here]
 
        </p>
 
      </div1>
 
    </body>
 
  </text>
 
</TEI.2>
 
</pre>
 
 
[http://www.diglib.org/standards/level1example.html See an example in context…]
 
 
===LEVEL 2: Minimal Encoding===
 
'''Purpose''': To create electronic text for full-text searching, linking to page images, and identifying simple structural hierarchy to improve navigation.
 
 
'''Rationale''': The text is subordinate to the page image, though navigational markers (textual divisions, headings) are captured. The text could stand alone as electronic text (without page images) if the accuracy of its contents is suitable to its intended use and it is not necessary to display low-level typographic or structural information. Level 2 requires a set of elements more granular than those of Level 1, including bibliographic or structural information below the monographic or volume level.
 
 
Though texts at Level 2 can be created and encoded by automated means, based on the typographic elements in the electronic file (for example, bold centered text at the top of the page surrounded by whitespace indicates a new chapter heading, and thus a new division), it is not likely to be absolutely reliable across a large body of material. Level 2 encoding requires some human intervention to identify each textual division and heading. Level 2 texts do not require any special knowledge or manual intervention below the section level.
 
 
Level 2 texts are not intended to be displayed separately from their page images. Level 2 encoding of sections and headings provides greater navigational possibilities than Level 1 encoding, and enables searching to be restricted within particular textual divisions (for example, searching for two phrases within the same chapter).
 
 
'''Level 2 is most suitable for projects with the following characteristics''':
 
 
* a large volume of material is to be made available online quickly
 
* a digital image of each page is desired
 
* the material is of interest to a large community of users who wish to read texts that allow keyword searching
 
* rudimentary search and display capabilities based on the large structures of the text are desired
 
* each text will be checked to ensure that divisions and headers are properly identified
 
* extensibility is desired; that is, one desires to keep open the option for a higher level of encoding to be added at a later date
 
 
All elements specified in Level 1 plus the following:
 
 
<table border="1" width="80%">
 
<tr>
 
<td colspan="1">&lt;front&gt;, &lt;back&gt;</td>
 
<td colspan="1">Optional</td>
 
</tr>
 
<tr>
 
<td colspan="1">&lt;div1&gt;</td>
 
<td colspan="1">type="section" is the default attribute value. It is recommended that the n= attribute be
 
included to record the sequence of divisions.</td>
 
</tr>
 
<tr>
 
<td colspan="1">&lt;head&gt;</td>
 
<td colspan="1">Required if present</td>
 
</tr>
 
<tr>
 
<td colspan="1">&lt;p&gt;</td>
 
<td colspan="1">One "container" element per div is required.</td>
 
</tr>
 
</table>
 
 
====Basic Structural Example:====
 
 
<pre>
 
<TEI.2 id="someid">
 
  <teiHeader> [Source and processing information goes here] </teiHeader>
 
  <text id="someotherid">
 
    <front> [titlepage information, table of contents, prefaces, etc.] </front>
 
    <body>
 
      <div1 type="chapter" n="1">
 
        <head>Chapter 1</head>
 
        <p>[text of Chapter 1 goes here interspersed with <pb> elements pointing to page
 
        images]</p>
 
      </div1>
 
      <div1 type="chapter" n="2">
 
        <head>Chapter 2</head>
 
        <p>[text of Chapter 2 goes here interspersed with <pb> elements pointing to page
 
        images]</p>
 
      </div1>
 
      <div1 type="chapter" n="3">
 
        <head>Chapter 3</head> [text of Chapter 3 goes here interspersed with <pb> elements
 
        pointing to page images] </div1>
 
      <div1 type="chapter" n="4">
 
        <head>Chapter 4</head> [text of Chapter 4 goes here interspersed with <pb> elements
 
        pointing to page images] </div1>
 
    </body>
 
    <back> [optional text of backmatter, appendices, etc.] </back>
 
  </text>
 
</TEI.2>
 
</pre>
 
 
[http://www.diglib.org/standards/level2example.html See an example in context…]
 
 
===LEVEL 3: Simple Analysis===
 
 
    <p>
 
'''Purpose''': To create text that can stand alone as electronic text and identifies
 
      hierarchy and typography without content analysis being of primary importance.</p>
 
 
    <p>
 
'''Rationale''': Level 3 texts can be created from scratch or by the relatively easy
 
      conversion of existing HTML or word-processing documents. Encoding offers the advantage of the TEI
 
      header, interoperability with other TEI collections, and extensibility to higher levels of encoding.
 
      Level 3 generally requires some human editing, but the features to be encoded are determined by the
 
      appearance of the text and not specialized content analysis.</p>
 
 
    <p>Level 3 texts identify front and back matter, and all paragraph breaks. The finer granularity of
 
      encoding these features, as well as figures, notes, and all changes of typography, allows a range of
 
      options for display, delivery, and searching. For example, one has the option of identifying and,
 
      therefore, specifying the display charactersitics of different typographic styles, and regularizing the
 
      display and placement of note text.</p>
 
 
    <p>Level 3 texts can stand alone as text without page images and, therefore, can be uploaded, downloaded
 
      and delivered quickly, and require less storage space than digital collections with page images.
 
      However, the simple level of structural anaylsis and absence of specialized content analysis reflected
 
      in Level 3 encoding may make it desirable for some, depending on project priorities, to include page
 
      images in order to provide users with a fuller set of resources.</p>
 
 
    <p>
 
      '''Level 3 is most suitable for projects with the following characteristics:'''
 
    </p>
 
 
    <ul>
 
      <li>the material is of interest to a large community of users who wish to read texts that allow
 
      keyword searching</li>
 
      <li>some sophistication of display, delivery, and searching based on structure of the text is desired</li>
 
      <li>each text will be checked to ensure that encoding decisions have been made appropriately</li>
 
      <li>the users of the texts may have limited storage or display capabilities</li>
 
      <li>the creator of the texts has limited or no ability to provide content expertise to analyze, tag,
 
      or review texts</li>
 
 
      <li>extensibility is desired; that is, one desires to keep open the option for a higher level of
 
      encoding to be added at a later date</li>
 
    </ul>
 
 
    <p>All elements specified in Levels 1 and 2, plus the following:</p>
 
 
   
 
      <table border="1" width="80%">
 
      <tr>
 
        <td colspan="1">&lt;front&gt;, &lt;back&gt;</td>
 
 
        <td colspan="1">Required if present.</td>
 
      </tr>
 
      <tr>
 
        <td colspan="1">&lt;p&gt;</td>
 
        <td colspan="1">Required for paragraph breaks in prose; may be used for stanzas using &lt;lb&gt; for
 
        line breaks in verse.</td>
 
      </tr>
 
 
      <tr>
 
        <td colspan="1">&lt;list&gt; and &lt;item&gt;</td>
 
        <td colspan="1">May be used in this level to indicate ordered and unordered list structures.</td>
 
      </tr>
 
      <tr>
 
        <td colspan="1">&lt;table&gt;, &lt;row&gt;, and &lt;cell&gt;</td>
 
 
        <td colspan="1">May be used to indicate table structures.</td>
 
      </tr>
 
      <tr>
 
        <td colspan="1">&lt;figure&gt;</td>
 
        <td colspan="1">Required to indicate figures other than page images.</td>
 
      </tr>
 
      <tr>
 
 
        <td colspan="1">&lt;hi&gt;</td>
 
        <td colspan="1">Required to indicate changes in typeface; rend attribute is optional.</td>
 
      </tr>
 
      <tr>
 
        <td colspan="1">&lt;note&gt;</td>
 
        <td colspan="1">All notes must be encoded. It is also recommended that notes that extend beyond one page be combined into one &lt;note&gt; element. Marginal notes, without reference, should occur at the beginning of the paragraph to which they refer, with the value of the place attribute as "margin".</td>
 
 
      </tr>
 
      <tr>
 
        <td colspan="1">&lt;lb&gt;</td>
 
        <td colspan="1">May be used to force line breaks.</td>
 
      </tr>
 
      </table>
 
   
 
 
    <p>NOTE ON &lt;note&gt;:</p>
 
 
    <p>It may be desirable to move footnotes from their original location in the text. If left at the bottom
 
      of a page, a note may become included in another paragraph or section of the encoded text, and thus
 
      separated from its reference. There are options for placement of footnotes if they are moved:</p>
 
 
    <ul>
 
      <li>Inline. The note is inserted at the point of reference. An n attribute records the value of the
 
      note reference if there is one.</li>
 
      <li>End-of-Division. Notes moved to the end of the division</li>
 
    </ul>
 
 
    <p>
 
 
      '''Basic Structural Example forthcoming ...'''
 
    </p>
 
   
 
===LEVEL 4: Basic Content Analysis===
 
<p>'''Purpose''': To create text that can stand alone as electronic text, identifies
 
      hierarchy and typography, specifies function of textual and structural elements, and describes the
 
      nature of the content and not merely its appearance. This level is not meant to encode or identify all
 
      structural, semantic or bibliographic features of the text.</p>
 
 
    <p>
 
'''Rationale''': Greater description of function and content allows for:</p>
 
 
    <ul>
 
      <li>flexibility of display and delivery</li>
 
      <li>sophisticated searching within specified textual and structural elements</li>
 
      <li>combining the broadest range of uses and audiences</li>
 
 
    </ul>
 
 
    <p>Texts encoded at Level 4 are able to stand alone as part of a library collection, and do not require
 
      page images in order for them to be read by students, scholars and general readers. This level of TEI
 
      encoding allows them to be displayed or printed in a variety of ways suitable for classroom or scholarly
 
      use.</p>
 
 
    <p>Level 4 texts contain elements and attributes that describe content. For example, lines of verse are
 
      tagged with &lt;l&gt;; the &lt;p&gt; element is reserved for true paragraphs. Features of
 
      the text that may contribute to meaning, such as indentation of verse lines and typographic change, are
 
      preserved. These are textual features that are not encoded at lower levels and that allow the text to be
 
      used and understood fully independent of images.</p>
 
 
    <p>The ability to stand alone as text means that Level 4 texts are more nimble and robust for exercises
 
      such as format repurposing and textual analysis.</p>
 
 
    <p>Finally, functionally accurate encoding in Level 4 texts
 
      allows them to be searched or displayed in sophisticated ways.
 
      For example, a searcher could limit his or her search in a
 
      dramatic text to stage directions or in a verse text to only
 
      first lines. In a politicall tract published by subscription, a
 
      search could be confined to names that appear in lists, thus
 
      limiting a search to names of people who subscribed to a
 
      particular volume. This ability to limit searches becomes more
 
      significant as textbases become larger, and thus is of great
 
      importance to the library community as it attempts to build into
 
      the initial design and implementation of textbases the features
 
      needed to enhance interoperability.</p>
 
 
    <p>
 
      '''Level 4 is most suitable for projects with the following characteristics''':
 
    </p>
 
 
    <ul>
 
      <li>sophisticated search and retrieval capabilities are desired</li>
 
 
      <li>the texts will be used for textual analysis</li>
 
      <li>extensibility is desired; that is, one desires to keep open the option for a higher level of
 
      encoding to be added by the scholarly community at a later date</li>
 
      <li>the users of the texts may have limited storage or display capabilities</li>
 
    </ul>
 
 
   
 
====General Level 4 Recommendations====
 
<ul>
 
<li>Typographically distinct text should be encoded as appropriate,
 
e.g. with &lt;term>, &lt;q>, &lt;gloss>, &lt;mentioned>, &lt;soCalled>, &lt;foreign&gt;,
 
        &lt;title&gt;, or &lt;emph&gt;. Any ambiguous emphasized text should be
 
        encoded as &lt;hi&gt; (e.g. &lt;hi rend="bold"&gt;).</li>
 
      <li>It is recommended that the &lt;sic&gt; element be used to
 
        indicate typographic errors, with corrections (if any) noted
 
        as the value of the corr= attribute.</li>
 
      <li>&lt;titlePage&gt; should include the verso if present, divided by &lt;pb
 
        n="verso"/&gt;. Tables of contents, errata, subscription lists, "other titles by the same author"
 
        should be included in a separate numbered division, as a &lt;list&gt; with &lt;item&gt;s.
 
        Frontispieces should be encoded as a &lt;figure&gt;, within a separate numbered division and
 
        &lt;p&gt;.</li>
 
      </ul>
 
 
[http://www.diglib.org/standards/level4genrec.html See some examples in context…]
 
 
====Level 4 Prose====
 
<ul>
 
<li>Letters that occur within the text body provide some challenges. It is recommended that quoted
 
        letters that occur as part of a text (and not collections of letters themselves) be encoded within
 
        &lt;q&gt;&lt;text&gt;&lt;body&gt;&lt;div1 type="letter"&gt;, with
 
        &lt;opener&gt;, &lt;dateline&gt;, &lt;salute&gt;, &lt;signed&gt;,
 
        &lt;closer&gt; included as appropriate.</li>
 
 
      <li>Quotations that do not occur inline, but are set off typographically in some way, should be
 
        encoded as &lt;q&gt;.</li>
 
      <li>Notes are to be encoded as described in Level 3.</li>
 
      <li>&lt;argument&gt;, &lt;opener&gt;, &lt;epigraph&gt;,
 
        &lt;closer&gt;, &lt;trailer&gt;, &lt;add&gt;, &lt;del&gt;,
 
        &lt;unclear&gt; as appropriate.</li>
 
 
      </ul>
 
 
[http://www.diglib.org/standards/level4prose.html See some examples in context…]
 
 
====Level 4 Drama====
 
<ul>
 
      <li>Cast lists should be encoded as &lt;list&gt;s, with &lt;item&gt;s.</li>
 
 
      <li>Speeches are encoded as &lt;sp&gt;, with speakers identified within
 
        &lt;speaker&gt; elements; stage directions are encoded as &lt;stage&gt; and enclose
 
        block level content describing scenery, etc.</li>
 
      </ul>
 
 
 
[http://www.diglib.org/standards/level4drama.html See some examples in context…]
 
 
====Level 4 Oral History====
 
      <ul>
 
      <li>Speakers in interviews can be identified in the &lt;teiHeader&gt; in several ways.
 
<ul>
 
<li>In the &lt;profileDesc&gt;, in the &lt;particDesc&gt; in a &lt;list&gt;, with &lt;name&gt; inside of &lt;item&gt;s.</li>
 
 
<li>As a list of author &lt;name&gt;s within &lt;fileDesc&gt;&lt;titleStmt&gt;</li>
 
</ul>
 
</li>
 
<li>In either method, use an id= on the &lt;name&gt; element to uniquely identify the individual participant</li>
 
 
      <li>Questions and answers from interviewees and interviewers are encoded as &lt;sp&gt;,
 
with speakers identified within &lt;speaker&gt; elements with a who= attribute the value of which
 
corresponds to the id= in the list of interview participants.</li>
 
      </ul>
 
[http://www.diglib.org/standards/level4oh.html See some examples in context…]
 
 
====Level 4 Verse====
 
      <ul>
 
      <li>All verse, even poems without separate stanzas or verse paragraphs, should be contained within a
 
        line group element &lt;lg&gt;. This will assist with automated processing and retrieval.</li>
 
      <li>It is common to see informal divisions within poems, noted by a string of asterisks or periods.
 
        These should be encoded as &lt;milestone/&gt;s with attribute values of unit="typography" and
 
        n="()" indicating the character used and its occurrence, &lt;milestone unit="typography"
 
        n="******"/&gt;.</li>
 
      <li>&lt;l&gt; It is recommended that indentation be recorded and that the rend attribute be
 
        used to do this.</li>
 
      </ul>
 
[http://www.diglib.org/standards/level4verse.html See some examples in context…]
 
====Level 4 Front and Back Matter====
 
      <ul>
 
      <li>It is recommended that all prefaces, tables of contents, afterwords, appendices, endnotes and
 
        apparatus be encoded. For publisher's advertisements, indexes, and glossaries or other front or back
 
        matter that isn't considered of primary importance to the text, there are three options:<ul>
 
        <li>Fully transcribe and encode</li>
 
        <li>Link to page images (may include an unencoded transcription)</li>
 
        <li>Omit, noted in &lt;editorialDesc&gt;</li>
 
        </ul>
 
</li>
 
      </ul>
 
[http://www.diglib.org/standards/level4fandb.html See some examples in context…]
 
 
===LEVEL 5: Scholarly Encoding Projects===
 
 
    <p>Level 5 texts are those that require subject knowledge, and encode semantic, linguistic, prosodic or
 
      other elements beyond a basic structural level.</p>
 
   
 
 
 
 
==General Guidelines for Attribute Usage==
 
 
    <p>Some general advice on the use of particular attributes
 
    follows.</p>
 
    <ul>
 
      <li><b>type=</b>: Constructing a list of acceptable attribute values
 
      for type that could find wide agreement is impossible. Instead,
 
      it is recommended that projects describe the type= attribute
 
      values used in their texts in the project ODD file or other
 
      documentation and that this list be made available to people
 
      using the texts. See <i>ABC for Book Collectors</i> by John
 
      Carter (7th edition, New Castle, DE:Oak Knoll Books, 1995) for
 
      a list of standard names and definitions of bibliographic
 
      features of printed books. For those elements where type is not
 
      required, such as &lt;head&gt; and &lt;title&gt;, use the
 
      attribute values for subtitles and additional titles, but not
 
      main titles.<br /> Example: &lt;div1 type="volume"&gt;</li>
 
 
      <li><b>n=</b>: Sometimes an n= (number) attribute can be used by itself. For instance in the case of
 
      pagebreaks:<br /> Example: &lt;pb n="456"/&gt;</li>
 
     
 
      <li><b>id=</b>: If you are in a situation that requires you to
 
      uniquely identify an element that will be used to reference
 
      another specific location in one or more texts, use an id=
 
      attribute. The value of this attribute must be unique within a
 
      document, and must be composed of alphanumeric characters,
 
      dots, hyphens, and underscores, and must start with a
 
      letter.<br /> Example: &lt;note id="n5" n="5"&gt;</li>
 
 
      <li><b>target=</b>: follows the same syntactic rules as the id=
 
      attribute value. In fact, target= and id= are often used in
 
      conjunction with one another as in the case of footnotes where
 
      the &lt;anchor id="n5" n="5"&gt; is at a specific place in the
 
      text and is referred to by the &lt;note target="n5" n="5"&gt;
 
      which contains the actual information of the footnote itself
 
      elsewhere.</li>
 
 
      <li><b>entity=</b>: The entity= attribute is simply the way a
 
      figure indirectly points or refers to its the actual image
 
      file<br /> Example:
 
&lt;!ENTITY TwaFifrn SYSTEM  "TwaFifrn.jpg" NDATA jpeg><br />
 
&lt;!-- ... --><br />
 
&lt;figure entity="TwaFifrn"&gt;</li>
 
 
      <li><b>rend=</b>: Difficulty using rend= attributes occurs when it is desirable to record
 
      more than one rendition feature. With this in mind, it is recommended that projects employ the
 
      following adaptation of "rendition ladders", a concept developed at the [http://www.wwp.brown.edu/ Brown University Women Writers Project]. This system allows for sets
 
of multiple renditional features to be included in one rend= value. Rendition ladders consist of categories of
 
      renditional features with values of each of those features enclosed in parentheses.<br /> rend= should only be
 
used to override a default value. For instance, if all text encoded as &lt;hi&gt; is defined as being
 
      rendered in italics, there is no reason to encode text as &lt;hi rend="font(italics)"&gt; Combining renditional features would result in a tag with attributes such as &lt;l rend="font(italics)align(right)"&gt;
 
      <ul>
 
      <li>font <br />italics, bold, fsc (full and smallcaps), smallcap, underlined, gothic</li>
 
      <li>align <br />right, left, center, block</li>
 
      <li>indent <br />Values in parentheses should indicate the number of tabstops to be indented, e.g.,
 
      &lt;l rend="indent(1)"&gt;</li>
 
</ul>
 
      </li>
 
      <li><b>lang=</b>: Use ISO639-2 three-character language codes. Note that this recommendation is slightly different from that of the TEI P5 Guidelines, which recommends the BCP 47 language codes.</li>
 
    </ul>
 
References: <references/>
 
 
[[Category:SIG:Libraries]]
 
[[Category:SIG:Libraries]]
 +
[[Category:Customization]]

Latest revision as of 00:43, 14 May 2013

See http://purl.oclc.org/NET/teiinlibraries for the latest published version of the Best Practices.

See also Future changes to Best Practices for TEI in Libraries.