Talk:Best Practices for TEI in Libraries
Contents
@type and @key on persName and orgName
In the best practices document, the author element is described as follows:
One or more author elements (one name per element) are used to encode the name for the personal author or corporate body responsible for the creation of the source document, even if this creator is not the main entry in the catalog record. Use <persName> or <orgName> when applicable. Whenever possible, establish or use the form of the name from a national name authority file.
Since the forms of names used in name authority files have a rigid form that doesn't look like a name in the TEI sense (strings like "Welles, Gideon, 1802-1878" offend the sensibilities of XML folks), during the Raleigh meeting we decided that name authority records given in the header should have a type attribute, similar to that used on <title>. So on fileDesc/titleStmt/author and fileDesc/sourceDesc/biblStruct/monogr/author the following values of @type would be allowed:
marc100 marc110
We would not allow marc111 or marc130 because these MARC fields, while used for main entries in cataloging, are not authors in the TEI sense. As explained in the description of the author element, this element should be used for personal authors or corporate bodies, not necessarily main entries.
We also decided to recommend use of @key, as in the "Level 4 Name Tagging" section, to reference authority file records. In "Level 4 Name Tagging", it says, "the key attribute points to the unique key in the database table or, as with the ref attribute, the key attribute can point to the xml:id value in the external file".
However, @key does not take an IDREF data type (as it was called in the old days), and once I tried to create examples, I realized we don't want to be in the business of adding a <taxonomy> for each authority record referenced elsewhere in the header. So I think what we want is this:
<author><persName type="marc100" key="lccn-n78-95332">Shakespeare, William, 1564-1616</persName></author> <author><orgName type="marc110" key="lccn-n50-63455">National Organization for Women</orgName></author> <author>(unknown)</author>
plus this elsewhere in the header:
<taxonomy xml:id="lccn"><bibl>Library of Congress Control Number</bibl></taxonomy>
Sound right?
The_TEI_Header_and_Other_Metadata_Schemas
I added a paragraph to the section called The TEI Header and Other Metadata Schemas explaining why you can't link to outside metadata the way you'd like to. Does it sound right to everyone?
rendition in header?
I changed removed mention of rendition ladders and replaced recommendation and examples with CSS. Should we add rendition element to the header? P5 talks about using this element to define local styles in terms of CSS, but we want to put CSS right inline. So is there any point in using the rendition element in the header?
hyphenation
One of the comments we received on the Best Practices draft was a request to recommend on a specific way for handling end-of-line (and end-of-page) hyphenation within the body. We have not discussed this at all in the Best Practices!
We should have a new section under "General Recommendations" explaining to do end-of-line hyphens like in Tite (to ensure that texts can be converted to Level 3 automatically). See http://www.tei-c.org/release/doc/tei-p5-exemplars/html/tei_tite.doc.html#e-o-l . End-of-page hyphens should be done the same way, with the pb tag intervening, e.g., obfus{U+00AD}<pb n="33" facs="00000037.tif"/>cation
P5 offers the hyphenation element in the header to document your method. We could give the following in the header element recommendations:
<hyphenation eol="hard"><p>End-of-line hyphenation silently removed where appropriate.</p></hyphenation>
However, if we use this element, we would need to find another way to encode all of the prose description that is currently in p elements inside the editorialDecl. What should we do?
For the record, if I weren't trying to ensure this compatibility, I would have distinguished three cases:
- hyphenation that occurs at a line or page break but would never occur normally: use U+00AD
- hyphenation that occurs at a line or page break but might have occurred normally: use U+002D
- hyphenation that should always be present, such as "re-creation" (to create again, as opposed to leisure activity): use U+2011
[pw]: two points. 1) I'm not sure I see the distinction between the last two examples--why would you use two different characters? 2) for the first example, particularly for end-of-line in prose where line breaks aren't otherwise recorded, why include this at all? It would mess up searching at least in some systems (eg DLXS).
profile/langUsage/language as empty element?
I added the language element to the header. The ident= attribute is required, but the element may be empty (despite no such examples given in P5). It seems entirely redundant to me to have content for this element. Should we prescribe having no content for it (and just a value for ident=)?
idno in sourceDesc
I substantially rewrote the description of the idno element in sourceDesc. Does this sound right to everyone? In Raleigh people said they wanted examples of local identifiers, but I don't see where this adds anything useful. Thoughts?
workflow descriptions and slightly revised rationales
I have created a "Workflow" section after the "Rationale" section for Levels 1-4, expanding on the brief description of workflows that we already had in place. I also revised some rationales, especially Level 3. Everything look okay?
Level 3 and Tite
I've explained the relationship between the Best Practices and Tite vis-a-vis Level 3 and Level 4. As Tite explains, there are a few elements missing from Level 3 which are included in Tite. Should we just include these in Level 3 so that elements in the body of each map exactly?