Talk:Best Practices for TEI in Libraries
Contents
Introduction
1) Definition of level 5 encoding currently reads:
"The text is generated either through corrected OCR or keyboarding, but the tagging requires substantial human intervention by encoders with subject knowledge. "
I suggest instead:
"The text is generated either through corrected OCR or keyboarding, and the tagging requires substantial human intervention by encoders with subject knowledge, "
because corrected OCR, keyboarding, and expert tagging ALL require substantial human intervention (though the first two, of course, don't require subject knowledge, and perhaps that is the point of the original phrasing)
2) "If a library uses TEI Tite to outsource its encoding, it should find conversion of TEI Tite files to be trivial: to Level 3 with some loss of granularity and to Level 4 with the addition of some markup, which amounts to minimal human intervention."
Should the colon after "trivial" be there?
2.9 General Guidelines for Attribute Usage
1) Since this isn't a comprehensive list of attributes (I don't think), why bother including the "xml:id" and "target" attributes if specific details about how libraries should use these is not actually included in this document? Is the documentation for these elements considered important to these guidelines, but too extensive to replicate? How does this differ from the specific best practices given for other attributes listed here, like "n" or "rend"?
2) Under "key and ref":
"For example,
<author><persName type="marc100" key="lccn-n78-95332">Shakespeare, William, 1564-1616</persName></author>
gives a project-interal key (lccn-n78-95332) for this name in the Library of Congress Name Authority File. Values of key attributes may be partially explained in a non-machine-readable way through use of a taxonomy element: "
should "project-interal" be "project-internal?" Or "project-integral?" Or something else?
3) Under "rend and rendition":
"The rend and rendition attributes may be used when it is desirable to record information about how the content object was displayed in the source document. "
Is it meant to read "content object," or just "content," or even just "object?" Having both sounds strange to me, but perhaps it's TEI terminology with which I'm not familiar.
4.2 The TEI Header
1) Currently reads: "The TEI header is a metadata record that describes an electronic text encoded according to the TEI specification."
Since there are multiple levels of encoding (does this translate to multiple "specifications?"), should this read either
a) "...encoded according to a TEI specification" or b) "...encoded according to the TEI specifications" ?
4.4 The TEI Header and Other Metadata Schemas
1) Currently reads:
"Unfortunately, there is currently no mechanism for specifying that the content of an element should be drawn from an outside metadata source or that it should supplement the content of the element"
To me, the "it" was confusing/ambiguous--I suggest instead:
"Unfortunately, there is currently no mechanism for specifying that the content of an element should be drawn from an outside metadata source or that outside metadata should supplement the content of the element"
This feels a little more redundant/wordy, perhaps, but it is clearer.