Minutes for June 2, 2009

June 2 TEI SIG Meeting

Syd, Kevin, Glen, Perry W., Rich Michelle will take notes

profileDesc/langUsage
you might want to say something about colloquialisms, accents, etc.
 * ident attribute is required for language codes
 * should the element have content; schema allows empty element
 * Kevin thinks it's redundant to have human readable name
 * In P5, met to record language information not available by the tag itself;
 * In P5, it is meant to precisely define percentage of languages represented
 * Use language as empty element with percentage of usage
 * Talk with the TEI about changing how it's done
 * Glen: gather language from a MARC record; but value we could never get without getting an external look-up (thus support for the empty element)
 * Glen: Useful for level 4 or 5 text?
 * If providing header by hand, would they want to annotate so allow for both (empty element and prose description)
 * Kevin: Don't need to bother to provide content if ident value is sufficient

idno in sourceDesc

 * typo in idno description
 * stuart yeats was wondering about local numbers
 * have all sorts of identifiers here with three frequently used examples
 * is it required?

Define Elements as Optional in Header

 * Header annotations may need to clarify when elements are optional
 * Issue may be forced when creating the ODD
 * Syd will email Kevin when ODD work begins about what is optional or not

Workflow

 * Revised rationale description of level 3 (thought the wording was ambiguous)

Level 5

 * Glen: Does level 5 need a workflow section?
 * Glen removed the rationale sentence in level 5.
 * No need to represent workflow for level 5
 * We should review the level 3 rationale and workflow

Relationship between level 3 and Tite

 * Kevin explained the relationship between the best practices and Tite
 * Perry Trolard explains in revised Tite, there are a few elements missing in level 3 but exist in Tite
 * In title (phrase level), title element, foreign, and genre specific (sp, speaker), cols
 * Making it optional we can give them wiggle room; level 3 is less coherent
 * Tite to level 4; Tite uses i for italics and b for bold; q for quoted material, no corr/sic/choice in level 4
 * cols is a separate category; structural (mark column layout)
 * dan o'donnell would like a cleaner mapping
 * Glen: abstract beauty in Tite lining up with the best practices
 * Level 3 is problematic historically
 * Level 3 is structural, minus content analysis (or phrasal-level encoding)
 * Kevin is leaning toward leaving things as they are
 * Leave level 3 as-is
 * TEI Tite lives between level 3 and 4 (vast majority is structural)
 * Kevin added this statement already; may revisit the reason why it's mostly structural

editor v. respStmt

 * based on the TEI-L discussion with the "translator" example
 * can use role attribute on editor to identify illustrators, translators, etc.
 * editor element has role attribute and give suggested values (not a closed list)
 * advantage of doing this, the headers are more machine readable (adds structure to the data)
 * if all is respStmt, look for free-text "illustrator"
 * how do to a CV for content of resp or respStmt (Syd thinks it's possible)
 * In P5 guidelines, example values are at least established
 * We need to disassociate use of the element name
 * Syd: what ever constraints to be placed in editor/role can be done with respStmt
 * respStmt supports transcription on the page
 * Syd and Kevin will continue by email
 * Syd: It doesn't matter which one is used so long as we are consistent; Syd doesn't like the use of editor to represent translator
 * P5 has a loose interpretation of editor
 * Alternatively, tighten up the editor element (make it less open than P5)
 * More specific scope for editor element (?)
 * Require use of attributes that are not required in P5
 * Glen: Narrow editor and be more specific in respStmt for non-editors
 * Kevin will make the change (with examples of what kinds of content goes in a respStmt)

Action Item Issues
* Kevin looked at Tite * Perry thought Tite was making vendor doing stuff they would have trouble doing and automation issues * How well can OCR software * Propose levels 1-3: all hyphen is retained as in the print source; use hyphen character (cut down in retrieval if you have a dumb search algorithm)
 * Always use level in title even if it can be inferred (in the Header)
 * Kevin reworded the rhetorical questions (list of bulleted examples instead)
 * ToC examples reworked to use ref instead of ptr
 * toC and list of illustration as optional
 * added section of key/ref (pending email to Syd to make sure technical underpinnings are clear)
 * local scheme clarification under General Rec is done under ref/key
 * Kevin rewrote discussion of external metadata; done.
 * expanded discussion on rend/rendition; rend for one case and rendition for the other
 * Hyphenation: Kevin emailed Syd and Perry last week and came up with a solution
 * If converted from Tite to level 3, you throw out the Tite encoded hypens
 * At level 4, disambiguate hyphens (suggest a method or two for dealing with soft hypens; or throw away hyphen)
 * Rich: recommends that hyphens by disambiguated in level 3
 * Syd: argues that the differentiation is structural at level 3
 * Not sure what to do with level 3.
 * Level 3 should be optional for disambiguating hyphens