Minutes for June 2, 2009
Jump to navigation
Jump to search
Attendees: Syd, Kevin, Glen, Perry, Rich, Michelle
Note taker: Michelle
Contents
profileDesc/langUsage/language
- ident attribute is required for language codes
- should the element have content; schema allows empty element
- Kevin thinks it's redundant to have human readable language name
- In P5, meant to record language information not available by the tag itself; you might want to say something about colloquialisms, regionalisms, accents, etc.
- In P5, it is meant to precisely define percentage of languages represented within the document
- Use language as empty element with percentage of usage
- Talk with the TEI about changing how it's done
- Glen: gather language from a MARC record; but "content value" we could never get without getting an external look-up (thus support for the empty element)
- Glen: Useful for level 4 or 5 texts?
- If providing header by hand, would they want to annotate so allow for both (empty element and prose description)
- Kevin: Don't need to bother to provide content if ident value is sufficient
idno in sourceDesc
- typo in idno description
- stuart yeats was wondering about local numbers
- have all sorts of identifiers here with three frequently used examples
- is it required?
Define Elements as Optional in Header
- Header annotations may need to clarify when elements are optional
- Issue may be forced when creating the ODD
- Syd will email Kevin when ODD work begins about what is optional and what is not
Workflow
- Revised rationale description of level 3 (thought the wording was ambiguous)
- We should review the level 3 rationale and workflow
Level 5
- Glen: Does level 5 need a workflow section?
- Glen removed the rationale sentence in level 5.
- No need to represent workflow for level 5
Relationship between level 3 and Tite
- Kevin explained the relationship between the best practices and Tite
- Perry Trolard explains in revised Tite, there are a few elements missing in level 3 but exist in Tite
- In Tite, title element, foreign, and genre specific (sp, speaker, stage), cols
- Making it optional we can give them wiggle room for mapping, but it makes level 3 less coherent
- Tite to level 4; Tite uses i for italics and b for bold; q for quoted material, no corr/sic/choice in level 4 so it's better to force Tite with level 3 than 4
- cols is a separate category; structural (mark column layout)
- dan o'donnell would like a cleaner mapping between the BP and Tite; suggested the TEI-C would too
- Glen: abstract beauty in Tite lining up with the best practices
- Level 3 is problematic historically
- Level 3 is structural, minus content analysis (or phrasal-level encoding)
- Kevin is leaning toward leaving things as they are
- Group: Leave level 3 as-is
- TEI Tite lives between level 3 and 4 (vast majority is structural)
- Kevin added this statement already; may revisit the reason why it's mostly structural
editor v. respStmt
- based on the TEI-L discussion with the "translator" example
- can use role attribute on editor to identify illustrators, translators, etc.
- editor element has role attribute and give suggested values (not a closed list)
- advantage of doing this, the headers are more machine readable (adds structure to the data)
- how do to a CV for content of resp or respStmt (Syd thinks it's possible)
- In P5 guidelines, example values are at least established for editor/role
- We need to disassociate use of the element name, editor
- Syd: what ever constraints to be placed in editor/role can be done with respStmt
- respStmt supports transcription on the page
- Syd and Kevin will continue by email
- Syd: It doesn't matter which one is used so long as we are consistent; Syd doesn't like the use of editor to represent translator and someone else (Perry?) agreed
- P5 has a loose interpretation of editor
- Alternatively, tighten up the editor element (make it less open than P5)
- Require use of attributes that are not required in P5
- Glen: Narrow editor and be more specific in respStmt for non-editors
- Kevin will make the change (with examples of what kinds of content goes in a respStmt)
Action Item Updates
- Always use level in title even if it can be inferred (in the Header)
- Kevin reworded the rhetorical questions (list of bulleted examples instead)
- ToC examples reworked to use ref instead of ptr
- toC and list of illustration as optional
- added section of key/ref (pending email to Syd to make sure technical underpinnings are clear)
- local scheme clarification under General Rec is done under ref/key
- Kevin rewrote discussion of external metadata; done.
- expanded discussion on rend/rendition; rend for one case and rendition for the other
Hyphenation
Preserve hyphenation?
- Kevin emailed Syd and Perry last week and came up with a solution
- Kevin looked at Tite
- Perry thought Tite was making vendors do stuff with hypjens they would have trouble doing and automation issues
- How well can OCR software intrepret hyphens?
- Propose levels 1-3: all hyphen is retained as in the print source; use hyphen character (cut down in retrieval only if you have a dumb search algorithm)
- If converted from Tite to level 3, you throw out the Tite encoded hypens
- At level 4, disambiguate hyphens (suggest a method or two for dealing with soft hypens; or throw away hyphen and collapse word)
- Rich: recommends that hyphens be disambiguated in level 3
- Syd: argues that the differentiation is structural at level 3
- Not sure what to do with level 3.
- Level 3 should be optional for disambiguating hyphens
Recording decision in header
- An issue we didn't have a chance to discuss last week was how to record hyphenation information in the header. editorialDecl may contain one or more p elements or one or more various specialized elements, including hyphenation.
- Syd really doesn't like this content model and is willing to prescribe not following it.
- Kevin said we want to avoid this so that the Best Practices will qualify as as a TEI customization.
- We decided to prescribe boilerplate text for a p element explaining the hyphenation decisions made. Kevin will make this and may create boilerplate text for the other things to be documented in editorialDecl. We hope that strict use of text like this will aid in machine processing in the future.
More Action Items Updates
- Copyeditors and MARC mappers now credited
Timeline
- Kevin said he wants to make the final revisions ASAP so the MARC mapping can begin. He also hopes to have another student do copyediting at the same time, but he expects this won't affect the MARC mapping.
- Kevin: According to our schedule, Syd will do ODDs from July 1 to August 3. Is that okay?
- Syd said he won't be back from vacation till July 5, so we can back things up till then.
- Kevin adjusted the timeline.
More Conference Calls?
- Kevin said he didn't foresee any more calls unless issues arise. Syd suggested scheduling a call in the future just in case since we can always cancel. After discussion, Kevin said he'd propose something for late June: this should give the catalogers time to discover issues but leave still leave them time to fix things by July 5.
- Thanks all around to everyone for good work. We'll be in touch as things develop.