|
|
(232 intermediate revisions by 8 users not shown) |
Line 1: |
Line 1: |
− | == Issues Pending ==
| + | The following are things to do to the BP before making an official "release". There is a separate list of [[Future changes to Best Practices for TEI in Libraries]]. |
| | | |
− | ===Throughout the Document=== | + | == Handling of hyphenation == |
| | | |
− | We also need to figure out what to do about the recommendation for lang=. That's a tougher issue, because it's not just a syntax change. Our guildelines for best practice are at odds with the TEI Guidelines and the IETF best current practices. [[User:Syd|Syd]] 2008-10-21T19:09Z.
| + | Syd will post to TEI-L asking how to distinguish "his-tory" and "run-on" when they break across lines. Both would take break="no", but how to allow for searching of "history" and "run-on"? |
| | | |
− | Syd, I think we agreen we want to conform with TEI Guidelines and IETF best current practices ... Can you point out or edit one of the offending sections so that we can get back in line (I'm not sure where the problem surfaces and how to fix it.). For me this is not a topic of debate ... Is there anyone who objects? [[User:Emcaulay|Emcaulay]] | + | : Syd re-read P5 and emailed Kevin that he thinks we can handle this without a wider call. Kevin agreed with Syd on a solution and made the revisions to main-driver.odd and level4.odd. Asked Syd to check over the changes to [https://github.com/sydb/TEI-in-Libraries/commit/31a8730b9d0af093099a18ab57078d4b8577c23b#BestPractices/main-driver.odd main-driver.odd] and [https://github.com/sydb/TEI-in-Libraries/commit/9e4f98caeda0446ebb3397f619c83b534ccd1fca#BestPractices/level4.odd level4.odd]. Then will update the SIG on our handling of this issue and on the final work on the BP as a whole. ([[User:Kshawkin|Kshawkin]] 11:28, 13 September 2011 (EDT)) |
| | | |
− | ==== Filenaming ([[Minutes_from_January_27%2C_2009|resolved on 2009-01-27]]) ====
| + | :: Syd said everything looked fine but made an additional suggestion, which Kevin [https://github.com/sydb/TEI-in-Libraries/commit/8c60fef9dd3b9641b878f16d0370b05d4b838856 just committed]. |
| | | |
− | {snippet from BPG text; comments about file naming}
| + | == Bug fixes == |
− | <span style="color: purple">[This recommendation also seems dated (and the standard is targeted for CD-ROM file naming). I think we should
| |
− | recommend a consistency in file naming according to respective digital object storage practices. For example, [http://wiki.dlib.indiana.edu/confluence/x/Sw8 IUDLP has guidelines in place] and perhaps we can mine the more general recommendations from there like only ASCII, no spaces, 3 letter extensions, etc. ([[User:Mdalmau|Mdalmau]])] </span> <span style="color: blue">sounds like a good idea to remove or revise; as is it seems weird. ([[User:emcaulay|emcaulay]])</span>
| |
− | <span style="color: #03C03C">I'll just point out that people still use CD-ROM as an archival storage medium (I'm looking at you, Chris) as well as a file transfer mechanism [pwillettt]</span>
| |
− | {end snippet}
| |
| | | |
− | The question isn’t is anyone still using CD-ROM, lots of folks probably do. The question is, is anyone still using ISO 9660 (as opposed to UDF, ECMA-168, or ISO 13490) CD-ROMs '''whithout''' using Rock Ridge or Joliet extensions. Anyone even know how to do that? -- [[User:Syd|Syd]]
| + | * Fix schema bug where list element is not allowed except as a child of p. |
| + | : Syd put model.inter back into model.common at Levels 3 and 4. This fixed the problem. <b>Syd wonders whether this change will have any side-effect; he will investigate further.</b> |
| | | |
− | <span style="color: purple">File naming is still an issue. Perry pointed out that some folks store TEI files on CD-Rom (makes sense). Perhaps it just needs to be teased out for those who use CD-ROM for storage and more general filenmaing guidelines for server storage/delivery, like:</span>
| + | == Copyediting == |
| | | |
− | Standardized file naming for a particular encoding project is key for reliable online storage and delivery of these files. Consider the following best practices when determining the file name scheme for your project:
| + | Change all instances of "must" to "should" and "required" to "recommended", and all existing "recommended" to "optional". All this is in accordance with RFC 2119 and with the BP's general policy of not requiring anything but simply being "best practices". Clarify that the ODDs require things just to encourage conformance. |
| + | : done ([[User:Kshawkin|Kshawkin]] 20:47, 5 September 2011 (EDT)) |
| | | |
− | * Each filename should contain an identifier that uniquely specifies a single digital object within the parent collection (e.g., a parent collection of text, images and other related materials)
| + | Move caveats before Level 1 Example to an appropriate place in main-driver.odd. |
− | * Each filename should be fully specified. It should not just be a sequence number that is dependent on location within a directory structure for context
| + | : done ([[User:Kshawkin|Kshawkin]] 20:47, 5 September 2011 (EDT)) |
− | * Filenames should not include spaces
| |
− | * Filenames should following a predicatble case constructions (e.g., all lowercase, camelCase, etc.)
| |
− | * The first character of the filename should be an ASCII letter ('a' through 'z' or 'A' through 'Z') to comply with current restrictions on identifiers by many programming and metadata languages such as METS
| |
− | * The "base" filename may include only ASCII letters ('a' through 'z' and 'A' through 'Z'), ASCII digits ('0' through '9'), hyphens, underscores, and periods. Refrain from using other characters and limit period usage to only once (to separate base name from file extensions).
| |
| | | |
− | For those saving files to CD-ROM for storage or file transfer, file naming should follow ISO 9660 conventions: 8-character filenames, 3-character extensions, using A-Z, a-z, 0-9, underscores and hyphens.
| + | Copyedit element recommendations at each level to avoid awkward and misleading syntax. |
| + | : Not finding anything in particular to fix. I think I was just tired when reading these in person with Syd. |
| | | |
− | ([[User:Mdalmau|Mdalmau]]) | + | Remove ODD markup in level specifications that Syd added from P5 so as not to include examples that contradict rest of Guidelines. |
| + | : Commented out gloss, desc, exemplum, remarks, and listRef in level1.odd for Syd to check over before I do in other levels. ([[User:Kshawkin|Kshawkin]] 20:47, 5 September 2011 (EDT)) |
| + | :: Syd said he un-commented out these elements. He'll review the 79 elements in the various ODDs to check for conflict with the BP prose. That way we can include them with our release. ([[User:Kshawkin|Kshawkin]] 10:46, 3 October 2011 (EDT)) |
| | | |
− | While I actually think my recommendation of 2008-12-03T11:57-05 (“I was wrong”) is syntactically slightly superior, it’s time to apply ''Syd’s wheel reinvention prevention convention'' in full force. The conventions MD refers to above (i.e., [http://wiki.dlib.indiana.edu/confluence/x/Sw8 IUDLP has guidelines in place]) are perfectly workable. We should just refer the reader there and be done with it. [[User:Syd|Syd]]
| + | Edit main-driver.odd to review metadata for the BP as a whole (in both the teiHeader and front elements): editors (stated more prominently than in the appendix), copyright, version number, etc. Check with Michelle on whether we should revise acknowledgment of DLF support. |
| | | |
− | I did steal these from the IU DLP guidelines, but I was selective because there's a lot of "Fedora" construct influencing our filenaming conventions. Wouldn't want the users to go there and feel overwhelmed. I selected the more "basic" factors for consideration, but certainly a pointer to DLP documentation or anywhere else could be helpful (as a footnote?). ([[User:Mdalmau|Mdalmau]])
| + | : In teiHeader, added Michelle as <editor> and copyedited name of SIG. Having been in touch with DLF recently, I'm pretty sure our statement of support is sufficient. |
| | | |
− | I hate to throw another comment into the mix on this issue, but I seem to recall reading somewhere that underscores are now discouraged in file naming, because they can be mistaken for blank spaces between words. If this is incorrect, my apologies. [[User:rwisnesk|rwisnesk]]
| + | : As last step before releasing, add <code><editionStmt><p>Version 3.0 (October 2011)</p></editionStmt></code> [[User:Kshawkin|Kshawkin]] 20:27, 29 September 2011 (EDT) |
| + | :: Done |
| | | |
− | : It's true about underscores: http://www.education.umd.edu/ETS/web/webNamingConventionRP.html .
| + | == Rendering better documentation from the ODDs == |
| | | |
− | I still wonder does anyone really need to worry about 8-character long filenames when burning modern CD-ROMs? [[User:Emcaulay|Emcaulay]]
| + | Tables should have borders on cells (or some clear path for indicating in ODD document that you want borders). |
| + | : Syd said this can be overriden by giving values for cssFile and/or cssSecondFile parameters to odd2html.xsl. Not sure how to "send" these through roma2. |
| + | :: Sebastian wrote in an email to use, e.g., <tt>--docflags="cssFile=foo.css"</tt> but warned that this was from memory ([[User:Kshawkin|Kshawkin]] 16:08, 22 August 2011 (EDT)) |
| | | |
− | ===The TEI Header===
| + | Omit exemplars from output — preferably just those from the P5 source, not the customization ODD file. |
| + | : Done. See the [https://github.com/sydb/TEI-in-Libraries/blob/master/BestPractices/odd2odd.xsl.patch stylesheet patch file]. |
| | | |
− | ====Inline comments====
| + | <code><editor></code>s are not showing up in HTML version of ODD: need to figure out how to make this happen. |
| | | |
− | There are a number of inline comments we need to address. Look for [[TEI_in_Libraries:_Guidelines_for_Best_Practices#The_TEI_Header|colored text in this section of the GBP]].
| + | : Done |
| | | |
− | ====Identifiers for outside metadata?====
| + | Check that editionStmt gets rendered in HTML version of ODD. |
| | | |
− | Should we have a place in the header to indicate an identifier for an outside metadata record for the item? Examples:
| + | : Done |
− | * record number for the source document in the local catalog
| |
− | * record number for the source document in WorldCat
| |
− | * record number for this TEI document in the local catalog
| |
− | * record number for this TEI document in WorldCat
| |
− | Having such a link would allow a delivery system to provide an unambiguous link to this full metadata without relying on matching other information in the header like a title, ISBN, or call number.
| |
− | ([[User:Kshawkin|Kshawkin]])
| |
| | | |
− | Yes, I think we should. How about the spot where the TEI Guidelines recommend putting the code for the classification of the text (in some scheme), <tt><classCode></tt> inside <tt><classDec></tt>, or is that too much of a stretch? (—[[User:Syd|Syd]])
| + | == Documentation of ODD processing == |
| | | |
− | : During the call on 2/10/09, Syd said he no longer thinks use of <tt>classCode</tt> (and a corresponding <tt>classDecl</tt>) is a good idea. Instead, he suggested we createa new element, <tt>otherDesc</tt>, to contain elements from outside the TEI namespace for metadata not covered by the TEI header. The GBP could specify how this element is used. ([[User:Kshawkin|Kshawkin]]) | + | Syd will write down the command-line code needed in order to generate HTML files from the ODDs and add this to https://github.com/sydb/TEI-in-Libraries/blob/master/README . |
| + | : Done |
| | | |
− | NOTE: we talked about this during our conf call on 2/10/09; we decided to have a sub-group conference call on 2/17/09 to talk in more detail about this. [[User:Emcaulay|Emcaulay]]
| + | == Before release == |
| | | |
− | === Use of floatingText ===
| + | Update [http://www.tei-c.org/SIG/Libraries/teiinlibraries/ the official HTML version] to remove "Expected release October 2011." |
| | | |
− | Should we recommend or require use of <tt>floatingText</tt> in Level 3 and above? In Level 4 and above? In Level 5 only?
| + | Update copy of main-driver.html linked from there. |
− | | |
− | == Issues Resolved; Corrections to make ==
| |
− | | |
− | ===Throughout the Document===
| |
− | | |
− | * Link to corresponding sections of the P5 Guidelines when ever possible (for each recommendation); Lisa and Rich need to add these to Level 2 and Lisa and Andrew need to add for Level 1
| |
− | | |
− | * IN EXAMPLES and GUIDELINES, CHANGE the "@id" TO "@xml:id". Lisa and Rich need to make sure this is handled correctly in Level 2. Lisa and Andrew need to make sure this is handled for Level 1.
| |
− | | |
− | * CHANGE TEI.2 as the root element to TEI. Lisa and Rich need to make sure this is handled correctly in Level 2. Lisa and Andrew need to make sure this is handled for Level 1.
| |
− | | |
− | ===Copyediting===
| |
− | | |
− | Copyedit the text so that we use terms like "item", "work", "tag", "element", "project", and "collection". consistently. Also check that we speak with a consistent voice: "the encoder should" vs. "the encoder must" vs. "the encoder will".
| |
− | | |
− | ==Issues Resolved, Changes made==
| |
− | | |
− | *Will also need to ditch the bit on entities.
| |
− | | |
− | | |
− | | |
− | ==== Numbered Divs (resolved on 2009-01-27: see [[Minutes_from_January_27%2C_2009|minutes]]) ====
| |
− | | |
− | <span style="color: green">[This seems worth revisiting. Do we really need such a software-specific recommendation? ([[User:Kshawkin|Kshawkin]])]</span>
| |
− | <span style="color: purple">[I agree. We generally avoid numbered divisions. Recent survey revealed a nearly 50/50 split on the topic, but we shouldn't advocate one or the other. ([[User:Mdalmau|Mdalmau]])] </span> I disagree pretty strongly — many perceive it a shortcoming of the TEI Guidelines that they often offer more than one way to do something when there is not much gain in the difference. For our purposes, I think this is one of those cases, and we should avoid causing the same confusion. We should pick either numbered or unnumbered and stick with it throughout. (And I don't think it matters much which we pick.) — [[User:Syd|Syd]] <span style="color: blue">For a discussion of whether to use numbered or unnumbered divs, see the TEI P5 Guidelines [http://www.tei-c.org/release/doc/tei-p5-doc/en/html/DS.html#DSDIV3 Chapter 4: Default Text Structure]([[User:emcaulay|emcaulay]]) </span>
| |
− | <span style="color: #03C03C">I'm not sure when Syd added his comment--before or after the conference call? We did pick one for the original best practices but there has been significant unhappiness about that decision ever since. As Michelle points out, the community out there is split 50-50. It's not a disagreement that's going away.([[User:Pwillett|Pwillett]] 19:37, 9 February 2009 (EST))</span>
| |
− | <span style="color: #03C03C">I just now see the "resolved" banner above, so ignore my comments just above. ([[User:Pwillett|Pwillett]] 19:38, 9 February 2009 (EST))</span>
| |
− |
| |
− | <span style="color: purple">Chapter 4 of the P5 Guidelines make allowances for both, with a preference towards unnumbered as it more easily supports arbitrary levels of nesting (as opposed to a fixed number). Unnumbered is also preferred because designated levels to parts of a text may change from project to project or even book to book within the same project. The guidelines make allowances for both: unnumbered divisions using the @type to designate the level. For those who type more semantically and for those who need numbered divisions for more predictable processing how about we re-write the section thusly:</span>
| |
− | | |
− | We recommend the use of unnumbered divisions throughout the electronic text with proper values inserted in the @type attribute. For those of you who require numbered divisions for software processing, populate the @ type attribute with a number, 1-7 (?), that corresponds to the appropriate level. For those of you who prefer a semantic label (e.g., chapter, section, etc.), determine a typology beforehand and designate the appropriate level in the @type attribute. The ability to do both is also possible if it is important to maintain an explicit connection between the numbered and unnumbered labels by using @ type and @subtype accordingly. However, a combination of numbered (e.g., <div1>) and unnumbered (e.g,, <div>) divisions is not supported. For a more detailed discussion about numbered and unnumbered divisions, consult [http://www.tei-c.org/release/doc/tei-p5-doc/en/html/DS.html#DSDIV3 Chapter 4: Default Test Structure] of the TEI P5 Guidelines. ([[User:Mdalmau|Mdalmau]])
| |
− | | |
− | <span style="color: purple">This revision may impact how we display examples throughout the text. We need to keep this in mind if accepted. [Mdalmau].</span>
| |
− | | |
− | <span style="color: purple">The construction of "typologies" is a common activity for many of us when performing document analysis. When we are ready to expand the guidelines, I think including a section on "document analysis" is key. We can then explore issues of typology-building and how to constrain those values in the schema (or even Schematron). But defining the value list is not an easy task, which is one benefit of using numbered divs. ([[User:Mdalmau|Mdalmau]])</span>
| |
− | | |
− | Using type= to mimick numbered <div>s seems too close to tag abuse for comfort. I completely agree that typologies are part of document analysis and enthusiastically support having a section on that. But I don't see how using numbered <div>s has anything to do with it. A project '''should''' develop a useful typology whether they are using numbered or unnumbered. Many projects ''won’t'', anyway.
| |
− | | |
− | <span style="color: blue">Would this wording be one possible compromise: Use unnumbered divisions <div>, unless your text has obvious divisions, such as chapters with no complex subdivisions, in which case begin with <div1></span>([[User:rwisnesk|rwisnesk]])
| |
− | | |
− | ====Page Breaks (resolved on 2009-01-27: see [[Minutes_from_January_27%2C_2009|minutes]]) ====
| |
− | | |
− | <span style="color: green">[Always including page breaks within a div seems quite software-specific. I suggest revisiting. ([[User:Kshawkin|Kshawkin]])]</span>
| |
− | <span style="color: #03C03C">As we've discussed on a conference call, this isn't software specific. There are two points here. The historical point is that we wanted to recommend a practice, as a way of creating consistency and uniformity among encoded documents. There's a choice to be made about where to stick page breaks, so we chose one. But more importantly, it's about any software (eg XSLT) that will grab and return an entire DIV. You'll want to include the page break in that chunck of encoded text. In my experience, this generally works, except at the beginning of the volume, which typically would have <TEXT><BODY><PB><HEAD>Book Title</HEAD><DIV><HEAD>Chapter title</HEAD> [pwillett]</span>
| |
− | | |
− | <span style="color: purple">It seems that the page break blurb we have in place is not really an issue that needs to be revisited. I agree with Perry that promoting consistency is helpful (and also aids processing of text in most cases; page breaks as are all milestone tags, are hard to reckon with sometimes). The suggestion seems neutral enough that it can remain as-is. If someone disagrees, please provide the rational for further review. Thanks!([[User:Mdalmau|Mdalmau]])</span>
| |
− | | |
− | While I am, of course, willing to be outvoted, this strikes me as less than a Best Practice, for both theoretical and practical reasons.
| |
− | | |
− | Theoretically, I think, as Snoopy said, “honesty is the best policy”. In general in these cases, the page break does '''not''' occur inside a division, it occurs between them. To encode it otherwise seems to me to be asking for trouble.
| |
− | | |
− | Practically, this recommendation favors <div> above all others. For any other element, if you wanted to know “on what page did this occur”, you would ask the question “What is the value of n= of the most recent <pb> on the preceding:: axis?”. This works for <quote>s, <said>s, <lg>s, <head>s, etc. If page breaks are encoded where they lie, it works for <div>, too. If they are moved into the division as per this recommendation, then a different question has to be asked: “What is the value of n= of my first child <pb>?”. Neither question is difficult to ask. But how do you know which to ask? [[User:Syd|—Syd]]
| |
− | | |
− | ===Level 1 section===
| |
− | | |
− | ====Paragraphs or Anonymous Block (resolved on 2009-01-27: see [[Minutes_from_January_27%2C_2009|minutes]]) ====
| |
− | <span style="color:purple">Currently, level 1 contains a table with the following information for the <p> tag:</span>
| |
− | | |
− | At least one "container" element per div is required (while <ab> is another option for this case, the Task Force suggests using <p> in order that the document be open to being extended to other encoding levels).<span style="color: #03C03C">I don't remember this discussion. It doesn't seem very difficult, once the decision is made to upgrade, to transform all ab's to p's. Or? [pwillett]</span>
| |
− | | |
− | <span style="color:purple">
| |
− | I agree with Perry and our goal is be conformant. So the <p> could simply be changed to <ab>. Do we want to address the <p> legacy any further or maybe as an end note?
| |
− | ([[User:Mdalmau|Mdalmau]])</span>
| |