Difference between revisions of "Talk:Best Practices for TEI in Libraries"

From TEIWiki
Jump to navigation Jump to search
(General Recommendations section)
(Rendering better documentation from the ODDs: noted things that aren't issues any more)
 
(276 intermediate revisions by 9 users not shown)
Line 1: Line 1:
Some comments for TEI Header Guidelines
+
The following are things to do to the BP before making an official "release".  There is a separate list of [[Future changes to Best Practices for TEI in Libraries]].
  
* You probably want to change the @id in a few places to @xml:id. And the examples have TEI.2 as the root. [[User:Piotr Banski|Piotr]] 22:48, 19 October 2008 (EDT)
+
== Handling of hyphenation ==
  
Yes, at some point soon we will need to change
+
Syd will post to TEI-L asking how to distinguish "his-tory" and "run-on" when they break across lines. Both would take break="no", but how to allow for searching of "history" and "run-on"?
* id= to xml:id=
 
* <TEI.2> to <TEI xmlns=...>
 
* target="blah" to target="#blah"
 
passim. Will also need to ditch the bit on entities. Maybe I can get to these later today. — done.
 
  
We also need to figure out what to do about the recommendation for lang=. That's a tougher issue, because it's not just a syntax change. Our guildelines for best practice are at odds with the TEI Guidelines and the IETF best current practices. [[User:Syd|Syd]] 2008-10-21T19:09Z.
+
: Syd re-read P5 and emailed Kevin that he thinks we can handle this without a wider call.  Kevin agreed with Syd on a solution and made the revisions to main-driver.odd and level4.odd.  Asked Syd to check over the changes to [https://github.com/sydb/TEI-in-Libraries/commit/31a8730b9d0af093099a18ab57078d4b8577c23b#BestPractices/main-driver.odd main-driver.odd] and [https://github.com/sydb/TEI-in-Libraries/commit/9e4f98caeda0446ebb3397f619c83b534ccd1fca#BestPractices/level4.odd level4.odd]. Then will update the SIG on our handling of this issue and on the final work on the BP as a whole. ([[User:Kshawkin|Kshawkin]] 11:28, 13 September 2011 (EDT))
  
== Referencing the P5 Guidelines ==
+
:: Syd said everything looked fine but made an additional suggestion, which Kevin [https://github.com/sydb/TEI-in-Libraries/commit/8c60fef9dd3b9641b878f16d0370b05d4b838856 just committed].
  
I think when ever possible, we should link to the P5 guidelines to further illustrate examples or even as additional reference to prose.  These guidelines mention "text hierarchy" which is akin to structure of a text so we should link to Chapter 4, Default Text Structure for illustrations of the various forms, etc.  Not only do we leverage existing, robust documentation, but we help all TEI users, novice or expert, "penetrate" the monolithic guidelines when relevant.  [[User:Mdalmau|Mdalmau]]
+
== Bug fixes ==
  
agreed. [[user:emcaulay|emcaulay]]
+
* Fix schema bug where list element is not allowed except as a child of p.
 +
: Syd put model.inter back into model.common at Levels 3 and 4.  This fixed the problem.  <b>Syd wonders whether this change will have any side-effect; he will investigate further.</b>
  
== Issues Pending ==
+
== Copyediting ==
  
===General Recommendations section ===
+
Change all instances of "must" to "should" and "required" to "recommended", and all existing "recommended" to "optional".  All this is in accordance with RFC 2119 and with the BP's general policy of not requiring anything but simply being "best practices".  Clarify that the ODDs require things just to encourage conformance.
 +
: done ([[User:Kshawkin|Kshawkin]] 20:47, 5 September 2011 (EDT))
  
==== Filenaming ====
+
Move caveats before Level 1 Example to an appropriate place in main-driver.odd.
 +
: done ([[User:Kshawkin|Kshawkin]] 20:47, 5 September 2011 (EDT))
  
{snippet from BPG text; comments about file naming}
+
Copyedit element recommendations at each level to avoid awkward and misleading syntax.
<span style="color: purple">[This recommendation also seems dated (and the standard is targeted for CD-ROM file naming). I think we should
+
: Not finding anything in particular to fixI think I was just tired when reading these in person with Syd.
recommend a consistency in file naming according to respective digital object storage practicesFor example, [http://wiki.dlib.indiana.edu/confluence/x/Sw8 IUDLP has guidelines in place] and perhaps we can mine the more general recommendations from there like only ASCII, no spaces, 3 letter extensions, etc. ([[User:Mdalmau|Mdalmau]])] </span> <span style="color: blue">sounds like a good idea to remove or revise; as is it seems weird. ([[User:emcaulay|emcaulay]])</span>
 
<span style="color: #03C03C">I'll just point out that people still use CD-ROM as an archival storage medium (I'm looking at you, Chris) as well as a file transfer mechanism [pwillettt]</span>
 
{end snippet}
 
  
<span style="color: purple">File naming is still an issue. Perry pointed out that some folks store TEI files on CD-Rom (makes sense).  Perhaps it just needs to be teased out for those who use CD-ROM for storage and more general filenmaing guidelines for server storage/delivery, like:</span>
+
Remove ODD markup in level specifications that Syd added from P5 so as not to include examples that contradict rest of Guidelines.
 +
: Commented out gloss, desc, exemplum, remarks, and listRef in level1.odd for Syd to check over before I do in other levels. ([[User:Kshawkin|Kshawkin]] 20:47, 5 September 2011 (EDT))
 +
:: Syd said he un-commented out these elementsHe'll review the 79 elements in the various ODDs to check for conflict with the BP prose.  That way we can include them with our release. ([[User:Kshawkin|Kshawkin]] 10:46, 3 October 2011 (EDT))
  
Standardized file naming for a particular encoding project is key for reliable online storage and delivery of these files. Consider the following best practices when determining the file name scheme for your project:
+
Edit main-driver.odd to review metadata for the BP as a whole (in both the teiHeader and front elements): editors (stated more prominently than in the appendix), copyright, version number, etc.  Check with Michelle on whether we should revise acknowledgment of DLF support.
  
* Each filename should contain an identifier that uniquely specifies a single digital object within the parent collection (e.g., a parent collection of text, images and other related materials)
+
: In teiHeader, added Michelle as <editor> and copyedited name of SIG. Having been in touch with DLF recently, I'm pretty sure our statement of support is sufficient.
* Each filename should be fully specified. It should not just be a sequence number that is dependent on location within a directory structure for context
 
* Filenames should not include spaces
 
* Filenames should following a predicatble case constructions (e.g., all lowercase, camelCase, etc.)
 
* The first character of the filename should be an ASCII letter ('a' through 'z' or 'A' through 'Z') to comply with current restrictions on identifiers by many programming and metadata languages such as METS
 
* The "base" filename may include only ASCII letters ('a' through 'z' and 'A' through 'Z'), ASCII digits ('0' through '9'), hyphens, underscores, and periods. Refrain from using other characters and limit period usage to only once (to separate base name from file extensions).
 
  
For those saving files to CD-ROM for storage or file transfer, file naming should follow ISO 9660 conventions: 8-character filenames, 3-character extensions, using A-Z, a-z, 0-9, underscores and hyphens.
+
: As last step before releasing, add <code>&lt;editionStmt>&lt;p>Version 3.0 (October 2011)&lt;/p>&lt;/editionStmt></code> [[User:Kshawkin|Kshawkin]] 20:27, 29 September 2011 (EDT)
 +
:: Done
  
([[User:Mdalmau|Mdalmau]])
+
== Rendering better documentation from the ODDs ==
  
==== Numbered Divs ====
+
Tables should have borders on cells (or some clear path for indicating in ODD document that you want borders).
 +
: Syd said this can be overriden by giving values for cssFile and/or cssSecondFile parameters to odd2html.xsl.  Not sure how to "send" these through roma2.
 +
:: Sebastian wrote in an email to use, e.g., <tt>--docflags="cssFile=foo.css"</tt> but warned that this was from memory ([[User:Kshawkin|Kshawkin]] 16:08, 22 August 2011 (EDT))
  
<span style="color: green">[This seems worth revisiting. Do we really need such a software-specific recommendation? ([[User:Kshawkin|Kshawkin]])]</span>
+
Omit exemplars from output — preferably just those from the P5 source, not the customization ODD file.
<span style="color: purple">[I agree.  We generally avoid numbered divisions. Recent survey revealed a nearly 50/50 split on the topic, but we shouldn't advocate one or the other. ([[User:Mdalmau|Mdalmau]])] </span> <span style="color: blue">For a discussion of whether to use numbered or unnumbered divs, see the TEI P5 Guidelines [http://www.tei-c.org/release/doc/tei-p5-doc/en/html/DS.html#DSDIV3 Chapter 4: Default Text Structure]([[User:emcaulay|emcaulay]]) </span>
+
: Done. See the [https://github.com/sydb/TEI-in-Libraries/blob/master/BestPractices/odd2odd.xsl.patch stylesheet patch file].
  
<span style="color: purple">Chapter 4 of the P5 Guidelines make allowances for both, with a preference towards unnumbered as it more easily suports arbitrary levels of nesting (as opposed to a fixed number).  Unnumbered is also preferred because designated levels to parts of a text may change from project to project or even book to book within the same project.  The guidelines make allowances for both: unnumbered divisions using the @type to designate the level.   For those who type more semantically and for those who need numbered divisions for more predictable processing how about we re-write the section thusly:</span>
+
<code>&lt;editor></code>s are not showing up in HTML version of ODD: need to figure out how to make this happen.
  
We recommend the use of unnumbered divisions throughout the electronic text with proper values inserted in the @type attribute.  For those of you who require numbered divisions for software processing, populate the @ type attribute with a number, 1-7 (?), that corresponds to the appropriate level.  For those of you who prefer a semantic label (e.g., chapter, section, etc.), determine a typology beforehand and designate the appropriate level in the @type attribute.  The ability to do both is also possible if it is important to maintain an explicit connection between the numbered and unnumbered labels by using @ type and @subtype accordingly.  However, a combination of numbered (e.g., <div1>) and unnumbered (e.g., <div>) divisions is not supported.  For a more detailed discussion about numbered and unnumbered divisions, consult [http://www.tei-c.org/release/doc/tei-p5-doc/en/html/DS.html#DSDIV3 Chapter 4: Default Test Structure] of the TEI P5 Guidelines.
+
: Done
  
<span style="color: purple">The construction of "typologies" is a common activity for many of us when performing document analysis.  When we are ready to expand the guidelines, I think including a section on "document analysis" is key.  We can then explore issues of typology-building and how to constrain those values in the schema (or even Schematron).  But defining the value list is not an easy task, which is one benefit of using numbered divs.</span>
+
Check that editionStmt gets rendered in HTML version of ODD.
  
===
+
: Done
 +
 
 +
== Documentation of ODD processing ==
 +
 
 +
Syd will write down the command-line code needed in order to generate HTML files from the ODDs and add this to https://github.com/sydb/TEI-in-Libraries/blob/master/README .
 +
: Done
 +
 
 +
== Before release ==
 +
 
 +
Update [http://www.tei-c.org/SIG/Libraries/teiinlibraries/ the official HTML version] to remove "Expected release October 2011."
 +
 
 +
Update copy of main-driver.html linked from there.

Latest revision as of 02:31, 24 October 2011

The following are things to do to the BP before making an official "release". There is a separate list of Future changes to Best Practices for TEI in Libraries.

Handling of hyphenation

Syd will post to TEI-L asking how to distinguish "his-tory" and "run-on" when they break across lines. Both would take break="no", but how to allow for searching of "history" and "run-on"?

Syd re-read P5 and emailed Kevin that he thinks we can handle this without a wider call. Kevin agreed with Syd on a solution and made the revisions to main-driver.odd and level4.odd. Asked Syd to check over the changes to main-driver.odd and level4.odd. Then will update the SIG on our handling of this issue and on the final work on the BP as a whole. (Kshawkin 11:28, 13 September 2011 (EDT))
Syd said everything looked fine but made an additional suggestion, which Kevin just committed.

Bug fixes

  • Fix schema bug where list element is not allowed except as a child of p.
Syd put model.inter back into model.common at Levels 3 and 4. This fixed the problem. Syd wonders whether this change will have any side-effect; he will investigate further.

Copyediting

Change all instances of "must" to "should" and "required" to "recommended", and all existing "recommended" to "optional". All this is in accordance with RFC 2119 and with the BP's general policy of not requiring anything but simply being "best practices". Clarify that the ODDs require things just to encourage conformance.

done (Kshawkin 20:47, 5 September 2011 (EDT))

Move caveats before Level 1 Example to an appropriate place in main-driver.odd.

done (Kshawkin 20:47, 5 September 2011 (EDT))

Copyedit element recommendations at each level to avoid awkward and misleading syntax.

Not finding anything in particular to fix. I think I was just tired when reading these in person with Syd.

Remove ODD markup in level specifications that Syd added from P5 so as not to include examples that contradict rest of Guidelines.

Commented out gloss, desc, exemplum, remarks, and listRef in level1.odd for Syd to check over before I do in other levels. (Kshawkin 20:47, 5 September 2011 (EDT))
Syd said he un-commented out these elements. He'll review the 79 elements in the various ODDs to check for conflict with the BP prose. That way we can include them with our release. (Kshawkin 10:46, 3 October 2011 (EDT))

Edit main-driver.odd to review metadata for the BP as a whole (in both the teiHeader and front elements): editors (stated more prominently than in the appendix), copyright, version number, etc. Check with Michelle on whether we should revise acknowledgment of DLF support.

In teiHeader, added Michelle as <editor> and copyedited name of SIG. Having been in touch with DLF recently, I'm pretty sure our statement of support is sufficient.
As last step before releasing, add <editionStmt><p>Version 3.0 (October 2011)</p></editionStmt> Kshawkin 20:27, 29 September 2011 (EDT)
Done

Rendering better documentation from the ODDs

Tables should have borders on cells (or some clear path for indicating in ODD document that you want borders).

Syd said this can be overriden by giving values for cssFile and/or cssSecondFile parameters to odd2html.xsl. Not sure how to "send" these through roma2.
Sebastian wrote in an email to use, e.g., --docflags="cssFile=foo.css" but warned that this was from memory (Kshawkin 16:08, 22 August 2011 (EDT))

Omit exemplars from output — preferably just those from the P5 source, not the customization ODD file.

Done. See the stylesheet patch file.

<editor>s are not showing up in HTML version of ODD: need to figure out how to make this happen.

Done

Check that editionStmt gets rendered in HTML version of ODD.

Done

Documentation of ODD processing

Syd will write down the command-line code needed in order to generate HTML files from the ODDs and add this to https://github.com/sydb/TEI-in-Libraries/blob/master/README .

Done

Before release

Update the official HTML version to remove "Expected release October 2011."

Update copy of main-driver.html linked from there.