LiteFinal
I spent much of today re-reading the February 2006 last revision of the TEI Lite document (tei_lite.odd) and asking myself the following questions
- is anything here actually wrong?
- is there anything here which I think is not useful?
- what topics do I think it might be useful to add ?
Obviously, the first category must be fixed. The only thing I've so far identified in this category is the assertion that the <teiCorpus> element cannot self-nest (#fail) -- it can now, as a result of some pretty good tidying up back in 2006. The initial chapter on the history of Lite itself also needs bringing up to date, and I have some draft text for that.
The other two are less clear, if only because "useful" really provokes the question "for whom?" and different answers arrive for the different responses to that question. So I would welcome comments on the tentative proposals I list here.
First off, let's not forget that the intention is not to rewrite this tutorial from scratch. If we were doing that, we probably wouldn't start from here but from TEI By Example, or from the earlier Council initiative spear headed by Peter Boot, or somewhere else. We'd also probably write it using a less dauntingly formal style today, though no-one seems to have complained about that recently.
The intention is just to continue to meet the original intentions of document TEI U5 which are summarised in the document itself roughly as follows :
- meet 80% of the needs of 80% of the current users of TEI
- provide a good basis for tutorials and generic introductory courses, etc.
- specifically to address two kinds of TEI application :
- "digital library" style straightforward encoding of early print materials
- "born digital" authoring of technical documents
With those audiences in mind, what needs adding or removing? Here are my initial thoughts, reading through the document.
1) The discussion of the Jane Eyre example should be improved, for example to comment on the treatment of end-of-line hyphenation (is it honeymoon or honey-moon ?) Should the list of cool things following the example not be linked to the section in which said cool thing is discussed?
- "Yes, the discussion could be improved slightly, and yes, the list could point to the appropriate sections of the TEI Guidelines. (However, I would point to the Vault version for which this version of TEI Lite is fixed against.)" [JC]
- I think Lou meant to suggest introducing a cross-reference within the TEI Lite document. Lite never refers to the full Guidelines elsewhere, and I think we should be consistent in that. Agree with both of Lou's suggestions. [KH]
- "maybe. not urgent?" [SR]
- yes, I meant cross refs to rest of Lite, but looking at the list again I decided against it, since it would interrupt the narrative flow too much. I have however mentioned the hyphenation issue. [LB]
- "Yes, the discussion could be improved slightly, and yes, the list could point to the appropriate sections of the TEI Guidelines. (However, I would point to the Vault version for which this version of TEI Lite is fixed against.)" [JC]
I agree. A lack of surprises is always a good thing; especially in terms of encouraging TEI newbies.
2) should xml:base and xml:space be removed from the schema? Neither is discussed -- if we keep them, they must be.
- "no, its weird to remove standard XML things" [SR]
- "Remove them." [JC]
- Agreed. [KH]
- They're gone [LB]
3) Should the section on divisions warn against some vulgar errors (mono-div bodies, misuse of @n to contain <head> values, attempts to evade the tessellation requirement)?
- "no. leave sleeping dogs" [SR]
- "If so, only very briefly." [JC]
- I can't find anything in P5 that says not to have a single div in a <body>, so if that's really the case, we have no business putting something like that in Lite. It seems to me that proper use of @n is already documented in Lite. I'm not sure what you have in mind about evading tessellation requirements, so I can't comment on that. [KH]
- Dogs continue to snore ... but I may raise the <body></body> issue as a SF ticket [LB]
4) Should section 4.3 (and the schema) include discussion of <spGroup> ?
- "No, that sounds heavy not Lite." [JC]
- "pass" [SR]
- Offhand I agree with James that <spGrp> is too heavy for Lite. What motivates the suggestion to include it? [KH]
- Just a passing whim... [LB]
5) A simpler prose drama example should be added to 4.3 before it launches into the complexities of overlapping hierarchies. It should be contrasted with the Fish example: which last should not use the @who attribute since this is explained later.
- "Yes, simpler prose drama would be good. [JC]
- "ok" [SR]
- I don't understand the syntax of "which last". Can I read this sentence as "the Fish example, which should not use"? In any case, I trust Lou to add a simpler example when he thinks it's appropriate. [KH]
- which yes, you can. And which I have now added a nicer example. [LB, channelling Mrs Gamp] On second thoughts I think the explanation of @who needs to go here -- the later one is inadequate
6) "the names used for editions referred to by the @ed attribute... [is] documented in the header" Is it? But where?
- pass [SR]
- Using editionStmt mentioned further down? [JC]
- Surely not -- that is for edition of the digital resource not its source [LB]
- In <refsDecl>, cf. 3.10.4 of P5 ... right? [KH]
- Indeed so. Added cross reference. [LB]
7) Should the discussion of @rend (6.1) mention the possibility of using CSS as value?
- No, absolutely not, in my opinion. [JC]
- NO!!!!!!! [SR]
- I'm surprised MH has not expressed a view on this (to my mind) reasonable proposition [LB]
- I think we should reconsider this once we figure a way out of the @rend vs. @style vs. @tei:style conundrum we're in. [KH]
8) In 6.2 we define q and quote but not said. My preference would be to remove quote, but if we add it we should either remove q or add said as well.
- document <said> [SR]
- I'd probably add said. [JC]
- But then why not remove q? [LB]
- Since <q> vs. <quote> is a subtle distinction that causes endless trouble for experts, I propose removing <quote>. [KH]
- Hoorah, I agree with Kevin and have removed quote. Possibly should remove soCalled and mentioned by the same logic though... I expect someone will complain in any case. [LB]
9) I havent checked whether the discussion of xml:lang values in footnote in section 6.3 is still politically correct. Is it?
- The document it points to has been deprecated. Should it point instead to: http://www.loc.gov/standards/iso639-2/php/code_list.php ? [JC]
- pass [SR]
- No. Some 639-2 3-letter codes are used, but basically only where there is no equivalent 2-letter code from 639-1. The authority is the IANA subtag registry, though. See the latest build of the guidelines for pointers to good explanations. [MH]
- Agree with Martin. [KH]
- Happy with Martin's revisions [LB]
10) Section 7 (just before the first example) has another vague reference to the Header, which should be made more precise or removed.
- Agreed, should be made more precise. (Though I'm not entirely sure what it should say.) [JC]
- pass [SR]
- I'm also not sure what to recommend to a user. Lou, did you have something in mind? [KH]
- The only place I can see where you'd be able to specify what sort of <note> you meant is <tagsDecl>, which isn't in Lite. So I removed the comment. [LB]
11) Sections 8 and 9 seem to have stood the test of time quite well, and I see no need to modify them
- In 8.3 @ana "links an element with its nterpretation" maybe shouldn't be singular? 'one or more interpretations'. Otherwise, nothing jumped out at me. [JC]
- pass [SR]
- Fine. [KH]
- I claim that interpretation here is being used as a non-count name... [LB]
12) Should section 10 say something about linked data? At least a reference to the existence of the @ref attribute? (In that context, we should probably also include somewhere a comment that elements with a value for their xml:id attribute -- at the least the TEI element itself -- can ipso facto be exposed as LOD.
- Maybe two sentences, one of the @ref attribute and one on what a good idea putting @xml:id's on things are since it simplifies people pointing into their files. Otherwise, I wouldn't get into LOD etc. here. [JC]
- not now. one day [SR]
- I'm not sure that the LOD people would agree that having an identifier on an element in an XML document really makes something LOD. You can point to it, but is there actually a machine-readable of an entity in the real world at that point? Not so sure. But I do think mentioning @ref is important. [KH]
- Have sneaked in an example linking a name to a wikipedia entry via ref and leaving it at that for now. [LB]
13) Does section 12 (on bibliographies) need expanding?
- I'd vote no. You'd want to change it too much. [JC]
- no. too many dragons [SR]
- Agree with SR. [KH]
- OK [LB]
14) Should section 16 also contain a reference to egXML, and hence some discussion of name spaces? Should it at least explain what an ODD is?
- I'd drop 16.1 entirely. [SR]
- Unlike SPQR, I'd leave 16.1 and include a link to the USE chapter explaining that there are more elements are used for storing TEI customisations. [JC]
- I don't want to start linking to P5: people can easily get lost. Agree with SR. [KH]
- Dropping 16.1 entirely would make Lite rather useless for technical documentation, surely? Have added a footnote earlier on to explain what a namespace is while I think about this . [LB]
15) Is the extended example in 16.3 actually useful enough to keep? Wouldn't a brief discussion of what an ODD is and how you might use one to define your own technical manual be much more useful?
- While I'd probably like that this isn't a manual for me. I'd leave it as is. [JC]
- pass [SR]
- I have problems with all of the examples in 16.3. I don't understand the first one, and the fact that it talks about itself and is an example of documentation of encoding both add to the confusion. Can we come up with something more straightforward? The second example isn't really an example: it's a fragment of what is being described which is built upon in the third example. That's not all that clear until you read further on. I think the third example is fine, and I think ODDs are way too complicated for someone using Lite to learn TEI! At most mention ODD in passing and say that the full TEI lets you do this. [KH]
- I read the consensus as saying leave things alone. [LB]
16) Should the @facs attribute be added to the schema and to the discussion in section 14? I think so : lots of people produce simple Lite-based "digital editions"
- yes, put in @facs. good idea. [SR]
- Yes, add @facs and document very briefly (point to chapter for more information?) [JC]
- Agree with everyone. [KH]
- OK have added simple <pb facs="page1.png"> type example, complete with health warning [LB]
17) Is the treatment of front and back matter too detailed? Which parts would you remove? (Me, I love it all; except maybe the list of suggested values for @types of prefatory matter)
- dunno. I kind of like that stuff well, up to 19.2 at least. but probably leave alone [SR]
- Leave it. (But yes, otentially nuke suggested values of @type there.) [JC]
- The list of values for @type seems to have been derived from section 4.5 of P5 but diverged at some point. I suggest keeping the list but resynchronizing with P5. Will promote interoperability. [KH]
- OK, have synchronized the two lists. Added the remark that where other kinds of prefatory matter are encountered, the encoder is at liberty to invent other values for the <att>type</att> attribute. [LB]
18) Something went wrong (probably a rogue #uri) at the end of section 18 sv "index"
- yes. That should be fixed. [JC]
- Definitely fix. [KH]
- Formatting artefact. Fixed. [LB]
19) 19.1.4 should probably include an example using <licence> . OTOH, I think elements <notesStmt> and <biblFull> could probably go.
- Yes, include licence, keep notesStmt, ditch biblFull would be my vote. [JC]
- Add license, keep notesStmt (useful as a place to put things when you don't know where else to put them), and definitely remove biblFull. [KH]
- Added <licence>, removed <biblFull>. Also removed <refState> and <cRefPattern> since we don't explain these at all [LB]
20) 19.3 needs an example of using <catRef>
- potentially [JC]
- Sure. [KH]
- Added discussion and example for <catRef> (also for <langUsage>) [LB]
21) 19.4 might profitably say that there better ways of handling revision history
- It might point out that these change elements can be updated by proper version control systems. (But I'd stay vague and neutral on what those are and how to do so.) [JC]
- Lou, by "better ways" are you think of version control systems as well? I agree with James on how to approach this. [KH]
- Yes, exactly. Added a paragraph to that effect. [LB]
22) The appendix "Substantive changes from the P4 version" should go. It's out of date and incomplete.
- Agreed. [JC]
- Fine. [KH]
- It's history... [LB]
23) (a late addition) Would it make sense to include <w> in the section discussing pc s and seg? The one example we have of linguistic tagging uses <seg type="lex" ana="#NP1"> which looks cumbersome beside <w ana="#NP1"> but lexical tagging/tokenisation is at least as useful and commonplace as segmentation using < s > which we do explain.
- thats a very puzzling omission. it seems like a no-brainer to include it. [SR]
- Perhaps it was omitted because you can use <seg type="word"> (and use <seg> for all the other bits and pieces in the same way). I see <cl> and <phr>, for instance, are also missing. Perhaps the oddity here is the inclusion of <s>? [MH]
- I understand Martin's objections but tend to lean towards a user-friendly, not-so-theoretically-supercoherent set of devices for Lite. <w> tastes lite. Or, conversely, its lack will litely raise eyebrows. [PB]
- I agree. A lack of surprises is always a good thing; especially in terms of encouraging TEI newbies. [SY]
- OK have added <w> to schema: some discussion still needed [LB]