Oddbyexample

From TEIWiki
Revision as of 18:21, 29 November 2012 by Mholmes (talk | contribs) (How to download or buy)
Jump to navigation Jump to search


Synopsis

This utility attempts to work out the minimal TEI customization needed to validate a collection of files. The XSLT (version 2) stylesheet which traverses a nominated directory tree looking for *.xml files which have <TEI> or <teiCorpus> root elements. It analyzes the collection of elements and attributes in the resulting corpus, and compares that to the whole of TEI P5. An ODD file is generated which:

  • loads the required modules
  • deletes any elements which are not used
  • deletes any attributes (including class attributes) which are not used by each element
  • for every attribute which has a TEI "data.enumerated" datatype, constructs a closed <valList> enumerating the values actually used.

From this you can construct a target schema.

Features

User commentary

Please sign all comments.


System requirements

Memory capacity is likely to be an issue for large corpuses. It's not going to read a giant corpus unless you have a great deal of memory to assign to Java. For situations like this, it is suggested that you construct a smaller corpus of representative sample documents and work with that. After generating a schema, you can validate your entire corpus, and each time you find an invalid document, add it to your smaller corpus and start again.

Source code and licensing

open source

Support for TEI

Oddbyexample is not yet able to:

  • derive simplified content models (beyond what Roma already does)
  • add new elements and derive content models for them
  • deal with non-TEI namespaces
  • generate attribute datatypes with complex regexps not already specified in TEI specifications
  • create new Schematron constraints etc

Language(s)

XSLT

Documentation

The script assumes you have the TEI package which has a file called "/usr/share/xml/tei/odd/p5subset.xml". If you don't have that, grab http://www.tei-c.org/release/xml/tei/odd/p5subset.xml, put the file somewhere, and add a "tei" parameter to point at it. (Alternatively, you can check out the TEI source and generate p5subset.xml yourself, by running "make p5subset.xml" in the P5 directory.)

Here's a sample command to run Oddbyexample:

saxon -o my.odd oddbyexample.xsl oddbyexample.xsl corpus=/wherever/you/have/yourfiles/

Tech support

No formal technical support is provided for Oddbyexample. If you post a question to the TEI-L list, though, other users may respond with help.

User community

Many members of the TEI community use Oddbyexample, but there is no formal community forum or mailing list.

Current version number, date of release and previous versions

Current and previous versions are all available through the TEI SourceForge repository. To see the current versions, go to the SVN directory view at [http://tei.svn.sourceforge.net/viewvc/tei/trunk/Stylesheets/tools/].

How to download or buy

Grab oddbyexample.xsl from Sourceforge (http://tei.svn.sourceforge.net/viewvc/tei/trunk/Stylesheets/tools/oddbyexample.xsl)

Additional notes