Samples of TEI texts

Please add links below to any TEI sample texts that are freely available for use by developers working on TEI-related software. By listing the texts here you are allowing the developers the right to test their software with your texts, but are not necessarily licensing any other use of these texts. Developers should ask permission of the text owners should they wish to make any more in-depth use of these materials.

Texts

 * ala2004 (EpiDoc XML files) from the Aphrodisias in Late Antiquity publication. The downloadable .zip archive contains 230 XML files, each containing an ancient Greek inscription, which validate to the version 4 of the EpiDoc DTD (a TEI localization)--the DTD is also included in the archive. These files are licenced under Creative Commons Attribution, so please feel free to do whatever you like with them! (Format: TEI P4)


 * The Auchinleck Manuscript, made available by the Oxford Text Archive contact [mailto:ota-info@rt.oucs.ox.ac.uk ota-info@rt.oucs.ox.ac.uk]. This text originates from the Auchinleck Manuscript Project at the National Library of Scotland, please see their website for more contextual material. Format: TEI P5.


 * EpiDoc Demo Website, a growing collection of sample EpiDoc XML files, including examples from epigraphic, papyrological, and other ancient projects. XML downloadable from each transformed inscription. (Format: TEI P4)


 * IAph2007 (EpiDoc XML files) from the Inscriptions of Aphrodisias (2007) publication. There are approx 1500 XML files available (either in a single .zip or as individual files either downloadable or linkable directly for dynamic processing), each containing an ancient Greek or Latin inscription. All files validate to the EpiDoc DTD (version 5). These files are licensed under Creative Commons Attribution (UK), so please feel free to do exciting things with them. (Format: TEI P4)


 * The Migration Samples page on the main TEI website includes sample texts from (inter alia) the British National Corpus, the Thomas McGreevey Archive, Early English Books Online, Multext East, Documenting the American South, and the Women Writers Project which were prepared as part of the TEI P4 Migration Work Group, the purpose of which was to demonstrate how to migrate TEI P3 (SGML) to TEI P4 (XML). Most of the material here is therefore of a certain antiquity.


 * Files referenced in Timothy J. Finney, "Manuscript Markup," in The Freer Biblical Manuscripts: Fresh Studies of an American Treasure Trove (ed. Larry W. Hurtado; SBLTCS 6; Atlanta: Society of Biblical Literature, 2006), 263-87. These include a partial transcription of the Freer manuscript of Paul (Gregory-Aland I 016), a transform, a stylesheet and a web page produced from the transcription by the transform.


 * The Samyukta Agama Project at Dharma Drum Buddhist College provides access to its more than 1000 TEI source files. Click on any cluster and find the link to the TEI source at the bottom of each column. The files are in Chinese, Pali and Sanskrit. This is an ongoing project, planned to end in winter 2008. Once the project is concluded markup documentation, schemas and stylesheets will be made available at the website.


 * The SGML-to-XML Migration Task Force has made some of its case studies and samples available, with restricted permissions. These are mostly SGML P3-encoded files. Descriptions of these samples are also available.


 * The NZETC has a range of New Zealand and Pacific-Islands texts. The texts are P5 encoded and the TEI is generally downloadable from the document table of contents. Features include:
 * Use of  and tags to implement workflow
 * tag used extensively for personal, ship, place, organisation and work names (keyed to external authority at )
 * Use of xml:lang="en" and  xml:lang="mi" for texts with English and Maori (plus small amounts of other languages)
 * Page images, facsimile PDFs and typeset PDFs (some texts only, for example this letter)
 * Document-by-document licensing, some documents under a creative commons license (licensing info not currently stored in the TEI).


 * A subset of Project Gutenberg is available as TEI, go to http://www.gutenberg.org/catalog/world/search and select "TEI Text Encoding Initiative (tei)" as the file type.


 * The BVH project (Virtual Humanistic Libraries) is a virtual library of high-quality digitised documents, offering a selection of Renaissance books located in the libraries of the Région Centre, Paris, Poitiers, Lyons, Troyes, etc. Three samples of TEI texts are proposed in html, pdf and xml/tei on Epistemon. These files are licenced under Creative Commons Attribution.


 * The Perseus Project makes its TEI P4 XML collections in Greek, Latin, and English available from http://www.perseus.tufts.edu/hopper/opensource under a Creative Commons Sharalike/Non-Commercial/Attribution license.


 * Some work is ongoing about the possibility to use the TEI to edit and archive ISO documents. Relevant tips and documentation are provided under TEI for ISO

Dictionaries

 * FreeDict is a repository of various TEI P4-encoded bilingual translating dictionaries on free licenses (http://www.freedict.org/). Some are now in P5, for example http://freedict.svn.sourceforge.net/viewvc/freedict/trunk/swa-eng/swa-eng.tei