Samples of TEI texts

Please add links below to any TEI sample texts that are freely available for use by developers working on TEI-related software. By listing the texts here you are allowing the developers the right to test their software with your texts, but are not necessarily licensing any other use of these texts. Developers should ask permission of the text owners should they wish to make any more in-depth use of these materials.

Texts

 * ala2004 (EpiDoc XML) from the Aphrodisias in Late Antiquity publication. The downloadable .zip archive contains 230 XML files, each containing an ancient Greek inscription, which validate to the version 4 of the EpiDoc DTD (a TEI localization)--the DTD is also included in the archive. These files are licenced under Creative Commons Attribution, so please feel free to do whatever you like with them! (Format: TEI P4)


 * Archimedes Palimpsest, XML files containing the transcriptions of the Archimedes text, released (like all the Palimpsest data and metadata) under Creative Commons Attribution 3.0 Unported. Texts validate to TEI P5. One XML file per folio page (scroll down list of hi-res photographs in each directory). Format: TEI P5


 * The Auchinleck Manuscript, made available by the Oxford Text Archive contact [mailto:ota-info@rt.oucs.ox.ac.uk ota-info@rt.oucs.ox.ac.uk]. This text originates from the Auchinleck Manuscript Project at the National Library of Scotland, please see their website for more contextual material. Format: TEI P5.


 * Duke Databank/Heidelberg (EpiDoc XML) aggregated data from the Duke Databank of Documentary Papyri (DDbDP: transcribed Greek texts) and the Heidelberger Gesamtverzeichnis der griechischen Papyrusurkunden Ägyptens (HGV: metadata). Approx 55,000 XML files released under Creative Commons Attribution license (CC-BY), by the Integrating Digital Papyrology project. Format: TEI P5.


 * EpiDoc Demo Website, a growing collection of sample EpiDoc XML files, including examples from epigraphic, papyrological, and other ancient projects. XML downloadable from each transformed inscription. (Vintage 2007.) (Format: TEI P4)


 * A subset of Project Gutenberg is available as TEI, go to http://www.gutenberg.org/catalog/world/search and select "TEI Text Encoding Initiative (tei)" as the file type.


 * IAph2007 ([[EpiDoc] XML files)] from the Inscriptions of Aphrodisias (2007) publication. There are approx 1500 XML files available (either in a single .zip or as individual files either downloadable or linkable directly for dynamic processing), each containing an ancient Greek or Latin inscription. All files validate to the EpiDoc DTD (version 5). These files are licensed under Creative Commons Attribution (UK), so please feel free to do exciting things with them. (Format: TEI P4)


 * Inscriptions of Roman Tripolitania 2009 (EpiDoc XML), about 1000 Latin and Greek inscriptions available for download under Creative Commons Attribution (CC-BY) licence. Format: TEI P4.


 * Files referenced in Timothy J. Finney, "Manuscript Markup," in The Freer Biblical Manuscripts: Fresh Studies of an American Treasure Trove (ed. Larry W. Hurtado; SBLTCS 6; Atlanta: Society of Biblical Literature, 2006), 263-87. These include a partial transcription of the Freer manuscript of Paul (Gregory-Aland I 016), a transform, a stylesheet and a web page produced from the transcription by the transform. (Format: TEI P5)


 * The NZETC has a range of New Zealand and Pacific-Islands texts. The texts are P5 encoded and the TEI is generally downloadable from the document table of contents. Features include:
 * Use of  and tags to implement workflow
 * tag used extensively for personal, ship, place, organisation and work names (keyed to external authority at )
 * Use of xml:lang="en" and  xml:lang="mi" for texts with English and Maori (plus small amounts of other languages)
 * Page images, facsimile PDFs and typeset PDFs (some texts only, for example this letter)
 * Document-by-document licensing, some documents under a creative commons license (licensing info not currently stored in the TEI).


 * The Perseus Project makes its TEI P4 XML collections in Greek, Latin, and English available from http://www.perseus.tufts.edu/hopper/opensource under a Creative Commons Sharalike/Non-Commercial/Attribution license.


 * The Samyukta Agama Project at Dharma Drum Buddhist College provides access to its more than 1000 TEI source files. Click on any cluster and find the link to the TEI source at the bottom of each column. The files are in Chinese, Pali and Sanskrit. This is an ongoing project, planned to end in winter 2008. Once the project is concluded markup documentation, schemas and stylesheets will be made available at the website.


 * The Migration Samples page on the main TEI website includes sample texts from (inter alia) the British National Corpus, the Thomas McGreevey Archive, Early English Books Online, Multext East, Documenting the American South, and the Women Writers Project which were prepared as part of the TEI P4 Migration Work Group, the purpose of which was to demonstrate how to migrate TEI P3 (SGML) to TEI P4 (XML). Most of the material here is therefore of a certain antiquity.


 * The BVH project (Virtual Humanistic Libraries) is a virtual library of high-quality digitised documents, offering a selection of Renaissance books located in the libraries of the Région Centre, Paris, Poitiers, Lyons, Troyes, etc. Three samples of TEI texts are proposed in html, pdf and xml/tei on Epistemon. These files are licenced under Creative Commons Attribution.


 * Some work is ongoing about the possibility to use the TEI to edit and archive ISO documents. Relevant tips and documentation are provided under TEI for ISO


 * TEI in dspace example http://dspace.nitle.org/handle/10090/11695 (P4?)


 * The [SARIT] project has recently brought out an electronic TEI-encoded edition of a 2007 print publication. It is a work on Buddhist tantric religion:   Christian K. Wedemeyer, ed., Āryadeva's Lamp that Integrates the Practices (Caryāmelāpakapradīpa): The Gradual Path of Vajrayāna Buddhism According to the Esoteric Community Noble Tradition - Part Three: Critically Edited Sanskrit Text of Āryadeva's Caryāmelāpakapradīpa, (New York: The American Institute of Buddhist Studies at Columbia University in New York with Columbia University's Center for Buddhist Studies and Tibet House US, 2007). E-details and full text can be seen [here].  Clicking [Downloads] on the above screen offers downloadable TEI, PDF and HTML versions of this e-text, and several others. The interesting thing about this e-text from the TEI point of view is the encoding and display of the manuscript variants to the critical edition.  It was good of the publishers and editors to give their permission for the e-dissemination of this work just three years after print publication. Best, Dr Dominik Wujastyk.

Dictionaries

 * FreeDict is a repository of various TEI P4-encoded bilingual translating dictionaries on free licenses (http://www.freedict.org/). Some are now in P5, for example http://freedict.svn.sourceforge.net/viewvc/freedict/trunk/swa-eng/swa-eng.tei
 * Du Cange is a medieval latin dictionary (mostly written during XVIIe XVIIIe). The printed text is encoded in TEI-P5, freely available at http://ducange.enc.sorbonne.fr/src/, the TEI choices are documented (in french).