TEI Internationalisation

The Text Encoding Initiative Guidelines have been adopted by projects and institutions in many countries and used for encoding texts in dozens of languages. However, the complex encoding of texts at which the TEI excels needs a close understanding of the 522 available elements, and non-English speakers are at a considerable disadvantage. Mistakes in using the TEI may come from misunderstanding the explanations in English. The TEI needs to be made more accessible.

The TEI Council and Board are promoting an effort to do a limited translation of the TEI.

Some work has already been undertaken on translating element and attribute names; Alejandro Bia and Arno Mittelbach have prepared translation sets of Catalan, Spanish, and German. This work is integrated into the Roma application, allowing users to create tailored schemas in one of the supported languages. However, it is not necessarily the most effective way to proceed; this is because many of the element names are in an abbreviated form of English (eg ) which are not easy to translate sensibly, and because the abbreviated names are relatively easy to recognize for people used to reading Latin script. Using  instead of  is not as helpful as translating "supplies a statement of responsibility for someone responsible for the intellectual content of a text, edition, recording, or series, where the specialized elements for authors, editors, etc. do not suffice or do not apply."

Study of the work described aboves suggests that translating the reference descriptions, followed by translation of element names, is likely to be the most effective way to promote the TEI in non-English speaking communities.

Thus instead of , the TEI user might prefer to write , ,  or . The TEI has an established system for recording such translations, and preserving the relationship to the original names so that document instances can be put back into canonical form.

Similarly, instead ''contains a single TEI-conformant document, comprising a TEI header and a text, either in isolation or as part of a teiCorpus element'', the Spanish-speaking user might find it more helpful to read ''contiene un único documento TEI, compuesto de una cabecera TEI (TEI header) y un cuerpo de texto (text) aislado o como parte de un elemento corpusTei (teiCorpus)''