TEITOK

From TEIWiki
Revision as of 17:14, 17 August 2018 by Maarten Janssen (talk | contribs)
Jump to navigation Jump to search


Synopsis

TEITOK - the Tokenized TEI Environment, a web-based platform for viewing, creating, and editing corpora with both rich textual mark-up and linguistic annotation pereferably in TEI annotation. It is developed at the Centro de Linguística da Universidade de Lisboa.

Features

  • Build a corpus where each text consists of a TEI/XML file
  • Annotate and edit each text in the corpus
    • Use a variety of scripts to provide automatic annotations
    • Use an easy GUI to edit manually
  • Search the corpus using CQP
    • Search results give XML fragments rather than raw CQP results
    • Statistical data about the results can be rendered as graphs
    • Edit XML documents directly from the CQP results
  • Visualize each TEI/XML file individually
    • Various visualization options depending on the content of the file
  • Plot the XML documents on the world map (OpenStreetMap)
    • Provide search results direclty on the map
  • Align the XML transcription with a facsimile image
    • Visualize each manuscript line above its transription
    • Get facsimile images of words from a CQP search
    • Transcribe directly from the facsimile image to TEI/XML
  • Work with dependency relations in TEI/XML
    • Searchable using a modified version of CQP (TT-CQP)
    • Create word sketches from the corpus
  • Align the XML transcription with an audio file
    • Get audio fragments direclty from a CQP search
    • Visualize the audio as a waveform
    • Transcribe directly from the audio file to TEI/XML
  • Use stand-off annotation alongside the TEI/XML files
    • Visualization and editing of the stand-off inspired by Brat
    • Search the stand-off annotations direclty in CQP
  • Work with interlinear glossed texts


User commentary

System requirements

Server based software that runs on most Linux servers

Source code and licensing

Support for TEI

In principle, TEITOK works with generic XML files, and can hence handle most flavours of TEI/XML. For more advanced features, it assumes the XML to be in TEI/XML P5.

Language(s)

  • Interface written in PHP and Javascript
  • Scripts written in Perl and C++
  • Multilingual interface with customizable internationalization
  • Documentation in English

Documentation

[1] [2]

Publications: [3]

Tech support

User community

Sample implementations

Examples of projects using TEITOK can be found on the project website: http://www.teitok.org/

Current version number and date of release

History of versions

Frequent updates, current version is 2.3 (August 2018)

How to download or buy

TEITOK is currently a private project on GitLab. Anybody interested in using TEITOK, please create an account on GitLab and contact the author with your account details to add you as a user to the project: Maarten Janssen.

Additional notes