Xaira

Synopsis
XAIRA (XML Aware Information Retrieval Architecture) is an open source tool for constructing high-quality linguistically-motivated search interfaces to large collections of XML documents.

Features

 * Works on any collection of XML documents (and will add tags if necessary) not only TEI ones
 * Indexes words as well as XML structure
 * Generates concordances, word indexes, collocations, summary statistics, text partitions
 * Open modular architecture designed for integration into web services; can also run as standalone Windows application

System requirements
The server and indexer are written to run on any platform (Linux, Mac, Windows). Disk space needed for the index is roughly the same size as the texts, depending on amount of tagging and indexing policy.

Source code and licensing
Open Source, GPL

Support for TEI
Supports TEI out of the box. Also supports any other XML schema. Uses a set of extensions to TEI Header to document indexing policies.

Language(s)
Xaira is written in C. The current interface and documentation are in English. The Windows GUI client comes with an XML resource file which can be customized for other languages, and has already been used to produce Hungarian and French versions.

Tech support
There is a developers list and an online help service

User community
Xaira and its predecessor SARA has been used widely within the corpus linguistics community for some time, as it is delivered with the British National corpus.

Sample implementations
http://www.natcorp.ox.ac.uk

There is a textbook, The BNC Handbook (Aston and Burnard, EUP) which describes the use of SARA with the BNC.

Current version number and date of release
Current version is 1.22 released January 2007

History of versions
See http://xaira.sf.net for history

How to download or buy
All recent versions have been distributed from Sourceforge Free download from http://xaira.sf.net

Additional notes
For presentations and documentation, see http://www.xaira.org