<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.tei-c.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Matthieu+Decorde</id>
	<title>TEIWiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.tei-c.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Matthieu+Decorde"/>
	<link rel="alternate" type="text/html" href="https://wiki.tei-c.org/index.php?title=Special:Contributions/Matthieu_Decorde"/>
	<updated>2026-04-21T17:53:58Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.32.0</generator>
	<entry>
		<id>https://wiki.tei-c.org/index.php?title=TXM&amp;diff=12033</id>
		<title>TXM</title>
		<link rel="alternate" type="text/html" href="https://wiki.tei-c.org/index.php?title=TXM&amp;diff=12033"/>
		<updated>2013-07-03T10:14:38Z</updated>

		<summary type="html">&lt;p&gt;Matthieu Decorde: /* Current version number and date of release */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Tools]]&lt;br /&gt;
&lt;br /&gt;
[[Category:Administrative tools]]&lt;br /&gt;
[[Category:Development tools]]&lt;br /&gt;
[[Category:Conversion and preprocessing tools]]&lt;br /&gt;
[[Category:Publishing and delivery tools]]&lt;br /&gt;
[[Category:Querying tools]]&lt;br /&gt;
[[Category:Analysis tools]]&lt;br /&gt;
[[Category:All-in-one Tools]]&lt;br /&gt;
[[Category:Interfaces]]&lt;br /&gt;
&lt;br /&gt;
[[Category:Discovering]]&lt;br /&gt;
[[Category:Comparing]]&lt;br /&gt;
[[Category:Sampling]]&lt;br /&gt;
[[Category:Illustrating]]&lt;br /&gt;
[[Category:Representing]]&lt;br /&gt;
&lt;br /&gt;
== Synopsis ==&lt;br /&gt;
TXM is free, open-source Unicode, XML &amp;amp; TEI compatible text/corpus analysis environment and graphical client based on CQP and R. It is available for Microsoft Windows, Linux, Mac OS X and as a J2EE web portal.&lt;br /&gt;
&lt;br /&gt;
== Features ==&lt;br /&gt;
* Provides qualitative analysis tools:&lt;br /&gt;
** '''concordances''' of lexical patterns based on the efficient [http://cwb.sourceforge.net CQP] full text search engine and its CQL query language&lt;br /&gt;
** CQL pattern '''frequency lists''' for any word property (type, lemma, pos...)&lt;br /&gt;
** CQL pattern '''occurrence graphics'''&lt;br /&gt;
** lexical patterns are expressed in the CQL query language, based on word &amp;amp; structure level properties: (for example)&lt;br /&gt;
*** &amp;quot;aiming&amp;quot; to simply search for the word 'aiming'&lt;br /&gt;
*** &amp;quot;.*ing&amp;quot; to search for words ending in &amp;quot;ing&amp;quot; (including mainly verb forms)&lt;br /&gt;
*** [pos=&amp;quot;VERB&amp;quot; &amp;amp; word=&amp;quot;.*ing&amp;quot;] to search for verb forms ending in &amp;quot;.ing&amp;quot; (where Part of Speech annotation is present)&lt;br /&gt;
*** [lemma=&amp;quot;group&amp;quot;] []{0,3} [pos=&amp;quot;VERB&amp;quot; &amp;amp; word=&amp;quot;.*ing&amp;quot;] to search for the collocation &amp;lt;group lemma&amp;gt; followed by a &amp;lt;verb with progressive aspect&amp;gt; with at most 3 words in between&lt;br /&gt;
** rich HTML-based text edition navigation with links from all other tools&lt;br /&gt;
* Provides quantitative analysis tools, based on [http://www.r-project.org R packages]:&lt;br /&gt;
** '''factorial correspondance analysis'''&lt;br /&gt;
** constrative word '''specificities'''&lt;br /&gt;
** '''hierarchical classification'''&lt;br /&gt;
** '''analysis of cooccurring words''' or lexical patterns&lt;br /&gt;
* May be used with any collection of '''Unicode''' encoded documents in various formats: '''TXT''', '''XML''', '''XML-TEI''' P5 (BFM, BVH, etc. projects customization), XML-'''Transcriber''', XML-'''TMX''' (aligned corpora - alpha), XML-PPS ('''Factiva''' - alpha), etc.&lt;br /&gt;
* Applies various NLP tools on the fly on texts before analysis (e.g. '''TreeTagger''' for lemmatization and pos tagging)&lt;br /&gt;
* Indexes words and their properties as well as hierarchical structure of texts&lt;br /&gt;
* Indexes external or internal metadata of texts or speakers&lt;br /&gt;
* Allows construction of various '''subcorpora''' and '''partitions''' (for constrative analysis between text structures or groups of words)&lt;br /&gt;
* '''Export'''s any result in CSV, XML or SVG format&lt;br /&gt;
* Scripting possible for automation of repetitive tasks or platform extension (in '''Groovy'''/Java)&lt;br /&gt;
* Includes a '''text editor''' to edit data sources, results and scripts&lt;br /&gt;
* Runs as standalone '''Windows''', '''Mac OS X''' or '''Linux''' application&lt;br /&gt;
* Runs also as '''web portal''' to access and analyze corpora online through a web browser (with access control management)&lt;br /&gt;
* '''Open source''': based on the best open source components for text analysis: CQP, R and Java &amp;amp; XSLT libraries&lt;br /&gt;
* Modular architecture (Eclipse RCP OSGi and J2EE conformant): one toolbox connecting all core components is used by all the applications&lt;br /&gt;
* Efficient Eclipse or Netbeans powered development framework&lt;br /&gt;
&lt;br /&gt;
== User comments ==&lt;br /&gt;
'''Please sign all comments.'''&lt;br /&gt;
&lt;br /&gt;
== System requirements ==&lt;br /&gt;
The standalone version runs on:&lt;br /&gt;
* Windows - 32bit or 64bit (tested on XP, Vista and 7)&lt;br /&gt;
* Mac OS X (tested on 10.5, 10.6 and 10.7)&lt;br /&gt;
* Linux - 32bit or 64bit (tested on Ubuntu and Debian)&lt;br /&gt;
&lt;br /&gt;
The portal server runs on any J2EE capable platform (tested in Tomcat and Glassfish).&lt;br /&gt;
&lt;br /&gt;
== Source code and licensing ==&lt;br /&gt;
Open Source under GPL V3 licence.&lt;br /&gt;
&lt;br /&gt;
== Support for TEI ==&lt;br /&gt;
Supports TEI and TEI Lite &amp;quot;out of the box&amp;quot; '''at the XML level''': words will be tokenized inside any #PCDATA and all the XML structure will be imported directly as textual structure.&lt;br /&gt;
&lt;br /&gt;
Supports various flavours of TEI P5 encoding semantics '''at the TEI level''':&lt;br /&gt;
* words and their properties: &amp;lt;nowiki&amp;gt;#PCDATA, &amp;lt;w&amp;gt;, &amp;lt;num&amp;gt;...&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* editorial markup: &amp;lt;nowiki&amp;gt;&amp;lt;sic&amp;gt;, &amp;lt;corr&amp;gt;...&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* texts and their properties: &amp;lt;nowiki&amp;gt;&amp;lt;TEI&amp;gt;, &amp;lt;text&amp;gt;...&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* intermediate text structures and their properties: &amp;lt;nowiki&amp;gt;&amp;lt;div&amp;gt;, &amp;lt;p&amp;gt;...&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* edition rendering: &amp;lt;nowiki&amp;gt;&amp;lt;pb/&amp;gt;, &amp;lt;head&amp;gt;...&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* what should not be indexed but considered for edition rendering: &amp;lt;nowiki&amp;gt;&amp;lt;teiHeader&amp;gt;, &amp;lt;note&amp;gt;...&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* alignment between texts: &amp;lt;nowiki&amp;gt;&amp;lt;teiCorpus&amp;gt;, &amp;lt;linkGrp&amp;gt;, &amp;lt;link&amp;gt;...&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* words identifier policy: &amp;lt;nowiki&amp;gt;&amp;lt;xml:id&amp;gt;&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* language declaration policy: &amp;lt;nowiki&amp;gt;&amp;lt;xml:lang&amp;gt;&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
See &amp;quot;BFM encoding manual&amp;quot; for an example, in French, http://bfm.ens-lyon.fr/article.php3?id_article=158 of encoding practice interpreted by TXM).&lt;br /&gt;
&lt;br /&gt;
The &amp;quot;TEI P5 BFM&amp;quot; TXM import module consists of Groovy and XSL scripts: they can be adapted directly by the user to any specific TEI encoding usage.&lt;br /&gt;
&lt;br /&gt;
TXM Import Modules also provide various import parameters to tune each import process to specific data sources.&lt;br /&gt;
&lt;br /&gt;
TEI sources from the following projects are currently imported into TXM:&lt;br /&gt;
* Perseus: http://www.perseus.tufts.edu/hopper&lt;br /&gt;
* TextGrid: http://www.textgrid.de/en&lt;br /&gt;
* NLTK - Brown Corpus (TEI XML Version): http://nltk.googlecode.com/svn/trunk/nltk_data/index.xml&lt;br /&gt;
* Frantext (libre): http://www.cnrtl.fr/corpus/frantext&lt;br /&gt;
* Base de Français Médiéval (BFM): http://bfm.ens-lyon.fr&lt;br /&gt;
* BVH Epistemon: http://www.bvh.univ-tours.fr/Epistemon&lt;br /&gt;
* Bouvard&amp;amp;Pécuchet: http://dossiers-flaubert.ish-lyon.cnrs.fr&lt;br /&gt;
* Presses Universitaires de Caen (PUC), MRSH de Caen - Revues.org: http://www.unicaen.fr/recherche/mrsh/document_numerique/outils ([[http://discours.revues.org?lang=en DISCOURS journal]])&lt;br /&gt;
* TXM: https://sourceforge.net/apps/mediawiki/txm/index.php?title=XML-TXM&lt;br /&gt;
&lt;br /&gt;
TEI sources are preprocessed by several XSL stylesheets, one can find in TXM source code.&lt;br /&gt;
Some of those stylesheets are available in the online TXM XSL stylesheets library:&lt;br /&gt;
http://sourceforge.net/projects/txm/files/library/xsl&lt;br /&gt;
&lt;br /&gt;
== Language(s) ==&lt;br /&gt;
&lt;br /&gt;
=== User Interface Language(s) ===&lt;br /&gt;
The user interface is currently available in:&lt;br /&gt;
* standalone version:&lt;br /&gt;
** English (EN)&lt;br /&gt;
** French (FR)&lt;br /&gt;
** Russian (RU)&lt;br /&gt;
* portal  version:&lt;br /&gt;
** English (EN)&lt;br /&gt;
** French (FR)&lt;br /&gt;
&lt;br /&gt;
=== Documentation Language(s) ===&lt;br /&gt;
The documentation is currently available in:&lt;br /&gt;
* standalone version:&lt;br /&gt;
** English (EN)&lt;br /&gt;
** French (FR)&lt;br /&gt;
* portal  version:&lt;br /&gt;
** French (FR) (tutorial - alpha state)&lt;br /&gt;
&lt;br /&gt;
=== Text/Corpus Language(s) ===&lt;br /&gt;
TXM works natively with any Unicode-conformant corpus.&amp;lt;br/&amp;gt;&lt;br /&gt;
Language support is specific to each NLP tool used (for example, TreeTagger can tag the following languages: BG, DE, EN, ES, ET, FR, FRO, GL, IT, LA, PT, RU, SW, ZH).&lt;br /&gt;
&lt;br /&gt;
=== Programming Language(s) ===&lt;br /&gt;
TXM is written in the following programming languages:&lt;br /&gt;
* C for CQP search engine (independent open source project http://cwb.sourceforge.net)&lt;br /&gt;
* C and R for statistical packages (independent open source project http://www.r-project.org)&lt;br /&gt;
* Java for the Toolbox and the Applications (driven by an independent open consortium http://jcp.org/en/home/index)&lt;br /&gt;
** Eclipse RCP framework used for the standalone version (independent open source project http://wiki.eclipse.org/index.php/Rich_Client_Platform)&lt;br /&gt;
** GWT framework used for the web portal version (independent open source project http://code.google.com/intl/fr/webtoolkit)&lt;br /&gt;
* Groovy for the import modules and command scripts (independent open source project http://groovy.codehaus.org)&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
* Main entry point for documentation on TXM at the Textométrie project web site: http://textometrie.ens-lyon.fr/spip.php?article98&amp;amp;lang=en&lt;br /&gt;
** See for example the TXM manual (in French) at http://txm.svn.sourceforge.net/viewvc/txm/trunk/doc/Manuel%20de%20TXM%200.7%20FR.pdf?revision=2332&lt;br /&gt;
* TXM user community wiki (in French) at https://listes.cru.fr/wiki/txm-users (includes a FAQ)&lt;br /&gt;
* TXM developers wiki (in English) on Sourceforge : http://sourceforge.net/apps/mediawiki/txm&lt;br /&gt;
* All available documentation (for users and for developers) published on Sourceforge: http://sourceforge.net/projects/txm/files/documentation&lt;br /&gt;
&lt;br /&gt;
== Tech support ==&lt;br /&gt;
Tech support is mainly provided through two mailing lists (see below).&lt;br /&gt;
&lt;br /&gt;
Users can also use 3 different trackers:&lt;br /&gt;
* Bug Reports - to describe bugs encountered in the software: https://sourceforge.net/tracker/?group_id=247041&amp;amp;atid=1190738&lt;br /&gt;
* Feature requests - to describe the features, changes in interface or any other improvements required in the software: https://sourceforge.net/tracker/?group_id=247041&amp;amp;atid=1190851&lt;br /&gt;
* Request for help - to describe a very difficult technical problem encountered in using the software: https://sourceforge.net/tracker/?group_id=247041&amp;amp;atid=1190852&lt;br /&gt;
&lt;br /&gt;
== User community ==&lt;br /&gt;
Currently, the TXM user community communicates using two mailing lists and a wiki:&lt;br /&gt;
* International mailing list : txm-open AT lists.sourceforge.net (very low activity for the moment)&lt;br /&gt;
** See archives at http://sourceforge.net/mailarchive/forum.php?forum_name=txm-open&lt;br /&gt;
* The mostly French-speaking mailing list : txm-users AT cru.fr (the most active)&lt;br /&gt;
** See archives at https://listes.cru.fr/sympa/arc/txm-users&lt;br /&gt;
* TXM user community wiki (in French) at https://listes.cru.fr/wiki/txm-users&lt;br /&gt;
&lt;br /&gt;
Training in the use of TXM is available every year at the CNRS summer school « Computing and Statistical Methods in Text Analysis » (MISAT), see http://laseldi.univ-fcomte.fr/ecole.&lt;br /&gt;
&lt;br /&gt;
The JADT conference (http://jadt.org) is the main meeting place for the TXM user community.&lt;br /&gt;
&lt;br /&gt;
== Sample implementations ==&lt;br /&gt;
The standalone version of TXM is delivered with several sample corpora included, which can be directly analyzed from within TXM after installation.&lt;br /&gt;
&lt;br /&gt;
The portal version of TXM has a demo running online at http://portal.textometrie.org/demo/?locale=en (work in progress).&lt;br /&gt;
&lt;br /&gt;
A previous experiment of a web application based on TXM applied to one TEI encoded text can be found at http://txm.ish-lyon.cnrs.fr/txm.&lt;br /&gt;
&lt;br /&gt;
== Current version number and date of release ==&lt;br /&gt;
* standalone: Current version is 0.7.2 released on Tuesday 2nd July 2013&lt;br /&gt;
* portal: Current version is 0.4 released November 2011&lt;br /&gt;
&lt;br /&gt;
== History of versions ==&lt;br /&gt;
See the Roadmap section on the developer's wiki at http://sourceforge.net/apps/mediawiki/txm.&lt;br /&gt;
&lt;br /&gt;
== How to download or buy ==&lt;br /&gt;
TXM is free to download and use:&lt;br /&gt;
* standalone (Windows, Mac, Linux):&lt;br /&gt;
** First point your browser to http://sourceforge.net/projects/txm&lt;br /&gt;
** Then click on the green Download button to download the setup for your architecture.&lt;br /&gt;
* portal (J2EE):&lt;br /&gt;
** First choose the archive for your architecture at [https://sourceforge.net/projects/txm/files/software/TXM%20portal https://sourceforge.net/projects/txm/files/software/TXM portal]&lt;br /&gt;
** Then follow installation instructions at https://sourceforge.net/apps/mediawiki/txm/index.php?title=TXM_WEB:_Quick_Install&lt;br /&gt;
** See also the demo portal http://portal.textometrie.org/demo/?locale=en&lt;br /&gt;
&lt;br /&gt;
== Additional notes ==&lt;br /&gt;
For publications related to TXM, please visit the Textométrie project web site at http://textometrie.ens-lyon.fr/spip.php?article82&amp;amp;lang=en:&lt;br /&gt;
* See for example:&amp;lt;br/&amp;gt;Heiden, S. (2010b). The TXM Platform: Building Open-Source Textual Analysis Software Compatible with the TEI Encoding Scheme. In K. I. Ryo Otoguro (Ed.), 24th Pacific Asia Conference on Language, Information and Computation - [http://www.compling.jp/paclic24 PACLIC24] (p. 389-398). Institute for Digital Enhancement of Cognitive Development, Waseda University, Sendai, Japan. [http://halshs.archives-ouvertes.fr/halshs-00549764/en Online].&lt;br /&gt;
&lt;br /&gt;
Sponsors &amp;amp; Contributors:&lt;br /&gt;
* Initial design and development of TXM (jan 2007- dec 2011) supported by French ANR grant #ANR-06-CORP-029&lt;br /&gt;
* Currently the platform continues its development through various contracts:&lt;br /&gt;
** ENS-LYON contract jun-aug 2009 (Rhône-Alpes region Cluster 13 grant): Queste del saint Graal web prototype&lt;br /&gt;
** ENS-LYON contract sept 2009 - jul 2010 (ANR CORPTEF Research Project funding): portal development&lt;br /&gt;
** Lyon 3 University contract jan-mar 2011: XML-Transcriber import, R GUI&lt;br /&gt;
** CNRS contract 2011 (DGLFLF grant): GGHF corpus processing&lt;br /&gt;
** Paris 1 University contract jan 2012 - dec 2014 (Matrice Equipex): TXM development and infrastructure for historians&lt;br /&gt;
* Other independent projects also improve TXM (community of developers):&lt;br /&gt;
** LASLA project 2011: import of ancient latin and greek corpora&lt;br /&gt;
** GREYC-PUC project may-jul 2011: PUC corpora import, improvement of portal, test on Glassfish&lt;br /&gt;
** PhD thesis on micro-finance 2011-: Factiva and Calibre import&lt;br /&gt;
** ANR-DFG SRCMF contract jun-jul 2012 : Tiger Search module, import &amp;amp; syntactic concordances&lt;/div&gt;</summary>
		<author><name>Matthieu Decorde</name></author>
		
	</entry>
	<entry>
		<id>https://wiki.tei-c.org/index.php?title=TXM&amp;diff=12032</id>
		<title>TXM</title>
		<link rel="alternate" type="text/html" href="https://wiki.tei-c.org/index.php?title=TXM&amp;diff=12032"/>
		<updated>2013-07-03T09:04:27Z</updated>

		<summary type="html">&lt;p&gt;Matthieu Decorde: /* Current version number and date of release */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[Category:Tools]]&lt;br /&gt;
&lt;br /&gt;
[[Category:Administrative tools]]&lt;br /&gt;
[[Category:Development tools]]&lt;br /&gt;
[[Category:Conversion and preprocessing tools]]&lt;br /&gt;
[[Category:Publishing and delivery tools]]&lt;br /&gt;
[[Category:Querying tools]]&lt;br /&gt;
[[Category:Analysis tools]]&lt;br /&gt;
[[Category:All-in-one Tools]]&lt;br /&gt;
[[Category:Interfaces]]&lt;br /&gt;
&lt;br /&gt;
[[Category:Discovering]]&lt;br /&gt;
[[Category:Comparing]]&lt;br /&gt;
[[Category:Sampling]]&lt;br /&gt;
[[Category:Illustrating]]&lt;br /&gt;
[[Category:Representing]]&lt;br /&gt;
&lt;br /&gt;
== Synopsis ==&lt;br /&gt;
TXM is free, open-source Unicode, XML &amp;amp; TEI compatible text/corpus analysis environment and graphical client based on CQP and R. It is available for Microsoft Windows, Linux, Mac OS X and as a J2EE web portal.&lt;br /&gt;
&lt;br /&gt;
== Features ==&lt;br /&gt;
* Provides qualitative analysis tools:&lt;br /&gt;
** '''concordances''' of lexical patterns based on the efficient [http://cwb.sourceforge.net CQP] full text search engine and its CQL query language&lt;br /&gt;
** CQL pattern '''frequency lists''' for any word property (type, lemma, pos...)&lt;br /&gt;
** CQL pattern '''occurrence graphics'''&lt;br /&gt;
** lexical patterns are expressed in the CQL query language, based on word &amp;amp; structure level properties: (for example)&lt;br /&gt;
*** &amp;quot;aiming&amp;quot; to simply search for the word 'aiming'&lt;br /&gt;
*** &amp;quot;.*ing&amp;quot; to search for words ending in &amp;quot;ing&amp;quot; (including mainly verb forms)&lt;br /&gt;
*** [pos=&amp;quot;VERB&amp;quot; &amp;amp; word=&amp;quot;.*ing&amp;quot;] to search for verb forms ending in &amp;quot;.ing&amp;quot; (where Part of Speech annotation is present)&lt;br /&gt;
*** [lemma=&amp;quot;group&amp;quot;] []{0,3} [pos=&amp;quot;VERB&amp;quot; &amp;amp; word=&amp;quot;.*ing&amp;quot;] to search for the collocation &amp;lt;group lemma&amp;gt; followed by a &amp;lt;verb with progressive aspect&amp;gt; with at most 3 words in between&lt;br /&gt;
** rich HTML-based text edition navigation with links from all other tools&lt;br /&gt;
* Provides quantitative analysis tools, based on [http://www.r-project.org R packages]:&lt;br /&gt;
** '''factorial correspondance analysis'''&lt;br /&gt;
** constrative word '''specificities'''&lt;br /&gt;
** '''hierarchical classification'''&lt;br /&gt;
** '''analysis of cooccurring words''' or lexical patterns&lt;br /&gt;
* May be used with any collection of '''Unicode''' encoded documents in various formats: '''TXT''', '''XML''', '''XML-TEI''' P5 (BFM, BVH, etc. projects customization), XML-'''Transcriber''', XML-'''TMX''' (aligned corpora - alpha), XML-PPS ('''Factiva''' - alpha), etc.&lt;br /&gt;
* Applies various NLP tools on the fly on texts before analysis (e.g. '''TreeTagger''' for lemmatization and pos tagging)&lt;br /&gt;
* Indexes words and their properties as well as hierarchical structure of texts&lt;br /&gt;
* Indexes external or internal metadata of texts or speakers&lt;br /&gt;
* Allows construction of various '''subcorpora''' and '''partitions''' (for constrative analysis between text structures or groups of words)&lt;br /&gt;
* '''Export'''s any result in CSV, XML or SVG format&lt;br /&gt;
* Scripting possible for automation of repetitive tasks or platform extension (in '''Groovy'''/Java)&lt;br /&gt;
* Includes a '''text editor''' to edit data sources, results and scripts&lt;br /&gt;
* Runs as standalone '''Windows''', '''Mac OS X''' or '''Linux''' application&lt;br /&gt;
* Runs also as '''web portal''' to access and analyze corpora online through a web browser (with access control management)&lt;br /&gt;
* '''Open source''': based on the best open source components for text analysis: CQP, R and Java &amp;amp; XSLT libraries&lt;br /&gt;
* Modular architecture (Eclipse RCP OSGi and J2EE conformant): one toolbox connecting all core components is used by all the applications&lt;br /&gt;
* Efficient Eclipse or Netbeans powered development framework&lt;br /&gt;
&lt;br /&gt;
== User comments ==&lt;br /&gt;
'''Please sign all comments.'''&lt;br /&gt;
&lt;br /&gt;
== System requirements ==&lt;br /&gt;
The standalone version runs on:&lt;br /&gt;
* Windows - 32bit or 64bit (tested on XP, Vista and 7)&lt;br /&gt;
* Mac OS X (tested on 10.5, 10.6 and 10.7)&lt;br /&gt;
* Linux - 32bit or 64bit (tested on Ubuntu and Debian)&lt;br /&gt;
&lt;br /&gt;
The portal server runs on any J2EE capable platform (tested in Tomcat and Glassfish).&lt;br /&gt;
&lt;br /&gt;
== Source code and licensing ==&lt;br /&gt;
Open Source under GPL V3 licence.&lt;br /&gt;
&lt;br /&gt;
== Support for TEI ==&lt;br /&gt;
Supports TEI and TEI Lite &amp;quot;out of the box&amp;quot; '''at the XML level''': words will be tokenized inside any #PCDATA and all the XML structure will be imported directly as textual structure.&lt;br /&gt;
&lt;br /&gt;
Supports various flavours of TEI P5 encoding semantics '''at the TEI level''':&lt;br /&gt;
* words and their properties: &amp;lt;nowiki&amp;gt;#PCDATA, &amp;lt;w&amp;gt;, &amp;lt;num&amp;gt;...&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* editorial markup: &amp;lt;nowiki&amp;gt;&amp;lt;sic&amp;gt;, &amp;lt;corr&amp;gt;...&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* texts and their properties: &amp;lt;nowiki&amp;gt;&amp;lt;TEI&amp;gt;, &amp;lt;text&amp;gt;...&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* intermediate text structures and their properties: &amp;lt;nowiki&amp;gt;&amp;lt;div&amp;gt;, &amp;lt;p&amp;gt;...&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* edition rendering: &amp;lt;nowiki&amp;gt;&amp;lt;pb/&amp;gt;, &amp;lt;head&amp;gt;...&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* what should not be indexed but considered for edition rendering: &amp;lt;nowiki&amp;gt;&amp;lt;teiHeader&amp;gt;, &amp;lt;note&amp;gt;...&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* alignment between texts: &amp;lt;nowiki&amp;gt;&amp;lt;teiCorpus&amp;gt;, &amp;lt;linkGrp&amp;gt;, &amp;lt;link&amp;gt;...&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* words identifier policy: &amp;lt;nowiki&amp;gt;&amp;lt;xml:id&amp;gt;&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
* language declaration policy: &amp;lt;nowiki&amp;gt;&amp;lt;xml:lang&amp;gt;&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
See &amp;quot;BFM encoding manual&amp;quot; for an example, in French, http://bfm.ens-lyon.fr/article.php3?id_article=158 of encoding practice interpreted by TXM).&lt;br /&gt;
&lt;br /&gt;
The &amp;quot;TEI P5 BFM&amp;quot; TXM import module consists of Groovy and XSL scripts: they can be adapted directly by the user to any specific TEI encoding usage.&lt;br /&gt;
&lt;br /&gt;
TXM Import Modules also provide various import parameters to tune each import process to specific data sources.&lt;br /&gt;
&lt;br /&gt;
TEI sources from the following projects are currently imported into TXM:&lt;br /&gt;
* Perseus: http://www.perseus.tufts.edu/hopper&lt;br /&gt;
* TextGrid: http://www.textgrid.de/en&lt;br /&gt;
* NLTK - Brown Corpus (TEI XML Version): http://nltk.googlecode.com/svn/trunk/nltk_data/index.xml&lt;br /&gt;
* Frantext (libre): http://www.cnrtl.fr/corpus/frantext&lt;br /&gt;
* Base de Français Médiéval (BFM): http://bfm.ens-lyon.fr&lt;br /&gt;
* BVH Epistemon: http://www.bvh.univ-tours.fr/Epistemon&lt;br /&gt;
* Bouvard&amp;amp;Pécuchet: http://dossiers-flaubert.ish-lyon.cnrs.fr&lt;br /&gt;
* Presses Universitaires de Caen (PUC), MRSH de Caen - Revues.org: http://www.unicaen.fr/recherche/mrsh/document_numerique/outils ([[http://discours.revues.org?lang=en DISCOURS journal]])&lt;br /&gt;
* TXM: https://sourceforge.net/apps/mediawiki/txm/index.php?title=XML-TXM&lt;br /&gt;
&lt;br /&gt;
TEI sources are preprocessed by several XSL stylesheets, one can find in TXM source code.&lt;br /&gt;
Some of those stylesheets are available in the online TXM XSL stylesheets library:&lt;br /&gt;
http://sourceforge.net/projects/txm/files/library/xsl&lt;br /&gt;
&lt;br /&gt;
== Language(s) ==&lt;br /&gt;
&lt;br /&gt;
=== User Interface Language(s) ===&lt;br /&gt;
The user interface is currently available in:&lt;br /&gt;
* standalone version:&lt;br /&gt;
** English (EN)&lt;br /&gt;
** French (FR)&lt;br /&gt;
** Russian (RU)&lt;br /&gt;
* portal  version:&lt;br /&gt;
** English (EN)&lt;br /&gt;
** French (FR)&lt;br /&gt;
&lt;br /&gt;
=== Documentation Language(s) ===&lt;br /&gt;
The documentation is currently available in:&lt;br /&gt;
* standalone version:&lt;br /&gt;
** English (EN)&lt;br /&gt;
** French (FR)&lt;br /&gt;
* portal  version:&lt;br /&gt;
** French (FR) (tutorial - alpha state)&lt;br /&gt;
&lt;br /&gt;
=== Text/Corpus Language(s) ===&lt;br /&gt;
TXM works natively with any Unicode-conformant corpus.&amp;lt;br/&amp;gt;&lt;br /&gt;
Language support is specific to each NLP tool used (for example, TreeTagger can tag the following languages: BG, DE, EN, ES, ET, FR, FRO, GL, IT, LA, PT, RU, SW, ZH).&lt;br /&gt;
&lt;br /&gt;
=== Programming Language(s) ===&lt;br /&gt;
TXM is written in the following programming languages:&lt;br /&gt;
* C for CQP search engine (independent open source project http://cwb.sourceforge.net)&lt;br /&gt;
* C and R for statistical packages (independent open source project http://www.r-project.org)&lt;br /&gt;
* Java for the Toolbox and the Applications (driven by an independent open consortium http://jcp.org/en/home/index)&lt;br /&gt;
** Eclipse RCP framework used for the standalone version (independent open source project http://wiki.eclipse.org/index.php/Rich_Client_Platform)&lt;br /&gt;
** GWT framework used for the web portal version (independent open source project http://code.google.com/intl/fr/webtoolkit)&lt;br /&gt;
* Groovy for the import modules and command scripts (independent open source project http://groovy.codehaus.org)&lt;br /&gt;
&lt;br /&gt;
== Documentation ==&lt;br /&gt;
* Main entry point for documentation on TXM at the Textométrie project web site: http://textometrie.ens-lyon.fr/spip.php?article98&amp;amp;lang=en&lt;br /&gt;
** See for example the TXM manual (in French) at http://txm.svn.sourceforge.net/viewvc/txm/trunk/doc/Manuel%20de%20TXM%200.7%20FR.pdf?revision=2332&lt;br /&gt;
* TXM user community wiki (in French) at https://listes.cru.fr/wiki/txm-users (includes a FAQ)&lt;br /&gt;
* TXM developers wiki (in English) on Sourceforge : http://sourceforge.net/apps/mediawiki/txm&lt;br /&gt;
* All available documentation (for users and for developers) published on Sourceforge: http://sourceforge.net/projects/txm/files/documentation&lt;br /&gt;
&lt;br /&gt;
== Tech support ==&lt;br /&gt;
Tech support is mainly provided through two mailing lists (see below).&lt;br /&gt;
&lt;br /&gt;
Users can also use 3 different trackers:&lt;br /&gt;
* Bug Reports - to describe bugs encountered in the software: https://sourceforge.net/tracker/?group_id=247041&amp;amp;atid=1190738&lt;br /&gt;
* Feature requests - to describe the features, changes in interface or any other improvements required in the software: https://sourceforge.net/tracker/?group_id=247041&amp;amp;atid=1190851&lt;br /&gt;
* Request for help - to describe a very difficult technical problem encountered in using the software: https://sourceforge.net/tracker/?group_id=247041&amp;amp;atid=1190852&lt;br /&gt;
&lt;br /&gt;
== User community ==&lt;br /&gt;
Currently, the TXM user community communicates using two mailing lists and a wiki:&lt;br /&gt;
* International mailing list : txm-open AT lists.sourceforge.net (very low activity for the moment)&lt;br /&gt;
** See archives at http://sourceforge.net/mailarchive/forum.php?forum_name=txm-open&lt;br /&gt;
* The mostly French-speaking mailing list : txm-users AT cru.fr (the most active)&lt;br /&gt;
** See archives at https://listes.cru.fr/sympa/arc/txm-users&lt;br /&gt;
* TXM user community wiki (in French) at https://listes.cru.fr/wiki/txm-users&lt;br /&gt;
&lt;br /&gt;
Training in the use of TXM is available every year at the CNRS summer school « Computing and Statistical Methods in Text Analysis » (MISAT), see http://laseldi.univ-fcomte.fr/ecole.&lt;br /&gt;
&lt;br /&gt;
The JADT conference (http://jadt.org) is the main meeting place for the TXM user community.&lt;br /&gt;
&lt;br /&gt;
== Sample implementations ==&lt;br /&gt;
The standalone version of TXM is delivered with several sample corpora included, which can be directly analyzed from within TXM after installation.&lt;br /&gt;
&lt;br /&gt;
The portal version of TXM has a demo running online at http://portal.textometrie.org/demo/?locale=en (work in progress).&lt;br /&gt;
&lt;br /&gt;
A previous experiment of a web application based on TXM applied to one TEI encoded text can be found at http://txm.ish-lyon.cnrs.fr/txm.&lt;br /&gt;
&lt;br /&gt;
== Current version number and date of release ==&lt;br /&gt;
* standalone: Current version is 0.7.2 released on Tuesday 2th July 2013&lt;br /&gt;
* portal: Current version is 0.4 released November 2011&lt;br /&gt;
&lt;br /&gt;
== History of versions ==&lt;br /&gt;
See the Roadmap section on the developer's wiki at http://sourceforge.net/apps/mediawiki/txm.&lt;br /&gt;
&lt;br /&gt;
== How to download or buy ==&lt;br /&gt;
TXM is free to download and use:&lt;br /&gt;
* standalone (Windows, Mac, Linux):&lt;br /&gt;
** First point your browser to http://sourceforge.net/projects/txm&lt;br /&gt;
** Then click on the green Download button to download the setup for your architecture.&lt;br /&gt;
* portal (J2EE):&lt;br /&gt;
** First choose the archive for your architecture at [https://sourceforge.net/projects/txm/files/software/TXM%20portal https://sourceforge.net/projects/txm/files/software/TXM portal]&lt;br /&gt;
** Then follow installation instructions at https://sourceforge.net/apps/mediawiki/txm/index.php?title=TXM_WEB:_Quick_Install&lt;br /&gt;
** See also the demo portal http://portal.textometrie.org/demo/?locale=en&lt;br /&gt;
&lt;br /&gt;
== Additional notes ==&lt;br /&gt;
For publications related to TXM, please visit the Textométrie project web site at http://textometrie.ens-lyon.fr/spip.php?article82&amp;amp;lang=en:&lt;br /&gt;
* See for example:&amp;lt;br/&amp;gt;Heiden, S. (2010b). The TXM Platform: Building Open-Source Textual Analysis Software Compatible with the TEI Encoding Scheme. In K. I. Ryo Otoguro (Ed.), 24th Pacific Asia Conference on Language, Information and Computation - [http://www.compling.jp/paclic24 PACLIC24] (p. 389-398). Institute for Digital Enhancement of Cognitive Development, Waseda University, Sendai, Japan. [http://halshs.archives-ouvertes.fr/halshs-00549764/en Online].&lt;br /&gt;
&lt;br /&gt;
Sponsors &amp;amp; Contributors:&lt;br /&gt;
* Initial design and development of TXM (jan 2007- dec 2011) supported by French ANR grant #ANR-06-CORP-029&lt;br /&gt;
* Currently the platform continues its development through various contracts:&lt;br /&gt;
** ENS-LYON contract jun-aug 2009 (Rhône-Alpes region Cluster 13 grant): Queste del saint Graal web prototype&lt;br /&gt;
** ENS-LYON contract sept 2009 - jul 2010 (ANR CORPTEF Research Project funding): portal development&lt;br /&gt;
** Lyon 3 University contract jan-mar 2011: XML-Transcriber import, R GUI&lt;br /&gt;
** CNRS contract 2011 (DGLFLF grant): GGHF corpus processing&lt;br /&gt;
** Paris 1 University contract jan 2012 - dec 2014 (Matrice Equipex): TXM development and infrastructure for historians&lt;br /&gt;
* Other independent projects also improve TXM (community of developers):&lt;br /&gt;
** LASLA project 2011: import of ancient latin and greek corpora&lt;br /&gt;
** GREYC-PUC project may-jul 2011: PUC corpora import, improvement of portal, test on Glassfish&lt;br /&gt;
** PhD thesis on micro-finance 2011-: Factiva and Calibre import&lt;br /&gt;
** ANR-DFG SRCMF contract jun-jul 2012 : Tiger Search module, import &amp;amp; syntactic concordances&lt;/div&gt;</summary>
		<author><name>Matthieu Decorde</name></author>
		
	</entry>
</feed>