User:Piotr Banski
- Institut für Deutsche Sprache, Mannheim;
- previous affiliation (until 2013): Institute of English Studies, University of Warsaw.
I was involved in the creation of the IPI PAN Corpus of Modern Polish, mostly experimenting with the possibilities that the TEI and the CES offered for its architecture. I was also hired as XML architect for the National Corpus of Polish. This has resulted in the robust multi-layer standoff TEI encoding for the NCP.
At IDS Mannheim, I have co-ordinated the KorAP project, and currently, I am involved in the creation of CoMParS, a multilingual parallel collection of sentence fragments (see my slides from the LingSIG meeting in Lyon for more info).
My current TEI-related projects include the open-source FreeDict (TEI-encoded translating dictionary repository) and the dormant OCTC (TEI-encoded multi-instance stand-off multilingual and partly parallel text corpus <note>I could go on with these attributes, you know -- Syd says I'm articulate ;-)</note>).
Some of my experiences with applying the TEI to large-scale language resources are reflected in my Balisage-2010 paper, on "Why TEI stand-off annotation doesn't quite work, and why you might want to use it nevertheless".
Together with Andreas Witt, I am co-convener of the TEI Special Interest Group, "TEI for Linguists". The SIG started in 2009 and is still going strong. I served on the TEI Technical Council in term 2011-2012; part of my involvement focussed on making sure that the needs of the language-resource-oriented side of the TEI community were duly communicated to the Council and recognized in the decision-making process.
I am also an administrator of this wiki and a co-author of many of its articles.