Difference between revisions of "SIG:CMC/Technical Meeting on CMC at DARIAH VCC 2014"
m (→WORKSHOP DESCRIPTION) |
|||
Line 21: | Line 21: | ||
Researchers at a European level are already aware that many of the challenges in building CMC corpora in the humanities are the same for every language; therefore CMC corpus projects for different languages can benefit from sharing knowledge and experience with each other and from facing the challenges as a joint task. Since 2013, a group of corpus projects from France, Germany, bItaly and the Netherlands has started to exchange expertise and experience in building CMC corpora (= the network "Building and annotating CMC corpora", https://wiki.itmc.tu-dortmund.de/cmc/) and to jointly work on a proposal for an extension to the TEI standard which is adapted to the particularities of a broad range of CMC genres (= the TEI special interest group on CMC, http://www.tei-c.org/Activities/SIG/). The DARIAH technical meeting will gather a restricted number of researchers, coming from different European countries, involved in projects aiming at building, structuring, annotating and analyzing CMC corpora - including: | Researchers at a European level are already aware that many of the challenges in building CMC corpora in the humanities are the same for every language; therefore CMC corpus projects for different languages can benefit from sharing knowledge and experience with each other and from facing the challenges as a joint task. Since 2013, a group of corpus projects from France, Germany, bItaly and the Netherlands has started to exchange expertise and experience in building CMC corpora (= the network "Building and annotating CMC corpora", https://wiki.itmc.tu-dortmund.de/cmc/) and to jointly work on a proposal for an extension to the TEI standard which is adapted to the particularities of a broad range of CMC genres (= the TEI special interest group on CMC, http://www.tei-c.org/Activities/SIG/). The DARIAH technical meeting will gather a restricted number of researchers, coming from different European countries, involved in projects aiming at building, structuring, annotating and analyzing CMC corpora - including: | ||
− | * | + | *''CoMeRe'': Project for a corpus of French CMC: http://comere.org |
− | * | + | *''Dortmund Chat Corpus'': http://www.chatkorpus.tu-dortmund.de |
− | * | + | *''DeRiK'': project for a reference corpus of German CMC: http://www.tinyurl.com/derik-llc |
− | * | + | *''Web2CorpusIT'': Pilot Corpus of Italian Computer-Mediated communication: http://www.glottoweb.org/web2corpus/ |
− | * | + | *''Wikipedia corpora in DeReKo'' / IDS Mannheim: http://www.ids-mannheim.de/dereko |
− | * | + | *''KobRA'': Corpus-based linguistic analysis with the help of data mining: http://www.kobra.tu-dortmund.de |
The expected outcomes of the meeting are, amongst others: | The expected outcomes of the meeting are, amongst others: | ||
*an advanced proposal for representing CMC genres in TEI (which subsequently shall be presented as a proposal to the TEI community in 2015); | *an advanced proposal for representing CMC genres in TEI (which subsequently shall be presented as a proposal to the TEI community in 2015); | ||
Line 33: | Line 33: | ||
<br/><br/> | <br/><br/> | ||
+ | |||
=TENTATIVE SCHEDULE / PROGRAM= | =TENTATIVE SCHEDULE / PROGRAM= | ||
Revision as of 18:14, 12 July 2014
This pages describes a workshop and tentative program for a community session/technical meeting on issues related with the modeling of CMC corpora organized by members of the CMC-SIG at the 4th DARIAH-EU VCC meeting 2014 in Rome. Date: Thursday, September 18, 15:00-18:00 Location: Rome, Villa Mirafiori Main page of the CMC-SIG in this wiki: SIG:Computer-Mediated Communication WORKSHOP DESCRIPTIONSee PDF version on the DARIAH website PCorpora of computer-mediated communication (CMC) are a desideratum for many scholars in the humanities who are interested in doing empirical research of language use and of emerging communicative genres on the Internet and in social media applications. Important steps for building such corpora and for representing them in an interoperable way are:
Researchers at a European level are already aware that many of the challenges in building CMC corpora in the humanities are the same for every language; therefore CMC corpus projects for different languages can benefit from sharing knowledge and experience with each other and from facing the challenges as a joint task. Since 2013, a group of corpus projects from France, Germany, bItaly and the Netherlands has started to exchange expertise and experience in building CMC corpora (= the network "Building and annotating CMC corpora", https://wiki.itmc.tu-dortmund.de/cmc/) and to jointly work on a proposal for an extension to the TEI standard which is adapted to the particularities of a broad range of CMC genres (= the TEI special interest group on CMC, http://www.tei-c.org/Activities/SIG/). The DARIAH technical meeting will gather a restricted number of researchers, coming from different European countries, involved in projects aiming at building, structuring, annotating and analyzing CMC corpora - including:
The expected outcomes of the meeting are, amongst others:
TENTATIVE SCHEDULE / PROGRAM
Precise times for the meeting and presentations will be fixed after DARIAH has decided about the proposal. All titles given below are working titles. Please everbody add preliminary titles for your contributions (contributions marked as [confirmed] have been confirmed by the authors) Technical meeting pt. I: Current state of building & modeling corpora (1,5 hrs)
Technical meeting pt. II: NLP for CMC / social media corpora (1,5 hrs)
Technical meeting pt. III: schedule for further work on standards and joint scientific activities (2 hrs)
|