Difference between revisions of "SIG:Computer-Mediated Communication"

From TEIWiki
Jump to navigation Jump to search
m (Drafts from the SIG (REQUEST FOR COMMENTS!))
(Drafts from the SIG (REQUEST FOR COMMENTS!))
Line 19: Line 19:
  
 
== Drafts from the SIG (REQUEST FOR COMMENTS!) ==
 
== Drafts from the SIG (REQUEST FOR COMMENTS!) ==
Between their [[SIG:CMC/1st SIG meeting in Rome, 3 October 2013|1st]] and [[SIG:CMC/2nd SIG meeting in Dortmund, 20 February 2014|2nd meeting]], the members of hte SIG have been working on drafts for (1) the definition of a basic schema for the representation of CMC genres and (2) the definition of a structure for the description of the communication environment within a metadata schema for CMC.
+
[1] (STARTING POINT) Suggestion for a basic schema for CMC from the DeRiK project:
 +
* Beißwenger, Michael; Ermakova, Maria; Geyken, Alexander; Lemnitzer, Lothar; Storrer, Angelika (2012): '''[http://jtei.revues.org/476 A TEI Schema for the Representation of Computer-mediated Communication]'''. In: Journal of the Text Encoding Initiative (jTEI), Issue 3 | November 2012 (DOI: 10.4000/jtei.476).
  
An '''outline of the current state of these drafts (as of January 2014)''' is given on the following pages:
+
[2] (CURRENT DRAFT) Modified and extended schema from the CoMeRe project (Thierry Chanier et al.):
  
*draft 1: '''[[SIG:CMC/Draft: A basic schema for representing CMC in TEI|A basic schema for representing CMC in TEI]]'''
+
*'''[[SIG:CMC/Draft: A basic schema for representing CMC in TEI|A basic schema for representing CMC in TEI]]'''
*draft 2: '''[[SIG:CMC/Draft: A metadata schema for CMC|A metadata schema for CMC]]'''
+
*'''[[SIG:CMC/Draft: A metadata schema for CMC|A metadata schema for CMC]]'''
  
'''As soon as the documentation on these pages is complete, the SIG will invite other people via the TEI mailing list to review and comment on the suggestions.''' As ypu will see from drafts, there are still some open issues on which we highly welcome every ideas and suggestions!
+
'''Publiations:'''
 +
* Thierry Chanier, Celine Poudat, Benoit Sagot, Georges Antoniadis, Ciara Wigham, Linda Hriba, Julien Longhi, Djamé Seddah (2014): '''[http://www.jlcl.org/2014_Heft2/1Chanier-et-al.pdf The CoMeRe corpus for French: structuring and annotating heterogeneous CMC genres]'''. In: Beißwenger, Michael; Oostdijk, Nelleke; Storrer, Angelika; van den Heuvel, Henk (Eds., 2014): Building and Annotating Corpora of Computer-Mediated Communication: Issues and Challenges at the Interface of Corpus and Computational Linguistics. Special Issue, Journal of Language Technology and Computational Linguistics (JLCL 2/2014).
 +
* Eliza Margaretha, Harald Lüngen (2014): '''[http://www.jlcl.org/2014_Heft2/3MargarethaLuengen.pdf Building Linguistic Corpora from Wikipedia Articles and Discussions text]'''. In: Beißwenger, Michael; Oostdijk, Nelleke; Storrer, Angelika; van den Heuvel, Henk (Eds., 2014): Building and Annotating Corpora of Computer-Mediated Communication: Issues and Challenges at the Interface of Corpus and Computational Linguistics. Special Issue, Journal of Language Technology and Computational Linguistics (JLCL 2/2014).
  
 
== Activities==
 
== Activities==

Revision as of 17:55, 20 March 2015


Motivation

In the past three decades, computer networks and especially the internet have brought forth new and emerging genres of interpersonal communication (computer-mediated communication, henceforth CMC). Even though there's been a lot of research on CMC genres and on language use on the internet in linguistics and social sciences as well as in the field of natural language processing, there are still no common standards for the representation and annotation of these new forms of communication and their structural and linguistic peculiarities. Being able to represent CMC data on the basis of an encoding framework such as the TEI which is broadly acknowledged within the field of digital humanities will allow for an interchange of data between research groups and for building interoperable CMC corpora for different languages.

Mission statement / scope and tasks

This special interest group is elaborating on suggestions for adapting the TEI guidelines to the representation of genres of computer-mediated communication (CMC). The focus of the group's work is on (but not limited to) tasks such as:

  • modelling user contributions (posts) to written CMC dialogues (which share features both with written discourse and with spoken utterances);
  • modelling CMC document structures ("CMC macrostructures" – e.g., forum threads, wiki talk pages, chat logfiles, Twitter timelines etc.);
  • annotating linguistic features within user posts ("CMC microstructures" – elements such as emoticons, addressing terms, hashtags; quotes from prior posts; etc.);
  • representing linked data and media objects connected with/embedded in CMC discourse;
  • metadata schemata for the description of CMC resources;
  • developing perspectives for the representation of discourse in multimodal cmc environments in which the participants in one interaction space combine a variety of modalities from written, spoken and non-verbal modes.

The customized TEI schema for CMC genres published in the jTEI (issue 3, 2012) and first presented at the TEI members' meeting in Würzburg (2011) will serve as one of the starting points of the group's work.

Drafts from the SIG (REQUEST FOR COMMENTS!)

[1] (STARTING POINT) Suggestion for a basic schema for CMC from the DeRiK project:

[2] (CURRENT DRAFT) Modified and extended schema from the CoMeRe project (Thierry Chanier et al.):

Publiations:

  • Thierry Chanier, Celine Poudat, Benoit Sagot, Georges Antoniadis, Ciara Wigham, Linda Hriba, Julien Longhi, Djamé Seddah (2014): The CoMeRe corpus for French: structuring and annotating heterogeneous CMC genres. In: Beißwenger, Michael; Oostdijk, Nelleke; Storrer, Angelika; van den Heuvel, Henk (Eds., 2014): Building and Annotating Corpora of Computer-Mediated Communication: Issues and Challenges at the Interface of Corpus and Computational Linguistics. Special Issue, Journal of Language Technology and Computational Linguistics (JLCL 2/2014).
  • Eliza Margaretha, Harald Lüngen (2014): Building Linguistic Corpora from Wikipedia Articles and Discussions text. In: Beißwenger, Michael; Oostdijk, Nelleke; Storrer, Angelika; van den Heuvel, Henk (Eds., 2014): Building and Annotating Corpora of Computer-Mediated Communication: Issues and Challenges at the Interface of Corpus and Computational Linguistics. Special Issue, Journal of Language Technology and Computational Linguistics (JLCL 2/2014).

Activities

[2014] 3rd SIG meeting at the 4th DARIAH-EU VCC meeting in Rome (September 17-18, 2014)

A proposal of the SIG for a community meeting on "TEI CMC: Models and tools for structuring & annotating corpora of social media / computer-­mediated communication" has been accepted for the DARIAH-EU VCC meeting in Rome. Details on the goal, contens and task of the meeting can be found on the page: SIG:CMC/Technical Meeting on CMC at DARIAH VCC 2014.

[2014] 2nd SIG meeting at the 7th workshop of the Empirikom network in Dortmund (February 20, 2014)

The 2nd SIG meeting was held as part of the 7th workshop of the scientific network Empirikom "Social Media Corpora for the eHumanities: Standards, Challenges, and Perspectives" in Dortmund.

A report about the meeting including slides of all presentations can be found on the page: 2nd CMC-SIG meeting in Dortmund, 20 February 2014.

[2013] 1st SIG meeting at TEI-MM in Rome (October 03, 2013)

The 1st SIG meeting was held as part of the TEI Conference and Members Meeting in October 2013 in Rome.

A report about the meeting can be found on the page: 1st CMC-SIG meeting in Rome, 3 October 2013.

[2013] Panel on CMC in TEI at the TEI-MM in Rome (October 04, 2013)

In addition to the 1st SIG meeting, Michael Beißwenger & Lothar Lemnitzer organized a special-topic panel on "Computer-mediated commuication in TEI: What lies ahead" with contributions of several members of the SIG that was held at the TEI-MM in Rome. The three presentations in the panel gave a report about of experiences with modeling CMC in XML and an outline of phenomena and issues related with the representation of CMC in TEI from the perspective of corpus projects from France, Germany, Italy and the Netherlands. The overall goal of the panel was to stimulate the further discussion within the TEI community about how a standard for the representation of CMC in TEI should look like and what might be a practical and reasonable way to go about creating such a standard.

Documentation of the panel:

Members of the SIG

  • Michael Beißwenger - TU Dortmund University (DE)
  • Thierry Chanier – Université Blaise Pascal, Clermont-Ferrand (FR)
  • Isabella Chiari – Università "La Sapienza", Rome (IT)
  • Maria Ermakova – Berlin-Brandenburg of Sciences and the Humanities (DE)
  • Maarten van Gompel – Radboud University Nijmegen (NL)
  • Iris Hendrickx – Radboud University Nijmegen (NL)
  • Axel Herold – Berlin-Brandenburg of Sciences and the Humanities (DE)
  • Henk van den Heuvel – Radboud University Nijmegen (NL)
  • Kun Jin – Université Blaise Pascal, Clermont-Ferrand (FR)
  • Lothar Lemnitzer – Berlin-Brandenburg of Sciences and the Humanities (DE)
  • Harald Lüngen - Institut für deutsche Sprache, Mannheim (DE)
  • Eliza Margaretha - Institut für deutsche Sprache, Mannheim (DE)
  • Angelika Storrer – University of Mannheim (DE)

Mailing list & further information

For exchange (inbetween the TEI-MMs), the SIG will use the talk pages in the TEI wiki and a mailing list.