Difference between revisions of "SIG:CMC/CoMeRe metadata schema draft for CMC (2014)"
(→Multimodal environments) |
(→Multimodal environments) |
||
Line 177: | Line 177: | ||
'''Figure 2:''' complex CMC environment with several modalities | '''Figure 2:''' complex CMC environment with several modalities | ||
− | [[File:Lms.jpg|thumb|left]] This image shows the description of modalities embedded into an Learning Managment System (LMS) environment: email, textchat, forum (''cmr-simuligne-tei-v1'' corpus). Example(5) details the way the data have bben collected and nature of interactions among the participants. | + | [[File:Lms.jpg|thumb|left]] This image shows the description of modalities embedded into an Learning Managment System (LMS) environment: email, textchat, forum (''cmr-simuligne-tei-v1'' corpus). Example(5) details the way the data have bben collected and nature of interactions among the participants. Note that the TEI file contains seperate interactions spaces, one per group of learners, whihc all followed the same learning scenario. Every group IS has a recursive structure where to every learning activity correspond an IS. |
− | |||
− | |||
− | |||
− | |||
Line 196: | Line 192: | ||
<channel mode="w" xml:lang="en-GB"><term ref="#webCT">Learning Management System (LMS), WebCT</term></channel> | <channel mode="w" xml:lang="en-GB"><term ref="#webCT">Learning Management System (LMS), WebCT</term></channel> | ||
<constitution>This corpus is made of interactions between participants (learners, natives, tutors, researchers) | <constitution>This corpus is made of interactions between participants (learners, natives, tutors, researchers) | ||
− | during the online language learning Simuligne experiment (2001). All these interactions happened within the LMS and are made of textacht turns, emails and forum messages. Participants were organized in groups (learning groups): 4 following "scenario 1", a fifth one gathering all participants during the Interculture activity ("scenario 2",(see <gi>projectDesc</gi>) a sixth restrained to tutors). All details about groups are in <gi>particDesc</gi>. Data have been collected by the <ref target="CR">corpus compiler of the first LETEC Simuligne corpus (2009). Since WebCT had no export facilities, data have been extracted out of WebCT internal database, then structured and anonymized.</ref> | + | during the online language learning Simuligne experiment (2001). All these interactions happened within the |
+ | LMS and are made of textacht turns, emails and forum messages. Participants were organized in groups | ||
+ | (learning groups): 4 following "scenario 1", a fifth one gathering all participants during the Interculture | ||
+ | activity ("scenario 2",(see <gi>projectDesc</gi>) a sixth restrained to tutors). All details about groups | ||
+ | are in <gi>particDesc</gi>. Data have been collected by the <ref target="CR">corpus compiler of the first | ||
+ | LETEC Simuligne corpus (2009). Since WebCT had no export facilities, data have been extracted out of WebCT | ||
+ | internal database, then structured and anonymized.</ref> | ||
</constitution> | </constitution> | ||
<derivation type="original"/> | <derivation type="original"/> | ||
<domain>education</domain> | <domain>education</domain> | ||
<factuality type="fact"/> | <factuality type="fact"/> | ||
− | <interaction type="complete" active="many"><note>Interactions happened accordingly to the guidelines of the learning activities (see <gi>projectDesc</gi> for access to guidelines) </note></interaction> | + | <interaction type="complete" active="many"><note>Interactions happened accordingly to the guidelines |
+ | of the learning activities (see <gi>projectDesc</gi> for access to guidelines) </note></interaction> | ||
<preparedness type="spontaneous"/> | <preparedness type="spontaneous"/> | ||
<purpose degree="high">learn and practice French, develop intercultural competences</purpose> | <purpose degree="high">learn and practice French, develop intercultural competences</purpose> | ||
</textDesc></pre> | </textDesc></pre> |
Revision as of 14:41, 8 April 2014
Contents
Status of this draft
This page describes a draft for a metadata schema for genres on computer-mediated communication (CMC) in TEI. The draft has been created by members of the TEI-SIG "Computer-Mediated Communication".
The SIG encourages everybody to discuss this draft and give their feedback/comments using the "discussion" function on top of this page. The comments/discussions will be carefully taken into consideration in the further development of the schema.
The history of the draft is documented on the main wiki page of the SIG. This page should be read in parallel to SIG:CMC/Draft: A basic schema for representing CMC in TEI.
Authors of this draft: Thierry Chanier, N.N., N.N.
Rationales for Modelling CMC discourse
Note : we use the terme CMC (which stands for Computer-Mediated Communication) in a broad meaning, when refering to all kinds of Networks Mediated Communication (cf. SMS).
Annotation is basically an interpretation and the TEI markup naturally encompasses hypotheses concerning what a text is and what it should be. Although the TEI was historically dedicated to the markup of literature texts, various extensions have been developed for the annotation of other genres and discourses, including poetry, dictionaries, language corpora or speech transcriptions. If one wants to still apply the word “text” to a coherent and circumscribed set of CMC interactions, it is not so much in the sense originally developed by the TEI. Indeed, it would be closer to the meaning adopted by Bauldry & Thibault (2006). These authors consider (ibid: 4) “texts to be meaning-making events whose functions are defined in particular social contexts” following Halliday (1989:10) who declared that “any instance of living language that is playing a role some part in a context of situation, we shall call it a text. It may be either spoken or written, or indeed in any other medium of expression that we like to think of.”
Bearing the above in mind, we found it more relevant to start from a general framework, that we will term “Interaction Space”, encompassing, from the outset, the richest and the more complex CMC genres and situations. Therefore, we did not work genre by genre, nor with scales that would, for instance, oppose simple and complex situations (e.g. unimodal versus multimodal environments) - as said, our goal is to release guidelines for all CMC documents and not for each CMC genre. This also explains why we did not limit ourselves solely to written communication. For these reasons, we take multimodality into account and our approach is akin to the one under discussion in European networks delaing with TEI and oral corpora: they tend to reject the collection and study of oral corpora as self contained elements and prefer to study oral and multimodal corpora within a common framework.
Interaction Space
Figure 1: Interaction Space
Interaction space: time, location, participants
An Interaction Space (henceforth referred to as IS) is an abstract concept, located in time (with a beginning and ending date with absolute time, hence a time frame) where interactions between a set of participants occur within an online location. The online location is defined by the properties of the set of environments used by the set of participants. Online means that interactions have been transmitted through networks, Internet, Intranet, telephone, etc. The set of participants is composed of individual members or groups. It can be a predefined learner group or a circumscribed interest group. A mandatory property of a group is the listing of its participants.
The range of types of interactions (and their related locations) is widespread. On one end of the scale, we find simple types with one environment based on one modality / tool (e.g., one email system, or text chat system, etc). On the other end of the scale, complex environments such as LMSs, where several type of communication modalities are integrated (see hereafter example with the LMS WebCT which uses only textual modalities synchronous — text chat — and asynchronous — email and forum —).
Environment, mode and modality
An environment may be synchronous or asynchronous, mono or multimodal. Multimodality refers to environments that offer several interaction tools, integrated within the same interface. Every tool uses one mode of communication (e.g., oral, text, icon, nonverbal) and one modality (e.g., a text chat has a specific textual modality, different from the modality of a collective word processor, although both are based on the same textual mode). Every modality has its own grammar which constraints interactions. The icon modality within an audio-graphic environment is composed of a finite set of icons (raise hand, clap hand, is_talking, momentarily absent, etc.). Consequently, an interaction may be multimodal because several modes are used and/or several modalities.
An environment offers the participants one or more locations / places in which to interact. For example, a conference system may have several rooms where a set of participants may work separately in sub-groups or gather in one place. In a 3D environment such as the synthetic world Second Life, a location may be an island or a plot. A plot may even be divided into small sub-plots where verbal communication (text chat, audio) is impossible from one to another. Hence we say that participants are in the same location / place if they can interact at a given time. Notions of location and interaction are closely related and are defined by the affordances of the environment. Lastly, an IS is an abstract space where interaction occurs. When the same participants interact over several weeks, different interaction sessions will occur.
More information on interactions in SIG:CMC/Draft: A basic schema for representing CMC in TEI
Describing the interaction space of monomodal environment within the TEI header
In this section we present the way Interaction Space(s) have been described for monomodal environemnts. next section will consider example fo multimodal environments.
Environments and affordances
The first step when describing an environment is to define the general features attached to the overall environment type to which it belongs (e.g., IRC text chat systems). However, this needs to be refined in order to elicit specific features of the system. For example, (1a) describes, in TEI, the general text chat modality where inside one public channel every connected participant may interact with the other participants. Example (1b), however, details the affordances related to the specific IRC system used in cmr-getalp_org. This simplified extract displays the three main types of chat actions (message, command, and event), and part of the subtype of events.
(1a) <textDesc xml:lang="en-GB"> <channel mode="w" xml:lang="en-GB"> <term ref="#texchat-epiknet">text chat</term></channel> <constitution>Messages typed by participants inside EpikNet IRC Channels and then collected by Botstats.com </constitution> <derivation type="original"/> <domain type="public"/> <factuality type="fact"/> <interaction type="complete" active="plural" passive="many"/> <preparedness type="spontaneous"/> <purpose degree="high"><note>Informal discussion</note></purpose> </textDesc> |
(1b) <classDecl> <taxonomy> <category xml:id="texchat-epiknet" /> <catDesc>Definition of the modality textchat. Type of messages used in cmr-getalp_org. Textchat features are those coming from EPIKNET <ref target="http://www.epiknet.org/"/> </catDesc> <category xml:id="chat-message"/> <category xml:id="chat-command"/> <category xml:id="chat-event"> <category xml:id="connexion" /> <category xml:id="deconnexion"/> <category xml:id="changementpseudo" /> [...] |
Structure of a textchat message, the <post> element
Part of the description a textchat turn, maybe applied to any kind of textchat environment. But particularities imposed by a specific environment (here, again, IRC EpikNet) have to be detailed in order to guide further research analyses (see explanation on attributes @who @alias, on the time), in order to help future research analysis.
(2)<editorialDecl> <normalization> <p>for details about encoding before TEI, see the attached document <idno>cmr-getalp_org-tei-v1-manuel.pdf</idno></p> […] </normalization> <stdVals> <p>The time of a post in second is not known in the textchat logfile. Hence the values of <att>when-iso</att> on the <gi>time</gi> element always end in the format <val>HH:MM</val>; i.e., seconds, fractions thereof, and time zone designators are not present.</p> </stdVals> <segmentation> <p><gi>post</gi> correspond to textchat turns</p> </segmentation> </editorialDecl> <tagsDecl> <tagUsage gi="post">one post corresponds to one texchat turn, i.e. one participant's utterrance.<list> <item><att>xml:id</att>ID of the posting.</item> <item> <att>alias</att> is the participant's alias. It does not identify a participant since a participant may change her/his alias (cf. <att>type</att>chat-command). Moreover two participant may use the same alias (we have never checked this).</item> <item> <att>who</att>is the login ID given by the system to a participant present in the channel at one given moment. In other words, if the participant leaves the channel and then comes back, s/he will receive another system ID. This system ID does not identify a participant in the whole channel. It only identifies a participant during a short period of interaction. 2 different participants cannot have the same system ID. Tracking aliases' use and relating it to system IDs may be one way of approaching this identification. This identification (knowing the exhaustive list of posts sent by the same person) may be a topic of investigation for future analyses.</item> <item> <att>type</att>type of message cf. taxononomy. </item> <item><att>sub-type</att>subtype of message in the taxonomy </item> <item> <att>synch</att>absolute time when the IRC channel displayed the post</item> </list></tagUsage> </namespace> </tagsDecl>
Interactions spaces within the environment
Location and time frames
In (3) is described the general location of the server, then a particular channel with its time frame. 80 other channels (in distinct TEI files) have been described in a similar way in cmr-getalp_org_tei_v1'..
(3) <profileDesc> <creation> <date from="2004-02-03" to="2004-04-09"/> <location type="online_environment"> <placeName>whereas epiknet.org was the place where IR Channels occurred, botstats.com collected the logfiles of the interactions</placeName> <geogName> <rs type="city">Blanquefort, France</rs> <rs type="TGN">7008161</rs> <rs type="URL">http://www.botstats.com</rs> <rs type="URL">http://www.epiknet.org</rs> </geogName> </creation> […] <settingDesc> <setting> <name>rhone-alpes</name> <locale>one IRC EpikNet channel</locale> <time from-iso="2004-03-09T00:00" to-iso="2004-04-09T12:08">begining time of first sessions and end time of last texchat session</time> <activity>participants type on keyboard. They can only see threads of messages of the IRC Channel</activity> </setting> </settingDesc>
In (4), for SMS, collected from volontarily participating people, we distinguish dates and locations of a company in charge of collecting the data (creation) from the participant location and times (settingDesc)(cmr-smslareunion-tei-v1 corpus).
(4) <profileDesc> <creation> <date from="2008-04-10" to="2008-06-30"/> <location type="telephone network"> <placeName>A private company collected the messages and sent them to "Laboratoire de recherche sur les espaces Créolophones et Francophones", Université de la Réunion. All participants were located in La Réunion</placeName> <geogName> <rs type="city">La Réunion, France</rs> <rs type="TGN">1000184</rs> </geogName></location> </creation> […] <settingDesc> <setting> <name>La Réunion</name> <locale> private phones (or phones given by their company) of authors of sms </locale> <time from-iso="2008-04-10T10:57 " to-iso="2008-06-30T21:35"> beginning time of the first message received by the project server and time of the last message received by the server. </time> <activity>participants, leaving in La Réunion freely accepted to send a copy of their SMS to the server of the project. The copy was sent by authors via a specific process, i.e. a process different from the SMS sent to their correspondent. </activity> </setting> </settingDesc>
Multimodal environments
Figure 2: complex CMC environment with several modalities
This image shows the description of modalities embedded into an Learning Managment System (LMS) environment: email, textchat, forum (cmr-simuligne-tei-v1 corpus). Example(5) details the way the data have bben collected and nature of interactions among the participants. Note that the TEI file contains seperate interactions spaces, one per group of learners, whihc all followed the same learning scenario. Every group IS has a recursive structure where to every learning activity correspond an IS.
(5)<textDesc xml:lang="en-GB"> <channel mode="w" xml:lang="en-GB"><term ref="#webCT">Learning Management System (LMS), WebCT</term></channel> <constitution>This corpus is made of interactions between participants (learners, natives, tutors, researchers) during the online language learning Simuligne experiment (2001). All these interactions happened within the LMS and are made of textacht turns, emails and forum messages. Participants were organized in groups (learning groups): 4 following "scenario 1", a fifth one gathering all participants during the Interculture activity ("scenario 2",(see <gi>projectDesc</gi>) a sixth restrained to tutors). All details about groups are in <gi>particDesc</gi>. Data have been collected by the <ref target="CR">corpus compiler of the first LETEC Simuligne corpus (2009). Since WebCT had no export facilities, data have been extracted out of WebCT internal database, then structured and anonymized.</ref> </constitution> <derivation type="original"/> <domain>education</domain> <factuality type="fact"/> <interaction type="complete" active="many"><note>Interactions happened accordingly to the guidelines of the learning activities (see <gi>projectDesc</gi> for access to guidelines) </note></interaction> <preparedness type="spontaneous"/> <purpose degree="high">learn and practice French, develop intercultural competences</purpose> </textDesc>