A Multidimensional Study of Synchronous and Supersynchronous Computer-Mediated Communication
Texts used in Biber’s (1988) study of language variation. The corpus totals approximately 960,000 words for 481 texts (Biber 1988: 67, 209–210, 1995: 87).
|Genre||Texts used||Number of texts||Approx. number of words|
|Speech||Face-to-face conv.||texts 1.1–1.14 and 3.1–3.6 from LLC||44||115,000|
|Telephone conv.||texts 7.1–7.3, 8.1–8.4 and 9.1–9.3 from LLC||27||32,000|
|Interviews124||texts 5.1–5.3, 5.5–5.7, 6.1, 6.3, 6.4a, 6.5 and 6.6 from LLC||22||48,000|
|Broadcasts||texts 10.1–10.7 and part of text 10.8 from LLC||18||38,000|
|Spont. speeches125||texts 11.1–11.5 from LLC||16||26,000|
← 297 | 298 →
|Prepared speeches||texts 12.1–12.6 from LLC||14||31,000|
|Writing||Press reportage||all texts in LOB category A||44||88,000|
|Press editorials||all texts in LOB category B||27||54,000|
|Press reviews||all texts in LOB category C||17||34,000|
|Religion||all texts in LOB category D||17||34,000|
|Hobbies||the first 30,000 words (texts 1–14) LOB category E||14||30,000|
|Popular lore||the first 30,000 words (texts 1–14) LOB category F||14||30,000|
|Biographies||the first 30,000 words (texts 1–14) LOB category G||14||30,000|
|Official documents||texts 1–6, 13–14 and 25–30 from LOB category H||14||28,000|
|Academic prose||all texts in LOB category J||80||160,000|
|General fiction||all texts in LOB category K||29||58,000|
|Mystery fiction||the first 30,000 words (texts 1–14) LOB category L||13||26,000|
|Science fiction||all texts in LOB category M||6||12,000|
|Adventure fiction||the first 30,000 words (texts 1–14) LOB category N||13||26,000|
|Romantic fiction||the first 30,000 words (texts 1–14) LOB category P||13||26,000|
|Humor||all texts in LOB category R||9||18,000|
|Personal letters||written to friends/relatives, collected by D. Biber||6||6,000|
|Professional letters||on administrative matters, collected by W. Grabe||10||10,000|
The frequencies in tables 1–7 are all normalized to text lengths of 1,000 tokens (except for type/token-ratio and word length); see section 3.2.
Tables 1a–3a present the raw frequencies per text of the linguistic features in the corpora investigated (for type/token ratio and word length, see Appendix II). The length of each text is shown in tables 1b–3b.
Certain messages and strings of text were excluded from the conversational writing logs and the SBC subset before the texts were annotated for the features in Biber’s (1988) methodology. Typical excluded instances are exemplified below.
Table 1 lists the features with a standard score above 2.0, or below -2.0, in the genres studied, the most influential (most salient) contributors to the dimension scores of the particular genre. Section 4.4, and part of 4.2, explore the most salient features of the conversational writing genres (split-window ICQ chat and Internet relay chat) and present their distribution in writing, ACMC and speech. The procedure of standard score calculation is described in section 3.5.
Table 1 presents the values for probability (p) from t-tests of the feature distributions in SCMC, SSCMS, writing and speech for the salient features in conversational writing discussed in chapter 4. For some of the features, or combinations of features, p-values are not available (“n.a.”) owing to the unavailability in Biber (1988) of the requisite data for the test. As regards inserts, no annotation of Biber’s (1988) texts of writing or speech was carried out; instead, the p-values for inserts given in table 1 in the comparisons to “speech” reflect for “speech” only the face-to-face conversations from SBC (as noted in section 4.6). With regard to emotives, the tests here reflect that none of the written (LOB) or spoken (LLC or SBC) texts contains emotives.
|Dimension 1:||Informational versus Involved Production|
|Dimension 2:||Narrative versus Non-Narrative Concerns|
|Dimension 3:||Explicit/Elaborated versus Situation-Dependent Reference|
|Dimension 4:||Overt Expression of Persuasion/Argumentation|
|Dimension 5:||Abstract/Impersonal versus Non-Abstract/Non-Impersonal Information|
|Dimension 6:||On-Line Informational Elaboration|
Table 1 shows the centroid scores of each cluster identified in Biber (1989, 1995) with respect to Biber’s (1988) Dimensions 1 through 5 (C1 means cluster centroid 1, C2 cluster centroid 2, etc.). Tables 2–4 each present the Euclidean distances found between the texts and the cluster centroids (and those between the average dimension scores of the genre and the latter), with the resulting cluster affiliations indicated in the rightmost column. Table 5 presents the Euclidean distances found between the dimension scores of Collot’s (1991) genre of BBS conferencing (i.e. the “ELC other” corpus of ACMC) and the cluster centroids. The polarity of all scores follows that in Biber (1988, 1989), rather than that in Biber (1995). See Appendix X for the dimension scores of the individual texts, and tables 5.1 and 5.5 (in chapter 5) for those of the genres.
Tables 1–3 present the dimension scores on Biber’s (1988) dimensions for the individual texts annotated in the present study.
124 “Interviews” denotes public conversations, debates and interviews (Biber 1988, 1995: 87).
125 In his description of the sampling procedure, Biber (1988: 210) indicates that spontaneous speeches were divided into 15 texts, which would yield a total of 480 texts. Later accounts, however, maintain that there was a total of 481 texts (e.g. Biber 1995: 87, Conrad & Biber 2001: 111), which explains why the figure from Biber (1988: 67) is retained here.
126 The polarity of all scores follows that in Biber (1988, 1989).