Corpus analysis is driven by a common interest in ‘linguistic evidence’, viewed as a source of insights into language phenomena or of lexical, semantic and contrastive data for subsequent applications. Among the latter, pedagogical settings are highly prominent, as corpora can be used to monitor classroom output, raise learner awareness and inform teaching materials.
The eighteen chapters in this volume focus on contexts where English is employed by specialists in the professions or academia and debate some of the challenges arising from the complex relationship between linguistic theory, data-mining tools and statistical methods.
Methodological Issues 23
Methodological Issues LYNNE FLOWERDEW Which Unit for Linguistic Analysis of ESP Corpora of Written Text? 1. Introduction Many different types of ESP corpora abound. Some of these, such as the 2.6 million-word Michigan Corpus of Upper-level Student Papers (2009), contain a variety of different text types, while others are quite specialised, e.g. the 0.5 million-word Guangzhou Petroleum English Corpus (Zhu 1989), one of the first specialised corpora to be compiled (see Flowerdew 2004, 2011, 2012 and Warren 2010b, for more details on ESP corpora). However, a key point to note is that different corpus- based research studies of specialised text have different starting points, relying on different units of analysis, as outlined below. The vast majority of research commences from a bottom-up perspective in which lexis or some kind of lexico-grammatical unit is taken as the starting point for analysis, moving towards a more top- down discourse-oriented approach, often based on the Swalesian concept of rhetorical move structure analysis. ESP studies in this vein include Upton/Connor (2001), Bhatia et al. (2004) and Carter- Thomas/Chambers (2012). In contrast, in the top-down approach the functional components of a genre are determined first, and then all the texts in a corpus are analysed in terms of these components (see Biber et al. 2007 for more information on top-down and bottom-up approaches). However, in reality, many studies mediate between bot- tom-up and top-down approaches. The following bottom-up linguistic units1 as an entry point to the analysis have been identified in the literature on corpus-based...