Loading...

Frequency in the Dictionary

A Corpus-Assisted Contrastive Analysis of English and Italian

by Dominic Stewart (Author)
Monographs 176 Pages
Series: Linguistic Insights, Volume 285

Summary

This book is concerned with frequency in foreign
language learning, and in particular with
contrastive frequency across languages. The
focus is on the learning of English and Italian,
whether it be English speakers learning Italian
or Italian speakers learning English. Despite
the fact that frequency – whether it be of
lemmas or of word forms within specific lexicogrammatical
environments – lies at the
heart of L2 learning, it is not stressed in any
salient or consistent manner in English / Italian
language-learning materials. This work aims
to redress the balance, offering a corpusassisted
critical analysis of the way frequency
is handled in English and Italian dictionaries,
and bringing out unexpected differences
between the two languages.

Table Of Contents

  • Cover
  • Title
  • Copyright
  • About the author
  • About the book
  • This eBook can be cited
  • Table of Contents
  • Acknowledgements
  • Introduction
  • 1. The pervasiveness of frequency in language analysis and in language learning
  • 1.0 Preliminary remarks
  • 1.1 The frequency of single words and phrases
  • 1.2 The frequency of collocation
  • 1.3 The frequency of grammatical structures
  • 1.4 The frequency of word forms within grammatical categories
  • 1.5 The importance of frequency in corpus linguistics
  • 1.6 The importance of frequency in theories of language
  • 1.7 Summary
  • 2. Comparative frequencies of cognates and conventional equivalents across English and Italian
  • 2.0 Preliminary remarks
  • 2.1 The influence of native parameters in L2 language learning
  • 2.2 Relative frequencies of some English / Italian cognates
  • 2.3 Frequency of word forms within grammatical categories
  • 2.3.1 The frequency of singular and plural noun forms
  • 2.3.1.1 Nouns whose singular form is predominant
  • 2.3.1.2 Nouns whose plural form is predominant
  • 2.3.2 Recap
  • 2.4 Summary
  • 3. The frequency of verb forms within specific grammatical categories as reported in English and Italian dictionaries
  • 3.0 Preliminary remarks
  • 3.1 Grammatical information provided in English and Italian dictionaries
  • 3.2 The frequency of verb forms within specific grammatical categories as flagged in English and Italian dictionaries
  • 3.2.1 Active vs passive forms
  • 3.2.2 Progressive vs non-progressive forms
  • 3.2.3 The simple present
  • 3.2.4 The English simple past and the Italian passato remoto
  • 3.2.5 Perfective forms
  • 3.2.6 The imperative
  • 3.2.7 The infinitive
  • 3.2.8 Grammatical person
  • 3.2.8.1 First person
  • 3.2.8.2 Third person
  • 3.3 Final remarks
  • 3.4 Summary
  • 4. Passive and progressive labels in English monolingual learner’s dictionaries
  • 4.0 Preliminary remarks
  • 4.1 Labels describing verbs in English monolingual learner’s dictionaries
  • 4.2 The labels ‘often passive’ / ‘usually passive’
  • 4.2.1 What does the label ‘passive’ mean?
  • 4.2.2 Weak passive meaning
  • 4.3 The labels ‘often progressive’ / ‘usually progressive’
  • 4.4 A conflict of form and function
  • 4.5 That which is not: The labels ‘no passive’ and ‘no progressive’
  • 4.6 Summary
  • 5. The frequency of negatives in English and Italian
  • 5.0 Preliminary remarks
  • 5.1 Information on negativisation in English monolingual dictionaries
  • 5.2 Information on negativisation in Italian monolingual dictionaries
  • 5.3 Summary
  • 6. Lexical environment across English and Italian
  • 6.0 Preliminary remarks
  • 6.1 A preliminary example of Word Sketch: foresee vs prevedere
  • 6.2 Lexical environment of cognates across English and Italian
  • 6.2.1 amenity vs amenità
  • 6.2.2 vacant vs vacante
  • 6.2.3 lucidity vs lucidità
  • 6.3 Pragmatic flags in dictionaries
  • 6.4 Summary
  • 7. Lexical and grammatical environment, a case study: English territory vs Italian territorio
  • 7.0 Preliminary remarks
  • 7.1 Word Sketches of the lemmas territory and territorio
  • 7.2 Singular and plural forms of the lemmas territory and territorio
  • 7.3 Implications for learners of English and learners of Italian
  • 7.4 Summary
  • 8. Limitations of the research
  • 8.0 Preliminary remarks
  • 8.1 Factors affecting frequency counts in this book
  • 8.2 Technical difficulties
  • 8.2.1 Examples of too much irrelevant data
  • 8.2.2 Examples of too little relevant data
  • 8.2.3 Repercussions for research
  • 8.3 Varying interpretations of grammatical categories
  • 8.4 Summary
  • Conclusion
  • References
  • Thematic index
  • Series index

Acknowledgements

My thanks are due to the University of Trento for providing the funds for this project.

Chapters 4 and 6 contain observations adapted from two articles of mine in the journal Iperstoria (University of Verona), numbers 7 (2016) and 16 (2020).←10 | 11→

Introduction

It seems almost redundant to affirm that frequency is crucial in language analysis and language learning. Frequency informs linguistics and language pedagogy to a huge extent, whether the object of study is, for example, single words and phrases, stress patterns, word forms within specific grammatical categories, register, level of (in)formality, sociolinguistic categories, dialectal differences – the list could go on and on. Frequency is also vital in comparisons between different languages and therefore in foreign language studies and in translation, and of course, it is central to corpus linguistics and to theories of language such as lexical priming.

This book is concerned, first and foremost, with frequency in foreign language learning, and in particular with contrastive frequency from one language to another. My focus is on the learning of English and Italian, whether it be English speakers learning Italian or Italian speakers learning English. Notwithstanding the fact that frequency of occurrence is the driving force behind much L2 learning (as well as L1 learning), the impression is that its role remains in the wings, because it is not stressed in any salient or consistent manner in English / Italian language-learning materials. This book aims to redress the balance, offering a critical analysis of the way frequency is handled in English and Italian dictionaries, assisted by the extensive use of corpus data in order to bring out important differences between the two languages. The English corpus adopted is the British Web 2007, also known as ukWaC, a web-derived corpus containing over 1 billion 300 million words from websites within the .uk domain. It is a general-purpose corpus with a broad range of text types. On the Sketch Engine (see below) website the corpus is described as follows:

←11 | 12→The ukWaC is a text corpus of British English collected from the .uk domain with using [sic] medium-frequency words from the British National Corpus as seed words. These two facts are fair to argue that it is a corpus of mainly British English although other variants are likely to be included as long as they were found on a .uk domain.

Reference will also be made to the British National Corpus, which contains approximately 100 million words of British English from the late twentieth century. It too is a general-purpose corpus offering a broad range of text types. It contains 90 % written texts and 10 % spoken.

The Italian corpus adopted throughout this book is the Italian Web 2006, otherwise known as itWaC, a web-derived corpus containing nearly 1 billion 600 million words from websites within the .it domain. Again, it is a general-purpose corpus with a broad range of text types. Occasional reference will also be made to the much larger (4.9 billion words) and more recent Italian Web 2016.

The corpora have been consulted using Sketch Engine, a corpus manager and analysis software created by Lexical Computing Ltd in 2003, now with over 500 ready-to-use corpora in more than 90 languages. See https://www.sketchengine.eu (Last visited May 20 2021) and Kilgarriff et al. (2014) for further details.

Chapter 1 stresses the importance of frequency of occurrence in language analysis and language learning, furnishing some preliminary examples of the frequency of single word forms and phrases, of pronunciation, of collocation, of colligation and of grammatical categories. The attention then moves on to the crucial role played by frequency in theories of language, with a particular focus on lexical priming and collostructional analysis.

The first part of Chapter 2 analyses the frequency of (lemma) cognates across English and Italian, reflecting upon the degree to which English / Italian dictionaries help their users to acquire notions of contrastive frequency. In the second part of the chapter, the spotlight moves on to the frequency of word forms within grammatical categories, beginning with the singular and plural of English and Italian nouns.

Chapter 3 focuses on the forms of words in grammatical categories, this time within the more complex sphere of verb categories. In both Chapters 3 and 4, there will be a more robust emphasis on lexicographical entries, in order to establish the degree of usefulness ←12 | 13→of English and Italian dictionaries as indicators of the extent to which single verbs recur in specific grammatical categories such as passive, simple past and perfective.

Chapter 4 examines the criteria adopted by English monolingual dictionaries when flagging verbs as frequently passive or progressive, and it therefore has a more specific focus by comparison with the briefer observations on these two categories made in Chapter 3. Because Italian dictionaries do not as a rule signpost the passive and progressive, the survey considers English dictionaries alone. The purpose is to gauge the degree of usefulness for learners of English of the labels ‘passive’ and ‘progressive’, by reflecting firstly upon what these labels actually mean, and secondly upon why these particular grammar labels are prioritised at the expense of others.

Chapters 5–7 move beyond single lemmas and word forms and into the terrain of lexical and grammatical environment, again considering how this is catered for in dictionaries. The main concern of Chapter 5 is verb negativisation in English and Italian, the emphasis being on verbs that have unusual ratios of affirmative / negative instances, ratios of which native speakers are apprised through priming, and of which students of L2 should ideally have at least approximate awareness. Frequency of occurrence is again paramount in Chapter 6, which is concerned with lexical environment across English and Italian, mostly of cognates, investigated with the use of Word Sketch, ‘an automatic, corpus-derived summary of a word’s grammatical and collocational behaviour’ (Kilgarriff et al. 2010) that is part of the range of search strategies provided by Sketch Engine. Chapter 7 offers a more detailed analysis, focusing on both lexical and grammatical environment in the form of a case study, taking as an example the collocational and colligational environment of the cognates territory and territorio, again assisted by Word Sketch. As in previous chapters, outcomes retrieved from English and Italian corpora will be measured against information supplied in English and Italian monolingual dictionaries.

The work draws to a close in Chapter 8, with some reflections both upon the limitations of the research project offered in this book and upon the shortcomings of the corpus analysis conducted.

←14 | 15→

1. The pervasiveness of frequency in language analysis and in language learning

1.0 Preliminary remarks

This chapter firstly provides some introductory examples of the frequency or infrequency of occurrence of single words, phrases, collocations, stress patterns and grammatical structures, as well as of word forms within grammatical categories, and secondly highlights the importance of frequency in corpus linguistics and in theories of language.

1.1 The frequency of single words and phrases

The frequency of occurrence of single words and phrases is fundamental in language analysis and language learning. For example, students of English as L2 may need to know which is (i) the more common way of forming the plural of the noun plateau – is it plateaus or plateaux?, or (ii) the most common way of expressing the century which extends from the beginning of the year 1500 to the end of the year 1599: is it the 1500s, the fifteen hundreds, the XVI century, the 16th century or the sixteenth century? Whereas in the British Web 2007 corpus the outcomes for plateaux and plateaus are fairly even (with 430 occurrences plateaux has a greater raw frequency than plateaus at 333 occurrences, but plateaux is sometimes part of a place-name such as Mille Plateaux or ←15 | 16→Hauts Plateaux), there is considerable frequency variation in the same corpus concerning the ways of referring to the period 1500–1599:

the 16th century

3930

the sixteenth century

Details

Pages
176
ISBN (PDF)
9783034344180
ISBN (ePUB)
9783034344197
ISBN (MOBI)
9783034344203
ISBN (Hardcover)
9783034343688
Language
English
Publication date
2021 (October)
Published
Bern, Berlin, Bruxelles, New York, Oxford, Warszawa, Wien, 2021. 176 pp., 8 fig. b/w, 45 tables.

Biographical notes

Dominic Stewart (Author)

Dominic Stewart teaches linguistics and Italian- English translation at the Department of Humanities, University of Trento, Italy, having previously taught at the Universities of Macerata, Bologna and Verona. He publishes mainly in the areas of corpus linguistics and translation.

Previous

Title: Frequency in the Dictionary