A Web of New Words

A Corpus-Based Study of the Conventionalization Process of English Neologisms

by Daphné Kerremans (Author)
©2015 Thesis 278 Pages
Series: English Corpus Linguistics, Volume 15


This book presents the first large-scale usage-based investigation of the conventionalization process of English neologisms in the online speech community. The study answers the longstanding question of how and why some neologisms become part of the English lexicon and others do not. It strings together findings and assumptions from lexicological, sociolinguistic and cognitive research and supplements the existing theories with novel data-driven insights. For this purpose a webcrawler was developed, which extracted the occurrences of the neologisms under consideration from the Internet in monthly intervals. The book shows that the different courses conventionalization processes may take result from the interplay between speaker-based sociopragmatic accommodation-induced aspects and factors facilitating cognitive processing of novel linguistic material.

Table Of Contents

  • Cover
  • Title
  • Copyright
  • About the author
  • About the book
  • Acknowledgements
  • This eBook can be cited
  • Contents
  • List of figures and tables
  • 1. Introduction
  • 1.1. General introduction
  • 1.2. Research questions
  • 1.3. Outline
  • 2. Neologisms in linguistics
  • 2.1. Lexicographical and lexicological approaches
  • 2.1.1. What are neologisms?
  • 2.1.2. The establishment of neologisms: Lexicalization, institutionalization and hypostatization
  • 2.1.3. Two empirical studies on institutionalization
  • 2.2. Cognitive-linguistic aspects of neologisms
  • 2.2.1. The importance of co(n)text
  • 2.2.2. The importance of exposure
  • 2.2.3. The importance of transparency
  • 2.3. Establishment revisited: Conventionalization
  • 3. Investigating English neologisms on the Internet
  • 3.1. The Web as a corpus? Static and dynamic applications
  • 3.1.1. General problems
  • 3.1.2. Downloadable crawlers
  • 3.1.3. On-demand crawlers
  • 3.2. The NeoCrawler: retrieving and monitoring neologisms online
  • 3.2.1. The Discoverer
  • 3.2.2. The Observer
  • 3.3. Data selection and investigation procedure
  • 3.3.1. Data selection
  • 3.3.2. The NeoCrawler’s socio-pragmatic classification system
  • 3.3.3. Operationalizing nameworthiness
  • 4. The conventionalization process of English neologisms
  • 4.1. The conventionalization continuum
  • 4.1.1. Non-conventionialization
  • 4.1.2. Topicality or transitional conventionalization
  • 4.1.3. Recurrent semi-conventionalization
  • 4.1.4. Advanced conventionalization
  • 4.1.5. Summary
  • 4.2. Conventionalization factors
  • 4.2.1. Coiner status
  • 4.2.2. Type of source
  • 4.2.3. Metalinguistic usage
  • 4.2.4. Nameworthiness
  • 4.2.5. Semantic ambiguity
  • 4.2.6. Syntagmatic lexical networks
  • 5. Incipient lexical networks in the conventionalization process of English neologisms
  • 5.1. Collocations as syntagmatic lexical relations
  • 5.2. The emergence of syntagmatic lexico-semantic networks
  • 5.2.1. Gradual emergence
  • 5.2.2. Instantaneous emergence
  • 5.2.3. Other syntagmatic usage patterns
  • 5.3. Collocations as cotextual anchoring points during conventionalization
  • 6. Summary and conclusion
  • 6.1. Summary of the results
  • 6.2. Towards a sociocognitive model of the conventionalization process of English neologisms
  • Appendices
  • Appendix 1: List of nonce-formations in alphabetical order
  • Appendix 2: List of neologisms in alphabetical order
  • Appendix 3: Questionnaires for the nameworthiness experiment
  • Appendix 4: Frequency distribution of word-forms and morpho-lexical relatives of robosigning
  • Appendix 5: Type of source frequency distribution in objectlinguistic use
  • Bibliography

← 10 | 11 → List of figures and tables

Table 1: The establishment process from three perspectives

Table 2: Examples of results of metalinguistically-driven searches for neologisms

Table 3: Page-tab classification scheme

Table 4: Token-tab classification scheme

Table 5: Linguistic properties of the neologisms in the sample

Table 6: Herring’s multi-faceted classification scheme for CMD

Table 7: Overview of collocation patterns for bloglets

Table 8: Orthography-token frequency distribution for robosigning 139

Table 9: Coiner status parameter realization

Table 10: Mode of usage percentages

Table 11: Nameworthiness scores and responses for the target neologisms

Table 12: Types of utility responses

Table 13: Overview of lexical units of detweet

Table 14: Overview of collocates for hyperlocal between 2007 and 2010

Table 15: Collocational profile for intexticated

Table 16: Collocational profile for tube-free

Table 17: Lexemes preceding bromosexual in the DISCUSSION FORUM and SOCIAL NETWORKS categories

Table 18: Collocates for morpho-lexical associates of bloglet in the OEC

Table 19: Collocational profile for bro in the OEC

Table 20: Comparison of the collocates for encore career and career

Table 21: Summary of the factor-of-influence analysis

Fig. 1: Integration of Milroy’s and Rogers’ model of diffusion stages into an S-curve

Fig. 2: Dafigase design of the NeoCrawler

Fig. 3: An overview of potential neologisms and their processing options

Fig. 4: The architecture of the NeoCrawler

Fig. 5: Overview of search processes for halfalogue

Fig. 6: User interface display for one search process for halfalogue

Fig. 7: An example of a personal blog

← 11 | 12 → Fig. 8: An example of a professional blog

Fig. 9: An example of a non-topic-specific portal

Fig. 10: An example of a discussion forum page

Fig. 11: An example of a filesharing website

Fig. 12: An example of a private Facebook page and its interaction options

Fig. 13: New and cumulative pages per month of roofvertising

Fig. 14: New and cumulative pages per month in different modes of usage of mesofact

Fig. 15: New and cumulative pages per month of back scooping

Fig. 16: Frequency of back scooping in different types of source in April 2010

Fig. 17: Cumulated pages per month in different modes of usage of kindergarchy

Fig. 18: New pages per month of burquini

Fig. 19: New and cumulative pages per month of cherpumple

Fig. 20: Relative overall frequencies for the observed referents of Boobgate

Fig. 21: New and cumulative pages per month of robosigning

Fig. 22: Type of source distribution from August until October 2010 of robosigning

Fig. 23: New and cumulative pages per month of encore career

Fig. 24: Field of discourse diffusion of encore career

Fig. 25: New and cumulative pages per month of slacktivism

Fig. 26a: Frequency distribution of slacktivism across different types of source in October 2010

Fig. 26b: Frequency distribution of slacktivism across different fields of discourse in October 2010

Fig. 27: Effect plot for coiner status

Fig. 28: Academic vs. non-academic cumulative frequency development of diabesity

Fig. 29: Field of discourse distribution of new pages per year of diabesity

Fig. 30: Academic vs. non-academic cumulative frequency development of globesity

Fig. 31: New pages per year in different types of source of encore career

Fig. 32: New pages per month in the categories NEWS and PORTAL of politerati

← 12 | 13 → Fig. 33: Effect plot for the metalinguistic usage (in percentage) and metalinguistic usage x collocation factor

Fig. 34: Cumulative pages per month of tynonym in meta- and objectlinguistic usage

Fig. 35: Comparison of nameworthiness scores for novel and esfiglished lexemes

Fig. 36: Effect plot for coiner status x semantic ambiguity

Fig. 37: Cumulative frequency development per assigned meaning of detweet

Fig. 38: Effect plot for collocations and other syntagmatic lexical patterns

Fig. 39: New and cumulative pages per year of bloglet

Fig. 40: New and cumulative pages per month of hyperlocal

Fig. 41: Type of source distribution of hyperlocal

Fig. 42: Field of discourse distribution of hyperlocal

Fig. 43: New and cumulative pages per month of intexticated

Fig. 44: New and cumulative pages per month of tube-free

Fig. 45: New and cumulative pages per month of halfalogue

Fig. 46: Cartoon illustrating facebook official

Fig. 47: New and cumulative pages per month of facebook official

Fig. 48: New and cumulative pages per month of bromosexual

Fig. 49: New and cumulative pages per month of frogurt

Fig. 50: Boxplot for the variable ‘collocation’ ← 13 | 14 →

← 14 | 15 → 1. Introduction

1.1. General introduction

Neologisms are like casting show winners. A minority of them become established singers, some are one-hit wonders and others almost instantaneously disappear into oblivion. The commercial framework of TV shows, gigs and record deals provides a supportive context for initially and instantaneously acquiring a high degree of popularity. As soon as the winner and runners-up have been crowned, however, attention quickly dwindles and independent careers take off or do not take off. Some winners might still score a hit with their finale song before vanishing from the public stage, others do manage to outlive the initial hype. Occasionally, a contestant does not even have to win to embark on a successful career. There does not seem to exist a recipe for success, nor a foolproof system that predicts who will make it and who will not. After all, many pop- and rockstars do not even need a casting show to become famous. In language, similar processes continuously take place. Of the many new words that enter the English language at a given point in time, some just happen to blend into it almost unnoticedly, whereas others stand out, receive a good deal of attention as ephemeral fashion words only to disappear after the hype has settled down. Although the following study is not intended to develop a linguistically-programmed crystal ball that would prophesy which novel words will become established in the English language, the underlying question is the same as for the casting show winners: which factors determine or influence, and to what degree, whether a novel formation diffuses through the speech community and possibly becomes a permanent addition to the language?

Prior to discussing the scope and aims of the present work in 1.2, it is necessary to briefly sketch the etymological origin and development of the word neologism, because it is partially responsible for the vagueness with which it is applied in linguistics (see 2.1.1) and its study in the realm of lexicography rather than in more theoretically-oriented branches, leaving the central question in the previous paragraph largely unanswered. Thus, simply put, a neologism is a new word, from Greek neo- ‘new’ and logos ‘word’. Following this definition, neologisms as sources of lexical enrichment are intrinsic parts of dynamic language use and development, both from a synchronic and diachronic perspective. Perhaps surprisingly, ← 15 | 16 → the word neology and its subsequently emerging lexical family did not appear in English until the Enlightenment in the 18th century as a borrowing from French (cf. Clauzure 2003: 208)1. In French, the interest in neologisms had been awakened earlier by the Pléiade poets in the 16th century, who claimed that the more words a language has, the more perfect it becomes and therefore introduced many novel formations and borrowings in their works and the language at large (cf. Alaoui 2003: 150-156). Not until the 18th century, however, did the concept gain a strong foothold in the French language and become lexicalized as néologique in 1725 (when the Dictionnaire néologique appeared as the first of many new word collections to reflect technological developments and political ideas), néologisme in 1734 and néologie in 1759 (cf. Alaoui 2003: 163-172). In English, the first use of neological in 1754, borrowed from French, is attributed to Lord Chesterfield, who envisioned a dictionary for the elite, “a genteel neological dictionary, containing those polite […] words and phrases, commonly used by the beau monde” (quoted in Clauzure 2003: 208). Such dictionaries of hard words had already existed since the 17th century, for instance, Robert Cawdrey’s Table Alphabeticall from 1604, but were concerned with borrowings from Latin, Greek etc., which were of course new in English, rather than novel English coinages per se.

Towards the end of the 18th century, possibly influenced by a similar development in French, neological acquired a second, slightly more negative sense (cf. Clauzure 2003: 209), as illustrated by another quote by Lord Chesterfield (in Clauzure 2003: 208): “the affected, the refined, the neological, or new and fashionable style, are at present too much in vogue in Paris”. Such objections fit the general spirit of 18th century England, in which language purity and reform were high on the agenda (cf. Baugh and Cable 2002: 274-288). Although Clauzure (cf. 2003: 218) claims that the same negative connotation affects neologism, again transferred from French (cf. Sablayrolles 2000: 55), on the basis of the quotations or illustrations in dictionaries, not all of the examples from the Oxford English Dictionary (OED) support this claim2. Moreover, neologism seems to have entered the English earlier than assumed. Clauzure mentions 1800 or afterwards (cf. 2003: 208, 211), but the OED lists an attestation from 17723.

← 16 | 17 → In view of its origins in a time when the language was seen as in need of protection from, rather than embellishment with, instances of lexical innovation, it is not astonishing that “[t]he history of English lexicography begins with the study of neology”, as remarked by John Algeo, neologist pur sang (1993: 281) and neology has remained somewhat neglected in theoretical linguistics. He lists an impressive chronology of dictionaries, word lists and popular books devoted to neologisms from Cawdrey’s 1604 publication until the end of the 20th century (cf. 1993: 282-282). In recent years, these print records have been and are being supplemented with online collections such as the periodical Word trends and new words blog from the OED4, MacMillan’s Open Dictionary and BuzzWord column5, Webster’s New Words and Slang section of their Open Dictionary6 and various private word-watching websites among which Paul McFedries’ WordSpy7 and Michael Quinion’s World Wide Words8 are the most acclaimed.

In spite of a common interest in neologisms, lexicographers, word spies and neolinguists have different goals and work within different theoretical paradigms. Whereas lexicographers continually face the difficult challenge of collecting new words and deciding which ones to include in their dictionaries, neolophilical word-watchers simply amass novel coinages, predominantly conspicuous vogue words that frequently turn out to be one-hit wonders, and provide a rudimentary profile of meaning and use. Neolinguists occupy an intermediate position. In addition to tracking new words, they are profoundly interested in their linguistic behavior and relation to other linguistic phenomena, which necessarily transcends a mere description of form and meaning. However, since their concerns are purely theoretical in nature, they do not need to evaluate the linguistic and extralinguistic durability in order to justify inclusion in dictionaries. Nevertheless, the question of why certain words become established and others become obsolete periodically arises here too. Partly due to the fact that neologisms have seemed to belong in the realm of lexicography and partly due to the lack of ← 17 | 18 → adequate empirical tools, this question has largely remained unanswered9. Crystal even claims that “there is never any way of telling which neologisms will stay and which will go” (1995: 132). Rather than taking this statement at face value, in the present study, I will provide a tentative answer to the unanswered question by empirically investigating several linguistic and extralinguistic factors that affect the conventionalization process, i.e. the process by means of which neologisms become established in the language and the speech community, to varying degrees.

1.2. Research questions

In view of the importance of language as a communication device in everyday human interaction, the need for new words arises perpetually “as they are required” (Aitchison 1991: 118). Most conspicuously, this need emerges when new objects or concepts are introduced in society or when objects or concepts change and their original names have become inept (cf. Aitchison 1991: 118). The social need is frequently intertwined with a semantic need in the language, i.e. to fill a lexical gap (cf. Bauer 1983: 43; Aitchison 1994: 158; Kjellmer 2000: 221). The coinage of new words is not necessarily motivated by naming requirements in society. Stylistic concerns or the need to be succinct play a role too, particularly in formal and creative writing (cf. Bauer 1983: 43; Aitchison 1994: 158). Thus, one major German bookseller praised their new recommendations as “unputdownable books”10. ‘Unputdownable’ can hardly be characterized as a new concept and it is doubtful whether this particular lexical gap needs to be filled. Rather, unputdownable condenses an entire syntactic phrase (‘difficult/hard/impossible to put down’) into one novel coinage, which saves space and catches the attention of the reader. Often, however, speakers create novel words with less ← 18 | 19 → conscious effort because they cannot instantaneously recall the established word in a conversation or because the lexicon does not provide an adequate expression (cf. Bauer 1983: 43). These instances of conversational need are individual, small-scale innovations that typically quickly disappear or remain restricted to the vocabulary of the conversation partners.

Despite the ubiquity of new word coinage processes, their products predominantly make transient appearances. Algeo observed that 58% of the new words collected in the Britannica Book of the Year between 1944 and 1976 were not rewarded with a dictionary entry, which represents evidence of their non-establishment or obsolescence (cf. 1993: 281, 283). Whereas he focused on a classificatory description of the extinct words and offers socio-functional explanations for their “desuetude” (Algeo 1993: 281) or death, other authors have concentrated on their birth and suggested several diffusion-inhibiting or -promoting factors. Unfortunately, many of these claims have never been investigated empirically, nor systematically, for instance, only with regard to a small selection of neologisms and restricted to specific genres (see 2.1 for an overview). The present work presents a large-scale, usage-based approach to these claims and attempts to provide a solid scientific basis for studying the diffusion of English neologisms.

In his entertaining book Predicting New Words, for instance, Allan Metcalf proposes to assess the failure or success rate of neologisms according to the “FUDGE factors” (2002: 152). He says that “the success or failure of new words is not entirely random. Some factors evidently make for success, while others hinder it.” (2002: 149). His FUDGE collection consists of linguistic and extralinguistic components. In his opinion, “unobtrusiveness” (2002: 155) is particularly significant (cf. 2002: 144, 167, 185); new words should “fl[y] under the radar” and “camouflage[…]” themselves, because “[o]ur minds are inclined to reject a conspicuous new word; it has to blend into the familiar landscape (or wordscape) before we can let it in” (2002: 156). In linguistic terms, this means that a novel lexeme should be phonologically, morphologically, semantically and orthographically consistent with established patterns in English (cf. Kjellmer 2000: 206, 208-21611). Moreover, it should be formally and functionally-semantically unambiguous, because of the requirements of successful communication. Since new words are by definition not or at best minimally familiar to the reader or ← 19 | 20 → hearer, the intentions of the writer or speaker can only be transferred unproblematically if the meaning is straightforward, i.e. not competing with other senses that are equally new. As a corollary of the unobtrusiveness factor, neologisms stand a better chance of becoming conventionalized when they are not the topic of metalinguistic discourse (cf. Metcalf 2002: 185). Metalinguistic discourse involves readers or writers, speakers or hearers commenting on the coinage, existence, emergence or formal shape of a neologism or providing explanations and definitions, e.g. I thought “bromance”12 was a very clever term, the first time I heard it13. These instances are commentarial-evaluative meta-uses; the word itself becomes the center of attention, but is not used with a new class of referents as in objectlinguistic use in the example Is this gay or bromance? I’m confused:(?, introducing a passage on an instance of a complicated intense male friendship14. Although initially, diffusion might be promoted by metalinguistic usage, reflected in an increase in frequency, it is debatable whether such metalinguistic discourse will also propel the active-objectlinguistic use of the neologism forward15. The unobtrusiveness criterion yields the two following negative hypotheses to be presently investigated:


ISBN (Hardcover)
Publication date
2015 (January)
Konventionalisierung Soziokognitiver Ansatz Neologismen Diffusion Online Sprachgemeinschaft
Frankfurt am Main, Berlin, Bern, Bruxelles, New York, Oxford, Wien, 2015. 278 pp., 21 tables, 50 graphs

Biographical notes

Daphné Kerremans (Author)

Daphné Kerremans studied English Linguistics and Phonetics at the University of Regensburg and Ludwig-Maximilians-University Munich (both Germany). She holds a PhD from the Ludwig-Maximilians-University Munich, where she works at the Chair of Modern English Linguistics. Her research interests include usage-based word-formation and lexicology, lexicography, corpus linguistics, cognitive sociolinguistics and semantics.


Title: A Web of New Words