Speech production and perception: Learning and memory

Fuchs, Susanne; Cleland, Joanne; Rochet-Capellan, Amélie

Speech production and perception: Learning and memory

by Susanne Fuchs (Volume editor)

Joanne Cleland (Volume editor)

Amélie Rochet-Capellan (Volume editor)

Linguistics

Open Access

Series: Speech Production and Perception, Volume 6

Summary

Learning and memory processes are basic features of human existence. They allow us to (un)consciously adapt to changes in our social and physical environment in a variety of ways and may have been a precursor for survival in human evolution. Through several reviews and original work the book focuses on three key topics that enhanced our understanding of the topic in the last twenty years: first, the role of real-time auditory feedback in learning, second, the role of motor aspects for learning and memory, and third, representations in memory and the role of sleep on memory consolidation.
The electronic version of this book is freely available, thanks to the support of libraries working with Knowledge Unlatched. KU is a collaborative initiative designed to make high quality books Open Access for the public good. More information about the initiative and links to the Open Access version can be found at www.knowledgeunlatched.org

Excerpt

Cover
Title Page
Copyright Page
About the editors
About the book
Citability of the eBook
Contents
List of Contributors
Preface
Changes in speech production in response to formant perturbations: An overview of two decades of research
Spatial and temporal variability of corrective speech movements as revealed by vowel formants during sensorimotor learning
Aetiology of speech sound errors in autism
Acquisition of new speech motor plans via articulatory visual biofeedback
Do manual gestures help the learning of new words? A review of experimental studies
Interference in memory consolidation of non-native speech sounds
Looking for exemplar effects: testing the comprehension and memory representations of r’duced words in Dutch learners of French

List of Contributors

Louis ten Bosch

Centre for Language Studies, Radboud University, Nijmegen, the Netherlands

Jana Brunner

Institut für Deutsche Sprache und Linguistik, Humboldt-Universität zu Berlin, Germany

Tiphaine Caudrelier

Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, 38000 Grenoble, France

Joanne Cleland

University of Strathclyde, Glasgow, United Kingdom

Jonathan Delafield-Butt

University of Strathclyde, Glasgow, United Kingdom

Marion Dohen

Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, 38000 Grenoble, France

Mirjam Ernestus

Centre for Language Studies, Radboud University, Nijmegen, the Netherlands;

Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands

Susanne Fuchs

Leibniz-Zentrum Allgemeine Sprachwissenschaft (ZAS), Berlin, Germany,

Pamela Fuhrmeister

Department of Speech, Language, and Hearing Sciences, University of Connecticut, United States

Phil Hoole

Institut für Phonetik und Sprachverarbeitung, Ludwig-Maximilians-Universität München, Germany

Eugen Klein

Institut für Deutsche Sprache und Linguistik, Humboldt-Universität zu Berlin, Germany

Louise McKeever

University of Strathclyde, Glasgow, United Kingdom

Lisa Morano

Centre for Language Studies, Radboud University, Nijmegen, the Netherlands

Amélie Rochet-Capellan

Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, 38000 Grenoble, France

James M. Scobbie

Queen Margaret University, United Kingdom

Eugen Klein, Jana Brunner, Phil Hoole

Spatial and temporal variability of corrective speech movements as revealed by vowel formants during sensorimotor learning

Abstract: Previous perturbation studies demonstrate that speakers can reorganize their motor strategies to adapt for articulatory or auditory perturbations (Savariaux, Perrier & Orliaguet, 1995; Rochet-Capellan & Ostry, 2011). However, across most studies we observe a fluctuating amount of inter-individual differences with respect to the adaptation outcome. To evaluate the predictions of the hypotheses put forward to explain these differences, we conducted a multidirectional auditory perturbation study investigating F2 perturbation with native Russian speakers. During participants’ production of CV syllables containing the close central unrounded vowel /ɨ/, F2 was perturbed in opposing directions depending on the preceding consonant (/d/ or /g/). The bidirectional shift was intended to encourage participants to produce the vowel /ɨ/ with two different motor strategies and allowed us to investigate intra-individual variation of adaptation patterns as a function of the perturbation direction and the consonantal context. To examine the evolution of the adaptation process, we performed generalized additive mixed modelling (GAMM) on the averaged and individual formant data using the experimental trials as discrete time points. In doing so, we were able to examine sudden changes in participants’ adaptation strategies, which appeared as non-linearities in the F2 curve. Our results suggest that previously formulated hypotheses regarding individual adaptation processes make empirical predictions which are not confirmed by the bidirectional perturbation data. Therefore, we propose a more general hypothesis that the successful adaptation is dependent on speakers’ ability to coordinate the perceived auditory errors with appropriate compensatory movements, which is influenced in turn by the complexity of the adaptation task. We discuss this hypothesis in the context of individual adaptation patterns and show that it not only can explain the inter-individual, but also the inter-study variability observed in previous perturbation studies.

Keywords: auditory feedback, real-time perturbations, formants, variability, individual behavior, generalized additive mixed modelling, Russian

←77 | 78→

1. Introduction

1.1. Perturbation and sensorimotor learning

Picture the situation of taking a photo of beautiful lakeside scenery and accidently dropping your camera into the water. Despite your misfortune, you are lucky and can spot the camera within what appears to be a reachable distance at the lake bottom. Hastily, you try to retrieve the camera but grab a few times beside it before you can actually take hold of it. Or even worse, you realize that the bottom that appeared reachable lies in fact much deeper below the water surface. In the described example, the coordination between your visual input and your hand movements is disrupted by the visual distortions caused by the different reflective angles between the water and the air. The fact that you can eventually grab the camera after a few attempts, assuming the lake bottom is indeed reachable by hand, provides evidence for the flexibility of the human sensorimotor system which is able to adapt for the visual perturbations and to find alternative motor strategies to reach the intended goal.

The same is mostly true for mechanical and auditory perturbations of speech. That is, when you later recall you tale of bad luck to a friend during the conference dinner, and you get upset about the unreasonable repair costs of your camera, you might speak with a mouth full of food. In this case, your articulators’ movements might be impeded by pieces of food which will force you to find alternative strategies to intelligibly articulate the words you intend to utter. Or, in another scenario, you may have to increase the loudness of your voice to compensate for the loud conversation happening at the table next to yours.

During experiments applying controlled perturbation, speakers have to produce speech under aggravated conditions, e.g., under blockage of their jaw movements or under altered auditory feedback. As in the initial example with the hand-eye coordination, speakers need to coordinate errors transmitted by their sensory input with appropriate corrective articulator movements to be able to retain intelligibility of their speech. In the case of speech, it is particularly intriguing which sensory channels (e.g., somatosensory, proprioceptive, or auditory) are involved in the process of adaptation. The answer to this question may provide a better understanding of the different types of sensory information relevant for speech ←78 | 79→production and ultimately the goals of articulator movements. Thus, the study of perturbed speech provides an empirical means to study the nature of speech sound representations as well as learning processes that occur in speech production.

1.2. Outcome variability in perturbation studies of speech

Despite the general ability of speakers to reorganize their motor strategies to retain the acoustic make-up of the intended speech sounds under aggravated conditions, the outcome of adaptation processes in speech exhibits high inter-individual and inter-study variability. For instance, Gay, Lindblom and Lubker (1981) examined participants’ productions of vowels when a bite block was inserted between their teeth. The authors found that speakers were able to adapt to these static perturbations with very little or no practice and produce acoustic outputs equivalent to their unperturbed speech. However, in a study by Savariaux, Perrier and Orliaguet (1995) when speakers’ lips were blocked with a tube during the production of the French [u]; only six out of 11 speakers were able to partially compensate for the labial perturbation and only one speaker compensated completely by changing the constriction location from a velo-palatal to a velo-pharyngeal region. The remaining four speakers did not compensate at all. Similar variability of the experimental outcomes is also observed across other articulatory perturbation studies, e.g., by Baum & McFarland (1997), Jones & Munhall (2003), and Brunner, Hoole & Perrier (2011). To explain this variability, Savariaux et al. (1995) suggest that the varying degree of adaptation among participants is due to “speaker-specific internal representation of articulatory-to-acoustic relationships”.

More recently, it has become possible to study speakers’ articulatory-to-acoustic relations by means of real-time perturbation of speakers’ auditory feedback. This methodology allows alteration of such acoustic parameters as fundamental frequency (f0; Jones & Munhall, 2000) and vowel formants (F1 and/or F2; Houde & Jordan, 1998; Purcell & Munhall 2006; Villacorta, Perkell & Guenther, 2007), and has the advantage that multiple perturbation conditions can be tested within the same study without participants’ awareness of any systematic manipulations. For instance, Rochet-Capellan and Ostry (2011) perturbed the first formant (F1) in the ←79 | 80→vowel /ɛ/ in opposing directions depending on the experimental stimulus in which it was embedded (head or bed), while in a control stimulus (ted) the F1 remained unchanged throughout the experiment. The authors found that speakers were overall able to adapt for the three distinct F1 levels which means that during the study participants employed three different motor strategies to produce the vowel /ɛ/. However, as with articulatory perturbation studies mentioned above, there is a noteworthy proportion of speakers, ranging from 10 to 20 % per study, who fail to adapt to auditory perturbations. Roughly speaking, these speakers exhibit two qualitatively different types of adaptation behaviors: either adjusting their response in the same direction as the applied perturbation, or hardly reacting to it.

One of the more recent hypotheses put forward to explain the outcome variability observed in perturbation studies is the idea by Lametti, Nasir and Ostry (2012) that speakers have individual preferences for articulatory or auditory feedback to control their speech production. To empirically evaluate their claim, Lametti et al. (2012) investigated participants in different experimental conditions where the authors either perturbed participants’ jaw trajectories without altering their speech acoustics, or perturbed their auditory feedback, or applied both types of perturbation simultaneously. The authors found a negative correlation between the amount of articulatory and auditory adaptation which means that speakers who adapted to articulatory perturbations, adapted to auditory alternations to a lesser degree.

However, Lametti et al.’s (2012) hypothesis conflicts with observations previously made by Ghosh et al. (2010) who investigated the relation between somatosensory and auditory acuity, where acuity stands for the degree to which speakers were sensitive to changes in articulatory and auditory feedback signals. Running contrary to the idea that speakers exhibit individual preferences towards auditory or somatosensory feedback, Ghosh et al. (2010) found that both types of acuity positively correlated with each other as well as with the magnitude of produced sibilant contrasts. In the context of vowels, the latter finding was previously made by Perkell et al. (2004). Furthermore, auditory acuity has been shown to have an influence on the adaptation magnitude during auditory perturbation of vowel formants (Villacorta et al., 2007) as well as during articulatory perturbation of sibilants (Brunner, Ghosh, Hoole, Matthies, Tiede ←80 | 81→& Perkell, 2011). In contrast to Lametti et al.’s (2012) hypothesis which predicts that speakers who fail to adapt to auditory perturbations should virtually ignore them, individual differences in auditory acuity provide a way to explain partial compensations which are frequently observed in auditory perturbation studies.

Another explanation for partial compensations was provided by Katseff, Houde & Johnson (2012) who suggest that these are the result of speakers’ attempts to integrate the altered auditory signal with the normal somatosensory signal that speakers receive during a perturbation experiment. Similar to other authors (e.g., Sato, Schwartz & Perrier, 2014), Katseff et al. (2012) assume that vowel targets are defined as regions in a multidimensional acoustic-somatosensory space. That is, when during auditory perturbation the acoustic parameters of speakers’ speech are diverted from the target, speakers will compensate for the acoustic error. However, their compensation will stop when the discrepancy between the auditory and somatosensory signals becomes too large. Katseff et al. (2012) support their view by the observation that in their study of F1 perturbation the relative compensation magnitude decreased from 100 % for 50 Hz perturbations to 40 % for 250 Hz perturbations. An analogous finding was previously made by MacDonald, Goldberg & Munhall (2010) for F1 and F2 perturbation.

Details

Pages: 280
Publication Year: 2019
ISBN (Hardcover): 9783631726914
ISBN (PDF): 9783631797860
ISBN (ePUB): 9783631797877
ISBN (MOBI): 9783631797884
DOI: 10.3726/b15982
Open Access: CC-BY
Language: English
Publication date: 2019 (November)
Keywords: Gestures Sleep Second-language learning Pathology Speech motor control Phonetics Feedback
Product Safety: Peter Lang Group AG

Biographical notes

Susanne Fuchs (Volume editor) Joanne Cleland (Volume editor) Amélie Rochet-Capellan (Volume editor)

Susanne Fuchs is a phonetician and speech motor control researcher at the Leibniz-Zentrum Allgemeine Sprachwissenschaft in Berlin. She investigates the biological grounding of spoken language, iconicity and its origin in sensorimotor properties as well as the effect of motion on cognitive processes. Joanne Cleland is a researcher and Speech and Language Therapist at the University of Strathclyde in Glasgow. She studies clinical phonetics and articulatory analysis of Speech Sound Disorders in children. Amélie Rochet-Capellan is a researcher at French CNRS. In the framework of embodied cognition, she studies the links between orofacial and limb sensorimotor control and language in typical speakers and speakers with intellectual deficiencies.

Speech production and perception: Learning and memory

Summary

Excerpt

Table Of Contents

Details

Biographical notes

Key Subject Areas