Acoustics of the Vowel


Dieter Maurer

It seems as if the fundamentals of how we produce vowels and how they are acoustically represented have been clarified: we phonate and articulate. Using our vocal chords, we produce a vocal sound or noise which is then shaped into a specific vowel sound by the resonances of the pharyngeal, oral, and nasal cavities, that is, the vocal tract. Accordingly, the acoustic description of vowels relates to vowelspecific patterns of relative energy maxima in the sound spectra, known as patterns of formants.
The intellectual and empirical reasoning presented in this treatise, however, gives rise to scepticism with respect to this understanding of the sound of the vowel. The reflections and materials presented provide reason to argue that, up to now, a comprehensible theory of the acoustics of the voice and of voiced speech sounds is lacking, and consequently, no satisfying understanding of vowels as an achievement and particular formal accomplishment of the voice exists. Thus, the question of the acoustics of the vowel – and with it the question of the acoustics of the voice itself – proves to be an unresolved fundamental problem.
11 Lack of Correlation between Methodological Limitations of Formant Determination and Limitations of Vowel Perception

11.1    Vowel Perception at Fundamental Frequencies > 350 Hz

As discussed in Section 8.2, recognisable vowels can be produced at fundamental frequencies substantially exceeding the critical limit above which formants can no longer be reliably determined for method­ological reasons.

Vowel perception is maintained for sounds at fundamental frequencies > 350 Hz. Yet, for these middle and higher fundamental frequency ranges, formant pattern estimation is questionable for methodological reasons. Thus, the methodological limitation of determining formant patterns of vowel sounds at fundamental frequencies > 350 Hz does not coincide with impaired vowel intelligibility.

Consequently, formulating a general theory of the physical representation of vowels based on formant patterns proves to be critical due to the related methodological limitations.

Consequently, formulating a general theory of the physical representation of vowels based on formant patterns proves to be critical due to the related methodological limitations.

11.2    Lack of Correspondence between Methodological Problems of Formant Pattern Estimation at Fundamental Frequencies ≤ 350 Hz and Impaired Vowel Perception

Vowel sounds produced at fundamental frequencies ≤ 350 Hz, for which the estimation of formant patterns proves questionable for reasons other than fundamental frequency—for instance, if expected relative spectral energy maxima are “missing” or if vowel-related parts of a spectrum are “flat”—are not less recognisable than vowel sounds for which formant pattern estimation may be said to be unproblematic.

Methodological problems regarding the determination of formant patterns of vowel sounds at fundamental frequencies ≤ 350 Hz do not necessarily coincide with impaired vowel intelligibility.

11.3    Addition: Lack of Methodological Basis of Determining Formant Patterns for Vowel Mimicry by Birds

Given the prevailing methodological standards, strictly speaking, the imitation of human vowel sounds by birds cannot be studied in terms of formant patterns. As explained in Section 6.3, formant calculation requires parameter settings for the frequency range and the maximum number of filters used in the analysis in relation to a specific vocal-tract size. Birds, however, have no vocal tract comparable to that of humans. Hence, it is impossible to determine how many filters should be used in analysing a vowel-like sound produced by a bird to determine vowel-specific formants.

Thus, in a first step, comparisons between the utterances of humans and birds must be based on a direct comparison of the respective spectra and must relate to the interpretation of observable relative spectral energy maxima. However, in a subsequent step, formant ana­lysis double-checked by resynthesis may be applied even if methodically unsubstantiated, in order to foster the discussion.

Again, this methodological limitation of mimicry analysis does not coincide with a principal difficulty to identify the imitated vowel sounds involved. ← 71 | 72 → ← 72 | 73 →