Acoustics of the Vowel


Dieter Maurer

It seems as if the fundamentals of how we produce vowels and how they are acoustically represented have been clarified: we phonate and articulate. Using our vocal chords, we produce a vocal sound or noise which is then shaped into a specific vowel sound by the resonances of the pharyngeal, oral, and nasal cavities, that is, the vocal tract. Accordingly, the acoustic description of vowels relates to vowelspecific patterns of relative energy maxima in the sound spectra, known as patterns of formants.
The intellectual and empirical reasoning presented in this treatise, however, gives rise to scepticism with respect to this understanding of the sound of the vowel. The reflections and materials presented provide reason to argue that, up to now, a comprehensible theory of the acoustics of the voice and of voiced speech sounds is lacking, and consequently, no satisfying understanding of vowels as an achievement and particular formal accomplishment of the voice exists. Thus, the question of the acoustics of the vowel – and with it the question of the acoustics of the voice itself – proves to be an unresolved fundamental problem.
2 Prevailing Empirical References

2.1    General References

The first extensive statistical study of the correspondence between vowels and formant patterns with reference to the three speaker groups, children, women and men was conducted by Peterson and Barney (1952, see Table 1, and Figure 1). Their study focused on American English and later became one of the dominant references in the literature.

Hillenbrand, Getty, Clark, and Wheeler (1995) used new recording and measurement methods (digitisation, LPC analysis) as well as an extended set of 12 vowels to replicate the classic study of Peterson and Barney (see Table 2).

Parallel to Peterson and Barney, Fant (1959) published a statistical study of Swedish vowels. However, Fant’s study was limited to the two speaker groups of men and women (see Table 3).

Presumably, the vowel-specific formant patterns as given by Peterson and Barney (1952) and Hillenbrand et al. (1995) are the most widely cited references in general discussions of the physical characteristics of vowels. The statistics of Fant (1959) also played an important role in the development of the source-filter theory. ← 21 | 22 →

2.2    Empirical Reference for Standard German

Pätzold and Simpson (1997) conducted a statistical study of vowels of Standard German, produced by men and women (see Table 4, limited to monophthongs). These values are given here because, as mentioned in the introduction, most of the author’s experiences and observations to date concern the sounds of the German language, and corresponding references are made in the text as from Part II.

2.3    Other Statistical References

References to other formant statistics and additional data of interest to the present discussion can be found in the Materials section. Such information includes formant statistics for vowels of different languages, model-like formant patterns, formant statistics for whispered vowels and indications concerning formant patterns of vowel sounds at different fundamental frequencies. ← 27 | 28 →