Ian Howell, M.Mus. Yale University, Vocal Pedagogy Director New England Conservatory
Acoustic recordings from the technological “bleeding edge” of the early twentieth century present opportunities to study celebrated voices from the past; yet, they raise challenging performance practice issues when our perceptions of stylistic and technical choices contradict our received aesthetic norms.(1) While tempo, rubato, and phrasing preferences change over time, the conflict between how singers sounded then versus now is perhaps best illuminated when we consider vibrato. Adelina Patti, Dame Nellie Melba, Enrico Caruso, Evan Williams, and many other elite turn of the twentieth century singers all exhibit rather ‘un-modern’ sounding vibratos on their acoustic recordings; either too fast, seemingly muted, or non-existent. This pilot study uses a novel sonic analysis technique to begin to question our ability to draw actionable technical conclusions regarding vibrato from what we hear on historical recordings.
What is vibrato?
We describe vibrato with a few parameters. Fundamentally, it is an oscillation above and below the perceived pitch. How many times per second the pitch oscillates is called the rate, and is described in Hertz (Hz). How far above and below the pitch the vibrato oscillates is called the extent, and is either described in terms of pitch intervals (e.g. one quarter step) or a percentage of the fundamental frequency (e.g. 3%). If the oscillation causes a change in amplitude—perhaps louder at the top of the cycle—this may be noted as well. Opinions vary, but we tend to think of an average oscillation rate of near 6Hz (cycles of vibrato per second) and about a quarter step (3%) above and a quarter step below the pitch (totaling a half step, called the “extent”) as “pleasing.” These parameters interact with one another in a complex manner, and variations in one may create the perception of changes in the other.
Many kinds of vocal vibrato exist, with varying degrees of susceptibility to active control.(2) Though vibrato may be added or removed in gradations by a trained singer, the type that we find in most classical solo singers is a fairly automated muscular response to a technique developed to produce high amplitude, balanced singing throughout one’s entire range. As this type of vibrato is basically involuntary, unless ideal classical vocal technique has changed in the past 100 years, the vibrato itself ought not have changed significantly.
The Case of Nellie Melba
Dame Nellie Melba is an interesting subject to study as her technical lineage may be traced to the Italian singing master Nicola Porpora (through Mathilda Marchesi, Garcia the younger and elder, and Giovanni Anzoni).(3) Widely celebrated by her contemporary fans and critics, Melba should exemplify the same ideals as modern singers who claim this tradition. However, her 1904 acoustic recording of “Porgi amor” from Mozart’s Le Nozze di Figaro (https://www.youtube.com/watch?v=HMqJ92txdAE) suggests otherwise. One might easily perceive here that she sings much of this aria with either a straight-tone, or with a markedly muted vibrato. However, a spectrogram reveals a regular oscillation of a little more than 6Hz and an average total extent of a little less than a half step at even the straighter-sounding moments. Is this a narrower extent than one might expect from an opera singer in heavy, romantic repertory? Almost certainly yes; however, Melba’s vibrato as seen in the spectrogram is both as pervasive and subtly varied as one would expect from a tasteful modern classical singer, which raises the question as to why it does not sound “modern.” Curiously, her vibrato has approximately the same average rate and extent found in many recordings of this repertory by Renee Fleming and Leontyne Price. We examined this particular recording at the NEC Voice and Sound Analysis Laboratory and directly compared it to Fleming and Price’s recordings of the same piece. For contrast we included a sample of Maria Callas—a singer with a very different sounding vibrato.
What is a Spectrogram?
A spectrogram is a visual representation of sound. The voice (like many musical instruments) produces harmonics in multiples of the fundamental frequency (the pitch we hear). While the complex, periodic compression wave that travels through the air is a combination of these harmonics, it can be “split apart” by a process called a fast Fourier transform; each harmonic is then displayed individually. This allows us to peer into the sonic—not just pitch- and time-based—space a recording occupies, filter out specific frequency bands, and appreciate how our experience of sound arises from smaller, discrete units of acoustic energy. Spectrograms read much like printed music; time passes from left to right and pitch is displayed low to high from bottom to top. The software that generated the following images, Overtone Analyzer, inserts a piano keyboard along the left side of the screen to help clarify the data. In the figures that follow, keep in mind that all the squiggly lines belong to the singer. Each harmonic has its own pitch; the relative strength (louder sounds are dark, hot colors like red and orange while quieter sounds are cool colors like blue and black) of the harmonics determines our perception of the timbre and vowel. Also visible in the following images are background noise and the accompanying instruments (clear examples of straight-tones interspersed throughout the vocal harmonics). To learn more about spectrograms, read this primer.
This comparative analysis technique explores the idea that the recording technology itself failed to capture sonic information relevant to our perception of Melba’s vibrato. While we may never recover acoustic information lost at the moment of sound capture, by removing the same information from modern recordings and considering the change in our perception of vibrato, we may imagine what Melba may have sounded like live. To provide a direct comparison, we isolated a single pitch (D5 on the “ah” vowel) from each recording. Fleming, Price, and Callas’ samples are all drawn from the pitch marked “A” (see figure 1).Melba’s sample is taken from the pitch marked “B,” as her recording is a half step lower. Each one-second sample is placed side by side in figure 2, which allows for direct comparison of the rate and extent of the vibratos. The similar rate (the number of peaks in the wavy lines per second) and extent (how far up and down they move) of Melba, Price, and Fleming are easily seen in the second harmonic (D6), strong in all four samples. All four samples have a rate of about 6Hz; Callas’ vibrato is differentiated by a visibly wider extent, which gives the impression of a slower rate. Figure 2 audio:
Why then do Melba, Fleming, and Price’s samples sound so different? At least part of the answer has to do with the quality of the recordings. The signal to noise ratio of Melba’s recording (at least in this specific transfer found on YouTube) is less favorable than the other three samples. Also, no musical information is present above approximately 3,680Hz (near the pitch B♭7), just hiss. The extra noise both above 3,680Hz, and also present throughout the frequency range occupied by Melba’s voice, masks a good deal of her sonic information. As an experiment, we deleted all the sound above 3,680Hz and between the remaining vocal harmonics for all four singers. This audio file reveals much in common between Melba, Price,(4) and Fleming (see figure 3). We then pasted the deleted noise from Melba’s recording into the silent spaces of all four samples. Due to the masking effect of this noise, our perception of the presence of vibrato decreases in a similar manner across the Melba, Price, and Fleming samples (see figure 4), all the more so as the volume of the noise is increased relative to the vocal harmonics (see figure 5). Indeed, these samples suggest that many of the differences in perceived vibrato between Melba, Price, and Fleming relate to the quality of sound capture above 3,680Hz, the masking effect of noise around and between the vocal harmonics, and the timbre of the particular voice,(5) rather than the specific rate or extent of the vibrato. Callas’ wider extent remains conspicuous across all samples.Figure 3 audio:
Figure 4 Audio:
Figure 5 Audio:
I believe this experiment suggests that Nellie Melba’s vibrato (and perhaps that of her contemporaries) may have sounded much more like our modern elite singers’ than we assume. While this limited experiment only compared these four, one-second samples, and we cannot reconstruct the part of Melba’s voice lost at the moment of recording, I believe that these results call into question our ability to draw actionable conclusions regarding vocal vibrato from this historical recording. In considering the reliability of measurements taken from transfers of historical recordings, we must acknowledge that the original playback machines allow the user to control the playback speed, and this would influence the rate (though not extent) of a singers vibrato. However, if what we hear on historical recordings may be markedly—not just subtly—different from what the singer sounded like live (a faster speed would not suddenly replace the lost information), further study is certainly warranted.
(1) See Arthur W.J.G. Ord-Hume, et al. “Recorded sound,” Grove Music Online. Oxford Music Online. Oxford University Press. It was not until the early 1920s that engineers were able to use sensitive electrical microphones, rather than acoustic horns, to capture sound.
(2) John Nix, “Shaken, Not Stirred: Practical Ideas for Addressing Vibrato and Nonvibrato Singing in the Studio and Choral Rehearsal,” Journal of Singing 70, 4 (2014): 412.
(3) Berton Coffin, Historical Vocal Pedagogies, (Lanham: Scarecrow, 1989), 14.
(4) Note that this moment in Ms. Price’s recording also captures a distinctive orchestral tone as well.
(5) Each singer sings a slightly different version of “ah,” as evidenced by the slightly inconsistent distribution of harmonic energy from sample to sample.