In music, vibrato consists of cyclic variations in pitch, loudness, or spectral envelope (hereafter, “timbre vibrato”—TV) or combinations of these. Here, stimuli with TV were compared with those having loudness vibrato (LV). In Experiment 1, participants chose from tones with different vibrato depth to match a reference vibrato tone. When matching to tones with the same vibrato type, 70% of the variance was explained by linear matching of depth. Less variance (40%) was explained when matching dissimilar vibrato types. Fluctuations in loudness were perceived as approximately the same depth as fluctuations in spectral envelope (i.e., about 1.3 times deeper than fluctuations in spectral centroid). In Experiment 2, participants matched a reference with test stimuli of varying depths and types. When the depths of the test and reference tones were similar, the same type was usually selected, over the range of vibrato depths. For very disparate depths, matches were made by type only about 50% of the time. The study revealed good, fairly linear sensitivity to vibrato depth regardless of vibrato type, but also some poorly understood findings between physical signal and perception of TV, suggesting that more research is needed in TV perception.
Vibrato is the cyclic variation of acoustical parameters in a pitched sound, with the variation having typically a frequency of several hertz. It is an important decoration in musical expression on most wind and bowed string instruments and the voice (e.g., Geringer, MacLeod, Madsen, & Napoles, 2014; Papich & Rainbow, 1974; Zhang, Bocko, & Beauchamp, 2015). Seashore (1938) wrote, “A good vibrato is a pulsation of pitch, usually accompanied with synchronous pulsations of loudness and timbre, of such extent and rate as to give a pleasing flexibility, tenderness and richness to the tone” (p. 33).
In vibrato, periodic loudness, pitch, and timbre variations often occur together. In musical instruments, for example, nonlinear effects in the bow-string interaction or the reed or lips of a player have the result that changes in the amplitude of a vibration are also expected to produce variations in spectral envelope (e.g., Benade, 1976). Clipping of a sine wave is a simple example (which shares features with a clarinet’s reed when it beats against the mouthpiece, or the effects pedal of an electric guitar).
The co-occurrence of vibrato types creates problems when musicians communicate about vibrato. For example, when one describes vibrato on a violin, it is commonly understood in terms of oscillations in pitch, but it is difficult to have these oscillations without correlated spectral and loudness changes because the spectral response of the instrument is a strong function of fundamental frequency (e.g., Gough, 2005). There has been little research investigating the perceptual relationship between different types of vibrato. Some experiments suggest that pitch fluctuations (hereafter “pitch vibrato”) have the largest influence on the vibrato percept (Horii & Hata, 1988, as cited in Desain, Honing, Aarts, & Timmers, 1999). Others (Hajda, 1999) concluded that removing variations in spectral ratio from a musical tone has bigger consequences than removing either pitch or amplitude variations. As Seashore (1937, Chapter 4) writes, “The vibrato is always heard as of very much smaller extent than it is in the physical tone.” For example, a vibrato with a measured amplitude of one semitone does not seem to the listener to be nearly so wide. Katok (2016, p. 16) stated that the perception of vibrato depends upon recognizing its presence as well as being able to differentiate between the different types of vibrato.
Absent in the literature is the systematic investigation of the nature of different types of vibrato and the extent of their relative perceptual sensitivities. If different types of vibrato could be systematically separated, would they be perceived differently along type and intensity parameters?
The current study compares perceptions of cyclic variations in loudness and spectral amplitude, hereafter called loudness vibrato and timbre vibrato types, respectively. Loudness vibrato is produced here by varying the amplitude of all harmonics in a periodic tone proportionally. Timbre vibrato is produced here by a cyclic change in spectral slope, with the overall amplitudes being then adjusted to keep the loudness constant. Pitch vibrato was not included, because of the need to limit the duration of experimental sessions and because pitch vibrato has been more extensively studied.
We investigate timbre vibrato perception through three questions over three experiments. Two of these experiments are reported here. They investigated the perception of manipulated vibrato test tones in terms of depth of vibrato (sometimes referred to as “extent,” Prame, 1997) and type of vibrato (loudness and timbre)—in comparison to a reference tone—using an experimental paradigm of selecting the best matching, most similar option from a selection of tones to a reference tone. Timbre and related words were never mentioned, and subjects were not advised what “similar” or “match” might mean.
The first experiment manipulated vibrato depth of test tones each with the same vibrato type (e.g., both loudness vibrato and no timbre vibrato). We wanted to see which of the test tones participants would choose that best match a reference tone with a different vibrato type (and so in the example, timbre vibrato), and the level of precision with which the matching could be performed. The second experiment held depth of the test tones constant, and forced participants to choose which vibrato type best matched the reference tone, even if the reference tone has a considerably different vibrato depth to the test tones.
The experiments took about 25 minutes to complete and used a graphical interface on a desktop computer playing synthesized tones through closed headphones. Planning to keep the sessions to this time limit was considered important in maintaining concentration and aiding recruitment. However, it did require limits to the number of iterations in sequential judgments.
Experiment 1. Vibrato Depth Matching
Experiment 1 aimed to quantify sensitivity to the depth of two types of vibrato: loudness and timbre using an adaptive paradigm, with stimuli selected based on sensitivity of previous participant response.
Tones of constant duration and pitch were synthesized with different depths of loudness vibrato or timbre vibrato (depth being one of two independent variables), each type synthesized so as to minimize vibrato of the other type. Tones were designed as a sum of time-varying sinusoidal components with N = 15 harmonic partials. They were calculated explicitly in python+numpy and generated at a sample rate of 44.1 kHz. Spectral centroid, SC (which can be visualized as the center of mass of the spectrum), is taken as the controlled variable for timbre because of its close association with perceived brightness (Almeida, Schubert, Smith, & Wolfe, 2017).
Examples are shown in Figure 1, and eight of the possible tones used in the experiments are provided as accompanying media. At any time, tones have a linear spectral slope (as measured in decibels/harmonic number). For loudness vibrato, the slope remains constant in time at –0.95 dB per harmonic, with the spectral centroid nearly constant at approximately 1350 Hz. For timbre vibrato, the slope oscillates about this value. The amplitude of each harmonic is multiplied by a function of time f(t) that oscillates around the value 1, growing to an amplitude Vd. The amplitude of this oscillation is zero from 0 to 0.3 seconds, then grows linearly to a maximum at 1.5 seconds, then quickly falls to zero at 1.6 seconds (see Figure 1). The vibrato depth parameter Vd, the second of two independent variables, corresponds to the fractional modulation of loudness or spectral centroid. These definitions are somewhat arbitrary and there is no expectation that a same value of depth is perceptually similar for timbre or loudness vibrato. It is, rather, a way of quantifying these variations so that later we can relate the perceptual depths of the two parameters.
In timbre vibrato, the spectral slope varies with time in such a way that the spectral centroid has the same function of time f(t). The amplitude of all the harmonics is then multiplied by a single factor such that the loudness of the resulting sound according to the Moore and Glasberg (MG) model in Psysound 3 (Cabrera, 1999) was constant over time. (Because this model is based on averaging over varying population response, the timbre vibrato tones may have had a loudness vibrato component for some participants.)
The range of Vd was limited to 0 to 1 in both cases. In an informal, preliminary experiment, a small number of participants attested that these ranges had comparable perceptual magnitudes.
All the stimuli had a duration of 1.6 second with a fundamental frequency of 500 Hz. None of the vibrato tones included fluctuation of the pitch: the frequency of all harmonics remained stable throughout the tone. They had linear starting and finishing transients with durations of 50 and 20 ms, respectively. The variation in loudness and spectral centroid as functions of vibrato depth are shown in Figure 2.
A selection of tones used in the experiment can be found in the additional media for this article and at http://newt.phys.unsw.edu.au/jw/vib1.html.
The experiment was performed on a computer graphical interface that presented a reference tone followed by three test tones, as illustrated in Figure 3. The participant’s primary task was to select one of the three test tones that best matched (sounds “closest” to) the reference tone. The depth matching experiment was conceived as an iterative, adaptive process (Leek, 2001). Participants were presented with a reference tone (Figure 3), either of timbre or loudness vibrato type, together with three test tones, all with the other vibrato type but each having a different depth. The reference depth parameter was designed to be a value chosen at random between 0 and 1 with uniform probability, although for a few subjects, when comparing vibrato of same type, the distribution was biased towards higher depths. One of the depths of the test tones (dt) is chosen as the closest to the reference from the previous iteration. The other two tones have a depth dt/s and dt*s, where s is a parameter starting at 2, and which was reduced at each iteration by a factor of 1.3, if the participant reported a high confidence (“Certain,” “Fairly confident,” or “Slightly confident”) but not reduced for low (“Not confident” or “Guessing”). This process was iterated up to five times for any given reference tone, then a new iterative process started with a new reference tone and, again, the most dissimilar test tones in terms of depth. Due to an error in the program, the last interval range between test tones could, for a consistently confident participant, become larger in the fifth than the fourth iteration. The data from such diverging tones made little difference to the results, but the participants corresponding to the faulty cases were removed from the experiment. For each iteration, the chosen tone and the chosen depth were recorded along with the degree of confidence.
The tones were presented with a binaural, closed, around-the ear headset (Sennheiser HD280 pro), at levels of 46-49 dB (A). The sound level inside a headphone volume with no signal from the computer was 39 dB (A) measured with a Bruel and Kjaer model 2250-S sound level meter. Tests for harmonic distortion in the computer-headphone reproduction system using a pure sine wave from the computer found that any harmonic components due to distortion were at least 40 dB below the fundamental.
One hundred and seventy-eight students (107 female, 69 male) enrolled in a Music Psychology course participated in the study, approved by the Human Ethics Committee of UNSW Sydney. Forty-nine were music students, 84 others had music experience, and 43 had little music experience. One hundred and seventy-one of the participants were aged between 18 and 23; seven were older than 23. They earned course credit in return for completing the experiments. The topics timbre and brightness had not been discussed in their course at the time of the experiment. The experiment session included the perceptual test on the scaling of brightness (Almeida et al., 2017). All participants completed the session consisting of the sequence of experiments:
Adjust loudness of stationary test tone to match reference tone with different spectrum. (5 trials, used in Almeida et al., 2017)
Vibrato type similarity experiment (select vibrato tone closest to reference: 5 trials) EXP 2
Adjust loudness of stationary test tone to twice the reference tone (5 trials, used in Almeida et al., 2017)
Vibrato depth matching experiment (select 1 of 3 vibrato tones closest to reference: 2 trials, 5 iterations each) EXP 1
Adjust brightness of stationary tone to twice the test tone (Almeida et al., 2017, 5 trials)
Vibrato depth matching experiment (2 trials, 5 iterations each) EXP 1
Adjust brightness of stationary tone to half the test tone (Almeida et al., 2017, 5 trials)
Vibrato depth matching experiment (2 trials, 5 iterations each) EXP 1
Free comments. (not reported here)
In this experiment, each participant was led iteratively to judge the depth of one vibrato type that best matched a given depth of the same or of the other vibrato type reference tone. In Figures 4 and 5, both axes show the vibrato depth parameter, which approximately equals the proportional change in the loudness or, for timbre vibrato, the proportional change in spectral centroid. Figure 4 shows the results when the test and reference vibratos were of the same type. Consider any one of the dots on the graph. Here a participant heard a particular reference tone with a loudness depth vibrato shown on the x-axis for that dot, and through the adaptive procedure, finally selected a tone with a loudness depth value indicated on the y-axis as the best match. Data on the plot of chosen vs. reference depth might therefore be expected to be distributed close to the line y=x. (Note, however, that floor and ceiling effects would tend to reduce the slope of the experimental line: subjects cannot strongly overestimate high values nor strongly underestimate low ones.) The scatter of the data is an indication of the limited precision or consistency of judgments. For loudness vibratos the slope of the line is close to 1, M = 0.88, SD = 0.059, t(161) = -2.04, p = .043, whereas for timbre vibratos it is significantly different from 1, M = 0.82, SD = 0.045, t(128) = -3.90, p < .001.
Figure 5 shows the results when participants matched vibratos of different types. The results show that when timbre vibrato depth parameter is selected to match a given loudness vibrato tone, the result has a slope close to 1, as shown as the solid black line in Figure 4, indicating that it is close to a line represented mathematically as x=y. This finding is based on a one-sample t-test, M = 1.00, being the null hypothesis value of a slope with x = y, SD = 0.061, t(324) = 0.06, p < .001, and a further t-test demonstrates that the line has an offset significantly different from 0, M = 0.079, SD = 0.017, t(324) = 4.63, p < .001. When loudness vibrato depth parameter is selected to match a given timbre vibrato tone, the results are more different: on average, the depth parameter for a loudness vibrato is 0.53 times the depth parameter of a timbre vibrato to which is judged equal to it in depth).
Combining the two observations using the geometric mean of the slopes of the two regressions: for a similar proportional change in loudness or spectral centroid, the perceived depth is 1.34 times larger in a loudness vibrato than in a timbre vibrato (average R2 = .44).
The depth matching experiment provides some quantitative information about the perception of the two different types of vibrato. First, it is clear from Figures 4 and 5 that listeners have a notion of vibrato depth: in both loudness and timbre vibrato types, higher depths are consistently matched to higher depths, with 44% of the variance in response explained. When matching vibratos of the same type, on average listeners match the target tone to 85% the depth of the reference tone.
When matching the depth of vibratos of different type, a similar result applies: listeners are able to associate a depth level of one type of vibrato to that of the other type. The depth parameter of a timbre vibrato (measured as relative change in the spectral centroid) has to be about 1.34 times higher than that of a loudness vibrato (measured as relative change in loudness) for the two to have similar perceptual depths.
In a previous study of steady tones (Almeida et al., 2017), we found that the brightness of a tone is doubled if the spectral centroid is increased by a factor of 1.6 for tones such as those used here. So, the proportional change in spectral centroid underestimates the proportional change in brightness by a factor of about 2.0/1.6 = 1.25. Consequently, extrapolating the salience of brightness from steady tones to vibrato, we could say that, for a judgment of similar extent of vibrato depth, the brightness variation is approximately the same as the loudness variation. Arithmetically, the brightness variation need be 1.34/(2.0/1.6) ∼1.07 times larger than the loudness variation, but the 7% difference is not greater than the measurement scatter.
Regression lines in Figure 5 give an average intercept of .072. This may be attributed to the interaction of floor (and ceiling) effects with statistical variations. Below a certain vibrato threshold value, a sound will be perceived as not fluctuating. Participants would choose any sound below the threshold value with equal probability, so that the average response is expected to be around half the threshold value. The threshold was estimated to be 0.15. This could explain why the slope of these linear fits is unexpectedly less than 1.
The error in matching depth for vibratos of the same type scales roughly with the depth of the reference tone (with the exception of timbre vibratos matched against their own type). This can be seen by plotting the residuals of the regression (Figure 6).
Students participating in this study had a range of musical experience. They ranked themselves into one of five categories: no musical experience (34 participants), music student (32), amateur musician (74), regular musician (17), professional musician (8). In order to test for the effect of music experience, we reduced the music experience categories to two: nonmusicians (no musical experience) and experienced musicians (regular and professional combined). We ran t-tests comparing the dependent variables slope and intercept of the linear fit for each of the fits, with the independent variable music experience; there was no significant difference at the p = .01 level.
Experiment 2. Similarity of Vibrato Types
The aim of Experiment 2 was to compare the perception of test tones of different types, but similar depth. Would the vibrato having the same type as the reference tone be considered more similar, regardless of the difference in depth between the reference versus the two test tones? Or would depth be an important factor?
Participants were the same as in Experiment 1.
Stimuli used were the same as in Experiment 1.
Participants were asked to choose which of two tones having vibrato of different types but similar depth matched a tone whose depth was not necessarily similar. The reference tone (Tone 1) was randomly chosen to be either a loudness or timbre vibrato type with a depth parameter of 0.2, 0.3, 0.4 or 0.5 (also randomly chosen). The depth values have the same meaning as in the depth matching task in Experiment 1 for both types of vibrato and the corresponding sounds are the same). A depth parameter of 0.5 in the loudness vibrato corresponds to a maximum loudness variation of 3.7 sones in a sound with loudness 7.5 sones. In this 500 Hz signal, a timbre vibrato depth parameter of 0.5 corresponds to a 46% change in spectral centroid: a change with amplitude 650 Hz in the average spectral centroid of 1400 Hz (see Figure 2). The test tones (Tones 2 and 3) had a randomly attributed depth parameter, the same in both tones. One was a loudness and the other was a timbre vibrato, in a random order. The participants were asked “Which sample sounds closer to tone 1?” (Figure 7).
In each trial participants were given two test tones with the same depth parameter and different type, and asked to select which was the better match with a reference tone. The vibrato type and depth of the reference tone was selected at random from 8 tones, each type having depth parameter 0.2, 0.3, 0.4, or 0.5.
Results are shown in Figure 8. When vibrato depths are perceived as comparable, high proportions of participants selected the test stimulus that has the same vibrato type as the reference tone. (As shown in Experiment 1, similar perceived depth means similar depth of variation in brightness and loudness, but not equal depth of variation in spectral centroid and loudness.) Figure 8 plots the fraction of best match selections where test and reference vibrato type are the same (both as a number and a shading); this fraction is plotted as a function of the vibrato depth parameters for vibrato of the reference (x) and test (y) tone. It is worth noting that this is an implicit test, because the task is to select the test tone most similar to the reference, without specifying what “similar” means. It is possible that participants sometimes group similar perceived depth as being similar, and other times similar vibrato type (loudness or timbre) as being similar.
Overall, participants matched test sound to a vibrato of the same type in 77% of cases. For four out of the 32 cases participants chose more often the opposite type vibrato as more similar to the test tone. These four cases were mostly those having the greatest difference in depths between test and reference. (In these four cases, 7 out of the 22 not matching by type were nonmusicians (68%), whereas only 5 (36%) out of 16 matching by type were nonmusicians). A chi-square test comparing the criterion of best match selections for the different groups showed that the difference was not significant, χ2(1, N = 36) = 2.46, p = .11.
The key to understanding these possibly surprising cases seems to be that these four are the only cases out of 32 where a shallow timbre vibrato is available to be paired with a deep loudness vibrato: either when choosing a shallow timbre vibrato to match to a deep loudness vibrato reference (bottom right of Figure 8a) or when choosing a deep loudness vibrato to match to a shallow timbre vibrato reference (top left of Figure 8b). Conversely, however, when choosing a deep timbre vibrato to match to a shallow loudness vibrato reference (top left of Figure 8a) or when choosing a shallow loudness vibrato to match to a deep timbre vibrato reference (bottom right of Figure 8b), participants match tones according to type in a quite high proportion of cases.
As Figure 5 shows, the perceptions of the strength of vibrato depth parameter and of loudness depth parameter are similar, but not identical. For that reason, the regression lines fitted to the data of Figure 5 have been overlaid on Figure 8. The overlays show the average depth of the vibrato type that, as a single test tone, matches a dissimilar reference type. It can be observed that, along these lines, participants matched to the same type of vibrato more often than in many of the other cases (along these lines the shading is darker than in squares further from the lines). One interpretation is thus that when two dissimilar vibrato types are perceived as having similar depth it is easier to distinguish their type, and that in this case the timbre-loudness distinction becomes the more important component of dissimilarity.
Further from the lines of equal perceived depth, the rates of matching according to vibrato type drop, and at the furthest cells that were tested, participants roughly chose equally often the tone that matched one type or the other tone. This can be seen in Figure 9, which plots the distribution of best match selection as a function of the minimum perceived depth difference between test and reference.
Distribution of Best Match Selection as a Function of Perceived Depth Difference
Using the results from Experiment 1, the perceived depth of each of the test and reference tones can be estimated, giving a distance in perceived depth between each of the test tones and the reference tone. Figure 9 groups best match selection in Experiment 2 as a function of the difference between the perceived depth of the test tone and that, in a given trial, of the test tone that is closer in depth to the reference. Best match selections can be classified in two groups of two cases each:
If the test tone that was closer in perceived depth to the reference tone also had the same type as the reference tone, the participant could have selected either that tone or the one more distant in perceived depth, so that the possible best match selection in this case would be classified as:
1a) Selected on both criteria, because selecting the closest tone means that the subject selected the tone with the closest perceived vibrato but also with the same type of vibrato
1b) Selected on neither criteria, because the user selected the test tone that was further in perceived depth distance and that was also the one having different type from the reference (in this group there may be tones that are almost similar in depth, so users might be choosing at random)
If the closer tone was a different type of vibrato from the reference (then the other test tone was of the same type as the reference), the best match selection would be classified as:
2a) Selected on vibrato type, if the user selected the tone with the same type, even though its perceived depth was further from the reference
2b) Selected on vibrato depth, if the user selected the tone that was closer in perceived depth to the reference in spite of being of different type from the source
Note that cases 1a, 1b, 2a, or 2b cover all possibilities. Also, the largest values of perceived depth differences (of the closer tone in depth) are only achieved for a small number of trials in the lower right corner of Figure 8a (because loudness vibrato of a defined depth parameter has a larger perceived depth than the same depth parameter in timbre vibrato). In this corner, choosing the closer vibrato in depth also means choosing the same type of vibrato, so for large perceived depth differences there are only trials in Group 1. Also, for these trials the test tones have very shallow vibrato whereas the reference has a deep vibrato, so it may be that subjects are often choosing at random between the two test tones.
On the left hand side of Figure 9 (small depth differences), there are large numbers of trials in both groups of cases (1 and 2) (shown in the top panel of Figure 9). Here one sees that, when both selection criteria are equivalent (1a and 1b) about 80% of subjects chose the tone that is closer to and the same type as the reference. This is as naïvely expected. Nevertheless, a substantial minority falls into the 1b case. One possible explanation is that the two test tones are not easily distinguished for at least some subjects. Case 1b is discussed further below.
Cases 2a and 2b (criteria not equivalent) show an interesting result. Here, selecting according to same type (2a) gives a different result compared to selecting according to closest depth. For the group of cases 2, a majority of subjects selected matching types of vibrato, even though that meant selecting the vibrato with a larger depth difference from the reference. Thus, for small differences in the predicted perceived depth (left side of Figure 9), the type of vibrato is a more important criterion for associating two vibrato tones than depth. (Unfortunately, the experiment was not designed to sample this space [depth distance to reference and depth distance between test tones] evenly, so that it is hard to extend this conclusion to larger depth distances. A new experiment could be designed so as to have a larger number of tones with large depth distances falling into cases 2a and 2b.)
In some trials, particularly in the top left of Figure 8b, a majority of participants matched tones that were neither of the same type nor of similar depth (Case 1b). A possible explanation is that weak timbre vibrato seems like generic vibrato, whereas in deep timbre vibrato, the effect of spectral variation is salient. Asked to match weak timbre vibrato to either deep timbre vibrato or deep loudness vibrato, most listeners matched the weak (hypothetically sub-threshold) vibrato to the strong vibrato that does not include a spectral effect, i.e., they chose the loudness vibrato. To the listener, the (weak, non-salient) timbre vibrato and the (strong) loudness vibrato seem to be the same vibrato type: neither reveals the timbre effect. (Readers may refer to the sound tone samples used in the test, available in the supplementary materials, also at http://newt.phys.unsw.edu.au/jw/vib1.html). This is one of the reasons for the vanishing fraction of type-matched responses on the right hand of Figure 9.
It should also be stressed that the overlays in Figure 8 show average behavior: for individuals, the perceptual weightings differ (as the results of Experiment 1 show), and this could explain some of the spread in Figure 8. A surprising result is observed in the eight examples lying exactly on the y=x diagonal. In each of these cases, one of the test tones was identical to the reference. A substantial majority (84%) of participants picked the identical tones as the most similar pair, but not all. This suggests that, for the remaining 16% of participants (of whom 10 out of 14 are nonmusicians), two vibratos of similar depth but different type are chosen as a match.
General Discussion and Conclusions
Participants are reasonably sensitive to vibrato depth and able to match the depths of two tones with the same vibrato type: a linear relation explains 62% (when matching loudness vibrato) or 72% (when matching timbre) of the variance. A linear relation only explains 41% of variance when matching the vibrato depth across types. The linear relation’s positive intercept suggests that participants cannot detect fluctuations of less than about 15% either in loudness or spectral centroid. When matching across vibrato types, the linear relationship suggests that proportional fluctuations in loudness are perceived as being about 1.3 times deeper than proportional fluctuations in spectral centroid and thus (by extrapolating from previous measurements on steady tones), about the same depth as proportional changes in brightness.
Recognition of vibrato similarity is complicated: when participants perceived their depths to be similar, about 80% associate tones having similar vibrato type. However, when two vibrato tones have very different depths, only about 50% of participants associate the tones by type, a sign that they might be answering randomly in these cases. When timbre vibrato depth was sufficiently small, participants associated it more often with a deep loudness vibrato than a deep timbre vibrato, suggesting that, for many people, spectral effects are not perceptually salient when the variation in spectral centroid is about 30% of the mean value (350 Hz, in the current examples) or less.
So despite fairly linear sensitivity to vibrato depth regardless of vibrato type, some poorly understood findings between physical signal and perception of timbre vibrato remain, and more research is needed in timbre vibrato perception, which, in distinction to pitch and loudness vibratos has been considerably neglected.
We thank the Australian Research Council for supporting this project, and Professor John R. Smith for helpful discussions.