This study reports on an experiment that tested whether drummers systematically manipulated not only onset but also duration and/or intensity of strokes in order to achieve different timing styles. Twenty-two professional drummers performed two patterns (a simple “back-beat” and a complex variation) on a drum kit (hi-hat, snare, kick) in three different timing styles (laid-back, pushed, on-beat), in tandem with two timing references (metronome and instrumental backing track). As expected, onset location corresponded to the instructed timing styles for all instruments. The instrumental reference led to more pronounced timing profiles than the metronome (pushed strokes earlier, laid-back strokes later). Also, overall the metronome reference led to earlier mean onsets than the instrumental reference, possibly related to the “negative mean asynchrony” phenomenon. Regarding sound, results revealed systematic differences across participants in the duration (snare) and intensity (snare and hi-hat) of strokes played using the different timing styles. Pattern also had an impact: drummers generally played the rhythmically more complex pattern 2 louder than the simpler pattern 1 (snare and kick). Overall, our results lend further evidence to the hypothesis that both temporal and sound-related features contribute to the indication of the timing of a rhythmic event in groove-based performance.
In groove-based music, it is assumed that musicians can achieve different timing “feels” or “styles” in performance by subtly altering the temporal location of the onset of events at the “microrhythmic” metrical level (on the order of about 10–40 milliseconds) by either playing slightly early (“pushed”) or late (“laid-back”) in relation to other players’ rhythm, a metronomic beat reference, or simply their own internal pulse (Butterfield, 2006, 2011; Câmara, 2016; Câmara & Danielsen, 2018; Danielsen, 2006, 2010, 2018; Iyer, 2002; Keil 1987, 1995; Kilchenmann & Senn, 2011, 2015; Senn, Bullerjahn, Kilchenmann, & von Georgi, 2017). Recently, the role of other sound parameters, such as loudness, duration, or timbre, and their interaction with timing have been found to be equally fundamental to microrhythmic expressivity in groove-based music (Câmara, Nymoen, Lartillot, & Danielsen, 2020; Danielsen, Waadeland, Sundt, & Witek, 2015). The present study intends to further test the hypothesis that musicians systematically manipulate acoustic sound features other than onset in order to produce different timing feels (laid-back, on-beat, and pushed) in a musical context.
In musicology and ethnomusicology, scholars have typically used the term “groove” to denote either the individual patterns that comprise a given style (Kernfield, 2003), the overall “rhythm matrix” comprised by all the instruments within a performance (“the groove” in a tune) (Iyer, 2002; Monson, 1996), or an aesthetic quality or “feel” stemming from the various rhythmic relations between or within the instruments of an ensemble in performance, either as a result of microtiming expression (Keil, 1987) or the interaction between microtiming and macrostructural features (Butterfield, 2006, 2011; Câmara, 2016; Danielsen, 2006, 2010). Related, the adjective “groove-based” has tended to denote music from genres derived from African American performance traditions that share a range of common rhythmic features (Câmara & Danielsen, 2018; Pressing, 2002). Most recently, however, “groove” has been operationalized by music psychologists as the aspect of any music, regardless of cultural origin, that elicits various experiential phenomena such as the “urge to move” (Madison, 2006) and “pleasure” (Janata, Tomic, & Haberman, 2012) in particular. In this article, however, we subscribe to the traditional musicological meanings of groove as broadly denotative of the rhythmic structural aspects of musics historically designated as “groove-based” and thus make no assumption regarding the degree to which onset asynchronies elicit higher or lower ratings of any operationalized groove feature.1
Regarding timing in performance, researchers generally assume that expert musicians are able to control the onset of events to a precise degree at the microrhythmic level. One commonly investigated instance of microtiming expression that is thought to contribute to the qualitative feel of grooves is “swing,” or the use of asymmetric long-short duration patterns in consecutive pairs of on- and off-beat notes, usually at the eighth-note or sixteenth-note metrical subdivision level. Different degrees of duration ratios in these long-short patterns (i.e., “swing ratios”) have been theorized to convey various qualities of “motional energy,” or “the force of momentum with which some musical events are directed toward others” (Butterfield, 2011, p. 4), which, in combination with manipulations of intensity and articulation can range from “relaxed” and “continuative” to “forward driving” and “choppy” (see also Butterfield, 2006). Investigations of commercial and field recordings have found that rhythm sections and solo instrumentalists apply varying degrees of swing ratios to either eighth notes in jazz (Benadon, 2006; Butterfield, 2011; Friberg & Sundstrom, 2002; Rose, 1989) or sixteenth notes in jazz-funk, funk, hip-hop (Butterfield, 2006; Câmara, 2016; Danielsen, 2006; Frane, 2017), samba (Gerischer, 2006; Haugen & Godøy, 2014), and djembe music (Polak, 2010). Performance experiments have also shown that drummers and percussionists alter the amount of swing depending on genre, tempo, and/or individual player preference (Haas, 2007; Honing & Haas, 2008; Haugen & Danielsen, 2020).
Another form of microtiming expression thought to contribute to overall timing feel in groove-based music involves the rhythmic interaction between and within the various instruments in an ensemble. It is considered a hallmark of technical proficiency in African American–derived groove performance practice to be able to play flexibly around a timing reference (colloquially referred to as the “beat”) in a controlled fashion while maintaining a steady tempo—either behind the beat (“laid-back”), on the beat, or ahead of the beat (“pushed”) (Berliner, 1994; Câmara et al., 2020; Danielsen et al., 2015; Keil 1987, 1995; Kilchenmann & Senn 2011, 2015; Monson, 1996). Such timing strategies may be referred to as groove-timing “feels” or “styles.” In drumming practice, in particular, while a certain timing feel may be more commonly associated with a given rhythmic pattern or genre—such as the presence of a slight delay (laid-back timing) in the snare strokes of “back-beat” patterns (see Frane, 2017; Iyer, 2002)—particular drummers also tend to develop highly individualized strategies (Dahl, 2011; Waadeland, 2006), and any given pattern may be played with different timing feels depending on personal preference or aesthetic contextual considerations (Butterfield, 2006).
Only a handful of studies have directly investigated the relation between purported groove feels in performance and microtiming profiles via controlled laboratory experiments. Kilchenmann and Senn (2011) instructed two drummers to play the same jazz-rock rhythmic pattern with a laid-back, on-beat, and pushed timing feel along with a metronome and found that both drummers displayed distinctive onset microtiming patterns for each feel. Individual characteristics were also found—one drummer anticipated all the timing conditions in relation to the metronome (described in performance parlance as an “ahead” or “pushy” player), while the other drummer played the pushed and on-the-beat feels ahead of but the laid-back feel behind the metronome (conversely, then, a more “laid-back” player). In a performance study with ten drummers of a standard “back-beat” rock pattern at three different tempi (64, 96, and 148 beats per minute), Danielsen and colleagues (2015) also investigated the degree to which drummers systematically manipulated onset microtiming profiles of the snare drum in order to distinguish different timing feels. They further hypothesized that drummers would systematically manipulate additional sound features, such as intensity and timbre (spectral centroid), in order to produce these feels. In terms of onset timing, they found that regardless of whether drummers displayed either “pushy” or “laid-back” tendencies in relation to the metronome, all were able to produce onsets in the laid-back and pushed conditions significantly behind and ahead of their own average on-beat timing. In other words, all were able to clearly distinguish between the different timing feels in terms of microtiming onset profiles.
Regarding the hypothesized relationship between the sound characteristics of the drum strokes and their onset timing, Danielsen and colleagues (2015) found that at the medium tempo (96 bpm), drummers showed a tendency to play their strokes with greater intensity (louder) in the laid-back condition relative to the on-beat condition. At the individual participant level, a majority of the drummers (seven out of ten) was additionally found to play these behind-the-beat strokes with a lower spectral centroid (darker). Regarding the effect of tempo, differences between timing style conditions were greatly diminished at the faster tempo, likely due to motoric limitations. However, the drummers played louder across all timing styles in the fast and medium tempo categories relative to the slow tempo category. In a similar instructed-timing style experiment with electric guitarists and bassists, Câmara and colleagues (2020) found that, in addition to manipulating onset location, the guitarists lengthened the durations of their strokes and used a slightly darker timbre when playing in a laid-back fashion, and the bassists played strokes with greater intensity when playing in a pushed fashion. Overall, findings from the latter two studies show that, in order to produce sound events as microrhythmically early or late, musicians systematically manipulated not only the onset timing of strokes but also other parameters of sound as well.
Findings from psychoacoustic studies with pure tones or clicks have suggested that thresholds for asynchrony detection (whether two sounds are heard as synchronous) can be as low as 2 ms (Hirsh, 1959; Zera & Green, 1993), and that the threshold for temporal order (correctly identifying which tone comes first/second) is generally higher, at around 20 ms (Hirsh, 1959). However, in real music, instrumental sounds tend to be complex and have overlapping spectra and/or unequal duration or loudness, all of which can lead to masking effects of various degrees that likely increase these thresholds. Butterfield (2010) tested the extent to which listeners could correctly identify the temporal order of bass and drum sounds in swing jazz excerpts with asynchrony manipulations of up to 30 ms, and found that most were not able to do so above a chance level of 50 percent. The tasks did not, however, assess whether participants were able to detect simply the presence of asynchrony between instruments. Goebl and Parncutt (2002), on the other hand, investigated the effect of intensity on onset asynchrony detection in piano tone dyads of different pitches (high/low) with equal offset locations and found that, regardless of pitch, when the tones were presented with 27 ms onset asynchrony, if the earlier tones were presented as louder (“early + loud” condition or, in classical piano parlance, “melody lead”), only 20 to 30 percent of participants detected the asynchrony, whereas if the earlier tones were presented as softer (“late + loud”), 90 to 100 percent of participants heard the tones as asynchronous. In other words, at 27 ms asynchrony, it was more difficult to detect the asynchrony when the earlier tone in the dyad was louder. This difficulty was attributed to a forward-masking effect or decreased sensitivity to synchrony due to familiarity with early and loud tone combinations. When the magnitude of asynchrony was increased to 54 ms, however, the chances of detecting asynchrony in the early + loud combinations increased to 40 to 50 percent, and in the late + loud combinations it increased to 100 percent, indicating that larger onset asynchronies generally enhanced detectability in both earlier and later directions.
Since the magnitudes of onset asynchronies in performance can be rather subtle, often verging on thresholds of perceptual discriminability, the intended timing feel of a given performance may be further augmented by musicians when they concomitantly manipulate sound features of rhythmic events that have been found to interact with timing at a perceptual level. Studies of a sound’s P-center, or perceptual “moment of occurrence” (Morton, Marcus, & Frankish, 1976) or “perceptual attack time” (Gordon, 1987), for example, have shown that, at least in isochronous contexts, the perceived temporal location of a sound event is a complex matter contingent upon more than just onset timing. Practically all studies on the P-centers of musical sounds have found that the faster a sound’s attack (onset to maximum amplitude peak) and total (onset to offset) duration, the earlier its average P-center relative to its onset, and, conversely, the longer the duration (both slow attack and long total duration), the later the P-center (Bechtold & Senn, 2018; Danielsen et al., 2019; Gordon, 1987; London et al., 2019; Scott, 1998; Seton, 1989; Villing, 2010; Vos, Mates, & Kruysbergen, 1995; Vos & Rasch, 1981; Wright, 2008). A recent study by Danielsen and colleagues (2019) also found that positively correlated combinations of attack and total duration (fast attack/short duration, slow attack/long duration) caused a “redundancy gain” whereby both factors shifted P-centers either earlier or later in time, respectively. Conversely, negatively related combinations (short attack/long duration, long attack/short duration) caused a “redundancy loss” whereby, because one factor tends to shift the P-center earlier while the other one shifts it later, the cumulative shifting effect was either attenuated or canceled out.
Loudness/intensity has also been shown to affect P-center location, though this has been underinvestigated and findings have been less conclusive. While Seton (1989) found no clear effect of intensity on a sawtooth tone, Gordon (1987) found that a saxophone sound played with greater intensity led to an earlier P-center than one played with lesser intensity. Bechtold and Senn (2018) confirmed this result and further found that, for saxophone sounds, both greater articulation (a stronger “tongue attack”) and higher dynamics (a louder sound) led to an earlier P-center on average in comparison to sounds with lesser articulation and lower dynamics. Altogether, findings of the effects of duration and intensity on P-center suggest that timing and sound interact at a perceptual level. Therefore, in the production of rhythms, potential redundancy gains across dimensions of timing and sound may help to further convey an intentional early, late, or on-beat timing feel. That is, the laid-back character of a late stroke may be enhanced if it is also played with a longer duration, just as the pushed-ness of an early stroke would be enhanced if it were played with a shorter duration.
Other perception studies have also revealed interactions among perceived timing, duration, and intensity of events in performance. Various experiments with classical pianists have shown that, when they are instructed to emphasize a melody tone in a polyphonic piano performance, they tend to play it both louder and earlier than the other voices (Goebl, 2001; Palmer, 1996; Repp, 1996). As mentioned, Goebl and Parncutt (2002) found that the relative perceptual salience of two tones in a piano dyad depended on both their relative intensity and the asynchrony between their onset timing locations. Yet other studies have found a systematic relationship between intensity and duration, whereby beats accented with greater intensity also tend to be lengthened in performance (Clarke, 1989; Dahl, 2004; Drake & Palmer, 1993; Gabrielsson, 1999; Waadeland, 2006). Regarding perceptual interactions between timing and intensity, Tekman (2002) found that it was easier to correctly identify a tone as being late in relation to another tone when it was both positioned late in terms of onset timing and played with greater intensity, as opposed to a tone that was simply positioned late but played at an equal intensity to another tone. This result was theorized as underlying a higher-order, semantic-level interaction (as opposed to a lower-order, sensory-level interaction) between perceptual dimensions of timing and intensity.
To sum up, several experiments with both music performance and music/audio perception point to the intimate relationship between the temporal and auditory aspects of microrhythm, as well as the integration of timing and various perceptual dimensions of sound.
The present study intends to further test the hypothesis that musicians systematically manipulate acoustic sound features other than onset in order to produce different timing feels in a musical context. Specifically, we aim to replicate some of the findings of Danielsen and colleagues (2015) regarding stroke onset and intensity production, but extend the scope of our investigation to an entire drum kit (snare drum, kick drum, and hi-hat cymbals), increase the number of participants, measure a new sound feature (duration), and explore the effects of two novel factors that may further influence the production of timing in performance: reference stimuli and rhythmic pattern. The rationale for presenting a timing reference to drummers with different sound stimuli is based on findings from sensorimotor tapping studies that suggest that musicians tend to display less negative mean asynchrony (“NMA”)—that is, synchronize more accurately to a reference—when presented with musical stimuli (see Repp, 2005; Repp & Su, 2013). In addition, while expert drummers are generally accustomed to playing along to metronomes comprised of click-like percussive stimuli, especially in a studio recording session, an instrumental track—a timing reference that more ecologically simulates a real musical production context (a trio ensemble)—will further test the effect of sound stimuli presentation for synchronization in drumming production, or, more specifically, whether the NMA disappears in a more ecological experimental context. We also chose to provide drummers with two different groove-based rhythms of differing rhythmic complexity and event density, in order to investigate contextual effects of stylistic pattern on production of intentional timing feel—that is, to gauge whether onset and sound-feature-manipulation profiles will differ for simpler groove-based patterns with no syncopation/pick-ups in relation to patterns with additional salient off-beat events. Finally, we also measure a new stroke feature, that of duration, since perceptual studies have shown that the attack and total temporal extent of a sound influences its P-center.
Overall, we hypothesize that, in the process of achieving distinctive early, late, or on-beat microrhythmic timing profiles in performance, instrumentalists leave sonic “stamps” on the sounds of their own instruments that are systematically related to these timing styles. We explore this possibility by measuring changes in duration and loudness (sound pressure level [SPL]), in addition to onsets, of sound events after instructing instrumentalists to play under different timing constraints, addressing the following question: “To what extent are there systematic differences in the acoustic signal between drum strokes played with different (a) intended timing styles along to (b) two different timing references in (c) two different groove patterns across subjects?” We hypothesize that NMA will be lower when the drummers are playing to the more ecological instrumental reference, and we also expect that the musical context will yield an effect, but we have no specific hypothesis regarding what this might be.
Twenty-two male drummers, 22–64 years of age (M = 36, SD = 11) participated in the study. All were active part-time or full-time musicians recruited from either local universities/conservatories or commercial performance scenes, and they had between 4 and 40 years of professional performance experience (M = 16, SD = 11). All were familiar with either jazz, funk/soul, or rock. All participants were paid an honorarium to take part in the experiment.
The drummers performed two rhythmic patterns at a 96 bpm medium tempo deemed to be comfortable based upon previous experiments (Câmara et al., 2020; Danielsen et al., 2015). Pattern 1 (Figure 1A) was a simplified version of the so-called “back-beat” pattern that is ubiquitous in popular groove-based music. Pattern 2 (Figure 1B) was a slightly more complex variation of the back-beat groove that included additional off-beat events: an extra eighth-note stroke on the “two-and” metrical position for the snare and a syncopated eighth-note on the “three-and,” as well as a pick-up on the “four-and,” for the kick drum.
Participants were presented with two categories of performance conditions:
Timing reference conditions Play each pattern (1 and 2) along to:
a metronome, comprised of woodblock sounds (condition: Metronome)
an instrumental backing track, comprised of guitar and bass sounds (condition: Instrumental)
Timing style conditions In each of the timing reference contexts listed above, play each pattern (1 and 2):
0) in as natural a manner as possible (condition: Natural)
1) in a laid-back manner, or behind the beat relative to the timing reference (condition: Laid-back)
2) in a pushed manner, or ahead of the beat relative to the timing reference (condition: Pushed)
3) in an on-beat manner, or synchronized to the timing reference (condition: On)
Apparatus and Materials
We did the recordings at the Motion Capture Laboratory, Department of Musicology, University of Oslo, Oslo, Norway, in the spring of 2018. For our drum instrumental setup, we used the following equipment: an acoustic metal snare drum, 7 inches deep and 14 inches wide (Gretsch, USA), with an Emperor X drumhead (Remo, USA) and a thin plastic muffle ring; a 21-inch kick drum (Gretsch), with a FA batter drumhead (Remo); and a hi-hat stand (Pearl, USA) with 14-inch cymbals (Yamaha, Japan) (see Figure 2). Pilot tests of the sound recordings revealed that close-microphone techniques with regular microphones led to too much leakage/bleed between hi-hat and snare signals, so these instruments were recorded with C411 contact microphones instead (AKG, Austria). For the kick drum, we used a Beta 52 (Shure, USA) microphone.
The microphone signals were sent into a BabyFace Pro sound card (RME, Germany) and recorded in the audio software Reaper 5.77 (Cockos Inc., NY) at a sampling frequency of 44.1 kHz and 24-bit resolution. Playback of the two timing-reference tracks was routed to a MG10XU analog mixer (Yamaha) and then fed into 250-Ohm-resistance DT 990 Pro headphones (Beyer Dynamic, Germany), which were then given to the participants for monitoring.
For the metronome beat reference track, we used two woodblock sounds (Cubase 8 Halion LE library, Steinberg/Yamaha, Germany), one pitched higher and located on the first quarter-note beat locations (spectral centroid [SC] = ca. 2370 Hz) and the other pitched lower and located on the remaining second, third, and fourth beats locations (SC = ca. 1565 Hz). Both sounds had very short, impulsive attack transients (attack duration from signal onset to max. amplitude peak ≈ 2 ms), with a gradual decay toward the signal offset (total duration ≈ 40 - 50 ms). The metronome stimuli were aligned to the software grid based on mean P-center results from our previous instructed timing study on guitarists and bassists (see Câmara et al., 2020).
The instrumental backing-track reference was comprised of electric guitar and electric bass sounds originally recorded in our audio analysis study of sound–microtiming interaction in guitar and bass (for details, see Câmara et al., 2020). For pattern 1, this was comprised of alternating quarter-note tones on bass (fundamental frequencies = E and A, total duration ≈ 610 ms) located on beats 1 and 3, and a two-chord guitar back-beat pattern (E and A, total duration ≈ 250 ms) located on beats 2 and 4 (Figure 3A). For pattern 2, an analogously complex and aesthetically appropriate version of the backing track was given, where the guitar pattern was syncopated on the second chord on the “4-and” beat, and the alternating quarter-note bass tones were always preceded by eighth-note pick-ups (total duration ≈ 300 ms) (Figure 3B). While we had no P-center data on these guitar and bass sounds, since they all had fast attacks (≈ 15–25 ms) and previous experiments have shown that P-center of plucked string instruments tend to be close to stimuli onset (Danielsen et al., 2019), all the instrumental reference stimuli were simply aligned to the software grid based on their calculated onset values (see “Audio Analysis” sub-section below for onset calculation details).
Before the experiment began, we encouraged all drummers to acquaint themselves with the drum-kit set up and warm up by playing freely for at least 10 minutes or so. Once the experiment began, in the beginning of each pattern and timing reference block, the “natural” timing condition was always given first, as a further warm up to allow participants to accustom themselves to the pattern and timing reference combination, as well as the tempo. Then, the remaining timing style conditions (laid-back, on, and pushed) were given, in randomized order. A rest period was allowed between each condition for as long as participants deemed necessary. Once started, each condition lasted for approximately 67.5 s (27 measures), and participants began to play as soon as they had entrained with the timing reference track. If a participant was dissatisfied with their performance during or after a condition trial, they were invited to repeat it as many times as they wanted. This resulted in 224 possible hi-hat strokes for both patterns, 54 snare and kick drum strokes for Pattern 1, and 81 for Pattern 2, played per condition per participant. After the performances, a short performer interview provided feedback related to the experimental setup and insight into the various performance strategies applied to achieve the different timing condition tasks. In total, the performance and interview session lasted between 45 to 60 min.
Determination of drum stroke segments
First, we estimated the location of the onset and offset of each drum stroke directly from the audio waveform using an algorithm developed by Lartillot and colleagues (2020). The onset is the starting point of the attack phase, where the attack phase is a line fitted on the audio waveform through an optimization of both the slope and the maximum amplitude (mirevents with “Attacks” option set to “Waveform” and parameter settings: WaveformThreshold = 3 percent [snare/kick] and 10 percent [hi-hat]). Similarly, the offset is the ending point of the decay phase. Then, we separated each stroke signal into attack and decay segments based on the time point of the maximum global amplitude (“peak”). Thus, for the analysis (described below), each stroke was partitioned into three segments: 1) attack (interval between onset and maximum peak); 2) decay (interval between maximum peak and offset); 3) total (interval between onset and offset).
The dynamic microphone recordings of the kick drum contain singular short and transient peaks that are easily extracted from the waveform itself. The snare and hi-hat contact microphone signals, however, display atypical, irregular transient peak patterns during the attack phase (see Figure 4). Therefore, we decided to extract their peak location from a smoothed signal envelope, calculated as the sum of bin amplitudes across the columns of a spectrogram using the mirevents algorithm (loc. cit) with “Attacks” option set to “Slope” (parameter settings: frame length = 30 ms [snare/kick] and 100 ms [hi-hat]; frame hop factor = 2 percent).2
Selection of sound descriptors
We selected three sound descriptors for the main analyses, defined as follows: 1) onset (the starting point of the stroke attack segment (see above), measured in milliseconds); 2) duration (the elapsed time interval of the stroke segment, measured in milliseconds); 3) sound pressure level (SPL; the unweighted root-mean-square (rms) amplitude of the signal, measured in dB, with a 0 dB reference given as the average rms amplitude of all strokes in all timing conditions).
Aside from the onset descriptor, we also calculated duration and SPL for each stroke segment (attack/decay/total—described above). In all, then, we analyzed seven descriptors for each drum instrument: onset, attack duration, decay duration, total duration, attack SPL, decay SPL, and total SPL.
Following Danielsen and colleagues (2015),3 and Câmara and colleagues (2020), we used sound pressure level (SPL) as the audio descriptor to describe stroke intensity. SPL is highly correlated to perceived loudness and remains a widely used metric in perceptual studies (Rossing, Moore, & Wheeler, 2002). Duration was also chosen as various aforementioned studies report that both the attack and the total elapsed time of a sound in particular have been shown to have a significant effect on its P-center, whereby longer durations tend to shift the experienced P-center later in time.
Data Processing and Statistical Analysis
We excluded the data from two participants from the analysis—audio signals from one participant suffered from distortion due to technical issues during recording, and we deemed another participant unable to complete the instructed tasks based on reports from the post-experiment interview. Because the natural timing condition was used mainly as a practice/adjustment task, we also omitted all data for these series. Subsequently, we gathered data from 5,040 recorded series of drum strokes (drum instruments  × timing style  × reference  × pattern  × participant  × audio descriptor ). We cropped all of the strokes into individual segments via custom scripts in Matlab version R2018a (Mathworks, USA). First, we cropped audio recordings into individual segments according to the grid points corresponding to the patterns from 400 ms before to 900 ms after the grid point. Then, we silenced the parts of the audio signal that belonged to the previous and following strokes. Segments that contained no signal around the grid point were marked as missing strokes and removed. Typically, participants would wait one or two measures before beginning to play in order to entrain to the metronome, resulting in 2 to 4 missing strokes per recording. Timing mistakes, defined as strokes with onsets of more than a sixteenth note (156 ms) early or late in relation to the metric grid, were also marked and removed automatically. We considered these events not as instances of early/late microtiming relative to the beat but rather as qualitatively different beat-level syncopations. We validated this automated process by plotting the waveform for each segment and inspecting the silenced audio parts and the marked missing strokes. A few missing strokes and timing mistakes were not detected automatically and were removed manually. Out of all total possible strokes captured per instrument, the total amount of manual and automatically marked invalid strokes accounted for about 9.4 percent of the snare, 8.7 percent of the kick, and 12.8 percent of the hi-hat data.
Before our statistical analysis, we defined extreme outliers as values that were more than three times the interquartile range away from the median of each instrument, timing condition, participant, and audio descriptor separately, and excluded them. Out of all valid strokes captured in total (i.e., excluding invalid missing strokes and timing mistakes), extreme outliers accounted for about 1.0 percent of the snare, 2.0 percent of the kick, and 0.8 percent of the hi-hat drum data. To check for normality, we within-subjects standardized the data (subtracted the average of all strokes across timing style conditions for a given participant from each individual timing style condition value for the same participant; see Fischer & Milfont, 2010) and manually screened the residuals via histograms and Q-Q plots for each dependent variable and instrument separately. None showed departure from normality.
Even though all participants reported familiarity with playing along to both a metronome and an instrumental backing track, we needed to confirm whether participants were in fact able to physically place their strokes either early, ahead, or synchronous with the reference track when instructed to do so—that is, to accomplish the instructed timing style tasks in terms of simple onset timing location for at least one instrument at a time. To do so, we first compared the average (arithmetic mean) profile of onset between the laid-back and pushed series with the on-beat series for each drum and individual. Then, in order to gauge the overall trends of all drummers’ timing-sound manipulation in the production of timing feel, we ran three-way RMANOVAs for each instrument and sound descriptor (onset, duration, and SPL) separately across all participants (N = 20), with timing style, reference, and pattern as the independent variables. Violations of sphericity (Mauchly’s test) were corrected using Greenhouse-Geisser. Post hoc paired-samples t-tests were performed where significant main effects or interaction were found and were Bonferroni corrected for multiple comparisons.
We also ran supplementary paired samples t-tests to investigate potential effects of individual notes between patterns where deemed appropriate (Bonferroni corrected for multiple comparisons). All statistical analyses were performed using SPSS ver. 25 (IBM, Inc., New York).
Our examination of the 240 onset data series (participant  × timing style  × reference  × pattern ) for each percussion instrument revealed that, for both timing references and patterns, the mean onset location of strokes corresponded to the given timing-style instructions in at least one out of three instruments (hi-hat, snare, kick). That is to say, when asked to play in a laid-back or pushed manner, the drummers successfully produced, on average, onsets earlier and later in time, respectively, in relation to the corresponding on-beat series in either the snare, kick, or hi-hat. Descriptive statistics of the microtiming onset profiles of all series by all drummers can be found in the Appendix.
Effects and Interactions of Timing Style, Reference, and Pattern
We conducted three-way RMANOVAs for each stroke segment (attack, decay, and total) and each drum instrument separately with timing Style, Reference, and Pattern (independent variables) and onset, duration, SC, and SPL (dependent variables). An overview of these RMANOVA results across all participants (N = 20) for all sound descriptors can be found in Table 1, and descriptive statistics for mean and standard deviations across participants for all timing-style, reference, and pattern conditions can be found in Table 2. In the following section, we elaborate upon significant results only. All reported onset and duration values are rounded up to 1 ms, and SPL values to 0.01 dBs.
Results showed a significant main effect of Style on onset location for hi-hat, snare, and kick drum. Post hoc pairwise comparisons revealed significant differences in mean onset between all three timing-style pairs for all instruments (laid-back vs. on, pushed vs. on, laid-back vs. pushed; see Table 3). These comparisons showed that the mean difference in onset timing between the pushed and on timing-style conditions (deltaPvO) was greater than the mean onset difference between the laid-back strokes and on-beat conditions (deltaLvO). To test whether this difference was significant, follow-up paired-samples t-tests were conducted. They confirmed that deltaPvO was indeed significantly larger than deltaLvO for all instruments: snare, M = +15 ms, SD = 18 ms, p = .001, d = 0.83; kick, M = +19 ms, SD = 17 ms, p < .001, d = 0.95; hi-hat, M = +19 ms, SD = 17 ms, p = .017, d = 0.62.
While we found no significant main effect of Reference on onset location for any of the instruments, we found a significant interaction between Style and Reference for all instruments such that timing style had a greater effect in the instrumental than in the metronome condition. Figure 5 illustrates the mean onset timing across participants in all style, reference, and pattern conditions. Paired-samples t-tests revealed that, for all instruments, mean onset timing was significantly later in the instrumental reference compared to metronome across patterns in the laid-back and on conditions. For the pushed conditions, however, we found the opposite result, where mean onset timing was earlier in the instrumental reference compared to the metronome (see Table 4).
A significant main effect of Pattern was found on onset for only the hi-hat, and post hoc pairwise comparisons revealed an earlier mean onset for Pattern 1 compared to Pattern 2 (M = −5 ms, SD = 7 ms, p = .005, d = 0.71).
A significant interaction on onset was also found between Style × Reference × Pattern for the kick and the hi-hat. Overall, based on examination of the plots, there was a lesser effect of timing style when drummers played the more complex pattern (Pattern 2), and no difference between metronome and instrumental reference track when playing this pattern in a laid-back timing style.
A significant main effect of Style on duration was found for the snare drum in the decay and total duration segments. Post hoc pairwise comparisons for the snare revealed significantly longer mean decay and total duration for laid-back strokes than for on-beat strokes, but no significant differences in any stroke segment between either pushed and on-beat or laid-back and pushed strokes (see Table 5). Figure 6 further illustrates the results for mean duration for all snare-stroke segments in all style, reference, and pattern conditions.
We also found a main effect of Pattern on duration for the snare. Post hoc pairwise comparisons revealed significantly longer durations in Pattern 1 than in Pattern 2 for both decay (M = +9 ms, SD = 9 ms, p < .001, d = 1.0) and total (M = +9 ms, SD = 9 ms, p < .001, d = 0.97) segments. Since the main difference between the patterns was structural, we wanted to further investigate whether this effect was driven by the difference in individual note composition between two pattern conditions. We therefore ran a supplementary paired-samples t-test between the average duration of the individual notes across participants in each pattern. The test revealed that not only was the decay and total (but not attack) duration of the extra eighth note in the double stroke of Pattern 2 shorter than the single stroke of Pattern 1, but also almost of all the strokes in Pattern 2 were significantly shorter than the corresponding strokes of Pattern 1 (see Table 6).
A significant main effect of Style on duration was found for the kick drum. Post hoc pairwise comparisons revealed significantly shorter mean durations in the laid-back than the pushed strokes for the decay segment, but only a trend toward significance for total duration. (see Table 7). We found no significant differences between laid-back and on-beat strokes or pushed and on-beat strokes for any stroke segment. Figure 7 illustrates the results for mean duration for all kick drum stroke segments in all style, reference, and pattern conditions.
We found no main effects of duration for the hi-hat, nor any interactions for any stroke segment in any of the percussion instruments.
Sound Pressure Level (SPL)
For the snare drum, a significant main effect of style on SPL appeared in all stroke segments (attack/decay/total). Post hoc pairwise comparisons revealed significantly higher mean attack, decay and total SPL in the laid-back compared to pushed strokes, but no significant differences in any segment between laid-back and on-beat strokes or pushed and on-beat strokes (see Table 8). Figure 8 illustrates the results for mean SPL for all snare segments in all style, reference, and pattern conditions.
We also found a significant main effect of Pattern on SPL for the snare. The post hoc pairwise comparison shows that mean SPL was higher in Pattern 2 than in Pattern 1 in the attack (M = +1.11 dB, SD = 1.48 dB, p = .003, d = 0.75) and total (M = +0.83 dB, SD = 1.72 dB, p = .043, d = 0.48) stroke segments.
In order to gauge whether the structural differences in note composition between patterns may have had an effect on snare SPL, we ran a supplementary paired-samples t-test between the average SPL of the notes in each pattern, as we did with duration. The tests revealed that while attack SPL of all notes in Pattern 2 were significantly higher than the corresponding notes of Pattern 1, the results of the total SPL comparisons showed no differences between the patterns (see Table 9).
We also found a significant interaction between Style and Pattern for all snare SPL stroke segments, such that there was a stronger effect of Style on SPL in Pattern 2 than in Pattern 1. Paired-samples t-tests revealed that mean SPL (all stroke segments) was significantly higher in Pattern 2 than in Pattern for laid-back strokes and on-beat strokes, but showed no difference for pushed strokes (see Table 10).
For the kick drum, only a main effect of Pattern on SPL was found. Post hoc pairwise comparisons showed that all stroke segments were played with higher SPL in Pattern 2 than in Pattern 1 at p < .001 (see Figure 9): attack, M = +3.27 dB, SD = 1.27 dB, d = 2.58; decay, M = +2.81 dB, SD = 1.92 dB, d = 1.46; total, M = +2.96 dB, SD = 1.34 dB, d = 2.21.
We ran a supplementary paired-samples t-test between the average SPL of the notes in each pattern. The test revealed that, for all stroke segments, not only were the differing kick notes from Pattern 2 (“three-and,” “four-and”) louder than all the notes in Pattern 1 (“one,” “three”), but also all of the notes in Pattern 2 were played significantly louder than all of the notes in Pattern 1 (see Table 11).
For the hi-hat, we found a significant main effect of Style on SPL in all stroke segments. Post hoc pairwise comparisons revealed significantly higher mean SPL in the pushed condition compared to the on-beat condition, as well as significantly higher SPL for laid-back compared to on-beat for the total segment, a trend toward significance for the attack segment, but no significant difference for the decay segment. The difference in SPL between the pushed and laid-back conditions was not significant for any of the segments (see Table 12).
We also found a significant main effect of Reference on SPL for the hi-hat in all stroke segments. Post hoc pairwise comparisons revealed higher mean SPL in the metronome compared to the instrumental condition (attack, M = +0.23 dB, SD = 0.50 dB, p = .050, d = 0.47; decay, M = +0.14 dB, SD = 0.29 dB, p = .043, d = 0.48.; total, M = +0.22 dB, SD = 0.28 dB, p = .003, d = 0.78). No interactions were found on SPL for the hi-hat. Figure 10 illustrates mean SPL for all hi-hat segments in all style, reference, and pattern conditions.
Effects of Timing Style on Onset Location
Drummers show high degree of onset timing control
Regarding onset location timing, results show that, on average, all twenty drummers were able to perform the tasks—they played laid-back and pushed strokes slightly later and slightly earlier relative to the on-beat timing style condition. These results accord with those of previous instructed timing-style performance studies with drummers by Kilchenmann and Senn (2011) and Danielsen and colleagues (2015), and with guitarists and bassists by Câmara and colleagues (2020), and they provide further evidence of musicians’ high degree of intentional control of onset manipulation in order to produce different timing feels in groove-based performance. Here, the average onset difference of all instruments between on-beat and laid-back strokes across pattern and timing reference was found to range between 21 to 27 ms, and for pushed compared to on-beat strokes, between -37 to -42 ms. In music theory terms, at the experiment tempo (96 bpm), these averages correspond to durational differences between roughly a 128th note (19.6 ms) and a 64th note (39.1 ms) to distinguish the average microtiming onset profile of laid-back and pushed from on-beat, indicating a high degree of control at the microrhythmic level. This is not to say, however, that drummers consciously operate with these minute canonical note categories in mind, only that they are able to produce onsets flexibly around the beat.
As to whether the average listener would be able to distinguish these different feels based on their onset timing profiles alone, most music performance studies, as mentioned previously, have either speculated or found indirect evidence that the threshold for detection of asynchrony between two sound events in a musical context is around 30 milliseconds (Butterfield, 2011; Goebl & Parncutt, 2002). If this is correct, then it is indeed likely, based on our experiment results, that one would be able to detect both laid-back and pushed drum strokes as asynchronous relative to either the timing reference stimuli or the onset timing of the on-beat conditions, with the likelihood of detecting pushed strokes being even greater due to their larger average onset asynchrony magnitudes.
As was the case in a previous study with guitarists and bassists (Câmara et al., 2020), the variability (SD) of the mean onset location for all instruments was numerically higher in the pushed (24 ms) and laid-back (17 ms) styles than it was in the on-beat (12 ms) condition. This may be explained by the fact that the majority of the participants reported having practiced on-beat synchrony with a timing reference more than they had practiced laid-back and pushed styles. It may also be related to the fact that participants described laid-back and pushed as more ambiguous timing style categories that allowed for a greater range of onset locations while still sounding aesthetically “correct.” This suggests that drummers, like guitarists and bassists, also regarded the “beat bin,” or temporal range in which sounds are perceived as synchronous with the beat (Danielsen, 2010; Danielsen et al., 2019) in the pushed and laid-back styles as broader than that of on-beat strokes.
Asymmetry of onset asynchrony magnitude between pushed and laid-back styles
A curious result was that, when they were playing with a pushed feel, the drummers produced greater asynchrony relative to both the timing references and the average on-beat condition onset location than when they were playing with a laid-back feel. On the one hand, the greater values for the pushed strokes may be because participants described the pushed feel as more difficult and unfamiliar, which is demonstrated as well by the greater variability in average pushed stroke onsets. As such, the increased difficulty could have led to the exaggeration of the earliness of pushed strokes over the lateness of laid-back strokes. On the other hand, drummers may consider laid-back timing approaches to be aesthetically amenable to subtler degrees of lateness, whereas they may consider more pronounced earliness to be acceptable in pushed performances. That is, a performance might sound sloppy or loose, rather than simply laid-back, if the strokes are exaggeratedly late, whereas in a pushed timing feel, the threshold for early strokes sounding rushed rather than pushed may be larger, allowing for greater earliness magnitudes. To our knowledge, however, no one has yet specifically investigated whether onset asynchrony thresholds are different in systematically off-beat late vs. early events relative to an external timing reference in a rhythmic context.
Timing reference and onset NMA
As expected, in the on-beat condition the average onset timing of all drummers anticipated the timing reference. We found this anticipatory tendency, or negative mean asynchrony (NMA), in both reference conditions for all instruments. At the same time, however, the results show relatively low overall NMA values in the range of -10 to -20 ms. These results accord with findings from in-phase sensorimotor synchronization tapping studies (see Repp, 2005; Repp & Su, 2013) showing that highly trained musicians, and especially drummers, tend to display lower NMA than nonmusicians, typically in the range of 0 to 20 ms. Moreover, Fujii and colleagues (2011) found that, in a simple on-beat synchronization performance task by drummers to a metronome, NMA was lowest for the hi-hat, followed by the snare and the kick drum in the medium tempo condition (120 bpm). This also resonates with our findings, where the hi-hat displayed the lowest NMA values in the on-beat conditions. The reason drummers tend to produce lower NMA with the hi-hat cymbal may be related to the fact that it is typically considered the main timekeeping element of the drum-kit and is typically played with the strongest/dominant hand (right for right-handed players).
The metronome reference yielded greater NMA than the instrumental reference, with the snare and the hi-hat demonstrating near-negligible anticipation in the instrumental (-4±13 ms and -3±14 ms, respectively). As of yet, there is no consensus as to what leads to NMA in sensorimotor-synchronization tasks. One potential explanation for the instrumental reference leading to lower overall NMA, however, is that the guitar and bass sounds used in the instrumental reference track had longer attack and total durations than the faster and shorter metronome woodblock sounds. Accordingly, as predicted by the findings of research into the perceptual centers of musical sounds (Danielsen et al., 2019; Gordon, 1987; Villing, 2010), the synchronized target location for drummers would be slightly later in the instrumental reference. Regardless, our findings accord with previous studies showing that synchronization to more ecological musical stimuli tends to lead to less NMA than synchronization to metronomic stimuli (Dixon, Goebl, & Cambouropoulos, 2006; Repp, 2008; Wohlschläger & Koch, 2000).
Effects of Timing Style on Sound Shape: Duration and Intensity
Timing style and duration: Longer snare and shorter kick in the laid-back condition
Regarding duration, we found significant main effects of timing style conditions for the snare and kick drums. While drummers showed a tendency to play snare strokes in the laid-back timing style condition slightly longer than they did in the on-beat condition, we found the opposite effect for the kick drum, curiously, whereby laid-back strokes were played slightly shorter than on-beat strokes. In both cases, only the duration of either the decay or total stroke was lengthened; attack length, as defined in this study, was much less alterable due to the constraints of the instrument type. (It is difficult, if not impossible, to manipulate the attack rise time of an impulsive strike of either a drumstick or foot pedal beater on a drum skin membrane as it occurs in a very short time interval where the amplitude rises very quickly.) Decay length, on the other hand, could have been lengthened by either striking the drum membrane with a greater intensity or allowing the membrane to vibrate for a longer time or both. However, a Pearson’s correlation test revealed only significant weak correlations at p < .001 between stroke duration and intensity (snare: decay, R = .07, total, R = .04; kick: decay, R = .09). Instead, then, for the snare, longer decay/total snare stroke length may have been achieved by allowing the drum stick to continue to bounce lightly on the skin after the first stroke impulse for a longer time (a so-called “normal” stroke technique, with stick rebound, as opposed to a “controlled” stroke, where the stick is stopped and held firmly above the skin right after impulse (Dahl & Altenmüller, 2008). Similarly, shorter kick strokes may have been achieved by keeping the foot pedal beater pressed firmly against the drum-head after striking (often referred to as “burying the beater”), effectively curtailing the resultant sound slightly, as opposed to allowing it to bounce back freely and allowing the membrane to vibrate longer.
The snare result accords with a previous instructed timing style study that revealed that guitarists also lengthened the duration of their laid-back strokes (Câmara et al., 2020). Though the magnitude of stroke lengthening by the drummers was subtler than that of the guitarists (ca. 7 ms vs. 30 ms significant difference from the on-beat style condition), P-center studies show ample evidence that longer durations lead to later P-centers relative to signal onset. Therefore, as was the case with the guitarists’ tests, a late and long stroke may further augment the “behind-the-beat” character of laid-back strokes, whereas shorter durations encourage the listener to experience on-beat and pushed strokes as either more in sync or earlier relative to timing reference stimuli, thereby enhancing their synchronous or early timing character, respectively. Reported strategies also corroborated these signal analysis results, in that several of the drummers described applying “slower” movements, aiming for “longer tones with more sustain,” and holding the drum stick with a “looser” grip to achieve a laid-back feel, and conversely using “faster/smaller” movements and tones with “less sustain” via a “tighter grip” to achieve on-beat and pushed feels.
As to why drummers would utilize shorter kick strokes, which potentially elicit the experience of an earlier P-center in the laid-back condition, we might look at the onset location differences between kick and snare in the various timing style conditions. The overall tendency for drummers in the laid-back style conditions was to position the kick slightly earlier (+7 ms) than the snare (+17 ms) relative to the timing reference. The ensuing average inter-instrument onset difference between kick and snare (10 ms, SD = 7 ms) in the laid-back condition is significant in itself, t(6.415), p < .001, and also significantly greater than the equivalent differences in the on-beat (4 ± 3 ms) t(4.841), p < .001, and pushed (1 ± 8 m), t(3.630), p = .002, style conditions. Some participants, in fact, reported that when they were playing in the laid-back style condition, they consciously implemented a strategy of aiming the kick drum closer to the timing reference (more on the beat) while delaying both snare and hi-hat slightly. By utilizing a strategy of relatively earlier and shorter kick + later and longer snare, then, the perception of an even longer interonset interval between kick and snare might be enhanced due to the additional effects of duration on P-center, thereby enhancing the overall laid-back timing feel of the performance as such.
Timing style and sound pressure level: Late and loud vs. early and loud
A significant main effect of timing style on sound pressure level (SPL) was found for the snare—drummers played laid-back strokes with higher decay/total SPL than they played pushed strokes. This was essentially a replication of the intensity findings of Danielsen and colleagues (2015), where drummers on average played laid-back strokes with the greatest intensity compared to the on-beat condition, and provides further evidence that intensity is a vital feature of timing style production. Interview responses from the present study’s twenty drummers also confirm the previous study’s reported association between laid-back feel and playing “heavier” by “giving more weight” to snare strokes. In addition, the drummers reported positioning themselves for the laid-back condition by leaning more backward or away from the snare and/or lifting the stick higher in preparation for a stroke, both of which may result in greater intensity (more distance allows for a higher striking velocity) as well as later onset (particularly when combined with a “flam” technique during simultaneous snare and hi-hat strokes—if both strokes fall toward the instruments at the same time but the snare begins higher up, it will land after the hi-hat). On the other hand, the drummers associated the pushed condition with “lighter/softer” or “thinner” strokes that were played with the body and hands “positioned closer to the drums,” and if a flam technique was used, the snare stroke was aimed to fall on the drum before the hi-hat (thanks to a lower snare stick height).
As with duration, however, we found no single directionality of the effect of timing style on one sound feature across the different percussion instruments, as the hi-hat indicated the drummers’ tendency to play both pushed and laid-back strokes with higher total SPL compared to on-beat strokes. In groove-based music performance, as mentioned, the hi-hat cymbal is widely considered to be the main “timekeeper” of the drum kit (and the entire ensemble in live performance contexts), because it clearly and consistently externalizes the density referent of the groove pattern—that is, its smallest practical metrical subdivision level (Nketia, 1974)—which, in our experiment, was manifested in both patterns as a stream of eighth notes. To clearly convey the idea that this timekeeper is meant to be heard as pushing against the timing reference or ensemble, then, drummers would want to accent it with greater intensity in order to increase its perceptual salience. While Goebl and Parncutt (2002) suggest that it is more difficult to detect the presence of an asynchrony in early and loud combinations due to a potential forward masking effect, the greater the onset magnitude between the two sounds, the higher the chance of detecting an asynchrony between them. We may recall that the onset asynchrony between pushed strokes and timing reference was found to be greater than between laid-back strokes and timing reference for all of the instruments. For the hi-hat, the average onset asynchrony of pushed strokes relative to timing reference was -46 (± 21) ms. This is substantially above the -27 ms asynchrony condition in Goebl and Parncutt’s (2002) experiment, where detectability of asynchrony in early and loud tone combinations was 20–30%, and closer to the -54 ms asynchrony condition, where it was higher at 40–50%. It therefore follows that increasing the magnitude of onset asynchrony of hi-hat strokes in the pushed condition would counteract the risk of reduced detectability of asynchrony in early and loud combinations due to forward masking. That is to say, when hi-hat strokes are played louder, the earlier they are played, the higher the chance they may be heard as asynchronously pushing against the timing reference layer, rather than masking and potentially supplanting it.
On the other hand, if an early and loud tone combination is harder to detect, then, as Danielsen and colleagues (2015) suggest, a late and loud stroke would appear to facilitate the perception of asynchrony between tones. In fact, Tekman (2002) found that it was easier to correctly classify a tone as being late in relation to another tone when it was both positioned late and coupled with a greater intensity of delivery, compared to a tone that was simply positioned late at the same intensity as another tone. In the laid-back condition, then, amplifying the hi-hat or the snare with a slightly greater intensity would help them stand out more clearly as late strokes in relation to the timing references. In addition, since the risk of forward masking may not be present in late and loud combinations, it may be that lesser degrees of asynchrony are needed to convey an intentionally late stroke against a timing reference when they are also played with greater intensity, which may be related to the overall lower average onset differences produced between laid-back and timing reference for the hi-hat and snare at least.
Effects of musical context: Reference and Pattern
Instrumental reference produces more extreme early and late onset timing
Timing reference stimuli had an amplifying effect on the magnitude of asynchrony—laid-back strokes were played even later (and, conversely, pushed strokes were played even earlier) in the instrumental reference compared to the metronome. This effect was most salient in Pattern 1 (see Figure 5) and may be related to the spectral features and/or duration of the timing reference stimuli, since the instrumental guitar and bass sounds were both longer than those of the woodblock metronome stimuli and spectrally more distinct from the drum kit’s sounds as well. When producing an asynchrony between two short, percussive, and highly impulsive sounds such as the woodblock and the snare/kick/hi-hat, the threshold for onset asynchrony may be reduced, because two click-like sounds in close proximity may stand out more to the listener/performer. On the other hand, when producing an asynchrony between an impulsive drum sound with faster attack and shorter duration and a guitar or bass sound with a relatively slower attack and longer duration, a greater magnitude may be required to prevent potential spectral, durational, or dynamic masking effects.
Effects of reference on sound pressure level of time-keeper (hi-hat)
Timing reference stimuli for the drummers also had an effect on the dynamics of their strokes: overall, they played the hi-hat more loudly to the metronome than they did to the instrumental stimuli. This may be related to differences in their interpretations of the tasks. Typically, drummers will play alone to a metronome either as a way to practice (to hone timing skills) or when they are recording a backing track in a studio session for a song, which is later overdubbed by other instruments. When they play solo with a metronome, then, they want most of all to ensure that they are synchronized with it, and by playing the hi-hat (the timekeeping element that coincides with all of the metronome’s sounds) louder, they may be trying to enhance their ability to do so. Unsurprisingly, participants reported that they tended to play more “mechanically” to the metronome. When they played to the instrumental reference, on the other hand, which evokes a live or studio trio rhythm section, they entered a more “dialogical mode” (Chernoff, 1979) of performance, whereby they played together with the other instruments rather than in strict synchronization to a timekeeper. In this context, it is possible that the drummers did not need to emphasize the timekeeper hi-hat so consistently and could focus instead on balancing their drum kit sounds with those of the recorded ensemble in a fashion they deemed appropriate to the musical context.
Effects of pattern: Musical contextual/aesthetic considerations
Pattern was found to have several effects on the onset, duration, and SPL of the different percussion instruments. Regarding onset timing, a main effect of pattern was found for the hi-hat, where the more complex rhythms of Pattern 2 led to slightly earlier strokes than did Pattern 1. As there were no differences between the hi-hat patterns themselves, this effect suggests that the greater density of notes stemming from the extra kick and snare off-beat strokes may have led drummers to play Pattern 2 in a “pushier” fashion overall. Just as drummers can be either “pushier” players or display a tendency to anticipate the timing reference even when playing laid back (Kilchenmann & Senn, 2011), certain patterns can elicit “pushier” performances by drummers in general, particularly when they are combined with greater pattern density and complexity.
As to duration, the drummers also played the snare drum with shorter durations (decay/total)—in the more complex and busier Pattern 2 than in the simpler and sparser Pattern 1. The main structural difference between the snare patterns was that, in Pattern 1, only a single eighth note stroke appeared on metrical position “two”; in pattern 2, on the other hand, a double eighth note stroke spanned the positions “two” and “two-and.” This double stroke may have compressed the time between the two strokes, leading to shorter average note durations. A supplementary paired-sampled t-test revealed that not only were the duration of the double strokes in Pattern 2 shorter than the equivalent single stroke in Pattern 1 but also almost all of the notes in Pattern 2 were significantly shorter than those in Pattern 1. This suggests that the reason why drummers played shorter strokes in Pattern 2 may be the overall greater density and proximity of the notes of all the instruments in that pattern, rather than simply the extra snare note alone.
As to effects of pattern on SPL, both snare (attack) and kick drum (all segments) were played more loudly in Pattern 2. For the snare, we found a further interaction between timing style and pattern as well—the drummers played the snare strokes more loudly in Pattern 2 in the laid-back and on-beat conditions but not the pushed condition, which recalls the snare result of the main effect of timing style on SPL (the laid-back condition was louder than the pushed). A follow-up note-by-note analysis of stroke SPL also revealed that, for both snare and kick drums, almost all of the notes of Pattern 2 were louder than all of the notes in Pattern 1. If greater effort leads to greater intensity, the relative greater difficulty reported by participants when playing laid-back and pushed feels in Pattern 2 as opposed to Pattern 1 may have contributed here. The reason may also be aesthetic in nature—perhaps the drummers considered the extra intensity to project a more “driving” or “energetic” rhythm in relation to the simpler Pattern 1. Overall, then, it would appear that musical context influences microtiming profiles—that is, the nature of the pattern and timing reference in a given musical context may further affect how timing feels are expressed in drumming performances.
Systematicity and intentionality
While drummers were found to systematically manipulate intensity or duration of strokes to express different timing styles, this does not necessarily imply intent on the drummers’ part. That is, it is fully possible that changes in these sound features were simply byproducts of onset timing manipulation due to motor limitation effects outside of their explicit control. At the same time, the possibility of intent should not be precluded, especially since drummers at times described applying strategies that both implicitly and explicitly mentioned a focus on intensity or duration. Also, although drummers are likely not consciously aware of the perceptual effects that duration and intensity have on either the P-center of sounds or on the detectability of onset asynchronies, this does not mean they cannot hear or feel the effects that longer/shorter or louder/softer sounds might have when further applied to late/early strokes. As such, it may be that they are able to intuitively utilize certain combinations of onset and intensity/duration in order to better achieve a given timing style.
Summary and Conclusions
Our findings show that drummers systematically manipulated not only the onset location but also the intensity and/or duration of the various drum instruments when instructed to perform groove-based patterns in a laid-back, on-beat, or pushed fashion. Drummers produced distinctive average stroke onset profiles for each timing style, with the pushed condition showing a tendency to be slightly more anticipated than the laid-back condition was delayed, potentially as a result of the increased difficulty of the pushed condition reported by drummers, or a decreased sensitivity to early as opposed to late onset asynchrony. Systematic differences in the shape of the acoustic signal for strokes played with different timing styles were also found in at least one measured sound descriptor (duration and SPL) for all the instruments in the drum kit. Drummers tended to play snare strokes in the laid-back condition louder and longer on average, a timing/sound strategy that might enhance the perceived lateness of strokes due to the increased detectability of late and loud asynchronies and the P-center delaying effects of longer durations. Kick drum strokes, on the other hand, were played shorter on average in the laid-back condition, which, when viewed in light of the longer concomitant snare strokes, would amplify the perceived time interval between the drum strokes themselves, rather than simply enhance the delayed or anticipated character of single strokes in relation to a timing reference. Lastly, the hi-hat showed a tendency to be played louder in both asynchronous conditions that was potentially related to its role as a timekeeper—that is, greater intensity may increase its perceptual salience and thus help to highlight other intentionally produced asynchronies in relation to an external timing reference.
Musical context also influenced timing style as effects of both timing reference (metronome vs. instrumental backing track) and pattern (Pattern 1 [“simple”] vs. Pattern 2 [“complex”]) were found on either onset location, duration, or intensity across the different percussion instruments. Timing reference primarily impacted the average onset profiles of the instructed timing styles with instrumental sounds leading to more pronounced early and late timing, perhaps to ensure that asynchronies were not spectrally or dynamically masked by the wider instrumental bass and guitar sounds, with their relatively slower attacks and longer durations. The metronome led to greater NMA in the on-beat style condition for all instruments, possibly prompted by the earlier P-centers of the metronome stimuli, with their shorter durations and faster attacks. Pattern was also found to affect performance—for example, snare and kick strokes were played louder, and hi-hat strokes earlier in Pattern 2 than in Pattern 1. Since the main difference between the two instructed patterns was structural, Pattern 2’s effects may be attributed to the greater note density and off-beat character of the snare and kick.
Overall, this study’s findings show further evidence that the production of “timing” in groove-based music involves more than simple onset asynchrony relations between instruments and/or timing references. While, in terms of effect sizes, onset manipulation may still seem to be the salient vehicle for drummers when expressing different timing feels in groove-based music, we should not ignore the durational and dynamic nuances of strokes played with onset asynchronies. We might thus consider that more specific terms such as “timing-sound styles” or “microrhythmic feels” may better convey the potential multiple range of acoustic features involved in the production of performed rhythmic events, especially since “timing” is so heavily connoted with “onset” to the point where they are virtually synonymous. However, “timing” still holds strong currency within music performance communities and scholarly institutions, and therefore it may also be possible to simply expand the concept of “timing” to encompass the concomitant manipulation of temporal and sound features.
Regardless of what we call this phenomenon, future empirical studies concerned with measuring and interpreting the role of timing expression in music would do well to consider also the sonic shape of rhythmic events, not to mention the aesthetic and stylistic contexts in which they are produced. Music psychological studies seeking to measure “groove” ratings (operationalized as the aspect of the music that elicits the “urge to move” and/or pleasure) while running into conflicting results regarding the role of microtiming, for example, might consider manipulating not only onset but also intensity and duration of instrumental stimuli, preferably with baseline conditions that resemble timing/sound profiles from actual musical performance examples as closely as possible. Music producers concerned with “humanizing” computer-programmed grooves with artificial onset asynchrony manipulations that may otherwise sound mechanical or deadpan may also benefit from scholarly studies that further categorize or model intensity and durational profiles obtained from real performances. As Albhy Galuten, producer of the Bee Gee’s seminal disco-funk track “Staying Alive” from 1977, put it, “Everyone knows that it’s more about feel than accuracy in drum tracks”—though he used a drum loop in the hit’s construction, it was extracted from a real performance, so, he insisted, “it felt really great—very insistent but not machinelike…[i]t had a human feel” (quoted in Grogan, 2018, p. 128). In the opinion of the present authors, whatever creates the particular aesthetic appeal that actual human performance elicits in listeners—that ephemeral rhythmic “feel” of groove-based music—must surely involve both the onset microtiming profiles of instruments and the manner in which they are “texturally” shaped.
In future research, we plan to conduct perception experiments to determine whether listeners are able to identify drum strokes as laid-back, on-beat, or pushed on the basis of their sound alone. In addition, we recorded motion-capture data during this experiment that we will analyze to determine whether drummers also systematically utilize different movement trajectories to produce the differences in duration and intensity we observed in this study.
The authors wish to thank all the drummers for their participation; Georgios Sioros, Victor Gonzáles Sánchez, and Martin Torvik Langerød (University of Oslo, Norway) for their assistance with the experimental setup; Carl Haakon Waadeland (Norwegian University of Science and Technology, Norway) for his help with the experimental design and recruitment of participants; Dag-Erik Eilertsen (University of Oslo, Norway) for his assistance with SPSS scripts for the statistical analysis; Justin London (Carleton College, USA) and Olivier Senn (Hochschule Luzern, Switzerland) for their comments on earlier versions of this manuscript. We are also grateful to the anonymous reviewers for their interesting and valuable comments and suggestions. This work was partially supported by the Research Council of Norway through its Centers of Excellence scheme, project number 262762, and the TIME (Timing and Sound in Musical Microrhythm) project, grant number 249817.
While early studies found either negative or negligible effects of onset asynchronies on subjective ratings of various assumed “groove” features (Davies, Madison, Silva, & Gouyon, 2013; Frühauf, Kopiez, & Platz, 2013; Madison, Gouyon, Ullén, & Hörnström, 2011; Madison & Sioros, 2014), recent studies have found that stimuli with microtiming profiles derived directly from, or resembling those of, original performed music do not necessarily obtain lower ratings than stimuli with artificially reduced onset asynchronies (Kilchenmann & Senn, 2015; Senn, Kilchenmann, von Georgi, & Bullerjahn, 2016; Senn, 2017; Skansaar, Laeng, & Danielsen, 2019).
The mirevents algorithm with both the “Waveform” and “Slope” options are available in the MIR Toolbox audio analysis package [ver. 1.8] (Lartillot, Toiviainen, & Eerola, 2008).
In contrast to Danielsen and colleagues (2015), we chose to not investigate the brightness of drum strokes via spectral centroid (SC) since we opted to use contact microphones to capture the snare and hi-hat signals. While contact microphones provide recordings from which superior stroke onset and SPL information can be extracted due to better signal source isolation, they do not reproduce the timbre of drum sounds as faithfully as those of dynamic microphones. As such, any comparisons of their perceptual brightness via SC would not be entirely ecological without further investigation.