Two experiments investigated perceptual and emotional consequences of note articulation in music by examining the degree to which participants perceived notes to be separated from each other in a musical phrase. Seven-note piano melodies were synthesized with staccato notes (short decay) or legato notes (gradual/sustained decay). Experiment 1 (n = 64) addressed the impact of articulation on perceived melodic cohesion and perceived emotion expressed through melodies. Participants rated melodic cohesion and perceived emotions conveyed by 32 legato and 32 staccato melodies. Legato melodies were rated more cohesive than staccato melodies and perceived as emotionally calmer and sadder than staccato melodies. Staccato melodies were perceived as having greater tension and energy. Experiment 2 (n = 60) addressed whether articulation is associated with humor and fear in music, and whether the impact of articulation depends on major vs. minor mode. For both modes, legato melodies were scarier than staccato melodies, whereas staccato melodies were more amusing and surprising. The effect of articulation on perceived happiness and sadness was dependent on mode: staccato enhanced perceived happiness for minor melodies; legato enhanced perceived sadness for minor melodies. Findings are discussed in relation to theories of music processing, with implications for music composition, performance, and pedagogy.
The capacity of music to convey a range of different emotions has been the subject of extensive psychological investigation (Juslin & Sloboda, 2011). Research has identified a range of musical attributes—such as tempo, dissonance, consonance, timbre, pitch, contour, tonality, and musical complexity—that collectively convey emotional meaning to listeners. However, in the field of music psychology, there is relatively scarce empirical research on articulation, an important musical device used by performers for expressive purposes.
Articulation is a fundamental component of musical expression and dictates how performers play the beginning and end of individual notes, and how much silence is heard between successive notes (Handel, 1989; Juslin & Laukka, 2003; Lawson & Stowell, 1999; Snyder, 2000). The two most common categories of articulation are staccato and legato. With legato articulation, the sound of each note finishes near or after the onset of the next note, such that the notes are connected and overlapping with each other (Repp, 1999). As a result of the overlap between successive notes, the onset characteristics of legato notes are less apparent, and tend to sound gradual and smooth even for instruments that have fixed onset characteristics, such as piano. With staccato articulation, notes are separated from each other by clearly perceived gaps, which are achieved by shortening the duration of each note (Mathiesen et al., 2020). Hence, staccato notes are perceived as having sharp, abrupt onset characteristics and also sound relatively short—often shorter than their notated duration (Repp, 1999). Articulation is among the most basic expressive devices employed by musicians during music performance, yet its perceptual and emotional effects are largely unknown. It is this gap in the music perception literature that the present study aimed to address.
Perceptual Consequences of Articulation
Articulation can significantly change the way a melodic sequence is perceived and experienced by listeners. If a melody is played with staccato articulation, it sounds like a series of distinct notes, yet if the melody is played with legato articulation, it sounds like the notes belong to one unified phrase (Chiat & Ying, 2013; Collins & Schmuckler, 1997). The present study posits that articulation influences the ease with which musical phrases are perceived as a coherent stream of notes (Huron, 2001). When acoustic input is not readily analyzed by the auditory system, listening becomes more effortful (Peelle, 2018). The cognitive demands required to process staccato articulation compared to legato articulation are described by auditory scene analysis: how the auditory system organizes acoustic input arising from multiple sources into a set of distinct auditory streams (Bregman, 1990).
In a musical context, acoustic input from multiple sources converges at the basilar membrane but is then segregated by the auditory system into distinct sources based on cues such as pitch, loudness, timbre, spatial proximity, and temporal proximity, as well as their immediate context (Bregman, 1990; Snyder & Alain, 2007). Of special importance to articulation is temporal proximity: notes that are proximal in time are more likely to be grouped together as part of the same auditory stream than notes that are separated from each other by a period of silence (Wright & Bregman, 1987). When the notes of a melody are contiguous or overlapping with one another, they are readily grouped together and the melody is perceived as a coherent phrase (Van Noorden, 1975). When gaps are inserted between notes, however, they tend to be perceived as a sequence of isolated events, rather than one coherent stream (Bregman & Dannenbring, 1973). Thus, a phrase of legato notes, which are temporally contiguous with one another, should be judged as more coherent compared to a phrase with staccato notes, which are separated from one another by brief periods of silence. This hypothesis was evaluated in Experiment 1.
Legato melodies should also be less effortful to perceive than staccato melodies, referred to as perceptual fluency, and hence preferred over staccato melodies. Reber et al. (2004) proposed that perceptual fluency—defined as the ease by which stimuli are processed in the brain—is correlated with preference (Reber et al., 1998). If legato is easier to process because of high perceptual fluency, then listeners should provide higher enjoyment ratings for legato than staccato melodies. Supporting this hypothesis, Mathiesen et al. (2020) manipulated legato and staccato in melodic arpeggios using a digital piano software instrument. Participants rated their liking for each arpeggio, with higher liking ratings assigned to legato arpeggios than to staccato arpeggios. This finding suggests that legato articulation is preferred over staccato in basic arpeggios. However, these findings are specific to melodic arpeggios. Most melodies, in contrast, move in a stepwise motion, with notes relatively closer to one another in pitch. Thus, the question remains as to whether the effects of articulation observed for arpeggios will be observed for more conventional melodic materials that include a variety of interval sizes. Thus, Experiment 1 was also designed to assess the hypothesis that participants will provide higher liking ratings for legato than staccato melodies.
Emotional Consequences of Articulation
Music is a powerful means of emotional communication (Juslin & Laukka, 2003) and performers commonly use articulation to convey an emotional impression (Juslin & Sloboda, 2011). In one study, Gabrielsson and Juslin (1996) asked professional instrumentalists to perform melodies to convey specific emotions. They found that happy, fear, and angry emotions were typically performed with staccato articulation, whereas “sad” and “tender” emotions were performed mostly with legato articulation. Juslin and Laukka (2003) conducted a meta-analysis of emotion-related music performance research and found that among the few studies that had investigated articulation, staccato was typically associated with happiness, anger, and fear, whereas legato was more associated with sadness and tenderness. Similarly, Juslin and Lindström (2010) found that sadness and tenderness were associated with legato articulation, whereas fear was associated with staccato articulation. The main limitation of these studies is that articulation was not systematically manipulated; rather, it was categorized and analyzed post hoc. The present study builds on this work by using one set of melodies, presenting each melody with either legato or staccato articulation, and measuring the effect of articulation on listeners’ perception of emotion conveyed through the music.
Perceived emotions conveyed through music are often measured using dimensional or discrete categorical models (Eerola & Vuoskoski, 2011), and Experiment 1 considered both models to assess perceived emotion in legato and staccato melodies. The circumplex model of affect is a dimensional model that includes two dimensions of emotion space—valence and arousal (Russell, 1980). The two axes can be used to create four quadrants in which specific emotions are categorized. The dimension of valence ranges from positive to negative affect (Juslin & Sakka, 2019), whereas the dimension of arousal ranges from high to low energy. Discrete models, in contrast, identify emotions categorically, such as anger, fear, joy, sadness, and tenderness (Juslin & Laukka, 2003). It was predicted in Experiment 1 that melodies with legato articulation should be perceived as expressing low energy, as reflected by high ratings of sadness, calmness, and fear,1 whereas melodies with staccato articulation should be perceived as expressing high energy, as reflected in high ratings of joy, anger, tension, and energy.
Emotional Consequences of Articulation and Mode
In the Western tonal tradition, music is typically composed in either a major or minor key, or mode, and there is an abundance of evidence showing that mode is a powerful emotional cue in music. For example, the major mode is associated with happiness and minor mode with sadness (e.g., Husain et al., 2002; Juslin & Lindström, 2010). Furthermore, musical cues such as tempo and pitch contour also play a role, with fast tempos communicating high arousal emotions such as perceived happiness, and pitch contour often communicating perceived emotional valence more generally (Schubert, 2001). When compared to other musical attributes such as tempo, musical mode was ranked as the most important musical cue for listeners’ perception of happiness and sadness conveyed by music (Eerola et al., 2013).
Experiment 2 in the present study was designed to investigate whether the hypothesized effects of articulation on perceived emotion vary as a function of mode. For example, staccato is likely to add tension to a melody. Tension in the context of music and emotion is often considered in light of the “tension-arousal” dimension in models of affect (e.g., Schimmack & Grob, 2000) and may arise by a lack of coherence in staccato melodies, where the disconnected quality of notes requires greater cognitive resources to build a complete representation of the melodic sequence (cf. discussion of “surface tension” in Lerdahl & Krumhansl, 2007). This increase in cognitive effort may increase tension relative to legato melodies. Gaps between notes in staccato melodies may also increase expectancy for subsequent notes, and increased expectancy often leads to an increase in tension (Huron, 2006).
The addition of tension in staccato melodies may elicit different emotional effects depending on mode. Adding tension to a melody composed in the major mode (i.e., a “happy” melody) may lighten the mood of the music and give rise to emotions such as amusement or humor. Conversely, adding tension to a melody composed in the minor mode (i.e., a “sad” melody) may darken the mood and give rise to emotions such as fear and agitation.
Why should staccato articulation increase tension and promote emotions such as humor or fear? There are three possibilities. First, by introducing gaps between notes in a melody, the processes of auditory streaming may be burdened and perceptual fluency reduced. As a result, listeners may experience feelings of uncertainty and tension. Second, the occurrence of short, sharp notes separated by silence in staccato melodies may be attention-grabbing, in the same way the word “boo!” might capture attention if vocalized within a period of silence. Third, the highly articulated nature of notes in staccato melodies may simulate the punctuated vocalized exhalations that are typical of human laughter (“ha-ha-ha”), acting as an iconic semiotic sign of humor (Trevor & Huron, 2019). If composers use staccato articulation to simulate human laughter, it is plausible that staccato melodies are sometimes perceived by listeners as conveying more amusement than legato melodies.
Within the field of musical psychology, there is minimal research on musical humor, even though music often evokes amusement and laughter (Huron, 2004). Amir (2005) found that incongruity, exaggeration, and the occurrence of unexpected events were associated with musical humor. Huron (2004) also noted that the most common strategy music satirist Peter Schickele used to evoke humor was to violate listener expectations, which he achieved by misquoting well-known music, as well as playing unusual instruments like kazoos within an orchestra. Huron (2004) concluded that an unpredictable event in the music can elicit surprise and subsequent laughter. The non-overlapping notes in staccato articulation may also elicit unpredictability when compared to legato and this may play a role in subverting expectations and the subsequent experience of surprise.
The Present Study
Two experiments were conducted to investigate the effect of articulation on perceived melodic coherence, liking, and emotion conveyed through music. In Experiment 1, 32 short monophonic synthetic piano melodies with either legato or staccato articulation were presented to listeners, who rated melodies on perceived coherence and a range of emotional connotations. Experiment 2 extended this design by presenting melodies that varied in both articulation and mode. Experiment 2 examined whether the impact of articulation differs for melodies composed in a major or minor key and considered emotional effects of articulation not tested in Experiment 1.
Experiment 1: Articulation, Cohesion, and Emotional Meaning
The aim of Experiment 1 was to investigate the perceptual and emotional impact of articulation using 32 seven-note melodies. Two key questions were addressed: First, does articulation influence the perceived cohesion of a melody—the degree to which the auditory system can group acoustic information into one auditory stream? Second, can articulation be used to convey emotional meaning in music? Experiment 1 was a within-subjects design in which 32 melodies were presented to participants under two conditions of articulation: a staccato (high articulation) version and a legato (low articulation) version. The 64 resultant melodies were presented in a random order to participants. The dependent variables were: A) ratings of cohesion (“the extent to which the tones of a melody sound as though they create an organised whole”; Russo et al., 2015, pp. 100–101); B) ratings of seven perceived emotions (sadness, calmness, fear, joy, anger, tension, and energy); and C) a rating of liking.
Given existing theory and research, it was hypothesized that, compared to staccato articulation, melodies played with legato articulation should be perceived as more cohesive (H1), but lower in emotional energy. Specifically, low emotional energy in response to legato melodies should be reflected in high ratings of sadness, calmness and fear, but low ratings of energy, tension, joy, and anger (H2). It was also hypothesized that melodies with legato articulation should receive higher ratings of liking than melodies with staccato articulation (H3).
Method
Participants
Sixty-four participants at Macquarie University, Australia, were recruited from the first-year Psychology recruitment portal (49 females, 14 males, one non-binary; Mage = 23.4 years, SD = 10.3, range: 17–62 years). Participants completed the task for course credit and all but one reported normal hearing (one participant reported a minor impairment but was not removed). The mean years of formal instrumental music training was 3.53 years (SD = 4.22), with 26 participants indicating no formal music training. The minimum sample size required for adequate power was calculated using G*Power (Version 3.1.9.6). For a small-moderate effect size of .25, a minimum sample size of N = 54 was recommended when power was set at .95 and alpha set to .05. The effect size was set at this more conservative value given the lack of research on articulation, and the possibility that such a manipulation may result in somewhat small differences in listeners’ perceptions.
Stimuli
A set of 32 melodies, each consisting of seven notes, were used with either legato or staccato articulation, yielding 64 melodic stimuli. The 32 melodies were adapted from Cuddy et al. (1981), who generated the set from a prototypical melody that consisted of notes from the diatonic C major scale. This melody implied an ascending C major chord and a descending dominant 7th chord before returning to the tonic (C-E-G-F-D-B-C), which is overall a tonic-dominant-tonic (I-V-I) harmonic progression. To create 31 variations, Cuddy et al. (1981) progressively replaced the notes of the prototypical melody in a way that altered structural elements such as contour, implied harmony, and diatonicism, yielding 32 melodies that ranged widely in perceived tonal structure (where high ratings of tonal structure represented melodies with musical “keyness” and “completeness,” and low ratings represented melodies with unexpected or jarring notes). The 31 variations of the original melody were constructed by manipulating conventional rules of Western tonal music, such that the final set ranged from the simple and prototypical melody to melodies that deviated extensively from Western tonal structure (see Figure 2 in Cuddy et al., 1981, for melody notations).
The present study transcribed the 32 melodies into the Sibelius 7 notation program, which synthesized each melody utilizing the software’s default piano sound to create audio files. The tempo of all melodies was fixed at 175 beats per minute. The internote onset intervals for staccato and legato melodies were 350 ms and the duration of each melody totalled 2.45s. Each of the 32 melodies were created with either a high articulation (staccato) on each note, or a low articulation (legato) on each note. For the staccato manipulation, the duration of each staccato note with the 350 ms internote onset was limited to 70 ms per note. This was 20 percent of the prescribed length of the legato notes, which played for the entire 350 ms per note. Figure 1 provides a visual representation of the waveform and intensity profile of Melody #1. This visualization shows how the relatively fast decay of notes in staccato melodies (left panels) give rise to fluctuations of intensity across approximately a 30 dB range, whereas the relatively smooth transitions of legato notes (right panel) “blend” into each other and result in melodies that fluctuate over a relatively small range of intensity.
Listener Response Measures
Cohesion Scale
Participants were instructed as follows: “Please rate your perception of the overall cohesion of the musical melody. A cohesive melody is a melody with tone sequences that are perceived to hang together as a unified whole, rather than a series of individual tones.” Experiment 1 used the same 7-point scale as Russo et al. (2015), where 1 = not cohesive, 4 = moderately cohesive, and 7 = very cohesive.
Perception of Musical Emotion Scale
Five emotion scales from the Geneva Emotional Music Scale (GEMS, Zentner et al., 2008) were selected. The GEMS items selected were calmness, sadness, joy, tension, and energy, as well as additional items of fear and anger (cf. Thompson et al., 2019). In a two-dimensional space comprising valence (positive, negative) and energy arousal (low, high), sadness, tension and fear are often considered as negative affect/low energy, calmness as positive affect/low energy and joy as positive affect/high energy (see, Eerola & Vuoskoski, 2011). Anger can occur with different levels of arousal but often instantiates negative affect and high energy (Juslin & Laukka, 2003; Russell & Mehrabian 1974; Zajenkowski, 2017). Participants were asked to rate these seven emotion scales on 7-point Likert scales measuring the emotions they perceived rather than the emotions they experienced. The instructions read: “Music often conveys different feelings and emotions. Please rate how well the musical melody conveys each of the following emotions” (1 = does not describe the emotion conveyed at all, and 7 = describes the emotion conveyed very well).
Preference Scale
Each participant was asked to rate how much they liked each melody using the same scale as reported in Mathiesen et al. (2020), who found that legato arpeggios were liked more than staccato arpeggios. The question read: “How much did you like the melody?” (1 = not at all, 4 = like moderately, and 7 = like extremely).
Procedure
The study was administered online using the Qualtrics online survey platform. Participants used their own electronic devices and headphones or speakers from an external location outside of the university laboratory. At the beginning of the survey, participants provided informed consent and answered demographic questions about age, sex, years of formal instrumental music training, if they experienced hearing impairment, and whether they described themselves as a musician.
All participants completed four unrandomized practice trials before starting the experimental trials. Legato melodies were presented in the first and third practice trials, and staccato melodies were presented in the second and fourth practice trials. The four melodies used in the practice trials were modified versions (one or two pitch changes) of four of the original 32 melodies and were not included in the main experiment. On each trial, participants heard a melody and provided nine responses (1 x cohesion, 7 x emotion, 1 x liking) before moving to the next trial. The nine scales were presented in the following order: cohesion, tension, energy, joy, sadness, fear, anger, calmness, and liking. Participants were permitted to replay the melodies during each trial as many times as they needed to, however they could not progress until they had completed all nine scales. The 64 trials (32 staccato and 32 legato) were presented in a randomized order, without replacement. The experiment took approximately 30 minutes to complete.
Statistical Approach
In the primary analysis, ratings for each scale were collapsed across the set of 32 melodies. That is, for each participant, mean ratings for each scale were calculated for the 32 legato melodies and the 32 staccato melodies. This resulted in 18 mean ratings per participant (nine scales x two articulation conditions). The 1152 responses (18 mean responses x 64 participants) were then analyzed in Stata (Version 17). Therefore, nine one-way repeated-measures analyses of variance (ANOVAs) were conducted. To first evaluate whether there were statistical outliers within each of the nine dependent variables, z-scores were calculated. Using the criterion set out in Tabachnick and Fidell (2019), two cells in Experiment 1 had z-scores either greater than 3.29 or less than -3.29 and were transformed to be 0.1 scale units greater than or less than the next most extreme score. Partial eta squared values were used as estimates of effect sizes, where .01 is considered a small effect size, .09 a medium effect size, and .25 a large effect size (Field, 2018).
Results
Perceived Cohesion
As shown in Figure 2, there was a significant difference in perceived cohesion between melodies presented with legato or staccato articulation, F(1, 63) = 57.81, p < .001, . Participants perceived the legato melodies (M = 4.54, SD = 1.00) to be significantly more cohesive than the staccato melodies (M = 3.00, SD = 1.18). This result supports H1.
Perceived Emotion
In relation to H2, mean ratings of calmness, sadness and fear were numerically higher for legato melodies than staccato melodies. These differences were statistically significant for calmness, F(1, 63) = 95.39, p < .001, , and sadness, F(1, 63) = 46.10, p < .001, , but not for fear, F(1, 63) = 0.61, p > .05, . Conversely, ratings were significantly higher for staccato than legato melodies for energy, F(1, 63) = 3.94, p = .05, , and tension, F(1, 63) = 11.84, p = .001, . However, there were no significant differences in ratings of staccato and legato melodies for anger, F(1, 63) = 1.42, p > .05, , or joy, F(1, 63) = 1.71, p > .05, . Thus, H2 was partially supported. Notably, ratings of anger (M = 1.89, SD = 0.92) were numerically the lowest means of all dependent variables, suggesting that anger was not strongly conveyed by the melodies.
Ratings of Liking
As shown in Figure 2, participants indicated a significantly greater liking for legato articulation (M = 3.86, SD = 0.98) than staccato articulation (M = 3.00, SD = 1.03), F(1, 63) = 52.72, p < .001, , thus supporting H3.
Secondary Analyses
Examining the Combined Effects of Tonal Structure and Articulation
Given that the tonal structure of the 32 melodies used in Experiment 1 ranged from simple to complex (Cuddy et al., 1981), an additional analysis examined whether the effects of articulation reported above depend on the tonal structure of the melodies. Cuddy et al. asked a group of highly trained professional musicians to rate the perceived tonal structure of all 32 melodies on a 6-point rating scale. In the present study, we conducted a series of regression and correlation analyses, where mean tonal structure ratings reported by Cuddy et al. were entered as predictors of mean ratings of legato and staccato melodies separately. The analysis was conducted to assess the proportion of variance explained by tonal structure for each of the nine dependent variables. As shown in Table 1, there were strong significant positive correlations between tonal structure ratings and mean cohesion ratings for both legato articulation (r = .88) and staccato articulation (r = .86). Indeed, 76% of the variance for cohesion ratings was explained by ratings of tonal structure. As shown in Figure 3, the melodies with high ratings of tonal structure were also rated high in cohesion.
. | Legato . | Staccato . | ||||||
---|---|---|---|---|---|---|---|---|
. | r . | R2 . | t . | Coef. . | r . | R2 . | t . | Coef. . |
Cohesion | .88 | .78 | 10.40 | .44 | .86 | .74 | 9.14 | .33 |
Anger | −.64 | .42 | −4.61 | −.17 | −.69 | .47 | −5.19 | −.14 |
Calmness | .89 | .79 | 7.12 | .17 | .79 | .63 | −2.19 | −.09 |
Energy | .64 | .40 | 4.51 | .10 | .60 | .36 | 4.07 | .10 |
Fear | −.84 | .70 | −8.35 | −.46 | −.77 | .59 | −6.62 | −.29 |
Joy | .85 | .72 | 8.87 | .47 | .79 | .63 | 7.08 | .31 |
Liking | .86 | .74 | 9.15 | .28 | .79 | .62 | 6.95 | .20 |
Sadness | −.37 | .14 | −2.19 | −.09 | −.42 | .18 | −2.53 | −.06 |
Tension | −.80 | .64 | −7.32 | −.44 | −.76 | .58 | −6.37 | −.26 |
. | Legato . | Staccato . | ||||||
---|---|---|---|---|---|---|---|---|
. | r . | R2 . | t . | Coef. . | r . | R2 . | t . | Coef. . |
Cohesion | .88 | .78 | 10.40 | .44 | .86 | .74 | 9.14 | .33 |
Anger | −.64 | .42 | −4.61 | −.17 | −.69 | .47 | −5.19 | −.14 |
Calmness | .89 | .79 | 7.12 | .17 | .79 | .63 | −2.19 | −.09 |
Energy | .64 | .40 | 4.51 | .10 | .60 | .36 | 4.07 | .10 |
Fear | −.84 | .70 | −8.35 | −.46 | −.77 | .59 | −6.62 | −.29 |
Joy | .85 | .72 | 8.87 | .47 | .79 | .63 | 7.08 | .31 |
Liking | .86 | .74 | 9.15 | .28 | .79 | .62 | 6.95 | .20 |
Sadness | −.37 | .14 | −2.19 | −.09 | −.42 | .18 | −2.53 | −.06 |
Tension | −.80 | .64 | −7.32 | −.44 | −.76 | .58 | −6.37 | −.26 |
Note. All correlations and regression coefficients except for sadness were significant at p < .001.
The results in Table 1 also indicate that except for sadness ratings, tonal structure was strongly correlated with mean ratings for the seven remaining dependent variables, with Pearson’s correlation values ranging from .60 to .89. All regression analyses, excluding sadness, yielded statistically significant results when tonal structure was entered as the predictor of each of the eight remaining dependent variables (p values < .001).
Finally, to assess whether ratings of tonal structure reported in Cuddy et al. (1981) interacted with any of the nine dependent variables reported in Experiment 1, we calculated correlations between ratings of tonal structure and each participants’ ratings of 32 melodies for legato and staccato conditions. From these data, the mean Pearson’s correlation coefficient for legato and staccato conditions for each participant was calculated. We then analyzed any significant differences in correlation coefficients between legato and staccato melodies using a series of paired-sample t-tests for each of the nine dependent variables. Table 2 shows that ratings of tonal structure presented in Cuddy et al. (1981) were positively correlated with perceived cohesion, calmness, energy, joy, and liking, and negatively correlated with anger, fear, sadness and tension. Paired-sample t-tests revealed that the positive correlations between ratings of tonal structure and perceived cohesion, calmness and joy were significantly greater for legato melodies relative to staccato melodies. Moreover, the negative correlations between ratings of tonal structure and perceived fear and tension were significantly greater for legato melodies relative to staccato melodies.
. | Mean r (SD) . | . | . | . | |
---|---|---|---|---|---|
. | Legato . | Staccato . | t . | 95% CIs . | p . |
Cohesion | .36 (.31) | .29 (.26) | 2.48 | .01, .13 | .016 |
Anger | −.18 (.22) | −.15 (.22) | −.97 | −.09, .03 | .336 |
Calmness | .36 (.25) | .21 (.23) | 4.93 | .09, .21 | .000 |
Energy | .10 (26) | .11 (.22) | −.35 | −.08, .05 | .728 |
Fear | −.39 (.23) | −.27 (.22) | −5.52 | −.16, -.08 | .000 |
Joy | .37 (.26) | .27 (.26) | 3.82 | .05, .16 | .000 |
Liking | .25 (.29) | .20 (.23) | 1.37 | −.02, .11 | .177 |
Sadness | −.08 (.22) | −.07 (.20) | −.41 | −.07, .05 | .680 |
Tension | −.34 (.29) | −.21 (.27) | −4.07 | −.18, -.06 | .000 |
. | Mean r (SD) . | . | . | . | |
---|---|---|---|---|---|
. | Legato . | Staccato . | t . | 95% CIs . | p . |
Cohesion | .36 (.31) | .29 (.26) | 2.48 | .01, .13 | .016 |
Anger | −.18 (.22) | −.15 (.22) | −.97 | −.09, .03 | .336 |
Calmness | .36 (.25) | .21 (.23) | 4.93 | .09, .21 | .000 |
Energy | .10 (26) | .11 (.22) | −.35 | −.08, .05 | .728 |
Fear | −.39 (.23) | −.27 (.22) | −5.52 | −.16, -.08 | .000 |
Joy | .37 (.26) | .27 (.26) | 3.82 | .05, .16 | .000 |
Liking | .25 (.29) | .20 (.23) | 1.37 | −.02, .11 | .177 |
Sadness | −.08 (.22) | −.07 (.20) | −.41 | −.07, .05 | .680 |
Tension | −.34 (.29) | −.21 (.27) | −4.07 | −.18, -.06 | .000 |
Note. df = 63 for each paired sample t-test; Bold p values highlight significant differences in mean correlations between legato and staccato conditions; CIs = confidence intervals.
Discussion
Experiment 1 revealed that melodies with legato articulation were perceived as significantly more cohesive than melodies with staccato articulation, thus supporting H1. Such an outcome was predicted by models of auditory streaming, whereby notes that are temporally proximal with one another should be more easily grouped into the same auditory stream than notes that are separated from each other by a period of silence (Bregman, 1990). Conversely, successive notes in staccato melodies are more likely to excite different channels in the peripheral auditory system, reducing the perceived cohesion of these melodies relative to legato melodies. Follow up analyses also showed that as the melodies increased in perceived tonal structure, ratings of cohesion also increased. This finding illustrates that auditory grouping is influenced not only by temporal proximity (bottom-up principles) but also by the extent to which notes conform to the regularities in Western tonal music (Cuddy, et al., 1981).
Experiment 1 revealed that legato melodies were associated with calmness and sadness, whereas staccato melodies were associated with tension and energy. There were no significant differences between staccato and legato melodies for ratings of anger, fear and joy. As hypothesized, these results suggest that legato melodies are perceived as cohesive and associated with low-energy emotions. Additional follow-up analyses showed that tonal structure in the melodies was strongly correlated with mean ratings of six of the seven emotion measures, illustrating the impact of Western tonal structure on emotional response. Taken together, the results indicate that emotional responses are affected by both general processes of auditory streaming as well as processes that depend on learning and enculturation.
Finally, legato melodies were liked to a significantly greater extent than staccato melodies. This result replicates the findings of Mathiesen et al. (2020) with more complex melodic stimuli than the arpeggios used by Mathiesen et al. The finding also suggests that listeners prefer melodies that are easier for the auditory system to process (high perceptual fluency) (Forster et al., 2013).
Experiment 2 was conducted to examine whether the impact of articulation on perceived emotion depends on musical mode. As major and minor modes represent common and contrasting tonal structures in Western music, Experiment 2 evaluated the effects of articulation for these two modes. Most of the melodies in Experiment 1 were in the major mode, so it was unclear whether the effects of articulation observed would also be observed for melodies in the minor mode. For example, minor melodies are universally perceived as sadder than major melodies (Thompson & Olsen, 2021); however, introducing staccato to minor melodies may transform the communication of sadness by infusing melodies with tension and energy, resulting in a new emotional connotation, such as fear or anger. Conversely, when tension and energy from staccato are added to a melody in the major mode, an otherwise simply happy-sounding melody might be transformed into a melody that conveys amusement or surprise. Experiment 2 assessed these predictions.
Experiment 2: Articulation, Emotion, and Musical Mode
The aim of Experiment 2 was to investigate whether the emotional effects of articulation vary between melodies composed in major and minor modes. Experiment 2 featured a 2 x 2 within-subjects design with articulation (legato, staccato) and mode (major, minor) as independent variables. It was hypothesized that higher levels of articulation (staccato) should infuse the music with increased tension and energy, as reported in Experiment 1. However, this infusion of tension and energy is predicted to elicit different emotional effects depending on whether the melodic structure has a positive valence connotation (major mode) or a negative valence connotation (minor mode). In particular, an increase in perceived energy from staccato articulation should enhance ratings of happiness, amusement and surprise for melodies in the major mode (H1), but should increase ratings of tension and fear for melodies in the minor mode (H2).
Method
Participants
Participants were recruited from the first-year Psychology recruitment portal at Macquarie University, Sydney. They comprised 60 participants (52 females, seven males, one preferred not to say; Mage = 24.5 years, SD = 10.2, age range: 18–58 years). All participants reported normal hearing and completed the task for course credit. The mean years of formal instrumental music training was 2.92 years (SD = 4.13), with 29 participants indicating they had no formal music training.
Stimuli
First, 24 of the 32 melodies used in Experiment 1 were selected for the major mode condition in Experiment 2. The other 8 melodies from Experiment 1 were excluded from Experiment 2 because the tonal structure was ambiguous and clearly not of the major mode. For the minor mode condition, pitches of selected notes in the 24 major mode melodies, which were composed in the key of C major, were shifted down by one semitone. To convert the melodies to C minor, all B notes were changed to B flat, E notes to E flat, and A notes to A flat. Next, 12 pairs of major and minor melodies were selected for inclusion in the experiment on the basis that they represented the clearest contrast in mode. This assessment was ascertained through a pilot study in which ten participants who did not complete the main experiment rated happiness (positive valence) and sadness (negative valence) in response to the major and minor versions of each melody on 7-point Likert scales. The results of this pilot study guided the decision to select the major and minor pairs derived from Cuddy et al. (1981) melodies #2-9 inclusive, #13, #16, #19 and #21. For each selected melody, mean ratings of joy were higher for the major mode version than the minor mode version, and mean ratings for sadness were higher for the minor mode version than the major mode version. Figure 4 illustrates the notation for each major and minor version of the final 12 melodies.
Measures
Six emotion rating scales were used: happiness, sadness, amusement, surprise, tension and scariness. In a two-dimensional space comprising valence (positive, negative) and energy arousal (low, high), sadness, scariness and tension are often associated with negative affect/low energy arousal, whereas happiness, amusement and surprise are often characterized by positive affect/high energy arousal (Eerola & Vuoskoski, 2011). Each emotion was explicitly named in the scale anchors. For example, the anchors for happiness ratings were: 1 = not at all happy, 4 = moderately happy, and 7 = extremely happy. Tension and scariness scales were included to measure low energy arousal coupled with negatively valenced emotions in the minor mode condition.
Procedure
Experiment 2 used the same consent form, demographic questions, and practice trials as Experiment 1. Experiment 2 featured six rating scales per trial, presented in the following order: happy, sad, amusing, surprising, tense, and scary. Every participant listened to all four versions (major legato, major staccato, minor legato, and minor staccato) of the 12 melodies. The 48 melodies were presented in random order. The experiment took approximately 20 minutes to complete.
Statistical approach
A two-way repeated-measures ANOVA was performed in Stata. Six mean emotion ratings for each condition per participant were recorded, totalling 1,440 cell means (60 participants x 4 conditions x 6 ratings). The same outlier criteria and data transformation that was used in Experiment 1 was applied to five outlying values in Experiment 2. Partial eta squared values were used as estimates of effect sizes, where .01 is considered a small effect size, .09 a medium effect size, and .25 a large effect size (Field, 2018).
Results
Main Effects of Articulation and Mode
We first examined the effect of articulation and mode separately. Figure 5 displays mean ratings of the six emotions for melodies with legato and staccato articulation (top panel) and melodies in the major and minor mode (bottom panel). As shown in Table 3, ratings of sadness and scariness were significantly higher in response to legato melodies than staccato melodies. On the other hand, ratings of amusement, happiness, and surprise were significantly higher in response to staccato than legato melodies. There were no significant differences between legato and staccato melodies for ratings of tension.
. | Articulation . | Mode . | ||||
---|---|---|---|---|---|---|
. | F (1, 59) . | p . | . | F (1, 59) . | p . | . |
Happiness | 5.07 | .028 | 0.08 | 142.42 | < .001 | 0.71 |
Sadness | 86.52 | < .001 | 0.60 | 51.84 | < .001 | 0.47 |
Tension | 0.31 | > .05 | 0.01 | 70.22 | < .001 | 0.54 |
Amusing | 48.21 | < .001 | 0.45 | 43.65 | < .001 | 0.43 |
Surprising | 42.62 | < .001 | 0.42 | 2.17 | > .05 | 0.04 |
Scariness | 9.51 | .003 | 0.14 | 43.56 | < .001 | 0.43 |
. | Articulation . | Mode . | ||||
---|---|---|---|---|---|---|
. | F (1, 59) . | p . | . | F (1, 59) . | p . | . |
Happiness | 5.07 | .028 | 0.08 | 142.42 | < .001 | 0.71 |
Sadness | 86.52 | < .001 | 0.60 | 51.84 | < .001 | 0.47 |
Tension | 0.31 | > .05 | 0.01 | 70.22 | < .001 | 0.54 |
Amusing | 48.21 | < .001 | 0.45 | 43.65 | < .001 | 0.43 |
Surprising | 42.62 | < .001 | 0.42 | 2.17 | > .05 | 0.04 |
Scariness | 9.51 | .003 | 0.14 | 43.56 | < .001 | 0.43 |
Table 3 also shows statistically significant main effects of mode for all emotions except surprise. Specifically, melodies in the major mode were rated significantly higher in happiness and amusement, whereas melodies in the minor mode were rated significantly higher in tension, scariness, and sadness.
Interaction Effects of Articulation and Mode
Significant interactions between articulation and mode were observed for ratings of happiness, F(1, 59) = 36.22, p < .001, , and sadness, F(1, 59) = 16.19, p < .001, . Post hoc pairwise comparisons were conducted to further explore these interactions. Four post hoc pairwise comparisons were conducted using a Bonferroni adjusted alpha of .0125 for each dependent variable. That is, the difference in ratings between legato and staccato articulation was assessed for major and minor melodies separately, and the difference in ratings between major and minor mode melodies was assessed for legato and staccato melodies separately.
As shown in Figure 6, ratings of happiness were significantly higher for staccato than legato melodies in the minor mode, t(59) = -4.06, p < .001, 95% CI [-.64, -.22], but not in the major mode, t(59) = -.302, p > .0125, 95% CI [-.25, .19]. Ratings of sadness were significantly greater for legato than staccato melodies in both the major mode, t(59) = 7.23, p < .001, 95% CI [.47, .83], and the minor mode, t(59) = 9.41, p < .001, 95% CI [.78, 1.20].
Next, ratings of happiness were significantly greater for major melodies relative to minor melodies when presented with legato articulation, t(59) = 12.11, p < .001, 95% CI [1.07, 1.49], and when presented with staccato articulation, t(59) = 10.23, p < .001, 95% CI [.71, 1.05]. On the other hand, ratings of sadness were significantly greater for minor melodies relative to major melodies when presented with legato articulation, t(59) = -7.03, p < .001, 95% CI [-.92, -.51], and when presented with staccato articulation, t(59) = -5.49, p < .001, 95% CI [-.51, -.24].
Discussion
The results of Experiment 2 indicate that both articulation and mode impact the perception of emotion conveyed through music, and they also have interactive effects. Ratings of amusement, happiness, and surprise were significantly higher for staccato than legato melodies, whereas ratings of sadness and scariness were significantly higher for legato than staccato melodies. Ratings of amusement and happiness were also affected by mode, with higher ratings for major than minor melodies. Conversely, ratings of tension, scariness, and sadness were higher for minor than major melodies.
Interactive effects were also observed. For melodies in the major mode, ratings of happiness were similar for staccato and legato melodies. However, for melodies in the minor mode, ratings of happiness were significantly greater for staccato relative to legato melodies. This pattern of results suggests that staccato articulation will only tend to enhance connotations of happiness for music composed in the minor mode, but not for melodies composed in the major mode. Presumably, staccato articulation has other musical functions for music composed in a major mode.
Whilst the interactive effects of articulation and mode deviated from the hypotheses, they illustrate the potential for articulation to convey emotional meaning, and in some cases differently depending on mode. Along with other expressive devices known to influence emotional responses to music such as intensity, timing, and tempo, performers can use articulation to communicate specific emotional intentions.
General Discussion
Experiment 1 examined the impact of articulation on listeners’ perception of melodic cohesiveness, their perception of emotions expressed through this music, and their liking for such melodies. Experiment 2 built upon the design of Experiment 1 and examined whether the effects of articulation on perceived emotions vary when melodies are played in either a major (happy) or minor (sad) mode. Each key finding will now be discussed in turn.
Perceived Cohesion and Liking for Staccato and Legato
As predicted, ratings of cohesion significantly differed as a function of articulation. Legato melodies were rated significantly more cohesive than staccato melodies. That is, staccato melodies did not seem to “hang together” to form a complete phrase as well as legato melodies. The finding is consistent with predictions based on processes of auditory streaming—the set of processes by which the auditory system analyses incoming sounds into streams of acoustic information (Bregman, 1990). Bregman distinguished between bottom-up and top-down processes of auditory scene analysis. Bottom-up processes are driven by basic attributes of sound in the acoustic environment (stimulus-driven), whereas top-down processes are guided by an individual’s previous experience, knowledge, and memory.
Based on the findings in Experiment 1, articulation likely taps into bottom-up processes whereby the temporal relationship between notes influences the extent to which a melody is perceived as a single auditory stream (legato) or a sequence of loosely connected auditory events (staccato). The relatively high cohesion ratings in response to legato melodies may indicate greater perceptual fluency than staccato melodies, which were perceived as less cohesive and potentially less fluent and more effortful for the auditory system to process. The impact of articulation on processing fluency may also help to explain why legato melodies were liked significantly more than staccato melodies. According to fluency theory, listeners prefer perceptually fluent over disfluent stimuli (Reber et al., 1998).
Tonal structure also impacted the perceived cohesion of melodies. Experiment 1 revealed a strong correlation between ratings of tonal structure in Cuddy et al. (1981) for the original set of melodies, and the cohesion ratings of both legato and staccato versions of those melodies. Irrespective of articulation, melodies that were assigned high ratings of tonal structure received higher cohesion ratings when compared to melodies assigned low ratings of tonal structure. One explanation is that the greater the tonal structure, the easier listeners could extract and encode the harmonic progression implied by the melody. The prototypical melody used by Cuddy et al. (1981) featured a basic I-V-I progression that was easy to perceive (arpeggiated chords), whereas the 31 variations involved systematic changes that made it increasingly difficult to infer harmonic structure. The expert analysts’ assessment of tonal structure was, by definition, influenced by their experience and knowledge of rules underlying Western tonal music, and ratings of tonal structure similarly reflect top-down mechanisms. Taken together, the findings illustrate that perceived cohesion in response to brief melodic phrases is influenced by both the bottom-up mechanism of articulation and the top-down mechanism of tonal structure.
Effects of Articulation on Perceived Emotions
It was predicted that low energy emotions would be conveyed more strongly by legato than staccato melodies. Thus, we predicted that calmness, sadness, and fear would be assigned higher ratings in legato than staccato melodies (note that in three dimensional models of emotion, fear is characterized as low energy, but high tension). This hypothesis was partially supported. Legato melodies conveyed significantly more calmness and sadness than the staccato melodies. This finding concurs with previous research that identified legato articulation to be associated with sadness and tenderness (Juslin & Laukka, 2003). However, there was no significant difference between legato and staccato for ratings of fear. Although we conceptualized fear as a low energy emotion (e.g., Ilie & Thompson, 2006), performers can communicate fear either by playing quietly to convey a fearful “freezing” response, or forcefully to induce a fearful fight-or-flight response (Vieillard et al., 2008). This ambiguity suggests that fear can be conveyed using either staccato or legato articulation, depending on other attributes of the music.
It was also predicted that higher energy emotions would be conveyed more strongly by staccato than legato melodies. Thus, we predicted that joy, anger, tension, and energy would be assigned higher ratings in staccato than legato melodies. This hypothesis was again partially supported. Ratings of energy and tension were significantly higher in response to staccato than legato melodies. However, there were no significant differences between staccato and legato melodies observed for ratings of joy or anger. One explanation for this outcome is that articulation only affects these emotional interpretations by interacting with other structural features of melodies. For example, when major and minor modes were systematically manipulated in melodies (Experiment 2), articulation affected ratings of happiness (i.e., joy) for melodies in the minor mode. However, articulation had no impact on ratings of anger for melodies in either the major or minor mode. In general, ratings of anger were concentrated at the lower end of the scale (“does not convey anger at all”) and may have resulted in a floor effect, whereby the emotion of anger could not be perceived in such brief melodic segments. Supporting this idea, research has shown that happiness and sadness are relatively easy to convey through simple monophonic music, whereas anger and fear are more readily conveyed through complex polyphonic music (Hailstone et al., 2009).
Interactions Between Articulation and Mode
Experiment 2 manipulated articulation and mode in a subset of the melodies used in Experiment 1. Twelve original melodies that were clearly in C major were altered by slightly lowering certain pitches to make 12 new C minor versions of each melody (see Figure 4 for music notation). Happiness and sadness scales were included to verify that melodies in the major mode represented positively valenced emotions, and melodies in the minor mode represented negatively valenced emotions. As predicted, major melodies were assigned higher ratings of happiness than minor melodies, and minor melodies were assigned higher ratings of sadness than major melodies. These results concur with previous studies of major-happy and minor-sad associations (Juslin & Lindström, 2010) and validated our manipulations of mode.
Significant interactions between articulation and mode were observed for ratings of sadness and happiness. Pairwise comparisons comparing legato and staccato within each mode revealed that ratings of sadness were significantly greater for legato than staccato melodies in both the major and minor modes. In other words, if a melody is composed in either a positively valenced major mode or a negatively valenced minor mode, the inclusion of legato articulation will enhance the communication of sadness relative to the same melodies performed with staccato.
Furthermore, ratings of amusement were significantly greater in response to staccato melodies relative to legato melodies. Major melodies were also rated as more amusing than minor melodies. However, there was no significant interaction between articulation and mode. Trevor and Huron (2019) noted the similarity between the punctuated sound of human laughter and staccato articulation in music, although they were unable to conclusively show that staccato articulation was specifically used by composers to simulate laughter and evoke humorous responses. The present findings provide evidence that articulation can be used to manipulate the communication of amusement and humor in music, and that this impact does not depend on mode. Further research is needed to investigate whether amusement is dependent on the interaction between articulation and other musical features, such as timbre, dynamics, tempo, textural density, and expectancy violations (Amir, 2005; Huron, 2004).
Ratings of surprise were also significantly higher for staccato melodies than legato melodies, regardless of mode. Thus, the use of staccato can successfully influence the communication of amusement and surprise to a greater extent than legato. Conversely, perceived tension was greater for minor melodies relative to major melodies. A minor mode conveys greater tension, but this effect does not vary as a function of articulation.
Implications and Future Directions
Performers have the power to represent a composer’s work in various ways; different musicians can be given the same score yet execute markedly different interpretations of that music during their performance (Gabrielsson & Juslin, 1996). Articulation is one of the most fundamental expressive devices available for performers, and the findings presented here show that articulation can significantly influence certain emotional interpretations conveyed through a short melodic sequence (e.g., calmness and energy), but not all emotional interpretations (e.g., fear and anger). The interaction between articulation and mode shows that the emotional impact of articulation is sometimes dependent on whether the music is composed in a major or minor mode.
The findings presented here will need to be replicated and extended to musical contexts of greater complexity and duration, and investigated in different cultural contexts. Longer durations of music will also permit a more thorough investigation of the impact of articulation on felt emotion rather than perceived emotion, given the emotions we hear and perceive in music do not always align with the emotions we feel and experience (Gabrielsson, 2001). Future research could also consider a wider range of articulation because in real music performances, articulation often varies in a more continuous manner between the two fixed levels of articulation presented in the present study and sometimes beyond those two levels. For example, some compositions suggest that the performer use articulation that is halfway between legato and staccato, known as portato or mezzo staccato. The musician will connect the notes as if playing legato but add gentle articulations to each note to help accentuate the pulse (Hofmann & Goebl, 2014). Listeners tend to engage with contrasts and variety in expressive music performances, so performers often consider the effect that contrasting levels of articulation can have on a listener within the same segment of music. Thus, research should examine how listeners react perceptually and emotionally when articulation is systematically varied within musical phrases. Finally, there are likely to be cultural influences on the emotional impact of articulation. The current investigation sheds light on how articulation can impact Western listeners, but the extent to which our findings can be generalized to other cultural settings is currently unknown.
The present findings also have pedagogical implications. Western tonal music training has prioritized the teaching of basic concepts such as pitch and duration to ensure musicians can read and play the written work of composers (Palmer, 1997). However, articulation is more pertinent to the question of how pitches and durations can be expressed with differing levels of emotional intensity and how a variety of timbral qualities are produced when performed on different instruments. The present findings reinforce the value of teaching musicians how important the intentional use of articulation can be for the successful execution of sophisticated expressivity during music performance. This knowledge will allow performers the freedom to interpret music in a way that will help them convey their intended emotional and expressive outcomes. The significant effects of legato articulation on perceived sadness and calmness can be applied by a performer playing reflective or relaxing background music. In this context, they may choose to use legato articulation more often than staccato articulation, given that legato articulation permits ease of grouping so is less cognitively demanding for the auditory system to process. Conversely, given the significant effects of staccato articulation on perceived energy, tension, amusement, happiness, and surprise, utilizing staccato articulation in more dynamic and engaging performance contexts may be more appropriate, especially in contexts where the goal is to capture and retain listeners’ attention.
Finally, three caveats are important to note. First, differences in perceived attack and decay between staccato and legato are inherent in the definition of articulation, but also inherent in the definition of timbre. The current data only permit the conclusion that legato and staccato articulation have very distinctive emotional and perceptual effects on melody perception. The perceived separation of notes in staccato articulation is likely responsible for the reduced melodic coherence and differing emotional connotations, relative to legato articulation. However, the notes themselves may have been heard as having slightly different timbral qualities (staccato notes may have sounded more percussive) and future research could attempt to disentangle the influence of timbre versus note separation on these findings. Second, the participant samples in both experiments were characterized by an uneven gender balance of predominantly female participants, coupled with a relatively large age range. Third, the order of emotion rating scales in each experiment was not randomized, so the potential for order effects, though unlikely to account for our results, cannot be ruled out.
Conclusion
The present study revealed that articulation in music has both perceptual and emotional effects, with some effects dependent on tonal structure. Legato melodies were liked more than staccato melodies and were perceived as more cohesive, calming, and sad. Conversely, staccato melodies were perceived as conveying greater tension and energy in Experiment 1 and greater amusement, happiness, and surprise in Experiment 2. Such findings can inform music pedagogy and provide the foundation for future research on the perceptual and emotional consequences of articulation—a fundamental expressive device used by performers.
Author Note
We thank members of the Macquarie University Music, Sound, and Performance Research Group for helpful comments on an earlier draft.
Note
According to three dimensional models of affect that include tension arousal and energetic arousal (e.g., Schimmack & Grob, 2000), fear is often characterized by low energy but high tension, whereas anger is characterized by both high energy and high tension (Ilie & Thompson, 2006). In the context of music perception, perceived energy is closely related to high intensities and fast tempi (Gabrielsson & Lindström, 2001).