Traditional neurobiological theories of musical emotions explain well why extreme music such as punk, hardcore, or metal—whose vocal and instrumental characteristics share much similarity with acoustic threat signals—should evoke unpleasant feelings for a large proportion of listeners. Why it doesn't for metal music fans, however, is controversial: metal fans may differ from non-fans in how they process threat signals at the sub-cortical level, showing deactivated responses that differ from controls. Alternatively, appreciation for metal may depend on the inhibition by cortical circuits of a normal low-order response to auditory threat. In a series of three experiments, we show here that, at a sensory level, metal fans actually react equally negatively, equally fast, and even more accurately to cues of auditory threat in vocal and instrumental contexts than non-fans; conversely, we tested the hypothesis that cognitive load reduced fans' appreciation of metal to the level experienced by non-fans, but found only limited support that it was the case. Nevertheless, taken together, these results are not compatible with the idea that extreme music lovers do so because of a different sensory response to threat, and highlight a potential contribution of controlled cognitive processes in their aesthetic experience.

Our capacity to perceive emotions in music has been the subject of impassioned psychology and neuroscience research in the past two decades (Blood & Zatorre, 2001; Juslin & Västfjäll, 2008). While music was once believed to have a “language of emotions” of its own, separate from our species' other expressive capacities (McAlpin, 1925), today's dominant view of musical expression construes it as in many ways continuous with natural languages (Patel, 2010). Musical emotions are studied as communicative signals that are encoded in sound by a performer, then decoded by the listening audience (Juslin & Laukka, 2003), for whom hearing music as expressive involves registering its resemblance with the bodily or vocal expressions of mental states (Juslin & Västfjäll, 2008). For instance, joyful music is often associated with fast pace and animated pitch contours (as is happy speech), melancholic music with slower and flatter melodic lines and dark timbres (as is sad speech), and exciting music with high intensity and high levels of distortion and roughness (as may be an angry shout) (Blumstein, Bryant, & Kaye, 2012; Escoffier, Zhong, Schirmer, & Qui, 2013; Ilie & Thompson, 2006; Juslin & Laukka, 2003).

Seeing musical expression as a culturally evolved phenomenon based on a biologically evolved signaling system (Bryant, 2013) explains much of people's typical affective responses to music. Just like vocalizations, music that signals happiness or affiliation may be appraised positively or lead to positive contagion (Miu & Baltes, 2012); sad music may elicit empathy, and make people sad or moved (Vuoskoski & Eerola, 2017). Similarly, humans—and many non-human animals—produce harsh, rough, and nonlinear sounds when alarmed (Anikin, Båth, & Persson, 2018). In ecological situations, such sounds trigger stereotypical fear and avoidance behaviors (e.g., in conditioning paradigms; Den, Graham, Newall, & Richardson, 2015), are strongly prioritized in sensory processing (Asutay & Västfjäll, 2017), and evoke activity in areas linked to the brain's threat response system (Arnal, Flinker, Kleinschmidt, Giraud, & Poeppel, 2015). It is therefore no surprise that “extreme” music such as punk, hardcore, or some metal (Abbey & Helb, 2014; Weinstein, 2000), whose vocal and instrumental characteristics share much acoustic similarity with threat signals, should evoke feelings of anger, tension, and fear for non-fans of this music (Blumstein et al., 2012; Rea, MacDonald, & Carnes, 2010; Thompson, Geeves, & Olsen, 2018), impair their capacity to cope with simultaneous external stress (Labbé et al., 2007), and trigger reactions of avoidance and a desire to stop listening (Bryson, 1996; Thompson et al., 2018). Decoding extreme music as an auditory signal of danger or threat, these non-fan listeners (as one respondent quoted in Thompson et al., 2018 literally, “…cannot understand how anyone finds this music pleasant to listen to.”

Some listeners1 obviously do, though. Extreme music, and most notably metal music, is a thriving global market and subculture, with strongly engaged communities of fans (Brown, Spracklen, Kahn-Harris, & Scott, 2016). Despite long-lived stereotypes that listeners who engage with metal music do so because of a psycho-socially dysfunctional attitude to violence and aggression (Bodner & Bensimon, 2015; Stack, Gundlach, & Reeves, 1994; Sun, Zhang, Duan, Du, & Calhoun, 2017), it is now well-established that listeners with high preference for metal music do not revel in the strongly negative feelings this music usually induces in non-metal fans. Rather, metal music fans report that the music leads them to experience a wide range of positive emotions including joy, power and peace (Thompson et al., 2018) and no increase of subjective anger (Gowensmith & Bloom, 1997). In fact, following an anger-induction paradim, Sharman and Dingle (2015) report that listening to 10 minutes of violent metal music relaxed metal music fans just as effectively as sitting in silence. It therefore appears that metal music fans do not process the threat-signaling features of violent music to the same outcome as non-metal fans. It is not that they enjoy the threat; rather, they do not experience threat at all.

The interaction between first-order and higher-order processing may provide some insight on why this may be the case. While traditional, neurobiological views of emotions link the emergence of emotional feelings—such as that of experiencing fear—to the operation of innately programmed, primarily subcortical brain systems, such as those centered on the amygdala (Panksepp, 2004), more recent cognitive frameworks tend to separate the activation of such circuits from that of higher-order cortical networks that use inputs from subcortical circuits to assemble the emotional experience (LeDoux & Brown, 2017; LeDoux & Pine, 2016). In short, while first-order threat responses may contribute to higher-order feeling of fear, they do not unequivocally constitute it: on the one hand, defensive survival circuits may be activated by subliminally presented threatening visual stimuli and generate behavioral or autonomic threat response patterns even in the absence of subjective fear (Diano, Celeghin, Bagnis, & Tamietto, 2017; Vuilleumier, Armony, Driver, & Dolan, 2001; Whalen et al., 2004); on the other hand, bilateral damage to the amygdala may interfere with bodily responses to threats, while preserving the conscious experience of fear (Feinstein et al., 2013; for a discussion, see Fanselow & Pennington, 2018). In sum, autonomic, behavioral, and primitive responses to threat stimuli appear to be neither necessary nor sufficient for the conscious experience of fear to emerge.

The existence of two populations—metal fans and non-fans—that respond to identical cues of auditory threat with radically different emotional experience (pleasure/approach, or fear/avoidance) provides a compelling ecological situation in which to study how first-order and high-order processes interact to create emotional states of consciousness. On the one hand, it is possible that metal fans differ from non-fans in how they process threat signals at the first-order/subcortical level. Just like clinical populations with specific phobias or social anxiety show increased amygdala reactivity to their trigger stimuli (e.g., pictures of spiders or fearful faces) even when presented outside of conscious awareness (McCrory et al., 2013; Siegel et al., 2017), metal fans may show deactivated responses to the cues of auditory threat constitutive of that musical genre, possibly as the result of positive conditioning (see e.g., Blair & Shimp, 1992). If present, such first-order, bottom-up differences between fans and non-fans would not only predict a different late-stage read-out of the activity of the threat circuit (i.e., experiencing fear or not), but also different autonomic and behavioral responses to auditory roughness even beyond the realm of music (e.g., fans not reacting to angry voices as fast/as negatively as non-fans). On the other hand, it is also possible that the fans' appreciation for metal music reflects a higher-order inhibition by cortical circuits of an otherwise normal, low-order response to auditory threat. In support of such a dissociation, Gowensmith and Bloom (1997) found that while metal fans listening to metal music reported feeling less angry than non-fans, both fans and non-fans reported similar levels of physiological arousal in response to metal music, suggesting that lower-order circuits reacted similarly in both groups. Similarly, in the visual modality, Sun, Lu, Williams, and Thompson (2019) recently reported that extreme music fans exhibited no more processing bias than non-fans for violent imagery in a binocular rivalry paradigm. Conversely, a number of studies have shown that loading executive functions with visual attention (Pessoa, McKenna, Guierrez, & Ungerleider, 2002), working memory (Van Dillen, Heslenfeld, & Koole, 2009), or demanding arithmetic tasks (Erk, Kleczar, & Walter, 2007) can lessen both the subjective evaluation and amygdala response to negative stimuli. If they are involved in musical aesthetic experiences, we should predict that such higher-order, top-down processes would be more engaged for metal fans than non-fans during the emotional experience of metal music, and that loading these executive functions with a dual-task paradigm would lead to a failed inhibition of avoidance-related processes arising from the threat circuit, thereby lessening metal fans' appreciation to the level experienced by non-fans.

In this article, we report on three experiments that aim to separate these two alternatives and to clarify the contribution of low-and higher-order processes in the emotional experience of metal music by fans and non-fans. We screened a total of 332 participants to constitute an experimental group of metal music fans that ranked low on appreciation for a control music genre (pop music) and a control group that ranked high on pop music but low on metal. To test the possibility of different low-order behavioral responses to threat cues, both groups rated the valence of vocal and musical stimuli presented with and without acoustic roughness, one prominent cue to vocal arousal/threat (Experiment 1). They were also subjected to a speeded spatial localization task with the same stimuli presented at different dichotic interaural time differences (ITDs) (Experiment 2.) To test the contribution of higher-order inhibition to fans' appreciation, we subjected both groups to a dualtask paradigm in which participants listened and rated their preference for both metal and pop music extracts while engaging in a demanding visual search task (Experiment 3). Our hypotheses, which we preregistered along with a basic data analysis strategy (see Supplementary Materials accompanying this article online at mp.ucpress.edu), were that groups would differ in Experiments 1 and 2 if metal appreciation is the result of different low-level processes, and would differ in Experiment 3 if it is the result of higher-order cognitive control over low-level processes.

Experiment 1: Valence Rating Task

A wealth of behavioral data suggests that cues of auditory threats, such as distortion, roughness, and other non-linearities, are generally evaluated with negative valence. For instance, Arnal et al. (2015) found that human listeners judge vocal, instrumental, and alarm sounds resynthesized to include temporal modulations in the 30-150 Hz range elicited more negative ratings, as well as faster response times, than similar unmodulated sounds; Blumstein et al. (2012) found that musical soundtracks manipulated to include distortion were judged more negative and more arousing than control soundtracks. In the animal kingdom, marmots spend less time foraging after hearing alarm calls manipulated to include white noise than after normal or control calls (Blumstein & Recapet, 2009). Here, we therefore take participants' explicit ratings of the valence of short vocal and instrumental sounds (manipulated to induce roughness or not) as an index of affective responses to auditory threat in a generic, non-musical context, and test the hypothesis that such responses may be deactivated in metal fans.

METHOD

Participants

A total of 332 participants with normal self-reported vision and hearing were screened via an online questionnaire for their orientation toward a variety of musical genres, including metal, as well as a number of demographic variables. Participants were all French-speaking young adults, enrolled in Sorbonne Universite, Paris, and were recruited through the experimental plateform of the Sorbonne-INSEAD Center for Multidisciplinary Science. For each genre, participants had to indicate how much they enjoyed listening to such music, using a 7-point Likert scale. In addition, for genres rated above 5, they had to cite three of their favorite tunes for that genre. Genres listed in the survey were inspired by typical taxonomies of internet music services like Spotify (Pachet & Cazaly, 2000), and included blues, contemporary music, classical, French variety, electro, folk, jazz, metal, pop, rap/hip-hop, religious music, rock, soul/funk, and world music. Pop music was selected as a control genre for being not typically associated with strong cues of auditory threat, and for having high negative correlation with preference for metal music across the group (Pearson's r = −.12, n = 332; Figure 1).

FIGURE 1.
Relations between liking for musical genres in the N = 332 participants screened for the study. Left: Correlation matrix between genres, labeled with Pearson's r coefficients. Right: two-dimensional multidimensional scaling solution for the same correlations, each genre labeled with Pearson's r correlation to metal. Participants who liked, or disliked, metal music tended to have similar attitudes to rock (r = .48), and opposite attitudes to pop (r = −.12) and rap (r = −.12).
FIGURE 1.
Relations between liking for musical genres in the N = 332 participants screened for the study. Left: Correlation matrix between genres, labeled with Pearson's r coefficients. Right: two-dimensional multidimensional scaling solution for the same correlations, each genre labeled with Pearson's r correlation to metal. Participants who liked, or disliked, metal music tended to have similar attitudes to rock (r = .48), and opposite attitudes to pop (r = −.12) and rap (r = −.12).

We then selected 40 participants from the original pool, based on their orientation towards metal and pop music. Twenty participants (male = 12; M = 21.3 years old, SD = 2.7 years) who gave ratings ≥ 6 for metal music and ≤ 4 for pop music were selected for the metal group, and 20 participants (male = 10; M = 22.3 years old, SD = 3.2 years) who gave ratings < 2 for metal and > 6 for pop music were selected for the control group. Metal fans did not statistically differ from controls in terms of age (mean difference: M = − 1.0 years, 95% Cl [−2.96, 0.86], t(38) = − 1.11, p = .27), musical expertise (mean practice difference = M = 4.9 years, 95% CI [−1.7, 11.5], t(11) = 1.63, p = .13), and musical engagement (mean listening difference: M = −3.35 hours/week, 95% CI [−12.1, 5.4], t(38) = −0.77, p = .44). Six participants were eventually not able to participate in the study after they were included, leaving 17 participants in each group for the final sample (N = 34).

Stimuli

Stimuli for the experiment consisted of 24 short, one-second recordings of human vocalizations (12 original, 12 rough) and musical instruments (12 original, 12 rough). Original vocalizations were recorded by one female and two male actors instructed to shout/sing phonemes [a] and [i] at three different pitches (in the range 450–480, 570–600, and 520–570 Hz for females; 200–215, 250–270, and 315–340 Hz for males), with a clear, loud voice (see audio samples in Supplementary Materials accompanying this article online at mp.ucpress.edu). Original musical instrument samples were extracted from the McGill University Master Samples sound library (MUMS; Opolko & Wapnick, 1989), and included single note recordings of three wind (bugle, clarinet, trombone) and one string (violin) instrument, each performed at three different pitches. Both types of sounds were then manipulated with a digital audio transformation aimed to simulate acoustic roughness, one prominent cue to vocal arousal/threat (ANGUS; Gentilucci, Ardaillon, & Liuni, 2018; freely available at forumnet.ircam.fr/product/angus/). ANGUS transforms sound recordings by adding subarmonics to the original signal using a combination of f0-driven amplitude modulations and time-domain filtering, an approach known to confer a growl-like, aggressive quality to any vocal or harmonic sound (Tsai et al., 2010). Here, we used ANGUS to add three amplitude modulators at f0/2, f0/3, and f0/4 submultiples of the original sounds' fundamental frequency (f0), and thus generated transformed “rough” versions of each of the 12 vocal and instrument original sounds, resulting in 24 vocal stimuli (see Supplementary Materials accompanying this article online at mp.ucpress.edu) and 24 musical stimuli.

Procedure

Participants were presented with one block of 24 vocal and one block of 24 musical stimuli (counterbalanced), played through Beyerdynamics DT770 headphones. At each trial, participants were instructed to rate the perceived valence/approachability of the stimulus, using a 7-point Likert scale ranging from 1 (very negative) to 7 (very positive). Stimuli were presented in random order within each block, with an interstimulus interval randomized between 0.8–1.2 s.

Preregistered analysis strategy

Participant ratings were analyzed with a 2 x 2 mixed ANOVA, with participant group (metal/not) as a between participant factor and stimulus roughness (original/rough) as a within-participant factor.

RESULTS

There was a main effect of stimulus roughness on perceived approachability, with ANGUS-manipulated sounds judged more negative than original sounds (Figure 2; mean valence difference M = −0.41, 95% CI [−0.51, −0.33], F(1, 32) = 43.74, p = < .00001, ges = 0.11). However, this effect of roughness did not interact with participant group: both metal fans and non-fans judged rough sounds less approachable than original sounds (mean valence difference M = −0.08, 95% CI [−0.26, 0.10], F(1, 32) = 0.39, p = .53, ges = 0.001).

FIGURE 2.
Effect of stimulus roughness on valence ratings (Experiment 1), left: human vocalizations, right: musical instruments. Rough sounds were judged more negatively than original sounds, and metal fans did not report less negativity than non-metal fans for either rough vocal or musical sounds. Error bars, 95% CI on the mean.
FIGURE 2.
Effect of stimulus roughness on valence ratings (Experiment 1), left: human vocalizations, right: musical instruments. Rough sounds were judged more negatively than original sounds, and metal fans did not report less negativity than non-metal fans for either rough vocal or musical sounds. Error bars, 95% CI on the mean.

As an additional non-registered analysis, we also examined the effect of sound category (vocalization or instrument) on valence ratings using a 2 x 2 x 2 mixed ANOVA: there was a main effect of category on valence ratings, with vocalizations judged more positive than musical instruments across conditions (Figure 2; mean difference M = 0.59, 95% CI [0.42, 0.76], F(1, 32) = 26.67, p = < .00001, ges = 0.21). However, this effect did not interact with either stimulus roughness, F(1, 32) = 0.02, p = .88, or participant group, F(1, 32) = 1.43, p = .24.

DISCUSSION

Our data replicate the finding that acoustic roughness, as simulated here by amplitude modulations and the ANGUS software tool, is appraised as low on approachability/valence (Arnal et al., 2015; Blumstein et al. 2012). Interestingly, despite being grounded in biological signaling and the physiology of the vocal apparatus (Fitch, Neubauer, & Herzel, 2002), roughness elicited similar emotional evaluation regardless of whether they were applied to vocal or musical sounds, confirming that biological signaling indeed underlie part of the emotional reactions to musical sounds (Blumstein et al., 2012).

Critically for our hypothesis, however, metal fans reported similar levels of valence as non-metal fans for both rough vocal or musical sounds. Outside of an extreme musical context, metal fans therefore do not find rough sounds particularly pleasing and approachable, even with isolated instrument sounds. This does not support the idea that metal lovers do so because of altered or reconditioned affective responses to auditory threat, but rather suggests that, outside of the culturally circumscribed musical context of metal music, such responses lead to the same behavioral outcome as in non-fans. Yet, because our rating task specifically targeted explicit affective judgments, it remains a possibility that low-level perceptual responses still differ in metal fans, but that these participants somehow compensate at the explicit level by relying on declarative knowledge, e.g., an awareness of the fact that rough sounds generally convey negative attitudes (e.g., shouts are often used in situations where people are angry). Thus, we ran a second experiment, to examine a purely perceptual process—sound localization—that although it is impacted by it, does not necessarily involve an affective evaluation of the stimuli, and operates on very short time scales that allegedly tap into more implicit mechanisms.

Experiment 2: Spatial Localization Task

Beyond the explicit negative appraisal of the stimuli, the rapid and accurate localization of danger is one of the main behavioral outcomes of the threat response system (Panksepp, 2004). In previous work, Asutay and Västfjäll (2017) submitted participants to a visual search task and found that search times for low-salient targets decreased when these were preceded with task-irrelevant arousing sounds (dog growls and fire alarm). Evidence for the importance of rapid and accurate localization of impending danger is also found in the auditory looming literature, where sound sources that imply approaching auditory motion are localized faster and more accurately than receding sound sources (McCarthy & Olsen, 2017) Similarly, Arnal et al. (2015) measured the speed and accuracy to detect whether normal vocalizations and screams were presented on participants' left or right sides using interaural time difference (ITD) cues, and found participants were both more accurate and faster at localizing screams. Here, we implement a similar spatial localization task as Arnal et al. (2015) and use location speed and accuracy as an implicit index of threat responses in metal and non-metal fans, testing whether such behavioral outcomes are hypoactivated in metal fans.

METHOD

Participants

Experiment 2 included the same 34 participants (metal = 17, non = 17) as in Experiment 1.

Stimuli

Experiment 2 used the same 48 stimuli (24 voice, 24 instrument samples) as Experiment 1, with the same acoustic manipulation of roughness (ANGUS; Gentilucci et al. 2018) for half of the stimuli.

Procedure

We used a similar procedure as Arnal et al. (2015). Participants were presented with 15 repetitions of each stimuli (a total of 15 x 48 = 720 trials), played dichotically through Beyerdynamics DT770 headphones with an interaural time difference (ITD) indicative of either a left-field or right-field presentation. Prior to testing, stimulus ITD was individually calibrated for each participant using an two-up, one-down staircase procedure, with a dichotically presented 300 ms pure tone at fundamental frequency 700 Hz. The initial ITD was 25 samples (567.5 ms at SR = 44,100 kHz), and the initial step size was 2 samples (45.4 ms). This step size was halved (1 sample, 22.7 ms) after the first inversion. Throughout the adaptive procedure, ITD values were constrained to a minimum of 22.7 ms and a maximum of 567.5 ms, and stimulus onset asynchrony (SOA) was randomized between 0.8–1.2 s. The procedure stopped after 12 inversions, and the final ITD was computed as the average ITD over the last 5 steps.

Testing then consisted of two blocks of 360 vocal and musical trials (counterbalanced, randomized with each block), dichotically presented at each participant's fixed ITD, with a balanced, pseudo-random sequence of 360 left- and 360 right-field presentations. SOA was randomized between 1.4–1.9 s. At each trial, participants were instructed to report their perceived field of presentation (left/right) as quickly as possible.

Preregistered analysis strategy

Similar to Arnal et al. (2015), we measured individual localization performance (d′), reaction times (RTs), and calculated a composite measure of efficiency, corresponding to the additive effect of individual z-score-normalized performance and reaction speed. Efficiency was computed for each participant and sound category, and statistical significance was assessed with a rmANOVA using participant group as a between-subject factor and stimulus roughness as a within-subject factor.

RESULTS

Average hit rate across participants and condition was H = .78 (SD = .18) and response time was RT = 1.04 s (SD = 0.94). There was a main effect of stimulus roughness on efficiency, where the spatial location of roughmanipulated sounds was detected more efficiently than that of original sounds (Figure 3; mean efficiency difference M = 0.31, 95% CI [0.14, 0.49], F(1, 32) = 6.65, p = .014, ges = 0.036). This difference was actually driven by accuracy: rough sounds were detected more accurately than original sounds (d': F(1, 32) = 6.15, p = .02, with no reduction of reaction time, z-score RTs: F(1, 32) = 1.10, p = .30. Importantly, the facilitating effect of roughness did not interact statistically with participant group, F(1, 32) = 0.52, p = .54, although paired t-tests only showed an effect of stimulus roughness in metal fans (mean efficiency difference M = 0.40, 95% CI [0.05, 0.76], t(16) = 2.46, p = .025, but not in non-fans (M = 0.22, 95% CI [−0.13, 0.57], t(16) = 1.24, p = .23).

FIGURE 3.
Effect of stimulus roughness on spatial localization (Experiment 2), left: human vocalizations, right: musical instruments. Rough sounds were localized with more efficiency (more accurately at similar reaction times), and metal fans were no less sensitive to the facilitating effect of roughness than non-metal fans. Error bars, 95% CI on the mean.
FIGURE 3.
Effect of stimulus roughness on spatial localization (Experiment 2), left: human vocalizations, right: musical instruments. Rough sounds were localized with more efficiency (more accurately at similar reaction times), and metal fans were no less sensitive to the facilitating effect of roughness than non-metal fans. Error bars, 95% CI on the mean.

As an additional non-registered analysis, we also examined the effect of sound category (vocalization or instrument) on the efficiency of spatial localization: regardless or roughness, musical instruments were detected more accurately (mean difference of z-score d': M = 0.74, 95% CI [0.48, 1.01], F(1, 32) = 17.01, p = .0002, ges = 0.19), but also more slowly as compared to vocalizations (mean difference of z-score RTs: M = 0.50, 95% CI [0.41, 0.60], F(1, 32) = 59.95, p < .00001 ges = 0.63), with the result of no effect on combined efficiency (Figure 3; F(1, 32) = 0.38, p = .54). None of these effects interacted with roughness, nor with participant group.

DISCUSSION

Our data replicate the previous finding that roughness, a prominent cue of vocal arousal, facilitates the spatial localization of both vocal and musical sounds. Arnal et al. (2015) found that rough sounds were detected with both better accuracy and faster response time; on a similar task, our participants gave here more accurate responses with similar response times than for control sounds. It is possible that the latency effect additionally found by Arnal et al. (2015) is due to their making the baseline task more difficult by embedding target sounds in white noise at 5 dB SNR, and adding a sinusoidal ramp of amplitude in the initial 100 ms of the sounds. It is therefore significant that, even in ecological listening conditions, acoustic roughness improved the accuracy of spatial localization.

Critically for our hypothesis, however, metal fans did not behave with less efficiency than non-metal fans when localizing rough sounds; if anything, they were even more accurate than non-fans. Taken together, results from Experiments 1 and 2 do not support the idea that extreme music lovers do so because they do not respond as intensely to auditory threat: explicitly, they rate roughness—one prominent acoustic cue to threat— as similarly negative and, implicitly, react to them equally fast and accurately as non-fans. These results are consistent with recent findings by Sun et al. (2019), in which both metal fans and non-fans were presented aversive and neutral pictures in a binocular rivalry paradigm designed to measure implicit bias towards negative stimuli. Under these conditions, and similarly to what we find here in the auditory domain, metal fans were found no less sensitive to violent imagery than non-fans, suggesting that preference for metal is not the result of sensitivized responses to threat.

Experiment 3: Loaded Preference Task

Results from Experiment 1 and 2 do not give empirical support for a differential functioning of low-level threat response circuits in metal fans, who react equally negatively (Experiment 1), equally fast, and accurately (Experiment 2) to acoustic roughness—one prominent cue to auditory threat—in vocal and instrumental contexts than non-fans. Whether autonomic/behavioral threat responses and subjective fear are the result of two entirely orthogonal systems (LeDoux & Pine, 2016) or the result of a unique fear generator with distinct effectors that can be independently modulated (Fanselow & Pennington, 2018), it therefore appears that, while they differ on their subjective experience of the music, metal fans do not respond to auditory threat differently than non-fans. As proposed above, an alternative hypothesis is that higher-order, top-down modulation by prefrontal cortical systems plays an important role in the aesthetic musical experience (Belin & Zatorre, 2015).

A wealth of behavioral and neural data documents top-down contributions of executive functions and prefrontal systems to the prepotent processing of affective stimuli (Abitbol et al., 2015; Greene, Morelli, Lowenberg, Nystrom, & Cohen, 2008; Van Dillen et al., 2009, and show that these functions can be experimentally manipulated with dual-task paradigms. For instance, Gilbert, Tafarodi, and Malone (1993) used a visual digit-search task in which participants were instructed to press a response key each time the digit 5 appeared in a stream of rapidly scrolling digits, while they concurrently read crime reports that contained both true and false statements; participants under such cognitive load were more likely to misremember false statements as true. Similarly, Greene et al. (2008) found that performing a concurrent digit-search task selectively interfered with utilitarian moral judgment (approving of harmful actions that maximize good consequences) but preserved non-utilitarian judgements based on emotional reactions (disapproving of harmful actions, regardless of outcome). Here, we use a dual-task paradigm in which participants listen and rate their preference for both metal and pop music extracts while engaging concurrently in a demanding digit-search task. With this paradigm, we test whether metal-fans' positive orientation towards violent music is the result of cognitive control over inputs from more automatic first-order circuits that, as seen in Experiments 1 and 2, would otherwise predict the same negative reactions as in non-metal fans.

METHOD

Participants

Experiment 3 included the same 34 participants (metal = 17, non = 17) as in Experiments 1 and 2.

Stimuli

Stimuli consisted in 80 short (7-9 s) extracts from commercial musical songs of the metal (40) and pop music (40) genres. Songs in both genres were selected on the basis of participant responses to the screening questionnaire (see Experiment 1), using the following procedure: each participant of the metal (respectively, pop) group listed 3 favorite titles of that genre; a list of 20 titles was selected from all of the participants' responses with the criteria to include music that had (respectively, did not have) clear cues of auditory threat (growl-like vocals, distorted guitars, noise and non-linearities); each title was then substituted by another similar, but lesser known song of a different artist using the “song radio” tool of the commercial music service Spotify.com (data accessed March 2018). The popularity of a given title or artist was estimated using Spotify's “play count” for that title or that artist (for a similar methodology, see e.g., Bellogin, de Vries, & He, 2013). Substitute titles were selected if their play count was less than 10% of that of the most popular title of the most popular artist of the genre, and if their artist's play count was less than 10% of that of the most popular artist of the genre. The rationale for the procedure was to select songs that were maximally similar to the group's self-reported favorite items, but unlikely to be known/recognized by the participants. Finally, two 7–9 s extracts from each of the 20 songs was selected, to be presented in each of the two experimental blocks (load/no-load), so that stimuli were matched in terms of musical content but not exactly repeated. The procedure resulted in 80 extracts (2 extracts x 20 songs x 2 genres), the same for all participants. Song list available in  Appendix A.

Procedure

The experimental procedure consisted of two blocks of 20 trials, with and without cognitive load (counterbalanced across participants). In each block, trials consisted in pairs of musical stimuli (one of each genre), presented in a random order with a 1.5 s interstimulus interval. Participants listened to the stimuli over headphones (Beyerdynamics DT770). Upon hearing the second stimulus of each pair, participants were instructed to report their preference for one or the other extract (two-alternative forced choice), as well as a measure of their confidence in that preference (from 1 = not at all confident to 4 = very confident).

In the load condition, streams of colored (red, green, blue, yellow) digits scrolled on the screen during each trial. The stream started 3 s. before the beginning of the first musical excerpt, and continued until participants were prompted for a confidence rating. This ensured that both listening and music preference were done under the concurrent task, while confidence judgments were provided without cognitive load. Participant were instructed to press a key when digit 5 was presented on the screen in either red, green or yellow, but to inhibit their response if it was presented in blue. Digit probability was set at 0.3 for digit 5, and 0.1 for digits 1–4, 6, and 8; color probability was 0.4 for blue, and 0.2 for red, green and yellow. Digits were displayed at a fixed period in the range 200–300 ms, calibrated for each participant using an adaptive procedure (see below). To increase task demands, a warning message was displayed at each detection error (miss or false alarm).

In the non-load block, the same string of digits was presented on the screen, but participants were instructed to simply ignore them and focus on the main task. The order of the blocks was counterbalanced across participants, and the stimuli were pseudorandomly assigned to one block or the other so that excerpts of the same songs appeared in different blocks.

The calibration procedure for digit search frequency was a two-up, one-down staircase, aiming for a 70% detection rate. The initial period was set at 500 ms, and the step size at 50 ms. Throughout the procedure, period values were constrained to a minimum of 200 ms and a maximum of 300 ms. The procedure stopped after 12 inversions, and the final period was computed as the average period over the last five steps.

Preregistered analysis strategy

Participants' preferences over the 20 trials of each block were aggregated into a score of preference for metal, by dividing the number of metal songs preferred over their alternative pop songs by the total number of trials (20). We then tested the effect of participant group (between-participant, 2 levels: metal/control) and condition (within-participant, 2 levels: load/control) on preference for metal and confidence scores using a rmANOVA.

RESULTS

Predictably, there was a large main effect of participant group on preference, with metal fans expressing stronger preference for metal over pop music alternatives independently of cognitive load (Figure 4, top; mean difference of preference M = 0.50, 95% CI [0.42, 0.57], F(1, 32) = 84.19, p < .00001, ges = 0.67). There was a main effect of cognitive load on participant's response times and confidence, with slower (mean increase of RT M = 480 ms, 95% CI [280, 670], F(1, 32) = 11.46, p = .0019, ges = 0.09) and less confident (mean loss of confidence M = −0.18 pt on a 1–4 scale, 95% CI [−0.26, −0.09], F(1, 32) = 9.32, p = .004, ges = 0.03) responses made under load, suggesting that our experimental manipulation indeed loaded cognitive functions. However, there was no main effect of the cognitive load manipulation on metal preference (Figure 4, top; mean loss of preference M = −0.01, 95% CI [−0.05,0.03], F(1, 32) = 0.12, p = .72) and, critically for our hypothesis, no significant interaction between group and cognitive load, F(1, 32) = 2.92, p = .09. Our pre-registered strategy for analysis therefore failed to reveal any effect of cognitive load on participant preference.

FIGURE 4.
Effect of cognitive load on preference for metal music, in both metal fans and non-fans (Experiment 3), top: all trials, bottom left: trials with slow responses, bottom right: trials with fast responses. While cognitive load had no effect on slow responses, the manipulation had an effect on preference responses when they were reported before the end of the second song (“fast responses”), with metal-fans reporting 36% less preference for metal over pop music while under load, while pop-fans did not show such a change in their musical preferences across the two conditions. Error bars, 95% CI on the mean.
FIGURE 4.
Effect of cognitive load on preference for metal music, in both metal fans and non-fans (Experiment 3), top: all trials, bottom left: trials with slow responses, bottom right: trials with fast responses. While cognitive load had no effect on slow responses, the manipulation had an effect on preference responses when they were reported before the end of the second song (“fast responses”), with metal-fans reporting 36% less preference for metal over pop music while under load, while pop-fans did not show such a change in their musical preferences across the two conditions. Error bars, 95% CI on the mean.

In an additional exploratory analysis, we studied participant preference response times and found they were in fact bimodally distributed, with 25.8% of “fast” responses made while listening to the second song in a trial (before it was completely heard, or shortly thereafter, i.e., < 300 ms post-song), and 74.2% of “slow” responses made after both songs were completely heard (i.e., > 300 ms post-song). We then grouped preference scores in fast/slow response types, and found that, while no effect of cognitive load was observed in slow responses, the effect that we predicted initially was present in fast responses (Figure 4, bottom). For these trials, cognitive load reduced preference for metal in metal fans by 36% (mean loss of preference M = −0.36, 95% CI [−0.61, −0.11], t(12) = −3.18, p = .008), while it did not affect preference for pop music in the control group (mean change of preference M = 0.08, 95% CI [−0.18, 0.35], t(22) = 0.66, p = .51).2 

DISCUSSION

Our dual-task paradigm with a taxing visual digit-search task was successful in creating cognitive load, as evidenced by 480 ms slower and less confident reports of musical preference in the concurrent music listening task. This pattern of result is weaker but consistent with previous paradigms of the same kind: with a slightly faster rate of digit display (140 ms) but a simpler task (without inhibiting targets of certain colors) and a different domain of evaluation (moral choices), Greene et al. (2008) report a 750 ms increase of response time; in Lee, Lee, and Ng Boyle (2007), a concurrent auditory task created a loss of confidence in visual judgments, with an effect size (d = 0.5) also greater than what we find here.

However, our data provided only little evidence for the role of cognitive load in evaluating preference for metal music. We found no effect of cognitive load on participants' preference judgments for extracts of the metal or pop music genre in our preregistered analysis strategy. Our hypothesized effect of load was only found when we restricted the analysis to those trials in which participants answered rapidly (before the two extracts of a pair were played integrally).

While this concerns only 25% of the data, the fact that cognitive load impacted only fast responses is not incompatible with the literature. In Van Dillen and van Steenbergen (2018), participants were time-limited and pressed to respond quickly to loaded trials (pictures of edible vs. non-edible food) to avoid participants engagement in avoidant gaze strategies that could reduce interference with the digit-span task; in Van der Wal and Van Dillen (2013), they were instructed to drink liquid samples all at once before evaluating them. That cognitive load did not interfere with slower, self-paced responses may indicate that our visual cognitive-load task only had a relatively moderate impact on executive functions, and that slow trials correspond to those in which the cognitive load was only partial and did not prevent our participants from engaging higher order cognition during their judgement (Lavie, 2010). It is also possible that load interfered as expected with sensory processing during listening, but that additional time taken after the direct experience of the stimuli allowed participants to engage in additional cognitive processes, such as semantic or autobiographic memory (e.g., “this is metal, and I like metal”), that may not have been impacted by our cognitive-load task.

Importantly though, an alternative explanation to the fact that cognitive load reduced a proportion of music preference towards metal in metal fans in fast responses is that load simply made participants unable to do the task: while speeded preference for metal music in metal fans was degraded under load to 0.42 (i.e., they on averaged preferred pop to metal), this proportion did not significantly differ from the 0.5 chance level. However, this alternative interpretation is not really compatible with the fact that load did not degrade preference for pop music in the control group. Another possibility is that it was speeded judgments, rather than load, which “regressed” preferences toward the mean, but this interpretation is also made unlikely by the fact that, even in these responses, metal fans had marked preference for metal in the no-load condition.

Further work should attempt to replicate this pattern of data with a paradigm involving higher cognitive load, and/or speeded responses of music preferences.

General Discussion: Towards a Higher-order Theory of the Emotional Experience of Music

While it is generally admitted that the cognition of musical signals is continuous with that of generic auditory signals (Schlenker, 2017) and that, in particular, the emotional appraisal of music largely builds on innately programmed, primary subcortical brain systems evolved to respond to animal signaling (Blumstein et al., 2012), human prosody (Juslin & Laukka, 2003) and environmental cues (Ma & Thompson, 2015), the case of appreciation for extreme metal music seems a theoretical conundrum (Thompson et al. 2018). It could be that metal fans differ from non-fans in how they process threat signals at the subcortical level, showing deactivated or reconditioned responses that differ from controls—a view that has lead some to call appreciation for violent music a psycho-social dysfunction (Bodner & Bensimon, 2015; Stack et al., 1994; Sun et al., 2017). However, from a more recent higher-order perspective of emotional experience (LeDoux & Brown, 2017), it is also possible that fans' appreciation for metal reflects the modulation/inhibition by the cortical circuits of higher-order cognition of an otherwise normal low-level response to auditory threat. In the first two experiments, we have shown here that, at the perceptual and affective levels, metal fans react in fact equally negatively (Experiment 1), equally fast and perhaps even more accurately (Experiment 2) to acoustic roughness—one prominent cue to auditory threat—in vocal and instrumental contexts than non-fans. In Experiment 3, we tested the converse hypothesis that cognitive load reduce fans' appreciation of metal to the level experienced by non-fans. Primary evidence did not allow to conclude that it was the case, except perhaps on one exploratory subset of the data (fast responses). Nevertheless, taken together, these results provide no support to the idea that extreme music lovers do so because of a different low-level response to threat, and highlight the potential for a contribution of higher-order, controlled cognitive processes in their aesthetic experience.

While these results have implications for a growing corpus of psychological studies of metal music (Bodner & Bensimon, 2015; Gowensmith & Bloom, 1997; Olsen, Thompson, & Giblin, 2018; Sun et al., 2017; Thompson et al., 2018), notably confirming that viewing metal as dysfunctional “problem music” is empirically untenable, implications for the general theory of musical emotions are, in our view, even greater. They shape a model of musical emotions that significantly extends the traditional view, in which the cortical and subcortical signals sent by affective and sensory systems (auditory thalami, auditory cortices) do not simply feed forward relatively unaltered to associative cortices (following e.g., right temporal-frontal pathway of emotional prosody processing; Schirmer & Kotz, 2006), but can also be thoroughly modified/inhibited by the circuits of higher-order cognition, to the point of creating emotional experiences (e.g., here, liking the music; in Thompson et al., 2018, the experience of peace or joy) that appear to contradict the low-level cues that serve as input to these evaluations (e.g., here, acoustic roughness). What is significant in the present pattern of results is that behavioral signatures of both types of responses simultaneously co-exist in the system: metal fans exhibit both “typical” low level processes that appraise rough sounds as negative and worthy of immediate attention (Experiments 1 and 2) as well as high-order systems able to assert cognitive control over these responses and produce positive emotional experiences (Thompson et al., 2018, and, tentatively here, Experiment 3).

This model suggests that there is, in fact, a hierarchy of emotional experiences to music. Some, like that of rejecting metal music as threatening and violent, are strongly conditioned by low-level systems and flow relatively unaltered into conscious awareness. Others, like appreciating metal, are significantly reshaped by cognitive control and culturally situated learning. It is perhaps ironic that positive responses to metal, once dismissed as dysfunctional or unsophisticated, may be one of the most cognitively refined in this spectrum of experiences. Other reactions of the same nature may include positive reactions to sad music (Vuoskoski, Thompson, McIlwain, & Eerola, 2012), or negative emotions to entraining, happy music (e.g., “Even if some culturally-determined part of your mind is saying ‘I hate this song,’ your body will ecstatically sing along with Debby Boone in ‘You light up my life’; Oswald, 2000).

This idea that low-level responses shaped by evolution, and higher-order responses shaped by the social environment, coexist and interact provides a unified framework to think about the interactions between biological and cultural evolution in the human musical experience (Bryant, 2013): sound patterns such as the distorted guitar sounds and harsh vocals of metal music exploit evolved perceptual response biases manifested in first-order systems, but then take on distinct/controlled emotional values through cultural evolutionary processes, reflected in higher-order responses. This model also brings musical emotions in line with modern constructionist views of emotions (Barrett, 2017; Cespedes-Guevara & Eerola, 2018), for which the emotional experience is a psychological event constructed from more basic “core affect” and higher-level conceptual knowledge. For fans of violent or sad music, the psychological construction of a positive experience from negatively valenced sensory cues may be similar to that of constructing “invigorating fear” from a roller-coaster ride or “peaceful sadness” from enjoying a moment of solitude after a busy day (Wilson-Mendenhall, Barrett, & Barsalou, 2013).

More importantly, several predictions can be made from this model. First, because they implicate additional cognitive resources and less direct sensory evidence, one might expect that higher-order musical experiences such as preference for metal or sad music should be both slower and less confident than lower-order musical experiences, e.g., dislike for metal or preference for pop music. In our data (Experiment 3), although this may reflect population rather than meaningful differences, judgments of preference for metal in metal fans were nonsignificantly slower (M = −239 ms, 95% CI [−618 ms, +139 ms], t(32) = 1.28, p = .20) but significantly less confident (M = −0.37,95% CI [−0.69, −0.05], t(32) = −2.37, p = .02) than judgments of preference for pop in non-fans. Further work should examine these differences in a within-subject, one-interval task more appropriate to measuring reaction times. Second, because first- and higher-order responses are assumed to coexist during the emotional experience, one would expect to measure physiological reactions (e.g., pupil dilation; Oliva & Anikin, 2018) or neural activity (in, for example, the amygdala; Arnal et al., 2015) indexing normal response to threat in both fans and non-fans (i.e., relatively independently of the listener's positive or negative emotional evaluation of metal music). Third, because executive functions involved in cognitive control are implemented in frontal lobe regions (Duncan & Owen, 2000), one would expect that positive higher-order emotional reactions to, for example, violent or sad music should be degraded to more direct aversive responses with experimental manipulations such as transcranial magnetic stimulation to the dorsolateral prefrontal cortex (Tassy et al., 2011), or during sleep. Finally, at the population level, appreciation for metal, because it implicates controlled cognitive processes and executive functions, may be correlated with greater capacity for emotional regulation, just like appreciation for sad music may be correlated with greater trait empathy (Vuoskoski et al., 2012). In Thompson et al. (2018, p. 10), four of the seven mood regulation strategies measured from the Brief Music in Mood Regulation Scale (B-MMR) were higher for fans of death metal relative to non-fans.

Finally, while some aspects of data from Experiment 3 provided tentative evidence of controlled processes in the appreciation of metal, and less so in pop music, our results leave open many possibilities concerning the nature or timing of these processes. First, they leave the notion of “cognitive load” relatively under-specified. Our task, a speeded digit search, loads both executive functions involved in updating (attention to novel digits) and inhibiting (inhibiting responses to targets of one specific color), but not, for example, in task switching (Miyake et al., 2000), and it is unclear which of these processes specifically contributes to the construction of the emotional experience. Second, the present results do not address the appraisal mechanisms that govern the emotional responses that, according to our theory, support liking metal. Processes inhibited in Experiment 3 could involve, for example, focusing one's attention on other features of the music than threatening cues (e.g., treating growling vocals as a non-emotionally-significant singing style, and focusing instead on on words or melody; Olsen et al., 2018), engaging in psychological distancing (e.g., evaluating metal sounds as a virtual threat that presents no actual danger to personal safety; Menninghaus et al., 2017), establishing an aesthetic judgmental attitude (Brattico & Vuust, 2017), or recontextualizing cues of violence as not directed toward the self, but from the self toward an hypothetical other (for a discussion of how these different levels may overlap, see also Thompson & Olsen, 2018). Finally, here we took music preference as a proxy for emotional experience, but preference is mediated by many variables other than a positive affective response, including imaginal and analytical responses (Lacher & Mizerski, 1994), which all could have been affected by our load manipulation. Further work should therefore attempt to replicate the effect of cognitive load on more direct and varied measures of emotional experience.

Notes

Notes
1.
It should be noted that the parent genre of metal music includes a great variety of subgenres, which may differ in their use of the type of cues studied in this work (e.g., a lot of vocal roughness in the death or black metal subgenres, and none in prog metal). While we refer here to metal and extreme music interchangeably, future work on the question should probably personalize sub-metal preferences per participant.
2.
A statistical note: analyses in the slow and fast response subgroups were done with independent rather than paired t-tests between the load and no-load condition (despite some amount of shared variance within some of the participants), because not all participants had slow and fast responses in both load conditions. If restricting the analysis to those participants who had fast responses in both load conditions, a repeated-measure ANOVA showed a similar group x load interaction, F(1, 11) = 7.01, p = .022, and a similar reduction of preference for metal in the metal group (mean loss of preference M = −0.40, 95% CI [−0.70, −0.09]), but that group included only four metal fans.

References

References
Abbey, E. J., & Helb, C. (Eds.)
Hardcore, punk, and other junk: Aggressive sounds in contemporary music
.
Lanham, MD
:
Lexington Books.
Abitbol, R., Lebreton, M., Hollard, G., Richmond, B. J., Bouret, S., & Pessiglione, M. (
2015
).
Neural mechanisms underlying contextual dependency of subjective values: converging evidence from monkeys and humans
.
Journal of Neuroscience
,
35
,
2308
2320
.
Anikin, A., Båth, R., & Persson, T. (
2018
).
Human non-linguistic vocal repertoire: Call types and their meaning
.
Journal of Nonverbal Behavior
,
42
(
1
),
53
80
.
Arnal, L. H., Flinker, A., Kleinschmidt, A., Giraud, A. L., & Poeppel, D. (
2015
).
Human screams occupy a privileged niche in the communication soundscape
.
Current Biology
,
25
(
15
),
2051
2056
.
Asutay, E., & Västfjäll, D. (
2017
).
Exposure to arousal-inducing sounds facilitates visual search
.
Scientific Reports
,
7
(
1
). DOI:
Barrett, L. F. (
2017
).
The theory of constructed emotion: an active inference account of interoception and categorization
.
Social Cognitive and Affective Neuroscience
,
12
(
1
),
1
23
.
Belin, P., & Zatorre, R. J. (
2015
).
Neurobiology: Sounding the alarm
.
Current Biology
,
25
(
18
),
R805
R806
.
Bellogin, A., de Vries, A. P., & He, J. (
2013
).
Artist popularity: Do web and social music services agree?
Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media
.
Cambridge, MA
:
AAAI
.
Blair, M. E., & Shimp, T. A. (
1992
).
Consequences of an unpleasant experience with music: A second-order negative conditioning perspective
.
Journal of Advertising
,
21
(
1
),
35
43
.
Blood, A. J., & Zatorre, R. J. (
2001
).
Intensely pleasurable responses to music correlate with activity in brain regions implicated in reward and emotion
.
Proceedings of the National Academy of Sciences
,
98
(
20
),
11818
11823
.
Blumstein, D. T., Bryant, G. A., & Kaye, P. (
2012
).
The sound of arousal in music is context-dependent
.
Biology Letters
,
8
(
5
),
744
747
.
Blumstein, D. T., & Recapet, C. (
2009
).
The sound of arousal: The addition of novel non-linearities increases responsiveness in marmot alarm calls
.
Ethology
,
115
(
11
),
1074
1081
.
Bodner, E., & Bensimon, M. (
2015
).
Problem music and its different shades over its fans
.
Psychology of Music
,
43
(
5
),
641
660
.
Brattico, E., & Vuust, P. (
2017
).
The urge to judge: Why the judgmental attitude has anything to do with the aesthetic enjoyment of negative emotions
.
Behavioral and Brain Sciences
,
40
,
e353
.
Brown, A. R., Spracklen, K., Kahn-Harris, K., & Scott, N. (Eds.). (
2016
).
Global metal music and culture: Current directions in metal studies
.
Abingdon-on-Thames, United Kingdom
:
Routledge
.
Bryant, G. A. (
2013
).
Animal signals and emotion in music: Coordinating affect across groups
.
Frontiers in Psychology
,
4
. https://doi.org/10.3389/fpsyg.2013.00990
Bryson, B. (
1996
).
“Anything but heavy metal”: Symbolic exclusion and musical dislikes
.
American Sociological Review
,
61
(
5
),
884
899
.
Cespedes-Guevara, J., & Eerola, T. (
2018
).
Music communicates affects, not basic emotions-A constructionist account of attribution of emotional meanings to music
.
Frontiers in Psychology
,
9
. DOI:
Den, M. L., Graham, B. M., Newall, C., & Richardson, R. (
2015
).
Teens that fear screams: A comparison of fear conditioning, extinction, and reinstatement in adolescents and adults
.
Developmental Psychobiology
,
57
(
7
),
818
832
.
Diano, M., Celeghin, A., Bagnis, A., & Tamietto, M. (
2017
).
Amygdala response to emotional stimuli without awareness: Facts and interpretations
.
Frontiers in Psychology
,
7
. DOI:
Duncan, J., & Owen, A. M. (
2000
).
Common regions of the human frontal lobe recruited by diverse cognitive demands
.
Trends in Neurosciences
,
23
(
10
),
475
483
.
Erk, S., Kleczar, A., & Walter, H. (
2007
).
Valence-specific regulation effects in a working memory task with emotional context
.
Neuroimage
,
37
(
2
),
623
632
.
Escoffier, N., Zhong, J., Schirmer, A., & Qiu, A. (
2013
).
Emotional expressions in voice and music: Same code, same effect?
.
Human Brain Mapping
,
34
(
8
),
1796
1810
.
Fanselow, M. S., & Pennington, Z. T. (
2018
).
A return to the psychiatric dark ages with a two-system framework for fear
.
Behaviour Research and Therapy, 100
,
24
29
.
Feinstein, J. S., Buzza, C., Hurlemann, R., Follmer, R. L., Dahdaleh, N. S., Coryell, W. H., et al (
2013
).
Fear and panic in humans with bilateral amygdala damage
.
Nature Neuroscience
,
16
(
3
),
270
272
.
Fitch, W. T., Neubauer, J., & Herzel, H. (
2002
).
Calls out of chaos: The adaptive significance of nonlinear phenomena in mammalian vocal production
.
Animal Behaviour
,
63
(
3
),
407
418
.
Gentilucci, M., Ardaillon, L., & Liuni, M. (
2018
).
Vocal distortion and real-time processing of roughness
.
Proceedings of the International Computer Music Conference (ICMC)
.
Daegu, Korea
:
International Computer Music Association
.
Gilbert, D. T., Tafarodi, R. W., & Malone, P. S. (
1993
).
You can't not believe everything you read
.
Journal of Personality and Social Psychology
,
65
(
2
),
221
233
.
Gowensmith, W. N., & Bloom, L. J. (
1997
).
The effects of heavy metal music on arousal and anger
.
Journal of Music Therapy
,
34
(
1
),
33
45
.
Greene, J. D., Morelli, S. A., Lowenberg, K., Nystrom, L. E., & Cohen, J. D. (
2008
).
Cognitive load selectively interferes with utilitarian moral judgment
.
Cognition
,
107
(
3
),
1144
1154
.
Ilie, G., & Thompson, W. F. (
2006
).
A comparison of acoustic cues in music and speech for three dimensions of affect
.
Music Perception
,
23
,
319
330
.
Juslin, P. N., & Laukka, P. (
2003
).
Communication of emotions in vocal expression and music performance: Different channels, same code?
Psychological Bulletin
,
129
(
5
),
770
814
.
Juslin, P. N., & Västfjäll, D. (
2008
).
Emotional responses to music: The need to consider underlying mechanisms
.
Behavioral and Brain Sciences
,
31
(
5
),
559
575
.
Labbé, E., Schmidt, N., Babin, J., & Pharr, M. (
2007
).
Coping with stress: The effectiveness of different types of music
.
Applied Psychophysiology and Biofeedback
,
32
(
3
4
),
163
168
.
Lacher, K. T., & Mizerski, R. (
1994
).
An exploratory study of the responses and relationships involved in the evaluation of, and in the intention to purchase new rock music
.
Journal of Consumer Research
,
21
(
2
),
366
380
.
Lavie, N. (
2010
).
Attention, distraction, and cognitive control under load
.
Current Directions in Psychological Science
,
19
(
3
),
143
148
.
LeDoux, J. E., & Brown, R. (
2017
).
A higher-order theory of emotional consciousness
.
Proceedings of the National Academy of Sciences
,
114
(
10
),
E2016
E2025
.
LeDoux, J. E., & Pine, D. S. (
2016
).
Using neuroscience to help understand fear and anxiety: A two-system framework
.
American Journal of Psychiatry
,
173
(
11
),
1083
1093
.
Lee, Y. C., Lee, J. D., & Ng Boyle, L. (
2007
).
Visual attention in driving: The effects of cognitive load and visual disruption
.
Human Factors
,
49
(
4
),
721
733
.
Ma, W., & Thompson, W. F. (
2015
).
Human emotions track changes in the acoustic environment
.
Proceedings of the National Academy of Sciences
,
112
(
47
),
14563
14568
.
McAlpin, C. (
1925
).
Is music the language of the emotions?
The Musical Quarterly
,
11
(
3
),
427
443
.
McCarthy, L., & Olsen, K. N. (
2017
).
A ‘looming bias’ in spatial hearing? Effects of acoustic intensity and spectrum on categorical sound source localization
.
Attention, Perception, and Psychophysics
,
79
,
352
362
.
McCrory, E. J., De Brito, S. A., Kelly, P. A., Bird, G., Sebastian, C. L., Mechelli, A., et al (
2013
).
Amygdala activation in maltreated children during pre-attentive emotional processing
.
The British Journal ofPsychiatry
,
202
(
4
),
269
276
.
Menninghaus, W., Wagner, V., Hanich, J., Wassiliwizky, E., Jacobsen, T., & Koelsch, S. (
2017
).
The distancing-embracing model of the enjoyment of negative emotions in art reception
.
Behavioral and Brain Sciences
,
40
. DOI:
Miu, A. C., & Baltes, F. R. (
2012
).
Empathy manipulation impacts music-induced emotions: A psychophysiological study on opera
.
PloS One
,
7
(
1
),
e30618
.
Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., Howerter, A., & Wager, T. D. (
2000
).
The unity and diversity of executive functions and their contributions to complex “frontal lobe” tasks: A latent variable analysis
.
Cognitive Psychology
,
41
(
1
),
49
100
.
Oliva, M., & Anikin, A. (
2018
).
Pupil dilation reflects the time course of emotion recognition in human vocalizations
.
Scientific Reports
,
8
(
1
). DOI:
Olsen, K. N., Thompson, W. F., & Giblin, I. (
2018
).
Listener expertise enhances intelligibility of vocalizations in death metal music
.
Music Perception
,
35
,
527
539
.
Opolko, F. J., & Wapnick, J. (
1989
).
McGill University Master Samples: MUMS
.
Montreal, Canada
:
McGill university
.
Oswald, J. (
2000
).
Plunderstanding ecophonomics: Strategies for the transformation of existing music-An interview by Norm Igma with John Oswald
. In J. Zorn (Ed.),
Arcana: Musicians on music
(pp.
9
17
).
New York
:
Granary Books
.
Pachet, F., & Cazaly, D. (
2000
).
A taxonomy of musical genres
.
Proceedings of Computer-Assisted Information Retrieval (Recherche d'Information et ses Applications, RIAO)
.
Paris, France
:
Centre des Hautes Etudes Internationales d'Informatique Documentaire
.
Panksepp, J. (
2004
).
Affective neuroscience: The foundations of human and animal emotions
.
Oxford, United Kingdom
:
Oxford University Press
.
Patel, A. D. (
2010
).
Music, language, and the brain
.
Oxford, United Kingdom
:
Oxford University Press
.
Pessoa, L., McKenna, M., Gutierrez, E., & Ungerleider, L. G. (
2002
).
Neural processing of emotional faces requires attention
.
Proceedings of the National Academy of Sciences
,
99
(
17
),
11458
11463
.
Rea, C., MacDonald, P., & Carnes, G. (
2010
).
Listening to classical, pop, and metal music: An investigation of mood
.
Emporia State Research Studies
,
46
,
1
3
.
Schirmer, A., & Kotz, S. A. (
2006
).
Beyond the right hemisphere: Brain mechanisms mediating vocal emotional processing
.
Trends in Cognitive Sciences
,
10
(
1
),
24
30
.
Schlenker, P. (
2017
).
Outline of music semantics
.
Music Perception
,
35
,
3
37
.
Sharman, L., & Dingle, G. A. (
2015
).
Extreme metal music and anger processing
.
Frontiers in Human Neuroscience
,
9
. https://doi.org/10.3389/fnhum.2015.00272
Siegel, P., Warren, R., Wang, Z., Yang, J., Cohen, D., Anderson, J. F., et al (
2017
).
Less is more: Neural activity during very brief and clearly visible exposure to phobic stimuli
.
Human Brain Mapping
,
38
(
5
),
2466
2481
.
Stack, S., Gundlach, J., & Reeves, J. L. (
1994
).
The heavy metal subculture and suicide
.
Suicide and Life-Threatening Behavior
,
24
(
1
),
15
23
.
Sun, Y., Lu, X., Williams, M., Thompson, W. F. (
2019
)
Implicit violent imagery processing among fans and non-fans of music with violent themes
.
Royal Society Open Science
,
6
. https://doi.org/10.1098/rsos.181580
Sun, Y., Zhang, C., Duan, S., Du, X., & Calhoun, V. D. (
2017
).
Altered resting-state functional connectivity of default-mode network and sensorimotor network in heavy metal music lovers
.
Neuroreport
,
30
(
5
),
317
322
Tassy, S., Oullier, O., Duclos, Y., Coulon, O., Mancini, J., Deruelle, C., et al (
2011
).
Disrupting the right prefrontal cortex alters moral judgement
.
Social Cognitive and Affective Neuroscience
,
7
(
3
),
282
288
.
Thompson, W. F., Geeves, A. M., & Olsen, K. N. (
2018
).
Who enjoys listening to violent music and why?
Psychology of Popular Media Culture
,
8
(
3
),
218
232
.
Thompson, W. F., & Olsen, K. N. (
2018
).
On the enjoyment of violence and aggression in music. Comment on “An integrative review of the enjoyment of sadness associated with music” by Tuomas Eerola et al
.
Physics of Life Reviews
,
25
,
128
130
.
Tsai, C. G., Wang, L. C., Wang, S. F., Shau, Y. W., Hsiao, T. Y., & Auhagen, W. (
2010
).
Aggressiveness of the growl-like timbre: Acoustic characteristics, musical implications, and biomechanical mechanisms
.
Music Perception
,
27
,
209
222
.
Van der Wal, R. C., & van Dillen, L. F. (
2013
).
Leaving a flat taste in your mouth: Task load reduces taste perception
.
Psychological Science
,
24
(
7
),
1277
1284
.
Van Dillen, L. F., Heslenfeld, D. J., & Koole, S. L. (
2009
).
Tuning down the emotional brain: An fMRI study of the effects of cognitive load on the processing of affective images
.
Neuroimage
,
45
(
4
),
1212
1219
.
Van Dillen, L. F., & van Steenbergen, H. (
2018
).
Tuning down the hedonic brain: Cognitive load reduces neural responses to high-calorie food pictures in the nucleus accumbens
.
Cognitive, Affective, and Behavioral Neuroscience
,
18
(
3
),
447
459
.
Vuilleumier, P., Armony, J. L., Driver, J., & Dolan, R. J. (
2001
).
Effects of attention and emotion on face processing in the human brain: An event-related fMRI study
.
Neuron
,
30
(
3
),
829
841
.
Vuoskoski, J. K., & Eerola, T. (
2017
).
The pleasure evoked by sad music is mediated by feelings of being moved
.
Frontiers in Psychology
,
8
. https://doi.org/10.3389/fpsyg.2017.00439
Vuoskoski, J. K., Thompson, W. F., McIlwain, D., & Eerola, T. (
2012
).
Who enjoys listening to sad music and why?
Music Perception, 29
,
311
317
.
Weinstein, D. (
2000
).
Heavy metal: The music and its culture
.
Boston, MA
:
Da Capo Press
.
Whalen, P. J., Kagan, J., Cook, R. G., Davis, F. C., Kim, H., Polis, S., et al (
2004
).
Human amygdala responsivity to masked fearful eye whites
.
Science
,
306
(
5704
),
2061
2061
.
Wilson-Mendenhall, C. D., Barrett, L. F., & Barsalou, L. W. (
2013
).
Neural evidence that human emotions share core affective properties
.
Psychological Science
,
24
(
6
),
947
956
.

Appendix A

Song Extracts Used as Stimuli in Experiment 3

METAL GROUP

Deez Nuts: Purgatory ©2017 Century Media Records Enterprise Earth: Shroud of Flesh ©2017 Stay Sick Recordings

Veil of Maya: Fracture ©2017 Sumerian Records Lordi: How to Slice a Whore ©2014 AFM Records Testament: The Pale King ©2016 Nuclear Blast The Color Morale: When One was Desolate ©2009 Rise Records

Heaven & Hell: I - Live ©2007 Rhino Entertainment Erra: Skyline ©2016 Sumerian Records

Carcass: Edge of Darkness ©1996 Earache Records Soil: Way Gone ©2017 Pavement Entertainment Coal Chamber: Entwined ©1999 Woah Dad!

Between The Buried and Me: The Coma Machine ©2015 Metal Blade Records

Testament: Trails of Tears ©1994 Atlantic Coal Chamber: Beckoned ©2002 Woah Dad!

Deathstars: Death Is Wasted On the Dead ©2014 Deathstars

Death: Spirit Crusher ©2011 Relapse Records

Miss May I: Never Let Me Stay ©2017 Sharptone

Nonpoint: Be Enough ©2016 Spinefarm Records

Allegaeon: From Nothing ©2016 Metal Blade Records

Miss May I: Crawl ©2017 Sharptone

CONTROL GROUP

Zaho: Te amo ©2013 Parlophone Records

Lea Michele: Empty Handed ©2013 Columbia Records

Loreen: Statements ©2017 Warner Music

Benjamin Ingrosso: Dance You Off ©2017 Record Company

TEN S Club 7: I'll Keep Waiting ©2000 Polydor Ltd

Hilary Duff: Rebel Hearts ©2005 Hollywood Records

Amerie: Hatin'On You ©2002 Sony Music Entertainment

Vrit: Solutions ©2017 Vrit

Elder Island: Key One ©2016 Elder Island

Superbus: On the River ©2012 Polydor

S Club 7: Dance Dance Dance ©2001 Polydor Ltd

Tt: Chanteur Sous Vide ©2016 Fffartworks

Tony! Toni! Ton!: My Ex-Girlfriend ©1993 PolyGram

Athlete: Airport Disco ©2016 Chrysalis Records

Thirteen Senses: Thru The Glass ©2004 Mercury Records

Hollysiz: Rather Than Talking ©2017 Hamburger Records

Vrit: Somewhere in Between ©2017 Vrit

Rupaul: Kitty Girl ©2017 RuCo

Rupaul & Ellis Miah: Just a Lil In & Out ©2017 RuCo

Supplementary Materials

Supplementary Materials accompanying this article online at mp.ucpress.edu include:

  1. Pre-registration document (in French) submitted to the Ecole Normale Superieure (ENS) Cogmaster office, Decision dated February 1, 2018.

  2. Experimental stimuli for Experiments 1 and 2: 12 vocal recordings in two variants each (natural/rough). Vocal samples recorded by the authors. File name coded as speaker_phoneme_pitch_condition.wav