We studied memory for harmony using a melody-and-accompaniment texture and 10 commercially successful songs of Western popular music. The harmony was presented as a timbrally matching block-chord accompaniment to digitally isolated vocals. We used three test chord variants: the target was harmonically identical to the original chord, the lure was schematically plausible but different from the original, and the clash conflicted with both the tonal center and the local pitches of the melody. We used two conditions: in the one-chord condition we presented only the test chord, while in the all-chords condition the test chord was presented with all the chords of the original excerpt. One hundred and twenty participants with varying levels of music training rated on a seven-point scale if the test chord was the original. We analyzed the results on two dimensions of memory: veridical–schematic and specialized–general. The target chords were rated higher on average than the lures and considerably higher than the clash chords. Schematic memory (knowledge of Western tonal harmony) seemed to be important for rating the test chords in the all-chords condition, while veridical memory (familiarity with the songs) was especially important for rating the lure chords in the one-chord condition.
The melody-and-accompaniment texture is one of the most common textures in Western tonal music (Arthur, 2017; Bharucha, 1984; Huron, 2016; Tagg, 2000). Although this texture is sometimes used in music cognition experiments, there are only a handful of studies that have used this texture to studying memory for harmony (Creel, 2011; Cullimore, 1999; Povel & Van Egmond, 1993). Avoiding melody-and-accompaniment textures reduces the ecological validity of the studies on harmony. Further, since melody tends to be more perceptually salient than harmony (Williams, 2005), it is important to study how harmony can be perceived and remembered when chords serve as background to a clearly defined melody.
Earlier studies have shown that the accompaniment affects the listening experience even if it is perceptually subordinated to the melody. Accompanied melodies can be better enjoyed (Galizio & Hendrick, 1972) and can lead to greater emotional arousal (Schotanus, 2020) and a more intense perception of sadness (Friedman, 2019) than their unaccompanied versions. However, the perceptual subordination often means that some of the most perceptible effects of the accompaniments are experienced as changes in the way the melody is heard (Bernstein, 1976, p. 63; Tagg, 2000). Considering the salience of the main melody and the tendency for the accompaniment to be experienced not in itself but via its influence on the melody, it is reasonable to suspect that the accompaniment may be remembered less clearly than the melody. On the other hand, the melody itself could become a cue to remembering the accompaniment (Creel, 2011). This said, we think that various melody-and-accompaniment textures can be particularly helpful for testing listeners’ ability to remember the chords of the pieces of music they know well (hereafter veridical memory or veridical knowledge). This is because the melody can both activate the memories of the song and unambiguously point listeners’ attention to specific moments in time within a song. Further, melody can also imply harmony. To imagine harmony implied by the melody, the listeners do not need to be familiar with the specific piece of music, since it can suffice if they are familiar with the style or genre (hereafter schematic memory or schematic knowledge): in other words, they can use their schematic knowledge of harmony for imagining the possible harmonic accompaniment of a melody.
Veridical and Schematic Memory for Harmony
Most studies related to memory for harmony have focused on the syntactic aspects of harmony (for a review, see Pearce & Rohrmeier, 2018). These studies have found some evidence that Western-enculturated listeners, even those without music training, possess schematic knowledge about harmony, such as tonal hierarchies (Krumhansl, 1990) and chord transitions (Lhost & Ashley, 2006; Vuvan & Hughes, 2019), and that their perception of harmony is affected by the relative frequency of occurrence of certain chord types (Jimenez, Kuusi, Czedik-Eysenberg, & Reuter, 2021; Jimenez, Kuusi, & Ojala, 2022). There is also some evidence that this type of knowledge can be acquired implicitly (via mere exposure to music), stored in long-term auditory memory, and activated automatically when listening to music informing harmonic expectations (Vuvan & Hughes, 2019). Less research has been done on listeners’ ability to anticipate harmonic events based on veridical memories of harmony. A few studies have investigated whether veridical memories can override the effect of schematic knowledge on harmonic perception. In these studies, veridical memories have been created during the experiment by familiarizing the participants with a song-like musical passage created for the experiment (Creel, 2011), and especially by using block-chord progression (Tillmann & Bigand, 2010; Pagès-Portabella, Bertolo, & Toro, 2021), one of the simplest and most common ways to instantiate harmony in empirical research (Pearce & Rohrmeier, 2018). In other studies, veridical knowledge has been created by informing the participants about the forthcoming chord (Guo & Koelsch, 2016; Justus & Bharucha, 2001). The studies have shown that veridical memories can decrease but not completely override the effect of schematic knowledge on harmonic perception. However, the generalizability of the results is limited by the brevity of the familiarization with the stimuli and, with the exception of Creel (2011), by the use of block chords as opposed to melody-and-accompaniment textures. Schubert and Pearce (2015, p. 367) point out that song-specific mental representations of harmonic patterns are more firmly established in long-term memory after weeks or years of repeated exposure than after relatively short periods of exposure within experimental sessions. These beliefs are consistent with empirical findings regarding the effect of multiple exposures (Szpunar et al., 2004) and consolidation periods (Marshall & Born, 2007; Miles et al., 2016; Morgan-Short et al., 2012) on memory tasks.
More recent studies that have focused on veridical memory for harmony have taken advantage of participants’ already-existing extensive familiarity with commercially successful songs, improving the ecological validity in terms of musical texture and familiarization. These studies have shown that the listeners are able to identify songs from their chord progressions even when melodic, rhythmic, timbral, and textural cues are missing, a type of retrieval task that relies almost exclusively on veridical memory for harmony (Jimenez & Kuusi, 2018, 2020; Kuusi et al., 2021).
Although these studies show that simple block-chord textures can activate memories of songs that use melody-and-accompaniment textures, very few studies on memory for harmony have used melody-and-accompaniment textures as stimuli, and none of them have studied both veridical and schematic memory using extensive real-life familiarity with the tested music. Povel and Van Egmond (1993) noticed that participants rated a melody as similar to itself even though the harmony of the different accompaniments was not the same, suggesting that participants can at least partially ignore the accompanying harmony in short-term memory tasks. Cullimore (1999) used excerpts from a piano piece by W. A. Mozart either as original, with stylistically acceptable harmonic modifications in the accompaniment, or with modifications in the main melody that did not modify the harmony. Participants rated the excerpts as being more different from each other when the modifications occurred in the main melody than in the accompaniment, even when the participants had played the piece in the past. This suggests that the auditory memory for the harmonic elements of the accompaniment might lack details and vividness. Creel (2011) showed that participants can store auditory long-term memory related to the general harmonic characteristics of the accompaniment and that these memories could be activated by listening to the melody without the accompaniment even when the pitches of the melody did not fully determine the key. Yet, it seems that information concerning specific chords in the accompaniment is more difficult to store than general harmonic information (Cullimore, 1999).
In order to investigate listeners’ ability to remember the specific chords of an accompaniment (veridical memory) and their ability to assess chords when they have not heard the songs before (schematic memory), our current study combined elements from Creel (2011) and an experiment on chord substitution by Lhost and Ashley (2006), the only attempt so far in testing the perception of chord substitution in the context of a specific musical style. We took Lhost and Ashley’s study as a point of departure by using stylistically acceptable and unacceptable chords together with the original chords. However, while Lhost and Ashley’s study investigated musicians’ ability to assess block chords in the context of the 12-bar blues progression, a style-specific harmonic schema, we set to investigate participants’ ability to assess the chords of specific songs. This type of task more directly involves veridical memory and does not necessarily require music training.
Additionally, we adopted from Creel (2011) the use of melody-and-accompaniment textures instead of block-chord progressions. However, we used digitally isolated vocals from pre-existing commercially successful songs instead of song-like stimuli created for the experiment. By these choices we were able to have naturalistic stimuli that provided participants with a rich harmonic palette and multiple cues for retrieval (e.g., lyrics and nuance of vocal timbre and interpretation). Hence, we were able to use ecologically valid stimuli to deepen our understanding of how chord substitutions are perceived and to gain knowledge about how well the harmony of the accompaniment is remembered.
Aim
Our study investigated participants’ veridical and schematic memory of harmony using melody-and-accompaniment textures. More specifically, we examined the participants’ ability to determine whether the chords accompanying isolated vocals of commercially successful songs were the original ones (targets) or had been substituted either by schematically plausible lures or by non-matching clashes. We expected that veridical knowledge (familiarity with the song) would be important for distinguishing the original chord from the lure while schematic knowledge (familiarity with the musical style and Western tonal music in general) was expected to be important for distinguishing the clash. Further, we examined the role of schematic knowledge of those participants who did not know the songs (and therefore could not use veridical knowledge) for comparing the lure and the target. For further information of the chords, see “Stimuli.”
We also studied the role of participant background variables for their responses, since we expected that both general music training and conceptual knowledge of harmony would facilitate the task. Earlier studies have shown that general music training relates to greater attention (Williams, 2005) and sensitivity to harmony (Farbood, 2012; Kopiez & Platz, 2009) in experimental tasks that involve implicit or explicit schematic knowledge of harmony. Further, the tasks that heavily rely on veridical memory for harmony—such as the identification of songs from chord progressions—seem to be facilitated by music training (Jimenez & Kuusi, 2018), by having played the songs, and by being able to write their chord labels from long-term memory (indicating conceptual knowledge, hereafter named as “specialized harmonic familiarity;” Jimenez & Kuusi, 2020; Kuusi et al., 2021).
In addition to the three different types of test chords (target, lure, and clash as explained above) we used two types of harmonic conditions. In the one-chord condition, only the chord to be rated was presented with the digitally isolated vocals from the best-known recording of the song. In the all-chords condition, all chords that accompany the vocals were presented. In both conditions, we used chords formed of simultaneously played pitches (instead of the original texture). We expected that (a) the task would be easier in the all-chords condition than in the one-chord condition. Further, we anticipated that (b) the participants are most confident about the targets being the original chord followed by the lures and the clash chords, that (c) participants’ veridical harmonic knowledge influences the lure ratings, and that (d) participants’ schematic harmonic knowledge influences the clash ratings.
Method
Participants
The online experiment was visited 1,476 times between September 22 and October 21, 2021. Since we knew that the number of non-serious visitors and survey bots is large in crowdsourcing platforms (Ahler et al., 2019; Dennis et al., 2020), we used a pre-test to screen participants. In the pre-test, the visitors were to choose the loudest tone of a series of five piano tones. The difficulty of the loudness pre-test was set relatively high to minimize the influence of the quality of participants’ headphones, the environmental noise, and participants’ hearing deficiencies like hearing loss. The loudness pre-test included three separate trials, and the participants were allowed to listen to the series of five tones in each trial as many times as they wanted before moving on to the next trial. Altogether 212 visitors abandoned the survey before taking the pre-test, one visitor abandoned the survey while taking it, and 963 visitors were not allowed to take the survey because they failed to answer the pre-test correctly. Further, we used three criteria1 for recognizing and rejecting 37 visitors who completed some parts of the experiment without actually listening to the item or with the help of autofillers or bots. Of the 263 who remained, 153 (58.2%) completed the experiment. This completion rate is approximately the same as in online experiments using participants with high internal motivation (Bosnjak & Tuten, 2003; Tuten et al., 2004) but clearly higher than in some other studies (O’Neil & Penrod, 2001; O’Neil et al., 2003). Further, it should be noted that the completion rate does not include those who completed the experiment but were rejected because of our inclusion criteria.
Since 33 of the participants had taken the experiment twice and we only accepted the first response from them, the total number of participants whose responses were included in our main analysis was 120 (60 male, 59 female, 1 other; age M = 42.44, SD = 11.29. We collected background information of the participants through a questionnaire to which they responded at different points during the experiment. Approximately one third (34.2%) of the participants had never played an instrument and could be labeled as listeners, while the others had played an instrument for less than five years (25.8%) or for more than five years (40.0%). Additional participant information will be explained in Results and shown in Appendix A.
Stimuli
The songs were selected based on various online pilots. Amazon Mechanical Turk workers were initially surveyed about the number of times they had heard and played 150 songs that (1) had more than 300,000 listeners in Last.fm and (2) had verses with harmony not limited to root position versions of the most common diatonic triads used in Western popular music (i.e., I, IV, V, vi, and ii with no added tones; de Clercq & Temperley, 2011; Miles et al., 2017; Nadar et al., 2019). Of these, 40 songs were pre-selected based on (1) the results of the survey, (2) whether the best-known recording of the song contains clearly audible vertical instantiations of the chords in the accompaniment of the verse (e.g., block chords not covered by loud percussion), and (3) our success at using DeMIX Pro version 2.0.2 to digitally isolate the vocals of the verse of the song without producing major audio artifacts.
From the beginning of the first verse of each of these 40 songs, we selected an excerpt consisting of 2 to 4 melodic phrases, containing 4 to 12 chords in the accompaniment, and lasting between 7 to 18 s. We used excerpts from the verses because they often feature only one vocal layer (Stephenson, 2002) and because they tend to have less timbral variety (van Balen et al., 2013) and a less dense texture (Everett, 2009) than choruses. All these characteristics facilitated the digital isolation of the vocals. Yet another reason for using verses was that chord progressions in verses tend to be less predictable than those in choruses, and the harmonic unpredictability of the verse tends to peak near the middle of each verse (Miles et al., 2017). Within each excerpt we chose the target chord using the following criteria: (1) it was not a root position version of the most common diatonic triads and (2) it occurred near to the middle of the excerpt, preceded by 2 to 5 chords and followed by 1 to 5 chords. Using target chords with moderate levels of harmonic unpredictability in our experiment prevented participants from guessing the original chord based purely on schematic harmonic knowledge.
For each excerpt of isolated vocals we created an accompaniment in Logic Pro X instantiating each chord as a single block chord that was as similar as possible to the original accompaniment in terms of pitches, metrical placement, and timbre. We created two additional versions of each excerpt by substituting the target chord with a lure or clash chord. The lure substitute was a commonly used chord type (de Clercq & Temperley, 2011; Miles et al., 2017; Nadar et al., 2019) whose pitches matched several of the pitches of the accompanied vocals. The lure chords were also similar to the targets in terms of chord type (e.g., both target and lure were major seventh chords) or in the exact pitches of all the notes of the chord except for its bass. In the latter case the bass tone always changed the chord type (e.g., from Fadd9 to Am7). The clash substitute was always a root-position major chord (without added tones) whose pitches clashed with both the tonal center and the local pitches of the melody (isolated vocals) and whose root was often a semitone apart from the original.
Due to the multiplicity of musical variables that may affect how participants perceive a chord accompanying a melody (e.g., harmonic context, metrical position, style, lyrics, timbre) we decided not to make any further assumptions about the musical validity of the chord substitutions based on theoretical grounds. Instead, we tested each of the 40 songs in pilots to identify the chord substitutions that participants were most likely to mistakenly assume to be the chord used in the best-known recording of the song. Although participants in the pilots were not allowed to later participate in the main experiment, they were recruited via the same crowdsourcing platform we used in the main experiment (see Procedure), and we therefore considered their responses sufficiently generalizable to our main experiment.
Finally, we chose 10 songs that represented various decades between 1965 and 2008 and that, according to the pilots, were well-known, had adequate timbral similarity between test chords and the original accompaniment (indicated by consistent high ratings for target), in which the original test chords could be substituted with lure chords that were musically convincing (indicated by similarity between ratings of target and lures), and that, when all songs were viewed as a whole, contributed to a set of stimuli that varied in terms of chord type, scale degree, and inversion. We then created two versions of each stimulus. Each of these two versions included the same vocals accompanied either by only the test chord (one-chord condition) or by the test chord plus other block chords instantiating the accompaniment in the original song (all-chords condition). For details see Appendices B and C.
Procedure
The project was approved by the Research Ethics Committee of the University of the Arts Helsinki. Participants were recruited online by “word of mouth” and using Amazon Mechanical Turk (MTurk), a crowdsourcing platform that provides access to more than a hundred thousand potential participants (Difallah et al., 2018). Armitage and Eerola (2020) have shown that the results of music cognition experiments on chord perception carried out in standard laboratory settings are comparable to those from online experiments that recruit participants using services like MTurk.
We used PsyToolkit software (Stoet, 2010, 2017) for data collecting. In the main experiment, the participants were first provided with the title of the song, the name of the artist or band who recorded the best-known version of the song, the date of release of that version, and playback controls to hear the excerpt (isolated vocals only). At this point, the participants were asked to give a general estimate of how many times they had heard and sung the song. Participants who had experience in playing and practicing musical instruments were also asked to estimate how many times they had played the song and what percentage of those performances were read from music notation, and to write down the labels of the chords from the excerpt based on long-term memory. All participants who had heard the song at least once in their lifetime were also asked to rate the vividness of their memory for the missing accompaniment of the isolated vocals.
After responding to the preliminary questions, the participants were taken to a page that included playback controls and questions about the three different harmonizations of the isolated vocals from the song. Each participant heard both the one-chord and all-chords condition for each song. To minimize the order effect of the conditions, at least eight songs were tested between the two conditions of a song. The time between the two conditions was further increased by the questions about the songs described above.
The following instructions were always presented at the top of the page:
For each of the audio clips in this page, please choose the option that best describes whether the accompanying test chord is the chord from the actual song.
- The underlined blue bold text, in the lyrics right below the playback controls, shows you the exact moment when you will hear the test chord.
- If you have heard the song before, please rely only on your current memory of the song, please do not look up the song on the internet or your private collection to refresh your memory of the song. Also, try to ignore the fact that the instruments used to play the chords in this page are not the exact same instruments as those used in the actual song.
- If you have never heard this song before in your life (other than the excerpt of isolated vocals we previously played for you), please respond according to your feeling about what original chord is likely to be (e.g., respond "definitely yes" if your strongly feel that the test chord sounds like it should be the chord used in the original song).
- Regardless of your degree of familiarity with the song, try to avoid giving the same response to all the three audio clips in this page.
Figure 1 is an example of how each excerpt was presented on the screen to the participants. The participants were given a 7-point scale to rate each of the three excerpts. Stimuli and rating scales for all three test chords of the song were presented on the same page. The participants were free to listen to the stimuli as many times as they wanted. To verify that the participants were attentively listening to the excerpts, they were randomly presented with audio clips that had isolated vocals without any accompaniment.
After being tested on all the 10 songs in both conditions, the participants were asked some additional questions about their experience with music including the self-reported portion of the Gold-MSI (Müllensiefen et al., 2014). Most participants completed the entire session in less than 40 minutes.
Results
We started our analyses by calculating descriptive statistics to all test variables (target, lure, and clash) for the one-chord condition and the all-chords condition. As stated, the order of the one-chord condition and the all-chords condition was pseudo-randomized. Half of the songs for any given participant were presented in the one-chord condition as the first instance, and the all-chords condition as the second instance followed at least eight songs later. For the other half of the songs, the order of the conditions was reversed. The confidence ratings for targets, lures, and clashes were averaged for the two conditions and two instances separately (see Figure 2). Generally, the ratings of the target were the highest (range from 4.68 to 5.25), those of the clash the lowest (range from 1.73 to 2.11), the lure being in-between (range from 3.95 to 4.39). The differences between the three test chords varied between 0.57 points (one-chord target versus one-chord lure) and 3.52 points (all-chords target versus all-chords clash), the average difference being 2.09 points. On the other hand, the differences between the first and second instance were generally very small: between 0.06 points (one-chord clash) and 0.51 points (one-chord target), the average difference being 0.17 points. There were statistically significant differences between the first and second instance only for the one-chord target, t(119) = 5.417, p > .001. This being the case, in the rest of the analyses we did not make a distinction of whether the confidence ratings were given in the first or second instance. The statistics for the target, lure, and clash chords in one-chord and all-chords conditions are given in Table 1.
Descriptive Statistics . | |||||
---|---|---|---|---|---|
. | N . | Min . | Max . | Mean . | Std. Deviation . |
One-chord Target | 120 | 3.30 | 6.50 | 4.946 | 0.690 |
One-chord Lure | 120 | 2.30 | 5.80 | 4.253 | 0.797 |
One-chord Clash | 120 | 1.00 | 5.44 | 2.078 | 0.942 |
All-chords Target | 120 | 3.20 | 7.00 | 5.205 | 0.789 |
All-chords Lure | 120 | 1.50 | 6.20 | 4.041 | 0.886 |
All-chords Clash | 120 | 1.00 | 5.70 | 1.797 | 0.967 |
Descriptive Statistics . | |||||
---|---|---|---|---|---|
. | N . | Min . | Max . | Mean . | Std. Deviation . |
One-chord Target | 120 | 3.30 | 6.50 | 4.946 | 0.690 |
One-chord Lure | 120 | 2.30 | 5.80 | 4.253 | 0.797 |
One-chord Clash | 120 | 1.00 | 5.44 | 2.078 | 0.942 |
All-chords Target | 120 | 3.20 | 7.00 | 5.205 | 0.789 |
All-chords Lure | 120 | 1.50 | 6.20 | 4.041 | 0.886 |
All-chords Clash | 120 | 1.00 | 5.70 | 1.797 | 0.967 |
To analyze the participant variables, we ran a principal components analysis (PCA) with varimax rotation (Table 2). The participant variables are listed in Table 3 and a more thorough explanation can be found in Appendix A. For the participant variables the Kaiser-Meyer-Olkin measure was .834, which means that the data was adequate for factor analysis, and Bartlett’s test of sphericity, χ2(136) = 1244.552, p <.001, told that the data matrix was not an identity matrix with uncorrelated variables, and, hence, suitable for factor analysis. The PCA revealed a four-component solution explaining approximately 71.1% of the variance (Table 2).
. | Initial Eigenvalues . | Extraction Sums of Squared Loadings . | ||||
---|---|---|---|---|---|---|
Component . | Total . | % of Variance . | Cumulative % . | Total . | % of Variance . | Cumulative % . |
1 | 6.678 | 39.280 | 39.280 | 6.678 | 39.280 | 39.280 |
2 | 2.399 | 14.112 | 53.391 | 2.399 | 14.112 | 53.391 |
3 | 1.924 | 11.318 | 64.709 | 1.924 | 11.318 | 64.709 |
4 | 1.083 | 6.369 | 71.078 | 1.083 | 6.369 | 71.078 |
. | Initial Eigenvalues . | Extraction Sums of Squared Loadings . | ||||
---|---|---|---|---|---|---|
Component . | Total . | % of Variance . | Cumulative % . | Total . | % of Variance . | Cumulative % . |
1 | 6.678 | 39.280 | 39.280 | 6.678 | 39.280 | 39.280 |
2 | 2.399 | 14.112 | 53.391 | 2.399 | 14.112 | 53.391 |
3 | 1.924 | 11.318 | 64.709 | 1.924 | 11.318 | 64.709 |
4 | 1.083 | 6.369 | 71.078 | 1.083 | 6.369 | 71.078 |
The structure was understandable and easy to interpret. The initial interpretation followed the standard view that factor analysis can help to uncover latent variables to which all the observable variables within each component are related (Bollen, 2002). Based on the variables, the interpretation is as follows (see the bold-print numbers in the varimax-rotated matrix in Table 3): Component 1 consisted of variables related to practical work with chords (e.g., composing, arranging, playing chords or songs by ear) and was initially labeled as Practical Harmonic Knowledge. Component 2 consisted of self-reported singing and listening abilities and interest in music and was initially named accordingly. Component 3 (initially labeled Familiarity with the Test Songs) was related to the participant’s general familiarity with the test song (age being correlated with the number of times participants had heard the test songs). Finally, Component 4 was related to theory-driven and notation-driven work with chords and was initially labeled as Conceptual Knowledge of Chords. Our decision to interpret C1 as Practical Harmonic Knowledge refers to the fact that all playing of—or practicing with—chords (with exception of variable V7) loaded to C1, even though variables V13 and V14 were specific for the test songs (and hence could be part of C3). It should be noted that playing chords (one of the characteristics of C1) was not particularly common among our participants (as shown in Appendix A, more than 60% of all participants had never played chords, and more than 70% had not played the test songs). What was very common among the participants was hearing the test songs (V11) and singing them (V12; as shown in Appendix A, all had heard the songs and 96% had sung them), and these variables related to general familiarity with test songs (the main characteristic of C3).
Rotated Component Matrix . | ||||
---|---|---|---|---|
. | Component . | |||
C1 Practical Harmonic Knowledge . | C2 Singing and Listening Abilities . | C3 Familiarity with the Test Songs . | C4 Conceptual Knowledge of Chords . | |
age | −.151 | −.323 | .565 | .041 |
V1_GoldMSI_Factor1_Active_engagement | .200 | .825 | .074 | .154 |
V2_GoldMSI_Factor2_Perceptual_abilities | .124 | .825 | .055 | .252 |
V3_GoldMSI_Factor3_Musical_training | .305 | .473 | −.095 | .595 |
V4_GoldMSI_Factor4_Singing_abilities | .236 | .829 | −.044 | .262 |
V5_GoldMSI_Factor5_Factor_emotions | .137 | .814 | .172 | .014 |
V6_playing_chords_by_ear_total_hours | .643* | .238 | .037 | .446 |
V7_playing_chords_from_music_notation_total_hours | .135 | .139 | .007 | .837 |
V8_years_ear_training_chords_and_progressions | .444 | .148 | .060 | .690 |
V9_number_of_pieces_composed | .862 | .141 | −.047 | .043 |
V10_number_of_pieces_arranged | .885 | .070 | .072 | .022 |
V11_average_times_heard_for_all_10_songs | .068 | .016 | .907 | .022 |
V12_average_times_sang_for_all_10_songs | .048 | .244 | .804 | −.012 |
V13_average_times_played_for_all_10_songs | .765 | .169 | .174 | .375 |
V14_percentage_times_played_BY_EAR_for_all_10_songs | .680 | .254 | −.048 | .186 |
V15_average_score_for_chord_labels_for_entire_excerpt_ for_all_10_songs | .706 | .176 | .014 | .414 |
V16_average_self_reported_vividness_of_memory_ for_accompaniment_for_all_10_songs_1_for_unknown | .169 | .468 | .645 | −.034 |
Rotated Component Matrix . | ||||
---|---|---|---|---|
. | Component . | |||
C1 Practical Harmonic Knowledge . | C2 Singing and Listening Abilities . | C3 Familiarity with the Test Songs . | C4 Conceptual Knowledge of Chords . | |
age | −.151 | −.323 | .565 | .041 |
V1_GoldMSI_Factor1_Active_engagement | .200 | .825 | .074 | .154 |
V2_GoldMSI_Factor2_Perceptual_abilities | .124 | .825 | .055 | .252 |
V3_GoldMSI_Factor3_Musical_training | .305 | .473 | −.095 | .595 |
V4_GoldMSI_Factor4_Singing_abilities | .236 | .829 | −.044 | .262 |
V5_GoldMSI_Factor5_Factor_emotions | .137 | .814 | .172 | .014 |
V6_playing_chords_by_ear_total_hours | .643* | .238 | .037 | .446 |
V7_playing_chords_from_music_notation_total_hours | .135 | .139 | .007 | .837 |
V8_years_ear_training_chords_and_progressions | .444 | .148 | .060 | .690 |
V9_number_of_pieces_composed | .862 | .141 | −.047 | .043 |
V10_number_of_pieces_arranged | .885 | .070 | .072 | .022 |
V11_average_times_heard_for_all_10_songs | .068 | .016 | .907 | .022 |
V12_average_times_sang_for_all_10_songs | .048 | .244 | .804 | −.012 |
V13_average_times_played_for_all_10_songs | .765 | .169 | .174 | .375 |
V14_percentage_times_played_BY_EAR_for_all_10_songs | .680 | .254 | −.048 | .186 |
V15_average_score_for_chord_labels_for_entire_excerpt_ for_all_10_songs | .706 | .176 | .014 | .414 |
V16_average_self_reported_vividness_of_memory_ for_accompaniment_for_all_10_songs_1_for_unknown | .169 | .468 | .645 | −.034 |
* The highest loadings of each variable are in bold print.
In addition to this standard approach with initial labels, we also provide another interpretation of the four components. In this second interpretation, we reconsidered the components using a two-dimensional framework: Dimension 1 stands for veridical and schematic memory for harmony, and Dimension 2 stands for general and specialized knowledge of harmony. The veridical and schematic types of memory have already been shown to affect the perception of chord progressions (see Introduction), and earlier studies have also shown the importance of specialized harmonic familiarity (see Aim). Figure 3 shows how each component is interpreted in the two-dimensional matrix. Veridical memory for harmony is understood as familiarity with the test songs either at a general level (by having heard and sung the songs; C3) or at a specialized level (by having played the songs and being able to write the labels; C1). Further, the number of arranged pieces could refer to specialized familiarity if the arranged pieces are the songs used in the experiment (we did not ask the participants whether they had arranged the test songs). However, the variable “composed pieces” does not fit in this interpretation and is written in grey because of this. As for schematic memory, it is understood as familiarity with tonal harmony either at a general level through the amount of exposure and attention to music during every-day listening (C2) or at a specialized level through training on analyzing chords and chord progressions, playing them, and identifying them by ear (C4). The variable V5 (emotional responses) does not unambiguously fit in this interpretation since it could be important for specialized knowledge as well. Even with the few shortcomings, this interpretation of the components allows us to describe the potential relationships between the chord ratings and veridical and schematic harmonic knowledge more easily.
To have a thorough view of how confidently the participants differentiated the target from the lure and clash chords, we analyzed the responses using the receiver operating characteristic (ROC) analysis and the area under curve (AUC). This analysis is commonly used in musical memory studies, and in our case, it showed how well the participants were able to differentiate the targets from the lures and clashes (for AUCs, see, e.g., Müllensiefen & Halpern, 2014; Schellenberg et al., 2019). Since we expected that hearing all the chords would help the participants in their task, we analyzed the AUCs separately for the one-chord condition and the all-chords condition. All AUCs were above the chance level (.500), and they showed that it was easiest for the participants to differentiate between targets and clash chords in the all-chords condition (AUC = .922, SD = .038) and almost as easy in the one-chord condition (AUC = .885, SD = .056). Differentiating between the target and lure was not as easy, since the AUC was .660 (SD = .155) in the all-chords condition and .602 (SD = .176) in the one-chord condition; see Figure 4). It should be noted, however, that we accepted participants regardless of their familiarity with all test songs. In case the song was unfamiliar, we encouraged the participant to use their feeling about what original chord is likely to be, that is, asking them to use schematic memory. From that perspective, targets and lures were equally correct. Further, we ran a two-factor ANOVA to see how much the condition (one-chord or all-chords) and the chord type used in the comparison with the target (lure or clash) affected the AUCs. The analysis confirmed that the type of test chord had a statistically significant effect, F(1,116) = 49.534, p <.001, on the AUCs, indicating that the participants were better able to distinguish the target chords from the clash chords than from the lure chords. The condition, however, had no effect, F(1,116) = 1.508, p = .227, indicating that the single test chord could be distinguished as easily as the test chord surrounded by other chords of the harmony. There was no interaction between the chord type and the condition, F(1,116) = 0.77, p = .782.
Since the lures were schematically plausible—even though they were not veridically correct—and since we had asked our participants to evaluate the chords also in the songs they were not familiar with, the AUCs did not reveal the whole picture of the schematic and veridical memory of harmony. Therefore, we continued our analyses by conducting regression analyses with the four components of the PCA for predicting the estimations separately for targets, lures, and clashes. We used both the one-chord condition and the all-chords condition since we wanted to see if the condition had a role in any of these. The results of the regression analyses are collected in Table 4 (model 1 for target, model 2 for lure, and model 3 for clash; (a) standing for the one-chord condition and (b) for the all-chords condition).
Model Summary . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Model . | R . | R Square . | Adjusted R Square . | Std. Error of the Estimate . | Change Statistics . | . | ||||
R Square Change . | F Change . | df1 . | df2 . | Sig. F Change . | Durbin-Watson . | |||||
1a | .575c | .330 | .307 | .574253 | .330 | 14.173 | 4 | 115 | .000 | 1.725 |
2a | .373c | .139 | .109 | .752579 | .139 | 4.647 | 4 | 115 | .002 | 1,858 |
3a | .539c | .291 | .266 | .806639 | .291 | 11.801 | 4 | 115 | .000 | 2.209 |
1b | .549c | .301 | .277 | .671027 | .301 | 12.395 | 4 | 115 | .000 | 1.940 |
2b | .518c | .269 | .243 | .770992 | .269 | 10.564 | 4 | 115 | .000 | 1.871 |
3b | .424c | .180 | .151 | .891261 | .180 | 6.306 | 4 | 115 | .000 | 2.187 |
Model Summary . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Model . | R . | R Square . | Adjusted R Square . | Std. Error of the Estimate . | Change Statistics . | . | ||||
R Square Change . | F Change . | df1 . | df2 . | Sig. F Change . | Durbin-Watson . | |||||
1a | .575c | .330 | .307 | .574253 | .330 | 14.173 | 4 | 115 | .000 | 1.725 |
2a | .373c | .139 | .109 | .752579 | .139 | 4.647 | 4 | 115 | .002 | 1,858 |
3a | .539c | .291 | .266 | .806639 | .291 | 11.801 | 4 | 115 | .000 | 2.209 |
1b | .549c | .301 | .277 | .671027 | .301 | 12.395 | 4 | 115 | .000 | 1.940 |
2b | .518c | .269 | .243 | .770992 | .269 | 10.564 | 4 | 115 | .000 | 1.871 |
3b | .424c | .180 | .151 | .891261 | .180 | 6.306 | 4 | 115 | .000 | 2.187 |
1a. Dependent Variable: one-chord condition, target
2a. Dependent Variable: one-chord condition, lure
3a. Dependent Variable: one-chord condition, clash
1b. Dependent Variable: all-chords condition, target
2b. Dependent Variable: all-chords condition, lure
3b. Dependent Variable: all-chords condition, clash
c. Predictors: (Constant), C1, C2, C3, C4
In the regressions, the Durbin-Watsons were between 1.725 and 2.187, i.e., all were near 2, which is optimal. Further, the residuals showed that the data were suitable for linear regression. For the one-chord condition, the model explained 30.7% of the variance of the target ratings; model 1a: R2 = .307, F(4, 115) = 14.173, p < .001, and all four PCA components added to the model (see Table 5). Further, it explained 10.9% of the variance of the lure ratings; model 2a: R2 = .109, F(4, 115) = 4.647, p = .002, with components 1 and 3 adding to the model, and 26.6% of the clash ratings; model 3a: R2 = .266, F(4, 115) = 11.801, p < .001, with all the components adding to the model. For the all-chords condition, the model explained 27.7% of the variance of the target ratings; model 1b: R2 = .277, F(4, 115) = 12.395, p < .001, with all components adding to the model. Further, it explained 24.3% of the variance of the lure ratings; model 2b: R2 = .243, F(4, 115) = 10.564, p < .001, with components 1, 3, and 4 adding to the model, while for the clash ratings the model explained 15.1% of the variance by components 2, 3, and 4; model 3b: R2 = .151, F(4, 115) = 6.306, p < .001.
Rating . | Compo-nents . | Standard Interpretation . | Veridical - Schematic Interpretation . | Unstandardized Coefficient B . | t . | Sig. . |
---|---|---|---|---|---|---|
Model 1a: one-chord target | 1 | Practical harmonic knowledge | Veridical specialized | 0.185 | 3.523 | 0.001 |
2 | Singing and listening abilities | Schematic general | 0.244 | 4.631 | 0.000 | |
3 | Familiarity with the test songs | Veridical general | 0.158 | 2.992 | 0.003 | |
4 | Conceptual knowledge of chords | Schematic specialized | 0.196 | 3.726 | 0.000 | |
Model 2a: one-chord lure | 1 | Practical harmonic knowledge | Veridical specialized | −0.185 | −2.688 | 0.008 |
2* | Singing and listening abilities | Schematic general | 0.072 | 1.041 | 0.300 | |
3 | Familiarity with the test songs | Veridical general | −0.205 | −2.965 | 0.004 | |
4 | Conceptual knowledge of chords | Schematic specialized | −0.084 | −1.221 | 0.224 | |
Model 3a: one-chord clash | 1 | Practical harmonic knowledge | Veridical specialized | −0.155 | −2.099 | 0.038 |
2 | Singing and listening abilities | Schematic general | −0.311 | −4.200 | 0.000 | |
3 | Familiarity with the test songs | Veridical general | −0.223 | −3.011 | 0.003 | |
4 | Conceptual knowledge of chords | Schematic specialized | −0.297 | −4.011 | 0.000 | |
Model 1b: all-chords target | 1 | Practical harmonic knowledge | Veridical specialized | 0.153 | 2.488 | 0.014 |
2 | Singing and listening abilities | Schematic general | 0.281 | 4.576 | 0.000 | |
3 | Familiarity with the test songs | Veridical general | 0.210 | 3.419 | 0.001 | |
4 | Conceptual knowledge of chords | Schematic specialized | 0.202 | 3.280 | 0.001 | |
Model 2b: all-chords lure | 1 | Practical harmonic knowledge | Veridical specialized | −0.272 | −3.850 | 0.000 |
2 | Singing and listening abilities | Schematic general | −0.132 | −1.869 | 0.064 | |
3 | Familiarity with the test songs | Veridical general | −0.292 | −4.128 | 0.000 | |
4 | Conceptual knowledge of chords | Schematic specialized | −0.186 | −2.627 | 0.010 | |
Model 3b: all-chords clash | 1 | Practical harmonic knowledge | Veridical specialized | −0.094 | −1.150 | 0.253 |
2 | Singing and listening abilities | Schematic general | −0.187 | −2.291 | 0.024 | |
3 | Familiarity with the test songs | Veridical general | −0.282 | −3.452 | 0.001 | |
4 | Conceptual knowledge of chords | Schematic specialized | −0.212 | −2.595 | 0.011 |
Rating . | Compo-nents . | Standard Interpretation . | Veridical - Schematic Interpretation . | Unstandardized Coefficient B . | t . | Sig. . |
---|---|---|---|---|---|---|
Model 1a: one-chord target | 1 | Practical harmonic knowledge | Veridical specialized | 0.185 | 3.523 | 0.001 |
2 | Singing and listening abilities | Schematic general | 0.244 | 4.631 | 0.000 | |
3 | Familiarity with the test songs | Veridical general | 0.158 | 2.992 | 0.003 | |
4 | Conceptual knowledge of chords | Schematic specialized | 0.196 | 3.726 | 0.000 | |
Model 2a: one-chord lure | 1 | Practical harmonic knowledge | Veridical specialized | −0.185 | −2.688 | 0.008 |
2* | Singing and listening abilities | Schematic general | 0.072 | 1.041 | 0.300 | |
3 | Familiarity with the test songs | Veridical general | −0.205 | −2.965 | 0.004 | |
4 | Conceptual knowledge of chords | Schematic specialized | −0.084 | −1.221 | 0.224 | |
Model 3a: one-chord clash | 1 | Practical harmonic knowledge | Veridical specialized | −0.155 | −2.099 | 0.038 |
2 | Singing and listening abilities | Schematic general | −0.311 | −4.200 | 0.000 | |
3 | Familiarity with the test songs | Veridical general | −0.223 | −3.011 | 0.003 | |
4 | Conceptual knowledge of chords | Schematic specialized | −0.297 | −4.011 | 0.000 | |
Model 1b: all-chords target | 1 | Practical harmonic knowledge | Veridical specialized | 0.153 | 2.488 | 0.014 |
2 | Singing and listening abilities | Schematic general | 0.281 | 4.576 | 0.000 | |
3 | Familiarity with the test songs | Veridical general | 0.210 | 3.419 | 0.001 | |
4 | Conceptual knowledge of chords | Schematic specialized | 0.202 | 3.280 | 0.001 | |
Model 2b: all-chords lure | 1 | Practical harmonic knowledge | Veridical specialized | −0.272 | −3.850 | 0.000 |
2 | Singing and listening abilities | Schematic general | −0.132 | −1.869 | 0.064 | |
3 | Familiarity with the test songs | Veridical general | −0.292 | −4.128 | 0.000 | |
4 | Conceptual knowledge of chords | Schematic specialized | −0.186 | −2.627 | 0.010 | |
Model 3b: all-chords clash | 1 | Practical harmonic knowledge | Veridical specialized | −0.094 | −1.150 | 0.253 |
2 | Singing and listening abilities | Schematic general | −0.187 | −2.291 | 0.024 | |
3 | Familiarity with the test songs | Veridical general | −0.282 | −3.452 | 0.001 | |
4 | Conceptual knowledge of chords | Schematic specialized | −0.212 | −2.595 | 0.011 |
*Greyed-out components had no statistically significant contribution to the model.
As the average ratings showed and as could be seen already from the bars in Figure 2 and the AUCs in Figure 4, determining that the lure is not the original chord was a difficult task. It was much more difficult than differentiating the clash chord from the original. In the one-chord condition, a very vivid veridical memory of the chords was needed, a type of knowledge that cannot be created from harmonic implications of the melody. Therefore, it makes sense that only two components, the veridical general knowledge (B = -.205, p = .004) and veridical specialized knowledge (B = -0.185, p = .008) were statistically significant variables in the model for the one-chord lure, and that the participants’ schematic knowledge was not helping them in their evaluations (see Table 5). Further, the negative signs in B values tell that the more veridical knowledge the participants had about the song, the lower the lure ratings were, indicating that by using veridical knowledge about the song the participants could differentiate the lure from the original. On the other hand, in the all-chords condition the other chords allowed the participants to use not only their veridical knowledge (C1; t = -0.272, p < .001; C3; t = -0.292, p < .000) but also their schematic (and conceptual) knowledge (C4; B = -0.186, p = .010) about chords to determine that the lures were not the original chords.
Conversely, determining that the clash chord was not the original chord was a relatively easy task that was facilitated by both schematic and veridical memory. As Table 5 shows, all B and t values are negative, indicating that in both one-chord and all-chords condition, veridical and schematic knowledge had a negative influence on the ratings of the clash chords. In other words, the higher the veridical and schematic knowledge of the participants, the lower the rating of the clash chord. It should be remembered that the ratings of the clash chords in both conditions were generally low, the averages on the scale from 1 to 7 being 2.078 (one-chord condition) and 1.797 (all-chords condition), indicating high confidence in the chord not being the original. It is possible that veridical specialized familiarity was no more needed for the clash-chord ratings in the all-chords condition, because the participants were also able to make the decision without this knowledge.
After conducting these main analyses, we also wanted to have a look at the responses from the participants who knew the songs beforehand and those who did not. We did this to focus on the effect of veridical and schematic knowledge of harmony. Since the responses were based on a varying number of songs for each participant and the number of familiar songs was not controlled for, we only show the bar charts (Figure 5). The figure reveals an expected pattern showing that those who knew the songs and, thus, had at least some veridical knowledge of the songs rated the targets higher than lures (Mone-chord target = 5.011; Mone-chord lure = 4.140; Mall-chords target = 5.267; Mall-chords lure = 3.882), while for those using schematic knowledge the target and lure ratings were practically the same (Mone-chord target = 4.353; Mone-chord lure = 4.502; Mall-chords target = 4.758; Mall-chords lure = 4.495). The figure also suggests an effect of both schematic and veridical knowledge on the ratings of clashes.
Discussion
This study investigated participants’ veridical and schematic memory for one test chord in an experiment with melody and block-chord accompaniment. Each stimulus consisted of the digitally isolated vocals of an excerpt from the first verse of a song accompanied by the test chord with or without the other chords that accompany the vocals in the original song. The test chords were harmonically identical to the original (target), schematically plausible but harmonically different from the original (lure), or harmonically clashing (clash). The main finding was expected: differentiating between the targets and lures was a more difficult task than evaluating that the clash chord was not used in the original harmony. Further, the results showed that providing the participants with all the chords (harmonic context) increased the participants' opportunity to use schematic memory to assess the schematically plausible lure chords. Through our analyses we found that the participant variables could be reasonably grouped using a two-dimensional framework with schematic–veridical and general–specialized knowledge of harmony as the dimensions and that the grouped variables could be used to predict the participants’ confidence in determining that the target chords were the same as the original chords. Further, we found that veridical knowledge of the songs was needed to correctly rate that the lure was not the original, while with the clash chords the schematic knowledge of harmony together with general veridical knowledge of the style were enough. In what follows, we will discuss the interpretations of the results.
The Effect of Context Chords
When the isolated vocals were accompanied with only the test chord (one-chord condition), the harmony was implied by the melody. In this case the participants could use their schematic knowledge if they were not familiar with the song. When the isolated vocals were accompanied by all the chords, the presence of the chords could affect chord evaluations in two different ways. First, the context chords made the stimuli rather similar to the original song, which was likely to lead to greater activation of the veridical long-term memory traces for the song, thus facilitating the evaluation of the test chord. Second, the context chords provided harmonic information that could further clarify the tonal center of the stimulus, the specific harmonic style of the song, and the voice leading from the previous chord to the target and the following chord in the progression. All this information provided additional opportunities for schematic harmonic knowledge to be activated, which influenced the ratings of the test chord with the help of the surrounding chords. The increase of information was related to veridical and schematic harmonic knowledge, and it explained the different involvement of (or reliance on) veridical and schematic harmonic knowledge in all-chords and one-chord conditions in our experiment.
The Effect of Substitution Type
As stated in Stimuli, the lure chords were similar to the targets either in terms of the chord type (e.g., both target and lure were major seventh chords) or in the exact pitches of all the notes of the chord except for its bass. The target and lure chords were also similar in terms of their relationship to the accompanied melody, in that they shared one or more pitch classes with the concurrent melody. According to corpus analysis, the roots of the lures were in most cases more common than the roots of the target. Further, the lures were selected for the experiment based on pilot responses showing that the lures were often mistaken for the original chords. Taken together, the lures were schematically very acceptable and, as such, difficult to differentiate from the targets but easy to differentiate from the tonally conflicting clashes. This explains our results that the target chords were selected more often than the lures by those participants who knew the songs (and used veridical memory), while those participants who did not know the songs and used schematic memory chose the lures and targets equally often. The results relate not only to the frequency of occurrences of chord roots and chord types but also to the different degrees of importance of chords in the tonal hierarchy (e.g., Krumhansl, 1990). Future studies could use models that quantify the hierarchical importance of chords (e.g., Lerdahl, 2001) to systematically study the effect of tonal hierarchies on participants’ ratings.
As stated, the clash chords were rated considerably lower than the target and lure chords. The clash chords were always major chords that not only had tones outside the main scale used in the entire vocal excerpt but also created vertical dissonances (e.g., minor second intervals) with one or more of the most salient pitches of the concurrent melody. Although the considerably low ratings for the clash chords in our experiment is not surprising considering the degree of tonal conflict between the vocals and the clash chords, these results are not trivial when compared to previous studies on the perception of tonal clashes between melody and accompaniment.
Inspired by Wolpert (2000), Kopiez and Platz (2009) investigated music students’ ability to notice the tonal clash between melody and accompaniment in songs of different styles of music. In their study, the song accompaniments were played in a key a major second higher or lower than the key of the melody. They found that even when instructed to pay attention to the fit between the melody and the accompaniment, 22% of the advanced students and 53% of the less advanced students failed to notice the tonal clash. This suggests that such type of clash is perceptually not as obvious as one could expect based on how extremely rare polytonality is in most styles of tonal music.
The differences in the results can be explained by the differences in paradigms. While Kopiez and Platz (2009) modified the intervallic relationship between the melody and accompaniment for each entire passage, we only transposed the test chord. This meant that, unlike in Kopiez and Platz’ study, our clash stimuli included a horizontal tonal clash between the clash chord and the context chords in our all-chords condition and between the clash chord and the implied or remembered context chords in our one-chord condition. It is also possible that the presence of the original extra-harmonic features in the accompaniment (e.g., rhythm) decreased the attentional resources devoted to harmonic perception in Kopiez and Platz. Finally, the key of the clashing accompaniment in Kopiez and Platz was always a major second apart from the key of the melody, and such intervallic relationship created less dissonant vertical clashes between the accompaniment and the melody than the type of clash chords used in our study.
Participant Variables
We used a procedure of grouping the participant variables via principal component analysis and using the components in a regression analysis for predicting the participants’ confidence about the target chords, lures, and clash chords. By this procedure we could concentrate on veridical and schematic knowledge of harmony on one hand and general and specialized knowledge of harmony on the other. Previous studies on identification of songs from chord progressions had made a distinction between general and specialized “specialized harmonic familiarity”; Jimenez & Kuusi, 2020; Kuusi et al., 2021). To our knowledge, our study is the first to make a distinction between general and specialized aspects of both veridical and schematic memory, a distinction that was largely suggested by how the participant variables grouped via component analysis. The interpretation of the confidence ratings led to the conclusion that veridical and schematic harmonic knowledge, both general and specialized, had a role in the assessment of the test chords, and that this knowledge was crucial for determining the target to be the original and the lure not being the original.
As we expected, veridical harmonic knowledge was more important than schematic harmonic knowledge for the participants to determine that the lure chord was not the chord used in the original song. This was particularly clear in the one-chord condition in which harmony implied by the melody (schematic knowledge) did not help in differentiating the target from the lure. In fact, only the two components that were related to veridical harmonic knowledge had statistically significant effect on the “lure” responses in one-chord condition. In contrast, the assessment of all-chords lures was not only facilitated by the veridical components but also by the component related to specialized schematic harmonic knowledge (also labeled as “conceptual knowledge of chords”). It is possible that the participants used specialized schematic harmonic knowledge to assess how well the lure chord fit the style and the inner logic of the chord progression. This knowledge could be used also by those participants who had only a weak memory of the accompaniment.
For determining that the clash chord was not the chord used in the original song, the participants seemed to use both schematic and veridical harmonic knowledge. Interestingly, our finding that familiarity with the songs facilitated the assessment of the clash chords contradicted with Kopiez and Platz (2009) who found that tonal clashes were less noticeable when the participants were familiar with the test song. It is possible that familiarity with the songs increased the tendency for the participants’ attention to gravitate towards the original extra-harmonic features of the accompaniment (rhythm and texture) and away from the tonal conflict. The role of the extra-harmonic features and their interaction with familiarity calls for further research. Our experimental paradigm can be easily adapted to investigating more nuanced categories of both harmonic and extra-harmonic features.
In addition to veridical and schematic memory for harmony, sensitivity to sensory dissonance could also have had a role in the assessment of the clash chords in our experiment. A recent study asked participants with varying levels of music training to rate harmonic surprise in block-chord instantiations of chord progressions from commercially successful songs (Cheung et al., 2020). The study showed that harmonic surprise could be predicted by a combination of cognitive factors (long-term and short-term statistical learning) and sensory factors (dissonance accumulated in echoic memory). They also found that the contribution of sensory dissonance to the surprise ratings was larger for the participants with less music training. Thus, it is also possible that sensory dissonance not only played a role in our experiment but that such contribution was modulated by music training. The characteristics of our experiment, however, do not allow us to disentangle the effect of sensory and cognitive factors on participants’ ratings.
Conclusion
The present study provided some evidence that both veridical and schematic harmonic knowledge can facilitate determining whether a test chord is the same as the original chord used in the best-known recording of a commercially successful song. The results suggest that the contribution of veridical and schematic harmonic knowledge to chord assessment task is at least partially determined by how schematically appropriate the test chord is, veridical harmonic knowledge being more crucial for the task when the test chord is schematically plausible. This study, however, did not test how different types of stylistically plausible chord substitutions are perceived, a topic that can be investigated in future research.
To our knowledge, this study is the first to isolate vocals from commercially successful songs to study memory for the harmony of accompaniment. The ecological validity of the experimental task and the stimuli used, combined with the relatively clear results obtained regarding the effects of veridical and schematic harmonic knowledge, suggest that this type of experimental paradigm could be of great value to increase our understanding of harmonic perception in the future.
Note
The participants were rejected if they a) responded before listening to the whole stimulus; b) did not identify the control stimuli that had no accompanying chords; c) provided likely automatic responses to the open-ended questions (e.g., nonsensical, or extremely repetitive responses).
References
Appendix A
Information about the Participant Background Variables
Variable name . | Explanation of the variable . | M . | SD . | Min . | Max . | % “never” “NA” or 0** . |
---|---|---|---|---|---|---|
age | age | 42.44 | 11.29 | 20 | 70 | |
V1_GoldMSI_Factor1_Active_engagement | Questions from the Gold-MSI related to active engagement with music* | 3.82 | 1.24 | 1.11 | 6.78 | |
V2_GoldMSI_Factor2_Perceptual_abilities | Questions from the Gold-MSI related to perceptual abilities [1 to 7] | 5.29 | 1.03 | 1.89 | 7.00 | |
V3_GoldMSI_Factor3_Musical_training | Questions from the Gold-MSI related to musical training [1 to 7] | 3.09 | 1.79 | 1.00 | 7.00 | |
V4_GoldMSI_Factor4_Singing_abilities | Questions from the Gold-MSI related to singing abilities [1 to 7] | 4.11 | 1.37 | 1.00 | 7.00 | |
V5_GoldMSI_Factor5_Factor_Emotions | Questions from the Gold-MSI related to emotions [1 to 7] | 5.30 | 1.16 | 1.67 | 7.00 | |
V6_playing_chords_by_ear_total_hours | Total hours of having played chords by ear*** | 509.81 | 1304.30 | 0.00 | 6517.86 | 68% |
V7_playing_chords_from_music_notation_ total_hours | Total hours of having played chords from notation*** | 1217.97 | 4126.86 | 0.00 | 27375.02 | 61% |
V8_years_ear_training_chords_and_ progressions | Total years participants reported having studied or practiced the identification of chords and chord progressions by ear | 1.83 | 5.36 | 0.00 | 30.00 | 76% |
V9_number_of_pieces_composed | Number of pieces participants had composed in their lives | 4.06 | 14.89 | 0.00 | 100.00 | 76% |
V10_number_of_pieces_arranged | Number of pieces participants had arranged in their lives | 8.93 | 48.10 | 0.00 | 500.00 | 80% |
V11_average_times_heard_for_all_10_songs | Average times participants had heard all the 10 songs from the experiment | 26.91 | 17.43 | 0.10 | 60.00 | 0% |
V12_average_times_sung_for_all_10_songs | Average times participants had sung all the 10 songs from the experiment | 11.82 | 13.18 | 0.00 | 60.00 | 4% |
V13_average_times_played_for_all_10_songs | Average times participants had played all the 10 songs from the experiment on a harmonic instrument † | 1.38 | 4.35 | 0.00 | 29.35 | 73% |
V14_percentage_times_played_by_ear_for_ all_10_songs | Percentage times participants had played all the 10 songs from the experiment by ear on a harmonic instrument †† | 5% | 13% | 0% | 68% | 80% |
V15_average_score_for_chord_labels_for_ entire_excerpt_for_all_10_songs | Average score we gave to the chord labels participants provided (no chord labels = 0) | 0.07 | 0.21 | 0.00 | 0.95 | 89% |
V16_average_self_reported_vividness_of_ memory_for_accompaniment_for_all_ 10_songs | Participants’ self-report about the vividness of their memory for the accompany of the tested excerpts [0 to 1] | 0.51 | 0.18 | 0.20 | 0.92 |
Variable name . | Explanation of the variable . | M . | SD . | Min . | Max . | % “never” “NA” or 0** . |
---|---|---|---|---|---|---|
age | age | 42.44 | 11.29 | 20 | 70 | |
V1_GoldMSI_Factor1_Active_engagement | Questions from the Gold-MSI related to active engagement with music* | 3.82 | 1.24 | 1.11 | 6.78 | |
V2_GoldMSI_Factor2_Perceptual_abilities | Questions from the Gold-MSI related to perceptual abilities [1 to 7] | 5.29 | 1.03 | 1.89 | 7.00 | |
V3_GoldMSI_Factor3_Musical_training | Questions from the Gold-MSI related to musical training [1 to 7] | 3.09 | 1.79 | 1.00 | 7.00 | |
V4_GoldMSI_Factor4_Singing_abilities | Questions from the Gold-MSI related to singing abilities [1 to 7] | 4.11 | 1.37 | 1.00 | 7.00 | |
V5_GoldMSI_Factor5_Factor_Emotions | Questions from the Gold-MSI related to emotions [1 to 7] | 5.30 | 1.16 | 1.67 | 7.00 | |
V6_playing_chords_by_ear_total_hours | Total hours of having played chords by ear*** | 509.81 | 1304.30 | 0.00 | 6517.86 | 68% |
V7_playing_chords_from_music_notation_ total_hours | Total hours of having played chords from notation*** | 1217.97 | 4126.86 | 0.00 | 27375.02 | 61% |
V8_years_ear_training_chords_and_ progressions | Total years participants reported having studied or practiced the identification of chords and chord progressions by ear | 1.83 | 5.36 | 0.00 | 30.00 | 76% |
V9_number_of_pieces_composed | Number of pieces participants had composed in their lives | 4.06 | 14.89 | 0.00 | 100.00 | 76% |
V10_number_of_pieces_arranged | Number of pieces participants had arranged in their lives | 8.93 | 48.10 | 0.00 | 500.00 | 80% |
V11_average_times_heard_for_all_10_songs | Average times participants had heard all the 10 songs from the experiment | 26.91 | 17.43 | 0.10 | 60.00 | 0% |
V12_average_times_sung_for_all_10_songs | Average times participants had sung all the 10 songs from the experiment | 11.82 | 13.18 | 0.00 | 60.00 | 4% |
V13_average_times_played_for_all_10_songs | Average times participants had played all the 10 songs from the experiment on a harmonic instrument † | 1.38 | 4.35 | 0.00 | 29.35 | 73% |
V14_percentage_times_played_by_ear_for_ all_10_songs | Percentage times participants had played all the 10 songs from the experiment by ear on a harmonic instrument †† | 5% | 13% | 0% | 68% | 80% |
V15_average_score_for_chord_labels_for_ entire_excerpt_for_all_10_songs | Average score we gave to the chord labels participants provided (no chord labels = 0) | 0.07 | 0.21 | 0.00 | 0.95 | 89% |
V16_average_self_reported_vividness_of_ memory_for_accompaniment_for_all_ 10_songs | Participants’ self-report about the vividness of their memory for the accompany of the tested excerpts [0 to 1] | 0.51 | 0.18 | 0.20 | 0.92 |
Note:
* Gold-MSI uses 1-to-7 Likert scales and multiple-choice questions that have 7 options later tabulated as being equivalent to the 1-to-7 Likert scale. In the table, the scores for the different factors of the Gold-MSI preserve the original 1 to 7 range.
** Percentage of participants responding “never,” “NA,” 0, or who were not asked the question because they reported never having sung regularly nor having ever played an instrument.
*** To obtain a more accurate estimate of total hours, we asked participants to estimate the approximate number of years and average hours per week.
† V13 averages include all 10 songs even if participants had never played or heard the song.
†† V14 summarizes participants’ response when asked to choose one of the following statements for each of the songs they had played: (1) I have only played this song by reading it from musical notation (e.g., chord symbols, tablature, staff notation). (2) I have played this song from memory but only after reading it from musical notation (e.g., chord symbols, tablature, staff notation (3) I have mostly played this song by ear but I have seen its music notation at least once (4) I have only played this song fully by ear (I have never seen its music notation and I figured out its chords completely by ear). The responses for each song were tabulated as follows: 1=0%, 2=50%, 3=75%, 4=100%. V14 is an average of these percentages (only songs played). The participants who had not played any of the songs received a 0% for this variable.
Appendix B
Information about Songs and Test Chords for the Main Experiment
Song . | Artist or Band . | Year of Release . | Type of Test Chord . | Test Chord (letter chord names) . | Test Chord (Roman numeral) . |
---|---|---|---|---|---|
Yesterday | The Beatles | 1965 | target | G | V/V or II |
lure | Bb(add6) | IV (add6) | |||
clash | B | #IV | |||
(Sittin’ On) The Dock of the Bay | Otis Redding | 1968 | target | A | V/V or II |
lure | D | V | |||
clash | Ab | bII in major mode (clashing with melody) | |||
How Deep Is Your Love | Bee Gees | 1977 | target | G7 | V7/vi |
lure | C7 | V7/ii | |||
clash | D | (#)VII in major mode | |||
Just the Way You Are | Billy Joel | 1977 | target | GM7 | IVM7 |
lure | BbM7 | bVIM7 | |||
clash | Ab | #IV | |||
Dust in the Wind | Kansas | 1977 | target | G | bVII in minor mode (V in relative major) |
lure | G/B | bVII6 in minor mode (V6 in relative major) | |||
clash | B | Major II in minor mode (clashing with melody) | |||
True Colors | Cyndi Lauper | 1986 | target | F add9(6&M7) | IVadd9(6&M7) |
lure | Am7(11) | vi7(11) | |||
clash | B | (#)VII in major mode | |||
Tears in Heaven | Eric Clapton | 1992 | target | D/F# | IV6 |
lure | D | IV | |||
clash | Ab | (#)VII in major mode | |||
Wonderwall | Oasis | 1995 | target | B7sus4 | IV7sus4 in minor mode |
lure | F#m7 | i7 | |||
clash | Bb | #III in minor mode | |||
Umbrella | Rihanna feat. Jay-Z | 2007 | target | Fm7 | iii7 |
lure | Ab5&6 | V5&6(3rd in melody) | |||
clash | D | bII in major mode (clashing with melody) | |||
Viva la Vida | Coldplay | 2008 | target | Eb7sus4 | V7sus4 |
lure | Absus4(add9) | Isus4(add9) | |||
clash | D | #IV (clashing with melody) |
Song . | Artist or Band . | Year of Release . | Type of Test Chord . | Test Chord (letter chord names) . | Test Chord (Roman numeral) . |
---|---|---|---|---|---|
Yesterday | The Beatles | 1965 | target | G | V/V or II |
lure | Bb(add6) | IV (add6) | |||
clash | B | #IV | |||
(Sittin’ On) The Dock of the Bay | Otis Redding | 1968 | target | A | V/V or II |
lure | D | V | |||
clash | Ab | bII in major mode (clashing with melody) | |||
How Deep Is Your Love | Bee Gees | 1977 | target | G7 | V7/vi |
lure | C7 | V7/ii | |||
clash | D | (#)VII in major mode | |||
Just the Way You Are | Billy Joel | 1977 | target | GM7 | IVM7 |
lure | BbM7 | bVIM7 | |||
clash | Ab | #IV | |||
Dust in the Wind | Kansas | 1977 | target | G | bVII in minor mode (V in relative major) |
lure | G/B | bVII6 in minor mode (V6 in relative major) | |||
clash | B | Major II in minor mode (clashing with melody) | |||
True Colors | Cyndi Lauper | 1986 | target | F add9(6&M7) | IVadd9(6&M7) |
lure | Am7(11) | vi7(11) | |||
clash | B | (#)VII in major mode | |||
Tears in Heaven | Eric Clapton | 1992 | target | D/F# | IV6 |
lure | D | IV | |||
clash | Ab | (#)VII in major mode | |||
Wonderwall | Oasis | 1995 | target | B7sus4 | IV7sus4 in minor mode |
lure | F#m7 | i7 | |||
clash | Bb | #III in minor mode | |||
Umbrella | Rihanna feat. Jay-Z | 2007 | target | Fm7 | iii7 |
lure | Ab5&6 | V5&6(3rd in melody) | |||
clash | D | bII in major mode (clashing with melody) | |||
Viva la Vida | Coldplay | 2008 | target | Eb7sus4 | V7sus4 |
lure | Absus4(add9) | Isus4(add9) | |||
clash | D | #IV (clashing with melody) |