We studied memory for harmony using a melody-and-accompaniment texture and 10 commercially successful songs of Western popular music. The harmony was presented as a timbrally matching block-chord accompaniment to digitally isolated vocals. We used three test chord variants: the target was harmonically identical to the original chord, the lure was schematically plausible but different from the original, and the clash conflicted with both the tonal center and the local pitches of the melody. We used two conditions: in the one-chord condition we presented only the test chord, while in the all-chords condition the test chord was presented with all the chords of the original excerpt. One hundred and twenty participants with varying levels of music training rated on a seven-point scale if the test chord was the original. We analyzed the results on two dimensions of memory: veridical–schematic and specialized–general. The target chords were rated higher on average than the lures and considerably higher than the clash chords. Schematic memory (knowledge of Western tonal harmony) seemed to be important for rating the test chords in the all-chords condition, while veridical memory (familiarity with the songs) was especially important for rating the lure chords in the one-chord condition.

The melody-and-accompaniment texture is one of the most common textures in Western tonal music (Arthur, 2017; Bharucha, 1984; Huron, 2016; Tagg, 2000). Although this texture is sometimes used in music cognition experiments, there are only a handful of studies that have used this texture to studying memory for harmony (Creel, 2011; Cullimore, 1999; Povel & Van Egmond, 1993). Avoiding melody-and-accompaniment textures reduces the ecological validity of the studies on harmony. Further, since melody tends to be more perceptually salient than harmony (Williams, 2005), it is important to study how harmony can be perceived and remembered when chords serve as background to a clearly defined melody.

Earlier studies have shown that the accompaniment affects the listening experience even if it is perceptually subordinated to the melody. Accompanied melodies can be better enjoyed (Galizio & Hendrick, 1972) and can lead to greater emotional arousal (Schotanus, 2020) and a more intense perception of sadness (Friedman, 2019) than their unaccompanied versions. However, the perceptual subordination often means that some of the most perceptible effects of the accompaniments are experienced as changes in the way the melody is heard (Bernstein, 1976, p. 63; Tagg, 2000). Considering the salience of the main melody and the tendency for the accompaniment to be experienced not in itself but via its influence on the melody, it is reasonable to suspect that the accompaniment may be remembered less clearly than the melody. On the other hand, the melody itself could become a cue to remembering the accompaniment (Creel, 2011). This said, we think that various melody-and-accompaniment textures can be particularly helpful for testing listeners’ ability to remember the chords of the pieces of music they know well (hereafter veridical memory or veridical knowledge). This is because the melody can both activate the memories of the song and unambiguously point listeners’ attention to specific moments in time within a song. Further, melody can also imply harmony. To imagine harmony implied by the melody, the listeners do not need to be familiar with the specific piece of music, since it can suffice if they are familiar with the style or genre (hereafter schematic memory or schematic knowledge): in other words, they can use their schematic knowledge of harmony for imagining the possible harmonic accompaniment of a melody.

Most studies related to memory for harmony have focused on the syntactic aspects of harmony (for a review, see Pearce & Rohrmeier, 2018). These studies have found some evidence that Western-enculturated listeners, even those without music training, possess schematic knowledge about harmony, such as tonal hierarchies (Krumhansl, 1990) and chord transitions (Lhost & Ashley, 2006; Vuvan & Hughes, 2019), and that their perception of harmony is affected by the relative frequency of occurrence of certain chord types (Jimenez, Kuusi, Czedik-Eysenberg, & Reuter, 2021; Jimenez, Kuusi, & Ojala, 2022). There is also some evidence that this type of knowledge can be acquired implicitly (via mere exposure to music), stored in long-term auditory memory, and activated automatically when listening to music informing harmonic expectations (Vuvan & Hughes, 2019). Less research has been done on listeners’ ability to anticipate harmonic events based on veridical memories of harmony. A few studies have investigated whether veridical memories can override the effect of schematic knowledge on harmonic perception. In these studies, veridical memories have been created during the experiment by familiarizing the participants with a song-like musical passage created for the experiment (Creel, 2011), and especially by using block-chord progression (Tillmann & Bigand, 2010; Pagès-Portabella, Bertolo, & Toro, 2021), one of the simplest and most common ways to instantiate harmony in empirical research (Pearce & Rohrmeier, 2018). In other studies, veridical knowledge has been created by informing the participants about the forthcoming chord (Guo & Koelsch, 2016; Justus & Bharucha, 2001). The studies have shown that veridical memories can decrease but not completely override the effect of schematic knowledge on harmonic perception. However, the generalizability of the results is limited by the brevity of the familiarization with the stimuli and, with the exception of Creel (2011), by the use of block chords as opposed to melody-and-accompaniment textures. Schubert and Pearce (2015, p. 367) point out that song-specific mental representations of harmonic patterns are more firmly established in long-term memory after weeks or years of repeated exposure than after relatively short periods of exposure within experimental sessions. These beliefs are consistent with empirical findings regarding the effect of multiple exposures (Szpunar et al., 2004) and consolidation periods (Marshall & Born, 2007; Miles et al., 2016; Morgan-Short et al., 2012) on memory tasks.

More recent studies that have focused on veridical memory for harmony have taken advantage of participants’ already-existing extensive familiarity with commercially successful songs, improving the ecological validity in terms of musical texture and familiarization. These studies have shown that the listeners are able to identify songs from their chord progressions even when melodic, rhythmic, timbral, and textural cues are missing, a type of retrieval task that relies almost exclusively on veridical memory for harmony (Jimenez & Kuusi, 2018, 2020; Kuusi et al., 2021).

Although these studies show that simple block-chord textures can activate memories of songs that use melody-and-accompaniment textures, very few studies on memory for harmony have used melody-and-accompaniment textures as stimuli, and none of them have studied both veridical and schematic memory using extensive real-life familiarity with the tested music. Povel and Van Egmond (1993) noticed that participants rated a melody as similar to itself even though the harmony of the different accompaniments was not the same, suggesting that participants can at least partially ignore the accompanying harmony in short-term memory tasks. Cullimore (1999) used excerpts from a piano piece by W. A. Mozart either as original, with stylistically acceptable harmonic modifications in the accompaniment, or with modifications in the main melody that did not modify the harmony. Participants rated the excerpts as being more different from each other when the modifications occurred in the main melody than in the accompaniment, even when the participants had played the piece in the past. This suggests that the auditory memory for the harmonic elements of the accompaniment might lack details and vividness. Creel (2011) showed that participants can store auditory long-term memory related to the general harmonic characteristics of the accompaniment and that these memories could be activated by listening to the melody without the accompaniment even when the pitches of the melody did not fully determine the key. Yet, it seems that information concerning specific chords in the accompaniment is more difficult to store than general harmonic information (Cullimore, 1999).

In order to investigate listeners’ ability to remember the specific chords of an accompaniment (veridical memory) and their ability to assess chords when they have not heard the songs before (schematic memory), our current study combined elements from Creel (2011) and an experiment on chord substitution by Lhost and Ashley (2006), the only attempt so far in testing the perception of chord substitution in the context of a specific musical style. We took Lhost and Ashley’s study as a point of departure by using stylistically acceptable and unacceptable chords together with the original chords. However, while Lhost and Ashley’s study investigated musicians’ ability to assess block chords in the context of the 12-bar blues progression, a style-specific harmonic schema, we set to investigate participants’ ability to assess the chords of specific songs. This type of task more directly involves veridical memory and does not necessarily require music training.

Additionally, we adopted from Creel (2011) the use of melody-and-accompaniment textures instead of block-chord progressions. However, we used digitally isolated vocals from pre-existing commercially successful songs instead of song-like stimuli created for the experiment. By these choices we were able to have naturalistic stimuli that provided participants with a rich harmonic palette and multiple cues for retrieval (e.g., lyrics and nuance of vocal timbre and interpretation). Hence, we were able to use ecologically valid stimuli to deepen our understanding of how chord substitutions are perceived and to gain knowledge about how well the harmony of the accompaniment is remembered.

Our study investigated participants’ veridical and schematic memory of harmony using melody-and-accompaniment textures. More specifically, we examined the participants’ ability to determine whether the chords accompanying isolated vocals of commercially successful songs were the original ones (targets) or had been substituted either by schematically plausible lures or by non-matching clashes. We expected that veridical knowledge (familiarity with the song) would be important for distinguishing the original chord from the lure while schematic knowledge (familiarity with the musical style and Western tonal music in general) was expected to be important for distinguishing the clash. Further, we examined the role of schematic knowledge of those participants who did not know the songs (and therefore could not use veridical knowledge) for comparing the lure and the target. For further information of the chords, see “Stimuli.”

We also studied the role of participant background variables for their responses, since we expected that both general music training and conceptual knowledge of harmony would facilitate the task. Earlier studies have shown that general music training relates to greater attention (Williams, 2005) and sensitivity to harmony (Farbood, 2012; Kopiez & Platz, 2009) in experimental tasks that involve implicit or explicit schematic knowledge of harmony. Further, the tasks that heavily rely on veridical memory for harmony—such as the identification of songs from chord progressions—seem to be facilitated by music training (Jimenez & Kuusi, 2018), by having played the songs, and by being able to write their chord labels from long-term memory (indicating conceptual knowledge, hereafter named as “specialized harmonic familiarity;” Jimenez & Kuusi, 2020; Kuusi et al., 2021).

In addition to the three different types of test chords (target, lure, and clash as explained above) we used two types of harmonic conditions. In the one-chord condition, only the chord to be rated was presented with the digitally isolated vocals from the best-known recording of the song. In the all-chords condition, all chords that accompany the vocals were presented. In both conditions, we used chords formed of simultaneously played pitches (instead of the original texture). We expected that (a) the task would be easier in the all-chords condition than in the one-chord condition. Further, we anticipated that (b) the participants are most confident about the targets being the original chord followed by the lures and the clash chords, that (c) participants’ veridical harmonic knowledge influences the lure ratings, and that (d) participants’ schematic harmonic knowledge influences the clash ratings.

Participants

The online experiment was visited 1,476 times between September 22 and October 21, 2021. Since we knew that the number of non-serious visitors and survey bots is large in crowdsourcing platforms (Ahler et al., 2019; Dennis et al., 2020), we used a pre-test to screen participants. In the pre-test, the visitors were to choose the loudest tone of a series of five piano tones. The difficulty of the loudness pre-test was set relatively high to minimize the influence of the quality of participants’ headphones, the environmental noise, and participants’ hearing deficiencies like hearing loss. The loudness pre-test included three separate trials, and the participants were allowed to listen to the series of five tones in each trial as many times as they wanted before moving on to the next trial. Altogether 212 visitors abandoned the survey before taking the pre-test, one visitor abandoned the survey while taking it, and 963 visitors were not allowed to take the survey because they failed to answer the pre-test correctly. Further, we used three criteria1 for recognizing and rejecting 37 visitors who completed some parts of the experiment without actually listening to the item or with the help of autofillers or bots. Of the 263 who remained, 153 (58.2%) completed the experiment. This completion rate is approximately the same as in online experiments using participants with high internal motivation (Bosnjak & Tuten, 2003; Tuten et al., 2004) but clearly higher than in some other studies (O’Neil & Penrod, 2001; O’Neil et al., 2003). Further, it should be noted that the completion rate does not include those who completed the experiment but were rejected because of our inclusion criteria.

Since 33 of the participants had taken the experiment twice and we only accepted the first response from them, the total number of participants whose responses were included in our main analysis was 120 (60 male, 59 female, 1 other; age M = 42.44, SD = 11.29. We collected background information of the participants through a questionnaire to which they responded at different points during the experiment. Approximately one third (34.2%) of the participants had never played an instrument and could be labeled as listeners, while the others had played an instrument for less than five years (25.8%) or for more than five years (40.0%). Additional participant information will be explained in Results and shown in  Appendix A.

Stimuli

The songs were selected based on various online pilots. Amazon Mechanical Turk workers were initially surveyed about the number of times they had heard and played 150 songs that (1) had more than 300,000 listeners in Last.fm and (2) had verses with harmony not limited to root position versions of the most common diatonic triads used in Western popular music (i.e., I, IV, V, vi, and ii with no added tones; de Clercq & Temperley, 2011; Miles et al., 2017; Nadar et al., 2019). Of these, 40 songs were pre-selected based on (1) the results of the survey, (2) whether the best-known recording of the song contains clearly audible vertical instantiations of the chords in the accompaniment of the verse (e.g., block chords not covered by loud percussion), and (3) our success at using DeMIX Pro version 2.0.2 to digitally isolate the vocals of the verse of the song without producing major audio artifacts.

From the beginning of the first verse of each of these 40 songs, we selected an excerpt consisting of 2 to 4 melodic phrases, containing 4 to 12 chords in the accompaniment, and lasting between 7 to 18 s. We used excerpts from the verses because they often feature only one vocal layer (Stephenson, 2002) and because they tend to have less timbral variety (van Balen et al., 2013) and a less dense texture (Everett, 2009) than choruses. All these characteristics facilitated the digital isolation of the vocals. Yet another reason for using verses was that chord progressions in verses tend to be less predictable than those in choruses, and the harmonic unpredictability of the verse tends to peak near the middle of each verse (Miles et al., 2017). Within each excerpt we chose the target chord using the following criteria: (1) it was not a root position version of the most common diatonic triads and (2) it occurred near to the middle of the excerpt, preceded by 2 to 5 chords and followed by 1 to 5 chords. Using target chords with moderate levels of harmonic unpredictability in our experiment prevented participants from guessing the original chord based purely on schematic harmonic knowledge.

For each excerpt of isolated vocals we created an accompaniment in Logic Pro X instantiating each chord as a single block chord that was as similar as possible to the original accompaniment in terms of pitches, metrical placement, and timbre. We created two additional versions of each excerpt by substituting the target chord with a lure or clash chord. The lure substitute was a commonly used chord type (de Clercq & Temperley, 2011; Miles et al., 2017; Nadar et al., 2019) whose pitches matched several of the pitches of the accompanied vocals. The lure chords were also similar to the targets in terms of chord type (e.g., both target and lure were major seventh chords) or in the exact pitches of all the notes of the chord except for its bass. In the latter case the bass tone always changed the chord type (e.g., from Fadd9 to Am7). The clash substitute was always a root-position major chord (without added tones) whose pitches clashed with both the tonal center and the local pitches of the melody (isolated vocals) and whose root was often a semitone apart from the original.

Due to the multiplicity of musical variables that may affect how participants perceive a chord accompanying a melody (e.g., harmonic context, metrical position, style, lyrics, timbre) we decided not to make any further assumptions about the musical validity of the chord substitutions based on theoretical grounds. Instead, we tested each of the 40 songs in pilots to identify the chord substitutions that participants were most likely to mistakenly assume to be the chord used in the best-known recording of the song. Although participants in the pilots were not allowed to later participate in the main experiment, they were recruited via the same crowdsourcing platform we used in the main experiment (see Procedure), and we therefore considered their responses sufficiently generalizable to our main experiment.

Finally, we chose 10 songs that represented various decades between 1965 and 2008 and that, according to the pilots, were well-known, had adequate timbral similarity between test chords and the original accompaniment (indicated by consistent high ratings for target), in which the original test chords could be substituted with lure chords that were musically convincing (indicated by similarity between ratings of target and lures), and that, when all songs were viewed as a whole, contributed to a set of stimuli that varied in terms of chord type, scale degree, and inversion. We then created two versions of each stimulus. Each of these two versions included the same vocals accompanied either by only the test chord (one-chord condition) or by the test chord plus other block chords instantiating the accompaniment in the original song (all-chords condition). For details see  Appendices B and  C.

Procedure

The project was approved by the Research Ethics Committee of the University of the Arts Helsinki. Participants were recruited online by “word of mouth” and using Amazon Mechanical Turk (MTurk), a crowdsourcing platform that provides access to more than a hundred thousand potential participants (Difallah et al., 2018). Armitage and Eerola (2020) have shown that the results of music cognition experiments on chord perception carried out in standard laboratory settings are comparable to those from online experiments that recruit participants using services like MTurk.

We used PsyToolkit software (Stoet, 2010, 2017) for data collecting. In the main experiment, the participants were first provided with the title of the song, the name of the artist or band who recorded the best-known version of the song, the date of release of that version, and playback controls to hear the excerpt (isolated vocals only). At this point, the participants were asked to give a general estimate of how many times they had heard and sung the song. Participants who had experience in playing and practicing musical instruments were also asked to estimate how many times they had played the song and what percentage of those performances were read from music notation, and to write down the labels of the chords from the excerpt based on long-term memory. All participants who had heard the song at least once in their lifetime were also asked to rate the vividness of their memory for the missing accompaniment of the isolated vocals.

After responding to the preliminary questions, the participants were taken to a page that included playback controls and questions about the three different harmonizations of the isolated vocals from the song. Each participant heard both the one-chord and all-chords condition for each song. To minimize the order effect of the conditions, at least eight songs were tested between the two conditions of a song. The time between the two conditions was further increased by the questions about the songs described above.

The following instructions were always presented at the top of the page:

For each of the audio clips in this page, please choose the option that best describes whether the accompanying test chord is the chord from the actual song.

- The underlined blue bold text, in the lyrics right below the playback controls, shows you the exact moment when you will hear the test chord.

- If you have heard the song before, please rely only on your current memory of the song, please do not look up the song on the internet or your private collection to refresh your memory of the song. Also, try to ignore the fact that the instruments used to play the chords in this page are not the exact same instruments as those used in the actual song.

- If you have never heard this song before in your life (other than the excerpt of isolated vocals we previously played for you), please respond according to your feeling about what original chord is likely to be (e.g., respond "definitely yes" if your strongly feel that the test chord sounds like it should be the chord used in the original song).

- Regardless of your degree of familiarity with the song, try to avoid giving the same response to all the three audio clips in this page.

Figure 1 is an example of how each excerpt was presented on the screen to the participants. The participants were given a 7-point scale to rate each of the three excerpts. Stimuli and rating scales for all three test chords of the song were presented on the same page. The participants were free to listen to the stimuli as many times as they wanted. To verify that the participants were attentively listening to the excerpts, they were randomly presented with audio clips that had isolated vocals without any accompaniment.

Figure 1.

Screenshot of the main experimental task as presented to participants. Each page contained three tasks corresponding to the three different test chords.

Figure 1.

Screenshot of the main experimental task as presented to participants. Each page contained three tasks corresponding to the three different test chords.

Close modal

After being tested on all the 10 songs in both conditions, the participants were asked some additional questions about their experience with music including the self-reported portion of the Gold-MSI (Müllensiefen et al., 2014). Most participants completed the entire session in less than 40 minutes.

We started our analyses by calculating descriptive statistics to all test variables (target, lure, and clash) for the one-chord condition and the all-chords condition. As stated, the order of the one-chord condition and the all-chords condition was pseudo-randomized. Half of the songs for any given participant were presented in the one-chord condition as the first instance, and the all-chords condition as the second instance followed at least eight songs later. For the other half of the songs, the order of the conditions was reversed. The confidence ratings for targets, lures, and clashes were averaged for the two conditions and two instances separately (see Figure 2). Generally, the ratings of the target were the highest (range from 4.68 to 5.25), those of the clash the lowest (range from 1.73 to 2.11), the lure being in-between (range from 3.95 to 4.39). The differences between the three test chords varied between 0.57 points (one-chord target versus one-chord lure) and 3.52 points (all-chords target versus all-chords clash), the average difference being 2.09 points. On the other hand, the differences between the first and second instance were generally very small: between 0.06 points (one-chord clash) and 0.51 points (one-chord target), the average difference being 0.17 points. There were statistically significant differences between the first and second instance only for the one-chord target, t(119) = 5.417, p > .001. This being the case, in the rest of the analyses we did not make a distinction of whether the confidence ratings were given in the first or second instance. The statistics for the target, lure, and clash chords in one-chord and all-chords conditions are given in Table 1.

Figure 2.

Mean ratings grouped by the type of chord, condition, and by whether the songs were presented first in the one-chord condition or the all-chords condition. The error bars indicate halved standard deviations. The asterisk marks the statistically significant difference between the 1st and 2nd instance.

Figure 2.

Mean ratings grouped by the type of chord, condition, and by whether the songs were presented first in the one-chord condition or the all-chords condition. The error bars indicate halved standard deviations. The asterisk marks the statistically significant difference between the 1st and 2nd instance.

Close modal
Table 1.

Statistics of Responses for the One-chord and All-chords Target, Lure, and Clash

Descriptive Statistics
NMinMaxMeanStd. Deviation
One-chord Target 120 3.30 6.50 4.946 0.690 
One-chord Lure 120 2.30 5.80 4.253 0.797 
One-chord Clash 120 1.00 5.44 2.078 0.942 
All-chords Target 120 3.20 7.00 5.205 0.789 
All-chords Lure 120 1.50 6.20 4.041 0.886 
All-chords Clash 120 1.00 5.70 1.797 0.967 
Descriptive Statistics
NMinMaxMeanStd. Deviation
One-chord Target 120 3.30 6.50 4.946 0.690 
One-chord Lure 120 2.30 5.80 4.253 0.797 
One-chord Clash 120 1.00 5.44 2.078 0.942 
All-chords Target 120 3.20 7.00 5.205 0.789 
All-chords Lure 120 1.50 6.20 4.041 0.886 
All-chords Clash 120 1.00 5.70 1.797 0.967 

To analyze the participant variables, we ran a principal components analysis (PCA) with varimax rotation (Table 2). The participant variables are listed in Table 3 and a more thorough explanation can be found in  Appendix A. For the participant variables the Kaiser-Meyer-Olkin measure was .834, which means that the data was adequate for factor analysis, and Bartlett’s test of sphericity, χ2(136) = 1244.552, p <.001, told that the data matrix was not an identity matrix with uncorrelated variables, and, hence, suitable for factor analysis. The PCA revealed a four-component solution explaining approximately 71.1% of the variance (Table 2).

Table 2.

Principal Component Analysis of Participant Variables

Initial EigenvaluesExtraction Sums of Squared Loadings
ComponentTotal% of VarianceCumulative %Total% of VarianceCumulative %
6.678 39.280 39.280 6.678 39.280 39.280 
2.399 14.112 53.391 2.399 14.112 53.391 
1.924 11.318 64.709 1.924 11.318 64.709 
1.083 6.369 71.078 1.083 6.369 71.078 
Initial EigenvaluesExtraction Sums of Squared Loadings
ComponentTotal% of VarianceCumulative %Total% of VarianceCumulative %
6.678 39.280 39.280 6.678 39.280 39.280 
2.399 14.112 53.391 2.399 14.112 53.391 
1.924 11.318 64.709 1.924 11.318 64.709 
1.083 6.369 71.078 1.083 6.369 71.078 

The structure was understandable and easy to interpret. The initial interpretation followed the standard view that factor analysis can help to uncover latent variables to which all the observable variables within each component are related (Bollen, 2002). Based on the variables, the interpretation is as follows (see the bold-print numbers in the varimax-rotated matrix in Table 3): Component 1 consisted of variables related to practical work with chords (e.g., composing, arranging, playing chords or songs by ear) and was initially labeled as Practical Harmonic Knowledge. Component 2 consisted of self-reported singing and listening abilities and interest in music and was initially named accordingly. Component 3 (initially labeled Familiarity with the Test Songs) was related to the participant’s general familiarity with the test song (age being correlated with the number of times participants had heard the test songs). Finally, Component 4 was related to theory-driven and notation-driven work with chords and was initially labeled as Conceptual Knowledge of Chords. Our decision to interpret C1 as Practical Harmonic Knowledge refers to the fact that all playing of—or practicing with—chords (with exception of variable V7) loaded to C1, even though variables V13 and V14 were specific for the test songs (and hence could be part of C3). It should be noted that playing chords (one of the characteristics of C1) was not particularly common among our participants (as shown in  Appendix A, more than 60% of all participants had never played chords, and more than 70% had not played the test songs). What was very common among the participants was hearing the test songs (V11) and singing them (V12; as shown in  Appendix A, all had heard the songs and 96% had sung them), and these variables related to general familiarity with test songs (the main characteristic of C3).

Table 3.

The Loadings of Each Participant Variable on the Four Components

Rotated Component Matrix
 Component
C1 Practical Harmonic KnowledgeC2 Singing and Listening AbilitiesC3 Familiarity with the Test SongsC4 Conceptual Knowledge of Chords
age −.151 −.323 .565 .041 
V1_GoldMSI_Factor1_Active_engagement .200 .825 .074 .154 
V2_GoldMSI_Factor2_Perceptual_abilities .124 .825 .055 .252 
V3_GoldMSI_Factor3_Musical_training .305 .473 −.095 .595 
V4_GoldMSI_Factor4_Singing_abilities .236 .829 −.044 .262 
V5_GoldMSI_Factor5_Factor_emotions .137 .814 .172 .014 
V6_playing_chords_by_ear_total_hours .643* .238 .037 .446 
V7_playing_chords_from_music_notation_total_hours .135 .139 .007 .837 
V8_years_ear_training_chords_and_progressions .444 .148 .060 .690 
V9_number_of_pieces_composed .862 .141 −.047 .043 
V10_number_of_pieces_arranged .885 .070 .072 .022 
V11_average_times_heard_for_all_10_songs .068 .016 .907 .022 
V12_average_times_sang_for_all_10_songs .048 .244 .804 −.012 
V13_average_times_played_for_all_10_songs .765 .169 .174 .375 
V14_percentage_times_played_BY_EAR_for_all_10_songs .680 .254 −.048 .186 
V15_average_score_for_chord_labels_for_entire_excerpt_ for_all_10_songs .706 .176 .014 .414 
V16_average_self_reported_vividness_of_memory_ for_accompaniment_for_all_10_songs_1_for_unknown .169 .468 .645 −.034 
Rotated Component Matrix
 Component
C1 Practical Harmonic KnowledgeC2 Singing and Listening AbilitiesC3 Familiarity with the Test SongsC4 Conceptual Knowledge of Chords
age −.151 −.323 .565 .041 
V1_GoldMSI_Factor1_Active_engagement .200 .825 .074 .154 
V2_GoldMSI_Factor2_Perceptual_abilities .124 .825 .055 .252 
V3_GoldMSI_Factor3_Musical_training .305 .473 −.095 .595 
V4_GoldMSI_Factor4_Singing_abilities .236 .829 −.044 .262 
V5_GoldMSI_Factor5_Factor_emotions .137 .814 .172 .014 
V6_playing_chords_by_ear_total_hours .643* .238 .037 .446 
V7_playing_chords_from_music_notation_total_hours .135 .139 .007 .837 
V8_years_ear_training_chords_and_progressions .444 .148 .060 .690 
V9_number_of_pieces_composed .862 .141 −.047 .043 
V10_number_of_pieces_arranged .885 .070 .072 .022 
V11_average_times_heard_for_all_10_songs .068 .016 .907 .022 
V12_average_times_sang_for_all_10_songs .048 .244 .804 −.012 
V13_average_times_played_for_all_10_songs .765 .169 .174 .375 
V14_percentage_times_played_BY_EAR_for_all_10_songs .680 .254 −.048 .186 
V15_average_score_for_chord_labels_for_entire_excerpt_ for_all_10_songs .706 .176 .014 .414 
V16_average_self_reported_vividness_of_memory_ for_accompaniment_for_all_10_songs_1_for_unknown .169 .468 .645 −.034 

* The highest loadings of each variable are in bold print.

In addition to this standard approach with initial labels, we also provide another interpretation of the four components. In this second interpretation, we reconsidered the components using a two-dimensional framework: Dimension 1 stands for veridical and schematic memory for harmony, and Dimension 2 stands for general and specialized knowledge of harmony. The veridical and schematic types of memory have already been shown to affect the perception of chord progressions (see Introduction), and earlier studies have also shown the importance of specialized harmonic familiarity (see Aim). Figure 3 shows how each component is interpreted in the two-dimensional matrix. Veridical memory for harmony is understood as familiarity with the test songs either at a general level (by having heard and sung the songs; C3) or at a specialized level (by having played the songs and being able to write the labels; C1). Further, the number of arranged pieces could refer to specialized familiarity if the arranged pieces are the songs used in the experiment (we did not ask the participants whether they had arranged the test songs). However, the variable “composed pieces” does not fit in this interpretation and is written in grey because of this. As for schematic memory, it is understood as familiarity with tonal harmony either at a general level through the amount of exposure and attention to music during every-day listening (C2) or at a specialized level through training on analyzing chords and chord progressions, playing them, and identifying them by ear (C4). The variable V5 (emotional responses) does not unambiguously fit in this interpretation since it could be important for specialized knowledge as well. Even with the few shortcomings, this interpretation of the components allows us to describe the potential relationships between the chord ratings and veridical and schematic harmonic knowledge more easily.

Figure 3.

Interpretation of the four components in terms of veridical, schematic, general and specialized knowledge of harmony. The black text indicates variables with the strongest relations to the dimensions.

Figure 3.

Interpretation of the four components in terms of veridical, schematic, general and specialized knowledge of harmony. The black text indicates variables with the strongest relations to the dimensions.

Close modal

To have a thorough view of how confidently the participants differentiated the target from the lure and clash chords, we analyzed the responses using the receiver operating characteristic (ROC) analysis and the area under curve (AUC). This analysis is commonly used in musical memory studies, and in our case, it showed how well the participants were able to differentiate the targets from the lures and clashes (for AUCs, see, e.g., Müllensiefen & Halpern, 2014; Schellenberg et al., 2019). Since we expected that hearing all the chords would help the participants in their task, we analyzed the AUCs separately for the one-chord condition and the all-chords condition. All AUCs were above the chance level (.500), and they showed that it was easiest for the participants to differentiate between targets and clash chords in the all-chords condition (AUC = .922, SD = .038) and almost as easy in the one-chord condition (AUC = .885, SD = .056). Differentiating between the target and lure was not as easy, since the AUC was .660 (SD = .155) in the all-chords condition and .602 (SD = .176) in the one-chord condition; see Figure 4). It should be noted, however, that we accepted participants regardless of their familiarity with all test songs. In case the song was unfamiliar, we encouraged the participant to use their feeling about what original chord is likely to be, that is, asking them to use schematic memory. From that perspective, targets and lures were equally correct. Further, we ran a two-factor ANOVA to see how much the condition (one-chord or all-chords) and the chord type used in the comparison with the target (lure or clash) affected the AUCs. The analysis confirmed that the type of test chord had a statistically significant effect, F(1,116) = 49.534, p <.001, on the AUCs, indicating that the participants were better able to distinguish the target chords from the clash chords than from the lure chords. The condition, however, had no effect, F(1,116) = 1.508, p = .227, indicating that the single test chord could be distinguished as easily as the test chord surrounded by other chords of the harmony. There was no interaction between the chord type and the condition, F(1,116) = 0.77, p = .782.

Figure 4.

AUCs for one-chord and all-chords condition. Error bars indicate halved standard deviations.

Figure 4.

AUCs for one-chord and all-chords condition. Error bars indicate halved standard deviations.

Close modal

Since the lures were schematically plausible—even though they were not veridically correct—and since we had asked our participants to evaluate the chords also in the songs they were not familiar with, the AUCs did not reveal the whole picture of the schematic and veridical memory of harmony. Therefore, we continued our analyses by conducting regression analyses with the four components of the PCA for predicting the estimations separately for targets, lures, and clashes. We used both the one-chord condition and the all-chords condition since we wanted to see if the condition had a role in any of these. The results of the regression analyses are collected in Table 4 (model 1 for target, model 2 for lure, and model 3 for clash; (a) standing for the one-chord condition and (b) for the all-chords condition).

Table 4.

Model Summary and Coefficients for Regression Analyses

Model Summary
ModelRR SquareAdjusted R SquareStd. Error of the EstimateChange Statistics 
R Square ChangeF Changedf1df2Sig. F ChangeDurbin-Watson
1a .575c .330 .307 .574253 .330 14.173 115 .000 1.725 
2a .373c .139 .109 .752579 .139 4.647 115 .002 1,858 
3a .539c .291 .266 .806639 .291 11.801 115 .000 2.209 
1b .549c .301 .277 .671027 .301 12.395 115 .000 1.940 
2b .518c .269 .243 .770992 .269 10.564 115 .000 1.871 
3b .424c .180 .151 .891261 .180 6.306 115 .000 2.187 
Model Summary
ModelRR SquareAdjusted R SquareStd. Error of the EstimateChange Statistics 
R Square ChangeF Changedf1df2Sig. F ChangeDurbin-Watson
1a .575c .330 .307 .574253 .330 14.173 115 .000 1.725 
2a .373c .139 .109 .752579 .139 4.647 115 .002 1,858 
3a .539c .291 .266 .806639 .291 11.801 115 .000 2.209 
1b .549c .301 .277 .671027 .301 12.395 115 .000 1.940 
2b .518c .269 .243 .770992 .269 10.564 115 .000 1.871 
3b .424c .180 .151 .891261 .180 6.306 115 .000 2.187 

1a. Dependent Variable: one-chord condition, target

2a. Dependent Variable: one-chord condition, lure

3a. Dependent Variable: one-chord condition, clash

1b. Dependent Variable: all-chords condition, target

2b. Dependent Variable: all-chords condition, lure

3b. Dependent Variable: all-chords condition, clash

c. Predictors: (Constant), C1, C2, C3, C4

In the regressions, the Durbin-Watsons were between 1.725 and 2.187, i.e., all were near 2, which is optimal. Further, the residuals showed that the data were suitable for linear regression. For the one-chord condition, the model explained 30.7% of the variance of the target ratings; model 1a: R2 = .307, F(4, 115) = 14.173, p < .001, and all four PCA components added to the model (see Table 5). Further, it explained 10.9% of the variance of the lure ratings; model 2a: R2 = .109, F(4, 115) = 4.647, p = .002, with components 1 and 3 adding to the model, and 26.6% of the clash ratings; model 3a: R2 = .266, F(4, 115) = 11.801, p < .001, with all the components adding to the model. For the all-chords condition, the model explained 27.7% of the variance of the target ratings; model 1b: R2 = .277, F(4, 115) = 12.395, p < .001, with all components adding to the model. Further, it explained 24.3% of the variance of the lure ratings; model 2b: R2 = .243, F(4, 115) = 10.564, p < .001, with components 1, 3, and 4 adding to the model, while for the clash ratings the model explained 15.1% of the variance by components 2, 3, and 4; model 3b: R2 = .151, F(4, 115) = 6.306, p < .001.

Table 5.

Interpretation of Components and Coefficients from Regressions

RatingCompo-nentsStandard InterpretationVeridical - Schematic InterpretationUnstandardized Coefficient BtSig.
Model 1a: one-chord target Practical harmonic knowledge Veridical specialized 0.185 3.523 0.001 
Singing and listening abilities Schematic general 0.244 4.631 0.000 
Familiarity with the test songs Veridical general 0.158 2.992 0.003 
Conceptual knowledge of chords Schematic specialized 0.196 3.726 0.000 
Model 2a: one-chord lure Practical harmonic knowledge Veridical specialized −0.185 −2.688 0.008 
2* Singing and listening abilities Schematic general 0.072 1.041 0.300 
Familiarity with the test songs Veridical general −0.205 −2.965 0.004 
Conceptual knowledge of chords Schematic specialized −0.084 −1.221 0.224 
Model 3a: one-chord clash Practical harmonic knowledge Veridical specialized −0.155 −2.099 0.038 
Singing and listening abilities Schematic general −0.311 −4.200 0.000 
Familiarity with the test songs Veridical general −0.223 −3.011 0.003 
Conceptual knowledge of chords Schematic specialized −0.297 −4.011 0.000 
Model 1b: all-chords target Practical harmonic knowledge Veridical specialized 0.153 2.488 0.014 
Singing and listening abilities Schematic general 0.281 4.576 0.000 
Familiarity with the test songs Veridical general 0.210 3.419 0.001 
Conceptual knowledge of chords Schematic specialized 0.202 3.280 0.001 
Model 2b: all-chords lure Practical harmonic knowledge Veridical specialized −0.272 −3.850 0.000 
Singing and listening abilities Schematic general −0.132 −1.869 0.064 
Familiarity with the test songs Veridical general −0.292 −4.128 0.000 
Conceptual knowledge of chords Schematic specialized −0.186 −2.627 0.010 
Model 3b: all-chords clash Practical harmonic knowledge Veridical specialized −0.094 −1.150 0.253 
Singing and listening abilities Schematic general −0.187 −2.291 0.024 
Familiarity with the test songs Veridical general −0.282 −3.452 0.001 
Conceptual knowledge of chords Schematic specialized −0.212 −2.595 0.011 
RatingCompo-nentsStandard InterpretationVeridical - Schematic InterpretationUnstandardized Coefficient BtSig.
Model 1a: one-chord target Practical harmonic knowledge Veridical specialized 0.185 3.523 0.001 
Singing and listening abilities Schematic general 0.244 4.631 0.000 
Familiarity with the test songs Veridical general 0.158 2.992 0.003 
Conceptual knowledge of chords Schematic specialized 0.196 3.726 0.000 
Model 2a: one-chord lure Practical harmonic knowledge Veridical specialized −0.185 −2.688 0.008 
2* Singing and listening abilities Schematic general 0.072 1.041 0.300 
Familiarity with the test songs Veridical general −0.205 −2.965 0.004 
Conceptual knowledge of chords Schematic specialized −0.084 −1.221 0.224 
Model 3a: one-chord clash Practical harmonic knowledge Veridical specialized −0.155 −2.099 0.038 
Singing and listening abilities Schematic general −0.311 −4.200 0.000 
Familiarity with the test songs Veridical general −0.223 −3.011 0.003 
Conceptual knowledge of chords Schematic specialized −0.297 −4.011 0.000 
Model 1b: all-chords target Practical harmonic knowledge Veridical specialized 0.153 2.488 0.014 
Singing and listening abilities Schematic general 0.281 4.576 0.000 
Familiarity with the test songs Veridical general 0.210 3.419 0.001 
Conceptual knowledge of chords Schematic specialized 0.202 3.280 0.001 
Model 2b: all-chords lure Practical harmonic knowledge Veridical specialized −0.272 −3.850 0.000 
Singing and listening abilities Schematic general −0.132 −1.869 0.064 
Familiarity with the test songs Veridical general −0.292 −4.128 0.000 
Conceptual knowledge of chords Schematic specialized −0.186 −2.627 0.010 
Model 3b: all-chords clash Practical harmonic knowledge Veridical specialized −0.094 −1.150 0.253 
Singing and listening abilities Schematic general −0.187 −2.291 0.024 
Familiarity with the test songs Veridical general −0.282 −3.452 0.001 
Conceptual knowledge of chords Schematic specialized −0.212 −2.595 0.011 

*Greyed-out components had no statistically significant contribution to the model.

As the average ratings showed and as could be seen already from the bars in Figure 2 and the AUCs in Figure 4, determining that the lure is not the original chord was a difficult task. It was much more difficult than differentiating the clash chord from the original. In the one-chord condition, a very vivid veridical memory of the chords was needed, a type of knowledge that cannot be created from harmonic implications of the melody. Therefore, it makes sense that only two components, the veridical general knowledge (B = -.205, p = .004) and veridical specialized knowledge (B = -0.185, p = .008) were statistically significant variables in the model for the one-chord lure, and that the participants’ schematic knowledge was not helping them in their evaluations (see Table 5). Further, the negative signs in B values tell that the more veridical knowledge the participants had about the song, the lower the lure ratings were, indicating that by using veridical knowledge about the song the participants could differentiate the lure from the original. On the other hand, in the all-chords condition the other chords allowed the participants to use not only their veridical knowledge (C1; t = -0.272, p < .001; C3; t = -0.292, p < .000) but also their schematic (and conceptual) knowledge (C4; B = -0.186, p = .010) about chords to determine that the lures were not the original chords.

Conversely, determining that the clash chord was not the original chord was a relatively easy task that was facilitated by both schematic and veridical memory. As Table 5 shows, all B and t values are negative, indicating that in both one-chord and all-chords condition, veridical and schematic knowledge had a negative influence on the ratings of the clash chords. In other words, the higher the veridical and schematic knowledge of the participants, the lower the rating of the clash chord. It should be remembered that the ratings of the clash chords in both conditions were generally low, the averages on the scale from 1 to 7 being 2.078 (one-chord condition) and 1.797 (all-chords condition), indicating high confidence in the chord not being the original. It is possible that veridical specialized familiarity was no more needed for the clash-chord ratings in the all-chords condition, because the participants were also able to make the decision without this knowledge.

After conducting these main analyses, we also wanted to have a look at the responses from the participants who knew the songs beforehand and those who did not. We did this to focus on the effect of veridical and schematic knowledge of harmony. Since the responses were based on a varying number of songs for each participant and the number of familiar songs was not controlled for, we only show the bar charts (Figure 5). The figure reveals an expected pattern showing that those who knew the songs and, thus, had at least some veridical knowledge of the songs rated the targets higher than lures (Mone-chord target = 5.011; Mone-chord lure = 4.140; Mall-chords target = 5.267; Mall-chords lure = 3.882), while for those using schematic knowledge the target and lure ratings were practically the same (Mone-chord target = 4.353; Mone-chord lure = 4.502; Mall-chords target = 4.758; Mall-chords lure = 4.495). The figure also suggests an effect of both schematic and veridical knowledge on the ratings of clashes.

Figure 5.

Mean ratings grouped by type of chord, condition, and whether participants had heard the song before the experiment. Error bars indicate halved standard deviations

Figure 5.

Mean ratings grouped by type of chord, condition, and whether participants had heard the song before the experiment. Error bars indicate halved standard deviations

Close modal

This study investigated participants’ veridical and schematic memory for one test chord in an experiment with melody and block-chord accompaniment. Each stimulus consisted of the digitally isolated vocals of an excerpt from the first verse of a song accompanied by the test chord with or without the other chords that accompany the vocals in the original song. The test chords were harmonically identical to the original (target), schematically plausible but harmonically different from the original (lure), or harmonically clashing (clash). The main finding was expected: differentiating between the targets and lures was a more difficult task than evaluating that the clash chord was not used in the original harmony. Further, the results showed that providing the participants with all the chords (harmonic context) increased the participants' opportunity to use schematic memory to assess the schematically plausible lure chords. Through our analyses we found that the participant variables could be reasonably grouped using a two-dimensional framework with schematic–veridical and general–specialized knowledge of harmony as the dimensions and that the grouped variables could be used to predict the participants’ confidence in determining that the target chords were the same as the original chords. Further, we found that veridical knowledge of the songs was needed to correctly rate that the lure was not the original, while with the clash chords the schematic knowledge of harmony together with general veridical knowledge of the style were enough. In what follows, we will discuss the interpretations of the results.

The Effect of Context Chords

When the isolated vocals were accompanied with only the test chord (one-chord condition), the harmony was implied by the melody. In this case the participants could use their schematic knowledge if they were not familiar with the song. When the isolated vocals were accompanied by all the chords, the presence of the chords could affect chord evaluations in two different ways. First, the context chords made the stimuli rather similar to the original song, which was likely to lead to greater activation of the veridical long-term memory traces for the song, thus facilitating the evaluation of the test chord. Second, the context chords provided harmonic information that could further clarify the tonal center of the stimulus, the specific harmonic style of the song, and the voice leading from the previous chord to the target and the following chord in the progression. All this information provided additional opportunities for schematic harmonic knowledge to be activated, which influenced the ratings of the test chord with the help of the surrounding chords. The increase of information was related to veridical and schematic harmonic knowledge, and it explained the different involvement of (or reliance on) veridical and schematic harmonic knowledge in all-chords and one-chord conditions in our experiment.

The Effect of Substitution Type

As stated in Stimuli, the lure chords were similar to the targets either in terms of the chord type (e.g., both target and lure were major seventh chords) or in the exact pitches of all the notes of the chord except for its bass. The target and lure chords were also similar in terms of their relationship to the accompanied melody, in that they shared one or more pitch classes with the concurrent melody. According to corpus analysis, the roots of the lures were in most cases more common than the roots of the target. Further, the lures were selected for the experiment based on pilot responses showing that the lures were often mistaken for the original chords. Taken together, the lures were schematically very acceptable and, as such, difficult to differentiate from the targets but easy to differentiate from the tonally conflicting clashes. This explains our results that the target chords were selected more often than the lures by those participants who knew the songs (and used veridical memory), while those participants who did not know the songs and used schematic memory chose the lures and targets equally often. The results relate not only to the frequency of occurrences of chord roots and chord types but also to the different degrees of importance of chords in the tonal hierarchy (e.g., Krumhansl, 1990). Future studies could use models that quantify the hierarchical importance of chords (e.g., Lerdahl, 2001) to systematically study the effect of tonal hierarchies on participants’ ratings.

As stated, the clash chords were rated considerably lower than the target and lure chords. The clash chords were always major chords that not only had tones outside the main scale used in the entire vocal excerpt but also created vertical dissonances (e.g., minor second intervals) with one or more of the most salient pitches of the concurrent melody. Although the considerably low ratings for the clash chords in our experiment is not surprising considering the degree of tonal conflict between the vocals and the clash chords, these results are not trivial when compared to previous studies on the perception of tonal clashes between melody and accompaniment.

Inspired by Wolpert (2000), Kopiez and Platz (2009) investigated music students’ ability to notice the tonal clash between melody and accompaniment in songs of different styles of music. In their study, the song accompaniments were played in a key a major second higher or lower than the key of the melody. They found that even when instructed to pay attention to the fit between the melody and the accompaniment, 22% of the advanced students and 53% of the less advanced students failed to notice the tonal clash. This suggests that such type of clash is perceptually not as obvious as one could expect based on how extremely rare polytonality is in most styles of tonal music.

The differences in the results can be explained by the differences in paradigms. While Kopiez and Platz (2009) modified the intervallic relationship between the melody and accompaniment for each entire passage, we only transposed the test chord. This meant that, unlike in Kopiez and Platz’ study, our clash stimuli included a horizontal tonal clash between the clash chord and the context chords in our all-chords condition and between the clash chord and the implied or remembered context chords in our one-chord condition. It is also possible that the presence of the original extra-harmonic features in the accompaniment (e.g., rhythm) decreased the attentional resources devoted to harmonic perception in Kopiez and Platz. Finally, the key of the clashing accompaniment in Kopiez and Platz was always a major second apart from the key of the melody, and such intervallic relationship created less dissonant vertical clashes between the accompaniment and the melody than the type of clash chords used in our study.

Participant Variables

We used a procedure of grouping the participant variables via principal component analysis and using the components in a regression analysis for predicting the participants’ confidence about the target chords, lures, and clash chords. By this procedure we could concentrate on veridical and schematic knowledge of harmony on one hand and general and specialized knowledge of harmony on the other. Previous studies on identification of songs from chord progressions had made a distinction between general and specialized “specialized harmonic familiarity”; Jimenez & Kuusi, 2020; Kuusi et al., 2021). To our knowledge, our study is the first to make a distinction between general and specialized aspects of both veridical and schematic memory, a distinction that was largely suggested by how the participant variables grouped via component analysis. The interpretation of the confidence ratings led to the conclusion that veridical and schematic harmonic knowledge, both general and specialized, had a role in the assessment of the test chords, and that this knowledge was crucial for determining the target to be the original and the lure not being the original.

As we expected, veridical harmonic knowledge was more important than schematic harmonic knowledge for the participants to determine that the lure chord was not the chord used in the original song. This was particularly clear in the one-chord condition in which harmony implied by the melody (schematic knowledge) did not help in differentiating the target from the lure. In fact, only the two components that were related to veridical harmonic knowledge had statistically significant effect on the “lure” responses in one-chord condition. In contrast, the assessment of all-chords lures was not only facilitated by the veridical components but also by the component related to specialized schematic harmonic knowledge (also labeled as “conceptual knowledge of chords”). It is possible that the participants used specialized schematic harmonic knowledge to assess how well the lure chord fit the style and the inner logic of the chord progression. This knowledge could be used also by those participants who had only a weak memory of the accompaniment.

For determining that the clash chord was not the chord used in the original song, the participants seemed to use both schematic and veridical harmonic knowledge. Interestingly, our finding that familiarity with the songs facilitated the assessment of the clash chords contradicted with Kopiez and Platz (2009) who found that tonal clashes were less noticeable when the participants were familiar with the test song. It is possible that familiarity with the songs increased the tendency for the participants’ attention to gravitate towards the original extra-harmonic features of the accompaniment (rhythm and texture) and away from the tonal conflict. The role of the extra-harmonic features and their interaction with familiarity calls for further research. Our experimental paradigm can be easily adapted to investigating more nuanced categories of both harmonic and extra-harmonic features.

In addition to veridical and schematic memory for harmony, sensitivity to sensory dissonance could also have had a role in the assessment of the clash chords in our experiment. A recent study asked participants with varying levels of music training to rate harmonic surprise in block-chord instantiations of chord progressions from commercially successful songs (Cheung et al., 2020). The study showed that harmonic surprise could be predicted by a combination of cognitive factors (long-term and short-term statistical learning) and sensory factors (dissonance accumulated in echoic memory). They also found that the contribution of sensory dissonance to the surprise ratings was larger for the participants with less music training. Thus, it is also possible that sensory dissonance not only played a role in our experiment but that such contribution was modulated by music training. The characteristics of our experiment, however, do not allow us to disentangle the effect of sensory and cognitive factors on participants’ ratings.

The present study provided some evidence that both veridical and schematic harmonic knowledge can facilitate determining whether a test chord is the same as the original chord used in the best-known recording of a commercially successful song. The results suggest that the contribution of veridical and schematic harmonic knowledge to chord assessment task is at least partially determined by how schematically appropriate the test chord is, veridical harmonic knowledge being more crucial for the task when the test chord is schematically plausible. This study, however, did not test how different types of stylistically plausible chord substitutions are perceived, a topic that can be investigated in future research.

To our knowledge, this study is the first to isolate vocals from commercially successful songs to study memory for the harmony of accompaniment. The ecological validity of the experimental task and the stimuli used, combined with the relatively clear results obtained regarding the effects of veridical and schematic harmonic knowledge, suggest that this type of experimental paradigm could be of great value to increase our understanding of harmonic perception in the future.

1

The participants were rejected if they a) responded before listening to the whole stimulus; b) did not identify the control stimuli that had no accompanying chords; c) provided likely automatic responses to the open-ended questions (e.g., nonsensical, or extremely repetitive responses).

Ahler
,
D. J.
,
Roush
,
C. E.
, &
Sood
,
G.
(
2019
,
April
4–7
).
The micro-task market for lemons: Data quality on Amazon’s Mechanical Turk
[
Paper presentation
].
77th Annual Conference of the Midwest Political Science Association
,
Chicago, IL, United States
.
Armitage
,
J.
, &
Eerola
,
T.
(
2020
).
Reaction time data in music cognition: Comparison of pilot data from lab, crowdsourced, and convenience Web samples
.
Frontiers in Psychology
,
10
,
2883
. https://doi.org/10.3389/fpsyg.2019.02883
Arthur
,
C.
(
2017
).
Taking harmony into account: The effect of harmony on melodic probability
.
Music Perception
,
34
(
4
),
405
423
. https://doi.org/10.1525/mp.2017.34.4.405
Bernstein
,
L.
(
1976
).
The unanswered question: Six talks at Harvard
.
Harvard University Press
.
Bharucha
,
J. J
. (
1984
).
Anchoring effects in music: The resolution of dissonance
.
Cognitive Psychology
,
16
(
4
),
485
518
.
Bollen
,
K. A.
(
2002
).
Latent variables in psychology and the social sciences
.
Annual Review of Psychology
,
53
(
1
),
605
634
. https://doi.org/10.1146/annurev.psych.53.100901.135239
Bosnjak
,
M.
, &
Tuten
,
T. L.
(
2003
).
Prepaid and promised incentives in web surveys: An experiment
.
Social Science Computer Review
,
21
(
2
),
208
217
. https://doi.org/10.1177/0894439303021002006
Cheung
,
V. K.
,
Harrison
,
P. M. C.
,
Koelsch
,
S.
,
Pearce
,
M. T.
,
FriedericI
,
A. D.
, &
Meyer
,
L.
(
2020
).
Distinct roles of cognitive and sensory information in musical expectancy
.
PsyArXiv Preprints
. https://doi.org/10.31234/osf.io/z76hg
Creel
,
S. C.
(
2011
).
Specific previous experience affects perception of harmony and meter
.
Journal of Experimental Psychology: Human Perception and Performance
,
37
(
5
),
1512
1526
. https://psycnet.apa.org/record/2011-09209-001
Cullimore
,
J. R.
(
1999
).
Harmonic hierarchies as distinctive abstractions that listeners may derive from musical surface structure
(
Unpublished master’s thesis
).
Queen’s University
,
Kingston, Canada
.
De clercq
,
T.
, &
Temperley
,
D.
(
2011
).
A corpus analysis of rock harmony
.
Popular Music
,
30
(
1
),
47
70
. https://www.cambridge.org/core/journals/popular-music/article/abs/corpus-analysis-of-rock-harmony/C5210A8EC985DDF170B53124F4464DA4
Dennis
,
S. A.
,
Goodson
,
B. M.
, &
Pearson
,
C. A.
(
2020
).
Online worker fraud and evolving threats to the integrity of MTurk data: A discussion of virtual private servers and the limitations of IP-based screening procedures
.
Behavioral Research in Accounting
,
32
(
1
),
119
134
. https://doi.org/10.2308/bria-18-044
Difallah
,
D.
,
Filatova
,
E.
, &
Ipeirotis
,
P.
(
2018
). Demographics and dynamics of Mechanical Turk workers. In
Q.
Yu
&
J.
Chen
(Eds.),
Proceedings of WSDM 2018: The eleventh ACM international conference on web search and data mining
(pp.
135
143
).
Association for Computer Machinery
. https://doi.org/10.1145/3159652.3159661
Everett
,
W.
(
2009
).
The foundations of rock: From “Blue Suede Shoes” to “Suite: Judy Blue Eyes.”
Oxford University Press
.
Farbood
,
M. M.
(
2012
).
A parametric, temporal model of musical tension
.
Music Perception
,
29
(
4
),
387
428
. https://doi.org/10.1525/mp.2012.29.4.387
Friedman
,
R. S.
(
2019
).
Exploring the impact of continual drones on perceived musical emotion
.
Psychomusicology: Music, Mind, and Brain
,
29
(
4
),
171
179
. https://psycnet.apa.org/record/2019-30392-001
Galizio
,
M.
, &
Hendrick
,
C.
(
1972
).
Effect of musical accompaniment on attitude: The guitar as a prop for persuasion
.
Journal of Applied Social Psychology
,
2
(
4
),
350
359
.
Guo
,
S.
, &
Koelsch
,
S.
(
2016
).
Effects of veridical expectations on syntax processing in music: Event-related potential evidence
.
Scientific Reports
,
6
(
1
),
1
11
. https://www.nature.com/articles/srep19064
Huron
,
D.
(
2016
).
Voice leading: The science behind a musical art
.
MIT Press
.
Jimenez
,
I.
, &
Kuusi
,
T.
(
2018
).
Connecting chord progressions with specific pieces of music
.
Psychology of Music
,
46
(
5
),
716
733
. https://doi.org/10.1177/0305735617721638
Jimenez
,
I.
, &
Kuusi
,
T.
(
2020
).
What helps jazz musicians name tunes from harmony? The relationship between work with harmony and the ability to identify well-known jazz standards from chord progressions
.
Psychology of Music
,
48
(
2
),
215
231
. https://doi.org/10.1177/0305735618793005
Jimenez
,
I.
,
Kuusi
,
T.
,
Czedik-Eysenberg
,
I.
, &
Reuter
,
C.
(
2021
).
Identifying songs from their piano-driven opening chords
.
Musicae Scientiae
. https://doi.org/10.1177/10298649211003631
Jimenez
,
I.
,
Kuusi
,
T.
, &
Ojala
,
J.
(
2022
).
Relative salience of chord-type and chord-voicing changes: A two-oddball paradigm
.
Psychology of Music
,
50
(
5
),
1566
1595
. https://doi.org/10.1177/03057356211055214
Justus
,
T. C.
, &
Bharucha
,
J. J.
(
2001
).
Modularity in musical processing: The automaticity of harmonic priming
.
Journal of Experimental Psychology: Human Perception and Performance
,
27
(
4
),
1000
1011
.
Kopiez
,
R.
, &
Platz
,
F.
(
2009
).
The role of listening expertise, attention, and musical style in the perception of clash of keys
.
Music Perception
,
26
(
4
),
321
334
. https://doi.org/10.1525/mp.2009.26.4.321
Krumhansl
,
C. L
. (
1990
).
Cognitive foundations of musical pitch
.
Oxford University Press
.
Kuusi
,
T.
,
Jimenez
,
I.
, &
Schulkind
,
M.
(
2021
). Revisiting the effect of listener and musical factors on the Identification of music from chord progressions. In
J.
Ojala
&
L.
Suurpää
(Eds.),
Musical performance in context: A festschrift in celebration of doctoral education at the Sibelius Academy
.
DocMus Research Publications 17
.
Sibelius Academy
.
Lerdahl
,
F
. (
2001
).
Tonal pitch space
.
Oxford University Press
.
Lhost
,
E.
, &
Ashley
,
R.
(
2006
).
Jazz, blues and the language of harmony: Flexibility in online harmonic processing
. In
Baroni
,
M.
(Ed.),
Proceedings of the 9th International Conference for Music Perception and Cognition
(pp.
1282
1288
).
Bologna, Italy
.
Marshall
,
L.
, &
Born
,
J.
(
2007
).
The contribution of sleep to hippocampus-dependent memory consolidation
.
Trends in Cognitive Sciences
,
11
(
10
),
442
450
. https://doi.org/10.1016/j.tics.2007.09.001
Miles
,
S. A.
,
Miranda
,
R. A.
, &
Ullman
,
M. T.
(
2016
).
Sex differences in music: A female advantage at recognizing familiar melodies
.
Frontiers in Psychology
,
7
,
278
. https://doi.org/10.3389/fpsyg.2016.00278
Miles
,
S. A.
,
Rosen
,
D. S.
, &
Grzywacz
,
N. M.
(
2017
).
A statistical analysis of the relationship between harmonic surprise and preference in popular music
.
Frontiers in Human Neuroscience
,
11
,
263
. https://doi.org/10.3389/fnhum.2017.00263
Morgan-Short
,
K
.,
Finger
,
I
.,
Grey
,
S
, &
Ullman
,
M. T.
(
2012
).
Second language processing shows increased native-like neural responses after months of no exposure
.
PLoS ONE
,
7
(
3
),
e32974
. https://doi.org/10.1371/journal.pone.0032974
Müllensiefen
,
D.
,
Gingras
,
B.
,
Musil
,
J.
, &
Stewart
,
L.
(
2014
).
Measuring the facets of musicality: The Goldsmiths Musical Sophistication Index (Gold-MSI)
.
Personality and Individual Differences
,
60
,
S35
.
Müllensiefen
,
D.
, &
Halpern
,
A. R.
(
2014
).
The role of features and context in recognition of novel melodies
.
Music Perception
,
31
(
5
),
418
435
. https://doi.org/10.1525/mp.2014.31.5.418
Nadar
,
C. R
.,
Abeßer
,
J
., &
Grollmisch
,
S.
(
2019
).
Towards CNN-based acoustic modeling of seventh chords for automatic chord recognition
. In
International Conference on Sound and Music Computing
.
Málaga, Spain
.
O’neil
,
K. M
., &
Penrod
,
S. D.
(
2001
).
Methodological variables in web-based research that may affect results: Sample type, monetary incentives, and personal information
.
Behavior Research Methods, Instruments, and Computers
,
33
,
226
233
. https://doi.org/10.3758/BF03195369
O’neil
,
K. M
.,
Penrod
,
S. D.
, &
Bornstein
,
B. H.
(
2003
).
Web-based research: Methodological variables’ effects on dropout and sample characteristics
.
Behavior Research Methods, Instruments, and Computers
,
35
,
217
236
. https://doi.org/10.3758/BF03202544
Pagès-Portabella
,
C
.,
Bertolo
,
M
.,
Toro
,
J.M.
. (
2021
)
Neural correlates of acoustic dissonance in music: The role of musicianship, schematic and veridical expectations
.
PLoS ONE
,
16
(
12
),
e0260728
. https://doi.org/10.1371/journal.pone.0260728
Pearce
,
M.
, &
Rohrmeier
,
M.
(
2018
). Musical syntax II: Empirical perspectives. In
R
.
Bader
(Ed.),
Springer handbook of systematic musicology
(pp.
487
505
).
Springer
.
Povel
,
D. J.
, &
Van Egmond
,
R.
(
1993
).
The function of accompanying chords in the recognition of melodic fragments
.
Music Perception
,
11
,
101
115
.
Schellenberg
,
E. G.
,
Weiss
,
M. W.
,
Peng
C.
, &
Alam
,
S.
(
2019
).
Fine-grained implicit memory for key and tempo
.
Music and Science
,
2
,
2059204319857198
. https://doi.org/10.1177/2059204319857198
Schotanus
,
Y.
(
2020
).
Singing and accompaniment support the processing of song lyrics and change the lyrics’ meaning
.
Empirical Musicology Review
,
15
(
1–2
),
18
55
. https://emusicology.org/article/view/6863/5746
Schubert
,
E.
, &
Pearce
,
M.
(
2015
). A new look at musical expectancy: The veridical versus the general in the mental organization of music. In
M.
Aramaki
,
R.
Kronland-Martinet
, &
S.
Ystad
(Eds.),
11th International Symposium on Computer Music Multidisciplinary Research
(
CMMR
) (pp.
358
370
).
Springer
.
Stephenson
,
K
. (
2002
).
What to listen for in rock: A stylistic analysis
.
Yale University Press
.
Stoet
,
G.
(
2010
).
PsyToolkit: A software package for programming psychological experiments using Linux
.
Behavior Research Methods
,
42
(
4
),
1096
1104
. http://doi.org/10.3758/BRM.42.4.1096
Stoet
,
G.
(
2017
).
PsyToolkit: A novel web-based method for running online questionnaires and reaction-time experiments
.
Teaching of Psychology
,
44
(
1
),
24
31
. https://doi.org/10.1177/0098628316677643
Szpunar
,
K. K.
,
Schellenberg
,
E. G.
, &
Pliner
,
P.
(
2004
).
Liking and memory for musical stimuli as a function of exposure
.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
30
(
2
),
370
381
. https://doi.org/10.1037/0278-7393.30.2.370
Tagg
,
P.
(
2000
).
Melody and accompaniment
.
Articles for Encyclopedia of Popular Music of the World (EPMOW)
. https://tagg.org/articles/xpdfs/melodaccUS.pdf
Tillmann
,
B.
, &
Bigand
,
E.
(
2010
).
Musical structure processing after repeated listening: Schematic expectations resist veridical expectations
.
Musicae Scientiae
,
14
(
2_suppl
),
33
47
. https://doi.org/10.1177/10298649100140S204
Tuten
,
T. L.
,
Galesic
,
M.
, &
Bosnjak
,
M.
(
2004
).
Effects of immediate versus delayed notification of prize draw results on response behavior in web surveys—An experiment
.
Social Science Computer Review
,
22
(
3
),
377
384
. https://doi.org/10.1177/0894439304265640
Van Balen
,
J. M. H.
,
Burgoyne
,
J. A.
,
Wiering
,
F.
, &
Veltkamp
,
R. C.
(
2013
).
An analysis of chorus features in popular song
. In
A. S.
Britto
Jr.
,
F.
Gouyon
, &
S.
Dixon
(Eds.),
Proceedings of the 14th Society of Music Information Retrieval Conference (ISMIR)
(pp.
107
112
)
Curitiba, Brazil
.
Vuvan
,
D. T.
, &
Hughes
,
B.
(
2019
).
Musical style affects the strength of harmonic expectancy
.
Music and Science
,
2
. https://doi.org/10.1177/2059204318816066
Williams
,
L. R.
(
2005
).
Effect of music training and musical complexity on focus of attention to melody or harmony
.
Journal of Research in Music Education
,
53
(
3
),
210
221
. https://doi.org/10.1177/002242940505300303
Wolpert
,
R. S.
(
2000
).
Attention to key in a nondirected music listening task: Musicians vs. nonmusicians
.
Music Perception
,
18
(
2
),
225
230
. https://doi.org/10.2307/40285910

Appendix A

Information about the Participant Background Variables

Variable nameExplanation of the variableMSDMinMax% “never” “NA” or 0**
age age 42.44 11.29 20 70  
V1_GoldMSI_Factor1_Active_engagement Questions from the Gold-MSI related to active engagement with music* 3.82 1.24 1.11 6.78  
V2_GoldMSI_Factor2_Perceptual_abilities Questions from the Gold-MSI related to perceptual abilities [1 to 7] 5.29 1.03 1.89 7.00  
V3_GoldMSI_Factor3_Musical_training Questions from the Gold-MSI related to musical training [1 to 7] 3.09 1.79 1.00 7.00  
V4_GoldMSI_Factor4_Singing_abilities Questions from the Gold-MSI related to singing abilities [1 to 7] 4.11 1.37 1.00 7.00  
V5_GoldMSI_Factor5_Factor_Emotions Questions from the Gold-MSI related to emotions [1 to 7] 5.30 1.16 1.67 7.00  
V6_playing_chords_by_ear_total_hours Total hours of having played chords by ear*** 509.81 1304.30 0.00 6517.86 68% 
V7_playing_chords_from_music_notation_ total_hours Total hours of having played chords from notation*** 1217.97 4126.86 0.00 27375.02 61% 
V8_years_ear_training_chords_and_ progressions Total years participants reported having studied or practiced the identification of chords and chord progressions by ear 1.83 5.36 0.00 30.00 76% 
V9_number_of_pieces_composed Number of pieces participants had composed in their lives 4.06 14.89 0.00 100.00 76% 
V10_number_of_pieces_arranged Number of pieces participants had arranged in their lives 8.93 48.10 0.00 500.00 80% 
V11_average_times_heard_for_all_10_songs Average times participants had heard all the 10 songs from the experiment 26.91 17.43 0.10 60.00 0% 
V12_average_times_sung_for_all_10_songs Average times participants had sung all the 10 songs from the experiment 11.82 13.18 0.00 60.00 4% 
V13_average_times_played_for_all_10_songs Average times participants had played all the 10 songs from the experiment on a harmonic instrument † 1.38 4.35 0.00 29.35 73% 
V14_percentage_times_played_by_ear_for_ all_10_songs Percentage times participants had played all the 10 songs from the experiment by ear on a harmonic instrument †† 5% 13% 0% 68% 80% 
V15_average_score_for_chord_labels_for_ entire_excerpt_for_all_10_songs Average score we gave to the chord labels participants provided (no chord labels = 0) 0.07 0.21 0.00 0.95 89% 
V16_average_self_reported_vividness_of_ memory_for_accompaniment_for_all_ 10_songs Participants’ self-report about the vividness of their memory for the accompany of the tested excerpts [0 to 1] 0.51 0.18 0.20 0.92  
Variable nameExplanation of the variableMSDMinMax% “never” “NA” or 0**
age age 42.44 11.29 20 70  
V1_GoldMSI_Factor1_Active_engagement Questions from the Gold-MSI related to active engagement with music* 3.82 1.24 1.11 6.78  
V2_GoldMSI_Factor2_Perceptual_abilities Questions from the Gold-MSI related to perceptual abilities [1 to 7] 5.29 1.03 1.89 7.00  
V3_GoldMSI_Factor3_Musical_training Questions from the Gold-MSI related to musical training [1 to 7] 3.09 1.79 1.00 7.00  
V4_GoldMSI_Factor4_Singing_abilities Questions from the Gold-MSI related to singing abilities [1 to 7] 4.11 1.37 1.00 7.00  
V5_GoldMSI_Factor5_Factor_Emotions Questions from the Gold-MSI related to emotions [1 to 7] 5.30 1.16 1.67 7.00  
V6_playing_chords_by_ear_total_hours Total hours of having played chords by ear*** 509.81 1304.30 0.00 6517.86 68% 
V7_playing_chords_from_music_notation_ total_hours Total hours of having played chords from notation*** 1217.97 4126.86 0.00 27375.02 61% 
V8_years_ear_training_chords_and_ progressions Total years participants reported having studied or practiced the identification of chords and chord progressions by ear 1.83 5.36 0.00 30.00 76% 
V9_number_of_pieces_composed Number of pieces participants had composed in their lives 4.06 14.89 0.00 100.00 76% 
V10_number_of_pieces_arranged Number of pieces participants had arranged in their lives 8.93 48.10 0.00 500.00 80% 
V11_average_times_heard_for_all_10_songs Average times participants had heard all the 10 songs from the experiment 26.91 17.43 0.10 60.00 0% 
V12_average_times_sung_for_all_10_songs Average times participants had sung all the 10 songs from the experiment 11.82 13.18 0.00 60.00 4% 
V13_average_times_played_for_all_10_songs Average times participants had played all the 10 songs from the experiment on a harmonic instrument † 1.38 4.35 0.00 29.35 73% 
V14_percentage_times_played_by_ear_for_ all_10_songs Percentage times participants had played all the 10 songs from the experiment by ear on a harmonic instrument †† 5% 13% 0% 68% 80% 
V15_average_score_for_chord_labels_for_ entire_excerpt_for_all_10_songs Average score we gave to the chord labels participants provided (no chord labels = 0) 0.07 0.21 0.00 0.95 89% 
V16_average_self_reported_vividness_of_ memory_for_accompaniment_for_all_ 10_songs Participants’ self-report about the vividness of their memory for the accompany of the tested excerpts [0 to 1] 0.51 0.18 0.20 0.92  

Note:

* Gold-MSI uses 1-to-7 Likert scales and multiple-choice questions that have 7 options later tabulated as being equivalent to the 1-to-7 Likert scale. In the table, the scores for the different factors of the Gold-MSI preserve the original 1 to 7 range.

** Percentage of participants responding “never,” “NA,” 0, or who were not asked the question because they reported never having sung regularly nor having ever played an instrument.

*** To obtain a more accurate estimate of total hours, we asked participants to estimate the approximate number of years and average hours per week.

† V13 averages include all 10 songs even if participants had never played or heard the song.

†† V14 summarizes participants’ response when asked to choose one of the following statements for each of the songs they had played: (1) I have only played this song by reading it from musical notation (e.g., chord symbols, tablature, staff notation). (2) I have played this song from memory but only after reading it from musical notation (e.g., chord symbols, tablature, staff notation (3) I have mostly played this song by ear but I have seen its music notation at least once (4) I have only played this song fully by ear (I have never seen its music notation and I figured out its chords completely by ear). The responses for each song were tabulated as follows: 1=0%, 2=50%, 3=75%, 4=100%. V14 is an average of these percentages (only songs played). The participants who had not played any of the songs received a 0% for this variable.

Appendix B

Information about Songs and Test Chords for the Main Experiment

SongArtist or BandYear of ReleaseType of Test ChordTest Chord (letter chord names)Test Chord (Roman numeral)
Yesterday The Beatles 1965 target V/V or II 
  lure Bb(add6) IV (add6) 
  clash #IV 
(Sittin’ On) The Dock of the Bay Otis Redding 1968 target V/V or II 
  lure 
  clash Ab bII in major mode (clashing with melody) 
How Deep Is Your Love Bee Gees 1977 target G7 V7/vi 
    lure C7 V7/ii 
    clash (#)VII in major mode 
Just the Way You Are Billy Joel 1977 target GM7 IVM7 
    lure BbM7 bVIM7 
    clash Ab #IV 
Dust in the Wind Kansas 1977 target bVII in minor mode (V in relative major) 
    lure G/B bVII6 in minor mode (V6 in relative major) 
    clash Major II in minor mode (clashing with melody) 
True Colors Cyndi Lauper 1986 target F add9(6&M7) IVadd9(6&M7) 
  lure Am7(11) vi7(11) 
  clash (#)VII in major mode 
Tears in Heaven Eric Clapton 1992 target D/F# IV6 
  lure IV 
  clash Ab (#)VII in major mode 
Wonderwall Oasis 1995 target B7sus4 IV7sus4 in minor mode 
    lure F#m7 i7 
    clash Bb #III in minor mode 
Umbrella Rihanna feat. Jay-Z 2007 target Fm7 iii7 
  lure Ab5&6 V5&6(3rd in melody) 
  clash bII in major mode (clashing with melody) 
Viva la Vida Coldplay 2008 target Eb7sus4 V7sus4 
    lure Absus4(add9) Isus4(add9) 
    clash #IV (clashing with melody) 
SongArtist or BandYear of ReleaseType of Test ChordTest Chord (letter chord names)Test Chord (Roman numeral)
Yesterday The Beatles 1965 target V/V or II 
  lure Bb(add6) IV (add6) 
  clash #IV 
(Sittin’ On) The Dock of the Bay Otis Redding 1968 target V/V or II 
  lure 
  clash Ab bII in major mode (clashing with melody) 
How Deep Is Your Love Bee Gees 1977 target G7 V7/vi 
    lure C7 V7/ii 
    clash (#)VII in major mode 
Just the Way You Are Billy Joel 1977 target GM7 IVM7 
    lure BbM7 bVIM7 
    clash Ab #IV 
Dust in the Wind Kansas 1977 target bVII in minor mode (V in relative major) 
    lure G/B bVII6 in minor mode (V6 in relative major) 
    clash Major II in minor mode (clashing with melody) 
True Colors Cyndi Lauper 1986 target F add9(6&M7) IVadd9(6&M7) 
  lure Am7(11) vi7(11) 
  clash (#)VII in major mode 
Tears in Heaven Eric Clapton 1992 target D/F# IV6 
  lure IV 
  clash Ab (#)VII in major mode 
Wonderwall Oasis 1995 target B7sus4 IV7sus4 in minor mode 
    lure F#m7 i7 
    clash Bb #III in minor mode 
Umbrella Rihanna feat. Jay-Z 2007 target Fm7 iii7 
  lure Ab5&6 V5&6(3rd in melody) 
  clash bII in major mode (clashing with melody) 
Viva la Vida Coldplay 2008 target Eb7sus4 V7sus4 
    lure Absus4(add9) Isus4(add9) 
    clash #IV (clashing with melody) 

Appendix C

Transcription of Stimuli Using Target Chord and All-chords Condition