We investigated perception of virtual pitches at missing fundamentals (MFs) in musical chords of three chromas (simultaneous trichords). Tone profiles for major, minor, diminished, augmented, suspended, and four other trichords of octave-complex tones were determined. In Experiment 1, 40 musicians rated how well a tone went with a preceding chord; in Experiment 2, whether the tone was in the chord. Mean ratings for nine non-chord tones were compared with predictions of four models: MFs, diatonicity, 5th-interval relations, and tones that complete familiar tetrachords (e.g., 7th chords). Profiles were accounted for by all four models in Experiment 1, and two (MFs, 5th relations) in Experiment 2. Overall, effect size was largest for MFs. In Experiment 3, listeners heard a chord and chose a matching tone from 12 possibilities. Profile peaks were predicted by pitch models (usually, the lower tone of a perfect 5th). Participants who more likely attended to MFs in isolated harmonic complex tones (fundamental listeners) were not more sensitive to MFs in chords, suggesting their responses instead depended on statistical properties of familiar music. We propose a speculative, psychohistoric explanation: MFs influenced the historical development of musical structure, which in turn influenced the perception of enculturated modern listeners.

Major-minor tonality is the most familiar system of structuring pitch in Western tonal music. Pitches and pitch patterns are perceived in relation to a major or minor scale,a major or minor triad (the tonic chord), or a tone (the tonic) (Cuddy & Badertscher, 1987; Krumhansl, 1990). The musical surface can often be reduced to, or is perceived as, familiar harmonic progressions (Holleran, Jones, & Butler, 1995). How we perceive musical structures depends on our auditory experience, which in turn depends on how musical structures developed in the past (Lynch, Eilers, Oller, & Urbano, 1990). The familiar pitch-time patterns of major-minor tonality emerged during a long period of musical development (Dahlhaus, 1968/1990). The structure of early Western polyphony was governed by a combination of explicit compositional rules and implicit perceptual principles (e.g., Eberlein, 1994; Huron, 2001).

To understand how major-minor tonality works, we need to understand both the psychological and the historical (here: psychohistoric) origins of Western tonal pitch structures. Overarching questions include: Why are some pitch combinations more consonant or common (prevalent) than others? How did the perception of consonance and dissonance develop historically? Why have major and minor triads played such a central role in Western polyphony since the 14th century?1 Why did major-minor tonality emerge in the 16th-17th centuries and why did it come to dominate most (Western) music?

Questions of this kind involve both humanities and sciences. From a scientific perspective, major-minor tonality emerged historically under the constant influence of universal principles of perception and cognition (Eberlein, 1994; Parncutt, 2011a). Scientists in disciplines such as acoustics, psychology, biology, mathematics, and computing attempt to reduce major-minor tonality to underlying psychophysical and neurocognitive principles (Bharucha, 1984, 1987; Deutsch & Feroe, 1981; Krumhansl, 1990; Tillmann, Bharucha, & Bigand, 2000; Trainor & Trehub, 1994). Humanities scholars in disciplines such as music theory, music analysis, music history, and music sociology tend instead to regard major-minor tonality as a complex, partly arbitrary outcome of historical, social, cultural, and political interactions (Lowinsky, 1954; Norton, 1984). Humanities scholars and scientists ask similar questions about musical structure, but adopt different approaches. Psychologists may ask about perception and cognition today but ignore the past, despite evidence that perception generally depends on culture and history (Nisbett, Peng, Choi, & Norenzayan, 2001). Historians and music theorists may study the history of music and music theory but ignore empirical psychology.

Humanities scholars and scientists (in both “psychoacoustic” and “cognitive” approaches) agree that pitches in tonal musical contexts vary in importance, but use different words to describe such variations (e.g., stability, salience, hierarchy). Krumhansl (1990) presented diverse experimental data that provided:

a quantitative measure of the hierarchical ordering imposed on the individual tones in tonal contexts. In music-theoretical terms, the rating might be identified with the relative stability or structural significance of tones as they function within tonal contexts. It will be argued that this hierarchy is, in some sense, basic to the structuring of music itself and also to the psychological response to music.

This identification of a music-theoretical construct and a pattern of psychological data, then, represents a point of contact between the structure contained within the music and described by music theory, and the listener's response to that structure. (Krumhansl, 1990, p. 16)

What is the origin of these differences between more and less stable tones? From a psychoacoustic viewpoint, tones with the same amplitude and waveform, when presented simultaneously, can differ in perceptual salience for two reasons. First, masking among nearby partials tends to make inner voices in a musical texture less salient than outer voices. Second, complex tone perception tends to increase the salience of tones that correspond to periodicities or fundamentals (Moore, 2003; Terhardt, Stoll, & Seewann, 1982).

In addition, missing fundamentals (MFs) may be perceived at pitches not corresponding to chord tones (that is, at non-chord pitches corresponding to non-chord tones).2 For example, Parncutt (1988) predicted that an A-minor triad (ACE) evokes weak pitches at D and F, and Parncutt (1989) incorporated the predicted salience of MFs at non-chord pitches in a model of pitch commonality, intended to account for harmonic relationships perceived between successive chords such as CEG (with MFs at D, F, and A) and DFA (with MFs at G and B♭). A theory of harmony based on the perception of harmonic patterns among the partials of complex sounds is promising, considering the biological importance of voiced speech sounds (Bowling & Purves, 2015; Bowling, Purves, & Gill, 2017). It is also possible to predict interesting structural aspects of tonality using a pitch-commonality model that considers only spectral pitch and ignores harmonic pitch patterns and virtual pitch (Milne, Laney, & Sharp, 2015).

Parncutt (1989) adapted the pitch model of Terhardt et al. (1982) for music-theoretical purposes, assigning all input frequencies and output pitches to 12 equally spaced categories per octave across the range of hearing (120 categories altogether). For a given input sound, represented as a sum of pure tones with different frequencies and amplitudes, the model estimated the perceptual salience of each audible partial, looked for harmonic patterns among these spectral pitches (i.e., among the perceived pitches of individual audible partials)3 and on that basis estimated the perceptual salience of virtual pitches.

Relevant predictions are shown in Table 1. The predictions were made using the model of Parncutt (1989), based on simple assumptions about mutual masking and harmonic pattern recognition among simultaneous partials. The free parameter settings in the model were kM = 18, kT = 3, and kS = 0.5. Parameter kM is the gradient of the masking pattern of a pure tone in dB per critical band; if it is high, there is less masking and the partials are more clearly audible. Parameter kT is a measure of how analytically tones are perceived; if it is high, one is more likely to experience spectral than virtual pitches. Parameter kS measures the tendency to hear multiple tones; if it is high, more tones are perceived simultaneously.

TABLE 1.

Predicted Chroma-salience Profiles of Four Common Chords Constructed From Two Different Kinds of Tone According to Parncutt (1989) 

ChordTone typeCC♯/ D♭DD♯/ E♭EFF ♯/ G♭GG♯/ A♭AA♯/ B♭B
Major HCT 0.87 0.01 0.19 0.03 0.66 0.09 0.01 0.64 0.05 0.15 0.03 0.12 
 (e.g., CEG) OCT 1.62 0.00 0.05 0.05 0.55 0.17 0.01 0.58 0.06 0.16 0.01 0.00 
Minor HCT 0.81 0.01 0.09 0.65 0.03 0.19 0.02 0.74 0.17 0.03 0.21 0.05 
 (e.g., CE♭G) OCT 1.34 0.01 0.02 1.09 0.00 0.32 0.00 0.79 0.32 0.02 0.01 0.06 
Suspended HCT 0.88 0.03 0.17 0.07 0.03 0.73 0.00 0.66 0.04 0.07 0.14 0.02 
 (e.g., CFG) OCT 1.34 0.05 0.02 0.11 0.00 1.35 0.00 0.72 0.07 0.01 0.16 0.00 
Diminished HCT 0.56 0.11 0.08 0.67 0.04 0.13 0.66 0.09 0.28 0.04 0.26 0.18 
 (e.g., CE♭G♭) OCT 1.08 0.02 0.23 0.93 0.02 0.31 1.11 0.00 0.71 0.00 0.02 0.49 
ChordTone typeCC♯/ D♭DD♯/ E♭EFF ♯/ G♭GG♯/ A♭AA♯/ B♭B
Major HCT 0.87 0.01 0.19 0.03 0.66 0.09 0.01 0.64 0.05 0.15 0.03 0.12 
 (e.g., CEG) OCT 1.62 0.00 0.05 0.05 0.55 0.17 0.01 0.58 0.06 0.16 0.01 0.00 
Minor HCT 0.81 0.01 0.09 0.65 0.03 0.19 0.02 0.74 0.17 0.03 0.21 0.05 
 (e.g., CE♭G) OCT 1.34 0.01 0.02 1.09 0.00 0.32 0.00 0.79 0.32 0.02 0.01 0.06 
Suspended HCT 0.88 0.03 0.17 0.07 0.03 0.73 0.00 0.66 0.04 0.07 0.14 0.02 
 (e.g., CFG) OCT 1.34 0.05 0.02 0.11 0.00 1.35 0.00 0.72 0.07 0.01 0.16 0.00 
Diminished HCT 0.56 0.11 0.08 0.67 0.04 0.13 0.66 0.09 0.28 0.04 0.26 0.18 
 (e.g., CE♭G♭) OCT 1.08 0.02 0.23 0.93 0.02 0.31 1.11 0.00 0.71 0.00 0.02 0.49 

Note: The OCTs have flat spectra; the amplitude envelopes of the HCTs are musically typical as specified in Parncutt (1989). The chords from HCTs are in close root position in middle register and the lowest tone is always C4 (e.g., C4E4G4). The numbers in the table are predicted chroma salience, calculated by summing virtual pitch salience across 10 octave registers, for both OCT- and HCT-chords. For example, the value at C is the sum of calculated virtual pitch salience at C0, C1, C2, … and C9. Virtual pitch salience is normalized such that the sum of all calculated values across all 120 pitches (12/octave x 10 octaves) equals the chord's calculated multiplicity–the predicted number of simultaneously perceived pitches, which for example is 2.86 for the major triad of HCTs and 3.27 for the major triad of OCTs. Numbers in bold indicate chord tones; italics indicate the main MFs.

The model predicts features of tone profiles of isolated chords constructed from either octave-complex tones (OCTs; cf. empirical data of Parncutt, 1993) or harmonic-complex tones (HCTs; cf. Reichweger, 2010; Thompson & Parncutt, 1997).4 Such profiles may depend on experience of statistical distributions in music (Pearce & Wiggins, 2012)—which chromas5 typically precede or follow given chords in musical scores or performances. In the following, we will refer to musical learning processes of this kind as “nurture” by comparison to more universal or innate aspects of pitch perception, which we will call “nature.” The distinction is problematic: virtual pitch perception according to Terhardt et al. (1982) is universally learned from voiced sounds in speech and hence a form of “nurture,” but the relationship between spectral and virtual pitches is often treated as “nature” because of its quasi-universality. Note also that the dichotomy between “psychoacoustic” (or “sensory”) and “cognitive” approaches to musical pitch structures is not the same as the nature-nurture distinction; both approaches involve both psychophysics (relationships between physical and experiential parameters) and cognition (information processing) (Parncutt, 1989).

The current study aimed to evaluate the relative importance of nature and nurture in the perception of nonchord tones by comparing predictions of simple models with empirical data. We measured the chroma salience profiles of diverse musical chords in common use, comparing the results of different empirical methods and the predictions of different predictive models. A chroma salience profile is a vector of 12 numbers, each representing the perceptual salience of a chroma. A chroma salience profile of a chord is a representation of the pitches perceived in the chord and their relative saliences.

In Experiment 1, listeners heard a chord followed by a tone and rated how well the tone went with the chord—similar to experiments reported in Krumhansl (1990), but without a preceding tonal context. From a musical perspective, this method is suitable for testing theories of chord-scale compatibility or mappings (Biles, 2003). In Experiment 2, listeners were asked whether the tone was in the chord. In Experiment 3, they could hear the chord and any of 12 chromatic tones by clicking on an interface, and chose the chord's clearest or main pitch. In Experiments 1 and 2, our analysis focused on non-chord tones (i.e., the nine chromas that did not correspond to chord tones), whereas in Experiment 3, we focused on the three chord tones. By comparing results using different methods, we aimed to shed light on the psychological nature and musical function of chords and hence on the major-minor tonal system.

Like Krumhansl (1990), our study was confined to chords constructed from OCTs. This strategy allowed us to eliminate confounds of pitch register and chord voicing and made it possible to demonstrate or falsify the psychological reality of weaker pitches6 (less salient chromas) by statistical comparison of ratings across a finite number of pitches. For example, the strategy enabled us to ask whether chromas F and A (predicted MFs) are evoked by the chord CEG, by comparing ratings for those two tones with seven other non-chord tones (C♯/D♭, D, etc.).

Analogous experiments using chords of HCTs (Reichweger, 2010) suffered from a confound: the greater was the overall perceived pitch distance between test sound and probe tone, the lower was the goodness-of-fit rating. In practice, this confound can be reduced, but not eliminated, by careful choice of test sounds and probe tones. In general, when a listener is asked to compare a test sound with a probe tone (e.g., for goodness of fit), results depend on both pitch commonality and pitch distance (Parncutt, 1989). The distinction is assumed to depend on categorical pitch perception (Burns & Ward, 1978): pitch commonality is higher when a greater number of (more salient) successive pitches are perceived to lie in the same pitch categories, whereas pitch distance is higher when the distance between (more salient) successive pitches in different categories is higher.

Parncutt (1989) demonstrated the psychological reality of selected non-chord pitches in chords of HCTs, but those pitches were octave-equivalent to chord tones: for example, the chord C4E4G4 was shown to evoke pitches at C3, G3, C5, and G5. To demonstrate the psychological reality of non-chord chromas, one would have to average over all inversions of each chord as well as musically typical spectral envelopes, spacings, and registral distributions, greatly increasing the number of experimental trials. These methodological difficulties evaporate when both test sound and probe tone are constructed from OCTs, such that the overall perceived pitch distance between test and probe is held approximately constant.

PREDICTORS AND MODELS

In the present study, we measured chroma-salience profiles for several musically typical chords and compared them with four predictors: one “nature” predictor (MFs) and three “nurture” predictors (diatonicity, 5th relations, and completion tones). The models are introduced here and later operationalized in Table 5 and accompanying text.

Missing fundamentals.

In the spectrum of a typical musical chord, patterns of spectral pitches (audible partials) may correspond to incomplete harmonic series. In general, pitch may be perceived at the fundamentals of such patterns (Ritsma, 1967). Empirical research on categorical perception of partial frequencies within complex tones (Moore, Peters, & Glasberg, 1985) suggests that these harmonic patterns need not be exact and can be mistuned by a few tens of cents (Terhardt et al., 1982). The inherently approximate nature of pitch intervals in memory, both in this case and in music, undermines the Pythagorean concept of musical intervals as frequency ratios (Parncutt & Hair, 2018). Predictions about pitch perception in musical chords should therefore be almost independent of variations among theoretical tuning systems such as Pythagorean and Pure/Just (Barbour, 1951).

Diatonic and chromatic representations of the harmonic series are provided for reference in Table 2. In a first approximation, consider only harmonics that are octave-equivalent to the fundamental (harmonic numbers 1, 2, 4, 8, etc.). In this case, and considering only fundamental frequencies of chord tones, the major triad CEG may evoke MFs at A, F and D. An MF is predicted at A, for example, because E corresponds to the 3rd harmonic and G to the 7th harmonic of A. In other words, the MFs are subharmonics of chord tones. Similarly, the minor triad ACE evokes MFs at D and F. In general, the larger the number of coinciding subharmonics at a given pitch, and the lower the harmonic numbers involved, the more salient is the predicted pitch (Terhardt et al., 1982).

TABLE 2.

The Harmonic Series in Diatonic and Chromatic Representations Relative to an Arbitrary Fundamental Frequency of 110 Hz (the Musical Note A2)

Harmonic no.12345678910
Frequencies (relative to 110 Hz) 110 220 330 440 550 660 770 880 990 1100 
Note names (relative to A2)  A2  A3  E4  A4  C E5  G5  A5  B5  C
Simple diatonic intervals  P1  P1  P5  P1  M3  P5  m7  P1  M2  M3 
Chromatic intervals in semitones   0   0   7   0   4   7 10   0   2   4 
Harmonic no.12345678910
Frequencies (relative to 110 Hz) 110 220 330 440 550 660 770 880 990 1100 
Note names (relative to A2)  A2  A3  E4  A4  C E5  G5  A5  B5  C
Simple diatonic intervals  P1  P1  P5  P1  M3  P5  m7  P1  M2  M3 
Chromatic intervals in semitones   0   0   7   0   4   7 10   0   2   4 

Note: The numbers following the note names in row 3 are octave registers (middle C is the lowest tone of register 4). Abbreviations for diatonic intervals row 4 are P = perfect, M = major, m = minor, 1 = unison, 3 = third, 5 = fifth, 7 = seventh. The intervals in the last two lines are relative to the lowest tone or fundamental. They are octave generalized; that is, expressed as simple rather than compound intervals (using modulo 12 arithmetic for chromatic intervals). Note that the tuning of the harmonic series differs from 12-tone equal temperament and piano keyboards; the largest deviation occurs at the 7th harmonic, where a ratio of 4:7 is 31 cents smaller than an equally tempered m7 interval (10 semitones).

Note that the predictions of spectral and temporal models of pitch perception are similar to each other due to the mathematical equivalence of time-and frequency-domain representations; our aim here is not to compare pitch models or approaches with each other, but to test predictions that are common to different models. Note also that the tendency to perceive pitch at the MF in isolated tones is subject to large individual differences (Schneider et al., 2005; Seither-Preisler, Johnson, Seither, & Lütkenhöner, 2008): fundamental listeners are more likely to hear pitches at MFs, whereas spectral listeners are less likely to do so.

Diatonicity.

The standard diatonic scale or set corresponds to the white keys on the modern piano and comprises seven chromas (C, D, E, F, G, A, and B), in any transposition and any musically typical or acceptable tuning. It represents a closed segment of the circle of 5ths, all tones being related to at least one other tone by a P4 or P5 interval. Most types of chords in the major-minor system can be played (in transposition) in this scale. Sometimes the same type of chord can be played at more than one position within the scale. For instance, in the key of C major, there are major triads on C, F, and G, and minor triads on D, E and A. Diatonic scales are highly familiar to Western listeners (Deutsch & Feroe, 1981), having dominated Western music since ancient times (Gauldin, 1983). If a chord is a subset of the standard diatonic scale, listeners may expect to hear other tones in the same scale in the music that follows.

5th relations.

On the circle of 5ths, chromatic pitches are assigned to a circular representation similar to a clock face: C at 12 o'clock, G at 1 o'clock, D at 2 o'clock, and so on. There is convergent evidence in the literature for the psychological reality of this construct (Bharucha, 1987; Krumhansl, 1990, 1991; Lerdahl, 1988; Shepard, 1982; Thompson & Cuddy, 1989). In a tonal musical context, when any tone is heard, other tones at P4 or P5 intervals above or below it may be expected (cognitively activated or facilitated). This idea is plausible given the central role of diatonic scales based on P8 and P5 intervals in Western tonal music. But this observation also means our diatonicity and 5th-relatedness predictors overlap. They also overlap with the MF predictor, because many MFs lie at P5 intervals below chord tones.

Completion tones.

Any chord of 3 chromas (trichord) can be transformed into a chord of 4 chromas (tetrachord) by adding one of the 9 other chromas. For example, a diminished triad can be turned into a dominant, diminished, or half-diminished 7th chord, or six other less familiar tone combinations, by adding a fourth tone. The relative prevalence (frequency of occurrence) of those 9 tetrachords in real music depends mainly on their consonance (Parncutt & Hair, 2011). If a tetrachord is prevalent and hence familiar, each of its four triadic subsets may be perceived as the tetrachord with a missing element—just as a familiar pattern can be recognized when elements are missing (Gestalt principle of closure). A listener may therefore expect to hear the missing tone. Our completion tone predictor again partially overlaps with the other predictors, because completion tones often lie at P4/P5 intervals from chord tones, create a harmonic pattern corresponding to part of the harmonic series, or create part of a diatonic scale.

To understand better how these four predictors overlap, consider a simple example. Parncutt (1993, Experiment 2) presented chords of octave-complex tones (OCTs) followed by single octave-complex probe tones. When the chord was a C-major triad (CEG), participants rated the probe tone F as sounding or fitting significantly better than F♯. Of the four proposed models, three might explain this result. In the MF model, F♯ is an MF in a C-major triad but F♯ is not. Regarding diatonicity, F is diatonic in two scales to which the triad belongs (C major and F major), whereas F♯ is diatonic in only one (G major). Regarding 5th relations, F lies a 5th away from a chord tone (C) but F♯ does not. Only the fourth model (completion tones) cannot explain this particular result: neither CEFG nor CEF♯G is a particularly familiar tetrachord, although both do occur occasionally in tonal music. Thus, we are unable to separate the theories on the basis of this example.

Three of the models—diatonicity, 5th relations, and completion tones—presumably involve musical learning (“nurture”); that is, experience of pitch-time patterns in music and their statistical regularities (Parncutt, Reisinger, Fuchs, & Kaiser, 2018). If the tone F often follows a C-major chord (or more generally, if a major triad is often followed by a tone a P4 above the root7) in the music to which we have been exposed, we are likely to indicate in an experiment that F is a good continuation to C major—regardless of whether F is also a MF. Here and elsewhere, the prevalence of musical patterns (such as any major triad followed by a tone a P4 above the root) can be estimated by statistical analysis of representative databases of musical scores; such analysis can then predict note-by-note expectancies (cf. Pearce & Wiggins, 2012).

With the contrasting—but also overlapping—predictions of these four models in mind, the experiments were designed to answer questions such as: Do fundamental listeners rate MFs in musical chords higher than spectral listeners? Are the tone profiles more peaked, and the ratings higher on average, for more familiar chords? Does the highest peak correspond to the root? Are there peaks among the nine non-chord tones? If so, which model or models predict them?

Method

OVERVIEW

Each participant participated in one preliminary experiment and three main experiments in a single session in a quiet room at the University of Graz. Each experiment comprised a series of trials whose order was random and different for each participant. Each began with a few practice trials. The order of the three main experiments was random for each participant. The preliminary experiment was the Auditory Ambiguity Test (AAT) of Seither-Preisler et al. (2007), which aimed to separate participants into fundamental and spectral listeners. After completing the experiments, participants commented briefly on their experience of participating, their strategies, and the chords that they recognized. Statistical tests were performed in SPSS 22.

PARTICIPANTS

Forty people (26 females) participated in all experiments: 16 musicology students (University of Graz), 8 students of a musical instrument, composition, or conducting (University of Music and Performing Arts Graz), 6 electrical and audio engineering students (Graz University of Technology), 7 students in other disciplines, and 3 non-students. The age range was 18 to 57 years (M = 25, SD = 7).

All participants were musicians with over 6 years experience of regular instrumental playing and/or singing (mean number of years of playing = 14, SD = 5; mean singing = 7, SD = 7). Most (78%) indicated that they were playing an instrument regularly; of the singers, most (63%) indicated they were singing regularly. The main instrument of 7 participants was the piano, of 7 the guitar, of 5 the flute, and of 3 each the trumpet and the violin. As a second instrument, the piano was mentioned most often (13 times). Most participants (26) named classical music as their main performance genre; further genres (in descending order) were traditional music, jazz, pop, and rock. Two participants gave singing as their main instrument and 31 reported they were singing in a choir.

Nonmusicians were not included because of the difficulty of the experimental task. A previous, comparable study (Reichweger, 2010) had found that nonmusicians were often unable to differentiate between chord tones and non-chord tones in familiar musical triads.

STIMULI

All sounds were electronically synthesized. They were either pure tones, HCTs with MFs, OCTs, or chords of OCTs.

In each trial of the preliminary experiment (AAT), participants heard two successive HCTs with MFs, in which MFs and spectral components moved in opposite direction: if the spectral components of the second sound were higher, the MF was lower, and vice versa. The task was to indicate whether the second tone was higher or lower than the first. Each tone comprised successive harmonics spanning one octave. In some tone pairs, the tone with the higher F0 had harmonic numbers 2-4 and the tone with the lower F0 had harmonics 5-10. In others, the higher tone had harmonics 3-6 and the lower had 7-14; in still others, 4-8 and 9-18. The gradient of the spectral envelope was always -6 dB per octave—comparable with a bowed string (e.g., violin). Each HCT had the same SPL before amplification and realization; overall loudness was adjusted to be comfortable. The frequency of the missing F0 was between 100 and 400 Hz. The interval between the MFs was familiar from Western music (major second or M2, major third or M3, perfect 5th or P5, major 6th, or M6). Both tone durations and the silent gaps between them were 500 ms. Test presentation and response analysis were performed by an automatized Visual Basic script.

In the three main experiments, in each trial participants heard a chord of OCTs followed by a single OCT or pure tone. The OCTs corresponded to the 12 steps of the equally tempered chromatic scale with A4 = 440 Hz and unstretched octaves (2:1). Each OCT comprised 10 partials of equal amplitude (before amplification); the frequency range was from C1 (32.7 Hz) to B10 (15800 Hz). The relative phase of the partials was randomized: each was phase-shifted by a random angle between 1800 and + 1800 before superposition, to eliminate the possibility that phase relationships might affect the relative salience of evoked pitches. The amplitude of each chord was adjusted to -0.5 dB relative to full scale. To avoid clicks, amplitude-linear ramps were applied to the start (10 ms) and end (30 ms) of each sound.

For the first 20 participants in the three main experiments, the chord of OCTs had a duration of 300 ms, the silence between the chord and the probe tone was 300 ms, and the probe tone was an OCT whose amplitude was the same as that of the chord and whose duration was 300 ms. For the second group of 20 participants (numbered 21 to 40), the chord of OCTs had a duration of 100 ms, the silence was 300 ms as before, and the probe tone was a pure tone whose pitch was randomly distributed over a two-octave range from F4 (349 Hz) to E6 (1320 Hz). The amplitude of the pure tone was the same as that of the corresponding partial within the chord, and its duration was 200 ms. These changes were intended to be exploratory; we did not independently manipulate duration and probe tone type (OCT versus pure).

EQUIPMENT

Sounds were presented using a DELL Optiplex 960 computer with Intel® Core™ 2 Duo CPU E8400 @3 GHz, Windows 7 64-Bit operating system, 8GB RAM, and ADI 1984A High Definition Audio Onboard Graphic Card. The sound signal was amplified using Samson C-control mixer-amplifier and sent to Beyerdynamics DT-100 closed headphones. “Octave” open-source software was used both to synthesize the sounds and run the experiment.

Auditory Ambiguity Test (AAT)

PROCEDURE

This preliminary test comprised 110 trials, in a different random order for each participant. In each trial, two successive HCTs with MFs were presented. In 100 ambiguous trials the spectral envelope and the MF always moved in opposite directions. In 10 unambiguous control trials, both the MF and the spectral envelope moved in the same direction, to test the participants’ reliability. Subjects with two or more incorrectly classified control trials were discarded. Participants were not informed about the structure of the sounds; they simply indicated whether the pitch rose or fell. Participants who commented that they were sometimes unsure of the direction of motion were asked to give spontaneous “gut reactions.” Each participant was given a score between 0 and 100, representing the number of trials in which her or his response corresponded to the movement of the MF. Consistently spectral listeners had scores near 0, while consistently fundamental listeners had scores near 100.

RESULTS

The mean score was 81.8 (SD 20)—comparable with the score of 81.6 for professional musicians of Seither-Preisler et al. (2008), who also found mean scores of 45.9 for nonmusicians and 61.6 for amateur musicians. According to the original criteria of the AAT (score of 050, spectral listener; 50-100, fundamental listener; Seither-Preisler et al., 2007), most of our participants were fundamental listeners. We divided our 40 participants into two equal groups relative to the median value of 90.5 (the distribution was not bimodal). The cutoff value was arbitrary; in general, it depends on the acoustic parameters of the stimuli such that HCTs with higher harmonic numbers and fewer successive harmonics increase the likelihood of spectral responses (Preisler, 1993). The 20 participants with lower AAT scores were labeled relatively spectral listeners (mean score: 66.5); the 20 with higher scores were relatively fundamental listeners (mean score: 97.1). The difference between the two means was significant (p < .001; we used the MannWhitney U-test, since the distributions were skewed). We assume that our relatively fundamental listeners consistently heard the pitch of an isolated HCT at the MF, whereas our relatively spectral listeners sometimes responded to spectral and sometimes to virtual pitch.

Experiment 1

PROCEDURE

In each trial, a chord of OCTs and a probe tone were heard in succession. Participants were asked to rate how well the tone went with the chord on a 7-point scale from very badly to very good (Wie gut passt der Ton zum Akkord? 1 = sehr schlecht, 7 = sehr gut). The experimenter explained that the task was comparable with the question “How well do two colors go with each other?” and compared that question with how well the colors went with each other in a picture hanging on the wall. The experimenter encouraged participants to respond to each trial quickly and spontaneously, and clarified that we valued their opinion and there were no clearly defined right or wrong answers. Some participants asked whether they should judge whether the tone would go together with the chord if it sounded simultaneously (even though it was presented successively); the experimenter avoided giving a clear answer to this question and instead asked participants to respond spontaneously on the basis of the sound, without thinking about music theory. If a participant interested in the aesthetics of new or atonal music claimed that every tone might go equally well with every chord, the experimenter repeated the instruction to respond to the sound itself as experienced.

In a series of 108 trials, 9 chords and 12 chromatic tones were presented in all combinations.8 Relative to an arbitrary reference chroma (0), the chords were 015 (e.g., CD F), 025 (CDF), 027 (sus = suspended 4th chord), 035 (e.g., CE♭F), 036 (dim = diminished triad), 037 (min = minor), 045 (e.g., CEF), 047 (maj = major), and 048 (aug = augmented). The arbitrary reference was randomly transposed around the chroma circle in each trial. The chords had been selected from all 19 possible Tn-types of cardinality 3 (Forte, 1973; Rahn, 1980), as illustrated in Figure 1. The first eight selected chords were the most common trichords in Renaissance polyphony according to a database analysis (Parncutt et al., 2018; see Figure 2).9 The selection reflected our goal to understand the historical emergence of majorminor tonality: we wanted to focus on pitch perception in chords that played an important role in that historical development. The last chord was a special case from which unusual results might be expected, and it increased the diversity and dissonance of the sounds to which participants were exposed. The augmented triad 048 is symmetrical and hence tonally ambiguous: it divides the octave, and hence the chroma circle, into three equal intervals of four semitones.

FIGURE 1.

Nineteen Tn-types of cardinality 3 according to Rahn (1980), in close position with C4 in the lowest voice. Trichords with familiar music-theoretical names have been labeled: sus = suspended triad (here, a G chord with suspended 4th, also called Gsus4), dim = diminished, min = minor, maj = major, aug = augmented.

FIGURE 1.

Nineteen Tn-types of cardinality 3 according to Rahn (1980), in close position with C4 in the lowest voice. Trichords with familiar music-theoretical names have been labeled: sus = suspended triad (here, a G chord with suspended 4th, also called Gsus4), dim = diminished, min = minor, maj = major, aug = augmented.

FIGURE 2.

Distribution of chords corresponding to 19 Tn types of cardinality 3 (trichords) in a database of musical scores of unaccompanied choral polyphony from the 13th, 14th, 15th, and 16th centuries (Parncutt et al., 2018). A new “chord” was identified at every onset in any voice. White bars: Prepared chords, in which one or more notes are held from the previous chord. Black bars: Unprepared chords, in which all onsets are simultaneous.

FIGURE 2.

Distribution of chords corresponding to 19 Tn types of cardinality 3 (trichords) in a database of musical scores of unaccompanied choral polyphony from the 13th, 14th, 15th, and 16th centuries (Parncutt et al., 2018). A new “chord” was identified at every onset in any voice. White bars: Prepared chords, in which one or more notes are held from the previous chord. Black bars: Unprepared chords, in which all onsets are simultaneous.

HYPOTHESES

On the basis of previous research, we made the following predictions.

  • H1. Relatively fundamental listeners rate MFs in musical chords higher than relatively spectral listeners.

  • H2. The main peaks in the profile for each chord correspond to the 3 chord tones. Thus, empirical profiles correlate with stimulus profiles, with 1 for each chord tone and 0 for the other 9 chromas.

  • H3. The difference between chord tones and nonchord tones is larger for more familiar chords. If so, a possible explanation is that Western listeners have more practice making such distinctions in such chords.

  • H4. The mean rating over all 12 probe tones is higher for more familiar or consonant chords. Goodness of fit can be regarded as a horizontal (successive) consonance judgment; listeners often cannot separate different kinds of consonance (Parncutt & Hair, 2011).

  • H5. Among the 3 chord tones, the highest peak corresponds to the conventional root. For most chords, that was the upper tone of a P4 interval.

  • H6. Ratings vary among the 9 chromas not corresponding to chord tones, enabling a comparison of the four listed models (MFs, diatonicity, 5th relations, completion tones).

RESULTS

We made a preliminary, exploratory test of whether the data might depend on chord duration or tone type by comparing the results of the first group of 20 participants (who heard longer chords and OCT probes) with those of the second group of 20 (shorter chords and pure probes). Although our experiment resembles a two-way repeated measures design with independent variables Chord (9 levels) and Chroma (12 levels), it is not; each chord is statistically independent of the other chords, as each chord has a different profile and is randomly transposed relative to the others. Therefore, repeated-measures analyses of variance (ANOVAs) were calculated for each chord separately with factors Group (2) and Chroma (12); results are shown in Table 3. Given our stance on the statistical independence of chords from one another we did not consider corrections for multiple comparisons necessary. There was no main effect of Group and no interaction between Chroma and Group for any of the chords, which justified averaging the results of the two groups in later analyses. We did not systematically investigate the effect of chord duration or tone type.

TABLE 3.

Repeated-measures Analyses of Variance of Results of Experiment 1 for Each Chord Separately with Factors Group (2) and Chroma (12)

ChordMain effect of GroupInteraction between Group and Chroma
015 F(1, 38) = 0.13, p = .91, ƞ2 = .000 F(11, 418) = 0.98, p = .47, ƞ2 = .025 
045 F(1, 38) = 0.01, p = .93, ƞ2 = .000 F(11, 418) = 1.10, p = .36, ƞ2 = .028 
025 F(1, 38) = 0.02, p = .89, ƞ2 = .001 F(11, 418) = 1.26, p = .24, ƞ2 = .032 
035 F(1, 38) = 0.05, p = .83, ƞ2 = .001 F(11, 418) = 0.68, p = .76, ƞ2 = .017 
027 F(1, 38) = 1.77, p =.19, ƞ2 = .045 F(11, 418) = 0.54, p = .87, ƞ2 = .014 
036 F(1, 38) = 0.75, p = .39, ƞ2 = .019 F(11, 418) = 0.67, p = .77, ƞ2 = .017 
037 F(1, 38) = 0.49, p = .49, ƞ2 = .013 F(11, 418) = 1.07, p = .39, ƞ2 = .027 
047 F(1, 38) = 0.75, p = .39, ƞ2 = .019 F(11, 418) = 0.93, p = .51, ƞ2 = .024 
048 F(1, 38) = 0.55, p = .46, ƞ2 = .014 F(11, 418) = 1.13, p = .34, ƞ2 = .029 
ChordMain effect of GroupInteraction between Group and Chroma
015 F(1, 38) = 0.13, p = .91, ƞ2 = .000 F(11, 418) = 0.98, p = .47, ƞ2 = .025 
045 F(1, 38) = 0.01, p = .93, ƞ2 = .000 F(11, 418) = 1.10, p = .36, ƞ2 = .028 
025 F(1, 38) = 0.02, p = .89, ƞ2 = .001 F(11, 418) = 1.26, p = .24, ƞ2 = .032 
035 F(1, 38) = 0.05, p = .83, ƞ2 = .001 F(11, 418) = 0.68, p = .76, ƞ2 = .017 
027 F(1, 38) = 1.77, p =.19, ƞ2 = .045 F(11, 418) = 0.54, p = .87, ƞ2 = .014 
036 F(1, 38) = 0.75, p = .39, ƞ2 = .019 F(11, 418) = 0.67, p = .77, ƞ2 = .017 
037 F(1, 38) = 0.49, p = .49, ƞ2 = .013 F(11, 418) = 1.07, p = .39, ƞ2 = .027 
047 F(1, 38) = 0.75, p = .39, ƞ2 = .019 F(11, 418) = 0.93, p = .51, ƞ2 = .024 
048 F(1, 38) = 0.55, p = .46, ƞ2 = .014 F(11, 418) = 1.13, p = .34, ƞ2 = .029 

Note: Group 1 was the first group of 20 participants (with longer stimulus durations) and Group 2 was the second group (with shorter durations).

Results are presented in Figure 3. The bottom right panel of Figure 3 is a special case. The augmented triad 048 (e.g., CEG♯) is symmetrical: the intervals between adjacent tones are all 4 semitones. For this reason, and because the chord was constructed from OCTs, the stimuli presented to the participants for probe pitches 0 to 3 were physically identical (because transposed around the chroma circle) to the stimuli presented for probe pitches 4 to 7, and for probe pitches 8 to 11. Results were therefore averaged across these three groups of trials. The error bars are smaller than for the other chords because they represent the means of 120 rather than 40 data per point.

FIGURE 3.

Results of Exper iment 1. Points are mean listener ratings over 40 participants. Error bars are 95% confidence intervals. Open circles are chord tones; filled circles are non-chord tones. Tones predicted to have higher salience are marked with letters: M: missing fundamental, D: diatonic tone, C: completion tone. The headings “3-4A” and so on are labels for Tn-types according to Rahn (1980); “015” means 0, 1, and 5 semitones relative to an arbitrary reference pitch.

FIGURE 3.

Results of Exper iment 1. Points are mean listener ratings over 40 participants. Error bars are 95% confidence intervals. Open circles are chord tones; filled circles are non-chord tones. Tones predicted to have higher salience are marked with letters: M: missing fundamental, D: diatonic tone, C: completion tone. The headings “3-4A” and so on are labels for Tn-types according to Rahn (1980); “015” means 0, 1, and 5 semitones relative to an arbitrary reference pitch.

H1 was not confirmed: relatively fundamental listeners did not generally rate MFs higher than relatively spectral listeners. For each chord, results were subjected to a repeated-measures ANOVA with factors Listener Type (2 levels: relatively fundamental, relatively spectral) and Chroma (9 levels; the three chord tones were omitted). This was possible given that the fundamental and spectral listener groups did not differ in variance according to the Levene test. Results are presented in Table 4. The effect of Listener Type was significant for only one chord: 037 (minor). The interaction between Listener Type and Chroma was significant for only two chords: 047 (major) and 048 (augmented). Given the lack of any consistent, significant main or secondary effect of Listener Type, we averaged over all listeners in Figure 3.

TABLE 4.

Repeated-measures Analyses of Variance of Results of Experiment 1 for Each Chord Separately with Factors Group (2) and Chroma (12)

ChordMain effect of GroupInteraction between Group and Chroma
015 F(1, 38) = 0.35, p = .56, ƞ2 = .009 F(8, 304) = 1.53, p = .15, ƞ2 = .039 
045 F(1, 38) = 0.39, p = .53, ƞ2 = .010 F(8, 304) = 0.39, p = .93, ƞ2 = .010 
025 F(1, 38) = 2.41, p = .13, ƞ2 = .060 F(8, 304) = 1.13, p = .34, ƞ2 = .029 
035 F(1, 38) = 0.34, p = .57, ƞ2 = .009 F(8, 304) = 0.86, p = .55, ƞ2 = .022 
027 F(1, 38) = 0.87, p = .36, ƞ2 = .022 F(8, 304) = 1.11, p = .35, ƞ2 = .028 
036 F(1, 38) = 0.09, p = .77, ƞ2 = .002 F(8, 304) = 0.76, p = .64, ƞ2 = .020 
037 F(1, 38) = 5.20, p < .05, ƞ2 = .120 F(8, 304) = 0.89, p = .53, ƞ2 = .023 
047 F(1, 38) = 0.41, p = .52, ƞ2 = .011 F(8, 304) = 2.14, p < .05, ƞ2 = .053 
048 F(1, 38) = 0.08, p = .78, ƞ2 = .002 F(8, 304) = 2.51, p < .05, ƞ2 = .062 
ChordMain effect of GroupInteraction between Group and Chroma
015 F(1, 38) = 0.35, p = .56, ƞ2 = .009 F(8, 304) = 1.53, p = .15, ƞ2 = .039 
045 F(1, 38) = 0.39, p = .53, ƞ2 = .010 F(8, 304) = 0.39, p = .93, ƞ2 = .010 
025 F(1, 38) = 2.41, p = .13, ƞ2 = .060 F(8, 304) = 1.13, p = .34, ƞ2 = .029 
035 F(1, 38) = 0.34, p = .57, ƞ2 = .009 F(8, 304) = 0.86, p = .55, ƞ2 = .022 
027 F(1, 38) = 0.87, p = .36, ƞ2 = .022 F(8, 304) = 1.11, p = .35, ƞ2 = .028 
036 F(1, 38) = 0.09, p = .77, ƞ2 = .002 F(8, 304) = 0.76, p = .64, ƞ2 = .020 
037 F(1, 38) = 5.20, p < .05, ƞ2 = .120 F(8, 304) = 0.89, p = .53, ƞ2 = .023 
047 F(1, 38) = 0.41, p = .52, ƞ2 = .011 F(8, 304) = 2.14, p < .05, ƞ2 = .053 
048 F(1, 38) = 0.08, p = .78, ƞ2 = .002 F(8, 304) = 2.51, p < .05, ƞ2 = .062 

Note: Group 1 was relatively fundamental listeners and Group 2 was relatively spectral listeners.

H2 was confirmed: the main peaks in each chord profile corresponded to chord tones. We performed an ANOVA with factors Chord (9) and Tone (2). Tone was set to 1 for the three chord tones and 0 for the nine non-chord tones in each chord. The difference in mean rating between chord and non-chord tones was compared across 9 chords. There was a main effect of Tone: responses for chord tones were higher than for nonchord tones (H2), F(1, 39) = 85.72, p < .001, ƞ2 .69.

H3 was also confirmed: participants could more easily distinguish chord tones from non-chord tones in more familiar (consonant) chords. In the same ANOVA with factors Chord and Tone, the effect of Tone was greater for chord 047 (maj) than all other chords except there was a ceiling effect, such that all three tones were heard to go (very) well with the preceding chord. 027 (sus) and 037 (min), F(8, 312) = 5.85, P < .001, ƞ2 = .13.

H4 was not confirmed: ratings were not generally higher for more familiar or consonant chords. An ANOVA with two factors, Chord (9 levels) and Chroma (12), revealed a main effect of Chord, F(8, 312) = 5.37, p < .001, ƞ2 = .12, but the mean rating for a chord did not depend in a clear way on its familiarity/consonance. In order of mean response (from highest to lowest), the chords were 036, 025, 037, 048, 015, 035, 047, 027, 045.

H5 was partially confirmed: the profile peak sometimes corresponded to the music-theoretic root. For each chord separately, a one-way repeated-measures ANOVA was applied to the ratings for the three chord-tones, ignoring ratings at non-chord pitches. There was a significant main effect of Chroma for four of the nine chords: 015, 027, 037, and 047.

  • For 015, the ANOVA yielded F(2, 78) = 3.16, p < .05, ƞ2 = .08, but Bonferroni-adjusted post hoc analysis produced no significant differences.

  • For 027, an ANOVA with Greenhouse-Geisser correction yielded F(1.58, 61.68) = 4.77, p < .05, ƞ2 = .11, and a Bonferroni-adjusted post hoc analysis revealed a significant difference (p < .01) between tone 0 and tone 2, tone 0 receiving higher ratings; 1.05, 95%-CI(0.22, 1.88).

  • For 037, F(2, 78) = 6.08, p < .01, ƞ2 ¼ .14, a Bon-ferroni-adjusted post hoc analysis produced a significant difference (p < .01) between tone 0 and tone 3, tone 0 receiving higher ratings; 1.10, 95%- CI(0.43, 1.77).

  • For 047, an ANOVA with a Greenhouse-Geisser correction yielded F(1.62, 63.31) = 10.97, p < .001, ƞ2 = .22, and a Bonferroni-adjusted post hoc analysis revealed a significant difference (p < .01) between tone 0 and tone 4, tone 0 receiving higher ratings; 1.25, 95%-CI(0.40, 2.10), tone 7 higher than tone 4; 1.13, 95%-CI(0.35, 1.90).

While all observed differences were broadly consistent with both typical music-theoretic positions and the predictions of Parncutt (1988) and Parncutt (1993), many predicted differences did not reach significance. Perhaps there was a ceiling effect, such that all three tones were heard to go (very) well with the preceding chord.

H6 was partially confirmed. A one-way ANOVA with 9 levels yielded a main effect of Chroma among non-chord tones for chords 015, 036, 037, 047. For 035 we observed a trend.

  • For 015, F(8, 312) = 2.25, p < .05, ƞ2 = .05

  • For 036, F(8, 312) = 2.65, p < .01, ƞ2 = .06

  • For 037, F(8, 312) = 2.27, p < .05, ƞ2 = .06

  • For 047, F(8, 312) = 2.59, p < .05, ƞ2 = .06

  • For 035, F(8, 312) = 1.79, p < .08, ƞ2 = .04

There was no significant effect of Chroma among non-chord tones for chords 025, 027, 045, or 048.

  • 0.25: F(8, 312) = 1.09, p = .37, ƞ2 = .03

  • 0.27: F(8, 312) = 1.24, p = .28, ƞ2 = .03

  • 0.45: F(8, 312) = 1.32, p = .23, ƞ2 = .03

  • 0.48: F(8, 312) = 1.39, p = .20, ƞ2 = .03

Although only half of the chords produced a significant effect of Chroma for non-chord tones, we proceeded to investigate predicted differences among non-chord tones in all nine chords, looking for higher mean goodness-of-fit ratings at predictions of four theories: MFs, diatonic tones, 5th-related tones, and completion tones.

MODELS

Predictions of the four models for non-chord tones are summarized in Table 5. For each chord and each of the four theories, the first chroma in the list is the one most strongly predicted by the theory. For example, the strongest non-chord tone in chord 015 according to the diatonic predictor is 3 (i.e., 3 semitones or a m3 above the 0 in 015). The other pitches are listed in descending order of predicted strength. If two or more pitches are predicted to have equal strength, they are listed in rising numerical order. This criterion was applied in the same way to all four theoretical predictions. Exact procedures for the four predictive models were as follows.

TABLE 5.

Predicted Profile Peaks for Nine Non-chord Tones (in Semitones Above 0 in the Chord Label) for Each Chord in the Experiments According to Four Different Theories, Starting with the Highest Predicted Peak in Each Case

ChordMFsDiatonic tones5th-related tonesCompletion tones
015 10, 6, 3, 8, 9, 2, 4, 7, 11 3, 8, 10, 6, 7, 2, 4, 9, 11 6, 7, 8, 10, 2, 3, 4, 9, 11 8, 10, 9, 3, 7, 6, 2, 4, 11 
025 10, 7, 1, 8, 4, 3, 6, 9, 11 7, 9, 10, 3, 4, 8, 11, 1, 6 7, 9, 10, 1, 3, 4, 6, 8, 11 9, 8, 10, 7, 3, 4, 1, 6, 11 
027 5, 10, 3, 8, 4, 9, 1, 6, 11 5, 9, 4, 10, 3, 11, 6, 8, 1 5, 9, 1, 3, 4, 6, 8, 10, 11 5, 4, 10, 9, 3, 6, 1, 8, 11 
035 8, 10, 1, 11, 2, 7, 4, 6, 9 10, 7, 8, 1, 2, 6, 9, 4, 11 10, 7, 8, 1, 2, 4, 6, 9, 11 9, 8, 7, 10, 2, 1, 4, 6, 11 
036 8, 11, 5, 2, 1, 4, 7, 9, 10 1, 5, 8, 10, 2, 4, 7, 9, 11 1, 5, 7, 8, 10, 11, 2, 4, 9 8, 10, 9, 11, 1, 7, 2, 4, 5 
037 5, 8, 11, 2, 9, 1, 4, 6, 10 5, 10, 2, 8, 1, 9, 4, 6, 11 2, 5, 8, 10, 1, 4, 6, 9, 11 10, 9, 8, 2, 5, 11, 1, 4, 6 
045 10, 9, 1, 2, 8, 6, 3, 7, 11 2, 7, 9, 10, 11, 1, 3, 6, 8 7, 9, 10, 11, 1, 2, 3, 6, 8 9, 8, 7, 10, 2, 11, 1, 3, 6 
047 9, 5, 2, 3, 8, 6, 1, 10, 11 2, 9, 5, 11, 6, 10, 1, 3, 8 2, 5, 9, 11, 1, 3, 6, 8, 10 10, 9, 11, 2, 5, 1, 3, 6, 8 
048 1, 5, 9, 2, 6, 10, 3, 7, 11 1, 2, 3, 5, 6, 7, 9, 10, 11 1, 3, 5, 7, 9, 11, 2, 6, 10 2, 6, 10, 3, 7, 11, 1, 5, 9 
ChordMFsDiatonic tones5th-related tonesCompletion tones
015 10, 6, 3, 8, 9, 2, 4, 7, 11 3, 8, 10, 6, 7, 2, 4, 9, 11 6, 7, 8, 10, 2, 3, 4, 9, 11 8, 10, 9, 3, 7, 6, 2, 4, 11 
025 10, 7, 1, 8, 4, 3, 6, 9, 11 7, 9, 10, 3, 4, 8, 11, 1, 6 7, 9, 10, 1, 3, 4, 6, 8, 11 9, 8, 10, 7, 3, 4, 1, 6, 11 
027 5, 10, 3, 8, 4, 9, 1, 6, 11 5, 9, 4, 10, 3, 11, 6, 8, 1 5, 9, 1, 3, 4, 6, 8, 10, 11 5, 4, 10, 9, 3, 6, 1, 8, 11 
035 8, 10, 1, 11, 2, 7, 4, 6, 9 10, 7, 8, 1, 2, 6, 9, 4, 11 10, 7, 8, 1, 2, 4, 6, 9, 11 9, 8, 7, 10, 2, 1, 4, 6, 11 
036 8, 11, 5, 2, 1, 4, 7, 9, 10 1, 5, 8, 10, 2, 4, 7, 9, 11 1, 5, 7, 8, 10, 11, 2, 4, 9 8, 10, 9, 11, 1, 7, 2, 4, 5 
037 5, 8, 11, 2, 9, 1, 4, 6, 10 5, 10, 2, 8, 1, 9, 4, 6, 11 2, 5, 8, 10, 1, 4, 6, 9, 11 10, 9, 8, 2, 5, 11, 1, 4, 6 
045 10, 9, 1, 2, 8, 6, 3, 7, 11 2, 7, 9, 10, 11, 1, 3, 6, 8 7, 9, 10, 11, 1, 2, 3, 6, 8 9, 8, 7, 10, 2, 11, 1, 3, 6 
047 9, 5, 2, 3, 8, 6, 1, 10, 11 2, 9, 5, 11, 6, 10, 1, 3, 8 2, 5, 9, 11, 1, 3, 6, 8, 10 10, 9, 11, 2, 5, 1, 3, 6, 8 
048 1, 5, 9, 2, 6, 10, 3, 7, 11 1, 2, 3, 5, 6, 7, 9, 10, 11 1, 3, 5, 7, 9, 11, 2, 6, 10 2, 6, 10, 3, 7, 11, 1, 5, 9 

MFs.

Predictions for MFs were chroma-salience profiles according to Parncutt (1988) with the following root-support weights: 10 for the P1/P8 interval, 5 for P5, 3 for M3, 2 for m7, and 1 for M2/M9 (see Parncutt, 2009, Appendix). The results of these calculations are presented in Table 6a, in which the two main MFs for each chord are also marked. Table 6b presents predictions of a similar algorithm that additionally accounts for masking among partials (Parncutt, 1993). Nearby partials mask each other, and the smaller the interval, the higher the degree of masking. In the chord 015, for example, tones 0 and 1 mask each other, which reduces their salience relative to tone 5. Comparing parts a and b of Table 6, we see that masking has little effect on the rank order of predictions.

A)

Chord in semitonesChord name01234567891011
015 – 10 13  2  3  0 15 5  2  3 6 
045 – 13  3  3  1 10 15  2  2  3 5 6 
025 – 11  3 12  1  2 15  0 7  3 9 
035 – 10  4  2 11  0 17  0  2 8 6 
027 sus 16  0 12  3  2 6  0 15  3 4 
036 dim 10  1  5 10  1 7 10  0 10 8 
037 min 15  1  2 13  0 8  0 10 8 
047 maj 18  0  3  3 10 6  2 10  3 7 
048 aug 13 5  3  0 13 5  3  0 13 5 
Chord in semitonesChord name01234567891011
015 – 10 13  2  3  0 15 5  2  3 6 
045 – 13  3  3  1 10 15  2  2  3 5 6 
025 – 11  3 12  1  2 15  0 7  3 9 
035 – 10  4  2 11  0 17  0  2 8 6 
027 sus 16  0 12  3  2 6  0 15  3 4 
036 dim 10  1  5 10  1 7 10  0 10 8 
037 min 15  1  2 13  0 8  0 10 8 
047 maj 18  0  3  3 10 6  2 10  3 7 
048 aug 13 5  3  0 13 5  3  0 13 5 

Note: Top row: interval in semitones relative to reference pitch 0. Body of table: Weight W according to Parncutt (1988, Equation 1, p. 77) with root-support weights 10 (for the P1/P8 interval), 5 (for P5), 3 (for M3), 2 (for m7), and 1 (for M2/M9). Chord tones are underlined, predicted roots are bold, and predicted MFs are italic.

B)

Chord in semitonesChord name01234567891011
015 – 22 35  4  9  0 55 11  9  7 24 
045 – 52  7 11  2 22 46  4  5 14 11 16 
025 – 33 11 34  4  6 51  0 21  9 29 
035 – 35 12  7 30  0 52  0  6 24 18 
027 sus 49  0 33 11  6 18  0 51  8 11 
036 dim 40  4 20 37  4 28 40  0 39 31 
037 min 49  3  6 41  0 26  0 34 25 
047 maj 54  0  9  9 28 18  6 29  9 20 
048 aug 45 17 10  0 45 17 10  0 45 17 10 
Chord in semitonesChord name01234567891011
015 – 22 35  4  9  0 55 11  9  7 24 
045 – 52  7 11  2 22 46  4  5 14 11 16 
025 – 33 11 34  4  6 51  0 21  9 29 
035 – 35 12  7 30  0 52  0  6 24 18 
027 sus 49  0 33 11  6 18  0 51  8 11 
036 dim 40  4 20 37  4 28 40  0 39 31 
037 min 49  3  6 41  0 26  0 34 25 
047 maj 54  0  9  9 28 18  6 29  9 20 
048 aug 45 17 10  0 45 17 10  0 45 17 10 

Note: Top row: interval in semitones relative to reference pitch 0. Body of table: Audibility A according to Parncutt (1993, Equation 5, p. 45) with the same root-support weights as for part A.

Diatonic tones.

The predictions of this model were calculated in two steps. First, we listed all major scales to which each chord belonged. Second, we counted how many of these scales each chroma belonged to, and weighted it with this number. Chromas that belong to the same diatonic scale(s) as a chord are marked in Figure 3 as “diatonic tones.” For example, the chord 015 (e.g., CD♭F) is part of two major scales: 0, 1, 3, 5, 6, 8, 10 (D♭major) and 0, 1, 3, 5, 7, 8, 10 (A♭ major). We therefore expect chromas 3, 6, 7, 8, and 10 to receive higher mean ratings following 015 than other non-chord tones. Of these, 3, 8, and 10 belong to both scales, so ratings for these tones are predicted to lie between those for chord tones (0, 1, 5) and other diatonic tones (6, 7).

5th-related tones.

The first predictions listed in the fourth column of Table 5 are tones that are 5th-related to at least one chord tone. For each chord, between 2 and 6 pitches were predicted by this method. If a tone was 5th-related to two chord tones, it was placed first in the list. For example, in chord 025, tone 7 is 5th-related to both tone 0 and tone 2. Finally, the remaining tones were inserted in numerical order.

Completion tones.

For each trichord in Experiment 1, we listed the main tetrachords of which it could be part. Consulting data on the prevalence of tetrachords in our database (cf. Parncutt et al. 2018), we identified the non-chord-chromas that would create a familiar tetrachord if added to the trichord. For example, if chromatic tone 10 is added to chord 015, it becomes (0, 1, 5, 10), which is a minor triad on 10 (10, 1, 5) with an added major 9th (0). Put another way, chord 015 has a missing root at 10. If chromatic tone 8 is added to 015, the result is a major 7th chord relative to root 1. To create the lists in Table 5, we subjectively estimated the consonance or prevalence in mainstream tonal music of the harmonic (simultaneous) tetrachord created by adding each possible completion tone to each trichord played on the piano. In the process we referred to our tetrachord prevalence data, but did not use it systematically, because our participants were most familiar with pop/jazz styles of the 20th century, for which we have no comparable data. Instead, we drew subjectively on our experience as musicians and theorists. We assume that other music theorists will generate similar data and that any individual differences will not bias our final conclusions. Tetrachords that include two or more semitone intervals (e.g., 015 plus 2, 4, or 11) were always placed last in the list—in rising numerical order, as before.

TESTING MODEL PREDICTIONS

We tested the predictions for MFs in Table 5 by running an ANOVA with two repeated-measures factors: Chord (9 levels) and MF strength (2), the latter being 1 for the first two MFs and 0 for other pitches. There were main effects of Chord, F(8, 312) = 6.04, p < .001, ƞ2 = .13, and MF Strength, F(1, 39) = 19.37, p < .001, ƞ2 = .33, confirming that the two predicted MFs (mean rating 4.17) were rated higher than other seven non-chord tones (mean 3.80). The interaction was not significant.

To test the predictions for diatonic tones in the table, the factors were Chord and Diatonicity (non-chord diatonic chromas versus other non-chord tones). The comparison was limited to eight chords (the non-diatonic augmented triad was omitted). Both main effects were significant: Chord, F(7, 273) = 5.78, p < .001, ƞ2 = .13, and Diatonicity, F(1, 39) = 4.79, p < .05, ƞ2 =.11, with means of 3.95 for non-chord diatonic chromas and 3.76 for other non-chord tones. The interaction was not significant.

For 5th-related tones, the two repeated-measures factors were Chord and 5th Relation, the latter set to 1 for 5th-related tones and 0 for other tones. There were significant main effects of Chord, F(8, 312) = 5.66, p < .001, ƞ2 =.13, and 5th Relation, F(1, 39) = 12.76, p < .01, ƞ2 = .25); 5th-related tones (mean 4.07) were rated higher than other non-chord tones (mean = 3.75); however, there was no interaction between Chord and 5th Relation.

For completion tones, the factors were Chord and Completion (completion tones versus other non-chord tones). There were main effects of Chord, F(8, 312) = 6.45, p < .001, ƞ2 = .14, and Completion, F(1, 39) = 8.16, p < .01, ƞ2 =.17; completion tones (mean = 4.03) were rated higher than the other non-chord tones (mean 3.85). However, there was no interaction between Chord and Completion.

In summary, Experiment 1 provided tentative evidence in favor of all four listed theories, but the effect size was higher for MFs (ƞ2 = .33) than for the other models (.11, .25, and .17 respectively), suggesting MFs were responsible for most of the variance.

Experiment 2

Experiment 2 was a repeat of Experiment 1 with just one change in the empirical method. The question that participants answered in each trial was: “Is the tone in the chord?” (Ist der Ton im Akkord?); and the rating scale was labeled 1 = definitely not and 7 = definitely. The new question focused the attention of participants on the chords themselves, rather than on the contexts in which the chords occur in music. The question also corresponded more directly to the idea of MFs, which—if they exist psychologically—should be perceived as physically real tones. We therefore expected a higher rate of “errors” (participants mistakenly indicating that a tone is in a chord) when a chord is followed by an MF than when it is followed by another nonchord tone. All other aspects of Experiment 2 were identical to Experiment 1, including initial hypotheses. Because Experiments 1 and 2 are similar, we use a lower p value for significance (.025) for the results section of Experiment 2 than if the two experiments were considered independent (where p would be .05) Detailed results are shown in Figure 4.

FIGURE 4.

Results of Experiment 2. The symbols and letters have the same meaning as in Figure 3.

FIGURE 4.

Results of Experiment 2. The symbols and letters have the same meaning as in Figure 3.

ADDITIONAL HYPOTHESES

When comparing the results of Experiments 1 and 2, the following additional hypotheses were tested.

H7: Mean ratings for Experiments 1 and 2 over 108 trials correlate with each other because the task is so similar. Confirmed: r = .79, p < .001. There was no interaction between Trial Number (a combination of Chord and Chroma) and Experiment, nor was there an interaction between Experiment and Chord or Chroma.

H8: The overall mean result for Experiment 2 is lower than for Experiment 1 because listeners are more likely to think a chromatic tone goes with a 3-tone chord than is part of it. Confirmed: a 3-way ANOVA with factors Chroma (12 levels), Chord (9), and Experiment (2) revealed a main effect of Experiment, F(1, 39) = 10.88, p < .01, ƞ2 = .22. The overall mean rating for Experiment 1 was 4.18; for Experiment 2, 3.98.

H9: The MF predictor is more successful than the other predictors in Experiment 2, because the question posed to participants in that experiment focused their attention on the chords themselves rather than the contexts in which they appeared (either for all participants or only for fundamental listeners). To test H9, we first considered MFs, running an ANOVA with two factors, Chord (9 levels) and MF-Strength (2). There were two main effects: Chord, in which some chords attracted higher mean ratings than others, F(8, 312) = 3.27, p < .01, ƞ2 = .08, and MF-Strength, in which MFs (mean = 3.89) were rated higher than other non-chord tones (mean = 3.56), F(1, 39) = 13.26, p < .01, ƞ2 = .25).10 The interaction was not significant. We then ran an ANOVA with factors Chord (9) and Diatonicity (2 levels: diatonic chromas versus other nonchord tones) and Greenhouse-Geisser correction. Only the main effect of Chord was significant, F(5.3, 259.7) = 3.54, p < .01, ƞ2 =.08. An ANOVA with Chord (9) and Completion (2 levels, completion tones versus other non-chord tones) with Greenhouse-Geisser correction revealed that only the main effect of Chord was significant, F(1, 39) = 19.37, p < .001, ƞ2 = .33. Finally, an ANOVA with Chord (9) and 5th Relation (2 levels: 5th-related chromas versus other non-chord tones) and Greenhouse-Geisser correction produced main effects of chord, F(6.1, 238.9) = 3.95, p < .01, ƞ2 = .09) and 5th-relatedness, F(1, 39) = 7.70, p < .01, ƞ2 = .17, and a significant interaction, F(6.0, 235.3) = 2.89, p < .05, ƞ2 = .07. Both significance level and effect size were higher for MFs, F(1, 39) = 13.26, p < .01, ƞ2 = .25, than for 5threlated tones, F(1, 39) = 7.70, p < .05, ƞ2 = .17, consistent with H9.

Summarizing the comparison of Experiments 1 and 2: In Experiment 1, participants were asked to rate how well the probe tone went with the preceding chord. That instruction logically included all four predictors: MFs, diatonic tones, 5th-related tones, and completion tones, as reflected in the results. In Experiment 2, participants were asked if the probe tone was in the chord, which logically included only MFs—perceived as if they are physically present. In both experiments, MFs and 5threlated tones were rated higher than other non-chord tones, but diatonic and completion tones were only rated higher in Experiment 1, consistent with the different instruction.

A Comparison of Four Theories

Figure 5 presents an alternative comparison of the predictions of four theories for Experiments 1 and 2. For each chord, we considered the two most likely MFs, diatonic tones, 5th-related tones, and completion tones according to the models, among the nine non-chord tones in each case, and considered the mean ratings given those two non-chord tones by all participants. For each theory, we predicted that mean ratings for those two chromas would exceed mean ratings for all nine non-chord tones. If the mean ratings for chromas predicted by a given theory A were higher than those predicted by theory B, we would then have more confidence in theory A. The comparison is problematic, because the predictors overlap: one and the same tone could be predicted by more than one theory.

FIGURE 5.

Comparison of four theories. Mean listener ratings for the first two pitches predicted by each model from among the 9 non-chord tones, over all chords. The higher the mean rating, the more accurate the prediction of the corresponding model. Baseline is the mean of all 9 non-chord tones for all chords. Error bars are 95% confidence intervals.

FIGURE 5.

Comparison of four theories. Mean listener ratings for the first two pitches predicted by each model from among the 9 non-chord tones, over all chords. The higher the mean rating, the more accurate the prediction of the corresponding model. Baseline is the mean of all 9 non-chord tones for all chords. Error bars are 95% confidence intervals.

The results as shown in the figure suggest that in Experiment 1 all models except diatonicity performed better than baseline.11 In Experiment 2, the MF theory appeared to account for the data best, followed by completion tones. This is evidence that MFs in musical chords have psychological reality—at least when those chords are constructed from OCTs. The importance of completion tones in the data reflects the importance of chord extensions in music theory—the idea that 7th chords are constructed by adding 7ths to triads, and consequently that Western tonal music is based on triads (Childs, 1998).

The surprisingly poor performance of the diatonicity model in Experiment 2 can be explained as follows. Trials in which a chord was followed by a diatonic tone were more familiar because such combinations happen more often in music. That made it easier for musically trained participants to recognize that the tone was not part of the chord, increasing the number of negative responses. If, for example, a listener hears the chord 015 followed by the diatonic tone 3 (a normal and familiar succession in tonal music), the familiar diatonic relationship helps her or him realize that the tone is not part of the chord.

Results for Fundamental Versus Spectral Listeners

The results presented above are averaged over relatively spectral and relatively fundamental listeners as determined by AAT. We averaged over the two groups because previous analyses had revealed no main effect of listener type; however, group differences were found in several other analyses.

We predicted a larger difference between ratings of different listener types in Experiment 2, because the question posed in that experiment (“Is the tone in the chord”) focused the listener's attention on tones in the chord itself, whereas the question asked in Experiment 1 (“Does the tone go with the chord”) referred to musical context. Results contradicted this prediction. For Experiment 1, a 3-way ANOVA with factors Chord (9 levels), Chroma (12), and Listener Type (2) with Greenhouse-Geisser correction showed significant main effects of Chord, F(8, 304) = 5.29, p < .001, ƞ2 = .12, and Chroma, F(7.5, 283.8) = 19.41, p < .001, ƞ2 = .34, but not Listener Type. However, there were significant interactions between Chroma and Listener Type, F(11) = 2.8, p < .01, ƞ2 = .07, and between Chord and Chroma, F(88, 3344) = 4.22, p < .001, ƞ2 = .10. When the same ANOVA was performed for Experiment 2, there were significant main effects of Chord, F(5.4, 205.9) = 2.4, p < .05, ƞ2 = .06, and Chroma, F(6.3, 239.6) = 16.02, p < .001, ƞ2 = .30, but not of Listener Type; there was also a significant interaction between Chord and Chroma, F(88, 334) = 5.14, p < .001, ƞ2 = .12, but this time no interaction with Listener Type.

When results for individual chords were analyzed separately, there was sometimes an interaction between Chroma (12 levels) and Listener Type (2). In Experiment 1, we found this interaction for four of nine chords: 035, 037, 047, and 048. For Experiment 2, we found this interaction for three of nine chords: 025, 047, and 048. However, we could not attach a particular meaning to the chords for which this difference was found and those for which it was not found.

We also conducted an ANOVA in which independent variables were tone type (with two levels: chord tone versus non-chord tone) and listener type. The interaction for Experiment 1 was significant, F(1) = 5.43, p < .05, ƞ2 = .13; relatively fundamental listeners rated chord tones higher than relatively spectral listeners, by comparison to non-chord tones. This contradicted our hypothesis, according to which relatively fundamental listeners would more likely hear certain non-chord tones (MFs), thereby reducing the difference between chord tones and non-chord tones.

When the same ANOVA was performed for Experiment 2, there were significant main effects of Chord, F(5.4, 205.9) = 2.4, p < .05, ƞ2 = .06, and Chroma, F(6.3, 239.6) = 16.02, p < .001, ƞ2 = .30, but not of Listener Type; there was also a significant interaction between Chord and Chroma, F(88, 334) = 5.14, p < .001, ƞ2 = .12, but this time no interaction with Listener Type.

Relatively fundamental listeners were predicted to hear MFs more clearly or more often. To test this idea, we performed an ANOVA that was restricted to non-chord tones (9 per chord x 9 chords), for each experiment separately. Independent variables were chord and tone type (repeated measures) and listener type (between). Here, tone type had two levels: MFs and other tones. We expected an interaction between listener type and tone type, but found one neither for Experiment 1 nor for Experiment 2. We also conducted similar analyses in which the two levels of tone type were defined differently: completion tones versus other non-chord tones, diatonic tones versus other non-chord tones, and 5threlated tones versus other non-chord tones. Again, no two-way interactions between listener type and tone type were found.

In sum, we found no clear, consistent, or theoretically explicable differences between the results of relatively spectral and relatively fundamental listeners. A possible explanation is that fundamental listeners were perceiving musical chords primarily on the basis of musical experience, rather than hearing MFs directly as we had hypothesized. This hypothesis is consistent with the strong dependency of spectral versus fundamental listening on stimulus exposure reported by Seither-Preisler et al., (2008). Evidently neither fundamental nor spectral listeners are capable of focusing attention on MFs in musical chords.

The finding that MFs accounted for non-chord tone profiles in both experiments, but especially in Experiment 2, combined with the observed lack of any consistent significant difference between the results of spectral and fundamental listeners, can now be explained differently. A psychohistoric explanation involves two stages. In the first, MFs influenced how often corresponding tones appeared immediately before and after given chords in music from previous centuries. In the second stage, those statistical regularities influenced the perception of all listeners—both fundamental and spectral. The first stage involved intuitive (subconscious) perception, in connection with composition and improvisation. These processes are always to some extent creative and experimental (even if contemporary theorists did not use terminology of that kind), otherwise musical styles would not have changed historically. The second stage involved codified compositional conventions that were presented repeatedly to listeners, causing them to be enculturated by these statistical regularities. A single-stage process is also possible, in which some modern listeners perceive MFs directly.

Experiment 3

The main aim of Experiment 3 was to test the psychological reality of chord roots by testing whether listeners spontaneously perceived the roots of diverse musical chords. A secondary aim was to test the models from the previous experiments on contrasting data, and to explore the effect of task on empirically determined chroma-salience profiles.

Music theorists first started to conceptualize chord roots and chordal invertibility (inversions relative to roots) in the early 17th century (Parncutt, 2011a; Rivera, 1984). This new development in the history of ideas was a response to two centuries of compositional practice in which most three-and four-voice chords had corresponded to what were later called major and minor triads in root position (Parncutt et al., 2018). To understand this historic process, we must consider both culturespecific ideas and perceptual universals (Eberlein, 1994).

The method for this experiment was inspired by Terhardt's (1972) standard procedure for determining the pitch of any short sound (described in Terhardt, 1998, pp. 312–313; see also Terhardt & Grubert, 1987). In that procedure, a test sound and a pure reference tone are heard in alternation. A listener adjusts the frequency of the pure tone until the two sounds have the same pitch. The SPL of the pure tone is held constant (e.g., 40 or 60 dB SPL). The procedure produces valid, reliable, quantitative pitch estimates. For Experiment 3, we reduced the number of response possibilities by defining 12 response categories in advance. Rather than allowing participants to continuously adjust the frequency of the reference tone, we asked them to choose a pitch from a set of possibilities.

METHOD

Participants saw an interface with 12 buttons in a circular arrangement (the chroma circle, illustrated in Figure 6). At the start of each trial, they clicked on a central button and heard a chord. They then focused their attention on the first pitch that they heard in the chord and found it on a circular display of 12 tones. The chord could be heard one or two times in each trial (mean: 1.8), and the tones could be heard as often as needed. In each trial, the chord was randomly transposed around the chroma circle, but the frequencies of the tones in the circular interface remained constant: C was always at the 12 o'clock position, E at 3 o'clock, F at 6 o'clock. Nine chords were presented six times each, making 54 trials in all. The order of trials was random and different for each participant. The chords were physically identical to those used in Experiments 1 and 2.

FIGURE 6.

Screenshot for Experiment 3. Participants clicked on the 12 unlabeled chroma buttons to hear the individual tones. Akkord = chord, Weiter = next trial, Beenden = stop.

FIGURE 6.

Screenshot for Experiment 3. Participants clicked on the 12 unlabeled chroma buttons to hear the individual tones. Akkord = chord, Weiter = next trial, Beenden = stop.

RESULTS

The results of Experiment 3 for all 40 participants are presented in Figure 7. From visual inspection, the chord profiles are much more peaked than for Experiments 1 and 2. In most cases, participants clearly differentiated between chord tones and non-chord tones. Unlike in Experiments 1 and 2, they also differentiated among chord tones. As before, the differentiation was more difficult for chords such as 036 and 048, presumably because they were more dissonant or less familiar.

FIGURE 7.

Results of Experiment 3, in which listeners actively selected the best-matching tone from 12 possibilities.

FIGURE 7.

Results of Experiment 3, in which listeners actively selected the best-matching tone from 12 possibilities.

Among chord tones, the most commonly matched chroma corresponded to the music-theoretical root in all cases where it could clearly be defined as the higher tone of a P4 interval. This definition yielded clear predictions for 7 of the 9 chords, the exceptions being 036 and 048. Considering each chord in Figure 7 in turn, the Wilcoxon Test (without adjusting for multiple comparisons) yielded the following results. The p value for significance is .05/3 = .017, because in each chord we made 3 comparisons (all 3 intervals between all 3 tones).

  • For chord 015, tone 5 was rated higher than tone 0 (p = .003)

  • For 025, 5 > 0 (p = .004), 5 > 2 (p < .001)

  • For 027 (suspended 4th chord), 0 > 7 (p < .001), 0 > 2 (p < .001), and 7 > 2 (p < .001)

  • For 035, 5 > 0 (p < .001) and 5 > 3 (p < .001)

  • For 036, 0 > 3 (p = .005)

  • For 037 (minor), 0 > 7 (p < .001), 0 > 3 (p < .001), and 7 > 3 (p = .008)

  • For 045, 5 > 0 (p < .001), and 4 > 0 (p = .002)

  • For 047, 0 > 7 (p < .001) and 0 > 4 (p < .001)

The only chord for which findings contradicted predictions was the diminished triad (036). The root of the diminished triad CE♭G♭ is often considered to be A♭; the triad may function as an incomplete dominant 7th on A♭ in the key of D♭. Parncutt (1988) similarly predicted that 036 has a strong MF at 8. The results of Experiments 1 and 2 were consistent with this prediction, but Experiment 3 contradicted it—perhaps because participants could play the target chord twice, focusing their attention on physically present tones (analytic listening). The result corresponded instead to the music-theoretic principle of stacked 3rds (tertian harmony; Rameau, 1721), which dominates theoretical treatises on both classical and jazz harmony (e.g., Rawlins & Bahha, 2005). A possible psychoacoustic explanation: the diminished triad may be perceived as a mistuned major or minor triad. That is feasible given that the partials of a HCT can be mistuned relative to a harmonic series by as much as a semitone and still be perceived as part of the pattern (Moore et al., 1985). If tone 0 in the diminished triad 036 is lowered by semitone, the chord becomes 047, the major triad; and if tone 6 is raised by a semitone, the chord becomes 037, the minor triad. The relatively low mean rating for the chord's 3rd (the tone 3 in 036) can be explained by masking.

Qualitative Data

After each experiment, participants were asked to comment briefly on any aspect of their experience, including how they felt about the task (Wie ist es dir bei dieser Aufgabe gegangen?) and what strategies they used. We conducted 120 short interviews (3 experiments x 40 participants). Their comments suggested that most participants did not recognize the chord that they heard in each trial. Those who did, did not identify the interval between a reference chroma in the chord (such as the root) and the probe tone.

Of the 40 participants, 32 were asked which chords they recognized (the first 8 were not asked this question). Most replied that they heard major, minor, diminished, and augmented chords (which were indeed 4 of the 9 chords), and nine replied that they heard diminished and augmented more often than major and minor (in fact, each of these chords was presented equally often). Ten participants reported hearing 7th chords, although there were none in the experiment; one participant reported hearing only 7th chords throughout the entire experiment. These responses are consistent with the psychological reality of MFs at non-notated chromas.

Those 32 participants were also asked to list the kinds of chord that they had heard and estimate the percentage of each chord in the experiments. Of these, most immediately objected they could only guess the answer, but only one refused to try. Major triads were mentioned by 27 participants, minor triads by 25, diminished triads by 17, augmented triads by 17, dominant 7th chords (which never occurred) by 14, suspended 4th triads by 2, and half-diminished 7th chords (which never occurred) by 2. Percentage estimates were generally inaccurate. One participant reported hearing 5% major, 5% minor, 45% diminished, and 45% augmented, while another reported 60% major and 40% minor chords. These findings are consistent with our assumption that most or all participants were unable to recognize chord-tone relationships and were therefore unable to respond on the basis of music-theoretic knowledge.

Although we did not ask how difficult the experiments were, out of all 40 participants, Experiment 1 was spontaneously described as difficult by 11, Experiment 2 (“Is the tone in the chord?”) by 26, and Experiment 3 by 13. Five found it difficult to concentrate, and another five found the experiments tiring or exhausting. Of the 20 participants who heard the shorter 100-ms test sounds, 12 complained they were too short. Regarding timbre, 10 participants said that the sounds reminded them of the organ, 6 of the piano, and 5 participants complained the timbre was unpleasant.

We also conducted short interviews following AAT. Six participants reported hearing tones go up and down simultaneously and chose the movement that sounded more important.

Correlation Analyses

To further test the psychological reality of MFs at non-chord tones, we correlated various results and predictors with each other. We first created a matrix of 8 vectors of 108 values each (12 values for each chord times 9 chords). Note that all 12 chromas are included in this analysis—both the three chord tones and the nine non-chord tones. Each vector represents either experimental data or theoretical predictions according to different models.

The first three vectors were ratings from Experiments 1, 2, and 3, averaged over all 40 participants. The next five vectors were predictions of five models. The first model was a simple stimulus model corresponding to music notation, in which the presence of a tone is indicated by 1 and the absence of a tone by 0; chord 015 was represented by the vector 110001000000. The second model was the octave-generalized model of harmonic pitch-pattern recognition by Parncutt (1988), as shown in Table 6a. The third model was similar to the second but also considered masking between nearby partials (Parncutt, 1993), see Table 6b. The last two predictors are explained below.

Table 7 presents correlation coefficients, calculated by comparing two vectors of 108 numbers (9 chords x 12 chromas; two-tailed significance tests). Within these vectors, subvectors for each chord (groups of 12 values) had been converted to z-scores (with mean = 0 and standard deviation = 1) before correlation. We first considered Pearson's (linear) correlations (Table 7a), since we were primarily concerned with how well each chroma is implied by or goes with the chord (conceived of as a real number) rather than the rank order of the chromas. We were also concerned to optimize the models by improving the Pearson correlations. Spearman (rank) correlations are also shown, because the data are not normally distributed. Each correlation coefficient has specific advantages and disadvantages (Hauke & Kossowski, 2011).

A) Pearson

DataModel
Expt 1Expt 2Expt 3StimulusPa (88)Pa (93)Pa (88)’Pa (93)’
Data Expt 1 .85 .77  .81 .83 .84 .85 .86 
Expt 2 .85 .79  .85 .82 .81 .87 .86 
Expt 3 .77 .79  .82 .79 .79 .84 .83 
Model Stimulus .81 .85 .82  .86 .84 .96 .93 
Pa (88) .83 .82 .79  .86 .99 .97 .98 
Pa (93) .84 .81 .79  .84 .99 .95 .98 
Pa (88)’ .85 .87 .84  .96 .97 .95 .99 
Pa (93)’ .86 .86 .83  .93 .98 .98 .99 
DataModel
Expt 1Expt 2Expt 3StimulusPa (88)Pa (93)Pa (88)’Pa (93)’
Data Expt 1 .85 .77  .81 .83 .84 .85 .86 
Expt 2 .85 .79  .85 .82 .81 .87 .86 
Expt 3 .77 .79  .82 .79 .79 .84 .83 
Model Stimulus .81 .85 .82  .86 .84 .96 .93 
Pa (88) .83 .82 .79  .86 .99 .97 .98 
Pa (93) .84 .81 .79  .84 .99 .95 .98 
Pa (88)’ .85 .87 .84  .96 .97 .95 .99 
Pa (93)’ .86 .86 .83  .93 .98 .98 .99 

Note: Pa (88)’ is a linear combination of Pa (88) and the stimulus model; Pa (93)’ similarly. All correlations are p < .01 (two-tailed comparisons).

B) Spearman

DataModel
Expt 1Expt 2Expt 3StimulusPa (88)Pa (93)Pa (88)’Pa (93)’
Data Expt 1 .73 .62  .72 .69 .69 .69 .69 
Expt 2 .73 .57  .74 .65 .65 .65 .66 
Expt 3 .62 .57  .74 .54 .53 .55 .54 
Model Stimulus .72 .74 .74  .74 .73 .75 .75 
Pa (88) .69 .65 .54  .74 .98 1.00 .99 
Pa (93) .69 .65 .53  .73 .98 .98 1.00 
Pa (88)’ .69 .65 .55  .75 1.00 .98 .99 
Pa (93)’ .69 .66 .54  .75 .99 1.0 .99 
DataModel
Expt 1Expt 2Expt 3StimulusPa (88)Pa (93)Pa (88)’Pa (93)’
Data Expt 1 .73 .62  .72 .69 .69 .69 .69 
Expt 2 .73 .57  .74 .65 .65 .65 .66 
Expt 3 .62 .57  .74 .54 .53 .55 .54 
Model Stimulus .72 .74 .74  .74 .73 .75 .75 
Pa (88) .69 .65 .54  .74 .98 1.00 .99 
Pa (93) .69 .65 .53  .73 .98 .98 1.00 
Pa (88)’ .69 .65 .55  .75 1.00 .98 .99 
Pa (93)’ .69 .66 .54  .75 .99 1.0 .99 

Note: Pa (88)’ is a linear combination of Pa (88) and the stimulus model; Pa (93)’ similarly. All correlations are p < .01 (two-tailed comparisons).

All correlations in Table 7 would have been highly significant (p < .01) if considered alone. However, our primary interest was to compare coefficients with each other and draw general, tentative conclusions from those comparisons.

Consider first the Pearson correlations. Results of Experiment 1 correlated well with results of Experiment 2, but less well with Experiment 3, in which participants actively chose the best-fitting tone and the profiles well predicted conventional chord roots. The experimental results also correlated well with the first three models (stimulus model, Pa (88), Pa (93)). For Experiment 1, the pitch models correlated better than the stimulus model as expected—but not for Experiments 2 and 3, suggesting that the pitch models could be improved by combining them with the stimulus model.

We therefore created a linear combination of each pitch model and the stimulus model (Pa (88)’, Pa (93)’). Relative to Table 6, a constant value of 20 was added to the predicted weight of the three chromas corresponding to chord tones. In chord 047, for example, the models output profiles of 12 values; 20 was added to the predicted salience of chromas 0, 4, and 7. By trial and error, we found that the value 20 roughly maximized the Pearson correlations between model predictions and empirical data for the three experiments. This result can be explained if some participants intuitively recognized some of the chords, deducing which pitches were chord tones based on musical experience or music theory. The success of this combined model suggests that the salience of chord tones was underestimated in the original models, relative to other chromas.

The Spearman correlations in Table 7b confirm that all correlations are significant at the p < .01 level, but they do not reflect the superior performance of the adjusted models Pa (88)’ and Pa (93)’. The better performance of the stimulus model by comparison to other models confirms that participants were generally able to distinguish chord tones from non-chord tones.

General Discussion

Experiment 1 shed light on the perceptual and cognitive foundations of chord-scale compatibility in music theory. A comparison of results with predictions of different models suggests that the scales with which a chord is compatible depend on both “nature” and “nurture” (as previously defined). The tones of compatible scales can be MFs, diatonic tones, 5th-related tones, or completion tones, or a combination of these.

Experiment 2 demonstrated that although MFs are not consciously perceived at non-chord tones in musical chords, listeners’ perceptions are influenced by them. Historically, they could be an important factor influencing variations in salience of non-chord tones. If so, a systematic consideration of such MFs belongs to the foundations of Western music theory.

Results of Experiment 3 were consistent with predicted chord roots according to virtual pitch theory. Whereas some participants may have responded on the basis of music-theoretic knowledge, our qualitative data suggest that individual chords were seldom correctly recognized, reducing the chance that results were artifacts of music-theoretic knowledge. Results were consistent with both Terhardt's (1974, 1982) claim that chord roots are virtual pitches and Thomson's contrasting (1993) claim that they are cultural phenomena.

Taken together, the results of Experiments 1, 2, and 3 suggest that the models of Parncutt (1988, 1993) correctly identify the main MFs in musical chords, but overestimate their perceptual salience. One possible explanation is that the MFs of chords in music result from incomplete, approximate harmonic series of slightly asynchronous partials; the models ignore the mistuning and asynchrony. A second explanation is that musicians learn to ignore MFs during musical training and practice, including ear-training courses.

For methodological reasons, the chords in our experiments were built from OCTs and not from HCTs. These chords, like those presented by Krumhansl (1990), sounded similar to chords played on a church organ, suggesting that our participants perceived them as if they comprised HCTs. If so, we can imagine a two-stage cognitive process: first, chord recognition based on similarity, and second, access to perceptual (nonlinguistic) “knowledge” about the chord as it occurs in music, such as profiles of prevalence of preceding and following tones.

Comparing the results of the pretest (AAT) and the main experiments, the data suggest that our participants sometimes directly perceived tone sensations at MFs and at other times responded according to statistical distributions in music to which they had been exposed. Results of Experiment 2, in which participants were asked if the probe tone was “in the chord,” were better accounted for by a theory of MFs than results of Experiment 1, in which participants were asked if the tone “went with the chord.” Results of Experiment 2 also suggest that variations in the salience of non-chord tones were better accounted for by a psychoacoustic theory of MF perception than by three competing theories based on experience of tonal music: diatonicity, 5th relations, and completion tones (tones that complete a familiar, more complex chord). The lack of a consistent significant difference between relatively fundamental and relatively spectral listeners in Experiment 2 suggests in addition that participants were not directly perceiving tones at non-chord-chromas; instead, they may have been imagining pitches that often occur before and after those chords in music—because in the past MFs were sometimes perceived in those chords at those pitches.

On this basis, we propose a speculative psychohistoric explanation for the observed variations in perceptual salience of non-chord tones. A psychohistoric account considers both the acoustics of musical sounds and historic changes in their perception. By contrast, psychoacoustic theories of pitch perception focus on physical properties of the real-time stimulus such as periodicity or harmonicity. A psychoacoustic approach usually does not consider the situation in which a sound is perceived or the (musical) experience of the listener.

A psychohistoric approach would ideally acknowledge the role and relevance of the history of musical structure, the history of music perception, the history of music theoretic ideas, and the historical, social, and ideological context dependency of music perception (Cazden, 1945). The points in this list are causally interconnected (confounded) and hence resistant to empirical scientific investigation. A psychohistoric approach also acknowledges and addresses the “two cultures” problem of Snow (1959) by introducing humanities issues into scientific discourse and vice versa. It has the potential to reconcile persistent contradictions between scientific approaches such as Krumhansl (1990) and Terhardt (1974), while at the same time acknowledging the positive contribution of both, and on that basis provide a new foundation for a comprehensive psychologically founded theory of major-minor tonality.

If we tried to understand historic music perception from a purely psychoacoustic viewpoint, we might predict that, since the advent of counterpoint, European listeners have been influenced by weak MFs at nonchord tones—that is, pitches that are not octaveequivalent to chord tones (e.g., the tone A in the chord CEG). In temporal theories of pitch perception, these tone sensations or tonal implications correspond to approximately periodic patterns in the waveform after auditory filtering. In spectral approaches, they correspond to fundamental frequencies of incomplete, approximately harmonic patterns of audible partials.

In a psychohistoric paradigm, this subconscious, historically undocumented aspect of musical pitch perception influenced the statistical probability of certain tones preceding and following certain chords. For example, the probability that the tone A would precede or follow the chord CEG, regardless of context, was boosted because A was weakly implied as an MF. Statistical regularities of that kind were then internalized by Western listeners (cf. Tillmann et al., 2000).

Our findings are limited to relationships between chromas and do not consider octave register. Future work may return to this issue, following Parncutt (1989) and Terhardt et al. (1982). Consider for example the A-minor triad ACE, in close position with A in the bass (e.g., A3C4E4). The tone A usually has audible harmonics at chromas A, E, C, G, and B; C, at C, G, E, B♭, and D; and E, at E, B, G, D, and F♯. The triad's spectrum has (approximate) fundamental frequencies at 6 or more chromas: physically present at A, C, and E, and missing (MFs) at D, F, and B. Ignoring voicing and register, the MF at D is associated with partials at D, A, F, C, and E; the MF at F, with C, A, and G; and the MF at B, with B, F, A, and C. If register is taken into account, specific pitches are predicted to be more salient than others, depending on the chord's voicing. The theory could be tested by manipulating the amplitude of selected harmonics of selected MFs, testing whether the salience of those MFs changed according to predictions.

We have not considered non-human pitch perception or neural substrates of pitch perception. That nonhuman animals perceive MFs (e.g., Heffner & Whitfield, 1976) is unsurprising given the ecological and social significance of fundamental frequency in conspecific vocalizations (e.g., Biben, Symmes, & Bernhards, 1989) and the susceptibility of the fundamental to masking in noisy environments (Sinnott, Stebbins, & Moody, 1975). Prior to the present study, we know of no evidence for the perception of MFs at non-chord tones within musical chords in either human or nonhuman subjects. It is difficult enough to demonstrate MF perception within musical chords with musically trained listeners; nonmusicians were excluded from our experiments because the task was too difficult. Nor are there published empirical studies on neural mechanisms underlying individual pitches perceived within musical chords or implied by musical chords. Studies such as Maess, Koelsch, Gunter, and Friederici (2001) and Patel, Gibson, Ratner, Besson, and Holcomb (1998) considered music-syntactic relationships and incongruities, but not individual pitches. Although each chord in our experiments was presented in isolation, post-experiment interviews suggested that chords were perceived as musical entities, implying that their perception was affected by musical experience—an aspect that nonhumans are unlikely to be sensitive to and mechanistic temporal models of pitch perception are unlikely to account for.

Our findings may inspire new approaches to analysis and composition. 20th-century music theorists repeatedly addressed issues of pitch salience: Schoenberg and followers such as Webern or Boulez tried to abandon syntactic relations and hierarchical distinctions between musical tones, making them compositionally less important. Discussion about the artistic virtues and perceptibility of such procedures is ongoing; no matter how hard a composer tries to avoid hierarchical cognitive structures, the listener will still construct them in an attempt to make sense of the music (Dibben, 1994, 1999; Imberty, 1993). From a psychological viewpoint, it is practically impossible to achieve atonality, since in passages regarded as “atonal” some tones or chromas generally sound more important than others. Even if we tried to equalize tone saliences for the average listener, applying algorithmic models to real-time measurement and adjustment, there would still be individual differences due to Adorno's listener typologies and the increasing diversity of modern musical styles and musical audiences (Lilienfeld, 1987). Issues of this kind can be clarified by combining psychological and music-theoretical approaches.

Our findings have additional applications in music analysis and composition. Pitch salience could be notated in musical scores as notehead size (Parncutt, 2011b); non-notated pitches might be gray instead of black. In algorithmic composition, Ferguson and Parncutt (2004) applied the pitch algorithm of Parncutt (1989) to composition in a relatively complex and dissonant style; future work may generate more consonant, accessible music, and revisit the question of “new tonalities.” In computer-based expressive performance, musical expression (including timing and dynamics) depends on harmonic accent, which in turn involves both vertical dissonance and horizontal harmonic relationships (Bisesi & Parncutt, 2011); a better understanding of MFs in chords could improve algorithms to predict harmonic accent, leading to more convincing artificial performances.

Music-psychological studies of pitch perception and cognition tacitly assume a one-to-one correspondence between notated and perceived pitches. Our findings undermine this assumption. Aspects of Krumhansl's (1990) cognitive structures may be explicable by variations in pitch salience and MFs. These include the tone profiles of musical keys (tonal hierarchies; Parncutt, 1989, 2011a) and tone profiles of chord progressions (Huron & Parncutt, 1993; Parncutt & Bregman, 2000). Melodies in major and minor keys may be perceived as prolongations of tonic triads (Parncutt, 2014; cf. Forte & Gilbert, 1982; Schenker, 1906/1954). In a psychohistoric approach, ratings of chords (harmonic functions) relative to tonal contexts depend on the prevalence of similar chord progressions in music, which in turn depend on pitch commonality and preferences for root progressions such as falling 5ths (Parncutt, 1989, 2005).

Conclusion

We measured tone profiles for a relatively large number of musically representative, isolated musical chords, using contrasting empirical methods and a large number of participants. Our results may represent the best existing body of data for the testing of explanatory psychoacoustic, cognitive, and music-theoretic models. We then demonstrated that models of MFs, diatonicity, 5th relations, and completion tones can account in part for tone profiles of typical musical chords (Experiment 1), but that MFs and 5th relations dominate when the listener's attention is more focused on tones in the chord itself (Experiment 2). We also presented what is presumably the most conclusive evidence to date for the psychological reality of chord roots (Experiment 3). In three different approaches to analysis of data from Experiments 1 and 2 (ANOVA, comparison of mean ratings at predicted chromas, and correlation analysis), we presented convergent evidence that peaks in the tone profiles of musical chords are significantly influenced by MFs.

An analysis of individual differences (fundamental versus spectral listeners) suggested, however, that individual listeners do not perceive these MFs directly. A possible explanation involves the well-documented sensitivity of listeners to statistical distributions in the music to which they are exposed. We speculate that MFs may have been perceived by past listeners, which influenced statistical pitch distributions of past music, which in turn influenced the music to which our participants were exposed, and hence their real-time music perception. This “psychohistoric” approach addresses a longstanding issue about the status of virtual pitches in tonal music and suggests that the contrasting approaches of Krumhansl and Terhardt may be complementary rather than contradictory.

Notes

Notes
1.
In musical set theory, a set of three different chromas (octavegeneralized pitches) is a trichord (Rahn, 1980). The term triad tends to be reserved for major, minor, diminished, and augmented triads; that is, for musically familiar or basic trichords. If some tones in a trichord are doubled (played in more than one octave register), the chord comprises three chromas but more than three tones. A tetrachord is a set of four chromas. By polyphony we mean music comprising several partly independent voices (rather than voices that move in parallel).
2.
A pitch at an MF is always a virtual pitch, but not all virtual pitches are at MFs. The pitch at or near the fundamental of a HCT, in which the fundamental is present and audible, is usually virtual. But if the spectral pitch of the lowest partial is more salient than the coinciding virtual pitch, as in some high-pitched musical sounds, the main pitch is spectral.
3.
A spectral pitch is the pitch of a pure tone—whether heard in isolation or as part of a complex sound, as a partial. Like any other pitch, spectral pitch is fundamentally subjective and experiential in nature, because empirical pitch judgments are always mediated by the listener's consciousness (Terhardt, 1998). In a common procedure for pitch judgment, a listener hears a complex sound and a pure tone in alternation and adjusts the frequency of the pure tone until the two sounds have the same pitch. The physiological correlates of spectral pitch in the peripheral auditory system are complex; both spectral and virtual pitch depend in general on a mixture of temporal and spectral information and processes (Moore, 2003). If we ignore physiology and consider only the relationship between spectral pitches and partial frequencies, the relationship is still complex: spectral pitches and corresponding spectral frequencies differ from each other depending on the sound levels of the partials and the degree to which they mask each other (pitch shifts). If a partial is completely masked, its spectral pitch ceases to exist.
4.
In music theory, a “chord” is often a familiar triad or seventh chord, or a sonority constructed according to the principle of stacked thirds. But the word “chord” may also refer to any simultaneity of any tones from the chromatic scale, which is how we use the word in this paper. Our definition is consistent with polyphonic musical practice since the Middle Ages, in which almost all possible pitch-class sets were used (Parncutt et al., 2018). It is also consistent with the idea of additive harmony in early modernism (Blӓttler, 2017) and the jazz-theory concept of bitonal chords that combine lower and upper structures (Pease & Pullig, 2001). Alternative terms for “chord” in this sense include “simultaneity” or “sonority.” An OCT is a complex tone whose partials are spaced at octave intervals across the audible spectrum. Shepard tones are OCTs whose amplitude envelope is bell-shaped. In this study, the amplitude envelope of OCTs was flat before amplification; low and high frequencies were instead attenuated by a mixture of acoustical phenomena (frequency response of sound card and headphones) and psychoacoustical phenomena (auditory threshold, curves of equal loudness, and masking).
5.
We distinguish between pitch and chroma. Pitch is the perceived height of a tone on a one-dimensional scale from low to high. Chroma is octave-generalized, musically categorized pitch. There are 12 chromas: C, C≯/D♭, D, etc. Each chroma can be realized in different octave registers. A chroma is also a psychological category: in a musical context based on the chromatic scale, pitches lying within roughly a quartertone of a chroma's centre pitch are perceived as belonging to that chroma (cf. Burns & Ward, 1978). A “chord chroma” is a chroma corresponding to one of the chord's notes; other chromas are “non-chord tones.”
6.
A “pitch” is a subjective experience. This definition applies equally to spectral and virtual pitch. An experimental psychoacoustic paradigm explores quantitative relationships between experiential parameters such as pitch, timbre and loudness on the one hand, and physical parameters such as the frequencies, amplitudes, and relative phases of partials within complex tones on the other.
7.
In a psychological approach, the root is a reference chroma, relative to which other chord chromas are often or usually perceived. Major and minor triads appear most often in root position (i.e., with the root in the bass). Chord roots are often ambiguous, but the assumed root usually corresponds to the lower tone of a P5 interval between chord chromas (or the upper tone of a P4).
8.
At first sight, this seems like a two-way repeated measures design with independent variables Chord (9 levels) and Chroma (12 levels). It is not, because a main effect of the circular (modulo 12) variable Chroma would be meaningless: each chord has a different profile and is randomly transposed relative to the others. In this simple, non-standard design, each chord is statistically independent of the other chords, so regardless of the number of tested chords there is no need to correct for multiple comparisons.
9.
Chord 015 (e.g., E-F-A) combines a consonant P4/P5 interval (5 semitones) with a dissonant m2 (1 semitone) and an intermediate M3 (4 semitones). In modern (tonal) terminology, it can be a minor triad (on with suspended or passing M9 (on E). In early music, it can be a M3 interval (FA) with a passing M7 (E), or a P5 interval (AE) with a passing m6 (F). In Figure 2, this dissonant tone combination is almost always prepared (i.e., the tones do not begin simultaneously).
10.
Because Experiments 1 and 2 are similar, the p value for significance is lower for the results section of Experiment 2. It lies between .05 (if the two experiments are regarded as independent) and .025 (if they are regarded as identical).
11.
The models overlap, so the data do not satisfy the independence condition for ANOVA. To avoid overstating the finding, we apply the approximate rule that two means are different if confidence intervals do not overlap, or overlap only slightly (Goldstein & Healy, 1995).

References

References
Barbour, J. M. (
1951
).
Tuning and temperament: A historical survey
.
East Lansing, MI
:
Michigan State College Press
.
Bharucha, J. J. (
1984
).
Anchoring effects in music: The resolution of dissonance
.
Cognitive Psychology
,
16
(
4
),
485
518
.
Bharucha, J. J. (
1987
).
Music cognition and perceptual facilitation: A connectionist framework
.
Music Perception
,
5
,
1
30
.
Biben, M., Symmes, D., & Bernhards, D. (
1989
).
Contour variables in vocal communication between squirrel monkey mothers and infants
.
Developmental Psychobiology
,
22
(
6
),
617
631
.
Biles, J. A. (
2003
).
GenJam in perspective: A tentative taxonomy for GA music and art systems
.
Leonardo
,
36
(
1
),
43
45
.
Bisesi, E., & Parncutt, R. (
2011
).
An accent-based approach to automatic rendering of piano performance: Preliminary auditory evaluation
.
Archives of Acoustics
36
(
2
),
1
14
.
Blättler, D. J. (
2017
).
A voicing-based model for additive harmony
.
Music Theory Online
,
23
(
3
).
Bowling, D. L., & Purves, D. (
2015
).
A biological rationale for musical consonance
.
Proceedings of the National Academy of Sciences
,
112
(
36
),
11155
11160
.
Bowling, D. L., Purves, D., & Gill, K. Z. (
2017
).
Vocal similarity predicts the relative attraction of musical chords
.
Proceedings of the National Academy of Sciences
. Retrieved from www.pnas.org/cgi/doi/10.1073/pnas.1713206115
Burns, E. M., & Ward, W. D. (
1978
).
Categorical perception—phenomenon or epiphenomenon: Evidence from experiments in the perception of melodic musical intervals
.
Journal of the Acoustical Society of America
,
63
,
456
468
.
Cazden, N. (
1945
).
Musical consonance/dissonance: A cultural criterion
.
Journal of Aesthetics and Art Criticism
,
4
,
3
11
.
Childs, A. P. (
1998
).
Moving beyond neo-Riemannian triads: Exploring a transformational model for seventh chords
.
Journal of Music Theory
,
42
,
181
193
.
Cuddy, L. L., & Badertscher, B. (
1987
).
Recovery of the tonal hierarchy: Some comparisons across age and levels of musical experience
.
Perception and Psychophysics
,
41
,
609
620
.
Dahlhaus, C. (
1990
).
Studies on the origin of harmonic tonality
(R. Gjerdingen, Trans.).
Princeton, NJ
:
Princeton University Press
. (Original work published 1968)
Deutsch, D., & Feroe, J. (
1981
).
The internal representation of pitch sequences in tonal music
.
Psychological Review
,
88
(
6
),
503
522
.
Dibben, N. (
1994
).
The cognitive reality of hierarchic structure in tonal and atonal music
.
Music Perception
,
12
,
1
25
.
Dibben, N. (
1999
).
The perception of structural stability in atonal music: The influence of salience, stability, horizontal motion, pitch commonality, and dissonance
.
Music Perception
,
16
,
265
294
.
Eberlein, R. (
1994
).
Die Entstehung der tonalen Klangsyntax
[The origin of tonal-harmonic syntax].
Frankfurt/Main
:
Peter Lang
.
Ferguson, S., & Parncutt, R. (
2004
). Composing ‘In the flesh’: Perceptually-informed harmonic syntax.
Proceedings of Sound and Music Computing Conference
.
Paris, France
:
SMCC
.
Forte, A. (
1973
).
The structure of atonal music
.
New Haven, CT
:
Yale University Press
.
Forte, A., & Gilbert, S. E. (
1982
).
An introduction to Schenkerian analysis
.
New York
:
Norton
.
Gauldin, R. (
1983
).
The cycle-7 complex: Relations of diatonic set theory to the evolution of ancient tonal systems
.
Music Theory Spectrum
,
5
,
39
55
.
Goldstein, H., & Healy, M. J. (
1995
).
The graphical presentation of a collection of means
.
Journal of the Royal Statistical Society Series A
,
158
,
175
177
.
Hauke, J., & Kossowski, T. (
2011
).
Comparison of values of Pearson's and Spearman's correlation coefficients on the same sets of data
.
Quaestiones Geographicae
,
30
(
2
),
87
93
.
Heffner, H., & Whitfield, I. C. (
1976
).
Perception of the missing fundamental by cats
.
Journal of the Acoustical Society of America
,
59
,
915
919
.
Holleran, S., Jones, M. R., & Butler, D. (
1995
).
Perceiving implied harmony: The influence of melodic and harmonic context
.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
21
(
3
),
737
753
Huron, D. (
2001
).
Tone and voice: A derivation of the rules of voice-leading from perceptual principles
.
Music Perception
,
19
,
1
64
.
Huron, D., & Parncutt, R. (
1993
).
An improved model of tonality perception incorporating pitch salience and echoic memory
.
Psychomusicology
,
12
,
152
169
.
Imberty, M. (
1993
).
How do we perceive atonal music? Suggestions for a theoretical approach
.
Contemporary Music Review
,
9
(
1-2
),
325
337
.
Krumhansl, C. L. (
1990
).
Cognitive foundations of musical pitch
.
Oxford, UK
:
Oxford University Press
.
Krumhansl, C. L. (
1991
).
Music psychology: Tonal structures in perception and memory
.
Annual Review of Psychology
,
42
(
1
),
277
303
.
Lerdahl, F. (
1988
).
Tonal pitch space
.
Music Perception
,
5
,
315
349
.
Lilienfeld, R. (
1987
).
Music and society in the 20th Century: Georg Lukacs, Ernst Bloch, and Theodor Adorno
.
International Journal of Politics, Culture, and Society
,
1
(
2
),
120
146
.
Lowinsky, E. E. (
1954
).
Music in the culture of the Renaissance
.
Journal of the History of Ideas
,
15
(
4
),
509
553
.
Lynch, M. P., Eilers, R. E., Oller, D. K., & Urbano, R. C. (
1990
).
Innateness, experience, and music perception
.
Psychological Science
,
1
(
4
),
272
276
.
Maess, B., Koelsch, S., Gunter, T. C., & Friederici, A. D. (
2001
).
Musical syntax is processed in Broca's area: An MEG study
.
Nature Neuroscience
,
4
(
5
),
540
545
.
Milne, A. J., Laney, R., & Sharp, D. B. (
2015
).
A spectral pitch class model of the probe tone data and scalic tonality
.
Music Perception
,
32
(
4
),
364
393
.
Moore, B. C. J. (
2003
).
Introduction to psychology of hearing
(5th ed.).
Amsterdam, Netherlands
:
Academic
.
Moore, B. C. J., Peters, R. W., & Glasberg, B. C. (
1985
).
Thresholds for the detection of inharmonicity in complex tones
.
Journal of the Acoustical Society of America
,
77
,
1861
1867
.
Nisbett, R. E., Peng, K., Choi, I., & Norenzayan, A. (
2001
).
Culture and systems of thought: Holistic versus analytic cognition
.
Psychological Review
,
108
(
2
),
291
310
.
Norton, R. (
1984
).
Tonality in Western culture: A critical and historical perspective
.
University Park, PA
:
Penn State University Press
.
Parncutt, R. (
1988
).
Revision of Terhardt's psychoacoustic model of chord root(s)
.
Music Perception
,
6
,
65
94
.
Parncutt, R. (
1989
).
Harmony: A psychoacoustical approach
.
Berlin
:
Springer-Verlag
.
Parncutt, R. (
1993
).
Pitch properties of chords of octave-spaced tones
.
Contemporary Music Review
,
9
,
35
50
.
Parncutt, R. (
2005
). Perception of musical patterns: Ambiguity, emotion, culture. In W. Auhagen, W. Ruf, U. Smilansky, & H. Weidenmüller (Eds.),
Music and science The impact of music
(Nova Acta Leopoldina, Bd. 92, Nr. 341, pp.
33
47
).
Halle, Germany
:
Deutsche Akademie der Naturforscher Leopoldina
.
Parncutt, R. (
2009
). Tonal implications of harmonic and melodic Tn-types. In T. Klouche & T. Noll (Eds.),
Mathematics and computing in music
(pp.
124
139
).
Berlin
:
Springer-Verlag
.
Parncutt, R. (
2011a
).
The tonic as triad: Key profiles as pitch salience profiles of tonic triads
.
Music Perception
,
28
,
333
365
.
Parncutt, R. (
2011b
).
The transdisciplinary foundation of (European) music theory [Keynote address]
.
Rome, Italy
:
EuroMAC
.
Parncutt, R. (
2014
).
The emotional connotations of major versus minor tonality: One or more origins?
Musicae Scientiae
,
18
,
324
353
Parncutt, R., & Bregman, A. S. (
2000
).
Tone profiles following short chord progressions: Top-down or bottom-up?
Music Perception
,
18
,
25
57
.
Parncutt, R., & Hair, G. (
2011
).
Consonance and dissonance in theory and psychology: Disentangling dissonant dichotomies
.
Journal of Interdisciplinary Music Studies
,
5
(
2
),
119
166
.
Parncutt, R., & Hair, G. (
2018
).
A psychocultural theory of musical interval: Bye bye Pythagoras
.
Music Perception
,
35
,
475
501
.
Parncutt, R., Reisinger, D., Fuchs, A., & Kaiser, F. (
2018
).
Consonance and prevalence of sonorities in western polyphony: Roughness, harmonicity, familiarity, evenness, diatonicity
.
Journal of New Music Research
,
48
(
1
),
1
20
.
Patel, A. D., Gibson, E., Ratner, J., Besson, M., & Holcomb, P. J. (
1998
).
Processing syntactic relations in language and music: An event-related potential study
.
Journal of Cognitive Neuroscience
,
10
(
6
),
717
733
.
Pearce, M. T., & Wiggins, G. A. (
2012
).
Auditory expectation: The information dynamics of music perception and cognition
.
Topics in Cognitive Science
4
(
4
),
625
652
.
Pease, T., & Pullig, K. (
2001
).
Modern jazz voicings: Arranging for small and medium ensembles
.
Milwaukee, WI
:
Hal Leonard Corporation
.
Preisler, A. (
1993
).
The influence of spectral composition of complex tones and of musical experience on the perceptibility of virtual pitch
.
Perception and Psychophysics
,
54
,
589
603
.
Rahn, J. (
1980
).
Basic atonal theory
.
New York
:
Longman
.
Rameau, J.-P. (
1721
).
Trait´e de l'harmonie reduite a des principes naturels
.
Paris, France
:
Ballard
. www.chmtl.indiana.edu
Rawlins, R., & Bahha, N. E. (
2005
).
Jazzology: The encyclopedia of jazz theory for all musicians
.
Milwaukee, WI
:
Hal Leonard Corporation
.
Reichweger, G. (
2010
).
The perception of pitch salience in musical chords of different tone types
(Master's thesis, Diplomarbeit).
University of Graz
,
Graz, Austria
.
Ritsma, R. J. (
1967
).
Frequencies dominant in the perception of the pitch of complex sounds
.
Journal of the Acoustical Society of America
,
42
,
191
198
.
Rivera, B. V. (
1984
).
The seventeenth-century theory of triadic generation and invertibilitiy and its application in contemporaneous rules of composition
.
Music Theory Spectrum
,
6
,
63
78
.
Schenker, H. (
1954
).
Harmony
(E. M. Borgese, Trans.).
Cambridge, MA
:
MIT Press
. (Original work published 1906)
Schneider, P., Sluming, V., Roberts, N., Scherg, M., Goebel, R., Specht, H. J., et al. (
2005
).
Structural and functional asymmetry of lateral Heschl's gyrus reflects pitch perception preference
.
Nature Neuroscience
,
8
(
9
),
1241
1247
.
Seither-Preisler, A., Johnson, L., Seither, S., Lütkenhöner, B. (
2008
).
The perception of dual aspect tone sequences changes with stimulus exposure
.
Brain Research Journal
,
2
(
3
),
125
148
.
Seither-Preisler, A., Krumbholz, K., Patterson, R., Johnson, L., Nobbe, A., Seither, S., & Lütkenhöner, B. (
2007
).
Tone sequences with conflicting fundamental pitch and timbre changes are heard differently by nusicians and nonmusicians
.
Journal of Experimental Psychology: Human Perception and Performance
,
33
,
743
751
.
Shepard, R. N. (
1982
).
Geometrical approximations to the structure of musical pitch
.
Psychological Review
,
89
,
305
33
.
Sinnott, J. M., Stebbins, W. C., & Moody, D. B. (
1975
).
Regulation of voice amplitude by the monkey
.
Journal of the Acoustical Society of America
,
58
,
412
414
.
Snow, C. P. (
1959
).
Two cultures
.
Science
,
130
(
3373
).
419
419
.
Terhardt, E. (
1972
).
Zur Tonhöhenwahrnehmung von Klängen
.
Acustica
,
26
,
173
199
.
Terhardt, E. (
1974
).
Pitch, consonance, and harmony
.
Journal of the Acoustical Society of America
,
55
,
1061
1069
.
Terhardt, E. (
1982
). Die psychoakustichen Grundlagen der musikalischen Akkordgrundto¨ne und deren algorithmische Bestimmung. In Dahlhaus, C. (Ed.),
Tiefenstruktur der Musik
(pp.
23
50
).
Berlin
:
TU Berlin
.
Terhardt, E. (
1998
).
Akustische Kommunikation
[Acoustic communication].
Berlin
:
Springer
.
Terhardt, E., & Grubert, A. (
1987
).
Factors affecting pitch judgments as a function of spectral composition
.
Perception and Psychophysics
,
42
,
511
514
.
Terhardt, E., Stoll, G., & Seewann, M. (
1982
).
Pitch of complex signals according to virtual-pitch theory: Tests, examples, and predictions
.
Journal of the Acoustical Society of America
,
71
,
671
678
.
Thompson, W. F., & Cuddy, L. L. (
1989
).
Sensitivity to key change in choral sequences: A comparison of single voices and four-voice harmony
.
Music Perception
,
7
,
151
68
.
Thompson, W. F., & Parncutt, R. (
1997
).
Perceptual judgments of triads and dyads: Assessment of a psychoacoustic model
.
Music Perception
,
14
,
263
280
.
Thomson, W. (
1993
).
The harmonic root: A fragile marriage of concept and percept
.
Music Perception
,
10
,
385
415
.
Tillmann, B., Bharucha, J. J., & Bigand, E. (
2000
).
Implicit learning of tonality: A self-organizing approach
.
Psychological Review
,
107
,
885
913
.
Trainor, L. J., & Trehub, S. E. (
1994
).
Key membership and implied harmony in Western tonal music: Developmental perspectives
.
Perception and Psychophysics
,
56
,
125
132