Tonal schemata are shaped by culture-specific music exposure. The acquisition process of tonal schemata has been delineated in Western mono-musical children, but cross-cultural variations have not been explored. We examined how Japanese children acquire tonal schemata in a bi-musical culture characterized by the simultaneous, and unbalanced, appearances of Western (dominant) music along with traditional Japanese (non-dominant) music. Progress of this acquisition was indexed by gauging children’s sensitivities to musical scale membership (differentiating scale-tones from non-scale-tones) and differences in tonal stability among scale tones (differentiating the tonic from another scale tone). Children (7-, 9-, 11-, 13-, and 14-year-olds) and adults judged how well two types of target tones (scale tone vs. non-scale tone; tonic vs. non-tonic) fit a preceding Western or traditional Japanese tonal context. Results showed that even 7-year-olds showed sensitivity to Western scale membership while sensitivity to Japanese scale membership did not appear until age nine. Also, sensitivity to the tonic emerged at age 13 for both types of melodies. These results suggest that even though they are exposed to both types of music simultaneously from birth, Japanese children begin by acquiring the tonal schema of the dominant Western music and this age of acquisition is not delayed relative to Western mono-musical peers.
How do listeners acquire musicalschemata (i.e., implicit knowledge about musical structures)? It has been widely acknowledged that just as children implicitly acquire language schemata through exposure to spoken language, children also implicitly acquire musical schemata through exposure to music in their culture (see Hannon & Trainor, 2007; Trainor & Hannon, 2013; Trehub & Hannon, 2006, for reviews). Such an experience-based account is, of course, true for schemata about tonality structure, one of the fundamental components of music. Tonal schemata develop with age (Corrigall & Trainor, 2014; Koelsch, Fritz, Schulze, Alsop, & Schlaug, 2005; Krumhansl & Keil, 1982; Trainor & Trehub, 1992, 1994) and they differ from culture to culture (Castellano, Bharucha, & Krumhansl, 1984; Kessler, Hansen, & Shepard, 1984; Matsunaga et al., 2018). Obviously, the respective influences of age and culture are profound in terms of acquisition of tonal schemata. However, the issue of how age and culture interact with one another has remained largely unexplored. More specifically, little is known about the extent of cross-cultural variation in the tonal schemata acquisition process delineated by children in Western countries. In this paper, we report on the acquisition trajectory of tonal schemata in Japanese bi-musical children.
Tonal schemata typically reflect abstract hierarchical structure of pitches based on listeners’ psychological prominences (e.g., Krumhansl, 1990). In their hierarchical structure, scale tones (i.e., memberships of a given scale; e.g., C, D, E, F, G, A, and B in C major of Western diatonic scale) are psychologically deemed more stable than non-scale tones (e.g., C♯, D♯, F♯, G♯, A♯ in C major). Moreover, there are differences in tonal stability among scale tones, and the tonic (e.g., C in C major; note that tonic is a Western music term whereas tonal center is a more general term) is the most stable among all scale tones. Such tonal hierarchies are found not only in listeners exposed to Western music but also in listeners who are exposed to non-Western music; for example, traditional Japanese music (Koizumi, 1958), traditional Korean music (Lantz, Kim, & Cuddy, 2014), Balinese music (Kessler et al., 1984), and traditional Indian music (Castellano et al., 1984).
A large body of research has examined a developmental trajectory of tonal hierarchy. In Western countries, average six-month-old infants (with no special music training) do not show sensitivity to Western tonal structure (Gerry, Unrau, & Trainor, 2012). At around the age of one year, Western infants begin to show a perceptual advantage for prototypical Western melodies over non-prototypical melodies (Cohen, Thrope, & Trehub, 1987; Lynch & Eilers, 1992; Trehub, Thorpe, & Trainor, 1990). On the other hand, the one-year-old infants have yet to be sensitive to differences between Western diatonic scale tones and non-scale tones (Trainor & Trehub, 1992); moreover, the influence of Western diatonic structure on the infants’ melody perception is not as strong as that on adults’ perception (Schellenberg & Trehub, 1999; Trehub, Cohen, Thorpe, & Morrongiello, 1986). At the age of four or five years, Western children begin to show behavioral differentiations between diatonic scale tones and non-scale tones (Corrigall & Trainor, 2010; Trainor & Trehub, 1994)—a few electrophysiological studies report that neurophysiological differentiations begin before four years of age (Corrigall & Trainor, 2014, 2019; Jentschke, Friederici, & Koelsch, 2014). However, the sensitivity of these children to diatonic scale tones is not acute and seems to be limited to specific Western melody types (for example, a melody implying a cadence, a highly familiar melody, or a melody with harmony accompaniment) (Corrigall & Trainor, 2010; Morrongiello & Roes, 1990; cf. Hargreaves & Lamont, 2017). When children reach six to eight years of age, they can differentiate diatonic scale tones from non-scale tones in more various Western melodies (Krumhansl & Keil, 1982; Lamont & Cross, 1994; Morrongiello & Roes, 1990). Additionally, children as young as age 10 become sensitive to differences in tonal stability among diatonic scale tones (Krumhansl & Keil, 1982; Lamont & Cross, 1994). Specifically, children of this age distinguish triadic tones (e.g., tonic) from non-triadic scale tones. Sensitivity to the dominant of the tonic appears later in development (for example, ages of 10-11 in Lamont & Cross, 1994), but this sensitivity does not reach the level found in adults until several years later (Krumhansl & Keil, 1982). To sum up, Western children begin to show robust sensitivity to Western diatonic scale membership around the age of seven and then go on to show sensitivity to differences in tonal stability between certain scale tones (e.g., the tonic) and other scale tones by around the age of 10.
While Western children’s tonal schemata acquisition has been well studied, there is very little research on the tonal schemata acquisition of Asian, Arabian, and African children, including those who grow up in present-day Japanese society. Present-day Japan is known to have a bi-musical tonality environment in Western music and traditional Japanese music. Strictly, Japan’s contemporary music culture is overwhelmingly Western (or of Western idiom) while many types of traditional Japanese music continue to thrive (e.g., Chiba, 2007; Izumi, 1995; Koizumi, 1958, 1984; Kojima, 1997; Shibata, 1978; Tokita, 2014). Here let us define the bi-musicality of Japan’s music environment, borrowing concepts from literature on bilingualism (De Groot, 2011; Werker & Byers-Heinlein, 2008). First, the environment has a simultaneous bi-musical dimension in the sense that Japanese listeners have been exposed to both types of music from birth. Second, Japan’s environment has an unbalanced1 bi-musical dimension as well in the sense that Japanese listeners have much more exposure to Western music than to traditional Japanese music. In this paper we define Western music as dominant music for Japanese listeners in terms of the opportunities and amount of exposure to music the average Japanese listener has. Taken together, the present-day Japanese society provides a simultaneous, yet unbalanced, bi-musical environment to listeners.
A number of behavioral and neuroimaging studies have confirmed that, as Japanese adults reflect their bi-musical tonality environment, they acquire both the Western-style tonal schema and the traditional-Japanese-style tonal schema (e.g., Hoshino, 1989; Matsunaga et al., 2018; Matsunaga, Yokosawa, & Abe, 2012, 2014; Ogawa, Kimura, & Mito, 1995/1996). This means that Japanese adults will typically organize a given tone sequence into either the Western-style tonal system or the traditional-Japanese-style tonal system. In other words, Japanese adults have a sensibility for both Western-style tonality and traditional-Japanese-style tonality. Moreover, consistent with this unbalanced bi-musical environment, Japanese adults appear to have the Western tonal schema as a dominant one. Hoshino and Abe (1984) report that Japanese adults tend to rely upon a Western-style tonal interpretation more often than a Japanese-style tonal interpretation for an ambiguous short melody, i.e., a melody that affords two different (Western-style or traditional-Japanese-style) tonal interpretations. How and when (i.e., age) do Japanese children acquire the above adult-like sense of Western-style tonality and that of traditional-Japanese-style tonality? The goal of the present study was to examine how Japanese children acquire the adult-like bi-musical tonal schemata.
Japanese children of preschool age have yet to acquire either the Western or the Japanese tonal schemata. Fujita (1975) compared 5- to 6-year-olds (preschool-age children) and 9- to 10-year-olds (school-age children) on their recognition of Western melodies, traditional Japanese melodies, and atonal melodies. While 5- to 6-year-olds did not differ in their recognition of Western, Japanese, and atonal melodies, 9- to 10-year-olds performed better on Western and Japanese melodies than atonal melodies. Similar results are found in an experiment by Sawanobori (1980). Also, Fukui and Matsukubo (1992) reported that although children under six years of age did not associate ‘a sense of traditional-Japanese-style tonality’ with traditional Japanese scales, 9-year-old children did demonstrate this association.
The present investigation sought to examine how, and at which ages, Japanese children develop the Western-style tonal schema and the traditional-Japanese-style tonal schema. In this study, the progress of tonal schemata acquisition was indexed by two types of sensitivities to tonal structure of music. One was sensitivity to scale membership; the other type involved sensitivity to differences in tonal stability among scale tones; in particular, the dominance of the tonal center (i.e., tonic in Western music) over other scale tones. In Experiment 1, we evaluated listeners’ skill to differentiate between scale tones and non-scale tones. In Experiment 2, we assessed listeners’ skill to differentiate between the tonal center and another scale tone. In these two experiments, our participants were listeners who were born and grew up in present-day Japanese society and had no music training. Specifically, the Japanese participants fell into six groups of different ages2: a 6- to 8-year-old group (the first and second grades of elementary school), an 8- to 10-year-old group (the third and fourth grades), a 10- to 12-year-old group (the fifth and sixth grades), a 13- to 14-year-old group (the second grade of junior high school), a 14- to 15-year-old group (the third grade), and an adult group. Ideally, it might be wise to include preschool children as well in our participants; however, in practice, this was difficult. Nevertheless, the participant preparation did not hamper the purpose of this study, considering that previous studies already confirm that preschool children have yet to exhibit signs of either the Western-style or the Japanese-style tonal schemata (Fujita, 1975; Fukui & Matsukubo, 1992; Sawanobori, 1980).
The present study prepared one group of melodic materials for which Japanese adults report feeling Western-style tonality and another group of those for which Japanese adults feel traditional-Japanese-style tonality. Specifically, we prepared tone sequences that Japanese adults organize into the Western tonal system (i.e., the adults assimilate into the Western tonal schema) as Western melodic materials; likewise, we prepared tone sequences that Japanese adults organize into the traditional Japanese tonal system (i.e., they assimilate into the traditional Japanese tonal schema) as traditionalJapanese melodic materials. The materials used in our previous study (Matsunaga et al., 2018) were verified as being satisfactory in terms of specific points related to Western and Japanese melodies (see Method for details); thus, indeed, the present study used exactly the same materials as the previous study did.
There were two compelling reasons to prepare such melodic materials. First, our goal was to examine how and when (i.e., age) Japanese children acquire adult-like sensibility of Western-style tonality and that of traditional-Japanese-style tonality. For this goal, it is more straightforward to prepare melodic materials on the basis of Japanese adults’ tonality sense. Second, there is an ongoing debate surrounding the question of which sequential characteristic on a given tone sequence leads to what tonality is perceived and how this perception happens (Schmuckler, 2016, for a review), while it is well-known that tonality perception depends on some sequential characteristics (e.g., Butler & Brown, 1994; Matsunaga & Abe, 2005). This means that we cannot depend solely on a set of scale tones consisting of a given tone sequence to identify which culture-specific tonality (Western-style tonality vs. traditional-Japanese-style tonality) Japanese listeners are supposed to perceive for this tone sequence. After all, it is impossible to prepare Western tonal materials or traditional Japanese tonal materials simply on the basis of a set of scale tones. Thus, in this study we prepared the materials on the basis of Japanese adults’ tonality sense.
We had two predictions through Experiments 1 and 2. The first prediction concerned the developmental stages of tonal schemata. Previous studies have shown that Western children exhibit sensitivity to scale membership at a younger age than show sensitivity to differences in tonal stability among scale tones (e.g., Krumhansl & Keil, 1982; Lamont & Cross, 1994). Needless to say, listeners cannot become sensitive to differences among scale tones without understanding which tones belong to a particular scale. Thus, the scale relationships enable us to predict that not only Western children but also Japanese children should show sensitivity to scale membership that emerges before sensitivity to differences in tonal stability among scale tones. The second prediction concerns a difference in ages at which children acquire the Western tonal schema versus the Japanese tonal schema. Present-day Japanese grow up in an unbalanced bi-musical environment where Western music is dominant over traditional Japanese music (e.g., Koizumi, 1984). In general, acquisition speed of skills is largely influenced by the amount of experience. If tonal schemata acquisition is also influenced by amount of music listening exposure, then the acquisition speed might be different between the Western tonal schema and the Japanese tonal schema.
The goal of Experiment 1 was to determine the age at which Japanese children began to differentiate between scale tones and non-scale tones of the Western diatonic scale and traditional Japanese scales, respectively. The task of child and adult participants was to provide goodness-of-fit ratings for two different target tones occurring at the final position of a melody (Figure 1): (a) a target of a scale tone that was congruent to tonal conventions in the sense that it remained within the key of the melody (scale-tone condition), and (b) a target of a non-scale tone that violated the rules of tonal structure by occurring outside the key of the melody (non-scale-tone condition).
Listeners sensitive to scale membership were expected to find a non-scale tone as a sour or wrong tone due to its violation of the tonal schema associated with the preceding tonal context (e.g., Matsunaga, et al., 2012; Trainor & Trehub, 1992, 1994). In this experiment, the listeners’ goodness-of-fit ratings should be significantly lower in conditions with non-scale tone targets than in conditions with scale tone targets. As noted, this study prepared one group of melodic materials that Japanese adults assimilate into the Western tonal schema and another group of melodic materials that they assimilate into the traditional Japanese tonal schema. Thus, we predicted that, due to their sensitivity to Western scale membership, Japanese adult participants would give lower goodness-of-fit ratings to non-scale target tones than to scale target tones for the Western melodic materials; Likewise, our prediction for the Japanese melodic materials was parallel to that for the Western melodies. Moreover, if Japanese child participants showed similar performances to the adult participants, then this suggests that the child participants relied on tonal rules similar to those used by the adult participants. In this case, it is likely that, as with Japanese adults, the children would rely on the implicit knowledge of scale membership of the corresponding melodic materials.
Ratings for Western melodies and those of traditional Japanese melodies were designed to be analyzed separately because equivalence between the two melody types was not ensured. For example, we were not certain whether the two types were comparable in terms of ease of differentiation between scale tones and non-scale tones. Therefore, to err on the side of caution, we did not directly compare ratings of the Western melody type with those of the Japanese melody type.
A total of 122 children and teenagers participated in the experiment. They were ranged between 6 and 15 years old, and grouped by two or three years according to grades in Japan’s education system. Thus, age 6-8 years (Grades 1 and 2 in elementary school) constituted one group, age 8-10 years (Grades 3 and 4) another, and so on. There were 24 6- to 8-year-olds (Mage = 6.9 years, SD = 0.69 years; boys = 10, girls = 14), 24 8- to 10-year-olds (Mage = 8.7 years, SD = 0.46 years; boys = 15, girls = 9), 26 10- to 12-year-olds (Mage = 10.5 years, SD = 0.51 years; boys = 23, girls = 3), 28 13- to 14-year-olds (Mage = 13.5 years, SD = 0.51 years; boys = 14, girls = 14), 20 14- to 15-year-olds (Mage = 14.4 years, SD = 0.50 years; boys = 18, girls = 2). In addition to the children and teenagers, 28 adults (Mage = 20.4 years, SD = 1.0 years; male = 23, female = 5) participated in the experiment. An additional five children were tested but excluded from the analyses for the following reasons: they had lived in foreign countries (n = 2) or failed to complete all testing (n = 3). Most of the participants reported having no formal music training (6- to 8-year-olds: 2 had 1 year of piano lessons). None spent any significant length of time in foreign countries. All participants were healthy with no history of hearing impairment. All participants signed an informed consent from approved by the Research Ethics of Shizuoka Institute of Science and Technology.
The sample size used in this experiment was based on an a prior power analysis conducted in R 3.5.1 statistical package ‘pwr’ (Champely, 2018). When we assumed an effect size of Cohen’s d = 0.69—reported in a certain condition of our own previous study (Matsunaga et al., 2018)—and the significant level of α = .05, sample sizes of 19 and 25 participants per group would provide 80% and 90% power, respectively, to detect effects. To achieve the desired 80-90% power, we determined the best sample size, i.e., more than 19 and nearer to 25 participants per group.
Materials and apparatus
The melodic materials were identical to that of our previous study (Matsunaga et al., 2018). The materials consisted of 12 melodies for which Japanese adults feel Western-style tonality and other 12 for which Japanese adults feel traditional-Japanese-style tonality. In other words, we prepared materials involving 12 melodies that Japanese adults should assimilate into the Western tonal schema and another set of 12 melodies that they should assimilate into the traditional Japanese tonal schema. Indeed, the four points of Matsunaga et al. (2018) confirmed that our prepared melodic materials fulfilled the requirements: (a) the Western melodic materials were created by a musician proficient in Western music; the traditional Japanese melodic materials were excerpted from existing traditional Japanese music, “Nihon Komoriuta Senkyoku (A Collection of Traditional Japanese Children’s Songs and Lullabies)”; (b) tonal interpretations for the Western melodies were very similar across Japanese adults and North American adults, whereas those for the Japanese melodies were qualitatively different between these two participant groups; (c) Japanese adults fully felt Western-style melody-ness and traditional-Japanese-style melody-ness for the Western melodies and the Japanese melodies, respectively; this evidence, taken together with evidence showing a high correlation between senses of melody-ness and tonality (Hoshino & Abe, 1981), suggests that Japanese adults can feel respective culture-specific tonalities for the two types of melodies; (d) responses of Japanese adults have clearly shown a “the Western melodies– happy” association and a “the traditional Japanese melodies–sad” association—parenthetically, the sign of the cultural music-emotion connotation has already emerged as early in age as the 6- to 8-year-old Japanese children (Matsunaga, Hartono, Yokosawa, & Abe, 2019).
The tonal center and the scale for each melodic material were validated by experts in Western music and traditional Japanese music, respectively. For the Western materials, the Western expert identified tonal centers and scales. On the other hand, the Japanese expert on traditional Japanese music had already analyzed and identified tonal centers and scales for the original existing music (i.e., Nihon Komoriuta Senkyoku). Therefore, we carefully excerpted brief passages from the original so as not to violate the identified tonal centers and scales. Finally, half of the 12 Western melodies were in the diatonic major scale and the other half in the minor scale; half of the 12 Japanese melodies were in the Minyo scale and the other half in the Miyakobushi scale.
We manipulated the pitch of a target tone at the final position of each melody in order to create two conditions: scale-tone and non-scale-tone conditions. In the scale-tone condition, the center tone (i.e., tonic) of a key in a given tonal context occurred at the final position. In the contrasting non-scale-tone condition, a target tone outside the implied key occurred at the final position. There were one or two semitone pitch distances between target tones of scale-tone and non-scale-tone conditions. We tested whether melodies ending in scale condition vs. non-scale condition matched in terms of pitch distances between the penultimate tone and the final tone, and confirmed no significant difference between these two conditions, t(23) = .318, p = .75, d = .06; the mean of pitch distance was 3.3 and 3.2 semitones for the scale and non-scale conditions, respectively. The result indicates that for listeners insensitive to scale membership, a target tone in the non-scale condition was not more structurally salient in terms of melodic leap than that in the scale condition. In sum, the present study used 48 stimulus tone sequences, i.e., 24 melodies (i.e., 12 Western and 12 Japanese melodies) x 2 conditions (i.e., scale and non-scale conditions)
Melodies were presented in several different keys while the same key was used between stimulus tone sequence pairs. The total number of tones per individual sequence ranged from 12 to 28 tones, and all tone sequences had the same tempo (120 bpm), metrical structure (meters of 4/4), and timbre (acoustic piano). Equal temperament was used in all stimulus tone sequences. The stimuli were created using Garageband (Apple Inc.) and presented though iTunes (Apple Inc.). A MacBook Air notebook computer with iTunes and Logicool stereo speakers (Z130) were used to present stimulus tone sequences. The sound volume was set to a comfortable listening level.
The experimental procedure for elementary school children (i.e., between 6 and 12 years of age) was basically the same as that for teenagers (i.e., between 13 and 15 years of age) and adults. However, there were minor differences explained below.
The experiment for elementary school children began with interviews, in which the children verbally answered about their age, gender, educational, and family background, as well as their history of music training. Then, the experiment proceeded with an explanation that the experimenter needed help with melodic ending. The children participants were given a booklet of answer sheets that included the seven schematic face drawings ranging from bad (frowning) through neutral (straight-mouthed) to good (smiling). They also received the following instruction: “If you think that a given melody has ended well and correctly, please mark the smiling face; if you think that a given melody has ended badly and incorrectly, please mark the frowning face.” That is, the children made a mark on the face that best described how the face might express the degree of goodness-of-fit of a final tone with the melodic context. This procedure has been shown to be effective in collecting tonal stability responses of school-age children (Cuddy & Badertscher, 1987; Lamont & Cross, 1994). The children were tested in small groups. Before the main experiment, two practice trials were given to the participants. These practice trials were sufficient for the children to feel comfortable with the task. The main experiment consisted of 48 trials, in which Western and Japanese melody types were mixed together. The order of 48 trials was fully randomized and was counterbalanced across small groups. Rest breaks were given to the children after every 16 trials, because they were assumed to have a difficulty maintaining concentration throughout the entire experiment. The main experiment lasted approximately 70 min.
On the other hand, the experimental sessions for teenagers and adults also began with a questionnaire assessing their age, gender, educational, and family background, as well as history of music training. The teenagers and adults were asked to rate how well a final tone fit with the whole melody on a 7-point scale (1 = bad and incorrect, 7 = good and correct). The teenagers and adults were tested in small groups. Unlike the children, drawings of faces and rest breaks were not given to the teenagers and adults. After two practice trials, 48 trials of the main experiment were conducted without breaks (about 30 min). The order of 48 trials was fully randomized and was counterbalanced across small groups. The participants were not informed that they were going to hear Western music and traditional Japanese music.
Results and Discussion
Before performing any statistical analysis, the elementary-school age children’s judgements given in the marked faces were converted into numbers on a 7-point scale: a smiling face (good) meant a rating of 7 and a frowning face (bad) meant a rating of 1. Subsequently, each participant’s ratings for each melody were analyzed as follows: difference scores were calculated by subtracting ratings of non-scale-tone condition from those of the contrasting scale-tone condition. If participants were sensitive to scale membership, they would give higher ratings to scale-tone condition than non-scale-tone condition. In this case, difference scores could show positive values that significantly exceeded the chance level of 0 (i.e., an equal rating of the non-scale-tone and scale-tone condition). In addition, difference scores were useful because there were large individual differences in the use of the response (rating) scale and the individual differences related to the age of the participant, in that younger children tended to give higher ratings overall. Similar individual differences were reported in a study of Krumhansl and Keil (1982) which tested Western children. All of the analyses to be described were based on difference scores.
Figure 2 shows average difference scores and their 95% confidence intervals. One-sample t-tests were conducted separately for each age group. Results for the Western melodies revealed that difference scores of all age groups significantly exceeded chance level (i.e., difference score = 0), 6- to 8-year-olds: t(23) = 2.60, p = .02, Cohen’s d = 0.53; 8- to 10-year-olds: t(23) = 7.28, p < .001, d = 1.49; 10- to 12-year-olds: t(25) = 10.15, p < .001, d = 1.99; 13- to 14-year-olds: t(27) = 8.39, p < .001, d = 1.59; 14- to 15-year-olds: t(19) = 8.68, p < .001, d = 1.94; adults: t(27) = 13.68, p < .001, d = 2.59. Unlike results of the Western melodies, results of the traditional Japanese melodies revealed that difference scores of 6- to 8-year-olds did not exceed chance level, t(23) = 1.62, p = .12, d = 0.33. In addition, its effect size was substantially smaller than that of the counterpart of the Western melodies. All age groups apart from the 6- to 8-year-olds showed that the difference scores significantly exceeded, 8- to 10-year-olds: t(23) = 4.61, p < .001, d = 0.94; 10- to 12-year-olds: t(25) = 8.02, p < .001, d = 1.57; 13- to 14-year-olds: t(27) = 7.69, p < .001, d = 1.45; 14- to 15-year-olds: t(19) = 7.36, p < .001, d = 1.65; adults: t(27) = 14.77, p < .001, d = 2.79. Incidentally, qualitatively similar results to one-sample t-tests were confirmed with a two-way mixed design ANOVA with factors of target tone (within subjects) and age group (between subjects) on raw rating data.
These results showed that, like Japanese adults, even our youngest 6- to 8-year-old children differentiated Western scale tones from Western non-scale tones. In contrast, the 6- to 8-year-old children did not differentiate Japanese scale tones from Japanese non-scale tones. This was different from the adult participants’ performances; indeed, it suggests that 6- to 8-year-old children did not use the same tonal rules for the Japanese melodies as the adults did. Instead, the beginning of differentiations between Japanese scale tones and non-scale tones was observed in responses of 8- to 10-year-old children. Moreover, an indirect comparison of the results of Western melodies with Japanese melodies indicates that sensitivity to scale membership emerged earlier in the Western diatonic scale than the traditional Japanese scales. In sum, the present results suggest that Japanese children begin to rely upon implicit knowledge of Western scale membership at ages between 6 and 8 years (Mage = 6.9 years) and then proceed to rely upon implicit knowledge of traditional-Japanese scale membership at age of 8 to 10 years (Mage = 8.7 years).
As shown in Figure 2, older participants produced larger difference scores than younger participants. This outcome is consistent with our prediction that younger children should find the task more challenging than older children. This trend was confirmed by a one-way ANOVA with age group (between subjects) as a factor. The results involving Western melodies revealed a significant main effect of age group, F(5, 144) = 12.71, p < .001, = .31. Post hoc multiple comparisons (Shaffer’s Modified sequentially Rejective Bonferroni Procedure, here and throughout) showed that adult difference scores were significantly higher than 6- to 8-year-old, 8- to 10-year-old, 10- to 12-year-old, and 13- to 14-year-old scores (adjust p < .01); 10- to 12-year-old, 13- to 14-year-old, and 14- to 15-year-old difference scores were also significantly higher than those of 6- to 8-year-old scores (adjust p < .01). Results of Japanese melodies revealed a main effect of age group, F(5, 144) = 14.16, p < .001, = .33. Post hoc multiple comparisons showed that adult difference scores were significantly higher than all other age groups (adjust p < .01); 10- to 12-year-old, 13- to 14-year-old, and 14- to 15-year-old difference scores were also significantly higher than those of 6- to 8-year-old scores (adjust p < .01). In short, parallel results of the Western and Japanese melodies confirmed that it takes several years for children’s sensitivity to scale membership to reach the level seen in adults.
The goal of Experiment 2 was to ascertain the age at which Japanese children begin to show sensitivity to differences in tonal stability within the sets of Western scale tones and those of Japanese scale tones, respectively. Specifically, this experiment focused on listeners’ differentiations between the tonal center and another scale tone. Except for kinds of target tones occurring at the ending of the melodies, the methodology of Experiment 2 was identical to that of Experiment 1. We prepared two different target tones (Figure 1): (a) a target of the center tone that was more congruent with tonal conventions in the sense that the tonal center was the most psychologically stable scale tone of the implied key, and usually occurred at the final position of most existing melodies (center condition); (b) a target of a non-center tone that was less congruent with tonal conventions in the sense that the non-center tone was a psychologically less stable scale tone of the implied key and did not occur at the final position of most existing melodies although it was within the key of the melody (non-center condition).
Listeners with a differentiated tonal hierarchy were expected to find a center target more comfortable than a non-center target (e.g., Krumhansl, 1990; Matsunaga et al., 2018). In the present experiment, the listeners should give higher ratings to the center condition than to the non-center condition. We predicted that, because they have a sophisticated Western tonal hierarchy, Japanese adult participants would give higher ratings to the center condition than to the non-center condition for the Western melodies. Likewise, we had a parallel prediction for the Japanese melodies. Moreover, if Japanese child participants showed similar performances as the adult participants did, this would suggest that these child participants relied on tonal rules similar to those used by the adult participants. In this case, it is likely that, as with Japanese adults, the children would acquire considerable sensitivity to the strength of a tonal center over non-center scale tone in the corresponding melody type.
Participants of Experiment 2 were individuals in comparable age groups to those of Experiment 1. A total of 145 children, teenagers, and adults participated. There were 26 6- to 8-year-olds (Mage = 7.0 years, SD = 0.57 years; boys = 13, girls = 13), 25 8- to 10-year-olds (Mage = 9.1 years, SD = 0.64 years; boys = 15, girls = 10), 19 10- to 12-year-olds (Mage = 10.9 years, SD = 0.66 years; boys = 14, girls = 5), 23 13- to 14-year-olds (Mage = 13.0 years, SD = 0 years; boys = 23, girls = 0), 26 14- to 15-year-olds (Mage = 14.7 years, SD = 0.47 years; boys = 24, girls = 2), and 26 adults (Mage = 20.1 years, SD = 1.37 years; boys = 9, girls = 17). An additional five children were tested but excluded from the analyses for the following reasons: lived in foreign countries (n = 2) or failed to complete all testing (n = 3). No participant reported having formal music training. None of the participants had participated in Experiment 1. As we noted in Experiment 1, we determined the sample size to be more than 19 and about 25 participants per group. This size was judged to have 80-90% power based on our previous study (Matsunaga et al., 2018).
Materials, apparatus, and procedure
The materials, apparatus, and procedure were basically identical to those of Experiment 1. Unlike Experiment 1, this experiment prepared the center condition and the non-center condition. In the center condition, the tonal center of a key of a given melody occurred at the final position. This was the same as the scale-tone condition of Experiment 1. In the non-center condition, the second scale tone of this key (e.g., D in C major) occurred at the final position in the given melody. The pitch distance between the center and the non-center conditions was three semitones and below. Also, we checked whether the center vs. the non-center condition matched in terms of the pitch distance between the penultimate tone and the final tone. We found no significant differences between these two conditions, t(23) = 0.19, p = .85, d = 0.04; the mean of pitch distance = 3.4 and 3.4 semitones for the center and the non-center conditions, respectively. This result indicates that for listeners insensitive to the center, a target tone in the non-center condition was not more structurally salient in terms of melodic leap than that in the center condition.
Results and Discussion
As with data pre-processing of Experiment 1, the elementary-school-age children’s judgements given in the marked faces were converted into numbers on a 7-point scale. Then, difference scores were calculated by subtracting ratings of the non-center condition from those of the center condition. Accurate performance in the goodness-of-fit judgement task entailed giving higher ratings to the center condition than to the non-center condition. All analyses to be described were based on difference scores. As in Experiment 1, in Experiment 2 there were large individual differences related to the age in the use of the response rating scale.
Figure 3 shows average difference scores and their 95% confidence intervals. One-sample t-tests were conducted separately for each age group. Results of the Western melodies revealed that difference scores of 6- to 8-year-olds, 8- to 10-year-olds, and 10- to 12-year-olds did not exceed chance level, 6- to 8-year-olds: t(25) = 1.30, p = .21, d = 0.26; 8- to 10-year-olds: t(24) = 0.76, p = .46, d = 0.15; 10- to 12-year-olds: t(18) = .17, p = .87, d = 0.04. In contrast, difference scores of 13- to 14-year-olds, 14- to 15-year-olds, and adults significantly exceeded, 13- to 14-year-olds: t(22) = 3.10, p = .005, d = 0.65; 14- to 15-year-olds: t(25) = 2.64, p = .01, d = 0.52; adults: t(25) = 6.38, p < .001, d = 1.25. The same results were found for the traditional Japanese melodies. Difference scores of 6- to 8-year-olds, 8- to 10-year-olds, and 10- to 12-year-olds did not exceed chance level, 6- to 8-year-olds: t(25) = 1.42, p = .17, d = 0.28; 8- to 10-year-olds: t(24) = 0.57, p = .58, d = 0.11; 10- to 12-year-olds: t(18) = 0.03, p = .98, d = 0.01. In contrast, difference scores of 13- to 14-year-olds, 14- to 15-year-olds, and adults significantly exceeded, 13- to 14-year-olds: t(22) = 3.67, p = .001, d = 0.77; 14- to 15-year-olds: t(25) = 4.23, p < .001, d = 0.83; adults: t(25) = 5.23, p < .001, d = 1.03. Incidentally, qualitatively similar results to one-sample t-tests were confirmed by a two-way mixed ANOVA design with factors of target tone (within subjects) and age group (between subjects) on raw rating data.
These results indicate that, irrespective of musical type, children under the age of 13 did not differentiate between the tonal center and the second scale tone. The beginning of sensitivity to differences in tonal stability between the tonal center and the second scale tone was seen at the 13- to 14-year-old (Mage = 13.0) children. Thus, the present results suggest that, for both types of music, Japanese children at the age of 13 on average have sophisticated tonal schemata in the sense that the tonal center is at a superordinate level in the tonal hierarchy.
By conducting a one-way ANOVA with a factor of age group, we examined the influence of age on difference scores. The result for the Western melodies revealed a main effect of age group, F(5, 139) = 8.93, p < .001, = .24. Adult difference scores were significantly higher than those of 6- to 8- year-olds, 8- to 10- year-olds, 10- to 12- year-olds, and 14- to 15-year-olds (adjust p < .05). In addition, difference scores of 13-to 14-year-olds were significantly higher than those of the 8- to 10-year-olds (adjust p < .05). Results for the Japanese melodies also revealed a main effect of age group, F(5, 139) = 7.37, p < .001, = .21. Difference scores of adults and 14- to 15-year-olds were significantly higher than those of 6- to 8-year-olds, 8- to 10-year-olds, and 10- to 12-year-olds (adjust p < .05). In addition, difference scores of 13- to 14-year-olds were significantly higher than those of 6- to 8-year-olds and 8- to 10-year-olds (adjust p < .05). It is clear from the main effects of age group that sensitivity to tonal center, regardless of music type, improved with ages.
The goal of this study was to clarify how and when Japanese children acquire Japanese adult-like sensibility of Western-style tonality and of traditional-Japanese-style tonality. The present findings reveal three major age milestones in the acquisition trajectory. First, these results highlight that sensitivity to Western scale membership appears even in 6- to 8-year-olds (Mage = 6.9 years), ages that are younger than the age where beginning of sensitivity to Japanese scale membership emerged. Previous studies have shown that Japanese preschoolers have not yet exhibited any signs of tonal schemata of either Western or Japanese music (Fujita, 1975; Fukui & Matsukubo, 1992; Sawanobori, 1980). In conjunction with the prior evidence, we provide further evidence that implicit knowledge of scale membership of the Western music (namely, a fundamental building block of the tonal schema of the first music) is acquired at around the age of seven in Japanese children. Moreover, we confirmed the precedence of the Western tonal schema in age of acquisition. This is in accordance with the dominance of Western music in Japanese contemporary music culture (e.g., Koizumi, 1984). The consistency of results indicates that although Japanese children are simultaneously exposed to two different types of music from birth, these children develop the tonal schema of dominant music (i.e., Western music for Japanese children) earlier than that of non-dominant music (traditional Japanese music) in their bi-musical environment.
Second, the present results revealed that the Japanese child participants begin their engagement with music by superficially behaving as Western mono-musical listeners. Then, subsequently these children appear to become bi-cultural musical listeners around the ages of 8 to 10 (Mage = 8.7 years), where they exhibit sensitivities to Japanese scale membership as well as Western scale membership. The results are consistent with previous findings indicating that 9-year-old children can differentiate between the Western diatonic scale and the traditional Japanese scales (Fukui & Matsukubo, 1992), and children at this age show better recognition of Western tonal music and Japanese tonal music over atonal music (Fujita, 1975). In sum, it takes about a decade for Japanese children to be basically enculturated into their native bi-musical environment.
Third, our results showed that sensitivity to differences in tonal stability between the tonal center and another scale tone (i.e., the second scale tone) began, on average, around the age of 13 to 14 (Mage = 13.0 years). Furthermore, this trend was the same for the Western melodies and the traditional Japanese melodies. If these results are generalized, it would seem that Japanese children under age 13 are not able to fully understand the importance of the tonal center relative to other scale tones in both the Western melodies and the traditional Japanese melodies. However, the straightforward generalization should be taken cautiously because other studies have provided inconsistent evidence. For example, analyses of spontaneous songs sung by 10-year-old Japanese children showed that some of the children pay particular attention to a role of the tonal center at the ending of a song (Umemoto, 1996). Also, 9-year-old children show better memory recognition of Western tonal music and Japanese tonal music than of atonal music (Fujita, 1975). This evidence suggests that children at age nine can assimilate the Western-style and Japanese-style tone sequences into the Western and the Japanese tonal schemata, respectively. In other words, these children can perceive tonal centers for both Western tonal music and Japanese tonal music. The gap between previous findings and our results may imply that estimates of children’s sensitivity to the tonal center vary with task demands and strategy. Alternatively, it may imply that sophistication of tonal schemata depends on the quantity and the quality of music exposure, hence there are individual differences in the progress of the sophistication of these schemata.
Taken together, the above three findings imply that Japanese children begin to operate with the Western tonal schema around the age of seven and they then go on to become bi-musical listeners around the age of nine where they acquire the Japanese tonal schema. Later, around the age of 13 years, on average, they have sophisticated the tonal schemata, wherein the tonal center is at a superordinate level in the tonal hierarchy. What kinds of tonal schemata learning processes underlie the development trajectory shown by Japanese children? Japanese preschoolers do not systematically distinguish between the Western diatonic scale and the traditional Japanese scales (Fukui & Matsukubo, 1992). In addition, Japanese preschoolers’ spontaneous songs appear to be based on a mixture of several rules such as Western tonal rules (Ogawa, 1998). Based on this evidence, it seems likely that young Japanese children start to implicitly gain “fused and undifferentiated” knowledge about tonal structures from all music they are exposed to without forming a distinction between Western music and traditional Japanese music. Later, the fused implicit knowledge may be gradually differentiated into the Western-style tonal schema and then further into the traditional-Japanese-style tonal schema as a function of an increase of music exposure. Finally, the two tonal schemata appear to be represented distinctively in Japanese adults’ minds/brains, as suggested by neuroscience evidence showing that neural sources of Western tonal processing and that of Japanese tonal processing are separated spatially (Matsunaga et al., 2012, 2014).
Importantly, we found two cultural commonalities in the developmental timetable of tonal schemata by comparing Japanese children with Western children qualitatively. The first cultural commonality is that Japanese children acquire sensitivity to scale membership of the first (Western) music at the same age as Western children acquire sensitivity to scale membership of Western music (Krumhansl & Keil, 1982; Lamont & Cross, 1994). We are curious about what this commonality implies. One possibility is that the commonality is guided by listening experiences of Western music. That is, listening experiences of Western music may have some properties (e.g., amount of music exposure, structural features) that lead children to acquire the Western tonal schema in a period of about seven years. Another possibility is that Western-style music education in Japanese kindergarten and primary school may enhance the children’s Western tonal schema acquisition in an environment where their music exposure time is divided between Western music and traditional Japanese music. The other possible interpretation holds that the commonality reflects some innateness. That is, in any music culture environment children may be biologically programmed to acquire the first tonal schema at age seven. It is also possible that some of these interpretations are intertwined. In any case, the currently available data do not allow us to come to any firm conclusions regarding this particular point. In order to further this discussion, future studies should directly compare tonal sensitivity of Japanese children and Western children by using the same materials and same procedure. Also, it would be valuable to obtain developmental data pertaining to children who have grown up in music cultures that are less westernized than today’s Japanese culture (e.g., Indonesia and Vietnam according to Matsunaga et al., 2018).
The second cultural commonality is that Japanese children show the same developmental stages of tonal schemata, irrespective of music type, as Western children show. As expected, the Japanese children in the present study acquired sensitivity to scale membership and then went on to acquire sensitivity to differences in tonal stability among scale tones. The developmental stage is identical to that shown by Western children (Krumhansl & Keil, 1982; Lamont & Cross, 1994). The second cultural commonality is not surprising, considering that listeners cannot become sensitive to differences among scale tones without understanding which tones belong to the particular scale. Incidentally, the two cultural commonalities can be interpreted as commonalities between bi-musical children and mono-musical children in light of the viewpoint that Western children can be interpreted as mono-musical listeners of Western music (e.g., Demorest, Morrison, Beken, & Jungbluth, 2008; Wong, Roy, & Margulis, 2009).
In bilingual studies, simultaneous and unbalanced bilingual children show a delay in non-dominant language (L2) acquisition, such as grammar acquisition at ages 2 to 4 (Meisel, 2007) and vocabulary acquisition at ages 4 to 5 (MacLeod, Fabiano-Smith, Boegner-Pagé, & Fontolliet, 2013). We examined simultaneous and unbalanced bi-musical children, and found that the bi-musical children acquired the tonal schema of non-dominant music later than that of dominant music. Parallel findings from bilingual studies and our bi-musical study indicate that even though children are exposed to two music/languages simultaneously from birth, the speed of acquisition may be different between the two music/languages.
In conclusion, the present study provides an initial glimpse into fundamental questions surrounding how and when (i.e., age) children acquire two culturally different tonal schemata in a simultaneous, yet unbalanced, bi-musical environment. Our findings reveal that Japanese children begin to internalize the Western tonal schema at age seven and then become bi-musicals by acquiring the Japanese tonal schema at age nine. Subsequently, they continue to develop sophistication in the two tonal schemata, becoming sensitive to defining differences in tonal stability between the tonal center and another scale tone by at age 13. Moreover, we found culture-universal characteristics in acquisition trajectory of tonal schemata by comparing our evidence for Japanese bi-musical children with previous evidence for Western mono-musical children.
We gratefully acknowledge support from Shizuoka Institute of Science and Technology and anonymous reviewers for helpful comments on the manuscript. A part of data appearing in this paper were presented at the autumn meeting of the Japanese Society for Music Perception and Cognition, 2018. This work was supported by JSPS KAKENHI Grant Number 16K21452. The authors declare no competing financial interest.
According to this perspective, a balanced bi-musical environment means a situation where equal exposure in both music are provided; and a mono-musical environment means a situation where exposures between two music are in the ratio of 100 to 0.
The Japanese education system includes six years in elementary school and three years in junior high school. Children enter an elementary school when they are six years old. In this study, we recruited participants by grades.