Japanese polite language (teineigo) varies with the speaker-addressee relationship as well as social norms. Descriptive studies have found that young Japanese children use polite-speech early in development. This claim was experimentally tested in 3- to 6-year-old Japanese children and correct use of polite verb forms was found even in the youngest children. The early acquisition of these verb forms is surprising, because there is a Japanese social norm that parental speech to children is mostly not polite, so it is not clear how children acquire the knowledge of how to use polite forms. To examine this, a large scale corpus analysis of polite language was performed using a probabilistic measure of the intended addressee. We confirmed that parental speech is mostly not polite, but parents also produced a substantial amount of polite language that varied appropriately with addressees and this can help to explain the early use of polite speech in Japanese children under experimental conditions.
Polite language is language that is used to avoid negative impacts on the self-image of the speaker or addressee (Brown & Levinson, 1987). For example, commanding someone to Pass the salt creates pressure on the addressee to perform the action. By making it more polite by changing it into a question Can you please pass the salt?, one reduces the negative consequences of noncompliant behavior. In languages like Japanese, the use of polite language is governed by various social norms that complicate the acquisition of these forms. The present study examines this issue by experimentally verifying whether young Japanese children can use polite language and then examines the parental input for learning these forms in a corpus analysis.
Japanese polite language is more complicated than English in multiple ways. For example, the Japanese version of the command Pass the salt would be shio o totte. This can be made more polite by adding either the word kudasai or choodai to the end. Both of these forms translate to please in English, but kudasai is more polite and used with strangers, while choodai is used with family members or children. The kudasai form is an example of teineigo ‘polite language’ (Cook, 2011; Ide & Yoshida, 2017), which is a normal level of polite language used for interactions with strangers. The choodai form is a plain form used with family members or those that one is close to. While polite language in English tends to be used with requests, the Japanese language requires politeness to be considered for every verb. For example, the declarative English sentence ‘He takes the salt’ could be translated either as shio o torimasu with a polite teineigo verb or as shio o toru with a plain verb form toru. The most commonly used polite form is copular verb desu ‘to be’ (plain form da) and since polite verbs often end in masu, polite language is often collectively labeled as desu/masu forms. These forms can be contrasted with higher-level polite language, such as kenjoogo forms, which are humble forms for talking about the speaker’s own behavior with an addressee of higher status (e.g., otorisuru ‘take’ might be used by a secretary talking to a famous visiting professor). There are other polite sonkeigo forms for talking about the behavior of the higher status addressee (e.g., otorininaru ‘take’). This means that most Japanese sentences can be produced in at least four ways and there is no simple English translation for these different forms. In addition to verbs, Japanese speakers encode politeness distinctions on nouns and pronouns. In English, this is typically done on adult names by adding an honorific title (e.g., Mr. Smith). In Japanese, there are both honorific titles (e.g., san) and informal titles (e.g. chan and kun), and these can be applied to a wider range of referents than in English. Japanese also makes politeness distinctions on closed-class elements like pronouns (Ide, 1982; Miyazaki, 2004; Mogi, 2002). The English second-person pronoun you can be expressed in many ways (e.g., anata, kimi, omae, anta, otaku), while the first-person pronoun has several forms as well (watakushi, watashi, atashi, uchi, boku, or ore) and these differ in politeness. These differences between polite language in English and Japanese helps to explain why adult English learners of Japanese often struggle with Japanese politeness distinctions (Cook, 2001; Iwasaki, 2010).
Although teineigo is more polite than plain speech, it is not always used in the same way as polite language in English. In English, polite speech is often used when one cares about the addressee’s self image. For example, it is not necessary to use polite language when talking to a computer (‘Siri, would it be possible to please tell me the time’), because the computer does not have a self-image that could be threatened by this request. But it is not strange to use a teineigo request like nanji desu ka ‘what time is it’ with a computer (Yanai, 2014), because teineigo is appropriate when the speaker and addressee are not close. Theorists like Ide (2017) have argued that Japanese teineigo distinctions do not reflect the same notion of politeness as in English, but instead reflects the discernment of social conventions (wakimae). Evidence for this discernment account comes from a study by Hill et al. (1986), where they asked American and Japanese adults to rate various linguistic requests that varied in politeness (e.g., gimme, would you mind if I borrow) with respect to various addressees that differed in status (e.g., professor, younger sister). They found stronger correlations in Japanese speakers between language forms and speakers. These metalinguistic judgments provide evidence that Japanese speakers have stronger social norms about how politeness terms should be used than English speakers (Ide & Yoshida, 2017). Since teineigo forms are often required by social norms, rather than politeness, we will use the Japanese term teineigo in this work to refer to these forms, because it emphasizes its linguistic type and does not specify its function.
To better understand the development of polite language in Japanese, we will first review the acquisition of polite language in languages like English. There is evidence to suggest that the development of politeness in English and other European languages is a slow gradual process (Andersen, 1984, 2013; Andersen et al., 1999; Axia & Baroni, 1985; Bates & Silvern, 1977; Garton & Pratt, 1990; Nippold et al., 1982, 1982). For example, Bates & Silvern (1977) found that the production of polite requests was significantly correlated with chronological age between 2 and 9 years of age. Andersen (1984) used a role-playing task and found that 4-year-olds used non-polite imperative forms to make requests from a student to their teacher, but 6-year-olds were more likely to use polite requests. Axia & Baroni (1985) did an Italian study where children made a request of an addressee, but the addressee resisted the requests. In this situation, they found that only 9-year-old children increased their use of polite terms. These developmental changes are not just due to production difficulties, because similar changes have been found in judgment tasks. Wagner et al. (2010) found an increase between 3 and 5 years of age in the ability to judge whether polite language should be used with adults or children. Although there is evidence for an early sensitivity to politeness distinctions (Sachs & Devin, 1976; Shatz & Gelman, 1973; Weeks, 1971), the majority of requests/commands in young children do not have politeness marking. For example, Ervin-Tripp et al. (1990) found that children from 3 to 11 years of age only produced less than 23% polite expressions towards the experimenter, even though this addressee was an adult stranger. This literature suggests that children have some difficulty using polite language in an adult-like manner and their use of polite language develops gradually.
In contrast with this gradual development of politeness in European languages, work in Japanese finds early correct use of polite Japanese forms from around the age of 2, especially formula-like greetings which are associated to specific contexts (Burdelski, 2010; Clancy, 1986, 1987; Cook, 1997; Miyoshi, 2000, 2004; Nakamura, 2001a, 2001b). Clancy (1986) and Cook (1997) report that children can use teineigo language appropriately based on an analysis of naturalistic speech data. Fukuda (2005) studied a 3-year-old boy playing with a friend or a family member and found that he adjusted the use of teineigo language according to social roles in the given scenes (e.g., he used appropriate polite forms when he spoke to a shop clerk). In a review of the literature, Nakamura (2001b) argues that by age 3, Japanese children use polite teineigo forms in a contextually-appropriate manner.
One explanation for these cross-linguistic developmental differences is the social knowledge that is needed in each language. In English, polite forms are often used to reduce the negative impact of a request on the addressee’s self image. Hence, the use of this language requires the ability to understand that others have minds with their own private emotions/desires and it is known that young children do not have a robust understanding of some of these mental states until around 4 years of age (Theory of Mind, Leslie et al., 2004; Wellman et al., 2001). Support for the link between theory of mind and politeness comes from the fact that the story in the Strange Stories Task (Happé, 1994) that leads to the most errors in autistic children is the one involving politeness. This can help to explain why children learning English take a long time to learn the way that polite language can influence the private emotions in the addressee’s mind. Japanese polite language is different, because it can appear in any sentence, even when that sentence does not request anything of the addressee (Ide & Yoshida, 2017). It is mainly triggered by the addressee’s relationship to the speaker, where teineigo is used for strangers and plain speech is used for children and those that the speaker is close to. The closeness of the speaker and addressee is public information that is available to the speaker, so it does not require special inferences about other minds. It is known that infants are sensitive to the presence of strangers (Ainsworth & Bell, 1970; Takahashi, 1986) and hence this social knowledge can influence their language learning long before they fully understand other minds. In summary, polite forms in English are designed to influence the addressee’s private mental state, but Japanese polite language is mainly conditioned by relatively static features of the relationship between the speaker and addressee.
In this work, we are interested in understanding the early use of polite language in Japanese children in contrast with the slow development in other languages. But since the literature on the development of polite speech in mostly comes from corpus studies, it is important to first confirm in an experimental study the descriptive findings in Japanese. Therefore, the first section of this work will be an experimental test to see whether Japanese children actually can use polite language appropriately early in development.
An Experimental Test of Polite Speech in 3- to 6-year-old Japanese Children
In contrast with the studies of naturalistic speech, experiment studies of Japanese children have found that young children do not fully understand/use polite language in adult-like ways. Ikeda et al. (2019) examined how 5- and 7-year-old Japanese children understood the match between teineigo and plain speech and adult/infant addressees. For several measures, they found that adults and 7-year-olds showed an understanding of the match, but 5-year-olds were at chance. Their study was based on a previous study by Wagner et al. (2010), who found that English-speaking 5-year-olds could identify register differences at above chance levels. Another study that used judgments was by Tsuji & Doherty (2014), who looked at polite language in 3- and 4-year-old children. Children heard teineigo and plain Japanese utterances and they had to judge whether the speaker was the polite Anpanman or the rude Baikinman (these characters are well known to Japanese children). After 4 years of age, children were almost always correct, but three-year-olds were close to chance.
There are several possible reasons why children did not show an understanding of teineigo in these studies. One reason is that they did not provide enough social context for children to understand the appropriate language to use. Both of these studies presented static images to represent different addressees. To address this issue, our study showed videos from a Japanese TV show called hajimete no otsukai ‘First Time Errand’, where young children were sent on an errand to visit a shop and request some product from a clerk. When the child in the video was about to make a request, we stopped the video and asked the participant child to say what the child in the video would have said. These videos provide more information about the context (e.g. store, products) that would help the children to understand the social relationships in the situation. Our study used production as the measure of their knowledge of politeness distinctions, while the previous studies used comprehension measures within judgments or looking paradigms. It is possible that children found it more difficult to use their knowledge of politeness within these novel comprehension tasks. On the other hand, children regularly use politeness in production and previous experimental work has found that young Japanese children are sensitive to the social context when using production measures (Chang et al., 2009).
Another reason for the mismatch between naturalistic speech and experiments may be due to the use of within-participant designs in the experiments. In naturalistic speech analyses, polite and plain language examples come from different dialogues or children, so there may be various factors that influence their ability to use these forms. Experimental designs allow us to test both polite and plain language abilities within the same children. In our study, we did this by creating two sets of videos. Half of these videos depicted a formal context, where a child requested something from an adult stranger in a store context and the correct utterance for the situations would be a polite teineigo expression (e.g., beekon kudasai ‘bacon please’). The other half depicted an informal context, where a child/mother was talking to another child or a family member, and a plain expression would be typical (e.g., ringo choodai ‘apple please’). Both sets of videos were shown to 3- to 6-year-old children. If young children use teineigo for the formal videos and plain forms for the informal videos in this task, then that would show that they do distinguish between these forms and previous null results with young children reflected limitations of the previous tasks used.
Participants were Japanese children recruited from schools in Kyoto, Japan: 30 children from Seiko Kindergarten, 22 children from Soai Kindergarten, and 5 children from Takanogawa Nursery. There were 15 three-year-olds (mean age= 43.8 months), 19 four-year-olds (mean age= 53.5 months), 13 five-year-olds (mean age= 68.5 months), and 10 six-year-olds (mean age= 76.9 months). The sample contained 24 females and 33 males. The study was approved by IRB and written parental consent was obtained for the children tested.
There were six videos depicting request events: three formal and three informal (Table 1, Chang, 2019). The formal videos involved the child talking to a non-familiar adult within a shop setting. The informal videos involve the child talking to a family member or another child.
|Bacon||Formal||a child asking the clerk for a package of bacon|
|Beef||Formal||a child going to a butcher and requesting some beef|
|Bag||Formal||a child asking a clerk for a bag that was left in the store|
|Apple||Informal||a child asking their older sister for an apple|
|Book||Informal||a child asking another child to stop playing with a book|
|Egg||Informal||a mother requesting an egg from a child|
|Bacon||Formal||a child asking the clerk for a package of bacon|
|Beef||Formal||a child going to a butcher and requesting some beef|
|Bag||Formal||a child asking a clerk for a bag that was left in the store|
|Apple||Informal||a child asking their older sister for an apple|
|Book||Informal||a child asking another child to stop playing with a book|
|Egg||Informal||a mother requesting an egg from a child|
The videos were designed to make sure that the child understood the request event. The sound in each video was replaced with an audio voice-over description in Japanese. For example for the bacon video, they heard ‘Yookun is going to buy some bacon’ while viewing the store from the outside (Figure 1, panel 1). Next they would see the child Yookun in front of the counter and they heard ‘What would he say?’ (Figure 1, panel 2). Then the clerk is shown weighing the bacon and they hear ‘There is the bacon. The clerk is going to bring it’. When Yookun receives the bacon, the commentator says ‘He is happy’ (Figure 1, panel 3). Then the video starts again and they hear ‘Yookun is going to buy some bacon’ while viewing the store from the outside. As they see Yookun approach the counter, they heard ‘What would he say to the clerk? What would he say?’ and the video would pause on a frame with the child looking at the clerk (Figure 1, panel 2). Japanese speakers often omit arguments instead of using pronouns, so these questions are ambiguous. In addition to being questions about what child in the video would have said, they can also be understood as asking the participant child what they would have said in the same situation and this makes the task easier for the child. All of the videos had similar descriptions that explained the goal of each request and the target request was never produced in the voice-over, so the child had to decide what request to make and also determine how polite it should be. The videos presented in blocks of formal and informal videos (counterbalancing order across children) with one filler video in between the blocks.
The child participants sat with the experimenter in front of a laptop. In the warmup, the participants were asked to describe the actions of several toys. First, they were asked to name a stuffed animal (e.g., teddy bear) and then the animal was shown doing several intransitive actions and they were asked to describe them (e.g., the bear is jumping). Then they were introduced to the video task. They saw the videos which depicted the situation twice and ended at the point when the request utterance should be produced. They were prompted to respond as the child in the video with the phrase nan te iu? ‘what would he say?’. When the child did not produce a request response, the experimenter first asked teninsan ni nan te iu? ‘what would he say to the clerk?’ and finally Yookun wa teninsan ni nan te iu? ‘what would Yookun say to the clerk?’. If a request response was not provided, then they continued to the next trial. At the end of each trial, the experimenter praised the child.
The child utterances were transcribed by native Japanese speakers (data and scripts, Chang, 2019). We first coded whether the utterances were requests that matched the scene, excluding repetitions of the experimenter’s speech (utterances that mismatch the scene are labeled as Other in Table 2). Valid utterances were coded as polite or plain (Table 2). Most of the plain requests were in the form of choodai ‘please give me’ or in the form of verb imperatives (e.g., ringo totte ‘Pick the apple’, kaban kaese ‘Return the bag’). Plain responses also included desire expressions (e.g., tamago hoshii ‘I want eggs’). Polite teineigo request responses were kudasai ‘please give me’ or a verbal compound form such as totte kudasai ‘Please pick it’. A typical non-request response would be the production of the noun alone without any verb. When children provided multiple responses, only the last one was coded. Twenty percent of the utterances were recoded by a second coder and reliability was 93%.
In each of the videos, the target response was a request expression. We first examined the production of these request expressions regardless of whether they were polite or not. The production of a request response (1=request, 0=non-request) was predicted using a logistic mixed effect model with centered age in months and situation (effect coded) crossed. There were random effects for participants and videos, and maximal models justified by the design yield no random slopes (Barr et al., 2013). Figure 2 shows the plot of the proportion of requests with age-in-months regression lines for formal and informal situations (the average results for individual children are shown by the dots with jitter added for visibility). Children provided requests in more than half the trials across the age range, intercept =1.4, z=2.8, p=0.006. The ability to produce requests increased with age, =0.1 , (1)=12.89, p<0.001. There was no main effect of situation (p=0.482) and no interaction of situation and age (p=0.051). Thus, the children were generally able to understand the situations in the videos and produce a request response rather than an irrelevant response. But children became better at doing this as they got older and this ability did not vary for formal or informal videos.
The next analysis looked at whether children used polite teineigo expressions for the formal context. In this analysis, the proportion of polite responses out of requests was calculated. Maximal models justified by the design yield no random slopes for participant. Figure 3 shows the children’s rate of polite responses with age-in-months regression lines for each situation. Children produced mostly polite responses in formal situations and mostly plain responses in the informal situations, =15.3 , (1)=177.56, p<0.001. There was no change with age (p=0.130) nor an interaction of age and situation (p=0.492). There was a 72% difference in the use of the polite form for formal and informal conditions and most of the children were categorical in matching the form to the situation (the average for each child is shown by jittered dots in Figure 3).
This experimental study provides evidence that even 3-year old children have a strong distinction between formal and informal registers and this distinction did not change over development. These results differ from the 3-year-olds in Tsuji & Doherty (2014), who were unable to distinguish polite and impolite speech in a judgment task. In Ikeda et al. (2019) study, 5-year-old children were at chance in a knowledge check of their ability to match polite speech with an adult addressee. There are probably several reasons why children were able to exhibit their knowledge of polite language in our task. We used real-world videos which provided a rich set of visual cues about the social context and we used a production measure which is the normal way that children use politeness distinctions in everyday life. These results support previous naturalistic work showing polite language appears early in Japanese children (e.g., Fukuda, 2005). We have demonstrated that 3-year-old Japanese children have the ability to use polite language in relatively novel situations (e.g., most 3-year-olds do not go shopping by themselves), but one question is how do they acquire this knowledge so early in development. To address this issue, we will look at the input for learning polite language in the next section.
The Input and Use of Polite Speech in Japanese Development
One important difference between English and Japanese polite speech is the input frequency of polite forms. While English polite language occurs mainly on requests, Japanese teineigo can appear in almost any sentence, and therefore, children have potentially a large amount of input for learning politeness distinctions. Hence, the early use of this language in Japanese children may be due to the greater input frequency relative to English. But a problem for a frequency account is that there is a Japanese social norm for parents and children to use non-polite plain speech with each other. If parents and children only use plain speech, then children must learn teineigo from overheard speech. Since children do not spend most of their time around strangers who are talking to other strangers, it is not clear if they have enough overheard input for learning all of the distinct teineigo forms for verbs. It is also likely that children will be exposed to these forms from television, but it is not certain that children can identify the social hierarchy that determines the proper level of speech (a few years difference in age can change the language used). Even when the social levels of the speaker and listener are known, there is a lot of variation in the forms that are considered polite in different situations (Tanaka & Yamashita, 2009). Thus, it is not clear how much input is available for learning about teineigo. Previous studies have examined small data sets of speech using qualitative approaches (e.g., Cook, 1997). The present study will address this issue by performing a quantitative analysis of a large corpus to determine how much polite speech is present in the parental input.
To acquire the appropriate rules for using polite language, children must identify an objective feature of the social context that can be used to condition their own word choices. As mentioned above, teineigo is mainly used with addressees that are strangers, and children can recognize people who are strangers to themselves. But if children hear an utterance produced by their parents, they must determine the intended addressee of this utterance in order to learn which words are teineigo and which are plain forms. The problem is that the intended addressee of an utterance is private information and when there are multiple participants in a conversation, it is not always clear who is the intended addressee. To learn about this distinction, one approach is to use the person who responds to an utterance as the intended addressee. Although this is not a perfect cue, it might be a useful probabilistic cue for learning to distinguish teineigo/plain forms. Thus, our corpus study examined whether, using the next speaker as a probabilistic cue to the intended addressee, children receive enough addressee-specific input from their parents to learn how to use polite language in Japanese.
In the present work, we performed a quantitative analysis using a probabilistic measure of the intended addressee (data and scripts, Chang, 2019). This analysis used the text from the recorded Japanese corpora in CHILDES where both children and other speakers were transcribed (Hamasaki, 2002; Ishii, 1999; Miyata, 2000, 2012; Miyata & Nisisawa, 2009, 2010; Nisisawa & Miyata, 2009, 2010; Okayama, 1973). This large combined corpus used seven individual children as well as the multiple children in the Okayama corpus. Table 3 summarizes the total number of utterances for all speakers in each corpus and the age range of the target child for each corpus.
In this work, we used the subsequent speaker in the dialogue as a proxy for the intended addressee of the previous utterance. The LUCID Language Researcher’s Toolkit version of the CHILDES corpora was used, so each utterance was paired with the age of the child and participant information such as a code for each speaker (Chang, 2017). When the speaker code changed, then the next speaker code was used as the addressee for all of the previous utterances by the same previous speaker (the last speaker in a file has no addressee code). Since politeness speech should depend on the age of the next speaker and their familiarity, we created four role/addressee categories: Target Child, Other Children, Parents, and Other Adults. The Target Child category had the utterances of the target child speaker and the Parents category contained either the mother and/or father of that child. Other roles that were children were classified as Other Children (e.g., sister, cousin) and others were classified as Other Adults (e.g., grandparents, investigator). This is not ideal as it collapses familiar and less familiar speakers, but since the Target Child and Parents made up the bulk of the utterances, it was necessary to collapse all other speakers into these two categories to ensure that there were a sufficient number of utterances in each combination of addressee and speaker role. Non-human roles or roles where the age was unknown were excluded (e.g., toys, unidentified). The number of utterances for each speaker and addressee role is summarized in Table 4. Since the majority of utterances are produced by the Target Child and the Parents, our analyses will focus on these speakers.
|Corpus||Child||#Utterances||Starting Age||Ending Age|
|Corpus||Child||#Utterances||Starting Age||Ending Age|
|Target Child||Parents||Other Children||Other Adults|
|Target Child||Parents||Other Children||Other Adults|
To understand why 3-year-old children used polite language appropriately in the experiment, the use of plain and teineigo levels for verbs was examined. This included present tense Japanese verb form (e.g., the verb ‘eat’ can be expressed as teineigo tabemasu or plain form taberu). Japanese marks negation on verbs, so Japanese plain equivalent of ‘do not eat’ is tabenai and the teineigo version is tabemasen. To place these in the past, we have plain forms tabeta ‘have eaten’ and tabenakatta ‘have not eaten’ as well as respective teineigo forms like tabemashita and tabemasendeshita. The copula ‘to be’ can also occur in a teineigo version desu as well as in the past tense deshita (the plain form is da and datta). Finally, a request can be made using the teineigo word kudasai. All utterances that were not labeled as having teineigo verb forms were treated as plain.
To find the teineigo forms in the corpora, we applied regular expressions to search through the 807,272 utterances for the strings masu, masen, mashita, desu, deshita, or kudasai followed by a space (all utterances had an extra space at the end). The list of verb forms was checked by a native Japanese speaker and words that were not teineigo form verb forms were excluded. It is typical for only predicates to carry polite endings, so if an utterance had one of these teineigo markers, the whole utterance was considered to be a teineigo utterance. All other utterances were plain utterances. There are many short frequent responses that act as short responses or backchannels (aizuchi). To focus our results on utterances which could be modified for politeness, we listed the 150 most frequent utterances in the corpus and removed any short utterances which were unlikely to be modified for politeness markers (e.g., hai ‘yes’ is unlikely to be modified with the polite form desu). This led to the removal of 71 utterance types (e.g., un, ne, hora, jaa, baibai, yoisho) for a total of 241,440 utterance tokens (Tables 3 and 4 report the numbers after these forms have been removed).
Since the social norm is for parents to use plain speech, the first issue that we examined is how much of the parental input is polite speech. The second issue is whether the polite speech varies in terms of the addressee, since this type of variation would be a useful cue for children to learn that word choice is conditioned by non-linguistic social factors. To examine these issues, we examined whether the utterances produced by the parents (mother and father) varied with different addressees (Target Child, Other Children, Other Adults, and each other – Parents). The third issue was the question whether the children used teineigo appropriately in these corpora as has been found in previous studies. To examine this in our corpus, we looked at the addressee-specific use of polite language produced by the Target Child. Finally, since we are interested in development, we plot the parent and children’s data with the age of the target child. This allows us to see whether the parental input changes depending on the age of the child (e.g., babytalk should become less common as children get older) and also whether the child’s abilities increase with development. Figure 4 shows the proportion of teineigo utterances for Parent and Target Child speakers (left and right panels, respectively). Since polite language could vary with the age of the child, we will include age in our analyses and our figures match this analysis by including age-dependent regression lines that vary with each addressee and speaker.
Since the addressees differed for each speaker role, we used separate logistic regressions to predict the production of the teineigo utterances (teineigo = 1, plain = 0) in the Parents and Target Child (left panel in Figure 4). For the Parents speaker role model, we included centered age in months (mean age was subtracted from age to make the mean age equal to 0) crossed with a helmert-coded addressee variable. The helmert-coded variable created three contrasts, which compared Other Adults against Parents, adults (Parents and Other Adults) against Other Children, and the Target Child against all other addressees. Overall, there was a gradual reduction in the use of teineigo as the target child got older, =-0.029, z=-3.7, p<0.001. The utterances to Other Adults addressees had significantly more teineigo forms than those between the parents, =-0.43, z=-3.1, p=0.002, and this difference significantly increased with development, =-0.038, z=-2.4, p=0.015 (the Other Adult and Parent lines diverge with age in the left panel of Figure 4). Teineigo use with adult addressees was more common than to Other Children, =-0.41, z=-7.8, p<0.001, and this did not change with the age of the child (p=0.601). Finally, the teineigo form utterances were less common with the Target Child compared to the average of the other three addressee types, =-0.13, z=-5.2, p<0.001, but they increased towards the Target Child as they got older, =0.0096, z=3.7, p<0.001. Overall, only 4% of the parent’s utterances were in teineigo form and this is consistent with the social norm that parents and children talk to each other with plain speech. The parents used teineigo more with other adults compared to each other, and this is consistent with the idea that teineigo should be used towards adult strangers. The parents also used teineigo more with adults than children, which is consistent with the norm that plain speech is used with children, regardless of their closeness to the speaker. The parents did not use teineigo with the target child initially compared to the other addressees, but they increased the use of teineigo with this child over development. The reason for this is not clear, but it may be due to a desire to expose the target child to these forms. Overall, the next speaker was a useful measure for identifying the addressee-specific nature of teineigo forms in parental speech.
To examine how the target child responded to this parental input, we fit a separate Target Child speaker role model with centered age in months crossed with a helmert-coded addressee variable (right panel of Figure 4). The helmert-coded coding created one contrast for Other Adults versus Parents, and another contrast for adults versus Other Children. Of the Target Child’s utterances, teineigo forms were produced more as the child grew older, =0.019, z=3.1, p=0.0021 (age correlation=0.19). Teineigo forms were used more with the Parents than with Other Adults, =0.4, z=4.9, p<0.001, and this did not change with age (p=0.920). There was no difference in the use of teineigo between adults and Other Children (p=0.179) and this did not change with age (p=0.238). This analysis demonstrated that the target children did not produce teineigo utterances in an adult-like manner. There is a social norm that plain forms are used within the family and the corpus results match this norm (2% contained teineigo forms). However, when the target children used teineigo forms, they tended to use them with their parents compared to other adults. Although this addressee-specific pattern is not adult-like, the overall use of teineigo forms by children is related to their parent’s use of these forms (a medium correlation of 0.55 between the parents and the target child proportion of teineigo utterances in each recording session/file), which suggests that children are mimicking the polite forms that they hear their parents use.
One finding was that there was substantial variability in the use of polite forms. For example, one mother in our corpus said to her child sonna boo de shicha ikemasen ‘don’t play with that stick’ using a teineigo verb ikemasen (Okayama, 31 months). Since the surrounding expressions are in plain form (e.g., the following utterance by the mother is the plain utterance omimi itaiitai nattara oishasan e ikannan ne de ‘you’ll have to go to see a doctor if you get your ears hurt’), it is not clear from the context why she used teineigo for only one utterance. Overall the teineigo form shicha ikemasen ‘don’t do it’ occurred only 3 times overall in the corpora, while the plain form shicha dame occurred 74 times. The children also produced polite forms as in a case where the child said ninjin dekimashita ‘I made a carrot’ (MiiPro, 40 months). This expression was surrounded by plain expressions by the mother and the child (e.g., ninjin da yo ninjin ‘here is a carrot, a carrot’), so it is not obvious why the child use a teineigo verb form with their mother. Thus compared to the small samples that were studied previously (e.g., Cook, 1997), a large collection of parent-child speech provides more cases appear to violate descriptive generalizations and this motivates a probablistic approach (see Tatsumi et al., 2020 for similar variability in the acquisition of Japanese morphology).
In summary, the present analysis examined the input that might support the robust distinction between teineigo and plain verb forms in young Japanese children. It was determined that parents provide evidence for learning the distinction in these forms, if children are sensitive to the next speaker as a probabilistic cue to the intended addressee. This evidence is only useful if there are a substantial number of polite forms produced in the presence of children. Previous studies examined only a small number of utterances and did not characterize how frequently these occurred in child-directed speech (e.g., Cook (1997) analyzed only 196 polite tokens). In contrast, our analysis examines 311,933 parental utterances and found 11,936 polite tokens. These utterances come from 477 recording sessions which were less than 2 hours on average. If we assume that children hear more than 4 hours of speech a day, then this database is only a subset of the utterances that a child might hear in one year. When we counted the number of masu, masen, and mashita forms, there were 825 unique verb forms. Thus, even though teineigo speech is only a small part of the input, the fact that parental input is very large means that Japanese children hear a substantial number of verbs in teineigo form before 3 years of age, and this can help to explain why they appear to acquire these forms fairly early in development.
The Role of Honorific and Diminutive Titles in the Development of Politeness Distinctions
The previous verb analysis demonstrated that Japanese children overhear sufficient teineigo forms in their input to support their use of polite language in production. But polite language is not restricted to verbs and it is likely that their knowledge of politeness distinctions is shaped by all of the forms in their input. One salient set of politeness distinctions are titles that are applied to nouns. In many languages, polite honorific titles tend to be used with polite verbs (e.g., English Mr. Smith, may I take your coat). Japanese has the honorific title san, which is placed after a noun in a way that is similar to the titles ‘Mr’ or ‘Ms’ in English (e.g., Smithsan ‘Ms. Smith’). In contrast with the polite titles, many languages have informal diminutive titles that are used to show intimacy. Japanese has the diminutive titles chan and kun for young, familiar, or cute referents. Kun tends to be for males, but chan can be used for both genders (young girl Shizuka could be called shizukachan and a young boy Nobita could be called nobitakun or nobitachan). Other adult family members, other than parents, can also be referred to with chan (obaachan ‘grandma’). Cute referents might also get the chan title like hamuchan ‘little hamster’. Jurafsky (1996) proposed a universal radial category of diminutives based on 60 languages, where diminutives are associated with children, smallness, or femaleness. While Japanese diminutives are generally consistent with this theory, there are also many exceptions. For example, Arnold Schwarzenegger is a famous adult male actor and would normally be called with the honorific title san, but instead he is given a diminutive title Shuwachan. There is a social norm that male and female politicans are given the diminutive title kun within the Japanese legislature, even though honorific san is used when they are outside of that context. Also, Japanese is unusual in that inanimate objects are sometimes given honorific titles in child-directed speech (e.g., ninjinsan ‘Mr. Carrot’). Thus, although there are various exceptions, there is a basic distinction in titles between polite san and informal diminutive chan/kun.
Here a similar analysis was performed on noun titles using the same probabilistic measure of the intended addressee. To examine the use of these terms, we collected all words which ended with san, chan, or kun in the corpus. Since the meaning associated with these terms cannot be explained in English with a single word, we will use “Mr” as the gloss for honorifics and “little” as the gloss for diminutives, but the reader should remember that these English glosses do not cover the full meaning of these terms in Japanese (in particular, san and chan can be used for either gender and do not specify marriage status). Furthermore, some words with these endings have been grammaticalized as fixed forms. For example, honyasan appears to be a title like ‘Mr. Bookstore’, but actually it is a lexical item that is used generically to refer to the job of a bookseller or the bookstore itself. To avoid including these lexically fixed forms, we extracted the stems that occurred at least once with polite san or plain chan/kun. These are the stems where there is a choice between these two forms in our corpus and hence our analysis only examined words that were not lexically assigned to a particular title in our corpora. We then searched for these alternating forms and counted the number of times that they occurred in each utterance. We then summed the number of these forms by age of the child, addressee, and speaker role by each recording session/file and computed the proportion of san forms out of san, chan, and kun forms. This creates a proportion of the forms which are san forms for particular speakers/addressees at different points in development.
The children in the corpora were still acquiring the language and sometimes produced overgeneralizations. For example, although Winnie-the-Pooh is called Puusan by Japanese adults, three different children in the corpus referred to him as Puuchan. This required that they separate Puu from san in their input and then replaced san with chan. Another example of these overgeneralizations from our corpora is yuubinyachan ‘little postman’, which for adults can only be yuubinyasan and refers generically to the job of a postman. These overgeneralizations suggest that children are not just memorizing the forms in their input, but also generating novel forms from a rule. But the learning of this rule is made more complicated by the inconsistent input from their parents.
The use of honorifics/diminutives does not match the use of teineigo/plain verb forms. For example, children normally use plain verb forms with family members and diminutives like chan for people and animals which are close to the child (e.g., ojiichan ‘grandpa’, nekochan ‘little cat’). But parental names are an exception in that san forms tend to be dominant, even when parents are female and close to the child (e.g., okaasan ‘mother’ is more common than okaachan ‘mommy’). These preferences can vary by family, so some families will prefer the chan form (e.g., otoochan ‘daddy’ to otoosan ‘father’). Therefore, our first analysis will be on the names for the mother (okaasan, kaasan, mamasan, okaachan, kaachan, mamachan) and the father (otoosan, toosan, papasan, otoochan, toochan, papachan). In our corpora, most of the recording sessions had more parent names with san over chan, but there were some parents or sessions with more chan than san. Although there were these biases for different families, there were 118 files where parents used both san and chan forms for parent names (using both okaachan and okaasan). As there is variation in the use of these parent names, our analysis examines whether the parental utterances distinguish the usage of san and chan with respect to the addressee and whether the target child acquires a similar pattern over development. There were only a few cases where titles were used between parents in these corpora, and this made it difficult to fit the parent contrasts and hence the speech between parents was excluded from the analysis. Figure 5 shows the proportion of san parent names for Parent and Target Child speakers (left and right panels, respectively). Age-dependent linear regression lines are shown for each combination of addressee and speaker.
Linear regressions were used to fit the proportion of parent-produced san parent name forms (out of all san, chan, and kun forms) with centered age and helmert-coded addressee (left panel of Figure 5). Overall, the parental use of san forms did not increase with age (p=0.405). Utterances to Other Adults had more parental san forms than those to Other Children, =-0.12, t=-2.5, p=0.011, but this difference reduced with the age of the child, =0.01, t=2.7, p=0.0072. Utterances to the Target Child were not different from the other addressees (p=0.945), but san forms became fewer as the Target child got older, =-0.0046, t=-3.3, p<0.001. Thus the pattern of polite term use with nouns differs from the pattern with verbs. Parents preferred to use polite san forms rather than chan for parent names (83% were san parental names). Although there was a slight reduction of san forms for the target child as they got older, the overall pattern converged to mainly using san for parent names for most addressees.
For the Target Child utterances (right panel of Figure 5), there was a lot of variability, and the use of san parent names forms did not significantly increase as the child got older (p=0.102). There was no difference in the use of these san forms for Parents and Other Adults (p=0.123), and this did not change over development (p=0.680). The use of these san forms was higher for Other Children compared to adults, =0.11, t=2, p=0.044, but this did not change over development (p=0.734). The Target Child preferred to use san forms for parent names in general (60% were san forms) and used san parent names more with other children. There was a large correlation of 0.61 between the parent and target child’s proportion of san terms for parent names in each recording session.
While English also makes distinctions in politeness with parent names (e.g., daddy, father), Japanese allows these titles to be applied to all sorts of words (e.g., animals, vegetables) especially in child-directed speech. One example from the corpus is the sentence umasan wa ne ninjinsan toka happa toka ga suki na n da tte ‘I heard that Mr. Horse like Mr. Carrot and leaves’, where the mother refers to a horse as umasan ‘Mr. Horse’ and carrots as ninjinsan ‘Mr. Carrot’ (even though carrots are being discussed here as food for the horse), but she does not use a title for happa ‘leaf’ (MiiPro, 35 months). Other animals like cats are often referred to with chan, but even then there is quite a bit of variability. One mother refers to a cat with a san form (honto ni nekosan mitai da ‘really like Mr. Cat’) and then uses a chan form three sentences later (nekochan ita ‘there was the little cat’, MiiPro, 36 months). A large number of non-parent-names san, chan, and kun titles are applied to the names of children/adults. Since the use of titles for these non-parent-name words is complex, we performed an analysis on the parental utterances and the target child’s output. Figure 6 shows the proportion of san form for non-parent-name words for Parent and Target Child speakers (left and right panels, respectively). Since these forms are a form of babytalk, we included the age of the target child in the analysis (corresponding age-dependent linear regression lines are shown for each addressee within each speaker in Figure 6).
Linear regressions were used to fit the proportion of parent-produced non-parental san forms with centered age and helmert-coded addressee (left panel of Figure 6). The proportion of san forms decreased over age, =-0.0032, t=-2.4, p=0.015. There was no main effect of the difference between Other Children and Other Adults (p=0.326) and no interaction with development (p=0.060). Utterances to the Target Child contained more san forms when compared against the other addressees, =0.03, t=3.3, p<0.001, but this reduced over development, =-0.0021, t=-2.9, p=0.0044 (the regression line for the Target Child in Figure 6 has a negative slope). While parent names are almost exclusively used with san, non-parent-name words are biased for chan (66% were chan forms). Although we only examined forms that appeared with both san and chan in our corpora, the majority of these non-parental terms were children’s names that were mainly labeled with chan. The majority of the san terms in this analysis were animal, vegetable, and inanimate referents. The fact that the parents use san forms mainly towards their own child and other children, rather than adults, may reflect the fact that these san forms were a type of babytalk and that would explain why san forms became less common as the children got older.
For the Target Child utterances (right panel of Figure 6), there was an increase in the use of san with non-parent-name words, =0.0057, t=2.5, p=0.015, over development. More san forms were used with Parents than with Other Adults, =0.099, t=4.1, p<0.001, and this did not change over development (p=0.368). There was no difference between forms used with Other Children and adult addressees (p=0.316), and this also did not change with the age of the child (p=0.184). There was a medium correlation of 0.39 between the parent and target child’s proportion of san terms for non-parent-name words used in each recording session. The target child preferred to use chan for non-parent-name words (73% were chan forms). Since most of the referents here were not adult strangers, the san titles were for animals, vegetables, and inanimate objects and it makes sense that children would use these titles only when talking to their parents, rather than other adults.
This analysis suggests two quite different patterns in the use of honorific and diminutive titles. Honorific san is the dominant term used with parent names, while diminutive chan is the dominant term used for other terms. This is due to the fact that many of these non-parent-names are children’s names, names for family members (e.g., oniichan ‘older brother’, obaachan ‘grandma’), and pets (e.g., nekochan ‘kitty’), where the referent is close to the speaker. The developmental changes were influenced by the use of san terms for referents that were not humans or pets. For example, honorific san titles are given to animals, foods, machines, and colors (e.g., zoosan ‘Mr. Elephant’, kanisan ‘Mr. Crab’, arisan ‘Mr. Ant’, tamagosan ‘Mr. Egg’, ringosan ‘Mr. Apple’, hikookisan ‘Mr. Airplane’, pinkusan ‘Mr. Pink’). While western parents are also willing to anthropomorphize imaginary inanimate objects (e.g., Thomas the Train Engine), Japanese adults also anthropomorphize common everyday objects. Burdelski (2013) has documented how Japanese adults sometimes apologize to real-world inanimate objects like flowers as a way to teach children about affective interactions with others. Hayashi et al. (2009) report a case where a teacher encourages a child to eat his carrots by talking about its feelings (i.e., Poor Mr. Carrot. Since Mr. Hamburger, Mr. Rice, and Mr. Orange have been eaten, he is all alone). Hatano et al. (1993) found that Japanese children between kindergarten and fourth grade were more likely than US and Israeli children to ascribe sensory abilities (e.g., can feel pain, can feel cold) to inanimate objects like rocks. Using polite language for inanimate objects may help to teach Japanese children that polite language is not based on avoidance of negative consequences of impolite speech, but rather its use reflects social norms in Japanese society.
While the verb corpus analysis did not provide strong evidence that the child understood the social nature of polite verb forms, the noun analysis provided evidence that the children varied in their use of polite forms. For parent words, the Target Child preferred to use honorific san forms (60%), but for other referents, they preferred to use diminutive chan/kun forms (73%). Also, parents and children tended to use san with wild animals, vegetables, and inanimate objects, but not pets. This pattern may be influenced by the uchi/soto distinction, where polite language is used for out-group members, but plain language is used to refer to in-group members (uchi “inside”, soto “outside”, Maynard, 1997). What is useful about this distinction is that it is not just about the relationship between the speaker and the listener, but also how both of these people are related to the topic of discussion. For example, when the child is talking to the parent about wild animals and inanimate objects, then these referents are treated as outside the family group and are therefore labeled with san (e.g., hatosan “Mr. Pigeon”, Burdelski, 2017). When talking about pets or other family members, then the child and parent can treat them as inside of the family group, and therefore chan/kun can be used. But when the child refers to the parent directly, then the parent becomes an out-group member with respect to the in-group composed of the child themselves and the child uses san to show respect. It is not clear why parents create this complex pattern of titles for a wide range of objects, but one possibility is that it exposes children to the idea of a social hierarchy and this helps later in the learning of verb politeness distinctions. Support for this idea comes from evidence that diminutive titles appear to ease the learning of other aspects of grammar in languages like Russian, Serbian, and Lithuanian (Kempe et al., 2003; Savickienė et al., 2009; Savickienė & Dressler, 2007; Ševa et al., 2007).
In many languages, the ability to use polite forms grows slowly over development (e.g., Bates & Silvern, 1977). In contrast, descriptive studies of Japanese parent-child interactions have suggested that Japanese children acquire politeness distinctions early in development (e.g., Clancy, 1986). However, experimental work in comprehension has found that Japanese children below 4 years of age do not appear to have an understanding of polite language (Ikeda et al., 2019; Tsuji & Doherty, 2014). Here, we address this mismatch between the experimental and descriptive studies by using a production study with rich video contexts. We found that 3- to 6-year-old Japanese children used mostly teineigo forms for the formal videos and plain speech for the informal videos. This suggests children from around 3 years of age understand how teineigo should be used and the lack of an effect in previous studies was due to the methodology used.
One reason that teineigo can be acquired by children younger than four is because it does not depend on mind-reading abilities (Leslie et al., 2004) and instead is signaled by public features of the context, such as the familiarity of the addressee (Ide & Yoshida, 2017). But to learn how particular forms are used, one must experience these forms in the input. However there is a social norm that parents should use plain speech with each other, which suggests that there may not be enough polite input to acquire its usage pattern. However, by examining a large corpus of 565,832 utterances, we discovered that, while Japanese parents only produced a small proportion of polite utterances around children, this input was substantial and covered a range of forms that could help to explain the early acquisition of this ability.
In addition to exposing children to these teineigo forms, another question is whether the parental input helps children to learn the addressee-specific nature of these forms. Again the parental input is not an obvious place for children to learn the correct mapping of these forms, because any teineigo utterances that are directed at children are teaching them the wrong mapping (plain forms should be used towards family members and children). In addition, it is not always clear who is the intended addressee of any particular utterance. In our analysis, we used the next speaker as the intended addressee and found that this was a probabilistic cue that could be used to learn the appropriate mapping of addressees to verb forms. If children are sensitive to this cue, then statistical learning mechanisms that have been found to be used by infants and children (Gómez & Gerken, 2000; Saffran et al., 1996; Twomey et al., 2014, 2016) would be sufficient to acquire the addressee-specific nature of polite speech.
Overall in the corpus analysis, there were no strong effects of the intended addressee in the Target Child utterances. This is probably due to the limited nature of the contexts that were available in the corpora. Descriptive studies that have found evidence for polite language have found it in role-play situations (e.g., Fukuda, 2005). Since we coded the intended addressee based on the actual speaker, rather than the imagined character, our analysis was unable to capture these situations. Our experimental study showed that children used polite speech appropriately to clerks in stores, but these kinds of real-world contexts were also not present in corpora based on parent-child recordings in a home setting. Although the contexts in these corpora may have limited the ability to see politeness distinctions in the child speech, we did find that the Target child varied the use of honorific and diminutive titles for different referents. This distinction may help the child to learn that there are different social categories that need to be considered in selecting word forms and this may be one of the functions of using these titles in child-directed speech.
In Japanese, there is an expression kuuki o yomu ‘read the air’ which refers to the ability to read a situation and use appropriate language for the context. There is a sense in which this metaphor is very appropriate, because we have found the rules that govern polite language use in Japanese are, like air, difficult to see or read. We found that parents sometimes randomly produced teineigo towards their children in the middle of conversations that used plain forms. Adults and children produced forms that are unacceptable for some native speakers (e.g. yuubinyachan ‘little postman’) and this shows that Japanese speakers are using rules that create overgeneralizations. Parents and children both marked parental name words with san, but this sometimes varied within the same conversation. In addition, there is a range of anthropomorphization of inanimate objects in child-directed speech. While this variation may expose children to polite forms, it may also complicate the learning of the correct adult usage of these forms. In this work, we have identified aspects of the parental input that can help to explain the early use of polite speech in children, but more research is needed to fully understand this complex system.
Contributed to conception and design: FC, TT, HH, NO
Contributed to acquisition of data: FC, TT, HH, MY, NO
Contributed to analysis and interpretation of data: FC, TT
Drafted and/or revised the article: FC, TT
Approved the submitted version for publication: FC, TT
We would like to thank the KYOTO Design Lab at the Kyoto Institute of Technology who provided the main support for this study. Franklin Chang and Tomoko Tatsumi are members of the Economic and Social Research Council (ESRC) International Centre for Language and Communicative Development (LuCiD), and the support of the ESRC [ES/L008955/1] is gratefully acknowledged. Franklin Chang is also supported by KAKEN Grant 19K12733 from the Japanese Society for the Promotion of Science.
Data accessibility statement
All the stimuli, presentation materials, participant data, and analysis scripts can be found on this paper’s project page on the http://osf.io/jgmxz
There are no competing interests in this work.
We would like to thank Nobuko Anan, Yusuke Suetsugu, Eri Takashima, Mariko Maeda, Atsushi Hirota, Chie Fukada, Kristine Onishi, Alia Martin, Amelie Bernard, and Andrew Jessop for help and discussions about the study. We would also like to thank the staff and children in the three kindergartens: Seiko Kindergarten, Soai Kindergarten, and Takanogawa Nursery.