Linguistic convergence – the alignment of your linguistic choices with those of your conversation partner – is a well-known phenomenon. An unresolved question about linguistic convergence concerns whether there are individual differences in convergence. Do some people converge with their partner more reliably than others, and do some people elicit convergence more reliably than others? We explored whether some speakers elicit a type of linguistic convergence known as structural priming more reliably than others. Our experimenters called businesses and asked questions such as, At/What time do you close? We demonstrated a structural priming effect (e.g., participants were more likely to use a preposition in their response when the question contained a preposition). We also found that some speakers elicited stronger priming effects than others, and that this tendency was reliable. Our findings suggest that there are reliable individual differences in the extent to which speakers elicit structural matching from their partners.
When people interact, their linguistic behavior tends to converge (e.g., Giles et al., 1991; Pardo, 2018). Convergence has been documented across levels of linguistic analysis: speech features (e.g., Cohen Priva & Sanker, 2020; Ostrand & Chodroff, 2021), lexical choices (e.g., Brennan & Clark, 1996; Tobar-Henríquez et al., 2020), and syntactic choices (e.g., Branigan et al., 2000; Weiner & Labov, 1983). Pickering and Garrod (2004) suggest that the cognitive processes involved in language processing drive linguistic convergence. For example, if your conversation partner uses a passive construction (The cat was chased by the dog), processing the construction leads to activation of the passive structure in your language processing system. The activation makes it more likely that you will subsequently use a passive construction yourself (e.g., Pickering & Branigan, 1998). There is also evidence that social factors affect the degree to which speakers converge. As one example, Giles et al. (1991) suggest that speakers converge with their partners as a way of signaling affiliation and diverge from their partners as a way of signaling disaffiliation.
Cohen Sanker and Priva (2020) argue that our understanding of the processes involved in linguistic convergence can be advanced by examining individual differences in convergence. They suggest that individual differences in the tendency of speakers to converge with their partners points to the role of cognitive and/or perceptual processes in linguistic convergence. For example, people with a stronger tendency to use the same cues during speech processing and speech production may show a different tendency to converge than those who do not weigh the same cues equally across processing and production (see Schertz & Clare, 2020, for a discussion of cue weighting in speech processing and production). Cohen Sanker and Priva (2020) further suggest that individual differences in the tendency for speakers to elicit convergence from their partners might point to the role of social processes in linguistic convergence. For example, prior work has shown that speakers tend to converge more to partners who are perceived as higher status (e.g., Gregory & Webster, 1996) or whose voices are perceived as more attractive (e.g., Babel et al., 2014).
A key step toward using individual differences as a vehicle for studying linguistic convergence is establishing the presence of such differences. The overall picture that emerges from the literature points to the existence of individual differences in linguistic convergence. Sanker (2015) reports individual differences in the tendency for speakers to converge with the speech features of their partners, though Cohen Priva and Sanker (2020) find no evidence for such effects. Tobar-Henriquez et al. (2020) report that there are reliable individual differences in the tendency for people to converge with their partner in the choice of lexical items. Several published reports document individual differences in speaker’s tendency to elicit convergence on speech features from their partners (e.g., Babel et al., 2014; Cohen Priva & Sanker, 2020; Pardo et al., 2017). The evidence suggests the existence of individual differences in linguistic convergence, but the extant literature suffers from important shortcomings. As discussed by Cohen Priva and Sanker (2020), many of the studies examine convergence by (or convergence to) a small number of speakers, focus on a single interaction (making it difficult to tell the difference between individual variation in convergence and the local effects of the interaction), and fail to test speakers over time to assess the reliability of their behavior (but see Tobar-Henríquez et al., 2020, for an exception). Limiting the number of speakers, models, and trials in the experiments can have a detrimental effect on our ability to detect true individual differences in these studies (e.g., Rouder et al., 2019; Rouder & Haaf, 2019). Furthermore, most studies of individual differences in linguistic convergence focus on the convergence of speech features, leaving other levels of linguistic analysis relatively unexplored. Thus, there is more work to be done to establish the reliability of individual differences in convergence.
The work reported here was designed to assess the presence of individual differences in speakers’ tendency to elicit linguistic convergence in syntactic choices. Such convergence is typically called structural priming or syntactic priming, and it refers to the fact that speakers tend to re-use recently encountered syntactic forms (e.g., Bock, 1986; Branigan et al., 2000). For example, a speaker who has recently encountered a double object dative construction (The cat gave the dog a bone) is more likely to use another double object construction to describe a transfer event (The girl gave the mouse some cheese) on a subsequent utterance than to use a prepositional object dative to describe the same situation (The girl gave some cheese to the mouse). Structural priming is a robust phenomenon (see Mahowald et al., 2016, for a meta-analysis), having been reliably observed both in the lab (e.g., Bock, 1986; Branigan et al., 2000) and in naturally occurring speech (e.g., Gries, 2005; Weiner & Labov, 1983).
We chose to focus on individual differences in the elicitation of structural priming for several reasons. First, whereas there have been studies of individual differences in linguistic convergence on the level of speech features (e.g., Cohen Priva & Sanker, 2020) and lexical choices (e.g., Tobar-Henríquez et al., 2020), to our knowledge, there have been no such studies of variation in the elicitation of structural priming. Second, Cohen Priva & Sanker (2020) argue that individual differences in the elicitation of linguistic convergence may be more robust than individual differences in the tendency to converge with one’s partner. Individual differences in the elicitation of structural priming might therefore be a better target for investigation than individual differences in the tendency to be primed. Third, extant theories of structural priming (e.g., Chang et al., 2006; Pickering & Branigan, 1998; Reitter et al., 2011) do not currently consider the possibility that speakers might elicit different levels of structural priming. Individual differences in the elicitation of structural priming could provide an important new testing ground for these theories.
The final reason we chose to look for individual differences in the elicitation of structural priming is that Chia et al. (2019) report preliminary evidence for the existence of these differences. Chia et al. (2019) replicated Levelt and Kelter’s (1982) phone call paradigm for eliciting structural priming. Experimenters called businesses and asked (At)What time do you close?. Participants were more likely to answer the question using a preposition (At 9 o’clock vs. 9 o’clock) when the initial question used a preposition than when it did not. Using the phone call paradigm and a set of related question-answer scenarios, Chia et al. (2019) demonstrated a reliable structural priming effect and further found that certain experimenters elicited more robust priming than others (see their Figure 5). Although promising, there are two shortcomings to Chia et al.’s (2019) demonstration of variation in the elicitation of priming. First, the number of observations collected varied across experimenters and was sometimes relatively small. Therefore, it is not clear how much of the variability in elicited priming effects were due to noise in the estimate of the priming effect elicited by each experimenter. Second, Chia et al. (2019) did not have the opportunity to assess the priming effect elicited by experimenters across time. Therefore, they were unable to examine the long-term stability of the effects elicited by the experimenters.
The current project rectified the shortcomings in Chia et al.’s (2019) observations to establish the presence of individual differences in the elicitation of structural priming. Following Chia et al. (2019, 2020), our experimenters called businesses to ask about their closing times. They collected data from several hundred participants in their first round of data collection and then collected a second batch of data a couple of weeks later. The large volume of data collected by each experimenter is a key feature of our study. The large sample size made it possible to generate a stable estimate of the priming effect that experimenters elicited from their partners, which is a necessary condition for estimating the reliability of the effect (e.g., Rouder et al., 2019). In addition, the delay in data collection between the two batches of data allowed us to test for the reliability of the elicited priming effect across time. Finding significant reliability in the observed priming effects would establish the presence of individual differences in the elicitation of structural priming.
We collected data from 39,204 participants. We excluded 574 participants due to invalid responses (see below), leaving a final sample of 38,630 participants.
The Institutional Review Board of Florida State University approved this research. Data collection was completed over 10 months. We trained 43 experimenters to make phone calls to businesses to ask about their closing times during this time. The data from three experimenters were excluded from the final data set for failure to follow instructions, leaving a total of 40 experimenters. [Note: the participants from the excluded experimenters are not included in the numbers reported in the participants section above]. In addition, time constraints prevented two experimenters from completing the complete set of phone calls (though both completed > 780 out of a possible 990 calls). Following the design of Chia et al. (2022), participants asked one of three questions: (At)What time do you close? (direct question), Can you tell me (at) what time you close? (conventional indirect question), or May I ask you (at) what time you close? (non-conventional indirect question). We varied the specific question that was asked for two reasons. First, Chia et al. (2022) found no differences in the priming effect observed for each question type. Although incidental to the main purpose of this study, we decided to use the large sample we collected for this study to generate a high-powered replication of Chia et al.’s (2022) null result. Second, using different questions within the sample would allow us to assess the reliability of the priming effect with questions asked in a range of contexts with slightly different pragmatics (e.g., the difference in politeness between asking a direct and indirect question).
In the first round of data collection, we asked experimenters to call 594 businesses (198 for each of the 3 types of questions; 99 of these questions included a preposition, and 99 of them did not). After the first round of data collection, experimenters waited a week before commencing the second round of data collection. Here, experimenters called 396 businesses (198 calls each for the What time…? and Can I ask you…? questions, with 99 being prepositional and 99 being non-prepositional). We did not collect data on the May I ask…? questions in the second round of data collection because the timing of data collection in the Spring and Summer 2021 semesters did not allow our participants to complete the second round of data collection for all three question types. One batch of experimenters (n = 21) collected data between February and May 2021. A second group of experimenters (n = 10) collected data between May and August 2021. The final group of experimenters (n = 12) collected data between August and November 2021.
Experimenters were instructed to call businesses anywhere in the United States. The only constraint on their selection of companies was that within each of our 6 conditions (3 question types x Prepositional vs. Non-prepositional questions), participants were asked to call an equal number of establishments with a $, $$, or $$$ rating from Google. Chia et al. (2022) found that the dollar sign rating of the businesses was related to the odds of the participant producing a full sentence answer to our questions (more dollar signs = more full sentences), and the production of full sentences generally leads to weaker priming effects (see Chia & Kaschak, 2021, for a discussion). Balancing the number of dollar signs across conditions and across participants thus ensured that our estimates of the priming effect for each experimenter would not be confounded with the types of businesses that were called.
Experimenters called each business, waited to be greeted, then asked the critical question. The experimenter immediately transcribed the participant’s response, including false starts and other disfluencies. The experimenter thanked the participant and ended the phone call upon receiving the response.
We coded the participants’ responses as follows. Responses were coded as prepositional if they included a prepositional phrase containing the temporal information that answered our question (e.g., We close at 10, At 9 o’clock). Responses that conveyed the temporal information without a preposition (e.g., 10 is our closing time, 9 o’clock) were coded as non-prepositional. Answers that did not address the question, such as We never close or I don’t know were coded as “other” and discarded from the analyses.
We also coded whether the participants’ response was a full sentence or a sentence fragment. Full sentences were responses that contained a verb phrase that included the closing time (e.g., We close at 9). Sentence fragments were responses that did not incorporate the temporal information into a verb phrase (e.g., Around 9, Hmm…maybe 9).
All analyses were conducted in R (R Core Team, 2021).
The first step in our analyses was to examine the overall priming effect in our dataset. The dependent measure in this analysis was the binary coding of the participant responses as prepositional (coded as 1) or non-prepositional (coded as 0). We used the glmertree package (Fokkema et al., 2015) in R to analyze the responses with a mixed models logistic regression. Our predictors were Question Type (prepositional = 1, non-prepositional = -1), Sentence (full sentence = 1, fragment = -1) and the interaction of these variables. The model also included Experimenter as a random factor and the random slope of Question Type across Experimenters. There were two questions of interest in this analysis. First, is there a significant priming effect (as demonstrated by an effect of Question Type)? Second, does the random slope of Question Type across Experimenters add significantly to the model fit? The latter question was addressed by using the anova() subcommand in R to compare the model described above to an identical model except for the omission of the random slope.
The next step in our analysis was to assess the reliability of the priming effect observed by individual experimenters. We first evaluated the split-half reliability of the priming effects in the entire dataset. Each experimenter’s trials were split in half, where each half had approximately the same number of trials for each of the 3 questions that were asked and approximately the same number of $, $$, or $$$ businesses. We calculated each experimenter’s priming effect for both halves of the data (priming effect calculated as the difference between the proportion of trials with a prepositional response following a prepositional question and the proportion of trials with a prepositional response following a non-prepositional question) and ran a correlation analysis on the effect across the overall sample. We next assessed the test-retest reliability of the priming effects. For each experimenter, we calculated the priming effect for both the first and second run of data collection and ran a correlation analysis on the effects. Finally, for both the split-half reliability and the test-retest reliability we calculated the Intraclass Correlation Coefficient (ICC; see Koo & Li, 2016; Mcgraw & Wong, 1996) using the icc() subcommand from the irr library in R (Gamer et al., 2019). We used a two-way model with a single rating type and an absolute agreement definition of the ICC.
The results of the mixed models regression predicting the log odds of participants generating a prepositional response based on Question Type, Sentence, and the interaction of these predictors is presented in Table 1. Replicating previous studies using this paradigm (e.g., Chia et al., 2019, 2020; Levelt & Kelter, 1982), there was a significant effect of Question Type (p < .001). Participants were more likely to generate a prepositional response when answering a prepositional question (M = 0.71; SD = 0.45) than when answering a non-prepositional question (M = 0.68; SD = 0.47). There was also an effect of Sentence (p < .001), with participants being more likely to generate a prepositional response when they generated a full sentence than when they generated a fragmentary response (see Chia et al., 2020, for a discussion).
|N = 38,630|
|N = 38,630|
Note: Significant effects are marked in boldface.
The model presented in Table 1 contained a random slope of Question Type across Experimenters. To assess whether this slope contributed significantly to model fit, we used the anova() subcommand in R to compare this model to a parallel model that did not include the random slope. The results indicate that the random slope contributes significantly to model fit (X2= 101.57, p < .001). This finding suggests significant variability in the priming effect elicited by our experimenters (i.e., that there are individual differences in the priming effects elicited by our experimenters).
Having shown both that there is a robust structural priming effect in our data and that there is significant variability in the priming effects elicited by individual experimenters, we next examined the reliability of these differences. We began by examining the split-half reliability of the priming effects. Each experimenter’s data were split in half (see earlier discussion), and a priming effect was generated for each half of the data. The overall split-half reliability was high (r = .847, p < .001), suggesting that the experimenter-level priming effects are reliable. The ICC for the overall split-half reliability was .848 (p < .001; CI: 0.731 < ICC < 0.917). A scatterplot of the split-half priming effects is shown in Figure 1.
We next examined the test-retest reliability of the priming effects by comparing the priming effects from the experimenter’s first run of data collection to the effects from their second run of data collection. We observe significant test-retest reliability (r = .484, p = .002), although the reliability was somewhat lower than the split-half reliability. The ICC for the test-retest reliability is 0.486 (p = .002; CI: 0.206 < ICC < 0.692).
Our work demonstrates for the first time that there are reliable individual differences in the tendency for speakers to elicit structural priming from their interaction partners. We believe that this finding is of interest for several reasons.
Establishing the existence of individual differences in the elicitation of linguistic convergence is a crucial step in understanding the mechanisms that drive convergence between conversation partners. Reliable individual differences suggest that at least some language users exhibit behavior that affects convergence across different interactions with different partners. This finding puts constraints on any theoretical accounts developed to understand how and why the behavior of conversational partners converge. Cohen Sanker and Priva (2020) suggest that reliable variation in the elicitation of convergence points to the role of social factors in shaping the extent to which speakers converge. Our study was not designed to examine the person-level factors that might lead to the elicitation of higher or lower levels of convergence. Nevertheless, the nature of our experimenters’ interactions suggests that any candidate factors (e.g., voice attractiveness; Babel et al., 2014) must be detectable from a relatively short sample of speech.
Establishing the existence of individual differences in the elicitation of syntactic convergence is important for developing theories of structural priming. Extant theories of structural priming (e.g., Chang et al., 2006; Reitter et al., 2011) do not currently consider the possibility that different speakers should elicit different levels of priming from their partners. Nevertheless, we can speculate on how major approaches to structural priming can be developed to accommodate individual differences such as those reported here. Activation-based theories of structural priming (e.g., Jacobs et al., 2019; Pickering & Branigan, 1998; Reitter et al., 2011) suggest that processing a sentence with a given structure (e.g., the double object structure: Jane gave Sally a book) activates the representation of that structure in the language system. The activation remains elevated for a short period of time. During this time, individuals will be faster to process sentences with the same form and more likely to generate sentences with the same form. Activation-based theories could explain individual differences in the elicitation of structural priming by developing an account of how speaker characteristics affect activation within the language system. As one example, voice features such as attractiveness may lead listeners to pay more attention to the form of the speech itself. This might result in higher levels of activation in the language system and the elicitation of more substantial priming effects.
An alternative account of structural priming (e.g., Chang et al., 2006) explains priming via error-based learning. When processing an utterance, comprehenders predict what is coming next. If they guess incorrectly (e.g., they expected a double object construction [Jane gave Sally a book], but got a prepositional object [Jane gave a book to Sally]), an error signal is generated. The language processing system updates based on the error. The processing system becomes biased (slightly) toward the unexpected structure, leading to structural priming. Explaining individual differences in the elicitation of structural priming via Chang et al.’s (2006) theory will require an account of how the characteristics of the speaker affect the expectations (and prediction error) of the listener. For example, listeners may use available characteristics of the speaker to generate an expected model of the speaker’s behavior (e.g., Kleinschmidt & Jaeger, 2015). If the speaker characteristics lead listeners to generate an inaccurate model, it would lead to more prediction error and the elicitation of stronger priming effects.
The current data do not allow us to go beyond speculation about how theories of structural priming might approach individual differences in the elicitation of priming effects. It is worth noting that the activation-based and error-based theories would need to rely on very different mechanisms to explain the individual differences we have seen here. As we learn more about the source of these differences, it may provide both a useful means of assessing which mechanisms provide the best account of priming and a useful means of assessing which theories can be extended to account for a broader range of linguistic phenomena (such as the ways that social and discourse factors may affect language production choices).
One strength of the method used in this study is that it allowed us to observe how nearly 1,000 participants responded to our experimenters. When studying individual differences in cognition, researchers tend to focus more on the overall sample size (see Prasad & Linzen, 2021, for a discussion of the sample sizes needed to observe syntactic adaptation effects) and less on the number of observations per participant. Rouder, Kumar and Haaf (2019) suggest that this is a problematic strategy – if a choice needs to be made, researchers interested in the correlation between tasks or the reliability of a single task would be better served collecting more data points from fewer individual participants than vice versa. Our method is straightforward to use and allows the collection of many observations per experimenter within a relatively short period. As such, this method may be useful to researchers who wish to explore individual differences in eliciting linguistic alignment. With some modification, the phone call task could likely be used to study priming for a range of different syntactic choices.
Demonstrating reliable individual differences in the elicitation of syntactic priming is a crucial first step in studying the nature of these differences. Although the current set of observations does not allow us to make much progress in answering the question of why these differences arise, we hope that this demonstration will spur further investigation into the topic and further development of our theories of structural priming (and language production more generally).
The work reported here was submitted by Katherine Chia in partial fulfillment of the requirements for the doctoral degree at Florida State University. We thank the committee members (Walter Boot, Sara Hart, Colleen Kelley, and Kaitlin Lansford) for their comments and feedback on this project. We would also like to thank the many experimenters who participated in the collection of these data.
The authors do not have any conflicts to declare.
Conception and Design: KC and MK
Acquisition of Data: KC
Analysis and Interpretation of Data: KC and MK
Writing: KC and MK
Approval of manuscript for publication: KC and MK