Although studies on belief perseverance suggest that people resist evidence opposing their beliefs, recent research found that people were receptive to clear, belief-disconfirming evidence. However, this research measured belief change immediately after presenting the evidence, and belief change varied considerably across participants. In three preregistered experiments, we replicated and extended prior work, testing whether belief change in response to empirical evidence on polarized topics persists one day later and variables associated with belief change, including the (in)consistency of evidence with prior views, evidence strength, and individual differences in beliefs, affect, thinking and reasoning strategies, and perceptions of the evidence and science. Overall, participants shifted their beliefs in response to evidence on capital punishment (Study 1), gun control (Study 2), and video games and aggression (Study 3) and maintained this change the next day. Belief change primarily occurred among those presented with belief-inconsistent evidence. Participants shifted their beliefs more in response to stronger vs. weaker evidence but were more sensitive to the evidence strength initially than the next day. Perceived evidence quality and scientific certainty were consistently associated with belief change, whereas belief commitment, actively open-minded thinking, social desirability, and positive and negative affect were not. People may be receptive to belief-inconsistent evidence, especially if they view it as strong and science as certain, irrespective of general individual differences in receptivity. Further research is needed on the persistence and predictors of belief change in response to evidence over a longer time frame and across topics, contexts, and samples.
Changing public beliefs in response to scientific evidence is important for the welfare of society. For example, aligning public beliefs with converging evidence (e.g., on climate change) and emerging evidence (e.g., in response to COVID-19) is critical for promoting public and environmental health. Although classic studies on belief perseverance suggest that people resist evidence opposing their beliefs, maintaining or even strengthening their beliefs in response to evidence challenging their views (e.g., Cohen et al., 2000; Lord et al., 1979; Taber & Lodge, 2006), these claims often come from studies presenting participants with conflicting evidence (Kuhn & Lao, 1996; Lord et al., 1979; McHoskey, 1995; A. G. Miller et al., 1993; Taber & Lodge, 2006; see also, Corner et al., 2012). Recent research found that when presented with a congruent pattern of belief-disconfirming scientific findings, participants shifted their beliefs in response to the evidence (Anglin, 2019). However, belief change was assessed immediately after presenting the evidence, and there was considerable variation in belief change across participants (Anglin, 2019). The present research sought to replicate and extend this prior work by testing whether belief change persists the next day and predictors of belief change (vs. resistance) in response to scientific evidence on polarized topics.
Belief Change in Response to Evidence
The literature on belief updating extends across disciplines, encompassing a wide range of topics and methodologies, including belief change in response to social, environmental, and political arguments and information (e.g., Corner et al., 2012; A. G. Miller et al., 1993; Taber & Lodge, 2006; Tappin et al., 2020), factual or statistical evidence (e.g., Vlasceanu & Coman, 2022), court or legal evidence (e.g., Hudachek & Quigley-McBride, 2022; McKenzie et al., 2002), misinformation and conspiracies (e.g., McHoskey, 1995; O’Brien et al., 2021; Orticio et al., 2022), misinformation corrections (e.g., Carey et al., 2022; Walter & Murphy, 2018), self-relevant evidence (e.g., Drobner & Goerg, 2024; Eil & Rao, 2011; Marks & Baines, 2017; Sharot et al., 2011), the perceived normative prevalence of beliefs (Orticio et al., 2022; Vlasceanu & Coman, 2022), and evidence bearing on beliefs instilled or claims made at the beginning of the study, including false claims (e.g., Anderson, 1983; Anderson et al., 1980; Ross et al., 1973).
Studies have also investigated belief change in response to scientific evidence, examining non-scientists’ belief updating in response to scientific (vs. anecdotal) statements (Vlasceanu & Coman, 2022), false, pseudoscientific claims with scientific references (O’Brien et al., 2021), scientific consensus communications (e.g., Lewandowsky et al., 2013; van Stekelenburg et al., 2022), and scientists’ updating in response to large-scale replication projects (McDiarmid et al., 2021). In addition, researchers have examined how people update their beliefs after reading summaries of scientific studies on polarized topics, rather than simply reading statements purported to be supported by scientific evidence. Until recently, however, many studies presenting participants with research summaries tested whether people maintain or adopt more polarized beliefs in response to mixed evidence—i.e., two studies with opposing findings (e.g., Kuhn & Lao, 1996; Lord et al., 1979; see also, Fischer et al., 2022). Fewer studies have tested how people update their beliefs in response to summaries of research on polarized topics that contain a clear, consistent pattern of findings.1
Noting this gap in the literature, Anglin (2019) tested whether participants updated their beliefs after reading research summaries supporting one side of a polarized topic. When presented with a clear, congruent pattern of scientific findings (on religion, politics, gun control, and capital punishment), participants consistently shifted their beliefs in response to evidence; in fact, participants showed the most belief change when the evidence opposed (vs. supported) their prior beliefs. Likewise, Rosman and Grösser (2024) found that, after reading about studies that consistently supported or opposed acupuncture’s effectiveness, participants updated their beliefs in the direction of the evidence they read.
These studies suggest that participants are receptive to belief-disconfirming scientific evidence when the results show a clear, consistent pattern, in line with research demonstrating the effectiveness of scientific consensus communications in leading people to update their beliefs on contested science topics to better align with the evidence (i.e., on climate change, GMOs, and vaccines; van Stekelenburg et al., 2022). However, research suggests that there is considerable variation in belief updating across participants (Anglin, 2019; Kuhn & Lao, 1996; Sommer et al., 2024; van Stekelenburg et al., 2022), raising the question of what variables predict belief change in response to scientific evidence. In addition, it is unclear whether belief change in response to reading about scientific research persists over time.
Thus, despite the abundance of research on belief updating in response to evidence, less attention has been given to how people update in response to summaries of research on polarized topics. The present study sought to replicate and extend Anglin (2019) to examine whether belief change in response to scientific summaries persists over time, emerges across different topics, and differs based on features of the evidence and audience.2 Understanding the public’s interpretation of and receptivity to scientific research on polarized topics is critical to effectively communicating evidenced-based solutions to social problems (e.g., those related to the environment, public health, education, and intergroup relations).
Although belief updating in response to scientific evidence may operate similarly to other forms of belief updating, research has shown that people sometimes respond differently to scientific and non-scientific information (Corner & Hahn, 2009; Walter & Murphy, 2018) and when the topics are polarized vs. not (e.g., Kahan et al., 2017; Vedejová & Čavojová, 2022). Beliefs may be based on many different forms of evidence besides scientific studies (e.g., group agreement, expert opinion, trusted figures or authorities, anecdotes, perception, experience, testimony, logic and reasoning, etc.; Metz et al., 2018; Sommer et al., 2024). Moreover, scientific evidence varies in quality and across times of collection, and people vary in their understanding of its variability (Broomell & Kane, 2017; Rabinovich & Morton, 2012; van der Bles et al., 2019), ability to evaluate and draw valid conclusions from evidence (Drummond & Fischhoff, 2017; Su, 2022), and in their trust vs. skepticism in science and particular scientific findings (Pew Research Center, 2015; Rosman & Grösser, 2024; Rutjens et al., 2018). Many studies have shown that people more critically evaluate research when the results oppose vs. support their beliefs (Drummond & Fischhoff, 2019; Klaczynski & Gordon, 1996; Koehler, 1993; Lord et al., 1979; MacCoun & Paletz, 2009; Munro, 2010; Munro & Ditto, 1997; Vedejová & Čavojová, 2022), and researchers theorize that these evaluation processes are key to understanding variability in when belief updating occurs (Sommer et al., 2024). Sommer et al. (2024) propose that individual differences in how people evaluate evidence act as operators that affect receptivity to evidence. Therefore, it is important to examine belief updating in response to scientific evidence—particularly summaries of scientific studies on polarized topics that people may respond differently to based on features of the evidence (e.g., its strength) and features of the perceiver (e.g., individual differences in perceptions and understanding of the evidence, goals, emotions, and thinking and reasoning strategies). Whether people possess the skills to evaluate the quality of scientific evidence, and how they do so when it opposes their beliefs, expectations, or desires, is important to understanding how people receive and update their beliefs in response to evidence on polarized topics.
Prior research points to a number of variables associated with people’s receptiveness (vs. resistance) to evidence, including the (in)consistency of the evidence with prior beliefs, consistency of evidence across studies, perceived quality of evidence, perceived bias of the source, and (un)expectedness of the findings (Anglin, 2019; Anglin et al., 2023; van Stekelenburg et al., 2022; Wallace et al., 2020); personal and social interests, identity needs, fears, and worldviews (Hornsey & Fielding, 2017); affect, personality, and cognitive style (Bowes et al., 2022; Drummond & Fischhoff, 2020; Porter & Schumann, 2018; Teeny et al., 2021); cognitive and reasoning skills (Bago et al., 2023; Pennycook et al., 2022); psychological distance to science (Većkalov et al., 2022); and science literacy and support for/trust in science (Rosman & Grösser, 2024; though acceptance vs. rejection of scientific findings often varies by topic; Pew Research Center, 2015; Rutjens et al., 2018). In this research, we measured and manipulated several of these potential variables associated with belief change vs. resistance to evidence on polarized topics.
Present Research
The present research replicated and extended Anglin (2019) to further examine belief updating in response to scientific evidence on polarized topics, investigating whether belief change persists one day later and variables associated with belief change, including features of the evidence and the individuals evaluating it. In Study 1, participants read two studies on the deterrent efficacy of the death penalty (following Anglin, 2019 and Lord et al., 1979), with congruent or opposing findings. Study 1 primarily served as a direct replication of Anglin (2019) but also extension with the addition of individual difference measures. Studies 2 and 3 served as conceptual replications and extensions, examining different topics (gun control and video games and aggression) and additional predictors of belief change, including the evidence strength.
Predictors of Belief Change. In examining variables associated with belief change, we tested the replicability of the moderating effect of belief (in)consistency from Anglin (2019), in which participants shifted their beliefs more in response to evidence inconsistent vs. consistent with their prior views,3 supported also by stronger belief updating effects in response to scientific consensus information found among skeptics than those with scientifically supported beliefs (van Stekelenburg et al., 2022). Though these findings contest earlier theories and claims of belief perseverance, which suggested that people resist evidence opposing their beliefs, individuals have more room to shift beliefs that deviate more from the evidence presented (Anglin, 2019; van Stekelenburg et al., 2022). In addition, we tested the replicability of the relationship with perceived evidence quality, in which participants shifted their beliefs more when they rated the quality of the evidence more favorably (Anglin, 2019). We also tested the effect of the consistency of evidence across studies, in which participants changed their beliefs more in response to a congruent vs. conflicting pattern of findings (Anglin, 2019; Rosman & Grösser, 2024). In Study 3, we extended prior research by manipulating the strength of the evidence to test whether participants updated their beliefs more in response to strong vs. weak evidence. Shifting beliefs more in response to belief-inconsistent (vs. consistent) evidence, a congruent (vs. incongruent) pattern of findings across studies, and strong vs. weak evidence would suggest that participants were receptive to scientific evidence challenging their beliefs and were sensitive to the strength of the evidence in updating their beliefs.
In this research, we also measured several individual difference variables that may be associated with belief change, including belief commitment and certainty, social desirability, actively open-minded thinking, scientific reasoning ability, perceptions of scientific uncertainty, and support for science, along with other variables added in each study. Although we largely examined these variables in an exploratory manner in Study 1, we selected them based on prior literature (discussed in further detail in the Supplementary Materials).
Persistence of Belief Change. Another goal of this research was to test the persistence of belief change in response to scientific evidence over time. Studies examining belief change in response to scientific evidence typically measure beliefs before and immediately after the presentation of evidence, in a single study session (e.g., Anglin, 2019; Rosman & Grösser, 2024). Participants report shifting their beliefs in response to the evidence, but these findings may reflect demand effects. Moreover, even if these effects reflect true belief change in response to the evidence, it is unclear whether belief change persists or whether participants would revert back to their earlier belief over time. Following up with participants after the initial session would reduce social desirability and help determine whether the observed belief change effects reflect true belief change. In the present research, we tested whether belief change persists 24 hours after the presentation of evidence. We selected this time frame to minimize dropouts and ensure the evidence would still be fresh enough to have an impact while allowing enough time to pass for social desirability concerns to be reduced.
Study 1
Study 1 had three primary aims. First, Study 1 tested the replicability of the belief change effects observed in response to congruent (vs. conflicting) evidence from Anglin (2019), using the same stimulus studies on the deterrent efficacy of the death penalty. With one exception (i.e., the number of belief measurements),4 the experimental stimuli and procedure were kept identical to Anglin (2019) to ensure the integrity of the direct replication. When presented with two studies with congruent findings, participants shifted their beliefs in the direction of the evidence presented, and when presented with two studies with conflicting results sequentially, participants shifted their beliefs in the direction of the first study and then shifted back in response to the second (Anglin, 2019). Therefore, we expected the same pattern of results, such that, overall, participants would shift their beliefs in response to congruent but not conflicting evidence. Second, Study 1 extended prior research by testing whether belief change in response to congruent findings persists one day later, suggesting that observed belief change effects do not simply reflect social desirability and may be enduring over time.
Third, Study 1 examined several individual difference variables that may be associated with belief change in response to empirical evidence, including the (in)consistency of the evidence with prior views, perceptions of the evidence quality, belief commitment and certainty, social desirability, actively open-minded thinking, scientific reasoning ability, perceptions of scientific uncertainty, and support for science. In Anglin (2019), participants shifted their beliefs more when they perceived the evidence to be higher in quality and when presented with belief-inconsistent vs. consistent evidence.5 The other variables measured as possible correlates of belief change were unique additions to the present study and thus examined in an exploratory manner.
Method
Disclosure Statement
This paper reports all measures, manipulations, and exclusions. Power analyses, reported within each study below, were performed to determine the sample size prior to data collection. No additional data were collected for any of the studies once data analysis began. All data, code, and materials are available on OSF at https://osf.io/gfp48/, along with the preregistered protocols and data analysis plans. Data were analyzed using SPSS 29.0. Raw data will be retained for confirmation purposes for a minimum of 5 years after publication.
Study Design
Study 1 followed a 4 (research findings: both effective, both ineffective, mixed-effective first, mixed-ineffective first) x 4 (time of assessment: T1 [before studies], T2 [after first study], T3 [after second study], T4 [next day]) x 2 (stimulus study order: longitudinal first vs. cross-sectional study first) mixed-model design, with research findings and stimulus study order as between-subjects factors and time of assessment as a within-subjects factor. Because stimulus study order was not expected to influence the results, if no order effects emerged, we planned to present the analyses collapsed across stimulus study order.
Hypotheses. Based on previous research (Anglin, 2019), we predicted that research findings would interact with time of assessment, with participants shifting their beliefs in response to the evidence immediately after its presentation, regardless of whether it supported or challenged their views. Specifically, participants presented with two stimulus studies with congruent findings were expected to shift their beliefs in the direction of the evidence presented (from T1 to T3) and those presented with conflicting findings were expected to shift in the direction of the first study (from T1 to T2) and shift back in response to the second (from T2 to T3). Whether participants would maintain a shift in their belief (from T1) at T4, 24 hours after the presentation of evidence, was examined in an exploratory manner.
Participants
Participants residing in the United States were recruited from Amazon’s Mechanical Turk (MTurk) to participate in a two-part study administered on consecutive days. A total of 343 participants (192 men, 147 women, 2 transgender, 2 other; Mage = 36.04, SD = 10.58) completed the first study session. An additional 57 began the study but dropped out before completing Part 1. The sample included 31 Hispanic or Latino, 8 American Indian or Alaska Native, 25 Asian, 49 Black or African American, 3 Native Hawaiian or Other Pacific Islander, and 272 White participants.6 Of the 343 participants with full data from Part 1, 315 returned to complete Part 2. Participants were included in all analyses where they had data, unless otherwise specified.
Power Analysis. Power analyses were conducted using G*Power 3.1. An a priori power analysis indicated that, for a 4x4 mixed-model ANOVA, N=136 is necessary to obtain small-medium effects (f = 0.15) at 95% power and α = .05. Because dropouts were anticipated between Sessions 1 and 2, an initial sample of N=350 was targeted for Session 1. A sensitivity power analysis with these parameters and the sample size for participants who completed both sessions (N=315) indicated that the study was powered to detect small effects (f = 0.10) for the omnibus ANOVA. However, we overlooked the fact that our hypotheses were contingent on the follow-up planned contrasts. Therefore, we performed an additional sensitivity power analysis for a paired samples t-test at 95% power and α = .01 (the alpha level set for these contrasts), which indicated that the study was powered to detect medium effects (.45 ≤ ds ≤ .53, based on the n for each analysis).
We did not conduct a priori power analyses for the analyses examining the associations between the individual difference variables and belief change. Sensitivity power analyses for linear multiple regressions at 80% power, α = .05, and 2-6 predictors (the number of predictors in the analyses we performed) indicated that the study was powered to detect fs = .18-21. Because these analyses were largely exploratory, we did not adjust alpha. As such, the study may only have sufficient power to detect larger effect sizes (e.g., setting the alpha to .01 would increase the effect size to fs of .21-.25). Thus, overall, Study 1 was likely powered to detect medium rather than small size effects.
Data Quality
Although MTurk showed evidence of high data quality for several years (e.g., Behrend et al., 2011; Buhrmester et al., 2011), recent research suggests that data quality has diminished amidst growing concerns of bots, calling for researchers to include validity checks and discuss efforts to maximize quality in their reports (Chmielewski & Kucker, 2020). To maximize data quality in the present research, we set the approval rate required for participation to 99% (higher than standard 90 or 95% rates; Chmielewski & Kucker, 2020). Participants were compensated well (well above the federal minimum wage in each study), as participants provide higher quality data with higher pay (Litman et al., 2015). We conducted each study through CloudResearch using their data quality screeners, shown to yield higher quality data than using MTurk directly (Douglas et al., 2023). This enabled us to make Part 2 available only to Part 1 participants and send participants the Part 2 study link, along with exclude workers from the previous study (or studies) in Studies 2 and 3. To further reduce non-naivety concerns, participants were required to have completed no more than 10,000 tasks (i.e., HITs) on MTurk. Each study included multiple choice attention check and open-ended questions, which we used to evaluate the attentiveness of the sample (described in further detail below). We also included a captcha at the beginning of the Qualtrics survey to screen out bots.
Materials and Procedure
The preregistration for Study 1 is available at https://osf.io/2w5uy.
Belief and Position (T1-T4). At the beginning of Part 1, participants reported their initial (T1) belief and position on the death penalty via the following questions: What is your position on the death penalty? (1=strongly oppose, 8=strongly support; single-item position measure); I believe the death penalty: (1=strongly fails to deter murder, 8=strongly deters murder; single-item belief measure); How certain are you that the death penalty deters or fails to deter murder in the way you specified? (1=very uncertain, 8=very certain; single-item belief certainty measure). Participants also completed these questions after reading and evaluating each stimulus study described below (i.e., at T2 and T3),2 and in the follow-up survey sent 24 hours later (T4), which participants were given 24 hours to complete.
Stimulus Studies. Two filler questions followed the T1 belief and position questions. Participants were then presented with two studies on the deterrent efficacy of the death penalty (a longitudinal study and a cross-sectional study). The stimuli were identical to those used in Study 4 of Anglin (2019), modeled on those originally developed by Lord et al. (1979). The order of presenting each study and the direction of each study’s findings (effective vs. ineffective) were randomized, such that participants either received two effective or ineffective studies, or one of each. For each stimulus study, participants first read a brief overview of the study, followed by a detailed description of the methods and results (described in words and depicted in table and graph form), critiques of the study, and the authors’ responses to the critiques (see the Supplementary Materials to view the stimuli and a summary of how they were created)).
Study Evaluations. After reading each study’s detailed description, critiques, and rebuttals, participants evaluated the quality of the study via closed and open-ended questions. First, they rated how well the study was conducted (1=very poorly done, 8=very well done) and how convincing the study was (1=very unconvincing, 8=very convincing), which were combined into an overall measure of perceived evidence quality (longitudinal study: α = .86; cross-sectional study: α = .92). Next, participants reported in an open-ended manner why they thought the study did or did not support the argument that the death penalty deters murder. We analyze the open-ended evaluations and relationship between initial beliefs and close-ended evaluations in the Supplementary Materials, as these analyses were not central to the main goals of the research.
Attention Checks. Participants completed a series of attention checks while reading the stimulus studies. Following each study overview, they answered a multiple-choice question asking what the study found (87-88% answered correctly); following the study evaluation questions, they were asked to recall one of the criticisms of the study and the authors’ response to the critique (91.7-94.4% provided sensible responses to each question); and following both studies, they answered a multiple choice question asking what both studies found (82% recalled correctly). Because the overall pattern of results was not affected by whether participants who failed the checks were excluded or not, all participants were retained in the analyses as specified in the preregistration.
Individual Difference Measures. After the presentation of both stimulus studies, participants completed several individual difference measures, described below. These measures were presented in randomized order, with the exception of the scientific reasoning scale, which was presented last due to its different formatting.
Belief Commitment. A 4-item measure of belief commitment was developed, drawing from items used in previous research (Abelson, 1988; Pomerantz et al., 1995). Participants rated their agreement with each statement on a 7-point scale (1=strongly disagree, 7=strongly agree; e.g., “I can’t imagine ever changing my belief”; α = .85).
Social Desirability Scale (SDS; Crowne & Marlowe, 1960). The SDS measures impression management, or the extent to which an individual seeks to portray their self in an overly favorable light. Participants answered a series of True/False questions containing a socially desirable response that is unlikely to occur (e.g., “I’m always willing to admit it when I make a mistake”). Eleven items were used from the original 33-item scale, which has shown to be superior in detecting faking good responses than newer scales (Lambert et al., 2016) and psychometrically sound in short forms (Reynolds, 1982). Responses were coded so that higher scores indicate greater social desirability (α = .75).
Scientific (Un)certainty. Eight items were developed to assess perceptions of scientific (un)certainty. The items measured different aspects of scientific certainty, including reliability and trustworthiness (e.g., “Scientific studies tend to produce similar results when researchers repeat them”; “If two studies show the same results, the findings can be trusted”) and confirmation and absolute truth (“Results of scientific studies should be treated as facts”; “We can never know for certain whether a scientific theory is true”). Participants rated each item on a 7-point scale (1=strongly disagree, 7=strongly agree).
A principal components analysis with a Varimax rotation was performed to determine whether the items loaded on a single or separate factors. This analysis produced two factors with eigenvalues >1. The six positively phrased items loaded on the first factor, which was labeled scientific certainty (eigenvalue = 2.77, accounting for 34.56% of variance). The two negatively phrased items loaded on the second factor, which was labeled scientific uncertainty (eigenvalue = 1.31, accounting for 16.40% of variance).7 The items on each factor were averaged to create two subscales (certainty: α = .76; uncertainty: α = .46). Because the alpha for the uncertainty items was low, we also created a composite measure for all items (with the uncertainty items reverse-coded; α = .61).
Actively Open-minded Thinking (AOT; Haran et al., 2013). The AOT scale contains 7 items measuring attitudes toward flexible or open-minded thinking (e.g. “People should revise their beliefs in response to new information or evidence.”). Participants rated their agreement or disagreement with each statement on a 7-point scale (1=Completely disagree, 7=Completely agree; α = .77); higher scores indicate more favorable attitudes toward actively open-minded thinking.
Scientific Reasoning Scale (SRS; Drummond & Fischhoff, 2017). The SRS is an 11-item measure of scientific reasoning ability, assessing the capacity to evaluate scientific evidence in terms of the factors that determine its quality. Each question presents a short scientific scenario followed by a true/false statement with a correct answer (e.g., “A researcher finds that American states with larger parks have fewer endangered species. True or False? These data show that increasing the size of American state parks will reduce the number of endangered species.”). Responses were coded so that higher scores indicated greater scientific reasoning ability (α = .74).
Demographics and Support for Science. At the end of Part 1 of the study, participants answered demographic questions, including single-item religiosity and political orientation questions rated on 9-point scales. Participants also rated the extent to which they support and trust scientific research and support and trust social scientific research, on 7-point scales (1=not at all, 7=completely). These four items were averaged to create an overall measure of support for science (α = .86).
Debriefing. After all participants completed both parts of the study, they were sent a debriefing email, describing the purpose of the study, informing them that the studies were fictitious and explaining the rationale for the deception, and thanking them for their participation. A few participants wrote back to say they found the study interesting and important and were glad to have participated; none expressed distress or concern about the study in their response.
Results
Randomization Checks
A one-way ANOVA was conducted to test for differences in participants’ death penalty belief at T1 as a function of their randomly assigned condition. No differences emerged among conditions for T1 belief, F(3, 366) = 0.87, p = .46, η2 = .01.8
Belief Change
Preregistered Confirmatory Analyses. A 4x4 mixed-model ANOVA, with research findings as a between-subjects factor (both effective, both ineffective, mixed/effective first, mixed/ineffective first) and time of assessment as a within-subjects factor (T1, T2, T3, T4) was performed to examine whether participants shifted their belief in response to each study and maintained a shift from T1 at T4, 24 hours later. The primary test of our hypothesis was the analysis for beliefs, as the evidence was directly related to participants’ beliefs about the effectiveness of the death penalty rather than their support for it. We also predicted that participants would change their position in response to the evidence, as has been found in previous studies, though to a lesser degree (Anglin, 2019). Because the results for position showed a similar pattern as for beliefs, for ease of presentation, we present the analyses for position in the Supplementary Materials.
We first performed the analysis with order included as a factor. There was no main effect for order, F(1, 304) = 0.09, p = .76, ηp2 < .001, and order did not significantly interact with time of assessment, F(3, 912) = 0.85, p = .47, ηp2 = .003, research findings, F(3, 304) = 2.11, p = .10, ηp2 = .02, or both, F(9, 912) = 0.90, p = .53, ηp2 = .01. Because no order effects emerged, we present the analyses collapsed across stimulus study order, as preregistered.
We expected research findings to interact with time of assessment, such that those presented with congruent evidence would shift their belief in the direction of the evidence from both studies (from T1 to T3), and those presented with mixed evidence would shift in the direction of the evidence of the first study (from T1 to T2) and shift back in response to the second (from T2 to T3).9 We also tested whether participants’ belief at T4 would significantly differ from their T1 belief, 24 hours after the presentation of evidence, as an extension of Anglin (2019). Following Anglin (2019), five planned contrasts were performed, comparing T1 to T2, T3, and T4, and T2 to T3 and T3 to T4. Only participants who reported beliefs at each time point were included in these analyses so that the T1 comparison would be the same for T2, T3, and T4. The significance level for all planned contrasts was set to p < .01 to control for multiple comparisons.
A-posteriori Exploratory Analyses. Non-significant contrasts were followed up with two one-sided tests (TOST; Lakens, 2017) to test for equivalence (i.e., statistically reject the presence of an effect based on the smallest effect size of interest). These analyses were suggested by the reviewers and were not preregistered. For the equivalence tests, we used d=.30 as the smallest effect size of interest, as we specified small-medium effects as the effect size in the power analysis and d=.30 was the smallest belief change effect observed in the contrasts performed in Anglin (2019) 10. To perform the TOST analyses, we (1) computed a confidence interval for d=.30 by multiplying d by the pooled standard deviation to obtain Δ and set ΔL and ΔU, (2) calculated the observed 90% confidence interval for the contrast, and (3) compared whether the 90% confidence interval fell within the upper and lower bounds for d=.30 (i.e., [ΔL, ΔU]). The results of this test provide support for the null hypothesis (i.e., suggest equivalence) if the 90% confidence interval falls within the lower and upper bounds and show an indeterminate result lacking power to support the null or alternative hypothesis (i.e., cannot determine equivalence) if the 90% confidence interval falls outside of the lower and upper bounds (Lakens, 2017).
Belief Change. The main effects for research findings, F(3, 308) = 3.61, p = .01, ηp2 = .03, and time of assessment, F(3, 924) = 8.07, p < .001, ηp2 = .03, were qualified by the predicted research findings x time of assessment interaction, F(9, 924) = 11.25, p < .001, ηp2 = .10. The planned contrasts indicated that those presented with congruent effective evidence showed an increase in their belief in the death penalty’s effectiveness from T1 to T2, t(92) = -5.74, p < .001, d = -.60 [99% CI = -.88, -.30], and beliefs remained significantly different from T1 beliefs at T3, t(92) = -5.35, p < .001, d = -.55 [99% CI = -.84,-.27], and T4, t(92) = -6.27, p < .001, d = -.65 [99% CI = -.94, -.35] (see Table 1 and Figure 1). There was no significant change in beliefs from T2 to T3, t(92) = -0.36, p = .72, d = -.04 [99% CI = -.30 .23], or from T3 to T4, t(92) = -1.20, p = .24, d = -.12 [99% CI = -.39 .15]. TOSTs indicated equivalence at T2 and T3 and T3 and T4 (see Table 2).
. | Both Ineffective (n = 68) . | Both Effective (n = 93) . | Mixed; Ineffective First (n = 72) . | Mixed; Effective First (n = 79) . | ||||
---|---|---|---|---|---|---|---|---|
Belief (Single-Item) | M (SD) | †d * d | M (SD) | †d * d | M (SD) | †d * d | M (SD) | †d * d |
T1 | 4.10 (2.16) | -- -- | 3.84 (2.20) | -- -- | 4.03 (2.08) | -- -- | 4.42 (2.01) | -- -- |
T2 | 4.12 (2.03) | -.01 -- | 4.85ab (2.06) | -.60 -- | 3.74 (1.91) | .21 -- | 5.11ab (1.85) | -.55 -- |
T3 | 3.82 (2.03) | .16 .25 | 4.88a (2.12) | -.55 -.04 | 4.26b (1.93) | -.16 -.44 | 4.65b (1.96) | -.19 .42 |
T4 | 3.65 (2.05) | .31 .14 | 5.05a (2.05) | -.65 -.12 | 4.10 (1.89) | -.06 .15 | 4.52 (2.01) | -.08 .15 |
. | Both Ineffective (n = 68) . | Both Effective (n = 93) . | Mixed; Ineffective First (n = 72) . | Mixed; Effective First (n = 79) . | ||||
---|---|---|---|---|---|---|---|---|
Belief (Single-Item) | M (SD) | †d * d | M (SD) | †d * d | M (SD) | †d * d | M (SD) | †d * d |
T1 | 4.10 (2.16) | -- -- | 3.84 (2.20) | -- -- | 4.03 (2.08) | -- -- | 4.42 (2.01) | -- -- |
T2 | 4.12 (2.03) | -.01 -- | 4.85ab (2.06) | -.60 -- | 3.74 (1.91) | .21 -- | 5.11ab (1.85) | -.55 -- |
T3 | 3.82 (2.03) | .16 .25 | 4.88a (2.12) | -.55 -.04 | 4.26b (1.93) | -.16 -.44 | 4.65b (1.96) | -.19 .42 |
T4 | 3.65 (2.05) | .31 .14 | 5.05a (2.05) | -.65 -.12 | 4.10 (1.89) | -.06 .15 | 4.52 (2.01) | -.08 .15 |
Note.a Denotes a significant (p < .01) difference from the mean at T1. b Denotes a significant difference from the mean at the preceding time of assessment. †d = strength of difference from belief at T1. * d = strength of difference from belief at previous time point. Higher scores represent a stronger belief in the death penalty’s effectiveness, rated on an 8-point scale.
. | Both Ineffective . | Both Effective . | Mixed; Ineffective First . | Mixed; Effective First . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
90% CI | ΔL, ΔU | 90% CI | ΔL, ΔU | 90% CI | ΔL, ΔU | 90% CI | ΔL, ΔU | ||||||
Beliefa | |||||||||||||
T1-T2 | -.21, .19 | -.50, .50 | -- | -- | .02, .409 | -.41, .41 | -- | -- | |||||
T1-T3 | -.04, .36 | -.53, .53 | -- | -- | -.33, .04 | -.44, .44 | -.38, -.01 | -.35, .35 | |||||
T1-T4 | -.11, .51 | -.44, .44 | -- | -- | -.25, .14 | -.37, .37 | -.26, .11 | -.39, .39 | |||||
T2-T3 | -.06, .57 | -.43, .43 | -.21, .13 | -.26, .26 | -.56, -.20 | -.35, .35 | -- | -- | |||||
T3-T4 | -.06, .34 | -.40, .40 | -.30, .05 | -.42, .42 | -.04, .35 | -.32, .32 | -.04, .33 | -.26, .26 |
. | Both Ineffective . | Both Effective . | Mixed; Ineffective First . | Mixed; Effective First . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
90% CI | ΔL, ΔU | 90% CI | ΔL, ΔU | 90% CI | ΔL, ΔU | 90% CI | ΔL, ΔU | ||||||
Beliefa | |||||||||||||
T1-T2 | -.21, .19 | -.50, .50 | -- | -- | .02, .409 | -.41, .41 | -- | -- | |||||
T1-T3 | -.04, .36 | -.53, .53 | -- | -- | -.33, .04 | -.44, .44 | -.38, -.01 | -.35, .35 | |||||
T1-T4 | -.11, .51 | -.44, .44 | -- | -- | -.25, .14 | -.37, .37 | -.26, .11 | -.39, .39 | |||||
T2-T3 | -.06, .57 | -.43, .43 | -.21, .13 | -.26, .26 | -.56, -.20 | -.35, .35 | -- | -- | |||||
T3-T4 | -.06, .34 | -.40, .40 | -.30, .05 | -.42, .42 | -.04, .35 | -.32, .32 | -.04, .33 | -.26, .26 |
Note.a Single-item measure.
Those presented with congruent ineffective evidence did not show a significant change in their belief at the .01 alpha level from T1 to T2, t(67) = -0.07, p = .94, d= -.01 [99% CI = -.32, .30], T1 to T3, t(67) = 1.32, p = .19, d = .16 [99% CI = -.16, .47], or T1 to T4, t(67) = 2.56, p = .013, d = .31 [99% CI = -.01, .63]. TOST analyses indicated equivalence at T1 and T2 and T1 and T3. However, equivalence could not be determined at T1 and T4, suggesting that this non-significant finding could be due to a lack of power if the smallest effect size of interest was d=.30 (see Table 2). There was no significant change in beliefs from T2 to T3 at the .01 alpha level, t(68) = 2.10, p = .04, d = .25 [99% CI = -.06, .57], or from T3 to T4, t(68) = 1.19, p = .24, d = .14 [99% CI = -.17, .45]. TOSTs indicated that equivalence could not be determined at T2 and T3 but did suggest equivalence at T3 and T4 (see Table 2).
Participants presented with effective evidence followed by ineffective evidence showed an increase in their belief in the death penalty’s effectiveness from T1 to T2, t(78) = -4.89, p < .001, d = -.55 [99% CI = -.86, -.24], which decreased following the ineffective evidence from T2 to T3, t(78) = 3.76, p < .001, d = .42 [99% CI = .12, .72]; beliefs were not significantly different from T1 at the .01 alpha level following the ineffective evidence at T3, t(78) = -1.72, p = .09, d = -.19 [99% CI = -.49, .10] or at T4, t(78) = -0.69, p = .49, d = -.08 [99% CI = -.37, .21]. There was no significant change from T3 to T4, t(78) = 1.30, p = .20, d = .15 [99% CI = -.15, .44]. TOST analyses indicated that equivalence could not be determined between T1 and T3, and between T3 and T4, possibly due to a lack of power, but did suggest equivalence at T1 and T4 (see Table 2).
Those presented with ineffective evidence followed by effective evidence did not show a significant change in response to the ineffective evidence from T1 to T2, t(71) = 1.81, p = .08, d = .21 [99% CI = -.10., .52], though did report stronger beliefs in its effectiveness from T2 to T3, t(71) = -3.74, p < .001, d = -.44 [-99% CI = -.76, -.12]. There was no change in beliefs from T3 to T4, t(71) = 1.32, p = .19, d = .16 [99% CI = -.15, .46], and beliefs did not significantly differ from T1 at T3, t(71) = -1.37, p = .18, d = -.16 [99% CI = -.47, .15], or T4, t(71) = -0.48, p = .63, d = -.06 [99% CI = -.36, .25]; see Table 1 and Figure 1). TOST analyses indicated equivalence at T1 and T2, T1 and T3, and T1 and T4; however, equivalence could not be established at T2 and T3 or T3 and T4 (see Table 2).
Belief Change Based on the (In)consistency of the Evidence with Prior Views. The preregistration specified regression analyses using belief change difference scores as an outcome to test whether participants showed more belief change when the evidence opposed vs. supported their initial views. Because there are problems with using difference scores as an outcome in longitudinal research (Castro-Schilo & Grimm, 2018), we instead conducted a series of mixed-model ANOVAs with time of assessment as a within-subjects factor and (in)consistency of the evidence with beliefs at T1 as a between-subjects factor. This specific analytic approach was not preregistered. The measures of belief at each time of assessment used in the analyses were calculated such that higher scores indicated a stronger belief in the direction of the evidence for those who received congruent evidence at all time points and a stronger belief in the direction of the most recently presented study for those who received mixed evidence. This was done by subtracting belief at each time point from 9 for those who most recently received evidence suggesting the death penalty is ineffective, and applying no transformation for those who most recently received evidence suggesting the death penalty is effective. We separated participants into evidence (in)consistency groups based on whether their responses were in the direction supporting or opposing the findings of the first study presented at T1; for T1-T2, there were two groups (first study consistent vs. inconsistent with beliefs at T1), and for T1-T3 and T1-T4, there were four groups (both studies consistent, both studies inconsistent, mixed/first study consistent, mixed/first study inconsistent).
We predicted that participants presented with two belief-inconsistent studies would change their beliefs more in response to the evidence at each time point than would those who received two belief-consistent studies. We also predicted that participants first
presented with a belief-inconsistent study would show more belief change from T1 to T2 than those first presented with a belief-consistent study. We made no prediction for those presented with mixed evidence at T3 and T4. We performed three analyses, testing for differences in belief change between those who received belief consistent vs. inconsistent evidence at each time point (T1 vs. T2, T1 vs. T3, and T1 vs. T4). Significant interactions were followed up with simple effects analyses testing for belief change for each (in)consistency of the evidence group. We report the effect size for these differences (d) in Table 3.
. | Belief in Direction of Evidence Before Presentation . | Belief in Direction of Evidence After Presentation . | . | ||
---|---|---|---|---|---|
Study 1 | M | SD | M | SD | d |
Belief Change T1-T2 | |||||
Belief-inconsistent | 2.62 | 0.98 | 3.73 | 1.69 | -.68 |
Belief-consistent | 6.38 | 0.97 | 6.24 | 1.23 | .13 |
Belief Change T1-T3 | |||||
Belief-inconsistent | 2.52 | 1.02 | 4.05 | 1.94 | -.87 |
Belief-consistent | 6.46 | 0.99 | 6.25 | 1.42 | .15 |
Mixed; belief-inconsistent first | 6.23 | 0.93 | 5.81 | 1.25 | .34 |
Mixed; belief-consistent first | 2.66 | 0.96 | 2.96 | 1.34 | -.22 |
Belief Change T1-T4 | |||||
Belief-inconsistent | 2.49 | 1.00 | 4.07 | 1.88 | -.84 |
Belief-consistent | 6.50 | 1.02 | 6.56 | 1.25 | -.05 |
Mixed; belief-inconsistent first | 6.27 | 0.90 | 5.81 | 1.31 | .38 |
Mixed; belief-consistent first | 2.63 | 0.99 | 2.99 | 1.39 | -.30 |
Study 2 | |||||
Belief Change T1-T2 | |||||
Belief-inconsistent | 2.67 | 0.79 | 3.11 | 1.07 | -.48 |
Belief-consistent | 5.31 | 0.92 | 5.33 | 1.07 | -.04 |
Belief Change T1-T3 | |||||
Belief-inconsistent | 2.69 | 0.80 | 3.04 | 1.09 | -.36 |
Belief-consistent | 5.32 | 0.93 | 5.37 | 1.09 | -.07 |
Study 3 | |||||
Belief Change T1-T2 | |||||
Belief-inconsistent | 2.67 | 0.61 | 3.25 | 0.86 | -.73 |
Belief-consistent | 5.39 | 0.71 | 5.58 | 0.82 | -.34 |
Neutral | 4.00 | 0.00 | 4.10 | 0.41 | -.25 |
Belief Change T1-T3 | |||||
Belief-inconsistent | 2.68 | 0.59 | 3.15 | 0.83 | -.69 |
Belief-consistent | 5.41 | 0.73 | 5.55 | 0.83 | -.22 |
Neutral | 4.00 | 0.00 | 4.08 | 0.39 | -.21 |
. | Belief in Direction of Evidence Before Presentation . | Belief in Direction of Evidence After Presentation . | . | ||
---|---|---|---|---|---|
Study 1 | M | SD | M | SD | d |
Belief Change T1-T2 | |||||
Belief-inconsistent | 2.62 | 0.98 | 3.73 | 1.69 | -.68 |
Belief-consistent | 6.38 | 0.97 | 6.24 | 1.23 | .13 |
Belief Change T1-T3 | |||||
Belief-inconsistent | 2.52 | 1.02 | 4.05 | 1.94 | -.87 |
Belief-consistent | 6.46 | 0.99 | 6.25 | 1.42 | .15 |
Mixed; belief-inconsistent first | 6.23 | 0.93 | 5.81 | 1.25 | .34 |
Mixed; belief-consistent first | 2.66 | 0.96 | 2.96 | 1.34 | -.22 |
Belief Change T1-T4 | |||||
Belief-inconsistent | 2.49 | 1.00 | 4.07 | 1.88 | -.84 |
Belief-consistent | 6.50 | 1.02 | 6.56 | 1.25 | -.05 |
Mixed; belief-inconsistent first | 6.27 | 0.90 | 5.81 | 1.31 | .38 |
Mixed; belief-consistent first | 2.63 | 0.99 | 2.99 | 1.39 | -.30 |
Study 2 | |||||
Belief Change T1-T2 | |||||
Belief-inconsistent | 2.67 | 0.79 | 3.11 | 1.07 | -.48 |
Belief-consistent | 5.31 | 0.92 | 5.33 | 1.07 | -.04 |
Belief Change T1-T3 | |||||
Belief-inconsistent | 2.69 | 0.80 | 3.04 | 1.09 | -.36 |
Belief-consistent | 5.32 | 0.93 | 5.37 | 1.09 | -.07 |
Study 3 | |||||
Belief Change T1-T2 | |||||
Belief-inconsistent | 2.67 | 0.61 | 3.25 | 0.86 | -.73 |
Belief-consistent | 5.39 | 0.71 | 5.58 | 0.82 | -.34 |
Neutral | 4.00 | 0.00 | 4.10 | 0.41 | -.25 |
Belief Change T1-T3 | |||||
Belief-inconsistent | 2.68 | 0.59 | 3.15 | 0.83 | -.69 |
Belief-consistent | 5.41 | 0.73 | 5.55 | 0.83 | -.22 |
Neutral | 4.00 | 0.00 | 4.08 | 0.39 | -.21 |
Note. Higher values indicate a stronger belief in the direction of the evidence. For those who received mixed evidence in Study 1, higher values indicate a stronger belief in the direction of the most recently presented study. Study 1 contained a single-item belief measure, whereas Studies 2 and 3 used a composite 2-item (causal and correlational) belief measure. Belief items were rated on 8-point scales in Study 1 and 7-point scales in Studies 2 and 3. *p < .05.
Supporting our predictions, significant interactions between time of assessment and (in)consistency of the evidence group emerged for T1 to T2, F(1, 365) = 76.06, p < .001, ηp2 = .17, T1 to T3, F(3, 341) = 32.17, p < .001, ηp2 = .22, and T1 to T4, F(3, 308) = 30.41, p < .001, ηp2 = .23. Simple effects analyses revealed that, for T1-T2, those presented with belief-inconsistent evidence showed moderately strong belief change effects in the direction of the evidence presented, t(180) = -9.08, p < .001, d = -.68 [95% CI = -.84, -.51], whereas those presented with belief-consistent evidence did not show significant change in response to the evidence, t(185) = 1.83, p = .07, d = .13 [95% CI = -.01, .28]. Participants who received two belief-inconsistent studies showed large belief change effects in the direction of the evidence at T3, t(92) = -8.39, p < .001, d = -.87 [95% CI = -1.11, -.63], and T4, t(88) = -7.88, p < .001, d = -.84 [95% CI = -1.08, -.59], whereas those who received two belief-consistent studies did not significantly change their beliefs in response to the evidence at T3, t(80) = 1.36, p = .18, d = .15 [95% CI = -.07, .37], or T4, t(71) = -0.43, p = .67, d = -.05 [95% CI = -.28, .18]. Participants presented with a belief-inconsistent study followed by a belief-consistent one showed small belief change effects opposing the belief-consistent study at T3, t(78) = 3.00, p = .004, d = .34 [95% CI = .11, .56], and T4, t(69) = 3.19, p = .002, d = .38 [95% CI = .14, .62], whereas those presented with a belief-consistent study followed by a belief-inconsistent one showed small belief change effects in response to the belief-inconsistent study at T3, t(91) = -2.14, p = .04, d = -.22 [95% CI = -.43, -.02], and T4, t(80) = -2.67, p = .009, d = -.30 [95% CI = -.52, -.07].
Individual Difference Variables Associated with Belief Change
Descriptive statistics for the individual difference variables are available in the Supplementary Materials. We conducted regression analyses to examine the associations between each individual difference variable and belief change in the direction of the evidence presented. Our preregistration indicated that we would correlate each individual difference measure with belief change difference scores to examine these associations. However, as noted by the reviewers, given the problems with using difference scores in longitudinal research, we used regression analyses following Castro-Schilo and Grimm (2018).
Sommer et al. (2024) provide a useful framework for organizing individual differences that may influence how people process evidence, including representation/framing, goals/emotions, thinking dispositions, and knowledge and strategies. The individual differences examined in this research can be divided into these categories. In Study 1, the goals we measured were belief commitment, belief certainty, social desirability, support for science, and political orientation. We measured the thinking disposition actively open-minded thinking and the knowledge/strategies scientific reasoning and perceptions of scientific (un)certainty. Although we planned to examine each individual difference variable in a separate analysis, given the number of analyses this would entail, the exploratory nature of many of the variables we investigated, and the publication of Sommer et al.’s (2024) model after this research was conducted, we reduced the number of analyses by performing a regression for each category of variables (goals, thinking disposition, knowledge/strategies). Because Sommer et al. (2024) argue that these characteristics play a role in belief updating by influencing how people evaluate and process evidence, we analyzed evaluations of the evidence quality in a separate regression analysis. We opted against including the separate categories of variables in the same analysis given the lack of clear theoretical guidance regarding which variables operate independently, interact, and predominate or exert their influence through other variables.
Thus, we performed a series of a-posteriori exploratory regression analyses examining whether each set of variables predicted belief change at each time point (T2, T3, and T4). Following Castro-Schilo and Grimm (2018), we entered T1 belief (in the direction of the evidence) in Step 1 and the individual difference measures in Step 2 as predictors of belief (in the direction of the evidence) at the subsequent time point. Our predictions focused on those presented with a single pattern of evidence (all participants at T2 and those who received congruent evidence at T3 and T4), as it is unclear whether and how people change their beliefs in response to mixed evidence (e.g., Anglin, 2019; Jern et al., 2014). Therefore, as preregistered (and to be consistent with Studies 2 and 3, which only presented participants with a congruent pattern of findings), the analyses for T1-T2 included all participants because participants only received one direction of evidence at that point, and the analyses for T1-T3 and T1-T4 were performed on those who received two studies with congruent findings. For these analyses, the belief variables were coded in the direction of the evidence, such that higher scores indicated a stronger belief in the direction of evidence presented at T2 for all participants and at T3 and T4 for those who received congruent evidence.11
Evaluations. We predicted that participants would shift their beliefs more in response to the evidence when they evaluated it more favorably. For these analyses, we used participants’ evaluations of the first study presented to predict belief change at T2, and the mean evaluations of both studies to predict belief change at T3 and T4 (for those who received congruent evidence). As predicted, perceived evidence quality predicted belief change in the direction of the evidence at T2, B = .26, SE = .04, β = .22, t = 6.44, p < .001 [95% CI = .18, .34], T3, B = .36, SE = .07, β = .29, t = 5.16, p < .001 [95% CI = .22, .50], and T4, B = .36, SE = .07, β = .29, t = 5.09, p < .001 [95% CI = .22, .50], controlling for belief in the direction of the evidence at T1. These findings suggest that perceived evidence quality was a predictor of belief change in response to the evidence at each time point.
Goals. We examined the relationships between belief commitment, belief certainty at T1, social desirability, support for science, and political orientation and belief change in an exploratory manner. There was no significant relationship between belief commitment, B = .01 [95% CI = -.11, .13], SE = .06, β = .01, t = 0.12, p = .90, belief certainty at T1, B = .02 [95% CI = -.07, .12], SE = .05, β = .02, t = 0.51, p = .61, support for science, B = -.01 [95% CI = -.16, .14], SE = .08, β = -.01, t = -0.18, p = .86, or political orientation, B = -.03 [95% CI = -.10, .03], SE = .03, β = -.04, t = -0.93, p = .36, and belief in the direction of the evidence at T2, controlling for belief in the direction of the evidence at T1. However, there was a significant relationship between social desirability and T2 belief, B = .06 [95% CI = .01, .11], SE = .03, β = .09, t = 2.39, p = .02, controlling for belief at T1, such that those who scored higher in social desirability reported greater belief change in response to the evidence at T2.
There was no significant relationship between belief commitment, B = -.02 [95% CI = -.22, .17], SE = .10, β = -.01, t = -0.18, p = .86, belief certainty at T1, B = -.13 [95% CI = -.28, .02], SE = .08, β = -.11, t = -1.72, p = .09, or support for science, B = .05 [95% CI = -.19, .28], SE = .12, β = .02, t = 0.38, p = .70, and belief in the direction of the evidence at T3, controlling for belief in the direction of the evidence at T1. The relationships between social desirability, B = .09 [95% CI = -.001, .17], SE = .04, β = .11, t = 1.96, p = .052, and political orientation, B = -.11 [95% CI = -.21, .001], SE = .05, β = -.12, t = -1.96, p = .052, with T3 belief also did not reach statistical significance, controlling for belief at T1.
There was no significant relationship between belief commitment, B = .03 [95% CI = -.17 .22], SE = .10, β = .02, t = 0.26, p = .79, belief certainty at T1, B = -.14 [95% CI = -.29, .01], SE = .08, β = -.12, t = -1.79, p = .08, support for science, B = .18 [95% CI = -.06, .42], SE = .12, β = .09, t = 1.50, p = .14, social desirability, B = .04 [95% CI = -.04, .13], SE = .04, β = .06, t = 0.98, p = .33, or political orientation, B = -.08 [95% CI = -.19, .03], SE = .06, β = -.09, t = -1.41, p = .16 and belief in the direction of the evidence at T4, controlling for belief in the direction of the evidence at T1.
Thinking Disposition. We tested the relationship between the thinking disposition AOT and belief change in an exploratory manner. There was no significant relationship between AOT and belief in the direction of the evidence at T2, B = .04 [95% CI = -.11, .19], SE = .08, β = .02, t = 0.49, p = .62, T3, B = .14 [95% CI = -.10, .39], SE = .13, β = .07, t = 1.15, p = .25, or T4, B = .11 [95% CI = -.14, .36], SE = .13, β = .05, t = 0.88, p = .38, controlling for belief in the direction of the evidence at T1.
Knowledge and Strategies. We also conducted exploratory analyses testing the relationship between SRS and scientific (un)certainty and belief change. For simplicity, we used the composite scientific (un)certainty measure in the analyses, but present the analyses conducted with the scientific certainty and uncertainty subscales as separate predictors in the Supplementary Materials.
There was no significant relationship between SRS and belief in the direction of the evidence at T2, B = .01 [95% CI = -.06, .07], SE = .03, β = .01, t = 0.20, p = .84, controlling for belief in the direction of the evidence at T1. However, perceived scientific certainty was positively associated with belief in the direction of the evidence at T2, B = .45 [95% CI = .26, .65], SE = .10, β = .17, t = 4.56, p < .001, controlling for belief in the direction of the evidence at T1; participants who perceived science to be more certain reported greater belief change in response to the evidence at T2.
There was no significant relationship between SRS and belief in the direction of the evidence at T3, B = .07 [95% CI = -.04, .18], SE = .05, β = .08, t = 1.32, p = .19, controlling for belief in the direction of the evidence at T1. However, perceived scientific certainty was positively associated with belief in the direction of the evidence at T3, B = .63 [95% CI = .34, .92], SE = .15, β = .25, t = 4.23, p < .001, controlling for belief in the direction of the evidence at T1, such that participants who perceived science to be more certain showed greater belief change in response to the evidence at T3.
There was no significant relationship between SRS, B = .05 [95% CI = -.06, .16], SE = .06, β = .05, t = 0.91, p = .37, and belief in the direction of the evidence at T4, controlling for belief in the direction of the evidence at T1. Perceived scientific certainty was positively associated with belief in the direction of the evidence at T4, B = .53 [95% CI = .24, .82], SE = .15, β = .21, t = 3.60, p < .001, controlling for belief in the direction of the evidence at T1. Again, participants who perceived science to be more certain showed more belief change in response to the evidence at T4.
Discussion
Overall, participants shifted their beliefs in response to a clear pattern of belief-disconfirming findings, replicating Anglin (2019). Participants who read two studies with congruent results tended to shift their beliefs in response to the evidence, and those presented with two studies with conflicting findings shifted their beliefs in response to the first and shifted back in response to the second. However, this pattern primarily emerged among those presented with evidence supporting rather than opposing the death penalty’s effectiveness. As a whole, the sample initially leaned against believing in the death penalty’s effectiveness, and thus participants might have been more swayed by evidence that countered their initial beliefs. Indeed, as in Anglin (2019), belief change primarily occurred among those who received belief-inconsistent, rather than belief-consistent, evidence. Notably, equivalence could not be determined for some of the planned contrasts, suggesting that the study lacked power to detect differences of d=.30 in some cases, and was likely underpowered to detect smaller differences. Even so, these findings are consistent with recent studies showing belief updating in response to science consensus communications on polarized topics, particularly among those who were initially more skeptical of the scientific consensus (van Stekelenburg et al., 2022). People may be receptive to evidence challenging their views, particularly when the evidence is consistent and clear, perhaps because such evidence prompts greater reconsideration of their beliefs.
Study 1 also found that perceptions of the evidence quality and scientific certainty were associated belief change: participants shifted their beliefs more in response to the evidence if they evaluated it more favorably and perceived science to be more certain. The other individual difference variables were not significantly associated with belief change, though the study was only powered to detect medium effects.
Thus, Study 1 largely replicated Anglin (2019), suggested that participants may maintain belief change in response to congruent evidence 24 hours after its presentation, and showed preliminary findings regarding the individual difference variables associated with belief change. Nonetheless, further evidence was needed to corroborate these findings given the exploratory manner in which the individual difference variables were examined and the fact that belief change was generally not observed among those presented with evidence challenging the death penalty’s effectiveness. We sought to further investigate these findings in Studies 2 and 3 using different stimulus study topics.
Studies 2 and 3
Studies 2 and 3 conceptually replicated Study 1 using different stimulus study topics (gun control and murder rates and video games and aggression) and evidence from real studies to avoid deception and increase the external validity of the findings. To simplify the design and analyses, participants read evidence from a single study, and thus beliefs were measured at three time points rather than four. Based on the results of Study 1, we predicted that participants would shift their beliefs in response to the evidence and maintain a shift in the direction of the evidence 24 hours later.
Study 3 also extended the prior studies by testing whether people are sensitive to the evidence strength in updating their beliefs. Participants were either presented with evidence from a recent meta-analysis containing over 100 experimental and correlational studies or a single, more dated correlational study on video games and aggression, with positive or null results. We predicted that participants would change their beliefs more in response to stronger vs. weaker evidence for two reasons. First, as discussed below, Study 2 participants showed modest belief change effects, likely because they were presented with weak evidence (as noted in many participants’ open-ended responses). Second, studies from the attitude change literature have found that people show greater and more enduring attitude change in response to strong vs. weak arguments, at least when they have the motivation and ability to carefully evaluate the arguments (Petty & Cacioppo, 1986), and effortful processing may be encouraged by participating in a paid study (Paolacci & Chandler, 2014).
In Study 1, participants changed their beliefs more in response to congruent vs. conflicting evidence. Another feature of evidence quality is its relevance to the question under investigation, including how well the study methods test the research question. A correlational design would be adequate to test the relationship between variables but an experimental design would provide stronger evidence for testing causality. Indeed, an important limitation to the stimulus studies from Study 1, identified by a subset of participants in their open-ended evaluations, was that the correlational nature of the research limits the ability to draw causal conclusions. In Studies 2 and 3, we explored whether participants are sensitive to the relevance of the evidence to the research question by measuring participants’ correlational and causal beliefs about the question. Research has shown that the public varies in their understanding of science (Scheufele, 2013), and overall science literacy rates are low (J. D. Miller, 2004; National Science Board, 2018). Therefore, we were unsure whether participants would be sensitive to the relevance of the study design to the question at hand. In Study 2, all participants received correlational evidence, and in Study 3, participants were randomly assigned to receive either correlational evidence only or both correlational and causal evidence.
Studies 2 and 3 also tested the replicability of the associations between the individual difference variables and belief change from Study 1. Based on the findings from Study 1, we predicted that participants would show more belief change if the evidence was belief-inconsistent (vs. consistent), they evaluated it more favorably, and they perceived science to be more certain. The other individual difference variables were examined in an exploratory manner. In Studies 2 and 3, we added a measure of positive and negative affect before and after the stimulus study to examine whether individuals’ current affect and affective response to the evidence correlated with belief change.12 In Study 3, we also added measures of intellectual humility, objectivism, education, and the personal relevance of the research as additional possible correlates of belief change.
Method
Study Design
Study 2 followed a 2 (research findings: gun control ineffective vs. effective) x 3 (time of assessment: T1 [before evidence], T2 [after evidence], T3 [next day]) mixed-model design, with research findings as a between-subjects factors and time of assessment as a within-subjects factor. Study 3 followed a 2 (research findings: positive vs. null) x 2 (evidence strength: stronger vs. weaker) x 3 (time of assessment: T1 [before evidence], T2 [after evidence], T3 [next day]) mixed-model design, with research findings and evidence strength as between-subjects factors and time of assessment as a within-subjects factor.
Hypotheses. Based on previous research (Anglin, 2019) and Study 1, we predicted that research findings would interact with time of assessment, such that participants would shift their beliefs in response to the evidence at T2 and T3. In Study 3, we also expected a three-way research findings x evidence strength x time of assessment interaction, such that participants would shift their beliefs more in response to stronger vs. weaker evidence.
Participants
Participants residing in the United States were again recruited from Amazon’s Mechanical Turk to participate in a two-part study administered on consecutive days. In Study 2, a total of 301 participants (148 men, 151 women, 1 transgender, 1 other; Mage = 38.61, SD = 12.17) completed the first study session. An additional 27 began the study but dropped out before completing Part 1. The sample included 21 Hispanic or Latino, 4 American Indian or Alaska Native, 26 Asian, 28 Black or African American, 247 White, and 2 Other identifying participants.1 Of the 301 participants with full data from Part 1, 283 returned to complete Part 2.
In Study 3, a total of 400 participants (186 men, 210 women, 3 transgender, 1 other; Mage = 40.48, SD = 13.50) completed the first study session. An additional 25 began the study but dropped out before completing Part 1. The sample included 22 Hispanic or Latino, 10 American Indian or Alaska Native, 29 Asian, 28 Black or African American, 2 Native Hawaiian or Other Pacific Islander, 336 White, and 6 Other identifying participants.1 Of the 400 participants with full data from Part 1, 323 returned to complete Part 2. Participants were included in all analyses where they had data, unless otherwise specified.
Power Analysis. A power analysis conducted using G*Power 3.1 for Study 2 indicated that, for a 2x3 mixed-model ANOVA, N=116 is necessary to obtain small-medium effects (f = 0.15) at 95% power and α = .05. Because dropouts were anticipated between Parts 1 and 2, an initial sample of N=300 was targeted for Study 2. A sensitivity power analysis with these parameters and the sample size for participants who completed both sessions (N=283) indicated that the study was powered to detect small effects (f = 0.10) for the omnibus ANOVA. However, we overlooked the fact that our hypotheses were contingent on the follow-up planned contrasts. Therefore, we performed an additional sensitivity power analysis for a paired samples t-test at 95% power and α = .01 (the alpha level set for these contrasts), which indicated that the study was powered to detect ds = .35-.37 (based on the n for each analysis).
A power analysis conducted for a 2x2x3 mixed-model ANOVA for Study 3 indicated that N=160 is necessary to obtain small-medium effects (f = 0.15) at 95% power and α = .05. Because a factor was added to Study 3, a higher initial sample of N=400 was targeted. A sensitivity power analysis with these parameters and the sample size for participants who completed both sessions (N=323) indicated that the study was powered to detect small effects (f = 0.10) for the omnibus ANOVA. Again, however, we overlooked the fact that our hypotheses were contingent on the follow-up planned contrasts. We performed an additional sensitivity power analysis for a paired samples t-test at 95% power and α = .01 (the alpha level set for these contrasts), which indicated that that the study was powered to detect ds = .47-.50 (based on the n for each analysis).
We did not conduct a priori power analyses for the analyses examining the associations between the individual difference variables and belief change. Sensitivity power analyses for linear multiple regressions at 80% power, α = .05, and 2-7 predictors (the number of predictors in the analyses we performed) indicated that Study 2 was powered to detect fs = .19-23, and Study 3 was powered to detect fs = .17-.21. Because these analyses were largely exploratory, we did not adjust alpha. As such, the effect sizes the studies were powered to detect may be larger (e.g., setting the alpha to .01 would increase the effect size to fs of .22-.27 for Study 2 and .21-.25 for Study 3). Thus, overall, these studies were likely powered to detect medium rather than small size effects.
Materials and Procedure
The preregistration for Study 2 is available at https://osf.io/xencw13 and for Study 3 at https://osf.io/sgehf.
Beliefs and Position (T1-T3). At the beginning of Part 1, participants reported their initial (T1) beliefs and position via the following questions: What is your position on [gun control (Study 2); violent video games (Study 3)]? (1=strongly oppose, 7=strongly support; single-item position); I believe [background checks on gun sales are associated with ____ gun-related murder rates (Study 2); playing violent video games is related ____ to levels of aggression (Study 3)] (1=much lower, 7=much higher; correlational belief); I believe [background checks on gun sales ____ gun-related murder rates (Study 2); playing violent video games ____ levels of aggression (Study 3] (1=strongly decrease(s), 7=strongly increase(s); causal belief); How certain are you that [background checks on gun sales affect gun-related murder rates (Study 2); playing violent video games affects aggression (Study 3)] in the way you specified? (1=very uncertain, 7=very certain; single-item belief certainty). Participants also completed these questions after reading and evaluating the stimulus study described below (i.e., at T2) and in the follow-up survey sent 24 hours later (T3), which participants had 24 hours to complete. The primary belief question was modified from Study 1 to differentiate between correlational and causal beliefs about the relationship between variables and to include a midpoint in the response scale (reducing the scale points from 8 to 7). However, the correlational and causal belief questions were strongly correlated at each time point in Study 2 (.63 ≤ r’s ≤ .88) and Study 3 (.88 ≤ r’s ≤ .94), suggesting that participants did not readily discriminate between the two. Because the preregistration specified that the two would be combined if they were strongly correlated at r ≥ .70, and the correlations exceeded this benchmark at all but one time point in Study 2, we combined the two items into a composite belief measure at each time point (Study 2: .77 ≤ α’s ≤ .94; Study 3: .93 ≤ α’s ≤ .97).
Affect (T1 and T2). Two filler questions followed the T1 belief and position questions. Next, participants completed a short positive and negative affect scale, rating the extent to which they felt each emotion at the present moment (T1; 1= not at all, 5= extremely; based on Drummond & Fischhoff, 2020; Ekman, 1992; Watson et al., 1988). This scale included 8 items in Study 2 (positive: happy, enthusiastic; negative: angry, fearful, disgusted, sad, annoyed, confused) and 10 items in Study 3 (uninterested and intrigued were added). A principal components analysis with Varimax rotation was performed on the items at T1 and T2, which indicated two factors, one for the positive affect items (α = .78, both studies) and one for the negative (α = .89, Study 2; α = .80, Study 3). One item, surprised, cross-loaded on both factors and was analyzed separately. Participants also completed the positive (α = .80, Study 2; α = .75, Study 3) and negative (α = .89, Study 2; α = .81, Study 3) affect measure after evaluating the study (at T2).14
Stimulus Studies. Participants then read the stimulus study summary. In Study 2, this summary described research examining the relationship between state gun control laws and gun-related murder rates (adapted from Tappin et al., 2021,15 describing Gius, 2015). The direction of the results was randomized across participants, such that the study either suggested that gun control is associated with lower or higher murder rates. In both conditions, the evidence was summarized from the same study (Gius, 2015), which found conflicting evidence regarding whether state gun control laws are associated with lower or higher murder rates. In Gius (2015), states with background checks on private gun sales had higher gun-related murder rates, whereas states with background checks on gun sales conducted through licensed dealers had lower gun-related murder rates. Those in the gun control ineffective condition read about the comparison and findings regarding checks on private gun sales, and those in the gun control effective condition read about the comparison and findings for checks on licensed dealers. To keep the study descriptions similar across conditions, participants read shorter summaries of the studies than in Study 1, with no critiques or rebuttals.
In Study 3, participants were randomly assigned to read one of four studies on video games and aggression, varying in evidence strength (stronger vs. weaker) and direction of the findings (positive vs. null results). Those who received stronger evidence read a recent meta-analysis, containing correlational and experimental evidence from over 100 studies (positive results: Anderson et al., 2010; null results: Ferguson, 2015); those who received weaker evidence read a single, more dated, correlational study, examining the relationship between playing video games (in general, not just violent ones) and aggression (positive results: Fling et al., 1992; null results: Van Schie & Wiegman, 1997). See the Supplementary Materials to view the stimuli and a summary of how they were created.
Study Evaluations. After reading the study summary, participants evaluated the quality of the study via the two closed questions from Study 1 (α = .88, Study 2; α = .91, Study 3) and described in an open-ended manner why they thought the study did or did not support the argument that [gun control [decreases/increases] gun-related murder rates (Study 2); playing violent video games [is/is not] a risk factor for aggression (Study 3)].
Attention Check. Participants completed a multiple-choice attention check in which they were asked to recall the study findings (89% recalled correctly in Study 2, and 99% in Study 3). Because the overall pattern of results was not affected by whether participants who failed the checks were excluded or not, all participants were retained in the analyses, as specified in the preregistration.
Individual Difference Measures. At the end of Part 1, participants completed the individual difference measures from Study 1: belief commitment (α = .78, Study 2; α = .84, Study 3), social desirability (α = .78, Study 2; α = .73, Study 3), actively open-minded thinking (α = .79, Study 2 only), scientific reasoning ability (α = .71, Study 2; α = .73, Study 3), and perceptions of scientific (un)certainty.
Because two factors emerged from the scientific (un)certainty scale developed in Study 1, but only 2 items loaded on the second factor (which had low reliability), an additional 4 items were added to further assess the uncertainty dimension of the scale (e.g., “Scientific studies rarely provide complete answers to questions”, “The results of scientific studies are always tentative”). A principal components analysis with Varimax rotation performed on the 12 scientific (un)certainty items again yielded 2 dominant factors: scientific certainty (α = .75, Study 2; α = .80, Study 3) and scientific uncertainty (α = .78, Study 2; α = .77, Study 3). We also computed an overall composite measure as in Study 1, with higher scores indicating greater scientific certainty (α = .81, Study 2; α = .82, Study 3).
In addition, in Study 3, participants completed intellectual humility (α = .84; Leary et al., 2017) and objectivism (α = .79; Leary et al., 1986) scales as two additional variables that may be linked to belief change. Intellectual humility involves recognizing the limits of one’s knowledge and that one’s beliefs may be wrong (Davis et al., 2016), and objectivism is a style of decision making in which individuals base their judgments on empirical rather than non-empirical information (Leary et al., 1986). Actively open-minded thinking was removed in Study 3 as the questions more directly assess attitudes toward open-minded thinking than open-minded thinking and was not found to be related to belief change in the previous studies. The individual difference measures were presented in randomized order, except the scientific reasoning scale, which was presented last (due to the difference in its formatting).
Demographics and Support for Science. At the end of Part 1, participants answered demographic questions, including single-item questions assessing their religiosity and political orientation, along with the four support for science items from Study 1 (α’s = .87 for Studies 2 and 3), all rated on 7-point scales.
In Study 3, participants also answered questions about their level of education (1=no high school, 7=graduate or professional degree), whether they are a parent/guardian (yes: n = 194; no: n = 206), whether they plan to become a parent in the future (yes: n = 54; no: n = 94; unsure: n = 58), whether they allow or would allow their children to play video games in general (yes: n = 211; no: n = 22; unsure: n = 15) and violent video games in particular (yes: n = 99; no: n = 90; unsure: n = 59), how frequently they play video games (1=never, 5=everyday), and the types of video games they play (violent, somewhat violent, nonviolent, don’t play video games; coded as 1=play violent or somewhat violent games: n = 181, 2=don’t play violent video games: n = 222).
Results
Randomization Checks
Study 2. An independent samples t-test was conducted to test for differences in participants’ beliefs at T1 as a function of their randomly assigned condition (gun control effective vs. ineffective findings). No significant differences in T1 beliefs emerged between conditions, t(325) = 1.10, p = .27, d = .12.16
Study 3. For Study 3, a two-way research findings (video games increase vs. have no effect on aggression) x evidence strength (stronger vs. weaker) ANOVA was performed as a randomization check to test for differences in T1 beliefs. The main effects for research findings, F(1, 411) = 2.92, p = .09, ηp2 = .01, and evidence strength, F(1, 411) = 0.51, p = .48, ηp2 = .001, were non-significant. There was an interaction between research findings and evidence strength, F(1, 411) = 3.88, p = .05, ηp2 = .01. The difference in T1 beliefs between those who received strong and weak positive results, t(206) = 0.90, p = .37, d = .13 [95% CI = -.15, .40], and strong and weak negative results, t(205) = -1.87, p = .06, d = -.26 [95% CI = -.53, .01], did not reach significance. However, those who received strong positive results (M = 4.80, SD = 1.03) reported a stronger baseline belief in a link between video games and aggression than those who received strong null results (M = 4.43, SD = 1.09), t(206) = 2.50, p = .01, d = .35 [95% CI = .07, .62]; those who received weak positive results (M = 4.67, SD = 0.97) showed no difference in beliefs at T1 than did those who received weak null results (M = 4.70, SD = 0.96), t(205) = -0.19, p = .85, d = -.03 [95% CI = -.30, .25].
Study 3 Manipulation Check
A 2 (evidence strength: stronger vs. weaker) x 2 (research findings: positive vs. null) ANOVA was performed on perceived evidence quality to test whether participants rated the evidence from the meta-analysis to be stronger than the evidence from the single study. As predicted, there was a main effect of evidence strength, F(1, 407) = 78.67, p < .001, ηp2 = .16, such that participants rated the evidence from the meta-analysis (M = 5.41, SD = 1.19) as higher in quality than that from the single study (M = 4.25, SD = 1.45). The main effect for the direction of the findings, F(1, 407) = 3.76, p = .053, ηp2 = .01, and two-way interaction, F(1, 407) = 0.23, p = .64, ηp2 = .001, were nonsignificant.
Belief Change
Preregistered Confirmatory Analyses. For Study 2, a 2x3 mixed-model ANOVA, with research findings as a between-subjects factor (gun control effective vs. ineffective) and time of assessment as a within-subjects factor (T1, T2, T3), was performed to examine whether participants shifted their beliefs in response to the evidence. We predicted that research findings would interact with time of assessment, such that participants would shift their beliefs from T1 to T2 and T3 in the direction of the evidence presented.
In Study 3, a 2x2x3 mixed-model ANOVA, with direction of research findings (positive vs. null) and evidence strength (stronger vs. weaker) as between-subjects factors and time of assessment as a within-subjects factor (T1, T2, T3) was performed to examine whether participants shifted their beliefs in response to the evidence at T2 and T3, and changed their beliefs more in response to stronger vs. weaker evidence. We predicted that research findings would interact with time of assessment, such that participants would shift their beliefs in response to the evidence at T2 and T3. We also expected a three-way research findings x evidence strength x time of assessment interaction, such that participants would shift their beliefs more in response to stronger vs. weaker evidence.17 Three-way interactions were followed up with simple effects analyses performed on evidence strength to test whether the research findings x time of assessment interactions varied by evidence strength.
For both studies, significant research findings x evidence strength interactions were followed up with three planned contrasts testing for belief change from T1 and T2, T1 to T3, and T2 to T3. Only participants who reported beliefs at each time point were included in these analyses so that the T1 comparison would be the same at T2 and T3. The significance level was set to p < .01 to control for multiple comparisons.
A-posteriori Exploratory Analyses. As in Study 1, non-significant contrasts were followed up with two one-sided tests (TOST; Lakens, 2017) to test for equivalence (i.e., statistically reject the presence of an effect based on the smallest effect size of interest), using d=.30 as the smallest effect size of interest.
Study 2. The main effect for time of assessment was nonsignificant, F(2, 554) = 0.62, p = .54, ηp2 = .002. There was a significant main effect for research findings, F(1, 277) = 15.80, p < .001, ηp2 = .05, qualified by the predicted interaction between research findings and time of assessment, F(2, 554) = 15.45, p < .001, ηp2 = .05. The planned contrasts indicated that participants presented with evidence suggesting that gun control is ineffective and actually increases murder rates changed their beliefs in the direction of the evidence from T1 to T2, t(147) = -4.14, p < .001, d = -.34, [99% CI = -.56, -.12], but the difference from T1 to T3 was not below the .01 alpha level, t(150) = -2.57, p = .011, d = -.21 [99% CI = -.42, .003] (see Table 4 and Figure 2). Participants who received ineffective evidence showed no significant change in beliefs from T2 to T3, t(147) = 1.59, p = .11 d = .13 [99% CI = -.08., .34]. TOSTs indicated that equivalence could not be determined at T1 and T3 or T2 and T3, suggesting that the non-significant findings could be due to a lack of power (see Table 5). Participants presented with evidence suggesting that gun control is effective in decreasing murder rates changed their beliefs in the direction of the evidence from T1 to T2, t(130) = 3.09, p = .002, d = .27 [99% CI = .04, .50], and T1 to T3, t(131) = 2.87, p = .005, d = .25 [99% CI = .02, .48], but showed no change from T2 to T3, t(130) = 0.22, p = .83, d = .02 [99% CI = -.21, .24]. TOSTs suggested equivalence at T2 and T3.
. | Gun Control Ineffective Evidence (n = 148) . | Gun Control Effective Evidence (n = 131) . | ||
---|---|---|---|---|
Beliefs (Composite) | M (SD) | †d * d | M (SD) | †d * d |
T1 | 2.89 (1.03) | -- -- | 2.74 (1.05) | -- -- |
T2 | 3.17ab (1.11) | -.34 -- | 2.53 (0.92) | .27 -- |
T3 | 3.07 (1.13) | -.21 .13 | 2.52a (0.96) | .25 .02 |
. | Gun Control Ineffective Evidence (n = 148) . | Gun Control Effective Evidence (n = 131) . | ||
---|---|---|---|---|
Beliefs (Composite) | M (SD) | †d * d | M (SD) | †d * d |
T1 | 2.89 (1.03) | -- -- | 2.74 (1.05) | -- -- |
T2 | 3.17ab (1.11) | -.34 -- | 2.53 (0.92) | .27 -- |
T3 | 3.07 (1.13) | -.21 .13 | 2.52a (0.96) | .25 .02 |
Note.aDenotes a significant (p < .01) difference from the mean at T1. b Denotes a significant difference from the mean at the preceding time of assessment. †d = strength of difference from beliefs at T1. * d = strength of difference from beliefs at previous time point. Higher scores indicate a stronger belief in gun control’s ineffectiveness; items were rated on 7-point scales.
. | Gun Control Ineffective Evidence . | Gun Control Effective Evidence . | ||
---|---|---|---|---|
Variable | 90% CI | ΔL, ΔU | 90% CI | ΔL, ΔU |
Beliefs (Composite) | ||||
T1-T2 | -- | -- | -- | -- |
T1-T3 | -.34, -.07 | -.27, .27 | -- | -- |
T2-T3 | -.01, .27 | -.21, .21 | -.13, .16 | -.18, .18 |
. | Gun Control Ineffective Evidence . | Gun Control Effective Evidence . | ||
---|---|---|---|---|
Variable | 90% CI | ΔL, ΔU | 90% CI | ΔL, ΔU |
Beliefs (Composite) | ||||
T1-T2 | -- | -- | -- | -- |
T1-T3 | -.34, -.07 | -.27, .27 | -- | -- |
T2-T3 | -.01, .27 | -.21, .21 | -.13, .16 | -.18, .18 |
Summary. Although Study 2 found belief change in response to the evidence, belief change effects were relatively modest, possibly because participants were presented with relatively brief evidence (i.e., only a brief overview of a single study). In Study 2, 26.3% of participants stated in their open-ended evaluations that the summary of the research did not provide enough detail to know what the researchers did and whether to trust the findings (see Supplementary Materials). Certainly, it is not always appropriate for people to change their beliefs in response to new evidence. Research is not always reliable (e.g., Open Science Collaboration, 2015), especially new findings (e.g., Pashler & de Ruiter, 2017; Pundi et al., 2020), and research conclusions are not always justified based on the study methods (e.g., Jussim et al., 2016). An important question is whether people consider the strength of scientific evidence in updating their beliefs in response to it, as weighing evidence strength is crucial to evaluating scientific conclusions and forming evidence-based opinions. People vary in their understanding of science and scientific uncertainty (Broomell & Kane, 2017; Rabinovich & Morton, 2012), and their ability to evaluate science (Drummond & Fischhoff, 2017) and draw valid conclusions from research studies. Thus, in addition to the direction of the findings, we manipulated the strength of the evidence in Study 3 to test whether people are sensitive to the evidence strength when updating beliefs in response to new evidence.
Study 3. There was a main effect for research findings, F(1, 313) = 33.38, p < .001, ηp2 = .10, and interaction between evidence strength and research findings, F(1, 313) = 5.58, p = .02, ηp2 = .02. Moreover, the predicted two-way interaction between time of assessment and research findings, F(2, 626) = 45.90, p < .001, ηp2 = .13, and three-way interaction among evidence strength, research findings, and time of assessment, F(2, 626) = 5.20, p = .006, ηp2 = .02, were significant. All other main effects and interactions were nonsignificant. The simple effects analyses revealed significant two-way time of assessment x research findings interactions among those who received stronger, F(2, 320) = 32.04, p < .001, ηp2 = .17, and weaker, F(2, 306) = 15.54, p < .001, ηp2 = .09, evidence. The planned contrasts indicated that those who received strong positive results reported a stronger belief that violent video games are linked to aggression at T2, t(82) = -6.55, p < .001, d = -.72 [99% CI = -1.04, -.40], and T3, t(82) = -4.12, p < .001, d = -.45 [99% CI = -.75, -.15], though the strength of this change weakened from T2 to T3, t(82) = 2.99, p = .004, d = .33 [99% CI =.04, .62]. A similar though weaker pattern emerged among those who received weak positive results: participants reported a stronger belief that violent video games are linked to aggression at T2, t(73) = -3.37, p = .001, d = -.39 [99% CI = -.70, -.08], and T3, t(73) = -3.12, p = .003, d = -.36 [99% CI = -.67, -.05]. Participants who received weak positive results showed no significant change from T2 to T3, t(73) = -0.33, p = .74, d = -.04 [99% CI = -.34, .26], though TOSTs indicated that equivalence could not be determined at T2 and T3. Participants who received strong null results reported a weaker belief that violent video games are linked to aggression at T2, t(78) = 3.86, p < .001, d = .44 [99% CI = .13, .74]. Belief change effects did not meet the .01 significance level from T1 to T3, t(78) = 2.60, p = .011, d = .29 [99% CI = -.01. .59], or T2 to T3, t(78) = -2.36, p = .02, d = -.27 [99% CI = -.56, .03]. TOSTs indicated that equivalence could not be determined at T1 and T3 or T2 and T3. A similar but slightly weaker pattern emerged among those who received weak null results: participants reported a weaker belief that violent video games are linked to aggression at T2, t(80) = 3.44, p < .001, d = .38 [99% CI = .09, .68], and T3, t(80) = 2.94, p = .004, d = .33 [99% CI = .03, .62] (see Table 6 and Figure 3). Participants who received weak null results showed no significant change from T2 to T3, t(80) = -0.88, p = .38, d = -.10 [99% CI = -.38, .19], though TOSTs indicated that equivalence could not be established at T2 and T3 (see Table 7).
. | Strong Evidence; Positive Results . | Weak Evidence; Positive Results . | Strong Evidence; Null Results . | Weak Evidence; Null Results . | ||||
---|---|---|---|---|---|---|---|---|
(n = 83) | (n = 74) | (n = 79) | (n = 81) | |||||
Beliefs (Composite) | M (SD) | †d * d | M (SD) | †d * d | M (SD) | †d * d | M (SD) | †d * d |
T1 | 4.83 (1.02) | -- -- | 4.76 (0.90) | -- -- | 4.42 (1.12) | -- -- | 4.65 (0.91) | -- -- |
T2 | 5.27ab (1.01) | -.72 -- | 4.93ab (0.93) | -.39 -- | 4.09ab (0.92) | .44 -- | 4.45ab (0.81) | .38 -- |
T3 | 5.10a (1.11) | -.45 .33 | 4.95 (0.94) | -.36 -.04 | 4.20a (0.85) | .29 -.27 | 4.49a (0.94) | .33 -.10 |
. | Strong Evidence; Positive Results . | Weak Evidence; Positive Results . | Strong Evidence; Null Results . | Weak Evidence; Null Results . | ||||
---|---|---|---|---|---|---|---|---|
(n = 83) | (n = 74) | (n = 79) | (n = 81) | |||||
Beliefs (Composite) | M (SD) | †d * d | M (SD) | †d * d | M (SD) | †d * d | M (SD) | †d * d |
T1 | 4.83 (1.02) | -- -- | 4.76 (0.90) | -- -- | 4.42 (1.12) | -- -- | 4.65 (0.91) | -- -- |
T2 | 5.27ab (1.01) | -.72 -- | 4.93ab (0.93) | -.39 -- | 4.09ab (0.92) | .44 -- | 4.45ab (0.81) | .38 -- |
T3 | 5.10a (1.11) | -.45 .33 | 4.95 (0.94) | -.36 -.04 | 4.20a (0.85) | .29 -.27 | 4.49a (0.94) | .33 -.10 |
Note.aDenotes a significant (p < .01) difference from the mean at T1. b Denotes a significant difference from the mean at the preceding time of assessment. †d = strength of difference from beliefs at T1. * d = strength of difference from beliefs at previous time point. Higher scores indicate a stronger belief in the link between violent video games and aggression; items were rated on 7-point scales.
. | Strong Evidence; Positive Results . | Weak Evidence; Positive Results . | Strong Evidence; Null Results . | Weak Evidence; Null Results . | ||||
---|---|---|---|---|---|---|---|---|
Beliefsa | 90% CI | ΔL, ΔU | 90% CI | ΔL, ΔU | 90% CI | ΔL, ΔU | 90% CI | ΔL, ΔU |
T1-T2 | -- | -- | -- | -- | -- | -- | -- | -- |
T1-T3 | -- | -- | -- | -- | -.10, .48 | -.22, .22 | -- | -- |
T2-T3 | -- | -- | -.23, .15 | -.11, .11 | -.45, -.08 | -.12, .12 | -.23, .09 | -.13, .13 |
. | Strong Evidence; Positive Results . | Weak Evidence; Positive Results . | Strong Evidence; Null Results . | Weak Evidence; Null Results . | ||||
---|---|---|---|---|---|---|---|---|
Beliefsa | 90% CI | ΔL, ΔU | 90% CI | ΔL, ΔU | 90% CI | ΔL, ΔU | 90% CI | ΔL, ΔU |
T1-T2 | -- | -- | -- | -- | -- | -- | -- | -- |
T1-T3 | -- | -- | -- | -- | -.10, .48 | -.22, .22 | -- | -- |
T2-T3 | -- | -- | -.23, .15 | -.11, .11 | -.45, -.08 | -.12, .12 | -.23, .09 | -.13, .13 |
Note.a Composite measure of causal and correlational belief items.
Belief Change Based on the (In)consistency of the Evidence with Initial Views. As in Study 1, we conducted a series of mixed-model ANOVAs with time of assessment as a within-subjects factor and (in)consistency of the evidence with beliefs at T1 (consistent vs. inconsistent) as a between-subjects factor to test whether participants changed their beliefs more when the evidence opposed vs. supported their initial views.18 The measures of belief at each time of assessment used in the analyses were calculated such that higher scores indicated a stronger belief in the direction of the evidence (i.e., belief at each time point was subtracted from 8 for those who received evidence suggesting gun control is effective/video games are linked to aggression, and no transformation was applied for those who received evidence suggesting the gun control is ineffective/video games are not linked to aggression). As in Study 1, we separated participants into belief consistent and inconsistent groups based on whether their initial responses fell above or below the midpoint of the scale in the direction supporting or opposing the research findings. We performed two ANOVAs testing for differences in belief change between those who received consistent vs. inconsistent evidence from T1-T2 and T1-T3. Significant interactions were followed up with simple effects analyses testing for belief change in the direction of the evidence for those who received consistent and inconsistent evidence. We report the effect size for these differences (d) in Table 3.
Study 2. Supporting our predictions, significant interactions between time of assessment and (in)consistency of the evidence emerged for T1 to T2, F(1, 301) = 18.20, p < .001, ηp2 = .06, and T1 to T3, F(1, 280) = 8.19, p = .005, ηp2 = .03. Simple effects analyses revealed that those presented with belief-inconsistent evidence showed moderately strong belief change effects in the direction of the evidence presented at T2, t(152) = -5.87, p < .001, d = -.48 [95% CI = -.64, -.31], and T3, t(142) = -4.31, p < .001, d = -.36 [95% CI = -.53, -.19], whereas those presented with belief-consistent evidence did not show significant change in response to the evidence at T2, t(149) = -0.43, p = .67, d = -.04 [95% CI = -.20, .13], or T3, t(138) = -0.78, p = .44, d = -.07 [95% CI = -.23, .10] (see Table 3).
Study 3. Supporting our predictions, significant interactions between time of assessment and (in)consistency of the evidence emerged for T1 to T2, F(1, 253) = 20.79, p < .001, ηp2 = .08, and T1 to T3, F(1, 194) = 12.33, p < .001, ηp2 = .06. Simple effects analyses indicated that those presented with belief-inconsistent evidence showed large belief change effects in the direction of the evidence presented at T2, t(117) = -7.97, p < .001, d = -.73 [95% CI = -.94, -.53], and T3, t(86) = -6.42, p < .001, d = -.69 [95% CI = -.92, -.45], whereas those presented with belief-consistent evidence showed smaller belief change effects in response to the evidence at T2, t(136) = -4.02, p < .001, d = -.34 [95% CI = -.52, -.17], and T3, t(108) = -2.29, p = .02, d = -.22 [95% CI = -.41, -.03]. Participants who initially held a neutral belief (i.e., their beliefs fell at the midpoint) showed small belief change effects in response to the evidence at T2, t(147) = -3.02, p = .003, d = -.25 [95% CI = -.41, -.08], and T3, t(120) = -2.31, p = .02, d = -.21 [-.39, -.03] (see Table 3).
Individual Difference Variables Associated with Belief Change19
Descriptive statistics for the individual difference variables are available in the Supplementary Materials. Regression analyses were conducted to examine the associations between each individual difference variable and belief change in the direction of the evidence presented, 20 following the same a-posteriori exploratory analytic approach as in Study 1. We performed a series of regression analyses examining whether each set of variables predicted belief change at T2 and T3. We entered T1 beliefs (in the direction of the evidence) in Step 1 and the individual difference measures in Step 2 as predictors of beliefs (in the direction of the evidence) at the subsequent time point.
Evaluations. We predicted that participants would shift their beliefs more in response to the evidence when they evaluated it more favorably.
Study 2. As predicted, in Study 2, perceived evidence quality predicted belief change in the direction of the evidence at T2, B = .20 [95% CI = .13, .27], SE = .04, β = .19, t = 5.54, p < .001, and T3, B = .20 [95% CI = .12, .27], SE = .04, β = .18, t = 4.94, p < .001, controlling for belief in the direction of the evidence at T1.
Study 3. Also supporting our predictions, in Study 3, perceived evidence quality predicted belief change in the direction of the evidence at T2, B = .15 [95% CI = .11, .19], SE = .02, β = .18, t = 7.70, p < .001, and T3, B = .12 [95% CI = .08, .17], SE = .02, β = .15, t = 5.70, p < .001, controlling for belief in the direction of the evidence at T1.
Goals. In Study 2, the goals we measured were belief commitment, belief certainty, social desirability, support for science, and political orientation. We measured these same goals in Study 3, along with personal relevance of the evidence, measured as frequency of video game play. We examined the associations between these variables and belief change in an exploratory manner.
Study 2. In Study 2, there was no significant relationship between belief commitment, B = -.04 [95% CI = -.13, .05], SE = .04, β = -.03, t = -0.92, p = .36, belief certainty at T1, B = -.04 [95% CI = -.11, .03], SE = .04, β = -.04, t = -1.11, p = .27, social desirability, B = .01 [95% CI = -.02, .04], SE = .02, β = .02, t = 0.69, p = .49, support for science, B = -.02 [95% CI = -.12, .08], SE = .05, β = -.01, t = -0.36, p = .72, and political orientation, B = .05 [95% CI = -.004, .11], SE = .03, β = .06, t = 1.83, p = .07, and belief in the direction of the evidence at T2, controlling for belief in the direction of the evidence at T1. There was also no significant relationship between belief commitment, B = -.07 [95% CI = -.16, .03], SE = .05, β = -.05, t = -1.39, p = .17, belief certainty at T1, B = -.05 [95% CI = -.12, .03], SE = .04, β = -.04, t = -1.17, p = .24, social desirability, B = .01 [95% CI = -.03, .04], SE = .02, β = .01, t = 0.39, p = .70, support for science, B = .002 [95% CI = -.11, .11], SE = .06, β = .001, t = 0.04, p = .97, and political orientation, B = .02 [95% CI = -.05, .08], SE = .03, β = .02, t = 0.49, p = .62, and belief in the direction of the evidence at T3, controlling for belief in the direction of the evidence at T1.
Study 3. In Study 3, there was no significant relationship between belief commitment, B = .02 [95% CI = -.03, .07], SE = .03, β = .03, t = 0.90, p = .37, belief certainty at T1, B = .004 [95% CI = -.04, .05], SE = .02, β = .01, t = 0.18, p = .85, social desirability, B = .01 [95% CI = -.01, .04], SE = .01, β = .03, t = 1.27, p = .21, support for science, B = .03 [95% CI = -.04, .09], SE = .03, β = .03, t = 0.87, p = .39, and political orientation, B = .004 [95% CI = -.03, .04], SE = .02, β = .01, t = 0.20, p = .84, and belief in the direction of the evidence at T2, controlling for belief in the direction of the evidence at T1. Personal relevance of the evidence was negatively associated with beliefs in the direction of the evidence at T2, B = -.06 [95% CI = -.10, -.02], SE = .02, β = -.07, t = -2.86, p = .004, controlling for T1 beliefs; participants who reported playing video games more frequently showed less belief change in response to the evidence.
There was no significant relationship between belief commitment, B = -.01 [95% CI = -.06, .05], SE = .03, β = -.01, t = -0.17, p = .87, belief certainty at T1, B = -.01 [95% CI = -.05, .04], SE = .02, β = -.01, t = -0.29, p = .77, social desirability, B = .004 [95% CI = -.02, .03], SE = .01, β = .01, t = 0.33, p = .74, support for science, B = .05 [99% CI = -.02, .12], SE = .04, β = .04, t = 1.46, p = .15, and political orientation, B = -.04 [95% CI = -.08, .003], SE = .02, β = -.06, t = -1.85, p = .07, and belief in the direction of the evidence at T3, controlling for belief in the direction of the evidence at T1. However, personal relevance of the evidence was significantly negatively correlated with belief in the direction of the evidence at T3, B = -.07 [95% CI = -.11, -.02], SE = .02, β = -.08, t = -2.96, p = .003, controlling for belief in the direction of the evidence at T1, such that those who played video games more frequency changed their beliefs less in response to the evidence.
Emotions. In both studies, we also included measures of positive affect, negative affect, and surprise to conduct exploratory analyses examining their relationship with belief change. Although Sommer et al. (2024) combined goals and emotions, we conducted separate analyses for goals and emotions given that they may influence one another in the processing of evidence (i.e., goals may affect the emotions individuals experience in response to evidence, and emotions may induce certain goals in processing evidence).
Study 2. Negative affect at T1 was positively associated with belief in the direction of the evidence at T2, B = .41 [95% CI = .11, .72], SE = .15, β = .15, t = 2.71, p = .007, and T3, B = .37 [95% CI = .03, .70], SE = .17, β = .12, t = 2.16, p = .03, and negative affect at T2 was negatively associated with belief change in the direction of the evidence at T2, B = -.42 [95% CI = -.70, -.15], SE = .14, β = -.18, t = -3.04, p = .003, and T3, B = -.40 [95% CI = -.69, -.10], SE = .15, β = -.16, t = -2.62, p = .01, controlling for belief in the direction of the evidence at T1. These findings suggest that greater negative affect before the presentation of evidence was associated with more belief change, but greater negative affect after the presentation of evidence was associated with less belief change at T2. There was a negative relationship between positive affect at T1 and belief in the direction of the evidence at T2, B = -.21 [95% CI = -.41, -.01], SE = .10, β = -.13, t = -2.07, p = .04; however, positive affect at T2 was unrelated to beliefs in the direction of the evidence at T2, B = .15 [95% CI = -.04, .34], SE = .10, β = .10, t = 1.51, p = .13, and positive affect at T1, B = -.03 [95% CI = -.25, .18], SE = .11, β = -.02, t = -0.28, p = .78, and T2, B = -.07 [95% CI = -.28, .14], SE = .11, β = -.04, t = -0.63, p = .53, were unrelated to beliefs in the direction of the evidence at T3, controlling for beliefs in the direction of the evidence at T1.
Surprise at T2 was related to belief in the direction of the evidence at T2, B = .23 [95% CI = .02, .23], SE = .05, β = .09, t = 2.35 p = .02, controlling for belief in the direction of the evidence at T1, suggesting that greater surprise after the presentation of evidence was associated with greater belief change in response to it. This relationship did not reach statistical significance at T3, B = .12 [95% CI = < -.001, .23], SE = .06, β = .08, t = 1.96, p = .051. Surprise at T1 was not significantly related to belief in the direction of the evidence at T2, B = -.02 [95% CI = -.16, .12], SE = .07, β = -.01, t = -0.32, p = .75, or T3, B = .08 [95% CI = -.07, .23], SE = .08, β = .04, t = 1.02, p = .31, controlling for beliefs in the direction of the evidence at T1.
Study 3. Negative affect at T1, B = -.003 [95% CI = -.26, .26], SE = .13, β = -.001, t = -0.02, p = .98, negative affect at T2, B = -.09 [95% CI = -.33, .16], SE = .12, β = -.03, t = -0.72, p = .48, positive affect at T1, B = -.01 [95% CI = -.15, .13], SE = .07, β = -.01, t = -0.14, p = .89, positive affect at T2, B = .004 [95% CI = -.13, .14], SE = .07, β = .003, t = 0.06, p = .96, and surprise at T1, B = .08 [95% CI = -.02, .18], SE = .05, β = .05, t = 1.63, p = .11, were not significantly related to belief in the direction of the evidence at T2, controlling for belief in the direction of the evidence at T1. However, surprise at T2 was positively associated with belief in the direction of the evidence at T2, B = .14 [95% CI = .07, .21], SE = .04, β = .11, t = 3.85, p < .001, controlling for T1 beliefs, suggesting that greater surprise after the presentation of evidence was associated with greater belief change in response to it.
Negative affect at T1, B = .14 [95% CI = -.16, .43], SE = .15, β = .05, t = 0.91, p = .36, negative affect at T2, B = -.13 [95% CI = -.41, .16], SE = .15, β = -.04, t = -0.87, p = .38, positive affect at T1, B = .02 [95% CI = -.14, .18], SE = .08, β = .01, t = 0.21, p = .84, positive affect at T2, B = .03 [95% CI = -.13, .19], SE = .08, β = .03, t = 0.40, p = .69, and surprise at T1, B = -.06 [95% CI = -.17, .05], SE = .06, β = -.03, t = -1.04, p = .30, were not significantly related to belief in the direction of the evidence at T3, controlling for belief in the direction of the evidence at T1. However, surprise at T2 was positively associated with belief in the direction of the evidence at T3, B = .10 [95% CI = .02, .18], SE = .04, β = .08, t = 2.55, p = .01, controlling for T1 beliefs, again suggesting that greater surprise after the presentation of evidence was associated with greater belief change in response to it.
Notably, in both studies, participants reported relatively neutral levels of positive affect and low levels of negative affect T1 and T2 (see Table S1, Supplementary Materials).
Thinking Disposition. We also conducted exploratory analyses testing the relationship between the thinking dispositions AOT (Study 2), intellectual humility (Study 3), and objectivism (Study 3) and belief change.
Study 2. In Study 2, AOT was unrelated to belief in the direction of the evidence at T2, B = -.01 [95% CI = -.12, .11], SE = .06, β = -.003, t = -0.09, p = .93, and T3, B = -.01 [95% CI = -.13, .12], SE = .06, β = -.004, t = -0.14, p = .89, controlling for belief in the direction of the evidence at T1.
Study 3. In Study 3, intellectual humility was unrelated to belief in the direction of the evidence at T2, B = .03 [95% CI = -.05, .11], SE = .04, β = .02, t = 0.66, p = .51, and T3, B = .06 [95% CI = -.03, .15], SE = .05, β = .04, t = 1.29, p = .20, and objectivism was unrelated to belief in the direction of the evidence at T2, B = .01 [95% CI = -.08, .10], SE = .05, β = .01, t = 0.23, p = .82, and T3, B = .02 [95% CI = -.07, .12], SE = .05, β = .01, t = 0.48, p = .63, controlling for belief in the direction of the evidence at T1.
Knowledge and Strategies. In both studies, we assessed the knowledge/strategies scientific reasoning and perceptions of scientific (un)certainty, along with education in Study 3. As in Study 1, we used the composite scientific (un)certainty measure in the analyses for simplicity, but present the analyses conducted with the scientific certainty and uncertainty subscales as separate predictors in the Supplementary Materials. We predicted that scientific certainty would be positively associated with belief change. Scientific reasoning and education (Study 3) were examined in an exploratory manner.
Study 2. As predicted, scientific certainty was positively associated with belief in the direction of the evidence at T2, B = .22 [95% CI = .09, .34], SE = .06, β = .10, t = 3.45, p < .001, and T3, B = .24 [95% CI = .11, .38], SE = .07, β = .11, t = 3.48, p < .001, controlling for belief in the direction of the evidence at T1, indicating that participants who perceived science to be more certain changed their beliefs more in response to the evidence.
SRS was negatively associated with beliefs in the direction of the evidence at T2, B = -.05 [95% CI = -.09, -.02], SE = .02, β = -.09, t = -3.00, p = .003, controlling for belief in the direction of the evidence at T1, suggesting that participants who scored higher in scientific reasoning changed their beliefs less in response to the evidence. However, SRS was not significantly related to belief in the direction of the evidence at T3, B = -.03 [95% CI = -.07, .01], SE = .02, β = -.05, t = -1.56, p = .12, controlling for T1 beliefs.
Study 3. As predicted, scientific certainty was positively associated with belief in the direction of the evidence at T2, B = .14 [95% CI = .06, .22], SE = .04, β = .09, t = 3.38, p < .001, and T3, B = .12 [95% CI = .04, .21], SE = .04, β = .08, t = 2.85, p = .005, controlling for belief in the direction of the evidence at T1; participants who perceived science to be more certain changed their beliefs more in response to the evidence. SRS was not significantly related to belief in the direction of the evidence at T2, B = -.02 [95% CI = -.04, .01], SE = .01, β = -.04, t = -1.39, p = .17, or T3, B = .003 [95% CI = -.02, .03], SE = .01, β = .01, t = 0.24, p = .81, controlling for beliefs at T1. Education was also unrelated to beliefs in the direction of the evidence at T2, B = .02 [95% CI = -.02, .07], SE = .02, β = .02, t = 0.95, p = .34, and T3, B = -.01 [95% CI = -.06, .05], SE = .03, β = -.01, t = -0.29, p = .77, controlling for T1 beliefs.
Discussion
Overall, Studies 2 and 3 replicated Study 1, showing that participants shifted their beliefs in response to the evidence, maintained this change 24 hours later, and that belief change primarily occurred among those who received belief-inconsistent, rather than belief-consistent, evidence. In a few cases, the planned contrasts directionally supported our hypotheses but were non-significant, and equivalence could not be determined for belief persistence, likely due to insufficient power. Study 3 extended Studies 1 and 2 to examine whether participants were sensitive to the evidence strength in updating their beliefs. Participants were sensitive to the evidence strength in updating their beliefs immediately after its presentation, but their initial sensitivity appeared to wear off by the next day. The randomization checks suggested that Study 3 participants who received strong positive results reported a stronger initial belief in the link between video games and aggression; however, overall, participants still shifted their beliefs more when presented with evidence opposing (vs. supporting) their initial beliefs. Across both studies, participants’ causal and correlational beliefs about the relationship between variables in the stimulus studies were strongly correlated, indicating that participants did not readily discriminate between correlational and causal evidence.
As in Study 1, participants who perceived the evidence to be higher in quality and science to be more certain showed more belief change in response to the evidence. Also consistent with the results from Study 1, belief commitment and certainty, support for science, political orientation, social desirability, and actively open-minded thinking were not significantly correlated with belief change. Scientific reasoning was negatively associated with belief change after the presentation of evidence in Study 2 but not Study 3, possibly because participants were presented with less evidence in Study 2 than Study 3, making those high in scientific reasoning less likely to change their beliefs in response to evidence of questionable quality. Studies 2 and 3 built on Study 1 to examine the relationship between affect and belief change, including one’s affective state before and after the presentation of evidence. Positive affect before and after the presentation of evidence were not consistently linked with belief change. Negative affect showed some associations with belief change in Study 2 but not Study 3. Surprise after the presentation of evidence tended to correlate with belief change in response to the evidence in both studies, supporting recent research in which surprise was found to predict greater receptiveness to evidence (Drummond & Fischhoff, 2020). Study 3 examined additional possible individual difference variables associated with belief change: intellectual humility, objectivism, education, and personal relevance of the evidence. Only personal relevance of the evidence was associated with belief change, such that participants shifted their beliefs less when the research question was more personally relevant to them.
General Discussion
The present research replicated and extended recent research on belief change in response to empirical evidence on polarized topics (Anglin, 2019), investigating whether people maintain belief change one day later, change their beliefs more in response to stronger evidence, and individual differences associated with belief change. Across three studies and topics, participants shifted their beliefs in response to the evidence, tended to maintain this change 24 hours later, and exhibited the most belief change when the evidence opposed their initial views. These findings replicate Anglin (2019), showing belief change in response to belief-disconfirming empirical evidence, and extend this work by demonstrating that belief change can persist 24 hours later. These results suggest that the belief change effects do not simply reflect social desirability and that empirical evidence can impact people’s beliefs on polarized topics, even—or perhaps especially—when it conflicts with their prior views.
Studies 2 and 3 used evidence from real scientific studies, increasing the external validity of the findings. However, participants may be more motivated to closely read and process evidence presented to them in a paid study than they would in their daily lives (Paolacci & Chandler, 2014). Anglin (2019) observed belief change in response to evidence among both paid and unpaid samples; nonetheless, future research is needed to examine whether these processes also occur in naturalistic settings.
Study 1 also replicated previous research in observing greater belief change in response to converging vs. conflicting evidence (Anglin, 2019), demonstrating that people take into account the consistency of evidence across studies, at least in the short-term. In Study 3, participants updated their beliefs more in response to stronger vs. weaker evidence but were more sensitive to the evidence strength initially than the next day. This finding may reflect a sleeper effect, in which people forget aspects of a persuasive argument over time (Kumkale & Albarracín, 2004), in that participants might have remembered the conclusions from the study but forgotten aspects of the evidence quality (i.e., the study methods and results) the next day. Although participants were sensitive to the consistency of evidence, they did not readily discriminate between correlational vs. causal findings, supporting research on gaps in the public’s understanding of science (J. D. Miller, 2004; National Science Board, 2018; Scheufele, 2013), such as the tendency to conflate correlation with causation (Bleske-Rechek et al., 2015; Norris et al., 2003).
Thus, overall, the present findings support the results of other recent studies showing the success of scientific consensus messages in aligning people’s beliefs with the evidence on polarized topics, particularly among skeptics (van Stekelenburg et al., 2022). These findings may provide support for the deficit model of science communication (Sturgis & Allum, 2004), which suggests that people hold scientific unsupported beliefs because they lack scientific knowledge and literacy but would form more scientifically supported beliefs with proper knowledge and education (see also, Pennycook et al., 2022).
Participants in these and other studies (e.g., Anglin, 2019; Carey et al., 2022; Rosman & Grösser, 2024; van Stekelenburg et al., 2022) might have showed belief change in response to belief-inconsistent evidence because the evidence induced negative arousal due to its dissonance with their expectations and worldview, and participants shifted their beliefs to accommodate the evidence and reduce the aversive state experienced from the dissonance (Proulx et al., 2012; Sleegers et al., 2019). Belief change effects persisted the next day, when the dissonance would likely have diminished, though it is possible that the act of reporting beliefs before reading belief-threatening evidence exacerbated the dissonance participants experienced and was responsible for producing lasting belief change. Because people typically do not assert their stance on an issue before reading evidence on a topic in real-world situations, future research should seek to measure initial beliefs at a time point before participants take the study to test whether the belief change effects observed in this research are contingent on reporting beliefs prior to the manipulation.
Importantly, although the overall pattern of results across the three studies supports these conclusions, some of the planned contrasts testing for belief change in response to particular stimulus research findings (e.g., death penalty ineffective evidence in Study 1) or at particular time points were non-significant, and equivalence could not be determined for several predicted null effects. These indeterminate results may be attributed to powering the studies for the overall ANOVAs but not fully for the follow-up planned contrasts. Future research should seek to replicate the present findings in larger samples, along with examine whether belief change in response to scientific evidence on polarized topics persists beyond 24 hours. Indeed, additional studies are needed to better understand whether belief change persists or is short-lived, without or even with repetition.
This research further built on prior work by examining individual difference variables associated with belief change in response to evidence on polarized topics. Replicating previous research (Anglin, 2019), participants showed more belief change if they rated the evidence as higher in quality. These findings support models of belief updating suggesting that belief change occurs as a function of evidence evaluation processes (Sommer et al., 2024). In Study 3, participants presented with stronger evidence rated it as higher in quality than those presented with weaker evidence, and participants tended to rate the evidence as stronger in Study 1 when presented with two studies and extensive detail about the studies than in Study 2 when presented with only a brief description of a single study. Thus, participants were influenced by both objective differences in the evidence quality and their subjective perceptions of it, and subjective perceptions were related to objective differences in the actual evidence quality. These findings support recent research indicating that detailed science communications are more effective than short ones (Chan & Albarracín, 2023) and provide further evidence that people can form scientifically supported beliefs when provided with the scientific knowledge to do so (Sturgis & Allum, 2004).
Participants also shifted their beliefs more in response to empirical evidence when they perceived science to be more certain. This finding suggests that people who view science as more conclusive are more convinced by new evidence whereas those who view science as tentative are more hesitant to update their beliefs. It is unclear whether individuals who perceive science to be more uncertain value holding empirically supported views but require more evidence to update their beliefs, given their understanding of the scientific process and the tentative nature of science, or whether they showed less belief change because they distrust science. Research suggests that directly communicating the uncertainty of scientific results, at least numerically, does not reduce trust in the findings (van der Bles et al., 2020). However, broad verbal statements of uncertainty may reduce trust somewhat (van der Bles et al., 2020), and indirect uncertainty involves questioning the quality or credibility of the evidence, which may be associated with lower trust (van der Bles et al., 2019). Because the questions measuring perceptions of scientific uncertainty in this research assessed verbal and indirect uncertainty, greater perceptions of scientific uncertainty might reflect participants’ distrust in science rather than an understanding of the inherent uncertainty of science. The literature suggests that the effectiveness of scientific uncertainty communications varies substantially based on individual differences and across contexts and topics (van der Bles et al., 2019), but these effects are not currently well-understood. Therefore, further research is needed on the role of individual differences and contextual factors in how people view scientific uncertainty and use it to make judgments and decisions.
Future research may also benefit from examining the relationship between epistemological beliefs (individuals’ general beliefs about the objectivity vs. subjectivity of knowledge; e.g., Kuhn et al., 2000) and belief updating in response to new evidence, as epistemological beliefs are conceptually related to scientific (un)certainty and may shape how people respond to new evidence and knowledge across a variety of contexts. However, research suggests that epistemic beliefs are difficult to access through introspection (Hofer & Pintrich, 1997), raising the question of whether participants also struggled to answer the scientific (un)certainty questions in this research. Perceptions of scientific certainty were consistently related to belief change across studies, suggesting that our measure was capturing meaningful beliefs participants held; however, future research is needed to further validate the scientific (un)certainty scale.
Individual differences in goals related to participants’ beliefs (belief commitment, belief certainty, support for science, political ideology, and social desirability), thinking dispositions (actively open-minded thinking, intellectual humility, and objectivism), and scientific reasoning and education did not reliably correlate with belief change. Previous research has found that individuals who score higher in intellectual humility tend to be more open to new evidence and opposing viewpoints (Bowes et al., 2022; McDiarmid et al., 2021; Porter & Schumann, 2017). In addition, Rosman and Grosser (2024) found that trust in science predicted belief updating in response to evidence on a controversial topic. Support for science and education may not have correlated with belief change in response to the evidence in the present research because individuals are not uncritically accepting of scientific evidence and take into account a variety of factors—including the evidence strength (Rosman & Grösser, 2024), their perceptions of its quality, their perceptions of the particular research topic based on its ideological underpinnings (Pew Research Center, 2015; Rutjens et al., 2018), and personal and social interests (Hornsey & Fielding, 2017)—when updating their beliefs. Indeed, belief updating and evidence evaluation processes may differ when people evaluate summaries of scientific studies than when they are presented with statements proposed to be supported by science and scientific consensus on a topic; belief updating in response to scientific statements and consensus information may be more closely linked to whether individuals trust science (e.g., Rosman & Grösser, 2024), whereas belief updating in response to research summaries may be more closely tied to evaluations of the research. Scientific reasoning ability was negatively correlated with belief change in Study 2, perhaps because the evidence presented was weaker in quality than the evidence presented in the other studies. Study 3 measured the personal relevance of the evidence to participants and found a negative correlation between frequency of playing video games and belief change in response to the evidence. Studies on additional topics are needed to determine whether people are generally more resistant to changing their beliefs on topics that bear on their lifestyle and decisions.
In Studies 2 and 3, positive and negative affect did not consistently correlate with belief change in response to the evidence. The only exception was surprise: greater surprise in response to the evidence was generally associated with belief change. This finding is consistent with research suggesting that unexpectedness or surprise may predict greater receptiveness to evidence (Drummond & Fischhoff, 2020), as it induces uncertainty (Lerner & Keltner, 2000) and increases the perceived credibility of the findings (Wallace et al., 2020). The lack of overall relationships between positive and negative affect and belief change may be attributed to the fact that the evidence did not elicit strong emotions. Though similar in strength to reports of emotions in other studies that examined emotional responses to polarized evidence (e.g., Drummond & Fischhoff, 2020) or specifically manipulated emotions (Weeks, 2015), it is possible that the research topics and evidence used as stimuli in this research were not perceived as strongly polarizing as other debated and controversial science topics. Furthermore, we did not specifically recruit participants with polarizing views. People with extreme beliefs on the topics might have been more resistant to belief-inconsistent evidence than participants were in the present research.
Our results suggest that features of the evidence, individuals’ perceptions of it, and scientific certainty may be more strongly related to belief change in response to scientific evidence than individual differences in beliefs, emotions, and thinking dispositions. Nonetheless, there was variability in the some of the relationships between the individual difference variables and belief change across studies and time points, these analyses were largely exploratory, and sensitivity power analyses suggested that the studies lacked power to detect smaller than medium size correlations. In addition, some of the significant relationships may be false positives, given the number of exploratory analyses performed. As such, it is unclear whether some of the individual difference variables play a smaller role in facilitating or impeding belief change and whether the findings would replicate in a larger sample. It is also unclear how these variables might interact, including with other unmeasured variables, to predict belief change.
Moreover, an important limitation to this research is that the individual difference variables were measured after the presentation of evidence. We made this decision to maintain the integrity of the replication. Studies suggest that completing individual difference measures, such as scientific reasoning, before reading summaries of scientific studies may alter how people respond to the evidence (e.g., Drummond & Fischhoff, 2019). We were concerned that several of the other variables (e.g., belief commitment, political orientation, support for science, scientific uncertainty, AOT, intellectual humility, objectivism) could have a similar effect, making participants more receptive or resistant to the evidence than they would naturally be. Indeed, research has shown that affirming beliefs related to a subsequently presented argument makes individuals more resistant or receptive to the evidence, depending on whether the affirmation is compatible or incompatible with the persuasive message (Jacks & O’Brien, 2004). Rating one’s belief commitment, support for science, actively open-minded thinking, etc. before reading the evidence might function as an affirmation and alter how participants respond to the evidence. In addition, some of the variables needed to be measured after the presentation of evidence (e.g., perceptions of the evidence quality and affect after the evidence). Even so, it is possible that reading about the research altered how participants responded to the individual difference measures and influenced the relationships observed with belief change. Future studies would benefit from administering the individual difference measures in an initial, separate session to test the replicability of the present findings.
In sum, this research adds to our understanding of how people respond to scientific evidence on polarized topics, finding that participants shifted their beliefs in response to the evidence and maintained this change 24 hours later, particularly when the evidence opposed their beliefs, they perceived the evidence to be higher in quality, and they rated science as more certain. Although this research suggests that people can be receptive to scientific evidence opposing their beliefs on polarized topics, future research is needed to examine the persistence and predictors of belief change in response to evidence over a longer time frame, across topics, and in response to evidence varying in strength. If people only slightly shift their beliefs in response to strong, converging scientific evidence for some topics, shift back over time, even with reinforcement (Carey et al., 2022), and are only attuned to the strength of evidence initially, significant barriers to effectively changing beliefs would exist, despite the belief change effects observed in this research. Because weighing the strength of scientific evidence to support claims is crucial to making informed, evidence-based decisions, future research should seek to better understand the factors people rely on to evaluate scientific evidence and how people apply evidence from research studies to make decisions.
Open Practices
Open Data and Materials: The study materials, data files, and analysis scripts for all studies are available on this paper’s Open Science Framework project page at https://osf.io/gfp48/.
Preregistration: The preregistrations for each study are available at: https://osf.io/2w5uy (Study 1), https://osf.io/xencw (Study 2), https://osf.io/sgehf (Study 3).
Contributions
Contributed to conception and design: SMA, ER, JY
Contributed to acquisition of data: SMA
Contributed to analysis and interpretation of data in main text: SMA
Contributed to analysis of open-ended responses in supplementary materials: SMA, JY, NAM
Drafted the article: SMA
Revised the article: SMA, NAM
Approved the submitted version for publication: SMA, ER, JY, NAM
Acknowledgements
We thank John Daley, Alexandra Deku, and Marlendy Elysee and for their help coding qualitative responses.
Author Note
This research was conducted while the authors were affiliated with Hobart and William Smith Colleges.
Funding Information
There was no external funding source for this research. The project was supported by internal funding to the first author. The funding source had no involvement in any stage of this research.
Competing Interests
The authors declare that they have no known competing interests or personal relationships that could have appeared to influence the work reported in this paper.
Data Accessibility Statement
The study materials, data files, and analysis scripts for all studies can be found on this paper’s Open Science Framework project page at https://osf.io/gfp48/. The preregistrations are available at: https://osf.io/2w5uy (Study 1), https://osf.io/xencw (Study 2), https://osf.io/sgehf (Study 3).
Footnotes
Some prior studies presented participants with a single pattern of belief-disconfirming findings and found that participants shifted their beliefs in response to the evidence but were interpreted as demonstrating resistance to persuasion (e.g., Cohen et al., 2000; Zuwerink & Devine, 1996).
Although research has found various strategies to be effective in promoting belief change in response to evidence on controversial or contested topics (e.g., Bago et al., 2020; Lord et al., 1984), here we focus on belief change in response to evidence without such additional strategies.
Although this hypothesis was not specified in the preregistration for Study 1, it was supported by previous research using the same methodology (Study 4 of Anglin, 2019) and was explicitly specified in the preregistration for Studies 2 and 3.
Because Anglin (2019) found no differences in beliefs between each summary and full description, participants only reported their belief after each study full description rather than after each study summary and full description. This change was implemented to reduce demand effects.
We refrained from specifying predictions about whether belief consistency would moderate belief change effects in the preregistration for Study 1 because Anglin (2019) found conflicting results regarding this effect across four studies. However, the methods of the studies in Anglin (2019) varied considerably, particularly with respect to the stimulus studies presented. Because the present study nearly directly replicates Study 4 from Anglin (2019), the results of that study likely serve as the best guide for the expected pattern of results in the present study.
Participants could select multiple race or ethnicity categories.
Although the difference in positive vs. negative phrasing of the items likely contributed to the discrepancy, people may also vary in their perceptions of whether uncertainty is present when initially conducting studies on a topic vs. whether science can ever resolve uncertainty (Rabinovich & Morton, 2012). Indeed, the certainty items focus on (un)certainty in the results of research studies, whereas the uncertainty items focus on (un)certainty inherent to science.
The N for this analysis is higher than the total number of participants with full data as some participants dropped out before completing the study.
Because we did not make predictions about changes in belief certainty, we present these analyses in the Supplementary Materials.
That is, after the presentation of the first stimulus study (for all participants) and after the presentation of both studies (for those who received congruent evidence)
This was done by subtracting beliefs from 9 for those who received evidence suggesting the death penalty is ineffective and applying no transformation for those who received evidence suggesting the death penalty is effective.
A discussion of the literature on affect and the processing of belief-relevant information is available in the Supplementary Materials.
Due to an error, the preregistration text appears in the study description section to the right of the summary box for Studies 2 and 3.
Surprise was again analyzed separately, as a factor analysis revealed it cross-loaded on both the positive and negative affect factors, as it did for affect at T1.
The stimuli were slightly revised from those used most recently in the research reported here: https://osf.io/zcqmk. See the Supplementary Materials for further information about how the stimuli were created.
For the randomization and manipulation checks, Ns are higher than the total number of participants with full data as some participants dropped out before completing the study.
In Study 3, we also predicted that those with a stronger understanding of science (measured via scientific reasoning ability, perceptions of scientific uncertainty, and education) would be more sensitive to the evidence strength in updating their beliefs. Due to the general lack of findings and readability concerns, we present these results in the Supplementary Materials.
This specific analytic approach deviated from our preregistration (see Study 1 for details).
Due to the large number of analyses required to conduct the additional exploratory mediation analyses described in the preregistration, and limited relevance to the research aims, these analyses were not performed.
For these analyses, the belief variables were coded in the direction of the evidence; for those who received evidence suggesting gun is ineffective, beliefs were subtracted from 8, and for those who received evidence suggesting gun control is effective, no transformation was applied, such that higher scores indicated a stronger belief in the direction of evidence.