What Are the Necessary Conditions for Wisdom? Examining Intelligence, Creativity, Meaning-Making, and the Big-Five Traits

We investigated whether intelligence, creativity, meaning-making, and the Big-Five traits are necessary conditions for wisdom. We used Amazon’s TurkPrime to recruit 298 participants who ranged from 20 to 73 years of age. Participants completed measures of intelligence, creativity, meaning-making, and the Big-Five traits, along with a battery of self-report and performance wisdom measures. We used principal component analyses to reduce the wisdom battery into self-report and performance wisdom components, followed by necessary condition analysis and segmented regressions to examine whether the cognitive and personality variables under consideration here were necessary conditions for each wisdom component. We found that intelligence was necessary for the performance wisdom component whereas the Big-Five traits were necessary for the self-report wisdom component. This study is the first to demonstrate that high levels of wisdom are unlikely without some level of intelligence and adaptive personality traits.


Wisdom: Definition and Measurement
Despite increasing interest in studying wisdom scientifically over the past few decades, most of this research has been conducted without a common definition of wisdom. Instead, there are almost as many definitions of wisdom as there are wisdom researchers (Glück, 2018), with different definitions focusing on different aspects of the construct. However, there is also substantial agreement (e.g., Glück, 2018;Grossmann et al., 2020), which has allowed wisdom researchers to agree on a common model of wisdom (Grossmann et al., 2020). The common model defines wisdom as meta-cognitive processes and strategies grounded in moral aspirations. Specifically, wise meta-cognitive processes and strategies entail considering multiple perspectives, balancing different viewpoints, integrating opposing views, reflection, and adapting problem solutions to their specific contexts. Moral aspirations relevant to wisdom include the willingness to balance one's interests with those of others, the pursuit of truth, and an orientation towards shared humanity without defaulting to in-group favoritism and outgroup negativity (Grossmann et al., 2020).
Operationalizations of wisdom, however, show more divergence. Measures of wisdom only moderately correlate with one another (e.g., , suggesting the presence of strong methodological effects and differences in the constructs being measured (Glück, 2018).
Operationalizations of wisdom fall into two main camps: self-report and performance measures. The preference for different measures has partly emerged from different conceptualizations of wisdom. That is, although there is substantial agreement regarding the qualities and characteristics that comprise wisdom (e.g., Glück, 2018;Grossmann et al., 2020), researchers conceptualize wisdom at different levels of analysis. For instance, some researchers think of wisdom as a configuration of characteristics of a person (e.g., Ardelt, 2003;Ardelt et al., 2019;Levenson et al., 2005;Webster, 2007) and therefore tend to develop and employ self-report wisdom measures. Other researchers think of wisdom as the process and/or products of cognition (e.g., Baltes & Staudinger, 2000;Brienza et al., 2018;Grossmann et al., 2013;Oakes et al., 2019). Consequently, these researchers tend to develop and employ performance measures of wisdom, which invite participants to record their thought processes as they work through difficult social or societal scenarios. These open-ended responses are then scored by raters on the levels of wisdom that they manifest. A meta-analysis on wisdom correlates found pronounced differences between self-report and performance measures (Dong et al., 2022). Given the substantial overlap between measurement methodology and theoretical approach to defining wisdom, it is difficult to determine whether the differences between self-report and performance wisdom measures are due to differences in measurement methodology or differences in conceptualizations of wisdom. As no single approach has been shown to comprehensively capture the construct of wisdom, we opted to assess wisdom in the current study with all commonly used wisdom measures.

Wisdom and Intelligence
Most wisdom researchers agree that intelligence should correlate with wisdom (e.g., Grossmann et al., 2020), as intelligence involves the abilities to solve problems and to learn from experiences (Gottfredson, 1997;Nisbett et al., 2012), both of which are essential for wisdom. Indeed, operationalizations of wisdom tend to include dimensions that assess the abilities to learn from experiences (Ardelt, 2003;Baltes & Staudinger, 2000;Webster, 2003Webster, , 2007 and to problem-solve (Brienza et al., 2018;Grossmann et al., 2013;Mickler & Staudinger, 2008). At the same time, wisdom is distinguished from intelligence. Problems of intelligence are amoral, close-ended, and have unambiguous solutions, whereas problems of wisdom are open-ended, without definite solutions, and cannot be solved without moral considerations (e.g., Sternberg, 1998). Intelligence and wisdom are therefore expected to correlate but not so as to suggest that they are the same construct. This expectation is supported by some studies, which reported small correlations between the two constructs (e.g., Mickler & Staudinger, 2008;, but not by others, which found no significant correlations (e.g., Grossmann et al., 2012). Meta-analytic findings suggest that while wisdom performance weakly correlates with intelligence (r = .15), self-reported wisdom does not (Dong et al., 2022).
However, correlations (which describe the linear relationships between variables) might not fully capture the relationship between intelligence and wisdom. Specifically, intelligence is believed to be a necessary-but-not-sufficient condition for wisdom, where a threshold level of intelligence is required for wisdom, but beyond which intelligence ceases to matter (e.g., Ardelt, 2000;Baltes et al., 1995;Glück, 2018Glück, , 2020Grossmann et al., 2020;Staudinger & Pasupathi, 2003). One study found a negative quadratic re-lationship between intelligence and wisdom, suggesting a nonlinear relationship between the two constructs (Mickler & Staudinger, 2008). A measure of fluid intelligence has been shown to form a "triangular" relationship with the Berlin wisdom paradigm, where most of the datapoints in the scatterplot occurred in the lower-right region and very few occurred in the upper-left region (Glück, 2020). In other words, participants with low levels of fluid intelligence were less likely to score very highly on wisdom, as measured by the Berlin wisdom paradigm, suggesting necessity, but participants with high levels of fluid intelligence could have any level of wisdom, suggesting insufficiency. While conceptually compelling, the triangularity of the relationship could not be tested for statistical significance. The hypothesis that intelligence is a necessary-but-not-sufficient condition for wisdom thus remains to be statistically tested (Glück, 2020). In the current study, therefore, we tested whether intelligence is necessary for wisdom and whether a threshold exists beyond which intelligence no longer predicted wisdom.

Wisdom and Creativity
Wise solutions to problems are inherently novel, not easily anticipated, and unconventional, implying that creativity is relevant for wisdom (Kunzmann & Baltes, 2003). Wisdom and creativity are thought to be located at the interface of intelligence and personality, signifying the integration of cognitive and personality functioning that first emerges in adolescence (Staudinger & Pasupathi, 2003;Sternberg, 2001). Accordingly, creativity and wisdom should be robustly correlated from adolescence on. Flexible thinking, a characteristic of creativity, is also included as a dimension of wisdom in some conceptualizations and operationalizations (Brienza et al., 2018;Grossmann et al., 2013). While creativity was found to be correlated (rs ranged from .28 to .40) with wisdom across a variety of wisdom and creativity measures Helson & Srivastava, 2002;Staudinger et al., 1997), it is possible that the relation between creativity and wisdom is not linear. Specifically, creativity might be necessary for wisdom as some aspects of creativity, including flexible thinking and unconventionality, are considered integral to wisdom. In the current study, we tested whether some level of creativity is necessary for wisdom.

Wisdom and Meaning-Making
The tendency to engage in meaning-making may facilitate wisdom development by making individuals reflect on their life experiences more deeply and frequently, thereby extracting new lessons and insights from these experiences. Meaning-making is a form of exploratory autobiographical reasoning whereby individuals reflect on their experiences to gain new lessons and insights about themselves and the world. Exploratory autobiographical reasoning like meaning-making has long been hypothesized to be a chief source of wisdom (Ardelt, 2005;Baltes & Staudinger, 2000;Glück & Bluck, 2013;Weststrate & Glück, 2017). According to the MORE Life Experience Model, the motivation to reflect on life's challenges and to develop a deeper understanding in the process sets the stage for wisdom development (Glück et Glück & Bluck, 2013). Although the hypothesis of meaning-making as a resource for wisdom has been corroborated by the robust positive correlations (rs ranging from .10 to .44) between the two constructs (Webster et al., 2018;Weststrate et al., 2018;Weststrate & Glück, 2017), it needs to be more fully tested. One way in which meaning-making might be a resource for wisdom is by being its necessary condition, a hypothesis that we tested in the current study. We also examined whether a threshold existed whereby increases in meaning-making would cease to predict increases in wisdom.

Wisdom and the Big-Five Traits
Of all the Big-Five traits, openness has the strongest theoretical and empirical association with wisdom. Most conceptualizations and operationalizations of wisdom include the consideration of multiple perspectives as an important aspect (Ardelt, 2003;Baltes & Staudinger, 2000;Brienza et al., 2018;Grossmann et al., 2013;Mickler & Staudinger, 2008), which is related to openness. Openness is also theorized to lead to more encounters with wisdom-fostering experiences (Glück et al., 2019;Glück & Bluck, 2013;Webster et al., 2018), enable one to deal with such experiences in wisdom-fostering ways, and prompt reflections on as well as integration of these experiences into one's life story in ways that allow for continued learning and growth (Glück et al., 2019;Glück & Bluck, 2013). Openness is thus thought to be a resource for wisdom by some researchers (Glück et al., 2019;Glück & Bluck, 2013) and a necessary-butnot-sufficient feature of wisdom by others (Webster et al., 2018). Congruent with these theoretical perspectives, openness significantly correlated with most measures of wisdom (e.g., Brienza et al., 2018;Le, 2005;Levenson et al., 2005;Mickler & Staudinger, 2008;Staudinger et al., 1997;Webster, 2014;Zacher et al., 2015). Indeed, meta-analytic evidence suggests a moderate-to-large correlation between openness and wisdom (r = .29; Dong et al., 2022). However, extant empirical evidence is insufficient to support the claim that openness is a resource for wisdom. Therefore, we investigated one way in which openness might be a resource for wisdom by examining whether it was a necessary condition for wisdom.
The relationships between wisdom and the other Big-Five traits are less clear both theoretically and empirically. When conceptualized as an adaptive configuration of personality characteristics, wisdom is expected to positively correlate with extraversion, agreeableness, conscientiousness, and emotional stability (reversed neuroticism; Ardelt et al., 2019). Indeed, self-report wisdom measures showed moderate-to-strong meta-analytic correlations with these traits (r extraversion = .26, r agreeableness = .28, r conscientiousness = .20, r emotional stability = .30; Dong et al., 2022). Wisdom has also been thought to be characterized by balanced, rather than extreme, levels of personality traits (Glück, 2018). For emotional stability, a threshold level is thought to be necessary for wisdom (Staudinger et al., 2005). However, empirical support for these nonlinear relationships is limited, with only one study reporting a negative quadratic relationship between wisdom and extraversion (Staudinger et al., 1998as cited in Glück, 2018). In the current study, therefore, we tested the hypothesis that certain levels of traits conscientiousness, agreeableness, extraversion, and emotional stability constitute necessary conditions for wisdom.

Testing Necessary-but-not-Sufficient Relations
Traditional approaches like quadratic relationships, while occasionally adopted by researchers for this purpose, cannot properly test for necessity relations or the presence of thresholds. Instead, we used necessary condition analysis (NCA; Dul, 2016;Dul et al., 2020) to statistically test for necessity and segmented regressions to test for thresholds.

Testing for Necessity: Necessary Condition Analysis (NCA)
If continuous variable X is a necessary condition for continuous variable Y, then it follows that it is unlikely, if not impossible, to have certain levels of Y without having certain levels of X. This kind of relationship can be statistically tested using NCA. NCA works by drawing a ceiling line through the upper-left observations in a scatterplot of Y on X. If values of X constrain values of Y (i.e., if X is a necessary condition for Y), then there should be an empty upperleft corner in the scatterplot. The ceiling line of the NCA defines this empty upper-left corner and the area of this empty space reflects the extent to which X constrains Y. The more X constrains Y, the larger the empty space should be. The necessity effect size (d) is the area of the empty space divided by the area of the space in which observations could occur given the range of X and Y. The effect size is thus affected by the mathematical technique through which the ceiling line is drawn. Two types of technique are recommended: Ceiling Envelopment-Free Disposal Hull (CE-FDH) and Ceiling Regression-Free Disposal Hull (CR-FDH; Dul, 2016). The CE-FDH draws a non-decreasing linear step function through the upper-left observations of the XY scatterplot. The CR-FDH technique, in contrast, fits an ordinary least squares trend line through the upper-left observations, with some of these observations falling above the ceiling line and others below. The two techniques usually yield similar estimates. As neither technique is clearly superior to the other, we followed the recommended practice of NCA and used both to calculate the necessity effect size in the current study.
Necessity effect sizes can be evaluated for their magnitude and statistical significance. Dul (2016) has suggested that effect sizes of 0 < d < .10 are small, .10 ≤ d < .30 are medium, .30 ≤ d < .50 are large, and d > .50 are very large. Statistical significance of necessity effects can be evaluated through approximate permutation tests. A permutation test first generates a distribution of necessity effect sizes under the null hypothesis. Through comparing the observed necessity effect size to the distribution under the null hypothesis, it is possible to calculate a p-value of the observed effect. Ideally, the distribution under the null hypothesis should be generated by reshuffling the observed values of X and Y such that all possible combinations (permutations) of X and Y values are obtained. However, in practice, this approach is too resource-intensive as the number of permutations increases rapidly with the number of observations. In-What Are the Necessary Conditions for Wisdom? Examining Intelligence, Creativity, Meaning-Making, and the Big-Five Traits Collabra: Psychology stead, the approximate permutation test is more commonly used, where the distribution under the null hypothesis is generated by randomly selecting a large number of permutations from all possible permutations ). An observed necessity effect is considered statistically significant if it is larger than 95% of the effect sizes that form the distribution under the null hypothesis .
Beyond examining each independent variable as a necessary condition for a dependent variable, it is also possible to examine the joint necessity effect of a multitude of independent variables. This can be achieved using the bottleneck technique of NCA, which identifies the level of each independent variable necessary for each level of the dependent variable (Dul, 2016). To have a certain level of the dependent variables, the necessary level of each independent variable must be met. If one independent variable is lower than its necessary level, the other independent variables cannot compensate for it, and so the desired level of the dependent variable is unlikely or impossible (Dul, 2016). In the current study, we examined the necessity effect of each variable on wisdom as well as their joint effect.
It is important to note that NCA does not automatically allow causal inferences, especially when using cross-sectional data. Necessity relationships tested by NCA are logical: one continuous variable being necessary for another continuous variable means that the value of the former constrains the value of the latter. Moreover, in logic, necessity does not imply temporal precedence or causality. In the current study, therefore, our goal was to examine whether the level of wisdom was constrained by the levels of the cognitive and personality variables of interest; we did not aim to investigate the causal directions of such relationships. When we use terms like "necessity" or "necessary conditions" in this paper, we mean only logical necessity without any causal implications.

Testing for Threshold: Segmented Regression
Some articulations of the necessary condition hypothesis state that the independent variable predicts the dependent variable only up to a certain value of the independent variable (i.e., a threshold), beyond which the independent variable is no longer associated with the dependent variable (i.e., insufficiency). Testing the threshold hypothesis, therefore, involves identifying a threshold and statistically testing its significance. In addition, the independent variable should positively predict the dependent variable before the threshold but should not predict it after the threshold. In this study, we used segmented regression analysis for such testing. Segmented regressions assess whether the relationship between the independent variable and the dependent variable is piecewise linear. Specifically, it takes a guessed threshold value of X as the starting point and uses a bottom-up, iterative process to estimate the threshold value that best fits the data (Muggeo, 2008). The statistical significance of the threshold is then evaluated using the Davies test (Davies, 1987). In the current study, segmented regression analyses allowed us to evaluate whether a statistically significant breakpoint (ψ) existed such that the relationship (i.e., slope) between wisdom and an independent variable of interest is positive and significant be-fore it and non-significant after it.

The Current Study
We examined whether intelligence, creativity, meaningmaking, and the Big-Five traits were necessary-but-not-sufficient conditions for wisdom. Specifically, we examined whether the necessity effect was significant for each of the independent variables of interest, such that high levels of wisdom were impossible without a certain level of that variable. Additionally, we examined whether a statistically significant threshold existed such that an independent variable of interest no longer predicted wisdom beyond it, suggesting insufficiency.

Method
This study was approved by the Social Sciences, Humanities and Education Research Ethics Board of the University of Toronto. Study materials (with the exception of the copyrighted WPT-Q), data and codebook, and analysis code are available at https://osf.io/du3r5/. The study was not preregistered.

Participants
As effect size estimates directly relevant to our research questions were not available at the time of the study design, we calculated the sample size using the lower bound of the middle third of correlation coefficients in psychology, r = .18 (Hemphill, 2003). A power analysis, conducted using G*Power (Faul et al., 2009), suggested that 271 participants were needed to achieve 85% power at detecting such an effect with a two-tailed α of .05. We thus set out to recruit 300 participants through Amazon's TurkPrime. A total of 309 participants completed the study (which on average took 90 minutes), who were each compensated with $7.50 USD. We removed 11 participants' responses due to noncompliance to study instructions and low quality. Specifically, we used open-ended questions as a way to gauge compliance and response quality. We removed responses if they included copy-and-pasted texts (identified through searching said text on Google) or unintelligible texts for the open-ended questions. The final sample included 298 participants, ranging from 20 to 73 years in age (M = 37.82, SD = 10.83). Demographics of the final sample are presented in Table 1.

Procedure and Measures
After consenting, participants completed (a) measures of intelligence and creativity, (b) a narrative task in which they described a wisdom-fostering experience, (c) a measure of the Big-Five traits, and (d) a battery of wisdom measures. Participants were debriefed immediately upon study completion and compensated within a few days after their responses were verified for compliance and quality. After all data were collected, independent groups of research assistants rated the open-ended responses for the constructs they manifested. Descriptive statistics and reliabilities of all measures are presented in Table 2. As some assumptions of Cronbach's alpha, namely adherence to tau equivalence What Are the Necessary Conditions for Wisdom? Examining Intelligence, Creativity, Meaning-Making, and the Big-Five Traits Collabra: Psychology (i.e., each item on a scale contributing equally to the total scale score) and continuous (rather than discrete) item scaling, were violated, we used omega total instead to assess internal consistency following the recommended practice (e.g., McNeish, 2018). As omega total can be interpreted like Cronbach's alpha (e.g., McNeish, 2018), we considered omega totals above .70 as indicating adequate reliability in the form of internal consistency. We assessed reliability in the form of inter-rater agreement using intra-class coefficients, ICC(2, k), for open-ended questions that required raters. We considered ICC(2, k) above .70 as indicative of adequate inter-rater reliability.

Cognitive Ability Measures
Intelligence. Intelligence was assessed using the Wonderlic Personnel Test -Quicktest (WPT-Q; Wonderlic, Inc., 2000), an abbreviated, 30-question version of the Wonderlic Personnel Test (WPT; Wonderlic, Inc., 2000). The WPT and WPT-Q are widely accepted as indicators of general intelligence, or g. The WPT-Q was chosen for its demonstrated validity and its proctor-free online administration format. Participants were given 8 minutes to answer as many questions as possible on the WPT-Q.
Creativity. Creativity was measured using the Alternative Uses Task (AUT; Guilford, 1967), a measure of divergent thinking. Participants were asked to list as many creative and non-mundane uses for a brick as possible within 3 minutes. Two trained undergraduate research assistants coded each participant's response on four dimensions. For fluency, participants' scores were derived from the total number of uses they listed, excluding implausible and nonsensical uses, r inter-rater = .98, t(296) = 85.24, p < .001. For elaboration, the research assistants rated the level of detail at which each use was described on a scale from 1 (not elaborate) to 3 (very elaborate). A participant's elaboration score was derived from the average elaboration score across all the uses they listed, r inter-rater = .91, t(296) = 38.78, p < .001. For flexibility, the research assistants first agreed on a list of categories of uses (e.g., construction) after reading through all uses generated by all participants. Flexibility scores for each participant was the number of categories covered by the uses they listed, r inter-rater = .83, t(292) = 25.77, p < .001. For originality, the research assistants first identified 5% of all uses listed by all participants, of which there were 2193, as original. Original uses were defined as uses that were either unique or listed by very few others in the current sample. Each participant's originality score was the number of uses they listed that were judged to be original. As originality had a low incidence rate by definition, we used the percentage of perfect agreement, instead of inter-rater correlation, to evaluate inter-rater agreement. At the level of participants, the two raters agreed perfectly in 69.46% of the cases. As inter-rater agreement was adequate for the four dimensions of creativity, we standardized and averaged the dimension scores to obtain one creativity score for each participant (omega total = .89).

Meaning-Making Measure
Meaning-making was assessed through participants' written accounts of an event that they found wisdom-fos- tering . Participants were asked to describe the event, its context, their thoughts and feelings during it, and its significance to their identity and life story. Each response was rated by three trained undergraduate research assistants for the level of meaning-making it manifested on a scale from 0 (no meaning) to 4 (highly developed  Note. Reliability was calculated using omega total in all cases except those indicated with the † superscript, which were calculated using ICC(2, k). Polychoric correlation matrices were used to calculate the omega totals for all variables except creativity and the whole scale of 3DWS (where the three subscales served as items), for which Pearson correlation matrices were used (McNeish, 2018).
*As only participants' total test scores, not scores on each item, were shared with us due to the copyrighted nature of the Wonderlic Personnel Test -Quicktest (WPT-Q), we were unable to compute the reliability of the test for our sample. The reliability of the WPT-Q in general is available in Wonderlic, Inc. (2004).
meaning that has significant depth). Scores were determined based on the extent to which the meaning (i.e., lesson or insight) gained from the experience was reflective, elaborative, impactful, and integrated with the specific aspects of the event (Weststrate, 2014). Responses were coded in three waves of 100 responses and the order of the responses was randomized within each wave. Inter-rater agreement, ICC(2, 3), was adequate for each wave (.81, .84, and .80, respectively) as well as for the overall sample (.82). We thus averaged the scores across coders to obtain one meaningmaking score for each participant.

Personality Measure
Big-Five Inventory-2 (BFI-2). The 60-item BFI-2 (Soto & John, 2017) was used to assess extraversion (e.g., "I am someone who is outgoing, sociable"), agreeableness (e.g., "I am someone who is compassionate, has a soft heart"), conscientiousness (e.g., "I am someone who is dependable, steady"), neuroticism (e.g., "I am someone who worries a lot"), and openness (e.g., "I am someone who values art and beauty"). The items for each trait were rated on a Likert scale ranging from 1 (disagree strongly) to 5 (agree strongly) and were averaged to obtain one score for each trait for each participant (see Table 2 for internal consistency). Reverse-scored neuroticism, emotional stability, was used in all analyses.

Performance Wisdom Measures
Berlin wisdom paradigm. We assessed participants' general wisdom performance using the Berlin wisdom paradigm (Baltes & Smith, 1990;Baltes & Staudinger, 2000), in which participants typed down their thoughts as they worked through a hypothetical dilemma (i.e., "A 15-yearold girl wants to get married right away. What should one take into consideration and do in such a situation?"). Ten untrained (i.e., naïve) undergraduate research assistants rated participants' responses on a scale from 1 (very unwise) to 6 (very wise) by relying on their implicit theories of wisdom. The convergent and construct validity of this naïve rating approach has been demonstrated by previous research Staudinger et al., 1992Staudinger et al., , 1998Weststrate et al., 2018). As inter-rater reliability was high (Table 2), we averaged the ratings across research assistants to obtain one Berlin wisdom paradigm score for each participant.
Bremen Wisdom Paradigm. We assessed participants' wisdom performance in their personal conflicts using a variation  of the Bremen wisdom paradigm (Mickler & Staudinger, 2008), in which participants were asked to recall a conflict with a close friend, describe how they dealt with the conflict then, and what they would do differently now. Participants' written responses were rated on a scale from 1 (very unwise) to 6 (very wise) by a different group of 10 untrained (i.e., naïve) undergraduate re-search assistants . As the inter-rater reliability was high (Table 2), we averaged the ratings across research assistants to obtain one Bremen wisdom paradigm score for each participant.
Grossmann's Wise Reasoning Task. We assessed wise reasoning using Grossmann's wise reasoning task , in which participants read about an intergroup conflict (a newspaper article about the conflict between Tajiks and Kyrgyz immigrants in Tajikistan) and an interpersonal conflict (a letter to a help column in which an individual described being caught between an arguing couple as a mutual friend) and were asked how the conflicts would unfold and why. Previous familiarity with the intergroup conflict was unlikely to have strongly affected the results, as most participants either did not know about the intergroup conflict previously (87.25%) or knew only a little (8.39%). One group of three undergraduate research assistants rated the intergroup conflict responses and another non-overlapping group rated the interpersonal conflict responses. Each response was rated on six dimensions of wise reasoning on a scale from 1 to 3 (scale anchors were different for each dimension; see Grossmann, 2012). The resolution dimension assessed the extent to which participants valued the importance of conflict resolution. The limits of knowledge dimension assessed the level of uncertainty expressed in participants' predictions. The compromise dimension assessed the extent to which participants valued compromise. The flexibility dimension assessed the ability to see that the conflict could unfold in many ways. The perspective dimension assessed the ability to look at the conflict from different perspectives. The change dimension assessed the extent to which participants predicted changes from the status quo. The ratings of some dimensions (i.e., resolution, compromise, flexibility, and change) were reverse-scored such that higher values represented wiser reasoning. ICC(2, 3) was adequate for most dimensions across the two conflicts ( Table 2). The low ICC for the flexibility dimension of the intergroup conflict was likely due to range restriction, as the raters agreed perfectly in 80.69% of the cases. Ratings provided by one rater were removed for the change dimension of the intergroup conflict because they showed low levels of agreement with the ratings provided by the other two raters. As inter-rater agreement was adequate, we averaged the ratings across coders to obtain one score per dimension for each conflict for each participant. The internal consistency among the six dimensions was adequate for both the interpersonal and intergroup conflicts ( Table 2).

Self-Report Wisdom Measures
Three-Dimensional Wisdom Scale (3DWS). The 39-item 3DWS (Ardelt, 2003) is based on the conceptualization of wisdom as an integration of cognitive, reflective, and affective personality characteristics and has a cognitive sub-What Are the Necessary Conditions for Wisdom? Examining Intelligence, Creativity, Meaning-Making, and the Big-Five Traits Collabra: Psychology scale (e.g., "ignorance is bliss"), a reflective subscale (e.g., "things often go wrong for me by no fault of my own"), and an affective subscale (e.g., "I am annoyed by unhappy people who just feel sorry for themselves"). Items were rated on a Likert-scale from 1 (strongly agree/definitely true of myself) to 5 (strongly disagree/not true of myself), where disagreeing with the items leads to higher scores, which indicate greater wisdom. Following the recommended practice (Ardelt, 2003), we computed a total 3DWS score for each participant by averaging the three dimension scores (omega total = .86).
Self-Assessed Wisdom Scale (SAWS). The 40-item SAWS (Webster, 2003(Webster, , 2007 assesses wisdom with five subscales: critical life experience (e.g., "I have overcome many painful events in my life"), reminiscence/reflectiveness (e.g., "I often think about connections between my past and present"), emotion regulation (e.g., "it is easy for me to adjust my emotions to the situation at hand"), humor (e.g., "I can chuckle at personal embarrassments"), and openness (e.g., "I like to read books which challenge me to think differently about issues"). Items were rated on a Likert scale from 1 (strongly disagree) to 6 (strongly agree), with higher scores indicating greater wisdom. Items were averaged into one SAWS score for each participant (omega total = .95).
Adult Self-Transcendence Scale (ASTI). The 10-item self-transcendence subscale of the ASTI was administered to assess the transcendental aspect of wisdom (Levenson et al., 2005). Items on the ASTI (e.g., "my peace of mind is not so easily upset as it used to be") were rated on a Likert scale from 1 (disagree strongly) to 4 (agree strongly), where higher scores indicated greater wisdom. Items were averaged into one ASTI score for each participant (omega total = .84).
Situated Wise Reasoning Scale (SWIS). We administered the SWIS (Brienza et al., 2018) to assess the extent to which participants engaged in wise reasoning in their personal lives. Participants were first asked to recall a recent difficult situation with a close friend. To ensure the vividness of the memory, participants were asked to describe the context of the conflict as well as their thoughts and emotions during it. Participants then responded to the self-report wise reasoning questionnaire, the items of which all started with "while this situation was unfolding, I …" and fell into five dimensions: consideration of others' perspectives (e.g., "… put myself in the other person's shoes"), consideration of change (e.g., "… looked for different solutions as the situation evolved"), intellectual humility (e.g., "… double-checked whether my opinion on the situation might be incorrect"), search for compromise/resolution (e.g., "… tried my best to find a way to accommodate both of us"), and taking an outsider's perspective (e.g., "… tried to see the conflict from the point of view of an uninvolved person"). The items were rated on a scale from 1 (not at all) to 5 (very much), with higher scores indicating more wise reasoning.
The items were averaged to obtain one wise reasoning score for each participant (omega total = .95).

Results
All analyses and graphs were done in R 3.6.1 (R Core Team, 2019) with the psych (Revelle, 2019), tidyverse (Wickham et al., 2019), NCA (Dul, 2020), segmented (Muggeo, 2008), and ggplot2 (Wickham, 2016) packages. The descriptive statistics are presented in Table 2 and zero-order correlations are presented in Tables 3, 4, and 5. We performed the analyses in four stages. First, we conducted a principal component analysis (PCA) on the wisdom measures and computed the componential scores. Second, we examined how the wisdom components correlated with intelligence, creativity, meaning-making, and the Big-Five traits. Third, we tested whether intelligence, creativity, meaning-making, and the Big-Five traits were necessary for the wisdom components using the NCA. Finally, we examined whether breakpoints exist in the relations between these variables and the wisdom components using segmented regressions.

Dimension Reduction of Wisdom Variables
To limit the number of statistical tests and to make our conclusions more applicable to the construct of wisdom rather than to specific wisdom measures, we conducted a PCA with a promax rotation on the eight wisdom measures and used the componential scores for subsequent analyses. A comparison between the eigenvalues from the PCA (2.16, 1.34, 1.01, 0.95, 0.82, 0.75, 0.64, and 0.34) and those from a parallel analysis with 100 iterations (1.25, 1.16, 1.09, 1.02, 0.97, 0.91, 0.84, and 0.77) suggested that the first two components, together explaining 52.90% of the total variance, should be retained. Componential loadings after the promax rotation (Table 4) suggested that the 3DWS and the Bremen wisdom paradigm should be excluded from the calculation of componential scores, as neither had loadings above .40. These measures were thus also excluded from subsequent analyses. The first component comprised the SAWS, the ASTI, and the SWIS, whereas the second component comprised the Berlin wisdom paradigm and the two Grossmann's wise reasoning tasks. We interpreted the two components as representing self-report and performance wisdom, respectively. We computed the componential scores by unit-weight averaging the standardized wisdom measure scores. As necessity effects for individual wisdom scales, including the 3DWS and the Bremen wisdom paradigm, and some of their subscales may be of interest to some researchers, we have reported these effects in Supplemental Materials (Tables S2 and S3) while noting that these analyses deviated from the original analytic plan of the study and were exploratory.    Table 5 presents the correlations among age, intelligence, creativity, meaning-making, the Big-Five traits, individual wisdom measures, and the wisdom components. Consistent with the low correlations among wisdom measures found by previous research (Glück, 2018), the two wisdom components did not significantly correlate with each other, r = .11, 95% CI = [-.01, .22], t(287) = 1.83, p = .07. The performance wisdom component was uniquely correlated with intelligence, creativity, and meaning-making, whereas the self-report wisdom component was uniquely correlated with emotional stability and extraversion. William's Tests showed that in comparison to the self-report wisdom component, the performance wisdom component was significantly more strongly correlated with intelligence, t(252) = -2.01, p < .05, Cohen's q = -0.17, but was significantly more weakly correlated with openness, t(285) = 5.18, p < .01, Cohen's q = 0.41, conscientiousness, t(285) = 4.23, p < .01, Cohen's q = 0.34, extraversion, t(285) = 7.55, p < .01, Cohen's q = 0.57, agreeableness, t(285) = 5.35, p < .01, Cohen's q = 0.43, and emotional stability, t(285) = 6.59, p < .01, Cohen's q = 0.51. The wisdom components did not significantly differ in their correlations with creativity, t(283) = -1.21, p = .23, Cohen's q = -0.19, or meaning-making, t(282) = -1.66, p = .10, Cohen's q = -0.13. These findings were partly consistent with the meta-analytic evidence that self-report measures of wisdom correlate with all of the Big-Five traits, whereas performance measures of wisdom only correlate with openness (Dong et al., 2022).

Necessary Condition Analyses (NCA)
The necessity effects of intelligence, creativity, meaningmaking, and the Big-Five traits on each of the two wisdom components were calculated using both ceiling line techniques and were tested for statistical significance using approximate permutation tests with 10,000 repetitions ( Table  6). The necessity effect of intelligence on the performance wisdom component was significant and moderate in size regardless of the ceiling line technique employed. Because the presence of outliers could affect the areas of the empty and occupied spaces, which may in turn affect the results of the NCA, we conducted the NCA with and without outliers. We defined outliers as being 1.5 interquartile range below the 25% quantile or 1.5 interquartile range above the 75% quantile. Removing the outliers did not affect the significance of the necessity effects, with one only exception (openness and the performance wisdom component; see below). This is in line with the claim that NCA is not sensitive to outliers below the ceiling line (Necessary Condition Analysis, 2022). We thus reported the NCA results based on all data in the main text and the NCA results with the outliers removed can be found in Supplemental Materials (Table S1).
The scatterplot (Figure 1a) suggested that intelligence was only necessary for the performance wisdom component up to a point, an observation we formally tested with segmented regression. For the self-report wisdom component, however, intelligence was not necessary. Creativity and meaning-making were not necessary for either wisdom component.
Each of the Big-Five traits was necessary for the self-report wisdom component. The necessity effect was small for extraversion and moderate for openness, emotional stability, conscientiousness, and agreeableness (Table 6). Scatterplots (Figure 2d-g) suggested that these findings were largely a reflection of the moderate-to-strong positive linear relations between these traits and the self-report wisdom component. None of the Big-Five traits was necessary for the performance wisdom component. Specifically, agreeableness, emotional stability, extraversion, and conscientiousness were not necessary for the performance wisdom component according to either ceiling line techniques, whereas openness was necessary only according to the CR-FDH technique, p = .02, 95% CI = [.01, .02]. The difference between the techniques could be due to the CR-FDH technique's allowance of observations beyond the ceiling line, resulting in a larger effect size in comparison to the CE-FDH technique (Figure 1c). This necessity effect decreased in size and was no longer significant after the outliers were removed (Table S1). We concluded that openness was not necessary for wisdom performance.
The joint necessity effects of intelligence, creativity, meaning-making, and the Big-Five traits on the wisdom components are shown in Tables 7 and 8. As can be seen, few necessary conditions existed for low-to-medium scores on either wisdom component; however, more variables became necessary for high scores (80.0% and above) on the wisdom components. High scores on the performance wisdom component required relatively higher levels of intelligence and creativity, whereas high scores on the self-report wisdom component required relatively higher levels on the Big Five. A non-trivial level of meaning-making only became necessary for maximal scores on the wisdom components, scores that very few participants reached in our sample. It should be noted that estimates from the bottleneck analyses might not be reliable as only a small number of participants scored above 80.0% on the wisdom components (28 for the performance wisdom component and 33 for the self-report wisdom component), necessitating replication studies with larger sample sizes.
What Are the Necessary Conditions for Wisdom? Examining Intelligence, Creativity, Meaning-Making, and the Big-Five Traits Collabra: Psychology  Note. Statistical significance of necessity effects were calculated by comparing the observed necessity effect to a distribution of necessity effects generated from permutations of shuffled, uncorrelated X and Y values, a process known as approximate permutation tests (Dul et al., 2020), resulting in approximate p-values associated with 95% confidence intervals (presented in square brackets). Bold face indicates p-value estimates smaller than .050.
What Are the Necessary Conditions for Wisdom? Examining Intelligence, Creativity, Meaning-Making, and the Big-Five Traits Collabra: Psychology

Segmented Regressions
Segmented regressions were conducted with the sample average of each independent variable as the guessed breakpoint (ψ guessed ). The empirically determined breakpoints (ψ estimated ) were tested for statistical significance using a two-tailed Davies test (Davies, 1987) with α = .05 and K = 10, following recommendations (Muggeo, 2008). Results are presented in Table 9.

Performance Wisdom Component
No significant breakpoint was found in the relation between intelligence and the performance wisdom component. Although the slope between intelligence and the performance wisdom component was positive before the estimated breakpoint and non-significant after it, this change was not statistically significant (Table 9), possibly due to inadequate sample size. Creativity, meaning-making, and the Big-Five traits also did not show significant breakpoints in their relations with the performance wisdom component. Scatterplots (Figures 1b-h) showed that these variables were either unrelated to the performance wisdom component or had linear relationships with it, suggesting that segmented regressions were unsuitable for modeling these relations. Indeed, some of these models did not converge and had breakpoint estimates that failed to stabilize despite a large number of iterations (1000).

Self-Report Wisdom Component
A significant breakpoint was found in the relation between intelligence and the self-report wisdom component, ψ estimated = 19.00, SE = 2.08, p = .041 (Table 9). The breakpoint was slightly lower than the estimated population average of WPT-Q performance (Wonderlic, Inc., 2004). The slope was positive before the breakpoint and negative after it, though neither slope significantly differed from zero. For the relations between the self-report wisdom component on the one hand and creativity, meaning-making, and the Big-Five traits on the other, both the analyses and the scatterplots (Figures 2b-h) suggested that segmented regressions were unsuitable models. Again, the breakpoint estimates failed to stabilize after 1000 iterations for many of these models, with some failing to converge.

Discussion
Characteristics such as intelligence, creativity, meaningmaking, and the Big-Five traits have been conceptualized as conditions or resources for wisdom. The current study is the first to formally test one way in which this postulation might be true by examining whether these characteristics were necessary for wisdom. We found that the necessary conditions to wisdom largely depended on how wisdom was operationalized: intelligence was necessary for wisdom performance whereas the Big-Five personality traits were necessary for self-reported wisdom. In addition to hypotheses of necessity, we also examined whether threshold levels of cognitive and personality characteristics existed such that the characteristics positively predicted wisdom before the threshold but ceased to predict wisdom after the threshold. Using segmented regression, we did not detect any thresholds that fit this definition.

Necessary Conditions for Wisdom
We found that the necessary conditions for wisdom largely depended on the form of wisdom in question. Intelligence was the only necessary condition for wisdom performance. Specifically, a score above 20 on the WPT-Q, which was close to the population average on the test (Wonderlic, Inc., 2004), was necessary for scoring above average on the performance wisdom component (i.e., a component score above 1.0). However, although the association between wisdom and intelligence was positive before the estimated breakpoint at 21 and negative after it, this breakpoint was not statistically significant, possibly due to inadequate sample size. The threshold hypothesis was thus not supported. We conclude that while intelligence is a necessary condition for the kind of wisdom captured by the performance wisdom component, more empirical evidence is needed before any conclusions can be drawn about the threshold hypothesis. For self-reported wisdom, however, intelligence was not necessary. Although a significant breakpoint existed in the relation between intelligence and the self-report wisdom component, the slopes before and after the breakpoint were not significantly different from zero. It is possible that the slopes would be statistically significant with larger sample sizes; alternatively, the statistical significance of the breakpoint could indicate a Type I error. Future research with larger sample sizes should thus be conducted to cross-validate the results. We concluded that while intelligence might be required for wisdom performance, it was not necessary for self-reported wisdom.
What Are the Necessary Conditions for Wisdom? Examining Intelligence, Creativity, Meaning-Making, and the Big-Five Traits

Collabra: Psychology
In contrast to intelligence, the Big-Five personality traits were necessary for self-reported wisdom, but not for wisdom performance. The threshold hypothesis was not supported for any of the traits, suggesting that the relationships between these traits and the self-report wisdom component were linear. There are at least two ways to interpret the finding that the Big-Five personality traits were necessary for high scores on the self-report wisdom component. First, the findings might corroborate the proposition that wisdom is an adaptive configuration of personality characteristics (e.g., Ardelt et al., 2019). This proposition has mainly been espoused by researchers who have developed and routinely used self-report measures of wisdom. If this proposition is true, it would explain our findings.
Alternatively, the strong correlations between the Big-Five traits and the self-report wisdom component could be due to common method variance, which would suggest that necessity effects pertained to the self-report method (i.e., it was necessary to score high on one self-report measure in order to score high on another) rather than to the constructs (i.e., it was necessary to be high on a trait in order to be high on wisdom). Although similarity in measurement method does not automatically lead to inflated correlations (e.g., Spector, 2006), measures sharing similar methods can be prone to similar systematic biases, which in turn can inflate the correlation between them. For instance, meta-analytic studies have demonstrated that social desirability is a systematic response bias that is correlated with emotional stability, extraversion, conscientiousness (Ones et al., 1996) and self-report wisdom measures (Dong et al., 2022), suggesting that it could have contributed to the differences between the wisdom components in the current study. However, as social desirability was not measured, we could not confirm whether it had indeed led to inflated correlations. Future studies should thus re-examine whether the Big-Five personality traits constitute necessary conditions for self-reported wisdom while ruling out the effect of common method variance. This can be achieved in at least two ways. First, common method variance can be statistically controlled. One way to achieve this is by measuring systematic response biases (e.g., social desirability) that affect both self-report wisdom measures and measures of the Big-Five personality traits. Systematic response biases (e.g., halo) can also be modelled and controlled for using statistical techniques such as structural equation modeling. Alternatively, methods other than self-report, such as informant reports, can be used to assess the Big-Five personality traits.

Non-Necessary Predictors of Wisdom
Our findings further suggest that while some characteristics, such as creativity and meaning-making, are correlated with wisdom, they are not necessary conditions for it. Of these constructs, meaning-making has been theorized as a resource for wisdom (e.g., Glück et al., 2019). It is important to note that the findings of the current study do not rule out this possibility, as not all resources are necessary conditions. For instance, it is possible that the absence of meaning-making can be compensated by the presence of another resource, or that rather than being a necessary condition for wisdom, meaning-making may be a sufficient condition (i.e., it is impossible to be unwise if one has a strong tendency to make meaning). Future studies should therefore explore the ways in which meaning-making serves as a resource for wisdom.

Interpreting the Wisdom Components
It is important to note that although we interpreted the two wisdom components as representing performance and self-report wisdom, there are alternative interpretations. One such interpretation is to consider the components as representing general wisdom and personal wisdom. General wisdom refers to insights into life in general; it is the kind of wisdom that manifests when advising others. Personal wisdom refers to insights into one's own life. The measures constituting the self-report wisdom component are all personal wisdom measures, whereas the measures constituting the performance wisdom component are all general wisdom measures. This perfect overlap makes it difficult to evaluate the appropriateness of either interpretation. In favor of the personal vs. general wisdom interpretation are the componential loadings of the Bremen wisdom paradigm and the 3DWS, the two measures that did not meet the .40 cutoff to be included in either component. Specifically, both measures assess personal wisdom and loaded more strongly on the self-report wisdom component (.39 and .38, respectively) than on the performance wisdom component (.16 and .09, respectively). However, as these loadings were low, we concluded that the evidence for the two components representing general and personal wisdom was not strong. Furthermore, if the self-report wisdom component actually represented personal wisdom, then it should have been more strongly correlated with meaning-making, as the lessons and insights learnt through one's experiences should lead to more personal wisdom by transforming how one interacts with the world. However, meaning-making was instead more strongly correlated with the performance wisdom component and had no significant correlation with the self-report wisdom component, a pattern of results that is more in line with the self-report vs. performance interpretation of the components than with the personal vs. general wisdom interpretation.

Limitations and Future Directions
The current study has several limitations, all of which can inform directions for future investigations. First, the current study only offers preliminary insights that should be replicated. Specifically, the current study's frequentist approach to statistical inferences necessitates replications to ensure that the Type I error rate is on par with the alpha level (e.g., Mayo, 2018). Furthermore, the current study might be underpowered to detect the necessity effects and changes in slope, as the sample size was planned based on the magnitude of small-to-medium effect sizes commonly found in personality and social psychology, rather than on the magnitudes of necessity effects and changes in slopes, as we had no way to reasonably estimate the latter beforehand. Future replications of the current study could use simulations to determine the sample size needed to detect the effect sizes found in the current study.
What Are the Necessary Conditions for Wisdom? Examining Intelligence, Creativity, Meaning-Making, and the Big-Five Traits Second, the results of the current study might be dependent on the principal components extracted, suggesting that replication studies will have different results if different wisdom components are extracted. Of concern is the fact that two commonly used measures of wisdom, the 3DWS and the Bremen wisdom paradigm, were excluded from the analyses that informed the key conclusions due to low componential loadings. As the 3DWS and the Bremen wisdom paradigm are prominent wisdom measures that meaningfully contribute to the discourse on the definition and operationalization of wisdom, not including these measures may limit the generalizability of the current findings to the construct of wisdom. Findings of the current study should thus be corroborated by other datasets before more definite conclusions can be drawn regarding the necessary conditions for wisdom.
Third, the current study measured intelligence using the WPT-Q, which could not distinguish between crystallized and fluid intelligence. The WPT-Q was chosen as it was the only reliable, valid, and cost-effective instrument suitable for online, unsupervised administration. However, as crystallized intelligence, or the knowledge of the world and learnt operations, has been shown to be more strongly associated with wisdom than fluid intelligence, or the general ability to solve novel problems that is independent of learning (e.g., Dong et al., 2022;Grossmann et al., 2012;Mickler & Staudinger, 2008;Staudinger et al., 1997), the inability to distinguish between the two aspects of intelligence limits the scope of the current study. Future studies should further explore the necessity of intelligence for wisdom by examining fluid and crystallized intelligence separately. Fourth, in order to limit the length of the study protocol and avoid participant fatigue, meaning-making was only measured for one specific situation. It is possible that this one state measure of meaning-making might not accurately reflect participants' general tendencies to make meaning out of life experiences or represent individual differences in the construct. This may then affect our ability to detect significant necessity effects of meaning-making on wisdom. Future studies should thus re-examine the necessity of meaning-making for wisdom using measures that can better reflect individuals' general tendencies to make meaning and individual differences in the construct.
Fifth, findings of the current study should be interpreted as probabilistic and not categorical. Given that the current study examined a sample drawn from the population, not the population itself, significant necessity effects indicated that high levels of wisdom were relatively unlikely, but not impossible, with low levels of certain cognitive and personality characteristics. It is thus incorrect to conclude based on the present findings that low levels of these characteristics categorically preclude one from being wise. Finally, the cross-sectional nature of the data and the statistical analyses employed dictate that the current study is unable to offer any insights into the causal relationships between the cognitive and personality variables on the one hand and wisdom on the other hand. Specifically, neither the NCA nor the segmented regression analysis make any causal assumptions and their results cannot be used to draw causal conclusions. Furthermore, in logic, the statement that one variable is a necessary condition for another variable is not a statement of causal relations. Given the nature of its data and analytical techniques, therefore, the results of the current study should not be interpreted as indicating that the possession of certain cognitive and personality characteristics causes, or even temporally proceeds, wisdom attainment. Instead, results of the present study simply suggest that low levels of certain cognitive and personality characteristics are associated with a low (but not zero) probability of having high levels of wisdom. We acknowledge, however, that when researchers discuss intelligence and certain personality traits as necessary conditions for wisdom, the implication is often that these conditions are necessary because they are resources that can facilitate wisdom development and manifestation. While findings of the current study are consistent with this view, they cannot speak to the causal implications of it.

Conclusion
The current study offers initial insights into the conditions that might be necessary for wisdom attainment. We found that the necessary conditions to wisdom varied depending on how wisdom was measured. Specifically, intelligence was necessary for wisdom performance, whereas the Big-Five traits were necessary for self-reported wisdom. We note that these insights are preliminary and need to be replicated by future studies. Future studies should also explore other ways, beyond necessity, through which wisdom correlates might be associated with wisdom, such as sufficiency and causality.

Author Note
Some findings of this article had been presented at the 2021 APS Virtual Conference (Dong & Fournier, 2021). righted WPT-Q), participant data (anonymized), codebook, and analysis scripts can be found on this paper's project page on the Open Science Framework: https://osf.io/du3r5/. Submitted: November 07, 2021 PDT, Accepted: February 28,

PDT
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CCBY-4.0). View this license's legal deed at http://creativecommons.org/licenses/by/4.0 and legal code at http://creativecommons.org/licenses/by/4.0/legalcode for more information.
What Are the Necessary Conditions for Wisdom? Examining Intelligence, Creativity, Meaning-Making, and the Big-Five Traits Collabra: Psychology