Is healthy neuroticism associated with longevity? A coordinated integrative data analysis

, Individual differences in the Big Five personality traits have emerged as predictors of health and longevity. Although there are robust protective effects for higher levels of conscientiousness, results are mixed for other personality traits. In particular, higher levels of neuroticism have significantly predicted an increased risk of mortality, no-risk at all, and even a reduced risk of dying. The current study hypothesizes that one potential reason for the discrepancy in these findings for neuroticism is that interactions among neuroticism and other key personality traits have largely been ignored. Thus, in the current study we focus on testing whether the personality traits neuroticism and conscientiousness interact to predict mortality. Specifically, we borrow from recent evidence of “healthy neuroticism” to explore whether higher levels of neuroticism are only a risk factor for increased mortality risk when conscientiousness levels are low. We conducted a pre-registered integrative data analysis using 12 different cohort studies (total N = 44,702). Although a consistent pattern emerged of higher levels of conscientiousness predicting a reduced hazard of dying, neuroticism did not show a consistent pattern of prediction. Moreover, no study provided statistical evidence of a neuroticism by conscientiousness interaction. The current findings do not support the idea that the combination of high conscientiousness and high neuroticism can be protective for longevity. Future work is needed to explore different protective factors that may buffer the negative effects of higher levels of neuroticism on health, as well as other behaviors and outcomes that may support the construct of healthy neuroticism.

Neuroticism might not always increase mortality risk. Some studies provide strong evidence (e.g., large statistically significant hazard ratios) that higher neuroticism is associated with a reduced risk of dying (Korten et al., 1999;Weiss & Costa Jr., 2005). These samples largely differed from others as participants were older adults (65+) however, findings remained statistically significant after adjusting for age. Korten et al. (1999) suggested that those with higher levels of neuroticism might be more likely to seek medical help for problems, whereas individuals scoring low in neuroticism might not. Ploubodis and Grundy (2009) found neuroticism was a risk factor for increased mortality in men, but protective among women. Others have documented that when high levels of neuroticism were analyzed in conjunction with either higher socioeconomic status (Hagger-Johnson et al., 2012) or higher cognitive ability (Weiss et al., 2009), neuroticism was no longer a risk factor for increased mortality.
Meta-analytic findings suggest that higher neuroticism is either associated with increased mortality risk or there is no overall effect. Roberts and colleagues (2007) averaged the relative risk ratios of neuroticism from twelve studies and found that higher levels of neuroticism were indeed a risk factor for mortality. A more recent analysis from seven international samples (n = 76,000+) found higher neuroticism was associated with an increased risk of death in 3 studies (Health and Retirement Study, Midlife in the United States Study, and German Socio-Economic Panel Study) but null effects for the other four . The overall meta-analytic effect was not significant for neuroticism across studies, but there was some evidence that very high levels of neuroticism were a risk factor for mortality. Graham and colleagues (2017b) explored neuroticism-mortality associations in 15 longitudinal datasets and identified five studies in which higher neuroticism was associated with increased mortality risk (Health and Retirement Study, Religious Orders Study, Seattle Longitudinal Study, MRC National Survey of Health & Development, and Longitudinal Aging Study Amsterdam). Within these meta-analyses, no single study nor the aggregate meta-analytic effect found higher neuroticism as protective. However, higher neuroticism was not a universal risk factor for mortality as many individual studies found null effects. These two recent meta-analyses (Graham et al., 2017b;) include many of the same studies used in the current analysis. However, the current analysis utilizes updated data that includes many more deaths. Thus, it is key to determine whether findings remain consistent with this updated mortality information and whether the neuroticism by conscientiousness interaction emerges as a significant predictor of mortality risk across multiple studies.

Healthy Neuroticism
Although neuroticism is associated with a host of unhealthy behavioral choices and suboptimal health outcomes (Lahey, 2009), evidence suggests that higher neuroticism could be health-beneficial because anxiety drives individuals to be more vigilant of somatic symptoms (Costa Jr. & McCrae, 1987) as well as utilize health-care services (Have, Oldehinkel, Vollebergh, & Ormel, 2005). Can neuroticism be protective in some circumstances? Friedman (2000) introduced the idea of "healthy neuroticism" as a way to explain why high neuroticism might actually have some health benefits. He suggested that the typical life path of someone scoring high in neuroticism would be characterized by pessimism, anger and anxiety that ultimately leads to substance abuse, lack of preventative health care, chronic negative affect, and suboptimal physiological reaction patterns. Alternatively, a quite different path would be anxiety that drives more proactive health behaviors due to the concern of current and future health problems. Thus, a "healthy neurotic" is more likely to engage in exercise, healthy eating, and preventative health care to alleviate any anxiety about poor health or early death.
There is now empirical evidence that high levels of conscientiousness may buffer the negative behavioral effects of high neuroticism. Vollrath and Torgersen (2002), for example, examined personality typologies based on levels of neuroticism and conscientiousness (among other trait combinations) in relation to substance abuse behaviors. They found that individuals scoring lower in conscientiousness or higher in neuroticism were more likely to smoke, drink, or use drugs. However, when both conscientiousness and neuroticism were high, there was a decrease in substance use. Likewise, Terracciano and Costa (2004) explored neuroticism by conscientiousness interactions and found that adults scoring high in both conscientiousness and neuroticism were three times less likely to smoke cigarettes than those low in conscientiousness but high in neuroticism. Thus, higher neuroticism was particularly associated with increased smoking when conscientiousness was low. Further, Weston & Jackson (2015) found that after the onset of a major chronic disease, higher levels of neuroticism predicted less smoking when conscientiousness levels were also high. Lastly, higher levels of conscientiousness buffered the negative effects of high neuroticism in terms of reduced alcohol consumption and related problems (Turiano, Whiteman, Hampson, Roberts, & Mroczek, 2012). The aforementioned studies suggest that when conscientiousness levels are high, the predictive effects of high neuroticism for substance abuse behavior are attenuated.
When exploring mortality risk, two studies found evidence for healthy neuroticism in regards to mortality risk. Friedman and colleagues (2010a) used archival data on over 1,000 participants from the Terman study, finding that higher neuroticism was protective for men but a risk factor for women. More importantly, they found a significant neuroticism by conscientiousness interaction for women. The plotted survival curves showed that women high in conscientiousness and low in neuroticism had the lowest mortality risk, but based on visual inspection we suggest that there was also a survival advantage when neuroticism was high in combination with high conscientiousness. As the authors suggested, higher neuroticism may become beneficial at certain life stages when challenges arise and appropriate actions need to be taken. Lastly, in a large meta-analysis of personalitymortality associations, trait interactions were explored, but no consistent pattern emerged across the studies and thus data were not presented in the article . A second study examined over 300,000 adults from the UK Biobank study (Gale et al., 2017) and found that higher levels of neuroticism predicted a reduced mortality risk among those rating their health as fair or poor. Although participants did not engage in healthier behaviors, they may have been more motivated to seek out health-care services for ailments. However, no such data were available to test this hypothesis.

The Current Study
The current study is the third in a series of three studies submitted to this issue. These studies were the result of a single, coordinated project involving multiple co-principal investigators and spanning research labs around the world. The goal of this project was to rigorously analyze evidence for "healthy neuroticism," narrowly defined as the significant moderation of the neuroticismhealth relationship by conscientiousness. This analysis considered multiple components of health, specifically behaviors, development of chronic conditions, and mortality, following a lifespan trajectory. Despite the coordination and similarity of analyses among these three components of health, there were notable differences between the studies, including the datasets which could be used in the analysis, the quantitative models applied, and the theoretical rationale linking personality to health. Moreover, synthesizing this project in a single manuscript would have required over-simplification of critical details and decision points along the way. As a result, the three components were divided into three separate but linked manuscripts, allowing for the requisite detail to be included in each. Information on each study in this series can be found here.
This specific paper addressed two major gaps in the literature. First, we assessed whether higher levels of neuroticism were associated with an increased or decreased mortality risk in a diverse set of 12 longitudinal samples. Second, we tested the trait interaction between neuroticism and conscientiousness as a predictor of mortality. This examination allowed us to directly assess healthy neuroticism -whether high neuroticism was only associated with increased mortality risk when conscientiousness levels were low. In other words, do higher levels of conscientiousness lead to reduced mortality risk among highly neurotic individuals? The use of a dozen high quality longitudinal datasets addressed the uncertainty inherent in any single study investigating the neuroticism-mortality association. Our pre-registered hypothesis was that the relationship between neuroticism and mortality risk would vary as a function of conscientiousness level. Moreover, we updated our pre-registration just prior to running our analyses to explore whether a three-way interaction, represented by gender by neuroticism by conscientiousness, predicted mortality risk because prior work found that higher neuroticism could be protective in women (Ploubidis & Grundy, 2009), especially when conscientiousness levels are high (Friedman, Kern, & Reynolds, 2010b). Specifically, we hypothesized that higher levels of conscientiousness would more strongly buffer the negative effects of neuroticism on increased mortality risk.

Coordinated Analysis
Coordinated analysis is a form of integrative data analysis that compares optimally similar models across multiple studies (Graham et al., 2017b;Hofer & Piccinin, 2009). Since the methods used in reports from individual longitudinal studies are rarely identical, direct comparisons of published results are generally not possible. For example, the questions and response formats used to assess certain constructs often vary across studies. As such, coordinated analyses are conceptual replications: instead of harmonizing across measures, coordinated analysis uses harmonized models thereby allowing the exact measurement of a given construct measurement to vary. This approach lends itself to the assessment of generalizability and can add to the credibility of results. Variation among study characteristics has long been considered a constraint on the generality of findings in longitudinal research. In coordinated analysis, these differences can be a strength: instead of regarding these study-level characteristics (e.g. cohort, measurement scale, country) as sources of error, researchers may have the option to test these differences systematically as sources of heterogeneity. By harmonizing the models themselves, investigators are able to rule out model variations as a source of heterogeneity, and focus on study-level characteristics instead. Furthermore, if there is consistency in the direction and strength of effect in heterogeneous samples, then this is evidence for generalizability of the results.

Sample Information
We included studies affiliated with the Integrative Analysis of Longitudinal Studies of Aging and Dementia (IALSA) network that had the requisite data to test our hypotheses. Within the network, IALSA has established data sharing agreements with affiliates that have non-public data. For this project, studies with non-public data provided an analyst who analyzed the data from their specific study for the purpose of evaluating replicability of results. In total, 12 longitudinal studies (total N = 44,702; deceased N = 8,788) had appropriate data to address the questions in the current project. All models used listwise deletion.
Sample sizes for each model in each dataset are presented in Table 1. Ethical approval was obtained at each of the primary research sites for each study. The Einstein Aging Study (EAS) is an observational longitudinal cohort study to examine cognitive aging and dementia, which began in 1993. Older adults who were at least 70 years of age, non-institutionalized, and native English speakers were systematically recruited from an urban, multi-ethnic, community-dwelling population in Bronx County, New York, USA. Participants receive comprehensive annual medical and neuropsychological evaluations (Katz et al., 2012). For the current project, baseline was defined as the first time at which participants had completed the personality scales, which ranged from 2005 to 2016. Mortality information through was obtained via follow-up phone calls and searches within the Social Security Death Index. A total of 574 (M age = 78.98, SD age = 5.21, 59% female) participants completed the baseline personality measure and had mortality information available. A total of 128 individuals died over the 11.08 year follow-up (M survival = 51.15 months, SD survival = 29.39 months, range = 2-122 months).
The Health and Retirement Study (HRS) is a longitudinal panel study that tracks retirement-age adults in the United States. Data collection began in 1992, with new cohorts added throughout the 1990s and 2000s (Sonnega et al., 2014). Baseline was defined as the first time participants completed the personality scales (ranging from 2006 to 2014). Mortality information through 2014 was assessed through two methods. First, surviving panel members and family members were sought to obtain interviews. Second, a search was made through the National Death Index to confirm deaths. A total of 19,211 (M age = 66.26, SD age = 11.16, 59% female) participants completed personality measures and had mortality information available. Of these, 3,066 individuals died over the 9.13 year follow-up (M survival = 45.11 months, SD survival = 25.90 months, range = 1-105 months).
The Lothian Birth Cohort 1936 (LBC1936) consists of surviving individuals who completed the Scottish Mental Health Survey in 1947 (Deary, Gow, Pattie, & Starr, 2012;Taylor, Pattie, & Deary, 2018). The LBC 1936 (birth) cohort was recruited between 2004 and 2007 by identifying individuals from the original 1947 (test) cohort who were residing in Edinburgh and the surrounding area. In total, 1,091 participants entered the study. Baseline was defined as 2006, the first wave of personality assessment for all participants. Mortality information was obtained by flagging at the National Records of Scotland who provided data on the date and cause of each death in the cohort. A total of 962 (M age = 69.50, SD age = 0.84, 51% female) participants completed personality measures and had mortality information available. Of these, 218 individuals died over the 14.23 year follow-up (M survival = 90.59 months, SD survival = 40.03 months, range = 2-161 months).
The Long Beach Longitudinal Study (LBLS) started in 1978 and was composed of 589 adults aged 28-84 years old in southern California who participated in a study comparing alternate forms of the Schaie-Thurstone Adult Mental Ability tests as well as other psychometric measures (Zelinski & Kennison, 2001). A new cohort of participants was added and personality was assessed using the Revised NEO Personality Inventory (NEO-PI-R; (Costa & McCrae, 1992)) in 1994/1995 and 2000/2001. Of these, 898 (M age = 64.81, SD age = 13.19, 56% female) participants completed personality assessments and had mortality information available. Mortality information through 2013 was attained through the National Death Index. A total of 131 individuals died over the 19.00 year follow-up (M survival = 113.40 months, SD survival = 41.15 months, range = 12-204 months).
The Rush Memory and Aging Project (MAP) is a longitudinal, epidemiologic clinical-pathologic cohort study of common chronic conditions of aging with emphasis on decline in cognitive and motor function and risk of dementia (Bennett et al., 2018;Bennett, Schneider, Arvanitakis, & Wilson, 2012). Participants are older adults recruited from retirement communities and subsidized senior housing facilities throughout the Chicago metropolitan area and northern Illinois. Enrollment began in 1997 and clinical evaluations and cognitive assessments occur annually. Neuroticism was first collected in 1997 while conscientiousness and extraversion were first collected in 2004. Openness and agreeableness were not collected. Because MAP is an autopsy study, the exact date of death is known for most participants as it is the day an autopsy was performed. In addition to annual evaluations, participants are also contacted quarterly to determine vital status and changes in health, and death is occasionally learned of and documented during quarterly contacts. Mortality information in the current study included deaths up to 2017. Of these, 653 (M age = 79.69, SD age = 7.13, 75% female) participants completed personality measures and had mortality information available. Of these, 268 individuals died over the 13.89 year follow-up (M survival = 95.99 months, SD survival = 31.43 months, range = 15-161 months).
The Sydney Memory and Ageing Study (MAS) is an ongoing longitudinal cohort study of brain aging and dementia in older individuals, who undertake medical, neuropsychological and psychosocial assessments biennially. Individuals without dementia aged 70-90 and living in the Australian community at baseline (between 2005-2007) were randomly recruited through the electoral roll (Sachdev et al., 2010). Mortality information through 2014 was obtained through the Australian Death Index. A total of 879 (M age = 78.71, SD age = 4.78, 46% female) participants completed personality assessments and had mortality information available. Of these, 180 individuals died over the 8.42 year follow-up (Msurvival = 52.84 months, SD survival = 24.91 months, range = 4-096 months).
The Midlife in the United States (MIDUS) tudy is an ongoing national study of 7,108 participants in the United States who were recruited in 1994/1995 and completed followed up assessments in 2004/2005 and 2013/2014 (Brim, Ryff, & Kessler, 2004). Mortality information through October 2015 was obtained by three methods. First, National Death Index updates were conducted in 2006 and 2009. Second, deaths were recorded during the tracing/closeout phases after fielding the MIDUS 2 (2005-06) and MIDUS 3 (2013-15) questionnaires. Lastly, deaths were recorded during primary data collection. A total of 6,245 (M age = 46.84, SD age = 12.91, 53% female) participants completed personality measures at baseline (1995-96) and had mortality information available. Of these, 1,069 individuals died over the 21.12 year follow-up (M survival = 135.32 months, SD survival = 64.20 months, range = 1-245 months).
The Veteran Affairs Normative Aging Study (NAS) is a study of the medical and psychosocial aging among men in the United States, and is funded by the United States Department of Veterans Affairs. The sample is based in the Greater Boston, Massachusetts metropolitan area and consists of 2,280 men who were enrolled between 1961 and 1970. The participants were on average 42 years old at enrollment (Bossé, Ekerdt, & Silbert, 1984). Personality was assessed beginning in 1990. A total of 992 (M age = 64.57, SD age = 7.46) men completed personality measures and had mortality information available. Mortality information is monitored by periodic mailings to participants and notification from next-of-kin or postal authorities. Records from the Department of Veterans Affairs and the Social Security Administration (Death Master File) are routinely reviewed for possible unreported deaths. When deaths were reported, a death certificate was obtained and coded for the cause of death. The Older Australian Twins Study (OATS) is a multisite, longitudinal study of monozygotic and dizygotic twins at least 65 years of age, including 623 participants who completed baseline assessments (Sachdev et al., 2009). Personality was measured at study onset (2007) using the Revised NEO Personality Inventory (NEO-PI-R; (Costa & McCrae, 1992)). Mortality information through 2014 was obtained from a number of sources (e.g., online indexes of deaths appearing in Australian newspapers, reports from family members), which were confirmed using the National Death Index. A total of 534 (M age = 71.40, SD age = 5.55, 65% female) participants completed personality assessments and had mortality information available. Of these, 56 individuals died over the 9.57 year follow-up (M survival = 63.76 months, SD survival = 28.53 months, range = 4-115 months).
The Religious Orders Study (ROS) is a longitudinal, epidemiologic clinical-pathologic cohort study of aging and Alzheimer's Disease that enrolled older Catholic nuns, priests, and brothers from more than 40 groups across the United States in 1994 (Bennett et al., 2018(Bennett et al., , 2012. Participants do not have known dementia at baseline and agree to annual clinical evaluations, cognitive testing, and brain and other tissue donation after death. The NEO personality assessment was collected in 1994. A total of 1,394 (M age = 75.95, SD age = 7.47, 71% female) participants completed personality assessments and had mortality information available through 2017. Since ROS is an autopsy study, the exact date of death is known for most participants as it is the day that an autopsy was performed. In addition to contact for annual evaluations, participants are contacted quarterly to determine vital status and changes in health, and death is occasionally discovered during quarterly contacts and documented. Of the total 1,394 participants with personality data, 792 individuals died over the 24.00 year follow-up (M survival = 120.75 months, SD survival = 70.73 months, range = 1-288 months).
The Seattle Longitudinal Study (SLS) started in 1956 and has since collected data on close to 6,000 participants in a cohort-sequential design (Schaie, Willis, & Caskie, 2004). Participants were sampled randomly from members of a large health maintenance organization in the Seattle, Washington area. A total of 1,656 participants aged 26 to 101 completed at least one personality assessment between 2001 and 2012. Personality was assessed with the revised NEO Personality Inventory (Costa & McCrae, 1992). A total of 1,649 (M age = 64.75, SD age = 15.77, 55% female) participants completed personality assessments and had mortality information available. Mortality information through May 2017 was obtained through obituaries, family notifications, the Social Security Death Index (only current up to 2014), and websites such as Ancestry.com. Of the 1,649 individuals with personality data, 444 individuals died over the 16.19 year follow-up (M survival = 64.47 months, SD survival = 46.14 months, range = 2-190 months).
The Wisconsin Longitudinal Study (WLS) follows a cohort of men and women who graduated from Wisconsin high schools in 1957. Data from 10,317 participants span almost 60 years from the baseline assessment in 1957, with follow-up assessments collected in 1967,1975,1993,2004, and most recently in 2011. In addition to the original cohort, subsequent assessments included randomly selected siblings and spouses of graduate participants (Herd, Carr, & Roan, 2014). Personality measures were first introduced in the WLS in 1993, which served as the baseline assessment for the present analyses. A total of 10,681 (M age = 53.76, SD age = 4.52, 54% female) participants completed personality assessments and had mortality information available. Mortality information through November 2014 was linked with records from the US National Death Index. Of these, 1,777 individuals died over the 22.33 year follow-up (M survival = 160.42 months, SD survival = 65.05 months, range = 3-264 months).

Turiano et al: Healthy Neuroticism and Mortality
Art. 33, page 7 of 16

Materials
The details of each measure, including item text and response choices, are described in Supplementary File 1, and descriptive statistics for each variable used are available here.

Vital Data
Information on how death was recorded within each study is described above. Those confirmed as dead by the end of the follow-up period (e.g., censor date) were coded as 1 and those still alive were coded as 0. We used two different time metrics to quantify survival time to ensure the robustness of effects. For the primary analysis, we created a continuous survival time variable based on the interval (in months) from the date when personality was first assessed to the date of the participant's death. Participants who were still alive (censored observations) had survival times that equaled the length of the maximum follow-up for that specific study (e.g., the last date a mortality update was provided). As a secondary analysis we used the participants' ages when they first completed the personality measures as the starting point and age at death (or censor date if they were still alive at the end of the follow-up) as the ending point. This analysis would test whether effects differed depending on the time metric utilized.

Personality
Each of the 12 samples had a baseline assessment of the Big Five personality dimensions. All but two studies assessed all of the Big Five traits. Notably, the MAP did not assess agreeableness or openness, and the MAS did not assess agreeableness or extraversion. The most commonly used measures were from the NEO family of instruments (Costa Jr. & McCrae, 2008). The NEO-PI-R was used in the MAS, OATS, and LBLS, while the NEO-FFI was used in the ROS and MAP. Two studies used the MIDI ( (Lachman & Weaver, 1997); HRS, MIDUS), one used the IPIP50 (LBC); three used the IPIP adjectives ( (Goldberg, 1992;Saucier, 1994); EAS, LBC, NAS), and one used the BFI ( (John & Srivastava, 1999); WLS). Cronbach's alpha coefficients were calculated from the samples of participants who were eligible for analysis. Internal reliability estimates for neuroticism ranged from 0.71 (HRS) to 0.93 (SLS). Internal reliability estimates for conscientiousness ranged from 0.56 (MIDUS) to 0.90 (SLS). Since the Big Five traits are broad constructs, there is variation across different measures in terms of the underlying facets/item used in a specific scale (John, Naumann, & Soto, 2008). Thus, the different measures used in the current analysis are beneficial because they capture slightly different ranges of meaning for both neuroticism and conscientiousness. For example, the IPIP-50 measure of conscientiousness included items assessing responsibility, practicality, and thriftiness, but not self-discipline or efficiency; in contrast, the BFI measures competency and achievement but not goal-striving. To some extent, this difference in trait coverage across scales is an asset to these analyses because a lack of significant differences in effects would suggest that estimated relationships are robust to choice of scale, while significant differences may point to more specific aspects of broad domains that are more strongly associated with outcomes (Hampson, 2012;Paunonen & Ashton, 2001). We provide a table of content measurement by scale online to highlight the difference in these measures.

Covariates
All analyses included age, education, and sex because of their known associations with personality traits and mortality risk. All Big Five traits were included in the models when available. Age for all studies was the participant age at the baseline assessment of personality and was standardized in each study. Gender was coded as 1 (female) or 0 (male). Note, NAS was a male-only sample and as such did not adjust for gender. Education in EAS, HRS, ROS, OATS, LBLS, MAS, and MAP were measured as the number of total years of education. In MIDUS, NAS, and WLS, education was measured using an ordinal scale that referred to the highest degree earned. We standardized education within study.

Analytical Plan
Analytic decisions for this project were made prior to the analysis of any data by the first (NT), second (EG), and third (SW) authors of this manuscript. Development of the scripts used in R were also completed prior to analyses by the third author, using randomly seeded subsets of the HRS data. The HRS data was used as a template to develop code so the first three authors could evaluate if the R syntax ran successfully and provided the necessary output required to evaluate the study hypotheses. This R script created an output object containing meta-data, descriptive statistics, statistics from the inferential models and values predicted from the model. The "meta-analysis" scripts were written to extract relevant information from each of the output objects and evaluate the overall effect from the individual analyses. Analyses, including scripts, were pre-registered at this point. After pre-registration, the template scripts for the individual study analysis were sent to each study's data analyst, who then completed the analysis. All scripts and data objects (not the data themselves, in order to comply with data sharing agreements) are available on OSF. We used R (Version 3.6.2; R Core Team, 2018) and the survival Functions in the metafor package were used to estimate the overall effects and heterogeneity between studies, as well as to create the forest plots. The sjPlot package was used to calculate predicted values for each study and the ggplot2 package was used to visualize effects.

Individual Study Analysis
We estimated a series of Cox proportional hazards models (Cox, 1972) to examine whether neuroticism and conscientiousness interacted to significantly predict the hazard of dying in each study. Cox models take into account the occurrence of a discrete outcome event (e.g., whether the person died or not over the specified follow-up) as well as the time interval of survival/death (e.g., how long did someone survive). For each predictor, a hazard ratio is estimated that indicates whether each standard deviation increase in that trait is associated with a reduced or increased hazard of death for the specified follow-up period. Based on our pre-registered study aims, we estimated three Cox models. First, we included baseline age, gender, education, and each of the Big Five personality traits as predictors of mortality. Second, we added the neuroticism by conscientiousness interaction to the model. Third, we included a three-way interaction (neuroticism by conscientiousness by gender), as well as the underlying two-way interactions to the model (neuroticism by gender; conscientiousness by gender).
We also fitted additional models to evaluate the robustness of effects. First, we tested an alternative time metric by using participants' ages (i.e., age at the baseline personality assessment and age at death for when someone was censored because they survived) and ran the three identical models as described above. Second, we tested the proportionality assumption of our Cox models by interacting survival time with the neuroticism by conscientiousness interaction. A statistically significant interaction would suggest that the interaction effect of neuroticism and conscientiousness on mortality risk differed at certain times within each study. Results from these additional models can be found in the supplemental material. All analyses for the current project were preregistered and were conducted using R software.

Meta-Analysis
The analytic tools of meta-analysis were used to estimate the average weighted (by sample size) effect size for conscientiousness and neuroticism main effects, as well as for the interaction of neuroticism and conscientiousness predicting mortality risk. Using random effects meta-analysis, this calculation included an estimation of heterogeneity in the effects between studies. The forest plots show the individual study effects, as well as the meta-analytic summaries.

Statistical Power
Given the anticipated sample size, we did not conduct a statistical power analysis with regards to our interaction coefficient estimate, or the coefficient estimates of neuroticism and conscientiousness at the time of preregistration. Moreover, our power to detect significant heterogeneity in effect sizes between studies is unclear. During the revision process of this manuscript we conducted a post-hoc power analysis using methods described by Hedges and Pigott (2004). Based on the within-study variability and the number of studies, we estimate that we are sufficiently able (power > .90) to detect heterogeneity of at least a standard deviation (in odds ratios) of .05 (corresponding τ of .08).

Results
Descriptive data for each study can be found here. Table 2 contains the main effects for trait neuroticism and conscientiousness (Model 1), as well as the neuroticism by conscientiousness interaction effects (Model 2). These models adjusted for age, gender, education, and the other Big Five personality traits. In Model 1, only 3 studies showed a significant association between higher neuroticism and greater hazard of dying over the corresponding study follow-up period (HRS, MIDUS, & ROS). None of the other studies showed significant associations for neuroticism (although the hazard ratios were in the same direction). The averaged association (weighted by N) was statistically significant (HR = 1.05; 95% CI = 1.02-1.07). There was not significant heterogeneity between the studies (Q(11) = 6.93, p = .805; I 2 = 9.14). All but four studies (LBLS, NAS, OATS, & SLS) showed a significant protective effect of higher conscientiousness predicting a reduced hazard of dying with a significant overall meta-analytic effect for conscientiousness (HR = 0.88; 95% CI = 0.84-0.93; Q(11) = 31.52, p = .001; I 2 = 63.55). None of the neuroticism by conscientiousness interactions were statically significant (Model 2). Figure 1 contains the forest plot summarizing the individual study estimates for the neuroticism by conscientiousness interaction as well as the averaged effect (weighted by N) and meta-analytic statistics (I 2 ) at the bottom of the plot. Lastly, none of the three-way interactions (neuroticism by conscientiousness by gender) were statistically significant, see link.

Supplementary Analyses
To further probe the data, we conducted additional analyses that were not a part of the initial preregistration process, but to address comments during the editorial process. First, we estimated additional models to determine if the main effects of conscientiousness or neuroticism changed when we did not adjust the models for the effects of the other personality traits (models were adjusted for age, gender, education, and either conscientiousness/neuroticism). The main effects for conscientiousness were very similar, but differences did emerge for neuroticism, see link.
The main effects for neuroticism in HRS and MIDUS were still significant, but they were now protective. The averaged effect was also significant and protective. However, there were no studies that showed a significant conscientiousness by neuroticism interaction effect, nor an averaged interaction effect when not controlling for the other Big Five traits see link.
We also conducted additional analyses by using current age at baseline and attained age at death/censor date to ensure that our choice of time metric did not change the pattern of findings. Findings were not appreciably different if we used time or age as our time metric. We also tested the proportionality assumption by interacting time/age with the neuroticism by conscientiousness interaction. Two studies showed a slight violation of the proportionality assumption. In the MAS study, the neuroticism by conscientiousness interaction was significant within approximately the first 16 months of the study follow-up (among the youngest age groups in the study). Within WLS, the neuroticism by conscientiousness interaction was significant within approximately the first 46 months of the study follow-up. These deviations were very minor and no consistent pattern emerged. The effect of the interaction was a risk-factor in the MAS whereas the interaction was protective within the WLS. Due to these minimal deviations and inconsistency in direction of effects, we remain hesitant to interpret these effects.

Discussion
The current project analyzed 12 different longitudinal studies to identify whether neuroticism and conscientiousness predicted mortality risk. The results are comparable to prior work suggesting that higher levels of conscientiousness are associated with decreased mortality risk. Significant associations between higher neuroticism and increased mortality risk were only found in three studies, while null effects were found in the others which is similar to an earlier investigation of many of the same studies (Graham et al., 2017). However, the main goal of this project was to test whether neuroticism and conscientiousness interact in their association with mortality. We hypothesized that higher levels of neuroticism would be associated with an increased mortality risk but only when levels of conscientiousness   were low. In other words, higher levels of conscientiousness would buffer the harmful effects of high neuroticism on mortality risk. Within each study, as well as in the pooled results, we found no statistical or descriptive evidence that such a trait interaction was associated with mortality risk. This lack of findings across the full set of studies is convincing since we had adequate power to find such associations if they existed.
The foundation of the current study was built on emerging evidence that the interaction of neuroticism by conscientiousness is associated with a host of behaviors and outcomes (Roberts, Smith, Jackson, & Edmonds, 2009;Turiano, Mroczek, Moynihan, & Chapman, 2013;Turiano et al., 2012;Vollrath & Torgersen, 2002). Since several investigations found that higher levels of neuroticism were associated with a reduced risk of dying (Korten et al., 1999;Weiss & Costa Jr., 2005), we borrowed from the healthy neuroticism concept to explain these somewhat counterintuitive findings. The large and diverse samples utilized, different personality assessments, long and short follow-up durations, and different age cohorts bolster our study design. Across the 12 studies, there was no clear evidence that higher levels of neuroticism had any protective effects. Only in our supplemental analyses requested by reviewers did we find that higher levels of neuroticism was associated with a reduced risk of death (MIDUS and HRS). However, this only occurred when we did not adjust for other personality traits (extraversion, agreeableness, and openness). We are hesitant to put much weight on these two individual findings because theoretical and conceptual frameworks suggest the Big Five are oblique factors and thus all five traits should be included in the same model. Regardless, we suggest several future directions to explore whether alternative operationalizations of healthy neuroticism may identify if neuroticism may have health-beneficial effects. We suggest several future directions to explore whether alternative operationalizations of healthy neuroticism may identify when neuroticism may have health-beneficial effects.
A strength of the current study is the large bandwidth of trait neuroticism and conscientiousness we covered with the different personality measures included in the analysis. The various measures adequately captured different aspects of neuroticism such as anxiousness/nervousness, depression, moodiness, and the responsibility, selfdisciplined aspects of conscientiousness. The IPIP-50, IPIP, and Goldberg measures were very similar in adjective lists, while the BFI, BFI-S, and NEO measures had substantial overlap. Even with the diversity of personality measures, we still found a consistent pattern of null findings. Thus, we believe that the lack of neuroticism by conscientiousness interactions would generalize in studies using similar measures of the Big Five. However, we limited our analysis in the current paper to just the broad domains of neuroticism and conscientiousness. It is possible that only certain facets of neuroticism or conscientiousness drive healthy neuroticism. For example, high levels of underlying facets such as depressed affect or emotional reactivity are likely not conducive to health under any circumstances. However, higher levels on facets such as anxiety and self-consciousness could have protective effects on health if driven by proactive thinking (in this case, higher conscientiousness). Similarly, certain components of conscientiousness, such as being systematic and self-disciplined, may be more health-beneficial than others. Previous investigations have demonstrated that compared to analyses at the broad trait level, facet level analysis (Hampson, 2012;Paunonen & Ashton, 2001) or even item-level analysis (Mottus, Bates, Condon, Mroczek, & Revelle, 2017) can provide more leverage in the prediction of important life outcomes. For example, Gale and colleagues (2017) found that facets of worry and vulnerability were driving the effects of neuroticism in predicting a reduced risk of mortality. However, in Gale et al.'s study, these facets were derived from a bifactor model of neuroticism items and were uncorrelated with overall neuroticism, which was predictive of increased mortality hazards. Alternatively, others have found that greater impulsiveness was the facet driving the protective effect of high neuroticism being associated with increased longevity (Weiss & Costa Jr., 2005). The authors suggested that this puzzling finding may have something to do with biological resilience. We were limited in the current study and were unable to systematically analyze more specific facets and items due to the complexity of the different studies involved and different measures of personality. Each scale had different underlying facet structures or no set facet structure at all due to the brevity of some of the measures. Thus future research exploring facet-level interactions between neuroticism and conscientiousness should be one of the next steps to provide better insight into when higher levels of neuroticism may not be detrimental for health.
We also explored how findings may have differed by gender, but we did not find evidence of a significant three-way interaction involving gender, neuroticism, and conscientious. Since females typically score higher in neuroticism compared to males (Weisberg, DeYoung, & Hirsh, 2011), and prior work showing that neuroticism may not predict mortality risk similarly in females (e.g., Friedman et al., 2010b;Ploubidis & Grundy, 2009), we hypothesized that higher levels of conscientiousness would be more consequential for the higher levels of neuroticism in females. Thus, a stronger buffering effect could be evident and help to explain inconsistencies in prior work finding gender discrepancies in neuroticismmortality associations. Future work would benefit from exploring gender differences in personality-mortality associations, and gender differences in the personalityhealth behavior associations. Relatedly, it could also be fruitful to examine non-linear effects of neuroticism. One prior study investigated non-linear effects of neuroticism and found a classic U-shaped association between anxiety and mortality risk (Mykletun et al., 2009). Mortality risk was increased when anxiety levels were within the lowest and highest quartiles but not in the two more average levels of anxiety. This finding parallels the Yerkes-Dodson model of arousal (Yerkes & Dodson, 1908) supporting the notion that some anxiety may be health protective. Future work should attempt to determine the critical threshold level for neuroticism (or more specifically anxiety) to be "healthy" as well as whether that protective threshold depends on certain individual characteristics (i.e., gender, socioeconomic status) or at different stages of life span (i.e., emerging adulthood versus old age).
Revisiting Friedman's (2000) initial idea of healthy neuroticism, we need to be clear that testing healthy neuroticism via a neuroticism by conscientiousness interaction was our group's conceptualization of how to empirically identify healthy neuroticism. Since several studies have shown a buffering role of conscientiousness when neuroticism was high (e.g., Turiano et al., 2013Turiano et al., , 2012, we thought a formal test of this interaction was the best approach to potentially explain discrepant findings of the neuroticism-mortality association. Future tests of healthy neuroticism may address alternative operationalizations such as neuroticism interactions with other positive individual difference constructs like perceived control beliefs (Infurna, Gerstorf, Ram, Schupp, & Wagner, 2011;Lachman & Firth, 2004), social support (Uchino, 2006), and purpose in life (McKnight & Kashdan, 2009). Sociodemographic characteristics have also emerged as possible constructs to explore. Poorer self-rated physical health (Gale et al., 2017), higher socioeconomic status (Hagger-Johnson et al., 2012), and better cognitive functioning (Weiss et al., 2009) interact with neuroticism such that the risk of higher neuroticism for death was attenuated. Although we did not test these alternative models in the current study, we plan to do so with future research and hope others with suitable data will also pursue such investigations. These psychosocial constructs are just a few examples of variables that could interact with neuroticism in a way that buffers the damaging effects of high neuroticism levels. We also must be clear that using the word "healthy" to describe neuroticism would only be accurate if higher levels of neuroticism had a positive impact on health, not just that the negative effects of higher neuroticism were absent when conscientiousness levels were high. Thus, caution should be used when describing the absence of a neuroticism effect versus a protective effect on health.
The current study focused solely on examining direct effects between personality and mortality, but future work would benefit from exploring indirect effects. Prior work shows that the neuroticism by conscientiousness interaction predicts health behaviors such as alcohol and tobacco use [Turiano et al. (2012); Graham et al., 2020], such that highly neurotic individuals who are also conscientiousness do not engage in increased alcohol or tobacco use. Moreover, in this current issue we examined healthy neuroticism and health behaviors and found support across studies that higher levels of conscientiousness did in fact buffer the increased use of smoking and physical inactivity. Since alcohol and tobacco use and physical inactivity are among the leading behavioral contributors to earlier mortality (Mokdad, Marks, Stroup, & Gerberding, 2004), it would be logical to expect a person scoring higher in neuroticism who does not engage in such behaviors to have a health and/or longevity advantage. However, although so-called healthy neurotics engaged in slightly better health behaviors, this effect was likely not substantial enough to impact overall health in terms of the chronic stress or mortality outcomes we examined in the current study. It would be important for future work to empirically test indirect effects that would allow estimation of whether the health behaviors (e.g., alcohol or tobacco use, physical inactivity) would explain or mediate the association between the neuroticism by conscientiousness interaction and multiple health outcomes.
We also need to ensure that findings from the current study are weighed in careful consideration of study limitations. First, although we utilized a variety of longitudinal data sets from different countries, the sample is limited to those listed in the IALSA framework and include mostly studies of middle to older adults. However, this is also a strength of the current study because mortality rates are higher in later adulthood, thus giving us more power to detect associations with personality. We also did not explore cause-specific mortality in this analysis which could possibly shed light on potential deaths that may be more closely linked to healthy neuroticism (e.g., cardiovascular death). Lastly, it should be noted that since we used listwise deletion for missing data, the data is biased to those who had complete data. Moreover, since research has shown that personality is associated with drop-out, future studies should systematically test for selective attrition.
In closing, the current study included in this three-paper coordinated analysis investigated one operationalization of healthy neuroticism so conclusions generated about whether healthy neuroticism exists or not for mortality risk is limited to this single operationalization. We are confident in the null findings, due to adequacy of power we had to detect associations if they were there, as well as the remarkably homogeneous estimates from each study. Since the included studies are largely WEIRD samples(i.e., Westernized, Educated, Industrialized, Rich, and Democratic; Henrich, Heine, & Norenzayan, 2010), our results may not generalize to Eastern cultures or lessindustrialized countries, and to disadvantaged populations within the cultures studied here. Results are also limited to the specific mortality outcome we studied. We believe these results would generalize to measures of personality that were not used in the present study. There was little evidence that the present scales yielded different conclusions from each other, and we have no reason to believe these scales vary in a systematic way from other scales. We also believe that these results are likely to generalize over time, with the caveat being that improvements in society, specifically health care, may change the average life span and the incidence of disease and disability.

Conclusions
We sought to explain discrepancies regarding how neuroticism was associated with mortality risk. Specifically, we attempted to find evidence of healthy neuroticism via a neuroticism by conscientiousness interaction. Although we did not find support for this operationalization of healthy neuroticism in predicting mortality risk, the current study was rigorous in terms of the methodological approach, the number and quality of the longitudinal data sets incorporated into our coordinated analysis, and the overall utility of predicting the ultimate objective health outcome -mortality. While this neuroticism by conscientiousness trait interaction does not appear to predict mortality risk, the idea of healthy neuroticism is appealing because it takes a well-established body of literature showing higher levels of neuroticism are associated with almost universal detrimental behaviors and outcomes (Lahey, 2009), and suggests this literature is incomplete because findings may not be as universal as once thought. We suggested several future directions to explore healthy neuroticism that can be used to identify under what conditions, for whom, and when in the lifespan higher levels of neuroticism may not be detrimental for health and may in fact be healthprotective.