The Convergence of Self and Informant Reports in a Large Online Sample

,

Self-reports are the most widely used method in personality psychology, largely due to their relative ease and cost-efficiency.Informant-reports, by contrast, are underutilized in personality research despite their numerous benefits.Many researchers falsely assume informant-reports are substantially more difficult to collect than self-reports, even in the context of increasingly easy data collection online.Here, we demonstrate the feasibility of collecting informant-reports from a large, global sample using an entirely opt-in procedure and establish the convergent validity of these informant-reports with self-report data at three levels of analysis.
Multiple models of personality emphasize an outside observer's ability to provide non-redundant, relevant information about a target's traits (McAbee & Connelly, 2016;Vazire, 2010).Most build on the idea that the individual's internal perspective (self-report) provides insight on one's identity, whereas an outside perspective (informant-report) offers a look at one's reputation (Hogan, 2007).For some purposes, this outside perspective may be an even better assessment of an individual's traits.In one meta-analytic review (Connelly & Ones, 2010), informant-reports surpassed self-reports as predictors for academic achievement and work performance.
Informant-reports have a long-standing reputation as being difficult to collect, though it is unclear whether this reputation is deserved.Vazire (2006) debunked multiple preconceptions about the time, effort, and cost required to collect informant-reports, and several prior studies demonstrate the feasibility of collecting moderate to large samples of informant-reports (Clifton et al., 2005; K. Lee & Ashton, 2017;Vazire, 2006).Yet, the impact on subsequent personality research has been negligible.In Vazire's (2006) analysis of the studies published in the Journal of Research in Personality (JRP) in 2003, only 24% of studies included informant-reports.The situation has not improved over the last 15 years.Among the 2017 JRP studies that collected data, only 11 (16%) used informant-reports; 65 (98%) used self-reports and, among these, 55 (83%) collected only selfreports.Studies published in the Personality Processes and Individual Differences section of the Journal of Personality and Social Psychology show a similar trend.Among the studies that collected data in 2018, only 4 (13%) used informantreports; 30 (94%) used self-reports and 22 of these relied solely on self-reports.One possible explanation is that the aforementioned examples made of prior work using informant reports do not generalize well; perhaps the historical reliance on undergraduate samples, for example, differs from online data collection practices in ways that are relevant for informant-reports.Or, perhaps the evidence that informant-reports are under-utilized in personality research was not sufficiently widely recognized to overcome the perception that they are difficult to implement.
How can the situation be improved?One solution is to establish informant-report procedures that eliminate the need for increased costs or time.Reducing added burdens to near zero would make large-scale informant report collection more feasible.Moreover, it would help to show that large informant-report samples need not be restricted to in-person samples (typically involving undergraduate students).With over 5 billion internet users worldwide (Internet World Stats, 2021), over half the world's population is available to researchers via the web, for little to no cost.Online, researchers can offer participants the chance to "optin" to an informant-report system and send rating requests to friends, family, colleagues, and others.As part of the Synthetic Aperture Personality Assessment (SAPA) Project (https://sapa-project.org),we collected 1,554 informant-reports for 921 unique targets using this completely opt-in, web-based procedure.Our aim is two-fold: to demonstrate the feasibility of collecting large samples of informant-reports from an online, global population using a convenience sampling technique and to show the congruence of self-informant reports at three levels of analysis: at the Big-Five, among the 27 domains of a lower-order assessment model, and at the item level.In addition, we seek to describe grouplevel differences between the participants who interact with the opt-in informant-report system and the general sample.

Method Procedure
The self-report and informant-report data described here were collected between May 20, 2013 andFebruary 6, 2017.Participants were motivated to visit the site in order to learn more about their personality, and were provided with customized feedback upon completion of the self-report survey.All participation was free and anonymous.After completing the survey but before receiving feedback about their personality based on their responses, participants were asked to consider sending an email to acquaintances, friends and/or family members asking them to complete a brief questionnaire about the participant's personality.After informed consent, participants who completed a self-report or informant-report were assigned a random identification number (RID).
Unlike many other online surveys, the SAPA Project is designed with the intention that no one participant will complete all available items (Revelle et al., 2016).Instead, SAPA participants receive only a random subset of the total items available.During this period of data collection, participants were given a random sample of 696 public domain items, mainly from the International Personality Item Pool (IPIP; Goldberg, 1999), and 60 cognitive ability items from the International Cognitive Ability Resource (ICAR; Condon & Revelle, 2014).Participants were required to respond to at least 22 items in order to receive feedback, but were allowed to complete as many as 300 personality items in total, plus as many as 26 optional demographic items.The subsequent data are characterized as massively missing, as no one participant completed the entire set of available items.However, as each participant was presented with a random subset of the possible items, the resulting missingness is distributed completely at random.Because the level of missingness is greater than 80-90%, we refer to our data as Massively Missing Completely at Random (MMCAR).Unbiased correlation/covariance matrices at the item level were constructed from these MMCAR data, which can be scored in order to find the correlation of the constructs.
The opt-in nature of the informant-report system created various subsets of participants (Figure 1): those who opened the request page ("Openers"), those who sent rating requests ("Senders"), those who were rated by others ("Targets"), and those who chose to rate one or more targets ("Informants").After completing an informant-report, a number of informants also completed the SAPA self-report survey.Participants were organized in nested subsets, wherein Senders are members of the group Openers, and Targets are members of the group Senders.This research complied with current APA standards of ethical treatment and was approved (determined exempt) by Northwestern University's Institutional Review Board.See supplementary materials to access the R script, which includes code for downloading the data directly into the R environment through an API.

Measures
Informants were asked to rate targets on 24 items mod- The Convergence of Self and Informant Reports in a Large Online Sample Collabra: Psychology eled after the 12 aspects of feedback provided on the SAPA-Project during the data collection period.This feedback was based on the 10 Big Five Aspect Scales (DeYoung et al., 2007) plus Honesty and Humility.Response choices for these informant report items ranged from 1 = "Very inaccurate" to 6 = "Very accurate".Additionally, Informants were asked to judge the target's physical attractiveness (Attractiveness), as well as math skills, verbal skills, and overall intelligence (these were indexed as "Rated IQ").See Table 1 for full item content.Unlike the self-report survey in the SAPA project, Informants were presented with all available informant-report items.After completing ratings of the target, informants were invited to complete the SAPA-Project survey for themselves, and if they chose to do so, their RID was used to identify their SAPA self-report.We identified 270 unique Informants who completed the SAPA survey in addition to providing ratings of the target.
During the time of this data collection, the self-report items administered to participants were drawn at random from a diverse pool of public-domain personality inventories; see Condon (2014Condon ( , 2018) ) for more information.The personality feedback provided to participants during this period was based on the hypothesis that the evidence supporting use of the Big Five Aspect Scales (BFAS; DeYoung et al., 2007) could be extended to a six-factor model (Ashton et al., 2007;Thalmayer et al., 2011).The hypothesized model -with 12 aspects and 6 factors -was also used as a framework for informant-report data collection.This theorybased structure is visible in the item content shown in Table 1.Subsequent to the data collection period, empirical evaluations of the structure of the self-report assessment model did not support the hypothesized framework.In fact, it was on the basis of these analyses that an alternative hierarchical model was first developed, the SAPA Personality Inventory (SPI, Condon, 2018).
The SPI is an empirically-derived, self-report personality assessment model with a hierarchical structure including both Big Five measures and a larger number (27) of more narrow, lower-order traits.As suggested above, the derivation of the SPI was based on the administration of nearly 700 personality items from dozens of (over-lapping) publicdomain personality inventories to three large samples (combined N > 125,000).All of these samples are included in the data used in the current study to report on self-informant correlations.The longest version of the SPI self-report measure -the one used in the current work -contains 135 items: each of the 27 lower-order traits is assessed with a 5 item scale, and each of the Big Five scales uses 14 items (only 70 of the 135 items are used for the Big Five scales).Unlike hierarchical personality models with fully "nested" traits at different levels (i.e., facets or aspects are nested under a specific Big Five domain), the 27 empirically-derived factors of the SPI are independent of the Big Five, meaning that several load highly on more than one higher order factor.For example, in the SPI model, the lower-level factor "Adaptability" is positively correlated with Extraversion (.38) and Openness (.39), and negatively correlated with Neuroticism (-.39).Full documentation of the SPI is given in Condon (2018), including a transparent walkthrough of the code, sensitivity analyses, norms, and IRT parameters.
Here, we report mainly on self-informant associations of the SPI self-report model using the higher level SPI-5 scales (Agreeableness, Conscientiousness, Neuroticism, Extraversion, and Openness) along with the 27 lower-level scales.Comparisons with several other self-report frameworks are possible using the publicly available data at https://dataverse.harvard.edu/dataverse/SAPA-Project.To contextualize the validity of the SPI-5 relative to other frameworks, we report the self-informant associations for three of the more widely-used self-report frameworks: the BFAS (DeYoung et al., 2007), the IPIP correlates of the Big Five Factor Markers (IPIPB5, Goldberg et al., 2006), and the IPIP correlates of the NEO-PI-R (IPIPNEO, Goldberg et al., 2006).The number of pairwise administrations between all of the self-report items in these frameworks (minimum = 1,988; Mdn = 2,770; m = 3,128.1)is considerably greater than the recommended number for stable correlational analyses (typically, 250 to 500; Kretzschmar & Gignac, 2019;Schönbrodt & Perugini, 2013).

Participants
All measures were administered in English.SAPA participants in the full self-report sample (N = 158,496) represent 213 different countries, with 68% (93,271) of the sample residing in the United States.Over half the sample was female (61%).1,554 informant-reports were received for 921 unique targets, of which 786 were matched to demographic information provided by participants who completed the self-report surveys.Targets resided in 64 countries, with roughly 67% (508) of Targets living in the United States.65% of Targets were female.
Participants who completed an informant-report were assigned a 9-digit RID that did not persist beyond their web browser session; this was done to prevent the use of tracking software and potentially identifying log-ins.If an informant closed and reopened their web browser, they were assigned a new RID.Therefore, we can only estimate how many unique Informants completed the informant-reports.Out of the 1,554 reports received, only 30 Informants indicated they previously completed one of our informantreports and thus, we estimate the total of unique Informants is approximately 1,500.The majority of Informants reported being friends of the target (55%), followed by romantic partners (12%), spouses (7%), siblings (7%), and parents (6%).88% of Informants reported knowing the target "fairly well" or "extremely well."Of the total number of Informants, 270 also completed the SAPA self-report.Additional demographic information is provided in Table 2.

Analysis
If a target was rated by the same informant more than once the most complete report was taken.If reports were similarly complete, the first report was taken.Of the 1,568 informant-reports, 14 reports were excluded for duplicated informant/target pairs, leaving 1,554 informant-reports for further analysis.Informant-reports were matched to selfreports with the last six-digits (Original Identification Number, OID) of the target's RID.Twelve ratings were manually matched by time-stamp, as a small number of OIDs corresponded to more than one RID.Of the 921 unique tar-  gets, 786 were successfully matched to a RID in SAPA.All analyses with both self-reports and informant-reports included these 786 unique participants.Analyses of informant-reports alone maintained the sample of 921 targets.Item Response Theory (IRT) two parameter logistic scoring in the psych package psych (Revelle, 2018) in R (R Core Team, 2018) was used for both self-reports and informantreports.These scores were used to compare the mean SPI-5 profiles of Openers, Senders, Targets, and Informants to the general respondent sample, and to identify the associations between the informant-report scales and the self-report scales, including the SPI measures as well as the BFAS, the IPIPB5, and the IPIPNEO.Targets received ratings from between 1 and 21 informants; the modal number of reports received was 1.For targets with two or more reports, the mean value was taken for each item.Next, we constructed a correlation matrix of self-report and informant-report items.In order to mitigate the influence of replicated error, this matrix was scored adjusting for overlapping keys and replacing the overlapping covariances with the corresponding best estimate of the item's "true variance", the average correlation for that item following methods proposed by Bashaw & Anderson (1967) and Cureton (1966).If more than one informant rated a target, we used intraclass correlations to assess inter-informant agreement between the first two informant-reports for a given target.
The best items in SAPA for each informant-report scale were derived empirically using the bestScales function in the psych (Revelle, 2018) package in R (R Core Team, 2018).These procedures are described more fully by Elleman et al. (2020).Similar to the methods used for criterion-keyed scale construction, this method generates a unit-weighted model (for each criterion) based on the magnitudes of the zero-order correlations between the predictors (the items) and the informant-report scores following ten iterations of k-fold cross validation.The pool of self-report items used as predictors included all those in the SPI scales (135) and the remainder of the 696 items used during the empirical derivation of the SPI (Condon, 2018).A full listing of the items and all corresponding data are available in the public domain (Condon et al., 2017;Condon & Revelle, 2015).All of these analyses were exploratory (i.e., not preregistered).

Group differences
Due to the large sample sizes collected, we discuss results in terms of effect size rather than statistical significance.Conventional criterion asserts Cohen's d of 0.2, 0.5, and 0.8, are small, medium, and large effect sizes respectively (Cohen, 1988).One meta-analytic review of individual differences research suggests small, medium, and large effects in empirical work are equivalent to Cohen's ds of 0.20, 0.41, 0.63 (Gignac & Szodorai, 2016).In line with both convention and empirical evidence we consider Cohen's d of 0.20 to be a small effect and Cohen's d of less than 0.20 to be a very small effect.Here we report only group differences of d = 0.10 or greater (a very small effect), but advise readers to interpret results with attention to the size of the effect and 95% confidence intervals.
Analyses of the sub-group scores on the SPI-5 domains indicated that Openers (n = 16,787) scored higher on openness than the typical SAPA participant, d = 0.12, 95% CI Differences in Big Five traits between Senders and Targets (senders who received an informant-report) did not meet our cut-off of Cohen's d = 0.10 or greater (effect sizes ranged from 0.01 to 0.07).The absence of an effect in this case is noteworthy as it suggests that self-report personality ratings of the target do not affect whether informants agree to provide a rating.Taken together, the participants who opted-in to the informant-report system, either by sending a request or completing an informant-report, differed from typical participants in that they were more Open and Agreeable than the average SAPA respondent, although these were small effects.Respondents who received more than one report were even more Open and Agreeable compared to those with only one report, though the incremental effect was also small.See Table 3 for descriptive statistics of the SPI-5 profiles for the general sample (SAPA), Openers, Senders, Targets and Informants.

Self-informant agreement
Correlations between the informant-and self-report scales are shown in Tables 4 and 5.In Table 4, the correlations are corrected for reliability using the standardized alpha for each scale (Bashaw & Anderson, 1967;Cureton, 1966).Some informant scales showed low scale reliability, which can be likely attributed to their relatively short length (four items or fewer) and the intentional inclusion of content from two aspects for each domain (e.g., compassion and politeness in Agreeableness, DeYoung et al., 2007).However, the standardized alphas for three of the informant scales (indicated in Table 1) were particularly poor: Agreeableness (4 items, = .47),Intellect/Openness (4 items, = .53),and Rated IQ (3 items, = .50).As low reliabilities can lead to large adjustments of the correlations when attenuated (Schmitt, 1996), we present the correlations without correcting for scale reliability in Table 5.
In this sample, correlation values of .15 are significant at the .01level using the Bonferroni correction for multiple comparisons.When corrected for scale reliability, the SPI-5 domains produced convergent validities between self-and informant-reports ranging from .42 to .72, with a median of .67.Among the 5 domains, discriminant validities ranged from .01 to .37 with median of .135;five values were above .20 and four of these were with Extraversion.Indeed, higher informant-ratings of Extraversion were positively associated with all Big Five self-ratings in the direction of social desirability.Raw convergent validities ranged from .28 to .55 (Mdn = .48)and raw discriminant validities ranged from .01 to .27(Mdn = .11).
The structure of the self-informant agreement demonstrates the overlapping structure of the SPI measure (Table 4), where informant-ratings correlated with multiple lowerlevel SPI domains.For example, informant-reports of Agreeableness correspond with self-reported Compassion (.74), Trust (.51), Honesty (.36), and Irritability (-.39).Informant's judgments of Conscientiousness correspond with self-reported Order (.51), Industry (.65), Impulsivity (-.31), and Self-Control (.47).Likewise, the 27 SPI domains load on more than one informant-rating scale.For example, selfreported Well Being converges with the informant's rating of Conscientiousness (.28), Emotional Stability (.46), Extraversion (.56), and Intellect/Openness (.24).The raw correlations exhibited similar structural patterns (Table 5).Targets and informants showed moderate agreement, especially on the Big-Five constructs; of these five, informants and targets agreed most strongly on the target's level of Agreeableness (.72) and least strongly on the target's Intellect and Openness (.42).Moreover, the structure of these agreements exhibit the same "heterarchical" (Milyavskaya et al., 2013) pattern proposed by the SPI model, where lower-level domains load on more than one higher-level domain in an overlapping sequence.
Table 6 shows self-informant agreement based on Big Five scores using the BFAS, IPIPB5, and IPIPNEO, after correcting for scale reliability.With the exception of Openness, results are similar across the various operationalizations of the Big Five traits.Openness/Intellect associations were smaller than those for the other traits, and more disparate across models; the self-informant correlations were similar in magnitude for the BFAS and IPIPB5 (.62 and .61,respectively), as were the lower correlations for the SPI and NEO Openness (.42 and .49, respectively).
Two hundred and eighty-two targets received two or more informant-ratings and the agreement among raters is shown in Table 7. Informants agreed to a similar extent for all of the Big Five traits (r's of .29 to .35) and Attractiveness ratings (.37), slightly less for Honest/Humility (.26), and not at all for judgments of intelligence (ns).Median inter-informant agreement was .34.

Item-scale analysis
In order to understand what informants were responding to in their judgments of the target, we examined the selfreport items that were the best predictors of informant-rating scores.Table 8 displays the self-report items that were consistently the best predictors of the informant-reports.The correlations of these "best items" with informant-report scales ranged from .23 to .56, with a median of .44.For each scale, the number of items ranged from 1 (Rated IQ) to 16 (Extraversion).These items provide insight into which self-reported aspects are being noticed and judged by the informant; for example, informant ratings of Agreeableness were best predicted by the targets' endorsement of items like "Cheer people up" and "Am sensitive to the needs of others."Likewise, targets' endorsement of items like "Am a person whose moods go up and down easily" and "Do things I later regret" were inversely related to the informant saying the target was emotionally stable.These results also corroborate a non-orthogonal structure for selfinformant agreement; for example, "Find life difficult" is a strong predictor for informant reported Extraversion (inversely related, -.43) and Intellect/Openness (inversely re-lated, -.28), suggesting this item taps into aspects of an informant's judgment of at least two domains.Notably, because these items are empirically-determined rather than theoretically-driven, they indicate which self-report items predict informant-reports irrespective of rationale; for example, "Am sensitive to the needs of others" is the selfreport item that best predicts an informant's judgment of physical attractiveness even though these concepts are not clearly linked.In general, these results exhibit the potential of item-scale analysis to bridge the gap between targets' phenomenological self-perception and informants' perceptions/judgments of manifest personality.

Discussion
Self-reports of anonymous web users showed strong correspondence with the ratings of anonymous friends, family, colleagues, and other informants.This correspondence was seen at three levels of analysis: the Big Five domains, the 27 lower-level SPI factors, and at the item-scale level.We demonstrate the efficacy, as well as feasibility, of collecting large samples of informant-reports for free online.The more detailed structure of the self reports provided by the 27 sub-scales of the SPI provides validity for both the broader and narrower domains.In addition, by using SAPA procedures to collect data from 696 items, we were also able to examine the pattern of individual item correlations with the informant ratings, which provides more detailed information about the aspects of the individual's identity being rated by informants.
Self-informant agreement on the SPI-5 domains ranged from .42 (Intellect/Openness) to .72 (Agreeableness), with a median of .67,when adjusted for scale reliability and item overlap.Our results are similar to those found for the levels of self-peer consensus on both 60 and 30-item measures of the Big Five (Konstabel et al., 2017).Contrary to prior work (Norman, 1969;Vazire, 2010), we found no evidence for substantially lower self-informant agreement on Emotional Stability/Neuroticism (.67) than in other domains.In this study, informants were selected by the target, and consequently, the majority of informants, regardless of the re-   Note.For targets rated more than once, the first two informant ratings were compared using intraclass correlations using the ICC2k function from the psych (Revelle, 2018) package in R (R Core Team, 2018) ported relationship with the target, reported knowing the target fairly or extremely well.This high level of acquaintanceship could be a possible explanation for the higher than expected self-informant agreement on Emotional Stability.Consistent with this explanation, Vazire reports higher levels of agreement between self and friends on Neuroticism traits (Self-esteem and Anxiety) compared to the agreement between the self and a stranger (Vazire, 2010).However, there is mixed evidence that self-other agreement across various traits, including Emotional Stability (Neuroticism), increases with greater acquaintanceship (Biesanz et al., 2007;Norman, 1969).
More generally, our results are consistent with predictions suggested by Vazire's Self-Other Knowledge Asymmetry model (Vazire, 2010) whereby the self and other are more strongly aligned on highly visible traits (i.e., Agreeableness, r = .72 in our sample) than less visible traits (i.e., Intellect and Openness, r = .42 in our sample).Self-informant agreement on Extraversion (r = .58)was somewhat lower than anticipated relative to the other domains, as Extraversion has shown the highest self-other agreement among the Big Five in prior work (Konstabel et al., 2017;Norman, 1969;Vazire, 2010).Given that reliability of the four-item informant-report Extraversion scale is adequate  Am sensitive to the needs of others.
Can't be bother with others' needs.
Sympathize with others' feelings.
Am concerned about others.
Tend to dislike soft-hearted people.
Am not interested in other people's problems.
Like to do things for others.
Am inclined to forgive others Don't understand people who get emotional.
Will do anything for others.
Inquire about others' well-being Am indifferent to feelings of others Complete tasks successfully.

Work hard
Find it difficult to get down to work.Get upset by unpleasant thoughts that come into my mind.
Find life difficult.
Am not easily annoyed.
Am relaxed most of the time.

Get caught up in my problems
Get upset easily.
Tend to keep in the background on social occasions.
Avoid company.
Cheer people up.
Feel at ease with people.
Other people think of me as being very lively.
Prefer to be alone.
Am skilled in handling social situations Have many friends.
Keep in the background.
Usually enjoy being with people.
Keep others at a distance.
Talk to a lot of different people at parties.Don't mind being the center of attention.
Feel comfortable around people.Only feel comfortable with friends.
Find life difficult.Don't understand things.
Am quick to understand things.
Can't make up my mind.
Worry too long after an embarrassing experience.
Recover quickly from stress and illness.
Find life difficult.
Make careless mistakes.
It is better to follow society's rules than to go my own way.
Make myself the center of attention.
Enjoy practical jokes that can sometimes really hurt people.
Would never make a high-risk investment.
Do crazy things.
Like to attract attention.
Turn plans into actions.
Am sensitive to the needs of others.
There are several people who keep trying to avoid me.
Suffer from others' sorrows.
(i.e., not problematic), we speculate that the lower self-informant agreement may be attributed to a mismatch in item content between Extraversion self-report and informant-report scales.Specifically, informants rated targets on Extraversion with respect to the target's assertiveness and social enthusiasm, completing items such as "Prefers to let others lead" (assertiveness, negative) and "Enjoys being with people" (enthusiasm, positive.)In contrast, the self-report items in SPI Extraversion included attention-seeking items such as "Like to attract attention" (attention-seeking, positive), and did not include assertiveness items.This explanation is supported further by the best items analyses as the empirically-derived scale of self-report items is more highly associated with informant ratings for Extraversion (.52) than any other Big Five domain (.35 to .46).It is also the case that informant-ratings of Extraversion were significantly associated with self-report ratings for all of the Big Five.This was unique to Extraversion -none of the other informant-report Big Five ratings had significant associations with self-reports of other traits except Intellect/Openness (only with Emotional Stability).These associations were also substantial and uniformly in the socially desirable direction.In other words, higher informant ratings of Extraversion was correlated with higher self-ratings of Agreeableness, Conscientiousness, Emotional Stability, and Intellect/Openness (and Extraversion).We interpret this to suggest that informants perceive -and perhaps conflate -socially desirable behavioral tendencies as being more extraverted.
There are at least two important aspects of the findings with respect to the consistency of ratings across informants.The first relates to the magnitude of the rates of agreement.Agreeableness, Extraversion, Emotional Stability, and Conscientiousness were nearly identical (.34-.35), with Intellect/Openness and Honesty/Humility slightly lower (.29 and .26, respectively).These magnitudes can be contextualized relative to the agreement for physical attractiveness, which is at the high end of this same range (.37).This contextualization is meaningful because physical attractiveness is often noted to be highly consensual (Ibáñez-Berganza et al., 2019).Though studies focused more directly on evaluations of attractiveness tend to report higher effect sizes (i.e., meta-analyses show mean inter-rater reliabilities of .21-.71; Langlois et al., 2000), these typically account for multiple moderators including age and cultural factors.The evidence for nearly identical consistency in Big Five informant-ratings suggests roughly equivalent consensus for these constructs.The second interesting finding from these analyses was the absence of statistically significant consistency in informant-ratings of intelligence (.06, ns).Inter-informant ratings between perceived intelligence and intellect/openness were higher in magnitude (.12), though also not significant.It is unlikely that these findings were caused by low internal consistency among the perceived intelligence items as informant scales with similar alphas (Agreeableness, Intellect/Openness) were not similarly affected.
The Convergence of Self and Informant Reports in a Large Online Sample

Collabra: Psychology
In contrast to other recent work on perceptions of intelligence (A.J. Lee et al., 2017;Stellar & Willer, 2018), we did not find consensus among informants in this sample with respect to targets' math skills, verbal skills, and overall intelligence, and this was unique among the traits assessed.
When compared to the general online respondent sample, participants who interacted with the opt-in system were generally more agreeable and open.Participants who opened the request page and those who were the target of informant-ratings scored higher on openness than the general sample.Additionally, Informants (who later completed a self-report measure) were more agreeable than the average respondent.It is unclear, however, if this could be generalized to the entire group of Informants (those with and without self-reports).The informant-report system itself has elements of both self-and other-selection, where an individual must both opt-in (self-selection) and receive a rating (other-selection) to become a target.Due to the self-selected and other-selected nature of this informantrating system, it is likely the traits of both the prospective target and the prospective informant(s) played some role in the exchange of requests and completed informant-ratings, even though there were no substantial differences in Big Five traits for those targets who were rated by informants and those who only solicited ratings (but were not rated).In other words, the fact that our available sample of self-informant reports is not representative of the larger SAPA sample is noteworthy and could belie how individual traits interact with the collection of self and informant reports where the system is both optional and sans compensation.
When comparing self-informant agreement of the SPI to the other operationalizations of the Big Five (BFAS, IPIPB5, and IPIPNEO), the findings are more similar than different.The most prominent differences are for Openness/Intellect, where the SPI and IPIPNEO self-reports are both less highly associated with the informant reports than the BFAS and IPIPB5.These pairings are unsurprising given the derivation of the BFAS from the IPIP Big Five (DeYoung et al., 2007), but the lower magnitudes -especially for SPI and NEO Openness -reflect the reality that longer selfreport measures of this trait in particular include more diverse content than is covered by the 4-item informant-report.The SPI Extraversion self-informant correlation is also somewhat lower than the other models (.63 vs .70-.76), but this can once again be explained by the lack of empirical evidence supporting the inclusion of Assertiveness in the derivation of the SPI (Condon, 2018).In fact, this is an important theme of the findings reported here as there was considerable content included in the pools of items used to derive the SPI, including all of the content from the BFAS, IPIPB5, and the IPIPNEO (Condon, 2018).

Limitations and future directions
In this study, we discuss the convergence of self and informant report items, but it should be reiterated the selfand informant-report assessment models were dissimilar by design.As discussed above, the informant survey was based on a theory-driven "aspects" model of the Big Five/Big Six domains whereas the self-report instrument was derived empirically based on the structure underlying a large pool of personality items.This is a limitation because it led to an imperfect evaluation of self-and informant-ratings based on common structure.This means that the results were conflated beyond the simple self/other distinction because we are also mapping across different frameworks.Future research should consider a more direct evaluation of the relations between self-and informant-ratings using an informant-report model that matches the self-report SPI model more directly.Note that using 2-item observer report scales for 27 factors would more than double the burden on observers relative to the 24 items administered here.
Despite the benefits of an online, opt-in informant-report system, the rate of data collection is slow compared to other collection methods.We collected 1,554 informantreports over 1,335 days, suggesting an average return rate of approximately one informant-report per day.Compare this to the average collection rate of 117 SAPA self-reports per day over the same time period.Although this study demonstrates the viability of a large, global sample of informant-reports, the length of time needed to obtain this large sample may be considered untenable for some researchers.Future research could explore ways to improve the rate of informant-report completion.For instance, the SAPA selfreport survey is founded on the premise that participants are motivated to complete the assessment to "know thyself," and participants are provided feedback on their personality rather than compensated outright.In line with this premise, perhaps informants could be similarly motivated to know more about their relationship to the target.On the basis of the results reported here, we think the pursuit of initiatives like these is well-justified, and it seems especially promising to consider the use of various "gamification" strategies (Keusch & Zhang, 2017).
For example, researchers could increase informants' motivation to complete the report by supplying the informant and target feedback on how closely their responses are aligned.In conjunction with participant feedback, the rate of data collection could be improved by increasing the overall proportion of participants who request informant-reports and expand the pool of potential informants for one target.In the current study, only 3% of the sample chose to send a informant-report request.Requests were sent via email to specified recipients (i.e., the sender sends the link directly to a potential informant).This method requires the participant to know the potential informant well-enough to have their email address, which may be why the vast majority of Informants (88%) reported knowing the target fairly or extremely well.This method also requires the participant to invite each potential informant individually meaning participant burden increases linearly with the number of potential informants.Alternatively, the informant-report system could be linked to popular social media sites, whereby a participant could broadcast a request to a large number of unspecified recipients.This approach would lessen the burden of sending multiple requests and could expand the pool of potential informants for a single target.This one-to-many approach could also improve the viability of examining convergence by how closely the informant knows the target, an avenue that goes unexplored in the current study due to low variability in target-informant intimacy.
The Convergence of Self and Informant Reports in a Large Online Sample

Collabra: Psychology
The convergence of informant-reports and self-reports demonstrates the efficacy of collecting informant data broadly and online.We demonstrate that this agreement can be accessed at various levels of analysis, from large domains to the item-level.Additionally, sufficiently-powered item-scale analyses are particularly informative for identifying which aspects of the target's self-reported identity are being rated by informants.Although we recognize the design and scope of the SAPA Project is unique, the informant-report methodology used here is more traditional in design (in that informant reports do not use planned missingness) and more accessible for other researchers (because the targets are responsible for soliciting informants).Thus, we hope that this aspect of the SAPA Project can provide insight into the power of collecting informant-reports and the feasibility of doing so at scale via the web.As assessment moves increasingly online, researchers have the opportunity to overcome a reliance on self-reports alone and utilize the internet to implement large, multi-method designs.

Figure 1 .
Figure 1.Nested subgroups with sample size.Arrows indicate selection (e.g., senders select total informants, total informants select which senders become targets).
work to get by.Am a person whose moods go up and down easily.Do things I later regret.

Table 1 . Informant-report items and reliabilities
The Convergence of Self and Informant Reports in a Large Online Sample Note.Item-scale correlations are corrected for item overlap and scale reliability.Negative correlations indicate reverse-coded items.The alpha ( ), average r ( ), and median r are provided for each scale.

Table 2 . Demographics
The Convergence of Self and Informant Reports in a Large Online Sample

Table 3 . Descriptive statistics of the SPI-5 for all subsets
Sample sizes vary within a given group for individual scales due to the "missingness" of the MMCAR data. Note.
The Convergence of Self and Informant Reports in a Large Online Sample

Table 4 . Agreement between self and informant reports for the SPI-5 and 27 lower-level SPI domains
Note.Correlations are adjusted for reliability.See Table5for uncorrected correlations.

Table 5 . Raw correlations between self-reports and informant-reports.
Correlations are not adjusted for reliability.See Table4for corrected correlations. Note.
The Convergence of Self and Informant Reports in a Large Online Sample

Table 6 . Self-informant agreement with the Big Five Aspect Scales (BFAS), the IPIP correlates of the Big Five Factor Markers (IPIPB5), and the NEO-PI-R (IPIPNEO)
Note.Correlations are adjusted for reliability.

Table 7 . Inter-informant agreement
The Convergence of Self and Informant Reports in a Large Online Sample

Table 8 . Self-report items that best predict informant-reports
The Convergence of Self and Informant Reports in a Large Online Sample