In recent years, there has been a growing concern regarding the replicability of findings in psychology, including a mounting number of prominent findings that have failed to replicate via high-powered independent replication attempts. In the face of this replicability “crisis of confidence”, several initiatives have been implemented to increase the reliability of empirical findings. In the current article, I propose a new replication norm that aims to further boost the dependability of findings in psychology. Paralleling the extant social norm that researchers should peer review about three times as many articles that they themselves publish per year, the new replication norm states that researchers should aim to independently replicate important findings in their own research areas in proportion to the number of original studies they themselves publish per year (e.g., a 4:1 original-to-replication studies ratio). I argue this simple approach could significantly advance our science by increasing the reliability and cumulative nature of our empirical knowledge base, accelerating our theoretical understanding of psychological phenomena, instilling a focus on quality rather than quantity, and by facilitating our transformation toward a research culture where executing and reporting independent direct replications is viewed as an ordinary part of the research process. To help promote the new norm, I delineate (1) how each of the major constituencies of the research process (i.e., funders, journals, professional societies, departments, and individual researchers) can incentivize replications and promote the new norm and (2) any obstacles each constituency faces in supporting the new norm.
There is growing consensus that psychology has a replicability “crisis of confidence” [70,69,89,66,67], stemming from the fact that a growing number of findings cannot be replicated via high-powered independent replication attempts that duplicate the original methodology as closely as possible. Across all areas of psychology, there is a growing list of (prominent) findings that have not held up to independent replication attempts, including findings from cognitive psychology (retrieval-induced forgetting ; eye movements on recall ; temporal judgments ; protection effect ; mental simulation 108]; Mozart effect [95,96]), developmental psychology (synesthetic cross-modality correspondence ), neurophysiology (vestibular stimulation ), industrial/organizational psychology (utility biasing effect on selection procedures ), positive psychology (weather effects on life satisfaction ), political psychology (self-prophecy effect on voting ; status-legitimacy hypothesis ), moral psychology (“Macbeth effect” ), educational psychology (stereotype threat on female math performance [28,37]), color influence on exam performance ), evolutionary psychology (fertility on face preferences ; ovulation on men’s testosterone ; sex differences in infidelity distress ), judgment & decision making (unconscious thought advantage ; choice-overload ), and social cognition (e.g., “social priming”/embodiment findings [38,17,51,18,31,57,72,71,94,44,46,52,34]).
More generalizable evidence supporting a general replicability problem comes from an ambitious and unprecedented large-scale crowdsourced project, the Reproducibility Project [66,67]. In this project, researchers were unable to replicate about 60% (out of 100) of findings from the 2008 issues of Psychological Science,Journal of Personality and Social Psychology, and Journal of Experimental Psychology: Learning, Memory, and Cognition . In another large-scale meta-scientific investigation, about 70% (16 out of 23) of important findings from cognitive and social psychology could also not be replicated . Though there are different ways to interpret successful versus unsuccessful replication results , taken together, these observations strongly suggest psychology currently has a general replicability problem (as do several other areas of science including cancer cell biology and cardiovascular health literatures [8,75]).
New Initiatives and Reforms
Several new initiatives have been launched to improve research practices in order to increase the reliability of findings in psychology. For instance, higher reporting standards have recently been instituted at several prominent psychology journals [21,87,88,47]. At such journals (e.g., Psychological Science, Memory & Cognition,Attention, Perception, & Psychophysics, Psychonomic Bulletin & Review,Personality and Social Psychology Bulletin, Social Psychological & Personality Science), authors submitting a manuscript must now acknowledge that they have disclosed basic methodological details critical for the accurate evaluation and interpretation of reported findings such as fully disclosing all excluded observations, all tested experimental conditions, all assessed outcome measures, and their data collection termination rule.
There is also a significant push to incentivize “open data”, the public posting of the raw data underlying studies reported in a published article [65,88]. For instance, at Psychological Science, authors who make their data publicly available now earn an open data badge that is prominently displayed alongside their published article. Encouragingly, there is preliminary evidence that such open practices badges are having a significant positive impact (e.g., a rapidly growing number of authors of Psychological Science articles have earned open data and materials badges for publicly posting their materials; see ). Furthermore, the new Journal of Open Psychology Data now publishes data papers that feature publicly posted data sets . Such open data practices not only facilitate independent verification of analyses and results so crucial to identifying errors and other inaccuracies, but substantially facilitate the execution of meta-analyses and re-analyses from different theoretical perspectives, which can accelerate knowledge development.
In addition, several journals (e.g., Cortex, Perspectives on Psychological Science, Attention,Perception, & Psychophysics,Comprehensive Results in Social Psychology) now offer pre-registered publication options whereby authors submit a study proposal that pre-specifies the methodology and analytical approaches to be used to test a specific hypothesis [14,106]. Proposals are evaluated on the soundness of the methodology and theoretical importance of the research question. Once accepted, the proposed study is executed and the article is published regardless of the results, eliminating questionable research practices and researcher bias which can grossly mischaracterize the evidence .
A final development is the growing practice of prominent journals to publish independent direct replication results, including replication results inconsistent with those originally published by the journal (e.g., Psychological Science, Psychonomic, Bulletin,& Review, Journal of Research in Personality, Journal of Experimental Social Psychology, Social Psychological & Personality Science). Though calls for the publication of replication results have been made for decades (e.g., [62,77]), the actual practice of prominent journals systematically publishing replication results is unprecedented and has immense potential to increase the reliability of findings in psychology. Such practice directly incentivizes researchers to execute independent replications so crucial to corroborating past findings and hence accelerating theoretical progress [22,74]. The new development of journals publishing replications may also reduce the tendency for researchers to report unexpected, exploratory, and/or tenuous results as confirmatory or conclusive findings .
Though this final development is particularly exciting, many researchers are currently afraid or unsure about possible social and career-related risks involved in executing and publishing independent replication results given several recent high-profile cases where the publication of replication results lead to nasty threats, retaliation, and personal attacks of incompetence by original authors ( e.g., [38,82,29,5,6,107]. This situation represents a serious hurdle that substantially interferes with the development of a research culture where the execution and publication of independent direct replications is seen as a routine part of the research process rather than something done mostly by selfless “open science” psychologists. To overcome this important hurdle, I propose a new replication norm that has the potential to substantially increase the execution and publication of independent direct replications so important to ensuring a self-correcting cumulative knowledge base. As Cohen propounded “…we must finally rely, as have the older sciences, on replication” (; p. 1002). Similarly, as Sir Ronald Fisher stated: “A scientific fact should be regarded as experimentally established only if a properly designed (independent) experiment rarely fails to give this level of significance [referring to p < .05]” (; p. 504).
Extant Peer Review Norm
The new replication norm is inspired directly from the extant peer review social norm that currently exists in psychology and other areas of science. This informal (and implicitly adopted) social norm states that psychologists should aim to review other peers’ papers at a rate approximately three times the number of first-author papers they themselves publish per year.1 The threefold rate is based on the logic that most papers are typically reviewed by three other peers. For example, if a researcher publishes 4 first-author publications in a year, they should aim to review (at least) approximately 12 papers submitted by other researchers. Given that each accepted paper collectivelycosts the field a certain amount of work, the social norm aims to guide researchers to contribute to the system as much work reviewing papers as they themselves cost the system. Of course, because such a norm is informal – and in no way enforceable – inevitably some individuals may end up “free riding”, that is intentionally or unintentionally drawing more from the collective resource than they themselves contribute to it (; e.g., by publishing 5 papers per year, but only reviewing 3 papers per year, a net deficit of 12 “units” of work.) Notwithstanding such suboptimality, the informal social norm ends up benefiting everyone in terms of clearer and strengthened manuscripts that make more important theoretical contributions to the field.
New Replication Norm
Following directly from the extant peer review social norm, I propose a new replication social norm whereby researchers strive to execute and publish independent replications of other findings in their research area in proportion – in some ratio – to the number of (first-author) studies they themselves publish per year. For example, the norm could be that researchers strive to execute and publish 1 independent direct replication (of another researcher’s finding) for every 4 (first-author) original studies they publish per year. This would mean that if a researcher publishes 3 articles reporting a total number of 8 studies in a year, they would be expected to execute and publish independent replications of (at least) 2 important findings in their own area of research in that same year (in other words, 20% of one’s published studies per year should involve independent direct orsystematic replications). Paralleling the peer review norm, the logic is that each original finding published by a researcher costs the collective field a certain amount of work to independently corroborate, hence the replication norm aims to guide researchers to contribute to the system a roughly commensurate amount of replication work (of other researchers’ findings) as they themselves cost the system (McCullough & Kelly, 2014). The more findings you publish, the more independent replications you need to execute and publish for everyone to benefit from a self-correcting cumulative knowledge base.2
Another parallel between the peer review and replication norm worth mentioning is that in each case, researchers’ primary motivation to embrace the norm is that they intrinsically care about the theoretical progress of their own research area. Hence, even though engaging in such activities takes time away from doing their own research, it is nonetheless in their best interest to peer-review and replicate other researchers’ findings. In addition, in each case, researchers also get credit for engaging in such activities. In the case of peer-reviewing, reviewers are typically acknowledged and thanked by journals at the end of the year and researchers will also list on their CV the journals (1) they have reviewed for and/or (2) they are on the editorial board for. For replications, researchers get credit by having their replication results published by the original journal (Pottery Barn Rule, see ), published at another journal (e.g., Psychonomic Bulletin & Review ), or at the very least get credit by posting their results to online databases that track replications (e.g., PsychFileDrawer.org;CurateScience.org).
An interesting question that follows from examining the parallels between the extant peer review norm and the proposed replication norm is whether executing replications should be viewed as a service to the field — as peer reviewing is typically viewed? I contend that contrary to the peer review norm, executing and publishing independent replications should be seen as part and parcel of the research process rather than as a service to the field . That being said, construing replications as an essential part of the research process does not mean psychology cannot benefit from promoting such a norm. Furthermore, peer reviewing could (and perhaps should) also more accurately be viewed as part and parcel of the research process because science arguably involves continuous peer review given that any finding at any future point in time can be questioned and/or refuted by new evidence [41,98].
Original-to-replication-studies ratio to strive for? Though it will be difficult in practice to identify and defend one particular optimal original-to-replication-study ratio, the act of proposing a reasonable ratio is more important than the actual ratio researchers end up adopting and achieving.3 Nonetheless, I will now present logical and empirical considerations that support the idea that a 4:1 original-to-replication-studies ratio may be a reasonable ratio the modal researcher should strive for.
On logical grounds, it is straightforward that small original-to-replication-studies ratios (e.g., 1:1 or 2:1) are suboptimal given that (1) the primary goal of science is to adduce new facts rather than verify old findings and (2) many findings are never cited nor important, hence valuable resources should not be spent attempting to replicate all findings. Consequently, such small ratios would likely be seen as an unwise use of resources and hence be very unlikely to be adopted by researchers. It would seem, then, that a more optimal ratio would involve a much larger proportion of original compared to replication studies. But how much larger? A large-scale survey asking psychologists’ attitudes toward newly proposed research and editorial practices provides empirical evidence as guidance in answering this question . In that survey, psychologists indicated – on average – that 23% of journal space should be dedicated to direct replications. If psychologists want about 23% of journal space dedicated to direct replications, then one can make the case that a good starting point for the new replication norm original-to-replication-studies ratio should be about 4:1. Assuming that journals actually abide by the democratic voice of the community to dedicate about 20% of their pages to direct replications, then it would make sense for researchers to strive for a 4:1 ratio given that it ensures their replication work will be rewarded. As it currently stands, most replication results are relegated to lower status journals, though some improvements have recently occurred on this front (as mentioned above; see also ). Of course, in practice, not all researchers will be able to accomplish a 4:1 ratio, and a minority of researchers may even disagree on principle in the value of independent direct replications . Nonetheless, the 4:1 ratio can act as an upper-bound to strive for, and the field can benefit immensely even if researchers’ actual modal ratio is much higher. For instance, even if only half of researchers aim for an 8:1 ratio, this would dramatically increase the number of independent replications in the published literature relative to the current state of affairs whereby independent direct replications represent less than 0.2% of the published literature .
What studies should be replicated? In general, researchers should seek to replicate studies that have had substantial theoretical impact within one’s own research area and/or studies that have important applied implications for society and/or public policy. High-theoretical impact studies include classic or seminal studies that have spurred on voluminous amounts of follow-up research (e.g.,  automaticity of social behavior studies) or studies that have been highly-cited relative to citation metrics for a particular field (note that the simple fact that a study is published in a “prominent” journal does not necessarily indicate a finding is high-impact). Furthermore, all else being equal, studies where much uncertainty remains regarding the magnitude of an effect (e.g., because of the use of a small sample size) should be considered good targets of replication. Other studies that are good candidates for replication include findings that potentially have important applied implications (e.g., imagined-contact effects for treating prejudice, which subsequent replication efforts were unable to corroborate: [9,58]).
What kind of replications count? It is important to emphasize that the independent replications executed as part of the new norm need to bedirect or close replications (same methodology) rather than conceptual replications (different methodology). A direct replication aims to duplicate as closely as possible the conditions and procedures that existing theory and evidence anticipate as necessary for obtaining the effect [64,66]. A conceptual replication, on the other hand, intentionally alters some (or several) aspects of the methodology to test whether a finding generalizes to different experimental manipulations, measures, or contexts [3,80]. Though in practice the difference between direct and conceptual replications lie on a continuum, it is crucial that independent replications duplicate the methodology of the original study as closely as possible because if different methodology is used and discrepant results emerge, then it is ambiguous whether the discrepant results are due to the different methodology or because the original finding was false [45,69].4 Of course, for some psychological phenomena (e.g., complex culturally-bound or historically-sensitive social psychological phenomena), it may be difficult in practice to know if all essential conditions were duplicated (e.g., identifying a counter-attitudinal essay topic in a different culture). These challenges speak even more loudly to the crucial importance of direct replications given the potentially large number of confounded variables that may inadvertently arise across original and replication studies (e.g., different language, culture, historical time period, attention span due to technological advancements, etc.)
In many instances, however, it will be more efficient (if possible) to incorporate direct replications as part of systematic replications whereby a direct replication of an original result is tested within particular cells of one’s design (the “anchor cell” ) and a conceptual replication tested in separate independent cells (e.g., using a different measure or experimental condition; see also , who used the term constructive replications). The direct replication anchor cell ensures that one can observe the original result in a new independent sample whereas the conceptual replication cells test whether the result generalizes to other measures, manipulations, conditions, or contexts. Table 1 demonstrates a hypothetical example of a systematic replication design of Schnall, Benton, & Harvey’s cleanliness priming on moral judgments finding .
|Original moral vignettes||New & improved vignettes|
|Priming condition||Cleanliness priming||Direct replication||Conceptual replication|
|Original moral vignettes||New & improved vignettes|
|Priming condition||Cleanliness priming||Direct replication||Conceptual replication|
In this case, participants in the anchor cell would judge the original series of moral actions after having been either primed with cleanliness-related or neutral words, to ensure the original result can be replicated. Completely independent participants would be randomly assigned to either the cleanliness-related versus neutral priming condition and then judge a series of new moral actions (e.g., more ecologically-valid moral actions), to see if the original finding generalizes to these arguably more valid stimuli (in which case a mixed-effects approach advanced by Judd, Westfall, & Kenny could be used whereby stimuli are considered as random factors .)5 Table 2 shows another systematic replication example where a new independent variable is tested.
|Low cognitive load (control)||High cognitive load|
|Priming condition||Elderly priming||Direct replication||Conceptual replication|
|Low cognitive load (control)||High cognitive load|
|Priming condition||Elderly priming||Direct replication||Conceptual replication|
In this case, data from such systematic replication could be analyzed via a mini meta-analysis, whereby the conceptual replication (e.g., the elderly vs. control contrast under the new high cognitive load situation) is treated as a new meta-analytic data point to be considered in conjunction with the direct replication result and original result. Indeed, there are some user-friendly R packages that can be used to execute such mini meta-analyses (e.g.,metafor package ;meta package ).
This systematic use of direct and conceptual replications is crucial for cumulative knowledge development because as scientists we need to make sure we can replicate past results in our own labs to make sure our instruments, measures, and participants/rats are behaving appropriately. This is arguably all the more important in psychology whereby most psychological phenomena are multiply-determined by large sets of variables that may exert different influences in different contexts for different individuals. Indeed, Feynman was highly skeptical of psychological studies specifically because he repeatedly observed psychologists not executing such systematic replications .6
Finally, other strategies exist to incorporate replications into one’s research at relatively low cost. For instance, Perspectives on Psychological Science (PPS) now offers a new article type called Registered Replication Reports that involve multi-lab pre-registered replication attempts of important findings in psychology . The process involves one lab submitting a proposal to replicate a finding that is deemed important (either theoretically influential or having important societal implications) which has yet to be independently replicated and/or still has substantial uncertainty about the size of the effect (e.g., multisite replication of verbal overshadowing effect ). Once a replication proposal is accepted and all procedures/materials are finalized, PPS makes a public announcement, at which time other labs (typically about 15) can join in to contribute a sample to the replication effort and earn co-authorship on the final article. Hence, this represents a low cost alternative to executing and disseminating replication results within one’s own field. Relatedly, an additional way to incorporate replications into one’s own research is to participate in collaborative replication efforts coordinated by the Center for Open Science (COS). COS organized the aforementioned Reproducibility Project that attempted to replicate 100 studies from three prominent journals and has subsequently organized several “Many Labs” replication efforts whereby several labs seek to replicate a set of important findings theorized to vary in replicability due to cross-cultural or contextual factors (e.g., Many Labs 1: ; Many Labs 3: ).
Direct and Indirect Benefits of New Replication Norm
The new replication norm will have several direct and indirect benefits. First, the norm will significantly increase the overall number of independent direct replications in the literature (currently < 0.2% ). This will be true whether the replication results are formally published in a prominent journal, published in newer open-access journals, or simply posted to PsychFileDrawer.org. Ultimately, independent replications need to occur more often and be disseminated as widely as possible so that the broader community of scientists can calibrate their confidence in empirical findings accordingly. It is most optimal for the journal that originally published an original finding to publish (unsuccessful) independent direct replication results – a standard known as the “Pottery barn rule”  – because this most effectively alerts readers of that journal that a finding may not be as robust as initially thought (e.g.,  independent replications of Vess, 2012; both original and replication work published atPsychological Science). This is to be contrasted with other situations where prominent journals that published an original finding were unwilling to publish sound high-powered unsuccessful independent replications (e.g., [16,52]; see also ).
Following directly from the first benefit, a second direct benefit of the new replication norm is that it will facilitate cumulative knowledge development and hence accelerate theory development. That is, accelerate the rate at which we can deepen our theoretical understanding of psychological phenomena. It is straightforward that in the case of successive (published) successful independent corroborations of earlier findings, we amass compelling evidence that our literature does in fact cumulatively build upon itself over time. This is to be contrasted with the present reality where the results of successful andunsuccessful independent replications are simply publicly unknown or are privately known in an unsystematic matter (e.g., water cooler discussions at conferences). In the case of unsuccessful independent direct replications, publicly knowing about a much larger number of such “failed” replication studies will significantly help protect against false, flawed, (or fraudulent) findings from going unquestioned for an extended period of time . This not only prevents other researchers from wasting precious resources and time following up on blind alleys, more importantly, it forces the field to seek alternative theoretical models that have a higher probability of reflecting actual psychological realities (i.e., publishing failed independent direct replications reduces theoretical false negatives). In the words of Nobel-laureate physicist Richard Feynman, “we are trying to prove ourselves wrong as quickly as possible, because only in that way can we find (theoretical) progress.” In other words, being proven wrong via independent direct replications is a good thing; it is not knowing one is wrong that is catastrophic for theory development.
The new replication norm will also have indirect benefits, meaning that it will benefit the field in ways not directly tied to increasing the execution and publication of independent direct replications. First, it will indirectly incentivize researchers to execute fewer studies. This is the case because in a world where independent replications are expected to be routinely executed and published, researchers would be much more motivated to increase the quality of their studies (e.g., by using larger samples, reporting their research more transparently, pre-registering their hypotheses ). Executing and publishing fewer studies per year would then reduce the overall peer-review workload because there will be fewer and shorter papers to independently evaluate. This again means more time to design sounder, costlier, and/or more advanced experimental designs (e.g., highly-repeated within-person designs, experience sampling designs, eye-tracking studies). This should again drive up the quality of the research and hence increase the informational value and potential impact of published findings.
A final indirect benefit of the new replication norm is that it will help facilitate our transformation toward a new research culture where executing and reporting independent direct replications is seen as completely routine and mundane part of the research process. This is to be contrasted with the current atmosphere where executing and publish replications can be perilous, given that it is often seen as confrontational or antagonistic (e.g., stepping on others’ toes ), disrespectful (a senior collaborator used the expression “you’re throwing them under the bus”), or highly risky in terms of one’s career advancement (e.g., replicators may be perceived as having a “difficult personality” that may negatively affect a department’s atmosphere). This is illustrated by several recent high-profile cases where the publication of replication results lead to nasty personal threats and attacks of incompetence by original authors (; e.g., [38,82,29,5,6,107]). The new replication norm will help eliminate misconceptions by the minority of researchers (who unfortunately may influence views of the majority) who see independent replications as a sign of mistrust aimed to discredit the reputations of other researchers. Eliminating such misconceptions is crucial to establishing a new research culture where independent replications are seen as a normal part of the research process.
Promoting the Norm
I contend the new replication norm can be promulgated via different informal and formal channels. To organize this discussion, I will delineate (1) how each of the major constituencies of the research process (i.e., funders, journals, professional societies, departments, and individual researchers) can incentivize replications and promote the new norm and (2) any obstacles each constituency faces in supporting the new norm.
Funding agencies. Funding agencies are in the best position to incentivize replications and the new replication norm, given that it is in their best interest to ensure that the research they are funding is in fact producing cumulative knowledge that is advancing theory and/or real-world applications. Indeed, there would appear to be (at least) two simple strategies funding agencies could implement in this regard. First, given the highly competitive nature of grants, a new criterion for being competitive to receive a grant could involve providing evidence that one has replicated key findings in one’s field that are generally being relied upon. Second, funders could impose a new condition for receiving a grant whereby a certain percentage (e.g., 5%) of a grant must be spent on replicating key findings in one’s field. Indeed, these two strategies were discussed at a recent roundtable discussion at APS with NSF and NIH representatives, and will be considered moving forward (H. Pashler, personal communication, Aug 21, 2015).
In terms of obstacles, the primary challenge for funders to contribute to the new replication norm in this way would appear to be government/administrative bureaucracy and inefficiencies that render it difficult and extremely slow to (1) make changes to longstanding practices and (2) implement new practices.
Journals. As already mentioned, several prominent journals have recently updated editorial policies and now publish high-quality replications of findings originally published by them (i.e., Pottery barn rule in effect, see Srivastava, 2012) or originally published by other journals (e.g., Psychonomic Bulletin & Review). This indeed is an exciting development given that this should motivate researchers to execute independent replications.
That being said, one obstacle involves the fact that it is currently unclear how much value psychologists place on publications of such replications for hiring and promotion decisions given that different researchers within the community hold different opinions on the value of replications (e.g., ). An obstacle faced by journals that do not yet publish replications could be the perceived negative influence publishing replication papers may have on a journal’s impact factor. On first thought, one may think replication papers would decrease a journal’s impact factor if such papers are rarely cited. However, the opposite may also be possible given that replication papers could become highly cited if they disconfirm a theoretically important long-standing finding (e.g.,  has been cited more than 200 times in 3 years [according to Google Scholar]).7
Professional societies. Professional societies could incentivize replications by officially endorsing the norm and giving out awards to Master’s and PhD-level students for best replication paper or best poster. They could also make sure journals they operate do publish replications. Also, societies could organize workshops at conferences or summer institutes on how to execute high-quality replications (e.g., following  “replication recipe”).
For professional societies, the only real obstacle to promoting the new replication norm is convincing researchers who (1) hold different opinions on the value of replications (e.g., [60,105]) or (2) believe promoting replication work may damage the credibility and/or reputation of our field (see ).
Psychology departments. At first glance, one may think that it would not be in the best interest of departments to incentivize replications given the current academic incentive structure that favors publishing novel findings above all else . Given the positive developments regarding journals publishing replications and funders discussing changes in their grant process with respect to replications, however, one can make the case that departments will eventually have to change. Indeed, there is anecdotal evidence that some departments have started to change in terms of an increased focus on quality rather than quantity of published findings (c.f. ). For example, the psychology department at Ludwig-Maximilians-Universität München just recently announced that they have established an Open Science Committee (OSC) that aims to teach open science/transparency skills to faculty and students and develop concrete suggestions regarding hiring and tenure criteria with the ultimate goal of increasing the quality of published findings . The department of psychology at Cambridge University has also just created an Open Science committee with similar goals (R. Kievit, personal communication, August 13, 2015). Though only anecdotal, the existence of these forward-thinking departments suggests that departments could eventually be in a position to incentivize replications and promote the new norm.
Independent of open science committees, the replication norm could be promoted by departments via informal and formal teaching. Informally, the norm could be discussed during professional development sessions often given to graduate students at the beginning of their graduate career. This seems reasonable given this is the context where the peer review norm is typically introduced, including among other things peer reviewing strategies more generally, professional networking at conferences, managing one’s website and online presence, etc. More formally, the norm can be promoted as part of undergraduate or graduate-level methods courses. Indeed, Frank and Saxe  have argued that executing independent direct replications as part of their undergraduate and graduate experimental methods training offers several pedagogical benefits to students (see also ). Frank and Saxe have found such an approach to be highly effective in getting students more engaged in learning the intricacies and importance of independent replications given that such classroom projects have the potential of making a real scientific contribution if done well.
One obstacle for departments to incentivize replications involves bureaucratic inefficiencies involved in changing departmental processes (similar to funding agencies). A more important obstacle for departments, however, involves developing new metrics for assessing the quality of individual scientists that go beyond the current metrics, which are typically the number of publications and citation counts for those publications. One model proposed to overcome such obstacle involves tracking replication studies so that replicability scores can be calculated to estimate the replicability of individual researchers’ findings .
Individual researchers. Finally, researchers themselves represent an important constituency that have a lot of potential in promoting the new replication norm. Individual researchers can simply execute and publish replication studies and attempt to inspire others to follow suit (e.g. trail-blazers include Hal Pashler, Brent Donnellan, Rolf Zwaan, Katie Corker, Richard Lucas). However, in delineating how individual researchers can help promote the new replication norm, it is important to distinguish between early versusestablished researchers. Indeed, a compelling case can be made that established researchers are in the best position to endorse and promote the new replication norm because established researchers (1) have more resources, (2) have job security and hence are less vulnerable to reputational damages that sometimes accompany failing to replicate others’ work , (3) have status and visibility and hence have more influence in inspiring others to execute replications, and (4) are largely responsible for the unreliable nature of the “legacy literature”.
In terms of obstacles, early researchers lack resources, are vulnerable to reputational damages from failing to replicate others’ work, and will incur opportunity costs whereby executing replications means fewer resources left over for publishing novel findings, the latter two which can substantially decrease one’s chance of getting an academic job. For established researchers, obstacles include different opinions on the value of replications (e.g., ) and/or personal career-related interests tied to current incentive structure .
To help overcome obstacles in getting individual researchers to adopt the new replication norm, we can turn to the social psychology literature to identify factors that may help change individual researcher behavior. First, social psychology theorists [24,42] have argued that clearly specifying collective goals can promote pro-social behaviors and minimize free riding (a.k.a., social loafing) in relation to collectively beneficial social norms, which in part is what I have attempted to achieve in this paper (i.e., strive for a 4:1 original-to-replication studies ratio). Furthermore, social norms theory (Perkins & Berkowitz, 1986) posits that behavior can be influenced by incorrect perceptions regarding how other group members think or act. As applied to alcohol use, for example, undergraduates tend to over-estimate the frequency and quantity of alcohol use by their peers  and this can increase problem drinking among the misperceiving majority (pluralistic ignorance: incorrect perception that majority belief is different from their own belief due to memorable exemplars) and can reinforce problem drinking among the minority of individuals actually exhibiting problematic drinking (false consensus: problem drinkers misperceive their behavior as normative). Social norms theory hence implies that correcting misperceptions about the target belief is crucial to maximize norm adoption. In our case, what is the target belief? Though the replication norm situation is arguably more complicated than the alcohol example, a relevant target belief in our context is that a majority of researchers may incorrectly believe (or overestimate the extent to which) independent direct replications will be negatively received and/or create antipathy due to memorable recent exemplars (e.g., the “Repligate” scandal [38,82,29]). In this way, social norms theory would predict that it is crucial to raise awareness regarding what psychologists actually believe with respect to direct replications (i.e., approximately 22% of journal space should be dedicated to direct replications ) so as to correct misperceptions regarding the extent to which publishing replications will be a negative experience.8 This in turn should increase the number of researchers who feel comfortable executing and publishing independent replications. Indeed, one could argue that the publication of this new replication norm and/or official endorsement by professional societies (e.g., Association for Psychological Science) may further reduce misperceptions regarding negative experiences surrounding replications.
Concerns and Challenges
At first glance, it may seem impossible or unfeasible for researchers to adopt the new replication norm for expensive and/or time-consuming studies (e.g., fMRI or longitudinal studies). Upon closer scrutiny, however, this may simply not be the case. For such studies, researchers can simply include independent replications assystematic replications where both a direct and conceptual replication are built into the study design . Though achieving this may increase a study’s sample size or number of observations (in the case of between- and within-subjects designs, respectively), it should nonetheless be possible in most situations, especially if such considerations are taken into account ahead of time when planning research. For example, when writing grant proposals, a researcher could decide ahead of time to include a systematic replication for some of the proposed studies. This will ensure that sufficient funds are requested that takes into account the larger sample sizes required to have high-power to (1) replicate the original finding and (2) potentially discover new boundary conditions or underlying mechanisms.
Another concern is that the new replication norm may be difficult or impossible to enforce. Though this is strictly true, I contend that proposing and promoting theidea of a new replication norm has a lot value even if such a norm is not enforceable. Indeed, the extant peer reviewer norm is also not strictly enforceable, but we nonetheless all collectively benefit from the existence of such unenforceable informal norm. In a similar sense, there is immense value in promoting a new research culture whereby p-hacking  is no longer condoned even though such normative behavior also cannot be enforced. As previously mentioned, the new replication norm idea could benefit our field in several respects even if only a minority of researchers adopts it using a higher original-to-replication studies ratio (e.g., 8:1 or even 10:1).
In the current article, I propose the idea of a new replication norm whereby researchers should aim to independently replicate important findings in their own research areas in proportion to the number of original studies they themselves publish per year. As a starting point, I have proposed that researchers should strive for a 4:1 original-to-replication studies ratio based on logical and empirical considerations. However, such ratio can be calibrated depending on a researcher’s faculty position and resources. Though the norm may not be enforceable (just like the peer reviewer norm), I contend our field could benefit in several respects even if only a minority of researchers adopts it using an original-to-replication studies ratio higher than the 4:1 ratio suggested. I argue this simple new norm could significantly advance our field by increasing the reliability and cumulativeness of our empirical knowledge base, accelerating our theoretical understanding of psychological phenomena, increasing the quality (rather than quantity) of studies, and by helping to facilitate our transformation toward a research culture where publishing independent direct replications is seen as a completely ordinary part of the research process.
The author declares that they have no competing interests.
I would like to thank Daniel Lakens, Hal Pashler, Brent Roberts, Rogier Kievit, and Brian Earp for valuable feedback on an earlier version of this manuscript.
Strictly speaking, the social norm should be three times the number of first-author submissions rather than accepted publications because many papers are rejected at several journals before finding a home. However, given that revised versions of rejected manuscripts submitted to a different journal are often reviewed by some of the same reviewers (and involve less work to review subsequently), for the sake of simplicity, it would appear researchers calibrate their peer review behavior in reference to number of published rather than submitted articles.
It is important to mention that the difference between original and replication research should not be exaggerated given that in actuality a continuum exists between original and replication research .
Also, like the peer review norm, researchers may naturally calibrate their required contribution to the system depending on their faculty position and available resources (e.g., research vs. teaching position, access to large vs. small subject pools, etc.).
That being said, such unsuccessful “conceptual replications” are nonetheless informative because they constrain the generalizability of a finding, which can have important theoretical or applied implications.
An alternative strategy would be to present the new-and-improved moral actionsafter the original dependent variables (DVs), in which case a MANOVA approach could be used to analyze the resulting data. Note, however, that in cases where individual DV items are time-consuming, this could cause fatigue effects which may interfere with finding an effect supporting the generalizability of the original finding.
A more radical strategy could involve always executing systematic replications for studies testing novel hypotheses that extend a published effect (i.e., always replicate back k-1 studies where k = current study). Though this approach in theory would dramatically increase the reliability of findings, it would arguably be an inefficient use of resources in terms of maximizing scientific discovery given that researchers have finite resources .
If every journal required a direct replication as Step 1 in any manuscript reporting a series of follow-up studies that build upon an original finding, we would not need the new replication norm. However, this is extremely unlikely to ever occur because such strategy is (1) simply not feasibleexcept in particular areas of experimental psychology and (2) arguably an unwise use of resources given that not all findings need to be replicated .
There is some evidence that under some conditions norms can also motivate socially undesirable behaviors among already compliant individuals . This should be unlikely for the new replication norm, however, given that researchers are intrinsically motivated to execute and publish replications because they care about the theoretical progress of their own research area.
Peer review comments
The author(s) of this paper chose the Open Review option, and the peer review comments are available at:http://dx.doi.org/10.1525/collabra.23.opr