Study preregistrations and theory development have been proposed as independent strategies for improving the quality, transparency, robustness, and reproducibility of research. Yet, a discussion of how theory and preregistration could interact and complement each other towards improved science is currently lacking. Here, we argue that hypothesis preregistration could stimulate theory development by serving various roles, depending on the availability or quality of theories and hypotheses in a field at any given time. We suggest that, in appropriate conditions, preregistration can increase the quality of hypotheses before they are tested, with indirect beneficial effects on theory development. In fields where theories are less developed or agreed upon, or are lacking altogether, hypothesis preregistration can nudge researchers to improve and share their hypotheses, engaging their community and facilitating accumulation and development of hypotheses. As a field’s theories and hypotheses become more advanced and better understood, hypothesis preregistration can become less important: the theory itself can function as a public repository of hypotheses and can constrain methodological aspects of research. We explore possible relations and synergies between hypothesis preregistration and theory development in a range of scenarios. We conclude with a discussion of implications and recommendations for researchers and meta-scientists.
1. Introduction
The aim of science is to build theories of the world with particular virtues, such as predictive and explanatory adequacy, coherence, practical benefit and, ultimately, truth. Some contributors to the science reform debates have suggested that not all the measures proposed recently to improve the quality, transparency, or reproducibility of research play a clear, direct role in advancing that aim. One of the most widely discussed measures in this context is preregistration of research studies. Preregistration involves declaring a time-stamped research plan and preserving it, typically in an independent online repository, before data collection or data analysis are underway (Nosek et al., 2018). Proponents of preregistration have identified three main aims of declaring and preserving research plans (Nosek et al., 2018). The first is to distinguish between tests and analyses decided before vs after data collection and processing. This is intended to help avoid confusing prediction and postdiction, and to avoid undermining the assumptions of statistical hypothesis tests through undisclosed or unplanned analyses. The second aim of preregistration is to help preventing, or at least detecting, questionable research practices, e.g., selective outcome reporting, HARKing (Hypothesizing After the Results are Known; Kerr, 1998), and CARKing (Criticizing After the Results are Known; Hobson, 2019), and help counteract bias by restricting researchers’ degrees of freedom (Nosek et al., 2019; Simmons et al., 2011). The third aim is to reduce publication bias: the tendency to publish more hypothesis-supporting results than other results (Fanelli, 2010; Ioannidis, 2005).
1.1. Existing Arguments For and Against Hypothesis Preregistration
Here, we assume that, to the extent that preregistration can realistically achieve these aims, it may indeed improve the reliability and replicability of studies. Current studies on the effectiveness of preregistration in limiting questionable research practices show that, for example in psychology, researchers still selectively report hypotheses in approximately half of all published, preregistered studies (van den Akker et al., 2023), and that current preregistration practices do not show robust effects on preventing selective reporting and HARKing (van den Akker, 2024; van den Akker et al., 2024). However, the effectiveness of preregistration has been shown to depend on a study’s producibility (the extent to which it can be conducted strictly based on the registered plan) and consistency (the agreement between the preregistration and publication; van den Akker, 2024). Thus, preregistration can be effective in restricting researchers’ degrees of freedom and limiting questionable research practices only when it is detailed, specific, and consistently followed, and when deviations are reported transparently.
Consequently, researchers are encouraged to register as detailed and complete research plans as possible, in particular their hypotheses and research designs, data acquisition methods, and data analysis protocols. However, such ‘maximal prespecifications’ are not always viable (Hardwicke & Wagenmakers, 2023), and partial specifications are less likely to attain preregistration’s stated aims, even assuming that researchers will follow their preregistered plans exactly. For example, registering only one’s hypotheses could open up for more exploratory degrees of freedom in data analyses, which may increase the risk of introducing researcher’s bias (Breznau et al., 2022; Simmons et al., 2011; Waldron & Allen, 2022). It could also encourage researchers to preregister ‘obvious’ hypotheses (Pham & Oh, 2021) or merely promote or ‘demonstrate’ their hypotheses (Waldron & Allen, 2022) — a practice that would fall between proper theory development and empirical testing. Contrary to that, some argue that researchers may be encouraged to preregister risky hypotheses because increasing a theory’s confidence and credibility through preregistration could compensate for small or negative results (Scheel et al., 2021).
Here, we argue that even preregistering only one’s research hypotheses is likely to have indirect beneficial effects on the research process. Our argument builds on the assumption that researchers will conduct hypothesis preregistrations with high producibility, such that registered hypotheses will be detailed and specific, and high consistency, such that researchers will report findings for a registered hypothesis—exclusively, or in addition to any other exploratory findings from a study. In these conditions, hypothesis preregistration could help prevent HARKing and other situations in which researchers construct, select, or modify their hypotheses after having seen the data, and present new hypotheses as if those had been formulated prior to data collection or analysis, and without fully disclosing the processes that led to the new hypotheses. Note that by HARKing we only mean cases of intentional and undeclared change of hypotheses after the results are known, not cases of abduction or other forms of hypothesis generation, where new theoretical constructs are invoked to explain observations or phenomena, while declaring the original hypotheses. As an intentional and undeclared change in the hypotheses, HARKing has been seen as conducive to relatively low replication rates, for example because it can promote statistical abuses (Kerr, 1998; for discussion, see Rubin, 2022). Hypothesis preregistration could then (indirectly) contribute to increasing replication rates, if other conditions on data and analyses are satisfied. It has also been suggested that hypothesis preregistration could improve the transparency of credit attribution and facilitate the development of hypotheses (Moeller et al., 2022) through their public identification and (re)formulation, and via participatory development of theoretical frameworks. An important assumption here, however, is that hypothesis preregistrations are made public upon submission (or for preregistrations that include study design and analysis details, that at least the parts containing the hypotheses are made public upon registration), and that the embargo functionalities that several platforms currently allow are not used (e.g., hiding time-stamped preregistration documents from public access for a specified amount of time while the study is ongoing).
Most current views on the benefits of preregistration emphasize its potential to curb type I errors (i.e., reduce the rate of false positives) by controlling researchers’ degrees of freedom, HARKing, p-hacking etc., in confirmatory research. Preregistration has also been argued to serve the purpose of reducing uncertainty about theoretical risk (Peikert et al., 2023). Risky predictions, if they turn out correct, can increase the persuasiveness of the evidence for or against a hypothesis. However, just increasing theoretical risk has well-known costs and trade-offs (e.g., inflated type I error rates and lower effect detectability, to be counteracted by greater statistical power), and is intrinsically uncertain. Importantly, risky tests can only produce persuasive evidence if the nature, source, and extent of theoretical risk are known (i.e., if uncertainty on theoretical risk is minimized), all other conditions being met. Peikert et al. (2023) argued convincingly that preregistration can contribute to reducing uncertainty about theoretical risk: “Communicating clearly how authors (…) collected their data and consequently analyzed it to arrive at the evidence they present is crucial for judging the theoretical risk they took” (p. 22), which is needed to increase the persuasiveness of evidence. This analysis applies to confirmatory and exploratory studies alike, but its primary focus remains, at least in the authors’ own framing, the preregistration of research methods.
1.2. Towards a New Argument for Hypothesis Preregistration
In this paper, we will develop a new argument for preregistration that is intended to integrate and complement the arguments recapitulated here. We assume that each of those arguments applies to particular situations (e.g., confirmatory research where uncertainty about theoretical risk depends largely on the methods or procedures used), even though their validity may still need to be proven empirically, in such situations or more generally; of course, the same holds for our own argument. In brief, we will suggest that, in appropriate situations, preregistration can increase the quality of hypotheses before they are tested, with indirect beneficial effects on theory development at large. Before we present our case, we will lay out two premises: one on the idea of ‘hypothesis quality’ and its relevance to the replication crisis and science reform, and one concerning the link between hypothesis quality and theory development, given the aims of empirical science.
1.2.1. First Premise: The Idea of Hypothesis Quality
Hypotheses are central at least in confirmatory empirical science, and play a role in other types of research, too. In textbook accounts of the scientific process, hypotheses are proposed, tested, and revised in the ‘empirical cycle’. In reality, hypotheses are the hinge between the ‘empirical cycle’ and the ‘theoretical cycle’ (van Rooij & Baggio, 2020), where theories and models are developed and used to derive explanatory hypotheses, which are fed into the empirical cycle for testing, and back into the theoretical cycle for revision in the context of a theory and its models. Regardless of the details of one’s views of the scientific process, two questions arise: what is a good hypothesis, and how can the quality of hypotheses be improved? Hypotheses typically share the virtues of the theories to which they belong: predictive and explanatory power, internal and external coherence, non-ad-hocness etc. (Borsboom et al., 2021; Schindler, 2018). But hypotheses can also inherit the ‘vices’ of theories, like lack of explicitness and formalization, a weak internal deductive structure etc., and even the absence of theories, or the identification of single, stand-alone hypotheses with theories. In this type of scenario, with weak or no theories, our question on how preregistration can help improve the quality of hypotheses, seems particularly salient. Importantly, the replication crisis has made these questions, and others about hypothesis quality, not just salient, but also urgent.
Some authors have argued that one major neglected factor for low replication rates in some fields of research might be a high base rate of false hypotheses under test (Bird, 2021; Ioannidis, 2005). If a large fraction of the hypotheses put to the test are false, and researchers do not know it, low replication rates should be expected, even in a field that has all the characteristics envisaged by science reform proponents: researchers preregister their studies, follow closely their plans, share materials, code, and data, do not HARK or p-hack, etc. This is also compatible with the ideal of testing risky hypotheses in a methodologically advanced science. However, high quality methods do not always or necessarily suffice for high quality science. The probability that a positive report is false (Wacholder et al., 2004) cannot be reduced by measures such as increasing power, or lowering the alpha in null-hypothesis statistical tests, if the prior probability of the hypotheses being tested is low.1 Low prior probability acts as a ‘trap’ that tends to neutralize the benefits of other science reform measures. Bird (2021) then recommends improving the quality of the hypotheses under test, or “trying to generate hypotheses with a greater probability of truth” (p. 985) to control the false positive report probability. This is not to say that one should subjectively assign higher prior probability to one’s hypotheses under test, or only test hypotheses that are known to be probable: hypotheses with high prior probability have limited capacity to confirm theories, as follows from Bayes’ theorem (see Vickers, 2022 for discussion and examples from evolutionary biology). Our recommendation is rather this: (1) deploy theoretical tools to identify hypotheses with low prior probability, so that either they are not subjected to costly tests that can yield false positives (Baggio et al., 2024; van Rooij & Baggio, 2020, 2021) or are flagged as ‘high risk’, and then measures are taken to minimize uncertainty about theoretical risk (Peikert et al., 2023); (2) adopt practices that could (directly or indirectly) increase the quality of hypotheses before they are tested: this is where preregistration may become pertinent. It is the latter claim that we address in this paper.
1.2.2 Second Premise: The Relationship Between Hypothesis Quality and Theory Development
The second premise to our argument is that, in general, there is a relationship between the quality of hypotheses and the stage of development of the theory to which they belong. Highly developed theories tend to yield higher quality hypotheses, such as hypotheses with higher prior probability: e.g., the detection of gravitational waves and the experimental discovery of the Higgs boson were strongly predicted by physical theories; because of their high prior probability, they had a low net confirmatory effect on the theory, nonetheless they were highly acclaimed results. A similar point applies to the discovery of Neptune, famously touted by Popper as proof that science proceeds by testing ‘exceedingly improbable’ predictions (Popper, 2013). In fact, Adams and Leverrier’s 1846 prediction of the new planet’s orbit had high prior probability given Newton’s theory: it was only improbable relative to an atheoretical logical space in which all orbits and celestial positions have the same ‘probability’. Harsanyi (1960) showed that this idea of probability, borrowed by Popper from Wittgenstein to justify the claim that science progresses by testing ‘bold’ and ‘improbable’ conjectures, is highly non-standard and even violates the axioms of probability. Instead, Popper’s lesson concerns the value of specificity: good scientific theories make predictions that hold true in small regions of logical space, as in the discovery of Neptune. That does not entail that they have low prior probability, either within or across theories. Ideally, theories must provide the means to finely control prior probabilities (Salmon, 1965), so that: (1) a broad spectrum of predictions may be tested, from risky to more conservative; (2) statistical parameters can be adjusted accordingly; and (3) measures can be taken to control and communicate uncertainty about theoretical risk.
Apart from these important issues in the philosophy of science, which will necessarily inform our argument, the question for us is whether and how preregistration and other practices can stimulate the development of higher quality hypotheses, and the improvement of existing ones, in situations where good theories are lacking or are in early stages of development, or there is no agreed upon definition of ‘good theory’,2 but there may be working accounts of the virtues of new hypotheses, e.g., their specificity and their ability to avoid low prior probability traps through high coherence, non-ad-hocness, and other theoretical qualities (Schindler, 2018). Here, we explore the possibility that, indeed, preregistration can lead to better quality hypotheses, which in turn may contribute to theory development more broadly. We will assume that theories and hypotheses can be improved along several dimensions: for the reasons given above, and others that will follow, we take prior probability and specificity to be two important variables.
2. Preregistration and Theory Development, in Theory
Since preregistration was proposed as one of the tools to combat the replication crisis (Nosek et al., 2018), some authors have expressed doubts about its justification and effectiveness. Focusing on hypothesis preregistration, for example, it was argued that the decision to declare a hypothesis cannot alter its prior probability or any other factors that can determine its quality, which have to be assessed and developed independently. For individual studies or hypotheses, preregistration is never diagnostic of good science (Szollosi et al., 2020). Even when hypotheses are preregistered, a practice of selective hypothesis reporting or HARKing might still persist, if registrations are not conducted in a producible and consistent way (van den Akker, 2024; van den Akker et al., 2024). Furthermore, many (preregistered) hypotheses may not be even derived from theories (McPhetres et al., 2021), but are generated ad hoc, for example by extrapolating from previous results. Strong theories and their hypotheses are needed before (for) preregistration (van Rooij, 2019; van Rooij & Baggio, 2020, 2021): only in the context of high-quality theories could preregistration have the effects expected by its advocates, such as contributing to the identification of true hypotheses.
A different perspective is that the problem is not that we do not know what the ‘true hypotheses’ are, but rather that theories in some fields are too flexible, in that they can be readjusted to fit any pattern in data (Muthukrishna & Henrich, 2019). Preregistration may expose this flexibility when there are deviations from the declared plans. Still, one could preregister bad hypotheses (Szollosi et al., 2020) that do not follow from any well-developed theory. While preregistration can provide an initial nudge to improve a theory, it is no guide as to what changes the theory needs and how to implement them (Szollosi & Donkin, 2021). Skeptics would argue that not only is preregistration not helpful in improving reliability or replicability: it may also hinder science’s advancement, if it distracts scholars from other, more urgent reforms, such as improving theories and the quality of hypotheses under test (van Rooij, 2019).
Yet another view is that preregistration is redundant and its aims may often be achieved by other means. For example, Rubin (2020) proposed distinguishing historical transparency, which could be improved by registration of methods, analyses, hypotheses etc., prior to data collection, from contemporary transparency, which is about justifying current methods, analyses, hypotheses etc., regardless of what else has been considered by the researchers. On this account, preregistration’s historical transparency has no effect on credibility beyond the benefits provided by contemporary transparency, when that is available. None of the benefits of preregistration may only be achieved by preregistration and not by any other means. With contemporary transparency, preregistration is not necessary for credibility, and could therefore be considered redundant.
Building on Rubin (2020), Rubin & Donkin (2022) argue that preregistration is unnecessary for discriminating between a priori vs post-hoc hypotheses or explanations, which has been claimed to be preregistration’s main function (Nosek et al., 2018). Preregistrations that are time-stamped in a repository provide verifiable distinctions between hypotheses and plans formulated before vs after the data are analyzed. This is taken to be a desideratum for good science, on the assumption that prediction is superior to accommodation: data are more persuasive, as evidence for or against a hypothesis, when they are predicted by the theory than when they are accommodated within it.3 By checking the content of hypotheses, a priori vs post-hoc explanations can be identified without knowing the time of their construction: one can check if a result is used or not in the rationale for a hypothesis (Rubin, 2022), and decide whether it was ‘predicted’ (or derived) vs accommodated. This assumes that hypotheses are embedded in a theoretical framework. However, in psychology and other fields, hypotheses are often built on intuitions or previous findings, or simply constitute the ‘theory’. It is therefore unclear to what extent proposals emphasizing the formal justification of hypotheses within theoretical frameworks are applicable to criticize the use and effectiveness of preregistration in practice, especially of hypothesis preregistration.
3. Preregistration and Theory Development, in Practice
The upshot of the arguments just summarized is that preregistration is not necessary to generate and select high-quality hypotheses, to increase the transparency of tests and the reproducibility of results, or to build plausible or true theories. These goals may also be achieved with other means than preregistration. After all, modern science has for centuries progressed toward better methods and better theories without preregistration and the procedural apparatus of Open Science. This conclusion, although factually correct, overlooks the fact that preregistration, as other practices in science, is not part of the formal apparatus of theory building, development, testing, and revision, but can still have indirect, external effects on those processes, depending on how a particular field of science functions at a specific time or context.
In support of that, there is some evidence that preregistration already helps achieving some of the objectives of science reform. Recent studies suggest that peer-reviewed registrations (Registered Reports; Chambers, 2013), as well as non-peer reviewed ones (e.g., a time-stamped plan uploaded to a public repository), can benefit the credibility of research. In medicine, where preregistration has been practiced at least since the 1990s (Dickersin & Rennie, 2012), registered clinical trials have been associated with lower risk of bias than non-registered trials (Lindsley et al., 2022). In economics, p-hacking is less frequent in randomized controlled trials when full analysis plans are registered (Brodeur et al., 2022). Studies of published results have shown that preregistered plans facilitate research evaluations even when they are not followed entirely (Simonsohn & Simmons, 2022). Researchers judge preregistered studies to be of higher quality than non-preregistered ones (Sarafoglou et al., 2022), in particular in the Registered Reports format (where preregistration is peer-reviewed) as compared to standard articles (Soderberg et al., 2021). Registered Reports also reduce the rates of published positive findings, compared to null results, impacting publication bias (Scheel et al., 2021) and CARKing (Hobson, 2019).
Although these results point to the potential improvement of methodological aspects of research, such as by reducing biases and questionable research practices, specific benefits have been shown for hypothesis preregistration, e.g., for the visibility of research ideas, for the collective selection, accumulation, or development of hypotheses (Moeller et al., 2022), and for improving the quality of research questions (Soderberg et al., 2021). At the same time, research on the effectiveness of hypothesis registration in psychology has found selective hypothesis reporting even in registered studies (van den Akker et al., 2023) and no robust effects of current preregistration practices on preventing selective reporting or HARKing, underlining the need for preregistrations with greater producibility and consistency (van den Akker, 2024; van den Akker et al., 2024). The benefits of preregistration depend therefore on how preregistration practice is implemented and followed. Crucially, despite the reviewed evidence for preregistration’s contributions to achieving some of the targets of science reform, it still remains unclear whether and in what conditions hypothesis preregistration could nudge researchers to generate higher-quality hypotheses, in particular hypotheses that escape low prior probability traps and that are sufficiently specific to produce informative results (section 1).
3.1. How Preregistration and Theory Development Interact Across Different Contexts
Given the lack of empirical studies on this specific question, we adopt a different approach here: we present and discuss a number of hypothetical scenarios, designed to reveal different ways in which preregistration and theory development could interact. We consider this form of argument as appropriate, primarily because at present there is no empirical research that directly addresses the questions we raise. Hypothetical scenarios are widely used in philosophy, which we regard as contiguous with the meta-scientific exercise we engage in here, under the label of the ‘method of cases’. In contrast with traditional uses of the method of cases, however, our goal is not to elicit judgments about one specific scenario, but to map out a minimal space of related scenarios, invite readers to consider them vis-à-vis particular real-world situations, and motivate the claim that, in those situations, hypothesis preregistration could stimulate theory development. Importantly, each scenario covers only a limited set of circumstances, and not the general case. General conclusions can only be drawn by pooling observations across scenarios. Finally, the scenarios that we examine are not entirely arbitrary, and partly correspond to the ‘research types’ discussed in the philosophy of probability and statistics (e.g., by Trafimow, 2003, see below).
For some scenarios, we will make an assumption concerning the relationship between the theory or hypothesis that a researcher currently believes, or accepts or commits to, and the outcomes of future observations. Such assumptions need to be made explicit because, in general, the attitudes of researchers have very significant effects on science in practice. Specifically, a researcher may believe in the future success of the theory, as an extension or an implication of their current belief that the theory is ‘true’ (or corroborated, supported etc.). Accompanying this belief there may be a preference for the future success of the relevant theory: everything else being equal, it would be realistic to expect that a researcher prefers the current theory to be confirmed or corroborated by future experiments, for example because possible worlds in which the currently believed theory is confirmed or corroborated are more intelligible, given the researcher’s current epistemic position, than possible worlds where the same theory is disconfirmed and thus a different theory is needed. It is easy to consider the future success preference (FSP) negatively. After all, a researcher should always prefer to find out the truth, without interests or partiality towards any one theory. And yet, the FSP is consistent with a rational current belief in the theory, it is not amenable to any of the ‘epistemic vices’ currently studied in virtue epistemology (Cassam, 2019), and is not the same as confirmation bias, where one selects, in a wider information pool, only that which confirms the theory. The FSP holds even when such information is not available: it implies a preference for the theory to survive future tests, not necessarily a tendency to sample favorable information to that end. The FSP could cause problematic behaviors that are not accounted for by confirmation bias, such as HARKing and ‘hypohacking’ (formulating or testing hypotheses in ad hoc ways so that seemingly confirmed hypotheses can emerge; Guest & Martin, 2021). In spite of the risks that the FSP may harbor when it is pursued against evidence and rigor, we suspect that there may be ways of exploiting it constructively through preregistration.
3.1.1. Scenario A: Hypothesis Preregistration Is Rare, Serves the FSP, and Cannot Improve Theory
Consider the following scenario. (A) — Hypothesis preregistration is neither widely adopted nor recommended, there are no community-wide or institutional norms or incentives to adopt it (e.g., from funding bodies), and there are few controversies around preregistration. In this situation, the default may be preregistration of methods, for example in attempted replication studies, while the quality of one’s hypotheses may be a key factor in deciding whether to preregister them or not. If one believes one has a good-enough hypothesis (one that is specific, avoids low prior probability traps etc.), such that the outcomes of the study can confirm or corroborate one’s theory (as per the FSP), then one can only benefit from preregistering one’s study. Preregistration will attest that the hypothesis was formulated before seeing the data, and that no significant changes were made to the theory, hypothesis, methods, or analyses, in order to save the hypothesis from contact with the data. If practiced transparently and honestly, preregistration can in such cases reduce uncertainty about theoretical risk and increase the persuasiveness of the results obtained (Peikert et al., 2023), thus serving the FSP. If one does not believe one has good-enough hypotheses, one could either test them without preregistration or formulate higher-quality hypotheses that can be preregistered. Scenario (A) neither requires nor leads to higher-quality hypotheses or theories: one could always choose not to preregister any hypotheses, instead of making the effort of devising better ones that can be preregistered, since by assumption there are no incentives towards preregistration practice.
3.1.2. Scenario B: Hypothesis Preregistration Is Encouraged but Does Not Lead to Better Theories
Consider now a different scenario. (B) — Hypothesis preregistration is sufficiently widespread or recommended by/to researchers, or there are community-wide or institutional norms or incentives to adopt it, that one will want to do it, more often than not. Given the push to preregister and the FSP, one could either preregister and test one’s higher-quality hypotheses or try to develop some. The impulse to theory development may indeed come ‘from’ preregistration more or less directly, although the practice of preregistration would have little to contribute to theory building as such (Szollosi et al., 2020). This is a possible scenario in which an independent incentive to preregister could effectively prompt researchers to improve the quality of the hypotheses they wish to test. It is not implausible to argue that some areas of psychology, social science, biomedical science, and possibly other research fields, are currently moving towards a (B)-type scenario.
However, one problem is that, in (B), there is incentive to preregister but there is still no incentive to build or develop better theories. This scenario captures well the predicament of much confirmatory research in those fields where hypotheses are not derived from theories, or where strong theories that entail new hypotheses are not available. In this situation, one clear sustainable path to reproducible and replicable science is to maximize the posterior probability of one’s hypothesis given the results, which can be achieved by choosing null hypotheses with low prior probability and alternative hypotheses with high prior probability (for an argument based on probability theory, see Trafimow, 2003, where this scenario is referred to as research type X, i.e., non-exploratory and non-theoretical research). In (B), in the absence of theories or incentives to develop them, researchers can try to maximize the probability that their hypotheses come out true after testing, for example by testing intuitively plausible ideas or hypotheses derived analogically from existing findings. It is only when these initial options are exhausted that other practices, including possibly preregistration, can provide incentives to obtain better hypotheses and to engage in theory development.
Another problem is that the community’s push to preregister, together with individual preferences (FSP), can lead researchers in (B)-type (or Trafimow’s X-type) scenarios to focus excessively on hypotheses with larger prior probabilities, and to seldom test riskier, potentially more informative hypotheses. In scenario (B), without further measures, research may become too conservative.
3.1.3. Scenario C: Hypothesis Preregistration Is Encouraged and Can Lead to Theory Development
The puzzle is how to reconcile incentives for preregistration, by assumption in (B), with the FSP and with the necessity of testing hypotheses along a continuum of probability and risk. This could happen in a new scenario, where researchers want not necessarily to ‘be right’, or to have had the ‘true’ hypotheses all along (FSP), but to obtain consequential results for their field: to run studies that, however they turn out, have high ‘information gain’ (Ebersole et al., 2016). Information gain is defined as the change in the probability of a hypothesis (which can be the null) from before the results were obtained to after they were obtained, i.e., the difference between the prior probability P(H) and the posterior probability P(H|R) (Trafimow, 2003). The larger this difference, the larger the information gain of a study. Let us call this scenario (C). It corresponds to Trafimow’s (2003) type Y, which is characterised by research that may be exploratory, and need not be confirmatory, but is still not theoretical. Trafimow shows that, in such circumstances, a rational strategy is to try to maximize information gain and test hypotheses with low prior probability, relative to scenario (B)/X, but not too low, as low prior probability traps should still be avoided (see above). In this scenario, communicating about theoretical risks is crucial to lend credibility to the claim that the results provide high information gains, if they turn out to support the hypothesis. Preregistration here has the role that Peikert et al. (2023) assign it, namely to reduce uncertainty about theoretical risk. Even more important than communicating the theoretical risk is estimating it correctly and accurately: although preregistration can contribute to the former as far as methods are concerned, by showing that the hypotheses are immune from HARKing, hypohacking etc., it cannot address the latter, for which theory appears to be necessary. In our scenario (C) and in Trafimow’s Y, the assumption is that research can be exploratory, but not theoretical: the hypotheses are not derived from the theory, if any exists. Here, theory development begins where preregistration ends, when controlling uncertainty on methodological aspects of research is not sufficient and theoretical risk must be assessed. It is more likely that researchers will have to engage in theory development in (C), as other approaches (e.g., analogical reasoning from published results) may not work as well for the aims of stating hypotheses for which theoretical risks and information gains are known.
4. Preregistration Along the Theory Availability Spectrum
In the previous section, we have argued that moderate, indirect incentives for theory development may arise in scenarios where studies are empirical and researchers pursue a strategy of maximizing information gains by testing risky hypotheses, other conditions being met. In this type of situations, preregistration of methods can reduce uncertainty about theoretical risks associated with procedural aspects of research. A preexisting push to preregister one’s hypotheses may encourage researchers to develop theories as formal means for providing more complete and accurate estimates of theoretical risk, as is required by a rational demand to use preregistration to communicate precisely epistemic risk in order to reduce uncertainty about it (Peikert et al., 2023). The interplay of (1) exploratory attitudes and goals in research, (2) the adoption of preregistration in a community, and (3) the notion that the objective of preregistration is to increase the expected persuasiveness of exploratory studies by reducing uncertainty about theoretical risks may prompt researchers to explain why their hypotheses carry the risk they do, in terms of prior probability or other relevant measures: for that, some form of theory development appears to be necessary.4 So far, the assumption has been that studies can be confirmatory or exploratory, and that theories are either not available or cannot justify the hypotheses that are presented for testing.
4.1. The Role of Preregistration as a Function of Theory Availability and Development
4.1.1. Scenario D: Possible Roles of Hypothesis Preregistration in the Absence of Theories
In this section, we explore in more detail the possibility that the role of hypothesis preregistration may differ depending on the availability or quality of theories in a given field at a particular time. Consider a new scenario, (D), for any discipline in which there are few strong or formal theories, where theories are sparsely developed, or where hypotheses tend to be derived analogically from previous research. Preexisting incentives for (hypothesis) preregistration, as in (B)-(C), could still counteract HARKing and put pressure on researchers to think more deeply or carefully about the hypotheses that will end up in the relevant repositories, for example by applying informal checks on their specificity and coherence, to the extent that can be done in the absence of explicit theory. Additionally, hypothesis preregistration can encourage researchers to learn about other registered hypotheses before formulating and registering theirs, starting to engage a scientific community in discussions about the quality of novel hypotheses and on connections between existing proposals. Here, we assume that hypothesis preregistrations are made public right away, and not ‘hidden’ under an embargo. For preregistrations that include study or analysis details that should not be made public immediately, for whatever reasons, we suggest that it should be possible to make the hypothesis preregistration public separately from the rest of the document. Even in this scenario, hypothesis preregistration may indirectly but effectively give impulse to theory development, in fields where better hypotheses are hard to come by in the absence of theory: preregistration could indeed expose the need for theory by increasing the demand of high quality hypotheses. As argued by van’t Veer & Giner-Sorolla (2016), preregistration may not directly provide input to theory development, but it may help researchers “put emphasis on developing sound theory and methods—the very elements specified in the pre-registration—rather than on results” (p. 3; see also Sarafoglou et al., 2022), or “put greater emphasis on defining their research questions” (Soderberg et al., 2021). One risk is that marginal improvements of hypotheses through preregistration are viewed as a success and as an ‘acceptable surrogate’ for theory development, in fields where theory building is significantly harder to initiate or sustain, or in which the longer-term benefits of theory are unclear. We leave it to researchers to assess how difficult it is to generate hypotheses in the absence of theory in their field, and what the impact of preregistration on theory development could be in each case.
4.1.2. Scenario E: The Role of Hypothesis Preregistration in Developing Theories
Consider now a different scenario, (E). A field of research that largely works with comparatively underdeveloped theories, but is further along the spectrum of theoretical development than in (D). This scenario corresponds to research type Z in Trafimow’s (2003) classification: this is the same as Y except that, given the availability of theories, the purpose is not to maximize the information to be gained from theory-free tests of hypotheses, but to maximize the change in confidence in the theory: the difference between the probability of the theory, given the result that the hypothesis is true, P(T|H), and the prior probability of the theory, P(T). Trafimow demonstrates that, in order to pursue that goal, the rational strategy would be to test hypotheses that have high prior probability if the theory is true, P(H|T) is close (but not too close) to 1, and low prior probability if the theory is false, P(H|¬T) is close (but not too close) to 0. This means that one should test hypotheses that are both plausible in the context of the relevant theory and specific for that theory. An example of a field currently in this stage may be linguistics: many of its theories are largely formalized, their empirical consequences can be derived with significant intersubjective agreement, mathematical meta-theories are available and used to compare theories (e.g., in equivalence proofs), and several of its formal constructs can be implemented as computational models, leading to qualitative and quantitative predictions (Baggio, 2020; Nefdt, 2023). There is a division of labor here between theory and preregistration. Theory supports the generation of hypotheses and the development of the apparatus needed for their assessment — including estimates of prior probabilities, theoretical risk, and projected information gains and changes in confidence. Hypothesis preregistration may help communities identify, among all the hypotheses generated by current theories, those that are worth pursuing empirically, that require testing for their development, or that are controversial or represent stumbling blocks in a field’s path forward. Again, hypothesis preregistrations would be ideally made public right away upon registration to serve this purpose. Preregistration of methods would still maintain the role it has in situations where research is not theoretical (see above).
In general, preregistration could facilitate the accumulation and community-driven development of pursuitworthy hypotheses. As in (D), researchers could build and develop their hypotheses, not only based on previous results, but also on other registered hypotheses. But in contrast to (D), in this scenario the hypothesis space would progressively acquire structure, via the combined effects of theory and preregistration: theory generates and internally validates hypotheses, preregistration helps identify pursuitworthy hypotheses that require the most rigorous empirical assessment, and that therefore must be protected from such practices as HARKing, hypohacking etc.
However, even (E) is not entirely unproblematic. In a community where preregistration is highly prized and rewarded, theory development may risk being viewed as a preliminary step to devising testable hypotheses that can be preregistered, and is primarily pursued with that end. Notably, the best theories in science give rise to diverse constructs, not only to testable hypotheses that may be worth preregistering. Moreover, different constructs require specific development and evaluation practices (van Rooij & Baggio, 2020, 2021), not to all of which preregistration will be applicable. This type of instrumentalist bias vis-à-vis theory could be offset if the push to preregister a field’s most promising hypotheses fed back into theory development as a request for theorists to identify, from within the theory, those hypotheses that are worthy of being pursued empirically, that should be shielded from bias and problematic behavior (e.g., HARKing), and that can lead to new results with greater information gains and changes in confidence levels for theories.
4.1.3. Scenario F: The Role of Hypothesis Preregistration in Advanced Theories
The latter point raises a question which we touched upon already in section 2: there are multiple paths to achieving certain goals in any field of research, and nothing that can be achieved through preregistration may only be achieved through it. Nevertheless, in practice, the situations in which preregistration is entirely redundant are few and far between. Consider a new scenario, (F), where a research field enjoys advanced theories that are not only mathematized, inferentially controlled, explanatory, and predictive, but also objects of significant agreement in the relevant community: all researchers know what the ‘standard theories’ are, their theoretical and empirical implications, what knowledge gaps remain, what crucial experiments may look like, and the possible outcomes of such experiments. Advanced theories, in addition to other virtues, have a public dimension that allows the community to rally around the core issues and challenges that ought to be addressed in order for the field to make progress. Some areas of physics (e.g., particle physics and cosmology) provide very good examples of advanced theories in this sense, but so would areas of science that build on knowledge of complex molecules (e.g., genetics, molecular biology, biochemistry), areas of mathematics, computer science etc. The question is whether hypothesis preregistration could have a role in advancing science in this type of scenario or whether it would be wholly redundant. Existing shared knowledge of pursuitworthy hypotheses could make HARKing and hypohacking pointless, as few would accept other hypotheses than those that follow from the ‘standard theory’. The theory itself, as long as it is widely accepted and understood, performs the function of public repository of hypotheses (Muthukrishna & Henrich, 2019). Thus, there would be less or no need for hypothesis preregistration as such. Methods preregistration could still play a role in fields with advanced theories and experimental research programs, even where theories constrain research methodologically, e.g., on what measurements should be performed or how the outcomes should be analysed. The goal of methods preregistration is the reduction of uncertainty about epistemic risks associated with methodological aspects of research (Peikert et al., 2023), whereas the theory would be in most cases sufficient to fully assess, justify, and communicate the nature, source, and extent of the epistemic risks, expected information gains, and changes in confidence in the theory. Advanced theories allow researchers to understand and control the epistemic risks they may take, and to test a wide array of hypotheses, from low to high prior probability, given the theory.
Hypothesis preregistration is neither widely used nor recommended | Hypothesis preregistration is sufficiently widespread | |
Absent theory | Functions: HP prevents questionable research practices such as HARKing, if conducted with high producibility and consistency. Opportunities: HP could make some researchers think more deeply about hypotheses. Risks: HP serves researchers’ individual preferences (FSP: Future Success Preference) – only hypotheses believed to confirm or corroborate investigated questions might be preregistered. (Example scenario: A) | Functions: HP prevents questionable research practices such as HARKing, if conducted with high producibility and consistency. HP encourages researchers to learn about other registered hypotheses before formulating and registering theirs, if other hypothesis registrations are made public. HP motivates development of higher quality hypotheses. Opportunities: HP prompts researchers to improve the quality of the hypotheses and might indirectly give impulse to early theory development. Risks: Researchers focus excessively on hypotheses with larger prior probabilities instead of more informative hypotheses. Hypotheses improvements could be regarded as surrogate for theory development. (Example scenarios: B, C, D) |
Developing theory | Functions: HP prevents questionable research practices such as HARKing, if conducted with high producibility and consistency. HP can nudge researchers to think more carefully about pursuitworthy hypotheses from the space of possible hypotheses stemming from given theories. Opportunities: HP can facilitate accumulation of hypotheses and start engaging scientific community in hypothesis development. Risks: Unknown (Scenario E) | Functions: HP prevents questionable research practices such as HARKing, if conducted with high producibility and consistency. HP nudges researchers to choose pursuitworthy hypotheses. HP assists the accumulation and community-driven development of pursuitworthy hypotheses. Opportunities: HP facilitates further theory development. Risks: Theories might be construed only for testable hypotheses that can be preregistered. (Scenario E) |
Advanced theory | Functions: HP is redundant but might be practiced together with other pre-specifications (e.g. preregistration of methods and analyses) that would restrict researchers’ degrees of freedom and ease access to a-priori decisions. (Scenario F) |
Hypothesis preregistration is neither widely used nor recommended | Hypothesis preregistration is sufficiently widespread | |
Absent theory | Functions: HP prevents questionable research practices such as HARKing, if conducted with high producibility and consistency. Opportunities: HP could make some researchers think more deeply about hypotheses. Risks: HP serves researchers’ individual preferences (FSP: Future Success Preference) – only hypotheses believed to confirm or corroborate investigated questions might be preregistered. (Example scenario: A) | Functions: HP prevents questionable research practices such as HARKing, if conducted with high producibility and consistency. HP encourages researchers to learn about other registered hypotheses before formulating and registering theirs, if other hypothesis registrations are made public. HP motivates development of higher quality hypotheses. Opportunities: HP prompts researchers to improve the quality of the hypotheses and might indirectly give impulse to early theory development. Risks: Researchers focus excessively on hypotheses with larger prior probabilities instead of more informative hypotheses. Hypotheses improvements could be regarded as surrogate for theory development. (Example scenarios: B, C, D) |
Developing theory | Functions: HP prevents questionable research practices such as HARKing, if conducted with high producibility and consistency. HP can nudge researchers to think more carefully about pursuitworthy hypotheses from the space of possible hypotheses stemming from given theories. Opportunities: HP can facilitate accumulation of hypotheses and start engaging scientific community in hypothesis development. Risks: Unknown (Scenario E) | Functions: HP prevents questionable research practices such as HARKing, if conducted with high producibility and consistency. HP nudges researchers to choose pursuitworthy hypotheses. HP assists the accumulation and community-driven development of pursuitworthy hypotheses. Opportunities: HP facilitates further theory development. Risks: Theories might be construed only for testable hypotheses that can be preregistered. (Scenario E) |
Advanced theory | Functions: HP is redundant but might be practiced together with other pre-specifications (e.g. preregistration of methods and analyses) that would restrict researchers’ degrees of freedom and ease access to a-priori decisions. (Scenario F) |
5. Conclusions and Recommendations
In this article, we have tried to map out and describe a range of situations in which hypothesis preregistration could interact with theory development. Our proposed typology includes scenarios in which synergies could be envisaged between theory and preregistration, as well as scenarios in which the success of one could compress the space that the other could occupy. A generalization emerging from our discussion is that, in certain cases, hypothesis preregistration could have indirect beneficial effects on theory development by helping increase the quality of hypotheses. However, as theories become more ‘mature’—theoretically more advanced, more widely accepted, and better understood—, hypothesis preregistration may lose its impact on theory advancement. Yet, this is likely to hold for a limited number of fields of science. In many research areas, theories would be less developed and agreed upon, and those are the cases where preregistration can have most benefit and deliver on its promise to improve empirical science. Crucially, there is no guarantee that a field of science will be able to develop advanced or good theories. That makes preregistration a potentially permanent practice in some fields, to support transparency, reliability, and community-driven development of pursuitworthy hypotheses.
Thus, researchers who choose to adopt preregistration in their own research could: (1) Benefit from pre-specification of hypotheses, without necessarily preregistering other aspects of their work, and (2) check existing space of publicly available preregistrations in the field when choosing pursuitworthy hypotheses to support community-driven hypothesis development. Those who conduct meta-scientific studies could: (1) Critically examine arguments for or against preregistration that fail to take into account a field’s theoretical advancement and how that may impact the need and utility of hypothesis preregistration; (2) Assess how hypotheses are generated, identified, and selected in a given field, and consider whether and how theory and preregistration do or could positively affect that process; (3) Promote discussions on different ways of assessing theoretical advancement in a given research field, in addition to discussions on the role of hypothesis preregistration in the same field; and (4) Promote ways of incorporating statements on the (lack of) theoretical motivation for hypotheses, e.g. in Registered Reports, such that it is clear whether the hypotheses under test follow from a theory, and how, and what the functions of preregistration are also in relation to such theoretical motivation.
Hypothesis preregistration is neither widely used nor recommended | Hypothesis preregistration is sufficiently widespread | |
Researchers | If you choose to adopt preregistration in your own research, you could benefit from pre-specification of hypotheses even if you do not preregister other aspects of the work. | Check the existing space of publicly available preregistrations in the field when choosing pursuitworthy hypotheses to support community-driven hypothesis development. |
Meta-scientists | Critically examine arguments for or against preregistration that do not take into account a field’s theoretical advancement and how that may impact the need and utility of hypothesis preregistration. Assess how hypotheses are generated, identified and selected in a particular field, and consider whether and how theory and preregistration do or could positively affect that process. | Promote discussions on ways of assessing theoretical advancement in a given research field, in addition to discussions on the role of hypothesis preregistration in the same field. Promote ways of incorporating statements on the (lack of) theoretical motivation for hypotheses, e.g. in Registered Reports, such that it is clear whether or not the hypotheses under test follow from a theory, and how, and what the functions of preregistration are also in relation to such theoretical motivation. |
Hypothesis preregistration is neither widely used nor recommended | Hypothesis preregistration is sufficiently widespread | |
Researchers | If you choose to adopt preregistration in your own research, you could benefit from pre-specification of hypotheses even if you do not preregister other aspects of the work. | Check the existing space of publicly available preregistrations in the field when choosing pursuitworthy hypotheses to support community-driven hypothesis development. |
Meta-scientists | Critically examine arguments for or against preregistration that do not take into account a field’s theoretical advancement and how that may impact the need and utility of hypothesis preregistration. Assess how hypotheses are generated, identified and selected in a particular field, and consider whether and how theory and preregistration do or could positively affect that process. | Promote discussions on ways of assessing theoretical advancement in a given research field, in addition to discussions on the role of hypothesis preregistration in the same field. Promote ways of incorporating statements on the (lack of) theoretical motivation for hypotheses, e.g. in Registered Reports, such that it is clear whether or not the hypotheses under test follow from a theory, and how, and what the functions of preregistration are also in relation to such theoretical motivation. |
Competing Interests
The authors declare that there were no conflicts of interest with respect to the authorship or the publication of this article.
Author Contributions
Both authors have approved the manuscript, contributed to this work in a meaningful way, and agree with its publication in Collabra: Psychology.
Footnotes
For a full exposition of the argument with examples, see Baggio, G., Why theory matters: New arguments for an old conclusion, ERIM Research Transparency Campaign, Erasmus University Rotterdam: youtu.be/IXlTBphsqUc.
About this question, see the articles collected in the special issue of Computational Brain & Behavior: ‘What makes a good theory? Interdisciplinary perspectives’; van Rooij et al. (2024).
For a discussion of preregistration in the context of the accommodation vs prediction distinction, see Choi (2024).
There is a possible exception in correlational ‘big data’ research, such as genome-wide association studies (GWAS), where theory may not be necessary to justify the claim that the hypothesis under test has low prior probability: if the search space is large (e.g., is defined by all genetic variants, genomic risk loci etc.), the very act of selecting one or a few candidates from it, also based on previous research, will yield hypotheses with low prior probability. Eventually, however, once the search space has been significantly narrowed down, theoretical considerations may come into play to justify claims to low prior probability, high risk, high projected information gains etc. Another possible and related exception is confirmatory research on empirical hypotheses in the absence of (strong) theoretical frameworks. In that case too, theory eventually becomes necessary to justify claims on prior probabilities, risks, and information gains.