Momentary fluctuations in children’s cognitive performance offer a rich source of signal. To understand this variation, we turn to the field of ADHD research. Children with ADHD exhibit high variability in reaction times. This phenomenon is accompanied by multiple theory-driven predictions about its candidate causes. However, suboptimal methods and datasets have obstructed empirical insights. To address this, we identify and address three sources of heterogeneity. We isolate a theoretically motivated estimate of reaction-time variability using dynamic structural equation modeling and examine its association with developmental problems. Our findings reveal a specific association between symptom-severity in the inattention domain and reaction-time variability in a population-based cohort of 1032 children aged 5.5-to-13.5 in the Netherlands. Moreover, by combining a unique task-design with latent difference-score models we provide support for the mechanistic hypothesis that attentiveness drives reaction-time variability. We conclude with three hypotheses for researchers interested in examining children’s development through its momentary dynamics.

Hundreds of empirical replications have shown that a substantial proportion of children diagnosed with Attention-Deficit Hyperactivity Disorder (ADHD) show elevated variability in reaction-times during cognitive testing (Castellanos & Tannock, 2002; Karalunas et al., 2014; Kofler et al., 2013; Leth-Steensen et al., 2000). This robust association has filled the highly desired role of a cognitive-correlate of a mental disorder, making the ongoing search for the mechanisms driving reaction-time variability a research priority. However, reliance on suboptimal methods and datasets has slowed down progress, leading to substantial heterogeneity in the processes that reflect reaction-time variability within and across studies. This complicates efforts to map reaction-time variability to candidate mechanisms, urging us to address the methodological morass head on. In the present paper, we apply innovations from relevant scientific fields to resolve three methodological problems and offer new insights into standing substantive disputes in the field. Our approach can be conceived of as a three-stage explanatory funnel where we move from high-level descriptive questions toward mechanisms while addressing methodological problems to lay a solid foundation for each subsequent stage.

Our ability to understand reaction-time variability is contingent on our ability to isolate the construct using suitable measures. Thus, the operationalization of reaction-time variability sits at the top of our funnel and affects all subsequent inferences. Most substantive interpretations of reaction-time variability within the ADHD literature converge on the conclusion that variability is composed of randomly occurring fluctuations (Geurts et al., 2008, p. 1; Karalunas et al., 2013, 2014; Kofler et al., 2013; van Belle et al., 2015). However, classical approaches such as the intraindividual standard deviation (i.e. the deviation of reaction-time at each trial from a person’s mean reaction-time, iSDi=(yi,tyi)2Ti) and the intraindividual coefficient of variation (the iSD divided by a person’s observed mean score, ICVi=iSDiyi) do not isolate variability in the way this definition demands. Instead, they conflate the effects of multiple processes within a cognitive time-series leading to biased estimates and suboptimal inferences. For instance, it has been shown that systematic changes in time-series data, in the form of trends, greatly inflate variability estimates (Wang et al., 2012). This means that a child who steadily improves, or worsens, without short-term fluctuations around that trend may be mistaken for a child who demonstrates high variability, clouding our understanding of the construct of interest.

At the second stage, we delve into our first research aim—to test the phenotypic (a)specificity of reaction-time variability. We want to know whether reaction-time variability is specifically related to greater ADHD-symptom severity, or whether it is associated with symptoms across multiple domains of developmental problems, namely emotionality, conduct problems, prosociality, and peer relations. We deem this question pertinent because it directly ties into the broader clinical landscape where dispute among competing theories largely centers around questions of splitting versus lumping (e.g. symptoms versus disorders versus spectra; Borsboom et al., 2019; Kotov et al., 2017; DSM-V; American Psychiatric Association, 2013). Moreover, despite more than 300 studies on the association between reaction-time variability and ADHD, whether reaction-time variability is a specific correlate of ADHD is still unclear (Karalunas et al., 2014; Kofler et al., 2013; Salum et al., 2019). A likely culprit for this uncertainty is the heterogeneity within reaction-time measures that may complicate mapping differences in cognitive fluctuations to differences in specific behavioral profiles. This heterogeneity is further exacerbated by ignoring the possibility that the domains of inattention and hyperactivity/impulsivity are qualitatively distinct (e.g. Willcutt et al., 2012; lumping of subdomains into unitary ADHD construct in 278/319 studies according to Kofler et al., 2013) and naturally dimensional, which when forced into a single dichotomous construct leads to information loss and construct instability (Kotov et al., 2017; Lahey et al., 2022).

At the third stage, we wish to test mechanistic hypotheses about the causes of reaction-time variability and their association to individual differences in developmental problems. Here we face another source of heterogeneity which stems from the multifaceted nature of cognitive tasks and their differential interaction with diverse individuals. The multifactorial nature of cognitive tasks creates an interpretative problem when attempting to probe the active mechanisms that drive characteristics of cognitive performance (Schweizer, 2007), such as variability. When cognitive tasks are viewed as a person-task interaction, it becomes clear that diversity can also arise from subjects (e.g. Anderson, 1992; Rommelse et al., 2015). For instance, Biesmans et al. (2019) showed that amongst lower intellectual ability groups, tasks measuring executive functions may measure basic cognitive processes, such as processing speed, rather than distinct executive functions. This means that tasks may measure different constructs depending on the ability of the population under study, as well as depending on the degree and manner in which processing speed is taken into account (Rommelse et al., 2015; Santegoeds et al., 2022). Therefore, a crucial step towards discovering the mechanisms underlying reaction-time variability resides in the design and use of tasks that modulate key cognitive components, thus facilitating insight. These tasks should ideally allow us to separate the influence of candidate mechanisms from other primary and higher-order cognitive processes and measure the same processes across the ability spectrum.

We aim to remedy these challenges by combining a unique study design with psychometric innovations. First and foremost, we introduce an innovative statistical approach that allows us to isolate a distinct component of reaction-time variability from other aspects of cognitive performance (Figure 1). This approach is dynamic structural equation modeling (DSEM), which unifies innovations in time-series modeling, structural equation modeling, and multilevel modeling to open new possibilities for inquiry into individual differences in within-person variability (Asparouhov et al., 2018). Second, we will apply DSEM to a large cohort of 1032 children aged 5.5 to 13.5 years old who completed a unique battery of cognitive tasks, the COTAPP (Rommelse et al., 2018). The COTAPP is a self-paced cognitive task battery in which the required cognitive demands (e.g., interference control, sustained performance over time) vary across blocks. This design allows us to isolate distinct task demands that are hypothesized to affect (differences in) reaction-time variability, while also taking differences in processing speed into account (Santegoeds et al., 2022). Third, psychopathology assessments will be taken from two continuous teacher-report questionnaires: a measure of multiple domains of developmental problems (namely emotionality, conduct problems, prosociality, ADHD, and peer relation problems) and a measure that separates ADHD symptoms into two continuous subdomains, inattention and hyperactivity/impulsivity. Thus, we tackle the issue of forcing distinct symptom dimensions into catch-all diagnostic dichotomies.

We use these tools to test four, non-preregistered, mechanistic hypotheses explaining the link between ADHD and reaction-time variability that we derived directly from literature and summarize in Table 1. Moreover, we use our task design to address a practical supplementary aim of deciding which task produces an estimate of reaction-time variability that is most predictive of psychopathology severity. Lastly, our research questions regarding the specificity of ADHD and reaction-time variability are as follows (we did not hold any a priori expectations about the results of these questions):

  1. Do children with greater reaction-time variability specifically show greater ADHD-symptom severity, or is this pattern present across multiple developmental problem domains?

  2. Do children with greater reaction-time variability specifically show greater inattention- or hyperactivity-symptom severity, or is the association present across both domains?

Table 1.
Four, non-preregistered, hypotheses tested through subtractive method
Hypothesis 1 Children with a higher severity of inattention and/or hyperactivity symptoms will show a greater decrease in reaction-time variability after incentivizing speed and accuracy in performance. 
Theory Adaptive gain theory of locus coeruleus function (Aston-Jones & Cohen, 2005)  
Background This hypothesis is a behavioral prediction from the adaptive gain theory, whereby reaction-time variability should decrease under task conditions with high perceived task-utility (Aston-Jones & Cohen, 2005). The adaptive gain theory of locus-coeruleus (LC) function, proposes that LC function can transition between two states: a phasic state and a tonic state (Aston-Jones & Cohen, 2005). Transitions between these two states are modulated by perceived task-utility which is monitored by the orbitofrontal cortex (OFC) and the anterior cingulate cortex (ACC), both of which provide direct inputs to the LC. The phasic state is triggered when perceived task-utility is high. Here phasic norepinephrine (NE) activity to task-stimuli is high and tonic NE activity is low. This leads to exploitative behavior that optimizes responding to task-relevant stimuli and diminishes the response to task-irrelevant stimuli, which manifests as low reaction-time variability. The tonic phase is triggered under low perceived task-utility. The tonic phase corresponds to low phasic responsivity to task-related stimuli and high tonic activation. This leads to exploratory behavior which serves the search for new rewarding stimuli and manifests as high reaction-time variability. 
Test We will test our first hypothesis by comparing children's reaction-time variability on a 2-choice reaction time task with a rewarded choice reaction-time task. Our reasoning is that rewarding speed while maintaining accuracy will increase the perceived utility of task-related stimuli, leading to a phasic state and a decrease in reaction-time variability. The degree of difference should be greater in children with a higher severity of inattention and/or hyperactivity symptoms, given that attention disorders are proposedly characterized by predominantly tonic responding (Aston-Jones et al., 1999; Hauser et al., 2016). Conversely, children with lower inattention and/or hyperactivity severity are likely to show a lower increase in exploitative behavior due to ceiling effects (e.g. Liddle et al., 2011). 
Hypothesis 2 Children with a higher severity of hyperactivity symptoms, but not necessarily inattention symptoms, will show a greater increase in reaction-time variability following a spatial interference manipulation. 
Theory Behavioural inhibition model (Barkley, 1997)  
Background The behavioral inhibition deficit model which predicts that children will behave more variably in tasks where they need to inhibit a prepotent/ongoing response to successfully perform (Barkley, 1997). The behavioral inhibition deficit model posits that inhibition deficits are a core aspect of ADHD. This behavioral inhibition deficit is hypothesized to directly, or through its indirect effects on executive functioning, manifest as behavioral variability in reaction-time tasks (Barkley, 1997). It is worth noting that Barkley’s BID model, while influential in the field, is no longer widely accepted. However, the role of behavioral inhibition as a candidate driver of reaction-time variability has been adopted in more modern theories; either as a possible mediating factor between working memory deficits and reaction-time variability (Rapport et al., 2008) or directly due to behavioral inhibition deficits (Sonuga-Barke et al., 2010). We focus on the BID model due to its primacy in putting forth behavioral inhibition as a driver of reaction-time variability. 
Test To test this hypothesis, we will compare the 2-choice reaction time task with the interference choice reaction-time task. Further, within Barkley's framework behavioral inhibition deficits are primarily linked to hyperactivity, while inattention can be seen as a consequence of inhibition difficulties (Barkley, 1997). Barkley makes a qualitative distinction between the type of inattention that arises as a byproduct of inhibition difficulties and the inattention that resides in the inattentive type of ADHD. Therefore, the degree of difference in reaction-time variability between the two tasks should be greater in children with greater hyperactivity severity, but not necessarily inattention severity. 
Hypothesis 3 Children with higher inattention and/or hyperactivity severity will show a greater difference in their reaction-time variability following approximately 30 minutes of cognitive testing designed to tax their sustained attention. 
Theory Adaptive gain theory (Aston-Jones & Cohen, 2005)  
Background Attentional lapses are posited as a cause of reaction-time variability across multiple theoretical frameworks (see Kofler, 2013 for an overview). Through the lens of the adaptive gain theory, attentional lapses are a byproduct of the highly distractable state that accompanies the extreme bounds of LC activity (Aston-Jones & Cohen, 2005; Unsworth & Robison, 2018). Prior empirical studies have shown that as time-on-task increases people tend to shift to the lower bound of LC activity, i.e. arousal and alertness decreases, leading to an increase in attentional lapses and reaction-time variability (Aston-Jones et al., 1994, 2007; Unsworth & Robison, 2018). Children with ADHD have well-documented impairments in sustained attention (e.g. Rommelse et al., 2011 for a review), in fact, studies have shown that their sustained attention declines faster than controls (Bubnik et al., 2015; Huang-Pollock et al., 2006, 2012). Thus, we expect the effect of our manipulation to be more pronounced for children with greater inattention and/or hyperactivity severity. 
Test To test this hypothesis, we compared the 2-choice reaction-time task to a vigilance choice-reaction time task, that is identical to the 2-choice reaction time task but is presented after approximately 30 minutes of cognitive testing. 
Hypothesis 4 Children with higher inattention and/or hyperactivity symptoms will show a greater increase in reaction-time variability following a longer intertrial interval. 
Theory Cognitive energetic model (Sergeant, 2004)  
Background This hypothesis stems from the cognitive energetic model (CEM) and provides a different explanation for attentional lapses within ADHD, by proposing that state regulation deficits lead to attentional lapses and subsequently behavioral variability (Kofler et al., 2013; Sergeant, 2004). Akin to the adaptive gain theory CEM proposes that deficient arousal regulation underlies variability. However, arousal in the CEM is linked to stimulus encoding and phasic NE release, pulling from earlier work from Pribram & McGuinness (1975). In contrast, Aston-Jones & Cohen's notion of phasic NE release is empirically linked to central decision processing (Karalunas et al., 2014). The CEM posits that children with ADHD have deficient arousal regulation and that this is exacerbated in tasks with longer event rates (i.e. time between trials; Sergeant, 2000). 
Test To test this hypothesis, we compared the vigilance reaction time task (to control for differences due to sustained attention) with the event rate reaction-time task. According to the CEM, we expect to see a greater degree of difference in reaction-time variability following the event-rate manipulation and this difference will be greater in children with greater inattention and/or hyperactivity severity. 
Hypothesis 1 Children with a higher severity of inattention and/or hyperactivity symptoms will show a greater decrease in reaction-time variability after incentivizing speed and accuracy in performance. 
Theory Adaptive gain theory of locus coeruleus function (Aston-Jones & Cohen, 2005)  
Background This hypothesis is a behavioral prediction from the adaptive gain theory, whereby reaction-time variability should decrease under task conditions with high perceived task-utility (Aston-Jones & Cohen, 2005). The adaptive gain theory of locus-coeruleus (LC) function, proposes that LC function can transition between two states: a phasic state and a tonic state (Aston-Jones & Cohen, 2005). Transitions between these two states are modulated by perceived task-utility which is monitored by the orbitofrontal cortex (OFC) and the anterior cingulate cortex (ACC), both of which provide direct inputs to the LC. The phasic state is triggered when perceived task-utility is high. Here phasic norepinephrine (NE) activity to task-stimuli is high and tonic NE activity is low. This leads to exploitative behavior that optimizes responding to task-relevant stimuli and diminishes the response to task-irrelevant stimuli, which manifests as low reaction-time variability. The tonic phase is triggered under low perceived task-utility. The tonic phase corresponds to low phasic responsivity to task-related stimuli and high tonic activation. This leads to exploratory behavior which serves the search for new rewarding stimuli and manifests as high reaction-time variability. 
Test We will test our first hypothesis by comparing children's reaction-time variability on a 2-choice reaction time task with a rewarded choice reaction-time task. Our reasoning is that rewarding speed while maintaining accuracy will increase the perceived utility of task-related stimuli, leading to a phasic state and a decrease in reaction-time variability. The degree of difference should be greater in children with a higher severity of inattention and/or hyperactivity symptoms, given that attention disorders are proposedly characterized by predominantly tonic responding (Aston-Jones et al., 1999; Hauser et al., 2016). Conversely, children with lower inattention and/or hyperactivity severity are likely to show a lower increase in exploitative behavior due to ceiling effects (e.g. Liddle et al., 2011). 
Hypothesis 2 Children with a higher severity of hyperactivity symptoms, but not necessarily inattention symptoms, will show a greater increase in reaction-time variability following a spatial interference manipulation. 
Theory Behavioural inhibition model (Barkley, 1997)  
Background The behavioral inhibition deficit model which predicts that children will behave more variably in tasks where they need to inhibit a prepotent/ongoing response to successfully perform (Barkley, 1997). The behavioral inhibition deficit model posits that inhibition deficits are a core aspect of ADHD. This behavioral inhibition deficit is hypothesized to directly, or through its indirect effects on executive functioning, manifest as behavioral variability in reaction-time tasks (Barkley, 1997). It is worth noting that Barkley’s BID model, while influential in the field, is no longer widely accepted. However, the role of behavioral inhibition as a candidate driver of reaction-time variability has been adopted in more modern theories; either as a possible mediating factor between working memory deficits and reaction-time variability (Rapport et al., 2008) or directly due to behavioral inhibition deficits (Sonuga-Barke et al., 2010). We focus on the BID model due to its primacy in putting forth behavioral inhibition as a driver of reaction-time variability. 
Test To test this hypothesis, we will compare the 2-choice reaction time task with the interference choice reaction-time task. Further, within Barkley's framework behavioral inhibition deficits are primarily linked to hyperactivity, while inattention can be seen as a consequence of inhibition difficulties (Barkley, 1997). Barkley makes a qualitative distinction between the type of inattention that arises as a byproduct of inhibition difficulties and the inattention that resides in the inattentive type of ADHD. Therefore, the degree of difference in reaction-time variability between the two tasks should be greater in children with greater hyperactivity severity, but not necessarily inattention severity. 
Hypothesis 3 Children with higher inattention and/or hyperactivity severity will show a greater difference in their reaction-time variability following approximately 30 minutes of cognitive testing designed to tax their sustained attention. 
Theory Adaptive gain theory (Aston-Jones & Cohen, 2005)  
Background Attentional lapses are posited as a cause of reaction-time variability across multiple theoretical frameworks (see Kofler, 2013 for an overview). Through the lens of the adaptive gain theory, attentional lapses are a byproduct of the highly distractable state that accompanies the extreme bounds of LC activity (Aston-Jones & Cohen, 2005; Unsworth & Robison, 2018). Prior empirical studies have shown that as time-on-task increases people tend to shift to the lower bound of LC activity, i.e. arousal and alertness decreases, leading to an increase in attentional lapses and reaction-time variability (Aston-Jones et al., 1994, 2007; Unsworth & Robison, 2018). Children with ADHD have well-documented impairments in sustained attention (e.g. Rommelse et al., 2011 for a review), in fact, studies have shown that their sustained attention declines faster than controls (Bubnik et al., 2015; Huang-Pollock et al., 2006, 2012). Thus, we expect the effect of our manipulation to be more pronounced for children with greater inattention and/or hyperactivity severity. 
Test To test this hypothesis, we compared the 2-choice reaction-time task to a vigilance choice-reaction time task, that is identical to the 2-choice reaction time task but is presented after approximately 30 minutes of cognitive testing. 
Hypothesis 4 Children with higher inattention and/or hyperactivity symptoms will show a greater increase in reaction-time variability following a longer intertrial interval. 
Theory Cognitive energetic model (Sergeant, 2004)  
Background This hypothesis stems from the cognitive energetic model (CEM) and provides a different explanation for attentional lapses within ADHD, by proposing that state regulation deficits lead to attentional lapses and subsequently behavioral variability (Kofler et al., 2013; Sergeant, 2004). Akin to the adaptive gain theory CEM proposes that deficient arousal regulation underlies variability. However, arousal in the CEM is linked to stimulus encoding and phasic NE release, pulling from earlier work from Pribram & McGuinness (1975). In contrast, Aston-Jones & Cohen's notion of phasic NE release is empirically linked to central decision processing (Karalunas et al., 2014). The CEM posits that children with ADHD have deficient arousal regulation and that this is exacerbated in tasks with longer event rates (i.e. time between trials; Sergeant, 2000). 
Test To test this hypothesis, we compared the vigilance reaction time task (to control for differences due to sustained attention) with the event rate reaction-time task. According to the CEM, we expect to see a greater degree of difference in reaction-time variability following the event-rate manipulation and this difference will be greater in children with greater inattention and/or hyperactivity severity. 

Transparency and openness

We include open-access data and report all data preprocessing steps, including exclusions and transformations. We report our sample characteristics and all aspects of the cognitive testing battery. Data and code for all our analyses, preprocessing steps, and data visualization are available at https://osf.io/pwfyk/?view_only=f657f076b90a46eeb7aa74c1f58fc6d7. Data were analyzed using R version 4.1.3 (R Core Team, 2022) and Mplus v8.5 (Muthén & Muthén, 2017). We include an extensive supplement which thoroughly documents our results. Our analyses and hypotheses were not preregistered. The collection of the analyzed sample was approved by the Central Research Committee on Research involving Human Subjects Arnhem-Nijmegen in March 2015 (number NL49249.091.14).”

1.1. Participants

We analyzed a sample of 1032 children aged 5.5 to 13.5 years who completed the Cognitive Task Application (COTAPP)—a computerized block-wise cognitive testing battery (Rommelse et al., 2018). This is a population-derived sample of children in the Netherlands. To achieve country-wide coverage children were selected from twenty-two elementary schools based on postal codes. The ethnicity distribution, based on migration status, is closely related to the distribution in the Dutch population (i.e. positive migration status in-sample 26.1 percent versus 25 percent in the population). The distribution of IQ-scores also closely reflects the IQ-scores in the population (mean = 100.5, SD = 16.2; as measured by the vocabulary and block pattern subsets of the Weschler Intelligence scale, 3rd edition or the Wechsler preschool and primary scale, 3rd edition). This is not the case in 12- to 13-year-old children within our sample, who scored lower than their norm group. There is a weak negative correlation between age-adjusted IQ-scores and age (r = -0.11, p = 0.01). Ethnicity, age, and gender are not associated in our sample. Although COTAPP has made considerable efforts towards inclusivity and representativeness compared to other studies, we cannot claim perfect representativeness of the Dutch population in that age range as a whole.

1.2. Procedure and materials

1.2.1. Neuropsychological assessment

We analyzed six blocks of cognitive tasks included in the COTAPP (Rommelse et al., 2018), as illustrated in Figure 1. Each task is a variation of a choice reaction time task, with manipulations designed to elicit more specific information depending on the manipulation. Tasks are completed by children in the following sequence, with every task starting after a short series of practice trials. First, the simple reaction time task assesses basic information processing speed across 20 trials (block 1). The participant is instructed to respond to a target stimulus that appears in the center of the screen, as soon as it appears, using a single response key. Second, the 2-choice reaction time task assesses higher-order information processing speed across 30 trials (block 2). The participant needs to press the correct response-button from two options which are tied to a different target stimulus (“yes-button” for X-stimulus and “no-button” for Y-stimulus). The target stimulus appears at the center of the screen. The simple reaction time task and the 2-choice reaction time task differ in terms of cognitive load. Third, the rewarded choice-reaction-time task assesses the effect of algorithmically administered rewards on higher-order processing speed across 40 trials (block 3). The task is identical to the 2-choice reaction-time task with the addition of a dynamic tracking algorithm which administers a reward each time a participant exceeds their individual response speed while remaining accurate. The initial individual response speed for each participant is based on their performance in the 2-choice reaction time task. After each rewarded response, the response speed that needs to be exceeded increases, while after each incorrect and/or slow response (in relation to current speed limit) the response speed decreases. By comparing the 2-choice reaction time task and the rewarded choice reaction-time task we can assess the effect of external incentives on a participant’s higher-order response speed. Fourth, the interference choice reaction-time task assesses interference control across 40 trials (block 4). The task is identical to the 2-choice reaction time task, except that the target stimulus does not appear in the center of the screen. Instead, the target stimulus appears either on the right side of the screen, or on the left. If the target stimulus appears in the same direction as the response key that it is tied to, the trial is congruent. Otherwise, the trial is incongruent. Incongruent trials evoke two contrasting responses: one from the stimulus-key association and one from the spatial location (e.g. stimulus X tied to the button on the right side appears on the left side of the screen). Stimulus congruency is evenly split across trials and randomly ordered. Order is constant across participants. Comparing the interference choice reaction-time task with the 2-choice reaction-time task allows us to assess interference control. Fifth, the vigilance choice-reaction-time task assesses sustained attention by using an identical paradigm as the 2-choice reaction-time task over 30 trials (block 7A). At this point, participants have completed an entire test battery and the discrepancy between the vigilance choice-reaction-time task and the 2-choice reaction-time task can be used to assess the effect of taxing children’s sustained attention. Sixth, the long inter-trial-interval (ITI) choice-reaction-time task assesses arousal regulation capacity across 20 trials (block 7B). The inter-trial interval is increased from 450-750ms to 3000-6000ms. The extended inter-trial interval makes it more difficult to maintain arousal. By comparing the vigilance choice-reaction-time task with the event-rate choice-reaction-time task, we can gauge participants’ arousal regulation capacity.

Figure 1.
Schematic representation of cognitive tasks used in the COTAPP assessment
Figure 1.
Schematic representation of cognitive tasks used in the COTAPP assessment
Close modal

1.3. Baseline teacher-report assessment

1.3.1 Behavioral measures

Strengths and Difficulties questionnaire (SDQ). The SDQ is a brief measure of psychopathology and prosocial behavior in four- to 16-year-old youths (Goodman, 1997). In our study, we analyzed the teacher report version of the SDQ. The 25 items of the SDQ are measured on a 3-point Likert scale (Not True, Somewhat True, Certainly True) and divided into five factors: emotional symptoms, conduct problems, ADHD problems, peer problems, and prosocial behavior. The grouping of items is based on factor analyses and current nosology (Goodman, 2001). A recent study on the dimensionality, age- and gender-invariance of the SDQ concluded that for children aged five to 14, a five-factor structure fits well and the factors are gender and longitudinally invariant (Murray et al., 2021). For a systematic review of studies assessing the psychometric properties of the SDQ see Kersten et al., 2016. For the item content of the SDQ see Appendix A in Goodman, 1997. In the present study, we used total scores for each of the five domains (i.e., aggregates of item scores) to represent children’s severity of developmental problems within the last six months or the current school year.

Figure 2.
Histograms for teacher-report ratings on SDQ developmental problem domains
Figure 2.
Histograms for teacher-report ratings on SDQ developmental problem domains
Close modal

Strengths and Weaknesses of ADHD-Symptoms and Normal-Behavior (SWAN) Scale. The SWAN scale (Swanson et al., 2001) is based on the 18 ADHD items included in the DSM-IV (American Psychiatric Association, 1994). The SWAN is designed to assess both adaptive functioning and maladaptive functioning on ADHD-related behaviors using a 7-point scale. Higher positive scores (i.e. between +1 and +3) represent maladaptive functioning, while higher negative scores (i.e. between -1 and -3) represent adaptive functioning (0 = average). For example, scoring a +3 on the item “Sustaining attention” indicates the child has pronounced problems with sustaining attention. The SWAN scale measures items within both the domains of hyperactivity/impulsivity and inattention. For an overview of studies assessing the psychometric properties of the SWAN scale see Brites et al., 2015. For the item content of the SWAN scale see Table 2 in (Swanson et al., 2001). We analyzed the teacher-report version of the SWAN scale in our study.

Figure 3.
Histograms for teacher-report ratings on SWAN inattention and hyperactivity scale
Figure 3.
Histograms for teacher-report ratings on SWAN inattention and hyperactivity scale
Close modal

1.4. Pre-processing

Handling outliers and missing data. Outliers were identified and replaced with missing values using a two-step process that was identical for all tasks and participants. First, reaction times per task were log-transformed to reduce skewness (Supplementary Material Figures S41-S49). Second, to remove implausible values all reaction times below 100ms were regarded as anticipations and replaced with missing values (Luce, 1986).

1.5. Statistical analyses

Dynamic structural equation modeling. To capture the high-resolution temporal dynamics of response times across trials, at the within-person level, we used a quantitative framework called Dynamic Structural Equation Modeling (DSEM; Asparouhov et al., 2018). DSEM is a broad modeling framework that allows us to reliably estimate static and dynamic characteristics of participants’ cognitive performance, interindividual differences in these characteristics, and how they relate to between- and within-person covariates. Four within-person parameters reflecting different facets of children’s cognitive performance, along with individual differences in the magnitude of these characteristics, can be estimated using DSEM: (a) response speed, which refers to a person’s mean reaction time, (b) reaction-time variability, which refers to the average amplitude of fluctuations from a person’s mean; (c) inertia, which is the extent to which reaction-time on a current trial can be predicted from reaction-time on a previous trial, and captures the temporal order of fluctuations from a person’s mean; (d) trend, which refers to linear systematic change in a person’s mean over time (Figure 4). Specifically, we use RDSEM to achieve stationarity while contemporaneously modeling the trend and inertia parameter (McNeish & Hamaker, 2020). RDSEM models inertia on the within-person residuals to avoid violations of stationarity (Asparouhov et al., 2018). For simplicity we call our model DSEM, referring to the broader framework that subsumes RDSEM, but interested readers should keep in mind that the inertia parameter is modeled on the within-person residuals. Given valid concerns about the normality assumption imposed by DSEM we tested the fit of our model to data by using graphical posterior predictive checks. The accompanying Stan code to specify DSEM in R and perform the graphical posterior predictive checks, which suggest DSEMs generated data that replicated the observed data structure well, are given provided at https://osf.io/pwfyk/?view_only=f657f076b90a46eeb7aa74c1f58fc6d7. Moreover, we refer the reader to a paper discussing the suitability of normal models to detect the presence and direction of effects in reaction-time data (Schramm & Rouder, 2019).

First, we wanted to know which of these four parameters are necessary to describe children’s cognitive performance and whether children show substantial interindividual variation in these parameters. We started out by estimating the maximal DSEM model, including all four fixed effects (a-d), each with a random effect capturing individual differences in these parameters. We then specified and compared a series of increasingly simple models that reflect, non-preregistered, hypotheses about the necessity of dynamic parameters and interindividual differences therein to describe the time series of cognitive performance in our sample. We used the deviance information criterion (DIC; Spiegelhalter et al., 2002, 2014) to select among a series of nested models. The DIC supported the most complex DSEM with 8 parameters across all blocks, except blocks 1 and 7B where including the inertia parameter did not improve model fit. This may be attributed to the fact that blocks 1 and 7B have the smallest number of trials and autoregression is the least reliable parameter as shown in prior simulation studies (Du & Wang, 2018). We decided to retain the inertia parameter across all tasks, since a sizeable literature base shows that temporal dependency is a hallmark of human cognitive performance (Solfo & van Leeuwen, 2023; Wagenmakers et al., 2004). Thus, we would expect the maximal model to generalize better to future instances of the tasks in blocks 1 and 7B. Moreover, due to concerns about the reliability of the DIC (e.g. Asparouhov et al., 2018) we additionally show that allowing individual differences in reaction-time variability considerably improves model fit using leave-one-out cross validation (see Supplemental Materials R). Interested researchers can replicate comparisons for all models using the provided Stan code. For detailed results see Supplementary materials F. For a more detailed version of the DSEM model specification and comparison section see Supplementary materials M.

Second, we wanted to know whether interindividual differences in the selected model-parameters were related to interindividual differences in age and severity of developmental problems. This included the specification of three separate models, per cognitive task.

Age as covariate of DSEM parameters. After we chose the appropriate DSEM we added age as a time-invariant covariate at the between person-level, since reaction-time measures are age-related. We regressed all the latent variables at the between-subject level on age to assess if individual differences in e.g., reaction-time variability were significantly predicted by age. Age, like all subsequently included time-invariant covariates, was grand-mean centered. We grand-mean centered covariates because we wanted to interpret the intercept of the latent variables as the mean (when predictors have a mean of zero due to grand-mean centering, the intercept of the outcome can be interpreted as the mean (McNeish & Hamaker, 2020).

Figure 4.
Schematic of full dynamic structural equation model, with all four parameters and random effects.

On the left-hand side we can see the within-person part of the model where the four parameters are specified for each person. Each colored dot corresponds to a parameter which together form the function that explains a child’s reaction times. The dot corresponds to mean response speed (denote α in the “model specification” section). The pink dot corresponds to the reaction-time variability parameter (b). The green dot corresponds to the inertia parameter (c). The blue dot reflects the trend parameter (d). On the right-hand side we can see the between-person part of the model where we specify the sample average for each of the within-person parameters and the extend to which individuals deviate from the average of each parameter, using random effects. Again, the colors denote the same parameters as in the within-person part of the model. The within-person model and between-person model are both estimated in on step.

Figure 4.
Schematic of full dynamic structural equation model, with all four parameters and random effects.

On the left-hand side we can see the within-person part of the model where the four parameters are specified for each person. Each colored dot corresponds to a parameter which together form the function that explains a child’s reaction times. The dot corresponds to mean response speed (denote α in the “model specification” section). The pink dot corresponds to the reaction-time variability parameter (b). The green dot corresponds to the inertia parameter (c). The blue dot reflects the trend parameter (d). On the right-hand side we can see the between-person part of the model where we specify the sample average for each of the within-person parameters and the extend to which individuals deviate from the average of each parameter, using random effects. Again, the colors denote the same parameters as in the within-person part of the model. The within-person model and between-person model are both estimated in on step.

Close modal

SDQ dimensions and age as covariates of DSEM parameters. We added the grand-mean centered total scores of the five SDQ dimensions (emotionality, prosociality, conduct problems, ADHD, and peer relations) as covariates at the between-person level. We regressed all the latent variables at the between-subject level on the SDQ dimensions. Age was also maintained as a between-person, grand-mean centered covariate. All between-person covariates were allowed to covary freely.

SWAN dimensions and age as covariates of DSEM parameters. We specified and estimated a model where the total scores of the two SWAN dimensions (inattention and hyperactivity) and age were, grand-mean centered, between-person covariates. We regressed all the latent variables at the between-subject level on the SWAN dimensions and age. The two SWAN dimensions and age were allowed to covary freely.

Third, we wanted to know which task manipulation produced the reaction-time variability estimate most sensitive to individual differences in the severity of the inattention and hyperactivity symptoms. We used the associations between reaction-time variability from the 2-choice reaction time task and ADHD subdomain scores (i.e. SWAN total scores for inattention and hyperactivity) as the benchmark effect sizes. We tested whether different task manipulations led to an incremental benefit in sensitivity. To formally test the hypothesis that the dependent correlations between reaction-time variability and inattention and hyperactivity severity are equitable across tasks we used a series of Steiger’s z-tests, which compare the effect size of interest while taking into account the dependency between reaction-time variability from different tasks (Steiger, 1980). We used the r.test function from the psych package in R to compute Steiger’s z-tests (Revelle, 2022).

Latent difference-score models for block comparison. Fourth, we wanted to test four, non-preregistered, hypotheses about the mechanisms driving reaction-time variability, and their relation to individual differences in the severity of the ADHD domains of inattention and hyperactivity. We used a subtractive method by combining the unique task-structure of COTAPP with latent difference-score models (Kievit et al., 2018; McArdle, 2009). Five cognitive tasks within the COTAPP use a 2-choice reaction time task at their core and then add various manipulations. This allows us to compare children’s reaction-time variability within the 2-choice reaction time task with tasks that differ in the implementation of a specific manipulation. Each task-specific manipulation perturbs a specific cognitive process, which enables us to test the effect that changing this process has on reaction-time variability. Using latent difference-score models, we can assess whether task-specific manipulations lead to within-person differences in reaction-time variability and whether individual differences in the degree of difference, are associated with inattention and/or hyperactivity severity. We used this method to test four, non-preregistered, mechanistic hypotheses about the causes of reaction-time variability. We compared reaction-time variability from block 2, to block 3, 4, and 7a separately and reaction-time variability in block 7a to block 7b. For detailed information about the specification of latent-difference score models see the online Supplementary materials N.

Figure 5.
Schematic of latent difference score model used across block comparisons.

The parameters in purple denote the parameters of interest. (1) The mean and variance of the latent difference score, represented by a purple circle with “ΔRTV” written inside. These capture the average difference in reaction-time variability that is due to the task-specific manipulation, and individual differences in the degree of difference. (2) The association between differences in inattention severity and differences in the degree of difference in reaction-time variability, captured by the arrow going from the “inattention” box to the purple circle. “BLOCK 2” and “BLOCK 7A” are placeholder names representing the task from where estimates on reaction-time are taken for each task comparison. In this figure the model comparing the 2-choice reaction-time task (block 2) with the vigilance choice-reaction-time task (block 7A) is depicted.

Figure 5.
Schematic of latent difference score model used across block comparisons.

The parameters in purple denote the parameters of interest. (1) The mean and variance of the latent difference score, represented by a purple circle with “ΔRTV” written inside. These capture the average difference in reaction-time variability that is due to the task-specific manipulation, and individual differences in the degree of difference. (2) The association between differences in inattention severity and differences in the degree of difference in reaction-time variability, captured by the arrow going from the “inattention” box to the purple circle. “BLOCK 2” and “BLOCK 7A” are placeholder names representing the task from where estimates on reaction-time are taken for each task comparison. In this figure the model comparing the 2-choice reaction-time task (block 2) with the vigilance choice-reaction-time task (block 7A) is depicted.

Close modal

Younger children show higher reaction-time variability

Older children consistently responded faster (βrange1 = [-0.479, -0.598], FDRprange2 < 0.001) and less variably (βrange = [-0.155, -0.398], FDRp < 0.001) across all tasks. Older children decreased their reaction-time faster across trials (βrange = [-0.150, -0.295], FDRprange = [0.002, 0.009]), across most cognitive tasks, likely reflecting greater learning efficiency. Systematic change in reaction-times was not significantly associated with age, in the simple reaction-time task (β = -0.104, FDRp = 0.09) and the 2-choice reaction-time task (β = 0.021, FDRp = 0.374). This may be a byproduct of the simplicity of these tasks which could have masked age-related difference in learning speed or may simply reflect the unreliability of the trend parameter across then span of 20-30 trials. Older children’s reaction time was, on average, only related to lower inertia in the simple reaction-time task (β = -0.157, FDRp = 0.012). In the other cognitive tasks, there was no significant association between inertia and age (βrange = [-0.121, 0.037], FDRprange = [0.063, 0.290]). All associations between age and DSEM parameters were present after we controlled for accuracy scores, for a detailed summary of our results see Supplement I.

Figure 6.
Relationship between individual differences in age and individual differences in DSEM parameters.

Each row corresponds to a different task. Tasks are ordered in the temporal order they occurred, with the simple reaction-time task on the top row and the long ITI choice-reaction-time task on the bottom row. Each column represents the association between a different DSEM parameter and age.

Figure 6.
Relationship between individual differences in age and individual differences in DSEM parameters.

Each row corresponds to a different task. Tasks are ordered in the temporal order they occurred, with the simple reaction-time task on the top row and the long ITI choice-reaction-time task on the bottom row. Each column represents the association between a different DSEM parameter and age.

Close modal

Lower accuracy is associated with higher reaction-time variability

Accuracy was measured in the same way in four cognitive tasks, i.e. pressing the key not assigned to the stimulus was coded as an error: the 2-choice reaction-time task, the reward choice-reaction-time task, the vigilance choice-reaction-time task, and event-rate choice-reaction-time task. Results for this subset of tasks if as follows. Children who performed more variably also made more errors, on average, across all cognitive tasks (βrange = [0.179, 0.415], prange < 0.001). Children who made more errors also performed more slowly on average, across all tasks (βrange = [-0.174, -0.437], prange < 0.001), together suggesting that a simple speed-accuracy trade-off does not explain individual differences in responses adequately. Children that made more errors also had a higher inertia in three of four cognitive tasks (βrange = [-0.113, 0.185], prange = [0.009, 0.018]). In the reward choice-reaction time task (β = 0.029, p = 0.286) there was no significant association between the number of errors and the individual differences in inertia. Differences in the rate of systematic change in reaction-times were not associated with the number of errors in three of four cognitive task (βrange = [-0.033, 0.209], prange = [0.051, 0.418]). In the reward choice-reaction-time task, a greater linear decrease in reaction times was associated with more errors (β = -0.150$, p = 0.047).

The interference choice-reaction time task allowed for two types of errors, compatible errors and incompatible errors. Compatible errors occurred when a child pressed the wrong button in a trial where there was no incongruence between the stimulus location and the response direction. Incompatible errors occurred when there was a mismatch between spatial location and response direction. A greater frequency of errors in incompatible trials was more strongly associated with higher reaction-time variability, faster response speed, and a greater increase in response speed over time, relative to errors made in compatible trials. Errors in incompatible trials were not significantly associated with individual differences in inertia. A greater number of errors in compatible trials were only associated with a faster response speed. For detailed results see Supplement I.

Higher reaction-time variability is specifically associated with greater ADHD severity

We assessed how individual differences in five domains of behavioral functioning, as measured by the Social Difficulties Questionnaire (SDQ), relate to individual differences in response speed, reaction-time variability, inertia, and systematic change in reaction times. The associations between each SDQ dimension and DSEM parameters reflect the unique, age- and accuracy-residualized, relations between children’s psychopathology and their cognitive characteristics.

ADHD was the only SDQ-dimension that was consistently associated with individual differences in reaction-time variability. Children with greater ADHD symptom-severity had, on average, higher reaction-time variability (βrange = [0.128, 0.229], FDRp = 0.005). Prosociality and conduct problems were only associated with reaction-time variability on one task, the vigilance choice-reaction-time task. These associations were positive but weak and specific to the vigilance choice reaction-time task (βprosociality = 0.086, FDRp = 0.034; βconduct = 0.105, FDRp = 0.017). Slower response speed was weakly associated with higher ADHD in two tasks, the interference choice-reaction-time task (β = 0.124, FDRp = 0.01), and the vigilance choice-reaction-time task (β = 0.151, FDRp = 0.01). In all other tasks response speed was not significantly associated with ADHD, after correction for multiple comparisons (βrange = [0.050, 0.103], FDRprange = [0.064, 0.243]). Slower response speed was also weakly associated with greater emotional problems in three cognitive tasks (βrange = [0.066, 0.081], FDRprange = [0.03, 0.05]) and with prosociality in the simple reaction-time task (β = -0.169, FDRp = 0.01). Individual differences in the inertia parameter were not significantly associated with differences in any SDQ domain after FDR correction. Children with a greater rate of systematic increase in reaction times also scored higher on prosociality, this was true for the simple reaction-time task (β = 0.290, FDRp = 0.03). For detailed results see Supplement G.

Figure 7.
Relationship between individual differences in SDQ developmental problem domains and individual differences in reaction-time variability.

Each row corresponds to a different task. Tasks are ordered in the temporal order they occurred, with the simple reaction-time task on the top row and the long ITI choice-reaction-time task on the bottom row. Each column corresponds to a difference SDQ domain of developmental problems.

Figure 7.
Relationship between individual differences in SDQ developmental problem domains and individual differences in reaction-time variability.

Each row corresponds to a different task. Tasks are ordered in the temporal order they occurred, with the simple reaction-time task on the top row and the long ITI choice-reaction-time task on the bottom row. Each column corresponds to a difference SDQ domain of developmental problems.

Close modal

Higher reaction time variability is specifically associated with greater inattention severity

We regressed between-person differences in subdimensions of ADHD symptoms using the SWAN subscales of inattention and hyperactivity/impulsivity on the DSEM parameters, controlling for age and accuracy. Higher inattention severity was specifically associated with greater reaction-time variability across all tasks (βrange = [0.144, 0.275], FDRprange = [0.002, 0.004]). In contrast, hyperactivity/impulsivity severity was not significantly associated with reaction-time variability in any cognitive task. Slower response speed was significantly associated with higher inattention severity across all tasks, and lower hyperactivity severity in 3 tasks (βrange = [-0.193, -0.086], FDRprange = [0.014, 0.033]). For detailed results see Supplement H.

Figure 8.
Relationship between individual differences in SWAN subscales of inattention and hyperactivity/impulsivity and individual differences in reaction-time variability.

Each row corresponds to a different task. Tasks are ordered in the temporal order they occurred, with the simple reaction-time task on the top row and the long ITI choice-reaction-time task on the bottom row. Each column corresponds to a different SWAN subscale, inattention on the left and hyperactivity on the right.

Figure 8.
Relationship between individual differences in SWAN subscales of inattention and hyperactivity/impulsivity and individual differences in reaction-time variability.

Each row corresponds to a different task. Tasks are ordered in the temporal order they occurred, with the simple reaction-time task on the top row and the long ITI choice-reaction-time task on the bottom row. Each column corresponds to a different SWAN subscale, inattention on the left and hyperactivity on the right.

Close modal

A 2-choice reaction time task after a sustained attention manipulation provides the greatest sensitivity for inattention severity

We formally compared the sensitivity of reaction-time variability estimates that emerge through different task manipulations to predict inattention symptoms. We used the base 2-choice reaction time task from block 2 as a benchmark as it was the simplest and shortest choice-reaction-time task in the COTAPP testing battery (r = 0.18). We found that individual differences in reaction-time variability from the sustained attention choice-reaction-time task were significantly more sensitive to individual differences in inattention symptom-severity, relative to the base 2-choice reaction time task. The increment in effect size was marginal (t = -3.06, rdifference = 0.10, p = 0.002). The long ITI choice-reaction time task was the only other task to have superior sensitivity in comparison to the base 2-choice reaction-time task (t = -2.76, rdifference = 0.09, p = 0.002). However, the difference in sensitivity between the long ITI choice-reaction-time task and the sustained attention choice-reaction-time task was insignificant, indicating that the increment in sensitivity is likely due to the sustained attention manipulation rather than the long ITI (t = 0.47, rdifference = -0.01, p = 0.640). For readers interested in more information regarding the strength of association between inattention and reaction-time variability across blocks, we produced figures of the posterior distributions that can be found in Supplement U.

Taxing vigilance led to a significant difference in reaction-time variability in children with higher inattention severity

We used latent difference score models and a unique task design to isolate the effect of specific cognitive processes on reaction-time variability. We compared a series of modified 2-choice reaction-time tasks with a simple 2-choice reaction-time task (see Figure 5 for a schematic of the design). Each modified 2-choice reaction-time task differed from the standard CRT with regards to one behavioral manipulation. For instance, the reward choice-reaction time task was identical to the 2-choice reaction time task but added rewards. We reasoned that any differences in reaction-time variability between tasks should be predominantly caused by the added behavioral manipulation (e.g. rewards). This allowed us to probe the effect of theoretical mechanisms, tied to task-specific manipulations, on reaction-time variability. For example, an individual might show moderate variability in a simple task, but show substantially more in a more complex task, even more so than others, suggesting the variability in their performance is affected by complexity. We also tested whether individual differences in the degree of difference in reaction-time variability were associated with individual differences in the SWAN dimensions of inattention and/or hyperactivity. Three of four behavioral manipulations led to a significant increase in reaction-time variability, on average. The long ITI choice-reaction-time task did not lead to a significant difference in reaction-time variability. There was pronounced individual variation in the degree of difference across all task comparisons (see Supplement K for table). The SWAN dimensions of inattention and hyperactivity were significantly correlated (r = 0.741).

First, the adaptive gain theory predicts that reaction-time variability should decrease under task conditions with high perceived task-utility (Aston-Jones & Cohen, 2005). We reasoned that reaction-time variability should be lower in the reward choice-reaction time task in comparison to the 2-choice reaction-time task. Contrary to expectations, we found a substantial average increase in reaction-time variability (β = 0.301, p < 0.001). Moreover, children with higher inattention or hyperactivity severity should show a greater degree of difference in their reaction-time variability. Contrary to this prediction, children with greater severity in inattention or hyperactivity/impulsivity did not show a greater degree of difference in reaction-time variability (βinattention = 0.042, p = 0.367; βhyperactivity = -0.047, p = 0.342).

Second, the behavioral inhibition model predicts that reaction-time variability will increase in tasks where children have to inhibit a prepotent/ongoing response (Barkley, 1997). In line with this prediction, there was a significant difference in reaction-time variability, on average, following a spatial interference manipulation (β = 0.379, p < 0.001). The behavioral inhibition model also predicts that children with higher hyperactivity severity should show a greater degree of difference in their reaction-time variability. In contrast to this prediction, children with greater hyperactivity severity did not show a greater degree of difference in their reaction-time variability (β = -0.024, p = 0.625). Inattention severity was also not associated with the degree of difference in reaction-time variability (β = 0.067, p = 0.167).

Third, according to the adaptive gain theory reaction-time variability should be greater after a prolonged period of testing (here defined as ~30 minutes). In line with this prediction, on average, children showed a significant increase in reaction-time variability on the same task, following 30 minutes of testing (β = 0.578, p < 0.001). Individual differences in the degree of difference were significantly associated with higher inattention severity (β = 0.141, p = 0.002), but not hyperactivity severity (β = 0.002, p = 0.972).

Lastly, the cognitive energetic model (CEM) posits that arousal regulation deficits in children with ADHD underlie variability and that this can be exacerbated by increasing the time between trials (Sergeant, 2000). Increasing the intertrial interval from 450-750ms to 3000-6000ms led to a significant decrease in children’s reaction-time variability, on average (β = -0.041, p = -0.041). Moreover, contrary to predictions of the CEM, the degree of difference was not significantly associated with inattention severity (β = 0.024, p = 0.621) or hyperactivity severity (β = 0.038, p = 0.459).

We found that reaction-time variability was specifically associated with symptom-severity in the inattention domain, in a population-based sample of children aged 5.5 to 13.5 years. Children with higher severity of inattention symptoms performed more variably across all cognitive tasks. Whereas reaction-time variability was not consistently associated with any other domain of behavioral functioning, including hyperactivity/impulsivity symptoms, highlighting the specificity of this association. Moreover, the tendency of children with more severe inattention symptoms to perform more variably held independently of the significantly higher variability showcased by younger children. Our results are, at first sight, at odds with a meta-analysis of 71 studies which concluded that reaction-time variability is a general marker of clinical functioning, and not specific to ADHD (Kofler et al., 2013). It is noteworthy, however, that this meta-analysis is largely composed of studies (63 out of 71 studies) relying on the intraindividual standard deviation (iSD) as a proxy of reaction-time variability which can conflate multiple sources of variance, such as systematic changes in reaction time (Nesselroade & Salthouse, 2004; Prathiba et al., 1998; Wang et al., 2012). The muddled nature of this statistical proxy may reduce the specificity of its associations. That is, the association between the iSD and mental disorders may be driven by processes that are distinct from the theoretical construct of reaction-time variability but are nonetheless subsumed by the statistical tool used to quantify it. This possibility stands despite the high correlation between the iSD and our DSEM estimate in this particular sample. The iSD may conflate cognitive processes in different tasks and task designs (e.g. with longer trial counts or adaptive difficulty) explaining the more diffuse pattern of associations between reaction-time variability and psychopathology across studies. Thus, in the face of empirical uncertainty, reliance on models that decompose sources that we know a priori are conflated seems the most defensible manner to obtain reliable inferences.

A distinct methodological possibility for the divergence in our findings is that existing meta-analyses aggregate results over multiple underpowered studies which could lead to incorrect inferences. Alternatively, a symptom-level explanation may be able to reconcile these distinct findings. It is plausible that clinical controls (i.e. people with diagnoses other than ADHD) within the reviewed studies, also suffered from inattention symptoms to a similar extent as people in the ADHD samples. This is congruent with our findings that reaction-time variability is specific to the inattentive symptoms within the disorder. This possibility also aligns with the finding that ASD patients without comorbid ADHD-symptoms did not show elevated reaction-time variability (Karalunas et al., 2014). Moreover, a previous study supporting the specificity of reaction-time variability as a marker for ADHD split their sample into non-overlapping diagnostic groups (Salum et al., 2019), which are uncommon in clinical populations (Caspi & Moffitt, 2018). Hence, to understand the mechanisms that drive specific aspects of phenotypes associated with neurodevelopmental disorders, it would be beneficial to pay attention to the symptom-level specificity which these mechanisms may display, to avoid decreasing both the sensitivity and specificity of our understanding. Contrary to the possibility that reaction-time variability may be tied to inattention symptoms, a comparison between hyperactive/impulsive and inattentive subtypes of ADHD across 41 studies showed neither subdomain of ADHD was preferentially associated with reaction-time variability (Kofler et al., 2013). However, these findings also suffer from heavy reliance on the intraindividual standard deviation, in 34 out of 41 studies. We interpret the aggregate of past and present results as compelling evidence against the hypothesis that reaction-time variability is a general marker of psychopathology. A weaker version of this hypothesis, where reaction-time variability is a marker for clinical functioning in a distinct subset of symptoms that we did not measure, remains unaffected by our results. Future work should seek to test how robust the association between reaction-time variability and inattention is to the addition of psychopathology symptoms not assessed in our study using a substantively motivated estimate of reaction-time variability.

To further our knowledge of reaction-time variability we used the unique design of the COTAPP assessment battery to test four, non-preregistered, hypotheses about the mechanisms of reaction-time variability. Two of these hypotheses were derived from the adaptive gain theory (Aston-Jones & Cohen, 2005), and the other two from the behavioral inhibition deficit model (Barkley, 1997) and the cognitive energetic model (Sergeant, 2000), respectively. Our findings support changes in sustained attention as a causal mechanism driving reaction-time variability. This was one of the outcomes predicted by the adaptive gain theory, which posits that reaction-time variability is a byproduct of locus-coeruleus norepinephrine (LC-NE) activity (Aston-Jones & Cohen, 2005). Specifically, the adaptive gain theory predicts that fatigue should push the activity of LC-NE towards the extremities of the arousal curve (inverted U) where children experience an increase in attentional lapses due to mind-wandering, mind-blanking, or increased sensitivity to external distractions (Unsworth & Robinson, 2016; (2018). Our results support the behavioral prediction of the theory but offer no direct evidence for the implication of LC-NE activity. As a complement to our results, empirical investigations of LC-NE activity as a function of time-on-task show the same behavioral pattern and link this to the diminution of (both tonic and phasic) LC-NE activity using pupillometry (Unsworth & Robinson, 2016; (2018). Together these results support LC-NE driven attentional lapses as a cause for reaction-time variability.

Our results further show that the effect of fatigue on sustained attention is greater in children with greater severity of symptoms in the inattention domain, but not the hyperactivity domain. The temporal instability of reaction-time variability in children with higher inattention symptoms is worth highlighting since it pushes against a static conceptualization of ADHD symptoms that are either absent or present within a child. Children’s attentiveness waned over the course of 30 minutes in response to environmental demands which opens the possibility that other behaviors, which may traditionally be considered fixed, also fluctuate over time (e.g. in the span of minutes, hours, or days). Information about the stability of symptoms, and individual differences therein, may point to differences in mechanisms driving ADHD and their contextual sensitivity. This information can be used to fortify the theoretical rationale for interventions, increasing the probability of success and the efficiency of resource allocation.

Contrary to a prediction of the adaptive gain theory, adding extrinsic rewards to the choice reaction-time task did not lower reaction-time variability. In fact, on average, reaction-time variability increased after the manipulation. It is noteworthy, however, that this task was designed to make children respond fast and accurately using incentives. Pushing the speed of children while increasing their motivation using rewards could have led most children to respond more variably than usual because they were operating near their limit, even if their attention was focused on the task. This is congruent with findings showing increased reaction-time variability as a function of cognitive load (e.g. Galeano-Weber et al., 2018) and urges future research to retest this hypothesis using incentives that do not alter task demands. Alternatively, the observed pattern could be attributed to ceiling effects. Most children could have already been sufficiently motivated to perform the task and their LC-NE activity was in a phasic state (i.e. exploitative mode), resulting in already low reaction-time variability. Contrary to this explanation, we did not find an association between severity in inattention or hyperactivity/impulsivity symptoms and reward-driven change in reaction-time variability. Adaptive gain theory hypothesizes that children with attention deficit disorders are characterized by overly persistent tonic activation (Aston-Jones et al., 1999; Aston-Jones & Cohen, 2005). Thus, we would expect that rewards would shift these children to a phasic state and lower their reaction-time variability. It may be that children with inattention symptoms, but not necessarily hyperactivity symptoms, reside predominantly in the lower bound of the LC-NE curve where tonic and phasic activation are both relatively low (i.e. a drowsy, inattentive state). This is also congruent with evidence from pupillometry studies showing that low tonic pretrial arousal predicted higher reaction-time variability; the link with inattention severity was, however, not examined (Grandchamp et al., 2014; Mittner et al., 2014). Aston-Jones & Cohen (2005), are explicit in their description of how ongoing evaluations of stimulus significance affect transitions between the middle portion (phasic state) and the right bound (tonic state) of the LC-NE curve, but do not explicitly convey how the lower bound is affected by this evaluation process. Future studies using a proxy of LC-NE activity and symptom-level measures of ADHD could modulate reward-value to observe the effect of changes in stimulus significance on arousal and reaction-time variability, and how this is moderated by symptom severity.

We tested two more candidate mechanisms. First, the behavioral inhibition deficit (BID) model is a prominent theory of ADHD that claims behavioral inhibition deficits either directly cause reaction-time variability, or indirectly by taxing executive functions (Barkley, 1997). The BID model’s predictions were partly supported by our results. We found evidence for the implication of behavioral inhibition as a process driving reaction-time variability, since adding a spatial interference manipulation led to substantial increase in reaction-time variability. However, we did not find evidence for a link between behavioral inhibition and severity in hyperactivity symptoms. Barkley proposed that only children with hyperactivity symptoms possess behavioral inhibition deficits and that these deficits in turn cause their inattention symptoms (1997). He further posits that these secondary symptoms of inattention are qualitatively distinct from the inattention found in the pure-inattentive type of ADHD. Thus, the BID model predicts that hyperactivity must be associated with reaction-time variability, but an association with inattention is optional. Our findings do not lend support to this prediction.

Second, we tested a prediction made by the cognitive energetic model (CEM): Arousal regulation deficits should lead to higher reaction-time variability via an increase in attentional lapses (Sergeant, 2004). Note, that the arousal within the CEM is conceptually distinct from the norepinephrine functioning in the adaptive gain theory, which is more closely aligned with the concept of effort within the CEM (Karalunas et al., 2014). The observed effect was in the opposite direction than predicted and the magnitude of the difference was negligible (i.e. reaction-time variability decreased on average and effect was not significant). Moreover, inattention and hyperactivity severity were not associated with the degree of difference in reaction-time variability. Thus, we found no evidence that arousal regulation as conceptualized by the CEM underlies reaction-time variability in children with ADHD symptoms. Our findings are consistent with most studies in the literature which find that the construct of effort is more consistently related to reaction-time variability than is arousal (Karalunas et al., 2014).

After we used the distinct block-design of COTAPP to tease apart candidate mechanisms, we decided to examine the translational question from a simpler perspective: We asked which task most strongly relates to inattention severity. The choice-reaction time task with a sustained attention manipulation had superior sensitivity according to our formal comparison. Therefore, retesting children after prolonged cognitive effort is an effective tool to increase the sensitivity of reaction-time variability to differences in inattention severity. From a practical standpoint, clinicians looking to add a measure of reaction-time variability into their clinical toolset, may want to consider whether the increase in variance explained is worth the additional financial and time burden associated with a 30m testing battery as opposed to a 5-minute test. Future studies designed to compare the cost-benefit ratio of tasks would go a long way in ensuring the efficient dissemination of reaction-time variability as a tool in the clinic.

The limitations of our study qualify our conclusions and concurrently point to directions for future research. First, our sample was derived from the population of children in the Netherlands aged 5.5 to 13.5 years. We encourage readers to thoughtfully generalize inferences to populations they deem share pertinent characteristics with our population. It is noteworthy that the COTAPP is not language or culture sensitive, which should aid generalizations.

Second, dynamic structural equation models assume a Gaussian distribution. We log-transformed reaction-time data to approximate a Gaussian distribution, but further research is needed to test the consequences of violating this assumption.

Third, we wanted to examine the specificity of reaction-time variability as a marker of ADHD and used the SDQ which covers some of the most important developmental problem domains. However, we did not assess an exhaustive list of psychopathology symptoms. Future studies including a comprehensive assessment battery of psychopathology would supplement our findings.

Fourth, the causal inferences made from our latent difference-score models may be limited by the fixed-order of our tasks. This criticism is not pertinent to the causal inference made with regards to sustained attention deficits, where fatigue was the intended manipulation. Moreover, we believe that it is unlikely that counterbalancing the order of tasks would alter our results for the following reasons: (1) the reward-task is the only task that was predicted to decrease reaction-time variability (RTV). It could be argued that fatigue led to the contrary increase of RTV. We find this unlikely because the reward task was completed directly after the choice reaction-time task that was used as a reference. (2) The difference in RTV was not associated with inattention in any other tasks making it difficult to explain why a fatigue-related increase in RTV would only be associated with inattention in the final testing block. (3) The tasks used are simple to complete and children complete a small count of practice trials prior to each block under the experimenter’s supervision to determine whether they adequately understand the task (see Supplementary materials A). This makes practice effects an unlikely influence on performance.

Fifth, we modeled symptom clusters as total scores which do not reflect the association of symptoms with their underlying causal latent dimensions, if these exist. The bias in the estimate of our ADHD variable would be proportional to the difference between the homogeneous weights assigned by total scores and the relative weights assigned by a more appropriate factor model (Bollen & Bauldry, 2011). Under a different causal theory of mental disorders (e.g. network theory; Borsboom et al., 2019), where symptoms are not underpinned by a unitary cause we would ideally have multi-item data for separate symptoms to deal with measurement error without assuming a latent causal factor.

Sixth, some of our model comparisons relied on the DIC, which has known problems with its instability (Asparouhov et al., 2018). We assessed the stability of the DIC, which was sufficient for all model comparisons. Moreover, we implemented DSEM in Stan and performed the most substantively meaningful model comparison using leave-one-out cross validation.

Lastly, our hypotheses were derived from extant theories and fit to a large sample but not preregistered. Thus, a pre-registered replication of our results is needed to bolster the conclusiveness of our claims.

The field of ADHD research exemplifies how intraindividual variability has come a long way from the heretical proposal of a few theorists to become an invaluable piece of the developmental puzzle. Today we have the tools to push forth our understanding of within-person variability in cognitive performance and embed this source of individual differences in longitudinal contexts to understand developmental change at an unprecedented temporal granularity. We believe that a close dialogue between neuroscience, clinical psychology, and psychometrics can ensure that our empirical tests are aligned with our theoretical understanding of neurodevelopmental disorders and their cognitive mechanics. In the present paper we exemplified how such an approach can produce novel insight about existing substantive questions.

To conclude we derive three testable hypotheses from the discussion of our data: First, reaction-time variability is specifically associated with inattention symptom-severity, and not hyperactivity/impulsivity severity, in choice reaction-time tasks. Based on our discussion we further predict that this pattern will hold irrespective of diagnostic status. Which specific symptoms within the inattention domain best predict the magnitude of reaction-time variability remains an open question for future research. Second, symptoms of ADHD will show meaningful fluctuations across time and people will differ in the extent of these fluctuations. This hypothesis is appreciably broad, but current evidence does not allow us to make more granular predictions. The availability of ecological momentary assessment (EMA) makes it possible to test the age-old assumption that diagnoses are static. Even if symptoms ultimately revert to their mean after a short period of time, the informativeness of differences in the patterns of temporary changes is worth investigating. Examining how differences in symptom fluctuations relate to socially relevant and widely accessible outcomes, such as academic performance, would offer a pertinent test of this hypothesis. Third, children with higher inattention severity will show lower pretrial LC-NE activity which will mediate the association between symptom severity and reaction-time variability. We hope that researchers will be inclined to follow one of our proposed trajectories or create their own path to enter the exciting world of dynamics that unfolds at each step of a child’s development.

Author Contributions

  • Substantial contributions to conception and design: MEA, NR, RK

  • Acquisition of data: NR

  • Analysis and interpretation of data: MEA

  • Drafting the article or revising it critically for important intellectual content: MEA, NR, RK

  • Final approval of the version to be published: MEA, NR, RK

Competing Interests

The authors have no competing interests to declare.

Data Accessibility Statement

Data and code for all our analyses, preprocessing steps, and data visualization are available at https://osf.io/pwfyk/?view_only=f657f076b90a46eeb7aa74c1f58fc6d7.

1.

βrange and prange indicate the minimum and maximum standardized effect size and p-value across all tasks, respectively.

2.

All reported FDRp-values have been adjusted for multiple comparisons using the False Discovery Rate (FDR) method.

American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). American Psychiatric Association.
American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). https://doi.org/10.1176/appi.books.9780890425596
Anderson, M. (1992). Intelligence and development: A cognitive theory. Blackwell Publishing.
Asparouhov, T., Hamaker, E. L., & Muthén, B. (2018). Dynamic Structural Equation Models. Structural Equation Modeling: A Multidisciplinary Journal, 25(3), 359–388. https://doi.org/10.1080/10705511.2017.1406803
Aston-Jones, G., & Cohen, J. D. (2005). AN INTEGRATIVE THEORY OF LOCUS COERULEUS-NOREPINEPHRINE FUNCTION: Adaptive Gain and Optimal Performance. Annual Review of Neuroscience, 28(1), 403–450. https://doi.org/10.1146/annurev.neuro.28.061604.135709
Aston-Jones, G., Iba, M., Clayton, E., Rajkowski, J., & Cohen, J. (2007). The locus coeruleus and regulation of behavioral flexibility and attention: Clinical implications. In Brain norepinephrine: Neurobiology and therapeutics (pp. 196–235). Cambridge University Press. https://doi.org/10.1017/CBO9780511544156.008
Aston-Jones, G., Rajkowski, J., & Cohen, J. (1999). Role of locus coeruleus in attention and behavioral flexibility. Biological Psychiatry, 46(9), 1309–1320. https://doi.org/10.1016/S0006-3223(99)00140-7
Aston-Jones, G., Rajkowski, J., Kubiak, P., & Alexinsky, T. (1994). Locus coeruleus neurons in monkey are selectively activated by attended cues in a vigilance task. Journal of Neuroscience, 14(7), 4467–4480. https://doi.org/10.1523/JNEUROSCI.14-07-04467.1994
Barkley, R. A. (1997). Behavioral Inhibition, Sustained Attention, and Executive Functions: Constructing a Unifying Theory of ADHD. 30. https://doi.org/10.1037/0033-2909.121.1.65
Biesmans, K. E., Van Aken, L., Frunt, E. M. J., Wingbermühle, P. A. M., & Egger, J. I. M. (2019). Inhibition, shifting and updating in relation to psychometric intelligence across ability groups in the psychiatric population. Journal of Intellectual Disability Research, 63(2), 149–160. https://doi.org/10.1111/jir.12559
Bollen, K. A., & Bauldry, S. (2011). Three Cs in measurement models: Causal indicators, composite indicators, and covariates. Psychological Methods, 16(3), 265–284. https://doi.org/10.1037/a0024448
Borsboom, D., Cramer, A. O. J., & Kalis, A. (2019). Brain disorders? Not really: Why network structures block reductionism in psychopathology research. Behavioral and Brain Sciences, 42, e2. https://doi.org/10.1017/S0140525X17002266
Brites, C., Salgado-Azoni, C. A., Ferreira, T. L., Lima, R. F., & Ciasca, S. M. (2015). Development and applications of the SWAN rating scale for assessment of attention deficit hyperactivity disorder: A literature review. Brazilian Journal of Medical and Biological Research, 48, 965–972. https://doi.org/10.1590/1414-431x20154528
Bubnik, M. G., Hawk, L. W., Pelham, W. E., Waxmonsky, J. G., & Rosch, K. S. (2015). Reinforcement Enhances Vigilance Among Children With ADHD: Comparisons to Typically Developing Children and to the Effects of Methylphenidate. Journal of Abnormal Child Psychology, 43(1), 149–161. https://doi.org/10.1007/s10802-014-9891-8
Caspi, A., & Moffitt, T. E. (2018). All for one and one for all: Mental disorders in one dimension. American Journal of Psychiatry, 175(9), 831–844. https://doi.org/10.1176/appi.ajp.2018.17121383
Castellanos, F. X., & Tannock, R. (2002). Neuroscience of attention-deficit/hyperactivity disorder: The search for endophenotypes. Nature Reviews Neuroscience, 3(8), 617–628. https://doi.org/10.1038/nrn896
Du, H., & Wang, L. (2018). Reliabilities of intraindividual variability indicators with autocorrelated longitudinal data: Implications for longitudinal study designs. Multivariate Behavioral Research, 53(4), 502–520. https://doi.org/10.1080/00273171.2018.1457939
Geurts, H. M., Grasman, R. P. P. P., Verté, S., Oosterlaan, J., Roeyers, H., van Kammen, S. M., & Sergeant, J. A. (2008). Intra-individual variability in ADHD, autism spectrum disorders and Tourette’s syndrome. Neuropsychologia, 46(13), 3030–3041. https://doi.org/10.1016/j.neuropsychologia.2008.06.013
Goodman, R. (1997). The Strengths and Difficulties Questionnaire: A research note. Journal of Child Psychology and Psychiatry, 38(5), 581–586. https://doi.org/10.1111/j.1469-7610.1997.tb01545.x
Goodman, R. (2001). Psychometric Properties of the Strengths and Difficulties Questionnaire. Journal of the American Academy of Child & Adolescent Psychiatry, 40(11), 1337–1345. https://doi.org/10.1097/00004583-200111000-00015
Grandchamp, R., Braboszcz, C., & Delorme, A. (2014). Oculometric variations during mind wandering. Frontiers in Psychology, 5, 31. https://doi.org/10.3389/fpsyg.2014.00031
Hauser, T. U., Fiore, V. G., Moutoussis, M., & Dolan, R. J. (2016). Computational Psychiatry of ADHD: Neural Gain Impairments across Marrian Levels of Analysis. Trends in Neurosciences, 39(2), 63–73. https://doi.org/10.1016/j.tins.2015.12.009
Huang-Pollock, C. L., Karalunas, S. L., Tam, H., & Moore, A. N. (2012). Evaluating vigilance deficits in ADHD: A meta-analysis of CPT performance. Journal of Abnormal Psychology, 121(2), 360–371. https://doi.org/10.1037/a0027205
Huang-Pollock, C. L., Nigg, J. T., & Halperin, J. M. (2006). Single dissociation findings of ADHD deficits in vigilance but not anterior or posterior attention systems. Neuropsychology, 20(4), 420–429. https://doi.org/10.1037/0894-4105.20.4.420
Karalunas, S. L., Geurts, H. M., Konrad, K., Bender, S., & Nigg, J. T. (2014). Annual Research Review: Reaction time variability in ADHD and autism spectrum disorders: measurement and mechanisms of a proposed trans-diagnostic phenotype. Journal of Child Psychology and Psychiatry, 55(6), 685–710. https://doi.org/10.1111/jcpp.12217
Karalunas, S. L., Huang-Pollock, C. L., & Nigg, J. T. (2013). Is reaction time variability in ADHD mainly at low frequencies? Journal of Child Psychology and Psychiatry, 54(5), 536–544. https://doi.org/10.1111/jcpp.12028
Kersten, P., Czuba, K., McPherson, K., Dudley, M., Elder, H., Tauroa, R., & Vandal, A. (2016). A systematic review of evidence for the psychometric properties of the Strengths and Difficulties Questionnaire. International Journal of Behavioral Development, 40(1), 64–75. https://doi.org/10.1177/0165025415570647
Kievit, R. A., Brandmaier, A. M., Ziegler, G., van Harmelen, A.-L., de Mooij, S. M. M., Moutoussis, M., Goodyer, I. M., Bullmore, E., Jones, P. B., Fonagy, P., Lindenberger, U., & Dolan, R. J. (2018). Developmental cognitive neuroscience using latent change score models: A tutorial and applications. Developmental Cognitive Neuroscience, 33, 99–117. https://doi.org/10.1016/j.dcn.2017.11.007
Kofler, M. J., Rapport, M. D., Sarver, D. E., Raiker, J. S., Orban, S. A., Friedman, L. M., & Kolomeyer, E. G. (2013). Reaction time variability in ADHD: A meta-analytic review of 319 studies. Clinical Psychology Review, 33(6), 795–811. https://doi.org/10.1016/j.cpr.2013.06.001
Kotov, R., Krueger, R. F., Watson, D., Achenbach, T. M., Althoff, R. R., Bagby, R. M., Brown, T. A., Carpenter, W. T., Caspi, A., Clark, L. A., Eaton, N. R., Forbes, M. K., Forbush, K. T., Goldberg, D., Hasin, D., Hyman, S. E., Ivanova, M. Y., Lynam, D. R., Markon, K., & Zimmerman, M. (2017). The Hierarchical Taxonomy of Psychopathology (HiTOP): A dimensional alternative to traditional nosologies. Journal of Abnormal Psychology, 126(4), 454–477. https://doi.org/10.1037/abn0000258
Lahey, B. B., Tiemeier, H., & Krueger, R. F. (2022). Seven reasons why binary diagnostic categories should be replaced with empirically sounder and less stigmatizing dimensions. JCPP Advances, 2(4), e12108. https://doi.org/10.1002/jcv2.12108
Leth-Steensen, C., King Elbaz, Z., & Douglas, V. I. (2000). Mean response times, variability, and skew in the responding of ADHD children: A response time distributional approach. Acta Psychologica, 104(2), 167–190. https://doi.org/10.1016/S0001-6918(00)00019-6
Liddle, E. B., Hollis, C., Batty, M. J., Groom, M. J., Totman, J. J., Liotti, M., Scerif, G., & Liddle, P. F. (2011). Task-related default mode network modulation and inhibitory control in ADHD: Effects of motivation and methylphenidate: Default mode network modulation in ADHD. Journal of Child Psychology and Psychiatry, 52(7), 761–771. https://doi.org/10.1111/j.1469-7610.2010.02333.x
Luce, R. D. (1986). Response times: Their role in inferring elementary mental organization. Oxford University Press.
McArdle, J. J. (2009). Latent Variable Modeling of Differences and Changes with Longitudinal Data. Annual Review of Psychology, 60(1), 577–605. https://doi.org/10.1146/annurev.psych.60.110707.163612
McNeish, D., & Hamaker, E. L. (2020). A primer on two-level dynamic structural equation models for intensive longitudinal data in Mplus. Psychological Methods, 25(5), 610–635. https://doi.org/10.1037/met0000250
Mittner, M., Boekel, W., Tucker, A. M., Turner, B. M., Heathcote, A., & Forstmann, B. U. (2014). When the Brain Takes a Break: A Model-Based Analysis of Mind Wandering. Journal of Neuroscience, 34(49), 16286–16295. https://doi.org/10.1523/JNEUROSCI.2062-14.2014
Murray, A. L., Speyer, L. G., Hall, H. A., Valdebenito, S., & Hughes, C. (2021). A Longitudinal and Gender Invariance Analysis of the Strengths and Difficulties Questionnaire Across Ages 3, 5, 7, 11, 14, and 17 in a Large U.K.-Representative Sample. Assessment, 10731911211009312. https://doi.org/10.1177/10731911211009312
Muthén, B., & Muthén, L. (2017). Mplus. In Handbook of item response theory (pp. 507–518). Chapman and Hall/CRC.
Nesselroade, J. R., & Salthouse, T. A. (2004). Methodological and theoretical implications of intraindividual variability in perceptual-motor performance. The Journals of Gerontology Series B: Psychological Sciences and Social Sciences, 59(2), P49–P55.
Prathiba, S., Elizabeth, B., & Donald T., S. (1998). Aging and variability in performance. Aging, Neuropsychology, and Cognition, 5(1), 1–13. https://doi.org/10.1076/anec.5.1.1.23
Pribram, K. H., & McGuinness, D. (1975). Arousal, activation, and effort in the control of attention. Psychological Review, 82(2), 116–149. https://doi.org/10.1037/h0076780
Revelle, W. (2022). psych: Procedures for Psychological, Psychometric, and Personality Research (R package version 2.4.6). Northwestern University. https:/​/​CRAN.R-project.org/​package=psych
Rommelse, N., Geurts, H. M., Franke, B., Buitelaar, J. K., & Hartman, C. A. (2011). A review on cognitive and brain endophenotypes that may be common in autism spectrum disorder and attention-deficit/hyperactivity disorder and facilitate the search for pleiotropic genes. Neuroscience & Biobehavioral Reviews, 35(6), 1363–1396. https://doi.org/10.1016/j.neubiorev.2011.02.015
Rommelse, N., Hartman, C., Brinkman, A., Slaats-Willemse, D., De Zeeuw, P., & Luman, M. (2018). COTAPP: Cognitieve taak applicatie handleiding [COTAPP: Cognitive test application manual]. Boom.
Rommelse, N., Langerak, I., Meer, J. van der, Bruijn, Y. de, Staal, W., Oerlemans, A., & Buitelaar, J. (2015). Intelligence May Moderate the Cognitive Profile of Patients with ASD. PLOS ONE, 10(10), e0138698. https://doi.org/10.1371/journal.pone.0138698
Salum, G. A., Sato, J. R., Manfro, A. G., Pan, P. M., Gadelha, A., do Rosário, M. C., Polanczyk, G. V., Castellanos, F. X., Sonuga-Barke, E., & Rohde, L. A. (2019). Reaction time variability and attention-deficit/hyperactivity disorder: Is increased reaction time variability specific to attention-deficit/hyperactivity disorder? Testing predictions from the default-mode interference hypothesis. ADHD Attention Deficit and Hyperactivity Disorders, 11(1), 47–58. https://doi.org/10.1007/s12402-018-0257-x
Santegoeds, E., Schoot, E., Roording-Ragetlie, S., Klip, H., & Rommelse, N. (2022). Neurocognitive functioning of children with mild to borderline intellectual disabilities and psychiatric disorders: Profile characteristics and predictors of behavioural problems. Journal of Intellectual Disability Research, 66(1–2), 162–177. https://doi.org/10.1111/jir.12874
Schramm, P., & Rouder, J. N. (2019). Are Reaction Time Transformations Really Beneficial? https://doi.org/10.31234/osf.io/9ksa6
Schweizer, K. (2007). Investigating the relationship of working memory tasks and fluid intelligence tests by means of the fixed-links model in considering the impurity problem. Intelligence, 35(6), 591–604. https://doi.org/10.1016/j.intell.2006.11.004
Sergeant, J. (2000). The cognitive-energetic model: An empirical approach to Attention-Deficit Hyperactivity Disorder. Neuroscience & Biobehavioral Reviews, 24(1), 7–12. https://doi.org/10.1016/S0149-7634(99)00060-3
Sergeant, J. (2004). Modeling attention-deficit/hyperactivity disorder: A critical appraisal of the cognitive?energetic model. Biological Psychiatry. https://doi.org/10.1016/j.bps.2004.09.010
Solfo, A., & van Leeuwen, C. (2023). A Bayesian classifier for fractal characterization of short behavioral series. Psychological Methods, Advanceonlinepublication. https://doi.org/10.1037/met0000562
Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series b (Statistical Methodology), 64(4), 583–639. https://doi.org/10.1111/1467-9868.00353
Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & van der Linde, A. (2014). The deviance information criterion: 12 years on. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(3), 485–493. https://doi.org/10.1111/rssb.12062
Steiger, J. H. (1980). Tests for comparing elements of a correlation matrix. Psychological Bulletin, 87(2), 245. https://doi.org/10.1037/0033-2909.87.2.245
Swanson, J., Deutsch, C., Cantwell, D., Posner, M., Kennedy, J. L., Barr, C. L., Moyzis, R., Schuck, S., Flodman, P., Spence, M. A., & Wasdell, M. (2001). Genes and attention-deficit hyperactivity disorder. Clinical Neuroscience Research, 1(3), 207–216. https://doi.org/10.1016/S1566-2772(01)00007-X
Unsworth, N., & Robison, M. K. (2018). Tracking arousal state and mind wandering with pupillometry. Cognitive, Affective, & Behavioral Neuroscience, 18(4), 638–664. https://doi.org/10.3758/s13415-018-0594-4
van Belle, J., van Raalten, T., Bos, D. J., Zandbelt, B. B., Oranje, B., & Durston, S. (2015). Capturing the dynamics of response variability in the brain in ADHD. NeuroImage: Clinical, 7, 132–141. https://doi.org/10.1016/j.nicl.2014.11.014
Wagenmakers, E. J., Farrell, S., & Ratcliff, R. (2004). Estimation and interpretation of 1/fα noise in human cognition. Psychonomic Bulletin & Review, 11(4), 579–615. https://doi.org/10.3758/BF03196615
Wang, L. P., Hamaker, E., & Bergeman, C. S. (2012). Investigating inter-individual differences in short-term intra-individual variability. Psychological Methods, 17(4), 567. https://doi.org/10.1037/a0029317
Willcutt, E. G., Nigg, J. T., Pennington, B. F., Solanto, M. V., Rohde, L. A., Tannock, R., Loo, S. K., Carlson, C. L., McBurnett, K., & Lahey, B. B. (2012). Validity of DSM-IV attention deficit/hyperactivity disorder symptom dimensions and subtypes. Journal of Abnormal Psychology, 121(4), 991–1010. https://doi.org/10.1037/a0027347
This is an open access article distributed under the terms of the Creative Commons Attribution License (4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Supplementary Material