The dynamic and context-appropriate shift between maintaining and updating goal-relevant information in working memory (WM) is believed to be governed by a fronto-striatal gating system. This system regulates the entry of information into WM, by deciding which information should be allowed in and which information should be prevented from entering WM. Prior research suggests that acute stress may impact WM functioning and associated cortical and subcortical regions. However, it remains unclear which specific component processes of WM are affected by acute stress. Here we used the reference-back paradigm to disentangle potential effects of acute stress on WM maintenance, updating, gate opening, and gate closing processes. Ninety-six participants completed the task twice: once at baseline and again while undergoing either a stress or control manipulation, in a between-subjects design. While no stress effects emerged for reaction times, acute stress detrimentally impacted accuracy on trials requiring WM maintenance. Drift diffusion modeling revealed that acute stress reduced overall drift rates (information processing speed), with larger group differences on maintenance trials. No effects emerged for response caution. To contextualize our findings, we conducted a meta-analysis (42 studies, 2008 participants, 210 effects). WM tasks were categorized by key demands: maintenance only, maintenance + overcoming competition, and maintenance + updating. The meta-analysis revealed impaired accuracy in tasks requiring both maintenance and updating, like the reference-back paradigm in our own study, but not tasks without updating demands. Collectively, our findings suggest that acute stress impacts response accuracy but not speed, and only in tasks with both maintenance and updating demands.

In the pursuit of our goals, stable focus is essential, as internal and external distractions have the potential to interfere with our progress. However, life is dynamic and demands the ability to shift between objectives, in order to adapt to the ever-changing circumstances. This adaptive process is where working memory (WM) is reckoned to play a pivotal role, involving the maintenance and updating of information that subserves our aims (Cowan, 2017; Oberauer, 2009). The dynamic balance between maintenance and updating has been proposed to be regulated by a fronto-striatal gating mechanism, acting via multiple parallel loops (PBWM: Prefrontal cortex Basal ganglia Working memory Model; Frank et al., 2001; Hazy et al., 2006, 2007; O’Reilly & Frank, 2006). In this system, dopamine (DA) plays a pivotal role, with its activity in the striatum and in the prefrontal cortex (PFC) serving to facilitate WM updating and WM maintenance, respectively (Broadway et al., 2018; Cools et al., 2004; Cools & D’Esposito, 2011; Cools & Robbins, 2004; D’Esposito & Postle, 2015; Durstewitz & Seamans, 2008; Furman et al., 2021). Specifically, it has been proposed that salient stimuli can trigger phasic dopaminergic bursts from the striatum, leading to thalamic disinhibition, which subsequently excites the PFC (i.e., opens the gate to WM) and enables WM updating (Cools & D’Esposito, 2011; Ott & Nieder, 2019; Rac-Lubashevsky et al., 2017). Conversely, tonic inhibition of the thalamus implements a gate-closed state, wherein WM content is protected against interference within or via the PFC (Chatham et al., 2014; D’Esposito & Postle, 2015; Postle, 2006; Verschooren et al., 2021).

Constrained by its limited capacity, WM must regulate the trade-off between maintenance and updating of its active representations, to ensure optimal functioning (Oberauer, 2018). Indeed, if the balance tilts to one side, then maladaptive behaviors can arise, such as perseveration or heightened distractibility, respectively (Olesen et al., 2006; Stedron et al., 2005). Among the possible factors that may affect the trade-off between WM maintenance and updating, acute stress is a potential candidate as it is known to influence the fronto-striatal network (Arnsten, 2009, 2015). Acute stress emerges when the perceived task demands exceed the available resources (Lazarus & Folkman, 1984), and causes the release of glucocorticoids and catecholamines (such as DA and noradrenaline) within the PFC and the connected subcortical regions (Bahari et al., 2018; Mitchell et al., 2008; Pruessner et al., 2004; Qin et al., 2009, 2012; Schwarz & Luo, 2015; van Ast et al., 2016; Wanke & Schwabe, 2020). In many cases, acute stress has been shown to diminish cognitive flexibility (Goldfarb et al., 2017; Plessow et al., 2011, 2012; for a comprehensive review see Shields et al., 2016), a phenomenon closely associated with an increased reliance on striatal regions during cognitive task execution, which promotes a more habitual and less adaptive response pattern (Vaessen et al., 2015; Wirz et al., 2018). However, the effect of stress on cognitive flexibility appears to be more complex and nuanced. Indeed, Tona and colleagues (2020) reported the absence of acute-stress effects on cognitive flexibility, while other studies showed positive effects on tasks requiring flexible adjustment of behavior through trial-and-error learning (Dayan & Yu, 2006; Gabrys et al., 2019).

However, mixed findings have emerged from studies that employed paradigms demanding both WM updating and maintenance (e.g., the n-back task), in which beneficial (Lin et al., 2020), adverse (Bogdanov & Schwabe, 2016; Schoofs et al., 2008), or even no acute stress-related effects (Bakvis et al., 2010; Stone et al., 2021) were found. While it is challenging to disentangle the acute stress effects on WM updating from those on WM maintenance in these kinds of tasks (Oberauer, 2009), a similar complex pattern of results was observed in studies that restricted the examination to the influence of acute stress on WM maintenance (as assessed via e.g., digit-span forward). That is, while some studies have shown impairments (Oei et al., 2006), others have shown improvements following acute stress induction (Stauble et al., 2013). The heterogeneous outcomes observed in previous research may be attributed to several factors, including the diverse array of WM paradigms (Oberauer, 2009), variations in the time interval between the application of the stressor and the WM task (Geißler et al., 2022; Hermans et al., 2014), and the stressor employed (Shields et al., 2016).

Within this mosaic of findings, focusing specifically on the possible effects of acute stress on the gating mechanism that regulates the maintenance versus updating of WM representations may provide useful insights into the effects of acute stress on WM.

As such, our study endeavors to enhance the existing body of literature by investigating the influence of acute stress on WM gating. To achieve this, we employed the reference-back paradigm, a well-established task designed to decompose WM in its component processes: maintenance, updating, gate opening, and gate closing processes (Rac-Lubashevsky & Kessler, 2016a, 2016b). To the best of our knowledge, whether and to what extent acute stress can specifically affect WM gating operations remains uncharted in research exploring stress-related effects, potentially offering novel insights into the field.

The reference-back paradigm shares similarities with the classical n-back task, where participants assess whether the presented stimulus matches the one shown N trials earlier. This requires WM content to be updated on each trial. However, unlike in the standard n-back task, not all trials within the reference-back paradigm necessitate WM updating. Indeed, in this task, two types of trials are intermixed, namely, “comparison” and “reference” trials, with only the latter requiring WM updating. Both types of trials require participants to indicate whether a currently presented stimulus matches or mismatches the one presented on the most recent reference trial (i.e., the reference stimulus). This means participants must continuously maintain in WM the stimulus shown on the most recent reference trial, but, crucially, on reference trials they also must update their WM content as the currently presented stimulus now serves as the reference stimulus for subsequent trials. In contrast, on comparison trials the reference stimulus remains unchanged and must therefore be protected from updating (through robust maintenance).

The transition from a comparison to a reference trial requires opening the gate to WM (gate opening), while the opposite transition requires closing it (gate closing). In contrast, trial repetitions in terms of either reference or comparison trials presumably leave the state of the gate unchanged. The behavioral cost of opening the gate to WM is calculated as the difference between reference-repetition and reference-switch trials. The same logic is applied for the computation of gate closing costs, which refer to the difference between comparison-repetition and comparison-switch trials. Finally, updating1 is calculated as the difference between reference-repetition and comparison-repetition trials, to isolate the cost of updating WM content from the cost of switching the state of the gate to WM. It should be noted that it is not possible to isolate WM maintenance with a specific contrast, but comparison-repetition trials can serve as an index. Although comparison-repetition trials require both WM maintenance and a matching decision, the latter is required on all trials and therefore an effect exclusively on comparison-repetition trials is likely to reflect a change in WM maintenance. A possible sequence of trials is depicted in Figure 1 (panel C), while a summary of the cognitive costs is given in Table 1 of the Data Analysis section.

Several recent studies that used this paradigm to investigate the neural correlates of WM gating processes (Nir-Cohen et al., 2020; Rac-Lubashevsky & Kessler, 2018, 2019; Rempel et al., 2021; Yu et al., 2022) provided evidence that is broadly consistent with the PBWM model (Frank et al., 2001). For instance, gate opening, but not gate closing, has been found to involve the activation of basal ganglia, thalamic and frontoparietal network (FPN), while updating has been found to rely on FPN activity only (Nir-Cohen et al., 2020, but see Nir-Cohen et al., 2023). Moreover, gate closing seems to be associated with higher frontal-theta activity (Rac-Lubashevsky & Kessler, 2018), a typical signature of goal shielding (Cavanagh & Frank, 2014; M. X. Cohen & Donner, 2013), which fits with the observation of a more cautious response pattern (i.e., longer but more accurate responses) when transitioning from a reference to a comparison trial (Rac-Lubashevsky & Frank, 2021).

Interestingly, other studies have provided indirect evidence supporting the role of dopaminergic activity in modulating WM gating mechanisms, showing a DA-striatal-related modulation (measured through eye blink rates; Jongkees & Colzato, 2016) of WM updating and gate switching (Rac-Lubashevsky et al., 2017), and a baseline-dependent modulation (via L-Tyrosine administration, the precursor of DA) of gate opening (Jongkees, 2020), two findings that corroborate the core assumption of PBWM model in attributing a crucial role to BG in regulating WM gating processes (Frank et al., 2001; Hazy et al., 2006).

Building on these findings, two recent studies employed the Drift Diffusion Model (DDM) to analyze reference-back task data, providing insights into the decision-making mechanisms underlying WM processes (Boag et al., 2021; Rac-Lubashevsky & Frank, 2021). The DDM assumes that decisions are made by accumulating noisy evidence until a decision threshold favoring one of two options is reached, thereby triggering response execution (Fudenberg et al., 2020; Ratcliff et al., 2016). Rac-Lubashevsky and Frank (2021) observed elevated decision thresholds during gate closing and behavioral response switches. The increase of threshold when closing the gate to WM has been suggested to reflect an increase in response caution due to conflict experienced when suppressing the automatic tendency to update WM content, which occurs in tasks requiring frequent updating. They also observed reduced drift rates (i.e., slower information processing) under conditions requiring greater cognitive control, such as gate opening, gate closing, updating, interpreted as reflecting increased task difficulty as opposed to a change in response strategy (i.e., threshold adjustments; Rac-Lubashevsky & Frank, 2021). Extending this work, Boag et al. (2021) modeled how all reference-back costs affected drift rates and non-decision time (ndt; time outside the decision stage), while the threshold was modulated exclusively by the updating cost. Consistent with Rac-Lubashevsky et al. (2021), they reported higher thresholds for reference versus comparison repetition trials, indicating an updating cost. Furthermore, they observed reductions in drift rate for gate opening, closing, and updating. All costs, except for gate opening, were also associated with increases in ndt. Collectively, these findings highlight the diverse cognitive processes underlying WM functioning, particularly in reference-back tasks, where costs seem to arise from control mechanisms (reduction of information processing speed and changes in response strategy), and additional processes outside the decision stage.

The Present Study

To assess whether and to what extent acute stress can differentially affect WM gating processes, we required participants to perform the reference-back task twice (Figure 1, panel A): first on its own (i.e., as baseline assessment) and subsequently while block-wise interleaving it with the execution of either a stress-inducing (i.e., the Paced Serial Addition Test, PSAT; Gronwall, 1977) or a control task (i.e., the Paced Serial Reading Task, PSRT, Tanosoto et al., 2015), in a between-subject design.

Figure 1.

Panel A: Schematic representation of the study procedure, which consisted of two phases (Phase 1, i.e., baseline assessment phase, and Phase 2, i.e., test phase), separated by a break of 2-3 minutes. RefBack, Reference Back paradigm; PSAT, Paced Serial Addition Test; PSRT, Paced Serial Reading Task. Panel B: Graphical representation of the PSAT (left) and PSRT (right). In each trial, a digit is presented. Depending on the group assignment, participants are required to add or repeat the presented digits. Panel C: Graphical representation of the reference-back paradigm. In each trial, a stimulus is presented inside a blue or red frame. The frame’s color at trial n determines whether the letter is a reference or comparison trial (red vs. blue). Furthermore, if the color of the frame is equal to that of trial n-1, trial n is defined as a repetition trial; otherwise, it is a switch trial. WM updating is required on all reference trials, whereas comparison trials require maintenance.

Figure 1.

Panel A: Schematic representation of the study procedure, which consisted of two phases (Phase 1, i.e., baseline assessment phase, and Phase 2, i.e., test phase), separated by a break of 2-3 minutes. RefBack, Reference Back paradigm; PSAT, Paced Serial Addition Test; PSRT, Paced Serial Reading Task. Panel B: Graphical representation of the PSAT (left) and PSRT (right). In each trial, a digit is presented. Depending on the group assignment, participants are required to add or repeat the presented digits. Panel C: Graphical representation of the reference-back paradigm. In each trial, a stimulus is presented inside a blue or red frame. The frame’s color at trial n determines whether the letter is a reference or comparison trial (red vs. blue). Furthermore, if the color of the frame is equal to that of trial n-1, trial n is defined as a repetition trial; otherwise, it is a switch trial. WM updating is required on all reference trials, whereas comparison trials require maintenance.

Close modal

The choice of this specific stressor was motivated by two main considerations. First, we needed to induce acute stress within an online experimental context. The current study was carried out when there were still some restrictions related to the COVID-19 pandemic in Italy. Second, we wanted to be sure that the effects of the acute stress induction could persist for the entire duration of the task. Catecholamine-related acute stress effects typically become evident within the first few minutes after exposure and tend to fade out as time goes on (Elzinga & Roelofs, 2005; Geißler et al., 2022). As the reference-back task is quite long to complete, we deemed it necessary to repeat the stress exposure throughout the task. To enhance the effectiveness of our stress induction, we introduced a socio-evaluative component, wherein participants were informed that their performance on the stressor task (PSAT) would be monitored and evaluated by the experimenter (for further details, see the Method’s section). Acute-stress-related research has indeed shown that stressors that involve social-evaluative components are more effective in eliciting a stress response (Shields et al., 2016).

Considering that acute stress promotes a more striatal-driven performance (Otto et al., 2013; Vogel & Schwabe, 2016; Wirz et al., 2018), compared to the control group, in the stress group we expected a facilitation of gate opening when performing the reference-back task under stress as compared to the baseline. Such an effect can also possibly result in a facilitation of updating ability when performing the task under stress. If gate opening cost but not updating cost is modulated by stress, this would suggest that prior observations of WM updating improvements under stress might actually reflect an improved ability to open the gate to WM rather than improved updating per se (Goldfarb et al., 2017). Finally, the promotion of striatal-driven performance via acute stress could also result in maintenance impairments (Elzinga & Roelofs, 2005; Oei et al., 2006; Otto et al., 2013; Schoofs et al., 2008), which could manifest themselves as a less efficient ability to close the gate to WM when performing the task under stress.

To anticipate our results, this study revealed nuanced effects of stress on WM processes. Our initial analysis of reaction times showed no substantial between-group differences related to our stress manipulation. However, the accuracy analysis uncovered a more complex picture. The Stress group showed a smaller updating cost (i.e., a smaller performance difference between comparison and reference trials) compared to the Control group. However, this was not due to better performance on reference trials. Instead, we found worse performance on comparison trials. This suggests that stress primarily challenges the maintenance of information, rather than facilitating the updating process. To further investigate these findings, we employed a DDM to decompose participants’ behavior in latent variables reflecting the fidelity of information processing (drift rates) vs. the choice of response strategy (decision threshold). This analysis corroborated and extended our initial results, revealing that stressed participants had lower drift rates, particularly during maintenance, compared to controls. These outcomes reinforce the notion that stress had a more pronounced impact on WM maintenance than on gating and/or updating processes.

To contextualize our findings in the existing literature, we conducted a meta-analysis, presented as Study 2. This comprehensive analysis examined a wide range of stressors and WM tasks, categorizing them based on specific WM demands (maintenance tasks such as the digit span forward; maintenance + overcoming competition tasks like the operation span; and maintenance + updating tasks such as the n-back; Oberauer, 2009). Importantly, this categorization of tasks does not allow us to dissociate effects of stress on WM maintenance vs updating per se. However, it allows us to examine whether stress impacts WM performance differentially based on whether there are concurrent or alternating task demands for resisting distraction (the maintenance + overcoming competition category) or updating (the maintenance + updating category). Synthesizing data from 42 studies and 210 effects using a multilevel multivariate model, this meta-analysis highlighted a negative impact of stress on maintenance + updating paradigms, particularly for accuracy measures. These findings complement and extend the results of Study 1, suggesting that acute stress may impact WM performance when tasks demand a dynamic balance between maintenance and updating of WM representations, but not when tasks solely require WM maintenance.

Methods

Participants

Participants were recruited by advertising personal contacts using opportunistic and snowball sampling and via social media (e.g., Facebook, Instagram). Participants were considered eligible to participate if they were between 18 and 40 years old, Italian native speakers, they had normal or corrected-to-normal eyesight, no color blindness, and they reported no drug use, and the absence of any diagnosed psychiatric disorders and/or intellectual disability. After recruitment, participants were randomly assigned to either the Stress or the Control group. Participants were not compensated for their participation and the experiment lasted about one hour. Overall, we were able to recruit 106 participants.

From this sample, 10 participants were excluded from the analyses due to unstable internet connection (N=2), drowsiness (N=1), or failure to reach the 80% cut-off accuracy threshold required to complete the practice phase of the main experiment before proceeding to the experimental blocks (N=5; see “Procedure”). Therefore, data from a sample of 96 young adults was analyzed (Control group: 39 females, 9 males, mean age = 24.8, SD = 2.6; Stress group: 39 females, 9 males, mean age = 23.7, SD = 2.6). Our sample size determination was guided by previous studies on stress and WM and practical resource constraints (Lakens, 2022). To acknowledge this limitation, we adopted a conservative analytical approach preferring caution over potentially reporting chance effects (see Data Analysis section).

The study conformed to the ethics standards of the 1975 Declaration of Helsinki (as revised in 1983) and was approved by the Ethics Committee of the School of Psychology of the University of Padua (protocol n° 4529). All participants provided online informed consent prior to participation.

Procedure

The study consisted of two separate sessions. During the first session, with the aim of obtaining an overview of the characteristics within the sample that could potentially interact with the stress-induced effects and to rule out possible between-group differences, we administered to participants a battery of questionnaires. These encompassed assessments of various factors, including self-reported interoceptive abilities, levels of anxiety, depression, stress, presence of repetitive negative thinking, intolerance of uncertainty, and cognitive control abilities specifically within the context of stress. During this session, which lasted 15-20 minutes, participants were also asked to indicate their age, gender, birthplace, education level, and handedness. For the second session, which lasted about an hour, participants carried out the reference-back paradigm and underwent the experimental manipulation. Both sessions were conducted online while ensuring the confidentiality of all collected data. Participants were asked to read and sign two informed consents, one for each session. The first informed consent was delivered via Google Form at the beginning of the first session. The informed consent for the second session was sent to the participants via email, with the request to read, sign, and subsequently return it to the experimenter before undergoing the second session. We instructed participants to complete both sessions at a maximum distance of 6 days, to ensure the reliability of the collected questionnaires data. Unfortunately, in both groups, one-third of participants did not respect the instructions (Stress: 17/48, maximum number of days in-between sessions = 25; Control = 15/48, maximum number of days in-between sessions = 31). For this reason, we did not further consider questionnaire data, but these data can be found online (https://osf.io/d43vs/).

The reference-back task was created in PsychoPy3 (Peirce et al., 2019) and administered via its online platform Pavlovia. Participants were tested by integrating Pavlovia and Zoom platforms. This procedure was implemented to allow the experimenter to monitor participants’ task performance (see “Materials”) as well as to add a socio-evaluative component to the cognitive stress manipulation. Participants were asked to perform the task in a quiet and distraction-free environment, ensuring the absence of any disruptive sources (e.g., turning off telephone and computer notifications). They were requested to use either a laptop or a desktop PC and to approach the study with the same seriousness as they would in a standard university laboratory.

The second session comprised two phases: the baseline assessment (Phase 1) and the test phase (Phase 2; see Figure 1). Phase 1 began with an assessment of participants’ valence and arousal using an emoji-labeled affect grid (Toet et al., 2018; see “Materials”). We operationalized subjective stress through these indirect measures to track stress responses without potentially biasing participants through direct questioning them about stress levels. This approach, using an affect grid for stress assessment, is widely adopted in stress research (e.g., Luettgau et al., 2018; Meier et al., 2022; Radenbach et al., 2015; Steinrücke et al., 2019), and allows to capture emotional states while minimizing social desirability bias. Additionally, given the multifaceted nature of our stressor (see Materials), which encompassed cognitive load, negative affect, and social-evaluative threat, recording both valence and arousal dimensions provided a more comprehensive assessment of participants’ subjective experiences. Participants were also asked to indicate whether their state was accompanied by a body sensation. If they perceived a body sensation, participants were instructed to click on the corresponding area of a body silhouette image, which included the following labeled options: head, shoulders, arms, hands, chest, belly, legs, and feet. Thereafter, participants were provided with the instructions of the reference-back task and practiced it (Rac-Lubashevsky & Kessler, 2016a, 2016b). Only those who achieved an accuracy threshold of 80% or higher during the practice phase were allowed to proceed with the session and complete six experimental blocks of the reference back task, which served as a baseline assessment of participants’ WM performance. Following this assessment, participants were presented with the emoji grid and body silhouette for the second time and were asked to rate their performance (i.e., “How well do you think you performed on the memory task?”) using a visual analogue scale ranging from 0 to 100. At the end of Phase 1, participants were asked to take a brief break of 2-3 minutes.

Phase 2 started with participants undergoing a block of either the cognitive stress (Paced Serial Addition Task, PSAT) or the control manipulation (Paced Serial Reading Task, PSRT; see “Materials” for more details). Immediately after, the emoji grid and body silhouette were presented for the third time. Then, participants were asked to alternate the execution of the PSAT or PSRT (depending on the assigned condition) and the reference-back task. In detail, each PSAT/ PSRT block was followed by two blocks of the reference-back task, resulting in a total of 6 PSAT/PSRT and 12 reference-back blocks. Thereafter, mood and body sensations were measured for the fourth time and were followed by a second subjective assessment of task performance. At the end of Phase 2, participants were debriefed and informed by a researcher about the aim of the study and the expected results.

The measures of interest were the reference-back performance contrasts (i.e., gate opening, gate closing, and updating costs) in Phase 1 and Phase 2, as a function of whether participants underwent the cognitive stress induction or a control manipulation. Those were examined for accuracy and reaction times. Additionally, follow-up analyses were conducted on drift rates and decision thresholds (DDM). Data regarding the self-assessment of performance and the presence of body sensations were collected for exploratory purposes that extend beyond the scope of this manuscript. As a consequence, these data are not analyzed or discussed further in the current manuscript, but can be found here https://osf.io/d43vs/.

Materials

Mood Assessment. To assess participants’ mood, we used the EmojiGrid (Toet et al., 2018), a cartesian grid similar to the widely used Affect grid (Russell et al., 1989). The EmojiGrid features 5 emoji faces on each side of the grid. The x-axis represents valence, ranging from disliking (e.g., sad emoji) to liking (e.g., smiling emoji), while the y-axis represents arousal, ranging from a state of low (e.g., calm emoji) to high arousal (e.g., excited emoji). A neutral emoji is located at the intersection of the two axes. The grid measured 550x550 pixels, was presented centrally and participants were instructed to indicate their current mood by clicking on the location within the grid that best represented their emotional state. Pixel coordinates were translated into values ranging from 1 (low arousal, low valence) to 9 (high arousal, high valence), with 5 representing a neutral mood.

Reference-Back Paradigm. The reference-back task involves the presentation of two types of trials: comparison and reference trials (i.e., Trial Type, Figure 1 panel C). The key distinction between them lies in the fact that the former only requires a matching decision, while the latter involves both a matching decision and an updating operation. Trials that follow the same trial type are referred to as repetition trials, while trials in which the trial type switches are referred to as switch trials (i.e., Switching, Figure 1 panel C). In each trial, participants are presented with a target stimulus, which can be either the capital letter “X” or “O”. The target stimulus appears within a colored frame (either red or blue), whose color serves as a cue indicating a reference or a comparison trial, respectively (counterbalanced across participants). Participants are asked to indicate whether the letter presented on the screen is the same as (i.e., match judgment) or different from (i.e., mismatch judgment) the most recent one shown within the reference frame.

Phase 1 and Phase 2 comprised 6 and 12 experimental blocks, respectively. Each reference-back block included 49 trials. The first trial of each block lasted 1.8s and served to set the first reference stimulus. No response was required for this trial. As for the remaining trials within each block, the trial sequence was as follows: a blank screen was presented for 0.5s, followed by the presentation of a central fixation for a jittered duration ranging between 0.9s and 1.2s. Afterwards, the colored frame and the target stimulus were presented until a response was made, but for a maximum duration of 1.8s. Match and mismatch responses were given by pressing the “L” and “S” buttons, respectively, on a QWERTY keyboard. Participants were instructed to respond as quickly and accurately as possible. The assignment of frame colors and response buttons was counterbalanced across participants. During the practice block, performance feedback (i.e., “Risposta corretta!” for correct responses, “Errore!” for incorrect responses, and “Troppo lento!” for slow responses) was provided for a duration of 0.5s. Within each block, the type of trial (reference vs. comparison), match type (match vs. mismatch), and switching type (switch vs. repetition) were presented with equal probability. Given that Phase 1 comprised 6 blocks, we had a total of 288 trials (excluding the first trial of each block, which did not require a response), with 72 trials per each possible combination of conditions of interest (e.g., reference-repetition, reference-switch, …), while Phase 2 comprised 12 blocks and thus 576 trials.

Experimental Manipulation. Cognitive stress was induced by means of a modified version of the Paced Auditory Serial Addition Task (PASAT; Gronwall, 1977). The PASAT requires participants to add pairs of digits. In our study, digits were visually presented at the center of the screen, once every 1.55s. Except for the first digit, which did not require a response, participants were instructed to add the current digit with the one immediately preceding it (i.e., first digit + second digit, second digit + third digit… ) and to verbally report the computed sum. The task implemented in the control condition was adapted from the Paced Auditory Numeral Reading Task (Tanosoto et al., 2015). It followed the same trial sequence as that of the cognitive stress task, but here participants were asked to read aloud the presented digits without performing any computation. Seven lists of 40 pseudo-randomly generated digits were created and presented to the participants in a fixed order. All participants were presented with the same sequence of digit lists. Participants in the stress group were informed that their performance would be monitored and evaluated by the experimenter via Zoom; thus, they were instructed to perform to the best of their abilities. In the control group, participants were informed that the experimenter would monitor the execution of the task, but without evaluating their performance. The rapid presentation rate of digits (i.e., 1.55s) and the Zoom set-up were implemented to induce stress in terms of cognitive load, negative affect, and social-evaluative threat (Ehrhardt et al., 2022; Lejuez et al., 2003; Poppelaars et al., 2019). In contrast, our control condition aimed to evoke a similar physiological activation due to the fast respiratory rate caused by reading the digits aloud (Tanosoto et al., 2015), without provoking stress.

Data Analysis

Data pre-processing, analysis, and visualization were conducted in R (R Core Team, 2021; version 4.1.2; https://www.R-project.org/) using the following packages: ggplot2 (Wickham, 2016), sjPlot (Lüdecke, 2021), emmeans (Lenth, 2022), performance (Lüdecke et al., 2021), readxl (Wickham & Bryan, 2019), readr (Wickham et al., 2022), reshape (Wickham, 2022), dplyr (Wickham et al., 2023), see (Lüdecke et al., 2021), brms (Bürkner, 2017), bayestestR (Makowski, Ben-Shachar, & Lüdecke, 2019), tidybayes (Kay, 2023), and patchwork (Pedersen, 2022). The data, analysis scripts, and materials for this study are available at https://osf.io/d43vs/.

All the analyses were performed using a Bayesian approach, which allows quantifying evidence both against and in support of the null hypothesis (Kruschke, 2013; Kruschke & Liddell, 2018). The Markov chain Monte Carlo sampling method (Gamerman & Lopes, 2006) was used to approximate the posterior distribution of the parameters, and the Maximum Probability of Effect (MPE) and Region of Practical Equivalence approach served for hypothesis testing (ROPE; Kruschke & Liddell, 2018). The MPE, ranging from 0.50 to 1, represents the probability that a parameter is strictly positive or negative. It is calculated as the proportion of posterior samples falling on the same side (positive or negative) of the mean posterior estimate. A higher MPE indicates a higher probability of the parameter being different from zero. Additionally, the MPE strongly correlates with the frequentist p-value, being them mathematically related by the following formula: p-value = 2 * (1 - MPE). Therefore, two-sided p-values of 0.1, 0.05, 0.01, and 0.001 correspond roughly to MPEs of 0.95, 0.975, 0.995, and 0.9995, respectively (Makowski, Ben-Shachar, Chen, et al., 2019).

The ROPE is defined as a region of values equivalent to the null value (i.e., corresponding to the null hypothesis), with which the bulk of the posterior distribution is compared. As a result, a value is obtained, which reflects the probability that the estimated parameter is practically equivalent to zero (i.e., lies into the ROPE). Therefore, the higher the percentage of the posterior distribution falling within the ROPE the stronger the evidence in favor of the null hypothesis (and, therefore, against the alternative hypothesis, from now on referred to as the target hypothesis); vice versa, the lower the overlap with the ROPE the stronger the evidence in favor of the target hypothesis (and, therefore, against the null hypothesis). ROPE range values were set based on the measurement scale of the respective dependent variable and, therefore, varied across analyses (for details, see the different data analysis section). In all the analyses, as suggested by McElreath (2020), the 89% Highest Posterior Density Interval (89% HPDI) of a given parameter was compared with the ROPE. If the HPDI is completely outside the ROPE, the target hypothesis is supported. On the contrary, if the HPDI overlaps with the ROPE, the null hypothesis is supported (Kruschke & Liddell, 2018). For descriptive purposes, we considered the following percentages as compelling support for the target hypothesis: less than 5% for strong evidence, %5 to %10 for moderate evidence, and %10 to %15 for weak evidence.

The use of both MPE and ROPE provides a more comprehensive assessment of our hypotheses. While the MPE offers a straightforward test against zero, the ROPE allows for a more nuanced evaluation of practical “significance” (Kruschke & Liddell, 2018). The ROPE is more restrictive and influenced by the choice of its range values, potentially leading to more conservative conclusions. In contrast, the MPE provides a continuous measure of effect probability, offering additional granularity in our interpretations (Makowski, Ben-Shachar, Chen, et al., 2019).

The model diagnostic was performed using the R-hat statistic (Gelman, 1996), with a maximum accepted value of 1.05, as suggested by Vehtari et al. (2021). Additionally, visual inspection of traces and a posterior predictive check were performed.

Mood Assessment

Data of interest consisted of cartesian coordinates (in pixels) of mouse clicks, where the x-coordinate represented valence and the y-coordinate represented arousal. However, due to a technical error, the data were mistakenly recorded as heights instead of pixels, which was the intended measurement scale for the grid. To address this issue, we retrieved the screen size (in pixel) of each participant’s monitor and converted height coordinates into pixel coordinates. Unfortunately, data from 12 participants (7 in the Stress group and 5 in the Control group) could not be included as we were unable to recover their monitor information. Consequently, the analyses carried out on mood data included data from only 84 participants. Prior to the analysis, pixel coordinates were translated into values ranging from 1 to 9. Mood ratings were collected at four different time points (T1, T2, T3, T4). Separate linear models were conducted for valence and arousal ratings with Group, Time, and their interaction as predictors. Additionally, an intercept was estimated for each participant. As the Time predictor consisted of 4 levels, we set its contrast matrix as follows (k-1 possible contrasts): T1 vs. T2 (Time 2-1), T2 vs. T3 (Time 3-2), and T3 vs. T4 (Time 4-3). The contrast of interest is the one involving T2 and T3, as it reflects the effect of the manipulation on mood ratings. Interactions suggesting a difference between groups were further examined by means of post-hoc between-group comparisons at each time point. The ROPE lower and upper limits were set at -0.1 and 0.1 times the standard deviation of the dependent variable (i.e., negligible effect; J. Cohen, 1988), respectively. We used weakly informative priors with priors on the effects centered on 0 (Normal(0, 2)) and for both models, we ran 4 chains, with 4000 samples each, and the first 2000 were discarded as burn-in.

Reference-Back

As a predetermined accuracy-based removal procedure, we decided to exclude participants who showed less than 80% accuracy in their responses (calculated separately for Phase 1 and Phase 2). However, none of the participants fell below this threshold (mean(min-max); Phase 1: Stress = 0.94 (0.85-0.99), Control = 0.94 (0.82-1); Phase 2: Stress = 0.94 (0.82-0.99), Control = 96 (0.86-0.99). We excluded trials faster than 200ms (i.e., anticipations) and the first two trials of each block: trial 1, which did not require any response, and trial 2, which was the first trial of each block that required a response. Additionally, for reaction times (RT) data we excluded incorrect responses. Accuracy (ACC) and RT data were analyzed using generalized multilevel linear models. These models estimate simultaneously both group-level and individual-level parameters, with the latter being constrained by the group-level distribution. Such models are advantageous when dealing with non-normally distributed data as they allow for the implementation of the most appropriate distribution. Based on the previous literature (Dixon, 2008; Lo & Andrews, 2015; Verschooren et al., 2021) and visual inspection, we selected an inverse Gaussian distribution (with a log link function) to model RT data, while a binomial distribution (with a logit link function) was used for ACC data. Each model included Group (Stress vs. Control) as between-subject predictor and the following within-subjects predictors: Trial Type (reference vs. comparison), Switching (repetition vs. switch), Phase (Phase 1 vs. Phase 2), and all interactions. An intercept and individual parameters for the effect of Phase, Trial Type, Switching, and their interactions were estimated for each participant. We utilized weakly informative priors for both models and ran 4 chains with 8000 samples each, discarding the initial 4000 samples as burn-in (for the complete model description see https://osf.io/d43vs/). For the RTs analyses, the ROPE upper and lower limits were set at -15ms and 15ms, whereas the corresponding range in log was set at -0.024 and 0.024. As for the accuracy, we assumed a variation of 0.01 (-0.01, +0.01), corresponding to 0.1 in the log-odds scale (-0.1, +0.1); that is, relevant predictors are supposed to multiply by 0.1 the probability of participants being accurate. Our ROPE definition for RT analysis was based on the typical 30-150 ms cost range observed in the reference-back paradigm. We reasoned that between-group differences should fall below these costs, while effects smaller than 15ms would have negligible practical significance. Regarding accuracy, the reference-back paradigm does not consistently produce costs. Therefore, we hypothesized that if between-group differences in accuracy were present, they could be small, but should exceed a minimum threshold of 1% change in accuracy to be considered practically significant. Given our experimental design and sample size, we reasoned that these values could be quite plausible and that stress effects smaller than these values would not be of interest.

Regardless of the outcome of the full models, additional planned comparisons were performed to evaluate the presence of reference-back costs in terms of gate opening, gate closing and updating for both ACC and RTs (see Table 1), and to assess possible between-group differences across Phase 1 and Phase 2. Models’ parameter estimates are reported in log and logit scales. Whereas, for the sake of interpretation, planned contrasts were performed after back-transforming the models’ estimates from the log and logit scale to the response scales (i.e., milliseconds and proportions).

Table 1.
Performance costs obtained from the reference-back task.
 comparison reference 
 repetition switch repetition switch 
Gate Opening   
Gate Closing   
Updating   
 comparison reference 
 repetition switch repetition switch 
Gate Opening   
Gate Closing   
Updating   

Notes. Gate Opening is calculated as the difference between reference switch and reference repetition trials. Gate Closing as comparison switch minus comparison repetition trials, while Updating as reference repetition minus comparison repetition trials.

Drift Diffusion Modeling. As a follow up analysis, we examined two key parameters that capture the decision process: the threshold (a; see Figure 2) and the drift rate (v; see Figure 2). The threshold parameter (a) reflects the amount of evidence that is needed for a decision to be made. This parameter reflects the cognitive control processes involved in regulating the speed-accuracy trade-off: a higher threshold leads to slower but more accurate responses, while a lower threshold results in faster but less accurate responses (Voss et al., 2004). The drift rate (v) reflects the rate at which evidence accumulates when approaching the response boundaries, and it is assumed to reflect the quality of the decision-driving signal, with a high drift rate leading to faster and more accurate responses. The drift rate can be influenced by various factors including stimulus characteristics, individual processing efficiency, WM capacity, and task difficulty (Muhle-Karbe et al., 2021; Schmiedek et al., 2007). Although not directly relevant to our study, two additional parameters of the DDM are worth mentioning: the non-decision time parameter (ndt) and the starting point (z). The non-decision time encompasses processes that occur outside the decision stage, such as stimulus encoding and motor execution. The starting point represents the location between the upper and the lower boundaries (z in Figure 2) where the diffusion process starts and can reflect a response bias towards one of the two boundaries (Myers et al., 2022). In our analysis, the starting point was fixed halfway between the lower and the upper bound, as our data was accuracy coded (i.e., with the upper bound representing a correct decision and the lower an incorrect).

Figure 2.
Illustration of the drift diffusion model.

RT is the result of both non-decision and decision times. Decision time depends on the drift rate (v) and the decision threshold (a), while non-decision time includes the encoding process and the response output process. The evidence accumulation process starts at z, which is located halfway between the two boundaries. Green (correct) and red (errors) traces represent the random accumulation of evidence until a decision is made (i.e., until either the upper or the lower boundary is reached) and a response is triggered.

Figure 2.
Illustration of the drift diffusion model.

RT is the result of both non-decision and decision times. Decision time depends on the drift rate (v) and the decision threshold (a), while non-decision time includes the encoding process and the response output process. The evidence accumulation process starts at z, which is located halfway between the two boundaries. Green (correct) and red (errors) traces represent the random accumulation of evidence until a decision is made (i.e., until either the upper or the lower boundary is reached) and a response is triggered.

Close modal

While ndt has been implicated in both motor execution processes and in task-switching operations (Schmitz & Voss, 2014; Voss et al., 2013), altering the state of the gate is not simply conceptualized as task-switching (i.e., switching between updating and maintenance, or vice versa) as the two gating mechanisms are supported by different neural substrates (Kessler, 2017; Nir-Cohen et al., 2023). For this reason, in our analysis, we decided to mimic Rac-Lubashevsky and colleagues (2021) allowing only the threshold and the drift rate parameters to vary across conditions (Trial Type, Switching, and Phase) and, as we expected practice effects, the non-decision time component was assumed to vary just across Phases (Dutilh et al., 2009, 2011; Reinhartz et al., 2023).

RT and ACC data were jointly fitted using the R package brms (Bürkner, 2017), which allows for a hierarchical Bayesian estimation of the DDM parameters. To estimate model parameters, we assigned prior probability distributions to each parameter based on previous literature (Boag et al., 2021; Matzke & Wagenmakers, 2009; Rac-Lubashevsky & Frank, 2021; for the complete model description see https://osf.io/d43vs/).

Following Rac-Lubashevsky and Frank (2021), the behavioral data were fitted after removing the first two trials of each block, omissions, and trials faster than 200ms. We run 4 chains of 10000 posterior samples each, discarding the first 5000 as burn-in.

For the dependent variables a and v, the model included Group (Stress vs. Control) as between-subject predictor and the following within-subjects predictors: Trial Type (reference vs. comparison), Switching (repetition vs. switch), Phase (Phase 1 vs. Phase 2), and all interactions. An intercept and individual parameters for the effect of Phase, Trial Type, Switching, and their interactions were estimated for each participant.

For the dependent variables ndt: the model included Group (Stress vs. Control) as between-subject predictor and Phase (Phase 1 vs. Phase 2) as within-subjects predictors, and their interaction. An intercept and individual parameters for the effect of Phase were estimated for each participant.

Parameters were estimated using the following formula:

v ~ 1 + Phase * Group * Switching * TrialType +

(1 + Phase * Switching * TrialType || gr(id, by = Group)),

a ~ 1 + Phase * Group * Switching * TrialType +

(1 + Phase * Switching * TrialType || gr(id, by = Group)),

ndt ~ 1 + Phase * Group + (1 + Phase * Group || gr(id, by = Group))

The model diagnostic included: examination of R-hat values, visual inspection of traces, and posterior predictive check (see Supplementary Material: Table S5.3, Figure S5, Figure S6).

Contrasts were set to evaluate the presence of a between-group difference across the two experimental phases in terms of reference-back costs.. The MPE and the 89% HPDI + ROPE approach served for hypothesis testing. ROPE ranges were set to -0.1 to 0.1 for v, to -0.07 to 0.07 for a, and -0.015s to -0.015s for ndt. These values represent around half the range of previously reported reference-back costs observed in DDM applications (Boag et al., 2021).

Results

Mood

Valence. Model parameter estimates are reported in Table S1. The 89% HPDI for none of the parameters fell completely inside the ROPE, but three of them fell completely outside the ROPE. First, we observed lower mood ratings at T2 as compared to T1 (Time 2-1: -1.23, 89% HPDI = [-1.60, -0.83], MPE = 1, 0% in ROPE). Second, between-group differences were observed when comparing T2 and T3 (Group x Time 3-2: -2.65, 89% HPDI = [-3.39, -1.91], MPE = 1, 0% in ROPE), and T3 and T4 (Group x Time 4-3: 1.30, 89% HPDI = [0.54, 2.02], MPE = 1, 0% in ROPE). Specifically, post-hoc comparisons showed inconclusive evidence when comparing groups’ valence at T2 (0.53, 89% HPDI = [-0.11, 1.26], MPE = 0.89, 21% in ROPE), but a plausible between-group difference at T3 (-2.12, 89% HPDI = [-2.79, -1.14], MPE = 1, 0% in ROPE), with participants in the Stress group showing lower valence ratings than participants in the Control group. This between group difference, persisted at T4 (-0.82, 89% HPDI = [-1.51, -0.12], MPE = 0.97, 3% in ROPE).

In summary, our stress manipulation successfully induced a negative valenced state (see Figure 3, panel A).

Arousal. Model parameter estimates are reported in Table S2. The 89% HPDI for none of the parameters fell completely inside the ROPE, but two parameters warranted further investigation as their 89% HPDI proportion inside the ROPE was relatively low: namely, the parameters estimating the interaction between Group and the changes across T2 and T3 (0.87, 89% HPDI = [0.07, 1.72], MPE = 0.95, 5% in ROPE), and Group and the changes across T3 and T4 (-0.97, 89% HPDI = [-1.79, -0.13], MPE = 0.97, 2% in ROPE). Post-hoc comparisons revealed that the evidence in favor of either the absence or the presence of a between-group difference at T2 was not enough to drive any conclusion (0.15, 89% HPDI = [-0.59, 0.84], MPE = 0.63, 38% in ROPE). However, a reliable between-group difference was evident at T3 (1.03, 89% HPDI = [0.28, 1.72], MPE = 0.99, 0% in ROPE), with participants in the Stress group showing higher ratings (i.e., higher arousal) as compared to participants in the Control group. Lastly, between-group comparison of mood ratings at T4 was inconclusive (0.05, 89% HPDI = [-0.68, 0.81], MPE = 0.55, 40% in ROPE).

To sum up, while arousal levels were substantially low in both groups across each time point, our manipulation resulted in a between group difference at T3, with stressed participants reporting higher arousal levels as compared to controls (see Figure 3, panel B).

Figure 3.
Estimated valence (panel A) and arousal (panel B) ratings across time points as a function of Group (Control: light blue, Stress: orange).

Group Posterior mean is represented by the black colored dot (filled with the same color as the respective group). T1 corresponds to the first mood assessment, collected at the beginning of the session. Ratings at T2 reflect participants’ mood after the completion of the first 6 blocks of the reference back task (Phase 1). T3 is the time point after the experimental manipulation and T4 represents the participant’s mood at the end of the session.

Figure 3.
Estimated valence (panel A) and arousal (panel B) ratings across time points as a function of Group (Control: light blue, Stress: orange).

Group Posterior mean is represented by the black colored dot (filled with the same color as the respective group). T1 corresponds to the first mood assessment, collected at the beginning of the session. Ratings at T2 reflect participants’ mood after the completion of the first 6 blocks of the reference back task (Phase 1). T3 is the time point after the experimental manipulation and T4 represents the participant’s mood at the end of the session.

Close modal

Reference-Back

A summary of model estimates is available in Table S3.1 (for RT) and Table S4.1 (for ACC) of the supplementary material.

RT. First, the 89% HPDI for four of the parameters fell completely inside the ROPE, thereby strongly favoring the null hypothesis. These included the interactions between Phase and Switching (-0.01, 89% HPDI = [-0.02, -0.01], MPE = 1, 100% in ROPE), Group and Switching (-0.00, 89% HPDI = [-0.01, 0.01], MPE = 0.51, 100% in ROPE), Group and Trial Type (0.00, 89%HPDI = [-0.01, 0.01], MPE = 0.55, 100% in ROPE) and Phase, Group and Trial Type (0.00, 89% HPDI = [-0.02, 0.02], MPE = 0.55, 100% in ROPE).

Conversely, the 89% HPDI for three of the parameters fell completely outside the ROPE. These included the effect of Phase (0.20, 89% HPDI = [0.19, 0.22], MPE = 1, 0% in ROPE), Switching (-0.06, 89% HPDI = [-0.06, -0.05], MPE = 1, 0% in ROPE), and Trial Type (-0.08, 89% HPDI = [-0.09, -0.07], MPE = 1, 0% in ROPE). That is, participants were faster during Phase 2 than 1, on repetition than on switch trials, and on comparison than on reference trials. Additionally, moderate evidence suggested the presence of a 4-way interaction (0.05, 89% HPDI = [0.01, 0.08], MPE = 0.99, 10% in ROPE), which we further explored through the planned contrast on the reference-back costs.

Reference-Back Costs. Planned contrasts performed to replicate the well-established reference-back performance patterns in RT revealed reliable gate closing (40ms, 89% HPDI = [34, 46], MPE = 1, 0% in ROPE), gate opening (33ms, 89% HPDI = [27, 39], MPE = 1, 0% in ROPE), and updating costs (57ms, 89% HPDI = [50, 64], MPE = 1, 0% in ROPE).

Contrasts reflecting possible between-group differences at Phase 1 mostly favors the absence of a between-group difference in terms of gate opening costs (0ms, 89% HPDI = [-18, 18], MPE = 0.50, 92% in ROPE), while inconclusive evidence was observed when looking at both gate closing and updating costs (gate closing: -10ms, 89% HPDI = [-28, 6], MPE = 0.83, 69% in ROPE; updating: 6ms, 89% HPDI = [-24, 16], MPE = 0.69, 81% in ROPE).

As for between-group differences at Phase 2 moderate evidence favored the absence of a between-group difference in gate opening and updating costs (gate opening: -6ms, 89% HPDI = [-18, 5], MPE = 0.82, 93% in ROPE; updating: 6ms, 89% HPDI = [-6, 20], MPE = 0.79, 91% in ROPE), while inconclusive evidence was found when comparing group’s gate closing costs (12ms, 89% HPDI = [0, 23], MPE = 0.96, 70% inside ROPE).

Based on the 89% HDPI + ROPE decision rule, we replicated the well-established RT performance patterns of the reference-back paradigm. However, we did not find compelling evidence for a modulation of these RT patterns by acute stress (see Figure 4).

Figure 4.
The posterior distribution of the estimated contrast quantifies the between-group difference in reference-back costs (reaction time data) in Phase 1 (light grey) and 2 (dark grey), the black bars represent the 89% HPDI.

Values below zero suggest a higher cost in the Stress group compared to the Control group. The purple area is the ROPE (-15ms, 15ms). The more the posterior distribution falls away from zero the higher the maximum probability of effect (MPE). The more the posterior distribution falls inside the purple rectangle (ROPE), the stronger the evidence in favor of the null hypothesis.

Figure 4.
The posterior distribution of the estimated contrast quantifies the between-group difference in reference-back costs (reaction time data) in Phase 1 (light grey) and 2 (dark grey), the black bars represent the 89% HPDI.

Values below zero suggest a higher cost in the Stress group compared to the Control group. The purple area is the ROPE (-15ms, 15ms). The more the posterior distribution falls away from zero the higher the maximum probability of effect (MPE). The more the posterior distribution falls inside the purple rectangle (ROPE), the stronger the evidence in favor of the null hypothesis.

Close modal

Estimated RT as a function of Phase, Group, Trial Type, and Switching are illustrated in Figure 5.

Figure 5.
Estimated RT, grouped by Groups (Control: light blue, Stress: orange), type of trial, and experimental phases.

Group mean is represented by the black colored dot (filled with the same color as the respective group), and bars represent the 89% HPDI.

Figure 5.
Estimated RT, grouped by Groups (Control: light blue, Stress: orange), type of trial, and experimental phases.

Group mean is represented by the black colored dot (filled with the same color as the respective group), and bars represent the 89% HPDI.

Close modal

Accuracy. First, the 89% HPDI for two of the parameters fell completely outside the ROPE, namely those reflecting the interactions between Switching and Trial Type (-0.42, 89% HPDI = [-0.58, -0.26], MPE = 1, 0% in ROPE), and between Phase, Group, Switching and Trial Type (-0.73, 89% HPDI = [-1.25, -0.21], MPE = 0.99, 0% in ROPE). Furthermore, we considered 2 parameters to warrant investigation as their proportion inside the ROPE was relatively low: namely, the effect of Switching (0.14, 89% HPDI = [0.08, 0.21], MPE = 1, 10% in ROPE) and the interaction between Phase and Group (-0.27, 89% HPDI = [-0.47, -0.06], MPE = 0.98, 5% in ROPE). To summarize, participants were more accurate on repetition trials compared to switch trials, with strong evidence for this effect on reference trials (0.36, 89% HPDI = [0.25, 0.45], MPE = 1, 0% in ROPE). For comparison trials, the evidence for a switching effect was inconclusive (0.07, 89% HPDI = [-0.17, 0.04], MPE = 0.85, 70% in ROPE). The Control group exhibited a practice effect, with accuracy improving from Phase 1 to Phase 2 (0.27, 89% HPDI = [0.12, 0.42], MPE = 0.99, 0% in ROPE). For the Stress group, the evidence regarding Phase 1 to Phase 2 accuracy changes was inconclusive (0.00, 89% HPDI = [-0.15, 0.15], MPE = 0.50, 82% in ROPE), though 82% of the HPDI fell within the ROPE region, suggesting the possibility of no improvement.

Reference-Back Costs. We investigated the 4-way interaction through planned contrasts on the reference-back costs. The contrasts performed to replicate the overall presence of typical reference-back performance patterns in accuracy provided strong evidence favoring the existence of gate opening costs (-0.01, 89% HPDI = [-0.02, -0.01], MPE = 1, 0% in ROPE), and strong evidence for the absence of gate closing (0.00, 89% HPDI = [0.00, 0.01], MPE = 0.89, 100% in ROPE) and updating costs (0.00, 89% HPDI = [0.00, 0.01], MPE = 0.93, 100% in ROPE).

Contrasts reflecting the between-group difference at Phase 1 were inconclusive for gate opening (-0.01, 89% HPDI = [-0.03, 0.00], MPE = 0.94, 37% in ROPE) and updating costs (0.00, 89%HPDI = [-0.01, 0.02], MPE = 0.70, 84% in ROPE), but mostly supported the absence of between-group difference in gate closing costs (0.00, 89% HPDI = [-0.01, 0.01], MPE = 0.52, 95% in ROPE).

As for the between-group comparisons at Phase 2, evidence was inconclusive for gate closing (-0.01, 89% HPDI = [-0.02, 0.00], MPE = 0.98, 43% in ROPE) and gate opening costs (0.01, 89% HPDI = [-0.00, 0.02], MPE = 0.90, 71% in ROPE). In contrast, the Stress group had smaller updating costs as compared to controls (-0.02, 89% HPDI = [-0.03, -0.01], MPE = 1, 0% in ROPE; see Figure 6).

Figure 6.
The full posterior distribution of the estimated contrast quantifies the between-group difference in reference-back costs (accuracy) in Phase 1 (light grey) and 2 (dark grey), the black bars represent the 89% HPDI.

Values below zero suggest a higher cost in the Control group compared to the Stress group. The purple area is the ROPE (-0.01, 0.01). The more the posterior distribution falls away from zero the higher the maximum probability of effect (MPE). The more the posterior distribution falls inside the purple rectangle (ROPE), the stronger the evidence in favor of the null hypothesis.

Figure 6.
The full posterior distribution of the estimated contrast quantifies the between-group difference in reference-back costs (accuracy) in Phase 1 (light grey) and 2 (dark grey), the black bars represent the 89% HPDI.

Values below zero suggest a higher cost in the Control group compared to the Stress group. The purple area is the ROPE (-0.01, 0.01). The more the posterior distribution falls away from zero the higher the maximum probability of effect (MPE). The more the posterior distribution falls inside the purple rectangle (ROPE), the stronger the evidence in favor of the null hypothesis.

Close modal

This resulted from the fact that the Stress group exhibited an inverted updating cost (0.02, 89% HPDI = [0.01, 0.02], MPE = 1, 2% in ROPE) with lower accuracy in comparison repetition as compared to reference repetition trials, while Controls did not exhibit any cost (-0.002, 89% HPDI = [-0.01, 0.003], MPE = 0.78, 100% in ROPE).

Indeed, Figure 7 clearly illustrates that the between-group difference in updating costs at Phase 2 was primarily attributable to lower accuracy in comparison repetition trials for the stressed group relative to the control group (-0.02, 89% HPDI = [-0.03, -0.01], MPE = 0.99, 0% in ROPE), rather than by differences in reference repetition trials (0.0, 89% HPDI = [-0.01, 0.01], MPE = 0.59, 99% in ROPE).

Figure 7.
Estimated ACC, grouped by experimental phases, Groups (Control: light blue, Stress: orange) and type of trial.

Group Posterior mean is represented by the black colored dot (filled with the same color as the respective group), and bars represent the 89% HPDI.

Figure 7.
Estimated ACC, grouped by experimental phases, Groups (Control: light blue, Stress: orange) and type of trial.

Group Posterior mean is represented by the black colored dot (filled with the same color as the respective group), and bars represent the 89% HPDI.

Close modal

Drift Diffusion Model. Model estimates are summarized in Table S5.1 of the Supplementary Material. For brevity, only the results of planned contrasts investigating the reference-back costs are reported here. Posterior distributions of drift rates and threshold as a function of Phase, Group, Switching, and TrialType are illustrated in Figure 9 and Figure 11.

Reference-Back Costs on Drift Rate. Planned contrasts performed to replicate the reference-back performance costs in terms of drift rate revealed strong evidence for the presence of gate opening (-0.215, 89% HPDI = [-0.257, -0.172], MPE = 1, 0% in ROPE) and updating costs (-0.207, 89% HPDI = [-0.259, -0.156], MPE = 1, 0% in ROPE), but not reliable gate closing costs (-0.049, 89% HPDI = [-0.094, -0.002], MPE = 0.96, 100% in ROPE).

In Phase 1, between-group comparison of costs (see Figure 8) revealed inconclusive evidence for differences for gate closing (0.114, 89%HPDI = [-0.01, 0.241], MPE = 0.93, 43% in ROPE), gate opening (-0.084, 89%HPDI = [-0.197, 0.036], MPE = 0.87, 60% in ROPE), and updating (0.042, 89%HPDI = [-0.092, 0.178], MPE = 0.70, 79% in ROPE).

As for Phase 2, between-group comparison of costs (see Figure 8) revealed that the Control group had greater gate closing (-0.216, 89%HPDI = [-0.327, -0.111], MPE = 0.99, 0% in ROPE) and updating (-0.271, 89%HPDI = [-0.394, -0.155], MPE = 1, 0% in ROPE) costs than the Stress group, while comparable between-groups costs were observed for gate opening (0.004, 89%HPDI = [-0.098, 0.107], MPE = 0.52, 99% in ROPE).

Figure 8.
The full posterior distribution of the estimated contrast quantifies the between-group difference in reference-back costs (drift rate) in Phase 1 (light grey) and Phase 2 (dark grey); the black bars represent the 89% HPDI.

Values below zero suggest a higher cost in the Control group compared to the Stress group. The purple area is the ROPE (-0.1, 0.1). The more the posterior distribution falls away from zero the higher the maximum probability of effect (MPE). The more the posterior distribution falls inside the purple rectangle (ROPE), the stronger the evidence in favor of the null hypothesis.

Figure 8.
The full posterior distribution of the estimated contrast quantifies the between-group difference in reference-back costs (drift rate) in Phase 1 (light grey) and Phase 2 (dark grey); the black bars represent the 89% HPDI.

Values below zero suggest a higher cost in the Control group compared to the Stress group. The purple area is the ROPE (-0.1, 0.1). The more the posterior distribution falls away from zero the higher the maximum probability of effect (MPE). The more the posterior distribution falls inside the purple rectangle (ROPE), the stronger the evidence in favor of the null hypothesis.

Close modal

Crucially, the increased costs observed in the Control group resulted from an overall greater increase in drift rates across all trial types compared to the Stress group (Phase x Group: -0.24, 89% HPDI = [-0.36, -0.12], MPE = 1, 0% in ROPE) and particularly for comparison repetition trials (see Figure 9), thus increasing the drift rate difference on comparison repeat and reference repeat trials in the Control group.

Figure 9.
Posterior distributions of drift rates grouped by experimental Phases, Groups (Control: light blue, Stress: orange), and type of trial.

The posterior mean is represented by the black colored dot (filled with the same color as the respective group), and bars represent the 89% HPDI.

Figure 9.
Posterior distributions of drift rates grouped by experimental Phases, Groups (Control: light blue, Stress: orange), and type of trial.

The posterior mean is represented by the black colored dot (filled with the same color as the respective group), and bars represent the 89% HPDI.

Close modal

Reference-Back Costs on Threshold. Planned contrasts performed to replicate the reference-back performance costs in terms of threshold revealed strong evidence for the presence of gate closing (0.137, 89% HPDI = [0.114, 0.161], MPE = 1, 0% in ROPE) and possibly updating costs (0.085, 89% HPDI = [0.062, 0.111], MPE = 1, 12% in ROPE), but not reliable gate opening costs (0.031, 89% HPDI = [0.008, 0.054], MPE = 0.98, 100% in ROPE).

In Phase 1, between-group comparison of costs revealed inconclusive evidence for differences in the cost of gate closing (0.068, 89%HPDI = [0.001, 0.136], MPE = 0.95, 52% in ROPE), while evidence favored the absence of between-group difference in gate opening (-0.013, 89%HPDI = [-0.079, 0.053], MPE = 0.62, 97% in ROPE) and updating costs (0.016, 89%HPDI = [-0.052, 0.085], MPE = 0.64, 95% in ROPE)

As for Phase 2, comparison revealed comparable costs between groups (gate closing = -0.005 89%HPDI = [-0.056, 0.044], MPE = 0.57, 100% in ROPE; gate opening = -0.020, 89%HPDI = [-0.070, 0.028], MPE = 0.75, 100% in ROPE; updating = -0.021, 89%HPDI = [-0.074, 0.032], MPE = 0.74, 98% in ROPE).

Our findings confirmed the presence of gate-closing and updating costs on the decision threshold, but acute stress did not modulate these costs (Figure 10 and Figure 11).

Figure 10.
The full posterior distribution of the estimated contrast quantifies the between-group difference in reference-back costs (threshold) in Phase 1 (light grey) and Phase 2 (dark grey); the black bars represent the 89% HPDI.

Values below zero suggest a higher cost in the Control group compared to the Stress group. The purple area is the ROPE (-0.07, 0.07). The more the posterior distribution falls away from zero the higher the maximum probability of effect (MPE). The more the posterior distribution falls inside the purple rectangle (ROPE), the stronger the evidence in favor of the null hypothesis.

Figure 10.
The full posterior distribution of the estimated contrast quantifies the between-group difference in reference-back costs (threshold) in Phase 1 (light grey) and Phase 2 (dark grey); the black bars represent the 89% HPDI.

Values below zero suggest a higher cost in the Control group compared to the Stress group. The purple area is the ROPE (-0.07, 0.07). The more the posterior distribution falls away from zero the higher the maximum probability of effect (MPE). The more the posterior distribution falls inside the purple rectangle (ROPE), the stronger the evidence in favor of the null hypothesis.

Close modal
Figure 11.
Posterior distributions of thresholds grouped by experimental Phases, Groups (Control: light blue, Stress: orange), and Trial Type.

The posterior mean is represented by the black colored dot (filled with the same color as the respective group), and bars represent the 89% HPDI.

Figure 11.
Posterior distributions of thresholds grouped by experimental Phases, Groups (Control: light blue, Stress: orange), and Trial Type.

The posterior mean is represented by the black colored dot (filled with the same color as the respective group), and bars represent the 89% HPDI.

Close modal

Non-Decision Time. As expected, across the two experimental phases, both groups reported a decrease in non-decision time (Phase: 0.031s, 89%HPDI = [0.025, 0.038], MPE = 1, 0% in ROPE). No reliable between group differences were observed (Group: 0.014s, 89%HPDI = [0.003, 0.025], MPE = 0.98, 55% in ROPE; Phase x Group: -0.003s, 89%HPDI = [-0.016, 0.010], MPE = 0.66, 98% in ROPE).

These results confirm the anticipated practice effect, with no evidence of between-groups differences.

Discussion Study 1

In brief, we successfully replicated the well-established reference-back performance patterns in RTs, namely performance costs associated with updating WM content as well as opening and closing the gate to WM. However, we did not identify any effect of our stress-induction on these RT performance costs. We had anticipated that acute stress would have promoted behavior driven by the striatum (Otto et al., 2013; Vogel & Schwabe, 2016; Wirz et al., 2018), thereby facilitating gate opening and possibly updating processes, while impairing gate closing and maintenance. Although we did find numerical differences ascribed to the implemented acute-stress manipulation on RT, those were not substantial enough to be considered meaningful. Nevertheless, when assessing the effects of acute stress on performance accuracy, our results partially confirmed our hypothesis. Specifically, during the second phase of the experimental session (when the reference-back task was performed under a stress or control manipulation), participants in the Stress group exhibited lower accuracy as compared to the Control group in trials requiring maintenance. Besides the lower accuracy in maintenance trials, we also observed a smaller updating cost in the Stress group as compared to the Control group. Notably, the smaller updating cost shown by stressed participants was a result of numerically higher accuracy in trials requiring updating as opposed to maintenance, in line with our expectations. Since updating is facilitated by the gate opening process, which is driven by striatal activity, one might have also expected an improvement in gate opening, but this was not the case. However, it is worth noting that overall accuracy was high in both groups, making it challenging to detect significant improvements in accuracy.

For a more comprehensive understanding of the potential effects of acute stress, we conducted a follow-up analysis using the Drift-Diffusion Model (DDM; Ratcliff & McKoon, 2008). Simultaneously modeling the drift rate and decision threshold allowed us to tease apart the effects of acute stress on two latent decision making processes: information processing and the adopted response strategy. Modeling results showed that both groups exhibited an overall increase in drift rates across the two experimental phases, which likely reflects a practice effect, but this effect was larger in the Control group as compared to the Stress group. Notably, across the two experimental phases, the Control group showed a more pronounced increase of the costs related to the gate closing and updating processes, as compared to the Stress group. However, these increased costs, computed as performance difference scores, in the Control group were driven by larger improvement on consecutive maintenance trials relative to the improvement on other trials. Thus, rather than worse performance related to gate closing and updating in the Control group, this performance pattern instead indicates particular improvement in WM maintenance across the experimental phases. The finding that the Stress group did not exhibit such a particular improvement on WM maintenance trials suggests a more pronounced impact of stress on this latter component of WM.

To contextualize our findings within the broader literature, we conducted a meta-analysis2. This analysis aimed to further investigate the effect of acute stress on WM by examining a wide range of stressors and WM tasks categorized according to their specific demands on WM. While most tasks are not designed to decompose WM into its component processes like the reference-back paradigm, they can still provide valuable insights due to their varying demands. For instance, some tasks only tax WM maintenance (e.g., digit forward), while others require overcoming competition from distracting stimuli (e.g., reading span) or updating of representations (e.g., n-back; see Oberauer, 2009). It should be noted that this categorization of tasks based on their demands does not allow us to isolate the effect of stress on a specific WM component, as not all tasks include performance contrasts like the reference-back task in Study 1. However, this categorization approach nevertheless allows us to examine whether the effects of stress differ depending on whether a task context only requires WM maintenance (e.g., a forward digit span task) or additionally requires WM updating (e.g., the n-back or reference-back task). Moreover, it allows us to assess whether stress effects are observed in reaction times, accuracy, or both. As such, the observation that acute stress impairs response accuracy but not speed in tasks requiring WM maintenance and updating would be consistent with our findings in Study 1. We focused exclusively on declarative WM tasks (as opposed to procedural tasks) to ensure a more targeted examination of acute stress effects on specific WM processes and to remain consistent with our focus on declarative WM in Study 1.

Methods

Study Selection and Inclusion Criteria

The following databases were used for searching peer-reviewed articles: Web of Science, PsycINFO, PubMed, Scopus. Search terms for working memory construct were: “working memory” OR “substitution” OR “updating” OR “directed forgetting” OR “short term memory” OR “short term retention” OR “short term maintenance” OR “working memory capacity” OR “working memory maintenance” OR “working memory gating” OR “working memory prioritization” OR “distractor filtering” OR “working memory representation”. Search terms for stress induction were: “cold-pressor” OR “Trier Social Stress Test” OR “mood induction” OR “acute stress” OR “stress induction” OR “stress manipulation” OR “stress was induced” OR “state anxiety” OR “emotional distress”. The first search was carried out between 13th-21st October 2022, and an additional search was conducted on November 3rd, 2023. No restrictions for the publication period were applied. References from retrieved articles and existing reviews were inspected for the presence of additional sources. Unpublished work was searched on Proquest Database. Figure 12 depicts the PRISMA (adapted from Page et al., 2021) flow diagram of the literature selection process.

Studies meeting the following criteria were considered eligible for inclusion:

  • Mixed, within- or between-subject studies that manipulated stress and assessed its effects on declarative WM. In the case of a within-subject design, the control and the stress conditions were administered on different days;

  • A control condition, which does not evoke stress, was included;

  • The WM task included only neutral (e.g., non emotional) stimuli;

  • Papers written in English;

  • Participants were healthy adults.

Figure 12.
Prisma Flow Diagram for the literature selection process, adapted from Page et al., 2021.
Figure 12.
Prisma Flow Diagram for the literature selection process, adapted from Page et al., 2021.
Close modal

Coding of Variables

We categorized the WM tasks based on the framework outlined by Oberauer (2009), which distinguishes three primary demands on WM (see Table 2). The first, Maintenance, is assessed in tasks that require the short-term recall or recognition of information without the need for shielding that information from distraction nor updating or otherwise manipulating that information, such as Digit Span Forward and Sternberg item recognition. The second category, Maintenance + Overcoming Competition, encompasses tasks that require both information storage and the ability to overcome interference from competitive stimuli or distracting tasks, such as the Operation Span and Reading Span. Finally, the third category, Maintenance + Updating, includes tasks that additionally demand the updating of stored information, such as the N-Back and the PASAT. The reference-back paradigm we used in Study 1 falls in the Maintenance + Updating category in this classification, as it requires participants to maintain information during comparison trials and update it during reference trials.

Table 2.
Description and categorization of WM tasks included in the meta-analysis.
Working Memory Task Description Demand 
Immediate Serial Recall Forward
(Corsi Block-Tapping Forward; Digit Span Forward; Walking Corsi Test Forward; Visual Memory Span Forward) 
Reproduce a sequentially presented list of items in the exact order of presentation. Maintenance 
Item Recognition
(Sternberg) 
Decide if a probe item was in a previously shown set. Maintenance 
Partial Report Task Report a specific item from a briefly displayed array. Maintenance 
Change Detection Task
(without distractors) 
Indicate any changes between initial presentation and test phase in a specific item or entire array. Maintenance 
Change Detection Task
(with distractors) 
Indicate changes between presentations in the presence of distracting stimuli. Maintenance + Overcoming Competition 
Working Memory Task Landscapes and Faces Remember specific stimuli (faces or scenes) while ignoring others, then identify if a probe matches a remembered item. Maintenance + Overcoming Competition 
Complex Span Tasks
(Operation Span Task; Symmetry Span Task; Reading Span) 
Reproduce a list of items presented between brief distractor tasks, usually in serial order. Maintenance + Overcoming Competition 
N-Back Decide if each stimulus in a sequence matches the one presented n steps back. Maintenance + Updating 
Immediate Serial Recall Backward or Other
(Corsi Block-Tapping Backward; Digit Span Backward; Walking Corsi Test Backward; Letter Number Sequencing) 
Reproduce a sequentially presented list of items in reversed or altered order. Maintenance + Updating 
Memory Updating
(Modified Delayed Match-to-Sample; Paced-Serial-Addition Task; Arithmetic Tasks) 
Update initial items through transformation or replacement based on subsequent instructions. Maintenance + Updating 
Working Memory Task Description Demand 
Immediate Serial Recall Forward
(Corsi Block-Tapping Forward; Digit Span Forward; Walking Corsi Test Forward; Visual Memory Span Forward) 
Reproduce a sequentially presented list of items in the exact order of presentation. Maintenance 
Item Recognition
(Sternberg) 
Decide if a probe item was in a previously shown set. Maintenance 
Partial Report Task Report a specific item from a briefly displayed array. Maintenance 
Change Detection Task
(without distractors) 
Indicate any changes between initial presentation and test phase in a specific item or entire array. Maintenance 
Change Detection Task
(with distractors) 
Indicate changes between presentations in the presence of distracting stimuli. Maintenance + Overcoming Competition 
Working Memory Task Landscapes and Faces Remember specific stimuli (faces or scenes) while ignoring others, then identify if a probe matches a remembered item. Maintenance + Overcoming Competition 
Complex Span Tasks
(Operation Span Task; Symmetry Span Task; Reading Span) 
Reproduce a list of items presented between brief distractor tasks, usually in serial order. Maintenance + Overcoming Competition 
N-Back Decide if each stimulus in a sequence matches the one presented n steps back. Maintenance + Updating 
Immediate Serial Recall Backward or Other
(Corsi Block-Tapping Backward; Digit Span Backward; Walking Corsi Test Backward; Letter Number Sequencing) 
Reproduce a sequentially presented list of items in reversed or altered order. Maintenance + Updating 
Memory Updating
(Modified Delayed Match-to-Sample; Paced-Serial-Addition Task; Arithmetic Tasks) 
Update initial items through transformation or replacement based on subsequent instructions. Maintenance + Updating 

Notes. Working memory demands are derived from Oberauer (2009), while task descriptions are adapted from Oberauer et al. (2018). Maintenance tasks involve only maintaining information without any distractions. Maintenance + Overcoming Competition tasks require maintaining information while overcoming competitive stimuli or performing distracting tasks. Third, Maintenance + Updating tasks involve both maintaining and manipulating information.

Analytical Strategy

The measures of interest were standardized mean differences between the stress and control conditions. To ensure meaningful comparisons across studies with different designs (within-subjects, between-subjects, and mixed), we employed a consistent approach to effect size calculation (Morris & DeShon, 2002). For all designs, we used a standard deviation that was not based on difference scores, thus providing a comparable scale of measurement. Specifically, we calculated Hedges’ g (Hedges, 1981) using the metafor package (Viechtbauer, 2010). For between-subjects and mixed designs (comparing pre-manipulation to post-manipulation), we used the pooled standard deviation calculated from pre-manipulation scores for mixed designs (Morris, 2008). For within-subjects designs, we used the control condition standard deviations. These differences were expected to be negative, indicating the detrimental effect of stress. Therefore, outcomes were differentially calculated for reaction time versus accuracy (or similar, e.g., d’) data. For reaction time, we calculated Control - Stress, while for accuracy measures, we used Stress - Control. This approach allows for a more direct and meaningful comparison of effect sizes across different study designs and outcome measures, enhancing the validity of our meta-analytic findings.

To correctly calculate standardized mean differences for within-subjects and mixed designs, researchers have to acknowledge the dependency between measures obtained from the same participants. Because this information was generally unavailable, we conducted a sensitivity analysis, calculating effect sizes under three different correlation values (0.3, 0.5, and 0.7). Whenever possible, effects were calculated using the means, standard deviations, and sample sizes reported in each article. In cases where these statistics were not directly provided, we contacted the corresponding authors to request the data. If the authors did not respond but data visualizations were available, we extracted the necessary information using WebPlotDigitizer software (Rohatgi, 2017). Studies for which the required statistics could not be obtained were excluded from the analysis.

The concept of dependency between measures should also be taken into account when studies report multiple outcomes, as these are measured on the same sample. In our case, some studies reported the effect of stress on reaction times and accuracy, different WM components, or different types of stressors. To account for this dependency and for between-study variance (τ²), we used a Bayesian multivariate multilevel model (Borenstein et al., 2009; Williams et al., 2018). Similar to the effect size calculation, we conducted a sensitivity analysis during the variance-covariance matrix calculation to address the uncertainty in the correlation between outcomes. Specifically, we considered three scenarios: assuming a correlation of 0.5 between outcomes; assuming a higher correlation (0.7 or 0.5) between outcomes of the same type (e.g., accuracy and d’) as compared to outcome of different types (0.5 or 0.3; e.g., accuracy vs. reaction time). This approach gave us a total of 9 possible datasets on which we applied two models. The first model examined the mean effect for each WM demand. The second model expanded upon this by incorporating information about the type of outcome under investigation (e.g., accuracy and reaction time). Both models were defined with cell means parametrization (i.e., omitting the intercept), allowing us to obtain a posterior distribution for each combination of the implemented categorical variables.

For both models, we used weakly informative priors:

μ ~ Normal(0, 1)

τ ~ Half-Normal(0,1)

and we ran 4 chains with 8000 each and the first 4000 were discarded as warm-up.

We first checked model convergence by inspecting trace plots and by using the R-hat statistics. We assessed the presence of an acute-stress effect using the Maximum Probability of Effect (MPE) and the 89% Highest Posterior Density Interval + Region of Practical Equivalence (89% HPDI + ROPE) decision rule. ROPE range was set from -.1 to .1. The mean estimate, 89% HPDI, MPE, and the percentage of 89% HPDI inside ROPE are reported. See the Data Analysis section of Study 1 for definitions of MPE and the 89% HPDI + ROPE.

Results

Study Characteristics

The final sample included 42 studies, a total of 2008 participants, and 210 effects, specifically we collected 44, 19, and 147 effects for Maintenance, Maintenance + Competition, and Maintenance + Updating demands, respectively. For descriptive purposes, Figure 13 and Figure 14 present aggregated effects at the study level grouped by outcome type (reaction time vs. accuracy) and WM demand. These figures are based on the dataset where a correlation of 0.5 was used to compute effect sizes from within- and mixed-design studies.

Figure 13.
Reaction time based outcome: effect of stress across WM demands.

Effect sizes are aggregated at the study level, with error bars indicating 95% confidence intervals. These values were computed from the dataset in which a correlation of 0.5 was applied to calculate effect sizes for within- and mixed-design studies. Effect sizes were expected to be negative, indicating the detrimental effect of stress.

Figure 13.
Reaction time based outcome: effect of stress across WM demands.

Effect sizes are aggregated at the study level, with error bars indicating 95% confidence intervals. These values were computed from the dataset in which a correlation of 0.5 was applied to calculate effect sizes for within- and mixed-design studies. Effect sizes were expected to be negative, indicating the detrimental effect of stress.

Close modal
Figure 14.
Accuracy based outcome: effect of stress across WM demands.

Effect sizes are aggregated at the study level, with error bars indicating 95% confidence intervals. These values were computed from the dataset in which a correlation of 0.5 was applied to calculate effect sizes for within- and mixed-design studies. Effect sizes were expected to be negative, indicating the detrimental effect of stress.

Figure 14.
Accuracy based outcome: effect of stress across WM demands.

Effect sizes are aggregated at the study level, with error bars indicating 95% confidence intervals. These values were computed from the dataset in which a correlation of 0.5 was applied to calculate effect sizes for within- and mixed-design studies. Effect sizes were expected to be negative, indicating the detrimental effect of stress.

Close modal

Effects on Tasks with Varying Working Memory Demands

While R-hat statistics were below 1.1, suggesting good convergence, divergent transitions were observed (min = 0, max = 99). As explained in the Stan Reference Manual, “A divergence arises when the simulated Hamiltonian trajectory departs from the true trajectory as measured by departure of the Hamiltonian value from its initial value. When this divergence is too high, the simulation has gone off the rails and cannot be trusted”. Among the nine datasets generated from our sensitivity analysis, the models (n = 3) assuming a uniform 0.5 correlation between outcomes in the variance-covariance matrix calculation exhibited the most stable behavior. These models showed fewer divergent transitions when varying the correlations used for effect size calculations (1 divergent transition for 0.3 correlation, 1 for 0.5, and 1 for 0.7). To provide a comprehensive view and account for the uncertainty in the effect size calculations, we present results from all three models (Figure 15, central columns).

Figure 15.
Full posterior distribution of the estimated stress effect on various WM components.

The color scheme represents different correlation values used for effect size calculation in within-subject or mixed designs: green (with03), blue (with05), and red (with07). Labels 0.5_0.5, 0.5_0.3, and 0.7_0.5 denote the correlations used in the variance-covariance matrix computation. The purple area is the ROPE (-0.1, 0.1). The more the posterior distribution falls away from zero the higher the maximum probability of effect (MPE). The more the posterior distribution falls inside the purple rectangle (ROPE), the stronger the evidence in favor of the null hypothesis.

Figure 15.
Full posterior distribution of the estimated stress effect on various WM components.

The color scheme represents different correlation values used for effect size calculation in within-subject or mixed designs: green (with03), blue (with05), and red (with07). Labels 0.5_0.5, 0.5_0.3, and 0.7_0.5 denote the correlations used in the variance-covariance matrix computation. The purple area is the ROPE (-0.1, 0.1). The more the posterior distribution falls away from zero the higher the maximum probability of effect (MPE). The more the posterior distribution falls inside the purple rectangle (ROPE), the stronger the evidence in favor of the null hypothesis.

Close modal

Regardless of the chosen correlation values and the demand on WM, evidence for an effect of acute stress on WM remained inconclusive based on the 89% HPDI + ROPE rule (see Table 3). However, the MPE suggested a plausible negative effect on the Maintenance + Updating tasks (see Table 3). While the probability of this effect’s existence was 99%, with an estimated magnitude of between -0.10 and -0.12 (exceeding the -0.1 threshold), the HPDI lacked sufficient precision to fall entirely outside the ROPE.

Table 3.
Parameter estimates for the three models.
Parameter Estimate Lower Upper MPE %ROPE 
Maint_0.3 -0.06 -0.18 0.07 0.79 0.72 
Maint_0.5 -0.06 -0.18 0.07 0.79 0.71 
Maint_0.7 -0.06 -0.19 0.07 0.80 0.70 
MaintComp_0.3 -0.18 -0.43 0.06 0.89 0.26 
MaintComp_0.5 -0.17 -0.41 0.08 0.88 0.28 
MaintComp_0.7 -0.16 -0.40 0.08 0.87 0.31 
MaintUp_0.3 -0.12 -0.20 -0.04 0.99 0.33 
MaintUp_0.5 -0.11 -0.19 -0.03 0.99 0.43 
MaintUp_0.7 -0.10 -0.19 -0.02 0.98 0.50 
Parameter Estimate Lower Upper MPE %ROPE 
Maint_0.3 -0.06 -0.18 0.07 0.79 0.72 
Maint_0.5 -0.06 -0.18 0.07 0.79 0.71 
Maint_0.7 -0.06 -0.19 0.07 0.80 0.70 
MaintComp_0.3 -0.18 -0.43 0.06 0.89 0.26 
MaintComp_0.5 -0.17 -0.41 0.08 0.88 0.28 
MaintComp_0.7 -0.16 -0.40 0.08 0.87 0.31 
MaintUp_0.3 -0.12 -0.20 -0.04 0.99 0.33 
MaintUp_0.5 -0.11 -0.19 -0.03 0.99 0.43 
MaintUp_0.7 -0.10 -0.19 -0.02 0.98 0.50 

Notes. Estimated stress effect across the different WM demands. Estimates’ names refer to the WM demand under investigation: Maint = Maintenance, MaintComp = Maintenance + Overcoming Competition, MaintUp = Maintenance + Updating, and the correlations implemented when computing effect sizes. “Lower” and “Upper” refer to the limits of 89% Highest Posterior Density Interval, %ROPE refers to the % in ROPE.

Moderate heterogeneity was observed for Maintenance + Competition (τ = 0.42, 0.41, 0.41; 89% HPDI = [0.20, 0.65], [0.18, 0.63], [0.19, 0.64]) indicating substantial variability in the estimated effect of stress across different studies using Maintenance + Competition paradigms. Lower heterogeneity was found Maintenance (τ = 0.18, 0.2, 0.22; 89% HPDI = [0.00, 0.3], [0.02, 0.33], [0.08, 0.37]), and Maintenance + Updating (τ = 0.12, 0.14, 0.18; 89% HPDI = [0.00, 0.22], [0.00, 0.24], [0.08, 0.29]) suggesting less variability in the stress effect for this component.

Results revealed that acute stress manipulation had a small effect on performance in paradigms testing Maintenance + Updating.

Effect of Outcome Type

As for the previous model, R-hat statistics were below 1.1, suggesting good convergence but divergent transitions were observed (min = 0, max = 45). Among the nine datasets generated from our sensitivity analysis, the models (n = 3) assuming a 0.7 vs. 0.5 correlation between outcomes in the variance-covariance matrix calculation exhibited the most stable behavior. These models showed fewer divergent transitions when varying the correlations used for effect size calculations (1 divergent transition for 0.3 correlation, 5 for 0.5, and 7 for 0.7). To provide a comprehensive view and account for the uncertainty in the effect size calculations, we present results from all three models (Figure 16 and Figure 17, last columns).

Figure 16.
Full posterior distribution of the estimated stress effect on RT grouped by WM demand.

The color scheme represents different correlation values used for effect size calculation in within-subject or mixed designs: red (with03), green (with05), and blue (with07). Labels 0.5_0.5, 0.5_0.3, and 0.7_0.5 denote the correlations used in the variance-covariance matrix computation. The purple area is the ROPE (-0.1, 0.1). The more the posterior distribution falls away from zero the higher the maximum probability of effect (MPE). The more the posterior distribution falls inside the purple rectangle (ROPE), the stronger the evidence in favor of the null hypothesis.

Figure 16.
Full posterior distribution of the estimated stress effect on RT grouped by WM demand.

The color scheme represents different correlation values used for effect size calculation in within-subject or mixed designs: red (with03), green (with05), and blue (with07). Labels 0.5_0.5, 0.5_0.3, and 0.7_0.5 denote the correlations used in the variance-covariance matrix computation. The purple area is the ROPE (-0.1, 0.1). The more the posterior distribution falls away from zero the higher the maximum probability of effect (MPE). The more the posterior distribution falls inside the purple rectangle (ROPE), the stronger the evidence in favor of the null hypothesis.

Close modal
Figure 17.
Full posterior distribution of the estimated stress effect on ACC grouped by WM demand.

The color scheme represents different correlation values used for effect size calculation in within-subject or mixed designs: red (with03), green (with05), and blue (with07). Labels 0.5_0.5, 0.5_0.3, and 0.7_0.5 denote the correlations used in the variance-covariance matrix computation. The purple area is the ROPE (-0.1, 0.1). The more the posterior distribution falls away from zero the higher the maximum probability of effect (MPE). The more the posterior distribution falls inside the purple rectangle (ROPE), the stronger the evidence in favor of the null hypothesis.

Figure 17.
Full posterior distribution of the estimated stress effect on ACC grouped by WM demand.

The color scheme represents different correlation values used for effect size calculation in within-subject or mixed designs: red (with03), green (with05), and blue (with07). Labels 0.5_0.5, 0.5_0.3, and 0.7_0.5 denote the correlations used in the variance-covariance matrix computation. The purple area is the ROPE (-0.1, 0.1). The more the posterior distribution falls away from zero the higher the maximum probability of effect (MPE). The more the posterior distribution falls inside the purple rectangle (ROPE), the stronger the evidence in favor of the null hypothesis.

Close modal

Outcome RT. Regardless of the chosen correlation values, evidence for an effect of acute stress on WM remained inconclusive based on the 89% HPDI + ROPE rule (see Table 4). Additionally, the MPE did not suggest a plausible negative effect (max value = 0.73, see Table 4).

Table 4.
Estimated effects of acute stress on reaction time for three models.
Parameter Estimate Lower Upper MPE %ROPE 
RT_Maint_0.3 0.05 -0.43 0.57 0.57 0.32 
RT_Maint_0.5 0.06 -0.42 0.56 0.59 0.33 
RT_Maint_0.7 0.05 -0.40 0.53 0.60 0.33 
RT_MaintComp_0.3 -0.12 -0.59 0.32 0.69 0.29 
RT_MaintComp_0.5 -0.11 -0.57 0.32 0.69 0.30 
RT_MaintComp_0.7 -0.10 -0.53 0.33 0.68 0.33 
RT_MaintUp_0.3 -0.05 -0.17 0.07 0.73 0.79 
RT_MaintUp_0.5 -0.04 -0.15 0.07 0.69 0.84 
RT_MaintUp_0.7 -0.03 -0.13 0.08 0.67 0.90 
Parameter Estimate Lower Upper MPE %ROPE 
RT_Maint_0.3 0.05 -0.43 0.57 0.57 0.32 
RT_Maint_0.5 0.06 -0.42 0.56 0.59 0.33 
RT_Maint_0.7 0.05 -0.40 0.53 0.60 0.33 
RT_MaintComp_0.3 -0.12 -0.59 0.32 0.69 0.29 
RT_MaintComp_0.5 -0.11 -0.57 0.32 0.69 0.30 
RT_MaintComp_0.7 -0.10 -0.53 0.33 0.68 0.33 
RT_MaintUp_0.3 -0.05 -0.17 0.07 0.73 0.79 
RT_MaintUp_0.5 -0.04 -0.15 0.07 0.69 0.84 
RT_MaintUp_0.7 -0.03 -0.13 0.08 0.67 0.90 

Notes. Estimated stress effect across the different WM demands (reaction times data). Estimates’ names refer to the WM demand under investigation: Maint = Maintenance, MaintComp = Maintenance + Overcoming Competition, MaintUp = Maintenance + Updating, and the correlations implemented when computing effect sizes. “Lower” and “Upper” refer to the limits of 89% Highest Posterior Density Interval, %ROPE refers to the % in ROPE.

The effect of stress on RT varied across studies: the highest values were observed for Maintenance (τ = 0.56, 0.55, 0.53; 89% HPDI = [0.00, 0.99], [0.00, 1], [0.0, 0.97]), and Maintenance + Competition (τ = 0.39, 0.39, 0.38; 89% HPDI = [0.00, 0.82], [0.00, 0.81], [0.00, 0.81]). Lower values were observed for Maintenance + Updating (τ = 0.1, 0.1, 0.09; 89% HPDI = [0.00, 0.21], [0.00, 0.2], [0.00, 0.19]).

Outcome ACC. Despite the varying correlation values, evidence for an acute stress effect on WM remained inconclusive (89% HPDI + ROPE test; see Table 5). However, the MPE suggested a plausible negative effect on Maintenance + Updating tasks (see Table 5). While the probability of this effect was approximately 99%, with estimated magnitudes between -0.13 and -0.15 (exceeding the -0.1 threshold), the HPDI lacked sufficient precision to be entirely outside the ROPE.

Table 5.
Estimated effects of acute stress on accuracy data (and related measures) for three models.
Parameter Estimate Lower Upper MPE %ROPE 
ACC_Maint_0.3 -0.06 -0.20 0.08 0.76 0.69 
ACC_Maint_0.5 -0.06 -0.21 0.07 0.76 0.68 
ACC_Maint_0.7 -0.06 -0.19 0.08 0.75 0.71 
ACC_MaintComp_0.3 -0.14 -0.43 0.14 0.80 0.34 
ACC_MaintComp_0.5 -0.14 -0.42 0.13 0.80 0.35 
ACC_MaintComp_0.7 -0.13 -0.40 0.14 0.79 0.36 
ACC_MaintUp_0.3 -0.15 -0.24 -0.06 1.00 0.18 
ACC_MaintUp_0.5 -0.14 -0.23 -0.05 0.99 0.25 
ACC_MaintUp_0.7 -0.13 -0.22 -0.03 0.99 0.32 
Parameter Estimate Lower Upper MPE %ROPE 
ACC_Maint_0.3 -0.06 -0.20 0.08 0.76 0.69 
ACC_Maint_0.5 -0.06 -0.21 0.07 0.76 0.68 
ACC_Maint_0.7 -0.06 -0.19 0.08 0.75 0.71 
ACC_MaintComp_0.3 -0.14 -0.43 0.14 0.80 0.34 
ACC_MaintComp_0.5 -0.14 -0.42 0.13 0.80 0.35 
ACC_MaintComp_0.7 -0.13 -0.40 0.14 0.79 0.36 
ACC_MaintUp_0.3 -0.15 -0.24 -0.06 1.00 0.18 
ACC_MaintUp_0.5 -0.14 -0.23 -0.05 0.99 0.25 
ACC_MaintUp_0.7 -0.13 -0.22 -0.03 0.99 0.32 

Notes. Estimated stress effect across the different WM demands (accuracy and related measures). Estimates’ names refer to the WM demand under investigation: Maint = Maintenance, MaintComp = Maintenance + Overcoming Competition, MaintUp = Maintenance + Updating, and the correlations implemented when computing effect sizes. “Lower” and “Upper” refer to the limits of 89% Highest Posterior Density Interval, %ROPE refers to the % in ROPE.

The effect of stress on ACC varied substantially across studies taxing Maintenance + Competition (τ = 0.49, 0.49, 0.48; 89% HPDI = [0.24, 0.74], [0.23, 0.73], [0.23, 0.73]). In contrast, lower heterogeneity was estimated for Maintenance (τ = 0.22, 0.23, 0.25; 89% HPDI = [0.06, 0.38], [0.08, 0.38], [0.11, 0.38]), and Maintenance + Updating (τ = 0.13 0.16, 0.19; 89% HPDI = [0.00, 0.23], [0.02, 0.26], [0.09, 0.30]).

In summary, acute stress may have a small negative impact specifically on accuracy-related measures in tasks involving Maintenance + Updating demands. These findings extend the results from our Study 1, where we observed that acute stress impacted performance accuracy during maintenance trials in a reference-back task, which also incorporates updating demands. In the general discussion, we further discuss these outcomes, alongside their limitations.

We aimed to elucidate the impact of acute stress on distinct component processes of working memory (WM) (Frank et al., 2001; Hazy et al., 2007; O’Reilly & Frank, 2006), namely, maintenance, updating, gate opening, and closing. To achieve this, in our first, experimental study we employed a mixed-design approach, with participants randomly assigned to either a stress-induction or a control condition. We leveraged the reference-back paradigm to isolate component processes of WM (Rac-Lubashevsky & Kessler, 2016a, 2016b), assessing performance changes from before to during the stress manipulation. Our manipulation was effective in reducing self-reported mood in terms of valence, but no substantial impact of acute stress on WM gating was observed. Nevertheless, our investigation yielded preliminary evidence of a negative effect on WM maintenance, at least when the task demands alternating the maintenance and the updating of the information.

Specifically, the stress group exhibited lower response accuracy compared to the control group for trials requiring maintenance. Drift diffusion modeling qualified this reduced accuracy in terms of impaired information processing rather than a change in response strategy (i.e., speed-accuracy trade-off). Notably, the reduced maintenance performance was expressed as a lack of improvement in the stress group from baseline to stress phases, whereas the control group demonstrated a more typical performance improvement between the phases that possibly reflects a practice effect. As such, our findings indicate that the stress manipulation specifically interfered with a test-retest improvement of performance, especially in WM maintenance.

We acknowledge that the observed effects could be related not only to the effect of acute stress per se but also to the specific stressor implemented in our study. The PASAT, which requires participants to continuously add pairs of digits and update them (Lejuez et al., 2003), might have biased WM to favor updating over maintenance processes. As a result, participants could have been primed to erroneously favor this updating strategy, at the cost of maintenance, even when engaged in the reference-back task (Bhandari & Badre, 2018). However, if this were the case, we would have expected to observe an increase in the drift rate (i.e., higher drift rate in the Stress group compared to the Control group) during updating trials, as well as between-group differences in decision thresholds. Contrary to this expectation, the drift rate was consistently lower for the stress group, especially in maintenance trials, and the decision thresholds were comparable across groups. This suggests that rather than the PASAT promoting updating over maintenance, it more generally strained WM performance.

To gain broader insights into the relationship between acute stress and WM, we conducted a meta-analysis examining a range of WM tasks categorized by their specific demands (i.e., maintenance, maintenance + competition, maintenance + updating). This meta-analytic approach allowed us to overcome the constraints of our single-study design and provide a more comprehensive view of acute stress effects on WM, examining the effects of stress as a function of outcome type (response time or accuracy), and task demands.

Our meta-analysis, which included both published and unpublished literature and examined a broad spectrum of stressors (from movie-induced to psychosocial) complemented the findings of Study 1. It revealed a negative impact of stress on WM performance, particularly when tasks involved both maintenance and updating demands and especially on accuracy as outcome measure. This aligns with Study 1 where we observed impaired accuracy in the reference-back paradigm, which involves both maintenance and updating demands. Arguably, the paradigm also involves overcoming competition, especially on comparison trials where participants must ensure that the presented stimulus is responded to but not encoded as the new reference. A subtle distinction between this kind of competition demand and the other tasks classified as Maintenance + Overcoming Competition is that in the latter type of tasks the interference typically arises from a task-unrelated source (e.g., distractor stimuli that must never be responded to). The fact that our meta-analysis did not reveal impaired performance on such tasks might indicate that stress does not simply render WM maintenance more vulnerable to external distraction.

Our meta-analysis aligns with and extends previous meta-analytic work of Shields and colleagues (2016), who reported a numerically similar effect size. Additionally, by investigating the impact of acute stress on distinct outcome types, our results also echoed those of Shields et al. (2016) in showing that stress predominantly impairs accuracy-related measures, with no reliable effects on reaction times.

In our study, we focused exclusively on declarative WM tasks to target acute stress effects on specific processes. Building on this, a future line of research could investigate whether acute stress has a more pronounced impact on gating in procedural WM compared to declarative WM. Interestingly, a version of the reference-back paradigm has been developed to investigate WM gating in procedural working memory (Kessler, 2017). While declarative and procedural WM gating operations share some overlapping neural activity patterns (Nir-Cohen et al., 2023), they also exhibit notable differences. Given the mixed findings on the effects of acute stress on task-switching and cognitive inhibition, it is difficult to predict with certainty whether procedural vs. declarative WM gating would be more affected. To elaborate, task-switching involves rapidly shifting between different cognitive tasks or rules. This process engages both procedural WM (how to perform tasks) and declarative WM (what information to use). Inhibition plays a crucial role in both procedural WM (inhibiting inappropriate action plans) and declarative WM (suppressing irrelevant information; Rac-Lubashevsky & Kessler, 2018). Given these complex interactions, we might expect that stress, in the procedural version of the reference-back paradigm, has a greater impact on gate-closing trials involving conflict, where additional effort is required to ignore conflicting task cues (Nir-Cohen et al., 2023).

In conclusion, while our study did not reveal the anticipated acute stress effects on WM gating processes, this line of research remains valuable and warrants further investigation. To establish more definitive conclusions, future research should consider employing alternative stress induction methods (e.g., Trier Social Stress Test, Socially Evaluated Cold Pressor Test) to explore potential stressor-specific effects, utilizing a broader range of WM tasks to assess the generalizability of our findings across different cognitive demands, incorporating objective measures of stress reactivity (e.g., cortisol levels, heart rate variability) to better quantify physiological stress responses, and implementing more direct measures of subjective stress experiences to capture participants’ psychological responses more accurately. These methodological refinements could provide deeper insights into the complex relationship between acute stress and WM processes, potentially uncovering nuanced effects that our current approach may have missed.

To our knowledge, our study stands out as the first to analyze acute stress effects on WM component processes by integrating both behavioral and model-based perspectives. While no substantial impact of acute stress on WM gating was observed, our investigation yielded preliminary evidence suggesting a negative effect on WM maintenance when tasks require a dynamic balance between maintenance and updating of WM representations. Continued exploration in this field holds promise for enhancing our understanding of how acute stress influences specific WM components.

We are grateful to Sofia De Faveri for helping with data collection. A special thanks to Lisa Toffoli for her insightful comments.

Contributed to conception and design: MC, NC, RS

Contributed to acquisition of data: MC

Contributed to analysis and interpretation of data: MC, BJJ, FG, MP, NC, RS

Drafted and/or revised the article: MC, BJJ, FG, MP, NC, RS

Approved the submitted version for publication: MC, BJJ, FG, MP, NC, RS

All authors declare that no competing interests exist.

1.

The reference-back paradigm also allows for the calculation of substitution cost, which specifically measures updating when the new reference letter differs from the previous one. In contrast, updating cost accounts for costs resulting from updating both different and same reference letters. Our study focused on stress effects on working memory gating processes, without specific hypotheses regarding the effects on updating same vs. different reference letters. Therefore, we did not distinguish between them, omitting the matching factor needed for substitution cost calculation. However, at the reviewers’ request, we conducted additional analyses including this factor to assess stress effects on substitution cost (in terms of reaction times, accuracy and drift-diffusion modeling). These analyses found no evidence of stress effects on substitution cost and are available in the supplementary material.

2.

This meta-analysis was pre-registered on PROSPERO (CRD42022366946, date of first registration: 24 October 2022). Here, we focused on our primary research question concerning working memory components and, therefore, we did not conduct all pre-registered analyses.

Arnsten, A. F. T. (2009). Stress signalling pathways that impair prefrontal cortex structure and function. Nature Reviews Neuroscience, 10(6), 410–422. https:/​/​doi.org/​10.1038/​nrn2648
Arnsten, A. F. T. (2015). Stress weakens prefrontal networks: Molecular insults to higher cognition. Nature Neuroscience, 18(10), 1376–1385. https:/​/​doi.org/​10.1038/​nn.4087
Bahari, Z., Meftahi, G. H., & Meftahi, M. A. (2018). Dopamine effects on stress-induced working memory deficits. Behavioural Pharmacology, 29(7), 584–591. https:/​/​doi.org/​10.1097/​FBP.0000000000000429
Bakvis, P., Spinhoven, P., Putman, P., Zitman, F. G., & Roelofs, K. (2010). The effect of stress induction on working memory in patients with psychogenic nonepileptic seizures. Epilepsy & Behavior, 19(3), 448–454. https:/​/​doi.org/​10.1016/​j.yebeh.2010.08.026
Bhandari, A., & Badre, D. (2018). Learning and transfer of working memory gating policies. Cognition, 172, 89–100. https:/​/​doi.org/​10.1016/​j.cognition.2017.12.001
Boag, R. J., Stevenson, N., van Dooren, R., Trutti, A. C., Sjoerds, Z., & Forstmann, B. U. (2021). Cognitive Control of Working Memory: A Model-Based Approach. Brain Sciences, 11(6), 721. https:/​/​doi.org/​10.3390/​brainsci11060721
Bogdanov, M., & Schwabe, L. (2016). Transcranial Stimulation of the Dorsolateral Prefrontal Cortex Prevents Stress-Induced Working Memory Deficits. Journal of Neuroscience, 36(4), 1429–1437. https:/​/​doi.org/​10.1523/​JNEUROSCI.3687-15.2016
Borenstein, M., Hedges, L. V., Higgins, J. P., & Rothstein, H. R. (2009). Introduction to Meta-Analysis. John Wiley & Sons. https:/​/​doi.org/​10.1002/​9780470743386
Broadway, J. M., Frank, M. J., & Cavanagh, J. F. (2018). Dopamine D2 agonist affects visuospatial working memory distractor interference depending on individual differences in baseline working memory span. Cognitive, Affective, & Behavioral Neuroscience, 18(3), 509–520. https:/​/​doi.org/​10.3758/​s13415-018-0584-6
Bürkner, P. (2017). brms: An R Package for Bayesian Multilevel Models Using Stan. Journal of Statistical Software, 80(1), 1–28. https:/​/​doi.org/​10.18637/​jss.v080.i01
Cavanagh, J. F., & Frank, M. J. (2014). Frontal theta as a mechanism for cognitive control. Trends in Cognitive Sciences, 18(8), 414–421. https:/​/​doi.org/​10.1016/​j.tics.2014.04.012
Chatham, C. H., Frank, M. J., & Badre, D. (2014). Corticostriatal Output Gating during Selection from Working Memory. Neuron, 81(4), 930–942. https:/​/​doi.org/​10.1016/​j.neuron.2014.01.002
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Psychology Press.
Cohen, M. X., & Donner, T. H. (2013). Midfrontal conflict-related theta-band power reflects neural oscillations that predict behavior. Journal of Neurophysiology, 110(12), 2752–2763. https:/​/​doi.org/​10.1152/​jn.00479.2013
Cools, R., Clark, L., & Robbins, T. W. (2004). Differential Responses in Human Striatum and Prefrontal Cortex to Changes in Object and Rule Relevance. The Journal of Neuroscience, 24(5), 1129–1135. https:/​/​doi.org/​10.1523/​JNEUROSCI.4312-03.2004
Cools, R., & D’Esposito, M. (2011). Inverted-U, Shaped Dopamine Actions on Human Working Memory and Cognitive Control. Biological Psychiatry, 69(12), e113–e125. https:/​/​doi.org/​10.1016/​j.biopsych.2011.03.028
Cools, R., & Robbins, T. W. (2004). Chemistry of the adaptive mind. Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences, 362(1825), 2871–2888. https:/​/​doi.org/​10.1098/​rsta.2004.1468
Cowan, N. (2017). The many faces of working memory and short-term storage. Psychonomic Bulletin & Review, 24(4), 1158–1170. https:/​/​doi.org/​10.3758/​s13423-016-1191-6
Dayan, P., & Yu, A. J. (2006). Phasic norepinephrine: A neural interrupt signal for unexpected events. Network: Computation in Neural Systems, 17(4), 335–350.
D’Esposito, M., & Postle, B. R. (2015). The Cognitive Neuroscience of Working Memory. Annual Review of Psychology, 66(1), 115–142. https:/​/​doi.org/​10.1146/​annurev-psych-010814-015031
Dixon, P. (2008). Models of accuracy in repeated-measures designs. Journal of Memory and Language, 59(4), 447–456. https:/​/​doi.org/​10.1016/​j.jml.2007.11.004
Durstewitz, D., & Seamans, J. K. (2008). The Dual-State Theory of Prefrontal Cortex Dopamine Function with Relevance to Catechol-O-Methyltransferase Genotypes and Schizophrenia. Biological Psychiatry, 64(9), 739–749. https:/​/​doi.org/​10.1016/​j.biopsych.2008.05.015
Dutilh, G., Krypotos, A.-M., & Wagenmakers, E.-J. (2011). Task-related versus stimulus-specific practice. Experimental Psychology. https:/​/​doi.org/​10.1027/​1618-3169/​a000111
Dutilh, G., Vandekerckhove, J., Tuerlinckx, F., & Wagenmakers, E.-J. (2009). A diffusion model decomposition of the practice effect. Psychonomic Bulletin & Review, 16, 1026–1036.
Ehrhardt, N. M., Fietz, J., Kopf-Beck, J., Kappelmann, N., & Brem, A. (2022). Separating EEG correlates of stress: Cognitive effort, time pressure, and social-evaluative threat. European Journal of Neuroscience, 55(9,10), 2464–2473. https:/​/​doi.org/​10.1111/​ejn.15211
Elzinga, B. M., & Roelofs, K. (2005). Cortisol-Induced Impairments of Working Memory Require Acute Sympathetic Activation. Behavioral Neuroscience, 119(1), 98–103. https:/​/​doi.org/​10.1037/​0735-7044.119.1.98
Frank, M. J., Loughry, B., & O’Reilly, R. C. (2001). Interactions between frontal cortex and basal ganglia in working memory: A computational model. Cognitive, Affective, & Behavioral Neuroscience, 1(2), 137–160. https:/​/​doi.org/​10.3758/​CABN.1.2.137
Fudenberg, D., Newey, W., Strack, P., & Strzalecki, T. (2020). Testing the drift-diffusion model. Proceedings of the National Academy of Sciences, 117(52), 33141–33148. https:/​/​doi.org/​10.1073/​pnas.2011446117
Furman, D. J., Zhang, Z., Chatham, C. H., Good, M., Badre, D., Hsu, M., & Kayser, A. S. (2021). Augmenting Frontal Dopamine Tone Enhances Maintenance over Gating Processes in Working Memory. Journal of Cognitive Neuroscience, 33(9), 1753–1765. https:/​/​doi.org/​10.1162/​jocn_a_01641
Gabrys, R. L., Howell, J. W., Cebulski, S. F., Anisman, H., & Matheson, K. (2019). Acute stressor effects on cognitive flexibility: Mediating role of stressor appraisals and cortisol. Stress, 22(2), 182–189. https:/​/​doi.org/​10.1080/​10253890.2018.1494152
Gamerman, D., & Lopes, H. F. (2006). Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference (2nd ed.). Chapman and Hall/CRC. https:/​/​doi.org/​10.1201/​9781482296426
Geißler, C., Friehs, M. A., Frings, C., & Domes, G. (2022). Time-dependent effects of acute stress on working memory performance: A systematic review and hypothesis. Psychoneuroendocrinology, 105998.
Gelman, A. (1996). Inference and monitoring convergence. In R. Gilks, S. Richardson, & D. J. Spiegelhalter (Eds.), Markov Chain Monte Carlo in Practice.
Goldfarb, E. V., Froböse, M. I., Cools, R., & Phelps, E. A. (2017). Stress and Cognitive Flexibility: Cortisol Increases Are Associated with Enhanced Updating but Impaired Switching. Journal of Cognitive Neuroscience, 29(1), 14–24. https:/​/​doi.org/​10.1162/​jocn_a_01029
Gronwall, D. M. A. (1977). Paced Auditory Serial-Addition Task: A Measure of Recovery from Concussion. Perceptual and Motor Skills, 44(2), 367–373. https:/​/​doi.org/​10.2466/​pms.1977.44.2.367
Hazy, T. E., Frank, M. J., & O’Reilly, R. C. (2006). Banishing the homunculus: Making working memory work. Neuroscience, 139(1), 105–118. https:/​/​doi.org/​10.1016/​j.neuroscience.2005.04.067
Hazy, T. E., Frank, M. J., & O’Reilly, R. C. (2007). Towards an executive without a homunculus: Computational models of the prefrontal cortex/basal ganglia system. Philosophical Transactions of the Royal Society B: Biological Sciences, 362(1485), 1601–1613. https:/​/​doi.org/​10.1098/​rstb.2007.2055
Hedges, L. V. (1981). Distribution theory for Glass’s estimator of effect size and related estimators. Journal of Educational Statistics, 6, 107–128. https:/​/​doi.org/​10.3102/​10769986006002107
Hermans, E. J., Henckens, M. J., Joëls, M., & Fernández, G. (2014). Dynamic adaptation of large-scale brain networks in response to acute stressors. Trends in Neurosciences, 37(6), 304–314. https:/​/​doi.org/​10.1016/​j.tins.2014.03.006
Jongkees, B. J. (2020). Baseline-dependent effect of dopamine’s precursor L-tyrosine on working memory gating but not updating. Cognitive, Affective, & Behavioral Neuroscience, 20(3), 521–535. https:/​/​doi.org/​10.3758/​s13415-020-00783-8
Jongkees, B. J., & Colzato, L. S. (2016). Spontaneous eye blink rate as predictor of dopamine-related cognitive function—A review. Neuroscience & Biobehavioral Reviews, 71, 58–82. https:/​/​doi.org/​10.1016/​j.neubiorev.2016.08.020
Kay, M. (2023). tidybayes: Tidy Data and Geoms for Bayesian Models (3.0.7). R package version 3.0.7. https:/​/​doi.org/​10.5281/​zenodo.1308151
Kessler, Y. (2017). The Role of Working Memory Gating in Task Switching: A Procedural Version of the Reference-Back Paradigm. Frontiers in Psychology, 8, 2260. https:/​/​doi.org/​10.3389/​fpsyg.2017.02260
Kruschke, J. K. (2013). Bayesian estimation supersedes the t test. Journal of Experimental Psychology: General, 142(2), 573–603. https:/​/​doi.org/​10.1037/​a0029146
Kruschke, J. K., & Liddell, T. M. (2018). Bayesian data analysis for newcomers. Psychonomic Bulletin & Review, 25(1), 155–177. https:/​/​doi.org/​10.3758/​s13423-017-1272-1
Lakens, D. (2022). Sample size justification. Collabra: Psychology, 8(1), Article 33267. https:/​/​doi.org/​10.1525/​collabra.33267
Lazarus, R. S., & Folkman, S. (1984). Stress, appraisal, and coping. Springer publishing company.
Lejuez, C. W., Kahler, C. W., & Brown, R. A. (2003). A modified computer version of the Paced Auditory Serial Addition Task (PASAT) as a laboratory-based stressor. The Behavior Therapist.
Lenth, R. (2022). emmeans: Estimated Marginal Means, aka Least-Squares Means. R package version 1.11.0-005. https:/​/​rvlenth.github.io/​emmeans/​
Lin, L., Leung, A. W. S., Wu, J., & Zhang, L. (2020). Individual differences under acute stress: Higher cortisol responders performs better on N-back task in young men. International Journal of Psychophysiology, 150, 20–28. https:/​/​doi.org/​10.1016/​j.ijpsycho.2020.01.006
Lo, S., & Andrews, S. (2015). To transform or not to transform: Using generalized linear mixed models to analyse reaction time data. Frontiers in Psychology, 6. https:/​/​doi.org/​10.3389/​fpsyg.2015.01171
Lüdecke, D. (2021). sjPlot: Data visualization for statistics in social science. R Package Version, 2(7), 1–106.
Lüdecke, D., Ben-Shachar, M., Patil, I., Waggoner, P., & Makowski, D. (2021). performance: An R Package for Assessment, Comparison and Testing of Statistical Models. Journal of Open Source Software, 6(60), 3139. https:/​/​doi.org/​10.21105/​joss.03139
Luettgau, L., Schlagenhauf, F., & Sjoerds, Z. (2018). Acute and past subjective stress influence working memory and related neural substrates. Psychoneuroendocrinology, 96, 25–34. https:/​/​doi.org/​10.1016/​j.psyneuen.2018.05.036
Makowski, D., Ben-Shachar, M., & Lüdecke, D. (2019). bayestestR: Describing Effects and their Uncertainty, Existence and Significance within the Bayesian Framework. Journal of Open Source Software, 4(40), 1541. https:/​/​doi.org/​10.21105/​joss.01541
Makowski, D., Ben-Shachar, M. S., Chen, S. H. A., & Lüdecke, D. (2019). Indices of Effect Existence and Significance in the Bayesian Framework. Frontiers in Psychology, 10, 2767. https:/​/​doi.org/​10.3389/​fpsyg.2019.02767
Matzke, D., & Wagenmakers, E.-J. (2009). Psychological interpretation of the ex-Gaussian and shifted Wald parameters: A diffusion model analysis. Psychonomic Bulletin & Review, 16(5), 798–817. https:/​/​doi.org/​10.3758/​PBR.16.5.798
McElreath, R. (2020). Statistical rethinking: A Bayesian course with examples in R and Stan. CRC press. https:/​/​doi.org/​10.1201/​9780429029608
Meier, M., Haub, K., Schramm, M. L., Hamma, M., Bentele, U. U., Dimitroff, S. J., Gärtner, R., Denk, B. F., Benz, A. B. E., Unternaehrer, E., & Pruessner, J. C. (2022). Validation of an online version of the trier social stress test in adult men and women. Psychoneuroendocrinology, 142, 105818. https:/​/​doi.org/​10.1016/​j.psyneuen.2022.105818
Mitchell, D. J., McNaughton, N., Flanagan, D., & Kirk, I. J. (2008). Frontal-midline theta from the perspective of hippocampal “theta.” Progress in Neurobiology, 86(3), 156–185. https:/​/​doi.org/​10.1016/​j.pneurobio.2008.09.005
Morris, S. B. (2008). Estimating effect sizes from pretest-posttest-control group designs. Organizational Research Methods, 11(2), 364–386. https:/​/​doi.org/​10.1177/​1094428106291059
Morris, S. B., & DeShon, R. P. (2002). Combining effect size estimates in meta-analysis with repeated measures and independent-groups designs. Psychological Methods, 7(1), 105–125. https:/​/​doi.org/​10.1037/​1082-989x.7.1.105
Muhle-Karbe, P. S., Myers, N. E., & Stokes, M. G. (2021). A Hierarchy of Functional States in Working Memory. The Journal of Neuroscience, 41(20), 4461–4475. https:/​/​doi.org/​10.1523/​JNEUROSCI.3104-20.2021
Myers, C. E., Interian, A., & Moustafa, A. A. (2022). A practical introduction to using the drift diffusion model of decision-making in cognitive psychology, neuroscience, and health sciences. Frontiers in Psychology, 13, 1039172. https:/​/​doi.org/​10.3389/​fpsyg.2022.1039172
Nir-Cohen, G., Egner, T., & Kessler, Y. (2023). The Neural Correlates of Updating and Gating in Procedural Working Memory. Journal of Cognitive Neuroscience, 35(6), 919–940. https:/​/​doi.org/​10.1162/​jocn_a_01988
Nir-Cohen, G., Kessler, Y., & Egner, T. (2020). Neural Substrates of Working Memory Updating. Journal of Cognitive Neuroscience, 32(12), 2285–2302. https:/​/​doi.org/​10.1162/​jocn_a_01625
Oberauer, K. (2009). Chapter 2 Design for a Working Memory. In Psychology of Learning and Motivation (Vol. 51, pp. 45–100). Elsevier. https:/​/​doi.org/​10.1016/​S0079-7421(09)51002-X
Oberauer, K. (2018). Removal of irrelevant information from working memory: Sometimes fast, sometimes slow, and sometimes not at all: Removal of irrelevant information from working memory. Annals of the New York Academy of Sciences, 1424(1), 239–255. https:/​/​doi.org/​10.1111/​nyas.13603
Oberauer, K., Lewandowsky, S., Awh, E., Brown, G. D. A., Conway, A., Cowan, N., Donkin, C., Farrell, S., Hitch, G. J., Hurlstone, M. J., Ma, W. J., Morey, C. C., Nee, D. E., Schweppe, J., Vergauwe, E., & Ward, G. (2018). Benchmarks for models of short-term and working memory. Psychological Bulletin, 144(9), 885–958. https:/​/​doi.org/​10.1037/​bul0000153
Oei, N. Y. L., Everaerd, W. T. A. M., Elzinga, B. M., van Well, S., & Bermond, B. (2006). Psychosocial stress impairs working memory at high loads: An association with cortisol levels and memory retrieval. Stress, 9(3), 133–141. https:/​/​doi.org/​10.1080/​10253890600965773
Olesen, P. J., Macoveanu, J., Tegner, J., & Klingberg, T. (2006). Brain Activity Related to Working Memory and Distraction in Children and Adults. Cerebral Cortex, 17(5), 1047–1054. https:/​/​doi.org/​10.1093/​cercor/​bhl014
O’Reilly, R. C., & Frank, M. J. (2006). Making Working Memory Work: A Computational Model of Learning in the Prefrontal Cortex and Basal Ganglia. Neural Computation, 18(2), 283–328. https:/​/​doi.org/​10.1162/​089976606775093909
Ott, T., & Nieder, A. (2019). Dopamine and Cognitive Control in Prefrontal Cortex. Trends in Cognitive Sciences, 23(3), 213–234. https:/​/​doi.org/​10.1016/​j.tics.2018.12.006
Otto, A. R., Raio, C. M., Chiang, A., Phelps, E. A., & Daw, N. D. (2013). Working-memory capacity protects model-based learning from stress. Proceedings of the National Academy of Sciences of the United States of America, 110(52), 20941–20946. https:/​/​doi.org/​10.1073/​pnas.1312011110
Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S. E., Chou, R., Glanville, J., Grimshaw, J. M., Hróbjartsson, A., Lalu, M. M., Li, T., Loder, E. W., Mayo-Wilson, E., McDonald, S., … Moher, D. (2021). The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. British Medical Journal (Clinical Research Ed.), 372, n71. https:/​/​doi.org/​10.1136/​bmj.n71
Pedersen, T. (2022). patchwork: The Composer of Plots. Https://Patchwork.Data-Imaginist.Com. https:/​/​github.com/​thomasp85/​patchwork
Peirce, J., Gray, J. R., Simpson, S., MacAskill, M., Höchenberger, R., Sogo, H., Kastman, E., & Lindeløv, J. K. (2019). PsychoPy2: Experiments in behavior made easy. Behavior Research Methods, 51(1), 195–203. https:/​/​doi.org/​10.3758/​s13428-018-01193-y
Plessow, F., Fischer, R., Kirschbaum, C., & Goschke, T. (2011). Inflexibly Focused under Stress: Acute Psychosocial Stress Increases Shielding of Action Goals at the Expense of Reduced Cognitive Flexibility with Increasing Time Lag to the Stressor. Journal of Cognitive Neuroscience, 23(11), 3218–3227. https:/​/​doi.org/​10.1162/​jocn_a_00024
Plessow, F., Kiesel, A., & Kirschbaum, C. (2012). The stressed prefrontal cortex and goal-directed behaviour: Acute psychosocial stress impairs the flexible implementation of task goals. Experimental Brain Research, 216(3), 397–408. https:/​/​doi.org/​10.1007/​s00221-011-2943-1
Poppelaars, E. S., Klackl, J., Pletzer, B., Wilhelm, F. H., & Jonas, E. (2019). Social-evaluative threat: Stress response stages and influences of biological sex and neuroticism. Psychoneuroendocrinology, 109, 104378. https:/​/​doi.org/​10.1016/​j.psyneuen.2019.104378
Postle, B. R. (2006). Working memory as an emergent property of the mind and brain. Neuroscience, 139(1), 23–38. https:/​/​doi.org/​10.1016/​j.neuroscience.2005.06.005
Pruessner, J. C., Champagne, F., Meaney, M. J., & Dagher, A. (2004). Dopamine Release in Response to a Psychological Stress in Humans and Its Relationship to Early Life Maternal Care: A Positron Emission Tomography Study Using [ 11 C]Raclopride. The Journal of Neuroscience, 24(11), 2825–2831. https:/​/​doi.org/​10.1523/​JNEUROSCI.3422-03.2004
Qin, S., Cousijn, H., Rijpkema, M., Luo, J., Franke, B., Hermans, E. J., & Fernández, G. (2012). The effect of moderate acute psychological stress on working memory-related neural activity is modulated by a genetic variation in catecholaminergic function in humans. Frontiers in Integrative Neuroscience, 1, 29. https:/​/​doi.org/​10.3389/​fnint.2012.00016
Qin, S., Hermans, E. J., van Marle, H. J. F., Luo, J., & Fernández, G. (2009). Acute Psychological Stress Reduces Working Memory-Related Activity in the Dorsolateral Prefrontal Cortex. Biological Psychiatry, 66(1), 25–32. https:/​/​doi.org/​10.1016/​j.biopsych.2009.03.006
R Core Team. (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. https:/​/​www.R-project.org
Rac-Lubashevsky, R., & Frank, M. J. (2021). Analogous computations in working memory input, output and motor gating: Electrophysiological and computational modeling evidence. PLOS Computational Biology, 17(6), e1008971. https:/​/​doi.org/​10.1371/​journal.pcbi.1008971
Rac-Lubashevsky, R., & Kessler, Y. (2016a). Decomposing the n-back task: An individual differences study using the reference-back paradigm. Neuropsychologia, 90, 190–199. https:/​/​doi.org/​10.1016/​j.neuropsychologia.2016.07.013
Rac-Lubashevsky, R., & Kessler, Y. (2016b). Dissociating working memory updating and automatic updating: The reference-back paradigm. Journal of Experimental Psychology: Learning, Memory, and Cognition, 42(6), 951–969. https:/​/​doi.org/​10.1037/​xlm0000219
Rac-Lubashevsky, R., & Kessler, Y. (2018). Oscillatory Correlates of Control over Working Memory Gating and Updating: An EEG Study Using the Reference-back Paradigm. Journal of Cognitive Neuroscience, 30(12), 1870–1882. https:/​/​doi.org/​10.1162/​jocn_a_01326
Rac-Lubashevsky, R., & Kessler, Y. (2019). Revisiting the relationship between the P3b and working memory updating. Biological Psychology, 148, 107769. https:/​/​doi.org/​10.1016/​j.biopsycho.2019.107769
Rac-Lubashevsky, R., Slagter, H. A., & Kessler, Y. (2017). Tracking Real-Time Changes in Working Memory Updating and Gating with the Event-Based Eye-Blink Rate. Scientific Reports, 7(1), 2547. https:/​/​doi.org/​10.1038/​s41598-017-02942-3
Radenbach, C., Reiter, A. M., Engert, V., Sjoerds, Z., Villringer, A., Heinze, H. J., Deserno, L., & Schlagenhauf, F. (2015). The interaction of acute and chronic stress impairs model-based behavioral control. Psychoneuroendocrinology, 53, 268–280. https:/​/​doi.org/​10.1016/​j.psyneuen.2014.12.017
Ratcliff, R., & McKoon, G. (2008). The diffusion decision model: Theory and data for two-choice decision tasks. Neural Computation, 20(4), 873–922. https:/​/​doi.org/​10.1162/​neco.2008.12-06-420
Ratcliff, R., Smith, P. L., Brown, S. D., & McKoon, G. (2016). Diffusion decision model: Current issues and history. Trends in Cognitive Sciences, 20(4), 260–281. https:/​/​doi.org/​10.1016/​j.tics.2016.01.007
Reinhartz, A., Strobach, T., Jacobsen, T., & von Bastian, C. C. (2023). Mechanisms of Training-Related Change in Processing Speed: A Drift-Diffusion Model Approach. Journal of Cognition, 6(1). https:/​/​doi.org/​10.5334/​joc.310
Rempel, S., Colzato, L., Zhang, W., Wolff, N., Mückschel, M., & Beste, C. (2021). Distinguishing Multiple Coding Levels in Theta Band Activity During Working Memory Gating Processes. Neuroscience, 478, 11–23. https:/​/​doi.org/​10.1016/​j.neuroscience.2021.09.025
Rohatgi, A. (2017, June). WebPlotDigitizer.
Russell, J. A., Weiss, A., & Mendelsohn, G. A. (1989). Affect grid: A single-item scale of pleasure and arousal. Journal of Personality and Social Psychology, 57(3), 493. https:/​/​doi.org/​10.1037/​0022-3514.57.3.493
Schmiedek, F., Oberauer, K., Wilhelm, O., Süß, H.-M., & Wittmann, W. W. (2007). Individual differences in components of reaction time distributions and their relations to working memory and intelligence. Journal of Experimental Psychology: General, 136(3), 414. https:/​/​doi.org/​10.1037/​0096-3445.136.3.414
Schmitz, F., & Voss, A. (2014). Components of task switching: A closer look at task switching and cue switching. Acta Psychologica, 151, 184–196. https:/​/​doi.org/​10.1016/​j.actpsy.2014.06.009
Schoofs, D., Preuß, D., & Wolf, O. T. (2008). Psychosocial stress induces working memory impairments in an n-back paradigm. Psychoneuroendocrinology, 33(5), 643–653. https:/​/​doi.org/​10.1016/​j.psyneuen.2008.02.004
Schwarz, L. A., & Luo, L. (2015). Organization of the locus coeruleus-norepinephrine system. Current Biology, 25(21), R1051–R1056. https:/​/​doi.org/​10.1016/​j.cub.2015.09.039
Shields, G. S., Sazma, M. A., & Yonelinas, A. P. (2016). The effects of acute stress on core executive functions: A meta-analysis and comparison with cortisol. Neuroscience & Biobehavioral Reviews, 68, 651–668. https:/​/​doi.org/​10.1016/​j.neubiorev.2016.06.038
Stauble, M. R., Thompson, L. A., & Morgan, G. (2013). Increases in cortisol are positively associated with gains in encoding and maintenance working memory performance in young men. Stress, 16(4), 402–410. https:/​/​doi.org/​10.3109/​10253890.2013.780236
Stedron, J. M., Sahni, S. D., & Munakata, Y. (2005). Common Mechanisms for Working Memory and Attention: The Case of Perseveration with Visible Solutions. Journal of Cognitive Neuroscience, 17(4), 623–631. https:/​/​doi.org/​10.1162/​0898929053467622
Steinrücke, J., Veldkamp, B. P., & De Jong, T. (2019). Determining the effect of stress on analytical skills performance in digital decision games towards an unobtrusive measure of experienced stress in gameplay scenarios. Computers in Human Behavior, 99, 144–155. https:/​/​doi.org/​10.1016/​j.chb.2019.05.014
Stone, C., Ney, L., Felmingham, K., Nichols, D., & Matthews, A. (2021). The effects of acute stress on attentional networks and working memory in females. Physiology & Behavior, 242, 113602. https:/​/​doi.org/​10.1016/​j.physbeh.2021.113602
Tanosoto, T., Bendixen, K. H., Arima, T., Hansen, J., Terkelsen, A. J., & Svensson, P. (2015). Effects of the Paced Auditory Serial Addition Task (PASAT) with different rates on autonomic nervous system responses and self-reported levels of stress. Journal of Oral Rehabilitation, 42(5), 378–385. https:/​/​doi.org/​10.1111/​joor.12257
Toet, A., Kaneko, D., Ushiama, S., Hoving, S., de Kruijf, I., Brouwer, A.-M., Kallen, V., & van Erp, J. B. F. (2018). EmojiGrid: A 2D Pictorial Scale for the Assessment of Food Elicited Emotions. Frontiers in Psychology, 9, 2396. https:/​/​doi.org/​10.3389/​fpsyg.2018.02396
Tona, K.-D., Revers, H., Verkuil, B., & Nieuwenhuis, S. (2020). Noradrenergic regulation of cognitive flexibility: No effects of stress, transcutaneous vagus nerve stimulation, and atomoxetine on task-switching in humans. Journal of Cognitive Neuroscience, 32(10), 1881–1895. https:/​/​doi.org/​10.1162/​jocn_a_01603
Vaessen, T., Hernaus, D., Myin-Germeys, I., & van Amelsvoort, T. (2015). The dopaminergic response to acute stress in health and psychopathology: A systematic review. Neuroscience & Biobehavioral Reviews, 56, 241–251. https:/​/​doi.org/​10.1016/​j.neubiorev.2015.07.008
van Ast, V. A., Spicer, J., Smith, E. E., Schmer-Galunder, S., Liberzon, I., Abelson, J. L., & Wager, T. D. (2016). Brain mechanisms of social threat effects on working memory. Cerebral Cortex, 26(2), 544–556.
Vehtari, A., Gelman, A., Simpson, D., Carpenter, B., & Bürkner, P.-C. (2021). Rank-normalization, folding, and localization: An improved R ̂ for assessing convergence of MCMC (with discussion). Bayesian Analysis, 16(2), 667–718. https:/​/​doi.org/​10.1214/​20-BA1221
Verschooren, S., Kessler, Y., & Egner, T. (2021). Evidence for a single mechanism gating perceptual and long-term memory information into working memory. Cognition, 212, 104668. https:/​/​doi.org/​10.1016/​j.cognition.2021.104668
Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1–48. https:/​/​doi.org/​10.18637/​jss.v036.i03
Vogel, S., & Schwabe, L. (2016). Learning and memory under stress: Implications for the classroom. Npj Science of Learning, 1(1), 16011. https:/​/​doi.org/​10.1038/​npjscilearn.2016.11
Voss, A., Nagler, M., & Lerche, V. (2013). Diffusion Models in Experimental Psychology: A Practical Introduction. Experimental Psychology, 60(6), 385–402. https:/​/​doi.org/​10.1027/​1618-3169/​a000218
Voss, A., Rothermund, K., & Voss, J. (2004). Interpreting the parameters of the diffusion model: An empirical validation. Memory & Cognition, 32(7), 1206–1220. https:/​/​doi.org/​10.3758/​BF03196893
Wanke, N., & Schwabe, L. (2020). Subjective Uncontrollability over Aversive Events Reduces Working Memory Performance and Related Large-Scale Network Interactions. Cerebral Cortex (New York, N.Y.: 1991), 30(5), 3116–3129. https:/​/​doi.org/​10.1093/​cercor/​bhz298
Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https:/​/​doi.org/​10.1007/​978-3-319-24277-4
Wickham, H. (2022). Reshaping data with the reshape package. Journal of Statistical Software, 21, 1–20.
Wickham, H., & Bryan, J. (2019). Readxl: Read Excel Files R Package Version 1.3.1. https:/​/​CRAN.R-project.org/​package=readxl
Wickham, H., François, R., Henry, L., Müller, K., & Vaughan, D. (2023). dplyr: A Grammar of Data Manipulation. R package version 1.1.4. https:/​/​dplyr.tidyverse.org
Wickham, H., Hester, J., & Bryan, J. (2022). readr: Read Rectangular Text Data. Https://Github.Com/Tidyverse/Readr. https:/​/​doi.org/​10.32614/​CRAN.package.meltr
Williams, D. R., Rast, P., & Bürkner, P. C. (2018). Bayesian meta-analysis with weakly informative prior distributions. PsyArXiv. https:/​/​doi.org/​10.31234/​osf.io/​7tbrm
Wirz, L., Bogdanov, M., & Schwabe, L. (2018). Habits under stress: Mechanistic insights across different types of learning. Current Opinion in Behavioral Sciences, 20, 9–16. https:/​/​doi.org/​10.1016/​j.cobeha.2017.08.009
Yu, S., Mückschel, M., Rempel, S., Ziemssen, T., & Beste, C. (2022). Time-On-Task Effects on Working Memory Gating Processes—A Role of Theta Synchronization and the Norepinephrine System. Cerebral Cortex Communications, 3(1), tgac001. https:/​/​doi.org/​10.1093/​texcom/​tgac001
This is an open access article distributed under the terms of the Creative Commons Attribution License (4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Supplementary Material