Using physiologically validated questionnaires in which the peak of circadian arousal is determined through morningness-eveningness preferences, individuals can be categorized into morning or evening chronotypes. Typically, individuals with such chronotypes are assumed to show better cognitive performance at their subjective peak of circadian arousal than at off peak. Although this so-called synchrony effect is accepted as common knowledge, empirical evidence is rather mixed. This may be explained by two methodical challenges. First, most studies are underpowered. Second, they include one task, but tasks differ across studies. Here, we tested the synchrony effect by focusing on two cognitive constructs that are assumed to underlie a wide variety of behaviors, that is: short-term maintenance and attentional control. Short-term maintenance refers to our ability to maintain information temporarily. Attentional control refers to our ability to avoid being distracted by irrelevant information. We addressed the methodical challenges by asking 446 young adults to perform eight tasks at on- and off-peak times. Four tasks were used to assess temporary maintenance of information (i.e., short-term memory). Four tasks were used to assess temporary maintenance and manipulation of information (i.e., working memory). Using structural equation modeling, we modeled attentional control as the goal-directed nature of the working-memory tasks without their maintenance aspects. At the individual-task level, there was some evidence for a synchrony effect. However, the evidence was weak and limited to two tasks. Moreover, at the latent-variable level, the results showed no evidence for a robust and general synchrony effect. These results were observed for the full sample (N = 446) and the subsample including participants with moderate to definite morning or evening chronotypes (N = 191). We conclude that the synchrony effect is most likely a methodical artefact and discuss the impact of our research on psychological science and scientific research more widely.
Environmental conditions at day and night are so different that it is hard to imagine that they do not influence human behaviour and experience. Indeed, previous research has shown that morningness-eveningness preferences, which are established by questionnaires, can be used to classify individuals into different chronotypes (Griefahn et al., 2001; Horne & Östberg, 1976). Moreover, these chronotypes have been related to physiological measures of circadian arousal (e.g., Horne & Östberg, 1977). Thus, most individuals who report a morningness preference and thus are classified as morning types have their peak of circadian arousal in the morning. By contrast, most individuals who report an eveningness preference and thus are classified as evening types have their peak of circadian arousal in the evening. Common knowledge as well as a large body of literature in psychology and neurosciences suggest an interplay between chronotype and time-of-day (e.g., May et al., 1993; May, 1999; May et al., 2005; May & Hasher, 1998a). That is, individuals with morning and evening types are assumed to exhibit better cognitive performance at their peak than at off peak. However, due to methodical challenges, the evidence for this so-called synchrony effect is not as clear as it seems. Here, we addressed these methodical challenges to clarify whether a general and robust synchrony effect can be observed.
Although the synchrony effect is accepted as true and is not really questioned in daily life (see, e.g., Brain Function of Night Owls and Larks Differ, Study Suggests, 2019; Ceurstemont, 2020; Cohen, 2014; Pink, 2018; Savage, 2020), empirical evidence is more equivocal. This is particularly the case for cognitive constructs for which a substantial synchrony effect has been put forward in early studies (see, e.g., Intons-Peterson et al., 1998; May, 1999; May & Hasher, 1998b; West et al., 2002). For instance, evidence is mixed for tasks measuring attentional control or executive functions (i.e., the ability to maintain goal-relevant information when facing distraction; Draheim et al., 2022; von Bastian et al., 2020). While some studies present evidence for the synchrony effect on attentional control (e.g., Bennett et al., 2008; Hahn et al., 2012; Hasher et al., 2002; Intons-Peterson et al., 1998; Lara et al., 2014; May, 1999; May & Hasher, 1998b), others are not able to detect such an effect on attentional control (e.g., Bennett et al., 2008; Heimola et al., 2021; Knight & Mather, 2013; Matchock & Mordkoff, 2008; May & Hasher, 1998b; Schmidt et al., 2012). The same situation applies to working memory (i.e., the ability to manipulate and maintain information for a short duration; e.g., Baddeley, 2012). That is, some studies find evidence for a synchrony effect on working memory (e.g., Rowe et al., 2009; Schmidt et al., 2015; West et al., 2002), while other studies suggest no synchrony effect on working memory (e.g., Ceglarek et al., 2021; Heimola et al., 2021; Lewandowska et al., 2017).
Study | Sample size | Number of task(s) | Testing time |
Bennett et al. (2008) | 77 | > 1 | between-subjects |
Bodenhausen (1990, Exp. 1) | 55 | 1 | between-subjects |
Bodenhausen (1990, Exp. 2) | 189 | 1 | between-subjects |
Ceglarek et al. (2021) | 66 | 1 | within-subject |
Fabbri et al. (2013, Exp. 1) | 170 | 1 | between-subjects |
Fabbri et al. (2013, Exp. 2) | 234 | 1 | between-subjects |
Goldstein et al. (2007) | 80 | > 1 | between-subjects |
Hahn et al. (2012) | 80 | > 1 | between-subjects |
Hasher et al. (2002) | 96 | 1 | between-subjects |
Intons-Peterson et al. (1998) | 64 / 40 | 1 | between-subjects |
Intons-Peterson et al. (1999, Exp. 1) | 77 / 42 | > 1 | between-subjects |
Intons-Peterson et al. (1999, Exp. 3) | 90 / 67 | > 1 | between-subjects |
Lara et al. (2014) | 27 | > 1 | within-subject |
Lehmann et al. (2013) | 42 / 42 | 1 | between-subjects |
Lewandowska et al. (2018) | 52 | > 1 | within-subject |
Li et al. (1998, Exp. 1) | 32 / 32 | 1 | between-subjects |
Li et al. (1998, Exp. 2) | 32 / 31 | 1 | between-subjects |
Matchock and Mordkoff (2009) | 80 | 1 | within-subject |
May (1999) | 40 / 44 | 1 | between-subjects |
May and Hasher (1998, Exp. 1) | 48 / 48 | 1 | between-subjects |
May and Hasher (1998, Exp. 2) | 36 / 36 | 1 | between-subjects |
May et al. (1993) | 20 / 18 | 1 | between-subjects |
May et al. (2005, Exp. 1) | 36 / 48 | 1 | between-subjects |
May et al. (2005, Exp. 2) | 54 / 36 | 1 | between-subjects |
Petros et al. (1990) | 79 | 1 | between-subjects |
Rothen and Meier (2016) | 160 | 1 | within-subject |
Rothen and Meier (2017) | 115 / 113 | 1 | within-subject |
Rowe et al. (2009) | 56 / 55 | 1 | between-subjects |
Schmidt et al. (2012) | 31 | 1 | within-subject |
Schmidt et al. (2015) | 28 | 1 | within-subject |
Van Opstaal (2021) | 130 | 1 | within-subject |
West et al. (2002) | 20 / 20 | 1 | within-subject |
Yang et al. (2007, Exp. 1) | 0 / 52 | 1 | between-subjects |
Yang et al. (2007, Exp. 2) | 0 / 46 | 1 | between-subjects |
Yaremenko et al. (2021) | 91 | 1 | between-subjects |
Yoon (1997) | 80 / 85 | 1 | between-subjects |
Study | Sample size | Number of task(s) | Testing time |
Bennett et al. (2008) | 77 | > 1 | between-subjects |
Bodenhausen (1990, Exp. 1) | 55 | 1 | between-subjects |
Bodenhausen (1990, Exp. 2) | 189 | 1 | between-subjects |
Ceglarek et al. (2021) | 66 | 1 | within-subject |
Fabbri et al. (2013, Exp. 1) | 170 | 1 | between-subjects |
Fabbri et al. (2013, Exp. 2) | 234 | 1 | between-subjects |
Goldstein et al. (2007) | 80 | > 1 | between-subjects |
Hahn et al. (2012) | 80 | > 1 | between-subjects |
Hasher et al. (2002) | 96 | 1 | between-subjects |
Intons-Peterson et al. (1998) | 64 / 40 | 1 | between-subjects |
Intons-Peterson et al. (1999, Exp. 1) | 77 / 42 | > 1 | between-subjects |
Intons-Peterson et al. (1999, Exp. 3) | 90 / 67 | > 1 | between-subjects |
Lara et al. (2014) | 27 | > 1 | within-subject |
Lehmann et al. (2013) | 42 / 42 | 1 | between-subjects |
Lewandowska et al. (2018) | 52 | > 1 | within-subject |
Li et al. (1998, Exp. 1) | 32 / 32 | 1 | between-subjects |
Li et al. (1998, Exp. 2) | 32 / 31 | 1 | between-subjects |
Matchock and Mordkoff (2009) | 80 | 1 | within-subject |
May (1999) | 40 / 44 | 1 | between-subjects |
May and Hasher (1998, Exp. 1) | 48 / 48 | 1 | between-subjects |
May and Hasher (1998, Exp. 2) | 36 / 36 | 1 | between-subjects |
May et al. (1993) | 20 / 18 | 1 | between-subjects |
May et al. (2005, Exp. 1) | 36 / 48 | 1 | between-subjects |
May et al. (2005, Exp. 2) | 54 / 36 | 1 | between-subjects |
Petros et al. (1990) | 79 | 1 | between-subjects |
Rothen and Meier (2016) | 160 | 1 | within-subject |
Rothen and Meier (2017) | 115 / 113 | 1 | within-subject |
Rowe et al. (2009) | 56 / 55 | 1 | between-subjects |
Schmidt et al. (2012) | 31 | 1 | within-subject |
Schmidt et al. (2015) | 28 | 1 | within-subject |
Van Opstaal (2021) | 130 | 1 | within-subject |
West et al. (2002) | 20 / 20 | 1 | within-subject |
Yang et al. (2007, Exp. 1) | 0 / 52 | 1 | between-subjects |
Yang et al. (2007, Exp. 2) | 0 / 46 | 1 | between-subjects |
Yaremenko et al. (2021) | 91 | 1 | between-subjects |
Yoon (1997) | 80 / 85 | 1 | between-subjects |
<em>Note</em>. For studies including young and/or older adults, both sample sizes are given separately. The first value refers to the sample size for young adults; the second value refers to the sample size for older adults. In all studies, the samples included participants that were categorized as morning and evening chronotypes. Only in Van Opstaal et al. (2022), the sample included participants with neutral chronotypes in addition to participants with morning and evening chronotypes. In the column “The number of task(s)”, a study is considered to include more than one task if the tasks are assumed to measure the same construct. In the column “Testing time”, a between-subjects manipulation of the testing time refers to a design in which one group of participants were tested at their subjective peak of circadian arousal and another group of participants were tested at their subjective off peak. A within-subject manipulation of the testing time refers to a design in which all participants were tested at their subjective on and off peaks. For the sake of clarity, studies which are considered as underpowered irrespective of whether the cut-off for the power was set to .80 or .90 are presented in bold and italic. Studies which are considered as underpowered when the cut-off was set to .90 only are presented in italic.
Based on this mixed evidence, it is tempting to conclude that the synchrony effect is not general and robust. However, the current evidence may be distorted by at least two methodical challenges. First, most studies investigating the synchrony effect on human cognition were statistically underpowered. Table 1 presents an overview of previous research about the sample size and the type of testing-time manipulation (i.e., whether on- and off-peak times were manipulated between-subjects or within-subject). We estimated the adequacy of the sample sizes by comparing them to the recommended sample sizes put forward by Brysbaert (2019). These recommended sample sizes were determined using a generic effect size of Cohen’s d of .40. According to Brysbaert (2019), effect sizes from previous studies should be avoided to determine the target sample sizes because published studies are frequently underpowered and the impact of the publication bias is unknown. He suggests using a Cohen’s d of .40, because this represents “a good first estimate of the smallest effect size of interest in psychological research” (see p. 1). Thus, with such a Cohen’s d, an alpha level of .05, and a power of .80, Brysbaert (2019) showed that the sample size should consist of 200 participants for a t-test comparing on- and off-peak times in a between-subjects design and of 52 participants for a t-test comparing on- and off-peak times in a within-subject design. If a stricter criterium is applied for the power by using a cut-off .90, the sample-size requirements increase to 264 and 70 participants, respectively. According to these recommendations, 81% and 89%, respectively, of the studies listed in Table 1 included small sample sizes and thus had not enough statistical power to detect a true effect.
The second methodological challenge is that most studies included one task, and the tasks differed across the studies. This challenge concerns, for example, about 81% of the studies listed in Table 1. Only including one task per study is questionable because of the task-impurity problem. That is, each cognitive task measures not only the construct of interest but also other constructs and random noise (e.g., Miyake & Friedman, 2012). Thus, the mixed evidence regarding the synchrony effect may result from different studies controlling more or less well for this task-impurity problem. Moreover, including different tasks across different studies is problematic because these different tasks may not assess the same construct. For example, working memory is often synonymously used for short-term memory (i.e., the ability only to maintain information for a short duration; see Ceglarek et al., 2021; Lewandowska et al., 2017 for examples of such a mismatch). Thus, the mixed evidence regarding the synchrony effect in working memory may result from different constructs being measured. For attentional control, this issue is even greater. Recent research has emphasized that the different tasks used to assess attentional control are not measuring the same construct but rather task-specific processes (e.g., Karr et al., 2018; Rey-Mermet et al., 2018, 2019, 2020, 2021). Accordingly, the mixed evidence regarding the synchrony effect in attentional control may be the result of some task-specific processes being affected by the synchrony effect and other task-specific processes being affected by no synchrony effect. This makes the presence or the absence of the synchrony effect difficult to predict at a theoretical level, thus challenging the generality and robustness of this effect.
The goal of the present study was to determine the scope and robustness of the synchrony effect by focusing on two cognitive constructs that are assumed to underlie a wide variety of behaviors and experiences and by addressing the outlined methodical challenges. Specifically, we aimed to investigate the synchrony effect on working memory and attentional control using two sets of four tasks. In one set, the tasks required temporary maintenance of information (i.e., short-term memory). In the other set, the tasks required temporary maintenance and manipulation of information (i.e., working memory). Using structural equation modeling, we modeled attentional control as the goal-directed nature of the working-memory tasks without their maintenance aspects. The modeling approach also enables us to solve the task-impurity problem by capturing the constructs of interest (i.e., attentional control and short-term maintenance) as the shared variance across the measures. Furthermore, we used a within-subject design in which a large sample of participants were tested at both peak and off-peak times.
We hypothesized that if the impact of circadian arousal can be assessed as a synchrony effect on working memory and attentional control (e.g., May & Hasher, 1998b; Rowe et al., 2009; West et al., 2002), we should be able to find better performance at peak than at off-peak times for the short-term memory and working-memory tasks as well as for the latent constructs of short-term maintenance and attentional control. In contrast, if the impact of circadian arousal cannot be measured as a synchrony effect on working memory and attentional control (e.g., Matchock & Mordkoff, 2008; May & Hasher, 1998b), performance should not differ between peak and off-peak times at the individual-task level and the latent-variable level.
Methods
Transparency and openness
We report how we determined our sample size, all data exclusions, all manipulations, and all measures in the study (Simmons et al., 2012). This study’s design and its analysis were pre-registered (see https://osf.io/tywu7). All deviations from the preregistration are presented in Appendix A. For all analyses, we used R (Version 4.3.1; R Core Team, 2021) and the R-packages afex (Version 1.3.0; Singmann et al., 2021), BayesFactor (Version 0.9.12.4.4; Morey, 2008), ggplot2 (Version 3.4.2; Wickham, 2016), lavaan (Version 0.6.16; Rosseel, 2012), papaja (Version 0.1.1; Aust & Barth, 2020), psych (Version 2.3.6; Revelle, 2021), semTools (Version 0.5.6; Jorgensen et al., 2021), and splithalf (Version 0.8.2; Parsons, 2020).
Participants
All participants were recruited and tested by the students from our university who took part in the course M08 in fall term 2020 and in the course M1 in spring term 2021. Participants were thus students’ acquaintance. Before the recruitment, the students were informed about the most influential studies investigating the synchrony effect in attentional control and working memory (with, e.g., May & Hasher, 1998b; Rowe et al., 2009; West et al., 2002). To complete the course, the students were asked to test participants following best research practices, for example, by instructing the participants carefully and by protocolling the issues if some occurred. Therefore, for a student, successfully recruiting and testing participants was not tied to whether or not their participants showed a synchrony effect. Furthermore, it was tied neither to their grades nor to the exclusion criteria used in the present study. This was implemented to ensure no bias in recruitment and testing from the side of the students.
From the side of the participants, we avoided any selection bias by testing all participants who were willing to participate. Due to this data-collection procedure, the analyses were planned to be performed on all chronotypes (morning, evening, and neutral). Therefore, in the preregistration, we determined the target sample size for a sample including all chronotypes. Following Brysbaert’s (2019) recommendations, we opted for a generic effect size of .196 – that is, a Cohen’s d of .40 –, a statistical power of .90, and a probability level of .05. Furthermore, we hypothesized a measurement model including four latent variables and 16 manifest variables. Using the a-priori Sample Size Calculator for Structural Equation Models (retrieved from http://www.danielsoper.com/statcalc; Soper, 2018), we determined that the target sample size is 453.
We classified all participants according to their chronotype. To this end, we followed the guidelines put forward by Chelminski et al. (2000) and Griefahn et al. (2001). Accordingly, participants with values on the Morningness-Eveningness Questionnaire (D-MEQ, Griefahn et al., 2001) between 16 and 30 were categorized as definite evening types. Participants with values between 31 and 41 were categorized as moderate evening type. Participants with values between 42 and 58 were categorized as neutral type. Participants with values between 59 and 69 were categorized as moderate morning type, and participants with values between 70 and 86 were categorized as definite morning type. This categorization is the same as in the seminal research published by May and colleagues (see, e.g., Intons-Peterson et al., 1998; May, 1999; May et al., 1993).
In total, 689 young participants were tested. Participants were not paid. A complete description of the exclusion criteria used in the present study is presented in Table 2. Applying these criteria resulted in a sample consisting of 446 participants, which is close to our target sample size. Because previous research repeatedly reported a synchrony effect for participants categorized as moderate and definite chronotypes (see, e.g., Intons-Peterson et al., 1998; May, 1999; May et al., 1993), we first report the analyses on the subsample including participants with moderate and definite chronotypes. The analyses on the full sample (N = 446) are presented as part of the multiverse-analysis approach.
Reasons | Number of exclusions |
Participants were not aged between 18 and 28. | 9 |
Participants did not report Swiss German or German as native language. | 5 |
Participants reported colorblindness or no normal vision. | 0 |
Participants reported neurological or psychiatric disorders. | 5 |
Participants did not complete the whole experiment. | 46 |
The session order or the task order across the sessions was incorrect. | 6 |
Participants took a rest longer than 30 minutes. | 8 |
The data timestamp was not in chronological order. | 2 |
Participants did not perform the morning session between 07:30 and 10:00 and the evening session between 16:30 and 19:00. | 65 |
The morning or evening session lasted more than two hours. | 2 |
A task was missing.a | 69 |
Participants were multivariate outliers.b | 26 |
Reasons | Number of exclusions |
Participants were not aged between 18 and 28. | 9 |
Participants did not report Swiss German or German as native language. | 5 |
Participants reported colorblindness or no normal vision. | 0 |
Participants reported neurological or psychiatric disorders. | 5 |
Participants did not complete the whole experiment. | 46 |
The session order or the task order across the sessions was incorrect. | 6 |
Participants took a rest longer than 30 minutes. | 8 |
The data timestamp was not in chronological order. | 2 |
Participants did not perform the morning session between 07:30 and 10:00 and the evening session between 16:30 and 19:00. | 65 |
The morning or evening session lasted more than two hours. | 2 |
A task was missing.a | 69 |
Participants were multivariate outliers.b | 26 |
a A task was missing or discarded if the computer malfunctioned or if the participants did not respond within three minutes when performing the task. This was implemented to ensure laboratory-like setting. b We checked for multivariate normality across all measures using the Mardia’s (1970) kurtosis index. Participants were considered as multivariate outliers when Mahalanobis’s d2 values were significant.
The subsample consisted of 191 young adults. One hundred and thirty-one participants were morning types, and 60 were evening types. Demographic characteristics for this subsample are summarized in Table 3. The study was approved by the ethics committee of the university (approval number: 2020-06-00001), and all participants gave written informed consent.
Measure | Sample |
Sample size | 191 |
Age, years | 23.8 (3) |
Age range | 18-28 |
Gender (female / male / other) | 135 / 54 / 2 |
Education, years | 13.5 (3.8) |
Education level a | 4.9 (1.5) |
BDI-II score b | 6.9 (6.3) |
PSQ score c | 30 (15.9) |
D-MEQ chronoscore d (morning / evening) | 64.5 (4) / 36.5 (4.1) |
D-MEQ chronoscore range (morning / evening) | 59-78 / 25-41 |
Measure | Sample |
Sample size | 191 |
Age, years | 23.8 (3) |
Age range | 18-28 |
Gender (female / male / other) | 135 / 54 / 2 |
Education, years | 13.5 (3.8) |
Education level a | 4.9 (1.5) |
BDI-II score b | 6.9 (6.3) |
PSQ score c | 30 (15.9) |
D-MEQ chronoscore d (morning / evening) | 64.5 (4) / 36.5 (4.1) |
D-MEQ chronoscore range (morning / evening) | 59-78 / 25-41 |
Note. Standard deviations are given in parentheses. BDI-II = Beck Depression Inventory II (Hautzinger et al., 2006); PSQ = Perceived Stress Questionnaire (Fliege et al., 2005); D-MEQ = German version of the Morningness-Eveningness-Questionnaire (Griefahn et al., 2001). a Education level ranged from 1 (no or less than 9 school years) to 8 (Ph.D.). b Depression score ranged from 0 (minimal depression) to 63 (severe depression). c PSQ score ranged from 0 to 100. d Chronoscore ranged from 16 (definite evening type) to 86 (definite morning type). Neutral types are indicated by a chronoscore ranging from 42 to 58.
Material
All tasks and questionnaires were programmed using lab.js (Henninger et al., 2019) on a computer using a 31 x 17.4 cm screen size and a 1920 x 1080 px screen resolution. A set of four tasks was used to measure short-term memory. These tasks were simple-span tasks with either digits, letters, matrices, or arrows as materials. They were programmed following Kane et al. (2004). A set of four tasks was used to measure working memory. These tasks were complex-span tasks and updating tasks with numerical or spatial materials. They were programmed following Rey-Mermet et al. (2019). Each task had two versions (i.e., Version 1 and Version 2). The two versions only differed in the presentation order of the stimulus exemplars. In each version, the same pseudorandom presentation order was administered for all participants. For all tasks, each stimulus exemplar was presented approximately equally often. Unless specified otherwise, each event (e.g., stimulus or prompt) was presented centrally in black color and in 36-point sans-serif font. In all tasks, the feedback consisted of a smiling face after a correct response and a frowning face after an error. Both feedback stimuli comprised a width and a height of 2.29° visual angle at a viewing distance of 60 cm. In the next paragraphs, we present the short-term memory and working-memory tasks as well as the questionnaires separately.
Short-term memory and working-memory tasks
In all short-term memory tasks, a trial consisted of an encoding phase followed by a recall phase. In the working-memory tasks, there was in addition either a distractor task for the complex spans or updating steps for the updating tasks. In all spans, set size refers to the number of memoranda to be remembered during each trial. Although the number of memoranda was limited in most tasks (e.g., the digits one until nine for the digit simple span), memoranda did not repeat within a trial. Next, we describe each task in detail.
In the digit simple span, the encoding phase consisted of memorizing sequences of digits. The digits were presented in set sizes ranging from two to nine digits. In the recall phase, digits had to be recalled in correct serial order. Thus, a text field was presented in the center of the screen. In addition, a counter (e.g., “digit 1” for the first digit to be recalled) was presented on the upper part of the screen (2.86° visual angle) to keep track of the serial order.
The letter simple span was similar to the digit simple span, except for the following modifications. First, the memoranda were the uppercase letters B, F, H, J, L, M, Q, R, and X. Second, set sizes ranged from two to eight letters. Third, at recall, letters had to be recalled in correct serial order by clicking sequentially on the corresponding letters in the matrix. To this end, all nine letters were presented in a 3 x 3 matrix (with a length of 4.68° visual angle). A counter (e.g., “letter 1” for the first letter to be recalled) was also presented in 32-point font on the upper part of the screen (3.82° visual angle).
The matrix simple span was similar to the letter simple span, except for the following modifications. First, the memoranda were the positions of squares in a 4 x 4 matrix. The matrix had a length of 4.96° visual angle and each square of the matrix had a length of 1.24° visual angle. Second, set sizes ranged from two to seven squares. Third, at recall, an empty 4 x 4 matrix was displayed. The counter included the word “position” (e.g., “position 1” for the first position to be recalled).
The arrow simple span was similar to the letter and matrix simple spans, except for the following modifications. First, the memoranda were short and long arrows radiating out from the center of the screen. Short arrows had a length of 0.86° visual angle, whereas long arrows had a length of 1.72° visual angle. Each arrow pointed at either 0°, 45°, 90°, 135°, 180°, 225°, 270°, or 315°. Second, set size ranged from two to six arrows. Third, at recall, all short arrows were presented in a 3 x 3 matrix on the left side of the screen, and all long arrows were also presented in a 3 x 3 matrix but on the right side of the screen. Both matrices had a length of 6.30°, and they were separated by a length of 2.10° visual angle. The counter included the word “arrow” (e.g., “arrow 1” for the first arrow to be recalled).
In the numerical complex span, the encoding phase consisted of memorizing sequences of three to five two-digit numbers. Between the presentation of these memoranda, the distractor task was presented. Thus, one equation that was either valid (e.g., “6-6=0”) or invalid (e.g., “5+11=18”) was displayed. The validity of the equation was judged by pressing the left- and right-pointing arrow key, respectively. This response mapping was presented in 24-point monospace font with a width of 12.35° visual angle on the lower part of the screen (4.30° visual angle). At recall, the numbers had to be recalled in correct serial order. Thus, a text field was presented in the center of the screen. In addition, a counter (e.g., “number 1” for the first number to be recalled) was presented on the upper part of the screen (2.86° visual angle) to keep track of the serial order.
The spatial complex span task was similar to the numerical complex span, except for the following modifications. First, four to six red squares were presented sequentially in a 5 x 5 matrix. The matrix had a length of 6.2° visual angle and each square of the matrix had a length of 1.24° visual angle. Second, the distractor task consisted of judging whether the pattern emerging from four squares presented concurrently and arranged in an L-shape was vertical or horizontal. Third, at recall, positions had to be recalled in correct serial order by clicking sequentially on the corresponding positions in the matrix. Thus, a 5 x 5 matrix was presented. In addition, a counter (e.g., “position 1” for the first position to be recalled) was displayed above the matrix (i.e., 4.30° visual angle).
In the numerical updating task, the encoding phase consisted of memorizing four digits (ranging from one to nine) presented in four different colors (i.e., red, blue, green, and orange). The digits were displayed centrally, each separated by a width of 1.43° visual angle. In the following updating steps, the digit to be updated was presented centrally in one of the four colors. In the recall phase, the most recent digit of each color had to be recalled. Thus, the word “digit” was displayed in the color corresponding to the digit to be recalled.
In the spatial updating task, the encoding phase consisted of memorizing the positions of three to five dots presented in a 4 x 4 matrix. The matrix had a length of 6.2° visual angle and each dot had a length of 1.24° visual angle. In each updating step, a new position of the to-be-updated dot was indicated by an arrow pointing in the direction of the required mental shift of that dot. Thus, one of the colored dots was presented centrally below a black arrow pointing either left, right, up, or down (with a width of 1.24° visual angle and a height of 0.48° visual angle). After each updating step, the most recent position of the dot to be updated had to be recalled. Thus, an empty 4 x 4 matrix is presented.
Questionnaires
In the present study, five questionnaires were used. The first questionnaire was the D-MEQ (Griefahn et al., 2001) to assess the chronotype of each participant. The second questionnaire was implemented to have some descriptive information about our sample. Thus, this questionnaire assessed socio-demographic variables, such as age, gender, handedness, color blindness, nationality, native language, foreign language(s), number of education years, socio-economic status, synesthetic experience, and leisure activities (e.g., music, sport, video game) as well as information about the general health status and the preferences for dorsal/ventral processing types. The third questionnaire was used to have some information about the current health status of our sample. The questions were about medication use and sleep in the last 24 hours as well as nicotine, alcohol and drug consumption in the last 2 hours. Because of the known effects of depression and stress on cognition (e.g., McDermott & Ebmeier, 2009; Rock et al., 2013; Saenger et al., 2014; Starcke et al., 2016), the last two questionnaires were the German Version of the Beck Depression Inventory II (BDI-II, Hautzinger et al., 2006) and the German version of the Perceived Stress Questionnaire (PSQ, Fliege et al., 2005).
Procedure
Participants were tested remotely by means of a browser-based online experiment during three sessions. Participants were alone for all three sessions, but they could phone the student who recruited them in case of problems or questions. At the beginning of all three sessions, participants were required to confirm that they performed the experiment in a laboratory-like setting (e.g. they were alone in a quiet room, without distraction, and they closed all computer programs, but the browser window with the tasks). At the beginning of the second and third sessions, participants were asked to perform a scaling task. This task was implemented to ensure that all stimuli were presented with the same size because all tasks were run on participants’ computers with different screen sizes and resolutions. In this scaling task, participants had to adapt the size of a rectangle to the size of a credit card. Then, based on this result, the size of the stimuli was computed so that it was the same across the different screen sizes and resolutions. In the middle of each session of the three sessions, participants could take a break of about 10 minutes. At the end of each session, they were asked about their current health status.
The three sessions were so organized that the first session was a screening session lasting for approximately 30 minutes. At the beginning of this session, after being informed, participants confirmed their consent to participate. Then, they were asked to perform the D-MEQ (Griefahn et al., 2001) in addition to a general socio-demographic questionnaire. The following two sessions lasted approximately for 1.5 hour each. These two sessions were separated by at least one night and maximally one week. The duration of one week was extended if the participant could not perform the session as planned (e.g., due to illness). The two sessions were performed either at 08:00 or at 17:00. We selected these testing times based on previous research (see Table 4). Moreover, because these two sessions lasted about 1.5 hour each, we considered testing times ranging from 07:30 to 10:00 and from 16:30 to 19:00, respectively, as still acceptable. These ranges are in line with the testing times reported in Table 4. For each chronotype (i.e., definite morning, moderate morning, neutral, moderate evening, and definite evening), half of the participants were assigned to be tested on peak in the second session and off peak in the third session, whereas the other half was assigned to be tested in the reversed order (i.e., off peak in the second session and on peak in the third session). During the on- and off-peak sessions, participants performed all short-term memory and working-memory tasks. At the end of the third session, participants were asked to complete the BDI-II (Hautzinger et al., 2006) and PSQ (Fliege et al., 2005).
Study | Morning testing time | Evening testing time |
Bennett et al. (2008) | 08:00-10:00 | 15:00-17:00 |
Bodenhausen (1990, Exp. 1) | 09:00 | 20:00 |
Bodenhausen (1990, Exp. 2) | 09:00 | either 15:00 or 20:00 |
Ceglarek et al. (2021) | 09:25-09:55; 11:00-11:30 | 18:30-19:00; 20:40-21:10 |
Fabbri et al. (2013, Exp. 1) | 09:00-10:00 | 17:00-18:00 |
Fabbri et al. (2013, Exp. 2) | 09:00-10:00 | 18:00-19:00 |
Goldstein et al. (2007) | 8:00-10:00 | 13:00-15:00 |
Hahn et al. (2012) | 8:00-10:00 | 13:00-15:00 |
Hasher et al. (2002) | 8:00-9:15 | 16:30-17:15 |
Intons-Peterson et al. (1998) | 8:00-10:30 | 15:30-18:00 |
Intons-Peterson et al. (1999, Exp. 1) | before 10:30 | 15:00 or later |
Intons-Peterson et al. (1999, Exp. 3) | before 10:30 | 15:00 or later |
Lara et al. (2014) | 08:00 | 20:30 |
Lehmann et al. (2013) | 9:00-11:00 | 15:00-17:00 |
Lewandowska et al. (2018) | 8:00; 09:00 | 17:00; 18:00 |
Li et al. (1998, Exp. 1) | 08:00 | 17:00 |
Li et al. (1998, Exp. 2) | 08:00 | 17:00 |
Matchock and Mordkoff (2009) | 08:00 | 16:00 and 20:00 |
May (1999) | 08:00 | 17:00 |
May and Hasher (1998, Exp. 1) | 08:00 | 16:00 or 17:00 |
May and Hasher (1998, Exp. 2) | 08:00 | 17:00 |
May et al. (1993) | 8:00 or 9:00 | 16:00 or 17:00 |
May et al. (2005, Exp. 1) | 8:00-9:00 | 17:00-18:00 |
May et al. (2005, Exp. 2) | 8:00-9:00 | 17:00-18:00 |
Petros et al. (1990) | 09:00 | 20:00 |
Rothen and Meier (2016) | 6:00-10:00 | 17:00-21:00 |
Rothen and Meier (2017) | 8:00-12:00 | 16:00-20:00 |
Rowe et al. (2009) | 8:00 or 9:00 | 16:00 or 17:00 |
Schmidt et al. (2012) | after one hour | after 10.5 hours |
Schmidt et al. (2015) | after one hour | after 10.5 hours |
Van Opstaal (2021) | 08:00 | 20:30 |
West et al. (2002) | 09:00 | 17:00 |
Yang et al. (2007, Exp. 1) | 9:00-10:00 | 16:00-17:00 |
Yang et al. (2007, Exp. 2) | 9:00-10:00 | 16:00-17:00 |
Yaremenko et al. (2021) | 07:40-09:00 | 20:30-21:30 |
Yoon (1997) | 8:00 or 9:00 | 16:00 or 17:00 |
Study | Morning testing time | Evening testing time |
Bennett et al. (2008) | 08:00-10:00 | 15:00-17:00 |
Bodenhausen (1990, Exp. 1) | 09:00 | 20:00 |
Bodenhausen (1990, Exp. 2) | 09:00 | either 15:00 or 20:00 |
Ceglarek et al. (2021) | 09:25-09:55; 11:00-11:30 | 18:30-19:00; 20:40-21:10 |
Fabbri et al. (2013, Exp. 1) | 09:00-10:00 | 17:00-18:00 |
Fabbri et al. (2013, Exp. 2) | 09:00-10:00 | 18:00-19:00 |
Goldstein et al. (2007) | 8:00-10:00 | 13:00-15:00 |
Hahn et al. (2012) | 8:00-10:00 | 13:00-15:00 |
Hasher et al. (2002) | 8:00-9:15 | 16:30-17:15 |
Intons-Peterson et al. (1998) | 8:00-10:30 | 15:30-18:00 |
Intons-Peterson et al. (1999, Exp. 1) | before 10:30 | 15:00 or later |
Intons-Peterson et al. (1999, Exp. 3) | before 10:30 | 15:00 or later |
Lara et al. (2014) | 08:00 | 20:30 |
Lehmann et al. (2013) | 9:00-11:00 | 15:00-17:00 |
Lewandowska et al. (2018) | 8:00; 09:00 | 17:00; 18:00 |
Li et al. (1998, Exp. 1) | 08:00 | 17:00 |
Li et al. (1998, Exp. 2) | 08:00 | 17:00 |
Matchock and Mordkoff (2009) | 08:00 | 16:00 and 20:00 |
May (1999) | 08:00 | 17:00 |
May and Hasher (1998, Exp. 1) | 08:00 | 16:00 or 17:00 |
May and Hasher (1998, Exp. 2) | 08:00 | 17:00 |
May et al. (1993) | 8:00 or 9:00 | 16:00 or 17:00 |
May et al. (2005, Exp. 1) | 8:00-9:00 | 17:00-18:00 |
May et al. (2005, Exp. 2) | 8:00-9:00 | 17:00-18:00 |
Petros et al. (1990) | 09:00 | 20:00 |
Rothen and Meier (2016) | 6:00-10:00 | 17:00-21:00 |
Rothen and Meier (2017) | 8:00-12:00 | 16:00-20:00 |
Rowe et al. (2009) | 8:00 or 9:00 | 16:00 or 17:00 |
Schmidt et al. (2012) | after one hour | after 10.5 hours |
Schmidt et al. (2015) | after one hour | after 10.5 hours |
Van Opstaal (2021) | 08:00 | 20:30 |
West et al. (2002) | 09:00 | 17:00 |
Yang et al. (2007, Exp. 1) | 9:00-10:00 | 16:00-17:00 |
Yang et al. (2007, Exp. 2) | 9:00-10:00 | 16:00-17:00 |
Yaremenko et al. (2021) | 07:40-09:00 | 20:30-21:30 |
Yoon (1997) | 8:00 or 9:00 | 16:00 or 17:00 |
Note. For Ceglarek et al. (2021) and Lewandowska et al. (2018), the testing times depended on whether the participants were categorized as morning or evening types. In each cell, the first testing time referred to the testing time used for the morning chronotype; the second testing time referred to the testing times used for the evening chronotype. For Schmidt et al. (2012, 2015), the testing times were individually selected so that it was either after one hour of wakefulness or after 10.5 hours.
Across the two on- and off- peaks sessions, the same order of short-term memory and working-memory tasks was used. For half of the participants, the following task order was used: verbal simple span, spatial complex span, numerical simple span, spatial updating, numerical complex span, arrow simple span, numerical updating, and spatial simple span. This task order was reversed for the other half of participants to control for practice effects. Moreover, in each session, one version of the tasks – that is, either Version 1 or Version 2 – was used. These versions and their order were counterbalanced across the two sessions so that half of the participants started with Version 1 and the other half with Version 2. Both counterbalancing conditions – that is, the counterbalancing of task order and of version order – were performed within each chronotype (i.e., within definite morning type, moderate morning type, neutral type, moderate evening type, and definite chronotype).
The task structure was similar across the different short-term memory and working-memory tasks. That is, each task started with the presentation of instructions explaining how participants had to carry out the task. These instructions were followed by a practice block, which could be repeated in case the participants required it. Following the practice block, participants performed one experimental block for the short-term memory tasks and two experimental blocks for the working-memory tasks. For the short-term memory tasks, the practice block included three trials with a set size of two, and the experimental block included three trials of each set size. In this experimental block, the set size ranged from three to nine for the digit simple span, three to eight for the letter simple span, two to seven for the matrix simple span, and two to six for the arrow simple span. For both numerical and spatial complex spans as well as for the spatial updating task, the practice block included two trials, and the two experimental blocks included a total of 12 trials. For the numerical updating task, the practice block included three trials (with four, six, and seven updating steps, respectively). The two experimental blocks included a total of 25 trials (each trial including seven updating steps). In addition, in this task, recall was probed in five out of the 25 trials immediately after the initial encoding. This was implemented to ensure that the initial set of memoranda was encoded. In all tasks, participants could take brief rests after each block.
The trial sequence was similar across all short-term memory tasks. That is, each trial started with the prompt “Ready?” until the participant pressed the space key. Then, the memorandum was presented for 1000 ms in the digit, letter, and arrow simple spans, and for 650 ms in the matrix simple span. The presentation of each memorandum was followed by a blank screen for 500 ms. This sequence was repeated, depending on the set size. After all memoranda were presented, participants were required to recall the sequence of memoranda in correct serial order. In the digit simple span, participants were asked to enter their response with the keyboard and then to press “Next” on the screen using the mouse. In the letter, matrix, and arrow simple spans, they were asked to select their responses by clicking on the screen. During the practice block only, the feedback about the accuracy of the given response was presented for 500 ms. At the end of each trial, a blank screen was displayed for 500 ms.
Except for the prompt “Ready?” and the feedback during the practice block, the trial sequence was more diverse for the working-memory tasks. In the numerical and spatial complex spans, each memorandum was presented for 1000 ms, and each distractor task was presented for 3000 ms maximally. During the distractor task, the stimulus-response mapping was additionally presented but in the practice block only. This encoding sequence was repeated, depending on the set size. After all memoranda were presented, participants were asked to recall memoranda in correct serial order. In the numerical complex span, participants were asked to enter their response with the keyboard and then to press “Next” on the screen using the mouse. In the spatial complex span, they were asked to select their responses by clicking on the screen. In both complex spans, at the end of the trial, a blank screen was displayed for 500 ms. In the numerical updating task, the encoding phase lasted for 5000 ms, followed by a blank screen for 250 ms. Each updating step was then displayed for 1250 ms, followed by a blank screen for 250 ms. The recall lasted until the participant responded by entering the digit with the keyboard. In the spatial updating task, memoranda were simultaneously presented for 500 ms per colored dot (e.g., four memoranda were presented for 2000 ms). Each updating step lasted for 500 ms. After each updating step, participants were asked to recall the most recent position of the dot to be updated. To this end, the matrix was displayed again until the participant clicked on it to give a response.
Data preparation
For all short-term memory and working-memory tasks, the dependent measure consisted of the accuracy rates computed as the proportion of memoranda recalled at the correct position (partial-credit load score, see Conway et al., 2005). Mean accuracy rates were then computed for each participant, each task, and each session (on peak vs. off peak). For the numerical updating task, performance on the immediate probes were not included in the computation of the dependent measure. Standardized questionnaires were analyzed following their manual.
Data analysis
We used an alpha level of .05 for all tests from the null hypothesis significance testing (NHST) framework. Effect sizes were calculated as Cohen’s d. For the Bayesian hypothesis testing, default prior scales were used. Moreover, the Bayes Factors (BFs) were interpreted using Raftery’s (1995) classification scheme. According to this classification, a BF between 1-3 is considered as weak evidence, a BF between 3-20 is considered as positive evidence, a BF between 20-150 is considered as strong evidence, and a BF larger than 150 is considered as very strong evidence.
Model estimation
Model fit was evaluated via multiple fit indices (Hu & Bentler, 1998, 1999): the goodness-of-fit statistic, the Bentler’s comparative fit index (CFI), the root mean square error of approximation (RMSEA), and the standardized root-mean-square residual (SRMR). For the statistic, a small, non-significant value indicates good fit. For the CFI, values larger than .95 indicate good fit, and values between .90 and .95 indicate acceptable fit. RMSEA values smaller than .06 and SRMR values smaller than .08 indicate good fit. It is to note that RMSEA values are less preferable when the sample size includes less than 250 participants (Hu & Bentler, 1998). In such cases, these values are provided for the sake of completeness, but they are not taken into account for evaluating the model fit.
In addition, the following criteria had to be met for a model to be considered a “good” fitting model: (1) the Kaiser-Meyer-Olkin (KMO) index – a measure of whether the correlation matrix is factorable – should be larger than .60 (Tabachnick & Fidell, 2019); (2) most of the error variances had to be lower than .90; (3) most of the factor loadings had to be significant and larger than .30; (4) no factor should be dominated by a large loading from one task; (5) The quality of how well the factor was represented by a set of measures – a measure sometimes called construct reliability or replicability – had to be good. This was assessed by the index H and it had to meet the standard criterion of .70 (Rodriguez et al., 2016).
Results
Results are reported in three steps. First, we investigated the reliability estimates and the correlational pattern for all measures. Second, we replicated previous research by examining the synchrony effect at the individual-task level (see, e.g., Ceglarek et al., 2021; Lewandowska et al., 2017; Rowe et al., 2009; Schmidt et al., 2015; West et al., 2002). Third, we used structural equation modeling (SEM) to investigate the synchrony effect at the latent-variable level for the constructs of short-term maintenance and attentional control. In this step, we aimed to model these constructs for on and off peaks separately and to estimate a latent-change model between both peaks. This assessment is equivalent to a paired t-test which is applied to latent constructs (see Kievit et al., 2018).
Reliability and correlations
As shown in Table 5, all measures had acceptable skew and kurtosis (i.e., between -1.01 and 0.50). The reliability estimates for all measures in both sessions were good, ranging from .77 to .95.
Session | Task | Mean | SD | Min. | Max. | Skew | Kurtosis | Reliability |
Off peak | Digit simple span | .77 | .11 | .44 | 1 | -0.36 | -0.23 | .89 [.87, .91] |
Letter simple span | .75 | .12 | .37 | 1 | -0.46 | 0.37 | .87 [.84, .89] | |
Matrix simple span | .79 | .12 | .43 | .99 | -0.49 | -0.32 | .87 [.84, .89] | |
Arrow simple span | .64 | .12 | .28 | .98 | -0.13 | 0.19 | .77 [.72, .81] | |
Numerical complex span | .44 | .19 | .02 | .98 | 0.28 | -0.01 | .88 [.85, .90] | |
Spatial complex span | .36 | .20 | .02 | .83 | 0.48 | -0.70 | .92 [.91, .94] | |
Numerical updating | .58 | .23 | .11 | 1 | -0.05 | -1.01 | .95 [.94, .96] | |
Spatial updating | .67 | .16 | .11 | .98 | -0.64 | 0.30 | .93 [.91, .94] | |
On peak | Digit simple span | .77 | .11 | .46 | .98 | -0.32 | -0.32 | .89 [.87, .91] |
Letter simple span | .76 | .12 | .33 | 1 | -0.40 | -0.03 | .88 [.86, .91] | |
Matrix simple span | .80 | .11 | .46 | 1 | -0.75 | 0.33 | .84 [.81, .88] | |
Arrow simple span | .66 | .13 | .32 | 1 | -0.07 | -0.20 | .80 [.76, .84] | |
Numerical complex span | .45 | .18 | .04 | 1 | 0.21 | -0.20 | .86 [.84, .89] | |
Spatial complex span | .37 | .20 | .02 | 1 | 0.50 | -0.39 | .92 [.91, .94] | |
Numerical updating | .61 | .23 | .10 | 1 | -0.24 | -0.89 | .95 [.94, .96] | |
Spatial updating | .68 | .15 | .22 | 1 | -0.65 | 0.30 | .92 [.90, .93] |
Session | Task | Mean | SD | Min. | Max. | Skew | Kurtosis | Reliability |
Off peak | Digit simple span | .77 | .11 | .44 | 1 | -0.36 | -0.23 | .89 [.87, .91] |
Letter simple span | .75 | .12 | .37 | 1 | -0.46 | 0.37 | .87 [.84, .89] | |
Matrix simple span | .79 | .12 | .43 | .99 | -0.49 | -0.32 | .87 [.84, .89] | |
Arrow simple span | .64 | .12 | .28 | .98 | -0.13 | 0.19 | .77 [.72, .81] | |
Numerical complex span | .44 | .19 | .02 | .98 | 0.28 | -0.01 | .88 [.85, .90] | |
Spatial complex span | .36 | .20 | .02 | .83 | 0.48 | -0.70 | .92 [.91, .94] | |
Numerical updating | .58 | .23 | .11 | 1 | -0.05 | -1.01 | .95 [.94, .96] | |
Spatial updating | .67 | .16 | .11 | .98 | -0.64 | 0.30 | .93 [.91, .94] | |
On peak | Digit simple span | .77 | .11 | .46 | .98 | -0.32 | -0.32 | .89 [.87, .91] |
Letter simple span | .76 | .12 | .33 | 1 | -0.40 | -0.03 | .88 [.86, .91] | |
Matrix simple span | .80 | .11 | .46 | 1 | -0.75 | 0.33 | .84 [.81, .88] | |
Arrow simple span | .66 | .13 | .32 | 1 | -0.07 | -0.20 | .80 [.76, .84] | |
Numerical complex span | .45 | .18 | .04 | 1 | 0.21 | -0.20 | .86 [.84, .89] | |
Spatial complex span | .37 | .20 | .02 | 1 | 0.50 | -0.39 | .92 [.91, .94] | |
Numerical updating | .61 | .23 | .10 | 1 | -0.24 | -0.89 | .95 [.94, .96] | |
Spatial updating | .68 | .15 | .22 | 1 | -0.65 | 0.30 | .92 [.90, .93] |
Note. Short-term memory and working memory were measured using accuracy rates. Permutation-based split-half reliability estimates were computed (see Parsons et al., 2019). The split-half correlations were adjusted with the Spearman–Brown prophecy formula, and the results of 5000 random splits were averaged. The 95% confidence intervals are presented in brackets. SD = Standard Deviation; Min. = minimum; Max. = maximum.
Pearson correlation coefficients as well as their upper and lower confidence intervals are shown in Table 6. Bayes factors (BFs) for the correlations are presented in Table 7. These assessed the weight of evidence in favor of the alternative hypothesis (BF10, i.e., in favor of a correlation) and in favor of the null hypothesis (BF01, i.e., in favor of the absence of the correlation). The correlations were moderate to strong, ranging from .24 to .73. All correlations were significant (ps < .001), and all Bayes Factors (BFs) suggested strong to very strong evidence for the correlations (all BFs10 38.92).
Off peak | On peak | |||||||||||||||
Session | Task | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
Off peak | 1. Digit s.s. | - | ||||||||||||||
2. Letter s.s. | .68* | - | ||||||||||||||
[.60, .75] | ||||||||||||||||
3. Matrix s.s. | .26* | .28* | - | |||||||||||||
[.12, .39] | [.14, .40] | |||||||||||||||
4. Arrow s.s. | .37* | .33* | .59* | - | ||||||||||||
[.24, .48] | [.19, .45] | [.49, .68] | ||||||||||||||
5. Numerical c.s. | .54* | .54* | .34* | .40* | - | |||||||||||
[.43, .64] | [.43, .63] | [.20, .46] | [.27, .51] | |||||||||||||
6. Spatial c.s. | .32* | .35* | .55* | .40* | .45* | - | ||||||||||
[.18, .44] | [.22, .47] | [.44, .64] | [.27, .51] | [.33, .56] | ||||||||||||
7. Numerical upd. | .34* | .36* | .39* | .47* | .40* | .43* | - | |||||||||
[.21, .46] | [.23, .48] | [.27, .51] | [.35, .57] | [.28, .52] | [.31, .54] | |||||||||||
8. Spatial upd. | .30* | .29* | .56* | .53* | .34* | .50* | .50* | - | ||||||||
[.17, .43] | [.15, .41] | [.45, .65] | [.41, .62] | [.21, .46] | [.38, .60] | [.39, .60] | ||||||||||
On peak | 9. Digit s.s. | .67* | .63* | .32* | .38* | .44* | .36* | .28* | .25* | - | ||||||
[.59, .75] | [.53, .71] | [.19, .44] | [.25, .49] | [.32, .55] | [.23, .48] | [.15, .41] | [.12, .38] | |||||||||
10. Letter s.s. | .60* | .68* | .27* | .32* | .49* | .29* | .25* | .27* | .66* | - | ||||||
[.50, .68] | [.60, .75] | [.13, .39] | [.18, .44] | [.37, .59] | [.15, .41] | [.11, .38] | [.13, .40] | [.57, .73] | ||||||||
11. Matrix s.s. | .28* | .34* | .70* | .52* | .25* | .42* | .25* | .45* | .38* | .31* | - | |||||
[.14, .41] | [.21, .46] | [.62, .77] | [.41, .62] | [.11, .38] | [.30, .53] | [.12, .38] | [.33, .56] | [.26, .50] | [.18, .44] | |||||||
12. Arrow s.s. | .29* | .26* | .58* | .70* | .31* | .38* | .39* | .49* | .41* | .32* | .55* | - | ||||
[.15, .41] | [.12, .38] | [.47, .66] | [.61, .76] | [.18, .43] | [.25, .50] | [.26, .50] | [.37, .59] | [.28, .52] | [.18, .44] | [.44, .64] | ||||||
13. Numerical c.s. | .44* | .53* | .31* | .34* | .67* | .35* | .37* | .24* | .52* | .52* | .34* | .42* | - | |||
[.32, .55] | [.42, .63] | [.17, .43] | [.21, .46] | [.58, .74] | [.22, .47] | [.24, .49] | [.10, .37] | [.41, .62] | [.41, .62] | [.21, .46] | [.29, .53] | |||||
14. Spatial c.s. | .26* | .31* | .51* | .32* | .32* | .72* | .29* | .45* | .42* | .39* | .46* | .39* | .43* | - | ||
[.13, .39] | [.17, .43] | [.40, .61] | [.19, .44] | [.18, .44] | [.64, .78] | [.16, .42] | [.33, .56] | [.30, .53] | [.26, .50] | [.34, .56] | [.26, .50] | [.31, .54] | ||||
15. Numerical upd. | .28* | .35* | .33* | .34* | .31* | .40* | .64* | .41* | .37* | .33* | .36* | .42* | .50* | .44* | - | |
[.14, .40] | [.22, .47] | [.20, .45] | [.21, .46] | [.18, .44] | [.27, .51] | [.54, .71] | [.29, .53] | [.24, .48] | [.20, .45] | [.23, .48] | [.29, .53] | [.39, .60] | [.32, .55] | |||
16. Spatial upd. | .27* | .28* | .48* | .44* | .28* | .41* | .33* | .73* | .30* | .35* | .51* | .46* | .29* | .46* | .42* | |
[.14, .40] | [.15, .41] | [.36, .58] | [.32, .55] | [.14, .40] | [.28, .52] | [.20, .46] | [.65, .79] | [.17, .42] | [.22, .47] | [.39, .61] | [.34, .56] | [.15, .41] | [.34, .56] | [.30, .53] |
Off peak | On peak | |||||||||||||||
Session | Task | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
Off peak | 1. Digit s.s. | - | ||||||||||||||
2. Letter s.s. | .68* | - | ||||||||||||||
[.60, .75] | ||||||||||||||||
3. Matrix s.s. | .26* | .28* | - | |||||||||||||
[.12, .39] | [.14, .40] | |||||||||||||||
4. Arrow s.s. | .37* | .33* | .59* | - | ||||||||||||
[.24, .48] | [.19, .45] | [.49, .68] | ||||||||||||||
5. Numerical c.s. | .54* | .54* | .34* | .40* | - | |||||||||||
[.43, .64] | [.43, .63] | [.20, .46] | [.27, .51] | |||||||||||||
6. Spatial c.s. | .32* | .35* | .55* | .40* | .45* | - | ||||||||||
[.18, .44] | [.22, .47] | [.44, .64] | [.27, .51] | [.33, .56] | ||||||||||||
7. Numerical upd. | .34* | .36* | .39* | .47* | .40* | .43* | - | |||||||||
[.21, .46] | [.23, .48] | [.27, .51] | [.35, .57] | [.28, .52] | [.31, .54] | |||||||||||
8. Spatial upd. | .30* | .29* | .56* | .53* | .34* | .50* | .50* | - | ||||||||
[.17, .43] | [.15, .41] | [.45, .65] | [.41, .62] | [.21, .46] | [.38, .60] | [.39, .60] | ||||||||||
On peak | 9. Digit s.s. | .67* | .63* | .32* | .38* | .44* | .36* | .28* | .25* | - | ||||||
[.59, .75] | [.53, .71] | [.19, .44] | [.25, .49] | [.32, .55] | [.23, .48] | [.15, .41] | [.12, .38] | |||||||||
10. Letter s.s. | .60* | .68* | .27* | .32* | .49* | .29* | .25* | .27* | .66* | - | ||||||
[.50, .68] | [.60, .75] | [.13, .39] | [.18, .44] | [.37, .59] | [.15, .41] | [.11, .38] | [.13, .40] | [.57, .73] | ||||||||
11. Matrix s.s. | .28* | .34* | .70* | .52* | .25* | .42* | .25* | .45* | .38* | .31* | - | |||||
[.14, .41] | [.21, .46] | [.62, .77] | [.41, .62] | [.11, .38] | [.30, .53] | [.12, .38] | [.33, .56] | [.26, .50] | [.18, .44] | |||||||
12. Arrow s.s. | .29* | .26* | .58* | .70* | .31* | .38* | .39* | .49* | .41* | .32* | .55* | - | ||||
[.15, .41] | [.12, .38] | [.47, .66] | [.61, .76] | [.18, .43] | [.25, .50] | [.26, .50] | [.37, .59] | [.28, .52] | [.18, .44] | [.44, .64] | ||||||
13. Numerical c.s. | .44* | .53* | .31* | .34* | .67* | .35* | .37* | .24* | .52* | .52* | .34* | .42* | - | |||
[.32, .55] | [.42, .63] | [.17, .43] | [.21, .46] | [.58, .74] | [.22, .47] | [.24, .49] | [.10, .37] | [.41, .62] | [.41, .62] | [.21, .46] | [.29, .53] | |||||
14. Spatial c.s. | .26* | .31* | .51* | .32* | .32* | .72* | .29* | .45* | .42* | .39* | .46* | .39* | .43* | - | ||
[.13, .39] | [.17, .43] | [.40, .61] | [.19, .44] | [.18, .44] | [.64, .78] | [.16, .42] | [.33, .56] | [.30, .53] | [.26, .50] | [.34, .56] | [.26, .50] | [.31, .54] | ||||
15. Numerical upd. | .28* | .35* | .33* | .34* | .31* | .40* | .64* | .41* | .37* | .33* | .36* | .42* | .50* | .44* | - | |
[.14, .40] | [.22, .47] | [.20, .45] | [.21, .46] | [.18, .44] | [.27, .51] | [.54, .71] | [.29, .53] | [.24, .48] | [.20, .45] | [.23, .48] | [.29, .53] | [.39, .60] | [.32, .55] | |||
16. Spatial upd. | .27* | .28* | .48* | .44* | .28* | .41* | .33* | .73* | .30* | .35* | .51* | .46* | .29* | .46* | .42* | |
[.14, .40] | [.15, .41] | [.36, .58] | [.32, .55] | [.14, .40] | [.28, .52] | [.20, .46] | [.65, .79] | [.17, .42] | [.22, .47] | [.39, .61] | [.34, .56] | [.15, .41] | [.34, .56] | [.30, .53] |
Note. Ninety-five percent confidence intervals are presented in brackets. Correlations for which the Bayes factor suggested positive to strong evidence for the alternative hypothesis (BF10) are presented in bold; correlations for which the Bayes factor suggested positive to strong evidence for the null hypothesis (BF01) are presented in italics. S.s. = simple span; c.s. = complex span; upd. = updating. * p < .05.
Off peak | On peak | ||||||||||||||||
Session | Task | BF | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
Off peak | 1. Digit s.s. | BF10 | - | ||||||||||||||
BF01 | |||||||||||||||||
2. Letter s.s. | BF10 | 1.60*1024 | - | ||||||||||||||
BF01 | 6.24*10-25 | ||||||||||||||||
3. Matrix s.s. | BF10 | 98.48 | 278.69 | - | |||||||||||||
BF01 | 0.01 | 3.59*10-3 | |||||||||||||||
4. Arrow s.s. | BF10 | 8.46*104 | 5.32*103 | 2.27*1016 | - | ||||||||||||
BF01 | 1.18*10-5 | 1.88*10-4 | 4.40*10-17 | ||||||||||||||
5. Numerical c.s. | BF10 | 1.34*1013 | 6.32*1012 | 9.83*103 | 1.19*106 | - | |||||||||||
BF01 | 7.48*10-14 | 1.58*10-13 | 1.02*10-4 | 8.37*10-7 | |||||||||||||
6. Spatial c.s. | BF10 | 2.74*103 | 2.88*104 | 2.79*1013 | 1.40*106 | 2.10*108 | - | ||||||||||
BF01 | 3.64*10-4 | 3.47*10-5 | 3.58*10-14 | 7.16*10-7 | 4.76*10-9 | ||||||||||||
7. Numerical upd. | BF10 | 1.65*104 | 5.62*104 | 8.10*105 | 1.21*109 | 2.06*106 | 2.60*107 | - | |||||||||
BF01 | 6.05*10-5 | 1.78*10-5 | 1.23*10-6 | 8.29*10-10 | 4.85*10-7 | 3.84*10-8 | |||||||||||
8. Spatial upd. | BF10 | 1.08*103 | 523.30 | 7.90*1013 | 1.28*1012 | 1.24*104 | 3.99*1010 | 4.95*1010 | - | ||||||||
BF01 | 9.25*10-4 | 1.91*10-3 | 1.27*10-14 | 7.83*10-13 | 8.07*10-5 | 2.51*10-11 | 2.02*10-11 | ||||||||||
On peak | 9. Digit s.s. | BF10 | 2.44*1023 | 1.45*1019 | 3.27*103 | 1.89*105 | 1.01*108 | 5.62*104 | 360.76 | 76.89 | - | ||||||
BF01 | 4.11*10-24 | 6.91*10-20 | 3.06*10-4 | 5.28*10-6 | 9.92*10-9 | 1.78*10-5 | 2.77*10-3 | 0.01 | |||||||||
10. Letter s.s. | BF10 | 8.73*1016 | 1.87*1024 | 135.18 | 2.56*103 | 1.09*1010 | 510.81 | 68.38 | 188.91 | 6.20*1021 | - | ||||||
BF01 | 1.15*10-17 | 5.35*10-25 | 7.40*10-3 | 3.90*10-4 | 9.15*10-11 | 1.96*10-3 | 0.01 | 5.29*10-3 | 1.61*10-22 | ||||||||
11. Matrix s.s. | BF10 | 300.22 | 1.31*104 | 2.42*1026 | 1.02*1012 | 62.66 | 1.09*107 | 78.89 | 1.45*108 | 3.68*105 | 2.28*103 | - | |||||
BF01 | 3.33*10-3 | 7.61*10-5 | 4.14*10-27 | 9.81*10-13 | 0.02 | 9.21*10-8 | 0.01 | 6.88*10-9 | 2.72*10-6 | 4.38*10-4 | |||||||
12. Arrow s.s. | BF10 | 456.42 | 81.46 | 2.21*1015 | 3.02*1025 | 1.92*103 | 3.11*105 | 4.15*105 | 1.08*1010 | 3.05*106 | 2.45*103 | 4.09*1013 | - | ||||
BF01 | 2.19*10-3 | 0.01 | 4.53*10-16 | 3.31*10-26 | 5.22*10-4 | 3.22*10-6 | 2.41*10-6 | 9.29*10-11 | 3.28*10-7 | 4.08*10-4 | 2.44*10-14 | ||||||
13. Numerical c.s. | BF10 | 5.19*107 | 3.20*1012 | 1.66*103 | 1.41*104 | 3.47*1022 | 2.19*104 | 1.10*105 | 38.92 | 9.85*1011 | 7.80*1011 | 1.67*104 | 5.92*106 | - | |||
BF01 | 1.93*10-8 | 3.13*10-13 | 6.04*10-4 | 7.08*10-5 | 2.88*10-23 | 4.58*10-5 | 9.08*10-6 | 0.03 | 1.02*10-12 | 1.28*10-12 | 5.98*10-5 | 1.69*10-7 | |||||
14. Spatial c.s. | BF10 | 128.81 | 1.43*103 | 2.25*1011 | 3.54*103 | 2.83*103 | 1.10*1028 | 586.53 | 1.84*108 | 1.26*107 | 5.34*105 | 4.78*108 | 4.13*105 | 2.97*107 | - | ||
BF01 | 7.76*10-3 | 6.99*10-4 | 4.45*10-12 | 2.83*10-4 | 3.53*10-4 | 9.13*10-29 | 1.70*10-3 | 5.43*10-9 | 7.93*10-8 | 1.87*10-6 | 2.09*10-9 | 2.42*10-6 | 3.36*10-8 | ||||
15. Numerical upd. | BF10 | 274.86 | 2.55*104 | 6.34*103 | 1.20*104 | 2.25*103 | 1.30*106 | 9.09*1019 | 5.23*106 | 9.16*104 | 5.93*103 | 5.68*104 | 6.39*106 | 6.40*1010 | 6.35*107 | - | |
BF01 | 3.64*10-3 | 3.92*10-5 | 1.58*10-4 | 8.36*10-5 | 4.45*10-4 | 7.71*10-7 | 1.10*10-20 | 1.91*10-7 | 1.09*10-5 | 1.69*10-4 | 1.76*10-5 | 1.56*10-7 | 1.56*10-11 | 1.57*10-8 | |||
16. Spatial upd. | BF10 | 225.33 | 335.53 | 2.66*109 | 5.67*107 | 267.41 | 3.02*106 | 9.14*103 | 1.85*1029 | 959.14 | 3.78*104 | 1.24*1011 | 3.55*108 | 467.77 | 4.35*108 | 1.17*107 | |
BF01 | 4.44*10-3 | 2.98*10-3 | 3.76*10-10 | 1.76*10-8 | 3.74*10-3 | 3.31*10-7 | 1.09*10-4 | 5.41*10-30 | 1.04*10-3 | 2.65*10-5 | 8.04*10-12 | 2.81*10-9 | 2.14*10-3 | 2.30*10-9 | 8.57*10-8 |
Off peak | On peak | ||||||||||||||||
Session | Task | BF | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
Off peak | 1. Digit s.s. | BF10 | - | ||||||||||||||
BF01 | |||||||||||||||||
2. Letter s.s. | BF10 | 1.60*1024 | - | ||||||||||||||
BF01 | 6.24*10-25 | ||||||||||||||||
3. Matrix s.s. | BF10 | 98.48 | 278.69 | - | |||||||||||||
BF01 | 0.01 | 3.59*10-3 | |||||||||||||||
4. Arrow s.s. | BF10 | 8.46*104 | 5.32*103 | 2.27*1016 | - | ||||||||||||
BF01 | 1.18*10-5 | 1.88*10-4 | 4.40*10-17 | ||||||||||||||
5. Numerical c.s. | BF10 | 1.34*1013 | 6.32*1012 | 9.83*103 | 1.19*106 | - | |||||||||||
BF01 | 7.48*10-14 | 1.58*10-13 | 1.02*10-4 | 8.37*10-7 | |||||||||||||
6. Spatial c.s. | BF10 | 2.74*103 | 2.88*104 | 2.79*1013 | 1.40*106 | 2.10*108 | - | ||||||||||
BF01 | 3.64*10-4 | 3.47*10-5 | 3.58*10-14 | 7.16*10-7 | 4.76*10-9 | ||||||||||||
7. Numerical upd. | BF10 | 1.65*104 | 5.62*104 | 8.10*105 | 1.21*109 | 2.06*106 | 2.60*107 | - | |||||||||
BF01 | 6.05*10-5 | 1.78*10-5 | 1.23*10-6 | 8.29*10-10 | 4.85*10-7 | 3.84*10-8 | |||||||||||
8. Spatial upd. | BF10 | 1.08*103 | 523.30 | 7.90*1013 | 1.28*1012 | 1.24*104 | 3.99*1010 | 4.95*1010 | - | ||||||||
BF01 | 9.25*10-4 | 1.91*10-3 | 1.27*10-14 | 7.83*10-13 | 8.07*10-5 | 2.51*10-11 | 2.02*10-11 | ||||||||||
On peak | 9. Digit s.s. | BF10 | 2.44*1023 | 1.45*1019 | 3.27*103 | 1.89*105 | 1.01*108 | 5.62*104 | 360.76 | 76.89 | - | ||||||
BF01 | 4.11*10-24 | 6.91*10-20 | 3.06*10-4 | 5.28*10-6 | 9.92*10-9 | 1.78*10-5 | 2.77*10-3 | 0.01 | |||||||||
10. Letter s.s. | BF10 | 8.73*1016 | 1.87*1024 | 135.18 | 2.56*103 | 1.09*1010 | 510.81 | 68.38 | 188.91 | 6.20*1021 | - | ||||||
BF01 | 1.15*10-17 | 5.35*10-25 | 7.40*10-3 | 3.90*10-4 | 9.15*10-11 | 1.96*10-3 | 0.01 | 5.29*10-3 | 1.61*10-22 | ||||||||
11. Matrix s.s. | BF10 | 300.22 | 1.31*104 | 2.42*1026 | 1.02*1012 | 62.66 | 1.09*107 | 78.89 | 1.45*108 | 3.68*105 | 2.28*103 | - | |||||
BF01 | 3.33*10-3 | 7.61*10-5 | 4.14*10-27 | 9.81*10-13 | 0.02 | 9.21*10-8 | 0.01 | 6.88*10-9 | 2.72*10-6 | 4.38*10-4 | |||||||
12. Arrow s.s. | BF10 | 456.42 | 81.46 | 2.21*1015 | 3.02*1025 | 1.92*103 | 3.11*105 | 4.15*105 | 1.08*1010 | 3.05*106 | 2.45*103 | 4.09*1013 | - | ||||
BF01 | 2.19*10-3 | 0.01 | 4.53*10-16 | 3.31*10-26 | 5.22*10-4 | 3.22*10-6 | 2.41*10-6 | 9.29*10-11 | 3.28*10-7 | 4.08*10-4 | 2.44*10-14 | ||||||
13. Numerical c.s. | BF10 | 5.19*107 | 3.20*1012 | 1.66*103 | 1.41*104 | 3.47*1022 | 2.19*104 | 1.10*105 | 38.92 | 9.85*1011 | 7.80*1011 | 1.67*104 | 5.92*106 | - | |||
BF01 | 1.93*10-8 | 3.13*10-13 | 6.04*10-4 | 7.08*10-5 | 2.88*10-23 | 4.58*10-5 | 9.08*10-6 | 0.03 | 1.02*10-12 | 1.28*10-12 | 5.98*10-5 | 1.69*10-7 | |||||
14. Spatial c.s. | BF10 | 128.81 | 1.43*103 | 2.25*1011 | 3.54*103 | 2.83*103 | 1.10*1028 | 586.53 | 1.84*108 | 1.26*107 | 5.34*105 | 4.78*108 | 4.13*105 | 2.97*107 | - | ||
BF01 | 7.76*10-3 | 6.99*10-4 | 4.45*10-12 | 2.83*10-4 | 3.53*10-4 | 9.13*10-29 | 1.70*10-3 | 5.43*10-9 | 7.93*10-8 | 1.87*10-6 | 2.09*10-9 | 2.42*10-6 | 3.36*10-8 | ||||
15. Numerical upd. | BF10 | 274.86 | 2.55*104 | 6.34*103 | 1.20*104 | 2.25*103 | 1.30*106 | 9.09*1019 | 5.23*106 | 9.16*104 | 5.93*103 | 5.68*104 | 6.39*106 | 6.40*1010 | 6.35*107 | - | |
BF01 | 3.64*10-3 | 3.92*10-5 | 1.58*10-4 | 8.36*10-5 | 4.45*10-4 | 7.71*10-7 | 1.10*10-20 | 1.91*10-7 | 1.09*10-5 | 1.69*10-4 | 1.76*10-5 | 1.56*10-7 | 1.56*10-11 | 1.57*10-8 | |||
16. Spatial upd. | BF10 | 225.33 | 335.53 | 2.66*109 | 5.67*107 | 267.41 | 3.02*106 | 9.14*103 | 1.85*1029 | 959.14 | 3.78*104 | 1.24*1011 | 3.55*108 | 467.77 | 4.35*108 | 1.17*107 | |
BF01 | 4.44*10-3 | 2.98*10-3 | 3.76*10-10 | 1.76*10-8 | 3.74*10-3 | 3.31*10-7 | 1.09*10-4 | 5.41*10-30 | 1.04*10-3 | 2.65*10-5 | 8.04*10-12 | 2.81*10-9 | 2.14*10-3 | 2.30*10-9 | 8.57*10-8 |
Note. S.s. = simple span; c.s. = complex span; upd. = updating.
Results at the individual-task level: Synchrony effect in short-term memory and working-memory tasks
Consistent with previous research (e.g., Ceglarek et al., 2021; Lewandowska et al., 2017; Rowe et al., 2009; Schmidt et al., 2015; West et al., 2002), we then investigated the synchrony effect at the individual-task level. This is displayed in Figure 1. Statistically, we compared short-term memory and working-memory performance between on- and off-peak sessions by computing a paired two-tailed t-test for each short-term memory and working-memory task separately. In addition, we computed Bayesian t-tests. This allowed us to assess not only the strength of evidence for the alternative hypothesis (i.e., the presence of the synchrony effect) but also the strength of evidence for the null hypothesis (i.e., the absence of a synchrony effect). These results are presented in Table 8. As shown in this table, no synchrony effect was observed for most tasks. Only in the numerical updating task and in the arrow simple span, a significant synchrony effect was found. However, the effect sizes were small for these tasks, and Bayesian evidence only suggests weak to positive evidence in favor of the effect (see Table 8).
Task | t(190) | p | Cohen’s d | BF10 | BF01 |
Digit simple span | 0.01 | .988 | 0.00 | 0.08 | 12.37 |
Letter simple span | 0.96 | .338 | 0.07 | 0.13 | 7.86 |
Matrix simple span | 1.53 | .128 | 0.11 | 0.25 | 3.93 |
Arrow simple span | 2.75 | .007* | 0.20 | 3.13 | 0.32 |
Numerical complex span | 0.66 | .509 | 0.05 | 0.10 | 9.98 |
Spatial complex span | 0.65 | .514 | 0.05 | 0.10 | 10.03 |
Numerical updating | 2.38 | .018* | 0.17 | 1.27 | 0.79 |
Spatial updating | 1.15 | .252 | 0.08 | 0.15 | 6.47 |
Task | t(190) | p | Cohen’s d | BF10 | BF01 |
Digit simple span | 0.01 | .988 | 0.00 | 0.08 | 12.37 |
Letter simple span | 0.96 | .338 | 0.07 | 0.13 | 7.86 |
Matrix simple span | 1.53 | .128 | 0.11 | 0.25 | 3.93 |
Arrow simple span | 2.75 | .007* | 0.20 | 3.13 | 0.32 |
Numerical complex span | 0.66 | .509 | 0.05 | 0.10 | 9.98 |
Spatial complex span | 0.65 | .514 | 0.05 | 0.10 | 10.03 |
Numerical updating | 2.38 | .018* | 0.17 | 1.27 | 0.79 |
Spatial updating | 1.15 | .252 | 0.08 | 0.15 | 6.47 |
Note. For the sake of clarity, results with a BF10 larger than 3 are presented in bold, whereas results with a BF01 larger than 3 are presented in italics. * p < .05.
Results at the latent-variable level: Synchrony effect in short-term maintenance and attentional control
The synchrony effect in short-term maintenance and attentional control was assessed by modeling a latent change between both on and off peaks for these constructs. A bifactor model was used to model short-term maintenance and attentional control. Overall, four latent-change models were thus modeled (see Figures 2 and 3). In each model, we took into account measure-specific variance across both sessions by allowing error variances from the measures of the same task to correlate. Factor loadings were also constrained to be positive. Moreover, the measurement invariance across time was modeled by applying equality constraints over off and on peaks for the factor loadings, the error variances, and the intercepts. Furthermore, we assessed the synchrony effect for each latent construct by modeling a latent change between the on-peak factor and the off-peak factor of the construct. That is, the on-peak factor was regressed on the off-peak factor by fixing the unstandardized regression weight to 1. The latent-change factor was then measured by fixing the unstandardized factor loading to the on-peak factor to 1. A covariance was also added between the off-peak factor and the latent-change factor to capture whether the latent change was dependent or proportional to the scores at off peak.
Following this strategy, we estimated the first model (Model 1) by modeling short-term maintenance as the common variance between short-term memory and working-memory measures, and attentional control as the working-memory variance remaining after controlling for short-term memory (see Engle et al., 1999). More precisely, all short-term memory and working-memory measures at off and on peaks were forced to load on an off-peak maintenance factor and an on-peak maintenance factor, respectively. In addition, all working-memory tasks at off and on peaks were forced to load on an off-peak attentional-control factor and an on-peak attentional-control factor, respectively. We assessed the synchrony effect for maintenance by modeling a latent change between the on-peak maintenance factor and the off-peak maintenance factor. Similarly, the synchrony effect for attentional control was assessed by modeling a latent change between the on-peak attentional-control factor and the off-peak attentional-control factor. This model is depicted in Figure 2a. However, it provided a bad fit to the data, KMO = .86, (97, N = 191) = 266.44, p < .001, CFI = .91, RMSEA [90% CI] = .10 [.08, .11], SRMR = .09.
Short-term memory and working-memory tasks were reported to involve different maintenance processes for verbal-numerical and spatial materials, but common attentional-control processes (see, e.g., Kane et al., 2004). Accordingly, we fitted a second model (Model 2), which was similar to Model 1, except that the off- and on-peak maintenance factors were modeled separately for both material types. This model is depicted in Figure 2b. It provided a good fit to the data, KMO = .86, (87, N = 191) = 133.68, p = .001, CFI = .97, RMSEA [90% CI] = .05 [.03, .07], SRMR = .07. However, the model parameters were not fully identified as the covariance matrix was not positive definite.
Applying a model-reduction strategy, we fitted a third model (Model 3) in which we assumed no synchrony effect for short-term maintenance. The synchrony effect was modeled only for attentional control. Thus, this model was similar to Model 2, except that there was one maintenance factor for verbal-numerical material and one maintenance factor for spatial material across both on- and off-peak sessions. This model is depicted in Figure 3a. It provided an acceptable fit to the data, KMO = .86, (101, N = 191) = 210.16, p < .001, CFI = .94, RMSEA [90% CI] = .08 [.06, .09], SRMR = .08. Inspection of the key parameter for the latent change shows no difference between off and on peaks, thus indicating no synchrony effect for attentional control (the unstandardized intercept of the latent-change factor = .01, se = .01, p = .493, 95% CI = [-.01, .03]). In this model, however, the construct reliability for the attentional-control factors was low (H = .42 and H = .39 for off and on peak, respectively). This indicates that the attentional-control factors did not represent much common variance across the measures. Therefore, this model has low explanatory power.
In the last model (Model 4), we followed previous work in which attentional control has been modeled as a common factor across all short-term memory and working-memory measures (see Kane et al., 2004). Thus, short-term memory and working-memory tasks at off and on peaks were forced to load on an off-peak general factor and an on-peak general factor, respectively. In addition, short-term memory and working-memory tasks with verbal-numerical material were forced to load on a verbal-numerical maintenance factor, whereas those tasks with spatial material were forced to load on a spatial maintenance factor. Thus, similar to Model 3, the synchrony effect was modeled only for attentional control. This model is depicted in Figure 3b. It provided a good fit to the data, KMO = .86, (94, N = 191) = 110.49, p = .118, CFI = .99, RMSEA [90% CI] = .03 [.00, .05], SRMR = .05. In this case, the construct reliability for all factors was good (all Hs .73). Inspection of the key parameter for the latent change shows no difference between off and on peaks, thus indicating no synchrony effect for attentional control (the unstandardized intercept of the latent-change factor = .001, se = .01, p = .937, 95% CI = [-.01, .01]).
Multiverse-analysis approach
In a final step, we tested for the robustness of the results in a multiverse-analysis approach. To this end, we re-ran the analyses by applying different data transformations, participants’ selections, trimming procedures, and SEM approaches. The different procedures we used are described in Table 9. Please note that as preregistered, we modeled the data using a second-order latent growth curve modeling approach. However, these model assessments never resulted in acceptable or good fit statistics. Therefore, we opted for the latent-change models, which is an extension of the latent growth curve models focusing more directly on the difference between the two measurement time points (Ghisletta & McArdle, 2012).
Procedure | Description |
Data transformation | 1. raw accuracy rates |
2. arcus-sinus transformed accuracy rates | |
Participants’ selection | 1. all chronotypes a |
2. moderate and definite morning and evening types | |
3. definite morning and evening types | |
Trimming | 1. with missing data (but no more than 2 short-term or working-memory tasks were missing in each session) |
2. after removing missing data | |
3. after removing outliers in time-out responses (i.e., number of time-out responses smaller or larger than 3 SDs) in the processing parts of the complex spans | |
4. after removing multivariate outliers in all span tasks | |
5. after removing multivariate outliers in all span tasks and the processing parts of the complex spans | |
6. after removing outliers in depression (i.e., BDI-II score larger than 20) and stress (i.e., PSQ score larger or smaller than 3 SDs) | |
7. after removing outliers in depression (i.e., BDI-II score larger than 20) and stress (i.e., PSQ score larger or smaller than 3 SDs) as well as those participants reporting medicament intake in the last 24 hours or alcohol, drug, caffeine or nicotine intake in the last 2 hours before a session | |
SEM approach | 1. latent-change model without constraints |
2. latent-change model with the constraints of positive factor loadings | |
3. latent-change model with the constraints of positive error variances | |
4. latent-change model with the constraints of positive factor loadings and error variances | |
5. latent-change model without the covariance between the latent-change factors in case of bivariate latent-change model | |
6. latent-change model without the measures which were used to assess attentional control and which showed a significant synchrony effect at the individual-task level | |
7. second-order latent growth model |
Procedure | Description |
Data transformation | 1. raw accuracy rates |
2. arcus-sinus transformed accuracy rates | |
Participants’ selection | 1. all chronotypes a |
2. moderate and definite morning and evening types | |
3. definite morning and evening types | |
Trimming | 1. with missing data (but no more than 2 short-term or working-memory tasks were missing in each session) |
2. after removing missing data | |
3. after removing outliers in time-out responses (i.e., number of time-out responses smaller or larger than 3 SDs) in the processing parts of the complex spans | |
4. after removing multivariate outliers in all span tasks | |
5. after removing multivariate outliers in all span tasks and the processing parts of the complex spans | |
6. after removing outliers in depression (i.e., BDI-II score larger than 20) and stress (i.e., PSQ score larger or smaller than 3 SDs) | |
7. after removing outliers in depression (i.e., BDI-II score larger than 20) and stress (i.e., PSQ score larger or smaller than 3 SDs) as well as those participants reporting medicament intake in the last 24 hours or alcohol, drug, caffeine or nicotine intake in the last 2 hours before a session | |
SEM approach | 1. latent-change model without constraints |
2. latent-change model with the constraints of positive factor loadings | |
3. latent-change model with the constraints of positive error variances | |
4. latent-change model with the constraints of positive factor loadings and error variances | |
5. latent-change model without the covariance between the latent-change factors in case of bivariate latent-change model | |
6. latent-change model without the measures which were used to assess attentional control and which showed a significant synchrony effect at the individual-task level | |
7. second-order latent growth model |
Note. Structural equation modeling was performed only if the sample size was larger than 80 participants. The procedure presented in the main text is displayed in bold. SD = standard deviation; BDI-II = Beck Depression Inventory II (Hautzinger et al., 2006); PSQ = Perceived Stress Questionnaire (Fliege et al., 2005); SEM = Structural Equation Modeling. a Participants with a score smaller than 50 were classified as evening types, whereas participants with a score larger than 50 were classified as morning types. For participants with a score of 50, they were classified as morning or evening types according to the question: “One hears about ‘morning types’ and ‘evening types’. Which one of these types do you consider yourself to be?” The potential responses were: “Definitely a morning type”, “Rather more a morning type than an evening type”, “Rather more an evening type than a morning type”, and “Definitely an evening type” (see Griefahn et al., 2001).
An overview of the results from the multiverse-analysis approach is presented in Figure 4. As shown in the upper part of the figure, no robust synchrony effect was observed at the individual-task level. At the latent-variable level, the results showed the difficulty of measuring short-term maintenance and attentional control as good factors. When these were modeled satisfactorily (i.e., with acceptable to good fit statistics and good indices H), the synchrony effect was in most cases not significant. In the two cases showing a significant effect, the latent change was small (maximum value for the unstandardized intercept of the latent-change factor = 0.02).
Finally, for the sake of completeness and following previous work (e.g., Allen et al., 2008; Bonnefond et al., 2003; Fabbri et al., 2008; Lewandowska et al., 2017; Matchock & Mordkoff, 2008), we also investigated the time-of-day effect, that is, the difference in performance between the morning and evening sessions (but independently of the subjective peak of circadian arousal). The results are summarized in the Appendix B. Consistent with the results about the synchrony effect, the results showed no robust time-of-day effect either at the individual-task level or at the latent-variable level.
Discussion
Most individuals who are classified as morning types according to their morningness-eveningness preferences have their peak of circadian arousal in the morning. It is assumed that they show better cognitive performance in the morning than in the evening. Conversely, most individuals who are classified as evening types have their peak of circadian arousal in the evening. It is assumed that they show better cognitive performance in the evening than in the morning. This synchrony effect – that is, the observation of better cognitive performance at the peak of circadian arousal than at off peak – is well-established as common knowledge. However, empirical evidence is more equivocal. In the present study, we aimed to empirically clarify this effect. Specifically, we determined the scope and robustness of the synchrony effect by addressing the methodical challenges typically observed in previous research. Thus, we investigated the synchrony effect on short-term memory, working memory, and attentional control in a large sample of participants, who were tested at their on-peak time and their off-peak time. Following seminal research (see, e.g., Intons-Peterson et al., 1998; May, 1999; May et al., 1993), on- and off-peak times were determined using a questionnaire (Griefahn et al., 2001; Horne & Östberg, 1976). All participants performed four short-term memory tasks and four working-memory tasks. Attentional control was assessed on the latent-variable level as the goal-directed nature of working-memory tasks without their maintenance aspects. Quite surprisingly, the results showed no evidence for a general and robust synchrony effect for any of the constructs we measured (i.e., short-term memory, working memory, and attentional control). Moreover, this pattern of results was confirmed when we applied different data transformations, participants’ selection criteria, and trimming procedures.
On the individual-task level, the results showed no synchrony effect for most tasks. A synchrony effect was only detected in one out of four short-term memory tasks (i.e., arrow simple span) and one out of four working-memory tasks (i.e., numerical updating task). However, their effect sizes were at best small. From a positive perspective, our results can be interpreted that the synchrony effect does exist and that it would emerge in about 25% of cognitive tasks (i.e., 2/8 of our tasks). From this perspective and based on the assumption that a large portion of null-findings are not published (e.g., Kühberger et al., 2014), our findings would be broadly consistent with previous research showing mixed evidence with regards to the synchrony effect on short-term memory and working memory (e.g., Rowe et al., 2009; Schmidt et al., 2015; West et al., 2002; but Ceglarek et al., 2021; Heimola et al., 2021; Lewandowska et al., 2017). This interpretation of the findings would be valid if the circumstances under which a synchrony effect occurs and under which it does not occur were clear. For example, the circumstances would have been clear if the synchrony effect was observed in one task type (e.g., working-memory tasks) or – although theoretically less plausible – only in tasks with the same stimulus materials (e.g., the digit simple span in short-term memory and the numerical complex span in working memory). However, we observed a synchrony effect in two tasks (arrow simple span and numerical updating) with different types (short-term memory and working memory, respectively) and different materials (spatial and numerical, respectively). Therefore, these results indicate no systematic pattern. Given that, our positive findings with regards to a synchrony effect on cognition is most likely coincidental. Therefore, the synchrony effect at the individual-task level is rather the exception than the norm.
On the latent-variable level, we were also not able to observe any evidence for a synchrony effect on cognition. Specifically, we used structural equation modeling to assess short-term maintenance and attentional control separately for on- and off-peak times, and we estimated the synchrony effect as a latent change between both peaks. Across the different analyses we run, we found no systematic synchrony effect for short-term maintenance or attentional control. The reasons were multiple: (1) The models did not provide good fit statistics or were not fully identified. (2) The factors did not represent shared variances across the measures. (3) The latent synchrony effect was not significant. There were only two cases with a specific combination of data transformation, participants’ selection and trimming procedure where the model provided good fit statistics, substantial common variance, and a significant – but small – latent synchrony effect. However, consistent with the results at the individual-task level, these two cases are the exception and represent most likely coincidental findings. Therefore, the systematic picture of the SEM approach unequivocally reveals no evidence for a general and robust synchrony effect on attentional control and short-term maintenance. Overall, the present results are mostly in line with the findings challenging the synchrony effect on cognition (e.g., Ceglarek et al., 2021; Heimola et al., 2021; Knight & Mather, 2013; Lewandowska et al., 2017; Li et al., 1998; Matchock & Mordkoff, 2008).
The results show no evidence for a general and robust the synchrony effect: Is this conclusion warranted?
One may argue that our conclusion is not warranted because (a) we did not have sufficient power in the present study, (b) there was an unbalanced number of morning and evening types, and (c) there were differences in the recruitment and the testing of the participants. We next discuss each of these concerns in detail.
Sufficient power?
Because we put forward that the small sample sizes typically used in previous research are an issue, a first concern may be whether in the present study, we had sufficient power to detect a true synchrony effect. This is particularly important because we focused on the subsample consisting of 191 moderate to definite chronotypes. The analyses on the complete sample including the 446 participants were reported in the multiverse-analysis approach.
For the analyses at the individual-task level, we can estimate the adequacy of the sample size by using the recommended sample sizes put forward by Brysbaert (2019). According to his recommendations, a sample of 70 participants should be sufficient to identify an effect in a within-subject design using a t-test with an effect size of .40, a power of .90, and an alpha level of .05. If a Bayesian approach is used, a sample of 131 participants should be sufficient to detect a synchrony effect with a Bayes Factor larger than 10 (see the Table 9 on p. 27, Brysbaert, 2019). Together, this indicates that with our sample of 191 participants, we had enough power to detect a synchrony effect in the present study.
The question is now whether we had enough power to have strong evidence against the synchrony effect. According to Brysbaert (2019), a sample of 1800 participants with moderate to definite chronotypes would have been necessary to find a Bayes factor larger than 10, thus indicating strong evidence against the synchrony effect. This is interesting for two reasons. First, it may explain why most Bayes factors in favor of the absence of the synchrony effect were larger than 3 but still smaller than 10 in the present study. Second, the requirement of a sample size consisting of 1800 participants with moderate to definite chronotypes puts forward the difficulty of having strong evidence against the synchrony effect. If we applied the same ratio between moderate to definite chronotypes and neutral types as the one observed in the present study (i.e., 43%), the goal of testing 1800 participants with moderate to definite chronotypes would require the recruitment and testing of more than 4200 participants. In the current research field, even if an online study is used, this seems difficult to implement. Therefore, we must acknowledge that our sample size was not large enough to provide strong evidence against a synchrony effect. Nevertheless, the conditions to provide strong evidence seem currently unrealistic. Moreover, the present study includes one of the largest sample sizes used so far (cf. Table 1), and the results at the individual-task level showed positive evidence against a synchrony effect with a Bayes factor larger than 3 for most tasks. Together, this warrants our conclusion that the synchrony effect is not as robust and general as previously thought.
For the analyses at the latent-variable level, we applied the recent approach put forward by Bader et al. (2022) to determine the required sample size for bifactor models. We opted for this approach because it takes into account the complexity of the bifactor models. The details of the approach are presented in Appendix C. The results show that a sample size of 191 participants is sufficient for estimating all our models with a high rate of proper convergence. Moreover, accurate parameter estimations are expected for most models (in particular for Model 3 and Model 4 in which we observed good fit statistics). Together, this suggests that the power in the present study was sufficient in all analyses to warrant our conclusion.
Imbalance between morning and evening types: Is this an issue in the present study?
A second concern may be the imbalance between morning and evening types we reported in the present study. Our results showed that 131 young adults were categorized as morning types, whereas 60 young adults were categorized as evening types. One may wonder whether this unbalanced number of morning and evening types is an issue for the statistical analyses. This is not the case because we used a within-subject design. That is, because all participants were tested in the morning and evening, all participants were tested at both on-peak and off-peak times. Such a design allows us to collapse the within-subject variables Session (morning vs. evening) and Chronotype (morning type vs. evening type) into a single within-subject variable Testing time (on peak vs. off peak). Thus, for each task, we were able to compute a paired t-test with the within-subject variable “Testing time” with the two levels “on peak” and “off peak”. The same applied for the latent-change model, which can be conceptualized as a paired t-test to latent constructs. Therefore, the imbalance between morning and evening types does not affect the conclusions for all our analyses.
We nevertheless agree that it could seem unexpected to observe more young adults with morning types than with evening types. This contrasts with the typical finding reported in previous research according to which young adults have almost exclusively an evening chronotype (see, e.g., May et al., 1993; May & Hasher, 1998b). However, a close inspection of the normative data reported by Griefahn et al. (2001, see their first figure) showed rather a balanced number of morning and evening types for young adults. Moreover, we found a balanced number of morning and evening types in young adults in two datasets (see Rothen & Meier, 2016; and Rothen, 2023). In an unpublished dataset (Rothen, 2015), we even observed more morning types than evening types in young adults, which is line with the present results. For the sake of transparency, we present in Appendix D figures displaying the number of morning, neutral, and evening types as a function of age for each of these datasets. Critically, this discrepancy in the findings can be explained. That is, in contrast to previous research in which American college students were tested, we tested in all our datasets – including the present one – young European adults coming from the general population. Thus, the previous result of more evening types in young adults might have been biased by testing a very specific population with very specific habits. This explanation is so far purely speculative, asking for further research in order to be tested directly and thoroughly.
Different recruitment and testing procedures?
A third concern may be whether we used different recruitment and testing procedures than those used in previous research. For example, a quick look at Table 2 might suggest that the recruitment and testing did not work well because of errors made by the students who recruited and tested the participants. According to this table, 65 participants were not completing the experiment at the correct time, six participants did the tasks in the wrong order, and 69 participants did not perform all tasks. However, these issues cannot be solely attributed to students’ errors. For example, because the study was performed online and each participant started the experiment on its own, it is possible that some participants did not recognize the importance of being tested at the specific time, although it was carefully explained by the students. Similarly, most participants did the tasks in the wrong order because they typed a wrong participant’s number in one of the sessions. However, because the order of the tasks was determined by the participant’s number, this typo resulted in a different task order across the sessions. Thus, this issue was rather a result of a participant’s mistake or misunderstanding. Finally, participants were excluded because at least one task was missing. Here, there might be several reasons leading to a missing task. For example, a crash could have occurred, thus leading to the exclusion of the task. We also excluded a task if a break longer than 3 minutes was observed during the execution of a task. To our best knowledge, we are not aware of any online studies controlling for such fine-grained experimental settings. Together, all the exclusion criteria makes our online study more similar to laboratory settings. Moreover, in all tasks, we observed performance similar to previous research (e.g., in the mean, reliability and correlation estimates; cf. Kane et al., 2004; Rey-Mermet et al., 2019). This suggests that the quality of our data is similar to the quality observed in previous studies using in-lab testing. Therefore, the testing procedure cannot account for the discrepancy between our results and previous research.
Open questions and next steps
The present results showed no evidence for a general and robust synchrony effect in young adults for short-term memory and working-memory tasks and for the constructs of short-term maintenance and attentional control. However, there are still some remaining questions. For example, previous research emphasizing the synchrony effect has frequently compared performance in young and older adults (e.g., Intons-Peterson et al., 1998, 1999; Knight & Mather, 2013; Li et al., 1998; May, 1999; May et al., 2005; May & Hasher, 1998b; Rowe et al., 2009). Therefore, the synchrony effect might be more general and robust in older adults. However, the results of the present study ask to be cautious when designing the next studies by opting for a design in which older adults are tested at both on- and off-peak times (see Rothen & Meier, 2016, for an example). Thus, the critical comparison testing the synchrony effect would be within the same group of older adults. In addition to increasing the statistical power, this has the advantage that the synchrony effect would not be affected by performance differences between young and older adults caused, for example, by the general slowing typically observed in older adults or by different speed-accuracy trade-offs (see, e.g., Salthouse, 1979, 1996).
Another open question concerns the impact of moderators on the synchrony effect, such as the sleep-wake history (see, e.g., Dijk & von Schantz, 2005). In the present study, we followed previous research and we did not explicitly control for prior sleep-wake history. The reason is that such control might have introduced a bias in the recruitment and testing of the participants so that the sample would no longer have been recruited and tested as described in the seminal studies. Therefore, whereas the present results indicate no robust and general synchrony effect for a population with no specific sleep-wake history, they do not preclude a synchrony effect in populations with disturbed sleep-wake history, such as shift-workers with off-peak working schedules. Therefore, future studies should be designed to determine to what extend prior sleep-wake history, in particular disturbed sleep-wake history, affects the synchrony effect.
The present results show no evidence for a robust and general synchrony effect when the chronotype was assessed using the Morningness-Eveningness Questionnaire (i.e., the D-MEQ, Griefahn et al., 2001). In the present study, we opted for such a method because we followed previous seminal studies (see, e.g., Intons-Peterson et al., 1998; May, 1999; May et al., 1993). However, these results do not preclude that circadian arousal can have an impact on cognition when other methods are used. For example, such an impact has been reported when participants were tested using a constant routine protocol or a forced desynchronization protocol (see Schmidt et al., 2007; Valdez, 2018, for reviews). In the constant routine protocol, the core body temperature, melatonin, cortisol, and cognitive performance are measured at regular intervals for a minimum of 24 hours, and participants are asked to stay awake with reduced motor activity. In a forced desynchronization protocol, participants are asked to adjust their sleep-wake cycle to, for example, 28h-period, and cognitive performance is measured at different times of this period. When these methods are used, the results indicate an impact of circadian arousal on cognitive constructs, such as attention and working memory (see, e.g., Ramírez et al., 2006; Valdez et al., 2005). Therefore, these methods seem more appropriate to detect a true impact of circadian arousal on cognition.
Conclusion
More generally, the results of the present study convey an important message which applies to psychological science and scientific research more widely. Namely, reliable and robust insights into cognitive processes result from well-powered experimental designs in which several tasks measure the same underlying constructs. Had we conducted our study with only one of the two tasks which revealed positive findings, we would have come to the wrong conclusion that the synchrony effect does exist. Moreover, given our large sample, the within-subject design and the reliability of our measures, we would have interpreted the findings as very robust and representative. As our results show no evidence for a general and robust synchrony effect across the different tasks and the different levels of analyses (individual-task vs. latent-variable level), such an interpretation of our results is not warranted. Given the overall picture of our results, we must conclude that the synchrony effect is not as robust and general as previously thought.
Contributions
Contributed to conception and design: ARM, NR
Contributed to acquisition of data: ARM
Contributed to analysis of data: ARM
Contributed to interpretation of data: ARM, NR
Drafted and/or revised the article: ARM, NR
Approved the submitted version for publication: ARM, NR
Acknowledgements
We thank Elisabeth Schoch, Stefanie Tangeten, and Daniel Fitze as well as the students who took part in the course M08 in fall term 2020 and in the course M1 in spring term 2021 for their help in data collection. We also thank Niels Kempkens for testing the reproducibility of the analyses.
Funding information
ARM is currently supported by a grant from the Swiss National Science Foundation (Grant 100014_207865). NR is currently supported by a grant from the Swiss National Science Foundation (Grant 10001CM_204314).
Competing interests
We have no known conflict interest to disclose.
Data accessibility statement
All deidentified data, experiment codes, research materials, analysis codes, and results are publicly accessible on the Open Science Framework (OSF) at https://osf.io/ngfxv. This study’s design and its analysis were pre-registered on OSF. The preregistration can be accessed at https://osf.io/tywu7.
Appendices
Appendix A. Deviations from the preregistration
In the present study, there are a few deviations from the preregistration. These are listed below.
Changes in the terminology
In the preregistration, we used the term “interference control”. However, to be more in line with recent research (von Bastian et al., 2020), we opted for the term “attentional control” in the present study.
Furthermore, in the preregistration, we used the term “extreme groups” regarding the chronotypes. This terminology may refer to the moderate to definite chronotypes or only the definite chronotypes. To avoid any confusion, we avoided “extreme groups” in the present study by referring either to moderate to definite chronotypes or to definite chronotypes.
Number of hypotheses for the construct “maintenance”
In the preregistration, we formulated only one hypothesis for the construct “maintenance” due the state of research at that time. Because more recent research has established a difficulty of finding a synchrony effect for short-term memory and working-memory tasks in the meantime (see, e.g., Ceglarek et al., 2021; Heimola et al., 2021), we formulated two contradictory hypotheses for this construct in the present study.
Recruitment in courses M08 and M1
In the preregistration, participants were planned to be recruited by the students from UniDistance Suisse who take part in the course M08. Because the sample was not completed after the course M08, participants were also recruited during the course M1.
Duration between sessions 2 and 3
In the preregistration, Sessions 2 and 3 were planned to be separated by maximally one week. During testing, this duration was extended if the participant could not perform the session as planned (e.g., because of illness).
Sample size
In the preregistration, the target sample size was 453 participants. After applying the inclusion criteria described in Table 2, we were close to this target sample size because the sample consisted of 446 participants. Please note that in the present study, we mainly report the analyses on the subsample including the participants with moderate to definite morning and evening types (N = 191). The reason is that we aim to be consistent with the seminal research published by May and colleagues (see, e.g., Intons-Peterson et al., 1998; May, 1999; May et al., 1993). The analyses on the full sample including all chronotypes (N = 446) are presented as part of the multiverse-analysis approach.
Statistical power
In the preregistration, the statistical power was described to be set to .95. This was a typo. A power of .90 was used to determine the sample size.
Dependent measures
In the preregistration, the dependent measures were error rates. For the sake of simplicity, we used accuracy rates (= 1 - error rates) in the present study.
Furthermore, in the preregistration, we used the term “partial-credit unit scoring method” to refer to the computation of the dependent measures as the proportion of memoranda recalled at the correct position. This was an error because our computation of the dependent measures corresponds to the partial-credit load scoring method. This was corrected in the present study.
Reliability of the measures
Reliability was planned to be calculated by adjusting split-half correlations with the Spearman–Brown prophecy formula. To have more stable estimates, we computed permutation-based split-half reliability estimates with the Spearman–Brown prophecy formula (Parsons et al., 2019).
Zero-order correlations
The upper and lower confidence intervals of the correlations were planned to be computed from a bootstrapping procedure with 10,000 random samples. We simplified the analyses by using the R-package psych (Version 2.2.9; Revelle, 2021) and by reporting the upper and lower confidence intervals computed with this package.
Furthermore, in the preregistration, the Bayes factors for the correlations were planned to be estimated using the BayesMed package (Nuijten et al., 2014) with default prior scales. Because this package could no longer be used, the Bayes factors were estimated using the R-package BayesFactor (Version 0.9.12.4.4; Morey & Rouder, 2021) with default prior scales.
Fit measures
In the preregistration, we forgot to mention that RMSEA values are less preferable when the sample size includes less than 250 participants (Hu & Bentler, 1998). The information was added in the present study.
Furthermore, in the preregistration, the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) were introduced as inference criteria. The goal of these fit measures is to compare two models. Because our results did not allow us to perform such model comparisons, these fit measures were not introduced in the present study. For the same reason, we did not perform difference () tests on nested models and the Bayesian hypothesis test with the BIC approximation (Wagenmakers, 2007).
Factor reliability
In the preregistration, the factor reliability was planned to be assessed using a coefficient . Because this coefficient cannot be computed for all models we estimated, we preferred computing the index H. This index could be computed for all models.
Measurement models
In the preregistration, three measurement models – that is, Model 1, Model 2, and Model 3 – were introduced. These models were estimated. The results are available on OSF.
Second-order latent growth curve modeling approach
In the preregistration, analyses using second-order latent growth curve models were introduced as main analyses. Because these model assessments never resulted in acceptable or good fit statistics, we opted for another approach with the estimation of latent-change models.
Appendix B. Time-of-day effects
We investigated the time-of-day effect, that is, the difference in performance between the morning and evening sessions. Thus, irrespective of the chronotype, performance in the morning was compared to performance in the evening. Consistent with the analyses on the synchrony effect, the analyses were first performed at the individual-task level and then at the latent-variable level.
At the individual-task level, the descriptive results are displayed in Figure B1, and the results from the null hypothesis significance testing (NHST) and Bayesian approach are described in Table B1. For most tasks, no time-of-day effect was observed. Only in the digit simple span and in the numerical updating task, Bayesian evidence was inconclusive regarding the presence or the absence of the effect. However, the effect sizes were small for these tasks.
Task | t(190) | p | Cohen’s d | BF10 | BF01 |
Digit simple span | 1.77 | .079 | 0.13 | 0.37 | 2.70 |
Letter simple span | 0.60 | .546 | 0.04 | 0.10 | 10.34 |
Matrix simple span | 0.90 | .368 | 0.07 | 0.12 | 8.29 |
Arrow simple span | 1.01 | .313 | 0.07 | 0.13 | 7.49 |
Numerical complex span | 0.27 | .789 | 0.02 | 0.08 | 11.94 |
Spatial complex span | -0.48 | .632 | 0.03 | 0.09 | 11.05 |
Numerical updating | 2.00 | .047* | 0.14 | 0.57 | 1.76 |
Spatial updating | 1.22 | .226 | 0.09 | 0.17 | 5.99 |
Task | t(190) | p | Cohen’s d | BF10 | BF01 |
Digit simple span | 1.77 | .079 | 0.13 | 0.37 | 2.70 |
Letter simple span | 0.60 | .546 | 0.04 | 0.10 | 10.34 |
Matrix simple span | 0.90 | .368 | 0.07 | 0.12 | 8.29 |
Arrow simple span | 1.01 | .313 | 0.07 | 0.13 | 7.49 |
Numerical complex span | 0.27 | .789 | 0.02 | 0.08 | 11.94 |
Spatial complex span | -0.48 | .632 | 0.03 | 0.09 | 11.05 |
Numerical updating | 2.00 | .047* | 0.14 | 0.57 | 1.76 |
Spatial updating | 1.22 | .226 | 0.09 | 0.17 | 5.99 |
Note. For the sake of clarity, results with a BF10 larger than 3 are presented in bold, whereas results with a BF01 larger than 3 are presented in italics. * p < .05.
At the latent-variable level, the results are presented in Figure B2. Consistent with the results on the synchrony effect, the results showed a difficulty of measuring maintenance and attentional control at the latent-variable level. When these were modeled satisfactorily (i.e., with acceptable to good fit statistics and good indices H), the time-of-day effect was in most cases not significant. When, however, it was significant, the latent change was small (maximum value for the unstandardized intercept of the latent-change factor = 0.02).
Appendix C. Sample size requirements
We estimated the adequacy of our sample sizes by applying the approach from Bader et al. (2022) using the R package “simsem” (Pornprasertmanit et al., 2021). For the sake of completeness, we applied this approach for the subsample of 191 participants with moderate to definite chronotypes as well as for the full sample of 446 participants with all chronotypes. We computed all four models (i.e., Model 1, Model 2, Model 3, and Model 4). The factor loadings used to determine the target sample sizes were estimated from the loadings reported by Kane et al. (2004, see Figure 6 on p. 206). We used these factor loadings because the design used by Kane et al. (2004) was similar to the present study. That is, in both studies, a sample of young adults were asked to perform several short-term memory and working-memory tasks, and a partial-credit scoring procedure was used to compute the dependent variable of each task. According to Kane et al. (2004), one critical feature that supports the view that attentional control was extracted from the working memory tasks in their model was that the magnitude of factor loadings differ, depending on the type of task (short-term memory vs. working memory) and the type of factor (attentional-control factor vs. maintenance factors). This means that working-memory measures had higher factor loadings for the attentional-control factor than for the short-term memory measures. Conversely, the short-term memory measures had higher factor loadings for the maintenance factors than the working-memory measures. To reflect this feature in the selection of the factor loadings for our sample-size computations, we computed the median factor loadings separately for each task type (working memory vs. short-term memory) and each factor type (attentional control vs. maintenance). Accordingly, we selected .5, .8, .7, and .3 as factor loadings from the short-term memory measures and the working-memory measures to the attentional-control factor and maintenance factors, respectively.
Model | Sample size | Convergence rate | Coverage range | Relative bias range | Relative SE bias range | Average ECV bias | Relative ECV bias |
Model 1 | 191 | 100.00 | 0.93 - 0.96 | 0 - 0.01 | 0 - 0.06 | 0.001 | 0.006 |
Model 2 | 191 | 96.35 | 0.91 - 0.96 | 0 - 0.02 | 0.01 - 0.24 | -0.003 | -0.013 |
Model 3 | 191 | 100.00 | 0.93 - 0.96 | 0 - 0.02 | 0 - 0.07 | 0.001 | 0.003 |
Model 4 | 191 | 99.80 | 0.93 - 0.96 | 0 - 0.02 | 0 - 0.06 | 0 | 0.001 |
Model 1 | 446 | 100.00 | 0.93 - 0.97 | 0 - 0.02 | 0 - 0.06 | 0 | 0.001 |
Model 2 | 446 | 100.00 | 0.93 - 0.96 | 0 - 0.01 | 0 - 0.07 | -0.002 | -0.006 |
Model 3 | 446 | 100.00 | 0.93 - 0.96 | 0 - 0.01 | 0 - 0.08 | 0 | 0.001 |
Model 4 | 446 | 100.00 | 0.93 - 0.96 | 0 - 0.01 | 0 - 0.1 | 0 | 0.001 |
Model | Sample size | Convergence rate | Coverage range | Relative bias range | Relative SE bias range | Average ECV bias | Relative ECV bias |
Model 1 | 191 | 100.00 | 0.93 - 0.96 | 0 - 0.01 | 0 - 0.06 | 0.001 | 0.006 |
Model 2 | 191 | 96.35 | 0.91 - 0.96 | 0 - 0.02 | 0.01 - 0.24 | -0.003 | -0.013 |
Model 3 | 191 | 100.00 | 0.93 - 0.96 | 0 - 0.02 | 0 - 0.07 | 0.001 | 0.003 |
Model 4 | 191 | 99.80 | 0.93 - 0.96 | 0 - 0.02 | 0 - 0.06 | 0 | 0.001 |
Model 1 | 446 | 100.00 | 0.93 - 0.97 | 0 - 0.02 | 0 - 0.06 | 0 | 0.001 |
Model 2 | 446 | 100.00 | 0.93 - 0.96 | 0 - 0.01 | 0 - 0.07 | -0.002 | -0.006 |
Model 3 | 446 | 100.00 | 0.93 - 0.96 | 0 - 0.01 | 0 - 0.08 | 0 | 0.001 |
Model 4 | 446 | 100.00 | 0.93 - 0.96 | 0 - 0.01 | 0 - 0.1 | 0 | 0.001 |
Note. The ranges for the relative bias in the factor loadings and their standard errors are given in absolute values. SE = standard error; ECV = Explained common variance.
Bader et al. (2022) put forward various criteria to evaluate whether the sample size is acceptable. First, the rate of proper convergence should be larger than 90%. Second, the coverage – that is, the proportion of solutions in which 95% confidence interval covered the true population value – should be close to the .95. Third, the relative bias in the factor loadings and their estimated standard errors should be taken into account. That is, the absolute values of these parameters are considered as negligible if they are smaller than .05. They are considered as moderate if they range between .05 and .10. However, they are considered as strongly biased and thus unacceptable if they are larger than .10. Finally, the explained common variance (ECV) is computed in order to assess the proportion of common variance in the dependent variables accounted for by the general factor compared to the specific factors. Here, the goal is to obtain accurate ECV estimates by having average and relative ECV biases as small as possible (e.g., biases in absolute values smaller than .03).
The results are summarized in Table C1. As presented in this table, the different criteria were fulfilled for nearly all models with both sample sizes, thus indicating an acceptable sample size. There is only one exception. For Model 2 with the subsample of 191 participants, the rate of proper convergence was larger than 90%, thus indicating an acceptable convergence rate. However, the minimum coverage slightly deviated from the recommended value of .95 in comparison to the other computations. Moreover, the estimated standard errors of the factor loadings were strongly biased. In particular, they were underestimated (-0.24), suggesting that significant effects may be overestimated (Muthén & Muthén, 2002). However, the estimation of this model with our data reveals that the covariance matrix was not positive definite, thus indicating no proper convergence. Thus, the issue in this model is not in the accurate estimation of the parameters and their standard errors but in the proper convergence of the model. Together, this suggests that our sample sizes were sufficient to estimate all models with a high rate of convergence and to obtain accurate parameter estimates in Models 1, 3 and 4 (see Model 2 for an exception).
Appendix D. Chronoscore as a function of age
In our lab, we have collected two datasets in which we observed a balanced number of morning and evening types in young adults (see Rothen, 2023; Rothen & Meier, 2016). We have also collected two other datasets in which we observed the finding of more morning types than evening types in young adults (i.e., Rothen, 2015, and the dataset reported in the present study). The first three datasets were collected in the same way. That is, each dataset was collected in the context of a research methods class at the University of Bern. Second-year psychology students were asked to recruit and to test 16 participants each. All participants completed an online version of the Morningness-Eveningness Questionnaire (D-MEQ, Griefahn et al., 2001). Then, they were assigned to the experimental condition either in the morning (between 6:00 and 10:00) or in the evening (between 17:00 and 21:00). For the last dataset (i.e., the dataset from the present study), there were only a few modifications. First, the complete experiment was performed online. Second, students from UniDistance Suisse were asked to recruit eight participants each. Third, the testing times ranged from 07:30 to 10:00 for the morning session and from 16:30 to 19:00 for the evening session.
Scores from the D-MEQ are presented as a function of age in Figures D1 to D4 for the four datasets, respectively. As shown in these figures, the correlation between D-MEQ scores and age was negligible for each dataset. Moreover, Bayes Factors (BFs) suggest positive evidence against the correlation in each case. Together, these results challenge the view that young adults have almost exclusively an evening chronotype (see, e.g., May et al., 1993; May & Hasher, 1998b)