Ono and Kitazawa (2010) found that the time interval immediately before a fast auditory flutter was perceived to be shorter than the time interval just before a slow auditory flutter, terming it the subsequent flutter effect. In contrast, conceptual replication studies suggested that this phenomenon is unlikely to replicate. A direct replication of the experiment of Ono and Kitazawa (2010) was performed along with three additional experiments to determine why the subsequent flutter effect was not replicated by the previous conceptual replications. The results indicate that the presence or absence of a control condition in which a flutter is not presented within the same block, as well as the time range within which participants should direct attention, is important for the reproducibility of the subsequent flutter effect.
It is known that our perception of time can be longer or shorter than its actual length, depending on various factors, even though the physical length of time remains the same. For example, the presentation of repetitive tone stimuli or visual flickers has been shown to lengthen the perception of the following time interval (Droit-Volet & Wearden, 2002; Penton-Voak et al., 1996; Treisman et al., 1990; Wearden, 1999). To explain these changes in time perception, researchers have generally hypothesized that repetitive stimulation accelerates some kind of internal pacemaker (Creelman, 1962; Treisman, 1963). The acceleration of this pacemaker causes an increase in pulse counts over a given period, thereby resulting in a lengthening of the perceived time. The scalar expectancy theory (SET; Gibbon et al., 1984), a widely used time perception model, employs the pacemaker framework. According to SET, temporal processing consists of three major components: a clock process comprising a pacemaker and accumulator, a memory process comprising short-term and reference-memory stores, and a comparator process that makes decisions.
Several studies have investigated the effect of repetitive stimuli just “before” a time, providing evidence for a pacemaker-accumulator clock of the type proposed by SET (e.g., Droit-Volet & Wearden, 2002; Penton-Voak et al., 1996). Ono and Kitazawa (2010) investigated whether repetitive tone stimuli presented “after” a time interval altered the perception of the preceding time interval. As the repetitive tones are presented immediately after the time to be evaluated, this may affect the SET model’s memory and judgment processes. One trial in the experiment consisted of reference and test intervals. The intervals were defined by delivering the first tone, followed by a silent period and the second tone. Following the test interval, a slow or fast series of tones (auditory flutter) was presented. Participants were asked to judge whether the test interval was longer or shorter than the reference interval. The results showed that the test interval just before a fast flutter was perceived as shorter than the test interval immediately before a slow flutter (hereinafter, the subsequent flutter effect). This is an important discovery because it indicates the possibility of retrospective processing in time perception.
However, in a conceptual replication of the study by Ono and Kitazawa (2010), Repp et al. (2013) and Masuda and Shirai (2021) failed to replicate the subsequent flutter effect, indicating that the conditions for replicating the effect may be limited. The experiment of Repp et al. (2013) differed from that of Ono and Kitazawa (2010) in many respects; however, the experiment of Masuda and Shirai (2021) was considerably similar to that of Ono and Kitazawa (2010). The one aspect in common between the studies by Repp et al. (2013) and Masuda and Shirai (2021), and different from Ono and Kitazawa (2010), was the presence or absence of a control condition. Specifically, Ono and Kitazawa (2010) compared subsequent flutters that were fast (25 Hz) and slow (5 Hz), whereas Repp et al. (2013) and Masuda and Shirai (2021) used a control condition with no subsequent flutter. Therefore, in the present study, the author directly replicated the work of Ono and Kitazawa (2010) by conducting verification experiments focusing on the presence or absence of a control condition to determine why Repp et al. (2013) and Masuda and Shirai (2021) were unable to replicate the subsequent flutter effect.
Experiment 1
Experiment 1 was a direct replication of Ono and Kitazawa (2010). A fast (25 Hz) or slow flutter (5 Hz) was presented immediately after a time interval delimited by short tones. If the findings of Ono and Kitazawa (2010) are replicated, the time interval immediately before a fast flutter should be perceived as shorter than that immediately before a slow flutter.
Method
Participants
The sample size was determined based on a power analysis. Because the effect size was large in Experiment 1 of Ono and Kitazawa (2010) (Cohen’s f = 1.333), we adopted a large effect size (Cohen’s f = 0.4) for the power analysis in this study. The author calculated the sample size needed to detect a significant main effect in a two-way analysis of variance (ANOVA) using G*power (Cohen’s f = 0.4, α = .05, 1-β = .95; Faul et al., 2007, 2009). According to the required sample size, 10 participants aged 20–23 years with normal hearing participated. The sample size and analyses of the study were registered on OSF (https://osf.io/bv9ds/). The study was approved by the Yamaguchi University Ethics Committee, and written informed consent was obtained from all participants.
Apparatus and Stimuli
Participants wearing headphones responded using a keyboard in a quiet, dimly lit room. Psychophysics Toolbox extensions were used to program the experimental stimuli in this study (Brainard, 1997; Pelli, 1997). The author used high-tone (1000 Hz) and low-tone (500 Hz) bursts that lasted 10 ms and included a 1-ms rise and fall time. A pair of high-tone bursts (1000 Hz) were used as markers to indicate the beginning and end of each interval. To present repetitive stimuli at 5 or 25 Hz, low-tone bursts (500 Hz) were used. The same apparatus and stimuli were used in Experiments 2, 3, and 4.
Procedure
Each trial had reference and test intervals (Figure 1). The participant started each trial by pressing the space bar. Following a 500-ms silence, a reference interval was presented by delivering the first high-tone burst, followed by a 400-ms silent period and the second high-tone burst. A test interval was presented after a 1,500-ms silence. The silent period of each test interval was chosen at random from a set of eight intervals (100, 280, 340, 380, 420, 460, 520, or 700 ms). After 100 ms of silence, a rapid sequence of low tones at 5 or 25 Hz was presented for 1,000 ms. The participants were required to determine whether the test interval was longer or shorter than the reference interval and responded by pressing a key corresponding to each judgment. Each participant completed 160 trials (2 repetition rate x 8 test intervals x 10 repetitions). The trial order was randomized across participants. Participants were free to take short breaks whenever they wanted. The participants completed 10 practice trials before the experiment.
Results and Discussion
Figure 2 shows the proportion of trials in which the test interval was judged to be longer than the reference interval. According to a two-factor within-participant ANOVA, the main effects of the repetition rate (5 or 25 Hz) and test interval (100, 280, 340, 380, 420, 460, 520, or 700 ms) were significant, but the interaction was not [F(1, 9) = 18.144, p = .002, ηp2 = .668; F(7, 63) = 131.295, p < .001, ηp2 = .936; F(7, 63) = 2.136, p = .052, ηp2 = .192]. The points of subjective equality—defined as the intersection of the sigmoid curve with the P = 0.5 line—were 372.2 ms in the 5-Hz condition and 421.9 ms in the 25-Hz condition. A cumulative normal distribution function was used to fit the data (probit analysis). According to the t-test results, the point of subjective equality of the 5-Hz condition was smaller than that of the 25-Hz condition [t(9) = 3.774, p = .004, d = 1.358].
The analyses showed that the time interval preceding the fast flutter was perceived as shorter than that preceding the slow flutter, and that the subsequent flutter effect of Ono and Kitazawa (2010) could be replicated. In the next experiment, the author assumed that the reason why the subsequent flutter effect could not be replicated by Repp et al. (2013) and Masuda and Shirai (2021) was the presence or absence of a control condition, and conducted an experiment with an additional control condition.
Experiment 2
Experiment 2 was a conceptual replication of Ono and Kitazawa’s (2010) study, adding a No-flutter condition in which nothing was presented after the test interval. If the results of Ono and Kitazawa (2010) are replicated, the time interval just before the fast flutter (25-Hz condition), as in Experiment 1, should be perceived as shorter than the time interval just before the slow flutter (5-Hz condition). However, based on the results of Repp et al. (2013) and Masuda and Shirai (2021), there may be no difference between the 25-Hz and 5-Hz conditions.
Method
Participants
The author calculated the sample size needed for detecting a significant main effect through a two-way ANOVA using G*power (Cohen’s f = 0.4, α = .05, 1-β = .95; Faul et al., 2007, 2009). Based on the required sample size, 12 participants aged 20–24 years with normal hearing participated.
Procedure
All aspects of this experiment were the same as those in Experiment 1, except for the addition of the No-flutter condition. In the No-flutter condition, nothing was presented after the test interval. Each participant completed 240 trials (3 repetition rate x 8 test intervals x 10 repetitions).
Results and Discussion
Figure 3 shows the proportion of trials in which the test interval was judged to be longer than the reference interval. According to a two-factor within-participant ANOVA, the main effects of the repetition rate (5-Hz, 25-Hz, or No-flutter), test interval (100, 280, 340, 380, 420, 460, 520, or 700 ms), and the interaction were significant [F(2, 22) = 8.619, p = .002, ηp2 = .439; F(7, 77) = 100.423, p < .001, ηp2 = .901; F(14, 154) = 2.601, p = .002, ηp2 = .191]. Multiple comparisons using the Holm–Bonferroni method (Holm, 1979) demonstrated that the proportion of longer judgments was significantly higher in the 5-Hz condition than in the No-flutter condition (p < .001); however, there was no significant difference between the 5-Hz and 25-Hz conditions (p = .240). The points of subjective equality were 386.3 ms, 400.3 ms, and 424.1 ms in the 5-Hz, 25-Hz, and No-flutter conditions, respectively. According to the t-test results, the point of subjective equality of the 5-Hz condition was not significantly different from that of the 25-Hz condition [t(11) = 1.206, p = .253, d = 0.253].
The analysis did not show that time interval immediately before the fast flutter was perceived as shorter than the time interval immediately before the slow flutter, and thus, the subsequent flutter effect of Ono and Kitazawa (2010) could not be replicated. Why was the subsequent flutter effect not replicated simply by including the No-flutter condition? One possibility is that the effect does occur in these conditions, but that the difference was not statistically significant because of sampling variability or because the data were too noisy or the effect too small. To assume that the null hypothesis is true from a failure to reject it is a fallacy (Cohen, 1994; Goodman, 2008). In the next experiment, the author explored the possibility that the subsequent flutter effect was not statistically significant because the No-flutter condition was included in the same block; hence, all conditions were tested in different blocks.
Experiment 3
In Experiment 3, the 5-Hz, 25-Hz, and No-flutter conditions were tested in separate blocks. If the subsequent fluttering effect was not replicated because the No-flutter condition was included in the same block, then the subsequent fluttering effect should be replicated in Experiment 3.
Method
Participants
The author calculated the sample size necessary for detecting a significant main effect in a two-way ANOVA using G*power (Cohen’s f = 0.4, α = .05, 1-β = .95; Faul et al., 2007, 2009). The required sample size was set at 12, and 12 participants aged 20–24 years with normal hearing participated.
Procedure
All aspects of this experiment were the same as those in Experiment 2, except that the 5-Hz, 25-Hz, and No-flutter conditions were tested in separate blocks. The order of the three blocks was counterbalanced among the participants. Each participant completed 240 trials (3 repetition rate x 8 test intervals x 10 repetitions).
Results and Discussion
Figure 4 shows the proportion of trials in which the test interval was judged to be longer than the reference interval. According to a two-factor within-participant ANOVA, the main effects of the repetition rate (5-Hz, 25-Hz, or No-flutter), test interval (100, 280, 340, 380, 420, 460, 520, or 700 ms), and the interaction were significant [F(2, 22) = 13.118, p < .001, ηp2 = .544; F(7, 77) = 161.421, p < .001, ηp2 = .936; F(14, 154) = 3.407, p < .001, ηp2 = .236]. Multiple comparisons using the Holm–Bonferroni method (Holm, 1979) showed that the proportion of longer judgments was significantly higher in the 5-Hz condition than in the 25-Hz and No-flutter conditions (p = .002; p < .001). The points of subjective equality were 339.5 ms, 390.7 ms, and 398.6 ms in the 5-Hz, 25-Hz, and No-flutter conditions, respectively. According to the t-test results, the point of subjective equality of the 5-Hz condition was smaller than that of the 25-Hz condition [t(11) = 4.478, p = .001, d = 1.495].
The analyses showed that the time interval preceding the fast flutter was perceived as shorter than the time interval preceding the slow flutter and that the subsequent flutter effect of Ono and Kitazawa (2010) was replicated. Taken together, the results of Experiments 2 and 3 indicate that the presence or absence of the subsequent fluttering effect depends on whether the No-flutter condition is included in the same block.
Why does including the No-flutter condition in the same block eliminate the subsequent flutter effect? One possibility is the wider time range that required attention. For example, in Experiment 1, without the No-flutter condition, participants needed to direct their attention to a time range of 1200–1800 ms, which was the target time interval of 100–700 ms in addition to 100 ms and 1000 ms for the blank and flutter, respectively. Although only the target time interval was sufficient for performing the task, the participants responded after the flutter presentation ended; therefore, the author assumed that they also paid attention during the flutter presentation time. However, when the No-flutter condition was included in the same block, the attentional load may have been too large for the participants because they needed to direct their attention over a wide time range of 100–1800 ms. A prior study demonstrated that a lack of attentional resources affects time perception (Polti et al., 2018). Therefore, in the next experiment, the target time range was narrowed for reducing the time range within which participants needed to direct attention.
Experiment 4
In Experiment 4, the 5-Hz, 25-Hz, and No-flutter conditions were tested in the same block, but with target time intervals of 340, 380, 420, and 460 ms. Thereafter, the time period to which the participants’ attention should be directed would range from 340 ms to 1560 ms (460 ms for the target time interval in addition to 100 ms and 1000 ms for the blank and flutter, respectively). This allowed participants to devote more attention to the task. If the subsequent fluttering effect was not replicated due to a lack of attentional resources, then the subsequent fluttering effect could be replicated in Experiment 4.
Method
Participants
The author calculated the sample size needed to detect a significant main effect in a two-way ANOVA using G*power (Cohen’s f = 0.4, α = .05, 1-β = .95; Faul et al., 2007, 2009). The required sample size was set at 15, and 15 participants aged 20–26 years with normal hearing participated.
Procedure
All aspects of this experiment were the same as those in Experiment 2, except for the target time intervals of 340, 380, 420, and 460 ms. Each participant completed 120 trials (3 repetition rate x 4 test intervals x 10 repetitions).
Results and Discussion
Figure 5 shows the proportion of trials in which the test interval was judged to be longer than the reference interval. According to a two-factor within-participant ANOVA, the main effects of the repetition rate (5-Hz, 25-Hz, or No-flutter), test interval (340, 380, 420, or 460 ms), and the interaction were significant [F(2, 28) = 8.953, p = .001, ηp2 = .390; F(3, 42) = 108.369, p < .001, ηp2 = .886; F(6, 84) = 3.185, p = .007, ηp2 = .185]. Multiple comparisons using the Holm–Bonferroni method (Holm, 1979) showed that the proportion of longer judgments was significantly higher in the 5-Hz condition than in the 25-Hz and No-flutter conditions (p = .006; p = .003).
The analyses showed that the time interval preceding the fast flutter was perceived as shorter than that immediately preceding the slow flutter and that the subsequent flutter effect of Ono and Kitazawa (2010) could be replicated. The results suggest that the presence or absence of a subsequent fluttering effect depends on whether there are sufficient attentional resources directed to the temporal task.
General Discussion
The purpose of this study was to determine why Repp et al. (2013) and Masuda and Shirai (2021) were unable to replicate the subsequent flutter effect of Ono and Kitazawa (2010). In Experiment 1, direct replication of Ono and Kitazawa (2010), the time interval just before a fast flutter (25 Hz) was judged to be shorter than the time interval immediately before a slow flutter (5 Hz), thus replicating the subsequent flutter effect. A conceptual replication of Ono and Kitazawa (2010) with an additional condition in which the subsequent flutter was not presented and did not replicate the subsequent flutter effect in Experiment 2. The reason for this is highly uncertain; it could be due to sampling variability. Experiment 3 replicated the subsequent flutter effect by testing each condition in separate blocks. Experiment 4 replicated the subsequent flutter effect by testing all conditions in the same block with a narrower test time interval.
Based on these results, let us consider why Repp et al. (2013) and Masuda and Shirai (2021) were unable to reproduce the subsequent flutter effect. The present author assumed it was because of a lack of attentional resources. Attention can be likened to a resource, and the concept has been expanded through various experiments (Lavie, 1995; Norman & Bobrow, 1975). Furthermore, it has been demonstrated that attention has a significant effect on time perception (e.g., Thomas & Weaver, 1975; Zakay & Block, 1996). The author suspected that Repp et al. (2013) and Masuda and Shirai (2021) were unable to replicate this effect because the control condition in which no subsequent flutter was presented was included in the same block, resulting in a too-long time span to which attention should be directed. Additionally, as a result of the lack of attentional resources and inability to direct attention to the flutter, the subsequent flutter effect could not be replicated.
The results of this study may also be explained by the fact that the preceding flutter effect was not replicated as well as the subsequent flutter effect. Previous studies have shown that the time interval just after a fast flutter is perceived as longer than the time interval immediately after a slow flutter (Penton-Voak et al., 1996; Treisman et al., 1990). This preceding flutter effect, however, was not replicated by the conceptual replications of Repp et al. (2013) and Masuda and Shirai (2021). The author only examined the subsequent flutter effect in this study; however, the preceding flutter effect not being replicated could be explained by a lack of attentional resources, which will be clarified in a future study.
The internal pacemaker account (Creelman, 1962; Treisman, 1963) predicts that the faster the flutter repetition rate, the longer the subjective time interval, as noted in the introduction. The current study, however, found that the faster the flutter repetition rate, the shorter the preceding time interval perceived. This result can not only be explained by the internal pacemaker account but also by the flutter modulation of memory and judgment. This is consistent with the theories proposing that working memory is the foundation of temporal perception (Lewis & Miall, 2006).
In this study, the author investigated the conditions under which the subsequent flutter effect of Ono and Kitazawa (2010) can be replicated. The presence or absence of a control condition in which a flutter was not presented within the same block, and the time range within which participants should direct attention, were found to be important conditions. These findings may have far-reaching implications not only for the subsequent flutter effect but also for reproducibility in time perception studies involving attention.
Funding
This study was supported by JSPS KAKENHI (17K04432 and 21K03132).
Conflicts of Interest
The authors declare no potential conflicts of interest with respect to the research, authorship, or publication of this article.
Ethics Approval
This study was approved by the Ethics Committee of Yamaguchi University. The experiments were conducted in accordance with the Declaration of Helsinki guidelines. Written informed consent was obtained from all participants before the experiment.
Author Contributions
F. Ono is the sole author involved in the conceptualization, data curation, formal analysis, funding acquisition, investigation, methodology, project administration, resources, software, supervision, validation, visualization, and writing- original draft and review and editing of the manuscript.
Data Accessibility Statement
The design and analysis plans were registered, and all data are publicly available on OSF at https://osf.io/bv9ds/. The author originally stated in the pre-registration that they would use Ryan’s method for post-hoc ANOVA comparisons; however, as Ryan’s method is not used commonly, we used the Holm–Bonferroni method instead.