The Spatial-Numerical Association of Response Codes (SNARC) effect – i.e., faster responses to small numbers with the left compared to the right side and to large numbers with the right compared to the left side – suggests that numbers are associated with space. However, it remains unclear whether the SNARC effect evolves from a number’s magnitude or the ordinal position of a number in working memory. One problem is that, in different paradigms, the task demands influence the role of ordinality and magnitude. While single-task setups in which participants judge the parity of a displayed number indicate the importance of magnitude for the SNARC effect, evidence for ordinal influences usually comes from experiments where ordinal sequences have to be memorized or setups in which participants possess pre-existing knowledge of the ordinality of stimuli. Therefore, in this preregistered study, we employed a SNARC task without secondary ordinal sequence memorization. We dissociate ordinal and magnitude accounts by carefully manipulating experimental stimulus sets. The results indicate that even though the magnitude model better accounts for the observed data, the ordinal position seems to matter as well. Hence, numbers are associated with space in both a magnitude- and an order-respective manner, yielding a mixture of both compatibility effects. Moreover, a multiple coding framework may most accurately explain the roots of the SNARC effect.
Magnitude information seems to be mentally associated with space, as suggested by the Spatial-Numerical Association of Response Codes (SNARC) effect: responses to relatively small numbers are faster with the left compared to the right side, and responses to relatively large numbers are faster with the right compared to the left side (Dehaene et al., 1993). The mental number line (Dehaene et al., 1993; Restle, 1970) and the working memory account (van Dijck et al., 2011; van Dijck & Fias, 2011) suggest different mental representations that induce this effect.
1.1. The Mental Number Line Account and its Challenges
Dehaene et al. (1993) proposed a visuospatial mental number line to account for the SNARC effect. This line is assumed to represent number magnitude horizontally, either from left to right or from right to left, depending on the writing direction (Dehaene et al., 1993; Restle, 1970). In strong opposition to the mental number line account, Casasanto and Pitt (2019) have argued that it may not be the numerical magnitude but the ordinal position of numbers that is spatially mapped. In most experiments, the two alternative explanations cannot be tested against each other because the ordinality typically coincides with magnitude. Prpic et al. (2021) argued against the ordinal account by referring to studies that show spatial-numerical associations in setups where no immediate stimulus order can be established. Meanwhile, studies showing a spatial mapping of non-numerical ordinal sequences, such as letters of the alphabet or months (Gevers & Lammertyn, 2005), support the ordinal account. This opened the discussion on whether the numerical magnitude or ordinality is spatially mapped (Gevers & Lammertyn, 2005).
1.2. The Working Memory Account
Further researchers described the importance of working memory for temporarily and selectively mapping number and ordinality representations onto space. These observations laid the grounds for the working memory account (van Dijck et al., 2011; van Dijck & Fias, 2011), which suggests that a spatial coding of numbers is formed during task execution by storing task-relevant numbers in working memory. Thereby the ordinal position of a number in working memory is associated with space: the beginning of a sequence is associated with the left and the end with the right (van Dijck & Fias, 2011). This association is created by binding number items to active spatial templates, which are created from experience. Newer versions of the working memory account further assume that the most common numbers and their typical relations are saved in long-term memory (Abrahamse et al., 2016). One such long-term memory representation is the canonical number set representing the numbers one to nine according to their magnitude, as this is the order in which those numbers typically occur (Abrahamse et al., 2016). A number item can prompt this representation. Abrahamse et al. (2016) explain the range dependence of the SNARC effect (i.e., an early observation that the SNARC effect seems to adapt to the range of numbers being used in a stimulus set, and the same number can be characterized by a left/right side advantage depending on other numbers present in the stimulus set; Dehaene et al., 1993) by assuming that when only a subset of the canonical number set is perceived, this representation “is ‘pruned’ to match the actually used items in the experiment” (Abrahamse et al., 2016, p. 6). The working memory account does not further specify which numbers are loaded into working memory while conducting the task. Thus, there are two possible interpretations: first, only the perceived numbers are activated in working memory in order of their magnitude. In this case, numbers within the range of the presented numbers that are not presented would not be activated in working memory (see the right column of Figure 1 for examples of the numbers activated in working memory). However, this wording could also be interpreted such that the mental representation includes all numbers within a particular range (see Figure 1 left column)1. We focus on the first interpretation as the predictions of the mental number line, and the working memory account can only be differentiated when assuming that only numbers are activated that are perceived during the task. We are aware that some extensions to the original working memory account allow for a much wider range of predictions, which are harder to falsify. While we do not focus on these extensions in detail here in the introduction, we will examine under which conditions they might also account for our data in the discussion section.
In sum, the mental number line and the working memory account differ concerning their assumption about the mental representation of numbers. The mental number line account claims that numbers are represented according to their magnitude. In contrast, in our interpretation, the working memory account assumes that only the presented numbers are represented in working memory in order of their magnitude.
1.3. More Than one Explanation for the SNARC
Several authors (e.g., Huber et al., 2016) showed that both magnitude (as postulated by the mental number line) and ordinal position (in line with the working memory account) are important for the SNARC effect. A model by Prpic et al. (2016), which is motivated by spatial compatibility effects when processing musical note lengths, suggests that depending on the nature of the task, either magnitude or ordinality determines the SNARC effect: direct tasks (tasks requiring the processing of the magnitude/ordinal position of a number, i.e., magnitude classification) evoke an ordinality-dependent SNARC effect and indirect ones (tasks that do not require processing the magnitude/the ordinal position of the number, i.e., parity judgment) a magnitude-dependent SNARC effect. Similarly, Schroeder et al. (2017) proposed a multiple coding framework, which suggests that different mechanisms provoke spatial associations simultaneously (see also Abrahamse et al., 2016; Huber et al., 2016). More precisely, the framework assumes that simultaneous activation of the number representations on a mental number line and in working memory is possible. Although this model requires further validation (see also Cipora et al., 2020) and asks for the formalization of a computational model, the basic idea that multiple coding mechanisms may be responsible for the SNARC effect is supported by the data of neglect patients (van Dijck & Doricchi, 2019). As Toomarian and Hubbard (2018) argued, the spatial coding of magnitude may be related to inborn spatial biases, while the mapping of the ordinality could be related to cultural factors. To sum up, different lines of argument support the mental number line account and the working memory account, and both magnitude and ordinality seem to be at play. What remains unclear is their relative role in the most basic parity judgment task typically used to measure the SNARC effect.
1.4. The Current Study
Even though several studies compared the mental number line and working memory accounts (Ginsburg et al., 2014; Huber et al., 2016; Lindemann et al., 2008; van Dijck & Fias, 2011), all of the cited studies used an ordinal secondary task (i.e., learning a sequence) or were based on some pre-existing knowledge structures. The pre-existing knowledge structures used as an anchor for the ordinality representation were musical notation (Prpic et al., 2016) or phone dial (Mingolo et al., 2021). These studies only observed the ordinality SNARC in tasks in which the ordinal position of the stimulus was task-relevant (i.e., direct tasks), that is, when a sequence of items had to be temporarily memorized. One could argue that the nature of the ordinal secondary task or the memory recall triggered the effect of the ordinal position of a number/object in a sequence found in such experiments. Further, studies using pre-existing knowledge structures (e.g., Mingolo et al., 2021; Prpic et al., 2016) deviated from the typical SNARC setup. For those studies, it remains unclear whether ordinality or magnitude plays a decisive role in the spatial mapping of numbers in the most basic SNARC setup with numbers and a typical numerical task.
Therefore, the current study aimed to probe the mental number line and the ordinal working memory accounts using a parity judgment (i.e., indirect) task without a secondary task (i.e., without learning a sequence explicitly and without referring to an existing ordinality-related knowledge structure). Further, we did not present the number set used in the parity judgment task to the participants in advance, avoiding the priming of a particular number representation. We used carefully selected stimulus sets to dissociate predictions from ordinal and magnitude number representations, always including two even and two odd numbers with three consecutive numbers and a fourth one maximally distant from the single-digit range (see Figure 1 for a more detailed description of stimuli that enable this dissociation).
As a result, an advantage of the ordinality model would thus lend strong support to the working memory account. In contrast, an advantage of the magnitude model would favor the mental number line model. As mentioned above, there are extensions of the original working memory model that might also incorporate result patterns, which are predicted by the magnitude model, making it almost empirically indistinguishable from the mental number line account. We will address those in the discussion.
The ethics committee for psychological research at the University of Tuebingen (Nürk_2020_0623_192) has approved the study’s protocol. The study was preregistered (https://osf.io/h7vpm).
An a priori power analysis2 revealed that at least 265 data sets must be collected to detect differences in SNARC slopes of Cohen’s d = 0.2 (power = .9, in paired two-sided t-tests), which we considered the smallest effect size of interest. Due to the online setting, we expected a dropout rate of 20% (incomplete or unusable data sets). Accordingly, we aimed for at least 320 participants. In total, 465 participants took part in the online study. Participants signed informed consent and could participate in a lottery of eleven Amazon vouchers or receive course credit for their participation. After exclusions (see below), a sample size of N = 423 remained (47 to 61 participants per condition, for the exact distribution of participants, see supplementary material S13). The participants reported an average age of 25.90 years (SD = 9.60 years, range = 18 - 83 years); 290 participants reported being female, 130 reported being male, and three defined themselves as diverse. In total, 375 participants indicated being right-handed, 39 left-handed, and nine ambidextrous. The sample comprised 366 German, 11 English, and 48 other native speakers (language reported less than ten times). All except three participants4 reported languages reading from left to right (two Arabic, one Pashto), and two participants did not state an identifiable mother tongue.
2.2. Material and Apparatus
The main experiment was conducted via Pavlovia (2020) using jsPsych (de Leeuw, 2015). We additionally used SoSci Survey (Leiner, 2019) to assign a participant number and to save e-mail addresses for the lottery.
Texts (in the experimental trials) were presented in white in font size 22 px, numbers, fixation cross, and feedback (for the practice trials) in font size 72 px on a black background. We used four number sets ([1, 2, 3, *8*], [2, 3, 4, *9*], [*1*, 6, 7, 8], [*2*, 7, 8, 9]) including one critical number (between asterisk). For the critical number, the distance toward the previous (following) number differs regarding the magnitude of a number and its ordinal position in the sequence (see Figure 1).
Each participant was assigned to one of the four number sets and one of two parity-to-key mapping orders (either first “d” for odd numbers and “k” for even numbers and then “k” for odd numbers and “d” for even numbers, or vice versa) by order of participation. Within one participant, each combination of each number of the number set and parity-to-key mappings was repeated 30 times resulting in 240 trials. This between-subject design prevented transfer effects between number sets.
After a welcome page on SoSci Survey, the participant was forwarded to Pavlovia, where the main experiment was executed in full screen (the experimental code can be found in the supplementary material S2). The participants were asked to choose a conducting language (German or English), whether they wanted to look at the experiment or participate and which device they used (PC, laptop, netbook, smartphone, e-reader, or other). Beneath, a red note appeared, stating that participation in the experiment was only possible using a PC, laptop, or netbook. If neither of the three devices was chosen or the device rendered like a smartphone, the experiment ended with a corresponding note.
Next, the participant was instructed to classify the parity of a number by either pressing the “d” or “k” key depending on the parity-to-key-mapping. During the practice trials (see Figure 2), each number of the number set was presented twice in random order. The participants received feedback for late (‘too late’), correct (‘correct’), and wrong (‘wrong’) responses. Participants that got less than 75% of the practice trials correct had to repeat the trials. We chose 75% because 50% of correct trials were reached by chance. Hence, 75% accuracy means that 50% of the trials were responded to correctly, while the other 50% were correct by chance (25% correct, 25% incorrect).
After the training, the instruction was repeated, and the experimental trials followed. The experimental trials were analogous to the practice trials. However, feedback was only given for no or late responses, stating that none of the response keys was pressed, and the parity-to-key mapping was presented again. Each number of a number set was presented 30 times in pseudorandomized order. Thereby no more than three numbers of the same parity were shown in a row, and a number was never repeated in consecutive trials. The same practice and experimental procedure were then repeated with the reversed parity-to-key mapping.
The study continued by collecting demographic data (cf. supplementary materials S3) and running an arithmetic task consisting of 40 equations with basic arithmetic operations and a time limit of two minutes (cf. supplementary materials S4, the analysis of this task goes beyond the scope of the current research question). Then, two final questions: “How noisy was your environment?” (“silent” to “extremely noisy”) and “If you were the experimenter, would you use the data?” (“Yes.”, “Not all of them.”, “No.”) were presented. The participant was then forwarded to SoSci Survey for the lottery.
2.5. Data Analyses
The data exclusion was derived from the one used by Cipora et al. (2019). Our analysis script can be found in the supplementary materials S5 and the data in S6. First, data sets from participants who reported an age below 18 (0.43%, this exclusion criterium was not preregistered) and participants who indicated that they performed the task in a very noisy or extremely noisy environment (1.94%) were excluded. Afterward, incorrectly responded trials (4.74%; prepared data set), and trials with reaction times below 250 ms (0.59%; as in Cipora et al., 2019; Ginsburg et al., 2014; van Dijck & Fias, 2011 this excludes accidental reactions and ensures the comparability of the results) were discarded as anticipations. Participants with no correct response in one of the conditions left were excluded (0.22%). Then sequentially, all reaction times beyond three SD below or above the individual mean reaction time were removed (3.27%). To ensure an accurate estimate for mean reaction time in each experimental cell, data sets resulting in less than 70% (as in Cipora et al., 2019) of valid trials left for any number (of a number set) in each parity-to-key-mapping were excluded (6.18%).
The dRTs (mean reaction time of the right minus mean reaction time of the left hand) for each participant and each number were calculated according to Fias et al. (1996). Two linear regressions were fitted for each participant predicting dRTs either by the ordinal positions of a number (ordinality model) or by the magnitude of a number (magnitude model). We tested both slopes against zero. For the linear regressions comparison (model comparison), logit transformations were applied to the R²-values of the linear regressions from each participant to approximate a normal distribution. A paired-samples t-test was used to compare the transformed R²-values of the models.
Additionally, we conducted an alternative analysis. We fitted a linear regression to the dRTs of the three consecutive numbers in a number set (the critical number was excluded) for each participant. We then calculated the absolute difference between the measured dRT and the predicted value from each model for the critical number for each participant (deviance) and compared them. For instance, for the number set 1, 2, 3, 8, we fitted a regression line to the dRTs of the numbers 1, 2, and 3. Subsequently, we compared the dRT value for the number 8 with the dRT of this number being predicted by either the magnitude model (i.e., 8) or by the ordinality model (i.e., 4). The dependent measure was the deviance between the actual dRT and predicted values for both magnitude and ordinality models. This analysis was added as we wanted to look at how the regression models behave when being estimated solely based on consecutive numbers and not being affected by the critical number (which constitutes 25% of the data) and which of these models is better in predicting the dRT for the critical number.
We used a significance level of .05, two-sided t-tests, and corrected the four reported p-values with the Bonferroni-Holm method. At first, the results from the model comparison and afterward, the comparison of the deviance from regressions are reported.
3.1. Model Comparison5
The slopes of the magnitude model (Mslope = -5.55, t(422) = -11.56, p < .001, 95%-CIslope = [-6.49, -4.61]), as well as the slopes of the ordinality model (Mslope = -13.33, t(422) = -12.41, p < .001, 95%-CIslope = [-15.45, -11.22]), differed significantly from zero, showing influences of magnitude and ordinality, respectively.
Compared to the ordinality model, the magnitude model showed significantly greater R2-values (Mmag = -0.67/.42 [mean of the transformed/untransformed R2-values], Mord = -0.96/.40 [mean of the transformed/untransformed R2-values], Mdif = 0.29/.02 [difference of the transformed/untransformed R2-values], t(422) = 3.33, p < .001, 95%-CIdif = [0.12, 0.46], Cohen’s d = 0.16) and hence described significantly more variance of the data. Figure 3 shows a visualization of the comparisons in the top panel.
3.2. Comparison of the Deviance
The predicted dRTs for the ordinal position (ordinality model) and the magnitude (magnitude model) of the critical number were calculated for each participant. The absolute distance between the prediction and the measured dRT for the critical number was significantly smaller for the ordinality model compared to the magnitude model (Mmag = 106.84, Mord = 51.43, Mdif = 55.41, t(422) = 15.57, p < .001, 95%-CIdif = [48.41, 62.40], Cohen’s d = 0.76). Hence, the deviance between the prediction for the critical number and the measured dRT in the magnitude model was about twice as large as the corresponding deviance of the ordinality model. A visualization of the comparisons is shown in Figure 3, bottom panel.
The exact mapping of numbers to space has been assumed to depend on their magnitude (mental number line account) or their ordinal position (as implied by working memory account and other theoretical accounting of the primary/sole role of ordinality). The current study compared the two assumptions while avoiding secondary tasks and memory recall of overlearned ordinal sequences. The results show that the magnitude model explained significantly more variance than the ordinality model. However, when calculating the deviance between the measured dRT for the critical fourth number and the predicted dRT via regression of the three consecutive numbers, the ordinality model yielded significantly smaller deviance. These results appear contradictory.
4.1. A Possible Explanation for This Contradiction
One possible source for the apparent inconsistency is the deviance comparison. Because the regression line parameters are estimated with a limited amount of data, the intercept and slope estimations are inevitably error-prone. While the error in the intercept affects all predicted positions equally – and thus both deviances – errors in the slope lead to linearly increasing errors with a distance of the critical fourth number from the other three. This distance is one for the ordinality model but five for the magnitude model – thus, the error in the slope adds up five times in the prediction of the critical number using the magnitude model. Since the error is not known, the results demand further studies and should be interpreted with caution at this point.
Meanwhile, comparing the explained variance supports the mapping of numbers according to their magnitude and hence the mental number line account. However, it has also been proposed that not only the perceived numbers themselves but also the whole included range of numbers as overlearned and stored in working memory (i.e., 1, 2, 3, 4, 5, 6,…) may partially constitute the working memory content (Abrahamse et al., 2016). This poses a fundamental problem for distinguishing this extended working memory account from the magnitude account because the ordinal metric of the overlearned absolute sequence in long-term memory and the linear magnitude metric of the mental number line account are identical. Seeing that our deviance comparisons of the critical fourth number favor the ordinality model without an intermediate, never shown number, the results indicate that only the perceived numbers appear to be ordered in working memory. These seemingly contradicting results thus all hint towards a multiple coding framework-oriented explanation.
4.2. Multiple-Coding Frameworks of the SNARC
Even though the magnitude model describes the current data more accurately, the advantage is associated with a very small effect size (d = 0.17), which is, in fact, smaller than the smallest effect of interest we defined in our a priori power analysis. At the same time, the results also imply that both the mental number line and the working memory account play important roles (e.g., the deviance comparison with Cohen’s d = 0.76 and the measured dRT lies between the predictions of both models). We propose to further consider an integrative model, which may have the potential to expand on previous multiple-coding-framework-related explanations (e.g., Abrahamse et al., 2016; Huber et al., 2016; Prpic et al., 2016; Schroeder et al., 2017). In particular, the referenced literature seems to imply that working memory may be loaded with different task-relevant item-to-space associations.
In our case, it seems that a mental number line activation induced a linear magnitude-to-space association of the relevant number range.
Meanwhile, an ordinal encoding – whereby the order is, in our case, also influenced by a canonical magnitude – induced a linear order-to-space association. Such a simultaneous activation could account for the deviances between the model’s predictions and the measured data, as well as for the fact that both models fit the data well. Indeed, such a model could furthermore account for the discussed differences in magnitude and ordinal influences in other experiments as well – the stronger the focus on a particular magnitude or sequential ordering, the stronger may be its influence on the selection of (in-)compatible response codes. Hence, as Casasanto and Pitt (2019) stated, the culturally influenced mapping of the ordinality plays an important role in spatial mappings in such an integrative model. Those ordinal mappings can explain the association between numerically unrelated concepts such as musical pitches (Rusconi et al., 2006) or emotional valence (de la Vega et al., 2012) and space.
Additionally, as described by Prpic et al. (2021), the magnitude also seems to have an influence. The interaction of those mappings can explain the results of studies investigating the SNARC effect in right-to-left reading participants, where the SNARC effect is not consistently found (Fischer et al., 2009; Shaki et al., 2009; Shaki & Gevers, 2011; Zebian, 2005; Zohar-Shai et al., 2017). Here the mapping of the potentially inborn magnitude seems to be in contrast with the culturally learned mapping of ordinality.
The presented statistical model comparison indicates that the mental number line account yields more accurate regressions. Meanwhile, despite the described issues and the fact that a linearity assumption may be a slight oversimplification, a linear extrapolation to the critical number favors the (pure) working memory account. The results are in line with a multiple-coding framework account, which assumes that multiple task-relevant entity encodings play crucial roles dependent on the task setup and the directness with which the setup triggers particular item-space associations. However, the data are also in line with an extended working memory account, where not only ordinal sequences are actively kept in active working memory but also the ordinality of natural overlearned numbers in long-term memory can influence the resulting pattern (cf. also Abrahamse et al., 2016; Huber et al., 2016; Prpic et al., 2016; Schroeder et al., 2017). Unfortunately, these two models are empirically indistinguishable when we are satisfied with a verbal phrasing of an in-principle influence of both codes/both ordinalities. Therefore, a computational framework is needed specifying how any entity-space encoding in working memory may influence behavioral decision-making for any given model assumption, be it ordinality or magnitude.
In particular, model implementations of the extended working memory account and the multiple coding frameworks would be necessary. An extended working memory model implementation would need to specify how long-term ordinal sequences in long-term memory and short-term ordinal sequences currently active in working memory may influence stimulus processing and response selection. On the other hand, a multiple coding framework implementation would need to define how magnitude and ordinality both influence stimulus processing and response selection. These model implementations should be able to generate exact predictions about how strong the influence of each component will be under which experimental setup. If such influences are properly implemented, then the two models could be quantitatively tested against each other, potentially enabling the exclusion of one of the proposed explanations.
The presented experimental data provide boundary conditions for the future when such models are developed and quantitatively tested against empirical data. We hope this work will facilitate the development and refining of the working memory account so that it is more specific on which numbers are activated in the typical SNARC setup.
Contributed to conception and design: NNK, JFH, JL, KC, MVB, H-CN
Contributed to acquisition of data: NNK, JFH, JL
Contributed to analysis and interpretation of data: NNK, JFH, JH, KC, MVB, H-CN
Drafted and/or revised the article: NNK, JFH, JH, KC, MVB, H-CN
Approved the submitted version for publication: NNK, JFH, JH, KC, MVB, H-CN
This work was funded by the Deutsche Forschungsgemeinschaft (DFG – German Research Foundation) within the Research Unit FOR2718: Modal and Amodal Cognition [grant number FOR 2718; project numbers BU 1335/12-1 and NU 265/5-1]. Martin Butz is a member of the Machine Learning Cluster of Excellence, EXC number 2064/1 – Project number 390727645. Additionally, Hans-Christoph Nuerk studies spatial-numerical associations and the SNARC effect in the DFG project. NU 265/8-1. We acknowledge support by Open Access Publishing Fund of University of Tuebingen.
The authors declare that no competing interests exist.
Data Accessibility Statement
The experimental code, stimuli, participant data, data analysis scripts, additional analyses, and more supplementary material can be accessed on OSF (https://osf.io/u9wer/).
The ethical agreement was given by the Commission for Ethics in Psychological Research of the University of Tuebingen (Nürk_2020_0623_192).
We thank J.-P. van Dijck for clarifying this.
All effect sizes were calculated using the jmv R-package.
The supplementary materials can be accessed on OSF (https://osf.io/u9wer/).
As preregistered, we did not exclude the data of right-to-left readers. However, excluding those participants did not substantially change the results: the magnitude model described significantly more variance and the deviance between prediction and actual data is significantly smaller for the ordinality model.
Further model comparisons are described in supplementary materials S7 and S8. S7 reports results from the analyses of different number sets and number ranges separately. S8 describes results from a comparison of two linear mixed models considering ordinality and magnitude. The Akaike Information Criterion (AICc) was smaller for the magnitude model. In accordance with the reported model comparison, this indicates that the magnitude model described the data more accurately.