Proximity and Expectancy Modulate Response Vigor After Reward Omission

Both humans and some non-human animals tend to respond more vigorously after failing to obtain rewards. Such response invigoration becomes more pronounced when individuals have increased expectations of obtaining rewards during reward pursuit (expectancy), and when they perceive the eventual loss to be proximal to reward receipt (proximity). However, it was unclear whether proximity and expectancy may have distinct influences on response vigor. To investigate this question, we developed a computerized ’scratch card’ task, in which participants turned three cards one by one and won points when all three cards matched (AAA). After each game, they pressed keys to confirm the outcome and start a new game. We included three types of losses: AAB, where participants had increased expectancy of winning as the game evolved, and the final outcome was proximal to winning; ABB and ABA, with reduced expectancy, but high proximity to winning; and ABC, with reduced expectancy and low proximity to winning. In three online studies, we consistently observed that participants confirmed losses more quickly than wins. Importantly, detailed analyses of the different types of losses revealed that proximity reduced vigor, whereas expectancy increased it. Together, these findings are in line with general appraisal theories: the adjustments of response vigor may be triggered by the appraised discrepancy between the current state and a reference state (e.g., attaining one’s goal), and serve to close the gap and facilitate goal pursuit. These findings may also have implications for the effect of ‘near miss’ on gambling persistence. Further exploring how reward omission impacts response vigor may help us better understand the goal pursuit process, and how it becomes maladaptive under certain circumstances.


Introduction Introduction
Action is motivated, as both humans and non-human animals generally seek to maximize reward (while also minimize punishment). However, they do not always succeed in obtaining rewards. For humans, incidents of such failures may range from mundane situations, such as running and missing the train, to more important ones, such as passing all selections yet not getting the job. When a desired reward is not obtained, how do individuals respond? In the current research, we focus on this important albeit often overlooked aspect of reward-seeking behavior.
Previous studies suggest that after failing to obtain a reward, humans and (at least some) non-human animals often increase the vigor of ensuing responses (response speed, force, or frequency). In a seminal study by Amsel & Roussel (1952), hungry rats were trained to run through two runways to obtain food pellets at the end of each runway. When the food pellets at the end of the first runway were removed, rats ran through the second runway more quickly, suggesting that reward omission (at least initially) invigorated responses (see also . Response invigoration after reward omission has also been observed in humans. For instance, when human participants were blocked from obtaining a monetary reward, they pressed a button with more force to confirm the outcome (Yu et al., 2014). Similarly, in a gambling context, participants initiated the next trial more quickly after a loss (i.e., reward omission) than after a win and a non-gamble baseline (Eben et al., 2020;Verbruggen et al., 2017; see also Corr & Thompson, 2014; M. J. Dixon et al., 2013;Stange et al., 2016). In the present study, we examined if the nature of the reward omission or loss matters for response invigoration.
It has been argued that response invigoration after reward omission is associated with certain negative emotions, such as disappointment, frustration, or regret (e.g., Verbruggen et al., 2017;Yu et al., 2014). Appraisal theories of emotion provide a useful framework for understanding both response invigoration and the associated negative emotions after reward omission. According to appraisal theories of emotion, emotions are multicomponential processes that involve changes in various components, such as appraisal of an event, motivational action tendencies, physiological reactions, expressive behaviors (facial, vocal and gestural) and subjective feelings (Moors et al., 2013). Appraisal refers to the process of assessing and evaluating aspects of an event that are of significance to one's well-being (Moors et al., 2013). The most central appraisal is whether an event is promoting or obstructing one's goal(s). Failing to obtain rewards can thus be appraised as a goal-incongruent event, which can lead to certain action tendencies, such as re-engaging and attempting to obtain rewards again (other action tendencies are also possible, such as discarding current reward pursuit, but here we will focus on re-engaging after reward omission). According to Frijda et al. (2014), these action tendencies, or states of action readiness, form the key to what we call "emotions". Action tendencies can manifest in actions with a certain strength and urgency (i.e., response vigor). Actions are considered "impulsive" when they are triggered by appraisals and not preceded by (much) deliberation. The strength or urgency of an (impulsive) action depends on the appraised importance of a goal and the discrepancy between the current state and the desired state (Frijda, 2010). An action becomes more vigorous when a goal is perceived to be more important, and/or when the appraised discrepancy between the current and the desired state becomes larger. Since failing to obtain rewards often entails a larger discrepancy than obtaining rewards, response thus becomes more vigorous after reward omission. In addition to changes in response vigor, changes in physiological responses (e.g., increase in corticosteroid level, ; and potential increase in skin conductance response, Otis & Ley, 1993; but see Lole et al., 2012) and expressive behaviors (e.g., frowning after losing in gambling; Wu et al., 2015) can also occur after reward omission. The integrated and synchronized changes in all components may lead to specific subjective feelings that can be categorized and verbalized, giving rise to discrete emotions such as 'frustration', 'disappointment', or 'regret' as its end product (Scherer & Moors, 2019).
Appraisal theories assume that the appraisal of an event, rather than the event itself, determines the nature of the emotion or action triggered by an event (Frijda et al., 2014;Scherer & Moors, 2019;Siemer et al., 2007). Objectively the same event (e.g., failing to obtain a reward) may thus trigger different (intensities of) behavioral responses (and hence, emotions), depending on one's appraisals. For instance, in the famous missed flight example by Kahneman & Tversky (1982), a traveler who missed his flight by 5 minutes is expected to feel worse than another traveler who missed the same flight by 30 minutes. Despite that the actual outcomes are the same, the appraisal of being proximal to the desired state (i.e., catching the flight) by the first traveler may trigger an upward counterfactual thinking and intensify his negative emotions. Another notable example of objectively the same outcome triggering different behavioral responses is 'near misses' in some real-life gambling scenarios, such as electronic gambling machines and instant scratch cards. 'Near misses' are losses that come very close to wins (Reid, 1986). For instance, on a three-reel slot machine, the jackpot symbol falling on the pay line on the first two reels and just missing the pay line on the third one is a classic 'near miss'. Although objectively the same with other types of losses in terms of actual outcome, a classic 'near miss' seems especially potent in encouraging persistent gambling (Clark et al., 2009;Cote et al., 2003;Kassinove & Schare, 2001;MacLin et al., 2007;Reid, 1986;Strickland & Grote, 1967). Furthermore, players with gambling disorder show amplified responses in the striatum (a core region of the brain reward circuitry) to 'near misses' compared with players without gambling disorder (Chase & Clark, 2010;Habib & Dixon, 2010;Sescousse et al., 2016). The latter finding suggests 'near misses' might contribute to gambling disorder, although the causal relationship between gambling disorder and reactivity to 'near misses' remains unclear.
Two unique cognitive processes may be triggered by the unfolding events in a 'near miss', which may distinguish it from other types of losses. Take again a three-reel slot machine as an example. Players may start a game with a certain expectation of winning. This initial expectation may be influenced by the outcomes of previous games, their overall optimism, experience with gambling, etc. After starting a game, their expectation of winning would presumably increase after seeing the jackpot symbols on the first two reels. When the third reel eventually stops with no jackpot symbol (creating a 'near miss'), the loss may be more unexpected compared to a 'full miss' (where the first or second reel already shows no jackpot symbol). In addition to an increased expectation of winning as the game evolves, a 'near miss' is also subjectively closer to a win compared to a 'full miss' (although objectively a 'near miss' is as proximal to a win as any other types of losses, M. R. Dixon & Schreiber, 2004), as it contains more jackpot symbols than a 'full miss'. Bossuyt et al. (2014) distinguished between these two processes, and referred to them as the appraisals of 'expectancy' and 'proximity', respectively. We will adopt this terminology in the present research for the sake of consistency. Expectancy and proximity thus describe how the players appraise the unfolding events at different stages as a game evolves. Expectancy refers to whether the players' expectations of winning increase from the beginning of a game to the moment before the outcome is revealed, whereas proximity refers to whether the players subjectively perceive a loss to be proximal to a win, after the eventual outcome is revealed. Different outcomes may also influence the initial expectation of winning for the next game, but this is not what we (and Bossuyt et al.) studied here. A classic 'near miss' thus triggers an increased expectation of winning as the game evolves, and the final outcome is also subjectively proximal to a win. As noted by Bossuyt et al. (2014), the appraisals of expectancy and proximity are thus confounded in the case of a classic 'near miss'. It is therefore unclear whether the effect of a 'near miss' on gambling persistence is due to one of the two processes, or both. Accordingly, gambling researchers have proposed different theoretical accounts for 'near miss' effects, emphasizing either the proximity of a 'near miss' to a win (see Clark, 2010;Clark et al., 2009;Reid, 1986 andBelisle &Peters et al., 2010), or the increased expectation of winning (hence, the unexpected nature of the eventual loss) in a 'near miss' (M. J. Dixon et al., 2011Dixon et al., , 2013Sharman & Clark, 2016;Stange et al., 2016Stange et al., , 2017. Previous work examining how quickly gamblers initiate a new gamble after a 'near miss' (an indicator of response vigor) have also yielded inconsistent results (Belisle & Dixon, 2016;Daly et al., 2014;M. J. Dixon et al., 2013;Stange et al., 2016Stange et al., , 2017Worhunsky et al., 2014). This inconsistency may partly be caused by the confounding of proximity and expectancy in the operationalization of a 'near miss'.
To disentangle the influence of proximity and expectancy on response vigor, we used a design similar to the one used by Bossuyt et al. (2014). These authors used a simulated three-reel slot machine to independently manipulate proximity and expectancy. Participants won money when all three reels had the same symbol (e.g., three lemons, three prunes or three melons; AAA), and lost (i.e., no reward) when any two symbols differed. Crucially, by changing the configuration of symbols on loss trials, proximity and expectancy were independently manipulated. The outcome AAB (i.e., a classic 'near miss'), where the last symbol differed from the first two, triggered appraisals of both increased expectancy and high proximity. The outcomes ABB and ABA, while being equally proximal to a win as AAB (i.e., all three outcomes have one symbol that is different from the other two), presumably reduced the expectation of winning as the game evolved, as the mismatch between the first and second reel already indicated that a win was no longer possible. The outcome ABC, where all three symbols differed, led to reduced expectancy of winning during the game, and the final outcome was also distal to a win compared with other outcomes (i.e., reduced expectancy, low proximity). By comparing the different loss outcomes, Bossuyt et al. (2014) showed that proximity and expectancy had separate influences on negative emotions. Participants reported more disappointment, frustration and regret when they lost after having increased compared to reduced expectancy of winning (while controlling for proximity; AAB vs. ABA and ABB). In contrast, proximity did not affect these negative emotions (while controlling for expectancy; ABC vs. ABA and ABB). Instead, participants even rated the situation to be more positive after a proximal loss (ABA and ABB) compared with a distal loss (ABC). In addition to measuring the 'subjective feeling' component, Bossuyt et al. (2014) also measured the motivational component, by measuring participants' tendency to repair after a loss. That is, participants could choose to bet in a new gamble to recoup their loss. Both high proximity and increased expectancy led to a higher tendency to repair.
Proximity and expectancy thus have distinct influences on self-reported negative emotions triggered by goal-incongruent events. In the context of reward omission (which we consider to be a goal-incongruent event), this might imply that they also modulate response vigor. Corroborating this idea, Amsel & Ward (1954) noted that rats only showed response invigoration by non-reward after some minimal number of rewards, suggesting that an 'expectancy' construct may need to be developed before reward omission becomes 'frustrative' (see also Penney, 1960). In another study, Haner & Brown (1955) blocked children's progress in a 'marble' game at different stages, and found that children exerted more force when the blocking was introduced near rather than far away from the completion of the game. Similarly, adult participants pressed a button with more force after being blocked from a monetary reward, especially when the blocking occurred close to attaining the reward (Yu et al., 2014). The latter two findings could be interpreted in terms of 'proximity' (i.e., more invigoration when the non-obtained reward was close). However, in these studies, proximity and expectancy were again 'confounded' as participants were blocked at a moment where they were getting close to a reward (i.e., high proximity) and presumably also had increased expectation of acquiring the reward (i.e., increased expectancy). Therefore, it remains unclear how these two appraisals may modulate response vigor after reward omission.

The Current Research The Current Research
In the current research, we adopted the manipulation from Bossuyt et al. (2014) and explored the influences of proximity and expectancy on response vigor. Although response vigor may be broadly included in the motivational component of emotion (which Bossuyt et al. assessed via the tendency to repair measurement), it likely captures a different aspect of behavior than the tendency to repair. When selecting an action, individuals need to choose both which discrete action to execute (e.g., whether to repair or not), and the vigor with which to execute the selected action. Thus, actions have a 'directional' and an 'activational' component (Braver et al., 2014;Niv et al., 2006), which may have distinct computational and neural underpinnings (Niv et al., 2007). By focusing on response vigor (the activational component), our work thus complements the previous work (Bossuyt et al., 2014) and contributes to an understanding of how distinct appraisals may influence motivational action tendencies after reward omission.
We developed an online 'scratch card' game, and recruited participants from Prolific.co, an online crowd-working platform (Palan & Schitter, 2018). Participants turned three cards out of an array of eight cards one by one, by pressing three keys consecutively. We opted to use a 'scratch card' task rather than a simulated slot machine for a few reasons. First, pragmatically, a 'scratch card' task was easier to program than the spinning reels in a simulated slot machine. Second, by using a 'scratch card' task, we were able to explore whether the response time of turning each card in a game was influenced by the cards turned thus far. Although this is also possible with slot machines (e.g., press a button to start the spin of a reel one by one), it would be less ecologically valid. Furthermore, instant scratch cards share many structure characteristics with slot machines (including 'near misses'), which led some researchers to propose that "scratch cards are essentially slot machines on paper" (Ariyabuddhiphongs, 2011). While the structural characteristics of slot machines and their potential influences on gambling have been extensively studied, our work adds to some recent work that extends this line of investigation to other forms of gambling, such as scratch card games (Stange et al., 2016(Stange et al., , 2017. In our 'scratch card' game, participants won points when all three cards matched, and lost their wagers when any two (or three) cards differed. They then needed to press a key to confirm the outcome (Confirm RT), and press a key again to start the next trial (Start RT). We registered the response time of each response as our dependent measurements. To manipulate proximity and expectancy, following Bossuyt et al. (2014), we included different types of losses, namely AAB (increased expectancy, high proximity), ABB and ABA (reduced expectancy, high proximity), and ABC (reduced expectancy, low proximity). Furthermore, we varied the amount of points at stake (in Experiment 1 and 2) or the overall probability of winning (in Experiment 3) across trials, to explore their potential influence. Experimental code and materials, raw data files and data analysis scripts can be found at osf.io/vuy95/. For Experiment 1 and 2, we did not seek to obtain formal ethics approval as these experiments were conducted according to the ethical rules presented in the General Ethical Protocol of the Faculty of Psychology and Educational Sciences of Ghent University. Experiment 3 was formally approved by the Ethical Committee (No. 2019/86).

Experiment 1 Experiment 1
In Experiment 1, we used an online scratch card task that contained four different outcomes (AAA, AAB, ABB, or ABC) and two amounts of points participants could potentially win (12 or 60 points). AAA was a win; AAB was a loss, but with increased expectancy and high proximity; ABB was a loss, with reduced expectancy but high proximity; and ABC was a loss with reduced expectancy and low proximity. Note that we used three cards with different fruits (oranges, strawberries, and grapes), and each card was randomly assigned the role of A, B and C on each trial. AAA, AAB, ABB and ABC thus denoted the overall configurations rather than specific cards (i.e., AAA could be 3 oranges, 3 strawberries, or 3 grapes).

Method Method
We report how we determined our sample size, all data exclusions (if any), all manipulations, and all measures in the study (Simmons et al., 2012).

Participants Participants
Participants who met the following criteria on Prolific.co were eligible for participation: (1) between 18 and 50 years old; (2) fluent in English; (3) having at least 70% of previously completed studies approved by researchers on Prolific.co (low approval rate may indicate low-effort responses); (4) not having participated in any previous studies hosted by the first author. According to Brysbaert (2019), typical effect sizes in psychological research are between Cohen's = .4 and .3. Using = .3 as the expected effect size, 90 participants are needed to achieve 80% power with a two-sided paired-samples t test and an alpha level of .05 (G*Power 3.1; Faul et al., 2007). We therefore decided to recruit 100 participants to leave some room for potential exclusion. In total, 104 participants 1 from Prolific.co took part in the experiment (63 males, 40 females, 1 did not report gender; , ). All participants received 3 pounds as compensation, plus another 3 pounds as extra bonus from the scratch card task (see below).

Apparatus and materials Apparatus and materials
The experiment was programmed in jsPsych (version 6.0.5), an open-source JavaScript library for creating online behavioral experiments (de Leeuw, 2015). To participate in the experiment, participants needed to have a computer with a keyboard. The experiment only ran in Chrome and Firefox, as browsers other than these two may have compatibility issues.
Although the response times registered by jsPsych in Chrome and Firefox have a lag between 23 and 54 milliseconds, the variability across trials is relatively small (the inter-trial standard deviation of response times caused by browser/operating system configurations varies between 3.23 and 8.37 milliseconds, Bridges et al., 2020), and is comparable to other software widely used to register response times in the laboratory, such as the Psychophysics toolbox (de Leeuw & Motz, 2016) and E-Prime (Hilbig, 2016). Since we manipulated the factors of interest within participants, the potential lags in response times (introduced by different browsers, operating systems, devices and programs running in the background etc.) should be relatively constant across different conditions for each participant. The within-subjects comparisons would therefore not be substantially influenced by the lags in response times.
On each trial, eight cards were presented (blue or red; 180 250 pixels) in a 3 3 grid, with no card presented in the center (see Figure 1 for the layout). Cards on the same row were separated by 40 pixels from each other, and rows were separated by 15 pixels vertically. When turned, each card contained a drawing of either a strawberry, some grapes or an orange (selected from Duñabeitia et al., 2018 picture number 381, 620, and 695, respectively). The outcome of a trial was predetermined (see below), so we used these three types of cards to create four outcomes with equal probabilities (25% for each outcome): AAA (all three cards contained the same fruit), AAB (the last card differed from the first two), ABB (the first card differed from the last two) and ABC (all three cards differed). Throughout the experiment, the amount of points in the balance was constantly updated and presented under the card array (font size: 25 pixels).

Procedure Procedure
Participants were recruited via Prolific.co. They first received an informed consent, which they needed to agree to in order to start the experiment. They were informed that they were going to play a series of 'scratch card' games, in which they would turn 3 cards one by one, and win points if all three cards matched. They were further informed that accumulated points would be converted into real money at the end of the experiment, with 100 points worth 50 British pence and a maximum of 3 pounds as extra bonus.
Participants then started with the scratch card task (see Figure 1 for the trial procedure). Each trial started with a start screen, with the message 'Press any number key to purchase the game (X points) and play to win Y points' printed in the middle of the screen (font size: 30 pixels). Participants could press any of the 10 number keys (from 0 till 9) to start a game. The points required to purchase the game were immediately subtracted from their balance when a key was pressed. We included two types of games with different payoffs (the four outcomes were equally probable within these two types of games). Half of the games cost 2 (X) points per game, and participants could win 12 (Y) points (i.e., the low-amount condition). The other half of the games cost 10 (X) points per game, and participants could win 60 (Y) points (i.e., the high-amount condition). Cards with different colors (either red or blue) were used to indicate the amount of points participants could potentially win, and the assignment of color to each amount condition was randomized across participants.
After a blank screen of 500 milliseconds, eight cards were presented on screen. A question mark was displayed in the center, to prompt participants to choose a card. Participants had to press a number key to select the corresponding card. The card was turned immediately after a response had been registered.
After the first card was turned, the question mark in the center disappeared for 500 milliseconds, during which time participants could not turn the second card yet. When the question mark reappeared, participants could turn one of the remaining cards by pressing the corresponding number key again. This process repeated until participants had turned three cards. The 500-millisecond wait period was included after each response to prevent participants from pressing three keys simultaneously. These wait periods thus ensured that participants would have to turn three cards se-Prolific.co sets a maximum duration for each experiment. Participants who do not finish within the maximum duration are 'timed out' and their slots are re-opened for new participants. Four participants took longer than the maximum duration (but nevertheless finished the experiment), thus resulting in 104 participants in total. In addition, three participants started the experiment but did not finish, and were 'timed out' (not included in the analysis); 13 participants initially reserved a slot but then withdrew; 1 participant was rejected as they could not provide a completion code, and there was also no data recorded.  Figure 1: The trial procedure in the scratch card task quentially, which was crucial for manipulating expectancy.
After all three cards had been turned and the outcome thus revealed (and an extra 500-millisecond wait period had passed), the amount of points participants had won on the current trial was displayed in the center, along with the three turned cards. For the low-amount condition, participants received 12 points if all three cards matched; for the high-amount condition, they received 60 points if all three cards matched. For the remaining outcomes, participants received 0 points. But as games had to be 'purchased' at the beginning of each trial -i.e., 2 points in the low-amount condition and 10 points in the high-amount conditionthey actually lost points when the cards did not match. Note that the purchase requirement also implied that the net outcome on win trials was 10 or 50 points. The number of points (0, 12, or 60) was displayed in the center of the display, and participants needed to press any of the 10 number keys to confirm the outcome and end the current trial. The start screen of the next trial was then presented after a 500-millisecond blank screen.
Unbeknownst to the participants, the outcome of each game was predetermined (although the specific cards used on each trial were randomly generated). For instance, if a particular trial was assigned to the AAB condition, participants would get this outcome regardless of which three cards they chose to turn. The specific cards that were actually turned were randomly generated (e.g., for an AAB trial, it could be strawberry-strawberry-orange, or orange-orange-grapes etc.). We used a 2 (amount: high vs. low) by 4 (outcome: AAA, AAB, ABB vs. ABC) within-subject design, with each cell containing 30 trials. The whole task consisted of 240 trials.
Throughout the experiment, the amount of points participants had in their balance was constantly displayed at the bottom of the screen (not shown in Figure 1) and updated whenever they purchased a new game (i.e., at the start of a game) or won points (i.e., at the end of a game). Participants had 50 points at the start of the experiment. To ensure that the balance was never below 0, the order of trials was pseudo-randomized. First, in the first 8 games, participants experienced each condition once (i.e., 2 amount levels 4 outcomes). Thus, by the end of the first 8 games, all participants had 74 points in their balance. Second, the remaining trials were randomized in such a way that the balance never fell below 0. By taking these two measures, we ensured that no participant would experience bankruptcy during the play. Since the outcomes of all trials were predetermined, all participants ended up with 770 points at the end of the experiment (equivalent to 3.85 pounds according to the conversion rate). The overall expected value of a game was thus positive in Experiment 1, in that all participants won 720 points in total. This differs from most real-life gambling scenarios, in which the expected value of gambling is negative so that players lose money in the long run. All participants received the maximum extra bonus of 3 pounds, in addition to the 3 pounds for compensation. After the scratch card task, participants filled out the short version of the UPPS-P impulsive behavior scale (Cyders et al., 2014), as part of a larger (different) study that aims to explore individual differences in impulsivity. Data from the UPPS-P scale will not be discussed here (as we considered sample size too low to study individual differences here; see also Brysbaert, 2019). Participants were then debriefed, thanked, and compensated via Prolific.co.

Data Analysis Data Analysis
Data analyses and reporting were conducted in R (3.6.2; R Core Team, 2019), with the R packages afex (

Data Cleaning Data Cleaning
Each single trial contains five responses: the response to start the current trial, the responses to turn the first, second and third card, and the response to confirm the outcome. We measured the latencies of all responses (from the end of the preceding wait period till the moment when a response was made). But since we were interested in how vigor is modulated by unfolding events, a single 'episode' actually consisted of 8 responses: the start response of trial n, the responses required to turn the three cards on trial n, the confirmation response of trial n, the start response of trial n+1 (as in Verbruggen et al., 2017), and the responses required to turn the first and second card on trial n+1.  Note that after the second card has been turned, the outcome of a game is partially revealed. We thus did not include the RT of turning the third card on trial n+1, as this RT may be influenced by both the outcome of trial n and the partial outcome of trial n+1, and we did not have enough trials to examine this interaction effect. Trials that were not immediately followed by other trials (i.e., the last trial, and a few other trials due to missing data) were first deleted. For the remaining episodes, we only included the ones where all eight RTs were below 5 seconds in the data analysis. This exclusion criterion was predetermined and consistent with previous studies (Eben et al., 2020;Verbruggen et al., 2017). By using this exclusion criterion, we excluded episodes in which participants might have taken a short break (hence resulting in RTs potentially longer than 5 seconds). After applying these exclusion criteria, the number of episodes left for each amount and outcome combination was counted for each participant. Each cell originally contained 30 trials. To have a reliable estimation of RT, participants needed to have at least 15 episodes left in each cell in order to be included in the analyses. Based on this criterion, data from 2 participants were excluded, leaving 102 participants in the final analysis. For the remaining participants, on average 6.65 % of episodes were excluded ( = 7.08%).

Main Data Analysis Main Data Analysis
For the main analysis reported in the manuscript, we focused on the RTs of turning the third card (card3 RT), confirming the outcome of the current trial (confirm RT) and starting the next trial (start RT). Confirm RT and start RT allowed us to explore the effects of proximity and expectancy on response vigor. We included card3 RT to check whether participants were paying attention to each turned card during the game (by determining if there was a difference between AA* and AB* sequences). In the Appendix, we provide an overview of how RT varied across all eight stages as a function of the amount (high vs. low) and outcome (AAA, AAB, ABB vs. ABC) of the current game.
Card3 RT, confirm RT and start RT were submitted to repeated-measures ANOVAs, with outcome (AA* vs. AB* for card3 RT, as the third card had not been turned yet; and AAA, AAB, ABB vs. ABC for confirm and start RT) and amount (high vs. low) as within-subject factors. Results of these repeated-measures ANOVAs are shown in Table 1. We then conducted a series of paired-samples t tests (and their Bayesian equivalent), to further explore the effects of out-come and amount. To explore the effect of outcome, we first combined high-and low-amount trials, and then compared each outcome against all other outcomes (see Table 2). To explore the effect of amount, we compared high-amount trials against low-amount trials within each outcome (see Table 3). For each comparison, we conducted a paired-samples t test, a Wilcoxon signed-rank test (to check if the result from the paired-samples t test is robust against nonnormal distributions), and the Bayesian paired-samples t test (to quantify the relative strength of evidence for two competing hypotheses, Dienes, 2014). A Bayes factor ( ) of B indicates the data are B times more likely under the alternative hypothesis (the default prior Cauchy's width = 0.707) than under the null hypothesis. We corrected p values for multiple comparisons using the Holm-Bonferroni method, for paired-samples t tests and Wilcoxon signedrank tests separately, and for the effects of outcome and amount at each stage separately. Hedges's g av was reported as the effect size (Lakens, 2013). Due to the large number of comparisons, we based our statistical inference on both the p values (after correcting for multiple comparisons) and the . Both (corrected) p-values had to be smaller than .05 and > 3 to show substantial relative support for the alternative hypothesis, and < to show substantial relative support for the null hypothesis. While the is a continuous measure of the relative likelihood of data under two hypotheses, we adopted the conventional cut-off values of 3 and as substantial evidence for the alternative and the null, respectively, to facilitate statistical inference (Dienes, 2014). Furthermore, we conducted robustness checks of in JASP (0.11.1, JASP Team, 2019), by using different prior widths. The qualitative conclusions using 3 and as the cut-off values remain the same when using different priors. The results of these robustness checks can be found at https://osf.io/erhft/, https://osf.io/a6kep/, and https://osf.io/fxjhn/, for the three experiments respectively. For completeness, in the tables we report the results of all comparisons. In the Results section below, we will focus on comparisons that are theoretically informative.    Note. diff = difference in response time (RT) between high-amount and low-amount condition, in milliseconds (positive value indicates longer RT in the high-amount condition); low-erCI = lower limit of 95% confidence interval; upperCI = upper limit of 95% confidence interval; = value from paired-sample t test; = value from paired-sample t test; = value from Wilcoxon signed-rank test; and are corrected for multiple comparisons within each stage using the Holm-Bonferroni method. = Bayes factor, the likelihood of obtaining the current data under the alternative hypothesis, divided by the likelihood of obtaining the current data under the null hypothesis (the default prior Cauchy's width = 0.707 was used); = Hedges's average .

Card3 RT Card3 RT
For card3 RT, only the main effect of outcome was statistically significant (Table 1). Participants turned the third card more quickly when the first two cards differed (i.e., AB*) in comparison to when the first two cards matched (i.e., AA*). This effect cannot be easily explained by perceptual processes (e.g., participants identified AB* more quick-   ly than AA*), as previous work on simple perceptual-matching tasks indicates that people tend to match identical pairs of stimuli faster than different pairs (Goulet & Cousineau, 2020;Nickerson, 1967), opposite to what we have observed here. Instead, this relative speeding up after AB* than after AA* may be explained by (potential) wins and losses. Response vigor might increase after AB* (as AB* indicated a certain loss), consistent with the 'increased vigor after a loss' hypothesis. Alternatively, participants might strategically slow down after AA* to think about which card to turn next. These processes are not necessarily mutually exclusive, and may simultaneously contribute to the difference in card3 RT after AA* versus AB*. Even though our data do not allow us to disentangle these processes, this difference in card3 RT after AA* versus AB* shows that participants were paying attention to each turned card as they went through a game. This is an important prerequisite for our manipulation of expectancy, in which we assumed that participants paid attention to each turned card and adjusted their expectation of winning accordingly as a game evolved.

Confirm RT Confirm RT
For confirm RT, both the main effect of outcome and the interaction effect with amount were statistically significant. Concerning the main effect of outcome, participants overall confirmed losses more quickly than wins (Table 2). More importantly, reward proximity decreased response vigor, such that participants confirmed ABB (high proximity, reduced expectancy) more slowly than ABC (low proximity, reduced expectancy), while reward expectancy seemed to increase response vigor, such that participants confirmed AAB (increased expectancy, high proximity) more quickly than ABB (reduced expectancy, high proximity); confirm RT: AAA ABB AAB ABC. The confirm RT for AAB and ABC did not differ significantly from each other, and provided substantial evidence for the null hypothesis ( between and ; Dienes, 2014).
The interaction effect between outcome and amount was driven by the effect of amount within the win condition only (i.e., AAA). Participants confirmed a win of 60 points more slowly than a win of 12 points (in line with the postreinforcement pause effect, Felton & Lyon, 1966), whereas amount did not modulate confirm RT for losses. This lack of modulation by amount on loss trials (especially on the AAB trials) might be surprising, as one might argue that expecting to win 60 points and then having that hope dashed might induce more 'frustration', and hence more vigorous response, than expecting to win 12 points. In line with this speculation, Verbruggen et al. (2017) found that participants initiated the next trial more quickly after failing to obtain a high amount of points compared to a low amount of points. Furthermore, Yu et al. (2014) found participants pressed buttons with more force after being blocked from obtaining 2 pounds than 20 pence. In contrast to these previous findings, here we did not find such a modulation effect by amount on response vigor after reward omission.

Start RT Start RT
For start RT, again both the main effect of outcome and the interaction effect were statistically significant. Concerning the comparisons between different outcomes, only the differences between wins and losses were statistically significant (start RT: AAA ABB AAB ABC). Different loss outcomes did not differ significantly from each other (see Table 2). The interaction effect was again driven by the effect of amount within AAA only (Table 3).
Wins versus losses seem to have a relatively persistent influence on response vigor. Our exploratory analyses showed that participants turned the first and second card of the next game more slowly after a win than after a loss (see Figure A1 in the Appendix). In contrast, the effects of expectancy and proximity seem relatively short-lived and are restricted to the confirm RT in the current experiment.

Discussion Discussion
High proximity reduced response vigor, such that participants confirmed ABB more slowly than ABC. This effect Proximity and Expectancy Modulate Response Vigor After Reward Omission Collabra: Psychology suggested that participants still paid attention to the last card, even after they already knew after turning the first two mismatching cards that they had lost (i.e., AB*). By contrast, increased expectancy seemed to increase response vigor, as indicated by the difference between AAB and ABB. No difference between AAB and ABC in confirm RT was observed, which might suggest that the opposing effects of expectancy and proximity have canceled each other out.
In Experiment 1, we used ABB as the loss condition with reduced expectancy and high proximity. However, this design choice may have introduced a confound. That is, for ABB, participants turned two matching cards in succession (i.e., the third card matches the second card), while for ABC, participants never turned two matching cards consecutively. The difference in confirm RT between ABB and ABC might therefore be caused by this confound, rather than by the appraisal of proximity per se. To address this question, in Experiment 2 we added ABA as an outcome. ABA is a loss with reduced expectancy and high proximity (like ABB), but without having two matching cards in succession. Thus, in both ABA and ABC, each turned card differs from the preceding card. The comparison between ABA and ABC would therefore allow us to address this potential confound in Experiment 2.

Experiment 2 Experiment 2
Experiment 2 was a replication and extension of Experiment 1. We used the same materials, procedure and data analysis methods as in Experiment 1. The major difference was that in Experiment 2, we included an extra loss outcome, namely ABA (i.e., the second card differed from the first and the third card), to further explore whether the difference in confirm RT between ABB and ABC depended on turning two matching cards consecutively.

Participants Participants
The same eligibility criteria as in Experiment 1 were used to recruit participants on Prolific.co. In total, 149 participants from Prolific.co took part in the experiment (90 males, 58 females, 1 did not report their gender; , ). For this experiment, we initially planned to recruit 100 participants. However, after collecting data from 100 participants, the Bayesian comparisons of confirm RT between ABA and AAB, and between ABA and ABC only showed 'anecdotal' evidence for the effects. We therefore decided to add another 50 participants. Data from 1 participant were not recorded, leaving 49 extra participants in total 2 . To correct for sequential testing, in Experiment 2 we adopted an alpha level of .025 (.05/2 = 0.025) for statistical significance. Furthermore, we used JASP (0.11.1; JASP Team, 2019) to conduct Bayesian sequential tests on confirm RT between different loss outcomes, to show how the changed after adding 49 more participants. JASP uses the BayesFactor package (Morey & Rouder, 2018) for calculating , but also provides the option to easily conduct and visualize sequential tests. The results of sequential analyses can be found in Figures A4, A5, and A6 in the Appendix.
Apparatus, materials and procedure Apparatus, materials and procedure The same apparatus, materials and procedure as in Experiment 1 were used. The only difference was that instead of including four possible outcomes, in the current experiment we included six types of outcomes, namely AAA (win), AAB, AAC, ABA, ABB, and ABC. Note that the AAB and AAC conditions are equivalent. The reason for including these two outcomes was for achieving a more balanced design. That is, after both AA* and AB*, there was an equal chance of the third card being A, B or C. Since we increased the overall number of outcomes from four to six, we reduced the amount of trials per cell from 30 to 20, to keep the total amount of trials the same (240 trials). Because of these changes, the overall probability of winning thus decreased from to . To make sure participants would win the same amount of points (i.e., 770 points including the initial 50 points), we reduced the wager such that one low-amount game cost 1 point and one high-amount game cost 5 points (as opposed to 2 and 10 points in Experiment 1). The amount of points won for each AAA outcome remained the same (12 and 60 points). The overall expected value of a game in Experiment 2 was thus again positive (i.e., participants won points in the long run). The rest of the procedure remained the same.

Data Analysis Data Analysis
The same procedure was used for preparing RT data. One difference was that participants now needed to have at least 10 episodes left per cell in order to be included in the analysis. Data from 7 participants were excluded, leaving 142 participants in the analysis. For the included participants, on average, 6.54 % of episodes were excluded ( = 6.13%). The same data analyses as in Experiment 1 were used. For all comparisons, AAB and AAC conditions were combined into AAB, as the two were effectively the same.

Results & Discussion Results & Discussion
Card3 RT Card3 RT The main effects of outcome and amount were both statistically significant. Replicating the result of Experiment 1, participants turned the third card more quickly when the first two cards did not match (Table 5). Overall, participants turned the third card more slowly when there was a high amount at stake, and this effect of amount seemed to be more pronounced when the first two cards matched (Table  6; the interaction effect between outcome and amount was not statistically significant though).

Confirm RT Confirm RT
For confirm RT, the main effects of outcome, amount, and the interaction effect were all statistically significant. Replicating the findings in Experiment 1, participants confirmed losses more quickly than wins (confirm RT: AAA ABB ABA AAB ABC). They again confirmed AAB more quickly than ABB (i.e., increased expectancy seemed to increase vigor), and ABB more slowly than ABC (i.e., high In total, 15 participants reserved a slot and then withdrew; 2 participants were 'timed out'. We do not have data from these participants.   Note. , = degrees of freedom, decimal numbers are the Greenhouse-Geisser corrected s; = mean-squared errors; = generalized eta-squared. Figure 3: Card3, Confirm and Start RT in Experiment 2. Error bars stand for 95% within-subject confidence intervals. Figure 3: Card3, Confirm and Start RT in Experiment 2. Error bars stand for 95% within-subject confidence intervals. For card3 RT, the third card is unresolved: AAA, AAB, AAC = AA*; ABA, ABB, ABC = AB* For card3 RT, the third card is unresolved: AAA, AAB, AAC = AA*; ABA, ABB, ABC = AB* proximity seemed to decrease vigor), while no difference between AAB and ABC was observed. The crucial tests for Experiment 2 were the comparisons between ABA and the other loss outcomes. Overall, confirm RT for ABA did not differ from ABB. Participants confirmed AAB more quickly than ABA (i.e., effect of expectancy), and ABA more slowly than ABC (i.e., effect of proximity) ( Table 5). Using both ABB and ABA as the high-proximity, reduced-expectancy condition, we therefore replicated the (opposing) effects of expectancy and proximity on confirm RT. The interaction effect between outcome and amount again seemed to be driven by slower confirmation of large wins than small wins. For the losses, amount again did not modulate confirm RT (Table 6).
Overall, this pattern of results showed that the appraisal of high proximity reduces response vigor, even when participants do not turn two matching cards in succession (as in the case of ABA). These results thus ruled out an alternative explanation, and strengthened the conclusion that high proximity reduces response vigor after reward omission. One important caveat for both the effects of proximity and expectancy is that we only tested the effect of proximity within losses that had reduced expectancy of winning, and the effect of expectancy within losses that were proximal to a win. Our design thus lacked the low-proximity, increasedexpectancy outcome (which does not seem feasible with our current task), which makes the interpretation of the results on confirm RT difficult. For instance, we observed no difference in confirm RT between a high-proximity, increased-expectancy outcome (AAB) and a low-proximity, reduced-expectancy outcome (ABC). This could indicate that the opposing effects of expectancy (increased vigor) and proximity (decreased vigor) cancel each other out on AAB trials. Alternatively, the effect of proximity and expectancy may be conditional on each other, and may jointly reduce response vigor only for high-proximity, reduced-expectancy outcomes (i.e., ABB). In other words, the lack of difference between AAB and ABC may suggest there are no separate effects of proximity and expectancy (only a joint effect). The start RT findings reported in the following section provide some support for the first account that assumes opposing effects of proximity and expectancy on AAB trials.

Start RT Start RT
For start RT, the two main effects and the interaction ef-   fect were statistically significant. Participants started the next trial more quickly after a loss than after a win (Table  5), and they 'paused' longer after a big win than after a small win (Table 6). Amount did not influence start RT within losses. The comparisons among different losses, though, showed an interesting change. Participants started a new game more quickly after AAB than after both ABB and ABC, whereas the latter two did not differ (start RT: AAB ABC ABB ABA; Note the start RT was shorter for AAB than for ABA, but the difference did not reach statistical significance). Remember that for confirm RT, we observed a slightly different pattern: AAB ABC ABB ABA). This quicker initiation of a new game after AAB than after ABC (and ABB) could indicate that the slowing effect of proximity dissipated more quickly than the invigoration effect of expectancy. This in turn suggests that both the effect of proximity and expectancy may indeed be present in the first place in confirm RT for AAB, but may have canceled each

Experiment 3 Experiment 3
In Experiments 1 and 2, we consistently found that high proximity of a loss to a win reduced response vigor, while increased expectancy of reward as games unfolded seemed to increase it. Expectancy varies as a game evolves, and this variation in expectancy may also be influenced by the overall probability of winning. In Experiment 3, we thus manipulated the winning probability across games, to explore its potential influence on response vigor.

Participants Participants
In total, 103 participants participated in Experiment 3. Two participants finished more than half of the trials before restarting the experiment and were therefore excluded 3 . Another 2 participants had fewer than 100 trials recorded and were also excluded. The final sample consisted of 99 participants (54 males, 45 females; , . Apparatus, materials and procedure Apparatus, materials and procedure The same apparatus, materials and general procedure as in the previous two experiments were used. The main difference was that instead of manipulating the amount of potential wins (which did not seem to influence the effects of proximity and expectancy much after a loss), we manipulated the probability of winning across games. Partici- pants were informed that they would play the scratch card game in three different casinos, where the chance of winning would be high, medium or low, respectively. Cards of different colors (either blue, red and green) were used in three casinos to indicate the probability of winning. The assignment of each card color to each casino was randomized across participants. Participants played 72 trials in each casino. For the trial distribution in each condition, see Table 7. The overall probability of winning in the three casinos was , and , respectively. Accordingly, the conditional probability of winning after getting AA* was , , and . Thus, if participants successfully learned the different winning probabilities from the three casinos, their expectancy of winning might show a larger increase after getting AA* in the high win probability condition (where the chance of getting another A for the last card was ), followed by the medium win probability condition, and then the low win probability condition.
The order of visiting the three casinos was randomly generated at the beginning of the experiment and then fixed for each participant. Participants always played 3 Another participant finished 3 trials before restarting the experiment. Data from this participant were included. 11 participants initially reserved a slot and then withdrew; 3 participants were 'timed out'. We do not have data from these participants.   games in a specific casino before switching to another one. The order of games was randomized within each casino, with the constraint that the balance never fell below 0. Note that since in Experiment 2 we observed no difference between ABB and ABA, in Experiment 3 we again only included ABB as the high-proximity, reduced-expectancy condition (as in Experiment 1), to reduce the number of outcomes we needed to include. The rest of the procedure remained the same as in Experiment 1. Each trial started with the message "Press any number key to purchase the game (5 points). Chance of winning 30 points: XYZ" (XYZ could be low, medium or high) printed on screen. Note that the amount of points participants could win on a single trial was adjusted to 30, so that they would end up with a comparable amount of points (710) as in the previous experiments. The expected value of a game in both the high and medium win probability 'casinos' was positive (participants won 600 and 180 points, respectively), while the expected value of a game in the low win probability 'casino' was negative (participants lost 120 points in total). In addition to showing the balance, the win probability level of the current casino was also displayed at the bottom of the screen (win odds: low, medium or high). The rest of the trial procedure remained the same.
As a manipulation check, after the scratch card task, participants were presented with the three cards used in the three casinos again, one by one. For each card, they were asked to estimate how often they thought they won in the casino that used that card, on a slider from 0% till 100%. They then filled out the UPPS-P impulsive behavior scale (Cyders et al., 2014), and were then debriefed, thanked and compensated.

Data Cleaning Data Cleaning
Since the low win probability condition contained only 8 AAA trials, and the high win probability condition contained only 8 ABC trials, we used a less strict rule for data exclusion. Participants needed to have at least 5 episodes in each cell in order to be included in the analysis (see also Verbruggen et al., 2017). Data from 5 participants were excluded, leaving 94 participants in the following analysis. For the remaining participants, on average 4.89% of episodes were excluded ( = 6.06%).

Main Data Analysis Main Data Analysis
Card3, confirm and start RT were submitted to repeatedmeasures ANOVAs, with win probability and outcome as within-subject factors. To explore the simple main effects of outcome, the three win probability conditions were combined, and each outcome was contrasted with all other outcomes (Table 10). To explore the simple main effects of win probability, we contrasted high vs. medium, high vs. low and medium vs. low win probability within each outcome (Table 11). We used the same analyses as in previous experiments for each comparison, and reported all of them for completeness.

Estimation of Win Probability Estimation of Win Probability
As a manipulation check, after the scratch card task, participants estimated how often they had won in each of the casinos. We compared participant's estimation of win probability across different conditions, and also compared the estimated win probability against the true win probability for each condition (see Table 8). Overall, participants tended to over-estimate the probability of winning, and the over-estimation was more pronounced when the real win probability was relatively low (see the test Est vs. Real in Table 8). However, they did estimate the win probability to be the highest for the high win probability condition, followed by the medium win probability condition, and to be the lowest for the low win probability condition. These results thus suggest that our manipulation was successful and participants perceived differences in win probabilities as expected. However, the perceived differences in win probabilities seemed to be smaller than the actual differences between conditions. Card3 RT Card3 RT For card3 RT, the main effect of outcome, win probability, and the interaction effect, were all statistically significant. Again, participants slowed down after AA* than AB* (Table 10), supporting our assumption that they paid attention to each turned card as they progressed in a game. Win probability modulated response vigor, such that less frequent events (e.g., AA* in the low win probability condition and AB* in the high win probability condition) led to longer Proximity and Expectancy Modulate Response Vigor After Reward Omission Collabra: Psychology Table 9: Repeated-Measures ANOVAs in Experiment 3  Note. The three winning conditions are combined. diff = difference in response time (RT), in milliseconds; lowerCI = lower limit of 95% confidence interval; upperCI = upper limit of 95% confidence interval; = value from paired-sample t test; = value from paired-sample t test; = value from Wilcoxon signed-rank test; and are corrected for multiple comparisons within each stage using the Holm-Bonferroni method. = Bayes factor, the likelihood of obtaining the current data under the alternative hypothesis, divided by the likelihood of obtaining the current data under the null hypothesis (the default prior Cauchy's width = 0.707 was used); = Hedges's average . card3 RT (Table 11; see also Figure 4, left panel). However, the modulation effect by win probability seemed larger for potential wins (i.e., AA*) than for losses (i.e., AB*). Part of the slowing down after infrequent events may be explained by surprise (e.g., attentional reorienting; Notebaert et al., 2009). This modulation of card3 RT by win probability shows that participants processed the win probability information during the game.

Confirm RT Confirm RT
For confirm RT, the main effect of outcome and the interaction effect were statistically significant. For the main effect of outcome, previous results were largely replicated (Table 10). Participants overall confirmed losses more quickly than wins, and confirmed ABB more slowly than ABC (i.e., the effect of proximity), while no difference be-Proximity and Expectancy Modulate Response Vigor After Reward Omission Collabra: Psychology Figure 4: Card3, Confirm and Start RT in Experiment 3. Error bars stand for 95% within-subject confidence intervals. Figure 4: Card3, Confirm and Start RT in Experiment 3. Error bars stand for 95% within-subject confidence intervals. For card3 RT, the third card is unresolved: AAA, AAB = AA*; ABB, ABC = AB* For card3 RT, the third card is unresolved: AAA, AAB = AA*; ABB, ABC = AB* tween AAB and ABC was found (potential cancellation of expectancy and proximity effects). One deviation from the previous findings, though, was that the comparison between AAB and ABB using a paired-samples t test was not statistically significant (after correcting for multiple comparisons). The Bayes factor also showed uninformative evidence. However, the non-parametric Wilcoxon signed-rank test did show a statistically significant difference, with shorter confirm RT for AAB than for ABB. Although the effect was in the same direction as those observed in Experiments 1 and 2, it should be interpreted with caution, as the global manipulation of expectations of winning (via different overall win probabilities) in the current experiment might interfere with the local manipulation of expectancy (via card sequences).
The interaction effect between outcome and win probability was driven by the modulation of confirm RT by win probability for wins (AAA), but not for losses (Table 11). Participants 'paused' longer after a win when the overall win probability was low. Surprisingly, although card3 RT was modulated by win probability after AA* (suggesting potential difference in participants' expectation of winning), this modulation effect was not observed for confirm RT of AAB, when participants eventually lost. Our global manipulation of expectations of winning thus failed to influence response vigor after AAB.

Start RT Start RT
For start RT, only the main effect of outcome was statistically significant. Interestingly, start RT was again faster after AAB than after ABB and ABC (Table 10; AAB ABB  ABC), consistent with what was observed in Experiment 2. The potential response invigoration effect by increased expectancy may therefore be more persistent than the slowing effect of high proximity.

General Discussion General Discussion
The present study explored the effect of proximity and expectancy on response vigor after reward omission. Participants played a series of online 'scratch card' games, in which they turned three cards one by one, and won points when all three cards matched (and lost their wager when there was a mismatch). By varying the sequence of cards on the loss trials, we independently manipulated the appraisals of proximity and expectancy. Across three experiments, we consistently found that (1) participants confirmed the outcome and started a new game more quickly after a loss than a win; (2) when confirming a loss, high proximity reduced response vigor (i.e., for confirm RT: high-proximity, reduced-expectancy [ABB and ABA] lowproximity, reduced-expectancy [ABC]), while increased expectancy seemed to increase response vigor (confirm RT: high-proximity, increased-expectancy [AAB] high-proximity, reduced-expectancy [ABB and ABA]) and might cancel out the opposing effect of proximity (confirm RT: highproximity, increased-expectancy [AAB] low-proximity, reduced-expectancy [ABC]); (3) the amount of reward at stake and the overall probability of winning influenced confirm RT and start RT after wins, but not after losses. Below we discuss the implications of these findings in detail.

Persistent Influence of Wins and Losses on Response Persistent Influence of Wins and Losses on Response Vigor Vigor
After turning the last card in a game, participants needed to confirm the outcome and start a new game. For both responses, we observed shorter RTs after a loss than a win. This difference probably reflects a modulation of response vigor by the presence and absence of reward. Due to the lack of a non-gamble baseline, we could not distinguish between decreased response vigor triggered by reward (i.e., post-reinforcement pause) and increased response vigor triggered by non-reward (i.e., post-loss speeding). However, post-reinforcement pause likely contributed to the difference between wins and losses. After all, participants responded more slowly after getting a large reward than a small reward, which is in line with previous findings on post-reinforcement pause (M. J. Dixon et al., 2013). As wins of high Proximity and Expectancy Modulate Response Vigor After Reward Omission Collabra: Psychology and low amounts occurred with equal probability, it seems unlikely that this difference was caused by response slowing after unexpected events.
As shown in the Appendix, the effects of wins vs. losses persisted further into the next game, such that participants turned the first and the second card of the next game more quickly after losses compared to wins (see Figures A1, A2, and A3 in the Appendix). To the extent that participants (erroneously) believed that there were hidden rules in the game and they could influence the outcome by discovering and exploiting these rules, increasing the speed of turning cards after losses seems counter-intuitive. Indeed, recent work employing a rock-paper-scissors game has shown that participants initiated a new game more quickly after a loss in an uncontrollable game, but they initiated a new game more slowly after a loss when there was a certain pattern in their opponents' responses that they could successfully exploit (Dyson et al., 2018). This prolonged initiation time after a loss against exploitable opponents might partly result from more strategic thinking of one's next move in this case. The 'scratch card' task used here does not contain such exploitable patterns, so it is possible that participants eventually realized that they had no control over the outcome, and thus forwent any strategic thinking of which card to turn next. Furthermore, since participants were accumulating points throughout the experiment, they might also gradually become less sensitive to wins and losses. In line with this speculation, in all three experiments participants overall responded more quickly in the second half of the experiment than in the first half. Furthermore, the effects of wins and losses on confirm and start RTs appeared to be smaller in the second half (although even in the second half, participants still responded more quickly after a loss compared to after a win. See the exploratory analyses on osf.io/ vuy95/). Overall, these results showed that wins vs. losses had a persistent influence on response vigor, from confirming the outcome of the current game, till turning the second card of the next game, spanning 4 consecutive responses in total.

Distinct Influences of Expectancy and Proximity on Distinct Influences of Expectancy and Proximity on Response Vigor Response Vigor
Even though the outcome was objectively the same for AAB, ABB, ABA, and ABC trials, we observed differences between these trial types. In all three experiments, we consistently observed that high proximity reduced response vigor after reward omission. Increased expectancy seemed to have an opposing effect, as participants confirmed a (proximal) loss more quickly when they had increased expectancy of winning as the game evolved. Although the effect of expectancy on confirm RT was not statistically significant in Experiment 3, it was in the same direction as in Experiments 1 and 2. Furthermore, the meta-analyzed effect combining results from all three experiments showed that overall participants did confirm AAB more quickly than ABB (meta-analyzed result: diff = -54 ms, 95% CI = [-70, -37], ; see Table A1 in the Appendix for results of metaanalyses for all comparisons). Furthermore, response invigoration after increased expectancy also fits the findings by Bossuyt et al. (2014), who showed that increased expectancy increased feelings of frustration, anger and regret. We therefore conclude that increased expectancy of winning (at least triggered by the card sequence of AAB) increased response vigor in the current research.
In all three experiments, participants confirmed highproximity, increased-expectancy outcomes (AAB) and low-proximity, reduced-expectancy outcomes (ABC) equally quickly (see also M. J. Dixon et al., 2013). As discussed above, one interpretation for this finding is that the effect of proximity might have canceled out the opposing effect of expectancy, resulting in no difference between AAB and ABC on confirm RT. Alternatively, the effect of proximity and expectancy may be dependent on each other, and may jointly influence response vigor only after high-proximity, reduced-expectancy outcomes (i.e., ABB). According to this second account, no effects of proximity and expectancy were present in the AAB trials, rather than there being two opposing effects that canceled each other out. While this second account may explain the pattern observed for confirm RT, it has difficulty in explaining the observation that participants started a new game more quickly after AAB than after ABC (meta-analyzed result: diff = -29 ms, 95% CI = [-42, -16], .001), and also more quickly after AAB than after ABB (meta-analyzed result: diff = -21 ms, 95% CI = [-39, -3], = .043; value corrected for multiple comparisons). These differences in start RT after AAB on the one hand, and ABC and ABB on the other hand, suggest that the invigoration effect of increased expectancy may be more persistent than the slowing effect of high proximity (thus resulting in overall shorter start RT after AAB than ABC and ABB), which in turn implies that both effects are indeed present when participants need to confirm AAB outcomes. Since AAB triggered increased expectancy of winning after the first two cards, the eventual loss may be more unexpected. This may initially orient attention towards the source of expectancy violation, thereby slowing down the response of confirming AAB (Notebaert et al., 2009;Wessel, 2018). No such orienting process would occur for the response of starting the next game, hence the start RT for AAB is lower than for ABC and ABB. Assuming effects of both proximity and expectancy thus offers a potential explanation for the overall pattern of results. In the following, we focus on the theoretical implications of the effects of proximity and expectancy on response vigor.
According to Frijda and colleagues (Frijda, 2010;Frijda et al., 2014), changes in states of action readiness are triggered by how events are appraised by individuals. Such states of action readiness can manifest in overt actions, with a certain strength and urgency (i.e., response vigor). In Frijda et al.'s framework, responses become more vigorous when the perceived discrepancy between the current state and a reference state becomes larger (Frijda, 2010). The effects of proximity and expectancy on response vigor may therefore be understood in terms of the discrepancy between the current state and a reference state. A proximal loss may indicate a more desirable current state than a distal loss, as players may confuse a game of chance as a game of skill, and take a proximal loss as an indication that they are getting better at the game (and that a reward is coming soon; Clark, 2010). Perceiving a proximal loss to be more desirable than a distal loss may thus lead to a smaller discrepancy and lower response vigor (while controlling for expectancy). Similarly, when participants still expect to obtain rewards before turning the last card (as in AAB), the reference state is more desirable compared to when they do not expect to obtain rewards anymore (the utility of an option is determined by both the value of an outcome and the expectancy of obtaining the outcome; Moors et al., 2017). After turning the last card and losing, the discrepancy between the current and the reference state is therefore larger for a loss after increased expectancy of winning than a loss after reduced expectancy of winning (while controlling for proximity), leading to increased response vigor in Proximity and Expectancy Modulate Response Vigor After Reward Omission Collabra: Psychology the former case. The magnitude of the discrepancy between the current state and a reference state may also explain the overall difference in response vigor after wins versus losses, since wins essentially entail no discrepancy, which is always smaller than the discrepancies encountered in losses.
Based on a similar feedback-based control process view, Carver & Scheier (1990) also proposed that organisms act to reduce the discrepancy between the current state and a reference state (they also proposed an avoidance system in which organisms act to enlarge the discrepancy between their current state and anti-goals, but that seems less relevant here). Importantly, a second loop monitors the rate at which the discrepancy is being reduced by the first action loop, and compares the rate of discrepancy reduction to a reference value. This second so-called meta-monitoring loop is responsible for the generation of affect: when the sensed rate of discrepancy reduction is higher than the reference value, organisms experience positive affect and reduce the effort of ongoing responses (coasting; Carver, 2003); when the sensed rate of discrepancy reduction is lower than the reference value, organisms experience negative affect and increase the effort of ensuing responses (Carver, 2004). Thus, Carver & Scheier (1990) attributed positive and negative affect, as well as the associated changes in response vigor, to the rate of discrepancy reduction rather than the magnitude of discrepancy (cf. Frijda, 2010). A proximal loss in the 'scratch card' task may correspond to a smaller difference between the sensed rate of discrepancy reduction and a reference value than a distal loss (e.g., the sensed rate of discrepancy reduction may be higher for a proximal loss, as people may erroneously believe they are getting better at the game), whereas a loss after increased expectancy may correspond to a larger difference than a loss after reduced expectancy (e.g., the reference value may be higher for a loss after increased winning expectancy). Despite the difference in the proposed origin of affect, both frameworks converge on the idea that positive affect is associated with things going well (either due to a small discrepancy or a high rate of discrepancy reduction) and response vigor can be reduced, while negative affect is associated with things not going so well (either due to a large discrepancy or a low rate of discrepancy reduction) and response vigor should be increased. Since increased expectancy before losing increased negative affect and high proximity increased positive affect (Bossuyt et al., 2014), the corresponding increase and decrease of response vigor by expectancy and proximity, respectively, are also in line with this broad idea.
Surprisingly, the amount of reward at stake and the overall probability of winning did not modulate the effect of proximity and expectancy (manipulated via card sequences) on loss trials. In an exploratory analysis combining data from 137 participants, Verbruggen et al. (2017) found that participants started a new trial more quickly after failing to obtain a large reward (with a low win probability) compared to a small reward (with a relatively high win probability). Similarly, Yu et al. (2014) observed that participants reported higher levels of frustration and applied more force in pressing buttons when failing to get 2 pounds vs. 20 pence. Here, we independently manipulated reward magnitude and the overall winning probability, but failed to observe a modulation effect by either factor. This lack of modulation cannot be explained by participants not processing the magnitude or probability information during the game, as the RT of turning the last card after getting AA* is clearly influenced by reward magnitude and probability (although not statistically reliable in Experiment 1). Some procedural differences may explain this inconsistency in results. In the experiments by Verbruggen et al. (2017), participants could chose to gamble or not, which might amplify the effect of reward magnitude when participants voluntarily chose to gamble and then lost. Yu et al. (2014) did not use a gamble task, but used two reward amounts (2 pounds vs. 20 pence) that differed much more from each other than the ones used here (30 pence vs. 6 pence). Either of these differences may explain why we did not observe any effect of reward magnitude and win probability on response vigor after a loss here. Another possibility is that the appraisal of expectancy and proximity (as manipulated via card sequences here) is a very fast process (Frijda et al., 2014), and influences response vigor without taking into account more fine-grained information, such as reward magnitude and probability. These possibilities can be further explored in future work.
In the current research we investigated the effects of proximity and expectancy on response vigor within the context of gambling, as the structural features of certain gambling scenarios (e.g., the sequential presentation of symbols on scratch cards and slot machines) makes the manipulation of proximity and expectancy relatively straightforward. Future work may further examine these effects in other contexts in which people pursue rewards (and either succeed or fail in obtaining rewards), to see if the current findings can be generalized to other reward-seeking behaviors. While the independent manipulation of proximity and expectancy may be difficult in other more naturalistic settings, soliciting people's subjective appraisals of proximity and expectancy (which we did not include here) may help address this problem. Furthermore, the potential identification of low-proximity, increased-expectancy events (an outcome that we could not include in the current design) may also lead to a better understanding of how the goal pursuit process is adjusted based on the appraisals of proximity and expectancy.
Implications for Near Miss in Gambling Implications for Near Miss in Gambling 'Near misses' in gambling contribute to gambling persistence, and has received substantial attention from gambling researchers. Different theoretical accounts have been proposed, emphasizing either the proximity of a 'near miss' to a win or the increased expectancy of winning (and hence, the unexpectedness of loss) in a 'near miss'. According to the cognitive distortion account, gamblers may mistake games of chance (i.e., gambling) as games of skill, and take a 'near-miss' as an indication of skill acquisition and a signal of imminent rewards (Clark, 2010;Clark et al., 2009;Reid, 1986). 'Near misses' may also acquire reinforcing properties via Pavlovian generalization, and reinforce gambling just as actual wins, due to its perceptual similarity to wins (Belisle & Dixon, 2016;Peters et al., 2010). These two accounts thus emphasize the proximity of a 'near miss' to a win. The frustration account, on the other hand, argues that a 'near miss' may elicit more frustration than a regular loss, and motivate gamblers to quickly remove themselves from the aversive state by continuing gambling (M. J. Dixon et al., 2011Dixon et al., , 2013Sharman & Clark, 2016;Stange et al., 2016Stange et al., , 2017. Since the appraisal of expectancy rather than proximity contributes to the subjective feeling of frustration (Bossuyt et al., 2014), the frustration account seems to emphasize the increased expectancy of winning in a 'near miss'. Our results suggest both factors may play a role here. The most widely used type of 'near miss', namely AAB, triggers an appraisal of both increased expectancy and high proximity. Compared to a full loss (i.e., a loss with reduced Proximity and Expectancy Modulate Response Vigor After Reward Omission Collabra: Psychology expectancy and low proximity), this combination of increased expectancy and high proximity may simultaneously frustrate gamblers so that they are motivated to avoid the source of discomfort, while at the same time provide an incentive to continue as they may perceive themselves getting closer to a win. Together, these two factors may lead to the quick initiation of a new gamble after AAB (i.e., for startRT, AAB ABC) and the perpetuation of gambling in the long run.
More broadly, our results suggest that it may be fruitful to explicitly distinguish the dimension of proximity and expectancy in future research on 'near misses'. Some researchers have acknowledged that 'near misses' may not be an unitary phenomenon (Clark et al., 2009(Clark et al., , 2013, and some have distinguished between 'classic' (such as AAB) and 'non-classic' near misses (such as ABB and ABA) in their research (M. J. Dixon et al., 2013;Worhunsky et al., 2014). However, such a dichotimazation (classic vs. nonclassic) is not based on any purported difference in the underlying mechanisms, and is therefore not theoretically informative. We propose using the separate dimensions of proximity and expectancy to define and operationalize different types of 'near misses'. This may help reconcile inconsistencies in previous findings due to their different operationalizations, and help us better understand how 'near misses' contribute to gambling persistence.

Conclusion Conclusion
Endeavors to obtain rewards are not always successful. In the present research, we explored how reward omission would influence response vigor in a gambling context, as a function of the changes in winning expectancy as a game evolves, and the proximity of the eventual loss to a win. We observed that overall, people increased response vigor after a loss than a win. Moreover, a proximal loss decreased the vigor (i.e., speed) of ensuing responses compared to a distal loss, while a loss after increased winning expectancy may increase response vigor compared to after reduced winning expectancy. These adjustments may be triggered by the appraised discrepancy between the current state and a reference state, and serve to close the perceived gap and facilitate goal pursuit. In gambling contexts where the continued pursuit of rewards after omission may be harmful, these processes may contribute to maladaptive goal pursuit in the form of excessive gambling. Further exploring the adjustment of response vigor after reward omission may help us better understand the adaptive goal pursuit process, and also how it becomes maladaptive in certain situations.   Note. diff = difference in response time (RT) between high-amount and low-amount condition, in milliseconds (positive value indicates longer RT in the high-amount condition); lowerCI = lower limit of 95% confidence interval; upperCI = upper limit of 95% confidence interval; = value from paired-sample t test; = value from paired-sample t test; = value from Wilcoxon signed-rank test; and are corrected for multiple comparisons within each stage using the Holm-Bonferroni method. = Bayes factor, the likelihood of obtaining the current data under the alternative hypothesis, divided by the likelihood of obtaining the current data under the null hypothesis (the default prior Cauchy's width = 0.707 was used); = Hedges's average .  Note. diff = difference in response time (RT), in milliseconds; lowerCI = lower limit of 95% confidence interval; upperCI = upper limit of 95% confidence interval. values are corrected for multiple comparison for each stage using the Holm-Bonferroni method. Figure A2: RT across all stages in Experiment 2. Error bars stand for 95% within-subject confidence intervals. Figure A2: RT across all stages in Experiment 2. Error bars stand for 95% within-subject confidence intervals.
these results might show that the opposing effects of expectancy and proximity canceled each other out on the confirm RT for AAB trials, but the invigoration effect of increased expectancy is more durable that the slowing effect of high proximity, resulting in faster start RT after AAB compared to ABB and ABC.
Proximity and Expectancy Modulate Response Vigor After Reward Omission Collabra: Psychology Figure A3: RT across all stages in Experiment 3. Error bars stand for 95% within-subject confidence intervals. Figure A3: RT across all stages in Experiment 3. Error bars stand for 95% within-subject confidence intervals. Figure A4: Sequential analysis of the comparison between ABB and ABA in Experiment 2. ABB is used as the high- Figure A4: Sequential analysis of the comparison between ABB and ABA in Experiment 2. ABB is used as the highproximity, low-expectancy condition. proximity, low-expectancy condition.  Figure A5: Sequential analysis of the effects of expectancy and proximity in Experiment 2. ABA is used as the high- Figure A5: Sequential analysis of the effects of expectancy and proximity in Experiment 2. ABA is used as the highproximity, low-expectancy condition. proximity, low-expectancy condition. Figure A6: Sequential analysis of the comparison between ABB and ABA in Experiment 2. Figure A6: Sequential analysis of the comparison between ABB and ABA in Experiment 2.
Proximity and Expectancy Modulate Response Vigor After Reward Omission Collabra: Psychology