In this paper, we draw connections between reward processing and cognition by behaviourally testing the implications of neurobiological theories of reward processing on memory. Single-cell neurophysiology in non-human primates and imaging work in humans suggests that the dopaminergic reward system responds to different components of reward: expected value; outcome or prediction error; and uncertainty of reward (Schultz et al., 2008). The literature on both incidental and motivated learning has focused on understanding how expected value and outcome—linked to increased activity in the reward system—lead to consolidation-related memory enhancements. In the current study, we additionally investigate the impact of reward uncertainty on human memory. The contribution of reward uncertainty—the spread of the reward probability distribution irrespective of the magnitude—has not been previously examined. To examine the effects of uncertainty on memory, a word-learning task was introduced, along with a surprise delayed recognition memory test. Using Bayesian model selection, we found evidence only for expected value as a predictor of memory performance. Our findings suggest that reward uncertainty does not enhance memory for individual items. This supports emerging evidence that an effect of uncertainty on memory is only observed in high compared to low risk environments.

## Introduction

We are constantly encoding information; however, relatively little of that information is eventually consolidated into memory. In order to be adaptive, memory must be selective and prioritise information that is likely to be relevant to future decisions. There are many ways in which reward can affect the consolidation of newly learned material. For example, students studying for exams will (ideally) be actively focusing their attention and resources to promote memory for information that is likely to be tested on an exam (motivated learning). In other situations, value may be more incidental to information that might be later remembered; for example, a child might enjoy interacting with a new object and therefore be more likely to remember its name (incidental learning).

The effects of reward on learning have been studied in the context of both motivated and incidental learning. Neurobiological mechanisms have been proposed to account for both types of learning (Adcock, Thangavel, Whitfield-Gabrieli, Knutson, & Gabrieli, 2006; Shohamy & Adcock, 2010; Wittmann, Dolan, & Düzel, 2011; Wittmann et al., 2005). Research has focused on the role of the neurotransmitter dopamine in increased hippocampal consolidation of reward-based memories (Lisman & Grace, 2005; Shohamy & Adcock, 2010). Additionally, salient and emotional episodic memory enhancements have been linked to increased activity in the locus-coeruleus-norepinephrine (LC-NE) system (Clewett & Mather, 2014; Clewett, Sakaki, Nielsen, Petzinger, & Mather, 2017; Preuschoff, ‘t Hart, & Einhauser, 2011). Given the difficulty of individually isolating motivational factors such as reward, emotion and arousal, it is likely that multiple neurobiological systems support increased hippocampal encoding (Madan, 2017; Shaikh & Coulthard, 2013; Shohamy & Adcock, 2010; Takeuchi et al., 2016). Furthermore, reward-based learning may be supported either by synaptic (Lisman & Grace, 2005) or systems level consolidation processes (Braun, Wimmer, & Shohamy, 2018; Murty, DuBrow, & Davachi, 2018; Studte, Bridger, & Mecklinger, 2016).

Reward-based learning is often considered within the context of reinforcement learning models (Diederen et al., 2017; Sutton & Barto, 1998). The reward signal is comprised of the expected value, the actual reward outcome or the prediction error. In such models, prediction errors are used to update the current belief about the value of different actions in order to maximise future rewards. It has been suggested that neurons in the dopaminergic system encode the prediction error term of these models (Schultz, 1998). Such models account for learning and decision-making, but the precise relationship between the reward signal and individual episodic memories is less clear (Bornstein & Norman, 2017; Diederen et al., 2017; Lengyel & Dayan, 2007). The effective reward value signal might be expected value, reward outcome or prediction error. Previous studies have not clearly distinguished between anticipated rewards and actual outcomes (Wittmann et al., 2005), more recently the focus has shifted to the relationship between reward cue and reward outcome (Bialleck et al., 2011; Bunzeck, Dayan, Dolan, & Duzel, 2010; Mason et al., 2017b; Mather & Schoeke, 2011). There is evidence to suggest that memory enhancements could be attributed to either reward anticipation or a post-encoding enhancement of items after reward delivery (Gruber, Ritchey, Wang, & Doss, 2016; Murayama & Kitagami, 2014; Patil, Murty, Dunsmoor, Phelps, & Davachi, 2017).

Reward uncertainty is another important, but often ignored, signal that refers to the predictability of the outcome of an event. It tells us the spread of the reward probability distribution irrespective of the magnitude (Tobler, O’Doherty, Dolan, & Schultz, 2007). In the case where there are two possible outcomes (e.g. reward vs. no reward), expected value increases linearly with the probability of receiving a reward, whereas uncertainty follows an inverted U-shaped function of probability of reward, and is maximal at p = 0.5. A common measure of uncertainty is entropy. Entropy is calculated as the negative weighted sum of the logarithm of the probabilities of each possible outcome –ΣO PO log2PO. Where PO is the event outcome (reward or no reward). Reward uncertainty is likely to be signalled by multiple systems. It has been associated both with changes in activity in the dopaminergic reward system and the LC-NE system, which also signals arousal and surprise (Clewett & Mather, 2014; Clewett et al., 2017; Kempadoo, Mosharov, Choi, Sulzer, & Kandel, 2016; Preuschoff et al., 2011). fMRI studies in humans have demonstrated distinct coding of reward expected value and uncertainty (D’Ardenne, Mcclure, Nystrom, & Cohen, 2008; Glimcher, 2011; Hsu, Krajbich, Zhao, & Camerer, 2009; Liu, Hairston, Schrier, & Fan, 2011; Ludvig, Sutton, & Kehoe, 2008; Preuschoff, Bossaerts, & Quartz, 2006;Preuschoff, Quartz, & Bossaerts, 2008; Schultz et al., 2008; Tobler, Fiorillo, & Schultz, 2005; Tobler et al., 2007). Tobler et al. (2007) found that stimuli associated with increases in expected value elicited monotonically increasing activation in the striatum, whereas stimuli associated with higher variance led to increased activation in the orbiofrontal cortex. Other studies have indicated that the reward signal is comprised of temporally distinct linear and quadratic responses to expected value and uncertainty within dopamingeric brain regions such as the striatum (Cooper & Knutson, 2008; Dreher, Kohn, & Berman, 2006; Rolls, McCabe, & Redoute, 2008).

The link between dopaminergic activity and uncertainty on the one hand, and dopaminergic activity and memory enhancement on the other, suggests that we should expect to see a behavioural relationship between reward uncertainty and memory performance. This has only recently been given any attention in the literature. A recent study examined the effects of reward uncertainty on recognition memory. Rouhani, Norman, and Niv (2018) found that participants remembered items that occurred within a high-risk context (large variance in reward distribution) better than in a low-risk context. They also found that across risk contexts, surprise or unsigned prediction error, was the best predictor of memory for individual items (see also De Loof et al. (2018)). The authors suggested that uncertainty experienced in high-risk reward environments may improve memory in these contexts (Duncan, Sadanand, & Davachi, 2012;Mather, Clewett, Sakaki, & Harley, 2015).

What isn’t clear is whether the relationship between reward uncertainty and memory holds at finer time scales within the experimental context. Here, we ask whether variations in reward uncertainty during the experiment are linked to variations in recognition memory accuracy. In motivated learning we (Mason et al., 2017a) tested the effects of reward components on episodic memory encoding. On each trial, participants were presented with a reward probability followed by the to-be-remembered item. They were then presented with the reward outcome, but earning this was contingent upon correctly recognising the item at a delayed memory test. For each item that participants were presented with we were able to test the influence of reward expected value, prediction error, outcome and uncertainty. Across four behavioural studies we found consistent evidence against an effect of reward uncertainty on memory, and only found evidence favouring an effect of reward outcome on memory, with higher reward outcomes leading to better memory than lower outcomes or an absence of a reward.

In principle, it is possible that rewards act differently on memory when items are studied under incidental or motivated learning conditions (Cohen, Rissman, Hovhannisyan, Castel, & Knowlton, 2017;Spaniol, Schain, & Bowen, 2013). During motivated learning participants engage in different strategies to enhance encoding: these include selective attention and differential resource allocation (Ariel & Castel, 2014; Castel, 2008; Castel, Benjamin, Craik, & Watkins, 2002; Eysenck & Eysenck, 1982; Loftus & Wickens, 1970; Stefanidi, Ellis, & Brewer, 2018), and directed forgetting (Fawcett & Taylor, 2008; Friedman & Castel, 2011; Hayes, Kelly, & Smith, 2013; Lehman & Malmberg, 2009; Wylie, Foxe, & Taylor, 2008). The learner also has the expectation at encoding that reward outcomes depend upon successful memory performance at test (Adcock et al., 2006). In contrast, in incidental learning paradigms the rewards are delivered at the time of learning and are found to increase recognition and recall of items associated with the rewards (Mather & Schoeke, 2011; Wittmann et al., 2011). Given the difference in reward delivery, it is conceivable that incidental learning relies to a greater extent on neurobiology mechanisms such as dopaminergic consolidation, and we may see a stronger coupling between rewards signals, identified in the neurobiological literature, and memory performance. Given the potential for involvement of different behavioural and neurobiological contributions under incidental versus motivated learning, it is possible that an uncertainty–reward relationship might exist for individual items under incidental conditions.

### Current Experiment

Accordingly, we conducted a behavioural experiment to assess the contribution of reward factors during incidental episodic memory encoding. The purpose of this paper is to test whether reward uncertainty influences memory on a trial-by-trial basis under incidental learning conditions. In addition, we will examine the influence of other reward predictors in order to identify the reward signals that drive memory performance at the behavioural level.

The reward task used in this experiment was developed by Preuschoff and colleagues and has been used to examine both dopaminergic reward signalling of uncertainty and to dissociate uncertainty and surprise (Preuschoff et al., 2006, 2011). In addition, the manipulation used in this experiment has been shown to induce a clear neural signature of reward uncertainty in the striatum (Preuschoff et al., 2006, 2008). To examine the effects of uncertainty on memory, a delayed recognition memory test was used to probe memory for words that were originally paired with rewards. The neuroimaging results from the Preuschoff (2006) experiment indicated time-dependent encoding of value and risk in the ventral striatum. Risk-related activity followed an inverted U shape function of probability, whereas the relationship between value and probability was linear. These findings are supported by evidence from several other studies exploring the neural correlates of risk (Cooper & Knutson, 2008; Dreher et al., 2006; Rolls et al., 2008;Tobler et al., 2007). The question is to what degree, if at all, reward uncertainty enhances memory. Furthermore, the aim of this experiment was to provide a comparison of the different components of reward: expected value; outcome; prediction error; uncertainty of reward; and surprisal as motivated by extensive research in single-cell neurophysiology in non-human primates and imaging work in humans (Cromwell & Schultz, 2003; Fiorillo, Tobler, & Schultz, 2003; Hollerman & Schultz, 1998; Schultz, 1998, 2002;Schultz et al., 2008; Tobler et al., 2005).

## Methods

We pre-registered the experiment at https://aspredicted.org/rn2hy.pdf. The data is available on Open Science Framework https://osf.io/xkpfz/. All participants provided informed consent and the study was approved by the UWA Human Research Ethics Office.

### Participants

Fifty students were recruited from the University of Western Australia undergraduate participant pool and were reimbursed with course credits. Sample size was based on anticipated effects from our previous studies examining reward-related learning. We are using Bayesian statistics as our inferential framework, which allows us to competitively test models and explicitly calculate a strength of evidence for these models. Participants had the chance to earn a maximum of $5.00 (all dollar values are in AUD), with an average of$2.77 (SD = 0.53) One participant was excluded from the analysis as their data did not save due to a network issue. This left a sample size of 49 participants (female = 32, M age = 21.14, SD = 5.63).

#### Stimuli

The stimuli for the recognition task were English words. A total of 216 words were used, taken from a pool of 400 words used in Mason et al. (2017a; obtained in turn from Oberauer, Lewandowsky, Farrell, Jarrold, and Greaves, 2012). All words were concrete nouns, and were chosen to refer to common objects that are larger or smaller than a soccer ball, with the pool consisting of 108 objects rated as larger and 108 rated as smaller. The words had an average length of 5.77 letters (SD = 1.84). The experiment was programmed and presented using the Psychophysics Toolbox for MATLAB version 2.54 (Brainard, 1997) on a standard desktop computer.

For each participant the experiment was conducted in two sessions occurring on different days. In the first session participants were exposed to a series of words, each word associated with a reward value with varying degrees of probability. There were three levels of probability (0.125, 0.5, 0.875) and two levels of uncertainty (low, high, and low respectively). We decided to test the conditions where reward uncertainty was greatest (.5) and two comparison points that had the same uncertainty but different expected values. The findings in the reward-memory literature do not often detect fine-grained effects (Bunzeck et al., 2010; Mason et al., 2017b;Wittmann et al., 2011) and we wanted to maximise the chances of detecting the effect.

On each trial participants placed a bet, following which they could either win or lose $0.15. The “betting” task was a simple task with simulated playing cards. Two cards were drawn without replacement by the computer from a simulated set of playing cards (ace to 9, where ace was low). The first card was drawn at random from a subset of cards (2, 5 or 8). The second was then drawn from the remaining 8 cards. Participants were to bet on whether the second card drawn would be higher or lower than the first card. When the bet was placed participants had not seen either card so they always had a 50% chance of winning (vs losing) the bet. Once the first card was drawn the probability of winning was known to the participants. The outline of a trial is shown in Figure 1. Figure 1 Participants are told that a card will be drawn at random from the playing cards (2, 5 and 8). They then place a bet as to whether the second card will be higher or lower than the first. The first card appears on screen. It is followed by a word. The second card is then presented on screen. Finally, the participant must indicate whether or not they have won the bet. In order to control the number of trials per condition, the probability of winning and they outcome are pre-determined on each trial but the cards that are shown are adjusted depending on the participants bet. For example, imagine that the reward probability is 0.125 and the outcome is “Win” and the participant bets “Higher”. The first card shown will need to be 8 and the second card will be 9. For this same condition, if the participant had bet “Lower” the first card would be 2 and the second card would be 1. Figure 1 Participants are told that a card will be drawn at random from the playing cards (2, 5 and 8). They then place a bet as to whether the second card will be higher or lower than the first. The first card appears on screen. It is followed by a word. The second card is then presented on screen. Finally, the participant must indicate whether or not they have won the bet. In order to control the number of trials per condition, the probability of winning and they outcome are pre-determined on each trial but the cards that are shown are adjusted depending on the participants bet. For example, imagine that the reward probability is 0.125 and the outcome is “Win” and the participant bets “Higher”. The first card shown will need to be 8 and the second card will be 9. For this same condition, if the participant had bet “Lower” the first card would be 2 and the second card would be 1. If, for example the participant bet on the second card being higher, then the probability of winning was equal to the total number of cards in the deck (9) minus the number displayed on the card drawn (C1) divided by the number of remaining cards in the deck (8): Pwin = (9-C1)/8. The first card was always a 2, 5 or 8 which meant the probability of winning was either 0.125, 0.5, 0.875. The reward value was kept constant on each trial, and so the expected reward and risk varied directly as a function of probability of winning. On each trial, a word was shown after card one and before card two. In the task used by Preuschoff et al. (2006) card one was displayed for 1.5 seconds, followed by an anticipatory period of 5.5 seconds before card two was presented. In the current experiment, card one was displayed for 1.5 seconds, followed by a fixation cross for 500 ms seconds. The target word was then displayed for 4 s. To ensure that the words presented were attended to, participants were required to indicate whether the object was smaller or larger than a soccer ball. Participants used the left and right arrow keys (with their index and middle fingers of their dominant hand) to input their response. The target word remained on screen after this response. At end of the 4 second period, a fixation cross was then displayed for 500 ms, before card two was displayed for 1500 ms. Participants then had 2000 ms to select one of two boxes indicating whether or not they had won the bet, to make sure the trial events were attended to and understood. If a participant responded incorrectly to this question they had a penalty amount of$0.05 deducted. There was an inter-trial interval of 500 ms. If no bet was placed the bet was lost, and if participants failed to correctly report the outcome of the bet they lost $0.05. Before beginning the experiment, the process was explained to participants and worked examples were given for each of the possible bets and card outcomes. Participants completed 10 practice trials during the first session. The experiment was run as a series of three blocks, with 36 trials in each block. At the end of the three blocks the participants randomly selected which block’s earnings they would keep. The lowest overall bonus payment a participant could earn was$0.

### Recognition memory test

The second session always occurred the day after the first session. This was usually exactly 24 hours later and always a minimum of 12 hours. In the second session, participants completed a recognition test on the words shown in the first session. Each of the 108 old words was shown, randomly intermixed with 108 new words. Participants were required to make an old/new judgement using the left and right arrow keys.

## Data Analysis

### Data Exclusion

During the first session, participants were asked to report whether or not they bet correctly in each trial. This was included in the experimental design to assess whether participants were maintaining attention during the task. It was assumed that participants who reported their bet outcome correctly at least 80% of the time performed well in this task, and were likely paying attention. 8 participants were excluded for not meeting the reporting requirement leaving a total sample size of 41.

## Results

### Model Comparisons

The dependent variable was each participants’ mean hit rate across each of the 6 conditions: reward probability (0.125, 0.5, 0.85) crossed with reward outcome (0 or 0.15) Figure 2 shows. The false alarm rate was 0.23 (SE 0.02), which is comparable to previous studies and indicates that participants are performing above chance (Mason et al., 2017a). We then conducted a mixed-effects regression. This allowed us to accommodate individual differences, at least in overall performance levels (by way of a random subject factor). A Bayesian model comparision approach was used to assess the unique contribution of different predictors. For each of the 6 experimental conditions we were are able to test the following theoretically relevant predictors: expected value, prediction error, reward outcome, reward uncertainty and surpisal. Definitions of these predictors are listed in Table 1. We tested only the individual predictors, i.e. we did not include the interaction terms. However, the interaction of interest between reward probability and reward outcome is effectively captured by the predictor surpisal.

Figure 2

Recognition memory performance as a function of the expected value of the cue and the reward outcome. Error bars show SEM within-subject error bars calculated using the method in (Morey, 2008). The symbols illustrate the predicted values for the best fitting model (Expected Value).

Figure 2

Recognition memory performance as a function of the expected value of the cue and the reward outcome. Error bars show SEM within-subject error bars calculated using the method in (Morey, 2008). The symbols illustrate the predicted values for the best fitting model (Expected Value).

Table 1

Reward-related predictors of memory performance.

PredictorDescription

Expected Value (EV) Probability of obtaining a reward multiplied by the reward magnitude
Reward Outcome (O) Magnitude of the reward obtained
Prediction Error (PE) Expected value of the reward minus the reward outcome
Reward Uncertainty (U) The entropy – ΣOPO log2PO
Surprisal (S) Information gained from observing an outcome O – log2(PO)where O is the outcome, PO is the probability of that outcome the outcome (reward or no reward)
PredictorDescription

Expected Value (EV) Probability of obtaining a reward multiplied by the reward magnitude
Reward Outcome (O) Magnitude of the reward obtained
Prediction Error (PE) Expected value of the reward minus the reward outcome
Reward Uncertainty (U) The entropy – ΣOPO log2PO
Surprisal (S) Information gained from observing an outcome O – log2(PO)where O is the outcome, PO is the probability of that outcome the outcome (reward or no reward)

Models were fit using the “lmer” function in the lme4 package (Bates, Mächler, Bolker, & Walker, 2015). The Bayesian Information Criteria (BIC) provided can be converted to an approximation of a Bayes Factor (assuming the unit information prior) according to the following rule: BFM1_M2 = exp(–0.5* (BICM1BICM2)) Raftery (1995). The BICs assumed prior is relatively uninformed, and tends to be conservative (i.e., it can favour the null hypothesis more than under an informed prior; Weakliem, 1999).

For our model comparisons we first selected the model with the lowest BIC value and we then compared each of the other models to this model see Table 2. For each comparison, the Bayes factor provides relative evidence for each of the models conditional on the data. It informs us how much our prior beliefs should shift in response to the data obtained. Although there are no strict cut-offs, according to Jeffreys (1961) we can interpret odds greater than 3 as some evidence, odds greater than 10 as strong evidence, and odds greater than 30 as very strong evidence for a particular hypothesis compared to an alternative (see also Wagenmakers, 2007). In addition, to illustrate the goodness of fit we plot the predictions of each of the best models (model with the lowest BIC) alongside the data.

Table 2

Linear mixed effects model comparison. The first column lists each of the models we tested and the best model, with the lowest BIC. The predictors were: expected value (EV), uncertainty (U), reward outcome (O), prediction error (PE) and surprisal (S). Each of the models (M) was compared to the best model and the third column shows the BF comparisons.

ModelBICBayesFactor

LMEV EV –75.65 1.00
LMBase Base –71.96 6.32
LMEVUn EV & U –67.12 70.94
LMPEOut PE & O –64.91 214.10
LMEVOut EV & O –64.91 214.10
LMPEEV PE & EV –64.91 214.10
LMUn –63.43 448.92
LMPE PE –61.73 1,052.40
LMOut –61.25 1,339.36
LMSup –59.63 2,999.18
LMPEUnOut PE & U & O –56.39 15,189.49
LMOutUnEV O & U & EV –56.39 15,189.49
LMPEUnEV PE & U & EV –56.39 15,189.49
LMUnSupEV U & S & EV –54.52 38,591.64
LMPEU PE & U –53.20 74,723.47
LMOutUn O & U –52.72 95,100.54
LMPESupOut PE & S & O –52.57 102,356.60
LMPESupEV PE & S & EV –52.57 102,356.60
LMEVSupO EV & S & O –52.57 102,356.60
LMUnSup U & S –50.86 240,525.53
LMPES PE & S –49.40 499,221.12
LMOutSup O & S –48.92 635,056.22
LMPEUnSupEV PE & U & S & EV –43.79 8,256,740.34
LMEVUnSupOut EV & U & S & O –43.79 8,256,740.34
LMPESupUnEV PE & S & U & EV –43.79 8,256,740.34
LMPESUn PE & S & U –40.64 40,036,172.54
ModelBICBayesFactor

LMEV EV –75.65 1.00
LMBase Base –71.96 6.32
LMEVUn EV & U –67.12 70.94
LMPEOut PE & O –64.91 214.10
LMEVOut EV & O –64.91 214.10
LMPEEV PE & EV –64.91 214.10
LMUn –63.43 448.92
LMPE PE –61.73 1,052.40
LMOut –61.25 1,339.36
LMSup –59.63 2,999.18
LMPEUnOut PE & U & O –56.39 15,189.49
LMOutUnEV O & U & EV –56.39 15,189.49
LMPEUnEV PE & U & EV –56.39 15,189.49
LMUnSupEV U & S & EV –54.52 38,591.64
LMPEU PE & U –53.20 74,723.47
LMOutUn O & U –52.72 95,100.54
LMPESupOut PE & S & O –52.57 102,356.60
LMPESupEV PE & S & EV –52.57 102,356.60
LMEVSupO EV & S & O –52.57 102,356.60
LMUnSup U & S –50.86 240,525.53
LMPES PE & S –49.40 499,221.12
LMOutSup O & S –48.92 635,056.22
LMPEUnSupEV PE & U & S & EV –43.79 8,256,740.34
LMEVUnSupOut EV & U & S & O –43.79 8,256,740.34
LMPESupUnEV PE & S & U & EV –43.79 8,256,740.34
LMPESUn PE & S & U –40.64 40,036,172.54

The results indicated that the best model was the Expected Value only model. The Bayes Factor model comparisons indicates some evidence that this model better accounts for the data than the base model containing no effects of probability (LMBase). Critically, the Expected Value model was strongly favoured over all other models, including all models incorporating an effect of reward uncertainty.

## Discussion

In this experiment we compared how a range of reward-related predictors influence incidental memory performance. Using a behavioural task developed to elicit reward uncertainty during encoding, we found that the expected value of a reward was the best predictor of memory for the words temporally linked to rewards. In our task participants were presented with a word between reward cue (which predicted the reward outcome with greater or lesser certainty) and reward outcome. We used mixed-effects modelling to compare how different reward factors predicted recognition memory performance in a delayed surprise memory test. Our study is the first to directly compare different reward-related predictors (expected value, reward outcome, prediction error, uncertainty and surprisal) in their effect on incidental memory.

The results from our experiment—showing a specific effect of expected value—contribute to the growing body of evidence that signals related to reward prediction error, reward outcomes (Mason et al., 2017b) and expected value (Jang, Nassar, Dillon, & Frank, 2018) are consistently shown to affect reward-based memory consolidation. There has been extensive research on both the role of prediction errors in learning and decision-making (Diederen et al., 2017; Rouhani et al., 2018) and the potential relationship between prediction errors and episodic memory formation on a trial-by-trial basis (Bunzeck et al., 2010; Ergo, De Loof, Janssens, & Verguts, 2019; Jang et al., 2018; Mason et al., 2017b;Rouhani et al., 2018; Wimmer, Braun, Daw, & Shohamy, 2014). A few studies have found evidence in favour of the this, however, there appears to be more consistent evidence that reward outcomes are a strong predictor of memory in incidental learning (Bunzeck et al., 2010;Mason et al., 2017b; Mather & Schoeke, 2011; Murayama & Kitagami, 2014). Although evidence generally emerges for these signals as predictors, not all studies have provided consistent evidence for effects of all on memory. While this may be partly due to sampling variability, it may also be the case that different experimental procedures may lead to one of these signals becoming more salient and influencing memory to a greater degree than others. For example, in the current study participants were explicitly told the expected value of each reward cue, and the reward outcome was revealed later in the trial. In other studies, the cue and outcome appear closer in time which may serve to emphasise their relationship (Bunzeck et al., 2010; Mason et al., 2017b; Mather & Schoeke, 2011). Another potential objection is that the majority of studies, including our own, provide participants with small financial incentives on each trial. In the current study, it does appear that people were response to the incentives as we observed an effect of expected value on memory. However, we know that people are motivated by factors other than money (Deci, Koestner, & Ryan, 1999) and that rewards of different magnitudes effect risk-seeking behaviour and potentially memory (Konstantinidis, Taylor, & Newell, 2017;Ludvig, Madan, Mcmillan, & Spetch, 2018). Therefore, future studies in this area may benefit from using an points based incentivisation scheme.

An additional issue worth considering is the relationship between the reward and the memory stimulus. Murayama and Kitagami (2014) found that rewards promoted memory for items presented after an unrelated reward task. In our experiment, the to-be-remembered item was not directly linked to earning a reward, but instead was presented for encoding between the reward cue and outcome; so was still embedded within the reward task (Mather & Schoeke, 2011). Arguably, these designs mean that the even under incidental learning the rewards are motivationally linked to the memory stimuli, suggesting that we need to be aware of the motivational influences more broadly (Madan, 2017).

There has been a broad interest in the functional link between mesolimbic system and episodic memory formation. Activation of the mesolimbic reward system during encoding has been consistently shown to increase hippocampal consolidation. Early studies focused on reward-related activation of the mesolimbic system. A variety of factors related to motivation have been associated with this functional link, including value, reward anticipation (Adcock et al., 2006), active decision-making (Murty et al., 2018), and curiosity (Gruber, Gelman, & Ranganath, 2014; Marvin & Shohamy, 2016). In many situations and experimental designs several of these factors are likely to interact to influence memory encoding, which may contribute to discrepancy in findings within the literature.

We found evidence against an effect of reward uncertainty on memory for individual items. This supports findings from our recent study examining reward uncertainty in motivated learning (Mason et al., 2017a). We predicted that if reward uncertainty does influence episodic memory encoding, the effects would be larger during incidental learning when the conditions of learning do not promote strategic learning. The evidence from this and the current study supports the overall conclusion that reward uncertainty related to individual items does not enhance episodic memory performance. This finding is of interest in itself, but also in the context of a growing interest in the potential contribution of environmental risk to learning and memory (Diederen et al., 2017; Rouhani et al., 2018). Rouhani et al. (2018) present the first study to directly compare memory encoding under high and low risk reward environments and demonstrate a positive benefit of high-risk contexts on learning. These findings may explain why previous studies looking at uncertainty and learning in classrooms have supported the notion that uncertainty improves learning (Howard-Jones, Jay, Mason, & Jones, 2016;Ozcelik, Cagiltay, & Ozcelik, 2013). For example, Howard-Jones et al. (2016) demonstrated that learning through a quiz based game—where rewards were delivered probabilistically compared to completing multiple choice questions in return for a fixed number of points—led to better memory performance in a subsequent test. Overall, there appears to be growing support for the idea that environmental reward uncertainty promotes learning, which could be linked to increased arousal (Miendlarzewska, Bavelier, & Schwartz, 2016; Rouhani et al., 2018).

Similarly, there is evidence from the decision-making literature that memory may underpin risk-seeking behaviours (Madan, Ludvig, & Spetch, 2014). In these studies participants show better memory for extreme outcomes associated with a risky option and presumably it is the expected value of the extreme that is driving the better memory and the risk seeking behaviours. However, it would be interesting to if our findings changed as a function of making risky choices. The current design asked participants to place bets on each trial where they could either win a small about or not win. Future studies, could examine memory when participants are required to place bets intermittently for both gains and losses.

Our findings suggest that there is not a necessary link between uncertainty and memory encoding. One explanation could be that we did not observe an effect of uncertainty on memory as our manipulation did not induce a sufficient state of uncertainty, and did not produce the assumed dopaminergic signal changes (we do not have a physiological measure of uncertainty). We have adapted the behavioural task used by Preuschoff and colleagues (Preuschoff et al., 2006, 2011), who found clear evidence of a direct relationship between reward uncertainty and dopaminergic activity. Given that our task was very similar to that of Preuschoff and colleagues, there is little reason to think that we did not induce a state of uncertainty at encoding. It should also be recognised that despite our null finding, there are several potential mechanisms by which activity related to reward uncertainty could nonetheless promote memory encoding and consolidation. Shohamy and Adcock (2010) suggested that tonic dopamine associated with reward uncertainty may increase the number of disinhibited neurons, thereby increasing the likelihood that dopamine neurons would burst in response to individual events when there is high environmental uncertainty. It is plausible and consistent with our results that such a mechanism was at play during the experiment. However, our results that expected value of reward influences memory are most consistent with phasic activity of dopamine neurons enhancing hippocampal activity. We did not find evidence that prediction errors or surprise—usually associated with activity in the LC-NE system—enhanced memory performance (Clewett et al., 2017), however it is likely that there are additional neurobiolgical mechanisms at play when learning occurs in complex reward-based environments.

The current study adds weight to several previous indicating that the relationship between reward and individual items in episodic memory is modulated by reward value (Mason et al., 2017a; Murayama & Kitagami, 2014; Wittmann et al., 2011). Our findings, in combination with previous studies, highlight that the precise relationship is sensitive to the rewards cues and outcomes used in the experimental task. Nonetheless, there is clear evidence that reward uncertainty on individual trials does not improve memory and learning.

## Data Accessibility Statement

The data is available on Open Science Framework https://osf.io/xkpfz/.

## Funding Information

This research was supported by an Australian Research Council Discovery Project (DP160101752).

## Competing Interests

The authors have no competing interests to declare.

## Author Contributions

• Contributed to conception and design: AM, AL, SF

• Contributed to acquisition of data: AL

• Contributed to analysis and interpretation of data: AM, AL, SF

• Drafted and/or revised the article: AM, AL, SF

• Approved the submitted version for publication: AM, AL, SF

1
,
A.
,
Thangavel
,
A.
,
Whitfield-Gabrieli
,
S.
,
Knutson
,
B.
, &
Gabrieli
,
J. D. E.
(
2006
).
Reward-Motivated Learning: Mesolimbic Activation Precedes Memory Formation
.
Neuron
,
50
,
507
517
. DOI:
2
Ariel
,
R.
, &
Castel
,
A. D.
(
2014
).
Eyes wide open: enhanced pupil dilation when selectively studying important information
.
Experimental Brain Research
,
232
,
337
344
. DOI:
3
Bates
,
D.
,
Mächler
,
M.
,
Bolker
,
B.
, &
Walker
,
S.
(
2015
).
Fitting linear mixed-effects models using lme4
.
Journal of Statistical Software
,
67
(
1
),
1
48
. DOI:
4
Bialleck
,
K. A.
,
Schaal
,
H.-P.
,
Kranz
,
T. A.
,
Fell
,
J.
,
Elger
,
C. E.
, &
Axmacher
,
N.
(
2011
).
Ventromedial prefrontal cortex activation is associated with memory formation for predictable rewards
.
PloS One
,
6
, e16695. DOI:
5
Bornstein
,
A. M.
, &
Norman
,
K. A.
(
2017
).
Reinstated episodic context guides sampling-based decisions for reward
,
20
(
7
). DOI:
6
Brainard
,
D. H.
(
1997
).
The Psychophysics Toolbox
.
Spatial Vision
,
10
(
4
),
433
436
. DOI:
7
Braun
,
E. K.
,
Wimmer
,
G. E.
, &
Shohamy
,
D.
(
2018
).
Retroactive and graded prioritization of memory by reward
.
Nature Communications
,
9
(
1
),
4886
. DOI:
8
Bunzeck
,
N.
,
Dayan
,
P.
,
Dolan
,
R. J.
, &
Duzel
,
E.
(
2010
).
A common mechanism for adaptive scaling of reward and novelty
.
Human Brain Mapping
,
31
,
1380
94
. DOI:
9
Castel
,
A. D.
(
2008
).
Metacognition and learning about primacy and recency effects in free recall: The utilization of intrinsic and extrinsic cues when making judgments of learning
.
Memory and Cognition
,
36
(
2
),
429
437
. DOI:
10
Castel
,
A. D.
,
Benjamin
,
A. S.
,
Craik
,
F. I. M.
, &
Watkins
,
M. J.
(
2002
).
The effects of aging on selectivity and control in short-term recall
.
Memory & Cognition
,
30
,
1078
1085
. DOI:
11
Clewett
,
D.
, &
Mather
,
M.
(
2014
).
Not all that glittered is gold: neural mechanisms that determine when reward will enhane or impair memory
.
Frontiers in Neuroscience
,
8
,
1
3
. DOI:
12
Clewett
,
D.
,
Sakaki
,
M.
,
Nielsen
,
S.
,
Petzinger
,
G.
, &
Mather
,
M.
(
2017
).
Noradrenergic mechanisms of arousal’s bidirectional effects on episodic memory
.
Neurobiology of Learning and Memory
,
137
,
1
14
. DOI:
13
Cohen
,
M.
,
Rissman
,
J.
,
Hovhannisyan
,
M.
,
Castel
,
A. D.
, &
Knowlton
,
B. J.
(
2017
).
Free recall test experience potentiates strategy-driven effects of value on memory
.
Journal of Experimental Psychology: Learning, Memory, & Cognition
. DOI:
14
Cooper
,
J. C.
, &
Knutson
,
B.
(
2008
).
Valence and salience contribute to nucleus accumbens activation
.
NeuroImage
,
39
,
538
547
. DOI:
15
Cromwell
,
H. C.
, &
Schultz
,
W.
(
2003
).
Effects of expectations for different reward magnitudes on neuronal activity in primate striatum
.
Journal of Neurophysiology
,
89
,
2823
2838
. DOI:
16
D’Ardenne
,
K. D.
,
Mcclure
,
S. M.
,
Nystrom
,
L. E.
, &
Cohen
,
J. D.
(
2008
).
BOLD Responses Reflecting Dopaminergic Signals in the Human Ventral Tegmental Area
.
Science
,
319
,
1264
1267
. DOI:
17
Deci
,
E. L.
,
Koestner
,
R.
, &
Ryan
,
R. M.
(
1999
).
A meta-analytic review of experiments examining the effects of extrinsic rewards on intrinsic motivation
.
Psychological Bulletin
,
125
,
627
700
. DOI:
18
De Loof
,
E.
,
Ergo
,
K.
,
Naert
,
L.
,
Janssens
,
C.
,
Talsma
,
D.
,
Van Opstal
,
F.
, &
Verguts
,
T.
(
2018
).
Signed reward prediction errors drive declarative learning
.
PloS One
,
13
(
1
),
1
15
. DOI:
19
Diederen
,
K. M. J.
,
Ziauddeen
,
H.
,
Vestergaard
,
M. D.
,
Spencer
,
T.
,
Schultz
,
W.
, &
Fletcher
,
P. C.
(
2017
).
Dopamine Modulates Adaptive Prediction Error Coding in the Human Midbrain and Striatum
.
The Journal of Neuroscience
,
37
(
7
),
1708
1720
. DOI:
20
Dreher
,
J. C.
,
Kohn
,
P.
, &
Berman
,
K. F.
(
2006
).
Neural coding of distinct statistical properties of reward information in humans
.
Cerebral Cortex
,
16
,
561
573
. DOI:
21
Duncan
,
K.
,
,
A.
, &
Davachi
,
L.
(
2012
).
Memory’s Penumbra: Episodic memory decisions induce lingering mnemonic biases
.
Science
,
337
(
6093
),
485
487
. DOI:
22
Ergo
,
K.
,
De Loof
,
E.
,
Janssens
,
C.
, &
Verguts
,
T.
(
2019
).
Oscillatory signatures of reward prediction errors in declarative learning
.
NeuroImage
,
186
(
June 2018
),
137
145
. DOI:
23
Eysenck
,
M. W.
, &
Eysenck
,
M. C.
(
1982
).
Effects of incentive on cued recall
.
The Quarterly Journal of Experimental Psychology Section A
,
34
,
489
498
. DOI:
24
Fawcett
,
J. M.
, &
Taylor
,
T. L.
(
2008
).
Forgetting is effortful: Evidence from reaction time probes in an item-method directed forgetting task
.
Memory & Cognition
,
36
,
1168
1181
. DOI:
25
Fiorillo
,
C. D.
,
Tobler
,
P. N.
, &
Schultz
,
W.
(
2003
).
Discrete Coding of Reward Dopamine Neurons
.
Science
,
299
,
1898
1902
. DOI:
26
Friedman
,
M. C.
, &
Castel
,
A. D.
(
2011
).
Are we aware of our ability to forget? Metacognitive predictions of directed forgetting
.
Memory & Cognition
,
39
,
1448
1456
. DOI:
27
Glimcher
,
P. W.
(
2011
).
Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis
.
Proceedings of the National Academy of Sciences of the United States of America
,
108
(Suppl),
15647
15654
. DOI:
28
Gruber
,
M. J.
,
Gelman
,
B. D.
, &
Ranganath
,
C.
(
2014
).
States of Curiosity Modulate Hippocampus-Dependent Learning via the Dopaminergic Circuit
.
Neuron
,
84
(
2
),
486
496
. DOI:
29
Gruber
,
M. J.
,
Ritchey
,
M.
,
Wang
,
S.-F.
, &
Doss
,
M. K.
(
2016
).
Post-learning Hippocampal Dynamics Promote Preferential Retention of Rewarding Events Article Post-learning Hippocampal Dynamics Promote Preferential Retention of Rewarding Events
.
Neuron
,
1
11
. DOI:
30
Hayes
,
M. G.
,
Kelly
,
A. J.
, &
Smith
,
A. D.
(
2013
).
Working Memory and the Strategic Control of Attention in Older and Younger Adults
.
The Journals of Gerontology, Series B: Psychological Sciences and Social Sciences
,
68
,
176
183
. DOI:
31
Hollerman
,
J. R.
, &
Schultz
,
W.
(
1998
).
Dopamine neurons report an error in the temporal prediction of reward during learning
.
Nature Neuroscience
,
1
,
304
9
. DOI:
32
Howard-Jones
,
P.
,
Jay
,
T.
,
Mason
,
A.
, &
Jones
,
H.
(
2016
).
Gamification of learning deactivates the default mode network
.
Frontiers in Psychology
,
6
,
1
16
. DOI:
33
Hsu
,
M.
,
Krajbich
,
I.
,
Zhao
,
C.
, &
Camerer
,
C. F.
(
2009
).
Neural response to reward anticipation under risk is nonlinear in probabilities
.
The Journal of Neuroscience
,
29
,
2231
2237
. DOI:
34
Jang
,
A. I.
,
Nassar
,
M. R.
,
Dillon
,
D. G.
, &
Frank
,
M. J.
(
2018
). Positive reward prediction errors strengthen incidental memory encoding.
Department of Cognitive, Linguistic, and Psychological Sciences; Brown Institute for Brain Science, Brown University, Providence RI 02912-1821 Center for Depression Anxiety and
. DOI:
35
Jeffreys
,
H.
(
1961
).
Theory of probability
.
Oxford
:
OUP
.
36
,
K. A.
,
Mosharov
,
E. V.
,
Choi
,
S. J.
,
Sulzer
,
D.
, &
Kandel
,
E. R.
(
2016
).
Dopamine release from the locus coeruleus to the dorsal hippocampus promotes spatial learning and memory
.
Proceedings of the National Academy of Sciences
,
113
(
51
),
14835
14840
. DOI:
37
Konstantinidis
,
E.
,
Taylor
,
R. T.
, &
Newell
,
B. R.
(
2017
).
Magnitude and incentives: revisiting the overweighting of extreme events in risky decisions from experience
. DOI:
38
Lehman
,
M.
, &
Malmberg
,
K. J.
(
2009
).
A global theory of remembering and forgetting from multiple lists
.
Journal of Experimental Psychology. Learning, Memory, and Cognition
,
35
,
970
988
. DOI:
39
Lengyel
,
M.
, &
Dayan
,
P.
(
2007
).
Hippocampal Contributions to Control: The Third Way
.
NIPS
,
1
8
.
40
Lisman
,
J.
, &
Grace
,
A. A.
(
2005
).
The hippocampal-VTA loop: controlling the entry of information into long-term memory
.
Neuron
,
46
,
703
13
. DOI:
41
Liu
,
X.
,
Hairston
,
J.
,
Schrier
,
M.
, &
Fan
,
J.
(
2011
).
Common and distinct networks underlying reward valence and processing stages: A meta-analysis of functional neuroimaging studies
.
Neuroscience and Biobehavioral Reviews
,
35
,
1219
1236
. DOI:
42
Loftus
,
G. R.
, &
Wickens
,
T. D.
(
1970
).
Effect of incentive on storage and retrieval processes
.
Journal of Experimental Psychology: General
,
85
,
141
147
. DOI:
43
Ludvig
,
E. A.
,
,
C. R.
,
Mcmillan
,
N.
, &
Spetch
,
M. L.
(
2018
).
Living Near the Edge: How Extreme Outcomes and Their Neighbors Drive Risky Choice
,
147
(
12
),
1905
1918
. DOI:
44
Ludvig
,
E. A.
,
Sutton
,
R. S.
, &
Kehoe
,
E. J.
(
2008
).
Stimulus representation and the timing of reward-prediction errors in models of the dopamine system
.
Neural Computation
,
20
,
3034
3054
. DOI:
45
,
C. R.
(
2017
).
Motivated Cognition: Effects of Reward, Emotion, and Other Motivational Factors Across a Variety of Cognitive Domains
.
Collabra: Psychology
,
3
(
1
),
24
. DOI:
46
,
C. R.
,
Ludvig
,
E. A.
, &
Spetch
,
M. L.
(
2014
).
Remembering the best and worst of times: Memories for extreme outcomes bias risky decisions
.
Psychonomic Bulletin & Review
,
21
,
629
636
. DOI:
47
Marvin
,
C. B.
, &
Shohamy
,
D.
(
2016
).
Curiosity and reward: Valence predicts choice and information prediction errors enhance learning
.
Journal of Experimental Psychology: General
,
145
(
3
),
266
272
. DOI:
48
Mason
,
A.
,
Farrell
,
S.
,
Howard-Jones
,
P.
, &
Ludwig
,
C. J.
(
2017a
).
The role of reward and reward uncertainty in episodic memory
.
Journal of Memory and Language
,
96
,
62
77
. DOI:
49
Mason
,
A.
,
Ludwig
,
C.
, &
Farrell
,
S.
(
2017b
).
Adaptive scaling of reward in episodic memory: A replication study
.
Quarterly Journal of Experimental Psychology
,
70
(
11
),
2306
2318
. DOI:
50
Mather
,
M.
,
Clewett
,
D.
,
Sakaki
,
M.
, &
Harley
,
C. W.
(
2015
).
Norepinephrine ignites local hot spots of neuronal excitation: How arousal amplifies selectivity in perception and memory
.
Behavioral and Brain Sciences
,
1
100
. DOI:
51
Mather
,
M.
, &
Schoeke
,
A.
(
2011
).
Positive outcomes enhance incidental learning for both younger and older adults
.
Frontiers in Neuroscience
,
5
,
129
. DOI:
52
Miendlarzewska
,
E. A.
,
Bavelier
,
D.
, &
Schwartz
,
S.
(
2016
).
Influence of reward motivation on human declarative memory
.
Neuroscience and Biobehavioral Reviews
,
61
,
156
176
. DOI:
53
Morey
,
R. D.
(
2008
).
Confidence Intervals from Normalized Data: A correction to Cousineau (2005)
.
Reason
,
4
,
61
64
. DOI:
54
Murayama
,
K.
, &
Kitagami
,
S.
(
2014
).
Consolidation power of extrinsic rewards: reward cues enhance long-term memory for irrelevant past events
.
Journal of Experimental Psychology: General
,
143
,
15
20
. DOI:
55
Murty
,
V. P.
,
DuBrow
,
S.
, &
Davachi
,
L.
(
2018
).
Decision-making Increases Episodic Memory via Postencoding Consolidation
.
Journal of Cognitive Neuroscience
,
26
(
3
),
1
10
. DOI:
56
Oberauer
,
K.
,
Lewandowsky
,
S.
,
Farrell
,
S.
,
Jarrold
,
C.
, &
Greaves
,
M.
(
2012
).
Modeling working memory: an interference model of complex span
.
Psychonomic Bulletin & Review
,
19
,
779
819
. DOI:
57
Ozcelik
,
E.
,
Cagiltay
,
N. E.
, &
Ozcelik
,
N. S.
(
2013
).
The effect of uncertainty on learning in game-like environments
.
Computers and Education
,
67
,
12
20
. DOI:
58
Patil
,
A.
,
Murty
,
V. P.
,
Dunsmoor
,
J. E.
,
Phelps
,
E. A.
, &
Davachi
,
L.
(
2017
).
Reward retroactively enhances memory consolidation for related items
.
Learning and Memory
,
24
(
1
),
65
69
. DOI:
59
Preuschoff
,
K.
,
Bossaerts
,
P.
, &
Quartz
,
S. R.
(
2006
).
Neural differentiation of expected reward and risk in human subcortical structures
.
Neuron
,
51
,
381
90
. DOI:
60
Preuschoff
,
K.
,
Quartz
,
S. R.
, &
Bossaerts
,
P.
(
2008
).
Human insula activation reflects risk prediction errors as well as risk
.
The Journal of Neuroscience
,
28
,
2745
2752
. DOI:
61
Preuschoff
,
K.
,
’t Hart
,
B. M.
, &
Einhauser
,
W.
(
2011
).
Pupil dilation signals surprise: Evidence for noradrenaline’s role in decision making
.
Frontiers in Neuroscience
,
5
,
1
12
. DOI:
62
Raftery
,
A. E.
(
1995
).
Bayesian Model Selection in Social Research
.
Sociological Methodology
,
25
,
111
163
. DOI:
63
Rolls
,
E. T.
,
McCabe
,
C.
, &
Redoute
,
J.
(
2008
).
Expected value, reward outcome, and temporal difference error representations in a probabilistic decision task
.
Cerebral Cortex
,
18
,
652
663
. DOI:
64
Rouhani
,
N.
,
Norman
,
K. A.
, &
Niv
,
Y.
(
2018
).
Dissociable Effects of Surprising Rewards on Learning and Memory
.
Journal of Experimental Psychology: Learning, Memory, and Cognition
. DOI:
65
Schultz
,
W.
(
1998
).
Predictive reward signal of dopamine neurons
.
Journal of Neurophysiology
,
80
,
1
27
. DOI:
66
Schultz
,
W.
(
2002
).
Getting formal with dopamine and reward
.
Neuron
,
36
,
241
63
. DOI:
67
Schultz
,
W.
,
Preuschoff
,
K.
,
Camerer
,
C.
,
Hsu
,
M.
,
Fiorillo
,
C. D.
,
Tobler
,
P. N.
, &
Bossaerts
,
P.
(
2008
).
Explicit neural signals reflecting reward uncertainty
.
Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences
,
363
,
3801
3811
. DOI:
68
Shaikh
,
N.
, &
Coulthard
,
E.
(
2013
).
Memory consolidation. Mechanisms and opportunities for enhancement
.
Translational Neuroscience
,
4
(
4
),
448
457
. DOI:
69
Shohamy
,
D.
, &
,
A.
(
2010
).
.
Trends in Cognitive Sciences
,
14
,
464
72
. DOI:
70
Spaniol
,
J.
,
Schain
,
C.
, &
Bowen
,
H. J.
(
2013
).
Reward-Enhanced Memory in Younger and Older Adults
.
The Journals of Gerontology, Series B: Psychological Sciences and Social Sciences
,
1
11
. DOI:
71
Stefanidi
,
A.
,
Ellis
,
D. M.
, &
Brewer
,
G. A.
(
2018
).
Free recall dynamics in value-directed remembering
.
Journal of Memory and Language
,
100
,
18
31
. DOI:
72
Studte
,
S.
,
Bridger
,
E.
, &
Mecklinger
,
A.
(
2016
).
Sleep spindles during a nap correlate with post sleep memory performance for highly rewarded word-pairs
.
Brain and Language
. DOI:
73
Sutton
,
R. S.
, &
Barto
,
A. G.
(
1998
).
Reinforcement Learning: An Introduction
.
Cambridge, MA
:
MIT Press
. DOI:
74
Takeuchi
,
T.
,
Duszkiewicz
,
A. J.
,
Sonneborn
,
A.
,
Spooner
,
P. A.
,
Yamasaki
,
M.
,
Watanabe
,
M.
,
Morris
,
R. G. M.
, et al. (
2016
).
Locus coeruleus and dopaminergic consolidation of everyday memory
.
Nature
,
537
,
357
362
. DOI:
75
Tobler
,
P. N.
,
Fiorillo
,
C. D.
, &
Schultz
,
W.
(
2005
).
Adaptive coding of reward value by dopamine neurons
.
Science
,
307
,
1642
1645
. DOI:
76
Tobler
,
P. N.
,
O’Doherty
,
J. P.
,
Dolan
,
R. J.
, &
Schultz
,
W.
(
2007
).
Reward value coding distinct from risk attitude-related uncertainty coding in human reward systems
.
Journal of Neurophysiology
,
97
,
1621
1632
. DOI:
77
Wagenmakers
,
E.-J.
(
2007
).
A practical solution to the pervasive problems of p values
.
Psychonomic Bulletin & Review
,
14
,
779
804
. DOI:
78
Weakliem
,
D. L.
(
1999
).
A critique of the Bayesian Information Criterian for Model Selection
.
Sociological Methods & Research
,
359
397
. DOI:
79
Wimmer
,
X. G. E.
,
Braun
,
E. K.
,
Daw
,
N. D.
, &
Shohamy
,
D.
(
2014
).
Episodic Memory Encoding Interferes with Reward Learning and Decreases Striatal Prediction Errors
,
34
,
14901
14912
. DOI:
80
Wittmann
,
B. C.
,
Dolan
,
R. J.
, &
Düzel
,
E.
(
2011
).
Behavioral specifications of reward-associated long-term memory enhancement in humans
.
Learning & Memory (Cold Spring Harbor, N.Y.)
,
18
(
5
),
296
300
. DOI:
81
Wittmann
,
B. C.
,
Schott
,
B. H.
,
Guderian
,
S.
,
Frey
,
J. U.
,
Heinze
,
H.-J.
, &
Düzel
,
E.
(
2005
).
Reward-related FMRI activation of dopaminergic midbrain is associated with enhanced hippocampus-dependent long-term memory formation
.
Neuron
,
45
(
3
),
459
67
. DOI:
82
Wylie
,
G. R.
,
Foxe
,
J. J.
, &
Taylor
,
T. L.
(
2008
).
Forgetting as an active process: An fMRI investigation of item-method-directed forgetting
.
Cerebral Cortex
,
18
,
670
682
. DOI: