Mapping anhedonia onto reinforcement learning: a behavioural meta-analysis
© Huys et al.; licensee BioMed Central Ltd. 2013
Received: 17 April 2012
Accepted: 9 May 2013
Published: 19 June 2013
Depression is characterised partly by blunted reactions to reward. However, tasks probing this deficiency have not distinguished insensitivity to reward from insensitivity to the prediction errors for reward that determine learning and are putatively reported by the phasic activity of dopamine neurons. We attempted to disentangle these factors with respect to anhedonia in the context of stress, Major Depressive Disorder (MDD), Bipolar Disorder (BPD) and a dopaminergic challenge.
Six behavioural datasets involving 392 experimental sessions were subjected to a model-based, Bayesian meta-analysis. Participants across all six studies performed a probabilistic reward task that used an asymmetric reinforcement schedule to assess reward learning. Healthy controls were tested under baseline conditions, stress or after receiving the dopamine D2 agonist pramipexole. In addition, participants with current or past MDD or BPD were evaluated. Reinforcement learning models isolated the contributions of variation in reward sensitivity and learning rate.
MDD and anhedonia reduced reward sensitivity more than they affected the learning rate, while a low dose of the dopamine D2 agonist pramipexole showed the opposite pattern. Stress led to a pattern consistent with a mixed effect on reward sensitivity and learning rate.
Reward-related learning reflected at least two partially separable contributions. The first related to phasic prediction error signalling, and was preferentially modulated by a low dose of the dopamine agonist pramipexole. The second related directly to reward sensitivity, and was preferentially reduced in MDD and anhedonia. Stress altered both components. Collectively, these findings highlight the contribution of model-based reinforcement learning meta-analysis for dissecting anhedonic behavior.
KeywordsAnhedonia Major depressive disorder Depression Reinforcement learning Reward learning Prediction error Computational Meta-analysis Reward sensitivity Learning rate
Anhedonia is one of the cardinal symptoms for a clinical diagnosis of major depressive disorder (MDD; [1–3]) and refers to an inability to experience pleasure or a diminished reactivity to pleasurable stimuli. It is typically measured by verbal reports. In people subjectively reporting anhedonia, reward feedback objectively has less impact in a variety of behavioural tasks [4–15]. However, modern accounts of decision-making distinguish structurally different ways in which this reduction might be realized, and precisely which of these is associated with anhedonia remains unclear.
Here, we attempt to distinguish two critical factors. The first factor is a reduction in the primary sensitivity to rewards. This is possibly the closest behavioural equivalent to the notion of a reduction in consummatory pleasure. Instruments measuring anhedonia, such as the relevant subscores of the Beck Depression Inventory  or the Mood and Anxiety Symptom Questionnaire (MASQ)  typically focus on this factor . The second factor is an alteration in participants’ ability to learn from reward feedback. This is emphasized by preclinical animal models of depression [19–22] and, because of the close association between dopamine (DA) signalling and reward learning [23–27], by neurobiological accounts linking anhedonia to DA [11, 14, 28–35]. It is most important to separate these factors, since they are likely to be associated with radically different ætiologies and therapeutic routes.
The distinction between these two factors is sharp in the mathematical formulation of reward learning based on prediction errors that underpins the account of DA activity [23–27], and is itself based on principles articulated in psychological  and engineering [37, 38] theories of learning. Consider an experiment in which a reward of a given magnitude is given stochastically on some trials: where r t =1 if the participant did receive the reward on trial t, and r t =0 if it did not. We write ρ for the value the participant assigns to this reward. The participants are assumed to build and maintain an expectation () of the average reward it might gain on this trial, by means of a prediction error on trial t which is the difference between the obtained ρ r t and expected reward. This prediction error can be used to improve the expectations adaptively  by adding the error to the previous expectation , where 0≤ϵ≤1 is a learning rate. This reduces the prediction error over time, at least on average.
The critical factors that might be associated with anhedonia are the two parameters ρ and ϵ in the above expressions. First, ρ is a measure of primary reward sensitivity – the larger ρ, the more sensitive the participant is to a reward of a given magnitude, or the greater the internal worth of an external reward. We comment on the difference between liking and wanting rewards later . Second, the term ϵ governs how reward prediction errors influence learning. Evidence suggests that the phasic activity of DA cells is roughly proportional to δ t ; thus, alterations in the amount of DA released per spike, or in the sensitivity of postsynaptic receptors should behave like a change in the learning rate ϵ and affect the speed at which reward affects behaviour .
The fact that varying either parameter can lead to similar qualitative patterns shows that the two parameters play partially replaceable roles and may not be fully separable . Any separation is likely to require substantial amounts of data. We therefore here fitted trial-by-trial individual reinforcement learning (RL) models in a meta-analytic manner to a series of six experiments involving 392 experimental sessions across 313 different participants. Model fitting allows for comprehensive tests of the ability of each hypothesis to account for the entire dataset. Bayesian model comparison ensures that the conclusions are based on parsimonious accounts of the data, avoiding overfitting and overly complex explanations [42–46].
High vs low BDI scores
Healthy volunteers. Payment: US$5 and course credit.
MDD vs controls
23 participants during an episode of MDD and 25 controls matched for age, sex, education, ethnicity and marital status. MDD participants met DSM-IV criteria for MDD, had Hamilton Rating Scale for depression scores ≥ 17, and no other axis I comorbidities except for anxiety. Inclusion required a minimal drug-free period of 2 weeks. Payment: US$10/hour plus 5US$ average task earnings.
History of MDD vs no history
Currently healthy participants with and without a history of major depressive disorder (MDD). Participants received US$15/hr in compensation for their time, as well as their task “earnings” (on average, US$5).
BPD vs controls
Euthymic outpatients (matched to the same 25 controls as in dataset ‘MDD’). The outpatients had a long-standing diagnosis of Bipolar Disorder, currently satisfying criteria on the Affective Disorder Evaluation, which contains modified SCID mood and psychosis modules. Patients were classified as euthymic if they currently Young Mania Rating Scale  score <12, and if their Hamilton Rating Scale for Depression  score was below 8. Exclusion criteria included other axis I disorders, a past history of MDD, substance abuse and ECT in the past 6 months. Participants were paid US$25 for participation and task earnings.
Pramipexole vs placebo
Healthy volunteers received either placebo or a single dose of the D2/D3 agonist pramipexole hydrochloride (PPX) 0.5 mg 2 hours prior to the task. At this low dose, PPX is thought to reduce phasic DA release through autoreceptor stimulation. Payment: US$ 40 for the pharmacological session and US$24.60 for the task session.
Threat-of-shock acute vs no stressor
Healthy volunteers took part in the task twice (one missing session), once in a no-stress condition and once in a stress condition. Participants were told that poor performance on the task might lead to a shock being delivered through electrodes attached to the back of their neck. In the stress condition, they were told that this was quite likely, whereas they were told that no shock would be delived in the no-stress condition. No shocks were actually delivered. Notably, the version of the task used in this study was more difficult, with the difference in size between long and short mouth being smaller. This resulted in fewer correct discriminations (see Figure 1F). Payment: either course credit or US$10/hour as well as money “won” during the task (US$10.60 on average).
Our main result is that measures of anhedonia and depression preferentially affected the reward sensitivity ρ, while a dopamine manipulation by pramipexole mainly affected the learning rate ϵ. Stress, however, affected both reward sensitivity and the learning rate. There was no difference in these two parameters in euthymic bipolar individuals, or those with a past history of depression.
Task and data
Sample characteristics: psychometric measures
Generalized distress depression (GDD)
Generalized distress anxiety (GDA)
Anxious anxiety (AA)
Anhedonic Depression (AD)
Full (BDI + BDA + MASQ subscores)
Reinforcement learning models
The first of these terms, , depends on instructions: if a t is the instructed choice for stimulus s t (for instance pressing ‘z’ for the long mouth) and is zero otherwise. The parameter γ thus determines the participants’ ability to follow the instructions. The bigger γ, the larger the contribution from , and hence the instructed response contributes more to choice. Importantly, this instructed choice is symmetric between rich and lean stimulus; thus this term leaves the asymmetry to the other terms.
The second and the third term depend on the expected reward . This captures the effect of the experienced rewards on previous trials, just as described in the introduction (except allowing different predictions for the different actions and stimuli). depends on four factors: the binary sequence r t up to that point in time, which indicates whether a reward was delivered or not, an initial value, the learning rate ϵ and the subjective (i.e. internal to the participant as opposed to the external magnitude in a fixed number of cents) effect size of a reward ρ, which we identify with reward sensitivity.
That is, after every trial, the expected reward for choice a for stimulus s is increased towards the subjective reward size ρ if a reward is received (r t =1) but the expectation was lower than ρ, and it is decreased towards zero if no reward was received (r t =0). The larger ρ, the larger the effect of rewards on choice propensities. As the learning rate ϵ approaches 1, learning is so fast that the values are simply the last experienced outcome for each choice-action pair. For 0<ϵ<1, expectations represent exponentially weighted averages over the recent outcome history. A multiplicative change to δ is equivalent to a change in ϵ.
In the task, the mouth is only shown for a very short period of time. Thus participants cannot be sure which stimulus was actually presented, and, as experimenters, we cannot know what the participants perceived. This uncertainty has two consequences. First, it implies that the factor γ which governs the effect of the instructions, should be less than ∞. Second, the participants will not be sure which value or and instruction weight to employ in their choice, or which value to update using Equation 4. We capture this effect by assuming that they know which stimulus-choice pair to update in terms of learning (Equation 4), but that when choosing, they use a form of Bayesian decision theory  to combine estimates based on both possibilities. That is, we use a parameter 0≤ζ≤1 to represent participants’ average uncertainty about which stimulus was actually presented. Assume participants expected.75 unit reward for pressing button ‘z’ given the long mouth (), and 0 given the short stimulus (). If they now believed with a probability ζ that they had seen stimulus s and with a probability 1−ζ that they had seen stimulus , then their expectation for pressing button ‘z’ would be . This is the contribution of in Equation 3. We write ζ as applying to the term only. We could also apply it to the instruction term , but this would be equivalent to rescaling γ (see Additional file 1).
Equations 3 and 4 comprise the full model ‘Belief’. We also considered two simpler variants, both of which had one fewer free parameter, and one more complicated variant, with an extra parameter. First, at ζ=1, participants are certain, and use the correct stimulus-action value to guide their choice. The model in which this value is forced is called ‘Stimulus-action’. At ζ=0.5, they are indiscriminate between the two stimuli. This is model ‘Action’ because they learn only about the values of choices, independently of stimuli. Finally, the more complex model ‘Punishment’ is based on the possibility that participants might treat a non-reward as a punishment, so making , where ρ>0. We mention other possible models in the discussion.
Model fitting & comparison
Bayesian model comparison at the group level and model fitting procedures are described in detail in . The key equations are provided in the Additional file 1. Briefly, model fitting was performed by Expectation-Maximisation to find group priors and individual (Laplace) approximate posterior distributions for the estimates for each parameter for each participant. For models ‘Action’ and ‘Stimulus-action’, this comprised parameters ; for model ‘Belief’ it additionally included parameter ζ, and for model ‘Punishment’ also ρ−. All parameters were represented as non-linearly transformed variables with support on the real line and normally distributed group priors.
More complex models will often fit the data better because they have more freedom. However, model complexity is better assessed by methods other than counting parameters [43, 59]. Bayesian model comparison is a principled way of assessing model parsimony by computing the posterior probability of each model given the entire dataset for all participants. Because exact computation of these quantities is intractable, individual parameters were integrated out by sampling from the fitted priors, and a standard Bayesian information criterion (BIC)-like approximation was employed at the group level . This procedure results in a measure we term iBIC (for ‘integrated BIC’) which captures how well each model explains the data given how complex it is . The smaller this number, the greater the model parsimony. The difference between two such values, ΔiBIC, is an approximation of the models’ relative log Bayes factor.
The same principles of model comparison also apply to the categorical question whether two groups differ in terms of their parameters. That is, when asking whether group A and B differ in terms of their reward sensitivity ρ or in terms of their learning rate ϵ, the correct approach is to compute the posterior likelihood of models incorporating these hypotheses about group differences. That is, we computed pairs of models for each dataset: in the first model of each pair , which allows group differences in ρ, participants share a common prior for all parameters except for ρ, for which the two groups have separate priors. In the second model , participants similarly share a prior for all parameters except for ϵ. Computing the Bayes factors (ΔiBIC) values for these two models relative to each other indicates whether group differences in one or the other parameter provide a more parsimonious account of the entire dataset while taking into account the relative flexibility each parameter accords the model and interactions between parameters. The Bayes factors of these models relative to the original model with no group separation, indicate whether the groups significantly differ in either characteristic.
After model validation, we first assessed inter-correlations between specific questionnaire measures (AD, BDA, GDD, BDI\A, AA, GDA) and reward sensitivity or learning rate in the entire sample using one multiple linear regression analysis for ρ and one for ϵ (regstats.m in Matlab V7.14). However, this analysis neglected two aspects of the data: first that the questionnaire data are likely correlated; and secondly that parameters for different participants are estimated with different degrees of confidence. We therefore ran a weighted hierarchical multivariate regression. This is equivalent to a standard hierarchical multivariate regression, except that parameters were weighted by the precisions with which they were estimated (see  for details). Note that parameters are represented in the transformed space throughout to avoid issues with non-Gaussianity.
We built a set of models that embody key hypotheses about the course of learning in the different groups and fitted them to the data. The models parameterize the type of learning performed by participants, allowing us to assess whether they attach rewards to stimulus-action pairs; or just to actions, or to a mixture of the two. We also test the status of ‘no reward’: do participants treat this outcome as a punishment, reducing the probability of the associated action, or do they treat it as a non-informative null outcome, as intended?
Since we are interested in understanding the characteristics of groups of individuals, we need to ascertain at the group, rather than at the individual level, which model does best [43, 45, 58]. We indeed found that, taking suitable account of complexity, a single model did capture all groups satisfactorily, allowing for a common and interpretable semantics for the parameters. These so-called random effects models capture inter-subject variation in a way that allows for differences in the extent to which individual participants’ data constrain the parameters. When we perform correlation analyses (below), we weigh parameters according to the precision with which they were inferred.
Both the standard Rescorla-Wagner model ‘Stimulus-action’ with separate stimulus-choice values , and model ‘Punishment’, which treats trials on which no reward was given as punishing, performed poorly with much worse iBIC values. This suggests that participants were not able to treat the two stimuli as entirely separate; and that they did not treat the absence of rewards as punishments with aversive properties. Model ‘Action’, which assumes that participants learn only action values, and thus do not separate between stimuli at all in the asymmetric component of choice (ζ=0.5) was an improvement over the basic model ‘Stimulus-action’ (ζ=1) but came a distant second to the model ‘Belief’ (ζ inferred for each session). Indeed, the belief parameter ζ was broadly distributed around 0.5 (mean 0.53, standard deviation 0.13). That is, on average, participants behave as model ‘Action’, neglecting the stimuli when learning and forming expectations. Individually, however, there was variability in how well they were able to discriminate the stimuli.
Figure 1F shows the probability of performing a correct choice for all participants in all datasets, with those not predicted better than chance by model ‘Belief’ circled in blue. This model correctly predicted significantly more choices than chance (binomial test, p<.05) for most experimental sessions (329 out of 392; black dots). Due to a smaller difference between long and short mouth, and hence a more difficult perceptual discrimination problem (Table 1), participants in the stress dataset– whether in the stress or the no stress condition–showed on average a lower probability of correct choice. They were consequently less well predicted by all models, but model ‘Belief’ still gave the best account of these data too (see Additional file 1: Section S2.1). Figure 2B shows that, as intended, the overall probability correct was captured by the instruction sensitivity parameter γ (Pearson correlation 0.93, p<10−20), which in turn frees the other parameters to capture trial-to-trial variation in the behaviour contingent on the reward outcomes.
Finally, Additional file 1: Section S2.2 also provides the results of a resampling analysis. This shows that if the model is run on the task, it spontaneously generates data that looks similar to the experimental data.
Given that model ‘Belief’ captured the data satisfactorily, we proceeded to analyse the relationship between model parameters and self-report questionnaire measures. Our aims were primarily to identify correlations between measures of anhedonia and learning rates or reward sensitivities. A standard, unweighted, multiple linear regression analysis revealed a significant negative correlation between ρ and anhedonic depression (AD), the most specific measure of anhedonic symptoms available to us (p=0.004; uncorrected for multiple comparisons). There was no correlation between ρ and other questionnaire measures, and no correlation between the questionairre measures and the learning rate ϵ (all p>0.1). Participants in the ‘Stress’ dataset were tested twice. Repeating this analysis using only the session without stress yielded similar results (p=0.007 for the correlation ρ vs AD, all other p>0.09).
Next, there was a negative linear correlation between ρ and ϵ of -0.41 (p<0.0001; Figure 3D). To further question the selectivity of the correlation between AD and ρ, we orthogonalized ρ with respect to ϵ in addition to orthogonalising as above. This again yielded an (unweighted) significant correlation of AD with ρ (p=0.004, multiple weighted linear regression, uncorrected for multiple comparisons), with no other correlations significant. The reverse orthogonalization did not yield any significant correlations with ϵ.
At least part of the correlation between ρ and ϵ arises because the the two parameters can explain similar features of the data, i.e. alterations in one parameter can be compensated for by alterations in the other parameter (see Figure 1). To establish whether the association between AD and the reward sensitivity parameter was due to real features in the data, rather than due to inference issues, we asked whether the correlations with questionnaire measures remained stable and identifiable in the surrogate data. For each of the 70 surrogate datasets of 392 experimental sessions, we repeated the standard, unweighted multiple linear regression. Figure 3E shows the distribution of p values for the correlation of AD with ρ and ϵ. While the median p value for the correlation between ρ and AD was 0.04, that for ϵ and AD was 0.71.
Finally, all correlation analyses using the reward sensitivity and learning parameters inferred from the second-best model ‘Action’ yielded the same results, showing that the results are not dependent on a particular model formulation.
We next examined how learning rate and reward sensitivity were affected by the factors explored in each of the individual datasets. For each dataset, we compared two models: one which assumes that the two experimental groups differed in terms of ρ, the other in terms of ϵ.
However, the Bayes factors comparing models and to the basic ‘Belief’ model without split parameters were all ≤3. This means that there was no strong evidence that either of these parameters categorically separates any two groups. Stress appeared most likely to have a genuinely shared effect on both parameters. Participants with a history of depression, or currently euthymic bipolar participants showed weak evidence of reductions in reward sensitivity. Note that all these ratios are by necessity numerically more modest than those in Additional file 1: Figure S1 because they are inferred from far less data.
Our results suggest that anhedonia (as measured by AD) and MDD affect appetitive learning more by reducing the primary sensitivity to rewards ρ than by affecting the learning rates. Non-significant trends for such an effect were observed for participants with a past history of depression, and amongst euthymic bipolar disorder patients. By contrast, a dopaminergic manipulation was found to preferentially affect the speed of learning, ϵ. Acute stress had no preferential effect on reward sensitivity or learning rate. These two parameters appear to be state, rather than trait measures: Neither a past history of depression, nor one of BPD, had a significant effect on ρ or ϵ.
Two self-report measures of anhedonia were used in this paper: the anhedonic depression subscore of the MASQ questionnaire, and the anhedonic subscore of the BDI. The former was clearly related to the reward sensitivity ρ. This subscore quantifies participants’ verbally expressed inability to experience pleasure, and so might be expected to capture something akin to consummatory ‘liking’ . Note, however, that various studies (e.g. [18, 61, 62]) have failed to find a direct correlate between anhedonia measures such as those employed here and participants’ ratings of how pleasant a sucrose solution is—a widely-used animal model of anhedonia .
By contrast, in our paradigm, ρ falls squarely within the behavioural definition of ‘wanting’ whereby past experiences are collated into values that influence choice. One possibility is thus that our results reflect the way that ‘liking’ is coupled into ‘wanting’, with the reward having less impact when it occurred (ρ was lower), rather than the same amount of pleasure not being translated efficiently into anticipation (which would be the consequence of lowered ϵ). Such a reduction in reward sensitivity parallels reductions in emotional reactions seen in MDD in response to appetitive, but also aversive, stimuli .
At a neurobiological level, ‘liking’, has been linked to μ-opioid signalling [40, 65–67] in NAcc shell. Recent PET studies have reported prefrontal changes in μ-opioid receptors in MDD , and μ-opioid receptors have been found to affect the response to antidepressant medications . We also note that substance abuse disorders, including of opiates, are prominently co-morbid with depressive disorders [70, 71], although it is not clear whether abuse causes depression or vice versa. It would be interesting to examine the effect of opiates on tasks like those examined here. If the primary reward sensitivity relates to “liking” and consummatory pleasure, then one would expect a preferential impact on ρ, unlike dopamine.
Dopamine in depression
The temporal difference learning model of phasic dopamine signalling posits that DA reports the prediction error δ. As mentioned in the introduction, a multiplicative change in this signal would be equivalent to a change in the learning rate ϵ. Although an alteration in the reward sensitivity ρ necessarily also leads to an alteration in the prediction error δ, the consequences of an abnormality that arises upstream of the prediction error are subtly different from an abnormality affecting the already computed prediction error signal. It is the latter that is most closely aligned with a primary alteration in dopaminergic function.
In the current study, we report tentative findings suggesting that pramipexole reduced the learning rate rather than affecting the reward sensitivity, which might correspond to a direct reduction in the signal reported by phasic DA. Although pramipexole is a non-ergot D 2/D3agonist (and is clinically used as such in Parkinson’s disease, restless leg syndrome, and occasionally in treatment-resistant MDD), it has previously been found to have behavioural effects of a DA antagonist at the low doses (0.5 mg) used in our dataset [72–74]. Similarly, low doses of the D 2 agonist cabergoline have been found to specifically reduce reward go learning . With both drugs, it has been postulated that this may be due to activation of inhibitory  presynaptic D 2 receptors, which have a higher affinity for DA  and reduce phasic DA release ([78–82]; see also ) and DA cell firing . Thus, our findings echo those of a recent study showing that dopamine agonists can reinstate prediction errors by restoring the component of the prediction error related to expected value, rather than that of reward . However, we emphasize that the conclusions pertaining to dopamine rest on the contribution of only one study and that there are important technical caveats to be borne in mind (see below).
As indicated in the introduction, DA has multiple and profound involvements with depression. These range from the fact that DA manipulations affect mood  and that many antidepressants have pro-DA effects (including buproprion, sertraline, nomifensine, tranylcypromine amongst others; ), to a role in resilience [87, 88], and suicidality in MDD [89, 90]. Critically, depression is common in other disorders that also involve DA dysfunction, such as Parkinson’s disease and schizophrenia [91, 92]. However, the findings reported here speak to the specific aspects of phasic DA in learning, rather than to all the manifold ways in which DA may be involved in anhedonia and depression. Tonic DA levels, which are partially independent of phasic bursts of activity [93, 94] have a profound impact on energy levels and vigour [95, 96] making this feature of DA a likely candidate for the severe psychomotor retardation seen in melancholic depression , and to the distinction between pleasure and motivational aspects of anhedonia [18, 97]. In the context of our task, this may relate to reaction times (albeit noting that these can also be affected by phasic DA signals; ), which were not analysed. Rather, here, we focused on the role of DA in instrumental learning of action values. Note, though, that the current design cannot completely disentangle instrumental effects from those due to Pavlovian approach [58, 99]. Such Pavlovian influences involve dopamine and the nucleus accumbens (NAcc; [100–105]), and may indeed constitute one of the ways in which DA supports antidepressant function [106, 107].
Overall, our findings are consistent with the fact that DA by itself is not a major target for psychopharmacological treatment of anhedonia , and, modulo the issues mentioned above, that DA appears to mediate ‘wanting’, more than ‘liking’ [40, 108]. There has been a number of recent functional imaging investigations into reward-related decision making in MDD (e.g. [11, 109–114]). Our findings have key implications for the interpretations of these studies, as it is important to separate contributions to the correlates of prediction errors that arise from changes in reward sensitivity from changes in learning rate. Because a) ‘wanting’ (reinforcement) and ‘liking’ (hedonic) aspects of tasks will often overlap; and b) both changes in ρ and ϵ would affect correlations with a regressor based on prediction errors δ, our findings do not invalidate these imaging results, but call for studies and analyses that further separate them.
Our conclusions were derived from a detailed, model-based meta-analysis combining behavioural data from six datasets in 392 sessions. Several features and limitations of this analysis deserve comment.
First, the interpretability of the parameters is maximised by using rigorous model comparisons. The Bayesian approach we used prevents overfitting by integrating out individual subject parameters [43, 45, 46, 58]. Rather than just comparing how careful parameter tuning can allow a model to fit the data extremely well, we extensively sampled the model’s entire parameter space, and ask how well, on average, the class of model, independent of its particular parameters, can fit the given data (for an insightful explanation, see chapter 28 of ). Next, our model comparison is done at the group level, rather than at the individual level, again through sampling and Bayesian model comparison. This ensures that conclusions about the parameters do apply to the group.
Second, our approach takes individual model fits into account. The regression analyses are true random effects analyses, weighting parameters by how strongly they are constrained by each participant’s own choice data, and by how well the model fits that particular participant. This, in combination with the explicit modelling of stimulus uncertainty (beliefs) and instruction weights, ensures that any non-specific performance variability does not unduly affect our parameters of interest. Furthermore, the weighted regression ensures that each participant influences the conclusions proportionally to how well they are fit by the model.
Third, it is standard practice to constrain the parameters when fitting models, for instance to avoid extreme outlier inference. We use two types of constraints. The parameter transformations generate hard constraints that force parameters to remain inside feasible regions. The empirical Bayesian inference of the group priors additionally yields the most appropriate soft constraints [59, 115].
Fourth, learning rate and reward sensitivity were correlated in all models tested. To alleviate this, we enforced independence at the group level. One standard approach would have been to compare models in which one dimension is constrained by forcing the parameters for all participants along that dimension to be equal. However, this a) is an unrealistic constraint; b) fails to address the fact that parameters may not give the model equal flexibility; and c) renders parameters hard to interpret as variability from one is squeezed into all the other parameters (which further aggravates point b). To circumvent these issues, we contrasted the parsimony of models that explicitly allowed participants to fall into distinct groups. Doing this at the group level addressed the questions at the group level, which is where we sought to draw conclusions. We also performed the regression analysis in a number of ways to assess the selectivity of the relationship with ρ and its stability in the inference procedure.
Fifth, it is important to note that our failure to discover correlations between AD and learning rate, or between the effects of pramipexol and reward sensitivity might be due to limits of power. This is particularly true for the categorical comparisons on which arguments about the differential effect of dopamine and anhedonia or depression rest mainly. It will be critical to replicate these findings in larger samples, potentially requiring paradigms explicitly adapted to separating the two factors.
Sixth, we find no evidence that either learning rate of reward sensitivity clearly separates any of the groups when comparing the basic model to models or that allow for the two groups to have different means in one of the other parameter. However, our central motivation was whether anhedonia associated more with unusual ρ or unusual ϵ, because of the different psychiatric and psychological interpretations of these possibilities. The most direct test of this is to compare to rather than comparing each to a mid-point in the shape of the basic model. Furthermore, there are a number of caveats to this finding, which form the basis for reporting the more lenient comparison between models and . First, the model comparison technique may be too conservative for this particular analysis. For instance, at the top level we very stringently punish according to the total number of observations. Although we have found this to overestimate the rate at which the group-level variance should drop with observations, in our hands this penalty has proven to yield the correct responses most reliably when tested on a variety of surrogate datasets (unpublished data in preparation). This is likely to be particularly important given that the power in the analyses in Figure 4 is much lower than that in the preceding analyses using the entire dataset. These considerations do not bear on comparisons between models and . Furthermore, the panels showing the parameters in Figure 4 appear to show robust group differences. However, a separation in parameter space does not guarantee that the increase in fit outweighs the increase in model complexity.
Of course, even though our best fitting model did an excellent job predicting the data of a plurality of participants, there could be a model that we did not try that would do even better. This is particularly true of the participants in the Stress dataset, who were fit the worst. There is a conventional ANOVA-like procedure in these circumstances that involves assessing the extent to which responses are potentially predictable [44, 116] from running the same task multiple times. Unfortunately, it is not clear how to execute this in cases involving learning which is contingent on the participants’ behaviour.
One advantage of the simplicity of the task is that it is likely insensitive to several aspects of reinforcement learning that are under current investigation, such as goal-directed versus habitual decision-making [117, 118]. However, one interesting direction would be to consider a more Bayesian treatment of the learning process, according to which the effective learning rate ϵ should be a function of the amount of experience that the participant has had, modulated by the participant’s belief that the contingencies are constant [119–121].
This elaborate account raises an important question about the factor ϵ in the models that we did build. Here, we considered the effect on learning of manipulating the magnitude of δ t ; but noted that this is exactly confounded in the magnitude of ϵ. Neurobiologically, though, there could be an alternative realization of ϵ, notably via cholinergic influences [122–124]. It would be interesting to examine the effect of cholinergic manipulations on the basic task.
Equally, in conventional reinforcement learning models, it is common to employ a variant of Equation 2 in which the terms are multiplied by another arbitrary constant which is usually written β and called an inverse temperature. The larger β, the more deterministic the choice between a t and , all else being equal. Thus β is often used as a surrogate for controlling exploration, as more stochastic choices (lower β) are more exploratory. However, in our model, β can substitute exactly for ρ (in a very similar way that ϵ can substitute for the magnitude of δ). Thus reward insensitivity could masquerade as over-exploration or (particularly in depression) as under-exploitation. These are not differentiable in this task. Nevertheless, the neurobiological realization of the exploration constant β has been suggested as being rather different – notably involving noradrenergic neuromodulation  – opening up another line of experimental investigation.
Finally, two relevant findings may be important for future model development. First, positive affective responses to positive events in daily life have recently been found to be stronger in depressed than in non-depressed individuals . Second, the impact of negative events has been found to be determined not by the strength of the immediate emotional reaction, but by the attributions made about it . Thus, it may be that the effects seen here can be modified by higher-order processes in ways that are critical for the development of psychopathology, and it may be useful to extend the present methods to such interactions .
This paper presented a model-based meta-analysis of behavioural data spanning several related manipulations and adds to a growing literature of behavioural correlates of depression [64, 128]. We concluded that anhedonia in depressive states was mediated by a change in reward sensitivity, which has different behavioural consequences from either stress or DA manipulations. Our analysis allowed us to draw these conclusions while taking into account as much as possible the variability between the datasets and participants; and the variability in other, unrelated aspects of task performance. We believe that similar analyses could be readily applied to other datasets. We hope that our findings will encourage the re-analysis of the dissociation between reward sensitivity and dopaminergic processes in depressive states.
The authors would like to thank the member of the Affective Neuroscience Laboratory for their assistance with collection of the data analysed in this manuscript.
Funding from the Gatsby Charitable Foundation (QH, PD) and Deutsche Forschungsgemeinschaft DFG GZ RA/1047/2-1 (QH). Collection of datasets presented in the current manuscript was supported by grants from the National Institute of Mental Health (R01 MH068376, R21 MH078979, R01 MH095809) and National Center for Complementary & Alternative Medicine (R21AT002974) awarded to DAP. DAP has also received consulting fees from ANT North America Inc. (Advanced Neuro Technology), AstraZeneca, Ono Pharma USA, Shire and Servier, as well as honoraria from AstraZeneca for projects unrelated to this study.
- American Psychiatric Association: Diagnostic and Statistical Manual of Mental Disorders. 1994, American Psychiatric Association Press
- World Health Organization: International Classification of Diseases. 1990, World Health Organization Press
- Myin-Germeys I, Peeters F, Havermans R, Nicolson NA, DeVries MW, Delespaul P, Os JV: Emotional reactivity to daily life stress in psychosis and affective disorder: an experience sampling study. Acta Psychiatr Scand. 2003, 107 (2): 124-131.PubMed
- Costello CG: Depression: Loss of reinforcers or loss of reinforcer effectiveness?. Behav Ther. 1972, 3: 240-247.
- Akiskal HS, McKinney WT: Depressive disorders: toward a unified hypothesis. Science. 1973, 182 (107): 20-29.PubMed
- Blaney PH: Contemporary theories of depression: critique and comparison. J Abnorm Psychol. 1977, 86 (3): 203-223.PubMed
- Nelson RE, Craighead WE: Selective recall of positive and negative feedback, self-control behaviors, and depression. J Abnorm Psychol. 1977, 86 (4): 379-388.PubMed
- Henriques JB, Glowacki JM, Davidson RJ: Reward fails to alter response bias in depression. J Abnorm Psychol. 1994, 103 (3): 460-466.PubMed
- Henriques JB, Davidson RJ: Decreased Responsiveness to reward in depression. Cogn Emotion. 2000, 14 (5): 711-724.
- Pizzagalli DA, Jahn AL, O’Shea JP: Toward an objective characterization of an anhedonic phenotype: a signal-detection approach. Biol Psychiatry. 2005, 57 (4): 319-327. [http://dx.doi.org/10.1016/j.biopsych.2004.11.026]PubMed CentralPubMed
- Steele JD, Kumar P, Ebmeier KP: Blunted response to feedback information in depressive illness. Brain. 2007, 130 (Pt 9): 2367-2374. [http://dx.doi.org/10.1093/brain/awm150]PubMed
- Huys QJM: Reinforcers and control. Towards a computational ætiology of depression. PhD thesis. Gatsby Computational Neuroscience Unit, UCL, University of London 2007, [http://www.gatsby.ucl.ac.uk/qhuys/pub.html]
- Huys QJM, Vogelstein J, Dayan P, Bottou L: Psychiatry: Insights into depression through normative decision-making models. Advances in Neural Information Processing Systems 21. Edited by: Schuurmans D, Koller D, Bengio Y. 2009, MIT Press, 729-736.
- Chase HW, Frank MJ, Michael A, Bullmore ET, Sahakian BJ, Robbins TW: Approach and avoidance learning in patients with major depression and healthy controls: relation to anhedonia. Psychol Med. 2010, 40 (3): 433-440. [http://dx.doi.org/10.1017/S0033291709990468]PubMed
- Chase HW, Michael A, Bullmore ET, Sahakian BJ, Robbins TW: Paradoxical enhancement of choice reaction time performance in patients with major depression. J Psychopharmacol. 2010, 24 (4): 471-479. [http://dx.doi.org/10.1177/0269881109104883]PubMed
- Beck A, Steer R, Brown G: Manual for the Beck Depression Inventory-II. 1996, San Antonio: Psychological Corporation
- Watson D, Weber K, Assenheimer JS, Clark LA, Strauss ME, McCormick RA: Testing a tripartite model: I. Evaluating the convergent and discriminant validity of anxiety and depression symptom scales. J Abnorm Psychol. 1995, 104: 3-144.PubMed
- Treadway MT, Zald DH: Reconsidering anhedonia in depression: lessons from translational neuroscience. Neurosci Biobehav Rev. 2011, 35 (3): 537-555. [http://dx.doi.org/10.1016/j.neubiorev.2010.06.006]PubMed CentralPubMed
- Papp M, Klimek V, Willner P: Parallel changes in dopamine D2 receptor binding in limbic forebrain associated with chronic mild stress-induced anhedonia and its reversal by imipramine. Psychopharmacology. 1994, 115 (4): 441-446.PubMed
- Ichikawa J, Meltzer HY: Effect of antidepressants on striatal and accumbens extracellular dopamine levels. Eur J Pharmacol. 1995, 281 (3): 255-261.PubMed
- D’Aquila PS, Collu M, Gessa GL, Serra G: The role of dopamine in the mechanism of action of antidepressant drugs. Eur J Pharmacol. 2000, 405 (1-3): 365-373.PubMed
- Barr AM, Markou A, Phillips AG: A ‘crash’ course on psychostimulant withdrawal as a model of depression. Trends Pharmacol Sci. 2002, 23 (10): 475-482.PubMed
- Montague PR, Dayan P, Sejnowski TJ: A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J Neurosci. 1996, 16 (5): 1936-1947.PubMed
- O’Doherty JP, Dayan P, Friston K, Critchley H, Dolan RJ: Temporal difference models and reward-related learning in the human brain. Neuron. 2003, 38 (2): 329-337.PubMed
- Bayer HM, Glimcher PW: Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron. 2005, 47: 129-141. [http://dx.doi.org/10.1016/j.neuron.2005.05.020]PubMed CentralPubMed
- Waelti P, Dickinson A, Schultz W: Dopamine responses comply with basic assumptions of formal learning theory. Nature. 2001, 412 (6842): 43-48. [http://dx.doi.org/10.1038/35083500]PubMed
- D’Ardenne K, McClure SM, Nystrom LE, Cohen JD: BOLD responses reflecting dopaminergic signals in the human ventral tegmental area. Science. 2008, 319 (5867): 1264-1267. [http://dx.doi.org/10.1126/science.1150605]PubMed
- Kapur S, Mann JJ: Role of the dopaminergic system in depression. Biol Psychiatry. 1992, 32: 1-17. [http://dx.doi.org/10.1016/0006-3223(92)90137-O]PubMed
- Nutt DJ: The role of dopamine and norepinephrine in depression and antidepressant treatment. J Clin Psychiatry. 2006, 67 (Suppl 6): 3-8.PubMed
- Nestler EJ, Carlezon WA: The mesolimbic dopamine reward circuit in depression. Biol Psychiatry. 2006, 59 (12): 1151-1159. [http://dx.doi.org/10.1016/j.biopsych.2005.09.018]PubMed
- Dunlop BW, Nemeroff CB: The role of dopamine in the pathophysiology of depression. Arch Gen Psychiatry. 2007, 64 (3): 327-337. [http://dx.doi.org/10.1001/archpsyc.64.3.327]PubMed
- Parker G: Defining melancholia: the primacy of psychomotor disturbance. Acta Psychiatr Scand. 2007, 115 (Suppl 433) (2): 21-30.
- Mathew SJ, Manji HK, Charney DS: Novel drugs and therapeutic targets for severe mood disorders. Neuropsychopharmacology. 2008, 33 (9): 2080-2092. [http://dx.doi.org/10.1038/sj.npp.1301652]PubMed
- Schlaepfer TE, Cohen MX, Frick C, Kosel M, Brodesser D, Axmacher N, Joe AY, Kreft M, Lenartz D, Sturm V: Deep brain stimulation to reward circuitry alleviates anhedonia in refractory major depression. Neuropsychopharmacology. 2008, 33 (2): 368-377. [http://dx.doi.org/10.1038/sj.npp.1301408]PubMed
- Bewernick BH, Hurlemann R, Matusch A, Kayser S, Grubert C, Hadrysiewicz B, Axmacher N, Lemke M, Cooper-Mahkorn D, Cohen MX, Brockmann H, Lenartz D, Sturm V, Schlaepfer TE: Nucleus accumbens deep brain stimulation decreases ratings of depression and anxiety in treatment-resistant depression. Biol Psychiatry. 2010, 67 (2): 110-116. [http://dx.doi.org/10.1016/j.biopsych.2009.09.013]PubMed
- Rescorla R, Wagner A: A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. Classical Conditioning II: Current Research and Theory. Edited by: Black AH, Prokasy WF. 1972, Appleton-Century-Crofts, 64-99.
- Widrow B, Hoff M: Adaptive switching circuits. WESCON Convention Report, Volume 4. 1960, Institute of, Radio Engineers, 96-104.
- Sutton R, Barto A, et al: Toward a modern theory of adaptive networks Expectation and prediction. Psychol Rev. 1981, 88 (2): 135-170.PubMed
- Sutton RS, Barto AG: Reinforcement Learning: An Introduction. 1998, Cambridge: MIT Press, [http://www.cs.ualberta.ca/~sutton/book/the-book.html]
- Berridge KC, Robinson TE: What is the role of dopamine in reward hedonic impact, reward learning, or incentive salience?. Brain Res Rev. 1998, 28 (3): 209-269.
- Smith A, Li M, Becker S, Kapur S: A model of antipsychotic action in conditioned avoidance: a computational approach. Neuropsychopharm. 2004, 29 (6): 1040-1049.
- Daw N: Decision Making, Affect, and Learning: Attention and Performance XXIII. Edited by: Delgado MR, Phelps EA, Robbins TW. 2009, OUP
- Kass R, Raftery A: Bayes factors. J Am Stat Assoc. 1995, 90 (430): 773-795.
- Gelman A, Carlin J, Stern H, Rubin D: Bayesian Data Analysis. 2004, Chapman and Hall/CRC Press
- Stephan KE, Penny WD, Daunizeau J, Moran RJ, Friston KJ: Bayesian model selection for group studies. Neuroimage. 2009, 46 (4): 1004-1017. [http://dx.doi.org/10.1016/j.neuroimage.2009.03.025]PubMed CentralPubMed
- Huys QJM, Moutoussis M, Williams J: Are computational models of any use to psychiatry?. Neural Netw. 2011, 24 (6): 544-551. [http://dx.doi.org/10.1016/j.neunet.2011.03.001]PubMed
- Bogdan R, Santesso DL, Fagerness J, Perlis RH, Pizzagalli DA: Corticotropin-releasing hormone receptor type 1 (CRHR1) genetic variation and stress interact to influence reward learning. J Neurosci. 1324, 31 (37): 6-13254. [http://dx.doi.org/10.1523/JNEUROSCI.2661-11.2011]
- Pizzagalli DA, Iosifescu D, Hallett LA, Ratner KG, Fava M: Reduced hedonic capacity in major depressive disorder: evidence from a probabilistic reward task. J Psychiatr Res. 2008, 43: 76-87. [http://dx.doi.org/10.1016/j.jpsychires.2008.03.001]PubMed CentralPubMed
- Dutra S, Brooks N, Lempert K, Guardado A, Goetz E, Pizzagalli D: Reward responsiveness in a remitted depressed sample: Effects of gender and trait negative affect. 23rd Annual Meeting of the Society for Research in Psychopathology. 2009, Minneapolis:
- Pizzagalli DA, Goetz E, Ostacher M, Iosifescu DV, Perlis RH: Euthymic patients with bipolar disorder show decreased reward learning in a probabilistic reward task. Biol Psychiatry. 2008, 64 (2): 162-168. [http://dx.doi.org/10.1016/j.biopsych.2007.12.001]PubMed CentralPubMed
- Young RC, Biggs JT, Ziegler VE, Meyer DA: A rating scale for mania reliability, validity and sensitivity. Br J Psychiatry. 1978, 133: 429-435.PubMed
- Hamilton M: A rating scale for depression. J Neurol Neurosurg Psychiatry. 1960, 23: 56-62.PubMed CentralPubMed
- Pizzagalli DA, Evins AE, Schetter EC, Frank MJ, Pajtas PE, Santesso DL, Culhane M: Single dose of a dopamine agonist impairs reinforcement learning in humans: behavioral evidence from a laboratory-based measure of reward responsiveness. Psychopharmacol (Berl). 2008, 196 (2): 221-232. [http://dx.doi.org/10.1007/s00213-007-0957-y]
- Bogdan R, Pizzagalli DA: Acute stress reduces reward responsiveness: implications for depression. Biol Psychiatry. 2006, 60 (10): 1147-1154. [http://dx.doi.org/10.1016/j.biopsych.2006.03.037]PubMed CentralPubMed
- Tripp G, Alsop B: Sensitivity to reward frequency in boys with attention deficit hyperactivity disorder. J Clin Child Psychol. 1999, 28 (3): 366-375. [http://dx.doi.org/10.1207/S15374424jccp280309]PubMed
- Santesso DL, Evins AE, Frank MJ, Schetter EC, Bogdan R, Pizzagalli DA: Single dose of a dopamine agonist impairs reinforcement learning in humans: evidence from event-related potentials and computational modeling of striatal-cortical function. Hum Brain Mapp. 2009, 30 (7): 1963-1976. [http://dx.doi.org/10.1002/hbm.20642]PubMed CentralPubMed
- Green DM, Swets JA, et al: Signal Detection Theory and Psychophysics, Volume 1974. 1966, New York: Wiley
- Huys QJM, Cools R, Gölzer M, Friedel E, Heinz A, Dolan RJ, Dayan P: Disentangling the roles of approach, activation and valence in instrumental and pavlovian responding. PLoS Comput Biol. 2011, 7 (4): e1002028-[http://dx.doi.org/10.1371/journal.pcbi.1002028]PubMed CentralPubMed
- MacKay DJ: Information Theory, Inference and Learning Algorithms. 2003, Cambridge: CUP
- Huys QJM, Eshel N, O’Nions E, Sheridan L, Dayan P, Roiser JP: Bonsai trees in your head: how the Pavlovian system sculpts goal-directed choices by pruning decision trees. PLoS Comput Biol. 2012, 8 (3): e1002410-[http://dx.doi.org/10.1371/journal.pcbi.1002410]PubMed CentralPubMed
- Berlin I, Givry-Steiner L, Lecrubier Y, Puech AJ: Measures of anhedonia and hedonic responses to sucrose in depressive and schizophrenic patients in comparison with healthy subjects. Eur Psychiatry. 1998, 13 (6): 303-309. [http://dx.doi.org/10.1016/S0924-9338(98)80048-5]PubMed
- Dichter GS, Smoski MJ, Kampov-Polevoy AB, Gallop R, Garbutt JC: Unipolar depression does not moderate responses to the Sweet Taste Test. Depress Anxiety. 2010, 27 (9): 859-863. [http://dx.doi.org/10.1002/da.20690]PubMed CentralPubMed
- Willner P, Towell A, Sampson D, Sophokleous S, Muscat R: Reduction of sucrose preference by chronic unpredictable mild stress, and its restoration by a tricyclic antidepressant. Psychopharmacology. 1987, 93 (3): 358-364. [http://dx.doi.org/10.1007/BF00187257]PubMed
- Bylsma LM, Morris BH, Rottenberg J: A meta-analysis of emotional reactivity in major depressive disorder. Clin Psychol Rev. 2008, 28 (4): 676-691. [http://dx.doi.org/10.1016/j.cpr.2007.10.001]PubMed
- Smith KS, Berridge KC: The ventral pallidum and hedonic reward neurochemical maps of sucrose “liking” and food intake. J Neurosci. 2005, 25 (38): 8637-8649.PubMed
- Pecina S, Berridge KC: Hedonic hot spot in nucleus accumbens shell where do mu-opioids cause increased hedonic impact of sweetness?. J Neurosci. 1177, 25 (50): 7-11786.
- Berridge KC: ‘Liking’ and ‘wanting’ food rewards: brain substrates and roles in eating disorders. Physiol Behav. 2009, 97 (5): 537-550. [http://dx.doi.org/10.1016/j.physbeh.2009.02.044]PubMed CentralPubMed
- Kennedy SE, Koeppe RA, Young EA, Zubieta JK: Dysregulation of endogenous opioid emotion regulation circuitry in major depression in women. Arch Gen Psychiatry. 2006, 63 (11): 1199-1208. [http://dx.doi.org/10.1001/archpsyc.63.11.1199]PubMed
- Garriock HA, Tanowitz M, Kraft JB, Dang VC, Peters EJ, Jenkins GD, Reinalda MS, McGrath PJ, von Zastrow M, Slager SL, Hamilton SP: Association of mu-opioid receptor variants and response to citalopram treatment in major depressive disorder. Am J Psychiatry. 2010, 167 (5): 565-573. [http://dx.doi.org/10.1176/appi.ajp.2009.08081167]PubMed CentralPubMed
- Grant BF: Comorbidity between DSM-IV drug use disorders and major depression: results of a national survey of adults. J Subst Abuse. 1995, 7 (4): 481-497.PubMed
- Swendsen J, Conway KP, Degenhardt L, Glantz M, Jin R, Merikangas KR, Sampson N, Kessler RC: Mental disorders as risk factors for substance use, abuse and dependence: results from the 10-year follow-up of the National Comorbidity Survey. Addiction. 2010, 105 (6): 1117-1128. [http://dx.doi.org/10.1111/j.1360-0443.2010.02902.x]PubMed CentralPubMed
- Samuels ER, Hou RH, Langley RW, Szabadi E, Bradshaw CM: Comparison of pramipexole and amisulpride on alertness, autonomic and endocrine functions in healthy volunteers. Psychopharmacology (Berl). 2006, 187 (4): 498-510. [http://dx.doi.org/10.1007/s00213-006-0443-y]
- Samuels ER, Hou RH, Langley RW, Szabadi E, Bradshaw CM: Comparison of pramipexole and modafinil on arousal, autonomic, and endocrine functions in healthy volunteers. J Psychopharmacol. 2006, 20 (6): 756-770. [http://dx.doi.org/10.1177/0269881106060770]PubMed
- Samuels ER, Hou RH, Langley RW, Szabadi E, Bradshaw CM: Comparison of pramipexole with and without domperidone co-administration on alertness, autonomic, and endocrine functions in healthy volunteers. Br J Clin Pharmacol. 2007, 64 (5): 591-602. [http://dx.doi.org/10.1111/j.1365-2125.2007.02938.x]PubMed CentralPubMed
- Frank MJ, O’Reilly RC: A mechanistic account of striatal dopamine function in human cognition: psychopharmacological studies with cabergoline and haloperidol. Behav Neurosci. 2006, 120 (3): 497-517. [http://dx.doi.org/10.1037/0735-7044.120.3.497]PubMed
- Grace AA: Phasic versus tonic dopamine release and the modulation of dopamine system responsivity: a hypothesis for the etiology of schizophrenia. Neuroscience. 1991, 41: 1-24.PubMed
- Cooper J, Bloom F, Roth R: The Biochemical Basis of Neuropharmacology. 2003, USA: Oxford University Press
- Baudry M, Martres MP, Schwartz JC: In vivo binding of 3H-pimozide in mouse striatum: effects of dopamine agonists and antagonists. Life Sci. 1977, 21 (8): 1163-1170.PubMed
- Fuller RW, Clemens JA, Hynes MD: Degree of selectivity of pergolide as an agonist at presynaptic versus postsynaptic dopamine receptors implications for prevention or treatment of tardive dyskinesia. J Clin Psychopharmacol. 1982, 2 (6): 371-375.PubMed
- Tissari AH, Rossetti ZL, Meloni M, Frau MI, Gessa GL: Autoreceptors mediate the inhibition of dopamine synthesis by bromocriptine and lisuride in rats. Eur J Pharmacol. 1983, 91 (4): 463-468.PubMed
- Sumners C, de Vries JB, Horn AS: Behavioural and neurochemical studies on apomorphine-induced hypomotility in mice. Neuropharmacology. 1981, 20 (12A): 1203-1208.PubMed
- Schmitz Y, Benoit-Marand M, Gonon F, Sulzer D: Presynaptic regulation of dopaminergic neurotransmission. J Neurochem. 2003, 87 (2): 273-289.PubMed
- Piercey MF, Hoffmann WE, Smith MW, Hyslop DK: Inhibition of dopamine neuron firing by pramipexole, a dopamine D3 receptor-preferring agonist comparison to other dopamine receptor agonists. Eur J Pharmacol. 1996, 312: 35-44.PubMed
- Chowdhury R, Guitart-Masip M, Dayan P, Huys QJM, Düzel E, Dolan RJ, Lambert: Dopamine restores reward prediction errors in older age. Nat Neurosci. 2013, in press
- Tremblay LK, Naranjo CA, Cardenas L, Herrmann N, Busto UE: Probing brain reward system function in major depressive disorder: altered response to dextroamphetamine. Arch Gen Psych. 2002, 59 (5): 209-215.
- Berton O, Nestler EJ: New approaches to antidepressant drug discovery: beyond monoamines. Nat Rev Neurosci. 2006, 7 (2): 137-151. [http://dx.doi.org/10.1038/nrn1846]PubMed
- Krishnan V, Han MH, Graham DL, Berton O, Renthal W, Russo SJ, Laplant Q, Graham A, Lutter M, Lagace DC, Ghose S, Reister R, Tannous P, Green TA, Neve RL, Chakravarty S, Kumar A, Eisch AJ, Self DW, Lee FS, Tamminga CA, Cooper DC, Gershenfeld HK, Nestler EJ: Molecular adaptations underlying susceptibility and resistance to social defeat in brain reward regions. Cell. 2007, 131 (2): 391-404. [http://dx.doi.org/10.1016/j.cell.2007.09.018]PubMed
- Vialou V, Robison AJ, Laplant QC, Covington HE, Dietz DM, Ohnishi YN, Mouzon E, Rush AJ, Watts EL, Wallace DL, Iñiguez SD, Ohnishi YH, Steiner MA, Warren BL, Krishnan V, BolaÃśos CA, Neve RL, Ghose S, Berton O, Tamminga CA, Nestler EJ: DeltaFosB in brain reward circuits mediates resilience to stress and antidepressant responses. Nat Neurosci. 2010, 13 (6): 745-752. [http://dx.doi.org/10.1038/nn.2551]PubMed CentralPubMed
- Pitchot W, Hansenne M, Ansseau M: Role of dopamine in non-depressed patients with a history of suicide attempts. Eur Psychiatry. 2001, 16 (7): 424-427.PubMed
- Pitchot W, Reggers J, Pinto E, Hansenne M, Fuchs S, Pirard S, Ansseau M: Reduced dopaminergic activity in depressed suicides. Psychoneuroendocrinology. 2001, 26 (3): 331-335.PubMed
- Tandberg E, Larsen JP, Aarsland D, Cummings JL: The occurrence of depression in Parkinson’s disease. A community-based study. Arch Neurol. 1996, 53 (2): 175-179.PubMed
- Gelder M, Harrison P, Cowen P: Shorter Oxford Textbook of Psychiatry. 2006, Oxford: Oxford University Press
- Floresco SB, West AR, Ash B, Moore H, Grace AA: Afferent modulation of dopamine neuron firing differentially regulates tonic and phasic dopamine transmission. Nat Neurosci. 2003, 6 (9): 968-973. [http://dx.doi.org/10.1038/nn1103]PubMed
- Goto Y, Grace AA: Dopaminergic modulation of limbic and cortical drive of nucleus accumbens in goal-directed behavior. Nat Neurosci. 2005, 8 (6): 805-812. [http://dx.doi.org/10.1038/nn1471]PubMed
- Salamone JD, Correa M, Farrar AM, Nunes EJ, Pardo M: Dopamine behavioral economics, and effort. Front Behav Neurosci. 2009, 3 (13): [http://dx.doi.org/10.3389/neuro.08.013.2009]
- Niv Y, Daw ND, Joel D, Dayan P: Tonic dopamine: opportunity costs and the control of response vigor. Psychopharmacol(Berl). 2007, 191 (3): 507-520. [http://dx.doi.org/10.1007/s00213-006-0502-4]
- Gard D, Gard M, Kring A, John O: Anticipatory and consummatory components of the experience of pleasure: a scale development study. J Res Pers. 2006, 40 (6): 1086-1102.
- Satoh T, Nakai S, Sato T, Kimura M: Correlated coding of motivation and outcome of decision by dopamine neurons. J Neurosci. 2003, 23 (30): 9913-9923.PubMed
- Guitart-Masip M, Fuentemilla L, Bach DR, Huys QJM, Dayan P, Dolan RJ, Duzel E: Action dominates valence in anticipatory representations in the human striatum and dopaminergic midbrain. J Neurosci. 2011, 31 (21): 7867-7875. [http://dx.doi.org/10.1523/JNEUROSCI.6376-10.2011]PubMed CentralPubMed
- Parkinson JA, Olmstead MC, Burns LH, Robbins TW, Everitt BJ: Dissociation in effects of lesions of the nucleus accumbens core and shell on appetitive pavlovian approach behavior and the potentiation of conditioned reinforcement and locomotor activity by D-amphetamine. J Neurosci. 1999, 19 (6): 2401-2411.PubMed
- Dickinson A, Smith J, Mirenowicz J: Dissociation of Pavlovian and instrumental incentive learning under dopamine antagonists. Behav Neurosci. 2000, 114 (3): 468-483.PubMed
- Parkinson JA, Dalley JW, Cardinal RN, Bamford A, Fehnert B, Lachenal G, Rudarakanchana N, Halkerston KM, Robbins TW, Everitt BJ: Nucleus accumbens dopamine depletion impairs both acquisition and performance of appetitive Pavlovian approach behaviour: implications for mesoaccumbens dopamine function. Behav Brain Res. 2002, 137 (1-2): 149-163.PubMed
- Di Ciano P, Cardinal RN, Cowell RA, Little SJ, Everitt BJ: Differential involvement of NMDA, AMPA/kainate, and dopamine receptors in the nucleus accumbens core in the acquisition and performance of pavlovian approach behavior. J Neurosci. 2001, 21 (23): 9471-9477.PubMed
- Murschall A, Hauber W: Inactivation of the ventral tegmental area abolished the general excitatory influence of Pavlovian cues on instrumental performance. Learn Mem. 2006, 13 (2): 123-126. [http://dx.doi.org/10.1101/lm.127106]PubMed
- Lex B, Hauber W: The role of nucleus accumbens dopamine in outcome encoding in instrumental and Pavlovian conditioning. Neurobiol Learn Mem. 2010, 93 (2): 283-290. [http://dx.doi.org/10.1016/j.nlm.2009.11.002]PubMed
- Sasaki-Adams DM, Kelley AE: Serotonin-Dopamine interactions in the control of conditioned reinforcement and motor behaviour. Neuropsychopharm. 2001, 25 (3): 440-452.
- Iiguez SD, Warren BL, Bolaos-Guzmn CA: Short- and long-term functional consequences of fluoxetine exposure during adolescence in male rats. Biol Psychiatry. 2010, 67 (11): 1057-1066. [http://dx.doi.org/10.1016/j.biopsych.2009.12.033]
- Flagel SB, Clark JJ, Robinson TE, Mayo L, Czuj A, Willuhn I, Akers CA, Clinton SM, Phillips PEM, Akil H: A selective role for dopamine in stimulus-reward learning. Nature. 2011, 469 (7328): 53-57. [http://dx.doi.org/10.1038/nature09588]PubMed CentralPubMed
- Elliott R, Sahakian BJ, Michael A, Paykel ES, Dolan RJ: Abnormal neural response to feedback on planning and guessing tasks in patients with unipolar depression. Psychol Med. 1998, 28 (3): 559-71.PubMed
- Steele JD, Meyer M, Ebmeier KP: Neural predictive error signal correlates with depressive illness severity in a game paradigm. Neuroimage. 2004, 23: 269-2680.PubMed
- Knutson B, Bhanji JP, Cooney RE, Atlas LY, Gotlib IH: Neural responses to monetary incentives in major depression. Biol Psychiatry. 2008, 63 (7): 686-692. [http://dx.doi.org/10.1016/j.biopsych.2007.07.023]PubMed CentralPubMed
- Kumar P, Waiter G, Ahearn T, Milders M, Reid I, Steele JD: Abnormal temporal difference reward-learning signals in major depression. Brain. 2008, 131 (Pt 8): 2084-2093. [http://dx.doi.org/10.1093/brain/awn136]PubMed
- Pizzagalli DA, Holmes AJ, Dillon DG, Goetz EL, Birk JL, Bogdan R, Dougherty DD, Iosifescu DV, Rauch SL, Fava M: Reduced caudate and nucleus accumbens response to rewards in unmedicated individuals with major depressive disorder. Am J Psychiatry. 2009, 166 (6): 702-710. [http://dx.doi.org/10.1176/appi.ajp.2008.08081201]PubMed CentralPubMed
- Smoski MJ, Felder J, Bizzell J, Green SR, Ernst M, Lynch TR, Dichter GS: fMRI of alterations in reward selection, anticipation, and feedback in major depressive disorder. J Affect Disord. 2009, 118 (1-3): 69-78. [http://dx.doi.org/10.1016/j.jad.2009.01.034]PubMed CentralPubMed
- Dempster A, Laird N, Rubin D: Maximum likelihood from incomplete data via the EM algorithm. J Royal Stat Soc Series B (Methodological). 1977, 39: 1-38.
- Sahani M, Linden JF: How linear are auditory cortical responses?. Advances in Neural Information Processing Systems, Volume 15. Edited by: Becker S, Thrun S, Obermayer K. 2003, Cambridge: MIT Press, 109-116.
- Dickinson A, Balleine B: The role of learning in the operation of motivational systems. Stevens’ Handbook of Experimental Psychology, Volume 3. Edited by: Gallistel R. 2002, New York: Wiley, 497-534.
- Daw ND, Niv Y, Dayan P: Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat Neurosci. 2005, 8 (12): 1704-1711. [http://dx.doi.org/10.1038/nn1560]PubMed
- Sutton R: Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. Proceedings of the Seventh International Conference on Machine Learning, Volume 216. 1990, , 224-224.
- Dayan P, Kakade S, Montague PR: Learning and selective attention. Nat Neurosci. 2000, 3 Suppl: 1218-1223. [http://dx.doi.org/10.1038/81504]PubMed
- Behrens TEJ, Woolrich MW, Walton ME, Rushworth MFS: Learning the value of information in an uncertain world. Nat Neurosci. 2007, 10 (9): 1214-1221. [http://dx.doi.org/10.1038/nn1954]PubMed
- Baxter MG, Chiba AA: Cognitive functions of the basal forebrain. Curr Opin Neurobiol. 1999, 9 (2): 178-183.PubMed
- Yu AJ, Dayan P: Uncertainty, neuromodulation, and attention. Neuron. 2005, 46 (4): 681-692. [http://dx.doi.org/10.1016/j.neuron.2005.04.026]PubMed
- Hasselmo: Neuromodulation: acetylcholine and memory consolidation. Trends Cogn Sci. 1999, 3 (9): 351-359.PubMed
- Aston-Jones G, Cohen JD: Adaptive gain and the role of the locus coeruleus-norepinephrine system in optimal performance. J Comp Neurol. 2005, 493: 99-110. [http://dx.doi.org/10.1002/cne.20723]PubMed
- Bylsma LM, Taylor-Clift A, Rottenberg J: Emotional reactivity to daily events in major and minor depression. J Abnorm Psychol. 2011, 120: 155-167. [http://dx.doi.org/10.1037/a0021662]PubMed
- Haeffel GJ, Abramson LY, Brazy PC, Shah JY, Teachman BA, Nosek BA: Explicit and implicit cognition: a preliminary test of a dual-process theory of cognitive vulnerability to depression. Behav Res Ther. 2007, 45 (6): 1155-1167. [http://dx.doi.org/10.1016/j.brat.2006.09.003]PubMed
- Eshel N, Roiser JP: Reward and punishment processing in depression. Biol Psychiatry. 2010, 68 (2): 118-124. [http://dx.doi.org/10.1016/j.biopsych.2010.01.027]PubMed
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.