ArticlePublisher preview available

Value dynamics affect choice preparation during decision-making

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

During decision-making, neurons in the orbitofrontal cortex (OFC) sequentially represent the value of each option in turn, but it is unclear how these dynamics are translated into a choice response. One brain region that may be implicated in this process is the anterior cingulate cortex (ACC), which strongly connects with OFC and contains many neurons that encode the choice response. We investigated how OFC value signals interacted with ACC neurons encoding the choice response by performing simultaneous high-channel count recordings from the two areas in nonhuman primates. ACC neurons encoding the choice response steadily increased their firing rate throughout the decision-making process, peaking shortly before the time of the choice response. Furthermore, the value dynamics in OFC affected ACC ramping—when OFC represented the more valuable option, ACC ramping accelerated. Because OFC tended to represent the more valuable option more frequently and for a longer duration, this interaction could explain how ACC selects the more valuable response.
Value decoding in OFC and ACC a, We trained an LDA decoder to predict the value (1–4) on forced trials from neuronal firing rates. We applied the decoder weights to compute the posterior probability for each value in individual free trials in overlapping windows of 20 ms stepped by 5 ms. States are sustained periods (≥35 ms) of confidence decoding of a specific value. In both OFC and ACC, we observed multiple flips between chosen and unchosen value states of each trial. b, We observed more chosen (red) than unchosen (blue) than unavailable (gray) value states in both OFC and ACC, as determined from a one-way analysis of variance with post hoc t-tests, *P < 0.01 and **P < 0.0001. Error bars denote the s.e. of the mean number of states observed in a single trial (n = 3,865 trials for both ACC and OFC in subject C; n = 3,247 trials for both ACC and OFC in subject G). c, We observed longer chosen than unchosen than unavailable value states. Conventions are the same as b. Error bars denote the s.e. of the mean duration of each observed state. In subject C’s OFC, we observed 6,107 chosen states, 3,045 unchosen states and 2,259 unavailable states. In subject G’s OFC, we observed 6,420 chosen states, 3,877 unchosen states and 3,251 unavailable states. In subject C’s ACC, we observed 6,265 chosen states, 3,037 unchosen states and 2,450 unavailable states. In subject G’s ACC, we observed 5,762 chosen states, 3,651 unchosen states and 3,228 unavailable states. d, Choice response times were faster when the chosen value (red) was decoded more strongly and slower when the unchosen value (blue) was decoded more strongly. We built a linear regression for response time with chosen and unchosen decoding strength as predictors. Bold lines indicate significance at P < 0.01 value decoding. e, The posterior probabilities associated with value states were significantly positively correlated between OFC and ACC. We calculated the Pearson correlation coefficient for each session separately, and we then plotted the mean of these correlations. We assessed significance at each time point by performing a one-sample t-test against zero on the Fisher-transformed correlation coefficients. The t-tests were two-sided. To correct for multiple comparisons, we adjusted our significance level so that correlations had to be significantly different from zero (assessed at P < 0.01 and indicated by the thick horizontal lines) for more than 20 ms. This criterion ensured that there were no significant correlations in the 500 ms before picture onset. Inter-regional value correlations were significant and positive in both subjects shortly after the choice options became available. Error bars denote the s.e. of the mean of the correlation coefficients.
… 
This content is subject to copyright. Terms and conditions apply.
Nature Neuroscience | Voume 26 | September 2023 | 1575–1583 1575
nature neuroscience
https://doi.org/10.1038/s41593-023-01407-3
Article
Value dynamics affect choice preparation
during decision-making
Zuzanna Z. Balewski1,3, Thomas W. Elston1,3, Eric B. Knudsen1
& Joni D. Wallis  1,2
During decision-making, neurons in the orbitofrontal cortex (OFC)
sequentially represent the value of each option in turn, but it is unclear how
these dynamics are translated into a choice response. One brain region that
may be implicated in this process is the anterior cingulate cortex (ACC),
which strongly connects with OFC and contains many neurons that encode
the choice response. We investigated how OFC value signals interacted with
ACC neurons encoding the choice response by performing simultaneous
high-channel count recordings from the two areas in nonhuman primates.
ACC neurons encoding the choice response steadily increased their ring
rate throughout the decision-making process, peaking shortly before
the time of the choice response. Furthermore, the value dynamics in OFC
aected ACC ramping—when OFC represented the more valuable option,
ACC ramping accelerated. Because OFC tended to represent the more
valuable option more frequently and for a longer duration, this interaction
could explain how ACC selects the more valuable response.
A wealth of evidence demonstrates the necessity of the orbitofron-
tal cortex (OFC) for value-based decision-making. Patients with OFC
damage show specific deficits in value-based decision-making
1
, while
electrical microstimulation of OFC in humans
2
and monkeys
3,4
selec-
tively impairs value-based decision-making. Despite this, there is less
evidence that OFC is involved in selecting the correct response to real-
ize the decision. OFC only weakly connects with motor areas
5
and its
neurons only weakly encode the choice response
68
. At the population
level, although OFC alternately represents the value of each available
option9,10, it does not appear to represent a specific option at the time
that the choice response occurs.
One area that could have an important role in translating
value-based decisions into actions is the anterior cingulate cortex
(ACC). Like OFC, ACC neurons strongly encode the value of antici-
pated outcomes, but unlike OFC, they also often encode the choice
response
7,1115
. In addition, ACC strongly connects with both OFC
5,16
and motor areas in the medial frontal cortex, such as the cingulate
motor area
17,18
. Stimulation of ACC in humans evokes movements
19
and an urgency to act
20
. ACC seems to be particularly important when
the cost of action must be factored into the decision. Lesions of ACC
impair effort-based decisions21,22 and neuronal tuning in ACC reflects
the value of anticipated outcomes, discounted by the effort necessary
to obtain them23.
We, therefore, aimed to determine whether the dynamics of OFC
value signals influenced neurons in ACC that encoded the choice
response. One clue as to how this might occur is that more valuable
options tend to be represented more frequently and for longer dura-
tion
9,10
. Consequently, a downstream area that integrated the OFC
value dynamics would be able to select the more valuable option.
ACC neurons that encoded the choice response tended to increase
their firing rate throughout the decision, peaking shortly before the
choice response. To determine whether this ramping was affected by
the value dynamics in OFC, we carried out high-channel count record-
ings simultaneously from OFC and ACC while monkeys performed a
value-based decision-making task.
Results
We taught two monkeys (subjects C and G) to use a bidirectional
lever to select either one (forced choice trials) or between two (free
choice trials) available pictures for the corresponding juice outcome
Received: 20 June 2022
Accepted: 17 July 2023
Published online: 10 August 2023
Check for updates
1Helen Wills Neuroscience Institute, University of California at Berkeley, Berkeley, CA, USA. 2Department of Psychology, University of California at Berkeley,
Berkeley, CA, USA. 3These authors contributed equally: Zuzanna Z. Balewski, Thomas W. Elston. e-mail: wallis@berkeley.edu
Content courtesy of Springer Nature, terms of use apply. Rights reserved
... Recent studies suggest that the orbitofrontal cortex (OFC) also encodes a cognitive map (6)(7)(8)(9)(10). However, single neuron recordings in monkeys show that OFC typically encodes little information about sensorimotor contingencies (11)(12)(13)(14)(15) and it may be that OFC is primarily involved in utilizing rather than constructing cognitive maps. This contrasts with HPC where neurons encode sensorimotor contingencies, in addition to spatial and temporal contexts, precisely the kind of information that is essential for building an internal model of the task at hand (16)(17)(18). ...
... Phase-modulation of HPC and OFC single neurons was assessed by examining the distribution of a given neuron's spiking with respect to the phase angle of the simultaneously recorded LFP. Instantaneous LFP phase angles were determined by bandpass filtering the LFPs to a frequency of interest (theta (4-8 Hz), alpha (9-12 Hz), beta (13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30), and gamma (30-60 Hz)) and then extracting phase information via the Hilbert transform. We used Rayleigh's test of circular uniformity to assess the extent to which spikes clustered at a specific phase angle as compared to the same number of spikes being uniformly distributed across all phase angles. ...
... B) Cross-trial phase alignment at theta (4-8 Hz), alpha (9-12 Hz), beta (13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30), and gamma (30- . CC-BY-NC-ND 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. ...
Preprint
Full-text available
What is good in one scenario might be bad in another. Despite the ubiquity of such contextual reasoning in everyday choice, how the brain flexibly utilizes different valuation schemes across contexts remains unknown. We addressed this question by monitoring neural activity from the hippocampus (HPC) and orbitofrontal cortex (OFC) of two monkeys performing a state-dependent choice task. We found that HPC neurons encoded state information as it became available and then, at the time of choice, relayed this information to OFC via theta synchronization. During choice, OFC represented value in a state-dependent manner: many OFC neurons uniquely coded for value in only one state but not the other. This suggests a functional dissociation whereby HPC encodes contextual information that is broadcast to OFC via theta synchronization to select a state-appropriate value subcircuit, thus allowing for contextual reasoning in value-based choice.
... Prior literature in nonhuman primates suggests that OFC neural firing represents the value of choices during decision making and may provide this value information to ACC to guide decisionmaking and selection of response. 75 Interestingly, we also identified an ACC gamma power increase predictive of a subsequent decision to take a risk that occurred immediately after the OFC risk-related signal. Though we did not investigate associations between these two signals, this similarity in timing and frequency may be of interest for future investigation. ...
Preprint
Risk taking behavior is a symptom of multiple neuropsychiatric disorders and often lacks effective treatments. Reward circuitry regions including the amygdala, orbitofrontal cortex, insula, and anterior cingulate have been implicated in risk-taking by neuroimaging studies. Electrophysiological activity associated with risk taking in these regions is not well understood in humans. Further characterizing the neural signalling that underlies risk-taking may provide therapeutic insight into disorders associated with risk-taking. Eleven patients with pharmacoresistant epilepsy who underwent stereotactic electroencephalography with electrodes in the amygdala, orbitofrontal cortex, insula, and/or anterior cingulate participated. Patients participated in a gambling task where they wagered on a visible playing card being higher than a hidden card, betting $5 or $20 on this outcome, while local field potentials were recorded from implanted electrodes. We used cluster-based permutation testing to identify reward prediction error signals, defined as signals associated with unexpected reward, by comparing oscillatory power following unexpected and expected rewards. We also used cluster-based permutation testing to compare power preceding high and low bets in high-risk (<50% chance of winning) trials and two-way ANOVA with bet and risk level to identify signals associated with risky, risk averse, and optimized decisions. We used linear mixed effects models to evaluate the relationship between reward prediction error and risky decision signals across trials, and a linear regression model for associations between risky decision signal power and Barratt Impulsiveness Scale scores for each patient. Reward prediction error signals were identified in the amygdala (p=0.0066), anterior cingulate (p=0.0092), and orbitofrontal cortex (p=6.0E-4, p=4.0E-4). Risky decisions were predicted by increased oscillatory power in high-gamma frequency range during card presentation in the orbitofrontal cortex (p=0.0022), and by increased power following bet cue presentation across the theta-to-beta range in the orbitofrontal cortex (p=0.0022), high-gamma in the anterior cingulate (p=0.0004), and high-gamma in the insula (p=0.0014). Risk averse decisions were predicted by decreased orbitofrontal cortex gamma power (p=2.0E-4). Optimized decisions that maximized earnings were preceded by decreases across the theta-to-beta range in orbitofrontal cortex (p=2.0E-4), all frequencies in amygdala (p=2.0E-4), and theta-to-beta range, and low-gamma in insula (p=4.0E-4). Insula risky decision power was associated with orbitofrontal cortex high-gamma reward prediction error signal (p=0.0048) and with patient impulsivity (p=0.00478). Our findings identify and help characterize reward circuitry activity predictive of risk-taking in humans. These findings may serve as potential biomarkers to inform the development of novel treatment strategies such as closed loop neuromodulation for disorders of risk taking.
... In line with the richness of these computations, a wide range of areas across frontal cortical-basal ganglia circuit have been implicated in shaping such cost-benefit decisions [1][2][3][4][5][6][7][8][9]. Single-units across this network code decision-relevant variables such as reward probability and effort cost [10][11][12][13][14][15][16][17][18][19][20]. In primates, individual neurons in frontal cortices can integrate reward-and effort-related information [21]. ...
Preprint
Full-text available
Adaptive value-guided decision-making requires weighing up the costs and benefits of pursuing an available opportunity. Though neurons across frontal cortical-basal ganglia circuits have been repeatedly shown to represent decision-related parameters, it is unclear whether and how this information is co-ordinated. To address this question, we performed large-scale single unit recordings across medial/orbital frontal cortex and basal ganglia as rats decided whether varying reward payoffs outweighed the associated physical effort costs. We found that single neurons across the circuit could encode different combinations of the canonical decision variables reward, effort and choice. Co-active cell assemblies - ensembles of neurons that repeatedly co-activated within short time windows (<25ms) within and across structures - were able to provide representations of the same decision variables through the synchronisation of individual neurons with different coding properties. Together, these findings demonstrate a hierarchical encoding structure for cost-benefit computations, where individual neurons with diverse encoding properties are coordinated into larger, lower dimensional spaces within and across brain regions to that can encode decision parameters on the millisecond timescale.
Preprint
Full-text available
Reinforcement learning (RL) engages a network of areas, including the orbitofrontal cortex (OFC), ventral striatum (VS), amygdala (AMY), and mediodorsal thalamus (MDt). This study examined RL mediated by gains and losses of symbolic reinforcers across this network. Monkeys learned to select options that led to gaining tokens and avoid options that led to losing tokens. Tokens were cashed out for juice rewards periodically. We found that task-relevant information was distributed across the network. However, examination of the way in which information was encoded differed, with VS showing increased responses to appetitive outcomes, OFC differentiating primary and symbolic reinforcers, and AMY responding to the salience of outcomes. In addition, analysis of network activity showed that symbolic reinforcement was calculated by temporal differentiation of accumulated tokens. This process was mediated by dynamics within the OFC-MDt-VS circuit. Thus, we provide a neurocomputational account of learning from symbolic gains and losses.
Preprint
The computation and comparison of subjective values underlying economic choices rely on the orbitofrontal cortex (OFC). In this area, distinct groups of neurons encode the value of individual options, the binary choice outcome, and the chosen value. These variables capture both the input and the output of the choice process, suggesting that the cell groups found in OFC constitute the building blocks of a decision circuit. Here we show that this neural circuit is longitudinally stable. Using two-photon calcium imaging, we recorded from mice choosing between different juice flavors. Recordings of individual cells continued for up to 20 weeks. For each cell and each pair of sessions, we compared the activity profiles using cosine similarity, and we assessed whether the cell encoded the same variable in both sessions. These analyses revealed a high degree of stability and a modest representational drift. A quantitative estimate indicated this drift would not randomize the circuit within the animal’s lifetime.
Article
Full-text available
The role of the dorsal anterior cingulate cortex (ACCd) in decision making has often been discussed but remains somewhat unclear. On the one hand, numerous studies implicated this area in decisions driven by effort or action cost. On the other hand, work on economic choices between goods (under fixed action costs) found that neurons in ACCd encoded only post-decision variables. To advance our understanding of the role played by this area in decision making, we trained monkeys to choose between different goods (juice types) offered in variable amounts and with different action costs. Importantly, the task design dissociated computation of the action cost from planning of any particular action. Neurons in ACCd encoded the chosen value and the binary choice outcome in several reference frames (chosen juice, chosen cost, chosen action). Thus, this area provided a rich representation of post-decision variables. In contrast to the OFC, neurons in ACCd did not represent pre-decision variables such as individual offer values in any reference frame. Hence, ongoing decisions are unlikely guided by ACCd. Conversely, neuronal activity in this area might inform subsequent actions.
Article
Full-text available
Significance Decision making involves the evaluation of the options, and the current popular theory puts the orbitofrontal cortex (OFC) in a central role in this process. However, by using a sophisticated task design in which the decision making requires both evidence accumulation over time and a stimulus-to-action transformation, we found that the OFC neuronal activity reflected neither. Both processes were instead represented in the dorsolateral prefrontal cortex (DLPFC), and the stimulus-to-action transformation occurred before the evidence accumulation. Our study argued for a more limited role of the OFC in value-based decision making than previously proposed and indicated how the OFC and the DLPFC may work together to compute value during decision making.
Article
Full-text available
In the eighteenth century, Daniel Bernoulli, Adam Smith and Jeremy Bentham proposed that economic choices rely on the computation and comparison of subjective values¹. This hypothesis continues to inform modern economic theory² and research in behavioural economics³, but behavioural measures are ultimately not sufficient to verify the proposal⁴. Consistent with the hypothesis, when agents make choices, neurons in the orbitofrontal cortex (OFC) encode the subjective value of offered and chosen goods⁵. Value-encoding cells integrate multiple dimensions6–9, variability in the activity of each cell group correlates with variability in choices10,11 and the population dynamics suggests the formation of a decision¹². However, it is unclear whether these neural processes are causally related to choices. More generally, the evidence linking economic choices to value signals in the brain13–15 remains correlational¹⁶. Here we show that neuronal activity in the OFC is causal to economic choices. We conducted two experiments using electrical stimulation in rhesus monkeys (Macaca mulatta). Low-current stimulation increased the subjective value of individual offers and thus predictably biased choices. Conversely, high-current stimulation disrupted both the computation and the comparison of subjective values, and thus increased choice variability. These results demonstrate a causal chain linking subjective values encoded in OFC to valuation and choice.
Article
Full-text available
Outcome-guided behavior requires knowledge about the current value of expected outcomes. Such behavior can be isolated in the reinforcer devaluation task, which assesses the ability to infer the current value of specific rewards after devaluation. Animal lesion studies demonstrate that orbitofrontal cortex (OFC) is necessary for normal behavior in this task, but a causal role for human OFC in outcome-guided behavior has not been established. Here, we used sham-controlled, non-invasive, continuous theta-burst stimulation (cTBS) to temporarily disrupt human OFC network activity by stimulating a site in the lateral prefrontal cortex that is strongly connected to OFC prior to devaluation of food odor rewards. Subjects in the sham group appropriately avoided Pavlovian cues associated with devalued food odors. However, subjects in the stimulation group persistently chose those cues, even though devaluation of food odors themselves was unaffected by cTBS. This behavioral impairment was mirrored in changes in resting-state functional magnetic resonance imaging (rs-fMRI) activity such that subjects in the stimulation group exhibited reduced OFC network connectivity after cTBS, and the magnitude of this reduction was correlated with choices after devaluation. These findings demonstrate the feasibility of indirectly targeting the human OFC with non-invasive cTBS and indicate that OFC is specifically required for inferring the value of expected outcomes.
Article
Full-text available
Humans and other animals often show a strong desire to know the uncertain rewards their future has in store, even when they cannot use this information to influence the outcome. However, it is unknown how the brain predicts opportunities to gain information and motivates this information-seeking behavior. Here we show that neurons in a network of interconnected subregions of primate anterior cingulate cortex and basal ganglia predict the moment of gaining information about uncertain rewards. Spontaneous increases in their information prediction signals are followed by gaze shifts toward objects associated with resolving uncertainty, and pharmacologically disrupting this network reduces the motivation to seek information. These findings demonstrate a cortico-basal ganglia mechanism responsible for motivating actions to resolve uncertainty by seeking knowledge about the future. Animals resolve uncertainty by seeking knowledge about the future. How the brain controls this is unclear. The authors show that a network including primate anterior cingulate cortex and basal ganglia encodes opportunities to gain information about uncertain rewards and mediates information seeking.
Article
Full-text available
Significance Where we direct our gaze can have a big impact on what we choose. However, where we choose to gaze during the decision process is not well-characterized, despite the important role it plays. In our study, monkeys performed a simple decision-making experiment where they were free to look around a computer screen showing choice options. They then indicated their economic choice with a joystick movement. When choice options appeared, monkeys rapidly gazed toward more valuable and novel stimuli—suggesting there is a system that orients gaze toward important information. However, despite the gaze preference for novel stimuli, subjects did not prefer to choose them. This suggests the mechanisms governing value-guided attentional capture and value-guided choice are dissociable.
Article
We make complex decisions using both fast judgments and slower, more deliberative reasoning. For example, during value-based decision-making, animals make rapid value-guided orienting eye movements after stimulus presentation that bias the upcoming decision. The neural mechanisms underlying these processes remain unclear. To address this, we recorded from the caudate nucleus and orbitofrontal cortex while animals made value-guided decisions. Using population-level decoding, we found a rapid, phasic signal in caudate that predicted the choice response and closely aligned with animals’ initial orienting eye movements. In contrast, the dynamics in orbitofrontal cortex were more consistent with a deliberative system serially representing the value of each available option. The phasic caudate value signal and the deliberative orbitofrontal value signal were largely independent from each other, consistent with value-guided orienting and value-guided decision-making being independent processes.
Article
Neurons in primate lateral prefrontal cortex (LPFC) play a critical role in working memory (WM) and cognitive strategies. Consistent with adaptive coding models, responses of these neurons are not fixed but flexibly adjust on the basis of cognitive demands. However, little is known about how these adjustments affect population codes. Here, we investigated ensemble coding in LPFC while monkeys implemented different strategies in a WM task. Although single neurons were less tuned when monkeys used more stereotyped strategies, task information could still be accurately decoded from neural populations. This was due to changes in population codes that distributed information among a greater number of neurons, each contributing less to the overall population. Moreover, this shift occurred for task-relevant, but not irrelevant, information. These results demonstrate that cognitive strategies that impose structure on information held in mind rearrange population codes in LPFC, such that information becomes more distributed among neurons in an ensemble.
Article
Neuronal oscillations in the frontal cortex have been hypothesized to play a role in the organization of high-level cognition. Within the orbitofrontal cortex (OFC), there is a prominent oscillation in the theta frequency (4–8 Hz) during reward-guided behavior, but it is unclear whether this oscillation has causal significance. One methodological challenge is that it is difficult to manipulate theta without affecting other neural signals, such as single-neuron firing rates. A potential solution is to use closed-loop control to record theta in real time and use this signal to control the application of electrical microstimulation to the OFC. Using this method, we show that theta oscillations in the OFC are critically important for reward-guided learning and that they are driven by theta oscillations in the hippocampus (HPC). The ability to disrupt OFC computations via spatially localized and temporally precise stimulation could lead to novel treatment strategies for neuropsychiatric disorders involving OFC dysfunction.