Practice effects and test-retest reliability of the prospect theory model derived from the gambling task. Boxplots show point estimates of the prospect theory model parameters in session 1 and 2, fit under separate priors (a). Scatter plots of the prospect theory model parameters over session 1 and 2 are presented (b). SEM: standard error of the mean. * p < 0.05.

Practice effects and test-retest reliability of the prospect theory model derived from the gambling task. Boxplots show point estimates of the prospect theory model parameters in session 1 and 2, fit under separate priors (a). Scatter plots of the prospect theory model parameters over session 1 and 2 are presented (b). SEM: standard error of the mean. * p < 0.05.

Source publication
Article
Full-text available
Computational models can offer mechanistic insight into cognition and therefore have the potential to transform our understanding of psychiatric disorders and their treatment. For translational efforts to be successful, it is imperative that computational measures capture individual characteristics reliably. To date, this issue has received little...

Context in source publication

Context 1
... estimated parameters demonstrated good-to-excellent reliability (Figure 6b; Table 2), and showed excellent reliability when estimating a correlation matrix within a joint model (Table 2). ...

Citations

... Importantly, computational modelling allows the estimation of specific parameters that distil these processes and govern the behaviour of models, thereby allowing the precise measurement of cognitive processes for each individual. These parameter estimates complement, but also go beyond traditional (model-agnostic) measures of performance, such as proportion of offers accepted, as they provide an explanation of how such measures arise, and also often have superior psychometric properties (41,42). ...
Preprint
Full-text available
Background Motivational dysfunction is a core feature of depression, and can have debilitating effects on everyday function. However, it is unclear which disrupted cognitive processes underlie impaired motivation, and whether impairments persist following remission. Decision-making concerning exerting effort to collect rewards offers a promising framework for understanding motivation, especially when examined with computational tools which can offer precise quantification of latent processes. Methods Effort-based decision-making was assessed using the Apple Gathering Task, in which participants decide whether to exert effort via a grip-force device to obtain varying levels of reward; effort levels were individually calibrated and varied parametrically. We present a comprehensive computational analysis of decision-making, initially validating our model in healthy volunteers (N=67), before applying it in a case-control study including current (N=41) and remitted (N=46) unmedicated depressed individuals, and healthy volunteers with (N=36) and without (N=57) a family history of depression. Results Four fundamental computational mechanisms that drive patterns of effort-based decisions, which replicated across samples, were identified: an overall bias to accept effort challenges; reward sensitivity; and linear and quadratic effort sensitivity. Traditional model-agnostic analyses showed that both depressed groups showed lower willingness to exert effort. In contrast with previous findings, computational analysis revealed that this difference was driven by lower effort acceptance bias, but not altered effort or reward sensitivity. Conclusions This work provides insight into the computational mechanisms underlying motivational dysfunction in depression. Lower willingness to exert effort could represent a trait-like factor contributing to symptoms, and might represent a fruitful target for treatment and prevention.
... Reliability indicates the consistency between estimates obtained when the same measurements (data acquisition and parameter estimation procedures) are repeated for the same individual (a human participant or animal subject; Browning et al., 2020;Karvelis, Paulus, & Diaconescu, 2023;Zorowitz & Niv, 2023). The reliability of parameters in computational models has received particular attention in recent years, especially in computational psychiatry (Brown, Chen, Gillan, & Price, 2020;Haines, Sullivan-Toole, & Olino, 2023;Mkrtchian, Valton, & Roiser, 2023). Reliability is directly related to the strength of the correlation between the parameter estimates and other external variables (e.g., self-reported symptom scores or neural activity) and thus to the probability of detecting significant correlations (Haines et al., 2023;Zorowitz & Niv, 2023; see Appendix A for mathematical details). ...
... D Scatterplot of point estimates obtained by empirical Bayes (EB). MAP, maximum a posteriori; ML, maximum likelihood; r , Pearson's correlation coefficient; ICC(A,1), agreement intraclass correlation; ICC(C,1), consistency intraclass correlation Recently, a method has been proposed to jointly model the population data of two sessions and to derive the parameter reliability from the covariance matrix of the group-level distribution (Brown et al., 2020;Waltmann et al., 2022;Sullivan-Toole, Haines, Dale, & Olino, 2022;Mkrtchian et al., 2023). The reliability obtained in this way tends to be greater than that assessed by classical methods, suggesting that this joint modeling approach enhances reliability. ...
... Then, hierarchical modeling is applied to the estimates to obtain the intersession correlation in a separate step. In previous studies on parameter reliability, a unified model integrating these processes has been employed (Brown et al., 2020;Waltmann et al., 2022;Mkrtchian et al., 2023;Sullivan-Toole et al., 2022;Yamamori, Robinson, & Roiser, 2023). Specifically, this approach involves joint estimation of the two sessions by using a hierarchical model, which incorporates a computational model at the individual level. ...
Article
Full-text available
Computational modeling of behavior is increasingly being adopted as a standard methodology in psychology, cognitive neuroscience, and computational psychiatry. This approach involves estimating parameters in a computational (or cognitive) model that represents the computational processes of the underlying behavior. In this approach, the reliability of the parameter estimates is an important issue. The use of hierarchical (Bayesian) approaches, which place a prior on each model parameter of the individual participants, is thought to improve the reliability of the parameters. However, the characteristics of reliability in parameter estimates, especially when individual-level priors are assumed, as in hierarchical models, have not yet been fully discussed. Furthermore, the suitability of different reliability measures for assessing parameter reliability is not thoroughly understood. In this study, we conduct a systematic examination of these issues through theoretical analysis and numerical simulations, focusing specifically on reinforcement learning models. We note that the heterogeneity in the estimation precision of individual parameters, particularly with priors, can skew reliability measures toward individuals with higher precision. We further note that there are two factors that reduce reliability, namely estimation error and intersession variation in the true parameters, and we discuss how to evaluate these factors separately. Based on the considerations of this study, we present several recommendations and cautions for assessing the reliability of the model parameters.
... For all models, we estimated free parameters by likelihood maximisation and Laplace approximation of model evidence to calculate the integrated Bayesian Information Criterion (BIC) and the exceedance probability respectively (this can now be found in Supplementary Fig. 1A Fig. 1D) all further analyses were conducted using the outcome-related signals estimated with this model (Fig. 1d, e, h, l). Parameter recovery 36,37 was also performed on the best model and presented in Supplementary Fig. 1E and Supplementary Table 1. ...
Article
Full-text available
Natural fluctuations in cardiac activity modulate brain activity associated with sensory stimuli, as well as perceptual decisions about low magnitude, near-threshold stimuli. However, little is known about the relationship between fluctuations in heart activity and other internal representations. Here we investigate whether the cardiac cycle relates to learning-related internal representations – absolute and signed prediction errors. We combined machine learning techniques with electroencephalography with both simple, direct indices of task performance and computational model-derived indices of learning. Our results demonstrate that just as people are more sensitive to low magnitude, near-threshold sensory stimuli in certain cardiac phases, so are they more sensitive to low magnitude absolute prediction errors in the same cycles. However, this occurs even when the low magnitude prediction errors are associated with clearly suprathreshold sensory events. In addition, participants exhibiting stronger differences in their prediction error representations between cardiac cycles exhibited higher learning rates and greater task accuracy.
... Electronic copy available at: https://ssrn.com/abstract=4732127 P r e p r i n t n o t p e e r r e v i e w e d maximize (Keller & Weibler, 2014;Parker et al., 2007), as well as to clinical syndromes such as ADHD and Gambling Disorders (Addicott et al., 2021;Mkrtchian et al., 2023;Wiehler et al., 2021). However, most research focuses on individual learning strategies, neglecting that decision-makers rarely face these problems entirely on their own. ...
Preprint
Full-text available
Situations requiring a balance between gathering new information or exploiting known options (i.e., involving an exploration-exploitation trade-off) are pervasive. While navigating this trade-off, individuals frequently have the chance to observe and learn from others engaged in the same task. However, so far it is unclear when and from whom people will copy in exploration-exploitation tasks and whether they rely on imitation of the observed agent’s choices or use the knowledge gained by observation to emulate the other players’ strategy. In two experiments, participants performed several nine-armed bandit tasks, either on their own or while seeing the choices of a fictitious agent using either an explorative or an equally successful exploitative strategy. Subject-level parameters for copying and exploration were extracted from the data using a customized model-based reinforcement learning model. We find evidence that the inclination of people to copy depends on the certainty derived from their individually acquired knowledge. In addition, cognitive modeling provided support that people rely on both types of observational learning: Imitation of the observed agents’ choices and adjusting their own exploration strategy towards the observed players' inclination to explore without necessarily making the same choices. Finally, participants copy rather explorative than exploitative agents. Contrary to our expectations, neither similarity nor dissimilarity of the observers’ and the observed agents’ exploration tendency is predictive of the inclination to copy. These results do not only shed light on the impact of observational learning on exploration strategies but also on humans’ processing of social and non-social information in exploration scenarios.
... While this list is not exhaustive, it supports the view that the nature of the computational phenotype is dynamic and that some of its variability tracks meaningful changes in participant-related factors, rather than simply reflecting unreliability (cf. 24,53 ). This view emphasizes the structured temporal variation in the computational phenotype and suggests that it should be measured to provide insight into inter-and intra-individual cognitive variation. ...
Article
Full-text available
Computational phenotyping has emerged as a powerful tool for characterizing individual variability across a variety of cognitive domains. An individual’s computational phenotype is defined as a set of mechanistically interpretable parameters obtained from fitting computational models to behavioural data. However, the interpretation of these parameters hinges critically on their psychometric properties, which are rarely studied. To identify the sources governing the temporal variability of the computational phenotype, we carried out a 12-week longitudinal study using a battery of seven tasks that measure aspects of human learning, memory, perception and decision making. To examine the influence of state effects, each week, participants provided reports tracking their mood, habits and daily activities. We developed a dynamic computational phenotyping framework, which allowed us to tease apart the time-varying effects of practice and internal states such as affective valence and arousal. Our results show that many phenotype dimensions covary with practice and affective factors, indicating that what appears to be unreliability may reflect previously unmeasured structure. These results support a fundamentally dynamic understanding of cognitive variability within an individual.
... Although there is growing interest in the parameter reliability of computational models (Ballard & McClure, 2019;Brown et al., 2020;Browning et al., 2020;Scheibehenne & Pachur, 2015;Waltmann et al., 2022), information on the reliability of RL model parameters is still very limited. Previously reported results on reliability have varied from poor (Moutoussis et al., 2018;Pike et al., 2022;Schaaf et al., 2023) to good or excellent (Brown et al., 2020;Mkrtchian et al., 2023;Waltmann et al., 2022). Further reports are needed to obtain the whole picture of the reliability of RL model parameters. ...
... A study using a task to examine affective bias (the go/no-go task) also reported low test-retest reliability of a learning rate although it is not a main parameter of this task (Pike et al., 2022). On the other hand, Mkrtchian et al. (2023) reported generally good reliability including learning rates using a four-armed bandit task with win-and-loss outcomes in each trial as was used in the current task. The difference in the results of those studies and ours may be due to a different test-retest interval (2 weeks or 1.5 months) or trial length (200 or 100) as well as task, modelling, or parameter estimation methods. ...
Article
Full-text available
Reinforcement learning models have the potential to clarify meaningful individual differences in the decision-making process. This study focused on two aspects regarding the nature of a reinforcement learning model and its parameters: the problems of model misspecification and reliability. Online participants, N = 453, completed self-report measures and a probabilistic learning task twice 1.5 months apart, and data from the task were fitted using several reinforcement learning models. To address the problem of model misspecification, we compared the models with and without the influence of choice history, or perseveration. Results showed that the lack of a perseveration term in the model led to a decrease in learning rates for win and loss outcomes, with slightly different influences depending on outcome volatility, and increases in inverse temperature. We also conducted simulations to examine the mechanism of the observed biases and revealed that failure to incorporate perseveration directly affected the estimation bias in the learning rate and indirectly affected that in inverse temperature. Furthermore, in both model fittings and model simulations, the lack of perseveration caused win-stay probability underestimation and loss-shift probability overestimation. We also assessed the parameter reliability. Test–retest reliabilities were poor (learning rates) to moderate (inverse temperature and perseveration magnitude). A learning effect was noted in the inverse temperature and perseveration magnitude parameters, showing an increment of the estimates in the second session. We discuss possible misinterpretations of results and limitations considering the estimation biases and parameter reliability.
... (continued on next page) Loosen et al., 2022;Waltmann et al., 2022;Hitchcock et al., 2022b). In other cases it offers only a modest improvement over summary statistics (Price et al., 2019;Mkrtchian et al., 2023;Moutoussis et al., 2018a;Chung et al., 2017;, and rarely a substantial improvement (Sullivan-Toole et al., 2022;Xu and Stocco, 2021;Smith et al., 2022). Still, some studies achieved better reliability than others and it is important to consider the factors underlying that. ...
... While these results highlight the benefits of using EB for parameter estimation, the resulting reliabilities are still rather poor. Moreover, many of the studies reporting poor reliabilities are already using EB methods (Moutoussis et al., 2018a;Shahar et al., 2019;Brown et al., 2020;Mkrtchian et al., 2023). ...
... Just like for the behavioral measures, the hierarchical approach can be further extended to incorporate both sessions by assuming the parameter estimates to be drawn from a multivariate distribution (Brown et al., 2020;Sullivan-Toole et al., 2022;Waltmann et al., 2022;Mkrtchian et al., 2023). Using this method, Brown et al. (2020) and Waltmann et al. (2022) were able to further improve the reliability of parameter estimates, reaching r = 0.72 − 0.89 and r = 0.74 − 0.86, respectively. ...
Article
Full-text available
Bringing precision to the understanding and treatment of mental disorders requires instruments for studying clinically relevant individual differences. One promising approach is the development of computational assays: integrating computational models with cognitive tasks to infer latent patient-specific disease processes in brain computations. While recent years have seen many methodological advancements in computational modelling and many cross-sectional patient studies, much less attention has been paid to basic psychometric properties (reliability and construct validity) of the computational measures provided by the assays. In this review, we assess the extent of this issue by examining emerging empirical evidence. We find that many computational measures suffer from poor psychometric properties, which poses a risk of invalidating previous findings and undermining ongoing research efforts using computational assays to study individual (and even group) differences. We provide recommendations for how to address these problems and, crucially, embed them within a broader perspective on key developments that are needed for translating computational assays to clinical practice.
... Recently, a method has been proposed to jointly model the population data of two sessions and to derive the parameter reliability from the covariance matrix of the group-level distribution (Brown et al., 2020;Waltmann et al., 2022;Sullivan-Toole, Haines, Dale, & Olino, 2022;Mkrtchian et al., 2023). The reliability obtained in this way tends to be greater than that assessed by classical methods, suggesting that this joint modeling approach enhances reliability. ...
... Reliability indicates the consistency between estimates obtained when the same measurements (data acquisition and parameter estimation procedures) are repeated for the same individual (a human participant or animal subject; Browning et al., 2020;Karvelis, Paulus, & Diaconescu, 2023;. The reliability of parameters in computational models has received particular attention in recent years, especially in computational psychiatry (Brown, Chen, Gillan, & Price, 2020;Haines, Sullivan-Toole, & Olino, 2023;Mkrtchian, Valton, & Roiser, 2023). Reliability is directly related to the strength of the correlation between the parameter estimates and other external variables (e.g., self-reported symptom scores or neural activity) and thus to the probability of detecting significant correlations (Haines et al., 2023;; see Appendix A for mathematical details). ...
... In addition, because the direction of bias cannot be determined from ICC(A,1) alone (i.e., whether the parameter has become larger or smaller), it is desirable to report the mean of the estimates for each session, the coefficient of variation (e.g., Scheibehenne & Pachur, 2015), and the test statistics (e.g., Scheibehenne & Pachur, 2015). and test statistics (e.g., t-value, p-value, and confidence interval) for group differences (e.g., Mkrtchian et al., 2023;Toyama, Katahira, & Kunisato, 2023) so that the direction of bias can be assessed. ...
Preprint
Full-text available
Computational modeling of behavior is becoming a standard methodology in psychology, cognitive neuroscience, and computational psychiatry. In this approach, the reliability of the parameter estimates is an important issue. The most commonly used metric to assess it is the test-retest reliability, which quantifies the consistency of the parameter estimates obtained with multiple measurements. The reliability could be increased by improving the parameter estimation methods as well as the design of behavioral tasks. Studies using hierarchical (Bayesian) models have reported significant improvements in test-retest reliability. Hierarchical models assume prior distributions for individual parameter estimates. In this report, we point out that the test-retest reliability of parameter estimates based on prior distributions are strongly affected by heterogeneity in the sample population, possibly leading to overestimation of the reliability. When test-retest reliability is increased using prior distributions, it may indicate the presence of heterogeneity in the population and the need for subgrouping.
... It can help explain how a changes to a low-level component of the system (e.g., concentration of a specific neurotransmitter) produce higher level changes that translate to an effect on behaviour (Maia and Frank, 2011), e.g., choosing between ice creams. Additionally, formalising higher-cognitive processes as models, allows to describe unobservable features (such as experience and motivation) in terms of latent variables (Forstmann and Wagenmakers, 2015;Huys et al., 2016;Lewandowsky and Farrell, 2011;Mkrtchian et al., 2021;O'Reilly et al., 2012;Palmeri et al., 2017;Palminteri et al., 2017;Zhang et al., 2020). ...
... Computational modeling, and in particular reinforcement learning, provides a framework to understand how different cognitive processes -such as experience, motivation and learning -feed into decision making (Mkrtchian et al., 2021;Niv, 2009;Sutton and Barto, 1998;Wu et al., 2020). ...
Thesis
Full-text available
Should I leave or stay in academia? Many decisions we make require arbitrating between novelty and the benefits of familiar options. This is called the exploration-exploitation trade-off. Solving this trade-off is not trivial, but approximations (called ‘exploration strategies’) exist. Humans are known to rely on different exploration strategies, varying in performance and computational requirements. More complex strategies perform well, but are computationally expensive (e.g., require to compute expected values). Cheaper strategies, i.e., heuristics, require less cognitive resources but can lead to sub-optimal performance. The simplest heuristic strategy is to ignore prior knowledge, such as expected values, and to choose entirely randomly. In effect, this is like rolling a dice to choose between different choice options. Such ‘value-free random’ exploration strategy may not always lead to optimal performance but allows to spare cognitive resources. In this thesis, I investigate the mechanisms of exploration heuristics in human decision making. I developed a cognitive task allowing to dissociate between different strategies for exploration. In my first study, I demonstrate that humans supplement complex strategies with exploration heuristics and, using a pharmacological manipulation, that value-free random exploration is specifically modulated by the neurotransmitter noradrenaline. Exploration heuristics are of particular interest when access to cognitive resources is limited and prior knowledge uncertain, such as in development and mental health disorders. In a cross-sectional developmental study, I demonstrate that value-free random exploration is used more at a younger age. Additionally, in a large-sample online study, I show that it is specifically associated to impulsivity. Together, this indicates that value-free random exploration is useful in certain contexts (e.g., childhood) but that high levels of it can be detrimental. Overall, this thesis attempts to better understand the process of exploration in humans, and opens the way for understanding the mechanisms of arbitration between complex and simple strategies for decision making.
Article
Full-text available
Active reinforcement learning enables dynamic prediction and control, where one should not only maximize rewards but also minimize costs such as of inference, decisions, actions, and time. For an embodied agent such as a human, decisions are also shaped by physical aspects of actions. Beyond the effects of reward outcomes on learning processes, to what extent can modeling of behavior in a reinforcement-learning task be complicated by other sources of variance in sequential action choices? What of the effects of action bias (for actions per se) and action hysteresis determined by the history of actions chosen previously? The present study addressed these questions with incremental assembly of models for the sequential choice data from a task with hierarchical structure for additional complexity in learning. With systematic comparison and falsification of computational models, human choices were tested for signatures of parallel modules representing not only an enhanced form of generalized reinforcement learning but also action bias and hysteresis. We found evidence for substantial differences in bias and hysteresis across participants—even comparable in magnitude to the individual differences in learning. Individuals who did not learn well revealed the greatest biases, but those who did learn accurately were also significantly biased. The direction of hysteresis varied among individuals as repetition or, more commonly, alternation biases persisting from multiple previous actions. Considering that these actions were button presses with trivial motor demands, the idiosyncratic forces biasing sequences of action choices were robust enough to suggest ubiquity across individuals and across tasks requiring various actions. In light of how bias and hysteresis function as a heuristic for efficient control that adapts to uncertainty or low motivation by minimizing the cost of effort, these phenomena broaden the consilient theory of a mixture of experts to encompass a mixture of expert and nonexpert controllers of behavior.