Figure - available from: Cancer Informatics
This content is subject to copyright.
Expected classification error rates for varying sample size and fixed number of selected proteins d = 8.

Expected classification error rates for varying sample size and fixed number of selected proteins d = 8.

Source publication
Article
Full-text available
Proteomics promises to revolutionize cancer treatment and prevention by facilitating the discovery of molecular biomarkers. Progress has been impeded, however, by the small-sample, high-dimensional nature of proteomic data. We propose the application of a Bayesian approach to address this issue in classification of proteomic profiles generated by l...

Citations

... Markov-chain-Monte-Carlo (MCMC) OBC methods were introduced in [17,18] for RNA-Seq application, and are usually used in real-world settings where Gaussian models are not appropriate. Other applications include liquid chromatography-mass spectrometry data [19], selection reaction monitoring data [20], and classification based on dynamical measurements of single-gene expression measurements [21]. The OBC has been adapted to settings in which there are missing values [22]. ...
Article
Full-text available
Introduction The most basic aspect of modern engineering is the design of operators to act on physical systems in an optimal manner relative to a desired objective – for instance, designing acontrol policy to autonomously direct a system or designing a classifier to make decisions regarding the system. These kinds of problems appear in biomedical science, where physical models are created with the intention of using them to design tools for diagnosis, prognosis, and therapy. Methods In the classical paradigm, our knowledge regarding the model is certain; however, in practice, especially with complex systems, our knowledge is uncertain and operators must be designed while taking this uncertainty into account. The related concepts of intrinsically Bayesian robust operators and optimal Bayesian operators treat operator design under uncertainty. An objective-based experimental design procedure is naturally related to operator design: We would like to perform an experiment that maximally reduces our uncertainty as it pertains to our objective. Results & Discussion This paper provides a nonmathematical review of optimal Bayesian operators directed at biomedical scientists. It considers two applications important to genomics, structural intervention in gene regulatory networks and classification. Conclusion The salient point regarding intrinsically Bayesian operators is that uncertainty is quantified relative to the scientific model, and the prior distribution is on the parameters of this model. Optimization has direct physical (biological) meaning. This is opposed to the common method of placing prior distributions on the parameters of the operator, in which case there is a scientific gap between operator design and the phenomena.
... We perform Bayesian inference of the parameters of the SRM model proposed in the work by Atashpaz-Gargari et al 5 and build a kernel classifier, similar to the classifier for liquid chromatography-mass spectrometry (LC-MS) data proposed in the work by Banerjee and Braga-Neto. 6 As in the latter reference, our method uses a likelihoodfree approach, called approximate Bayesian computation (ABC), [7][8][9] which is necessary because the SRM model of Atashpaz-Gargari et al 5 is complex and does not have an analytical formulation of the likelihood. After calibration of the parameters, the ABC method is implemented via a Markov chain Monte Carlo (MCMC) procedure 10,11 to obtain a sample from the posterior distribution of the protein concentrations. ...
... The coefficient of variation φ has the initial value displayed in Table 1, which is the same as the one used in the work by Banerjee and Braga-Neto. 6 This value is modified based on the data, as part of the prior calibration process described in Algorithm 1. ...
... We employ the kernel-based scheme proposed in the work by Banerjee and Braga-Neto, 6 which is itself based on the OBC in Dalton and Dougherty. 12 One of the issues with kernelbased classification is choosing the right value of the kernel bandwidth parameter. ...
Article
Full-text available
Selected reaction monitoring (SRM) has become one of the main methods for low-mass-range–targeted proteomics by mass spectrometry (MS). However, in most SRM-MS biomarker validation studies, the sample size is very small, and in particular smaller than the number of proteins measured in the experiment. Moreover, the data can be noisy due to a low number of ions detected per peptide by the instrument. In this article, those issues are addressed by a model-based Bayesian method for classification of SRM-MS data. The methodology is likelihood-free, using approximate Bayesian computation implemented via a Markov chain Monte Carlo procedure and a kernel-based Optimal Bayesian Classifier. Extensive experimental results demonstrate that the proposed method outperforms classical methods such as linear discriminant analysis and 3NN, when sample size is small, dimensionality is large, the data are noisy, or a combination of these.
Chapter
The Intelligent Decision Support Systems (IDSSs) represent an interdisciplinary research domain bringing together Artificial Intelligence/Machine Learning (AI/ML), Decision Science (DS), and Information Systems (IS). IDSS refers to the use of AI/ML techniques in decision support systems. In this context, it should be emphasized the special role of statistical learning (SL) in the process of training algorithms from data. The purpose of this chapter is to provide a short review of some of the state-of-the-art AI/ML algorithms, seen as intelligent tools used in the medical decision-making, along with some important applications in the automated medical diagnosis of some major chronic diseases (MCDs). In addition, we aim to present an interesting approach to develop novel IDSS inspired by the evolutionary paradigm.