Article

Statistical Spectroscopic Tools for Biomarker Discovery and Systems Medicine

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Metabolic profiling based on comparative, statistical analysis of NMR spectroscopic and mass spectrometric data from complex biological samples has contributed to increased understanding of the role of small molecules in affecting and indicating biological processes. To enable this research, the development of statistical spectroscopy has been marked by early beginnings in applying pattern recognition to Nuclear Magnetic Resonance data and the introduction of Statistical Total Correlation Spectroscopy (STOCSY) as a tool for biomarker identification in the past decade. Extensions of statistical spectroscopy now compose a family of related tools used for compound identification, data preprocessing, and metabolic pathway analysis. In this Perspective, we review the theory and current state of research in statistical spectroscopy and discuss the growing applications of these tools to medicine and systems biology. We also provide perspectives on how recent institutional initiatives are providing new platforms for the development and application of statistical spectroscopy tools and driving the development of integrated 'systems medicine' approaches in which clinical decision making is supported by statistical and computational analysis of metabolic, phenotypic, and physiological data.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Statistical Total Correlation Spectroscopy (STOCSY) is performed by calculating the covariance and the correlation between the peaks (data points) across different samples of a dataset to highlight the highly correlated peaks with high positive covariance assuming they are from the same molecular structure. [13][14][15] Covariance stands for the combined variability between different variables and correlation stands for the linearity of that combined variability of those different variables; essentially the parameters covariation and correlation try to describe how different variables (in this case, features) behave in analogous manners. Thus, STOCSY is initiated by the selection of a peak (a driver peak) to yield covariance and correlation values between this driver peak and all other variables across the different samples. ...
... STOCSY calculations have been applied to several cases in metabolomics studies to facilitate the biomarker identification stage. [13][14][15] The application of STOCSY-like calculations in different datasets, such as NMR and MS, has gained more attention over the last 5 years. Statistical Hetero-Spectroscopy (SHY) was first proposed by Crockford et al. 16 for the combination of NMR and MS data to deliver more informative results. ...
Article
Introduction: Data Fusion-based Discovery (DAFdiscovery) is a pipeline designed to help users combine mass spectrometry (MS), nuclear magnetic resonance (NMR), and bioactivity data in a notebook-based application to accelerate annotation and discovery of bioactive compounds. It applies Statistical Total Correlation Spectroscopy (STOCSY) and Statistical HeteroSpectroscopy (SHY) calculation in their data using an easy-to-follow Jupyter Notebook. Method: Different case studies are presented for benchmarking, and the resultant outputs are shown to aid natural products identification and discovery. The goal is to encourage users to acquire MS and NMR data from their samples (in replicated samples and fractions when available) and to explore their variance to highlight MS features, NMR peaks, and bioactivity that might be correlated to accelerated bioactive compound discovery or for annotation-identification studies. Results: Different applications were demonstrated using data from different research groups, and it was shown that DAFdiscovery reproduced their findings using a more straightforward method. Conclusion: DAFdiscovery has proven to be a simple-to-use method for different situations where data from different sources are required to be analyzed together.
... In addition to considering which samples to investigate and which analytical technologies to use (please see supplemental information online) to solve the biological or clinical question at hand, it is equally important to consider the sampling methodology. Although there are well-documented procedures for sample preparation for both NMR- [29,30] and mass spectrometry (MS)-based metabolic phenotyping approaches [31,32], it is now well known that diet, drug-taking, and diurnal variation [33], as well as factors such as gender and age, all affect the metabolite profiles obtained, and several statistical methods have been developed to correct for such systematic confounding variation, so as to focus on the key clinical and biological questions [34]. ...
... In these circumstances there will be both dependencies and correlations between the levels, trajectories, and entropies of one metabolite relative to others. The ability to determine these dependencies and correlations represents the fourth fashion in which metabolic data can be analyzed, and has been facilitated by recent developments in statistical spectroscopy methods ( Figure 3D provides an example relating to the interdependency of metabolite levels) [34]. A good example is the close correlation between urinary levels of 4-cresylsulphate and phenylacetylglutamine owing to commonality in the microbial transformations of tyrosine and phenylalanine, respectively, in the early parts of their biosynthetic pathways [35]. ...
Article
Understanding metabotype (multicomponent metabolic characteristics) variation can help to generate new diagnostic and prognostic biomarkers, as well as models, with potential to impact on patient management. We present a suite of conceptual approaches for the generation, analysis, and understanding of metabotypes from body fluids and tissues. We describe and exemplify four fundamental approaches to the generation and utilization of metabotype data via multiparametric measurement of (i) metabolite levels, (ii) metabolic trajectories, (iii) metabolic entropies, and (iv) metabolic networks and correlations in space and time. This conceptual framework can underpin metabotyping in the scenario of personalized medicine, with the aim of improving clinical outcomes for patients, but the framework will have value and utility in areas of metabolic profiling well beyond this exemplar.
... Out of the manifold of variations to combine raw and Fourier transformed data, a variety of covariance-transformed spectral representations have been introduced and their applications have been demonstrated: Among those used in NMR spectroscopy, the most often used or described were direct covariance, indirect covariance, doubly indirect covariance, unsymmetrical indirect covariance, generalized indirect covariance, which replaced the previous one, multidimensional covariance in form of Triple-Rank Covariance and 4D Covariance [2,[26][27][28][29]. Furthermore, the family of Statistical Correlation Spectroscopy (STOCSY) has been introduced, and its usefulness is demonstrated in many applications [22,[30][31][32]. ...
... While the concept of homo-and hetero-covariance spectroscopy was developed nearly three decades ago, there are relatively few reports on the use of synchronous and asynchronous spectra involving NMR spectroscopy [3,14,55,58,59]. In contrast, an abundant number of investigations have applied so-called statistical hetero-spectroscopy (STOCSY) that has delivered important contributions to the field of metabolomics and whose variants have recently been depicted like a phylogenetic tree [22,32,60]. In the current report, the focus is however laid on examples from chemical processes rather than metabolomics. ...
Article
Full-text available
Covariance processing of data and spectra has established itself among the computer-based NMR spectroscopy methodologies to increase sensitivity and resolution and to facilitate spectral analysis. While homo-correlations yield two-dimensional (2D) diagonally symmetric or antisymmetric spectra, hetero-covariance transformations allow to transfer NMR chemical shift information to other spectroscopic techniques, such as near infra-red or Raman. This is visualized as a 2D correlation map, provided a common indirect or perturbation domain, such as time, concentration change, and pressure. Covari-ance spectra can be generated as synchronous or asynchronous maps. The synchronous map relates the signals of species, e.g., educts and products. The asynchronous spectrum allows to derive the sequential order in which such species occur relative to each other. After a theoretical introduction into covariance NMR, its application in process analytical technology is discussed for wine fermentation, a radical polymerization reaction, a continuous process ethanol production using immobilized yeast, and a Knoevenagel condensation in a microreaction system. The covariance approach is extended toward two perturbation variables and quantitative relationships through PARAFAC kernel analysis and is illustrated for the preparation of polylactic acid nanocomposites. The advantages and added values of using synchronous and asynchronous spectra to gain process knowledge and control are demonstrated.
... However, agricultural/fishery 88 research in industrialized countries today emphasizes food pro- 89 cessing and value-added products rather than the alleviation of 90 food shortages in developing countries [4,5]. Considering the 91 remarkable diversity in food culture across the world, an under- 92 standing of natural ecosystems in a variety of environments is 93 important [6]. 94 Based on reports of the direct and indirect effects of agriculture, 95 forestry, and fisheries on natural ecosystems that produce biomass 96 resources, we proposed the application of a recent analytial para- 97 digm to ecosystem research [7][8][9][10][11][12][13][14][15][16]. ...
... Magnetic resonance imaging 213 (MRI) can be used to investigate these metabolites nondestruc-214 tively, without extraction and/or cell lysis [90,91] (Fig. 3 Statistical analysis and data mining are used to extract impor-222 tant information from big data sets generated by NMR [92]. Princi- (STOCSY) uses correlation analysis to identify metabolites [93]. 232 Recently, ML approaches such as the self-organizing map (SOM) 233 and support vector machine (SVM) methods have drawn attention 234 in several fields. ...
Article
A natural ecosystem can be viewed as the interconnections between complex metabolic reactions and environments. Humans, a part of these ecosystems, and their activities strongly affect the environments. To account for human effects within ecosystems, understanding what benefits humans receive by facilitating the maintenance of environmental homeostasis is important. This review describes recent applications of several NMR approaches to the evaluation of environmental homeostasis by metabolic profiling and data science. The basic NMR strategy used to evaluate homeostasis using big data collection is similar to that used in human health studies. Sophisticated metabolomic approaches (metabolic profiling) are widely reported in the literature. Further challenges include the analysis of complex macromolecular structures, and of the compositions and interactions of plant biomass, soil humic substances, and aqueous particulate organic matter. To support the study of these topics, we also discuss sample preparation techniques and solid-state NMR approaches. Because NMR approaches can produce a number of data with high reproducibility and inter-institution compatibility, further analysis of such data using machine learning approaches is often worthwhile. We also describe methods for data pretreatment in solid-state NMR and for environmental feature extraction from heterogeneously-measured spectroscopic data by machine learning approaches.
... 21−23 However, chemical characterization of molecular species associated with an outcome still is a limiting factor for exploratory metabolic profiling. NMR spectroscopy provides an atom-centered spectroscopic tool for structure elucidation that can be enhanced by statistical spectroscopic methods 24,25 or by physical hyphenation with chromatographic methods such as solid-phase extraction (SPE), 26 liquid chromatography (LC) 21 or LC-NMR-MS 21,27 to achieve a better chemical characterization of endogenous and exogenous metabolites. ...
... 23 Since the STOCSY method was published, many derivations have aimed to improve specific properties such as differentiation between structural and pathway correlations by clustering, subset selection, or stoichiometric relationships. 24 Statistical correlation can be undermined by overlapped signals unrelated to the metabolite of interest in a 1D-NMR spectrum, and 2D-NMR experiments are still required for unambiguous structure elucidation. 29 In addition, the structural information obtained using statistical algorithms is dependent on criteria such as correlation thresholds 28,30 or correlation-distance cut-offs. ...
Article
A major purpose of exploratory metabolic profiling is for the identification of molecular species that are statistically associated with specific biological or medical outcomes; unfortunately the structure elucidation process of unknowns is often a major bottleneck in this process. We present here new holistic strategies that combine different statistical spectroscopic and analytical techniques to improve and simplify the process of metabolite identification. We exemplify these strategies using study data collected as part of a dietary intervention to improve health and which elicits a relatively subtle suite of changes from complex molecular profiles. We identify three new dietary biomarkers related to the consumption of peas (N-methyl nicotinic acid), apples (rhamnitol) and onions (N-acetyl-S-(1Z)-propenyl-cysteine-sulfoxide) that can be used to enhance dietary assessment and assess adherence to diet. As part of the strategy, we introduce a new probabilistic statistical spectroscopy tool, RED-STORM (Resolution EnhanceD SubseT Optimization by Reference Matching), that uses 2D J-resolved ¹H-NMR spectra for enhanced information recovery using the Bayesian paradigm to extract a subset of spectra with similar spectral signatures to a reference. RED-STORM provided new information for subsequent experiments (e.g. 2D-NMR spectroscopy, Solid-Phase Extraction, Liquid Chromatography prefaced Mass Spectrometry) used to ultimately identify an unknown compound. In summary, we illustrate the benefit of acquiring J-resolved experiments alongside conventional 1D ¹H-NMR as part of routine metabolic profiling in large datasets and show that application of complementary statistical and analytical techniques for the identification of unknown metabolites can be used to save valuable time and resource.
... Notable new concepts for 2D NMR correlation spectroscopy studies such as, heteronuclear correlation NMR spectra [43,225], a statistical analysis of the linear trajectories of NMR chemical shifts [50], and multidimensional NMR spectroscopy [228], were introduced during this review period. Some hetero-spectral correlation with other probes, such as IR [21,145,146], mass spectroscopy [21,224], and Xray diffraction (XRD) [223] were also reported. ...
... Applications of 2D correlation mass spectroscopy were reported [21,223,224,234e237]. Hetero-spectral correlation between mass and NMR were also reported [21,224]. ...
Conference Paper
A comprehensive survey review of new and noteworthy developments of 2D correlation spectroscopy (2DCOS) and its applications for the last two years is compiled. This review covers not only journal articles and book chapters but also books, proceedings, and review articles published on 2DCOS, numerous significant new concepts of 2DCOS, patents and publication trends. Noteworthy experimental practices in the field of 2DCOS, including types of analytical probes employed, various perturbation methods used in experiments, and pertinent examples of fundamental and practical applications, are also reviewed.
... Proper peak alignment is also a requisite for the implementation of statistical total correlation spectroscopy (STOCSY) 22 and STOCSY derivations or variations, 23,24 as these are statistical analysis tools as well. As already mentioned, p-JRES spectra can be used as an alternative to increase peak dispersion, which is critical to reducing peak overlap. ...
... Existing computational and statistical tools for metabolomic profiling can then be applied to the processed spectra to extract the biologically useful information (Hollywood et al. 2006, Lewis et al. 2009, Xia et al. 2009, Allen and Maleti c-Savati c 2011, Zheng et al. 2011, Hao et al. 2012, Allen et al. 2013, Zhang et al. 2015. However, these methods, led by Statistical Total Correlation Spectroscopy (STOCSY) (Cloarec et al. 2005) and its variations (Robinette et al. 2013) have yet to accurately identify metabolites from the NMR spectra because they do not successfully solve the high dimensionality of the covariance matrix estimation and the overlapping signals. In a typical NMR spectrum, the number of chemical shift intervals (ppm values) can range from 3000 to 8000 and the number of samples can range from several to hundreds. ...
Article
Full-text available
Motivation Nuclear magnetic resonance spectroscopy (NMR) is widely used to analyze metabolites in biological samples, but the analysis requires specific expertise, it is time-consuming, and can be inaccurate. Here, we present a powerful automate tool, SPatial clustering Algorithm-Statistical TOtal Correlation SpectroscopY (SPA-STOCSY), which overcomes challenges faced when analyzing NMR data and identifies metabolites in a sample with high accuracy. Results As a data-driven method, SPA-STOCSY estimates all parameters from the input dataset. It first investigates the covariance pattern among datapoints and then calculates the optimal threshold with which to cluster datapoints belonging to the same structural unit, i.e. the metabolite. Generated clusters are then automatically linked to a metabolite library to identify candidates. To assess SPA-STOCSY’s efficiency and accuracy, we applied it to synthesized spectra and spectra acquired on Drosophila melanogaster tissue and human embryonic stem cells. In the synthesized spectra, SPA outperformed Statistical Recoupling of Variables (SRV), an existing method for clustering spectral peaks, by capturing a higher percentage of the signal regions and the close-to-zero noise regions. In the biological data, SPA-STOCSY performed comparably to the operator-based Chenomx analysis while avoiding operator bias, and it required <7 min of total computation time. Overall, SPA-STOCSY is a fast, accurate, and unbiased tool for untargeted analysis of metabolites in the NMR spectra. It may thus accelerate the use of NMR for scientific discoveries, medical diagnostics, and patient-specific decision making. Availability and implementation The codes of SPA-STOCSY are available at https://github.com/LiuzLab/SPA-STOCSY.
... Current efforts solving the automatic explorations of NMR spectra include correlation-based, Bayesian-based, and linear regression-based methods. Correlation-based methods are led by Statistical Total Correlation Spectroscopy (STOCSY) and its variations [10][11][12][13][14] . Due to the high-dimensional problem in the covariance matrix estimation and overlapping signals, these methods have yet to identify metabolites within NMR spectra accurately. ...
Preprint
Full-text available
Nuclear Magnetic Resonance is a powerful platform that reveals the metabolomics profiles within biofluids or tissues and contributes to personalized treatments in medical practice. However, data volume and complexity hinder the exploration of NMR spectra. Besides, the lack of fast and accurate computational tools that can handle the automatic identification and quantification of essential metabolites from NMR spectra also slows the wide application of these techniques in clinical. We present NMRQNet, a deep-learning-based pipeline for automatic identification and quantification of dominant metabolite candidates within human plasma samples. The estimated relative concentrations could be further applied in statistical analysis to extract the potential biomarkers. We evaluate our method on multiple plasma samples, including species from mice to humans, curated using three anticoagulants, covering healthy and patient conditions in neurological disorder disease, greatly expanding the metabolomics analytical space in plasma. NMRQNet accurately reconstructed the original spectra and obtained significantly better quantification results than the earlier computational methods. Besides, NMRQNet also proposed relevant metabolites biomarkers that could potentially explain the risk factors associated with the condition. NMRQNet, with improved prediction performance, highlights the limitations in the existing approaches and has shown strong application potential for future metabolomics disease studies using plasma samples.
... To help surmount these challenges, several computational tools have been developed to perform standard data preprocessing and to obtain phased, baseline-corrected, chemical shift- Correlation Spectroscopy (STOCSY) (Cloarec et al., 2005) and its variations (Robinette et al., 2013), have yet to accurately identify metabolites from the NMR spectra because of the highdimensional problem in the covariance matrix estimation and the overlapping signals. In a typical NMR spectrum, the number of chemical shift intervals (ppm values) can range from 3000 to 8000 and the number of samples can range from several to hundreds. ...
Preprint
Full-text available
Nuclear Magnetic Resonance (NMR) spectroscopy is widely used to analyze metabolites in biological samples, but the analysis can be cumbersome and inaccurate. Here, we present a powerful automated tool, SPA-STOCSY (Spatial Clustering Algorithm - Statistical Total Correlation Spectroscopy), which overcomes the challenges by identifying metabolites in each sample with high accuracy. As a data-driven method, SPA-STOCSY estimates all parameters from the input dataset, first investigating the covariance pattern and then calculating the optimal threshold with which to cluster data points belonging to the same structural unit, i.e. metabolite. The generated clusters are then automatically linked to a compound library to identify candidates. To assess SPA-STOCSY efficiency and accuracy, we applied it to synthesized and real NMR data obtained from Drosophila melanogaster brains and human embryonic stem cells. In the synthesized spectra, SPA outperforms Statistical Recoupling of Variables, an existing method for clustering spectral peaks, by capturing a higher percentage of the signal regions and the close-to-zero noise regions. In the real spectra, SPA-STOCSY performs comparably to operator-based Chenomx analysis but avoids operator bias and performs the analyses in less than seven minutes of total computation time. Overall, SPA-STOCSY is a fast, accurate, and unbiased tool for untargeted analysis of metabolites in the NMR spectra. As such, it might accelerate the utilization of NMR for scientific discoveries, medical diagnostics, and patient-specific decision-making.
... The robust and quantitative nature of the data makes the application of multivariate data analysis on the basis of correlation analysis particularly useful. Indeed, most statistical evaluations have focused on correlation analysis when interrogating NMR data [1]. ...
Article
Full-text available
Metabolite identification in non-targeted NMR-based metabolomics remains a challenge. While many peaks of frequently occurring metabolites are assigned, there is a high number of unknowns in high-resolution NMR spectra, hampering biological conclusions for biomarker analysis. Here, we use a cluster analysis approach to guide peak assignment via statistical correlations, which gives important information on possible structural and/or biological correlations from the NMR spectrum. Unknown peaks that cluster in close proximity to known peaks form hypotheses for their metabolite identities, thus, facilitating metabolite annotation. Subsequently, metabolite identification based on a database search, 2D NMR analysis and standard spiking is performed, whereas without a hypothesis, a full structural elucidation approach would be required. The approach allows a higher identification yield in NMR spectra, especially once pathway-related subclusters are identified.
... SRV analysis was performed on the data set according to the R packages recently published [14]. SRV is an automatic variable size bucketing method to achieve dimensionality reduction of the data set, and it acquires a set of local continuous variables with similar change patterns using a local spectral dependency measure based on a parameter of covariance/correlation (L) between consecutive variables. ...
Article
Nuclear magnetic resonance (NMR)-based metabolomics study usually involves spectral preprocessing, identification of biomarkers and interpretation of biological processes and pathogenesis, however, the traditional procedure is bound to inborn defects. In this study, a new analytical frame was proposed to assist spectral alignment and dimensionality reduction, screen the differential metabolites and get biological explanation of the metabolic network by combing weighted gene co-expression network analysis (WGCNA) and recoupled statistical total correlation spectroscopy (RSTOCSY). The performance of RSTOCSY-based WGCNA method was evaluated by the NMR dataset of serum from coronary heart disease with diabetes mellitus (CHDDM) patients. The statistical recoupling of variables (SRV) was successfully used to categorize the whole dataset into a number of superclusters of signals and served to spectral alignment, and its effectiveness was confirmed by the wine dataset with a larger spectral drift. Three phenotype-driven metabolite modules related to CHDDM were identified from the dataset by WGCNA, and 22 metabolites were further identified from the three modules according to the metabolic correlations within or between modules, and 40 significant metabolic correlations were observed from the intra- and inter-metabolites in the 2D pseudospectrum. These modules involve amino acid metabolism, microbial metabolism and glucose metabolism, and their analysis of metabolite network diffusion revealed a new discovery that the ferroptosis pathway is related to CHDDM. This RSTOCSY-based WGCNA approach provides an effective analysis workflow for information recovery and structure identification of metabolites and improving interpretability and understanding of the disease pathogenesis.
... In general, components with similar self-diffusion coefficients or concentration could not be resolved by DOSY or CORDY, respectively. (4,7,12,14), acesulfame-K (6, 18), arginine (9,15,19,20), ethanol (10,21), citrate (16,17), propylene glycol (8,11,13,22), respectively. Table S1. ...
... A low correlation indicates that the ratios of the peaks are different across different spectra and are likely to belong to different molecules 36 . This type of analysis, which is commonly referred to as statistical spectroscopy, allows recovery of structural and pathway information from analysis of sequential or parallel spectroscopic measurements on multiple samples 37 . The concept that correlative structure in spectra could be exploited to extract chemical information was first applied to Raman and infrared spectra 36 . ...
Article
Metabolic profiling of biological samples provides important insights into multiple physiological and pathological processes but is hindered by a lack of automated annotation and standardized methods for structure elucidation of candidate disease biomarkers. Here we describe a system for identifying molecular species derived from nuclear magnetic resonance (NMR) spectroscopy-based metabolic phenotyping studies, with detailed information on sample preparation, data acquisition and data modeling. We provide eight different modular workflows to be followed in a recommended sequential order according to their level of difficulty. This multi-platform system involves the use of statistical spectroscopic tools such as Statistical Total Correlation Spectroscopy (STOCSY), Subset Optimization by Reference Matching (STORM) and Resolution-Enhanced (RED)-STORM to identify other signals in the NMR spectra relating to the same molecule. It also uses two-dimensional NMR spectroscopic analysis, separation and pre-concentration techniques, multiple hyphenated analytical platforms and data extraction from existing databases. The complete system, using all eight workflows, would take up to a month, as it includes multi-dimensional NMR experiments that require prolonged experiment times. However, easier identification cases using fewer steps would take 2 or 3 days. This approach to biomarker discovery is efficient and cost-effective and offers increased chemical space coverage of the metabolome, resulting in faster and more accurate assignment of NMR-generated biomarkers arising from metabolic phenotyping studies. It requires a basic understanding of MATLAB to use the statistical spectroscopic tools and analytical skills to perform solid phase extraction (SPE), liquid chromatography (LC) fraction collection, LC-NMR-mass spectroscopy and one-dimensional and two-dimensional NMR experiments.
... Moreover, Cloarec et al. showed that PLS coefficients combined with 1D STOCSY enhances biomarkers identification 26,34 . Using this approach, the plot of α-glucose Pure Shift STOCSY on the first PLS orthogonal projection (Fig. 6A, bottom), allowed us to clearly define the strong correlation existent among α-glucose (driver peak), β-glucose, α-fructose, and β-fructose, and their anti-correlation with sucrose. ...
Article
Full-text available
Even though Pure Shift NMR methods have conveniently been used in the assessment of crowded spectra, they are not commonly applied to the analysis of metabolomics data. This paper exploits the recently published SAPPHIRE-PSYCHE methodology in the context of plant metabolome. We compare single pulse, PSYCHE, and SAPPHIRE-PSYCHE spectra obtained from aqueous extracts of Physalis peruviana fruits. STOCSY analysis with simplified SAPPHIRE-PSYCHE spectra of six types of Cape gooseberry was carried out and the results attained compared with classical STOCSY data. PLS coefficients analysis combined with 1D-STOCSY was performed in an effort to simplify biomarker identification. Several of the most compromised proton NMR signals associated with critical constituents of the plant mixture, such as amino acids, organic acids, and sugars, were more cleanly depicted and their inter and intra correlation better reveled by the Pure Shift methods. The simplified data allowed the identification of glutamic acid, a metabolite not observed in previous studies of Cape gooseberry due to heavy overlap of its NMR signals. Overall, the results attained indicated that Ultra-Clean Pure Shift spectra increase the performance of metabolomics data analysis such as STOCSY and multivariate coefficients analysis, and therefore represent a feasible and convenient additional tool available to metabolomics.
... The list of peaks showing a high correlation with the driver peak, including the driver peak itself, can be used for a query at any of the several databases available, like the Human Metabolome Database (HMDB), the Biological Magnetic Resonance Bank (BMRB), and so on. 4,5 Several adaptations were introduced to STOCSY to improve its performance, to combine with other nuclei or analytical platforms, and to extract biological information from perturbed metabolic pathways 6 as well as alternatives (like ratio analysis NMR spectroscopy 7 ). However, STOCSY presents limitations where its performance is diminished, mainly in cases of peak misalignment, weak peaks, and especially with peak overlap. ...
Article
The identification of metabolites in complex biological matrices is a challenging task in 1D ¹H NMR based metabolomic studies. Statistical TOtal Correlation Spectroscopy (STOCSY) has emerged for aiding the structural elucidation by revealing the peaks that present high correlation to a driver peak of interest (which would likely belong to the same molecule). However, in these studies the signals from metabolites are normally present as a mixture of overlapping resonances, limiting the performance of STOCSY. 2D ¹H homonuclear J-resolved spectra (JRES), in its usual tilted and symmetrized processed form, were projected and STOCSY was applied on these 1D projections (p-JRES-STOCSY) as an alternative to avoid the overlap issue, but this approach suffers in cases where the signals are very close. In addition, STOCSY was applied to JRES spectra (also tilted) to identify correlated multiplets, although the overlap issue in itself was not addressed directly and the subsequent search in databases is complicated in cases of higher order coupling. With these limitations in mind, in the present work we propose a new methodology based on the application of STOCSY on a set of nontilted JRES spectra, detecting peaks that would overlap in 1D spectra of the same sample set. COrrelation COmparison Analysis for Peak Overlap Detection (COCOA-POD) is able to reconstruct projected 1D STOCSY traces that result in more suitable database queries, as all peaks are summed at their f2 resonances instead of the resonance corresponding to the multiplet center in the tilted JRES (the peak dispersion and resolution enhancement gained are not sacrificed by the projection). Besides improving database queries with better peak lists obtained from the projections of the 2D STOCSY analysis, the overlap region is examined and the multiplet itself is analyzed from the correlation trace at 45° to obtain a cleaner multiplet profile, free from contributions from uncorrelated neighboring peaks.
... Hence, structure elucidation for metabolic profiling has been demonstrated to indirectly deepen knowledge of metabolic pathways, which may aid in development of future diagnostic and therapeutic techniques. Within metabolic profiling, 1 H Nuclear Magnetic Resonance (NMR) spectroscopy has become an extremely valuable tool for the characterisation of complex mixtures 2 ; statistical spectroscopy tools have also been applied to extract information from complex spectral sets 3 . Posma et al 4 . ...
Article
Full-text available
Metabolite identification and annotation procedures are necessary for the discovery of biomarkers indicative of phenotypes or disease states, but these processes can be bottlenecked by the sheer complexity of biofluids containing thousands of different compounds. Here we describe low-cost novel SPE-NMR protocols utilising different cartridges and conditions, on both natural and artifical urine mixtures, which produce unique retention profiles useful to metabolic profiling. We find that different SPE methods applied to biofluids such as urine can be used to selectively retain metabolites based on compound taxonomy or other key functional groups, reducing peak overlap through concentration and fractionation of unknowns and hence promising greater control over the metabolite annotation/identification process.
... 1,2 Clinicians have already used biological measurements, such as blood type 1,2 and blood pressure, 1 in clinical decision making. However, big data techniques have processed and extracted a previously unprecedented scale for the precise measurement of biological features, [3][4][5][6] and researchers have used numerous techniques for biomarker discovery: metabolomics, 7 proteomics, 8 genomics, 9,10 epigenetics, 11 and lipidomics. 12 Coupled with the acknowledgement that the same disease can vary greatly across patients 13,14 and the ability to measure features at the patient level, 5 the definition of suitable biomarkers has evolved; the Food and Drug Administration (FDA) and National Institutes of Health (NIH) recently convened to Impact statement Precision medicine evolved because of the understanding that human disease is molecularly driven and is highly variable across patients. ...
Article
Biomarkers are the pillars of precision medicine and are delivering on expectations of molecular, quantitative health. These features have made clinical decisions more precise and personalized, but require a high bar for validation. Biomarkers have improved health outcomes in a few areas such as cancer, pharmacogenetics, and safety. Burgeoning big data research infrastructure, the internet of things, and increased patient participation will accelerate discovery in the many areas that have not yet realized the full potential of biomarkers for precision health. Here we review themes of biomarker discovery, current implementations of biomarkers for precision health, and future opportunities and challenges for biomarker discovery. Impact statement Precision medicine evolved because of the understanding that human disease is molecularly driven and is highly variable across patients. This understanding has made biomarkers, a diverse class of biological measurements, more relevant for disease diagnosis, monitoring, and selection of treatment strategy. Biomarkers’ impact on precision medicine can be seen in cancer, pharmacogenomics, and safety. The successes in these cases suggest many more applications for biomarkers and a greater impact for precision medicine across the spectrum of human disease. The authors assess the status of biomarker-guided medical practice by analyzing themes for biomarker discovery, reviewing the impact of these markers in the clinic, and highlight future and ongoing challenges for biomarker discovery. This work is timely and relevant, as the molecular, quantitative approach of precision medicine is spreading to many disease indications.
... Given the discovery of unknown metabolite 4 as a biomarker that could at least partially predict low S/G excreting volunteers from high S/G excreting volunteers, it became important to identify this metabolite. Metabolite 4 was characterised by a singlet methyl resonance at 2.348 ppm that was linked by statistical correlation spectroscopy (STOCSY) [49] to a pair of second order, aromatic pseudo-doublets at ca 7.210 and 7.285 ppm. It thus appeared that metabolite 4 contained a 4-substituted benzene ring with a methyl group at position 1. ...
Article
Metabolic profiling by NMR spectroscopy or hyphenated mass spectrometry, known as metabonomics or metabolomics, is an important tool for systems-based approaches in biology and medicine. The experiments are typically done in a diagnostic fashion where changes in metabolite profiles are interpreted as a consequence of an intervention or event; be that a change in diet, the administration of a drug, physical exertion or the onset of a disease. By contrast, pharmacometabonomics takes a prognostic approach to metabolic profiling, in order to predict the effects of drug dosing before it occurs. Differences in pre-dose metabolite profiles between groups of subjects are used to predict post-dose differences in response to drug administration. Thus the paradigm is inverted and pharmacometabonomics is the metabolic equivalent of pharmacogenomics. Although the field is still in its infancy, it is expected that pharmacometabonomics, alongside pharmacogenomics, will assist with the delivery of personalised or precision medicine to patients, which is a critical goal of 21st century healthcare.
... Once a systematic effect of a diet or food has been identified, the signals that define the response can be related to a series of chemicals that are identified either from databases relating spectral information to chemical structure or by performing specific analytical experiments on selected or pooled samples to recover further chemical information, for example, application of NMR pulse programs that allow derivation of information relating to neighboring proton or carbon atoms (Nicholson & Wilson, 1989), isolation of chemical components using solid-, liquid-, or gas-phase chromatography followed by measurement using NMR and MS technologies (Wilson & Nicholson, 1987), direct hyphenation of LC-NMR-MS (Wilson & Nicholson, 1987), or use of statistical spectroscopic correlation methods (see Robinette, Lindon, & Nicholson, 2013 for a summary of available methods) based on identifying covariance of signals across an NMR or combined NMR and MS dataset. ...
Chapter
Nutrition provides the building blocks for growth, repair, and maintenance of the body and is key to maintaining health. Exposure to fast foods, mass production of dietary components, and wider importation of goods have challenged the balance between diet and health in recent decades, and both scientists and clinicians struggle to characterize the relationship between this changing dietary landscape and human metabolism with its consequent impact on health. Metabolic phenotyping of foods, using high-density data-generating technologies to profile the biochemical composition of foods, meals, and human samples (pre- and postfood intake), can be used to map the complex interaction between the diet and human metabolism and also to assess food quality and safety. Here, we outline some of the techniques currently used for metabolic phenotyping and describe key applications in the food sciences, ending with a broad outlook at some of the newer technologies in the field with a view to exploring their potential to address some of the critical challenges in nutritional science.
... Using this approach, two or more molecules involved in the same pathway can also display high intermolecular correlations, or can even be anti-correlation, because of biological covariance. A range of other statistical spectroscopy techniques have also emerged with varying degrees of usage in metabolomics [28]. ...
Chapter
This chapter introduces readers to metabolomics as a new tool in pharmacognosy research. The escalating cost of medicine and health care services warrant the development of new ways and approaches in remedying the problem. Application of metabolomics approach in various aspects of pharmacognosy research may assist in reducing the cost of drug discovery and development. Metabolomics is a holistic approach in understanding biological processes at a system level. It incorporates an extensive use of instrumentation (especially spectroscopy) and statistical methods. The tool has been successfully tested in solving numerous problems from diverse fields, and offers good promises of its benefits and potential use. This chapter discusses to the basic understanding and procedures in metabolomics, including sample selection, collection, data acquisition, and data analysis. Relevant topics to pharmacognosy are also discussed to expose readers to some examples of the investigations involving metabolomics.
... In addition to being predictive, biomarkers are preferably easy and cheap to score (Aronson 2005). This is probably why the use of molecular and biochemical markers, which proved to be excellent predictors and are relatively easy to measure in highthroughput conditions, became widespread in medicine (Menard et al. 2013;Robinette et al. 2013). ...
Article
Full-text available
Background In the last decade, metabolomics has emerged as a powerful diagnostic and predictive tool in many branches of science. Researchers in microbes, animal, food, medical and plant science have generated a large number of targeted or non-targeted metabolic profiles by using a vast array of analytical methods (GC–MS, LC–MS, ¹H-NMR….). Comprehensive analysis of such profiles using adapted statistical methods and modeling has opened up the possibility of using single or combinations of metabolites as markers. Metabolic markers have been proposed as proxy, diagnostic or predictors of key traits in a range of model species and accurate predictions of disease outbreak frequency, developmental stages, food sensory evaluation and crop yield have been obtained. Aim of review (i) To provide a definition of plant performance and metabolic markers, (ii) to highlight recent key applications involving metabolic markers as tools for monitoring or predicting plant performance, and (iii) to propose a workable and cost-efficient pipeline to generate and use metabolic markers with a special focus on plant breeding. Key message Using examples in other models and domains, the review proposes that metabolic markers are tending to complement and possibly replace traditional molecular markers in plant science as efficient estimators of performance.
... See Figure 1 below for an example of the use of PCA. The interested reader can find further information in the recent literature (Lindon and Nicholson, 2008;Robinette et al., 2013). ...
Article
Full-text available
Variable patient responses to drugs are a key issue for medicine and for drug discovery and development. Personalized medicine, that is the selection of medicines for subgroups of patients so as to maximize drug efficacy and minimize toxicity, is a key goal of twenty-first century healthcare. Currently, most personalized medicine paradigms rely on clinical judgment based on the patient's history, and on the analysis of the patients' genome to predict drug effects i.e., pharmacogenomics. However, variability in patient responses to drugs is dependent upon many environmental factors to which human genomics is essentially blind. A new paradigm for predicting drug responses based on individual pre-dose metabolite profiles has emerged in the past decade: pharmacometabonomics, which is defined as “the prediction of the outcome (for example, efficacy or toxicity) of a drug or xenobiotic intervention in an individual based on a mathematical model of pre-intervention metabolite signatures.” The new pharmacometabonomics paradigm is complementary to pharmacogenomics but has the advantage of being sensitive to environmental as well as genomic factors. This review will chart the discovery and development of pharmacometabonomics, and provide examples of its current utility and possible future developments.
... Biospecimen banks may contain multiple biospecimens from the same individual allowing longitudinal metabolomics studies and/or comparison of metabolite profiles between biofluids (35). Figures 1 and 2 present spectra of biofluids collected simultaneously, including arterial blood plasma, jugular venous blood plasma, CSF, and urine ( Figure 1). ...
Article
Metabolomics is an important member of the omics community in that it defines which small molecules may be responsible for disease states. This article reviews the essential principles of metabolomics from specimen preparation, chemical analysis, to advanced statistical methods. Metabolomics in traumatic brain injury has so far been underutilized. Future metabolomics-based studies focused on the diagnoses, prognoses, and treatment effects need to be conducted across all types of traumatic brain injury.
... These metabonomics 2, 3 or metabolomics 4 studies are typically executed with NMR spectroscopy-or mass spectrometry-based technologies for metabolite identification in biofluids, cell extracts or tissue samples. There are many steps in a metabonomics experiment and most of these steps have well described protocols for NMR- [5][6][7][8][9] or MS-based [10][11][12][13][14] approaches and well-accepted statistical procedures [15][16][17][18][19][20] for their analysis, with the significant exception of known metabolite identification. This remains a problematic step for both NMR spectroscopy- [21][22][23][24][25] or mass spectrometry (MS)-based 10,26,27 metabonomics. ...
Article
A new, simple-to-implement and quantitative approach to assessing the confidence in NMR-based identification of known metabolites is introduced. The approach is based on a topological analysis of metabolite identification information available from NMR spectroscopy studies and is a development of the metabolite identification carbon efficiency (MICE) method. New topological metabolite identification indices are introduced, analysed and proposed for general use, including topological metabolite identification carbon efficiency (tMICE). Since known metabolite identification is one of the key bottlenecks in either NMR spectroscopy- or mass spectrometry-based metabonomics/metabolomics studies, and given the fact that there is no current consensus on how to assess metabolite identification confidence, it is hoped that these new approaches and the topological indices will find utility.
... Metabolites were assigned by querying public metabolome databases such as the Human Metabolome Database, 24 the Madison-Qingdao Metabolomics Consortium Database 25 and the commercially available software Chenomx NMR suite v.8.1 (Chenomx Inc., Edmonton, Canada), aided by statistical total correlation spectroscopy techniques. 26 The metabolites were finally confirmed by two-dimensional NMR techniques, e.g., total correlation spectroscopy (TOCSY) and heteronuclear single quantum correlation (HSQC). ...
Article
Most genetically modified crops are engineered for herbicide tolerance, among them, glyphosate tolerant crops have the greatest share. Glyphosate is one of the most extensively used herbicides worldwide. The popularity of glyphosate stems from its low cost, low environmental impact, and effectiveness while being safe for animals. The toxicity of glyphosate to untargeted organisms was studied using goldfish (Carassius auratus) after exposure to different concentrations of glyphosate isopropylamine salt, a glyphosate based herbicide for 96 hours. Tissues of brain, kidney and liver were collected and subjected to NMR-based metabolomics analysis and histopathological inspection. Plasma was collected and the hematological parameters of glutamic-oxaloacetic transaminase (GOT), glutamate-pyruvate transaminase (GPT), lactate dehydrogenase (LDH), blood urea nitrogen (BUN) and creatinine (CRE) were quantified. Glyphosate produced an increase in the hematological parameters of BUN and CRE and dose-dependent injuries. Metabolomics analysis revealed significant perturbations in neurotransmitter equilibrium, energy metabolism and amino acid metabolism in glyphosate dosed fish, which are associated with the toxicity of glyphosate. The results highlight the vulnerability of glutaminergic neurons to glyphosate and enlighten the potential of glutamine as an early marker of glyphosate induced neurotoxicity.
Chapter
Chronic kidney disease (CKD) is becoming a leading cause of morbidity and mortality worldwide. Most often CKD is diagnosed during advanced stages of kidney disease. While genes that elevate CKD risk have been identified, there are environmental factors associated with CKD. In combination with sophisticated data analytics, multi-omics methods are needed to elucidate both genetic and environmental drivers of CKD and identify novel biomarkers to diagnose and stage disease. These approaches will reveal the underlying etiology and mechanisms of CKD, and enable the development of improved intervention strategies, specifically nutritional intervention. This chapter describes advanced technologies including metabolomics, exposome, and systems biology approaches that can be used to inform improved treatment for CKD and summarizes relevant literature to determine how metabolic individuality (or heterogeneity) can inform personalized treatment or interventional options.
Article
The development of new "omics" platforms is having a significant impact on the landscape of natural products discovery. However, despite the advantages that such platforms bring to the field, there remains no straightforward method for characterizing the chemical landscape of natural products libraries using two-dimensional nuclear magnetic resonance (2D-NMR) experiments. NMR analysis provides a powerful complement to mass spectrometric approaches, given the universal coverage of NMR experiments. However, the high degree of signal overlap, particularly in one-dimensional NMR spectra, has limited applications of this approach. To address this issue, we have developed a new data analysis platform for complex mixture analysis, termed MADByTE (Metabolomics and Dereplication by Two-Dimensional Experiments). This platform employs a combination of TOCSY and HSQC spectra to identify spin system features within complex mixtures and then matches spin system features between samples to create a chemical similarity network for a given sample set. In this report we describe the design and construction of the MADByTE platform and demonstrate the application of chemical similarity networks for both the dereplication of known compound scaffolds and the prioritization of bioactive metabolites from a bacterial prefractionated extract library.
Article
Full-text available
We have applied nuclear magnetic resonance spectroscopy based plasma phenotyping to reveal diagnostic molecular signatures of SARS-CoV-2 infection via combined diffusional and relaxation editing (DIRE). We compared plasma from healthy age-matched controls (n = 26) with SARS-CoV-2 negative non-hospitalized respiratory patients and hospitalized respiratory patients (n = 23 and 11 respectively) with SARS-CoV-2 rRT-PCR positive respiratory patients (n = 17, with longitudinal sampling time-points). DIRE data were modelled using principal component analysis and orthogonal projections to latent structures discriminant analysis (O-PLS-DA), with statistical cross-validation indices indicating excellent model generalization for the classification of SARS-CoV-2 positivity for all comparator groups (area under the receiver operator characteristic curve = 1). DIRE spectra show biomarker signal combinations conferred by differential concentrations of metabolites with selected molecular mobility properties. These comprise the following: (a) composite N-acetyl signals from α-1-acid glycoprotein and other glycoproteins (designated GlycA and GlycB) that were elevated in SARS-CoV-2 positive patients [p = 2.52 × 10–10 (GlycA) and 1.25 × 10–9 (GlycB) vs controls], (b) two diagnostic supramolecular phospholipid composite signals that were identified (SPC-A and SPC-B) from the –⁺N–(CH3)3 choline headgroups of lysophosphatidylcholines carried on plasma glycoproteins and from phospholipids in high-density lipoprotein subfractions (SPC-A) together with a phospholipid component of low-density lipoprotein (SPC–B). The integrals of the summed SPC signals (SPCtotal) were reduced in SARS-CoV-2 positive patients relative to both controls (p = 1.40 × 10–7) and SARS-CoV-2 negative patients (p = 4.52 × 10–8) but were not significantly different between controls and SARS-CoV-2 negative patients. The identity of the SPC signal components was determined using one and two dimensional diffusional, relaxation, and statistical spectroscopic experiments. The SPCtotal/GlycA ratios were also significantly different for control versus SARS-CoV-2 positive patients (p = 1.23 × 10–10) and for SARS-CoV-2 negatives versus positives (p = 1.60 × 10–9). Thus, plasma SPCtotal and SPCtotal/GlycA are proposed as sensitive molecular markers for SARS-CoV-2 positivity that could effectively augment current COVID-19 diagnostics and may have value in functional assessment of the disease recovery process in patients with long-term symptoms.
Chapter
In this chapter, we summarize data preprocessing and data analysis strategies used for analysis of NMR data for metabolomics studies. Metabolomics consists of the analysis of the low molecular weight compounds in cells, tissues, or biological fluids, and has been used to reveal biomarkers for early disease detection and diagnosis, to monitor interventions, and to provide information on pathway perturbations to inform mechanisms and identifying targets. Metabolic profiling (also termed metabotyping) involves the analysis of hundreds to thousands of molecules using mainly state-of-the-art mass spectrometry (MS) and nuclear magnetic resonance (NMR) spectroscopy technologies. While NMR is less sensitive than mass spectrometry, NMR does provide a wealth of complex and information rich metabolite data. NMR data together with the use of conventional statistics, modeling methods, and bioinformatics tools reveals biomarker and mechanistic information. A typical NMR spectrum, with up to 64k data points, of a complex biological fluid or an extract of cells and tissues consists of thousands of sharp signals that are mainly derived from small molecules. In addition, a number of advanced NMR spectroscopic methods are available for extracting information on high molecular weight compounds such as lipids or lipoproteins. There are numerous data preprocessing, data reduction, and analysis methods developed and evolving in the field of NMR metabolomics. Our goal is to provide an extensive summary of NMR data preprocessing and analysis strategies by providing examples and open source and commercially available analysis software and bioinformatics tools.
Chapter
NMR data from large studies combining multiple cohorts is becoming common in large-scale metabolomics. The data size and combination of cohorts with diverse properties leads to special problems for data processing and analysis. These include alignment, normalization, detection and removal of outliers, presence of strong correlations, and the identification of unknowns. Nonetheless, these challenges can be addressed with suitable algorithms and techniques, leading to enhanced data sets ripe for further data mining.
Article
inline-formula> ${}^{1}$ H High-Resolution Magic Angle Spinning (HRMAS) Nuclear Magnetic Resonance (NMR) is a reliable technology used for detecting metabolites in solid tissues. Fast response time enables guiding surgeons in real time, for detecting tumor cells that are left over in the excision cavity. However, severe overlap of spectral resonances in 1D signal often render distinguishing metabolites impossible. In that case, Heteronuclear Single Quantum Coherence Spectroscopy (HSQC) NMR is applied which can distinguish metabolites by generating 2D spectra ( ${}^{1}$ H- ${}^{13}$ C). Unfortunately, this analysis requires much longer time and prohibits real time analysis. Thus, obtaining 2D spectrum fast has major implications in medicine. In this study, we show that using multiple multivariate regression and statistical total correlation spectroscopy, we can learn the relation between the ${}^{1}$ H and ${}^{13}$ C dimensions. Learning is possible with small sample sizes and without the need for performing the HSQC analysis, we can predict the ${}^{13}$ C dimension by just performing ${}^{1}$ H HRMAS NMR experiment. We show on a rat model of central nervous system tissues (80 samples, 5 tissues) that our methods achieve 0.971 and 0.957 mean $R^2$ values, respectively. Our tests on 15 human brain tumor samples show that we can predict 104 groups of 39 metabolites with 97 percent accuracy. Finally, we show that we can predict the presence of a drug resistant tumor biomarker (creatine) despite obstructed signal in ${}^{1}$ H dimension. In practice, this information can provide valuable feedback to the surgeon to further resect the cavity to avoid potential recurrence.
Chapter
Metabolites are small molecules derived from biochemical processes in metabolism, and their profiling enables the analysis of physiological functions. Metabolic profiling through cross-sectional studies has moved forward to longitudinal cohort studies and metabolome-wide association studies (MWAS) which have helped unveil numerous discoveries in amino acid, fatty acid and energy metabolism pathways and their link in inflammatory bowel disease (IBD). This chapter will introduce metabolic profiling approaches and discuss the role that the metabolites play in the link between the gut microbiome and the host with regard to IBD. We will discuss the various biomarkers, which have been uncovered by metabonomics currently through separation of IBD phenotypes and the future for this area in relation to biomarkers for pathogenesis of IBD and personalizing medical therapy.
Article
The complexity of biological mixtures continues to challenge efforts aimed at unknown metabolite identification in the metabolomics field. To address this challenge, we provide a new method to identify related peaks from individual metabolites in complex NMR spectra. Extractive ratio analysis NMR spectroscopy (E-RANSY) builds on our previously described ratio analysis method [Anal. Chem. 2011, 83, 7616-7623] and exploits the simplified NMR spectra provided by the extraction of metabolites under varied pH conditions. Under such conditions, metabolites from the same biological specimen are extracted differentially and the resulting NMR spectra exhibit characteristics favorable for unraveling unknown metabolite peaks using ratio analysis. We demonstrate the utility of the E-RANSY method by extracting carboxylic acid containing metabolites from human urine, one of the highly complex biological mixtures encountered in the metabolomics field. E-RANSY performs better than the original RANSY method and offers new avenues to identify unknown metabolites in complex biological mixtures.
Chapter
The exposome concept places substantial weight on the internal chemical milieu of individuals, as this is the primary integrator of the human genome and the wider external environment. Small molecule metabolites of both endogenous and exogenous origin are involved in a plethora of cellular and systemic functions, and collectively contribute to the mechanistic linkage of exposures, responses, and associated adverse outcomes. Temporal and spatial responses of metabolic phenotypes to various environmental stimuli provide a direct report on multiple interacting and conditional processes that are modulated by numerous factors including diet, lifestyle, pharmaceutical use, microbial activity, age, sex, and many others. Measuring and integrating information about the human metabolome represents a critical part of the path toward understanding the environmental determinants of chronic disease.
Chapter
NMR spectroscopy of urine is a fertile bioanalytical approach for a wide range of studies in areas such as toxicity, drug development, molecular epidemiology, disease diagnosis, and nutrition. In this chapter, technical concerns critical to the design and execution of urinary NMR experiments are explored. Beginning with the chemical characteristics of urinary NMR spectra, we discuss the history of urinary NMR metabolomics through studies of toxicity and its suitability as a platform for large-scale studies due to high reproducibility and robustness. With respect to experimental design, a detailed discussion of validated urine collection procedures for both human and other animal model experimental systems is provided along with procedures for the use of preservatives and storage. We explore specific issues in the acquisition of urinary NMR experiments, such as the choice of pulse program and solvent suppression. Data pre-processing techniques, such as spectral binning, quantitative peak-fitting, and full-spectrum approaches, as input to subsequent chemometric evaluation of NMR spectra are detailed. Moving towards applications, we review illustrative biological examples of NMR spectroscopy of urine to studies of normal variation and non-healthy phenotypes. Finally, we discuss emerging challenges in biomarker discovery as well as the emerging field of pharmacometabonomics.
Chapter
This chapter provides a brief background of two-dimensional correlation spectroscopy (2DCOS), its new and noteworthy developments and applications in vibrational and optical spectroscopy. 2DCOS is now accepted as a powerful and versatile technique for the in-depth analysis of various spectral data obtained under external perturbations in various spectroscopic experiments. This chapter highlights new types of 2DCOS, such as chemometrics-combined 2DCOS, moving-window 2D analysis, orthogonal sample design and related techniques, projection 2D analysis, and 2D codistribution spectroscopy, as well as various experimental practices in vibrational and optical spectroscopy.
Article
A lot of time is spent by researchers in the identification of metabolites in NMR-based metabolomic studies. The usual metabolite identification starts employing public or commercial databases to match chemical shifts thought to belong to a given compound. Statistical total correlation spectroscopy (STOCSY), in use for more than a decade, speeds the process by finding statistical correlations among peaks, being able to create a better peak list as input for the database query. However, the (normally not automated) analysis becomes challenging due to the intrinsic issue of peak overlap, where correlations of more than one compound appear in the STOCSY trace. Here we present a fully automated methodology that analyzes all STOCSY traces at once (every peak is chosen as driver peak) and overcomes the peak overlap obstacle. Peak overlap detection by clustering analysis and sorting of traces (POD-CAST) first creates an overlap matrix from the STOCSY traces, then clusters the overlap traces based on their similarity and finally calculates a cumulative overlap index (COI) to account for both strong and intermediate correlations. This information is gathered in one plot to help the user identify the groups of peaks that would belong to a single molecule and perform a more reliable database query. The simultaneous examination of all traces reduces the time of analysis, compared to viewing STOCSY traces by pairs or small groups, and condenses the redundant information in the 2D STOCSY matrix into bands containing similar traces. The COI helps in the detection of overlapping peaks, which can be added to the peak list from another cross-correlated band. POD-CAST overcomes the generally overlooked and underestimated presence of overlapping peaks and it detects them to include them in the search of all compounds contributing to the peak overlap, enabling the user to accelerate the metabolite identification process with more successful database queries and searching all tentative compounds in the sample set.
Article
Crack-cocaine abuse and dependence is a severe public health problem. Sometimes, the crack-cocaine smokers show behavioral changes and symptoms very similar to those observed for severe mental disorders, like schizophrenia. Although crack-cocaine use can easily be detected in smokers by urine or blood tests, after some time, cocaine biomarkers become untraceable. In addition, schizophrenia diagnosis is limited to clinical interviews, while precise clinical tests for this mental disorder remain unknown. Employing metabolomics based on NMR spectroscopy, herein, we showed that blood serum metabolites might be used for discrimination between crack-cocaine users and schizophrenia patients groups. These two groups showed the greatest differences in around eleven key-metabolites. Moreover, seven possible peripheral metabolites might be enough for differentiation of crack-cocaine users and healthy controls. These results may contribute to a better understanding of crack-cocaine biochemical effects, and enable more precise diagnosis when crack-cocaine biomarkers, methylecgonidine or ecgonidine, are absent from the urine or blood of crack-cocaine users.
Article
In this work, multivariate statistical techniques are employed to determine patterns and conversion curves from time-resolved X-ray powder diffraction data. For these purposes, time-window statistical total correlation spectroscopy is introduced for the pattern matching of the crystalline phase and is shown to be effective even in the case of overlapping peaks. When combined with evolving factor analysis and multivariate curve resolution–alternating least squares, this technique allows a definite estimation of patterns and conversion curves. The procedure is applied to in situ synchrotron powder diffraction patterns to monitor the setting reaction of magnesium potassium phosphate ceramic (MKP) from magnesia (MgO) and potassium dihydrogen phosphate. It is shown that the phases involved in the reaction are clearly distinguished and their evolution is correctly described. The conversion curves estimated with the proposed procedure are compared with the ones determined with the peak integration method, leading to an excellent agreement (Pearson's correlation coefficient equal to 0.9995 and 0.9998 for MgO and MKP, respectively). The approach also allows for the detection and description of the evolution of amorphous phases that cannot be described through conventional analysis of powder diffraction data.
Article
The human microbiome is a new frontier in biology and one that is helping to define what it is to be human. Recently, we have begun to understand that the “communication” between the host and its microbiome is via a metabolic superhighway. By interrogating and understanding the molecules involved we may start to know who the main players are, and how we can modulate them and the mechanisms of health and disease.
Chapter
This chapter describes the concept of the patient journey, that is, the various interactions between a patient and a medical team as the patient first encounters the system, is diagnosed, treated, and followed up after whatever course of action was deemed appropriate. The various bottlenecks in the process are explained. As a new paradigm, the role of metabolic phenotyping (metabotyping) in monitoring the patient journey is discussed and examples are provided. The potential of such metabolic phenotyping in the clinic has implications in terms of stratified or personalized medicine, including adding information to aid diagnosis or to allow better prognosis, and these implications are listed. Finally, one example of the process, a dedicated phenome center, is illustrated.
Chapter
Metabolic phenotyping of urine, blood plasma/serum, and many other biological fluids provides important prognostic and diagnostic information and permits monitoring of disease progression in an objective manner. Much effort has been made in recent years to develop analytical instrumentation and technology, which has allowed for the acquisition of data in an effective, accurate, reproducible, and high-throughput manner. The two main goals are to enable the study of general population samples from epidemiologic collections for biomarkers of disease risk and to provide enhanced information to aid clinical decision making for improved diagnosis, prognosis, and patient stratification. However, developing highly reproducible methods and standardized protocols that minimize technical or experimental bias and allow realistic interlaboratory comparisons of subtle biomarker information remains a challenge. Increasing demand to metabolically phenotype large cohorts of samples means experiments must be performed in a tightly regulated and uniform way to maintain quality control in a high-throughput setting. Here, we present an introduction to experimental and technical factors, including sample preparation, throughput, reproducibility, cost-effectiveness, resolution, and experimental coverage, which require careful consideration before considering high-throughput screening of biofluids for metabolic analysis.
Article
Numerous molecular screening strategies have recently been developed to measure the chemical diversity of a population's biofluids with the ultimate aim to provide clinicians, medical scientists and epidemiologists with a clearer picture of the presence and severity of cardiovascular disease; prognosis; and response to treatment. Current cardiology practice integrates clinical history and examination with state-of-the-art imaging, invasive measures, and electrical interrogation. Biomarkers in common clinical use are relatively limited to troponin and brain natriuretic peptide, dependent on damage to heart muscle, or myocyte 'stretch' respectively. Although they have been recently applied to risk stratification in asymptomatic individuals at higher risk, the development of markers capable of detecting earlier phases of disease development would facilitate targeted strategies to prevent pathological complications in the general community. Metabolomics is the systematic study of small molecules in biological fluids. Profiling strategies aim to comprehensively measure and quantify such biomarkers in a fast, cost-effective and clinically informative manner. Techniques tend to be applied in an unbiased fashion, with advanced statistical methods allowing for identification of signature profiles in particular cohorts. In this manner, metabolomics has the potential to identify new pathophysiological pathways, and thus therapeutic targets, as well as assist in improved risk-stratification and personalized cardiovascular medicine. The latter has great potential in the primary and secondary cardiovascular disease prevention settings, integrating known and as yet unidentified host and environmental factors. The current review discusses applications of metabolomic techniques relevant to both the research and the clinical cardiologist.
Article
Full-text available
Metabonomics/Metabolomics is an important science for the understanding of biological systems and the prediction of their behaviour, through the profiling of metabolites. Two technologies are routinely used in order to analyse metabolite profiles in biological fluids: nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS), the latter typically with hyphenation to a chromatography system such as liquid chromatography (LC), in a configuration known as LC–MS. With both NMR and MS-based detection technologies, the identification of the metabolites in the biological sample remains a significant obstacle and bottleneck. This article provides guidance on methods for metabolite identification in biological fluids using NMR spectroscopy, and is illustrated with examples from recent studies on mice.
Article
Full-text available
Metabolic phenotyping involves the comprehensive analysis of biological fluids or tissue samples. This analysis allows biochemical classification of a person's physiological or pathological states that relate to disease diagnosis or prognosis at the individual level and to disease risk factors at the population level. These approaches are currently being implemented in hospital environments and in regional phenotyping centres worldwide. The ultimate aim of such work is to generate information on patient biology using techniques such as patient stratification to better inform clinicians on factors that will enhance diagnosis or the choice of therapy. There have been many reports of direct applications of metabolic phenotyping in a clinical setting.
Article
Full-text available
Comprehensive characterization of human tissues promises novel insights into the biological architecture of human diseases and traits. We assessed metabonomic, transcriptomic, and genomic variation for a large population-based cohort from the capital region of Finland. Network analyses identified a set of highly correlated genes, the lipid–leukocyte (LL) module, as having a prominent role in over 80 serum metabolites (of 134 measures quantified), including lipoprotein subclasses, lipids, and amino acids. Concurrent association with immune response markers suggested the LL module as a possible link between inflammation, metabolism, and adiposity. Further, genomic variation was used to generate a directed network and infer LL module's largely reactive nature to metabolites. Finally, gene co-expression in circulating leukocytes was shown to be dependent on serum metabolite concentrations, providing evidence for the hypothesis that the coherence of molecular networks themselves is conditional on environmental factors. These findings show the importance and opportunity of systematic molecular investigation of human population samples. To facilitate and encourage this investigation, the metabonomic, transcriptomic, and genomic data used in this study have been made available as a resource for the research community.
Article
Full-text available
Metabolism has an essential role in biological systems. Identification and quantitation of the compounds in the metabolome is defined as metabolic profiling, and it is applied to define metabolic changes related to genetic differences, environmental influences and disease or drug perturbations. Chromatography-mass spectrometry (MS) platforms are frequently used to provide the sensitive and reproducible detection of hundreds to thousands of metabolites in a single biofluid or tissue sample. Here we describe the experimental workflow for long-term and large-scale metabolomic studies involving thousands of human samples with data acquired for multiple analytical batches over many months and years. Protocols for serum- and plasma-based metabolic profiling applying gas chromatography-MS (GC-MS) and ultraperformance liquid chromatography-MS (UPLC-MS) are described. These include biofluid collection, sample preparation, data acquisition, data pre-processing and quality assurance. Methods for quality control-based robust LOESS signal correction to provide signal correction and integration of data from multiple analytical batches are also described.
Article
Full-text available
The characteristics of the OPLS method have been investigated for the purpose of discriminant analysis (OPLS-DA). We demonstrate how class-orthogonal variation can be exploited to augment classification performance in cases where the individual classes exhibit divergence in within-class variation, in analogy with soft independent modelling of class analogy (SIMCA) classification. The prediction results will be largely equivalent to traditional supervised classification using PLS-DA if no such variation is present in the classes. A discriminatory strategy is thus outlined, combining the strengths of PLS-DA and SIMCA classification within the framework of the OPLS-DA method. Furthermore, resampling methods have been employed to generate distributions of predicted classification results and subsequently assess classification belief. This enables utilisation of the class-orthogonal variation in a proper statistical context. The proposed decision rule is compared to common decision rules and is shown to produce comparable or less class-biased classification results. Copyright © 2007 John Wiley & Sons, Ltd.
Article
Full-text available
Recently, interest in untargeted metabolomics has become prevalent in the general scientific community among an increasing number of investigators. The majority of these investigators, however, do not have the bioinformatic expertise that has been required to process metabolomic data by using command-line driven software programs. Here we introduce a novel platform to process untargeted metabolomic data that uses an intuitive graphical interface and does not require installation or technical expertise. This platform, called XCMS Online, is a web-based version of the widely used XCMS software that allows users to easily upload and process liquid chromatography/mass spectrometry data with only a few mouse clicks. XCMS Online provides a solution for the complete untargeted metabolomic workflow including feature detection, retention time correction, alignment, annotation, statistical analysis, and data visualization. Results can be browsed online in an interactive, customizable table showing statistics, chromatograms, and putative METLIN identities for each metabolite. Additionally, all results and images can be downloaded as zip files for offline analysis and publication. XCMS Online is available at https://xcmsonline.scripps.edu.
Article
Full-text available
Brain tissue biopsies are required to histologically diagnose brain tumors, but current approaches are limited by tissue characterization at the time of surgery. Emerging technologies such as mass spectrometry imaging can enable a rapid direct analysis of cancerous tissue based on molecular composition. Here, we illustrate how gliomas can be rapidly classified by desorption electrospray ionization-mass spectrometry (DESI-MS) imaging, multivariate statistical analysis, and machine learning. DESI-MS imaging was carried out on 36 human glioma samples, including oligodendroglioma, astrocytoma, and oligoastrocytoma, all of different histologic grades and varied tumor cell concentration. Gray and white matter from glial tumors were readily discriminated and detailed diagnostic information could be provided. Classifiers for subtype, grade, and concentration features generated with lipidomic data showed high recognition capability with more than 97% cross-validation. Specimen classification in an independent validation set agreed with expert histopathology diagnosis for 79% of tested features. Together, our findings offer proof of concept that intraoperative examination and classification of brain tissue by mass spectrometry can provide surgeons, pathologists, and oncologists with critical and previously unavailable information to rapidly guide surgical resections that can improve management of patients with malignant brain tumors.
Article
Full-text available
With successes of genome-wide association studies, molecular phenotyping systems are developed to identify genetically determined disease-associated biomarkers. Genetic studies of the human metabolome are emerging but exclusively apply targeted approaches, which restricts the analysis to a limited number of well-known metabolites. We have developed novel technical and statistical methods for systematic and automated quantification of untargeted NMR spectral data designed to perform robust and accurate quantitative trait locus (QTL) mapping of known and previously unreported molecular compounds of the metabolome. For each spectral peak, six summary statistics were calculated and independently tested for evidence of genetic linkage in a cohort of F2 (129S6xBALB/c) mice. The most significant evidence of linkages were obtained with NMR signals characterizing the glycerate (LOD10-42) at the mutant glycerate kinase locus, which demonstrate the power of metabolomics in quantitative genetics to identify the biological function of genetic variants. These results provide new insights into the resolution of the complex nature of metabolic regulations and novel analytical techniques that maximize the full utilization of metabolomic spectra in human genetics to discover mappable disease-associated biomarkers.
Article
Full-text available
The role of urinary metabolic profiling in systems biology research is expanding. This is because of the use of this technology for clinical diagnostic and mechanistic studies and for the development of new personalized health care and molecular epidemiology (population) studies. The methodologies commonly used for metabolic profiling are NMR spectroscopy, liquid chromatography mass spectrometry (LC/MS) and gas chromatography-mass spectrometry (GC/MS). In this protocol, we describe urine collection and storage, GC/MS and data preprocessing methods, chemometric data analysis and urinary marker metabolite identification. Results obtained using GC/MS are complementary to NMR and LC/MS. Sample preparation for GC/MS analysis involves the depletion of urea via treatment with urease, protein precipitation with methanol, and trimethylsilyl derivatization. The protocol described here facilitates the metabolic profiling of ∼400-600 metabolites in 120 urine samples per week.
Article
Full-text available
The feasibility of electrospray (ES) ionization of aerosols generated by electrosurgical disintegration methods was investigated. Although electrosurgery itself was demonstrated to produce gaseous ions, post-ionization methods were implemented to enhance the ion yield, especially in those cases when the ion current produced by the applied electrosurgical method is not sufficient for MS analysis. Post-ionization was implemented by mounting an ES emitter onto a Venturi pump, which is used for ion transfer. The effect of various parameters including geometry, high voltage setting, flow parameters, and solvent composition was investigated in detail. Experimental setups were optimized accordingly. ES post-ionization was found to yield spectra similar to those obtained by the REIMS technique, featuring predominantly lipid-type species. Signal enhancement was 20- to 50-fold compared with electrosurgical disintegration in positive mode, while no improvement was observed in negative mode. ES post-ionization was also demonstrated to allow the detection of non-lipid type species in the electrosurgical aerosol, including drug molecules. Since the tissue specificity of the MS data was preserved in the ES post-ionization setup, feasibility of tissue identification was demonstrated using different electrosurgical methods.
Article
Full-text available
Bariatric surgery, also known as metabolic surgery, is an effective treatment for morbid obesity, which also offers pronounced metabolic effects including the resolution of type 2 diabetes and a decrease in cardiovascular disease and long-term cancer risk. However, the mechanisms of surgical weight loss and the long-term consequences of bariatric surgery remain unclear. Bariatric surgery has been demonstrated to alter the composition of both the microbiome and the metabolic phenotype. We observed a marked shift toward Gammaproteobacteria, particularly Enterobacter hormaechei, following Roux-en-Y gastric bypass (RYGB) surgery in a rat model compared with sham-operated controls. Fecal water from RYGB surgery rats was highly cytotoxic to rodent cells (mouse lymphoma cell line). In contrast, fecal water from sham-operated animals showed no/very low cytotoxicity. This shift in the gross structure of the microbiome correlated with greatly increased cytotoxicity. Urinary phenylacetylglycine and indoxyl sulfate and fecal gamma-aminobutyric acid, putrescine, tyramine, and uracil were found to be inversely correlated with cell survival rate. This profound co-dependent response of mammalian and microbial metabolism to RYGB surgery and the impact on the cytotoxicity of the gut luminal environment suggests that RYGB exerts local and global metabolic effects which may have an influence on long-term cancer risk and cytotoxic load.
Article
Full-text available
We have performed a metabolite quantitative trait locus (mQTL) study of the (1)H nuclear magnetic resonance spectroscopy ((1)H NMR) metabolome in humans, building on recent targeted knowledge of genetic drivers of metabolic regulation. Urine and plasma samples were collected from two cohorts of individuals of European descent, with one cohort comprised of female twins donating samples longitudinally. Sample metabolite concentrations were quantified by (1)H NMR and tested for association with genome-wide single-nucleotide polymorphisms (SNPs). Four metabolites' concentrations exhibited significant, replicable association with SNP variation (8.6×10(-11)
Article
Full-text available
Genome-wide association studies (GWAS) have identified many risk loci for complex diseases, but effect sizes are typically small and information on the underlying biological processes is often lacking. Associations with metabolic traits as functional intermediates can overcome these problems and potentially inform individualized therapy. Here we report a comprehensive analysis of genotype-dependent metabolic phenotypes using a GWAS with non-targeted metabolomics. We identified 37 genetic loci associated with blood metabolite concentrations, of which 25 show effect sizes that are unusually high for GWAS and account for 10-60% differences in metabolite levels per allele copy. Our associations provide new functional insights for many disease-related associations that have been reported in previous studies, including those for cardiovascular and kidney disorders, type 2 diabetes, cancer, gout, venous thromboembolism and Crohn's disease. The study advances our knowledge of the genetic basis of metabolic individuality in humans and generates many new hypotheses for biomedical and pharmaceutical research.
Article
Full-text available
MetaboAnalyst is an integrated web-based platform for comprehensive analysis of quantitative metabolomic data. It is designed to be used by biologists (with little or no background in statistics) to perform a variety of complex metabolomic data analysis tasks. These include data processing, data normalization, statistical analysis and high-level functional interpretation. This protocol provides a step-wise description on how to format and upload data to MetaboAnalyst, how to process and normalize data, how to identify significant features and patterns through univariate and multivariate statistical methods and, finally, how to use metabolite set enrichment analysis and metabolic pathway analysis to help elucidate possible biological mechanisms. The complete protocol can be executed in approximately 45 min.
Article
Full-text available
We present a genome-wide association study of metabolic traits in human urine, designed to investigate the detoxification capacity of the human body. Using NMR spectroscopy, we tested for associations between 59 metabolites in urine from 862 male participants in the population-based SHIP study. We replicated the results using 1,039 additional samples of the same study, including a 5-year follow-up, and 992 samples from the independent KORA study. We report five loci with joint P values of association from 3.2 × 10(-19) to 2.1 × 10(-182). Variants at three of these loci have previously been linked with important clinical outcomes: SLC7A9 is a risk locus for chronic kidney disease, NAT2 for coronary artery disease and genotype-dependent response to drug toxicity, and SLC6A20 for iminoglycinuria. Moreover, we identify rs37369 in AGXT2 as the genetic basis of hyper-β-aminoisobutyric aciduria.
Article
Full-text available
Statistical total correlation spectroscopy (STOCSY) is a well-established and valuable method in the elucidation of both inter- and intrametabolite correlations in NMR metabonomic data sets. Here, the STOCSY approach is extended in a novel Iterative-STOCSY (I-STOCSY) tool in which correlations are calculated initially from a driver peak of interest and subsequently for all peaks identified as correlating with a correlation coefficient greater than a set threshold. Consequently, in a single automated run, the majority of information contained in multiple STOCSY calculations from all peaks recursively correlated to the original user defined driver peak of interest are recovered. In addition, highly correlating peaks are clustered into putative structurally related sets, and the results are presented in a fully interactive plot where each set is represented by a node; node-to-node connections are plotted alongside corresponding spectral data colored by the strength of connection, thus allowing the intuitive exploration of both inter- and intrametabolite connections. The I-STOCSY approach has been here applied to a (1)H NMR data set of 24 h postdose aqueous liver extracts from rats treated with the model hepatotoxin galactosamine and has been shown both to recover the previously deduced major metabolic effects of treatment and to generate new hypotheses even on this well-studied model system. I-STOCSY, thus, represents a significant advance in correlation based analysis and visualization, providing insight into inter- and intrametabolite relationships following metabolic perturbations.
Article
Full-text available
Background: Arterial calcifications are associated with increased cardiovascular risk, but the genetic basis of this association is unclear. Methods: We performed clinical, radiographic, and genetic studies in three families with symptomatic arterial calcifications. Single-nucleotide-polymorphism analysis, targeted gene sequencing, quantitative polymerase-chain-reaction assays, Western blotting, enzyme measurements, transduction rescue experiments, and in vitro calcification assays were performed. Results: We identified nine persons with calcifications of the lower-extremity arteries and hand and foot joint capsules: all five siblings in one family, three siblings in another, and one patient in a third family. Serum calcium, phosphate, and vitamin D levels were normal. Affected members of Family 1 shared a single 22.4-Mb region of homozygosity on chromosome 6 and had a homozygous nonsense mutation (c.662C→A, p.S221X) in NT5E, encoding CD73, which converts AMP to adenosine. Affected members of Family 2 had a homozygous missense mutation (c.1073G→A, p.C358Y) in NT5E. The proband of Family 3 was a compound heterozygote for c.662C→A and c.1609dupA (p.V537fsX7). All mutations found in the three families result in nonfunctional CD73. Cultured fibroblasts from affected members of Family 1 showed markedly reduced expression of NT5E messenger RNA, CD73 protein, and enzyme activity, as well as increased alkaline phosphatase levels and accumulated calcium phosphate crystals. Genetic rescue experiments normalized the CD73 and alkaline phosphatase activity in patients' cells, and adenosine treatment reduced the levels of alkaline phosphatase and calcification. Conclusions: We identified mutations in NT5E in members of three families with symptomatic arterial and joint calcifications. This gene encodes CD73, which converts AMP to adenosine, supporting a role for this metabolic pathway in inhibiting ectopic tissue calcification. (Funded by the National Human Genome Research Institute and the National Heart, Lung, and Blood Institute of the National Institutes of Health.).
Article
Full-text available
The production of 'global' metabolite profiles involves measuring low molecular-weight metabolites (<1 kDa) in complex biofluids/tissues to study perturbations in response to physiological challenges, toxic insults or disease processes. Information-rich analytical platforms, such as mass spectrometry (MS), are needed. Here we describe the application of ultra-performance liquid chromatography-MS (UPLC-MS) to urinary metabolite profiling, including sample preparation, stability/storage and the selection of chromatographic conditions that balance metabolome coverage, chromatographic resolution and throughput. We discuss quality control and metabolite identification, as well as provide details of multivariate data analysis approaches for analyzing such MS data. Using this protocol, the analysis of a sample set in 96-well plate format, would take ca. 30 h, including 1 h for system setup, 1-2 h for sample preparation, 24 h for UPLC-MS analysis and 1-2 h for initial data processing. The use of UPLC-MS for metabolic profiling in this way is not faster than the conventional HPLC-based methods but, because of improved chromatographic performance, provides superior metabolome coverage.
Article
Full-text available
Serum metabolite concentrations provide a direct readout of biological processes in the human body, and they are associated with disorders such as cardiovascular and metabolic diseases. We present a genome-wide association study (GWAS) of 163 metabolic traits measured in human blood from 1,809 participants from the KORA population, with replication in 422 participants of the TwinsUK cohort. For eight out of nine replicated loci (FADS1, ELOVL2, ACADS, ACADM, ACADL, SPTLC3, ETFDH and SLC16A9), the genetic variant is located in or near genes encoding enzymes or solute carriers whose functions match the associating metabolic traits. In our study, the use of metabolite concentration ratios as proxies for enzymatic reaction rates reduced the variance and yielded robust statistical associations with P values ranging from 3 x 10(-24) to 6.5 x 10(-179). These loci explained 5.6%-36.3% of the observed variance in metabolite concentrations. For several loci, associations with clinically relevant parameters have been reported previously.
Article
Full-text available
Here we present a novel method for enhanced NMR spectral information recovery, utilizing a statistical total correlation spectroscopy editing (STOCSY-E) procedure for the identification of drug metabolite peaks in biofluids and for deconvolution of drug and endogenous metabolite signals. Structurally correlated peaks from drug metabolites and those from closely related drug metabolite pathways are first identified using STOCSY. Subsequently, this correlation information is utilized to scale the biofluid (1)H NMR spectra across these identified regions, producing a modified set of spectra in which drug metabolite contributions are reduced and, thus, facilitating analysis by pattern recognition methods without drug metabolite interferences. The application of STOCSY-E is illustrated with two exemplar (1)H NMR spectroscopic data sets, posing various drug metabolic, toxicological, and analytical challenges viz. 800 MHz (1)H spectra of human urine (n = 21) collected over 10 h following dosing with the antibiotic flucloxacillin and 600 MHz (1)H NMR spectra of rat urine (n = 27) collected over 48 h following exposure to the renal papillary toxin 2-bromoethanamine (BEA). STOCSY-E efficiently identified and removed the major xenobiotic metabolite peaks in both data sets, providing enhanced visualization of endogenous changes via orthogonal to projection filtered partial least-squares discriminant analysis (OPLS-DA). OPLS-DA of the STOCSY-E spectral data from the BEA-treated rats revealed the gut bacterial-mammalian co-metabolite phenylacetylglycine as a previously unidentified surrogate biomarker of toxicity. STOCSY-E has a wide range of potential applications in clinical, epidemiology, toxicology, and nutritional studies where multiple xenobiotic metabolic interferences may confound biological interpretation. Additionally, this tool could prove useful for applications outside of metabolic analysis, for example, in process chemistry for following chemical reactions and equilibria and detecting impurities.
Article
Full-text available
Metabolomics is a newly emerging field of 'omics' research that is concerned with characterizing large numbers of metabolites using NMR, chromatography and mass spectrometry. It is frequently used in biomarker identification and the metabolic profiling of cells, tissues or organisms. The data processing challenges in metabolomics are quite unique and often require specialized (or expensive) data analysis software and a detailed knowledge of cheminformatics, bioinformatics and statistics. In an effort to simplify metabolomic data analysis while at the same time improving user accessibility, we have developed a freely accessible, easy-to-use web server for metabolomic data analysis called MetaboAnalyst. Fundamentally, MetaboAnalyst is a web-based metabolomic data processing tool not unlike many of today's web-based microarray analysis packages. It accepts a variety of input data (NMR peak lists, binned spectra, MS peak lists, compound/concentration data) in a wide variety of formats. It also offers a number of options for metabolomic data processing, data normalization, multivariate statistical analysis, graphing, metabolite identification and pathway mapping. In particular, MetaboAnalyst supports such techniques as: fold change analysis, t-tests, PCA, PLS-DA, hierarchical clustering and a number of more sophisticated statistical or machine learning methods. It also employs a large library of reference spectra to facilitate compound identification from most kinds of input spectra. MetaboAnalyst guides users through a step-by-step analysis pipeline using a variety of menus, information hyperlinks and check boxes. Upon completion, the server generates a detailed report describing each method used, embedded with graphical and tabular outputs. MetaboAnalyst is capable of handling most kinds of metabolomic data and was designed to perform most of the common kinds of metabolomic data analyses. MetaboAnalyst is accessible at http://www.metaboanalyst.ca.
Article
Full-text available
The rapidly evolving field of metabolomics aims at a comprehensive measurement of ideally all endogenous metabolites in a cell or body fluid. It thereby provides a functional readout of the physiological state of the human body. Genetic variants that associate with changes in the homeostasis of key lipids, carbohydrates, or amino acids are not only expected to display much larger effect sizes due to their direct involvement in metabolite conversion modification, but should also provide access to the biochemical context of such variations, in particular when enzyme coding genes are concerned. To test this hypothesis, we conducted what is, to the best of our knowledge, the first GWA study with metabolomics based on the quantitative measurement of 363 metabolites in serum of 284 male participants of the KORA study. We found associations of frequent single nucleotide polymorphisms (SNPs) with considerable differences in the metabolic homeostasis of the human body, explaining up to 12% of the observed variance. Using ratios of certain metabolite concentrations as a proxy for enzymatic activity, up to 28% of the variance can be explained (p-values 10(-16) to 10(-21)). We identified four genetic variants in genes coding for enzymes (FADS1, LIPC, SCAD, MCAD) where the corresponding metabolic phenotype (metabotype) clearly matches the biochemical pathways in which these enzymes are active. Our results suggest that common genetic polymorphisms induce major differentiations in the metabolic make-up of the human population. This may lead to a novel approach to personalized health care based on a combination of genotyping and metabolic characterization. These genetically determined metabotypes may subscribe the risk for a certain medical phenotype, the response to a given drug treatment, or the reaction to a nutritional intervention or environmental challenge.
Article
A two-dimensional (2D) correlation method generally applicable to various types of spectroscopy, including IR and Raman spectroscopy, is introduced. In the proposed 2D correlation scheme, an external perturbation is applied to a system while being monitored by an electromagnetic probe. With the application of a correlation analysis to spectral intensity fluctuations induced by the perturbation, new types of spectra defined by two independent spectral variable axes are obtained. Such two-dimensional correlation spectra emphasize spectral features not readily observable in conventional one-dimensional spectra. While a similar 2D correlation formalism has already been developed in the past for analysis of simple sinusoidally varying IR signals, the newly proposed formalism is designed to handle signals fluctuating as an arbitrary function of time, or any other physical variable. This development makes the 2D correlation approach a universal spectroscopic tool, generally applicable to a very wide range of applications. The basic property of 2D correlation spectra obtained by the new method is described first, and several spectral data sets are analyzed by the proposed scheme to demonstrate the utility of generalized 2D correlation spectra. Potential applications of this 2D correlation approach are then explored.
Article
We describe a new multivariate statistical approach to recover metabolite structure information from multiple (1)H NMR spectra in population sample sets. SubseT Optimization by Reference Matching (STORM) was developed to select subsets of 1H NMR spectra that contain specific spectroscopic signatures of biomarkers differentiating between different human populations. STORM aims to improve the visualization of structural correlations in spectroscopic data using these reduced spectral subsets containing smaller numbers of samples than the number of variables (n≪p). We have used 'statistical shrinkage' to limit the number of false positive associations and to simplify the overall interpretation of the auto-correlation matrix. The STORM approach has been applied to findings from an on-going human Metabolome-Wide Association study on Body Mass Index to identify a biomarker metabolite present in a subset of the population. Moreover, we have shown how STORM improves the visualization of more abundant NMR peaks compared to a previously published method (STOCSY). STORM is a useful new tool for biomarker discovery in the 'omic' sciences that has a widespread applicability. It can be applied to any type of data, provided that there is interpretable correlation among variables, and can also be applied to data with more than 1 dimension (e.g. 2D-NMR spectra).
Article
Two-dimensional infrared (2D IR) spectroscopy, a novel technique based on time-resolved IR spectroscopy, is introduced. In 2D IR, a system is excited by an external perturbation, which induces a dynamic fluctuation of the IR spectrum. A correlation analysis is applied to the time-dependent IR signals to yield a spectrum defined by two independent wavenumbers. By spreading IR peaks over the second dimension, a complex spectrum consisting of overlapped peaks can be substantially simplified, and spectral resolution is enhanced. Peaks located on a 2D spectral plane provide information on connectivity and interactions among functional groups associated with the IR bands. 2D IR spectra are presented for a system consisting of a mixture of atactic polystyrene (PS) and low-density polyethylene (PE) to illustrate these features. The spectroscopic evidence clearly shows PS and PE in a blend are segregated at the molecular level, allowing the components to respond to an applied external perturbation independently of each other. A substantial difference is observed in the local mobility of the backbone and side-group functionalities of PS. On the basis of this observation, it is possible to assign the 1459-cm-1 component of the broad IR band centered around 1454 cm-1 to the backbone CH2 deformation in PS.
Article
Metabolomics is a growing area in the field of systems biology. Metabolomics has already a long history and also the connection of metabolomics with chemometrics goes back some time. This review discusses the symbiosis of metabolomics and chemometrics with emphasis on the medical domain, puts the combination of the two in historical perspective and tries to give ideas for future research. Copyright © 2006 John Wiley & Sons, Ltd.
Article
During the previous decade, a new array of analytical methodologies and technologies were introduced related to the analysis of microbial, plant and animal metabolomes (complete collections of all low molecular weight compounds in a cell). The scientific field of metabolomics was born. In this review, we discuss advances in methodologies and technologies, and outline applications.
Article
Although bone fracture has become a serious global health issue, current clinical assessments of fracture risk based on bone mineral density are unable to accurately predict whether an individual is likely to suffer a fracture. There is increasing recognition that the chemical structure and composition, or microstructure, of mineralized tissues has an important role to play in determining the fracture resistance of bone. The objective of this preliminary study was to evaluate the use of specular reflectance Fourier transform infrared (SR FT-IR) microspectroscopy in conjunction with discriminant analysis as an innovative technique for providing future insights into the origins of orthopedic abnormalities. The impetus for this approach was that SR FT-IR microspectroscopy would offer several advantages over conventional transmission methods. Bone samples were obtained from young racehorses at known fracture predilection sites and spectra were successfully obtained from calcified cartilage and subchondral bone for the first time. By applying discriminant analysis to the spectral data set in biologically relevant regions, microstructural differences between groups of individuals were found to be related to features associated with both the mineral and organic components of the bone. The preliminary findings also suggest that differences in bone microstructure may exist between healthy individuals of the same age, raising important questions around the normal limits of individual variation and whether individuals may be predisposed to later fracture as a result of detrimental microstructural changes during early growth and development.
Article
A novel concept in vibrational spectroscopy called two-dimensional infrared (2D IR) spectroscopy is described. In 2D IR, a spectrum defined by two independent wavenumbers is generated by a cross-correlation analysis of dynamic fluctuations of IR signals induced by an external perturbation. 2D IR spectra are especially suited for elucidating various chemical interactions among functional groups. Notable features of the 2D IR approach are: simplification of complex spectra consisting of many overlapped peaks; enhancement of spectral resolution by spreading peaks over the second dimension; and establishment of unambiguous assignments through correlation analysis of bands selectively coupled by various interaction mechanisms. The procedure for generating 2D IR correlation spectra and the properties of the 2D spectra are discussed in detail. Examples of 2D IR spectra are presented for atactic polystyrene and the proteinacious component of human stratum corneum to demonstrate the utility of this technique.
Article
The present review describes commonly employed metabolic profiling platforms and discusses the current and likely future application of these technologies in surgery. The metabolic adaptations that occur in response to surgical illness and trauma are incompletely understood. Evaluating these will be critical to the development of personalized surgical health solutions. Metabonomics is an advancing field in systems biology, which provides a means of interrogating these metabolic shifts. Recent literature regarding metabolic profiling technologies and their applications in surgical practice are discussed. Future strategies are proposed for the incorporation of these and next-generation technologies in the evaluation of all steps in the patient surgical pathway. Metabolite-based profiling has provided valuable insights into the metabolic irregularities that occur in cancer development and progression across a variety of cancer subclasses including colorectal, breast, prostate, and lung cancers. In addition, metabolic modeling has shown considerable promise in other surgical conditions including trauma and sepsis and in the assessment of pharmacotherapeutic efficacy. Metabonomics offers a posttranscriptional view of system activity providing functional information downstream of the genome and proteome. Information at this level will provide the surgeon with a novel means of evaluating major socioeconomic problems such as cancer and sepsis. In addition, the rapid nature of emerging next generation profiling platforms provides a viable means of "real-time" perioperative metabolic assessment and optimization.
Article
The high level of complexity in nuclear magnetic resonance (NMR) metabolic spectroscopic data sets has fueled the development of experimental and mathematical techniques that enhance latent biomarker recovery and improve model interpretability. We previously showed that statistical total correlation spectroscopy (STOCSY) can be used to edit NMR spectra to remove drug metabolite signatures that obscure metabolic variation of diagnostic interest. Here, we extend this "STOCSY editing" concept to a generalized scaling procedure for NMR data that enhances recovery of latent biochemical information and improves biological classification and interpretation. We call this new procedure STOCSY-scaling (STOCSY(S)). STOCSY(S) exploits the fixed proportionality in a set of NMR spectra between resonances from the same molecule to suppress or enhance features correlated with a resonance of interest. We demonstrate this new approach using two exemplar data sets: (a) a streptozotocin rat model (n = 30) of type 1 diabetes and (b) a human epidemiological study utilizing plasma NMR spectra of patients with metabolic syndrome (n = 67). In both cases significant biomarker discovery improvement was observed by using STOCSY(S): the approach successfully suppressed interfering NMR signals from glucose and lactate that otherwise dominate the variation in the streptozotocin study, which then allowed recovery of biomarkers such as glycine, which were otherwise obscured. In the metabolic syndrome study, we used STOCSY(S) to enhance variation from the high-density lipoprotein cholesterol peak, improving the prediction of individuals with metabolic syndrome from controls in orthogonal projections to latent structures discriminant analysis models and facilitating the biological interpretation of the results. Thus, STOCSY(S) is a versatile technique that is applicable in any situation in which variation, either biological or otherwise, dominates a data set at the expense of more interesting or important features. This approach is generally appropriate for many types of NMR-based complex mixture analyses and hence for wider applications in bioanalytical science.
Article
A new method is presented using an optical particle counter and the compact mobile laser mass spectrometer LAMPAS 3 for in situ analysis of single particles generated by electrosurgical dissection of biological tissues. The instrumental performance is demonstrated for analysing aerosol particles formed during rapid thermal evaporation of porcine liver and porcine kidney tissues. Particle number concentrations of up to 5,000 particles per cubic centimetre were detected during surgical dissection. Chemical analysis of tissue particles was performed by bipolar time-of-flight mass spectrometry. The application of an online mass spectrometric particle analysis for surgical aerosols is reported here for the first time.
Article
Direct combination of cavitron ultrasonic surgical aspirator (CUSA) and sonic spray ionization mass spectrometry is presented. A commercially available ultrasonic surgical device was coupled to a Venturi easy ambient sonic-spray ionization (V-EASI) source by directly introducing liquified tissue debris into the Venturi air jet pump. The Venturi air jet pump was found to efficiently nebulize the suspended tissue material for gas phase ion production. The ionization mechanism involving solely pneumatic spraying was associated with that of sonic spray ionization. Positive and negative ionization spectra were obtained from brain and liver samples reflecting the primary application areas of the surgical device. Mass spectra were found to feature predominantly complex lipid-type constituents of tissues in both ion polarity modes. Multiply charged peptide anions were also detected. The influence of instrumental settings was characterized in detail. Venturi pump geometry and flow parameters were found to be critically important in ionization efficiency. Standard solutions of phospholipids and peptides were analyzed in order to test the dynamic range, sensitivity, and suppression effects. The spectra of the intact tissue specimens were found to be highly specific to the histological tissue type. The principal component analysis (PCA) and linear discriminant analysis (LDA) based data analysis method was developed for real-time tissue identification in a surgical environment. The method has been successfully tested on post-mortem and ex vivo human samples including astrocytomas, meningeomas, metastatic brain tumors, and healthy brain tissue.
Article
Small molecules are central to biology, mediating critical phenomena such as metabolism, signal transduction, mating attraction, and chemical defense. The traditional categories that define small molecules, such as metabolite, secondary metabolite, pheromone, hormone, and so forth, often overlap, and a single compound can appear under more than one functional heading. Therefore, we favor a unifying term, biogenic small molecules (BSMs), to describe any small molecule from a biological source. In a similar vein, two major fields of chemical research,natural products chemistry and metabolomics, have as their goal the identification of BSMs, either as a purified active compound (natural products chemistry) or as a biomarker of a particular biological state (metabolomics). Natural products chemistry has a long tradition of sophisticated techniques that allow identification of complex BSMs, but it often fails when dealing with complex mixtures. Metabolomics thrives with mixtures and uses the power of statistical analysis to isolate the proverbial “needle from a haystack”, but it is often limited in the identification of active BSMs. We argue that the two fields of natural products chemistry and metabolomics have largely overlapping objectives: the identification of structures and functions of BSMs, which in nature almost inevitably occur as complex mixtures. Nuclear magnetic resonance (NMR) spectroscopy is a central analytical technique common to most areas of BSM research. In this Account, we highlight several different NMR approaches to mixture analysis that illustrate the commonalities between traditional natural products chemistry and metabolomics. The primary focus here is two-dimensional (2D) NMR; because of space limitations, we do not discuss several other important techniques, including hyphenated methods that combine NMR with mass spectrometry and chromatography. We first describe the simplest approach of analyzing 2D NMR spectra of unfractionated mixtures to identify BSMs that are unstable to chemical isolation. We then show how the statistical method of covariance can be used to enhance the resolution of 2D NMR spectra and facilitate the semi-automated identification of individual components in a complex mixture. Comparative studies can be used with two or more samples, such as active vs inactive, diseased vs healthy, treated vs untreated, wild type vs mutant, and so on. We present two overall approaches to comparative studies: a simple but powerful method for comparing two 2D NMR spectra and a full statistical approach using multiple samples. The major bottleneck in all of these techniques is the rapid and reliable identification of unknown BSMs; the solution will require all the traditional approaches of both natural products chemistry and metabolomics as well as improved analytical methods, databases, and statistical tools.
Article
Supervised multivariate statistical analyses of NMR spectroscopic data sets are often required to identify metabolic differences between sample classes, and the use of orthogonal filters has proven to be highly efficient even when dealing with weak perturbations. In this note, we associate orthogonal filters to the recently reported recoupled-statistical total correlation spectroscopy (RSTOCSY). An initial supervised deflation of the spectral matrix is applied to remove all information orthogonal to the effect of interest and is followed by an RSTOCSY analysis to extract a list of pairs of metabolites that experience correlated perturbations. This list can then be used to find possibilities for the perturbed metabolic network. This supervised RSTOCSY approach, dubbed OR-STOCSY, yields metabolites related to perturbations of biological interest, even if they make a minor contribution to the global variance of a complex data set compared to other (possibly confounding) effects under study. The method is demonstrated with the application to genetic phenotypes in Caenorhabditis elegans.
Article
Laser desorption ionization-mass spectrometric (LDI-MS) analysis of vital biological tissues and native, ex vivo tissue specimens is described. It was found that LDI-MS analysis yields tissue specific data using lasers both in the ultraviolet and far-infrared wavelength regimes, while visible and near IR lasers did not produce informative MS data. LDI mass spectra feature predominantly phospholipid-type molecular ions both in positive and negative ion modes, similar to other desorption ionization methods. Spectra were practically identical to rapid evaporative ionization MS (REIMS) spectra of corresponding tissues, indicating a similar ion formation mechanism. LDI-MS analysis of intact tissues was characterized in detail. The effect of laser fluence on the spectral characteristics (intensity and pattern) was investigated in the case of both continuous wave and pulsed lasers at various wavelengths. Since lasers are utilized in various fields of surgery, a surgical laser system was combined with a mass spectrometer in order to develop an intraoperative tissue identification device. A surgical CO(2) laser was found to yield sufficiently high ion current during normal use. The principal component analysis-based real-time data analysis method was developed for the quasi real-time identification of mass spectra. Performance of the system was demonstrated in the case of various malignant tumors of the gastrointestinal tract.
Article
Surgery remains the first and most important treatment modality for the majority of solid tumors. Across a range of brain tumor types and grades, postoperative residual tumor has a great impact on prognosis. The principal challenge and objective of neurosurgical intervention is therefore to maximize tumor resection while minimizing the potential for neurological deficit by preserving critical tissue. To introduce the integration of desorption electrospray ionization mass spectrometry into surgery for in vivo molecular tissue characterization and intraoperative definition of tumor boundaries without systemic injection of contrast agents. Using a frameless stereotactic sampling approach and by integrating a 3-dimensional navigation system with an ultrasonic surgical probe, we obtained image-registered surgical specimens. The samples were analyzed with ambient desorption/ionization mass spectrometry and validated against standard histopathology. This new approach will enable neurosurgeons to detect tumor infiltration of the normal brain intraoperatively with mass spectrometry and to obtain spatially resolved molecular tissue characterization without any exogenous agent and with high sensitivity and specificity. Proof of concept is presented in using mass spectrometry intraoperatively for real-time measurement of molecular structure and using that tissue characterization method to detect tumor boundaries. Multiple sampling sites within the tumor mass were defined for a patient with a recurrent left frontal oligodendroglioma, World Health Organization grade II with chromosome 1p/19q codeletion, and mass spectrometry data indicated a correlation between lipid constitution and tumor cell prevalence. The mass spectrometry measurements reflect a complex molecular structure and are integrated with frameless stereotaxy and imaging, providing 3-dimensional molecular imaging without systemic injection of any agents, which can be implemented for surgical margins delineation of any organ and with a rapidity that allows real-time analysis.
Article
Surgical trauma initiates a complex series of metabolic host responses designed to maintain homeostasis and ensure survival. (1)H NMR spectroscopy was applied to intraoperative urine and plasma samples as part of a strategy to analyze the metabolic response of Wistar rats to a laparotomy model. Spectral data were analyzed by multivariate statistical analysis. Principal component analysis (PCA) confirmed that surgical injury is responsible for the majority of the metabolic variability demonstrated between animals (R² Urine = 81.2% R² plasma = 80%). Further statistical analysis by orthogonal projection to latent structure discriminant analysis (OPLS-DA) allowed the identification of novel urinary metabolic markers of surgical trauma. Urinary levels of taurine, glucose, urea, creatine, allantoin, and trimethylamine-N-oxide (TMAO) were significantly increased after surgery whereas citrate and 2-oxoglutarate (2-OG) negatively correlated with the intraoperative state as did plasma levels of betaine and tyrosine. Plasma levels of lipoproteins such as VLDL and LDL also rose with the duration of surgery. Moreover, the microbial cometabolites 3-hydroxyphenylpropionate, phenylacetylglycine, and hippurate correlated with the surgical insult, indicating that the gut microbiota are highly sensitive to the global homeostatic state of the host. Metabonomic profiling provides a global overview of surgical trauma that has the potential to provide novel biomarkers for personalized surgical optimization and outcome prediction.
Article
The newly developed rapid evaporative ionization mass spectrometry (REIMS) provides the possibility of in vivo, in situ mass spectrometric tissue analysis. The experimental setup for REIMS is characterized in detail for the first time, and the description and testing of an equipment capable of in vivo analysis is presented. The spectra obtained by various standard surgical equipments were compared and found highly specific to the histological type of the tissues. The tissue analysis is based on their different phospholipid distribution; the identification algorithm uses a combination of principal component analysis (PCA) and linear discriminant analysis (LDA). The characterized method was proven to be sensitive for any perturbation such as age or diet in rats, but it was still perfectly suitable for tissue identification. Tissue identification accuracy higher than 97% was achieved with the PCA/LDA algorithm using a spectral database collected from various tissue species. In vivo, ex vivo, and post mortem REIMS studies were performed, and the method was found to be applicable for histological tissue analysis during surgical interventions, endoscopy, or after surgery in pathology.
Article
The development of Statistical Total Correlation Spectroscopy (STOCSY), a representation of the autocorrelation matrix of a spectral data set as a 2D pseudospectrum, has allowed more reliable assignment of one- and two-dimensional NMR spectra acquired from the complex mixtures that are usually used in metabolomics/metabonomics studies, thus, improving precise identification of candidate biomarkers contained in metabolic signatures computed by multivariate statistical analysis. However, the correlations obtained cannot always be interpreted in terms of connectivities between metabolites. In this study, we combine statistical recoupling of variables (SRV) and STOCSY to identify perturbed metabolite systems. The resulting Recoupled-STOCSY (R-STOCSY) method provides a 2D correlation landscape based on the SRV clusters representing physical, chemical, and biological entities. This enables the identification of correlations between distant clusters and extends the recoupling scheme of SRV, which was previously limited to the association of neighboring clusters. This allows the recovery of only meaningful correlations between metabolic signals and significantly enhances the interpretation of STOCSY. The method is validated through the measurement of the distances between the metabolites involved in these correlations, within the whole metabolic network, which shows that the average shortest path length is significantly shorter for the correlations detected in this new way compared to metabolite couples randomly selected from within the entire KEGG metabolic network. This enables the identification without any a priori knowledge of the perturbed metabolic network. The R-STOCSY completes the recoupling procedure between distant clusters, further reducing the high dimensionality of metabolomics/metabonomics data set and finally allows the identification of composite biomarkers, highlighting disruption of particular metabolic pathways within a global metabolic network. This allows the perturbed metabolic network to be extracted through NMR based metabolomics/metabonomics in an automated, and statistical manner.
Article
Intestinal ischemia/reperfusion (I/R) injury initiates a systemic inflammatory response syndrome with a high associated mortality rate. Early diagnosis is essential for reducing surgical mortality, yet current clinical biomarkers are insufficient. Metabonomics is a novel strategy for studying intestinal I/R, which may be used as part of a systems approach for quantitatively analyzing the intestinal microbiome during gut injury. By deconvolving the mammalian-microbial symbiotic relationship systems biology thus has the potential for personalized risk stratification in patients exposed to intestinal I/R. This review describes the mechanism of intestinal I/R and explores the essential role of the intestinal microbiota in the initiation of systemic inflammatory response syndrome. Furthermore, it analyzes current and future approaches for elucidating the mechanism of this condition.
Article
Spectroscopic profiling of biological samples is an integral part of metabolically driven top-down systems biology and can be used for identifying biomarkers of toxicity and disease. However, optimal biomarker information recovery and resonance assignment still pose significant challenges in NMR-based complex mixture analysis. The reduced signal overlap as achieved when projecting two-dimensional (2D) J-resolved (JRES) NMR spectra can be exploited to mitigate this problem and, here, full-resolution (1)H JRES projections have been evaluated as a tool for metabolic screening and biomarker identification. We show that the recoverable information content in JRES projections is intrinsically different from that in the conventional one-dimensional (1D) and Carr-Purcell-Meiboom-Gill (CPMG) spectra, because of the combined result of reduction of the over-representation of highly split multiplet peaks and relaxation editing. Principal component and correlation analyses of full-resolution JRES spectral data demonstrated that peak alignment is necessary. The application of statistical total correlation spectroscopy (STOCSY) to JRES projections improved the identification of previously overlapped small molecule resonances in JRES (1)H NMR spectra, compared to conventional 1D and CPMG spectra. These approaches are demonstrated using a galactosamine-induced hepatotoxicity study in rats and show that JRES projections have a useful and complementary role to standard one-dimensional experiments in complex mixture analysis for improved biomarker identification.
Article
Metabolomics is a post genomic research field concerned with developing methods for analysis of low molecular weight compounds in biological systems, such as cells, organs or organisms. Analyzing metabolic differences between unperturbed and perturbed systems, such as healthy volunteers and patients with a disease, can lead to insights into the underlying pathology. In metabolomics analysis, large amounts of data are routinely produced in order to characterize samples. The use of multivariate data analysis techniques and chemometrics is a commonly used strategy for obtaining reliable results. Metabolomics have been applied in different fields such as disease diagnosis, toxicology, plant science and pharmaceutical and environmental research. In this review we take a closer look at the chemometric methods used and the available results within the field of disease diagnosis. We will first present some current strategies for performing metabolomics studies, especially regarding disease diagnosis. The main focus will be on data analysis strategies and validation of multivariate models, since there are many pitfalls in this regard. Further, we highlight the most interesting metabolomics publications and discuss these in detail; additional studies are mentioned as a reference for the interested reader. A general trend is an increased focus on biological interpretation rather than merely the ability to classify samples. In the conclusions, the general trends and some recommendations for improving metabolomics data analysis are provided.
Article
A novel mass spectrometric ionization technique based on rapid evaporation of biological tissues (see picture) can be used to analyze vital tissues during surgical intervention as well as for processed tissue specimens. A tissue identification system based on principal-component analysis was developed. The method differentiates malignant tumor cells from the surrounding healthy tissue.
Article
We present a new approach for analysis, information recovery, and display of biological (1)H nuclear magnetic resonance (NMR) spectral data, cluster analysis statistical spectroscopy (CLASSY), which profiles qualitative and quantitative changes in biofluid metabolic composition by utilizing a novel local-global correlation clustering scheme to identify structurally related spectral peaks and arrange metabolites by similarity of temporal dynamic variation. Underlying spectral data sets are presented in a novel graphical format to represent high-dimensionality biochemical information conveying both statistical metabolite relationships and their responses to experimental perturbation simultaneously in a high-throughput and intuitive manner. The method is exemplified using multiple 600 MHz (1)H NMR spectra of rat (n = 40) urine samples collected over 160 h following the development of experimental pancreatitis induced by L-arginine (ARG) and a wider range of model toxins including acetaminophen, galactosamine, and 2-bromoethanamine. The CLASSY approach deconvolutes complex biofluid mixture spectra into quantitative fold-change metabolic trajectories and clusters metabolites by commonalities of coexpression patterns. We demonstrate that the developing pathological processes cause coordinated changes in the levels of many compounds which share similar pathway connectivities. Variability in individual responses to toxin exposure is also readily detected and visualized allowing the assessment of interanimal variability. As an untargeted, unsupervised approach, CLASSY provides significant advantages in biological information recovery in terms of increased throughput, interpretability, and robustness and has wide potential metabonomic/metabolomic applications in clinical, toxicological, and nutritional studies of biofluids as well as in studies of cellular biochemistry, microbial fermentation monitoring, and functional genomics.
Article
Significance testing is a crucial step in metabolic biomarker recovery from the metabolome-wide latent variables computed by multivariate statistical analysis. In this study we propose an algorithm based on the landscape of the covariance/correlation ratio of consecutive variables along the chemical shift axis to restore, prior to significance testing, the spectral dependency and recouple variables in clusters which correspond to physical, chemical, and biological entities: statistical recoupling of variables (SRV). Variables are associated into a series of clusters, which are then considered as individual objects for the control of the false discovery rate. Compared to classical procedures, it is found that SRV allows efficient recovery of statistically significant metabolic variables. The proposed SRV method when associated with the Benjamini-Yekutieli correction retains a low level of significant variables in the noise areas of the nuclear magnetic resonance (NMR) spectrum, close to that observed using the conservative Bonferroni correction (false positive rate), while also allowing successful identification of statistically significant metabolic NMR signals in cases where the classical procedures of Benjamini-Yekutieli and Benjamini-Hochberg (false discovery rate) fail. This procedure improves the interpretability of latent variables for metabolic biomarker recovery.
Article
Covariation in the structural composition of the gut microbiome and the spectroscopically derived metabolic phenotype (metabotype) of a rodent model for obesity were investigated using a range of multivariate statistical tools. Urine and plasma samples from three strains of 10-week-old male Zucker rats (obese (fa/fa, n=8), lean (fa/-, n=8) and lean (-/-, n=8)) were characterized via high-resolution 1H NMR spectroscopy, and in parallel, the fecal microbial composition was investigated using fluorescence in situ hydridization (FISH) and denaturing gradient gel electrophoresis (DGGE) methods. All three Zucker strains had different relative abundances of the dominant members of their intestinal microbiota (FISH), with the novel observation of a Halomonas and a Sphingomonas species being present in the (fa/fa) obese strain on the basis of DGGE data. The two functionally and phenotypically normal Zucker strains (fa/- and -/-) were readily distinguished from the (fa/fa) obese rats on the basis of their metabotypes with relatively lower urinary hippurate and creatinine, relatively higher levels of urinary isoleucine, leucine and acetate and higher plasma LDL and VLDL levels typifying the (fa/fa) obese strain. Collectively, these data suggest a conditional host genetic involvement in selection of the microbial species in each host strain, and that both lean and obese animals could have specific metabolic phenotypes that are linked to their individual microbiomes.
Article
Structural assignment of resonances is an important problem in NMR spectroscopy, and statistical total correlation spectroscopy (STOCSY) is a useful tool aiding this process for small molecules in complex mixture analysis and metabolic profiling studies. STOCSY delivers intramolecular information (delineating structural connectivity) and in metabolism studies can generate information on pathway-related correlations. To understand further the behavior of STOCSY for structural assignment, we analyze the statistical distribution of structural and nonstructural correlations from 1050 (1)H NMR spectra of normal rat urine samples. We find that the distributions of structural/nonstructural correlations are significantly different (p < 10(-112)). From the area under the curve of the receiver operating characteristic (ROC AUC) we show that structural correlations exceed nonstructural correlations with probability AUC = 0.98. Through a bootstrap resampling approach, we demonstrate that sample size has a surprisingly small effect (e.g., AUC = 0.97 for a sample size of 50). We identify specific signatures in the correlation maps resulting from small matrix-derived variations in peak positions but find that their effect on discrimination of structural and nonstructural correlations is negligible for most metabolites. A correlation threshold of r > 0.89 is required to assign two peaks to the same metabolite with high probability (positive predictive value, PPV = 0.9), whereas sensitivity and specificity are equal at 93% for r = 0.22. To assess the wider applicability of our results, we analyze (1)H NMR spectra of urine from rats treated with 115 model toxins or physiological stressors. Across the data sets, we find that the thresholds required to obtain PPV = 0.9 are not significantly different and the degree of overlap between the structural and nonstructural distributions is always small (median AUC = 0.97). The STOCSY method is effective for structural characterization under diverse biological conditions and sample sizes provided the degree of correlation resulting from nonstructural associations (e.g., from nonstationary processes) is small. This study validates the use of the STOCSY approach in the routine assignment of signals in NMR metabolic profiling studies and provides practical benchmarks against which researchers can interpret the results of a STOCSY analysis.
Article
Organisms often respond in complex and unpredictable ways to stimuli that cause disease or injury. By measuring and mathematically modelling changes in the levels of products of metabolism found in biological fluids and tissues, metabonomics offers fresh insight into the effects of diet, drugs and disease.