Effect of adjusting the temperature parameter in the contrastive learning loss on the distribution of molecules in the latent space as visualized via the t-SNE algorithm. For clarity, only a random subset of 2000 natural products is shown. (A) Learning based purely on the cross-entropy objective function. (B-E) The temperature scalar (as in [112]) was varied between 0.02 and 0.5 as indicated. (Reducing t below led to numerical instabilities.) All drugs, fluorophores, and Recon2 metabolites are plotted, along with a randomly chosen 2000 natural products (as in [113]).

Source publication

Figure 1. The transformer-based architecture used in the present work....

Figure 2. Learning curve for training our transformers on (A) drugs,...

Figure 3. Effect of adjusting the temperature parameter in the...

Figure 4. Relationship between the extent of population of different...

Figure 5. Values adopted in dimension 254 of the trained 256-D...

FragNet, a Contrastive Learning-Based Transformer Model for Clustering, Interpreting, Visualizing, and Navigating Chemical Space

Article

Full-text available

Apr 2021

The question of molecular similarity is core in cheminformatics and is usually assessed via a pairwise comparison based on vectors of properties or molecular fingerprints. We recently exploited variational autoencoders to embed 6M molecules in a chemical space, such that their (Euclidean) distance within the latent space so formed could be assessed...

Context 1

... clock time for training an epoch on a single NVIDIA-V100-GPU system was ca. 30 s and 23 min for the two datasets illustrated in Figure 3 gives an overall picture using t-SNE [127,128] of the dataset used. Figure 3A recapitulates that published previously, using standard VAE-type ELBO/K-L divergence learning alone, while panels Figure 3B-E show the considerable effect of varying the temperature scalar (as in [112]). [112]) was varied between 0.02 and 0.5 as indicated. ...

View in full-text

Context 2

View in full-text

Context 3

... can clearly be seen from Figure 3B-E that as the temperature was increased in the series 0.02, 0.05, 0.1, and 0.5, the tightness and therefore the separability of the clusters progressively decreased. For instance, by mainly looking at the fluorophores (red colors) in the plotted latent space for each of the four temperatures, the separability as well as tightness of the cluster was best for the 0.02 and 0.05 temperatures. ...

View in full-text

Integrating transformers and many-objective optimization for drug design

Article

Full-text available

Jun 2024
BMC BIOINFORMATICS

Background Drug design is a challenging and important task that requires the generation of novel and effective molecules that can bind to specific protein targets. Artificial intelligence algorithms have recently showed promising potential to expedite the drug design process. However, existing methods adopt multi-objective approaches which limits the number of objectives. Results In this paper, we expand this thread of research from the many-objective perspective, by proposing a novel framework that integrates a latent Transformer-based model for molecular generation, with a drug design system that incorporates absorption, distribution, metabolism, excretion, and toxicity prediction, molecular docking, and many-objective metaheuristics. We compared the performance of two latent Transformer models (ReLSO and FragNet) on a molecular generation task and show that ReLSO outperforms FragNet in terms of reconstruction and latent space organization. We then explored six different many-objective metaheuristics based on evolutionary algorithms and particle swarm optimization on a drug design task involving potential drug candidates to human lysophosphatidic acid receptor 1, a cancer-related protein target. Conclusion We show that multi-objective evolutionary algorithm based on dominance and decomposition performs the best in terms of finding molecules that satisfy many objectives, such as high binding affinity and low toxicity, and high drug-likeness. Our framework demonstrates the potential of combining Transformers and many-objective computational intelligence for drug design.

"Several Birds with One Stone": Exploring the Potential of AI Methods for Multi-Target Drug Design

Preprint

Full-text available

May 2024

Background: Drug discovery is a time-consuming and expensive process. Artificial intelligence (AI) methodologies have been adopted to cut costs and speed up the drug development process, serving as promising in silico approaches to efficiently design novel drug candidates targeting various health conditions. Most existing AI-driven drug discovery studies follow a single-target approach which focuses on identifying compounds that bind a single target (i.e., one-drug-one-target approach). Polypharmacology is a relatively new concept that takes a systematic approach to search for a compound (or a combination of compounds) that can bind two or more carefully selected protein biomarkers simultaneously to synergistically treat the disease. Recent studies have demonstrated that multi-target drugs offer superior therapeutic potentials compared to single-target drugs. However, it is intuitively thought that searching for multi-target drugs is more challenging than finding single-target drugs. At present, it is unclear how AI approaches perform in designing multi-target drugs. Results: In this paper, we comprehensively investigated the performance of multi-objective AI approaches for multi-target drug design. Conclusion: Our findings are quite counterintuitive demonstrating that, in fact, AI approaches for multi-target drug design are able to efficiently generate more high-quality novel compounds than the single-target approaches while satisfying a number of constraints.

Integrating Transformers and Many-Objective Optimization for Cancer Drug Design

Preprint

Full-text available

Apr 2024

Background: Drug design is a challenging and important task that requires the generation of novel and effective molecules that can bind to specific protein targets. Artificial intelligence (AI) algorithms have recently showed promising potential to expedite the drug design process. However, existing methods adopt multi-objective approaches which limits the number of objectives. Results: In this paper, we expand this thread of research from the many-objective perspective, by proposing a novel framework that integrates a latent Transformer-based model for molecular generation, with a drug design system that incorporates absorption, distribution, metabolism, excretion, and toxicity (ADMET) prediction, molecular docking, and many-objective metaheuristics. We compared the performance of two latent Transformer models (ReLSO and FragNet) on a molecular generation task and show that ReLSO outperforms FragNet in terms of reconstruction and latent space organization. We then explored six different many-objective metaheuristics based on evolutionary algorithms and particle swarm optimization on a drug design task involving potential drug candidates to human lysophosphatidic acid receptor 1 (LPA1), a cancer-related protein target. Conclusion: We show that multi-objective evolutionary algorithm based on dominance and decomposition (MOEA/DD) performs the best in terms of finding molecules that satisfy many objectives, such as high binding affinity and low toxicity, and high drug-likeness. Our framework demonstrates the potential of combining Transformers and many-objective computational intelligence for cancer drug design.

SALSA: Semantically-Aware Latent Space Autoencoder

Article

Mar 2024

In deep learning for drug discovery, molecular representations are often based on sequences, known as SMILES, which allow for straightforward implementation of natural language processing methodologies, one being the sequence-to-sequence autoencoder. However, we observe that training an autoencoder solely on SMILES is insufficient to learn molecular representations that are semantically meaningful, where semantics are specified by the structural (graph-to-graph) similarities between molecules. We demonstrate by example that SMILES-based autoencoders may map structurally similar molecules to distant codes, resulting in an incoherent latent space that does not necessarily respect the semantic similarities between molecules. To address this shortcoming we propose Semantically-Aware Latent Space Autoencoder (SALSA) for molecular representations: a SMILES-based transformer autoencoder modified with a contrastive task aimed at learning graph-to-graph similarities between molecules. To accomplish this, we develop a novel dataset comprised of sets of structurally similar molecules and opt for a supervised contrastive loss that is able to incorporate full sets of positive samples. We evaluate semantic awareness of SALSA representations by comparing to its ablated counterparts, and show empirically that SALSA learns representations that maintain 1) structural awareness, 2) physicochemical awareness, 3) biological awareness, and 4) semantic continuity.

Gradient-Based Competitive Learning: Theory

Article

Full-text available

Nov 2023

Deep learning has been recently used to extract the relevant features for representing input data also in the unsupervised setting. However, state-of-the-art techniques focus mostly on algorithmic efficiency and accuracy rather than mimicking the input manifold. On the contrary, competitive learning is a powerful tool for replicating the input distribution topology. It is cognitive/biologically inspired as it is founded on Hebbian learning, a neuropsychological theory claiming that neurons can increase their specialization by competing for the right to respond to/represent a subset of the input data. This paper introduces a novel perspective by combining these two techniques: unsupervised gradient-based and competitive learning. The theory is based on the intuition that neural networks can learn topological structures by working directly on the transpose of the input matrix. At this purpose, the vanilla competitive layer and its dual are presented. The former is representative of a standard competitive layer for deep clustering, while the latter is trained on the transposed matrix. The equivalence of the layers is extensively proven both theoretically and experimentally. The dual competitive layer has better properties. Unlike the vanilla layer, it directly outputs the prototypes of the data inputs, while still allowing learning by backpropagation. More importantly, this paper proves theoretically that the dual layer is better suited for handling high-dimensional data (e.g., for biological applications), because the estimation of the weights is driven by a constraining subspace which does not depend on the input dimensionality, but only on the dataset cardinality. This paper has introduced a novel approach for unsupervised gradient-based competitive learning. This approach is very promising both in the case of small datasets of high-dimensional data and for better exploiting the advantages of a deep architecture: the dual layer perfectly integrates with the deep layers. A theoretical justification is also given by using the analysis of the gradient flow for both vanilla and dual layers.

A Perspective on How Fibrinaloid Microclots and Platelet Pathology May be Applied in Clinical Investigations

Article

Full-text available

Sep 2023
SEMIN THROMB HEMOST

Microscopy imaging has enabled us to establish the presence of fibrin(ogen) amyloid (fibrinaloid) microclots in a range of chronic, inflammatory diseases. Microclots may also be induced by a variety of purified substances, often at very low concentrations. These molecules include bacterial inflammagens, serum amyloid A, and the S1 spike protein of severe acute respiratory syndrome coronavirus 2. Here, we explore which of the properties of these microclots might be used to contribute to differential clinical diagnoses and prognoses of the various diseases with which they may be associated. Such properties include distributions in their size and number before and after the addition of exogenous thrombin, their spectral properties, the diameter of the fibers of which they are made, their resistance to proteolysis by various proteases, their cross-seeding ability, and the concentration dependence of their ability to bind small molecules including fluorogenic amyloid stains. Measuring these microclot parameters, together with microscopy imaging itself, along with methodologies like proteomics and imaging flow cytometry, as well as more conventional assays such as those for cytokines, might open up the possibility of a much finer use of these microclot properties in generative methods for a future where personalized medicine will be standard procedures in all clotting pathology disease diagnoses.

ChemoGraph: Interactive Visual Exploration of the Chemical Space

Article

Full-text available

Jun 2023
COMPUT GRAPH FORUM

Exploratory analysis of the chemical space is an important task in the field of cheminformatics. For example, in drug discovery research, chemists investigate sets of thousands of chemical compounds in order to identify novel yet structurally similar synthetic compounds to replace natural products. Manually exploring the chemical space inhabited by all possible molecules and chemical compounds is impractical, and therefore presents a challenge. To fill this gap, we present ChemoGraph, a novel visual analytics technique for interactively exploring related chemicals. In ChemoGraph, we formalize a chemical space as a hypergraph and apply novel machine learning models to compute related chemical compounds. It uses a database to find related compounds from a known space and a machine learning model to generate new ones, which helps enlarge the known space. Moreover, ChemoGraph highlights interactive features that support users in viewing, comparing, and organizing computationally identified related chemicals. With a drug discovery usage scenario and initial expert feedback from a case study, we demonstrate the usefulness of ChemoGraph.

Explainability Techniques for Chemical Language Models

Preprint

Full-text available

May 2023

Explainability techniques are crucial in gaining insights into the reasons behind the predictions of deep learning models, which have not yet been applied to chemical language models. We propose an explainable AI technique that attributes the importance of individual atoms towards the predictions made by these models. Our method backpropagates the relevance information towards the chemical input string and visualizes the importance of individual atoms. We focus on self-attention Transformers operating on molecular string representations and leverage a pretrained encoder for finetuning. We showcase the method by predicting and visualizing solubility in water and organic solvents. We achieve competitive model performance while obtaining interpretable predictions, which we use to inspect the pretrained model.

Can We Quickly Learn to “Translate” Bioactive Molecules with Transformer Models?

Preprint

Full-text available

Dec 2022

Meaningful exploration of the chemical space of druglike molecules in drug design is a highly challenging task due to a combinatorial explosion of possible modifications of molecules. In this work, we address this problem with transformer models, a type of machine learning (ML) model, with recent demonstrated success in applications to machine translation and other tasks. By training transformer models on pairs of similar bioactive molecules from the public ChEMBL dataset, we enable them to learn medicinal-chemistry-meaningful, context-dependent transformations of molecules, including those absent from the training set. Most generated molecules are highly plausible and follow similar distributions of simple properties (molecular weight, polarity, hydrogen bond donor and acceptor numbers) as the training dataset. By retrospective analysis of the performance of transformer models on ChEMBL subsets of ligands binding to COX2, DRD2, or HERG protein targets, we demonstrate that the models can generate structures identical or highly similar to highly active ligands, despite the models having not seen any ligands active against the corresponding protein target during training. Thus, our work demonstrates that transformer models, originally developed to translate texts from one natural language to another, can be easily and quickly extended to “translations” from known molecules active against a given protein target to novel molecules active against the same target, and thereby contribute to hit expansion in drug design.

Accurate prediction of molecular properties and drug targets using a self-supervised image representation learning framework

Article

Full-text available

Nov 2022

The clinical efficacy and safety of a drug is determined by its molecular properties and targets in humans. However, proteome-wide evaluation of all compounds in humans, or even animal models, is challenging. In this study, we present an unsupervised pretraining deep learning framework, named ImageMol, pretrained on 10 million unlabelled drug-like, bioactive molecules, to predict molecular targets of candidate compounds. The ImageMol framework is designed to pretrain chemical representations from unlabelled molecular images on the basis of local and global structural characteristics of molecules from pixels. We demonstrate high performance of ImageMol in evaluation of molecular properties (that is, the drug’s metabolism, brain penetration and toxicity) and molecular target profiles (that is, beta-secretase enzyme and kinases) across 51 benchmark datasets. ImageMol shows high accuracy in identifying anti-SARS-CoV-2 molecules across 13 high-throughput experimental datasets from the National Center for Advancing Translational Sciences. Via ImageMol, we identified candidate clinical 3C-like protease inhibitors for potential treatment of COVID-19.

Contexts in source publication

Citations