Article

GRid-INdependent Descriptors (GRIND): A Novel Class of Alignment-Independent Three-Dimensional Molecular Descriptors

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Traditional methods for performing 3D-QSAR rely upon an alignment step that is often time-consuming and can introduce user bias, the resultant model being dependent upon and sensitive to the alignment used. There are several methods which overcome this problem, but in general the necessary transformations prevent a simple interpretation of the resultant models in the original descriptor space (i.e. 3D molecular coordinates). Here we present a novel class of molecular descriptors which we have termed GRid-INdependent Descriptors (GRIND). They are derived in such a way as to be highly relevant for describing biological properties of compounds while being alignment-independent, chemically interpretable, and easy to compute. GRIND are obtained starting from a set of molecular interaction fields, computed by the program GRID or by other programs. The procedure for computing the descriptors involves a first step, in which the fields are simplified, and a second step, in which the results are encoded into alignment-independent variables using a particular type of autocorrelation transform. The molecular descriptors so obtained can be used to obtain graphical diagrams called "correlograms" and can be used in different chemometric analyses, such as principal component analysis or partial least-squares. An important feature of GRIND is that, with the use of appropriate software, the original descriptors (molecular interaction fields) can be regenerated from the autocorrelation transform and, thus, the results of the analysis represented graphically, together with the original molecular structures, in 3D plots. In this respect, the article introduces the program ALMOND, a software package developed in our group for the computation, analysis, and interpretation of GRIND. The use of the methodology is illustrated using some examples from the field of 3D-QSAR. Highly predictive and interpretable models are obtained showing the promising potential of the novel descriptors in drug design.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... 3D-QSAR is a powerful approach to quantitatively and qualitatively extract common 3D structural elements in a group of small molecules that mediate binding to a specific domain(s) in a protein. The process involves the calculation of 3D variables, correlation of variables with the activities to build a 3D-QSAR model, and finally, filtration of variables to obtain the most relevant variables correlated with the activities in the 3D-QSAR model [111]. The built 3D-QSAR model can be used for the identification of structural elements required for effective receptor binding and high throughput screening to identify new drug candidates [112][113][114][115][116][117]. ...
... For this, the 3D structures of compounds were retrieved from the PubChem chemical database and then docked into the S1PR1 binding site using the GOLD program [118] to obtain active conformers. The active conformers were introduced to an alignment-independent 3D-QSAR program, namely Pentacle [111] where the 3D features for each compound against a virtual receptor (N1: hydrogen bond acceptor (HBA), O: hydrogen bond donor (HBD), TIP: molecular shape and DRY: hydrophobic interaction) were extracted for each compound using ALMOND algorithm. These features were encoded as the 3D variables according to the consistently large auto and cross-correlation (CLACC) algorithm and correlated with the corresponding observed S1PR1 modulatory activity to build a 3D-QSAR model using the partial least square (PLS) algorithm. ...
Article
Multiple sclerosis (MS) is a neurological disease that leads to severe physical and cognitive disabilities. Drugs used in the treatment of MS vary from small synthetic molecules to large macromolecules such as antibodies. Sphingosine 1-phosphate receptor modulators are frequently used for the treatment of MS. These medicines prevent the egress of lymphocytes from secondary lymphoid organs leading to immune system suppression. Currently, four S1PR modulators are on the market and several potential drug candidates are in clinical trials for the treatment of MS. These compounds differ in chemical structure, adverse effects, and efficacy points of view. The current article reviews the latest studies on S1PR1 modulators and compares them with other MS drugs in terms of efficacy, tolerability, and safety. A special focus was dedicated to discussing the structure-activity relationships of these compounds and performing a three-dimensional quantitative structure-activity relationship (3D-QSAR) analysis to gain better insight into the ligand-receptor interaction mode.
... Selected 3D molecular confirmations of ligands obtained from clusters containing maximum docked ligands along with their inhibitory potencies (pIC 50 ) were imported in Pentacle software version 1.06 to construct the GRIND model (Pastor et al., 2000). Calculation of molecular interaction fields (MIFs) was done by use of different probes, namely, N1, O, DRY, and TIP, where N1 (amide N) represents a hydrogen bond donor, O (sp 2 carbonyl O) denotes a hydrogen bond acceptor, DRY indicates a hydrophobic region, and TIP stands for steric hotspots within the virtual receptor site. ...
... To select the most probable binding poses, common scaffold clustering was performed because using GRID-independent molecular descriptors, analysis of 3D structural features is highly dependent on 3D confirmations of the molecules (Pastor et al., 2000). Multiple clusters at 3.5 Å RMSD were generated and binding poses from clusters with maximum number of docked ligands were further used to build the GRIND model. ...
Article
Full-text available
Leukotrienes (LTs) are pro-inflammatory lipid mediators derived from arachidonic acid (AA), and their high production has been reported in multiple allergic, autoimmune, and cardiovascular disorders. The biological synthesis of leukotrienes is instigated by transfer of AA to 5-lipoxygenase (5-LO) via the 5-lipoxygenase-activating protein (FLAP). Suppression of FLAP can inhibit LT production at the earliest level, providing relief to patients requiring anti-leukotriene therapy. Over the last 3 decades, several FLAP modulators have been synthesized and pharmacologically tested, but none of them could be able to reach the market. Therefore, it is highly desirable to unveil the structural requirement of FLAP modulators. Here, in this study, supervised machine learning techniques and molecular modeling strategies are adapted to vaticinate the important 2D and 3D anti-inflammatory properties of structurally diverse FLAP inhibitors, respectively. For this purpose, multiple machine learning classification models have been developed to reveal the most relevant 2D features. Furthermore, to probe the 3D molecular basis of interaction of diverse anti-inflammatory compounds with FLAP, molecular docking studies were executed. By using the most probable binding poses from docking studies, the GRIND model was developed, which indicated the positive contribution of four hydrophobic, two hydrogen bond acceptor, and two shape-based features at certain distances from each other towards the inhibitory potency of FLAP modulators. Collectively, this study sheds light on important two-dimensional and three-dimensional structural requirements of FLAP modulators that can potentially guide the development of more potent chemotypes for the treatment of inflammatory disorders.
... accessed: 21st November 2016) and converted to 3D using Corina version 3.494 (Sadowski et al., 1994). These were then used to generate GRIND2 descriptors (Pastor et al., 2000;Duran et al., 2009) making use of Pentacle software version 1.0.6 (www.moldiscovery.com/software/pentacle), with default settings. The resulting molecular descriptors were then projected into the principal component analysis (PCA) scores obtained for a collection of 8298 ToxCast and Tox21 compounds (USEPA, 2016) characterized using a similar procedure (see Supporting Information Excel File). ...
... illustrated that both properties were weakly correlated (r = 0.51), while MW and logP were largely independent (r = 0.2) for the screen hits. To obtain a less biased overview of compound properties, grid-independent descriptors (Duran et al., 2009;Pastor et al., 2000) (GRIND2) were computed as described in the 'Methods' section. In order to anchor the display of the NTP80-list according to these descriptors, the same parameters were also obtained for a set of ToxCast and Tox21 compounds (n=8298). ...
... Both learning set compounds and candidate molecules were imported into Pentacle software (version 1.06 for Linux) [47] in a 3D sdf format. There they were oriented towards principal moments of inertia and protonated there at a physiological pH. ...
Article
Full-text available
The influenza A virus nonstructural protein 1 (NS1), which is crucial for viral replication and immune evasion, has been identified as a significant drug target with substantial potential to contribute to the fight against influenza. The emergence of drug-resistant influenza A virus strains highlights the urgent need for novel therapeutics. This study proposes a combined theoretical criterion for the virtual screening of molecular libraries to identify candidate NS1 inhibitors. By applying the criterion to the ZINC Natural Product database, followed by ligand-based virtual screening and molecular docking, we proposed the most promising candidate as a potential NS1 inhibitor. Subsequently, the selected natural compound was experimentally evaluated, revealing measurable virus replication inhibition activity in cell culture. This approach offers a promising avenue for developing novel anti-influenza agents targeting the NS1 protein.
... properties (AMSP), 32 comparative moment analysis (CoMMA), 33 weighted holistic invariant molecular descriptors (WHIM) 34 and grid independent descriptors (GRIND) methods. 35 Although such methods are not affected by molecular alignment, the 3D descriptors they rely on are still sensitive to the three-dimensional conformation of molecules. Therefore, the conformations of molecules used for the development of QSAR models using 3D descriptors are crucial for obtaining meaningful predictive models. ...
Article
QSAR models are widely and successfully used in many research areas. The success of such models highly depends on molecular descriptors typically classified as 1D, 2D, 3D, or 4D. While 3D information is likely important, e.g., for modeling ligand‐protein binding, previous comparisons between the performances of 2D and 3D descriptors were inconclusive. Yet in such comparisons the modeled ligands were not necessarily represented by their bioactive conformations. With this in mind, we mined the PDB for sets of protein‐ligand complexes sharing the same protein for which uniform activity data were reported. The results, totaling 461 structures spread across six series were compiled into a carefully curated, first of its kind dataset in which each ligand is represented by its bioactive conformation. Next, each set was characterized by 2D, 3D and 2D+3D descriptors and modeled using three machine learning algorithms, namely, k ‐Nearest Neighbors, Random Forest and Lasso Regression. Models’ performances were evaluated on external test sets derived from the parent datasets either randomly or in a rational manner. We found that many more significant models were obtained when combining 2D and 3D descriptors. We attribute these improvements to the ability of 2D and 3D descriptors to code for different, yet complementary molecular properties.
... Unlike other 3D-QSAR methodologies such as CoMFA and CoMSIA, which require prior alignment of molecules to calculate descriptors, Pentacle uses grid-independent descriptors (GRIND). 24 For calculation of these descriptors molecule was placed in 3D-grid and subjected to interactions with different probes that represent most common type of ligand-receptors interactions. These included DRY probe that represent hydrophobic interactions, N1 probe that represents hydrogen bond acceptor (HBA) groups, O probe representing hydrogen bond donors (HBD) and TIP probe describing steric interactions. ...
Article
Full-text available
Background/Aim: Therapy of diabetes mellitus type 2 includes drugs that act as inhibitors of dipeptidyl peptidase 4 (DPP-4) enzyme. Several DPP-4 inhibitors are marketed today and although they have favourable safety profile and tolerability, they show moderate activity in controlling glycaemia. The 3D quantitative structure-activity relationship (3D-QSAR) methodology was employed in order to find pharmacophore responsible for good DPP-4 inhibitory activity and designed new compounds with enhanced activity. Methods: For 3D-QSAR model development, 48 compounds structurally related to sitagliptin were collected from ChEMBL database. Structures of all compounds were optimised in order to find the best 3D conformations prior to QSAR modelling. To establish correlation between structure and biological activity Partial Least Squares (PLS) regression method integrated in Pentacle software was used. Results: Parameters of internal and external validation (R2 = 0.80, Q2 = 0.64 and R2 pred = 0.610) confirmed reliability of developed QSAR model. Analysis of obtained structural descriptors enabled identification of key structural characteristics that influenced DPP-4 inhibitory activity. Based on that information, new compounds were designed, of which 35 compounds had a better predicted activity, compared to sitagliptin. Conclusion: This QSAR model can be used for DPP-4 inhibitory activity prediction of structurally related compounds and resulting pharmacophore contains information useful for optimisation and design of new DPP-4 inhibitors. Finally, authors propose designed compounds for further synthesis, in vitro and in vivo testing, as new potential DPP-4 inhibitors.
... Each characterizes the effect of an atom on the overall shape of a molecule. For example, farther atoms have larger ii than atoms closer to the center of the atom [25,26]. These descriptors are important because they are sensitive to the three-dimensional structure of the molecule. ...
Article
Indole and its derivatives are common heterocyclic compounds in nature that have a wider range of medicinal activities such as antifungal, anti-inflammatory, and anti-seizure. Virtually all indole derivatives showed outstanding antifungal activity against Candida albicans. The aim of this study was to QSAR modeling of indole derivatives and the design of new drugs that have antifungal activity. In this study, 52 compounds were selected. All optimized compounds and quantum descriptors were obtained using Gaussian software and DFT/B3LYP computational method with 6–31 G (d) basis set al, so other descriptors were determined using Dragon software. To examine the relationship between these descriptors and the activity of these compounds, the MLR linear correlation method was used, and the QSAR equation with R² = 0.7884 and R = 0.8879 was obtained for it. Likewise, MSE = 0.1897, RMSE = 0.2848, and Q² = 0.68663 approve the acceptability of the obtained model. The obtained equation reveals that the activity of these compounds is related to the negative coefficient of GATS8p, R7e +, and G2e, which means that with increasing the values of these description nodes, the amount of activity declines. On the other hand, the activity of these compounds depended on the positive coefficients of HATS3p, MATS5e, and RDF045, i.e. with increasing these values, the activity of these compounds also increases, and a good correlation was obtained between the experimental and predicted activity values.
... Chemical structures were drawn in SYBYL7.3 (Version, S., 6.9, Tripos Associates, St. Louis, Mo, 2001) and the optimisation was performed applying Tripos force field with a distance-dependent dielectric and the Powell conjugate gradient algorithm with convergence criterion of 0.001 kcal/mol Å. Partial atomic charges and Grid Independent Descriptors (GRINDs) were calculated using Gasteiger-Huckel method and Pentacle 1.05 software, respectively (Pastor et al., 2000). Regarding the nature of the Grid Independent Descriptors, the final information provided by the GRINDs approach is directly correlated to the structures of inhibitors. ...
Article
Full-text available
Due to the fact that different isoforms of carbonic anhydrase play distinct physiological roles, their diseases/disorders involvement are different as well. Involvement in major disorders such as glaucoma, epilepsy, Alzheimer’s disease, obesity and cancers, have turned carbonic anhydrase into a popular case study in the field of rational drug design. Since carbonic anhydrases are highly similar with regard to their structures, selective inhibition of different isoforms has been a significant challenge. By applying a proteochemometrics approach, herein the chemical interaction space governed by acyl selenoureido benzensulfonamides and human carbonic anhydrases is explored. To assess the validity, robustness and predictivity power of the proteochemometrics model, a diverse set of validation methods was used. The final model is shown to provide valuable structural information that can be considered for new selective inhibitors design. Using the supplied information and to show the applicability of the constructed model, new compounds were designed. Monitoring of selectivity ratios of new designs shows very promising results with regard to their selectivity for a specific isoform of carbonic anhydrase.
... The virtual screening of compound candidates was carried by the use of the shortest centroid distance. The calculation was carried in Pentacle software version 1.06 for Linux [77][78][79]. From each of the three candidate groups, ten of the most similar molecules were selected and further submitted to molecular docking. ...
Article
Full-text available
Alzheimer’s disease (AD), a devastating neurodegenerative disease, is the focus of pharmacological research. One of the targets that attract the most attention for the potential therapy of AD is the serotonin 5HT6 receptor, which is the receptor situated exclusively in CNS on glutamatergic and GABAergic neurons. The neurochemical impact of this receptor supports the hypothesis about its role in cognitive, learning, and memory systems, which are of critical importance for AD. Natural products are a promising source of novel bioactive compounds with potential therapeutic potential as a 5HT6 receptor antagonist in the treatment of AD dementia. The ZINC—natural product database was in silico screened in order to find the candidate antagonists of 5-HT6 receptor against AD. A virtual screening protocol that includes both short-and long-range interactions between interacting molecules was employed. First, the EIIP/AQVN filter was applied for in silico screening of the ZINC database followed by 3D QSAR and molecular docking. Ten best candidate compounds were selected from the ZINC Natural Product database as potential 5HT6 Receptor antagonists and were proposed for further evaluation. The best candidate was evaluated by molecular dynamics simulations and free energy calculations.
... Finally, the maximum auto 21 interaction distance . The remnant and encoded MIFs were considered as the 22 was used for encoding the MIFs olecular descriptors (GRIND) and correlated with the experimentally independent m -GRID Partial least square SAR model. Q -agonistic activities to generate a 3D 1 ed S1P determin (PLS) algorithm was used for building the 3D-QSAR model. ...
Article
Full-text available
Purpose: Drug repurposing is an approach successfully used for discovery of new therapeutic applications for the existing drugs. The current study was aimed to use the combination of in silico methods to identify FDA-approved drugs with possible S1P1 agonistic activity useful in multiple sclerosis (MS). Methods: For this, a 3D-QSAR model for the known 21 S1P1 agonists were generated based on 3D-QSAR approach and used to predict the possible S1P1 agonistic activity of FDA-approved drugs. Then, the selected compounds were screened by docking into S1P1 and S1P3 receptors to select the S1P1 potent and selective compounds. Further evaluation was carried out by molecular dynamics (MD) simulation studies where the S1P1 binding energies of selected compounds were calculated. Results: The analyses resulted in identification of cobicistat, benzonatate and brigatinib as the selective and potent S1P1 agonists with the binding energies of -85.93, -69.77 and -67.44 kcal. mol⁻¹, calculated using MM-GBSA algorithm based on 50 ns MD simulation trajectories. These values are better than that of siponimod (-59.35 kcal mol⁻¹), an FDA approved S1P1 agonist indicated for MS treatment. Furthermore, similarity network analysis revealed that cobicistat and brigatinib are the most structurally favorable compounds to interact with S1P1. Conclusion: The findings in this study revealed that cobicistat and brigatinib can be evaluated in experimental studies as potential S1P1 agonist candidates useful in the treatment of MS.
... To elucidate the relationship between the structure of compounds 1-12 and their antibacterial activity toward A. baumannii, we developed an alignment-independent 3D QSAR model in Pentacle 1.06 software. 23,24 The MIC 50 values (Table 1) were converted into -logMIC 50 and used as a dependent variable. The initial geometries of compounds were generated using MMFF94 force field 25 and further refined using the PM7 semi-empirical quantum chemical method, 26 implemented in MOPAC2016. ...
Article
Antimicrobial-resistance (AMR) has become the greatest concern and highly challenging issue when treating nosocomial infections. The exigency to develop new potent compounds continues to increase worldwide, whereby derivatives of natural products are becoming more attractive. In the present paper, the microbiological assessment of a series of 12 cinnamide hydrazides, four of them completely novel, against clinically relevant pathogens has discovered several derivatives with promising in vitro activities against Acinetobacter baumannii, one of the most dreaded opportunistic pathogens in hospitals. The compounds were synthesized by combining one of three different natural acids (cinnamic, 4-chloro or 4-methoxy) with four monothiocarbohydrazones (MTCHs) - an important class of synthetic organic molecules. Their structure was confirmed by elemental microanalysis, as well as ATR-FTIR, ¹H and ¹³C NMR spectra, with the addition of 2D NMR spectra for novel compounds. The hybrids of cinnamic acids and pyridine derivatives are particularly active compounds with the lowest MIC50 value of 10.4 for p-chloro cinnamic acid and acetyl pyridine derivatives. An alignment-independent 3D QSAR model identified pharmacophoric hotspots and suggested several structural modifications that might improve the potency of this class of compounds against A. baumannii. The compounds are strong iron-chelating agents forming complexes with a stability constant between 10⁷ and 10⁹. The synthesized derivatives represent a promising class of antibacterial compounds with activities comparable to the commonly used antibiotics.
... The training series can be entered as a single SDFile [13] annotated with a biological property for model building. Most of the computational work was carried out by external programs (e.g., Moka [14,15] for pK a ionization, COR-INA [16, 17] for 3D structure generation, Pentacle [18,19], and PaDel [20] for molecular descriptor calculations), which were called by eTOXlab in sequence. For prediction, we provided a simple application programming interface (API) with a single endpoint. ...
Chapter
The pharmaceutical industry would benefit from the collaboration with academic groups in the development of predictive safety models using the newest computational technologies. However, this collaboration is sometimes hampered by the handling of confidential proprietary information and different working practices in both environments. In this manuscript, we propose a strategy for facilitating this collaboration, based on the use of modeling frameworks developed for facilitating the use of sensitive data, as well as the development, interchange, hosting, and use of predictive models in production. The strategy is illustrated with a real example in which we used Flame, an open-source modeling framework developed in our group, for the development of an in silico eye irritation model. The model was based on bibliographic data, refined during the company-academic group collaboration, and enriched with the incorporation of confidential data, yielding a useful model that was validated experimentally.
... 3D-QSAR model was created by using Pentacle software 1.07 [32] . This software was used to calculate alignment-independent three-dimensional molecular descriptors (GRIND) [33] . The main aim of development 3D-QSAR model in this study is to define crucial structural elements of compounds which have the most important influence on anti-HeLa activity. ...
Article
Selected tetrahydropyrimidines (THPMs) were investigated by means of cytotoxic activities on selected cancer cell lines (HeLa, A549, and LS174) and on non-cancerous cell line (MRC-5). Among evaluated compounds, two of them (B7 and B8) showed good cytotoxic activity on the tested cell lines and were selected for further evaluation that included mechanism of action, DNA and BSA interactions and molecular docking study. Calculated parameters from fluorescence quenching studies indicated that B7 and B8 bind on minor groove of DNA and have great ability to bind on carrier protein. Three-dimensional quantitative structure anti-HeLa activity study was performed with data set of twenty-one compounds. Molecular Interaction Fields were used to derive Grid independent descriptors (GRIND), as independent variables in Pentacle software. The quality and predictive capacity of the model was proved by internal statistical parameters: R² = 0.992, Q²= 0.51, as well as external parameters such as R²pred = 0.804 and rm², r/2m and rm2¯, that were higher than 0.5. The structural determinants significant for anti-HeLa activity of compounds were identified by using developed 3D-QSAR model. Interpretation of the most impactful GRIND variables on the anti-HeLa activity generated several hypotheses for design of novel and more potent anti-HeLa tetrahydropyrimidines. Additional molecular targets for the most active synthesized derivatives (B7 and B8) are predicted by use of online web-based tool – SwissTargetPrediction.
... All calculations were carried in Pentacle software version 1.06 for Linux. [54] ...
Preprint
The need for an effective drug against COVID-19, is, after almost 18 months since the global pandemics outburst, still very high. A very quick and safe approach to counteract COVID-19 is in silico drug repurposing. The SARS-CoV-2 PLpro promotes vi-ral replication and modulates the host immune system, resulting in inhibition of the host antiviral innate immune response, and there-fore is an attractive drug target. In this study, we used a combined in silico virtual screening candidates for SARS-CoV-2 PLpro protease inhibitors. We used the Informational spectrum method applied for Small Molecules for searching the Drugbank database and further followed by molecular docking. After in silico screening of drug space, we identified 44 drugs as potential SARS-CoV-2 PLpro inhibitors that we propose for further experimental testing.
... All calculations were carried in Pentacle software version 1.06 for Linux. [54] ...
Article
Full-text available
In the current pandemic, finding an effective drug to prevent or treat the infection is the highest priority. A rapid and safe approach to counteract COVID‐19 is in silico drug repurposing. The SARS‐CoV‐2 PLpro promotes viral replication and modulates the host immune system, resulting in inhibition of the host antiviral innate immune response, and therefore is an attractive drug target. In this study, we used a combined in silico virtual screening for candidates for SARS‐CoV‐2 PLpro protease inhibitors. We used the Informational spectrum method applied for Small Molecules for searching the Drugbank database followed by molecular docking. After in silico screening of drug space, we identified 44 drugs as potential SARS‐CoV‐2 PLpro inhibitors that we propose for further experimental testing. Combined in silico machinery, consisting of long‐ and short‐range molecular interaction methods is capable of filtering large compound databases, yielding a good candidate for a receptor. The advantage of this approach is a fast assessment of a candidate, starting from basic information on both protein and small organic molecule – FASTA sequence and SMILES notation.
... Stereochemical information of the molecules was kept fixed to generate low energy conformations via neutralization of formal charges, orientation of the 3D structures with reference to their moment of inertia and removal of counterions in salts. In the following step, each independent set of molecular conformations were imported into Pentacle v 1.07 (Pastor et al., 2000) along with their biological activity values (pIC 50 ). Molecular Interaction Fields (MIFs) were computed using probes i.e., hydrophobic interactions (DRY), Hydrogen bond acceptor (O), Hydrogen bond donor (N1) and Topology representing the molecular boundaries (TIP) with reference to the receptor. ...
Article
Full-text available
Insulin like growth factor receptor (IGF-1R) and Insulin receptor (IR) are widely accepted to play a prominent role in cancer drug discovery due to their well-established involvement in various stages of tumorigenesis. Previously, neutralization of IGF-1R via monoclonal antibodies was in focus, which failed because of compensatory activation of IR-A upon inhibition of IGF-1R. Recent studies have demonstrated high homology between IGF-IR and IR particularly in tyrosine kinase domain and targeting both receptors have produced efficient therapeutic approaches such as inhibition of cancer cell cycle proliferation. Herein, we have made an attempt to analyze the unique data set from different chemical classes, containing potent ATP competitors against tyrosine kinase domain. We performed the 2D, 3D quantitative structure-activity relationship (QSAR) studies on inhibitors of these receptors to predict useful pharmacophoric features. We have optimized virtual screening of structurally diverse data set of dual inhibitors of IGF-1R and IR. Based on QSAR studies, we predict potential novel clinical candidates with a demonstrated absorption, distribution, metabolism, elimination, and toxicology (ADMETox) track. We also demonstrated comprehensive analysis of co-crystal complexes along with their inhibitors and built 3D- GRid INdependent Descriptors (GRIND) model to obtain insightful features such as H-bond donors and acceptors, overall topology and Vander Waal volume (vdw_vol) which are found to be responsible for dual inhibition of receptors. These findings lead to further description that Tirofiban, Practolol, Edoxaban, Novobiocin have potential to perform dual inhibition of both targets.
... MD simulation was performed with the NAMD software, while docking studies were performed with both AutoDock Vina and GOLD programs. Afterward, the predicted bioactive conformers were utilized to develop 3D-QSAR models by using the Pentacle program and therefore to gain further insights into the structural requirements that affect their antagonistic activity on 5-HT2A receptor (99). Overall, the results obtained through performed LBDD and SBDD studies as well as ADMET profiling may provide information that can be used for further optimization of compounds as well as for rational drug design. ...
Article
Full-text available
Drug discovery and development is a very challenging, expensive and time-consuming process. Impressive technological advances in computer sciences and molecular biology have made it possible to use computer-aided drug design (CADD) methods in various stages of the drug discovery and development pipeline. Nowadays, CADD presents an efficacious and indispensable tool, widely used in medicinal chemistry, to lead rational drug design and synthesis of novel compounds. In this article, an overview of commonly used CADD approaches from hit identification to lead optimization was presented. Moreover, different aspects of design of multitarget ligands for neuropsychiatric and anti-inflammatory diseases were summarized. Apparently, designing multi-target directed ligands for treatment of various complex diseases may offer better efficacy, and fewer side effects. Antipsychotics that act through aminergic G protein-coupled receptors (GPCRs), especially Dopamine D2 and serotonin 5-HT2A receptors, are the best option for treatment of various symptoms associated with neuropsychiatric disorders. Furthermore, multi-target directed cyclooxygenase-2 (COX-2) and 5-lipoxygenase (5-LOX) inhibitors are also a successful approach to aid the discovery of new anti-inflammatory drugs with fewer side effects. Overall, employing CADD approaches in the process of rational drug design provides a great opportunity for future development, allowing rapid identification of compounds with the optimal polypharmacological profile.
... Furthermore, to investigate whether specific structural features should differentiate the highly myelotoxic from less toxic molecules and, thus, should be taken into consideration in the design of new candidate drugs, the dataset was ana-190 lyzed © using GRIND toxicophore-based descriptors [28]. The paper reported some structural features characterizing the toxicophores of myelotoxic compounds. ...
Article
Introduction:Safety and tolerability is a critical area where improvements are needed to decrease the attrition rates during development of new drug candidates. Modeling approaches, when smartly implemented, can contribute to this aim. Areas covered:The focus of this review was on modeling approaches applied to four kinds of drug-induced toxicities: hematological, immunological, cardiovascular (CV) and liver toxicity. Papers, mainly published in the last 10 years, reporting models in three main methodological categories – computational models (e.g., quantitative structure–property relationships, machine learning approaches, neural networks, etc.), pharmacokinetic-pharmacodynamic (PK-PD) models, and quantitative system pharmacology (QSP) models – have been considered. Expert opinion:The picture observed in the four examined toxicity areas appears heterogeneous. Computational models are typically used in all areas as screening tools in the early stages of development for hematological, cardiovascular and liver toxicity, with accuracies in the range of 70–90%. A limited number of computational models, based on the analysis of drug protein sequence, was instead proposed for immunotoxicity. In the later stages of development, toxicities are quantitatively predicted with reasonably good accuracy using either semi-mechanistic PK-PD models (hematological and cardiovascular toxicity), or fully exploited QSP models (immuno-toxicity and liver toxicity).
... To circumvent this necessity, grid independent descriptors (GRIND) were developed which use internal distances instead of requiring alignment. 48 Over several years, we have implemented more variations than can be enumerated here; however, every method used to represent a whole molecule failed most likely because only a single, ground-state conformer was used. Obviously, this approach is chemically flawed but lacking the ability to identify and calculate reactive conformers for each library member using quantum chemical methods, no way forward was possible with the current workflow. ...
Article
ConspectusCatalyst design in enantioselective catalysis has historically been driven by empiricism. In this endeavor, experimentalists attempt to qualitatively identify trends in structure that lead to a desired catalyst function. In this body of work, we lay the groundwork for an improved, alternative workflow that uses quantitative methods to inform decision making at every step of the process. At the outset, we define a library of synthetically accessible permutations of a catalyst scaffold with the philosophy that the library contains every potential catalyst we are willing to make. To represent these chiral molecules, we have developed general 3D representations, which can be calculated for tens of thousands of structures. This defines the total chemical space of a given catalyst scaffold; it is constructed on the basis of catalyst structure only without regard to a specific reaction or mechanism. As such, any algorithmic subset selection method, which is unsupervised (i.e., only considers catalyst structure), should provide an ideal initial screening set for any new reaction that can be catalyzed by that scaffold. Notably, because this design strategy, the same set of catalysts can be used for any reaction that can be catalyzed with that parent catalyst scaffold. These are tested experimentally, and statistical learning tools can be used to create a model relating catalyst structure to catalyst function. Further, this model can be used to predict the performance of each catalyst candidate in the greater database of virtual catalyst candidates. In this way, it is possible estimate the performance of tens of thousands of catalysts by experimentally testing a smaller subset. Using error assessment metrics, it is possible to understand the confidence in new predictions. An experimentalist using this tool can balance the predicted results (reward) with the prediction confidence (risk) when deciding which catalysts to synthesize next in an optimization campaign. These catalysts are synthesized and tested experimentally. At this stage, either the optimization is a success or the predicted values were incorrect and further optimization is required. In the case of the latter, the information can be fed back into the statistical learning model to refine the model, and this iterative process can be used to determine the optimal catalyst. In this body of work, we not only establish this workflow but quantitatively establish how best to execute each step. Herein, we evaluate several 3D molecular representations to determine how best to represent molecules. Several selection protocols are examined to best decide which set of molecules can be used to represent the library of interest. In addition, the number of reactions needed to make accurate, statistical learning models is evaluated. Taken together these components establish a tool ready to progress from the development stage to the utility stage. As such, current research endeavors focus on applying these tools to optimize new reactions.
... Molecular descriptors, which represent a molecule in a mathematical form, have been first developed in the chemistry field to derive quantitative structure−activity relationships (QSAR) or quantitative structure−property relationships (QSPR), that is, to predict the properties of chemical compound based on their molecular structure. 951,952 Several software packages such as the Molecular Operating Environment (MOE), 953 the CODESSA package, 954 PaDEL, 955 GRIND, 956 and DRAGON 957 can calculate the molecular descriptors based on theoretical considerations. Lewis et al. used 1D, 2D and 3D molecular descriptors to describe a new library of amphiphilic macromolecules and construct QSAR models for their antiatherogenic activity. ...
Article
Full-text available
The complex interaction of cells with biomaterials (i.e., materiobiology) plays an increasingly pivotal role in the development of novel implants, biomedical devices, and tissue engineering scaffolds to treat diseases, aid in the restoration of bodily functions, construct healthy tissues, or regenerate diseased ones. However, the conventional approaches are incapable of screening the huge amount of potential material parameter combinations to identify the optimal cell responses and involve a combination of serendipity and many series of trial-and-error experiments. For advanced tissue engineering and regenerative medicine, highly efficient and complex bioanalysis platforms are expected to explore the complex interaction of cells with biomaterials using combinatorial approaches that offer desired complex microenvironments during healing, development, and homeostasis. In this review, we first introduce materiobiology and its high-throughput screening (HTS). Then we present an in-depth of the recent progress of 2D/3D HTS platforms (i.e., gradient and microarray) in the principle, preparation, screening for materiobiology, and combination with other advanced technologies. The Compendium for Biomaterial Transcriptomics and high content imaging, computational simulations, and their translation toward commercial and clinical uses are highlighted. In the final section, current challenges and future perspectives are discussed. High-throughput experimentation within the field of materiobiology enables the elucidation of the relationships between biomaterial properties and biological behavior and thereby serves as a potential tool for accelerating the development of high-performance biomaterials.
... The conformer ensemble for each molecule was generated by Omegae 9,10 , and conformers most similar to the conformation of natural APS ligand were selected by vROCS 11,12 . These conformers were used for generation of 3D QSAR models, computing GRIND descriptors from the encoded molecular interaction fields (MIF) in Pentacle program 13 . MIFs characterize the interaction features of a molecule with the environment which could be mapped using chemical probes. ...
Article
Full-text available
Metabolism of sulfur (sulfur assimilation pathway, SAP) is one of the key pathways for the pathogenesis and survival of persistant bacterias, such as Mycobacterium tuberculosis (Mtb), in the latent period. Adenosine 5?-phospho-sulfate reductase (APSR) is an important enzyme involved in the SAP, absent from the human body, so it might represents a valid target for development of new antituberculosis drugs. This work aimed to develop 3D QSAR model based on the crystal structure of APSR from Pseudomonas aeruginosa, which shows high degree of homology with APSR from Mtb, in complex with its substrate, adenosine 5?-phosphosulfate (APS). 3D QSAR model was built from a set of 16 nucleotide analogues of APS using alignment-independent descriptors derived from molecular interaction fields (MIF). The model improves the understanding of the key characteristics of molecules necessary for the interaction with target, and enables the rational design of novel small molecule inhibitors of Mtb APSR.
... accessed: 21st November 2016) and converted to 3D using Corina version 3.494 (Sadowski et al. 1994). These were then used to generate GRIND2 descriptors (Duran et al. 2009;Pastor et al. 2000) making use of Pentacle software version 1.0.6 (www.moldiscovery.com/software/pentacle), with default settings. The resulting molecular descriptors were then projected into the principal component analysis (PCA) scores obtained for a collection of ca. ...
... Minimized structures of compounds were transferred to Pentacle 1.05 software (Molecular Discovery Ltd., Oxford, UK) to generate the GRid-INdependent Descriptors (GRINDs). GRIND descriptors specifically describe the pharmacodynamic properties, including receptor-ligand interactions [47][48][49][50]. Amanda algorithm was applied to extract the alignment-independent descriptors (GRINDs) [48]. ...
Article
Full-text available
Dihydrofolate reductase (DHFR) is an essential enzyme that participates in folate metabolism and purine and thymidylate synthesis in cell proliferation. It converts dihydrofolate (DHF) to tetrahydrofolate (THF) in the presence of nicotinamide adenine dinucleotide phosphate (NADPH). This enzyme is found within all organisms adjusting the cellular level of THF. Negligible DHFR activity leads to a deficiency of THF and cell death. This trait is used to inhibit the cancer cells. DHFR has been extensively studied as the therapeutic target for cancer treatment. Accordingly, there is an urgent need for the identification and development of novel effective inhibitors with higher selectivity, lower toxicity, and better potency than currently available drugs. Hence, we aimed to identify new compounds by utilizing the alignment-independent three-dimensional quantitative structure-activity relationship (3D-QSAR) and structure-based pharmacophore modeling. Using the results obtained from these approaches, several antagonists have been retrieved from the virtual screening by applying some filters such as Lipinski’s rule of five. Selected compounds were then docked into the binding site of the receptor for identification of ligand-receptor interactions, binding affinity prediction, and refinement based on GoldScore fitness values. Eventually, pharmacokinetic/drug-likeness features and toxicity profiles of novel compounds were predicted and evaluated. Four hits with PubChem CIDs of 136138676, 94182663, 60219817, and 133300845 were finally proposed as new candidates with potential inhibitory activity against human DHFR.
... Finally, we estimated the target-ligand affinity by using the FLAP (fingerprints for ligands and proteins) procedure [46] that provides a common reference framework for comparing molecules, using GRID molecular interaction fields (MIFs). The GRID MIFs (i.e., GRID molecular interaction fields) [47], originally developed for structure-based drug design [48], have been applied to a variety of drug discovery areas over the years, such as pKa [49] and tautomers modeling [50], scaffold-hopping [51], 3D-QSAR [52], and metabolism prediction [53]. Using the GRID MIFs one can easily obtain information related to non-covalent bonding between the selected probe and the target. ...
Article
Full-text available
The production of seeds without sex is considered the holy grail of plant biology. The transfer of apomixis to various crop species has the potential to transform plant breeding, since it will allow new varieties to retain valuable traits thorough asexual reproduction. Therefore, a greater molecular understanding of apomixis is fundamental. In a previous work we identified a gene, namely APOSTART, that seemed to be involved in this asexual mode of reproduction, which is very common in Poa pratensis L., and here we present a detailed work aimed at clarifying its role in apomixis. In situ hybridization showed that PpAPOSTART is expressed in reproductive tissues from pre-meiosis to embryo development. Interestingly, it is expressed early in few nucellar cells of apomictic individuals possibly switching from a somatic to a reproductive cell as in aposporic apomixis. Moreover, out of 13 APOSTART members, we identified one, APOSTART_6, as specifically expressed in flower tissue. APOSTART_6 also exhibited delayed expression in apomictic genotypes when compared with sexual types. Most importantly, the SCAR (Sequence Characterized Amplified Region) derived from the APOSTART_6 sequence completely co-segregated with apomixis.
Article
We performed molecular field analysis using computed data of half-titanocene-catalyzed olefin polymerization. The activation energies of ethylene insertion, propylene insertion, and the energy differences between ethylene insertion and β-hydrogen transfer...
Article
Full-text available
Structural information can help engineer enzymes. Usually, specific amino acids in particular regions are targeted for functional reconstruction to enhance the catalytic performance, including activity, stereoselectivity, and thermostability. Appropriate selection of target sites is the key to structure‐based design, which requires elucidation of the structure–function relationships. Here, we summarize the mutations of residues in different specific regions, including active center, access tunnels, and flexible loops, on fine‐tuning the catalytic performance of enzymes, and discuss the effects of altering the local structural environment on the functions. In addition, we keep up with the recent progress of structure‐based approaches for enzyme engineering, aiming to provide some guidance on how to take advantage of the structural information.
Article
Full-text available
The application of QSAR analysis dates back a half-century ago and is currently continuously employed in any rational drug design. The multi-dimensional QSAR modeling can be a promising tool for researchers to develop reliable predictive QSAR models for designing novel compounds. In the present work, we studied inhibitors of human aldose reductase (AR) to generate multi-dimensional QSAR models using 3D- and 6D-QSAR methods. For this purpose, Pentacle and Quasar’s programs were used to produce the QSAR models using corresponding dissociation constant (Kd) values. By inspecting the performance metrics of the generated models, we achieved similar results with comparable internal validation statistics. However, considering the externally validated values, 6D-QSAR models provide significantly better prediction of endpoint values. The obtained results suggest that the higher the dimension of the QSAR model, the higher the performance of the generated model. However, more studies are required to verify these outcomes. Supplementary Information The online version contains supplementary material available at 10.1186/s13065-023-00970-x.
Thesis
Full-text available
Towards reducing the timeframe and the high attrition rate in small­-molecule drug discovery, there is growing interest in integrating experimental data and computational methods to decipher the molecular mechanisms through which bioactive compounds interact with their target proteins. The goal of this dissertation is to develop and apply several of these data-­intensive integrative approaches. In the first study, an update was made for StreptomeDB, a chemogenomics database describing the physicochemical and biological properties of metabolites originating from bacteria of the genus Streptomyces. Substantial improvements were made over its forerunners, especially in terms of data content (~2500 new metabolites added) and interoperability (hyperlinks to several spectral, (bio)chemical and chemical vendor databases, and to a genome-­based metabolite prediction server). Next, a novel pharmacophore­-based target prediction tool was developed, named ePharmaLib. It was retrospectively validated using StreptomeDB metabolites. As proof-­of-concept, ePharmaLib predictions were complemented with bioassay experiments to identify the human purine nucleoside phosphorylase as a hitherto unknown protein target of the metabolite called neopterin. In another study, an in-depth structural and statistical analysis was carried out using the solved 3D structures of aromatic­-cage­-containing proteins complexed with their cationic ligands. As a follow­up, the scope of the aforementioned study was expanded to include ligands forming π­π or hydrophobic contacts with aromatic cages. Ultimately, the collected data set was integrated into a web database named AroCageDB. In the fifth study, the solved 3D structures of covalent protein–ligand complexes were manually expertly annotated from the Protein Data Bank and assimilated into a dedicated web database named CovPDB. Lastly, in the sixth study, was carried out an integrative drug repurposing approach based on computational modeling and in vitro enzymatic assays, to repurpose CovPDB serine targeted covalent inhibitors. This led to the identification of the phenylbororonic acid BC­-11, as a nanomolar covalent inhibitor of the human transmembrane protease serine 2, while it exhibited a unique selectivity profile for serine proteases ascribable to its boronic acid warhead. Moreover, BC-­11 showed significant inhibition of SARS­-CoV-­2 (Omicron variant) spike pseudotyped particles in a cell-­based entry assay, thus serving as a good starting point for further structural optimization to develop novel COVID­-19 antivirals.
Article
Recepteur d’Origine Nantais known as RON is a member of the receptor tyrosine kinase (RTK) superfamily which has recently gained increasing attention as cancer target for therapeutic intervention. The aim of this work was to perform an alignment-independent three-dimensional quantitative structure–activity relationship (3D QSAR) study for a series of RON inhibitors. A 3D QSAR model based on GRid-INdependent Descriptors (GRIND) methodology was generated using a set of 19 compounds with RON inhibitory activities. The generated 3D QSAR model revealed the main structural features important in the potency of RON inhibitors. The results obtained from the presented study can be used in lead optimization projects for designing of novel compounds where inhibition of RON is needed.
Article
Full-text available
This review highlights the recent advances (2019-present) in the use of MFA (molecular field analysis) for data-driven catalyst design, enabling to improve selectivities/reaction outcomes in asymmetric catalysis. Successful examples of MFA-based molecular design and how to design molecules by MFA are described, including how to generate and evaluate MFA-based regression models, and future challenges in MFA-based molecular design in molecular catalysis.
Article
Aim: Through the application of structure- and ligand-based methods, the authors aimed to create an integrative approach to developing a computational protocol for the rational drug design of potent dual 5-HT 2A /D 2 receptor antagonists without off-target activities on H 1 receptors. Materials & methods: Molecular dynamics and virtual docking methods were used to identify key interactions of the structurally diverse antagonists in the binding sites of the studied targets, and to generate their bioactive conformations for further 3D-quantitative structure-activity relationship modeling. Results & conclusion: Toward the goal of finding multi-potent drugs with a more effective and safer profile, the obtained results led to the design of a new set of dual antagonists and opened a new perspective on the therapy for complex brain diseases.
Article
The synthesis of the desired chemical compound is the main task of synthetic organic chemistry. The predictions of reaction conditions and some important quantitative characteristics of chemical reactions as yield and reaction rate can substantially help in the development of optimal synthetic routes and assessment of synthesis cost. Theoretical assessment of these parameters can be performed with the help of modern machine-learning approaches, which use available experimental data to develop predictive models called quantitative or qualitative structure–reactivity relationship (QSRR) modelling. In the article, we review the state-of-the-art in the QSRR area and give our opinion on emerging trends in this field.
Chapter
Ultrahigh-throughput virtual screening (uHTVS) is an emerging field linking together classical docking techniques with high-throughput AI methods. We outline mechanistic docking models’ goals and successes. We present different AI accelerated workflows for uHTVS, mainly through surrogate docking models. We showcase a novel feature representation technique, molecular depictions (images), as a surrogate model for docking. Along with a discussion on analyzing screens using regression enrichment surfaces at the tens of billion scale, we outline a future for uHTVS screening pipelines with deep learning.
Article
Full-text available
Magnetite (MG) modified cellulose membrane (Cell-MG), obtained by reaction of 3-aminosilane and subsequently with diethylenetriaminepentaacetic acid dianhydride functionalized waste Cell fibers (Cell-NH2 and Cell-DTPA, respectively), and amino-modified diatomite was used for Azoxystrobin and Iprodione removal from water. Cell-MG membrane was structurally and morphologically characterized using FT-IR and FE-SEM techniques. The influences of operational parameters, i.e. pH, contact time, temperature, and the mass of adsorbent on adsorption and kinetics were studied in a batch system. The calculated capacities of 35.32 and 30.16 mg g-1 for Azoxystrobin and Iprodione, respectively, were obtained from non-linear Langmuir model fitting. Weber-Morris model fitting indicates the main contribution of intra-particle diffusion to overall mass transport resistance. Thermodynamic data indicate spontaneous and endothermic adsorption. The reusability of adsorbent and results from wastewater purification showed that Cell-MG could be used as general-purpose adsorbent. The adsorbent/adsorbate surface interaction was considered from the results obtained using density functional theory (DFT) and calculation of molecular electrostatic potential (MEP). Thus, a better understanding of the relation between the adsorption performances and contribution of non-specific and specific interactions to adsorption performances and design of novel adsorbent with improved properties was deduced
Article
Through the representation of small molecule structures as numerical descriptors and the exploitation of the similarity principle, chemoinformatics has made paramount contributions to drug discovery, from unveiling mechanisms of action and repurposing approved drugs to de novo crafting of molecules with desired properties and tailored targets. Yet, the inherent complexity of biological systems has fostered the implementation of large-scale experimental screenings seeking a deeper understanding of the targeted proteins, the disrupted biological processes and the systemic responses of cells to chemical perturbations. After this wealth of data, a new generation of data-driven descriptors has arisen providing a rich portrait of small molecule characteristics that goes beyond chemical properties. Here, we give an overview of biologically relevant descriptors, covering chemical compounds, proteins and other biological entities, such as diseases and cell lines, while aligning them to the major contributions in the field from disciplines, such as natural language processing or computer vision. We now envision a new scenario for chemical and biological entities where they both are translated into a common numerical format. In this computational framework, complex connections between entities can be unveiled by means of simple arithmetic operations, such as distance measures, additions, and subtractions.
Chapter
In recent years, constant increase in the performance of computer-based tools and several mathematical algorithms to solve chemistry-related problems. In recent years, screening of potent lead molecules using computational approaches has been gaining more attention as alternate approaches for high-throughput screening. Several cheminformatics tools are used in research, but integrating it with statistical methods are said to reflect the development of new algorithms and applications. These molecular modeling or cheminformatics methods strongly depend on the quantitative structure–activity relationship (QSAR) analysis. This QSAR technique is extensively applied to predict the pharmacokinetics property through the reference biological activity and it is one sound technique in the medicinal chemistry. Through this chapter, the basic principle of computational methods that relies on QSAR models, their descriptors, statistical phenomenon towards the molecular structures are discussed. At the same time, we also highlight the important components of QSAR models and their types to describe the molecular structure of lead molecules and discuss future limitations and perspectives to guide future research in the field of QSAR.
Article
Background Inflammation is common pathogenesis of many diseases progression, such as malignancy, cardiovascular and rheumatic diseases. The inhibition of the synthesis of inflammatory mediators by modulation of cyclooxygenase (COX) and lipoxygenase (LOX) pathways provides a challenging strategy for the development of more effective drugs. Objective The aim of this study was to design dual COX-2 and 5-LOX inhibitors with iron-chelating properties using a combination of ligand-based (three-dimensional quantitative structure-activity relationship (3D-QSAR)) and structure-based (molecular docking) methods. Methods The 3D-QSAR analysis was applied on a literature dataset consisting of 28 dual COX-2 and 5-LOX inhibitors in Pentacle software. The quality of developed COX-2 and 5-LOX 3D-QSAR models were evaluated by internal and external validation methods. The molecular docking analysis was performed in GOLD software, while selected ADMET properties were predicted in ADMET predictor software. Results According to the molecular docking studies, the class of sulfohydroxamic acid analogues, previously designed by 3D-QSAR, was clustered as potential dual COX-2 and 5-LOX inhibitors with iron-chelating properties. Based on the 3D-QSAR and molecular docking, 1j, 1g, and 1l were selected as the most promising dual COX-2 and 5-LOX inhibitors. According to the in silico ADMET predictions, all compounds had an ADMET_Risk score less than 7 and a CYP_Risk score lower than 2.5. Designed compounds were not estimated as hERG inhibitors, and 1j had improved intrinsic solubility (8.704) in comparison to the dataset compounds (0.411-7.946). Conclusion By combining 3D-QSAR and molecular docking, three compounds (1j, 1g, and 1l) are selected as the most promising designed dual COX-2 and 5-LOX inhibitors, for which good activity, as well as favourable ADMET properties and toxicity, are expected.
Article
Full-text available
The review aims to present a classification and applicability analysis of methods for preliminary molecular modelling for targeted organic, catalytic and biocatalytic synthesis. The following three main approaches are considered as a primary classification of the methods: modelling of the target – ligand coordination without structural information on both the target and the resulting complex; calculations based on experimentally obtained structural information about the target; and dynamic simulation of the target – ligand complex and the reaction mechanism with calculation of the free energy of the reaction. The review is meant for synthetic chemists to be used as a guide for building an algorithm for preliminary modelling and synthesis of structures with specified properties. The bibliography includes 353 references.
Article
Description of molecular stereostructure is critical for the machine learning prediction of asymmetric catalysis. Herein we report a spherical projection descriptor of molecular stereostructure (SPMS), which allows precise representation of the molecular van der Waals (vdW) surface. The key features of SPMS descriptor are presented using the examples of chiral phosphoric acid, and the machine learning application is demonstrated in Denmark’s dataset of asymmetric thiol addition to N-acylimines. In addition, SPMS descriptor also offers a color-coded diagram that provides straightforward chemical interpretation of the steric environment.
Chapter
Molecular modeling and simulation play a central role in academic and industrial research focused on physico-chemical properties and processes. The efforts carried out in this field have crystallized in a variety of models, simulation methods, and computational techniques that are examining the relationship between the structure, dynamics and functional role of biomolecules and their interactions. In particular, there has been a huge advance in the understanding of the molecular determinants that mediate the interaction between small compounds acting as ligands and their macromolecular targets. This book provides an updated description of the advances experienced in recent years in the field of molecular modeling and simulation of biomolecular recognition, with particular emphasis towards the development of efficient strategies in structure-based drug design.
Article
Full-text available
Constant research with natural products has generated, over time, a large number of compounds with potential to be evaluated in several biological tests and subsequently have been cataloged in databases that allow other researchers perform virtual screenings on activity in various biological systems. This considerably reduces the time for the development of new drugs. This review describes the main databases of Natural Products available for searching for bioactive compounds. It also describes the main features of Virtual Screening strategies for identification of molecules with potential to be used as new drugs. In adittion, a search was made in the Web of Science database, using as search term "Virtual screening of natural products databases" from 2003 to 2018. The search criterion resulted in 230 articles, which had their abstracts evaluated as to the pertinence to the criteria required for this work, which are: a) be a research article; b) performing a virtual screening from databases of natural products or containing natural products; c) works that identified drug candidate molecules. Based on these criteria, the bibliographic review work on the topic was excluded. After this analysis, 104 works were selected for this review. Were selected relevant papers describing the obtaining of potential drug candidates that were distributed in 15 classes, of which the anticancer, antibacterial and anti-inflammatory hits were the most abundant. There are also described works showing efforts to search for new molecules against various other diseases in distinct biological systems. In this way, this work shows an overview of several methodologies and we hope they can help and inspire the development of new research to improve people's quality of life.
Article
Introduction Despite the availability of FDA approved inhibitors of HIV protease, numerous efforts are still ongoing to achieve ‘near-perfect’ drugs devoid of characteristic adverse side effects, toxicities, and mutational resistance. While experimental methods have been plagued with huge consumption of time and resources, there has been an incessant shift towards the use of computational simulations in HIV protease inhibitor drug discovery. Areas covered Herein, the authors review the numerous applications of 3D-QSAR modeling methods over recent years relative to the design of new HIV protease inhibitors from a series of experimentally derived compounds. Also, the augmentative contributions of molecular docking are discussed. Expert opinion Efforts to optimize 3D QSAR and molecular docking for HIV-1 drug discovery are ongoing, which could further incorporate inhibitor motions at the active site using molecular dynamics parameters. Also, highly predictive machine learning algorithms such as random forest, K-means, decision trees, linear regression, hierarchical clustering, and Bayesian classifiers could be employed.
Article
Full-text available
It is shown how a self-organizing neural network such as the one introduced by Kohonen can be used to analyze features of molecular surfaces, such as shape and the molecular electrostatic potential. On the one hand, two-dimensional maps of molecular surface properties can be generated and used for the comparison of a set of molecules. On the other hand, the surface geometry of one molecule can be stored in a network and this network can be used as a template for the analysis of the shape of various other molecules. The application of these techniques to a series of steroids exhibiting a range of binding activities to the corticosteroid-binding globulin receptor allows one to pinpoint the essential features necessary for biological activity.
Article
Dans cet article on détaille le principe du descripteur moléculaire que l'on utilise dans des études structure-activité. Ce descripteur dérivé d'une fonction d'autocorrélation s'applique indifféremment aux structures topologiques ou tridimensionnelles des molécules, il permet la comparaison de 2 molécules quelconques. Les molécules apparaissent alors sous forme de vecteurs facilement manipulables en machine. Ces vecteurs traduisent comment une propriété est distribuée sur une structure moléculaire.
Article
We used autocorrelation descriptors of the topological structure to correlate structure and activity in a population of 190 Glafenines and 106 Isoindomethacins. Principal Component Analysis of Glafenines described by their autocorrelation vectors shows a good partition between active and inactive molecules in the first factorial plane. The atomic property which makes the most significant contribution to this partition is electronegativity. These results are confirmed by Linear Discriminant Analysis which makes it possible to predict the activity class of randomly selected molecules with an 80% probability of success. The cloud of points representative of Indomethacin structures falls in the same region of space as the Glafenine population; moreover, the introduction of Indomethacins does not significantly modify the discrimination rate. Therefore, it can be postulated that the structural parameters which influence the biological activity are identical in both series.
Article
Molecular surface properties such as the electrostatic or the hydrophobicity potential were condensed into an autocorrelation descriptor. A vector of these autocorrelation descriptors based on the molecular electrostatic potential was successfully applied to modeling the affinities of a set of 31 steroid molecules binding to the corticosteroid binding globulin (CBG) receptor by using a combination of a Kohonen and a feedforward neural network. Similarly, an autocorrelation vector derived from the hydrophobicity potential was used to model the binding constant of a set of 78 polyhalogenated aromatic compounds to the cytosolic Ah receptor. The models found have a high predictive ability as established by cross-validation.
Article
Our discussion has centered around the kind of symbols of organic chemistry which the medicinal and biochemist might use in the discussion of structure-activity studies. The structural formulas of classical organic chemistry are very inadequate for such discussions. For example, computer programs are now being used to store and print out on demand all of the formulas of a particular type of drug. The chemist is then handed an enormous list of, say, all compounds active against malaria. The effect of looking at such a list of several thousand compounds is completely bewildering because our present symbolism in organic chemistry is not well suited for discussing the dynamics of organic reactions in nonhomogeneous systems (or even homogeneous ones). The English school of chemists started in the 1930's to formulate symbols such as ±I, ±D, etc., to describe the dynamic effects of substituents on organic reactions. A huge advance was made by Hammett, who showed how the qualitative symbols should be formulated in numerical terms. The extension of his breakthrough by Taft, Brown, Charton, and others has provided those who work with organic reactions powerful statistical tools with which to evaluate organic reaction mechanisms. Our efforts have been directed toward finding the missing link which will allow us to bring to bear on biochemical reactions (often occurring in media of blood and guts) some of the powerful tools of physical organic chemistry. The evidence in hand is that log P or π can enable us to employ computers in a numerical analysis of biochemical struture-activity problems.
Article
Comparative molecular field analysis (CoMFA) is a promising new approach to structure/activity correlation. Its characteristic features are (1) representation of ligand molecules by their steric and electrostatic fields, sampled at the intersections of a three-dimensional lattice, (2) a new "field fit" technique, allowing optimal mutual alignment within a series, by minimizing the RMS field differences between molecules, (3) data analysis by partial least squares (PLS), using cross-validation to maximize the likelihood that the results have predictive validity, and (4) graphic representation of results, as contoured three-dimensional coefficient plots. CoMFA is exemplified by analyses of the affinities of 21 varied steroids to corticosteroid- and testosterone-binding globulins. Also described are the sensitivities of results to the nature of the field and the definition of the lattice and, for comparison, analyses of the same data using various combinations of other parameters. From these results, a set of ten steroid-binding affinity values unknown to us during the CoMFA analysis were well predicted.
Article
The interaction of a probe group with a protein of known structure is computed at sample positions throughout and around the macromolecule, giving an array of energy values. The probes include water, the methyl group, amine nitrogen, carboxy oxygen, and hydroxyl. Contour surfaces at appropriate energy levels are calculated for each probe and displayed by computer graphics together with the protein structure. Contours at negative energy levels delineate contours also enable other regions of attraction between probe and protein and are found at known ligand binding clefts in particular. The contours also enable other regions of attraction to be identified and facilitate the interpretation of protein-ligand energetics. They may, therefore, be of value for drug design.
Article
A primary goal in any drug design strategy is to predict the activity of new compounds. Comparative molecular field analysis (CoMFA) has been used in drug design and three-dimensional quantitative structure/activity relationship (3D-QSAR) methods. The CoMFA approach permits analysis of a large number of quantitative descriptors and uses chemometric methods such as partial least squares (PLS) to correlate changes in biological activity with changes in chemical structure. One of the characteristics of the 3D-QSAR method is the large number of variables which are generated in order to describe the nonbonded interaction energies between one or more probes and each drug molecule. Since it is difficult to know a priori which variables affect the biological activity of the compounds, much effort has been devoted to developing methods that optimize the selection of only those variables of importance. This work focuses on some of the aspects involved in the selection of such variables, applied to a series of glucose analogue inhibitors of glycogen phosphorylase b, using the program GRID to describe the molecular structures and using a method of generating optimal partial least squares estimations (program GOLPE) as the chemometric tool. This data set, consisting of over 30 compounds in which the three-dimensional ligand-enzyme bound structures are known, is well suited to study the effect of different data pretreatment procedures on the final model used for the prediction of new drug molecules. By relying on our knowledge of the real physical problem (i.e., using the combined crystallographic and kinetic results), it has been shown that suitable data pretreatment and variable selection have been found that does not result in a significant loss of relevant information. Moreover, by using an appropriate scaling procedure, GOLPE variable selection minimizes the risk of overfitting and overpredicting.
Article
3d-QSAR procedures utilize descriptors that characterize molecular shape and charge distributions responsible for the steric and electrostatic nonbonding interactions intimately involved in ligand-receptor binding. Comparative molecular moment analysis (CoMMA) utilizes moments of the molecular mass and charge distributions up to and including second order in the development of molecular similarity descriptors. As a consequence, two Cartesian reference frames are then defined with respect to each molecular structure. One frame is the principal inertial axes calculated with respect to the center-of-mass. For neutrally charged molecular species, the other reference frame is the principal quadrupolar axes calculated with respect to the molecular "center-of-dipole." QSAR descriptors include quantities that characterize shape and charge independently as well as quantities that characterize their relationship. 3D-QSAR partial least squares (PLS) cross-validation procedures are utilized to predict the activity of several training sets of molecules previously investigated. This is the first time that molecular electrostatic quadrupolar moments have been utilized in a 3D-QSAR analysis, and it is shown that descriptors involving the quadrupolar moments and related quantities are required for the significant cross-validated predictive r2's obtained. CoMMA requires no superposition step, i.e., no step requiring a comparison between two molecules at any stage of the 3D-QSAR calculation.
Article
This report describes a new methodology aimed at grouping 3D-QSAR interaction energy descriptors into regions of neighbor variables bearing the same chemical and statistical information. These regions represent the structural variability of the series better than individual descriptor variables and can advantageously replace them in the chemometric analysis. The algorithm used to generate such regions is described, together with their application for improving the quality of GOLPE variable selection. The method is illustrated on a series of 47 glucose analogues, inhibitors of glycogen phosphorylase b, and is shown to improve both the predictive ability and the interpretability of the 3D-QSAR models obtained, comparing favorably with other methods previously described.
Article
Water present in a ligand binding site of a protein has been recognized to play a major role in ligand-protein interactions. To date, rational drug design techniques do not usually incorporate the effect of these water molecules into the design strategy. This work represents a new strategy for including water molecules into a three-dimensional quantitative structure-activity relationship analysis using a set of glucose analogue inhibitors of glycogen phosphorylase (GP). In this series, the structures of the ligand-enzyme complexes have been solved by X-ray crystallography, and the positions of the ligands and the water molecules at the ligand binding site are known. For the structure-activity analysis, some water molecules adjacent to the ligands were included into an assembly which encompasses both the inhibitor and the water involved in the ligand-enzyme interaction. The mobility of some water molecules at the ligand binding site of GP gives rise to differences in the ligand-water assembly which have been accounted for using a simulation study involving force-field energy calculations. The assembly of ligand plus water was used in a GRID/GOLPE analysis, and the models obtained compare favorably with equivalent models when water was excluded. Both models were analyzed in detail and compared with the crystallographic structures of the ligand-enzyme complexes in order to evaluate their ability to reproduce the experimental observations. The results demonstrate that incorporation of water molecules into the analysis improves the predictive ability of the models and makes them easier to interpret. The information obtained from interpretation of the models is in good agreement with the conclusions derived from the structural analysis of the complexes and offers valuable insights into new characteristics of the ligands which may be exploited for the design of more potent inhibitors.
Article
A series of novel conformationally restricted butyrophenones (2-(aminoethyl)- and 3-(aminomethyl)thieno- or benzocycloalkanones bearing (6-fluorobenzisoxazolyl)piperidine, (p-fluorobenzoyl)piperidine, (o-methoxyphenyl)piperazine, or linear butyrophenone fragments) were prepared and evaluated as atypical antipsychotic agents by in vitro assays of affinity for dopamine receptors (D(1), D(2)) and serotonin receptors (5-HT(2A), 5-HT(2C)) and by in vivo assays of antipsychotic potential and the risk of inducing extrapyramidal side effects. Potency and selectivity depended mainly on the amine fragment connected to the cycloalkanone structure. As a group, compounds with a benzisoxazolyl fragment had the highest 5-HT(2A) activities, followed by the benzoylpiperidine derivatives; in general, alpha-substituted cycloalkanone derivatives were more active than the corresponding beta-substituted congeners. CoMFA (comparative molecular field analysis) and docking studies showed electrostatic, steric, and lipophilic determinants of 5-HT(2A) and D(2) affinities and 5-HT(2A)/D(2) selectivity. The in vitro and in vivo pharmacological profiles of N-[(4-oxo-4H-5, 6-dihydrocyclopenta[b]thiophene-5-yl)ethyl]-4-(6-fluorobenzisox azol-3 -yl)piperidine (23b, QF 0510B), N-[(4-oxo-4,5,6, 7-tetrahydrobenzo[b]thiophene-5-yl)ethyl]-4-(6-fluorobenzisoxazol- 3-y l)piperidine (24b, QF 0610B), and N-[(7-oxo-4,5,6, 7-tetrahydrobenzo[b]thiophene-6-yl)ethyl]-4-(6-fluorobenzisoxazol- 3-y l)piperidine (29b, QF 0902B) suggest that they may be effective antipsychotic drugs with low propensity to induce extrapyramidal side effects.
Chemometric Detection of Binding Sites of 7TM Receptors. In Molecular Modelling and Prediction of Bioreactivity
  • M Clementi
  • S Clementi
  • S Clementi
  • G Cruciani
  • M Pastor
Clementi, M.; Clementi, S.; Clementi, S.; Cruciani, G.; Pastor, M. Chemometric Detection of Binding Sites of 7TM Receptors. In Molecular Modelling and Prediction of Bioreactivity; Gun-dertofte, K., Jorgensen, F. S., Eds.; Kluwer Academic/Plenum Publishers: New York, 2000; pp 207-212.
pp 199-213. (22) Dataset of 31 steroids binding to the corticosteroid-binding globulin (CBG) receptor
  • H Kubinyi
  • G Folkers
  • Y C Martin
Kubinyi, H., Folkers, G., Martin, Y. C., Eds.; KLUWER/ES-COM: Dordrecht, 1998; Vol. 3, pp 199-213. (22) Dataset of 31 steroids binding to the corticosteroid-binding globulin (CBG) receptor. http://www2.ccc.uni-erlangen.de/services/ steroids/index.html. (23) CORINA Molecular Networks, GmbH Computerchemie Lange-marckplatz 1, Erlangen, Germany, 1997. JM000941M GRIND Journal of Medicinal Chemistry, 2000, Vol. 43, No. 17 3243
Variables Selection in PLS Analysis
  • G Cruciani
  • S Clementi
  • M Baroni
Cruciani, G.; Clementi, S.; Baroni, M. Variables Selection in PLS Analysis. In 3D QSAR in Drug Design, Theory, Methods and Applications; Kubinyi, H., Ed.; ESCOM: Leiden, 1993; pp 551-564.
Autocorrelation as a tool for a congruent description of molecules in 3D-QSAR studies. Pharm. Pharma-col
  • S Wold
Wold, S. Autocorrelation as a tool for a congruent description of molecules in 3D-QSAR studies. Pharm. Pharma-col. Lett. 1993, 3, 5-8.