Figure 2 - uploaded by Arnout Voet
Content may be subject to copyright.
Overview of the bit-mask construction in EleKit. (1) a near-or-inside mask of R P is created, (2) a near-but-not-inside mask of L P is created, (3) a near-but-not-inside mask of L SM is created, (4) the logical conjunction of the three masks is used to select points to correlate from the electrostatic potentials of L P and L SM . 

Overview of the bit-mask construction in EleKit. (1) a near-or-inside mask of R P is created, (2) a near-but-not-inside mask of L P is created, (3) a near-but-not-inside mask of L SM is created, (4) the logical conjunction of the three masks is used to select points to correlate from the electrostatic potentials of L P and L SM . 

Source publication
Article
Full-text available
One of the underlying principles in drug discovery is that a biologically active compound is complimentary in shape and molecular recognition features to its receptor. This principle infers that molecules binding to the same receptor may share some common features. Here, we have investigated whether the electrostatic similarity can be used for the...

Context in source publication

Context 1
... advent of the ‘omics’ era, it has become clear that most proteins do not act in solitude but depend on Protein-Protein Interactions (PPIs) to exert their biological function. It has been estimated that the number of PPIs in humans ranges from , 130,000 [1] to , 650,000 [2] and these PPIs are crucial for the regulation of many biological processes. PPIs are often involved in processes associated with diseases, therefore targeting PPIs with small molecule PPI inhibitors (SMPPIIs) opens a pipeline for the development of novel drug classes against a variety of diseases. While many small molecule drugs targeting enzymes, nuclear receptors, ion channels and G-protein coupled receptors have been developed, the number of reported successes in the discovery of SMPPIIs remains fairly low. As a matter of fact, PPIs were once thought to be high hanging fruits for drug discovery [3]. PPIs were even considered to be undruggable, mostly because of their relative flat but extensive interfaces [4]. Though initially thought to be undruggable, an increasing number of SMPPIIs have been reported in recent years [5]. However, the number of deposited 3D SMPPII receptor complex structures remain far more limited than the number of reported successful cases. This hinders the understanding of their mechanism of action and chemical space properties [6]. Commonly used methods for screening are computational docking [7] and pharmacophore-based screening [8]. It was observed that the crucial interactions between a protein ligand and its protein receptor are often similar to those between the SMPPII and the protein receptor [9,10]. Thus, the PPI interface can be used to create a pharmacophore query to screen for small molecule ligands [11,12]. Another approach is to exploit the principle of electrostatic complementarity in molecular recognition. Next to steric complementarity, electrostatics are one of the main driving forces involved in molecular recognition [13]. Despite the complex biophysical nature of the electrostatic potential, calculations for macromolecular systems are nowadays tractable [14,15]. Electrostatics are known to play a key role in protein-DNA [16], protein-protein [17] and protein-substrate [13] recognitions. Given the importance of electrostatics for the molecular recognition event, electrostatics have been used to study protein similarity [18–20] and the nature of protein-protein interactions [17,21–24]. More specifically, the electrostatic complementarity between protein-protein interfaces has long been a subject of investigation [22,23]. Using the correlation of electrostatic potentials as a quantitative measure, the electrostatic complementarity between PPI interfaces has been demonstrated [17,24]. Other studies focused on the conservation of the electrostatic potentials through evolution [25] and its role in molecular association kinetics [26]. It is generally accepted that there is a high degree of complementarity in shape and electrostatics between a ligand and its receptor. This implies that molecules with similar shape and electrostatic properties may bind to the same receptor. This principle has been used to identify small molecule inhibitors similar to natural substrates or known inhibitors by screening for compounds with similar shape, volume and electrostatics [27–30]. An SMPPII cannot occupy the same shape and volume as its much bigger protein-ligand counterpart. However, it can still be assumed that there is some local electrostatic potential similarity between an SMPPII and a ligand protein, since they recognize the same binding site on the receptor. A recent example of the usefulness of taking electrostatic potential similarity into account while designing an SMPPII can be found in the work of Cavalluzo et al. [31], where an SMPPII was designed de novo by including electrostatic similarity. This success has motivated our effort to systematically investigate the complementarity in electrostatic potential between small molecules and protein ligands binding to the same protein receptor, and its potential use to assist in the rational design of SMPPIIs. For this purpose, a tool named EleKit was developed. To compute the partial charges and electrostatic potentials, EleKit builds upon PDB2PQR [32] and APBS [15]. EleKit requires two sets of complex structures in order to calculate the electrostatic similarity between a protein ligand and a small molecule ligand: (i) the PPI complex of the protein-ligand (L P ) with the protein-receptor (R P ) and (ii) a small molecule ligand (L SM ) in its predicted or experimentally determined conformation on the protein-receptor (R P ). The EleKit method is shown schematically in figure 1. First, the electrostatic potentials around and are computed using APBS (parameters listed in table 1) and stored in 3D grids. Since only the area where and intersect is most likely to be relevant for molecular recognition, a bit mask is created on the electrostatic potential grids (figure 2). The goal of this mask is to take into account only those points in space that are not only in the solvent region around and but also near the interface atoms of R P . To create this mask, a distance cutoff is needed. This distance is used when dilating (a morphological mathematical operation) the molecular surface. Based on the hydrogen bond length ( , 2.5A ̊ ) and the facts that enough points are needed for correlation and that the local similarity is our focus, a cutoff value ranging from 1.4 A ̊ to 3.5 A ̊ seems reasonable. All experiments reported in this study were performed with an intermediate cutoff value of 2.0 A ̊ . Using 3.0 A ̊ or 4.0 A ̊ would have very little impact on the results (data not shown). Finally, the similarity between electrostatic potentials of and is assessed by correlating values at the grid points within the mask using the Spearman rank-order correlation coefficient ( r ). Additional similarity scores (Carbo index [33], Hodgkin index [34], Pearson’s r and a Tanimoto score) are also calculated. EleKit is written in OCaml [35] () and computations are parallelized by the Parmap library [36]. Experiments were run on Linux computing nodes with 2.4GHz Intel Xeon processors. EleKit takes between ten seconds to two minutes per ligand molecule and can parallelize the computation of several ligands when run on a multi-core machine. EleKit is released as open source and available from the authors’ website The EleKit method was applied to analyze previously reported cases of SMPPIIs, for which accurate structures of the PPI as well as the SMPPII receptor complex are available in the PDB (table 2). Additionally, the SMPPIIs are required to bind in the PPI interface, allowing for a substantial overlap between the protein ligand and the SMPPII and thus excluding allosteric inhibition mechanisms. The approach used in EleKit to perform comparison of electrostatic potentials resembles what has been done previously on proteins [18–21]. Analysis of Electrostatic Similarities of Proteins (AESOP) [20], the method of Dlugosz et al. [19] and Protein Interaction Property Similarity Analysis (PIPSA) [18,37] also use APBS as their electrostatic computation engine. PIPSA can also use University of Houston Brownian Dynamics [38] (UHBD). While EleKit relies on the Spearman rank-order correlation coefficient (as McCoy et al. [17]), PIPSA uses the Hodgkin index [34] to numerically assess the similarity of electrostatic potentials. AESOP uses the Average Normalized Difference [39]. The method of Dlugosz et al. [19] approximates the electrostatic potential with spherical harmonics and uses a similarity index specifically designed to compare the obtained rotation-invariant descriptors. EleKit, similarly to several other methods [18,19,37], uses boolean masks to select a region over which electrostatic potentials are compared. All methods vary in the way masks are constructed. Electrostatic similarity analysis for these different SMPPII- related structures indicate that several exhibit correlation. In general, correlation between electrostatic potentials of SMPPIIs and electrostatic potentials of the respective ligand proteins are observed (table 2). This is especially true for the SMPPIIs targeting the HDM2:p53, HIV-1 Integrase:LEDGF/p75, Integrin:Fibrinogen, IL2:IL2R and XIAP:smac interactions. The highest similarity between a protein ligand and a small molecule ligand can be observed in the HIV-1 Integrase:LEDGF/p75 and the Integrin:Fibrinogen interactions and their respective inhibitors. In these cases, r is on average , 0.52 and , 0.73 respectively (table 2). The origin of these classes of SMPPIIs can be traced back to pharmacophore based discovery of lead compounds designed to mimic the interactions observed at the PPI interface [40,41]. For the inhibitors of the HDM2:p53 interaction, the majority of the inhibitors exhibit electrostatic potential similarity. However, a few show low correlations ( r v 0 : 2 ) and in one case even some anti- correlation ( r & { 0 : 15 ). Interestingly, the Tanimoto score shows similarity in all HDM2:p53 cases. The electrostatic potentials between inhibitors and protein ligands in ZipA:FtsZ and VHL:HIF1 still correlate although less strongly than in other cases. These inhibitors are observed to be less active when tested. For inhibitors targeting the XIAP:smac interaction, which originated from peptidomimetic design, some compounds exhibit lower similarity than expected. This can be explained by the divergence of conformations of the receptor protein, since the XIAP:smac complex was solved by NMR while the structures of XIAP bound to inhibitors were solved by X-ray crystallography. The PPI complex solved by NMR spectroscopy are more difficult to superpose onto the crystal structure conformation obtained for the SMPPII complex. The inhibitors of the IL2:IL2R interaction are well known for binding to the IL2R interface by causing a rotameric change of a ...

Similar publications

Article
Full-text available
Bis-polyaza pyridinophane scorpiands bind nucleotides in aqueous medium with 10-100 micromolar affinity, predominantly by electrostatic interactions between nucleotide phosphates and protonated aliphatic amines and assisted by aromatic stacking interactions. The pyridine-scorpiand receptor showed rare selectivity toward CMP with respect to other nu...

Citations

... [35] Furthermore, electrostatic complementarity at the protein-ligand interface is recognized as an important factor for predicting and optimizing ligand affinity and selectivity. [36,37] We reasoned that the ionic PROXYL probes could also be useful for the detection of protein-ligand interactions, in particular for ligands bearing charged chemical groups, since those will significantly perturb the protein electrostatic potential in the binding site. Here, we extend the PROXYL method to the investigation of protein-ligand complexes and apply it to two protein-ligand systems, the protein interleukin-8 (IL-8), interacting with glycosaminoglycans (GAGs), and the src-homology 2 (SH2) domain of the growth factor receptor-bound protein 2 (Grb2), interacting with phosphotyrosine peptides. ...
Article
Full-text available
NMR spectroscopy techniques can provide important information about protein‐ligand interactions. Here we tested an NMR approach which relies on the measurement of paramagnetic relaxation enhancements (PREs) arising from analogous cationic, anionic or neutral soluble nitroxide molecules, which distribute around the protein‐ligand complex depending on near‐surface electrostatic potentials. We applied this approach to two protein‐ligand systems, interleukin‐8 interacting with highly charged glycosaminoglycans and the SH2 domain of Grb2 interacting with less charged phospho‐tyrosine tripeptides. The electrostatic potential around interleukin‐8 and its changes upon binding of glycosaminoglycans could be derived from the PRE data and confirmed by theoretical predictions from Poisson‐Boltzmann calculations. The ligand influence on the PREs and NMR‐derived electrostatic potentials of Grb2 SH2 was localized to a narrow protein region which allowed the localization of the peptide binding pocket. Our analysis suggests that experiments with nitroxide cosolutes can be useful for investigating protein‐ligand electrostatic interactions and mapping ligand binding sites.
... 60 SMPPII are found to copy the natural interaction not only in terms of shape and chemistry, but even at the electrostatic potential level. 61 This mimicry suggests that the pharmacophore queries created from PPI complex structures can be used to identify SMPPII via virtual screening [64][65] . ...
Article
Full-text available
The pharmacophore concept was first put forward as a useful picture of drug interactions almost a century ago, and with the rise in computational power over the last few decades, has become a well-established CADD method with numerous different applications in drug discovery. Depending on the prior knowledge of the system, pharmacophores can be used to identify derivatives of compounds, change the scaffold to new compounds with a similar target, virtual screen for novel inhibitors, profile compounds for ADME-tox, investigate possible off-targets, or just complement other molecular methods “chemical groups” or functions in a molecule were responsible for a biological effect, and molecules with similar effect had similar functions in common. The word pharmacophore was coined much later, by Schueler in his 1960 book Chemobiodynamics and Drug Design, and was defined as “a molecular framework that carries (phoros) the essential features responsible for a drug’s (Pharmacon) biological activity.
... The recognition and binding of a protein with other biomolecules (ligand, receptor, or antibody) depends on different chemical and physical factors, mainly electrostatic energy-whose contribution is the result of Coulomb interactions between the molecules- (Voet et al., 2013). The electric field in the active site of a protein regulates its catalytic activity and determines the relative binding orientations; in addition, the surface of the proteins and the interface generated after the interaction have many polar and charged residues (Sheinerman et al., 2000). ...
... [29] This mimicry suggests that the pharmacophore searches created from PPI complex structures can be used to identify SMPPII via virtual screening . [30] . Different methods can be employed to design the pharmacophore features onto the amino acids present at the PPI interface. ...
Article
Drug discovery and designis avery challenging, expensive and time taking process. In silico approaches involving computational tools and methodologies hasbecome a part of the drug designing anddiscoveryprocess from drug target search, selection to its lead optimization.Over several last few years.Quantitative structure-activity relationship (QSAR) has become a very essential tool for new lead identification, its design and optimization to discover reliable predictive models.This review article will focus on the summarised overview of ligandbaseddrug design approaches using advancecomputational techniques likepharmacophore modelling, andmodern QSAR techniques etc., along with the recent developments in this field and their application in new drug discovery for therapeutic purposes.The review concludes with an outlook on the scope and challenges of the rational drug design using QSAR studies.
... Chemical compounds often complement their protein target in shape and electrostatics 55 . This implies that chemical compounds with similar shape and electrostatic properties may bind to the same receptor 55 . ...
... Chemical compounds often complement their protein target in shape and electrostatics 55 . This implies that chemical compounds with similar shape and electrostatic properties may bind to the same receptor 55 . This principle has been used to identify small molecule inhibitors similar to natural substrates or known inhibitors by screening for compounds with similar shape, volume and electrostatics 55,56,57 . ...
... This implies that chemical compounds with similar shape and electrostatic properties may bind to the same receptor 55 . This principle has been used to identify small molecule inhibitors similar to natural substrates or known inhibitors by screening for compounds with similar shape, volume and electrostatics 55,56,57 . ...
Article
Full-text available
Clathrin-mediated endocytosis (CME) is a normal biological process where cellular contents are transported into the cells. However, this process is often hijacked by different viruses to enter host cells and cause infections. Recently, two proteins that regulate CME – AAK1 and GAK – have been proposed as potential therapeutic targets for designing broad-spectrum antiviral drugs. In this work, we curated two compound datasets containing 83 AAK1 inhibitors and 196 GAK inhibitors each. Subsequently, machine learning methods, namely Random Forest, Elastic Net and Sequential Minimal Optimization, were used to construct Quantitative Structure Activity Relationship (QSAR) models to predict small molecule inhibitors of AAK1 and GAK. To ensure predictivity, these models were evaluated by using Leave-One-Out (LOO) cross validation and with an external test set. In all cases, our QSAR models achieved a q2LOO in range of 0.64 to 0.84 (Root Mean Squared Error; RMSE = 0.41 to 0.52) and a q2ext in range of 0.57 to 0.92 (RMSE = 0.36 to 0.61). Besides, our QSAR models were evaluated by using additional QSAR performance metrics and y-randomization test. Finally, by using a concensus scoring approach, nine chemical compounds from the Drugbank compound library were predicted as AAK1/GAK dual-target inhibitors. The electrostatic potential maps for the nine compounds were generated and compared against two known dual-target inhibitors, sunitinib and baricitinib. Our work provides the rationale to validate these nine compounds experimentally against the protein targets AAK1 and GAK.
... These ligands can be macromolecules like other proteins, DNA, RNA and/or small endogenous molecules like neurotransmitters and ions or other organic molecules like carbohydrates. The feasibility and strength of such molecular recognitions mediated by binding of the partner molecules (i.e., protein and ligand) are regulated by complementarity in shape and electrostatic features of the interacting regions on the surface of each molecule (Sowdhamini et al. 1995;Voet et al. 2013). These features, i.e., shape and electrostatic fingerprints, are presented by the arrangement of the amino acid residues in the 3-D structure of proteins. ...
Chapter
Full-text available
Most of the therapeutic drugs available in the market today, are targeted against proteins. Drug molecules are designed to complement shape, size and electrostatic fingerprints of the functional site of a target protein so that they can bind to the protein and impede its molecular function. Details of functional site are derived from 3-D structure of the protein obtained either through experimental techniques or computational protein modeling and form the basis for structure-based drug design. Knowledge derived from homologous proteins facilitates this process by providing an understanding on common and unique features of the intended target with respect to its close and distant relatives. This helps to design a drug with high selectivity and affinity. Often inherent dynamic nature of proteins facilitates inter-protein interactions and aid them to perform major cellular activities as an assembled complex. With improved apprehension of structural biology, consideration of multi-protein machineries and their associated conformational dynamics is increasingly gaining importance in drug design and discovery. Susceptibility of protein-protein interactions in disease conditions is progressively being realized and this has attracted protein-protein interfaces as potential drug targets for therapeutic intervention in the last few decades. In this chapter, we have discussed the properties of protein structure, evolution, dynamics and protein complexes along with explanations on how each factor contributes to the design of an effective drug molecule that is safe and efficacious.
... Electrostatic interactions between protein residues and the solvent are one of the key determinants of protein folding and stability (Strickler et al., 2006;Zhou & Pang, 2018). The electrostatics also regulates protein interactions with other proteins as well as with molecules other than proteins, such as small-molecule drugs (Voet et al., 2013). The electrostatic features of the protein are determined by the distribution of whole and partial charges across the 3D protein structure (Vascon et al., 2020). ...
Article
Full-text available
Here, we report on a computational comparison of the receptor-binding domains (RBDs) on the spike proteins of severe respiratory syndrome coronavirus‐2 (SARS-CoV-2) and SARS-CoV in free forms and as complexes with angiotensin-converting enzyme 2 (ACE2) as their receptor in humans. The impact of 42 mutations discovered so far on the structure and thermodynamics of SARS-CoV-2 RBD was also assessed. The binding affinity of SARS-CoV-2 RBD for ACE2 is higher than that of SARS-CoV RBD. The binding of COVA2-04 antibody to SARS-CoV-2 RBD is more energetically favorable than the binding of COVA2-39, but also less favorable than the formation of SARS-CoV-2 RBD-ACE2 complex. The net charge, the dipole moment and hydrophilicity of SARS-CoV-2 RBD are higher than those of SARS-CoV RBD, producing lower solvation and surface free energies and thus lower stability. The structure of SARS-CoV-2 RBD is also more flexible and more open, with a larger solvent-accessible surface area than that of SARS-CoV RBD. Single-point mutations have a dramatic effect on distribution of charges, most prominently at the site of substitution and its immediate vicinity. These charge alterations alter the free energy landscape, while X→F mutations exhibit a stabilizing effect on the RBD structure through π stacking. F456 and W436 emerge as two key residues governing the stability and affinity of the spike protein for its ACE2 receptor. These analyses of the structural differences and the impact of mutations on different viral strains and members of the coronavirus genera are an essential aid in the development of effective therapeutic strategies. Communicated by Ramaswamy H. Sarma
... However, the electrostatic distribution of the nonbonded regions is still smaller than the bonded regions. en, we think that electrostatic complementation helps the binding of proteins to small molecules [34,35]. Although previous studies have found that hydrogen bond plays an important role in ligand and protein binding, we have no statistically significant difference in the distribution of hydrogen bonds between binding and unbinding regions [36,37]. ...
Article
Full-text available
The analysis and prediction of small molecule binding sites is very important for drug discovery and drug design. The traditional experimental methods for detecting small molecule binding sites are usually expensive and time consuming, and the tools for single species small molecule research are equally inefficient. In recent years, some algorithms for predicting binding sites of protein-small molecules have been developed based on the geometric and sequence characteristics of proteins. In this paper, we have proposed SmoPSI, a classification model based on the XGBoost algorithm for predicting the binding sites of small molecules, using protein sequence information. The model achieved better results with an AUC of 0.918 and an ACC of 0.913. The experimental results demonstrate that our method achieves high performances and outperforms many existing predictors. In addition, we also analyzed the binding residues and nonbinding residues and finally found the PSSM; hydrophilicity, hydrophobicity, charge, and hydrogen bonding have obviously different effects on the binding-site predictions.
... 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 Figure 9. Correlation of EC scores with bioactivities for the Mcl-1 chlorine scan subset. EC maps show that 6-chloro substitution (29) shows the most favorable electrostatics due to formation of a halogen bond with the Ala227 backbone carbonyl group. In both EC and EC R plots, compounds with 6-Cl substitution (data points shown as black ■) are enriched at high EC scores. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 Figure 11. ...
... Correlation of EC scores with bioactivities for the Mcl-1 chlorine scan subset. EC maps show that 6-chloro substitution(29) shows the most favorable electrostatics due to formation of a halogen bond with the Ala227 backbone carbonyl group. In both EC and EC R plots, compounds with 6-Cl substitution (data points shown as black ■) are enriched at high EC scores. ...
Article
Electrostatic interactions between small molecules and their respective receptors are essential for molecular recognition and are also key contributors to the binding free energy. Assessing the electrostatic match of protein-ligand complexes therefore provides important insights into why ligands bind and what can be changed to improve binding. Ideally, ligand and protein electrostatic potentials at the protein-ligand interaction interface should maximize their complementarity while minimizing desolvation penalties. In this work, we present a fast and efficient tool to calculate and visualize the electrostatic complementarity (EC) of protein-ligand complexes. Using several benchmark sets compiled from mainly electrostatically driven SAR, including data of the PPI target XIAP and the GPCR mGLU5, we demonstrate that the EC method can visualize, rationalize, and predict electrostatically driven ligand affinity changes and help to predict compound selectivity. The methodology presented here for analysis of electrostatic complementarity is a powerful and versatile tool for drug design.
... Most of its academic users work in computer science, on compilers and formal methods. But, OCaml is also used in bioinformatics [21][22][23], structural bioinformatics [24][25][26][27], chemoinformatics [14,28], systems biology [29][30][31][32] and ecotoxicology [33]. ...
... EleKit [26,27] was the first structural bioinformatics software able to measure the similarity of a ligand's electrostatic field with that of a protein binding at a protein-protein interface (Fig. 5). Ligands showing a high similarity in this setting are potential drugs breaking protein-protein interactions. ...
... Electrostatic potential fields are calculated and stored in distinct grids ( 2 A and 2 B ). A boolean mask in 3D is created to select the solvent region nearby the interface ( 3 A and 3 B ). Finally, the similarity between electrostatic potentials in the masked region ( 4 A and 4 B ) is calculated using the Spearman rank correlation coefficient (figure adapted from Voet [26]) ...
Article
Full-text available
Background: OCaml is a functional programming language with strong static types, Hindley-Milner type inference and garbage collection. In this article, we share our experience in prototyping chemoinformatics and structural bioinformatics software in OCaml. Results: First, we introduce the language, list entry points for chemoinformaticians who would be interested in OCaml and give code examples. Then, we list some scientific open source software written in OCaml. We also present recent open source libraries useful in chemoinformatics. The parallelization of OCaml programs and their performance is also shown. Finally, tools and methods useful when prototyping scientific software in OCaml are given. Conclusions: In our experience, OCaml is a programming language of choice for method development in chemoinformatics and structural bioinformatics.