Article

Soft protein-protein docking in internal coordinates

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The association of two biological macromolecules is a fundamental biological phenomenon and an unsolved theoretical problem. Docking methods for ab initio prediction of association of two independently determined protein structures usually fail when they are applied to a large set of complexes, mostly because of inaccuracies in the scoring function and/or difficulties on simulating the rearrangement of the interface residues on binding. In this work we present an efficient pseudo-Brownian rigid-body docking procedure followed by Biased Probability Monte Carlo Minimization of the ligand interacting side-chains. The use of a soft interaction energy function precalculated on a grid, instead of the explicit energy, drastically increased the speed of the procedure. The method was tested on a benchmark of 24 protein-protein complexes in which the three-dimensional structures of their subunits (bound and free) were available. The rank of the near-native conformation in a list of candidate docking solutions was <20 in 85% of complexes with no major backbone motion on binding. Among them, as many as 7 out of 11 (64%) protease-inhibitor complexes can be successfully predicted as the highest rank conformations. The presented method can be further refined to include the binding site predictions and applied to the structures generated by the structural proteomics projects. All scripts are available on the Web.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... RosettaDock ( Gray et al., 2003) is an efficient protein- protein docking software that is based on the MC method. Some other software based on randomized search are HADDOCK ( Dominguez et al., 2003), ICM-DISCO ( Fernandez-Recio, Totrov, & Abagyan, 2002) and ATTRACT (Zacharias, 2003). ...
... PyDock (Cheng, Blundell, & Fernandez-Recio, 2007) uses an energy function based on ICM potentials, composed of van der Waals, Coloumbic electrostatics and Atomic Solvation Parameter (ASP)-based desolvation energy. The scoring function in ICM-DISCO ( Fernandez-Recio et al., 2002) takes into consideration the intermolecular grid-based energy terms and the Accessible Surface Area (ASA)-based desolvation energy. RosettaDock program also employs a similar scoring scheme. ...
... (Ravikant & Elber, 2010) servers were ranked next. In the human predictor category, HADDOCK (Dominguez et al., 2003) was ranked first followed by SwarmDock (Venkatraman & Ritchie, 2012), Vakser group, Vajda/Kozakov group (Camacho & Vajda, 2002;Chuang et al., 2008) and ICM (Fernandez-Recio et al., 2002) in the 2nd to 5th position ( Kozakov et al., 2013) (Lensink & Wodak, 2013. ...
... In the template-based approaches as applied to dimers [83,86,89,90,[139][140][141], the quaternary model is constructed by matching a pair of monomer target sequences to a library of related template protein complexes which have the structure experimentally solved. In the template-free approaches [123,125,135,[142][143][144][145][146], also known as protein-protein docking, the target protein complex structure is predicted by scoring a large set of protein-protein orientations which are generated by assembling known monomer structure models. ...
... To have a control of SPRING with the rigid-body docking methods [123,125,135,[142][143][144][145][146], we implement SPRING-H on the dimer complexes of the protein docking benchmark set [70] 3.0, which have both complex and unbound monomer structures solved in the PDB. Since SPRING has often partial structure aligned, we implement another version of SPRING-UB which superimposes the unbound monomer structures to the SPRING-H models after threading. ...
Thesis
Determining protein structures from sequence is a fundamental problem in molecular biology, as protein structure is essential to understanding protein function. In this study, I developed one of the first fully automated pipelines for template based quaternary structure prediction starting from sequence. Two critical steps for template based modeling are identifying the correct homologous structures by threading which generates sequence to structure alignments and refining the initial threading template coordinates closer to the native conformation. I developed SPRING (single-chain-based prediction of interactions and geometries), a monomer threading to dimer template mapping program, which was compared to the dimer co-threading program, COTH, using 1838 non homologous target complex structures. SPRING’s similarity score outperformed COTH in the first place ranking of templates, correctly identifying 798 and 527 interfaces respectively. More importantly the results were found to be complementary and the programs could be combined in a consensus based threading program showing a 5.1% improvement compared to SPRING. Template based modeling requires a structural analog being present in the PDB. A full search of the PDB, using threading and structural alignment, revealed that only 48.7% of the PDB has a suitable template whereas only 39.4% of the PDB has templates that can be identified by threading. In order to circumvent this, I included intramolecular domain-domain interfaces into the PDB library to boost template recognition of protein dimers; the merging of the two classes of interfaces improved recognition of heterodimers by 40% using benchmark settings. Next the template based assembly of protein complexes pipeline, TACOS, was created. The pipeline combines threading templates and domain knowledge from the PDB into a knowledge based energy score. The energy score is integrated into a Monte Carlo sampling simulation that drives the initial template closer to the native topology. The full pipeline was benchmarked using 350 non homologous structures and compared to two state of the art programs for dimeric structure prediction: ZDOCK and MODELLER. On average, TACOS models global and interface structure have a better quality than the models generated by MODELLER and ZDOCK.
... Le programme ICM-DISCO, développé par l'équipe de Ruben Abagyan, est aussi constitué de deuxétapes (Fernández-Recio et al., 2002). La premièreétape consiste en un docking rigide quiéchantillonne les différentes positions du ligand autour du récepteur. ...
... La combinaison de ces trois contributions (complémentarité de forme, potentielélectrostatique et désolvatation) permet d'approximer des fonctions d'énergie libre (Fernández-Recio et al., 2002;Camacho et al., 2006). ...
Thesis
Même si le docking protéine-protéine devient un outil incontournable pour répondre aux problématiques biologiques actuelles, il reste cependant deux difficultés inhérentes aux méthodes actuelles: 1) la majorité de ces méthodes ne considère pas les possibles déformations internes des protéines durant leur association. 2) Il n'est pas toujours simple de traduire les informations issues de la littérature ou d'expérimentations en contraintes intégrables aux programmes de docking. Nous avons donc tenté de développer une approche permettant d'améliorer les programmes de docking existants. Pour cela nous nous sommes inspirés des méthodologies mises en place sur des cas concrets traités durant cette thèse. D'abord, à travers la création du complexe ERBIN PDZ/Smad3 MH2, nous avons pu tester l'utilité de la Dynamique Moléculaire en Solvant Explicite (DMSE) pour mettre en évidence des résidus importants pour l'interaction. Puis, nous avons étendu cette recherche en utilisant divers serveurs de docking puis la DMSE pour cibler un résultat consensus. Enfin, nous avons essayé le raffinage par DMSE sur une cible du challenge CAPRI et comparé les résultats avec des simulations courtes de Monte-Carlo. La dernière partie de cette thèse portait sur le développement d'un nouvel outil de visualisation de la surface moléculaire. Ce programme, nommé MetaMol, permet de visualiser un nouveau type de surface moléculaire: la Skin Surface Moléculaire. La distribution des calculs à la fois sur le processeur de l'ordinateur (CPU) et sur ceux de la carte graphique (GPU) entraine une diminution des temps de calcul autorisant la visualisation, en temps réel, des déformations de la surface moléculaire.
... Le programme ICM-DISCO, développé par l'équipe de Ruben Abagyan, est aussi constitué de deuxétapes (Fernández-Recio et al., 2002). La premièreétape consiste en un docking rigide quiéchantillonne les différentes positions du ligand autour du récepteur. ...
... La combinaison de ces trois contributions (complémentarité de forme, potentielélectrostatique et désolvatation) permet d'approximer des fonctions d'énergie libre (Fernández-Recio et al., 2002;Camacho et al., 2006). ...
Article
Full-text available
Protein-protein docking has become an extremely important challenge in biology, however, there remain two inherent difficulties: 1) most docking methods do not consider possible internal deformations of the proteins during their association; 2) it is not always easy to translate information from the literature or from experiments into constraints suitable for use in protein docking algorithms. Following these conclusions, we have developed an approach to improve existing docking programs. Firstly, through modelling the ERBIN PDZ / Smad3 MH2 complex, we have tested the utility of Molecular Dynamics with Explicit Solvent (MDSE) for elucidating the key residues in an interaction. We then extended this research by using several docking servers and the DMSE simulations to obtain a consensus result. Finally, we have explored the use of DMSE refinement on one of the targets from the CAPRI experiment and we have compared those results with those from short Monte-Carlo simulations. Another aspect of this thesis concerns the development of a novel molecular surface visualisation tool. This program, named MetaMol, allows the visualisation of a new type of molecular surface: the Molecular Skin Surface. Distributing the surface calculation between a computer's central processing unit (CPU) and its graphics card (GPU) allows deformations of the molecular surface to be calculated and visualised in real time.
... The basic architecture of the system sampled with the global stochastic optimizer is shown in Figure 1A. System parts that are less flexible and/or more certain are represented as grid potential maps ( Figure 1B) as described in [33,34]. When calculating the grids, uncertain and unimportant atoms (e.g. ...
... Representing more certain/less flexible parts of the system with grid maps [33,34] has the advantage of greatly speeding up the calculations for the remaining (explicitly represented) parts of the system. Also, by being more permissive to temporary steric clashes emerging in the course of simulation, grid map representation "smoothens" the energy landscape and improves the efficiency of its sampling. ...
Article
Experimental structure determination for G protein-coupled receptors (GPCRs) and especially their complexes with protein and peptide ligands is at its infancy. In the absence of complex structures, molecular modeling and docking play a large role not only by providing a proper 3D context for interpretation of biochemical and biophysical data, but also by prospectively guiding experiments. Experimentally confirmed restraints may help improve the accuracy and information content of the computational models. Here we present a hybrid molecular modeling protocol that integrates heterogeneous experimental data with force field-based calculations in the stochastic global optimization of the conformations and relative orientations of binding partners. Some experimental data, such as pharmacophore-like chemical fields or disulfide-trapping restraints, can be seamlessly incorporated in the protocol, while other types of data are more useful at the stage of solution filtering. The protocol was successfully applied to modeling and design of a stable construct that resulted in crystallization of the first complex between a chemokine and its receptor. Examples from this work are used to illustrate the steps of the protocol. The utility of different types of experimental data for modeling and docking is discussed and caveats associated with data misinterpretation are highlighted.
... Un des problèmes des programmes de docking réside dans la représentation de la surface des protéines. En effet, peu de méthodes de docking utilisent une représentation explicite des chaînes latérales lors des premières étapes de recherche (Fernandez-Recio et al., 2002), en raison du coût important que cela engendre en terme de temps de calcul. Généralement, le choix de représentation des chaînes latérales est également lié à l'algorithme de recherche conformationnelle, comme c'est le cas pour les programmes de recherche sur grilles (Katchalski-Katzir et al., 1992), ces derniers représentant la structure des protéines en les projetant sur une grille tridimentionnelle. ...
... Il apparaît que cette technique soit efficace lorsque les changements conformationnels dus à l'interaction entre les partenaires sont de faible amplitude. En outre, certains procédés visent à optimiser (Fernandez-Recio et al., 2002) ou minimiser les chaînes latérales (Jackson et al., 1998), ou encore à introduire de façon explicite les effets de la solvatation . Ces techniques de raffinement et d'optimisation sont très importantes, en particulier lorsque les fonctions de score utilisées se basent sur des critères énergétiques, très sensibles à la qualité des modèles. ...
Article
The high-throughput characterization of the protein-protein interactions networks laid the bases for the first interaction maps in all model organisms, including human. In contrast, the structures of the protein assemblies are still restricted to a very limited set of interactions. In this work, a specific evolutionary pressure that exerted at protein interfaces has been revealed. To our knowledge, no such effect had been previously described. Based on this finding, a novel bioinformatic approach, called SCOTCH (Surface COmplementarity Trace in Complex History) has been developed to predict the structures of protein assemblies. Coupled to a docking program, such as SCOTCHer also developed in this work, this approach was shown to predict efficiently the structures of many complexes. This work also focused on the inhibition of protein interactions by synthetic peptides, rationally designed on the basis of the complex structure. The results obtained for two examples, the Asf1 – Histone H3/H4 and the gp120 – CD4 complexes emphasize the high interest of rational design of complex interface for the development of novel therapeutic strategies.
... The concept has its roots in Hill's theory 27 of cooperativity in biochemistry where the ligand anchors itself on the surface of the protein and exhibits fluctuations in the three Euler angles formed between the ligand and the protein. While the Hill model was for rigid bodies, Gray et al 26 added side-chain flexibility to small translations and rotations of the ligand as was also done by Fernandez-Recio et al. 28 , Parma et al. 29 . Flexibility was also introduced by relaxing interaction potentials by Chen and Weng 30 . ...
Preprint
This paper aims to understand the binding strategies of a nanobody-protein pair by studying known complexes. Rigid body protein-ligand docking programs produce several complexes, called decoys, which are good candidates with high scores of shape complementarity, electrostatic interactions, desolvation, buried surface area, and Lennard-Jones potentials. It is not known which decoy represents the true structure. We studied thirty-seven nanobody-protein complexes from the Single Domain Antibody Database, sd-Ab DB, http://www.sdab-db.ca/. For each structure, a large number of decoys are generated using the Fast Fourier Transform algorithm of the software ZDOCK. The decoys were ranked according to their target protein-nanobody interaction energies, calculated by using the Dreiding Force Field, with rank 1 having the lowest interaction energy. Out of thirty-six PDB structures, twenty-five true structures were predicted as rank 1. Eleven of the remaining structures required Ångstrom size rigid body translations of the nanobody relative to the protein to match the given PDB structure. After the translation the Dreiding interaction (DI) energies of all complexes decreased and became rank 1. In one case, rigid body rotations as well as translations of the nanobody were required for matching the crystal structure. We used a Monte Carlo algorithm that randomly translates and rotates the nanobody of a decoy and calculates the DI energy. Results show that rigid body translations and the DI energy are sufficient for determining the correct binding location and pose of ZDOCK created decoys. A survey of the sd-Ab DB showed that each nanobody makes at least one salt bridge with its partner protein, indicating that salt bridge formation is an essential strategy in nanobody-protein recognition. Based on the analysis of the thirty-six crystal structures and evidence from existing literature, we propose a set of principles that could be used in the design of nanobodies.
... Flexibility is a difficulty for protein-protein and protein-small molecule docking [122]. Modeling protein flexibility involves soft potentials [123,124], explore rotameric states [125], using various protein receptor structures [126,127] or refining with MD. ...
Article
Full-text available
Background: Protein-protein interactions (PPIs) are appealing targets for designing novel small-molecule inhibitors. The role of PPIs in various infectious and neurodegenerative disorders makes them potential targets with a broad therapeutic spectrum, though they were portrayed as un-druggable targets due to their flat surfaces, disordered conformations, and absence of grooves. However, recent progresses in computational biology have led researchers to reconsider PPIs as an important area in drug discovery. Areas covered: In this review, we introduce in-silico methods used to identify PPI interfaces and present an in-depth overview of various computational methodologies that are successfully applied to annotate the PPI interfaces. We also discuss several successful case studies that use computational tools to understand PPIs modulation and their key roles in various physiological processes. Expert opinion: Computational methods face challenges due to the inherent flexibility of proteins, which makes them expensive, and result in the use of rigid models. This problem becomes more significant in PPIs due to their flexible and flat interfaces. Computational methods like molecular dynamics (MD) simulation and machine learning can integrate the chemical structure data into biochemical and can be used for target identification and modulation. These computational methodologies have been crucial in understanding the structure of PPIs, designing PPI modulators, discovering new drug targets, and predicting treatment outcomes.
... The protein receptor flexibility is a common problem to protein-protein and protein-small molecule docking (Andrusier et al., 2008;B-Rao et al., 2009). Approaches to model protein flexibility include the use of soft potentials (Fernández et al., 2002;Ferrari et al., 2004), explore rotameric states (Leach, 1994), using different protein receptor structures (Amaro et al., 2018;Falcon et al., 2019) or refining with molecular dynamics (Alonso et al., 2006). The challenges in modeling peptides arises from: 1) peptides can have different conformations in their free/bound states; 2) the same peptide sequence might bind different proteins in different conformations (Huart and Hupp, 2013), and 3) different peptide sequences can bind the same receptor in different conformations (Aiyer et al., 2021). ...
Article
Full-text available
Protein-protein interactions (PPIs) mediate a large number of important regulatory pathways. Their modulation represents an important strategy for discovering novel therapeutic agents. However, the features of PPI binding surfaces make the use of structure-based drug discovery methods very challenging. Among the diverse approaches used in the literature to tackle the problem, linear peptides have demonstrated to be a suitable methodology to discover PPI disruptors. Unfortunately, the poor pharmacokinetic properties of linear peptides prevent their direct use as drugs. However, they can be used as models to design enzyme resistant analogs including, cyclic peptides, peptide surrogates or peptidomimetics. Small molecules have a narrower set of targets they can bind to, but the screening technology based on virtual docking is robust and well tested, adding to the computational tools used to disrupt PPI. We review computational approaches used to understand and modulate PPI and highlight applications in a few case studies involved in physiological processes such as cell growth, apoptosis and intercellular communication.
... Existing works tackle each of these stages differently. Monte-Carlo simulation [5], FFT [6], spherical harmonics [7], energy-based techniques [8], and a host of other techniques are applied in the docking domain to get improved results. One can choose among force-field based, knowledge-based, empirical, machine learning-based or a combination of these scoring functions to identify the near-native structures. ...
Preprint
Full-text available
ResDock is a new method to improve the performance of protein-protein complex structure prediction. It utilizes shape complementarity of the protein surfaces to generate the conformation space. The use of an appropriate scoring function helps to select the feasible structures. An interplay between pose generation phase and scoring phase enhance the performance of the proposed ab initio technique. <br
... Existing works tackle each of these stages differently. Monte-Carlo simulation [5], FFT [6], spherical harmonics [7], energy-based techniques [8], and a host of other techniques are applied in the docking domain to get improved results. One can choose among force-field based, knowledge-based, empirical, machine learning-based or a combination of these scoring functions to identify the near-native structures. ...
Preprint
Full-text available
ResDock is a new method to improve the performance of protein-protein complex structure prediction. It utilizes shape complementarity of the protein surfaces to generate the conformation space. The use of an appropriate scoring function helps to select the feasible structures. An interplay between pose generation phase and scoring phase enhance the performance of the proposed ab initio technique. <br
... 2020_1 version) [40], a small fraction of the estimated total number of PPIs in human, ranging between 130,000 [41] and 650,000 [42] interactions. In this scenario, protein-protein complexes with no available structure can be computationally modelled by a variety of template-based modeling [43,44] and in silico docking approaches [45][46][47][48][49][50][51][52]. Among the docking methods using energybased scoring, pyDock [53] has shown excellent predictive results in the most recent CASP-CAPRI and CAPRI assessment experiments [54,55], and can also be used for the identification of interface residues and hot-spots [56,57]. ...
Article
Full-text available
Protein-protein interactions play an essential role in many biological processes, and their perturbation is a major cause of disease. The use of small molecules to modulate them is attracting increased attention, but protein interfaces generally do not have clear cavities for binding small compounds. A proposed strategy is to target interface hot-spot residues, but their identification through computational approaches usually require the complex structure, which is not often available. In this context, pyDock energy-based docking and scoring can predict hot-spots on the unbound proteins, thus not requiring the complex structure. Here, we have devised a new strategy to detect protein–protein inhibitor binding sites, based on the integration of molecular dynamics for the generation of transient cavities, and docking-based interface hot-spot prediction for the selection of the suitable cavities. This integrative approach has been validated on a test set formed by protein–protein complexes with known inhibitors for which complete structural data of unbound molecules and complexes is available. The results show that local conformational sampling with short molecular dynamics can generate transient cavities similar to the known inhibitor binding sites, and that docking simulations can identify the best cavities with similar predictive accuracy as when knowing the real interface. In a few cases, these predicted pockets are shown to be suitable for protein–ligand docking. The proposed strategy will be useful for many protein–protein complexes for which there is no available structure, as long as the the unbound proteins do not deviate dramatically from the bound conformations.
... Soft docking method containing of two steps, geometric docking and interaction scoring, is able to accommodate conformational charges associated with the molecular complexing process, and to identify the correct complex model or a small number of complex models, which can be further screened visually or based on other experimental results to recognize the correct complex model (Jiang & Kim, 1991). The soft docking concept has evolved primarily toward use in protein-protein docking (Fernández-Recio, Totrov, & Abagyan, 2002;C. H. Li, Ma, Zu Chen, & Wang, 2003;Palma, Krippahl, Wampler, & Moura, 2000;Schneidman-Duhovny et al., 2003) and protein-receptor modeling combined with experimental NMR data (X. ...
Chapter
Today, the development of new drugs is a challenging task of science. Researchers already applied molecular docking in the drug design field to simulate ligand- receptor interactions. Docking is a term used for computational schemes that attempt to find the “best” matching between two molecules in a complex formed from constituent molecules. It has a wide range of uses and applications in drug discovery. However, some defects still exist; the accuracy and speed of docking calculation is a challenge to explore and these methods can be enhanced as a solution to docking problem. The molecular docking problem can be defined as follows: Given the atomic coordinates of two molecules, predict their “correct” bound association. The chapter discusses common challenges critical aspects of docking method such as ligand- and receptor- conformation, flexibility and cavity detection, etc. It emphasis to the challenges and inadequacies with the theories behind as well as the examples.
... The ICM scoring function is weighted according to the following parameters: (1) internal force-field energy of the ligand, (2) entropy loss of the ligand between bound and unbound states, (3) ligand-receptor hydrogen bond interactions, (4) polar and non-polar solvation energy differences between bound and unbound states, (5) electrostatic energy, (6) hydrophobic energy, and (7) hydrogen bond donor or acceptor desolvation. The lower the ICM score, the higher the chance the ligand is a binder (Fernández-Recio et al. 2002;MolSoft 2000). ...
Article
Full-text available
The most common pharmacologic approaches to inhibiting EGFR have been to develop small-molecule inhibitors which exert their effects at the intracellular portion of the receptor to prevent tyrosine kinase phosphorylation and subsequent activation of signal transduction pathways. A non-covalent molecular docking study was carried-out between 119 NCI anticancer compounds with receptor tyrosine kinase domain from epidermal growth factor receptor (PDB ID: 1M14), out of which 11 compounds had binding energy calculated with monte Carlo algorithm in ICM-pro Molsoft < − 25.25 kcal/mol. was found to have highest binding affinity with a reported value of − 32.832 kcal/mol, while Asaley had the overall least value (− 1.977 kcal/mol). The binding energy of all the compounds were found to be greatly influenced by the number and type of hydrogen bond interaction present between the ligands and the receptor except in few instances involving 5-flourouracil and mitomycin. A detailed description of the interactions of EGFR inhibitors developed can assist in the design of a more potent, more specific cytostatic drugs that can arrest tumor growth and cause tumor regression.
... To model the Gai3:GIV interaction, protein-protein docking of the GIV-GBA peptide (residues 1,678-1,688) and Gai3 models was performed with ICM using a rigid-body two-stage fast Fourier transform method followed by flexible refinement of ligand/receptor interface residues 61 . To extend the model for inclusion of GIV residues 1,678-1,696, eight additional amino acids were added to the C terminus of the top-scoring GIV-GBA conformation from the docking solutions above. ...
Article
Full-text available
Heterotrimeric G proteins are quintessential signalling switches activated by nucleotide exchange on Gα. Although activation is predominantly carried out by G-protein-coupled receptors (GPCRs), non-receptor guanine-nucleotide exchange factors (GEFs) have emerged as critical signalling molecules and therapeutic targets. Here we characterize the molecular mechanism of G-protein activation by a family of non-receptor GEFs containing a Gα-binding and -activating (GBA) motif. We combine NMR spectroscopy, computational modelling and biochemistry to map changes in Gα caused by binding of GBA proteins with residue-level resolution. We find that the GBA motif binds to the SwitchII/α3 cleft of Gα and induces changes in the G-1/P-loop and G-2 boxes (involved in phosphate binding), but not in the G-4/G-5 boxes (guanine binding). Our findings reveal that G-protein-binding and activation mechanisms are fundamentally different between GBA proteins and GPCRs, and that GEF-mediated perturbation of nucleotide phosphate binding is sufficient for Gα activation.
... Already in the seminal work on BH of Wales and Doye, the potential usefulness of such coordinates has been noted, 35 and later shown to be beneficial for structures connected by double-ended pathways. 55 One of the first applications of the idea of using ICs for global geometry optimization was reported in the context of protein-ligand docking, which developed into the so-called Internal Coordinate Mechanics (ICM) model, 56 further extending the above mentioned concept of meaningful building blocks. 53 Another example of global optimization in ICs was the attempt of introducing dihedral angles into the framework of the so-called deterministic global optimization. ...
Article
Efficient structure search is a major challenge in computational materials science. We present a modification of the basin hopping global geometry optimization approach that uses a curvilinear coordinate system to describe global trial moves. This approach has recently been shown to be efficient in structure determination of clusters [Nano Letters 15, 8044-8048 (2015)] and is here extended for its application to covalent, complex molecules and large adsorbates on surfaces. The employed, automatically constructed delocalized internal coordinates are similar to molecular vibrations, which enhances the generation of chemically meaningful trial structures. By introducing flexible constraints and local translation and rotation of independent geometrical subunits we enable the use of this method for molecules adsorbed on surfaces and interfaces. For two test systems, trans-$\beta$-ionylideneacetic acid adsorbed on a Au(111) surface and methane adsorbed on a Ag(111) surface, we obtain superior performance of the method compared to standard optimization moves based on Cartesian coordinates.
... A variety of approaches has been used to try to deal with this flexibility. In so-called "soft" docking the van der Waals term is modified to permit some local plasticity (Fernandez-Recio et al. (2002);Palma et al. (2000)). In flexible docking, bond angles, bond lengths and torsion angles of the components are modified during complex generation. ...
Thesis
The sequencing of the human genome provides the parts list for understanding cellular processes. However, as 70% of eukaryotic genes work through multi-protein systems, it is only through detailed study of the interactions of these components, that a more complete, systems-level understanding can be gained. This thesis is centred on the establishment of PICCOLO - a comprehensive database of structurally characterized protein interactions. In generating the resource, issues of interface definition, quaternary structure, data redundancy, structural environment and interaction type are addressed. The resource enables a variety of analyses to be performed concerning interface properties including residue propensity, hydropathy, polarity, interface size, sequence entropy and residue contact preference. PICCOLO has been applied to probing the patterns of substitutions that are accepted in protein interfaces across evolution, and whether these patterns are distinguishable from those seen in other structural environments. The derivation of a high-quality set of multiple structural alignments in the form of the database TOCCATA, a prerequisite for such analysis, is described, as well as procedures to derive environment-specific substitution tables. The Blundell group has contributed a series of methods to predict the likely effect of non-synonymous Single Nucleotide Polymorphisms (nsSNPs) on protein stability, function and interactions in order to triage the large volumes of data created from high-throughput genetic screening studies, enabling prioritization of those nsSNPs most likely to be phenotypically detrimental. PICCOLO's contribution to these predictions is described. Historically there has been little focus on protein-protein interactions as drug targets for small-molecule therapeutics. However, alanine-scanning mutagenesis studies have revealed that only a subset of residues contribute the greater part of free energy to binding - so-called "hot-spots". Molecular characterization of hot-spots performed using PICCOLO, probes the molecular basis underlying this important phenomenon leading to the possibility of predictive methods to identify hot-spots 'in silico'.
... At the same time, electrostatic interactions between charged residues on the microtubule and the kinesin motor domain are important to maintaining the directional bias kinesins 45 . However, the role of electrostatics in the molecular mechanism of kinesin's motion is not fully understood despite multiple experimental [41][42][43][44] and computational 42,45 studies because the size of the microtubule-kinesin system makes computational modeling difficult 46,47 . Using our computational focusing method, we overcame these modeling difficulties. ...
Article
Full-text available
Many biological phenomena involve the binding of proteins to a large object. Because the electrostatic forces that guide binding act over large distances, truncating the size of the system to facilitate computational modeling frequently yields inaccurate results. Our multiscale approach implements a computational focusing method that permits computation of large systems without truncating the electrostatic potential and achieves the high resolution required for modeling macromolecular interactions, all while keeping the computational time reasonable. We tested our approach on the motility of various kinesin motor domains. We found that electrostatics help guide kinesins as they walk: N-kinesins towards the plus-end, and C-kinesins towards the minus-end of microtubules. Our methodology enables computation in similar, large systems including protein binding to DNA, viruses, and membranes.
... Finally, protein-protein docking based on two comparative models is a significant challenge, because of the approximate nature of the structures and possible conformational changes upon binding [67][68][69][70]. In most cases, one needs to know the approximate site of docking before an accurate model can be made of the complex. ...
... doi:10.1093/molbev/msv336 MBE method (ICM, Molsfot LLC.) and compared with solutions from the ClusPro 2.0 server (Fernandez-Recio et al. 2002Comeau et al. 2004;Kozakov et al. 2006). Simulations were carried out at 300 K in continuous dielectric solvent (no explicit waters). ...
Article
Full-text available
Trimeric G protein signaling is a fundamental mechanism of cellular communication in eukaryotes. The core of this mechanism consists of activation of G proteins by the Guanine-nucleotide Exchange Factor (GEF) activity of G protein coupled-receptors (GPCRs). However, the duration and amplitude of G protein-mediated signaling is controlled by a complex network of accessory proteins that appeared and diversified during evolution. Among them, non-receptor proteins with GEF activity are the least characterized. We recently found that proteins of the ccdc88 family possess a Gα-Binding and Activating (GBA) motif that confers GEF activity and regulates mammalian cell behavior. A sequence similarity-based search revealed that ccdc88 genes are highly conserved across metazoa but the GBA motif is absent in most invertebrates. This prompted us to investigate if the GBA motif is present in other non-receptor proteins in invertebrates. An unbiased bioinformatics search in C. elegans identified GBAS-1 (GBA and SPK domain containing-1) as a GBA motif-containing protein with homologues only in closely related worm species. We demonstrate that GBAS-1 has GEF activity for the nematode G protein GOA-1 and that the two proteins are co-expressed in many cells of living worms. Furthermore, we show that GBAS-1 can activate mammalian Gα-subunits and provide structural insights into the evolutionarily conserved determinants of the GBA-G protein interface. These results demonstrate that the GBA motif is a functional GEF module conserved among highly divergent proteins across evolution, indicating that the GBA-Gα binding mode is strongly constrained under selective pressure to mediate receptor-independent G protein activation in metazoans.
... Similar to studies using SRM on the nuclear pore complex 81 , SRM is being employed to obtain additional information on the structure of the mucin-secreting porosome complex. Finally, computational approaches are being employed, such as coarse-grain molecular docking studies [82][83][84][85][86][87][88][89][90][91][92][93][94][95][96][97] , homology modeled interactions [98][99][100] , and fitting of known atomic structures of protein-protein interactions and complexes [101][102][103][104][105][106][107][108] . It is becoming increasingly clear that the ultrastructural and mass spectrometry methods show promise in providing complementary information and the high degree of cross-validation required to build an accurate structural model of the mucin-secreting porosome complex. ...
Article
Full-text available
Macromolecular structures embedded in the cell plasma membrane called ‘porosomes’, are involved in the regulated fractional release of intravesicular contents from cells during secretion. Porosomes range in size from 15 nm in neurons and astrocytes to 100–180 nm in the exocrine pancreas and neuroendocrine cells. Porosomes have been isolated from a number of cells, and their morphology, composition, and functional reconstitution well documented. The 3D contour map of the assembly of proteins within the porosome complex, and its native X-ray solution structure at sub-nm resolution has also advanced. This understanding now provides a platform to address diseases that may result from secretory defects. Water and ion binding to mucin impart hydration, critical for regulating viscosity of the mucus in the airways epithelia. Appropriate viscosity is required for the movement of mucus by the underlying cilia. Hence secretion of more viscous mucus prevents its proper transport, resulting in chronic and fatal airways disease such as cystic fibrosis (CF). CF is caused by the malfunction of CF transmembrane conductance regulator (CFTR), a chloride channel transporter, resulting in viscous mucus in the airways. Studies in mice lacking functional CFTR secrete highly viscous mucous that adhered to the epithelium. Since CFTR is known to interact with the t-SNARE protein syntaxin-1A, and with the chloride channel CLC-3, which are also components of the porosome complex, the interactions between CFTR and the porosome complex in the mucin-secreting human airway epithelial cell line Calu-3 was hypothesized and tested. Results from the study demonstrate the presence of approximately 100 nm in size porosome complex composed of 34 proteins at the cell plasma membrane in Calu-3 cells, and the association of CFTR with the complex. In comparison, the nuclear pore complex measures 120 nm and is comprised of over 500 protein molecules. The involvement of CFTR in porosome-mediated mucin secretion is hypothesized, and is currently being tested.
... Single particle cryo-EM tomography, 122 and SAXS studies combined with neutron scattering on isolated neuronal porosomes should serve to reveal the position of various proteins within the structure. Finally, computational approaches employing coarse-grain molecular docking studies, [123][124][125][126][127][128][129][130][131][132][133][134][135][136][137][138] homology-modeled interactions, [139][140][141] and fitting of known atomic structures of proteins and their interactions and the resultant complexes formed 142-148 will be employed to further determine the molecular structure of the neuronal porosome. A molecular understanding of the neuronal porosome structure will help in revealing the crosstalk between proteins within the complex and in the regulation of neurotransmitter release, and hence the potential for the design and development of drugs to target neurological disorders and diseases. ...
Article
Full-text available
Cup-shaped secretory portals at the cell plasma membrane called porosomes mediate the precision release of intravesicular material from cells. Membrane-bound secretory vesicles transiently dock and fuse at the base of porosomes facing the cytosol to expel pressurized intravesicular contents from the cell during secretion. The structure, isolation, composition, and functional reconstitution of the neuronal porosome complex have greatly progressed, providing a molecular understanding of its function in health and disease. Neuronal porosomes are 15 nm cup-shaped lipoprotein structures composed of nearly 40 proteins, compared to the 120 nm nuclear pore complex composed of >500 protein molecules. Membrane proteins compose the porosome complex, making it practically impossible to solve its atomic structure. However, atomic force microscopy and small-angle X-ray solution scattering studies have provided three-dimensional structural details of the native neuronal porosome at sub-nanometer resolution, providing insights into the molecular mechanism of its function. The participation of several porosome proteins previously implicated in neurotransmission and neurological disorders, further attest to the crosstalk between porosome proteins and their coordinated involvement in release of neurotransmitter at the synapse. © 2015 by the Society for Experimental Biology and Medicine.
... This cube representation implies conformational changes by way of size/shape complementarity, close packing and, most importantly, liberal steric overlap. In recent years the soft docking concept has evolved primarily toward use in protein-protein docking [52][53][54][55][56] and protein-receptor modeling combined with experimental NMR data [57][58][59] . ...
Article
Full-text available
Molecular Docking is the computational modeling of the structure of complexes formed by two or more interacting molecules. The goal of molecular docking is the prediction of the three dimensional structures of interest. Docking itself only produces plausible candidate structures. These candidates are ranked using methods such as scoring functions to identify structures that are most likely to occur in nature. The state of the art of various computational aspects of molecular docking based virtual screening of database of small molecules is presented. This review encompasses molecular docking approaches, different search algorithms and the scoring functions used in docking methods and their applications to protein and nucleic acid drug targets. Limitations of current technologies as well as future prospects are also presented.
... Les coordonnées internes peuvent être utilisées en combinaison avec différentes méthodologies comme les modes normaux Duong et Zakrzewska, 1997), la dynamique moléculaire (Schwieters et Clore, 2001 ;Mazur, 2002), la minimisation d'énergie (Li et Scheraga, 1987 ;Navizet et al., 2004a), ou les simulations de Monte Carlo (Fernández-Recio et al., 2002 ;Bastard et al., 2003). ...
Article
Full-text available
Due to their importance for function, the mechanical properties of proteins are the subject of great attention. We have used molecular modeling techniques to gain a better understanding of these properties. We have notably used molecular dynamics simulations to study the dynamics of E-cadherin molecules which are involved in cellular adhesion. The influence of the presence of calcium ions has been monitored in the context of the change in flexibility and dimerisation. We have also examined three dimeric conformations observed experimentally and discussed their potential involvement in adhesion. We have also developed various methodological tools for the theoretical study of proteins. The first is a new index to measure protein flexibility at the single amino acid level, \textit{via} the use of restrained energy minimisations. This method also allows us to determine dynamical domains within protein structures by analyzing the deformations caused by the restraints. We have also developed a new multi-scale representation of proteins, containing both coarse-grained and all-atom residues. This representation should allow us to study large systems while keeping atomic precision within the most important parts of the protein.
Chapter
The field of structural biology is rapidly advancing thanks to significant improvements in X-ray crystallography, nuclear magnetic resonance (NMR), cryo-electron microscopy, and bioinformatics. The identification of structural descriptors allows for the correlation of functional properties with characteristics such as accessible molecular surfaces, volumes, and binding sites. Atom depth has been recognized as an additional structural feature that links protein structures to their folding and functional properties. In the case of proteins, the atom depth is typically defined as the distance between the atom and the nearest surface point or nearby water molecule. In this paper, we propose a discrete geometry method to calculate the depth index with an alternative approach that takes into account the local molecular shape of the protein. To compute atom depth indices, we measure the volume of the intersection between the molecule and a sphere with an appropriate reference radius, centered on the atom for which we want to quantify the depth. We validate our method on proteins of diverse shapes and sizes and compare it with metrics based on the distance to the nearest water molecule from bulk solvent to demonstrate its effectiveness.
Article
This paper aims to understand the binding strategies of a nanobody-protein pair by studying known complexes. Rigid body protein-ligand docking programs produce several complexes, called decoys, which are good candidates with high scores of shape complementarity, electrostatic interactions, desolvation, buried surface area, and Lennard-Jones potentials. However, the decoy that corresponds to the native structure is not known. We studied 36 nanobody-protein complexes from the single domain antibody database, sd-Ab DB, http://www.sdab-db.ca/. For each structure, a large number of decoys are generated using the Fast Fourier Transform algorithm of the software ZDOCK. The decoys were ranked according to their target protein-nanobody interaction energies, calculated by using the Dreiding Force Field, with rank 1 having the lowest interaction energy. Out of 36 protein data bank (PDB) structures, 25 true structures were predicted as rank 1. Eleven of the remaining structures required Ångstrom size rigid body translations of the nanobody relative to the protein to match the given PDB structure. After the translation, the Dreiding interaction (DI) energies of all complexes decreased and became rank 1. In one case, rigid body rotations as well as translations of the nanobody were required for matching the crystal structure. We used a Monte Carlo algorithm that randomly translates and rotates the nanobody of a decoy and calculates the DI energy. Results show that rigid body translations and the DI energy are sufficient for determining the correct binding location and pose of ZDOCK created decoys. A survey of the sd-Ab DB showed that each nanobody makes at least one salt bridge with its partner protein, indicating that salt bridge formation is an essential strategy in nanobody-protein recognition. Based on the analysis of the 36 crystal structures and evidence from existing literature, we propose a set of principles that could be used in the design of nanobodies.
Preprint
Full-text available
Accurate determination of a small molecule candidate (ligand) binding pose in its target protein pocket is important for computer-aided drug discovery. Typical rigid-body docking methods ignore the pocket flexibility of protein, while the more accurate pose generation using molecular dynamics is hindered by slow protein dynamics. We develop a tiered tensor transform (3T) algorithm to rapidly generate diverse protein-ligand complex conformations for both pose and affinity estimation in drug screening, requiring neither machine learning training nor lengthy dynamics computation, while maintaining both coarse-grain-like coordinated protein dynamics and atomistic-level details of the complex pocket. The 3T conformation structures we generate are closer to experimental co-crystal structures than those generated by docking software, and more importantly achieve significantly higher accuracy in active ligand classification than traditional ensemble docking using hundreds of experimental protein conformations. 3T structure transformation is decoupled from the system physics, making future usage in other computational scientific domains possible.
Preprint
Full-text available
T cells play a vital role in adaptive immune responses to infections, inflammation and cancer and are dysregulated in autoimmunity. Antigen recognition by T cells – a key step in adaptive immune responses – is performed by the T cell receptor (TCR)-CD3 complex. The extracellular molecular organization of the individual CD3 subunits (CD3δϵ and CD3γϵ) around the αβTCR is critical for T cell signaling. Here, we incorporated unnatural amino acid (UAA) photo-crosslinkers at specific mouse TCRα, TCRβ, CD3δ and CD3γ sites, based on previous mutagenesis, NMR spectroscopy and cryo-EM evidence, and crosslinking allowing us to identify nearby interacting CD3 or TCR subunits on the mammalian cell surface. Using this approach, we show that CD3γ and CD3ϵ, belonging to CD3γϵ heterodimer crosslinks to Cβ FG loop and Cβ G strand, respectively and CD3δ crosslinks to Cβ CC’ loop and Cα DE loop. Together with computational docking, we identify that in in situ cell-surface conformation, the CD3 subunits exists in CD3ϵ’-CD3γ-CD3ϵ-CD3δ arrangement around the αβTCR. This unconventional technique, which uses the native mammalian cell surface microenvironment, includes the plasma membrane and excludes random, artificial crosslinks, captures a dynamic, biologically relevant, cell-surface conformation of the TCR-CD3 complex, which is compatible with the reported static cryo-EM structure’s overall CD3 subunits arrangement, but with key differences at the TCR-CD3 interface, which may be critical for experiments in T cell model systems.
Article
Structure-based virtual screening is a key, routine computational method in computer-aided drug design. Such screening can be used to identify potentially highly active compounds, to speed up the progress of novel drug design. Molecular docking-based virtual screening can help find active compounds from large ligand databases by identifying the binding affinities between receptors and ligands. In this study, we analyzed the challenges of virtual screening, with the aim of identifying highly active compounds faster and more easily than is generally possible. We discuss the accuracy and speed of molecular docking software and the strategy of high-throughput molecular docking calculation, and we focus on current challenges and our solutions to these challenges of ultra-large-scale virtual screening. The development of Web services helps lower the barrier to drug virtual screening. We introduced some related web sites for docking and virtual screening, focusing on the development of pre- and post-processing interactive visualization and large-scale computing.
Article
The computationally hard protein–protein complex structure prediction problem is continuously fascinating to the scientific community due to its biological impact. The field has witnessed the application of geometric algorithms, randomized algorithms, and evolutionary algorithms to name a few. These techniques improve either the searching or scoring phase. An effective searching strategy does not generate a large conformation space that perhaps demands computational power. Another determining factor is the parameter chosen for score calculation. The proposed method is an attempt to curtail the conformations by limiting the search procedure to probable regions. In this method, partial derivatives are calculated on the coarse-grained representation of the surface residues to identify the optimal points on the protein surface. Contrary to the existing geometric-based algorithms that align the convex and concave regions of both proteins, this method aligns the concave regions of the receptor with convex regions of the ligand only and thus reduces the size of conformation space. The method’s performance is evaluated using the 55 newly added targets in Protein–Protein Docking Benchmark v 5 and is found to be successful for around 47% of the targets.
Chapter
Molecular docking has become an important component of the drug discovery process. Since first being developed in the 1980s, advancements in the power of computer hardware and the increasing number of and ease of access to small molecule and protein structures have contributed to the development of improved methods, making docking more popular in both industrial and academic settings. Over the years, the modalities by which docking is used to assist the different tasks of drug discovery have changed. Although initially developed and used as a standalone method, docking is now mostly employed in combination with other computational approaches within integrated workflows. Despite its invaluable contribution to the drug discovery process, molecular docking is still far from perfect. In this chapter we will provide an introduction to molecular docking and to the different docking procedures with a focus on several considerations and protocols, including protonation states, active site waters and consensus, that can greatly improve the docking results.
Chapter
Protein-protein docking algorithms are powerful computational tools, capable of analyzing the protein-protein interactions at the atomic-level. In this chapter, we will review the theoretical concepts behind different protein-protein docking algorithms, highlighting their strengths as well as their limitations and pointing to important case studies for each method. The methods we intend to cover in this chapter include various search strategies and scoring techniques. This includes exhaustive global search, fast Fourier transform search, spherical Fourier transform-based search, direct search in Cartesian space, local shape feature matching, geometric hashing, genetic algorithm, randomized search, and Monte Carlo search. We will also discuss the different ways that have been used to incorporate protein flexibility within the docking procedure and some other future directions in this field, suggesting possible ways to improve the different methods.
Chapter
Protein-protein docking algorithms are powerful computational tools, capable of analyzing the protein-protein interactions at the atomic-level. In this chapter, we will review the theoretical concepts behind different protein-protein docking algorithms, highlighting their strengths as well as their limitations and pointing to important case studies for each method. The methods we intend to cover in this chapter include various search strategies and scoring techniques. This includes exhaustive global search, fast Fourier transform search, spherical Fourier transform-based search, direct search in Cartesian space, local shape feature matching, geometric hashing, genetic algorithm, randomized search, and Monte Carlo search. We will also discuss the different ways that have been used to incorporate protein flexibility within the docking procedure and some other future directions in this field, suggesting possible ways to improve the different methods.
Chapter
Today, the development of new drugs is a challenging task of science. Researchers already applied molecular docking in the drug design field to simulate ligand- receptor interactions. Docking is a term used for computational schemes that attempt to find the “best” matching between two molecules in a complex formed from constituent molecules. It has a wide range of uses and applications in drug discovery. However, some defects still exist; the accuracy and speed of docking calculation is a challenge to explore and these methods can be enhanced as a solution to docking problem. The molecular docking problem can be defined as follows: Given the atomic coordinates of two molecules, predict their “correct” bound association. The chapter discusses common challenges critical aspects of docking method such as ligand- and receptor- conformation, flexibility and cavity detection, etc. It emphasis to the challenges and inadequacies with the theories behind as well as the examples.
Book
A macromolecule, known as a giant molecule or a polymer, is a chemical species,composed of a long chain with a regularly repeating unit, a high molecular weightand a high molecular size The unit for molecular weight is usually theDalton (Da); one Dalton is equal to one atomic mass unit. Symbols andparameters appearing in this chapter . Macromolecules aredivided into natural and man- made polymers. The latter are known as syntheticpolymers
Article
Computational docking approaches aim to overcome the limited availability of experimental structural data on protein–protein interactions, which are key in biology. The field is rapidly moving from the traditional docking methodologies for modeling of binary complexes to more integrative approaches using template-based, data-driven modeling of multi-molecular assemblies. We will review here the predictive capabilities of current docking methods in blind conditions, based on the results from the most recent community-wide blind experiments. Integration of template-based and ab initio docking approaches is emerging as the optimal strategy for modeling protein complexes and multimolecular assemblies. We will also review the new methodological advances on ab initio docking and integrative modeling.
Chapter
Mutation of a single amino acid in a protein often has consequences on the interaction with other proteins, which may affect other interaction networks and pathways and ultimately lead to pathological phenotypes. A detailed structural analysis of these altered protein–protein complexes is essential to interpret the impact of a given mutation at the molecular level, which may facilitate intervention with therapeutic purposes. Given current limitations in the structural coverage of the human interactome, computational docking is emerging as a complementary source of information. Structural analysis can help to locate a given mutation at a protein–protein interface, but further characterisation of its impact on binding affinity is needed for a full interpretation. The integration of computational docking methods and energy‐based descriptors is facilitating the characterisation of an increasing number of disease‐related mutations, thus improving our understanding of the consequences of such mutations at the phenotypic level. Key Concepts • Protein–protein interactions are key to understand disease at the molecular level. • Disease‐related mutations can have significant structural and energetic impact on protein–protein interactions. • The 3D structure of a complex is essential to interpret the functional impact of a mutation. • Computational docking can provide structural models for protein–protein interactions with no available structure. • In addition to structural data, further energetic description is needed to fully interpret the impact of a mutation. • Hot‐spot interface residues are causing the greater impact on a protein–protein interaction when mutated. • Altered protein–protein interactions are potentially suitable drug targets for therapeutic purposes.
Article
Molecular Docking is used to position the computer-generated 3D structure of small ligands into a receptor structure in a variety of orientations, conformations and positions. This method has found to be useful in drug discovery and medicinal chemistry providing insights into molecular recognition. Docking has become an integral part of computer aided drug design and discovery (CADDD). Traditional docking methods suffer from limitations of semi-flexible or static treatment of targets and ligand. Over the last decade advances in field of computational, proteomic and genomics has also led to development of different docking methods which incorporate protein-ligand flexibility and their different binding conformations. Receptor flexibility accounts for more accurate binding pose predictions and more rational depiction of protein binding interactions with the ligand. Protein flexibility has been included by generating protein ensembles or by dynamic docking methods. Dynamic docking considers solvation, entropic effects and also fully explore the drug-receptor binding and recognition from both energetic and mechanistic point of view. Though in the fast paced drug discovery program dynamic docking is computationally expensive but is being progressively used for screening of large compound libraries to identify the potential drugs. In this review, we present a quick introduction to available docking methods and their application and limitations in drug discovery.
Article
Stereo- and regioisomers of a series of N,N-bis(alkanol)amine aryl ester derivatives have been prepared and studied as multidrug resistance (MDR) modulators. The new compounds contain a 2-(methyl)propyl chain combined with a 3-, 5- or 7-methylenes long chain and carry different aromatic ester portions. Thus, these compounds have a methyl group on the 3-methylenes chain and represent branched homologues of previously studied derivatives. The introduction of the methyl group gives origin to a stereogenic center and consequently to (R) and (S) enantiomers. In the pirarubicin uptake assay on K562/DOX cell line these compounds showed good activity and efficacy and in many cases enantioselectivity was observed. Docking studies confirmed the influence of the stereocenter on the interaction in the P-gp pocket. The P-gp interaction mechanism and selectivity towards MRP1 and BCRP were also evaluated on MDCK transfected cells overexpressing the three transporters. Almost all these compounds inhibited both P-gp and BCRP, but only derivatives with specific structural characteristics showed MRP1 activity. Moreover, two compounds, (S)-3 and (R)-7, showed the ability to induce collateral sensitivity (CS) against MDR cells. Therefore, these two CS-promoting agents could be considered interesting leads for the development of selective cytotoxic agents for drug-resistant cells.
Article
Full-text available
Molecular docking methodology explores the behavior of small molecules in the binding site of a target protein. As more protein structures are determined experimentally using X-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy, molecular docking is increasingly used as a tool in drug discovery. Docking against homology-modeled targets also becomes possible for proteins whose structures are not known. With the docking strategies, the druggability of the compounds and their specificity against a particular target can be calculated for further lead optimization processes. Molecular docking programs perform a search algorithm in which the conformation of the ligand is evaluated recursively until the convergence to the minimum energy is reached. Finally, an affinity scoring function, ΔG [U total in kcal/mol], is employed to rank the candidate poses as the sum of the electrostatic and van der Waals energies. The driving forces for these specific interactions in biological systems aim toward complementarities between the shape and electrostatics of the binding site surfaces and the ligand or substrate.
Article
To understand cellular processes at the molecular level we need to improve our knowledge of protein-protein interactions, from a structural, mechanistic and energetic point of view. Current theoretical studies and computational docking simulations show that protein dynamics plays a key role in protein association, and support the need for including protein flexibility in modeling protein interactions. Assuming the conformational selection binding mechanism, in which the unbound state can sample bound conformers, one possible strategy to include flexibility in docking predictions would be the use of conformational ensembles originated from unbound protein structures. Here we present an exhaustive computational study about the use of precomputed unbound ensembles in the context of protein docking, performed on a set of 124 cases of the Protein-Protein Docking Benchmark 3.0. Conformational ensembles were generated by conformational optimization and refinement with MODELLER and by short molecular dynamics trajectories with AMBER. We identified those conformers providing optimal binding and investigated the role of protein conformational heterogeneity in protein-protein recognition. Our results show that a restricted conformational refinement can generate conformers with better binding properties and improve docking encounters in medium-flexible cases. For more flexible cases, a more extended conformational sampling based on Normal Mode Analysis was proven helpful. We found that successful conformers provide better energetic complementarity to the docking partners, which is compatible with recent views of binding association. In addition to the mechanistic considerations, these findings could be exploited for practical docking predictions of improved efficiency.
Chapter
Proteins are an important class of biological macromolecules present in all biological organisms. Proteins consist of a sequence of 20 different amino acids, also referred to as residues. To be able to perform their biological function, proteins often fold into one, or more, specific spatial conformations, driven by a number of non-covalent interactions, such as hydrogen bonding, ionic interactions, Van der Waals’ forces, and hydrophobic packing. In order to understand the functions of proteins at a molecular level, it is often necessary to determine their 3D structure. This is the topic of the scientific field of structural biology that employs techniques, such as X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, and electron microscopy, to determine the structure of proteins.
Article
Molecular dynamics simulations of proteins are usually performed on a single molecule, and coarse-grained protein models are calibrated using single-molecule simulations, therefore ignoring intermolecular interactions. We present here a new coarse-grained force field for the study of many protein systems. The force field, which is implemented in the context of the discrete molecular dynamics algorithm, is able to reproduce the properties of folded and unfolded proteins, in both isolation, complexed forming well-defined quaternary structures, or aggregated, thanks to its proper evaluation of protein-protein interactions. The accuracy and computational efficiency of the method makes it a universal tool for the study of the structure, dynamics, and association/dissociation of proteins.
Article
Identification of relevant reaction pathways in ever more complex composite materials and nanostructures poses a central challenge to computational materials discovery. Efficient global structure search, tailored to identify chemically-relevant intermediates, could provide the necessary first-principles atomistic insight to enable a rational process design. In this work we modify a common feature of global geometry optimization schemes by employing automatically-generated collective curvilinear coordinates. The similarity of these coordinates to molecular vibrations enhances the generation of chemically meaningful trial structures for covalently bound systems. In the application to hydrogenated Si clusters we concomitantly observe a significantly increased efficiency in identifying low-energy structures and exploit it for an extensive sampling of potential products of silicon-cluster soft landing on Si(001) surfaces.
Article
A computational protein-protein docking method that predicts atomic details of protein-protein interactions from protein monomer structures is an invaluable tool for understanding the molecular mechanisms of protein interactions and for designing molecules that control such interactions. Compared to low-resolution docking, high-resolution docking explores the conformational space in atomic resolution to provide predictions with atomic details. This allows for applications to more challenging docking problems that involve conformational changes induced by binding. Recently, high-resolution methods have become more promising as additional information such as global shapes or residue contacts are now available from experiments or sequence/structure data. In this review article, we highlight developments in high-resolution docking made during the last decade, specifically regarding global optimization methods employed by the docking methods. We also discuss two major challenges in high-resolution docking: prediction of backbone flexibility and water-mediated interactions. Copyright © 2015 Elsevier Ltd. All rights reserved.
Chapter
Protein-protein docking was born in the 1970s as a tool to analyze macromolecular recognition. It developed afterwards into a method of prediction of the mode of association between proteins of known structure. Since 2001, the performance of docking procedures has been assessed in blind predictions by the CAPRI (Critical Assessment of PRedicted Interactions) experiment. The results show that docking routinely yields good models of the protein-protein complexes that undergo only minor changes in conformation and associate as rigid bodies. In contrast, flexible recognition accompanying large conformation changes in the components remains difficult to simulate, and structural predictions generally yield lower quality models. In recent years, a new challenge has been to predict affinity and to estimate the stability of the complex along with its structure. Over the years, CAPRI has proved to be a strong incentive to develop new flexible docking procedures and more discriminative scoring functions, and it has provided a common ground for discussing methods and questions related to protein-protein recognition.
Article
Protein structure prediction and protein docking prediction are two related problems in molecular biology. We suggest the use of multiple docking in the process of protein structure prediction. Once reliable structural models are predicted to disjoint fragments of the protein target sequence, a combinatorial assembly may be used to predict their native arrangement. Here, we present CombDock, a combinatorial docking algorithm for the structural units assembly problem. We have tested the algorithm on various examples using both domains and domain substructures as input. Inaccurate models of the structural units were also used, to test the robustness of the algorithm. The algorithm was able to predict a near-native arrangement of the input structural units in almost all of the cases, showing that the combinatorial approach succeeds in overcoming the inexact shape complementarity caused by the inaccuracy of the models.
Article
Full-text available
The structure of TEM-1 -lactamase complexed with the inhibitor BLIP has been determined at 1.7 A resolution. The two tandemly repeated domains of BLIP form a polar, concave surface that docks onto a predominantly polar, convex protrusion on the enzyme. The ability of BLIP to adapt to a variety of class A -lactamases is most likely due to an observed flexibility between the two domains of the inhibitor and to an extensive layer of water molecules entrapped between the enzyme and inhibitor. A -hairpin loop from domain 1 of BLIP is inserted into the active site of the -lactamase. The carboxylate of Asp 49 forms hydrogen bonds to four conserved, catalytic residues in the -lactamase, thereby mimicking the position of the penicillin G carboxylate observed in the acyl−enzyme complex of TEM-1 with substrate. This -hairpin may serve as a template with which to create a new family of peptide-analogue -lactamase inhibitors.
Article
Full-text available
The X-ray crystal structure of the molecular complex of penicillin G with a deacylation-defective mutant of the RTEM-1 beta-lactamase from Escherichia coli shows how these antibiotics are recognized and destroyed. Penicillin G is covalently bound to Ser 70 0 gamma as an acyl-enzyme intermediate. The deduced catalytic mechanism uses Ser 70 0 gamma as the attacking nucleophile during acylation. Lys 73 N zeta acts as a general base in abstracting a proton from Ser 70 and transferring it to the thiazolidine ring nitrogen atom via Ser 130 0 gamma. Deacylation is accomplished by nucleophilic attack on the penicilloyl carbonyl carbon by a water molecule assisted by the general base, Glu 166.
Chapter
Full-text available
We present our database-screening tool SLIDE, which is capable of screening large data sets of organic compounds for potential ligands to a given binding site of a target protein. Its main feature is the modeling of induced complementarity by making adjustments in the protein side chains and ligand upon binding. Mean-field theory is used to balance the conformational changes in both molecules in order to generate a shape-complementary inter-face. Solvation is considered by prediction of water molecules likely to be conserved from the crystal structure of the ligand-free protein, and allowing them to mediate ligand interactions, if possible, or including a desolvation penalty when they are displaced by ligand atoms that do not replace the lost hydrogen bonds. A data set of over 175 000 organic molecules was screened for potential ligands to the progesterone receptor, dihydrofolate reductase, and a DNA-repair enzyme. In all cases the screening time was less than a day on a Pentium II processor, and known ligands as well as highly complementary new potential ligands were found.
Article
Full-text available
A geometric recognition algorithm was developed to identify molecular surface complementarity. It is based on a purely geometric approach and takes advantage of techniques applied in the field of pattern recognition. The algorithm involves an automated procedure including (i) a digital representation of the molecules (derived from atomic coordinates) by three-dimensional discrete functions that distinguishes between the surface and the interior; (ii) the calculation, using Fourier transformation, of a correlation function that assesses the degree of molecular surface overlap and penetration upon relative shifts of the molecules in three dimensions; and (iii) a scan of the relative orientations of the molecules in three dimensions. The algorithm provides a list of correlation values indicating the extent of geometric match between the surfaces of the molecules; each of these values is associated with six numbers describing the relative position (translation and rotation) of the molecules. The procedure is thus equivalent to a six-dimensional search but much faster by design, and the computation time is only moderately dependent on molecular size. The procedure was tested and validated by using five known complexes for which the correct relative position of the molecules in the respective adducts was successfully predicted. The molecular pairs were deoxyhemoglobin and methemoglobin, tRNA synthetase-tyrosinyl adenylate, aspartic proteinase-peptide inhibitor, and trypsin-trypsin inhibitor. A more realistic test was performed with the last two pairs by using the structures of uncomplexed aspartic proteinase and trypsin inhibitor, respectively. The results are indicative of the extent of conformational changes in the molecules tolerated by the algorithm.
Article
Full-text available
The past decade has seen an alarming worldwide increase in resistance to beta-lactam antibiotics among many pathogenic bacteria, which is due mainly to plasmid- or chromosomally encoded beta-lactamases that specifically cleave penicillin and cephalosporins, rendering them inactive. There is therefore a need to develop new strategies in the design of effective inhibitors of beta-lactamase. All the small-molecule inhibitors in clinical use are not very effective and are rapidly degraded. Furthermore, newly characterized mutants of the plasmid-mediated beta-lactamase TEM-1 are highly resistant to these small-molecule inhibitors, including clavulanic acid and tazobactam. It has been shown that Streptomyces clavuligerus produces an exocellular beta-lactamase inhibitory protein (BLIP; M(r) 17.5 K). Here we present data defining BLIP as the most effective known inhibitor of a variety of beta-lactamases, with Ki values in the subnanomolar to picomolar range. To identify those features in BLIP that make it such a potent inhibitor, we have determined its molecular structure at 2.1 A resolution. BLIP is a relatively flat molecule with a unique fold, comprising a tandem repeat of a 76-amino-acid domain. Each domain consists of a helix-loop-helix motif that packs against a four-stranded antiparallel beta-sheet (Fig. 1a). To our knowledge, BLIP is the first example of a protein inhibitor having two similarly folded domains that interact with and inhibit a single target enzyme.
Article
Full-text available
Crystallization of the 1:1 molecular complex between the beta-lactamase TEM-1 and the beta-lactamase inhibitory protein BLIP has provided an opportunity to put a stringent test on current protein-docking algorithms. Prior to the successful determination of the structure of the complex, nine laboratory groups were given the refined atomic coordinates of each of the native molecules. Other than the fact that BLIP is an effective inhibitor of a number of beta-lactamase enzymes (KI for TEM-1 approximately 100 pM) no other biochemical or structural data were available to assist the practitioners in their molecular docking. In addition, it was not known whether the molecules underwent conformational changes upon association or whether the inhibition was competitive or non-competitive. All six of the groups that accepted the challenge correctly predicted the general mode of association of BLIP and TEM-1.
Article
A new computationally efficient and automated “soft docking” algorithm is described to assist the prediction of the mode of binding between two proteins, using the three-dimensional structures of the unbound molecules. The method is implemented in a software package called BiGGER (Bimolecular Complex Generation with Global Evaluation and Ranking) and works in two sequential steps: first, the complete 6-dimensional binding spaces of both molecules is systematically searched. A population of candidate protein-protein docked geometries is thus generated and selected on the basis of the geometric complementarity and amino acid pairwise affinities between the two molecular surfaces. Most of the conformational changes observed during protein association are treated in an implicit way and test results are equally satisfactory, regardless of starting from the bound or the unbound forms of known structures of the interacting proteins. In contrast to other methods, the entire molecular surfaces are searched during the simulation, using absolutely no additional information regarding the binding sites. In a second step, an interaction scoring function is used to rank the putative docked structures. The function incorporates interaction terms that are thought to be relevant to the stabilization of protein complexes. These include: geometric complementarity of the surfaces, explicit electrostatic interactions, desolvation energy, and pairwise propensities of the amino acid side chains to contact across the molecular interface. The relative functional contribution of each of these interaction terms to the global scoring function has been empirically adjusted through a neural network optimizer using a learning set of 25 protein-protein complexes of known crystallographic structures. In 22 out of 25 protein-protein complexes tested, near-native docked geometries were found with Cα RMS deviations ≤ 4.0 Å from the experimental structures, of which 14 were found within the 20 top ranking solutions. The program works on widely available personal computers and takes 2 to 8 hours of CPU time to run any of the docking tests herein presented. Finally, the value and limitations of the method for the study of macromolecular interactions, not yet revealed by experimental techniques, are discussed. Proteins 2000;39:372–384. © 2000 Wiley-Liss, Inc.
Article
Rigid-body methods, particularly Fourier correlation techniques, are very efficient for docking bound (co-crystallized) protein conformations using measures of surface complementarity as the target function. However, when docking unbound (separately crystallized) conformations, the method generally yields hundreds of false positive structures with good scores but high root mean square deviations (RMSDs). This paper describes a two-step scoring algorithm that can discriminate near-native conformations (with less than 5 Å RMSD) from other structures. The first step includes two rigid-body filters that use the desolvation free energy and the electrostatic energy to select a manageable number of conformations for further processing, but are unable to eliminate all false positives. Complete discrimination is achieved in the second step that minimizes the molecular mechanics energy of the retained structures, and re-ranks them with a combined free-energy function which includes electrostatic, solvation, and van der Waals energy terms. After minimization, the improved fit in near-native complex conformations provides the free-energy gap required for discrimination. The algorithm has been developed and tested using docking decoys, i.e., docked conformations generated by Fourier correlation techniques. The decoy sets are available on the web for testing other discrimination procedures. Proteins 2000;40:525–537. © 2000 Wiley-Liss, Inc.
Article
A new computationally efficient and automated “soft docking” algorithm is described to assist the prediction of the mode of binding between two proteins, using the three-dimensional structures of the unbound molecules. The method is implemented in a software package called BiGGER (Bimolecular Complex Generation with Global Evaluation and Ranking) and works in two sequential steps: first, the complete 6-dimensional binding spaces of both molecules is systematically searched. A population of candidate protein-protein docked geometries is thus generated and selected on the basis of the geometric complementarity and amino acid pairwise affinities between the two molecular surfaces. Most of the conformational changes observed during protein association are treated in an implicit way and test results are equally satisfactory, regardless of starting from the bound or the unbound forms of known structures of the interacting proteins. In contrast to other methods, the entire molecular surfaces are searched during the simulation, using absolutely no additional information regarding the binding sites. In a second step, an interaction scoring function is used to rank the putative docked structures. The function incorporates interaction terms that are thought to be relevant to the stabilization of protein complexes. These include: geometric complementarity of the surfaces, explicit electrostatic interactions, desolvation energy, and pairwise propensities of the amino acid side chains to contact across the molecular interface. The relative functional contribution of each of these interaction terms to the global scoring function has been empirically adjusted through a neural network optimizer using a learning set of 25 protein-protein complexes of known crystallographic structures. In 22 out of 25 protein-protein complexes tested, near-native docked geometries were found with Cα RMS deviations ≤ 4.0 Å from the experimental structures, of which 14 were found within the 20 top ranking solutions. The program works on widely available personal computers and takes 2 to 8 hours of CPU time to run any of the docking tests herein presented. Finally, the value and limitations of the method for the study of macromolecular interactions, not yet revealed by experimental techniques, are discussed. Proteins 2000;39:372–384. © 2000 Wiley-Liss, Inc.
Article
Rigid-body methods, particularly Fourier correlation techniques, are very efficient for docking bound (co-crystallized) protein conformations using measures of surface complementarity as the target function. However, when docking unbound (separately crystallized) conformations, the method generally yields hundreds of false positive structures with good scores but high root mean square deviations (RMSDs). This paper describes a two-step scoring algorithm that can discriminate near-native conformations (with less than 5 Å RMSD) from other structures. The first step includes two rigid-body filters that use the desolvation free energy and the electrostatic energy to select a manageable number of conformations for further processing, but are unable to eliminate all false positives. Complete discrimination is achieved in the second step that minimizes the molecular mechanics energy of the retained structures, and re-ranks them with a combined free-energy function which includes electrostatic, solvation, and van der Waals energy terms. After minimization, the improved fit in near-native complex conformations provides the free-energy gap required for discrimination. The algorithm has been developed and tested using docking decoys, i.e., docked conformations generated by Fourier correlation techniques. The decoy sets are available on the web for testing other discrimination procedures. Proteins 2000;40:525–537. © 2000 Wiley-Liss, Inc.
Article
Many proteins have evolved to form specific molecular complexes and the specificity of this interaction is essential for their function. The network of the necessary inter-residue contacts must consequently constrain the protein sequences to some extent. In other words, the sequence of an interacting protein must reflect the consequence of this process of adaptation. It is reasonable to assume that the sequence changes accumulated during the evolution of one of the interacting proteins must be compensated by changes in the other.Here we apply a method for detecting correlated changes in multiple sequence alignments to a set of interacting protein domains and show that positions where changes occur in a correlated fashion in the two interacting molecules tend to be close to the protein-protein interfaces. This leads to the possibility of developing a method for predicting contacting pairs of residues from the sequence alone. Such a method would not need the knowledge of the structure of the interacting proteins, and hence would be both radically different and more widely applicable than traditional docking methods.We indeed demonstrate here that the information about correlated sequence changes is sufficient to single out the right inter-domain docking solution amongst many wrong alternatives of two-domain proteins. The same approach is also used here in one case (haemoglobin) where we attempt to predict the interface of two different proteins rather than two protein domains. Finally, we report here a prediction about the inter-domain contact regions of the heat- shock protein Hsc70 based only on sequence information.
Article
Some of the parameters that are used in the computer program ECEPP (Empirical Conformational Energy Program for Peptides) to describe the geometry of amino acid residues and the potential energy of interactions have been updated. The changes are based on recently available experimental information. The most significant changes improve the geometry and the interactions of prolyl and hydroxyprolyl residues, on the basis of crystallographic structural data. The structure of the pyrrolidine ring has been revised to correspond to the experimentally determined extent of out-of-plane puckering of the five-membered ring. The geometry of the peptide group preceding a Pro residue has also been altered. The parameters for nonbonded interactions involving the C(delta) and H(delta) atoms of Pro and Hyp have been modified. Use of the revised parameters provides improvements in the computed minimum-energy conformations of peptides containing the Pro-Pro and Ala-Pro sequences. In particular, it is demonstrated that an alpha-helix-like conformation of a residue preceding Pro is now only of moderately high energy, and thus it is an accessible state. This result corroborates the observed occurrence of Pro residues in kinked alpha-helices in globular proteins. The structure of the poly(Gly-Pro-Pro) triple helix, a computational model for collagen structure, has been recomputed. The validity of previous computations for this model structure has been confirmed. The refinement of the computed interactions has provided a new general model structure to be used for future computations on collagen-like polypeptides.
Article
A computational method for attempting to predict protein complexes from the coordinates of the individual proteins has been developed. It is based on matching complementary patterns of knobs and holes. The computer algorithm correctly and uniquely predicts the association of the alpha and beta subunits to form the αβ dimer corresponding to the α1β1 interface in the hemoglobin tetramer. It fails to correctly dock trypsin inhibitor onto trypsin. Nevertheless, this lone success is still a significant advance over previous protein-docking algorithms. The method is also important because it introduces several ways to measure the shape of protein surface regions.
Article
Here we carry out an examination of shape complementarity as a criterion in protein--protein docking and binding. Specifically, we examine the quality of shape complementarity as a critical determinant not only in the docking of 26 protein--protein "bound", complexed cases, but in particular, of 19 "unbound" protein--protein cases, where the structures have been determined separately. In all cases, entire molecular surfaces are utilized in the docking, with no consideration of the location of the active site, or of particular residues/atoms in either the receptor or the ligand which participate in the binding. To evaluate the goodness of the strictly geometry-based shape complementarity in the docking process as compared to the main favorable and unfavorable energy components, we study systematically a potential correlation between each of these components and the RMSD of the "unbound" protein--protein cases. Specifically, we examine the non-polar buried surface area, polar b...
Article
The Protein Data Bank is a computer-based archival file for macromolecular structures. The Bank stores in a uniform format atomic co-ordinates and partial bond connectivities, as derived from crystallographic studies. Text included in each data entry gives pertinent information for the structure at hand (e.g. species from which the molecule has been obtained, resolution of diffraction data, literature citations and specifications of secondary structure). In addition to atomic co-ordinates and connectivities, the Protein Data Bank stores structure factors and phases, although these latter data are not placed in any uniform format. Input of data to the Bank and general maintenance functions are carried out at Brookhaven National Laboratory. All data stored in the Bank are available on magnetic tape for public distribution, from Brookhaven (to laboratories in the Americans), Tokyo (Japan), and Cambridge (Europe and worldwide). A master file is maintained at Brookhaven and duplicate copies are stored in Cambridge and Tokyo. In the future, it is hoped to expand the scope of the Protein Data Bank to make available co-ordinates for standard structural types (e.g. α-helix, RNA double-stranded helix) and representative computer programs of utility in the study and interpretation of macromolecular structures.
Article
An efficient methodology, further referred to as ICM, for versatile modeling operations and global energy optimization on arbitrarily fixed multimolecular systems is described. It is aimed at protein structure prediction, homology modeling, molecular docking, nuclear magnetic resonance (NMR) structure determination, and protein design. The method uses and further develops a previously introduced approach to model biomolecular structures in which bond lengths, bond angles, and torsion angles are considered as independent variables, any subset of them being fixed. Here we simplify and generalize the basic description of the system, introduce the variable dihedral phase angle, and allow arbitrary connections of the molecules and conventional definition of the torsion angles. Algorithms for calculation of energy derivatives with respect to internal variables in the topological tree of the system and for rapid evaluation of accessible surface are presented. Multidimensional variable restraints are proposed to represent the statistical information about the torsion angle distributions in proteins. To incorporate complex energy terms as solvation energy and electrostatics into a structure prediction procedure, a “double-energy” Monte Carlo minimization procedure in which these terms are omitted during the minimization stage of the random step and included for the comparison with the previous conformation in a Markov chain is proposed and justified. The ICM method is applied successfully to a molecular docking problem. The procedure finds the correct parallel arrangement of two rigid helixes from a leucine zipper domain as the lowest-energy conformation (0.5 Å root mean square, rms, deviation from the native structure) starting from completely random configuration. Structures with antiparallel helixes or helixes staggered by one helix turn had energies higher by about 7 or 9 kcal/mol, respectively. Soft docking was also attempted. A docking procedure allowing side-chain flexibility also converged to the parallel configuration starting from the helixes optimized individually. To justdy an internal coordinate approach to the structure prediction as opposed to a Cartesian one, energy hypersurfaces around the native structure of the squash seeds trypsin inhibitor were studied. Torsion angle minimization from the optimal conformation randomly distorted up to the rms deviation of 2.2 Å or angular rms deviation of l0° restored the native conformation in most cases. In contrast, Cartesian coordinate minimization did not reach the minimum from deviations as small as 0.3 Å or 2°. We conclude that the most promising detailed approach to the protein-folding problem would consist of some coarse global sampling strategy combined with the local energy minimization in the torsion coordinate space. © 1994 by John Wiley & Sons, Inc.
Article
A general method, suitable for fast computing machines, for investigating such properties as equations of state for substances consisting of interacting individual molecules is described. The method consists of a modified Monte Carlo integration over configuration space. Results for the two-dimensional rigid-sphere system have been obtained on the Los Alamos MANIAC and are presented here. These results are compared to the free volume equation of state and to a four-term virial coefficient expansion.
Article
A general method, suitable for fast computing machines, for investigating such properties as equations of state for substances consisting of interacting individual molecules is described. The method consists of a modified Monte Carlo integration over configuration space. Results for the two-dimensional rigid-sphere system have been obtained on the Los Alamos MANIAC and are presented here. These results are compared to the free volume equation of state and to a four-term virial coefficient expansion. The Journal of Chemical Physics is copyrighted by The American Institute of Physics.
Article
We used isothermal titration calorimetry to study the equilibrium thermodynamics for formation of the physiologically-relevant redox protein complex between yeast ferricytochrome c and yeast ferricytochrome c peroxidase. A 1:1 binding stoichiometry was observed, and the binding free energies agree with results from other techniques. The binding is either enthalpy- or entropy-driven depending on the conditions, and the heat capacity change upon binding is negative. Increasing the ionic strength destabilizes the complex, and both the binding enthalpy and entropy increase. Increasing the temperature stabilizes the complex, indicating a positive van't Hoff binding enthalpy, yet the calorimetric binding enthalpy is negative (-1.4 to -6.2 kcal mol(-)(1)). We suggest that this discrepancy is caused by solvent reorganization in an intermediate state. The measured enthalpy and heat capacity changes are in reasonable agreement with the values estimated from the surface area change upon complex formation. These results are compared to those for formation of the horse ferricytochrome c/yeast ferricytochrome c peroxidase complex. The results suggest that the crystal and solution structures for the yeast complex are the same, while the crystal and solution structures for horse cytochrome c/yeast cytochrome c peroxidase are different.
Article
A computationally tractable strategy has been developed to refine protein-protein interfaces that models the effects of side-chain conformational change, solvation and limited rigid-body movement of the subunits. The proteins are described at the atomic level by a multiple copy representation of side-chains modelled according to a rotamer library on a fixed peptide backbone. The surrounding solvent environment is described by "soft" sphere Langevin dipoles for water that interact with the protein via electrostatic, van der Waals and field-dependent hydrophobic terms. Energy refinement is based on a two-step process in which (1) a probability-based conformational matrix of the protein side-chains is refined iteratively by a mean field method. A side-chain interacts with the protein backbone and the probability-weighted average of the surrounding protein side-chains and solvent molecules. The resultant protein conformations then undergo (2) rigid-body energy minimization to relax the protein interface. Steps (1) and (2) are repeated until convergence of the interaction energy. The influence of refinement on side-chain conformation starting from unbound conformations found improvement in the RMSD of side-chains in the interface of protease-inhibitor complexes, and shows that the method leads to an improvement in interface geometry. In terms of discriminating between docked structures, the refinement was applied to two classes of protein-protein complex: five protease-protein inhibitor and four antibody-antigen complexes. A large number of putative docked complexes have already been generated for the test systems using our rigid-body docking program, FTDOCK. They include geometries that closely resemble the crystal complex, and therefore act as a test for the refinement procedure. In the protease-inhibitors, geometries that resemble the crystal complex are ranked in the top four solutions for four out of five systems when solvation is included in the energy function, against a background of between 26 and 364 complexes in the data set. The results for the antibody-antigen complexes are not as encouraging, with only two of the four systems showing discrimination. It would appear that these results reflect the somewhat different binding mechanism dominant in the two types of protein-protein complex. Binding in the protease-inhibitors appears to be "lock and key" in nature. The fixed backbone and mobile side-chain representation provide a good model for binding. Movements in the backbone geometry of antigens on binding represent an "induced-fit" and provides more of a challenge for the model. Given the limitations of the conformational sampling, the ability of the energy function to discriminate between native and non-native states is encouraging. Development of the approach to include greater conformational sampling could lead to a more general solution to the protein docking problem.
Article
A method is described for the minimization of a function of n variables, which depends on the comparison of function values at the (n + 1) vertices of a general simplex, followed by the replacement of the vertex with the highest value by another point. The simplex adapts itself to the local landscape, and contracts on to the final minimum. The method is shown to be effective and computationally compact. A procedure is given for the estimation of the Hessian matrix in the neighbourhood of the minimum, needed in statistical estimation problems.
Article
The Protein Data Bank is a computer-based archival file for macromolecular structures. The Bank stores in a uniform format atomic co-ordinates and partial bond connectivities, as derived from crystallographic studies. Text included in each data entry gives pertinent information for the structure at hand (e.g. species from which the molecule has been obtained, resolution of diffraction data, literature citations and specifications of secondary structure). In addition to atomic co-ordinates and connectivities, the Protein Data Bank stores structure factors and phases, although these latter data are not placed in any uniform format. Input of data to the Bank and general maintenance functions are carried out at Brookhaven National Laboratory. All data stored in the Bank are available on magnetic tape for public distribution, from Brookhaven (to laboratories in the Americas), Tokyo (Japan), and Cambridge (Europe and worldwide). A master file is maintained at Brookhaven and duplicate copies are stored in Cambridge and Tokyo. In the future, it is hoped to expand the scope of the Protein Data Bank to make available co-ordinates for standard structural types (e.g. alpha-helix, RNA double-stranded helix) and representative computer programs of utility in the study and interpretation of macromolecular structures.
Article
A novel algorithm is presented which models protein-protein interactions using surface complementarity. The method is applied to antibody-antigen docking. A steric scoring scheme, based upon a soft potential, is used to assess complementarity, and a simple electrostatic model is then used to remove infeasible interactions. The soft potential allows for structural changes that occur during docking. Biochemical knowledge is necessary to reduce the number of docking orientations produced by the method to a manageable size. The information used includes the known epitope residues and a single loose distance constraint. The method is applied to all three crystallographically determined antibody-lysozyme complexes, HyHEL-10, D1.3 and HyHEL-5. For the first time, a predicted antibody structure (that of D1.3) is used as a docking target. In the four systems modelled, the method identifies between 15 and 40 possible docking orientations. The root-mean-square (r.m.s.) deviation between these orientations and the relevant crystallographic complex is measured in the interface region. For all four complexes an orientation is found with r.m.s. deviation in the range 1.9 A and 4.8 A. The algorithm is implemented on a single instruction/multiple datastream (SI/MD) architecture computer. The use of a parallel architecture computer ensures detailed coverage of the search space, whilst still maintaining a search time of two days.
Article
A solvation energy function for use in the molecular simulation of proteins is proposed. It is based on the accessible surface areas of atoms in the protein and on atomic solvation parameters derived from empirical vapor-to-water free energies of transfer of amino acid side-chain analogs. The energy function and its derivatives were added to the CHARMM molecular simulation program (Brooks, B.R., Bruccoleri, R.E., Olafson, B.D., States, D.J., Swaminathan, S., & Karplus, M., 1983, J. Comput. Chem. 4(2), 187-217). The effect of the added energy term was evaluated by 110 ps of molecular dynamics on the 26-residue protein melittin. The melittin monomer and tetramer were studied both with and without the added term. With the added energy term the monomer partially unfolded, while the secondary structure of the tetramer was preserved, in agreement with reported experiments (Brown, L.R., Lauterwein, J., & Wuethrich, K., 1980, Biochim. Biophys. Acta 622(2), 231-244; Lauterwein, J., Brown, L.R., & Wuethrich, K., 1980, Biochim. Biophys. Acta 622(2), 219-230).
Article
The crystal structure of a 1:1 complex between yeast cytochrome c peroxidase and yeast iso-1-cytochrome c was determined at 2.3 A resolution. This structure reveals a possible electron transfer pathway unlike any previously proposed for this extensively studied redox pair. The shortest straight line between the two hemes closely follows the peroxidase backbone chain of residues Ala194, Ala193, Gly192, and finally Trp191, the indole ring of which is perpendicular to, and in van der Waals contact with, the peroxidase heme. The crystal structure at 2.8 A of a complex between yeast cytochrome c peroxidase and horse heart cytochrome c was also determined. Although crystals of the two complexes (one with cytochrome c from yeast and the other with cytochrome c from horse) grew under very different conditions and belong to different space groups, the two complex structures are closely similar, suggesting that cytochrome c interacts with its redox partners in a highly specific manner.
Article
Conformational searches by molecular dynamics and different types of Monte Carlo or build-up methods usually aim to find the lowest-energy conformation. However, this is often misleading, as the energy functions used in conformational calculations are imprecise. For instance, though positions of local minima defined by the repulsive part of the Lennard-Jones potential are usually altered only slightly by functional modification, the relative depths of the minima could change significantly. Thus, the purpose of conformational searches and, correspondingly, performance criteria should be reformulated and appropriate methods found to extract different local minima from the search trajectory and allow visualization in the search space. Attempts at convergence to the lowest-energy structure should be replaced with efforts to visit a maximum number of different local energy minima with energies within a certain range. We use this quantitative criterion consistently to evaluate performances of different search procedures. To utilize information generated in the course of simulation, a "stack" of low energy conformations is created and stored. It keeps track of variables and visit numbers for the best representatives of different conformational families. To visualize the search, projection of multidimensional walks onto a principal plane defined by a set of reference structures is used. With Met-enkephalin as a structural example and a Monte Carlo procedure combined with energy minimization (MCM) as a basic search method, we analyzed the influence on search efficiency of different characteristics as temperature schedules, the step size for variable modification, constrained random step and response mechanisms to search difficulties. Simulated annealing MCM had comparable efficiency with MCM at constant and elevated temperature (about 600 K). Constraining the randomized choice of side-chain chi angles to optimal values (rotamers) on every MCM step did not improve, but rather worsened, the search efficiency. Two low-energy Met-enkephalin conformations with parallel Tyr1 and Phe4 rings, a gamma-turn around the Gly2 residue, and Phe4 and Met5 side-chains forming together a compact hydrophobic cluster were found and are suggested as possible structural candidates for interaction with a receptor or a membrane.
Article
Molecular surfaces are fitted to each other by a new solution to the problem of docking a ligand into the active site of a protein molecule. The procedure constructs patterns of points on the surfaces and superimposes them upon each other using a least-squares best-fit algorithm. This brings the surfaces into contact and provides a direct measure of their local complementarity. The search over the ligand surface produces a large number of dockings, of which a small fraction having the best complementarity and the least steric hindrance are evaluated for electrostatic interaction energy. When applied to molecules taken from crystallographically observed complexes, this procedure consistently assigns the lowest electrostatic energies to correct dockings. On independently determined structures, the ability of the method to discern correct dockings depends on how much conformational difference there is between the free and complexed forms of the molecules. The procedure is found to be fast enough on contemporary workstation computers to permit many conformations to be considered, and tolerant enough to make rather coarse bond dihedral sampling a practicable way to overcome the problem of structural flexibility.
Article
We present a method to search for possible binding modes of molecular fragments at a specific site of a potential drug target of known structure. Our method is based on a Monte Carlo (MC) algorithm applied to the translational and rotational degrees of freedom of the probe fragment. Starting from a randomly generated initial configuration, favorable binding modes are generated using a two-step process. An MC run is first performed in which the energy in the Metropolis algorithm is substituted by a score function that measures the average distance of the probe to the target surface. This has the effect of making buried probes move toward the target surface and also allows enhanced sampling of deep pockets. In a second MC run, a pairwise atom potential function is used, and the temperature parameter is slowly lowered during the run (Simulated Annealing). We repeat this procedure starting from a large number of different randomly generated initial configurations in order to find all energetically favorable docking modes in a specified region around the target. We test this method using two inhibitor-receptor systems: Streptomyces griseus proteinase B in complex with the third domain of the ovomucoid inhibitor from turkey, and dihydrofolate reductase from E. coli in complex with methotrexate. The method could consistently reproduce the complex found in the crystal structure searching from random initial positions in cubes ranging from 25 A to 50 A about the binding site. In the case of SGPB, we were also successful in docking to the native structure. In addition, we were successful in docking small probes in a search that included the entire protein surface.
Article
Antibody-lysozyme and protease-inhibitor complexes are reconstituted by docking lysozyme as a rigid body onto the combining site of the antibodies and the inhibitors onto the active site of the proteases. Simplified protein models with one sphere per residue are subjected to simulated annealing using a crude energy function where the attractive component is proportional to the interface area. The procedure finds clusters of orientations in which a steric fit between the two protein components is achieved over a large contact surface. With five out of six complexes, the native structure of the complexes determined by X-ray crystallography is among those retained. Docked complexes are then subjected to conformational energy refinement with full atomic detail. With Fab HyHEL 5 and lysozyme, a native-like complex has the lowest refined energy. It can also be retrieved when starting with the X-ray structure of free lysozyme. However, some non-native complexes cannot be rejected: they form large interfaces, have a large number of H-bonds, and few unpaired polar groups. While these are necessary features of protein-protein recognition, they are not sufficient in determining specificity.
Article
Predicting the structures of protein-protein complexes is a difficult problem owing to the topographical and thermodynamic complexity of these structures. Past efforts in this area have focussed on fitting the interacting proteins together using rigid body searches, usually with the conformations of the proteins as they occur in crystal structure complexes. Here we present work which uses a rigid body docking method to generate the structures of three known protein complexes, using both the bound and unbound conformations of the interacting molecules. In all cases we can regenerate the geometry of the crystal complexes to high accuracy. We also are able to find geometries that do not resemble the crystal structure but nevertheless are surprisingly reasonable both mechanistically and by some simple physical criteria. In contrast to previous work in this area, we find that simple methods for evaluating the complementarity at the protein-protein interface cannot distinguish between the configurations that resemble the crystal structure complex and those that do not. Methods that could not distinguish between such similar and dissimilar configurations include surface area burial, solvation free energy, packing and mechanism-based filtering. Evaluations of the total interaction energy and the electrostatic interaction energy of the complexes were somewhat better. Of the techniques that we tried, energy minimization distinguished most clearly between the "true" and "false" positives, though even here the energy differences were surprisingly small. We found the lowest total interaction energy from amongst all of the putative complexes generated by docking was always within 5 A root-mean-square of the crystallographic structure. There were, however, several putative complexes that were very dissimilar to the crystallographic structure but had energies that were close to that of the low energy structure. The magnitude of the error in energy calculations has not been established in macromolecular systems, and thus the reliability of the small differences in energy remains to be determined. The ability of this docking method to regenerate the crystallographic configurations of the interacting proteins using their unbound conformations suggests that it will be a useful tool in predicting the structures of unsolved complexes.
Article
Molecular recognition is achieved through the complementarity of molecular surface structures and energetics with, most commonly, associated minor conformational changes. This complementarity can take many forms: charge-charge interaction, hydrogen bonding, van der Waals' interaction, and the size and shape of surfaces. We describe a method that exploits these features to predict the sites of interactions between two cognate molecules given their three-dimensional structures. We have developed a “cube representation” of molecular surface and volume which enables us not only to design a simple algorithm for a six-dimensional search but also to allow implicitly the effects of the conformational changes caused by complex formation. The present molecular docking procedure may be divided into two stages. The first is the selection of a population of complexes by geometric “soft docking”, in which surface structures of two interacting molecules are matched with each other, allowing minor conformational changes implicitly, on the basis of complementarity in size and shape, close packing, and the absence of steric hindrance. The second is a screening process to identify a subpopulation with many favorable energetic interactions between the buried surface areas. Once the size of the subpopulation is small, one may further screen to find the correct complex based on other criteria or constraints obtained from biochemical, genetic, and theoretical studies, including visual inspection.
Article
An optimized method based on the principle of simulated annealing is presented for determining the relative position and orientation of interacting molecules. The spatial relationships of these molecules are described by intermolecular distance constraints between specific pairs of atoms, such as found in hydrogen bonds or from experimentally determined data. The method makes use of a random walk through six rotational and translational degrees of freedom where the constituent molecules are treated as rigid bodies. Van der Waals repulsions are used only to define a lower bound on distances between constrained atom pairs within the docking procedure. A cost function comprised of purely geometric constraints is optimized via simulated annealing, in order to search for the best orientation and position of the two molecules. Our docking procedure is applied to eight serine proteinase complexes from the Brookhaven Protein Data Bank. For each simulation 100 computations were performed. A typical docking computation requires only a few seconds of CPU time on a VAXserver 3500. The influence of the number of constraints on the final docked positions was studied. The sensitivity of the docking procedure to a ligand structure which is not well defined is also addressed. Possible applications of this method include using approximate distances incorporating complete energy functions.
Article
A new methodology for the conformational modelling of biomolecular systems (1) is extended to local deformations of chain molecules and to flexible molecular rings. It is shown that these two cases may be reduced to considering an equivalent molecular model with a regular tree-like topology. A simple procedure is developed to analyze any flexible rings (the five- and six-membered sugar rings of carbohydrates and nucleic acids, in particular) and local deformation regions by energy minimization. Dynamic equations are also derived for such molecular systems. As a result, a unified approach is proposed for the efficient energy minimization and simulation of dynamic behavior of multimolecular systems having any set of variable internal coordinates, local deformation regions and cycles. Advantages and domains of applicability of the approach are discussed.
Article
A general methodology is proposed for the conformational modelling of biomolecular systems. The approach allows one: (i) to describe the system under investigation by an arbitrary set of internal variables, i.e., torsion angles, bond angles, and bond lengths; it offers a possibility to pass from the free structure to a completely fixed one with the number of variables from 3N to zero, respectively, where N is the number of atoms; (ii) to consider both, a single molecule and a complex of many molecules, (e.g., proteins, water, ligands, etc.) in terms of one universal model; (iii) to study the dynamics of the system using explicit analytical Lagrangian equations of motion, thus opening up possibilities for investigations of slow concerted motions such as domain oscillations in proteins etc.; (iv) to calculate the partial derivatives of various functions of conformation, e.g., the conformational energy or external constraints imposed, using a standard efficient procedure regardless of the variables and the structure of the system. The approach is meant to be used in various investigations concerning the conformations and dynamics of biomacromolecules.
Article
A Monte Carlo-minimization method has been developed to overcome the multiple-minima problem. The Metropolis Monte Carlo sampling, assisted by energy minimization, surmounts intervening barriers in moving through successive discrete local minima in the multidimensional energy surface. The method has located the lowest-energy minimum thus far reported for the brain pentapeptide [Met5]enkephalin in the absence of water. Presumably it is the global minimum-energy structure. This supports the concept that protein folding may be a Markov process. In the presence of water, the molecules appear to exist as an ensemble of different conformations.
Article
The interaction of a probe group with a protein of known structure is computed at sample positions throughout and around the macromolecule, giving an array of energy values. The probes include water, the methyl group, amine nitrogen, carboxy oxygen, and hydroxyl. Contour surfaces at appropriate energy levels are calculated for each probe and displayed by computer graphics together with the protein structure. Contours at negative energy levels delineate contours also enable other regions of attraction between probe and protein and are found at known ligand binding clefts in particular. The contours also enable other regions of attraction to be identified and facilitate the interpretation of protein-ligand energetics. They may, therefore, be of value for drug design.
Article
We describe here a novel procedure for automated protein docking, based only on geometric criteria. In our algorithm we project protein surfaces into bi-dimensional matrices; the search for complementary regions is performed by detecting matching sub-matrices. An exhaustive sampling of the rotation space is made in order to analyse all the possible relative orientations of the two proteins, but nevertheless this procedure requires a relatively short processing time (3 h to 24 h cpu time on a SG4D320, depending on the complexity of the input information). When tested with co-crystallized, free components and models of components of known protein-protein complexes, the method gave very satisfactory results. The procedure selects no more than four relative orientations of the molecular components, but the correct orientation is always present among them, ranking either first or second. In more than half the cases the "wrong" solutions nevertheless correctly identify most of the residues involved in the interaction. This is remarkable also in view of the fact that the chosen test complexes (trypsin-trypsin inhibitor and antibody-lysozyme) have a very different geometry of surface complementarity: trypsin inhibitor inserts a long side-chain into the deep specificity pocket of the protease, while the interface between antibody and lysozyme is rather flat and contains buried water molecules (not included in the calculation). In order to simulate a more realistic protein docking problem, we also used a trypsin inhibitor and an anti-lysozyme antibody model in our simulations, again with satisfying results.
Article
A typical problem for a docking procedure is how to match two molecules with known 3-D structure so as to predict the configuration of their complex. A very serious obstacle to docking is an inherent inaccuracy in the 3-D structures of the molecules. In general, existing molecular recognition techniques are not designed for cases where (i) conformational changes upon macromolecular complex formation are substantial or (ii) the X-ray data on one or both (macro) molecules are not available, and the structures, based on alternative sources (NMR, modeling), are not well defined. We designed a direct computer experiment using molecules totally deprived of any structural features smaller than 7 Å. This was performed on the basis of a previously developed docking algorithm. The modified procedure was applied to a number of known protein complexes taken from the Brookhaven Protein Data Bank. In most cases, a pronounced trend towards the correct structure of the molecular complex was clearly indicated and the real binding sites were predicted. The distinction between the prediction of the antigen-antibody complex and other molecular pairs may reflect important differences in the principles of complex formation. The results strongly suggest the use of our recognition procedure for docking studies where the detailed structures of the molecules are lacking.
Article
Copper-substituted cytochrome c (CuCc) has been used as a structurally faithful, redoxinert inhibitor to probe the mechanism of electron transfer (ET) between Cc molecules and cytochrome c peroxidase (CcP). This inhibitor enhances photoinduced ET quenching of the triplet excited state of a zinc-substituted protein (ZnCcP or ZnCc) by its iron(III) partner (Fe3+Cc or Fe3+CcP). These results show that CcP and Cc form a ternary complex in which one Cc molecule binds tightly at a surface domain of CcP having low ET reactivity, whereas the second Cc molecule binds weakly to the 1:1 complex at a second domain with markedly greater (approximately 10(3)) reactivity. These results also rule out the possibility that Cc bound at the second domain cooperatively enhances ET to Cc at the first domain. The multiphasic kinetics observed for the photoproduced ET intermediate do not reflect electron self-exchange between two Cc molecules within the ternary complex.
Article
The fundamental event in biological assembly is association of two biological macromolecules. Here we present a successful, accurate ab initio prediction of the binding of uncomplexed lysozyme to the HyHel5 antibody. The prediction combines pseudo Brownian Monte Carlo minimization with a biased-probability global side-chain placement procedure. It was effected in an all-atom representation, with ECEPP/2 potentials complemented with the surface energy, side-chain entropy and electrostatic polarization free energy. The near-native solution found was surprisingly close to the crystallographic structure (root-mean-square deviation of 1.57 A for all backbone atoms of lysozyme) and had a considerably lower energy (by 20 kcal mol-1) than any other solution.
Article
In the classical procedures for predicting the structure of protein complexes two molecules are brought in contact at multiple relative positions, the extent of complementarity (geometric and/or energy) at the surface of contact is assessed at each position, and the best fits are retrieved. In view of the higher occurrence of hydrophobic groups at contact sites, their contribution results in more intermolecular atom-atom contacts per unit area for correct matches than for false positive fits. The hydrophobic groups are also potentially less flexible at the surface. Thus, from a practical point of view, a partial representation of the molecules based on hydrophobic groups should improve the quality of the results in finding molecular recognition sites, as compared to full representation. We tested this proposal by applying the idea to an existing geometric fit procedure and compared the results obtained with full vs. hydrophobic representations of molecules in known molecular complexes. The hydrophobic docking yielded distinctly higher signal-to-noise ratio so that the correct match is discriminated better from false positive fits. It appears that nonhydrophobic groups contribute more to false matches. The results are discussed in terms of their relevance to molecular recognition techniques as compared to energy calculations.
Article
We have developed a geometry-based suite of processes for molecular docking. The suite consists of a molecular surface representation, a docking algorithm, and a surface inter-penetration and contact filter. The surface representation is composed of a sparse set of critical points (with their associated normals) positioned at the face centers of the molecular surface, providing a concise yet representative set. The docking algorithm is based on the Geometric Hashing technique, which indexes the critical points with their normals in a transformation invariant fashion preserving the multi-element geometric constraints. The inter-penetration and surface contact filter features a three-layer scoring system, through which docked models with high contact area and low clashes are funneled. This suite of processes enables a pipelined operation of molecular docking with high efficacy. Accurate and fast docking has been achieved with a rich collection of complexes and unbound molecules, including protein-protein and protein-small molecule associations. An energy evaluation routine assesses the intermolecular interactions of the funneled models obtained from the docking of the bound molecules by pairwise van der Waals and Coulombic potentials. Applications of this routine demonstrate the goodness of the high scoring, geometrically docked conformations of the bound crystal complexes.
Article
Two major components are required for a successful prediction of the three-dimensional structure of peptides and proteins: an efficient global optimization procedure which is capable of finding an appropriate local minimum for the strongly anisotropic function of hundreds of variables, and a set of free energy components for a protein molecule in solution which are computationally inexpensive enough to be used in the search procedure, yet sufficiently accurate to ensure the uniqueness of the native conformation. We here found an efficient way to make a random step in a Monte Carlo procedure given knowledge of the energy or statistical properties of conformational subspaces (e.g. phi-psi zones or side-chain torsion angles). This biased probability Monte Carlo (BPMC) procedure randomly selects the subspace first, then makes a step to a new random position independent of the previous position, but according to the predefined continuous probability distribution. The random step is followed by a local minimization in torsion angle space. The positions, sizes and preferences for high-probability zones on phi-psi maps and chi-angle maps were calculated for different residue types from the representative set of 191 and 161 protein 3D-structures, respectively. A fast and precise method to evaluate the electrostatic energy of a protein in solution is developed and combined with the BPMC procedure. The method is based on the modified spherical image charge approximation, efficiently projected onto a molecule of arbitrary shape. Comparison with the finite-difference solutions of the Poisson-Boltzmann equation shows high accuracy for our approach. The BPMC procedure is applied successfully to the structure prediction of 12- and 16-residue synthetic peptides and the determination of protein structure from NMR data, with the immunoglobulin binding domain of streptococcal protein G as an example. The BPMC runs display much better convergence properties than the non-biased simulations. The advantage of a true global optimization procedure for NMR structure determination is its ability to cope with local minima originating from data errors and ambiguities in NMR data.
Article
Several sets of amino acid surface areas and transfer free energies were used to derive a total of nine sets of atomic solvation parameters (ASPs). We tested the accuracy of each of these sets of parameters in predicting the experimentally determined transfer free energies of the amino acid derivatives from which the parameters were derived. In all cases, the calculated and experimental values correlated well. We then chose three parameter sets and examined the effect of adding an energetic correction for desolvation based on these three parameter sets to the simple potential function used in our multiple start Monte Carlo docking method. A variety of protein-protein interactions and docking results were examined. In the docking simulations studied, the desolvation correction was only applied during the final energy calculation of each simulation. For most of the docking results we analyzed, the use of an octanol-water-based ASP set marginally improved the energetic ranking of the low-energy dockings, whereas the other ASP sets we tested disturbed the ranking of the low-energy dockings in many of the same systems. We also examined the correlation between the experimental free energies of association and our calculated interaction energies for a series of proteinase-inhibitor complexes. Again, the octanol-water-based ASP set was compatible with our standard potential function, whereas ASP sets derived from other solvent systems were not.
Article
The structure of TEM-1 beta-lactamase complex with the inhibitor BLIP has been determined at 1.7 angstrom resolution. The two tandemly repeated domains of BLIP form a polar, concave surface that docks onto a predominantly polar, convex protrusion on the enzyme. The ability of BLIP to adapt to a variety of class A beta-lactamases is most likely due to an observed flexibility between the two domains of the inhibitor and to an extensive layer of water molecules entrapped between the enzyme and inhibitor. A beta-hairpin loop from domain 1 of BLIP is inserted into the active site of the beta-lactamase. The carboxylate of Asp 49 forms hydrogen bonds to four conserved, catalytic residues in the beta-lactamase, thereby mimicking the position of the penicillin G carboxylate observed in the acyl-enzyme complex of TEM-1 with substrate. This beta-hairpin may serve as a template with which to create a new family of peptide-analogue beta-lactamase inhibitors.
Article
A long sought goal in the physical chemistry of macromolecular structure, and one directly relevant to understanding the molecular basis of biological recognition, is predicting the geometry of bimolecular complexes from the geometries of their free monomers. Even when the monomers remain relatively unchanged by complex formation, prediction has been difficult because the free energies of alternative conformations of the complex have been difficult to evaluate quickly and accurately. This has forced the use of incomplete target functions, which typically do no better than to provide tens of possible complexes with no way of choosing between them. Here we present a general framework for empirical free energy evaluation and report calculations, based on a relatively complete and easily executable free energy function, that indicate that the structures of complexes can be predicted accurately from the structures of monomers, including close sequence homologues. The calculations also suggest that the binding free energies themselves may be predicted with reasonable accuracy. The method is compared to an alternative formulation that has also been applied recently to the same data set. Both approaches promise to open new opportunities in macromolecular design and specificity modification.