Article

Evaluation of the 3D-Dock protein docking suite in rounds 1 and 2 of the CAPRI blind trial

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The 3D-Dock suite of programs has been used to make predictions for the seven targets in rounds 1 and 2 of the CAPRI method evaluation exercise. Some correct contacts were obtained in at least one prediction for four of seven targets. Target 06 was predicted very well, with an RMSD of the ligand after superimposition of the receptor of only 0.77 A. We investigate the performance of the various stages of the method, with the aim of finding where improvements need to be made, and in particular whether the manual interventions that were made were essential, and whether results of the level of accuracy obtained for target 06 may be expected with confidence.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... d F r (%) is the F-score of each method for the receptor protein on the data set. e The values for 3D-Dock are from literatures [36,37]; The blank results mean that 3D-Dock never produced on these targets. http://www.biomedcentral.com/1471-2105/13/158 is the experimental structure. ...
... It can produce structures with the smallest iRMSD values in top 1000 predictions with minimum energy. 3D-Dock[36,37] uses an initial grid-based shape complementarity search to produce lots of potential interacting conformations. They can be ranked by using interface residue propensities and interaction energies. ...
Article
Full-text available
The ability to predict protein-protein binding sites has a wide range of applications, including signal transduction studies, de novo drug design, structure identification and comparison of functional sites. The interface in a complex involves two structurally matched protein subunits, and the binding sites can be predicted by identifying structural matches at protein surfaces. We propose a method which enumerates "all" the configurations (or poses) between two proteins (3D coordinates of the two subunits in a complex) and evaluates each configuration by the interaction between its components using the Atomic Contact Energy function. The enumeration is achieved efficiently by exploring a set of rigid transformations. Our approach incorporates a surface identification technique and a method for avoiding clashes of two subunits when computing rigid transformations. When the optimal transformations according to the Atomic Contact Energy function are identified, the corresponding binding sites are given as predictions. Our results show that this approach consistently performs better than other methods in binding site identification. Our method achieved a success rate higher than other methods, with the prediction quality improved in terms of both accuracy and coverage. Moreover, our method is being able to predict the configurations of two binding proteins, where most of other methods predict only the binding sites. The software package is available at http://sites.google.com/site/guofeics/dobi for non-commercial use.
... Some do this only in the first step, when many docking solution are generated and unlikely ones eliminated. But this is then followed by sidechain modeling [24][25][26] and sometimes also limited backbone adjustments [27,28]. Such refinement steps seem to pay off in general, but their capacity to explore larger conformational changes is limited. ...
... To discriminate between native and non-native associations, docking methods use a range of different scoring schemes. Some are based on purely geometric measures [32,33], and others contain combinations of empirical energy [27,28] and/or database-derived [24] terms. Some CAPRI predictors, but certainly not all, manually reranked highly scoring solutions, hoping to improve the odds that the correct answer would be among the ten models that they were allowed to submit. ...
Article
Given the increasing interest in protein-protein interactions, the prediction of these interactions from sequence and structural information has become a booming activity. CAPRI, the community-wide experiment for assessing blind predictions of protein-protein interactions, is playing an important role in fostering progress in docking procedures. At the same time, novel methods are being derived for predicting regions of a protein that are likely to interact and for characterizing putative intermolecular contacts from sequence and structural data. Together with docking procedures, these methods provide an integrated computational approach that should be a valuable complement to genome-scale experimental studies of protein-protein interactions.
... This set of programs has been developed since 1995, and has been previously reported in the literature. [1][2][3][4][5][6] This software is available freely to academics (http://www.sbg.bio.ic.ac.uk/docking). Here we describe our results when applying this method to blind predictions. ...
... Previously, in rounds 1 and 2 of CAPRI, 3D-Dock was used to predict seven target systems. 6 Of note were three predicted structures for these systems. In round 1, an accurate prediction was made for target 2, 7 the recognition of the bovine rotavirus VP6 by a FAB fragment. ...
Article
In rounds 3-5 of CAPRI, the community-wide experiment on the comparative evaluation of protein-protein docking for structure prediction, we applied the 3D-Dock software package to predict the atomic structures of nine biophysical interactions. This approach starts with an initial grid-based shape complementarity search. The product of this is a large number of potential interacting conformations that are subsequently ranked by interface residue propensities and interaction energies. Refinement through detailed energetics and optimization of side-chain positions using a rotamer library is also performed. For rounds 3, 4, and 5 of the CAPRI evaluation, where possible, we clustered functional residues on the surfaces of the monomers as an indication of binding sites, using sequence based evolutionary conservations. In certain targets this provided a very useful tool for identifying the areas of interaction. During round 5, we also applied the techniques of side-chain trimming and geometrical clustering described in the literature. Of the nine target complexes in rounds 3-5, we predicted conformations that contained at least some correct contact residues for seven of these systems. For two of the targets, we submitted predictions that were considered as medium-quality. These were a nidogen-laminin complex for target 8 (T08) and a serine-threonine phosphatase bound to a targeting subunit (T14). For a further three target systems, we produced models that were rated as acceptable predictions.
... Just as in sampling techniques, scoring techniques have evolved by combining various features and properties. Methods such as GRAMM-X[53], ATTRACT[73,74], 3D-DOCK[50,75], LZERD[76]etc. use different combinations of solvation energy, dielectric constants, electrostatics, van der Waals interaction, hydrogen bonds, clashes, clustering using pairwise RMSD etc. ...
Article
Full-text available
Computational methods to predict the 3D structures of protein interactions fall into 3 categories—template based modeling, protein–protein docking and hybrid/integrative modeling. The two most important considerations for modeling methods are sampling and scoring conformations. Sampling has benefitted from techniques such as fast Fourier transforms (FFT), spherical harmonics and higher order manifolds. Scoring complexes to determine binding free energy is still a challenging problem. Rapid advances have been made in hybrid modeling where experimental data are amalgamated with computations. These methods have received a boost from the popularity of experimental methods such as electron microscopy (EM). While increasingly larger and complicated complexes are now getting elucidated by integrative methods, modeling conformational flexibility remains a challenge. Ongoing improvements to these techniques portend a future where organelles or even cells could be accurately modeled at a molecular level.
... 3D-Dock standalone suite consists of three servers that are FTDock (Fourier Transform Dock), RPScore (Residue level Pair potential Score) and MultiDock (Multiple copy sidechain refinement Dock).  Input -Protein coordinate files in PDB format [73].  Output -FTDock outputs multiple predictions that can be screened using biochemical information. ...
Article
Full-text available
Drug detection and growth is intense, lengthy and interdisciplinary process. Traditionally, drug discovery was done by amalgamating compounds in a time-utilizing multi-step processes and then they were further investigated for their respective promising candidates. Nowadays in silico methods for drug designing have come into play, which helps in the identification of drug targets using various bioinformatics drug designing tools. They can also be used to analyze the target structures for probable binding site, generate potential molecules, examine their drug likeness, dock particular molecules with the target, rank them in accordance to their binding affinities and further amend the molecules to upgrade their binding characteristics and finally obtain potential candidates for drug discovery. As the structural information of many protein targets become available through X-Ray crystallography, NMR and bioinformatics approaches, there comes an increasing demand for the computer based tools which can recognize and inspect the active sites and suggest potential druggable unit which could specifically bind to these active sites. The major advantages of these bioinformatics drug designing tools is that they are available everywhere on internet, they have decreased support costs, decreased license costs, software integration, easy monitoring and grid calculations. For the above mentioned reasons, we compile in this review 49 online tools which could be beneficial to biotechnologist for in silico drug design.
... Parmi toutes les méthodes de complémentarité de surface, celle utilisant la transformation de Fourier rapide (Fast Fourier Transform ou FFT), apparue dès 1991 [126], est l'une des plus simples et des plus utilisées [10,35,40,41,132,143,180,235,257]. Une grille cubique est tracée, et,à chaque point, on attribue un poids qui est négatif et important si le point est situéà l'intérieur de la protéine A, nul s'il està l'extérieur et 1 s'il est proche de la surface ; on fait de même pour la protéine B. ...
Article
My thesis shows results for the prediction of protein-RNA interactions with machine learning. An international community named CAPRI (Critical Assessment of PRedicted Interactions) regularly assesses in silico methods for the prediction of the interactions between macromolecules. Using blindpredictions within time constraints, protein-protein interactions and more recently protein-RNA interaction prediction techniques are assessed.In a first stage, we worked on curated protein-RNA benchmarks, including 120 3D structures extracted from the non redundant PRIDB (Protein-RNA Interface DataBase). We also tested the protein-RNA prediction method we designed using 40 protein-RNA complexes that were extracted from state-ofthe-art benchmarks and independent from the non redundant PRIDB complexes. Generating candidates identical to the in vivo solution with only a few 3D structures is an issue we tackled by modelling a candidate generation strategy using RNA structure perturbation in the protein-RNAcomplex. Such candidates are either near-native candidates – if they are close enough to the solution– or decoys – if they are too far away. We want to discriminate the near-native candidates from thedecoys. For the evaluation, we performed an original cross-validation process we called leave-”onepdb”-out, where there is one fold per protein-RNA complex and each fold contains the candidates generated using one complex. One of the gold standard approaches participating in the CAPRI experiment as to date is RosettaDock. RosettaDock is originally optimized for protein-proteincomplexes. For the learning step of our scoring function, we adapted and used an evolutionary algorithm called ROGER (ROC-based Genetic LearnER) to learn a logistic function. The results show that our scoring function performs much better than the original RosettaDock scoring function. Thus,we extend RosettaDock to the prediction of protein-RNA interactions. We also evaluated classifier based and metaclassifier-based approaches, which can lead to new improvements with further investigation.In a second stage, we introduced a new way to evaluate candidates using a multi-scale protocol. A candidate is geometrically represented on an atomic level – the most detailed scale – as well as on a coarse-grained level. The coarse-grained level is based on the construction of a Voronoi diagram over the coarse-grained atoms of the 3D structure. Voronoi diagrams already successfully modelled coarsegrained interactions for protein-protein complexes in the past. The idea behind the multi-scale protocolis to first find the interaction patch (epitope) between the protein and the RNA before using the time consuming and yet more precise atomic level. We modelled new scoring terms, as well as new scoring functions to evaluate generated candidates. Results are promising. Reducing the number of parameters involved and optimizing the explicit solvent model may improve the coarse-grained level predictions.
... Parmi toutes les méthodes de complémentarité de surface, celle utilisant la transformation de Fourier rapide (Fast Fourier Transform ou FFT), apparue dès 1991 [110], est l'une des plus simples et des plus utilisées [11,33,39,40,117,124,153,206,229]. Une grille cubique est tracée, et, à chaque point, on attribue un poids qui est négatif et im-Chapitre 1. Introduction portant si le point est situé à l'intérieur de la protéine A, nul s'il est à l'extérieur et 1 s'il est proche de la surface ; on fait de même pour la protéine B. ...
Article
The function of a protein is often subordinated to its interaction with one or many partners. Yet, the tridimensional structure study of this complexes, that can't be done experimentally, would permit the understanding of many cellular processes. This work contains two parts. The first part concerns the setting up of a scoring function for protein-protein docking and the second part concerns the crystallographic structure study of a tetrameric protein : the Paramecium Bursaria Chlorella Virus thymidylate synthase X, a potential antibacterial target. Docking of protein-protein complexes consists in two successive steps : first a large number of putative conformations are generated, then a scoring function is applied to rank them. This scoring function has to take into account both geometric complementarity of the two molecules and physico-chemical properties of surfaces in interaction. We addressed the second step of this problem through the development of a quick and reliable scoring function. This was done using Voronoi tessellation of the tridimensional structure of the proteins. Voronoi or Laguerre tessellations were shown to be good mathematical models of protein structure. In particular, this formalization leads to a good description of structural properties of the residues. This modeling illustrates the packing of the residues at the interface between two proteins. Thus, it is possible to measure a set of parameters, on protein-protein complexes whose structure is known, and on decoys. These parameters are frequencies of residues and pair frequencies of the residues at the interface, volumes of Voronoi cells, distances between residues at the interface, interface area and number of residues at the interface. They were used as input in statistical machine learning procedures (logistic learning, support vector machines (SVM) and genetic algorithms). These led to efficient scoring functions, able to separate native structures from decoys. In the second part, I describe the experimental determination of thymidylate synthase X tridimensionnal structure, an interesting antibacterial target. Thymidylate synthase X is a flavoprotein discovered recently. It plays a key role in the synthesis of dTMP in most of the prokaryotic organisms, but does not exist in superior eukaryotic organisms. This protein catalyses the methyl transfer from tetrahydrofolate to dUMP using FAD as a cofactor and NADPH as substrate. The tridimensional structure of ThyX homotetramer with its cofactor, FAD, was solved at 2.4Å by molecular replacement. As shown in the Thermotoga maritima and Mycobacterium tuberculosis ThyX structures, the monomer contains a core of β sheets and two α helices at its extremity. The active site is at the interface between three monomers, the isoalloxazine part of FAD being accessible to the solvent and close to a long flexible loop. FAD binding in this structure is a little different from those already observed, especially its the adenine part. This structure, in association with directed mutagenesis experiments made by our collabora- tors, revealed residues playing a key role during the catalysis.
... Besides, the positioning of the MBR peptide in the appropriate orientation and direction at the site of the interaction requires some clue about the residues from both partners, which are likely to mediate the interaction. The docked complex model of the MBR peptide with the full-length binding partner can be obtained using software such as 3D-Garden, 38 3D-Dock Suite, 39 and Cluspro 2.0. 40 Previously, studies have used the program Auto Dock 41 to identify the binding sites of MBR peptides on a protein, with or without prior knowledge of their site of interaction. ...
Article
Full-text available
Several key biological events adopt a “hit-and-run” strategy in their transient interactions between binding partners. In some instances, the disordered nature of one of the binding partners severely hampers the success of co-crystallization, often leading to the crystallization of just one of the partners. Here, we discuss a method to trap weak and transient protein interactions for crystallization. This approach requires the structural details of at least one of the interacting partners and binding knowledge to dock the known minimum binding region (peptide) of the protein onto the other using an optimal-sized linker. Prior to crystallization, the purified linked construct should be verified for its intact folding and stability. Following structure determination, structure-guided functional studies are performed with independent, full-length unlinked proteins to validate the findings of the linked complex. We designed this approach and then validated its efficacy using a 24 amino acid minimum binding region of the intrinsically disordered, neuron-specific substrates, Neurogranin and Neuromodulin, joined via a Gly-linker to their interacting partner, Calmodulin. Moreover, the reported functional studies with independent full-length proteins confirmed the findings of the linked peptide complexes. Based on our studies, and in combination with the supporting literature, we suggest that optimized linkers can provide an environment to mimic the natural interactions between binding partners, and offer a useful strategy for structural studies to trap weak and transient interactions involved in several biological processes.
... На­ ступним етапом є аналіз енергетики потенційних взаємодій та взаємної динамічної підгонки повер­ хонь контакту. Серед програм, призначених для макромолекулярного докингу, широко застосовують 3D-Dock (FTDock) [112], HEX [113], BIGGER [114], GRAMM [115], ZDOCK [116] ...
Article
Full-text available
Structural bioinformatics is a novel branch of biology which uses the computational methods of analysis with the aim of modeling the 3D structures of proteins and macromolecular complexes. The goal of structural bioinformatics research is also the creation of novel modulators of functional activities of proteins, such as novel drugs (drug design). Progress in structural bioinformatics is determined by the rapid growth in the deciphering of several hundreds of genomes of both prokaryotic and eukaryotic organisms and transition to post-genomic era. In this review some aspects of structural bioinformatics such as 3D structure modeling, proteomics and interactomics, bioinformatics in transcriptome analysis, molecular dynamics of proteins, protein-ligand interactions modeling and computer aided drug design are discussed. The development and applications of these methods for mammalian tyrosyl-tRNA synthetase and viral HIV-protease are discussed. The first Ukrainian web-portal BioUA devoted to the genomics and structural bioinformatics investigations is described in this review.
... Structural studies on protein-protein interaction are the next step of challenges in structural/functional proteomics. Computer simulation is one of powerful tools to predict and analyze protein–protein interaction [5,6]. Generally, the first stage of the prediction is to search the docking interface between proteins, where shapes of molecules are transformed to voxel grids, and virtual complexes are generated in silico. ...
Article
Direct docking of nanomolecules, such as proteins, is responsible for biological signal transduction in cells. This physiological interaction is mimicked by various biosensors in nanotechnology. In many cases phosphorylation of protein is involved in protein–protein interaction, and understanding phosphorylation-dependent interaction is necessary to design novel biosensors. Here, we developed and tested a specific method for studying on interaction of phospho-proteins in silico. The algorithm, named phospho-pivot modeling, consists of two parts: first is to generate a library of virtual complexes by pivoting phospho-ligand at the docking site on the receptor, and second is to grade them according to probability in atomic proximity between two molecules. After a 90-min computation by a personal computer, the phospho-pivot modeling yielded an in silico model for the complex of Ser/Thr phosphatase-1 (PP1) and calyculin A, an inhibitory compound of PP1, which was superimposed on the crystal structure in database with r.m.s.d. of 0.23 Å. The phospho-pivot modeling was applied on the prediction for the complex of PP1 and phospho-CPI-17, an inhibitory protein, whose complex structure is unknown. A 1285-min computation selected one converged structure of the PP1·CPI-17 complex out of 186,624 models. The computation time was reduced to 400 min by adding a prescreening process, where virtual complexes with conflicts between main chains were dismissed from the grading process. Thus, phospho-pivot modeling algorithm is sufficient to predict complex structure of proteins, whose monomeric structures have been solved.
... created with a local random translation (,8 A ˚ ) and rotation (,8u) and a spin around the axis of centers of the two proteins (0–360u). Local docking is useful when epitope information is known, as is common in many antibody applications303132.Figure 2 shows plots that summarize local docking runs for antibody 11k2 which binds human monocyte chemoattractant protein (MCP)-1 [Protein Data Bank (PDB) ID code 2BDN [33]]. Due to difficulties in accurately capturing the energetic differences from small backbone changes in flexible backbone docking, the interface score (intermolecular energy) provides the best discrimination of structures [24,25]. ...
Article
Full-text available
High resolution structures of antibody-antigen complexes are useful for analyzing the binding interface and to make rational choices for antibody engineering. When a crystallographic structure of a complex is unavailable, the structure must be predicted using computational tools. In this work, we illustrate a novel approach, named SnugDock, to predict high-resolution antibody-antigen complex structures by simultaneously structurally optimizing the antibody-antigen rigid-body positions, the relative orientation of the antibody light and heavy chains, and the conformations of the six complementarity determining region loops. This approach is especially useful when the crystal structure of the antibody is not available, requiring allowances for inaccuracies in an antibody homology model which would otherwise frustrate rigid-backbone docking predictions. Local docking using SnugDock with the lowest-energy RosettaAntibody homology model produced more accurate predictions than standard rigid-body docking. SnugDock can be combined with ensemble docking to mimic conformer selection and induced fit resulting in increased sampling of diverse antibody conformations. The combined algorithm produced four medium (Critical Assessment of PRediction of Interactions-CAPRI rating) and seven acceptable lowest-interface-energy predictions in a test set of fifteen complexes. Structural analysis shows that diverse paratope conformations are sampled, but docked paratope backbones are not necessarily closer to the crystal structure conformations than the starting homology models. The accuracy of SnugDock predictions suggests a new genre of general docking algorithms with flexible binding interfaces targeted towards making homology models useful for further high-resolution predictions.
... Best results were obtained by the program 3D-DOCK (Smith and Sternberg, 2003) (http://www.bmm.icnet.uk/). This program is composed of four routines: The first one is ftdock: Here a global scan of the translational and rotational space of possible positions of the two molecules, limited by surface complementarity (SCscore) and electrostatic filtering is done. ...
Article
Full-text available
In diesem Jahrhundert haben neue experimentelle Techniken und Computer-Verfahren enorme Mengen an Information erzeugt, die bereits viele biologische Rätsel enthüllt haben. Doch die Komplexität biologischer Systeme wirft immer weitere neue Fragen auf. Um ein System zu verstehen, bestand der Hauptansatz bis jetzt darin, es in Komponenten zu zerlegen, die untersucht werden können. Ein neues Paradigma verknüpft die einzelnen Informationsteile, um sie auf globaler Ebene verstehen zu können. In der vorgelegten Doktorarbeit habe ich deshalb versucht, infektiöse Krankheiten mit globalen Methoden („Systembiologie“) bioinformatisch zu untersuchen. Im ersten Teil wird der Apoptose-Signalweg analysiert. Apoptose (Programmierter Zelltod) wird bei verschiedenen Infektionen, zum Beispiel bei Viruserkrankungen, als Abwehrmaßnahme eingesetzt. Die Interaktionen zwischen Proteinen, die ‚death’ Domänen beinhalten, wurden untersucht, um folgende Fragen zu klären: i) wie wird die Spezifität der Interaktionen erzielt? –sie wird durch Adapter erreicht, ii) wie werden Proliferation/ Überlebenssignale während der Aktivierung der Apoptose eingeleitet? – wir fanden Hinweise für eine entscheidende Rolle des RIP Proteins (Rezeptor-Interagierende Serine/Threonine-Proteinkinase 1). Das Modell erlaubte uns, die Interaktions-Oberflächen von RIP vorherzusagen. Der Signalweg wurde anschließend auf globaler Ebene mit Simulationen für verschiedene Zeitpunkte analysiert, um die Evolution der Aktivatoren und Inhibitoren des Signalwegs und seine Struktur besser zu verstehen. Weiterhin wird die Signalverarbeitung für Apoptosis-Signalwege in der Maus detailliert modelliert, um den Konzentrationsverlauf der Effektor-Kaspasen vorherzusagen. Weitere experimentelle Messungen von Kaspase-3 und die Überlebenskurven von Zellen bestätigen das Modell. Der zweite Teil der Resultate konzentriert sich auf das Phagosom, eine Organelle, die eine entscheidende Rolle bei der Eliminierung von Krankheitserregern spielt. Dies wird am Beispiel von M. tuberculosis veranschaulicht. Die Fragestellung wird wiederum in zwei Aspekten behandelt: i) Um die Prozesse, die durch M. tuberculosis inhibiert werden zu verstehen, haben wir uns auf das Phospholipid-Netzwerk konzentriert, das bei der Unterdrückung oder Aktivierung der Aktin-Polymerisation eine große Rolle spielt. Wir haben für diese Netzwerkanalyse eine Simulation für verschiedene Zeitpunkte ähnlich wie in Teil eins angewandt. ii) Es wird vermutet, dass Aktin-Polymere bei der Fusion des Phagosoms mit dem Lysosom eine Rolle spielen. Um diese Hypothese zu untersuchen, wurde ein in silico Modell von uns entwickelt. Wir fanden heraus, dass in der Anwesenheit von Aktin-Polymeren die Suchzeit für das Lysosom um das Fünffache reduziert wurde. Weiterhin wurden die Effekte der Länge der Aktin-Polymere, die Größe der Lysosomen sowie der Phagosomen und etliche andere Modellparameter analysiert. Nach der Untersuchung eines Signalwegs und einer Organelle führte der nächste Schritt zur Untersuchung eines komplexen biologischen Systems der Infektabwehr. Dies wurde am Beispiel der Wirt-Pathogen Interaktion bei Bordetella pertussis und Bordetella bronchiseptica dargestellt. Die geringe Menge verfügbarer quantitativer Daten war der ausschlaggebende Faktor bei unserer Modellwahl. Für die dynamische Simulation wurde ein selbst entwickeltes Bool’sches Modell verwendet. Die Ergebnisse sagen wichtige Faktoren bei der Pathologie von Bordetellen hervor, besonders die Bedeutung der Th1 assoziierten Antworten und dagegen nicht der Th2 assoziierten Antworten für die Eliminierung des Pathogens. Einige der quantitativen Vorhersagen wurden durch Experimente wie die Untersuchung des Verlaufs einer Infektion in verschiedenen Mutanten und Wildtyp-Mäusen überprüft. Die begrenzte Verfügbarkeit kinetischer Daten war der kritische Faktor bei der Auswahl der computer-gestützten Modelle. Der Erfolg unserer Modelle konnte durch den Vergleich mit experimentellen Beobachtungen belegt werden. Die vergleichenden Modelle in Kapitel 6 und 9 können zur Untersuchung neuer Wirt-Pathogen Interaktionen verwendet werden. Beispielsweise führt in Kapitel 6 die Analyse von Inhibitoren und inhibitorischer Signalwege aus drei Organismen zur Identifikation wichtiger regulatorischer Zentren in komplexen Organismen und in Kapitel 9 ermöglicht die Identifikation von drei Phasen in B. bronchiseptica und der Inhibition von IFN-γ durch den Faktor TTSS die Untersuchung ähnlicher Phasen und die Inhibition von IFN-γ in B. pertussis. Eine weitere wichtige Bedeutung bekommen diese Modelle durch die mögliche Identifikation neuer, essentieller Komponenten in Wirt-Pathogen Interaktionen. In silico Modelle der Effekte von Deletionen zeigen solche Komponenten auf, die anschließend durch experimentelle Mutationen weiter untersucht werden können. In this century new experimental and computational techniques are adding an enormous amount of information, revealing many biological mysteries. The complexities of biological systems still broach new questions. Till now the main approach to understand a system has been to divide it in components that can be studied. The upcoming new paradigm is to combine the pieces of information in order to understand it at a global level. In the present thesis we have tried to study infectious diseases with such a global ‘Systems Biology’ approach. In the first part the apoptosis pathway is analyzed. Apoptosis (Programmed cell death) is used as a counter measure in different infections, for example viral infections. The interactions between death domain containing proteins are studied to address the following questions: i) How specificity is maintained - showing that it is induced through adaptors, ii) how proliferation/ survival signals are induced during activation of apoptosis – suggesting the pivotal role of RIP. The model also allowed us to detect new possible interacting surfaces. The pathway is then studied at a global level in a time step simulation to understand the evolution of the topology of activators and inhibitors of the pathway. Signal processing is further modeled in detail for the apoptosis pathway in M. musculus to predict the concentration time course of effector caspases. Further, experimental measurements of caspase-3 and viability of cells validate the model. The second part focuses on the phagosome, an organelle which plays an essential role in removal of pathogens as exemplified by M. tuberculosis. Again the problem is addressed in two main sections: i) To understanding the processes that are inhibited by M. tuberculosis; we focused on the phospholipid network applying a time step simulation in section one, which plays an important role in inhibition or activation of actin polymerization on the phagosome membrane. ii) Furthermore, actin polymers are suggested to play a role in the fusion of the phagosome with lysosome. To check this hypothesis an in silico model was developed; we find that the search time is reduced by 5 fold in the presence of actin polymers. Further the effect of length of actin polymers, dimensions of lysosome, phagosome and other model parameter is analyzed. After studying a pathway and then an organelle, the next step was to move to the system. This was exemplified by the host pathogen interactions between Bordetella pertussis and Bordetella bronchiseptica. The limited availability of quantitative information was the crucial factor behind the choice of the model type. A Boolean model was developed which was used for a dynamic simulation. The results predict important factors playing a role in Bordetella pathology especially the importance of Th1 related responses and not Th2 related responses in the clearance of the pathogen. Some of the quantitative predictions have been counterchecked by experimental results such as the time course of infection in different mutants and wild type mice. All these computational models have been developed in presence of limited kinetic data. The success of these models has been validated by comparison with experimental observations. Comparative models studied in chapters 6 and 9 can be used to explore new host pathogen interactions. For example in chapter 6, the analysis of inhibitors and inhibitory paths in three organism leads to the identification of regulatory hotspots in complex organisms and in chapter 9 the identification of three phases in B. bronchiseptica and inhibition of IFN-γ by TTSS lead us to explore similar phases and inhibition of IFN-γ in B. pertussis. Further an important significance of these models is to identify new components playing an essential role in host-pathogen interactions. In silico deletions can point out such components which can be further analyzed by experimental mutations.
... The first mining technique was to rescore each complex using the software Rpdock – a member of the 3D-Dock suite [33] and reject those below a threshold. Rpdock uses evidence gathered empirically to quantify the probability of a complex's existence and returns a score (RPScore) based on the results. ...
Article
Full-text available
The increasing number of protein sequences and 3D structure obtained from genomic initiatives is leading many of us to focus on proteomics, and to dedicate our experimental and computational efforts on the creation and analysis of information derived from 3D structure. In particular, the high-throughput generation of protein-protein interaction data from a few organisms makes such an approach very important towards understanding the molecular recognition that make-up the entire protein-protein interaction network. Since the generation of sequences, and experimental protein-protein interactions increases faster than the 3D structure determination of protein complexes, there is tremendous interest in developing in silico methods that generate such structure for prediction and classification purposes. In this study we focused on classifying protein family members based on their protein-protein interaction distinctiveness. Structure-based classification of protein-protein interfaces has been described initially by Ponstingl et al. 1 and more recently by Valdar et al. 2 and Mintseris et al. 3, from complex structures that have been solved experimentally. However, little has been done on protein classification based on the prediction of protein-protein complexes obtained from homology modeling and docking simulation. We have developed an in silico classification system entitled HODOCO (Homology modeling, Docking and Classification Oracle), in which protein Residue Potential Interaction Profiles (RPIPS) are used to summarize protein-protein interaction characteristics. This system applied to a dataset of 64 proteins of the death domain superfamily was used to classify each member into its proper subfamily. Two classification methods were attempted, heuristic and support vector machine learning. Both methods were tested with a 5-fold cross-validation. The heuristic approach yielded a 61% average accuracy, while the machine learning approach yielded an 89% average accuracy. We have confirmed the reliability and potential value of classifying proteins via their predicted interactions. Our results are in the same range of accuracy as other studies that classify protein-protein interactions from 3D complex structure obtained experimentally. While our classification scheme does not take directly into account sequence information our results are in agreement with functional and sequence based classification of death domain family members.
Chapter
Full-text available
The binding site of a protein governs its function by allowing binding of small and macromolecules such as nucleic acids, proteins, and other molecules. These binding molecules, also known as ligands, generally form non-covalent bonds and have transient interactions and dissociate after performing a function. The binding sites are unique and have shape complementarity to its ligands to maintain the specificity and affinity. For example, molecules such as hormones, activators, inhibitors, neuro-transmitters, and toxins have specificity in their binding sites. A ligand-binding site entails vast information about its biological function, such as the geometry, physicochemical properties, and electrostatic charge, which in turn allows binding for the highly specific ligand. Various experimental methods such as X-ray crystallography, mass spectrometry, nuclear magnetic resonance, and isothermal titration calorimetry are used to determine the binding site of proteins. For drug discovery, it is inevitable to use high throughput screening of binding sites of proteins, and computational methods give an efficient and cost-effective way of analyzing the same. Several algorithms, tools, and software are available to detect protein cavities computationally. The study of binding sites is relevant to various fields of research, including computer-aided drug design, agrochemical design, cancer mechanisms, drug formulation, and physiological regulation.
Article
Based on the structure of the HIV integrase core domain, dipeptide derivatives, as a type of HIV integrase inhibitor, were synthesized, and their fragmentation pathways were investigated by electrospray ionization mass spectrometry (ESI-MSn) in conjunction with tandem mass spectrometry (MS/MS). In order to better understand the fragmentation pathways, the MS2 and MS3 spectra of the title compound were obtained. The main fragmentation pathways occur by the cleavage of the C–CO bonds between N-(benzothiazol-2-yl)aminocarbonyl and methylene, NH–CO bonds between the NH groups and carbonyl groups. Electrospray ionization was proven to be a good method for the structural characterization and identification of this kind of compound.
Chapter
In recent years, biological research is increasingly being based on large-scale experimentation that collects data on an organismic scale. These data are voluminous and often very noisy. Not just their interpretation but also the configuration of the involved experiments necessitates complex computer analysis. The respective computer methods are themselves an object of intensive research in a scientific discipline called computational biology or bioinformatics. Computational biology has a wide variety of facets ranging from experiment configuration and low-level data analysis to hypothesis generation by computer.
Chapter
Protein modeling, which consists of a broad range of computational techniques to understand the properties of proteins, has become an integral part of structural biology and drug design. Modeling can be used to predict the secondary structure or folding of a protein based on its sequence alone, to predict the three-dimensional (3-D) structure of a protein based on knowledge of the structure of a related protein, to design new proteins, and also to predict properties that depend on the experimentally determined 3-D structure of a protein. Examples of such properties include drug binding, protein–protein interactions, and interactions with elements in a protein's environment, including ions, lipids, carbohydrates, and nucleic acids. Conformational changes in proteins can be investigated by using molecular dynamics simulations to provide detailed insights into the dynamics of proteins, a crucial aspect of protein function. In recent developments, quantum mechanical calculations have been used much more often to study reactions in proteins. With the ever-rising power of computers, increasingly detailed aspects of protein function can now be investigated by using modeling methods, at a scale and level of detail that is often very difficult or impossible to achieve by an experimental approach. In this chapter, the main principles and techniques involved in protein modeling are introduced. Some reported examples will also be provided to highlight how protein modeling can be used in complementary fashion with other methods.
Article
Death domain (DD)-containing proteins are involved in both apoptosis and survival/proliferation signaling induced by activated death receptors. Here, a phylogenetic and structural analysis was performed to highlight differences in DD domains and their key regulatory interaction sites. The phylogenetic analysis shows that receptor DDs are more conserved than DDs in adaptors. Adaptor DDs can be subdivided into those that activate or inhibit apoptosis. Modeling of six homotypic DD interactions involved in the TNF signaling pathway implicates that the DD of RIP (Receptor interacting protein kinase 1) is capable of interacting with the DD of TRADD (TNFR1-associated death domain protein) in two different, exclusive ways: one that subsequently recruits CRADD (apoptosis/inflammation) and another that recruits NFkappaB (survival/proliferation).
Article
Protein-protein binding is one of the critical events in biology, and knowledge of proteic complexes three-dimensional structures is of fundamental importance for the biochemical study of pharmacologic compounds. In the past two decades there was an emergence of a large variety of algorithms designed to predict the structures of protein-protein complexes--a procedure named docking. Computational methods, if accurate and reliable, could play an important role, both to infer functional properties and to guide new experiments. Despite the outstanding progress of the methodologies developed in this area, a few problems still prevent protein-protein docking to be a widespread practice in the structural study of proteins. In this review we focus our attention on the principles that govern docking, namely the algorithms used for searching and scoring, which are usually referred as the docking problem. We also focus our attention on the use of a flexible description of the proteins under study and the use of biological information as the localization of the hot spots, the important residues for protein-protein binding. The most common docking softwares are described too.
Article
Full-text available
An important goal after structural genomics is to build up the structures of higher-order protein-protein complexes from structures of the individual subunits. Often structures of higher order complexes are difficult to obtain by crystallography. We have used an alternative approach in which the structures of the individual catalytic (C) subunit and RIalpha regulatory (R) subunit of PKA were first subjected to computational docking, and the top 100,000 solutions were subsequently filtered based on amide hydrogen/deuterium (H/2H) exchange interface protection data. The resulting set of filtered solutions forms an ensemble of structures in which, besides the inhibitor peptide binding site, a flat interface between the C-terminal lobe of the C-subunit and the A- and B-helices of RIalpha is uniquely identified. This holoenzyme structure satisfies all previous experimental data on the complex and allows prediction of new contacts between the two subunits.
Article
Full-text available
Acetyl-CoA carboxylase (ACC) and propionyl-CoA carboxylase (PCC) catalyze the carboxylation of acetyl- and propionyl-CoA to generate malonyl- and methylmalonyl-CoA, respectively. Understanding the substrate specificity of ACC and PCC will (1) help in the development of novel structure-based inhibitors that are potential therapeutics against obesity, cancer, and infectious disease and (2) facilitate bioengineering to provide novel extender units for polyketide biosynthesis. ACC and PCC in Streptomyces coelicolor are multisubunit complexes. The core catalytic beta-subunits, PccB and AccB, are 360 kDa homohexamers, catalyzing the transcarboxylation between biotin and acyl-CoAs. Apo and substrate-bound crystal structures of PccB hexamers were determined to 2.0-2.8 A. The hexamer assembly forms a ring-shaped complex. The hydrophobic, highly conserved biotin-binding pocket was identified for the first time. Biotin and propionyl-CoA bind perpendicular to each other in the active site, where two oxyanion holes were identified. N1 of biotin is proposed to be the active site base. Structure-based mutagenesis at a single residue of PccB and AccB allowed interconversion of the substrate specificity of ACC and PCC. The di-domain, dimeric interaction is crucial for enzyme catalysis, stability, and substrate specificity; these features are also highly conserved among biotin-dependent carboxyltransferases. Our findings enable bioengineering of the acyl-CoA carboxylase (ACCase) substrate specificity to provide novel extender units for the combinatorial biosynthesis of polyketides.
Article
Aromatic polyketides are a class of natural products that include many pharmaceutically important aromatic compounds. Understanding the structure and function of PKS will provide clues to the molecular basis of polyketide biosynthesis specificity. Polyketide chain reduction by ketoreductase (KR) provides regio- and stereochemical diversity. Two cocrystal structures of actinorhodin polyketide ketoreductase (act KR) were solved to 2.3 A with either the cofactor NADP(+) or NADPH bound. The monomer fold is a highly conserved Rossmann fold. Subtle differences between structures of act KR and fatty acid KRs fine-tune the tetramer interface and substrate binding pocket. Comparisons of the NADP(+)- and NADPH-bound structures indicate that the alpha6-alpha7 loop region is highly flexible. The intricate proton-relay network in the active site leads to the proposed catalytic mechanism involving four waters, NADPH, and the active site tetrad Asn114-Ser144-Tyr157-Lys161. Acyl carrier protein and substrate docking models shed light on the molecular basis of KR regio- and stereoselectivity, as well as the differences between aromatic polyketide and fatty acid biosyntheses. Sequence comparison indicates that the above features are highly conserved among aromatic polyketide KRs. The structures of act KR provide an important step toward understanding aromatic PKS and will enhance our ability to design novel aromatic polyketide natural products with different reduction patterns.
Article
In this work, intermolecular distance was integrated into the docking of protein–protein complexes. To develop an efficient docking procedure, 22 enzyme–inhibitor targets and 15 antibody–antigen targets were taken from a benchmark set. A three-step approach was adopted, which included global sampling by FTDOCK, filtering by intermolecular distance and ranking by a composite scoring function. For the enzyme–inhibitor targets, the composite scoring function consists of geometry and energy terms. In the set composed of the ∼100 highest ranked candidates for each target, correct complexes were identified for all of the 22 enzyme–inhibitor targets. This docking strategy also succeeded on the four test targets, of which three are CAPRI targets with the same receptor but different binding modes. Interestingly, all three binding modes were correctly predicted. For the antibody–antigen targets, CDR and physical energy were also used in the filtering process and informatics terms were added to the scoring function. The composite score had successful prediction for 13 of the 15 antibody–antigen targets.
Article
Unlabelled: Interaction free energies are crucial for analyzing binding propensities in proteins. Although the problem of computing binding free energies remains open, approximate estimates have become very useful for filtering potential binding complexes. We report on the implementation of a fast computational estimate of the binding free energy based on a statistically determined desolvation contact potential and Coulomb electrostatics with a distance-dependent dielectric constant, and validated in the Critical Assessment of PRotein Interactions experiment. The application also reports residue contact free energies that rapidly highlight the hotspots of the interaction. Availability: The program was written in Fortran. The executable and full documentation is freely available at http://structure.pitt.edu/software/FastContact
Article
Electron transfer reactions are crucial for respiration and denitrification. In this article, we analyze the interaction of nitrous oxide reductase with its electron donors cytochrome c550 and pseudoazurin. Our docking protocol comprises generation of candidate complexes followed by a selection step based on the distance of the donor and acceptor groups in each partner protein. Finally, the structures of the candidate complexes were optimized using a force field calculation, together with a second distance filtering step. The prediction power of this protocol was studied using the crystal structure of the cytochrome c2/photosynthetic reaction center of Rhodobacter sphaeroides as a reference. The results suggest that both cytochrome c550 and pseudoazurin bind at the same hydrophobic surface patch residing near the CuA center of nitrous oxide reductase. The central, well-conserved interaction surface of the donors is hydrophobic, but it is surrounded by numerous lysine side-chains, which interact electrostatically with analogously positioned side-chain carboxylates of the acceptor. The prediction output is an ensemble of energetically similar structures that are rotationally related to each other. While such an ensemble may reflect incomplete prediction power of the docking protocol, it may also manifest a biological situation where there are multiple ways of forming a productive electron transfer complex. Analyses of the predicted structures and the conservation pattern of the amino acid residues suggest the existence of specific electron transfer pathways to and from the CuA center of nitrous oxide reductase.
Article
Phospho-amino acids in proteins are directly associated with phospho-receptor proteins, including protein phosphatases. Here we produced and tested a scheme for docking together interacting phospho-proteins whose monomeric 3D structures were known. The phosphate of calyculin A, an inhibitor for protein phosphatase-1 and 2A (PP1 and PP2A), or phospho-CPI-17, a PP1-specific inhibitor protein, was docked at the active site of PP1. First, a library of 186,624 virtual complexes was generated in silico, by pivoting the phospho-ligand at the phosphorus atom by step every 5 degrees on three rotational axes. These models were then graded for probability according to atomic proximity between two molecules. The predicted structure of PP1 x calyculin A complex fitted to the crystal structure with r.m.s.d. of 0.23 A, providing a validate test of the modeling method. Modeling of PP1 x phospho-CPI-17 complex yielded one converged structure. The segment of CPI-17 around phospho-Thr38 is predicted to fit in the active site of PP1. Positive charges at Arg33/36 of CPI-17 are in close proximity to Glu274 of PP1, where the sequence is unique among Ser/Thr phosphatases. Single mutations of these residues in PP1 reduced the affinity against phospho-CPI-17. Thus, the interface of the PP1 x CPI-17 complex predicted by the phospho-pivot modeling accounts for the specificity of CPI-17 against PP1.
Article
An efficient biologically enhanced sampling geometric docking method is presented based on the FTDock algorithm to predict the protein-protein binding modes. The active site data from different sources, such as biochemical and biophysical experiments or theoretical analyses of sequence data, can be incorporated in the rotation-translation scan. When discretizing a protein onto a 3-dimensional (3D) grid, a zero value is given to grid points outside a sphere centered on the geometric center of specified residues. In this way, docking solutions are biased toward modes where the interface region is inside the sphere. We also adopt a multiconformational superposition scheme to represent backbone flexibility in the proteins. When these procedures were applied to the targets of CAPRI, a larger number of hits and smaller ligand root-mean-square deviations (RMSDs) were obtained at the conformational search stage in all cases, and especially Target 19. With Target 18, only 1 near-native structure was retained by the biologically enhanced sampling geometric docking method, but this number increased to 53 and the least ligand RMSD decreased from 8.1 A to 2.9 A after performing multiconformational superposition. These results were obtained after the CAPRI prediction deadlines.
Article
In this work we present two methods for the reranking of protein-protein docking studies. One scoring method searches the InterDom database for domains that are available in the proteins to be docked and evaluates the interaction of these domains in other complexes of known structure. The second one analyzes the interface of each proposed conformation with regard to the conservation of Phe, Met, and Trp and their polar neighbor residues. The special relevance of these residues is based on a publication by Ma et al. (Proc Natl Acad Sci USA 2003;100:5772-5777), who compared the conservation of all residues in the interface region to the conservation on the rest of the protein's surface. The scoring functions were tested on 30 unbound docking test cases. The evaluation of the methods is based on the ability to rerank the output of a Fast Fourier Transformation (FFT) docking. Both were able to improve the ranking of the docking output. The best improvement was achieved for enzyme-inhibitor examples. Especially the domain-based scoring function was successful and able to place a near-native solution on one of the first six ranks for 13 of 17 (76%) enzyme-inhibitor complexes [in 53% (nine complexes) even on the first rank]. The method evaluating residue conservation allowed us to increase the number of good solutions within the first 100 ranks out of approximately 9000 in 82% of the 17 enzyme-inhibitor test cases, and for seven (41%) out of 17 enzyme-inhibitor complexes, a near native solution was placed within the first seven ranks.
Article
Methanol dehydrogenase (Hd-MDH) and its physiological electron acceptor, cytochrome c(L) (Hd-Cyt c(L)), isolated from a methylotrophic denitrifying bacterium, Hyphomicrobium denitrificans A3151, have been kinetically and structurally characterized; the X-ray structures of Hd-MDH and Hd-Cyt c(L) have been determined using molecular replacement at 2.5 and 2.0 A resolution, respectively. To explain the mechanism for electron transfer between these proteins, the dependence of MDH activity on the concentration of Hd-Cyt c(L) has been investigated at pH 4.5-7.0. The Michaelis constant for Hd-Cyt c(L) shows the smallest value (approximately 0.3 microM) at pH 5.5. The pseudo-first-order rate constant (k(obs)) of the reduction of Hd-Cyt c(L) exhibits a hyperbolic concentration dependence of Hd-MDH at pH 5.5, although k(obs) linearly increases at pH 6.5. These findings indicate formation of a transient complex between these proteins during an electron transfer event. Hd-MDH (148 kDa) is a large tetrameric protein with an alpha(2)beta(2) subunit composition, showing a high degree of structural similarity with other MDHs. Hd-Cyt c(L) (19 kDa) exhibiting the alpha-band at 550.7 nm has a unique C-terminal region involving a disulfide bond between Cys47 and Cys165. Moreover, there is a pair of Hd-Cyt c(L) monomers related with a pseudo-2-fold axis of symmetry in the asymmetric unit, and the two monomers tightly interact with each other through three hydrogen bonds. This configuration is the first example in the studies of cytochrome c as the physiological electron acceptor for MDH. The docking simulation between the coupled Hd-Cyt c(L) molecules and the heterotetrameric Hd-MDH molecule has been carried out.
Article
Full-text available
The type-1 insulin-like growth factor receptor (IGF-1R) is the cognate tyrosine kinase receptor for the insulin-like growth factor IGF-I and is expressed widely in many foetal and postnatal tissue cells. IGF-1R is overexpressed in a number of human tumour types and is a valid target for anti-cancer therapeutic efforts. Designing antagonists for IGF-1R would be greatly facilitated by the availability of structural information on the complex between IGF-I and IGF-1R. In the present work we model the three-dimensional structure of the complex between IGF-I and the first three domains of IGF-1R using a macromolecular docking method guided by selected experimental data. Interface metrics indicative of the binding affinity and reliability of the model are computed and compared with other biomolecular complexes. This model is consistent with experimental chimerical and mutagenesis data, provides a structural basis for understanding the primary interaction of IGF-I with its receptor and facilitates design of antagonist ligands.
Article
The extracellular module of SPARC/osteonectin binds to vascular endothelial growth factor (VEGF) and inhibits VEGF-stimulated proliferation of endothelial cells. In an attempt to identify the binding site for SPARC on VEGF, we hypothesized that this binding site could overlap at least partially the binding site of VEGF receptor 1 (VEGFR-1), as SPARC acts by preventing VEGF-induced phosphorylation of VEGFR-1. To this end, a docking simulation was carried out using a predictive docking tool to obtain modeled structures of the VEGF-SPARC complex. The predicted structure of VEGF-SPARC complex indicates that the extracellular domain of SPARC interacts with the VEGFR-1 binding site of VEGF, and is consistent with known biochemical data. Following molecular dynamics refinement, side-chain interactions at the protein interface were identified that were predicted to contribute substantially to the free energy of binding. These provide a detailed prediction of key amino acid side-chain interactions at the protein-protein interface. To validate the model further, the identified interactions will be used for designing mutagenesis studies to investigate their effect on binding activity. This model of the VEGF-SPARC complex should provide a basis for future studies aimed at identifying inhibitors of VEGF-induced angiogenesis.
Article
Full-text available
The structural protein VP6 of rotavirus, an important pathogen responsible for severe gastroenteritis in children, forms the middle layer in the triple-layered viral capsid. Here we present the crystal structure of VP6 determined to 2 Å resolution and describe its interactions with other capsid proteins by fitting the atomic model into electron cryomicroscopic reconstructions of viral particles. VP6, which forms a tight trimer, has two distinct domains: a distal β-barrel domain and a proximal α-helical domain, which interact with the outer and inner layer of the virion, respectively. The overall fold is similar to that of protein VP7 from bluetongue virus, with the subunits wrapping about a central 3-fold axis. A distinguishing feature of the VP6 trimer is a central Zn2+ ion located on the 3-fold molecular axis. The crude atomic model of the middle layer derived from the fit shows that quasi-equivalence is only partially obeyed by VP6 in the T = 13 middle layer and suggests a model for the assembly of the 260 VP6 trimers onto the T = 1 viral inner layer.
Article
Full-text available
A geometric recognition algorithm was developed to identify molecular surface complementarity. It is based on a purely geometric approach and takes advantage of techniques applied in the field of pattern recognition. The algorithm involves an automated procedure including (i) a digital representation of the molecules (derived from atomic coordinates) by three-dimensional discrete functions that distinguishes between the surface and the interior; (ii) the calculation, using Fourier transformation, of a correlation function that assesses the degree of molecular surface overlap and penetration upon relative shifts of the molecules in three dimensions; and (iii) a scan of the relative orientations of the molecules in three dimensions. The algorithm provides a list of correlation values indicating the extent of geometric match between the surfaces of the molecules; each of these values is associated with six numbers describing the relative position (translation and rotation) of the molecules. The procedure is thus equivalent to a six-dimensional search but much faster by design, and the computation time is only moderately dependent on molecular size. The procedure was tested and validated by using five known complexes for which the correct relative position of the molecules in the respective adducts was successfully predicted. The molecular pairs were deoxyhemoglobin and methemoglobin, tRNA synthetase-tyrosinyl adenylate, aspartic proteinase-peptide inhibitor, and trypsin-trypsin inhibitor. A more realistic test was performed with the last two pairs by using the structures of uncomplexed aspartic proteinase and trypsin inhibitor, respectively. The results are indicative of the extent of conformational changes in the molecules tolerated by the algorithm.
Article
Full-text available
A monoclonal antibody raised against X-31 influenza virus reacted with the majority of natural H3N2 viruses isolated between 1968 and 1982. A number of variants of X-31 and of a receptor-binding mutant of X-31 were selected by the antibody during virus replication in eggs and MDCK cells. Antibody-binding assays indicated that the viruses selected were not antigenic variants and analyses using derivatized erythrocytes showed that their receptor-binding properties differed from those of the parent viruses. The amino acid substitutions in the variants were all located in the vicinity of the receptor-binding site and the structural consequences are discussed in relation to the three-dimensional structure of X-31 HA. In addition all of the variants fused membranes at higher pH than wild-type virus indicating that structural modifications in the distal globular region of HA influence the low pH-induced conformational change required for membrane fusion.
Article
Full-text available
Evidence is provided that dromedary heavy-chain antibodies, in vivo-matured in the absence of light chains, are a unique source of inhibitory antibodies. After immunization of a dromedary with bovine erythrocyte carbonic anhydrase and porcine pancreatic alpha-amylase, it was demonstrated that a considerable amount of heavy-chain antibodies, acting as true competitive inhibitors, circulate in the bloodstream. In contrast, the conventional antibodies apparently do not interact with the enzyme's active site. Next we illustrated that peripheral blood lymphocytes are suitable for one-step cloning of the variable domain fragments in a phage-display vector. By bio-panning, several antigen-specific single-domain fragments are readily isolated for both enzymes. In addition we show that among those isolated fragments active site binders are well represented. When produced as recombinant protein in Escherichia coli, these active site binders appear to be potent enzyme inhibitors when tested in chromogenic assays. The low complexity of the antigen-binding site of these single-domain antibodies composed of only three loops could be valuable for designing smaller synthetic inhibitors.
Article
Full-text available
The Structural Classification of Proteins (SCOP) database provides a detailed and comprehensive description of the relationships of all known proteins structures. The classification is on hierarchical levels: the first two levels, family and superfamily, describe near and far evolutionary relationships; the third, fold, describes geometrical relationships. The distinction between evolutionary relationships and those that arise from the physics and chemistry of proteins is a feature that is unique to this database, so far. The database can be used as a source of data to calibrate sequence search algorithms and for the generation of population statistics on protein structures. The database and its associated files are freely accessible from a number of WWW sites mirrored from URL http://scop.mrc-lmb.cam.ac.uk/scop/
Article
Full-text available
Combining structural data from cryo-electron microscopy (cryo-EM) and X-ray crystallography to give pseudo-atomic models of large molecular complexes has proved particularly suitable for studying viruses and viral complexes. Several groups are developing programs to fit X-ray data to EM data. These programs are in general tailored to particular problems with regard to size, symmetry, number of rigid bodies, resolution etc. Here, two approaches are described to fitting X-ray data to EM data in the presence of steric interference and their relative merits and limitations are indicated. These fitting techniques are applied to the case of the rotavirus double-layered particle (DLP) in complex with antibodies which inhibit the transcription of mRNA by the DLP. This is a particularly good test case, as the cryo-electron microscopy map of the DLP-Fab complex, the X-ray structure of the viral protein (VP6) and also that of the VP6-Fab complex are available. The estimation of partial occupancy is also considered.
Article
Full-text available
HPr kinase/phosphatase (HprK/P) is a key regulatory enzyme controlling carbon metabolism in Gram- positive bacteria. It catalyses the ATP-dependent phosphorylation of Ser46 in HPr, a protein of the phosphotransferase system, and also its dephosphorylation. HprK/P is unrelated to eukaryotic protein kinases, but contains the Walker motif A characteristic of nucleotide-binding proteins. We report here the X-ray structure of an active fragment of Lactobacillus casei HprK/P at 2.8 A resolution, solved by the multiwavelength anomalous dispersion method on a seleniated protein (PDB code 1jb1). The protein is a hexamer, with each subunit containing an ATP-binding domain similar to nucleoside/nucleotide kinases, and a putative HPr-binding domain unrelated to the substrate-binding domains of other kinases. The Walker motif A forms a typical P-loop which binds inorganic phosphate in the crystal. We modelled ATP binding by comparison with adenylate kinase, and designed a tentative model of the complex with HPr based on a docking simulation. The results confirm that HprK/P represents a new family of protein kinases, first identified in bacteria, but which may also have members in eukaryotes.
Article
Full-text available
HPr kinase/phosphorylase (HprK/P) controls the phosphorylation state of the phosphocarrier protein HPr and regulates the utilization of carbon sources by Gram-positive bacteria. It catalyzes both the ATP-dependent phosphorylation of Ser-46 of HPr and its dephosphorylation by phosphorolysis. The latter reaction uses inorganic phosphate as substrate and produces pyrophosphate. We present here two crystal structures of a complex of the catalytic domain of Lactobacillus casei HprK/P with Bacillus subtilis HPr, both at 2.8-A resolution. One of the structures was obtained in the presence of excess pyrophosphate, reversing the phosphorolysis reaction and contains serine-phosphorylated HPr. The complex has six HPr molecules bound to the hexameric kinase. Two adjacent enzyme subunits are in contact with each HPr molecule, one through its active site and the other through its C-terminal helix. In the complex with serine-phosphorylated HPr, a phosphate ion is in a position to perform a nucleophilic attack on the phosphoserine. Although the mechanism of the phosphorylation reaction resembles that of eukaryotic protein kinases, the dephosphorylation by inorganic phosphate is unique to the HprK/P family of kinases. This study provides the structure of a protein kinase in complex with its protein substrate, giving insights into the chemistry of the phospho-transfer reactions in both directions.
Article
The Structural Classification of Proteins (SCOP) database provides a detailed and comprehensive description of the relationships of known protein structures. The classification is on hierarchical levels: the first two levels, family and superfamily, describe near and distant evolutionary relationships; the third, fold, describes geometrical relationships. The distinction between evolutionary relationships and those that arise from the physics and chemistry of proteins is a feature that is unique to this database so far. The sequences of proteins in SCOP provide the basis of the ASTRAL sequence libraries that can be used as a source of data to calibrate sequence search algorithms and for the generation of statistics on, or selections of, protein structures. Links can be made from SCOP to PDB-ISL: a library containing sequences homologous to proteins of known structure. Sequences of proteins of unknown structure can be matched to distantly related proteins of known structure by using pairwise sequence comparison methods to find homologues in PDB-ISL. The database and its associated files are freely accessible from a number of WWW sites mirrored from URL http://scop.mrc-lmb. cam.ac.uk/scop/
Article
The Structural Classification of Proteins (SCOP) database provides a detailed and comprehensive description of the relationships of known protein structures. The classification is on hierarchical levels: the first two levels, family and superfamily, describe near and distant evolutionary relationships; the third, fold, describes geometrical relationships. The distinction between evolutionary relationships and those that arise from the physics and chemistry of proteins is a feature that is unique to this database so far. The sequences of proteins in SCOP provide the basis of the ASTRAL sequence libraries that can be used as a source of data to calibrate sequence search algorithms and for the generation of statistics on, or selections of, protein structures. Links can be made from SCOP to PDB-ISL: a library containing sequences homologous to proteins of known structure. Sequences of proteins of unknown structure can be matched to distantly related proteins of known structure by using pairwise sequence comparison methods to find homologues in PDB-ISL. The database and its associated files are freely accessible from a number of WWW sites mirrored from URL http://scop.mrc-lmb.cam.ac.uk/scop/.
Chapter
Introduction The need for protein–protein and protein–DNA dockingOverview of the computational approachScope of this chapterStructural studies of protein complexesMethodology of a protein–protein docking strategy Rigid body docking by Fourier correlation theoryUse of residue pair potentials to re-rank docked complexesUse of distance constraintsRefinement and additional screening of complexesImplementation of the docking suiteResults from the protein–protein docking strategyModelling protein–DNA complexes MethodResultsStrategies for protein–protein docking Evaluation of the results of docking simulationsFourier correlation methodsOther rigid-body docking approachesFlexible protein–protein dockingRigid-body treatment to re-rank putative docked complexesIntroduction of flexibility to re-rank putative docked complexesBlind trials of protein–protein dockingEnergy landscape for protein dockingConclusions The need for protein–protein and protein–DNA dockingOverview of the computational approachScope of this chapter Rigid body docking by Fourier correlation theoryUse of residue pair potentials to re-rank docked complexesUse of distance constraintsRefinement and additional screening of complexesImplementation of the docking suite Method Results Evaluation of the results of docking simulationsFourier correlation methodsOther rigid-body docking approachesFlexible protein–protein dockingRigid-body treatment to re-rank putative docked complexesIntroduction of flexibility to re-rank putative docked complexes
Article
Empirical residue–residue pair potentials are used to screen possible complexes for protein–protein dockings. A correct docking is defined as a complex with not more than 2.5 Å root-mean-square distance from the known experimental structure. The complexes were generated by “ftdock” (Gabb et al. J Mol Biol 1997;272:106–120) that ranks using shape complementarity. The complexes studied were 5 enzyme-inhibitors and 2 antibody-antigens, starting from the unbound crystallographic coordinates, with a further 2 antibody-antigens where the antibody was from the bound crystallographic complex. The pair potential functions tested were derived both from observed intramolecular pairings in a database of nonhomologous protein domains, and from observed intermolecular pairings across the interfaces in sets of nonhomologous heterodimers and homodimers. Out of various alternate strategies, we found the optimal method used a mole-fraction calculated random model from the intramolecular pairings. For all the systems, a correct docking was placed within the top 12% of the pair potential score ranked complexes. A combined strategy was developed that incorporated “multidock,” a side-chain refinement algorithm (Jackson et al. J Mol Biol 1998;276:265–285). This placed a correct docking within the top 5 complexes for enzyme-inhibitor systems, and within the top 40 complexes for antibody–antigen systems. Proteins 1999;35:364–373. © 1999 Wiley-Liss, Inc.
Article
A computationally tractable strategy has been developed to refine protein-protein interfaces that models the effects of side-chain conformational change, solvation and limited rigid-body movement of the subunits. The proteins are described at the atomic level by a multiple copy representation of side-chains modelled according to a rotamer library on a fixed peptide backbone. The surrounding solvent environment is described by "soft" sphere Langevin dipoles for water that interact with the protein via electrostatic, van der Waals and field-dependent hydrophobic terms. Energy refinement is based on a two-step process in which (1) a probability-based conformational matrix of the protein side-chains is refined iteratively by a mean field method. A side-chain interacts with the protein backbone and the probability-weighted average of the surrounding protein side-chains and solvent molecules. The resultant protein conformations then undergo (2) rigid-body energy minimization to relax the protein interface. Steps (1) and (2) are repeated until convergence of the interaction energy. The influence of refinement on side-chain conformation starting from unbound conformations found improvement in the RMSD of side-chains in the interface of protease-inhibitor complexes, and shows that the method leads to an improvement in interface geometry. In terms of discriminating between docked structures, the refinement was applied to two classes of protein-protein complex: five protease-protein inhibitor and four antibody-antigen complexes. A large number of putative docked complexes have already been generated for the test systems using our rigid-body docking program, FTDOCK. They include geometries that closely resemble the crystal complex, and therefore act as a test for the refinement procedure. In the protease-inhibitors, geometries that resemble the crystal complex are ranked in the top four solutions for four out of five systems when solvation is included in the energy function, against a background of between 26 and 364 complexes in the data set. The results for the antibody-antigen complexes are not as encouraging, with only two of the four systems showing discrimination. It would appear that these results reflect the somewhat different binding mechanism dominant in the two types of protein-protein complex. Binding in the protease-inhibitors appears to be "lock and key" in nature. The fixed backbone and mobile side-chain representation provide a good model for binding. Movements in the backbone geometry of antigens on binding represent an "induced-fit" and provides more of a challenge for the model. Given the limitations of the conformational sampling, the ability of the energy function to discriminate between native and non-native states is encouraging. Development of the approach to include greater conformational sampling could lead to a more general solution to the protein docking problem.
Article
Two tasks must be accomplished when calculating the binding modalities and binding energies of two molecules in solution: the calculation of the interaction energy and the calculation of the effects of solvation. It is the competition between the energy of binding and the energy of remaining solvation which determines the binding properties. It is necessary to calculate (or at least approximate in some manner) the partition function in order to make a theoretical estimate of these effects. An efficient algorithm for performing the energy evaluations necessary for this calculation is presented in this paper. The fast Fourier transform (FFT) is used in combination with a polar factorization of the potentials to calculate the interaction energy at all relative translations between two molecules of fixed orientation. Thermodynamic quantities, including the partition function, internal and free energies can then be estimated from a set of these calculations covering the orientation space.
Article
Streptococcus pyogenes that produces the bacterial superantigen streptococcal pyrogenic exotoxin A (SpeA) is associated with outbreaks of streptococcal toxic shock syndrome (STSS) in the United States and Europe. SpeA stimulates Vβ2.1, 12.2, 14.1, and 15.1-positive T cells, and the lymphokine production from the activated T cells is believed to result in the symptoms associated with STSS. The T-cell receptor (TCR)–SpeA interaction is crucial for superantigenic activity, and studies were undertaken to determine regions of both SpeA and the TCR involved in the formation of MHC/SpeA/TCR complexes. Previously, recombinant toxins encoded by speA alleles 1, 2, and 3 as well as toxins resulting from 19 distinct point mutations in speA1 were generated. Here, these 22 toxin forms were incubated with human peripheral blood mono- nuclear cells (PBMCs), and the percentages of T-cell blasts bearing Vβ chains 2.1, 12.2, and 14.1 were quantified by flow cytometry. The analysis indicates that the residues of SpeA needed for a productive TCR interaction differ for each Vβ chain examined. An amino acid substitution at only one site significantly affected the toxin’s ability to stimulate Vβ2.1-expressing T cells, three individual amino acid substitutions resulted in significant loss of ability to stimulate Vβ12.2-expressing T cells, and substitution at 13 individual sites significantly affected the ability to stimulate Vβ14.1-expressing T cells. To elucidate the regions of the Vβ chains that interacted with SpeA, synthetic peptides representative of the human Vβ12.2 complementary-determining regions (CDRs) 1, 2, and 4 were used to block the SpeA-mediated proliferation of human PBMCs. The CDR1, CDR2 and CDR4 peptides were each able to block proliferation, with the activity of CDR1 > CDR2 > CDR4. Combinations of CDR1 peptide with CDR2 or CDR4 peptides allosterically enhanced the ability of each to block proliferation, suggesting SpeA has distinct binding sites for the CDR loops.
Article
A protein docking study was performed for two classes of biomolecular complexes: six enzyme/inhibitor and four antibody/antigen. Biomolecular complexes for which crystal structures of both the complexed and uncomplexed proteins are available were used for eight of the ten test systems. Our docking experiments consist of a global search of translational and rotational space followed by refinement of the best predictions. Potential complexes are scored on the basis of shape complementarity and favourable electrostatic interactions using Fourier correlation theory. Since proteins undergo conformational changes upon binding, the scoring function must be sufficiently soft to dock unbound structures successfully. Some degree of surface overlap is tolerated to account for side-chain flexibility. Similarly for electrostatics, the interaction of the dispersed point charges of one protein with the Coulombic field of the other is measured rather than precise atomic interactions. We tested our docking protocol using the native rather than the complexed forms of the proteins to address the more scientifically interesting problem of predictive docking. In all but one of our test cases, correctly docked geometries (interface Calpha RMS deviation </=2 A from the experimental structure) are found during a global search of translational and rotational space in a list that was always less than 250 complexes and often less than 30. Varying degrees of biochemical information are still necessary to remove most of the incorrectly docked complexes.
Article
A computational system is described that predicts the structure of protein/protein and protein/DNA complexes starting from unbound coordinate sets. The approach is (i) a global search with rigid-body docking for complexes with shape complementarity and favourable electrostatics; (ii) use of distance constraints from experimental (or predicted) knowledge of critical residues; (iii) use of pair potential to screen docked complexes and (iv) refinement and further screening by protein-side chain optimisation and interfacial energy minimisation. The system has been applied to model ten protein/protein and eight protein-repressor/DNA (steps i to iii only) complexes. In general a few complexes, one of which is close to the true structure, can be generated.
Article
Empirical residue-residue pair potentials are used to screen possible complexes for protein-protein dockings. A correct docking is defined as a complex with not more than 2.5 A root-mean-square distance from the known experimental structure. The complexes were generated by "ftdock" (Gabb et al. J Mol Biol 1997;272:106-120) that ranks using shape complementarity. The complexes studied were 5 enzyme-inhibitors and 2 antibody-antigens, starting from the unbound crystallographic coordinates, with a further 2 antibody-antigens where the antibody was from the bound crystallographic complex. The pair potential functions tested were derived both from observed intramolecular pairings in a database of nonhomologous protein domains, and from observed intermolecular pairings across the interfaces in sets of nonhomologous heterodimers and homodimers. Out of various alternate strategies, we found the optimal method used a mole-fraction calculated random model from the intramolecular pairings. For all the systems, a correct docking was placed within the top 12% of the pair potential score ranked complexes. A combined strategy was developed that incorporated "multidock," a side-chain refinement algorithm (Jackson et al. J Mol Biol 1998;276:265-285). This placed a correct docking within the top 5 complexes for enzyme-inhibitor systems, and within the top 40 complexes for antibody-antigen systems.
Article
We have determined the structure of a complex of influenza hemagglutinin (HA) with an antibody that binds simultaneously to the membrane-distal domains of two HA monomers, effectively cross-linking them. The antibody prevents the low pH structural transition of HA that is required for its membrane fusion activity, providing evidence that a rearrangement of HA membrane-distal domains is an essential component of the transition.
Article
Camelids produce functional antibodies devoid of light chains and CH1 domains. The antigen-binding fragment of such heavy chain antibodies is therefore comprised in one single domain, the camelid heavy chain antibody VH (VHH). Here we report on the structures of three dromedary VHH domains in complex with porcine pancreatic α-amylase. Two VHHs bound outside the catalytic site and did not inhibit or inhibited only partially the amylase activity. The third one, AMD9, interacted with the active site crevice and was a strong amylase inhibitor (K i = 10 nm). In contrast with complexes of other proteinaceous amylase inhibitors, amylase kept its native structure. The water-accessible surface areas of VHHs covered by amylase ranged between 850 and 1150 Å2, values similar to or even larger than those observed in the complexes between proteins and classical antibodies. These values could certainly be reached because a surprisingly high extent of framework residues are involved in the interactions of VHHs with amylase. The framework residues that participate in the antigen recognition represented 25–40% of the buried surface. The inhibitory interaction of AMD9 involved mainly its complementarity-determining region (CDR) 2 loop, whereas the CDR3 loop was small and certainly did not protrude as it does in cAb-Lys3, a VHH-inhibiting lysozyme. AMD9 inhibited amylase, although it was outside the direct reach of the catalytic residues; therefore it is to be expected that inhibiting VHHs might also be elicited against proteases. These results illustrate the versatility and efficiency of VHH domains as protein binders and enzyme inhibitors and are arguments in favor of their use as drugs against diabetes.
Article
Superantigens (SAGs) crosslink MHC class II and TCR molecules, resulting in an overstimulation of T cells associated with human disease. SAGs interact with several different surfaces on MHC molecules, necessitating the formation of multiple distinct MHC-SAG-TCR ternary signaling complexes. Variability in SAG-TCR binding modes could also contribute to the structural heterogeneity of SAG-dependent signaling complexes. We report crystal structures of the streptococcal SAGs SpeA and SpeC in complex with their corresponding TCR beta chain ligands that reveal distinct TCR binding modes. The SpeC-TCR beta chain complex structure, coupled with the recently determined SpeC-HLA-DR2a complex structure, provides a model for a novel T cell signaling complex that precludes direct TCR-MHC interactions. Thus, highly efficient T cell activation may be achieved through structurally diverse strategies of TCR ligation.
Modelling protein docking using shape complementarity, electrostatics and biochemical infor-mation
  • Jackson Ha Rm Gabb
  • Sternberg
  • Mje
Gabb HA, Jackson RM, Sternberg MJE. Modelling protein docking using shape complementarity, electrostatics and biochemical infor-mation. J Mol Biol 1997;272:106 –120.
Rapid refinement of protein interfaces incorporating solvation: application to the dock-ing problem
  • Jackson Rm Ha Gabb
  • Sternberg
  • Mje
Jackson RM, Gabb HA, Sternberg MJE. Rapid refinement of protein interfaces incorporating solvation: application to the dock-ing problem. J Mol Biol 1998;276:265–285.
Three camelid VHH domains in complex with porcine pancreatic alpha-amylase. Inhibition and versatility of binding topology
  • A Desmyter
  • S Spinelli
  • Lauwereys F M Payan
  • S Cambillau
Desmyter A, Spinelli S, Payan F, Lauwereys M, Wyns L, Muylder-mans S, Cambillau C. Three camelid VHH domains in complex with porcine pancreatic alpha-amylase. Inhibition and versatility of binding topology. J Biol Chem 2002;277:23645–23650.
Proceedings Sixth International Conference on Intelligent Systems for Molecular Biology
  • Sternberg MJE
  • Aloy P
  • Gabb HA
  • Jackson RM
  • Moont G
  • Querol E
  • Aviles FX
Bioinformatics—from genomes to drugs
  • Sternberg MJE
  • Moont G