Article

Docking Unbound Proteins Using Shape Complementarity, Desolvation, and Electrostatics

Authors:
  • State Key Laboratory for Agrobiotechnology, College of Biological Sciences
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

A comprehensive docking study was performed on 27 distinct protein-protein complexes. For 13 test systems, docking was performed with the unbound X-ray structures of both the receptor and the ligand. For the remaining systems, the unbound X-ray structure of only molecule was available; therefore the bound structure for the other molecule was used. Our method optimizes desolvation, shape complementarity, and electrostatics using a Fast Fourier Transform algorithm. A global search in the rotational and translational space without any knowledge of the binding sites was performed for all proteins except nine antibodies recognizing antigens. For these antibodies, we docked their well-characterized binding site-the complementarity-determining region defined without information of the antigen-to the entire surface of the antigen. For 24 systems, we were able to find near-native ligand orientations (interface C(alpha) root mean square deviation less than 2.5 A from the crystal complex) among the top 2,000 choices. For three systems, our algorithm could identify the correct complex structure unambiguously. For 13 other complexes, we either ranked a near-native structure in the top 20 or obtained 20 or more near-native structures in the top 2,000 or both. The key feature of our algorithm is the use of target functions that are highly tolerant to conformational changes upon binding. If combined with a post-processing method, our algorithm may provide a general solution to the unbound docking problem. Our program, called ZDOCK, is freely available to academic users (http://zlab.bu.edu/~rong/dock/).

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... After that, we employed the computational protein-protein docking algorithm (ZDOCK) 25 to predict the bound conformation of the PLK1 PBD domain (PBD ID code 1UMW) 26 and SHCBP1 355-562 aa domain, whose structure was predicted by homologous modeling, using an I-TASSER server 27 (Supplementary Fig. 7b). Following ZDOCK analysis, the top-ranked bound conformation for the SHCBP1-PLK1 complex was selected, and 12 core amino acids on the binding surface of the mode were predicted (Fig. 6c, d and Supplementary Fig. 7c, d). ...
... Tumor proliferation was assessed by Ki-67 immunohistochemical staining.Protein-protein docking. The binding surface of the PLK1 PBD domain (367-603 aa) and SHCBP1 355-562 aa domain was predicated by the Dock Proteins (ZDOCK) protocol25 . Briefly, we employed the I-TASSER server 27 to predict the 3D structure of SHCBP1 355-562 aa domain, and then the SHCBP1 355-562 aa domain and PLK1 PBD domain (PBD ID code 1UMW) were respectively defined as receptor and ligand proteins, and no amino acid was predefined as interface residue or paired interacting residue. ...
Article
Full-text available
Trastuzumab is the backbone of HER2-directed gastric cancer therapy, but poor patient response due to insufficient cell sensitivity and drug resistance remains a clinical challenge. Here, we report that HER2 is involved in cell mitotic promotion for tumorigenesis by hyperactivating a crucial HER2-SHCBP1-PLK1 axis that drives trastuzumab sensitivity and is targeted therapeutically. SHCBP1 is an Shc1-binding protein but is detached from scaffold protein Shc1 following HER2 activation. Released SHCBP1 responds to HER2 cascade by translocating into the nucleus following Ser273 phosphorylation, and then contributing to cell mitosis regulation through binding with PLK1 to promote the phosphorylation of the mitotic interactor MISP. Meanwhile, Shc1 is recruited to HER2 for MAPK or PI3K pathways activation. Also, clinical evidence shows that increased SHCBP1 prognosticates a poor response of patients to trastuzumab therapy. Theaflavine-3, 3’-digallate (TFBG) is identified as an inhibitor of the SHCBP1-PLK1 interaction, which is a potential trastuzumab sensitizing agent and, in combination with trastuzumab, is highly efficacious in suppressing HER2-positive gastric cancer growth. These findings suggest an aberrant mitotic HER2-SHCBP1-PLK1 axis underlies trastuzumab sensitivity and offer a new strategy to combat gastric cancer. Resistance to Trastuzumab in HER2 gastric cancer patients remains a clinical challenge. In this study, the authors demonstrate that HER2 promotes tumorigenesis in gastric cancer by regulating mitotic progression through a Shc1-SHCBP1-PLK1-MISP axis and they identify a compound, TFBG, able to disrupt SHCBP1/PLK1 interaction and to synergize with trastuzumab.
... First, we employed proximity ligation assay (PLA; in situ biochemical assay) to confirm survivin interaction with caspase-3, caspase-7, caspase-9, and XIAP. Subsequently, we carried-out protein-protein docking of survivin with caspase-3, caspase-7, and caspase-9 in the absence/ presence of XIAP BIR domains using ZDOCK modules [26,27], followed by refinement using RDOCK modules [28] to select a total of 18 near-native protein-protein interaction (PPI) complex structures. Further, 50 ns molecular dynamics (MD) simulation was carried out to evaluate the structural stability of the selected 18 PPI complex structures. ...
... The prepared survivin was independently docked to caspase-3, caspase-7, and caspase-9, in the absence of XIAP BIR domains using the ZDOCK module [26,27]. ZDOCK is a rigid body protein-protein docking module that predicts the probable conformations (about 2000) of the protein-protein system. ...
Article
Survivin is an Inhibitor of Apoptosis (IAP) family protein that is involved in various protein-protein interactions (PPIs) and thereby regulates cell division, apoptosis, and autophagy. Besides, survivin is overexpressed in most of the human solid tumors, but not in differentiated normal cells. Hence, identification of survivin PPI hotspots could pave way for effective structure-based drug design. For this, we used both in vitro (proximity ligation assay) and in silico (protein-protein docking and molecular dynamics simulation) methods to understand survivin PPI interaction with caspase-3, caspase-7, and caspase-9 in the absence/presence of XIAP BIR domains. Computational results reveal that survivin interacts with the catalytic site and/or at the dimerization site of caspase-3, caspase-7, and caspase-9. This PPI could inhibit the catalytic activity of caspases and hinder apoptosis in cancer. Moreover, MM-PBSA binding energy calculation disclosed that survivin strongly interacts with all three caspases in the presence of XIAP BIR domains. Additionally, per-residue energy decomposition results revealed that survivin BIR domain residues Asp53, Glu65, Gly66, Glu68, Asp70, Asp71, and Glu76, and C-α helix residues Glu125, Lys129, Arg132, and Arg133 significantly contributed to binding with the caspases. Targeting these hotspot residues with small molecules could result in the disruption of survivin-caspase PPI, leading to the induction of apoptosis in cancer.
... The receptor in the docking study was the crystal structure of apo CBM6E-GH128/ CBM6. The ZDOCK algorithm [40] was employed for rigid body docking, utilizing shape complementarity, electrostatics, and desolvation terms to generate ZDock scores. These scores were used to rank the protein poses obtained from the docking simulations. ...
Article
Full-text available
Background Degradation via enzymatic processes for the production of valuable β-1,3-glucooligosaccharides (GOS) from curdlan has attracted considerable interest. CBM6E functions as a curdlan-specific β-1,3-endoglucanase, composed of a glycoside hydrolase family 128 (GH128) module and a carbohydrate-binding module (CBM) derived from family CBM6. Results Crystallographic analyses were conducted to comprehend the substrate specificity mechanism of CBM6E. This unveiled structures of both apo CBM6E and its GOS-complexed form. The GH128 and CBM6 modules constitute a cohesive unit, binding nine glucoside moieties within the catalytic groove in a singular helical conformation. By extending the substrate-binding groove, we engineered CBM6E variants with heightened hydrolytic activities, generating diverse GOS profiles from curdlan. Molecular docking, followed by mutation validation, unveiled the cooperative recognition of triple-helical β-1,3-glucan by the GH128 and CBM6 modules, along with the identification of a novel sugar-binding residue situated within the CBM6 module. Interestingly, supplementing the CBM6 module into curdlan gel disrupted the gel’s network structure, enhancing the hydrolysis of curdlan by specific β-1,3-glucanases. Conclusions This study offers new insights into the recognition mechanism of glycoside hydrolases toward triple-helical β-1,3-glucans, presenting an effective method to enhance endoglucanase activity and manipulate its product profile. Furthermore, it discovered a CBM module capable of disrupting the quaternary structures of curdlan, thereby boosting the hydrolytic activity of curdlan gel when co-incubated with β-1,3-glucanases. These findings hold relevance for developing future enzyme and CBM cocktails useful in GOS production from curdlan degradation.
... The sampling strategies can be classified into exhaustive global search, local shape feature matching, or randomized search (Huang, 2014). The exhaustive global search methods (Chen et al., 2003;Chen & Weng, 2002;Comeau et al., 2004;Gabb et al., 1997;Heifetz et al., 2002;Kozakov et al., 2006;Mandell et al., 2001;Vakser, 1997) mostly use fast Fourier transforms (FFTs; Katchalski- Katzir et al., 1992) to cover the complete 6D (3D translational plus 3D rotational) space, assuming no conformational changes of the docking partners. The local shape-matching methods (Duhovny et al., 2002;Esquivel-Rodríguez et al., 2012;Gardiner et al., 2001;Kuntz et al., 1982;Schneidman-Duhovny et al., 2005;Venkatraman et al., 2009) typically represent a protein by the shape of its molecular surface and find matches of high shape complementarity between two proteins. ...
Article
Full-text available
Conventional protein–protein docking algorithms usually rely on heavy candidate sampling and reranking, but these steps are time‐consuming and hinder applications that require high‐throughput complex structure prediction, for example, structure‐based virtual screening. Existing deep learning methods for protein–protein docking, despite being much faster, suffer from low docking success rates. In addition, they simplify the problem to assume no conformational changes within any protein upon binding (rigid docking). This assumption precludes applications when binding‐induced conformational changes play a role, such as allosteric inhibition or docking from uncertain unbound model structures. To address these limitations, we present GeoDock, a multitrack iterative transformer network to predict a docked structure from separate docking partners. Unlike deep learning models for protein structure prediction that input multiple sequence alignments, GeoDock inputs just the sequences and structures of the docking partners, which suits the tasks when the individual structures are given. GeoDock is flexible at the protein residue level, allowing the prediction of conformational changes upon binding. On the Database of Interacting Protein Structures (DIPS) test set, GeoDock achieves a 43% top‐1 success rate, outperforming all other tested methods. However, in the standard DIPS train/test splits, we discovered contamination of close homologs in the training set. After decontaminating the training set, the success rate is 31%. On the DB5.5 test set and a benchmark dataset of antibody–antigen complexes, GeoDock outperforms the deep learning models trained using the same dataset but falls behind most of the conventional methods and AlphaFold‐Multimer. GeoDock attains an average inference speed of under 1 s on a single GPU, enabling its application in large‐scale structure screening. Although binding‐induced conformational changes are still a challenge owing to limited training and evaluation data, our architecture sets up the foundation to capture this backbone flexibility. Code and a demonstration Jupyter notebook are available at https://github.com/Graylab/GeoDock.
... All models were in the closed state. The gephyrin-bound models 4α 1 :1β-GephE and 4α 2 :1β-GephE were generated using the GlyR models and the C-terminal GephE domain bound to a GlyR β subunit peptide (PDB ID: 4PD1) using MODELLER and Z-DOCK-a Fast Fourier Transform algorithm which uses electrostatics, desolvation, and shape complementarity to dock rigid bodies [18]. The successful docked models were selected after a qualitative assessment of their structural alignment with 4PD1. ...
Article
Full-text available
Glycine receptors (GlyRs) are glycine-gated inhibitory pentameric ligand-gated ion channels composed of α or α + β subunits. A number of structures of these proteins have been reported, but to date, these have only revealed details of the extracellular and transmembrane domains, with the intracellular domain (ICD) remaining uncharacterised due to its high flexibility. The ICD is a region that can modulate function in addition to being critical for receptor localisation and clustering via proteins such as gephyrin. Here, we use modelling and molecular dynamics (MD) to reveal details of the ICDs of both homomeric and heteromeric GlyR. At their N and C ends, both the α and β subunit ICDs have short helices, which are major sites of stabilising interactions; there is a large flexible loop between them capable of forming transient secondary structures. The α subunit can affect the β subunit ICD structure, which is more flexible in a 4α2:1β than in a 4α1:1β GlyR. We also explore the effects of gephyrin binding by creating GlyR models bound to the gephyrin E domain; MD simulations suggest these are more stable than the unbound forms, and again there are α subunit-dependent differences, despite the fact the gephyrin binds to the β subunit. The bound models also suggest that gephyrin causes compaction of the ICD. Overall, the data expand our knowledge of this important receptor protein and in particular clarify features of the underexplored ICD.
... The I-RMSD of a prediction is the RMSD of the Cα atoms of its interface residues calculated after the superposition of the Cα atoms in the prediction onto the native structure [61]. As many algorithms adopted, a hit is defined as the prediction whose I-RMSD between prediction and native structure is less than 2.5 Å [59,62,63]. Besides, we also adopt 1 Å, 2 Å, 4 Å as hit thresholds from different criteria to verify the robustness of the docking methods. ...
Article
Protein-protein interaction plays an important role in studying the mechanism of protein functions from the structural perspective. Molecular docking is a powerful approach to detect protein-protein complexes using computational tools, due to the high cost and time-consuming of the traditional experimental methods. Among existing technologies, the template-based method utilizes the structural information of known homologous 3D complexes as available and reliable templates to achieve high accuracy and low computational complexity. However, the performance of the template-based method depends on the quality and quantity of templates. When insufficient or even no templates, the ab initio docking method is necessary and largely enriches the docking conformations. Therefore, it's a feasible strategy to fuse the effectivity of the template-based model and the universality of ab initio model to improve the docking performance. In this study, we construct a new, diverse, comprehensive template library derived from PDB, containing 77,685 complexes. We propose a template-based method (named TemDock), which retrieves the evolutionary relationship between the target sequence and samples in the template library and transfers similar structural information. Then, the target structure is built by superposing on the homologous template complex with TM-align. Moreover, we develop a consensus-based method (named ComDock) to integrate our TemDock and an existing ab initio method (ZDOCK). On 105 targets with templates from Benchmark 5.0, the TemDock and ComDock achieve a success rate of 68.57 % and 71.43 % in the top 10 conformations, respectively. Compared with the HDOCK, ComDock obtains better I-RMSD of hit configurations on 9 targets and more hit models in the top 100 conformations. As an efficient method for protein-protein docking, the ComDock is expected to study protein-protein recognition and reveal the various biological passways that are critical for developing drug discovery. The final results are stored at https://github.com/guofei-tju/mqz_ComDock_docking.
... Each one of these poses was assigned a ZDOCK score, composed of pairwise shape complementarity (PSC), desolvation, and electrostatics energy terms. PSC computes the total number of atom pairs between the two proteins being docked within a distance cut-off, and a penalty term is assigned to grid point overlap of core-core, surface-core and surfacesurface terms to prevent steric clashes [40,56]. Desolvation is based on the atomic contact energy (ACE), which is the free energy change of breaking two protein atom-water contacts and replacing them with a protein-atom-protein atom contact and a water-water contact. ...
Article
Full-text available
Fungal effector proteins are important in mediating disease infections in agriculturally important crops. These secreted small proteins are known to interact with their respective host receptor binding partners in the host, either inside the cells or in the apoplastic space, depending on the localisation of the effector proteins. Consequently, it is important to understand the interactions between fungal effector proteins and their target host receptor binding partners, particularly since this can be used for the selection of potential plant resistance or susceptibility-related proteins that can be applied to the breeding of new cultivars with disease resistance. In this study, molecular docking simulations were used to characterise protein-protein interactions between effector and plant receptors. Benchmarking was undertaken using available experimental structures of effector-host receptor complexes to optimise simulation parameters, which were then used to predict the structures and mediating interactions of effector proteins with host receptor binding partners that have not yet been characterised experimentally. Rigid docking was applied for both the so-called bound and unbound docking of MAX effectors with plant HMA domain protein partners. All bound complexes used for benchmarking were correctly predicted, with 84% being ranked as the top docking pose using the ZDOCK scoring function. In the case of unbound complexes, a minimum of 95% of known residues were predicted to be part of the interacting interface on the host receptor binding partner, and at least 87% of known residues were predicted to be part of the interacting interface on the effector protein. Hydrophobic interactions were found to dominate the formation of effector-plant protein complexes. An optimised set of docking parameters based on the use of ZDOCK and ZRANK scoring functions were established to enable the prediction of near-native docking poses involving different binding interfaces on plant HMA domain proteins. Whilst this study was limited by the availability of the experimentally determined complexed structures of effectors and host receptor binding partners, we demonstrated the potential of molecular docking simulations to predict the likely interactions between effectors and their respective host receptor binding partners. This computational approach may accelerate the process of the discovery of putative interacting plant partners of effector proteins and contribute to effector-assisted marker discovery, thereby supporting the breeding of disease-resistant crops.
... The starting point for the generation of our 'core' set was the protein docking Benchmark 5 (BM5) (Vreven et al., 2015) consisting of 230 protein-protein complexes (targets) complete with their experimental structure. For each of the 230 protein-protein complexes (targets) in BM5 (Vreven et al., 2015), we generated a total of 30 000 DMs with FTDock (Gabb et al., 1997), ZDock (Chen and Weng, 2002) and HADDOCK (de Vries et al., 2007;Dominguez et al., 2003) as detailed previously (Barradas-Bautista et al., 2022). The quality of the generated DMs was assessed following the CAPRI (Critical Assessment of PRedicted Interactions) protocol (Méndez et al., 2003), and consequently DMs were classified, in order of increasing quality, as Incorrect, Acceptable, Medium-and High-quality, as reported in Supplementary Table S1. ...
Article
Full-text available
Motivation Protein–protein interactions drive many relevant biological events, such as infection, replication and recognition. To control or engineer such events, we need to access the molecular details of the interaction provided by experimental 3D structures. However, such experiments take time and are expensive; moreover, the current technology cannot keep up with the high discovery rate of new interactions. Computational modeling, like protein–protein docking, can help to fill this gap by generating docking poses. Protein–protein docking generally consists of two parts, sampling and scoring. The sampling is an exhaustive search of the tridimensional space. The caveat of the sampling is that it generates a large number of incorrect poses, producing a highly unbalanced dataset. This limits the utility of the data to train machine learning classifiers. Results Using weak supervision, we developed a data augmentation method that we named hAIkal. Using hAIkal, we increased the labeled training data to train several algorithms. We trained and obtained different classifiers; the best classifier has 81% accuracy and 0.51 Matthews’ correlation coefficient on the test set, surpassing the state-of-the-art scoring functions. Availability and implementation Docking models from Benchmark 5 are available at https://doi.org/10.5281/zenodo.4012018. Processed tabular data are available at https://repository.kaust.edu.sa/handle/10754/666961. Google colab is available at https://colab.research.google.com/drive/1vbVrJcQSf6\_C3jOAmZzgQbTpuJ5zC1RP?usp=sharing Supplementary information Supplementary data are available at Bioinformatics Advances online.
... The ligand (distal helix) was then docked via fast fourier transform (FFT) algorithm on the 3D grids. The scoring function consists of interface atomic contact energies (IFACE) [48], shape complementarity and electrostatics with charge adopted from CHARMM19 force field [49]. The initially generated 2 × 10 3 poses were subjected to a culling process to eliminate those having no contacts with residues we specified in Table S1. ...
Preprint
Full-text available
Calcineurin (CaN) is a calcium-dependent phosphatase involved in numerous signaling pathways. Its activation by Ca2+ is in part driven by binding of calmodulin (CaM) to a CaM-recognition motif within the phosphatase's regulatory domain (RD); however, secondary interactions between CaM and the CaN regulatory domain may be necessary to fully activate CaN (Biochemistry 52.(2013), 8643-8651). Specifically, it has been shown that the CaN regulatory domain folds upon CaM binding and that there is a region C-terminal to the canonical CaM-binding region, the 'distal helix', that assumes an alpha helix fold and contributes to activation (Biochemistry 52.(2013), 8643-8651). We hypothesized in Dunlap et al (Biochemistry 52.(2013), 8643-8651) that this putative alpha helical distal helix is capable of binding CaM in a region distinct from the canonical CaM binding region (CaMBR) site, whereby CaN is activated. To test this hypothesis, we utilized molecular simulations including replica-exchange molecular dynamics, protein-protein docking and computational mutagenesis to model distal helix conformations. From these simulations we have isolated a potential binding site on CaM (site D) that facilitates moderate affinity inter-protein interactions that may attenuate CaN auto-inhibition. Further, molecular simulations of the distal helix A454E mutation demonstrated weakened distal helix/CaM interactions that were previously shown to impair CaN activity. K30E and G40D mutations of CaM at site D presented similar decreases in binding affinity predicted by simulations. The prediction was correlated with a phosphatase assay in which these two mutants show reduced CaN activity. This study therefore provides a potential structural basis for the role of secondary CaM/CaN interactions in mediating CaN activation.
... Molecular docking of Bid with the cytoplasmic domain of IRE1 was performed using ZDOCK software [16]. The docking poses were scanned using 12-degree rotations against protein 4U6R, and the top 100 hits were saved. ...
Article
IRE1 is a transmembrane signaling protein that activates the unfolded protein response under endoplasmic reticulum stress. IRE1 is endowed with kinase and endoribonuclease activities. The ribonuclease activity of IRE1 can switch substrate specificities to carry out atypical splicing of Xbp1 mRNA or trigger degradation of specific mRNAs. The mechanisms regulating the distinct ribonuclease activities of IRE1 have yet to be fully understood. Here, we report the Bcl-2 family protein Bid as a novel recruit of the IRE1 complex, which directly interacts with the cytoplasmic domain of IRE1. Bid binding to IRE1 leads to a decrease in IRE1 phosphorylation in a way that it can only perform Xbp1 splicing while mRNA degradation activity is repressed. The RNase outputs of IRE1 have been found to regulate the homeostatic-apoptotic switch. This study thus provides insight into IRE1-mediated cell survival.
... A total of 54,000 relative orientations that are evenly distributed in the rotational space are sampled in a dense docking mode. Detailed discussions of FFT-based molecular docking algorithms are available in References [15][16][17]. MDockPP can predict protein-RNA complex structures, protein-protein dimeric structures, as well as cyclic and dihedral symmetric oligomeric complex structures (see Note 5.1). ...
Chapter
Full-text available
HIV-1 integrase (IN) is a key enzyme that is essential for mediating the insertion of retroviral DNA into the host chromosome. IN also exhibits additional functions which are not fully elucidated, including its ability to bind to viral genomic RNA. Lack of binding of IN to RNA within the virions has been shown to be associated with production of morphologically defective virus particles. However, the exact structure of HIV-1 IN bound to RNA is not known. Based on the studies that C-terminal domain (CTD) of IN binds to TAR RNA region and based on the observation that TAR and the host factor INI1 binding to IN-CTD are identical, we computationally modelled the IN-CTD/TAR complex structure. Computational modeling of nucleic acid binding to proteins is a valuable method to understand the macromolecular interaction when experimental methods of solving the complex structures are not feasible. The current model of the IN-CTD/TAR complex may facilitate further understanding of this interaction and may lead to therapeutic targeting of IN-CTD/RNA interactions to inhibit HIV-1 replication.Key wordsRNA-protein interactionsComputational modelingProtein-RNA dockingHIV-1TAR RNAINI1/SMARCB1
... Finally, clustering based on Root Mean Square Deviation (RMSD) was applied to each solution to discard the redundant solution. In case of ZDOCK, it utilizes grid-based representation and 3D Fast Fourier Transform (FFT) to efficiently explore rigid body search space of docking positions [30]. ZDOCK results were further refined and rescored with FiberDock. ...
Article
Full-text available
The current study focuses on molecular cloning, expression and structural characterization of growth hormone-receptor (GHR) and its extracellular domain as growth hormone binding protein (GHBP) from the liver of Nili-Ravi buffalo (Bubalus bubalis; Bb). RNA was isolated, genes were amplified by reverse transcriptase-polymerase chain reaction and sequence was characterized. The BbGHR sequence showed three amino acid variations in the extracellular domain when compared with Indian BbGHR. For the production of full length BbGHR and BbGHBP in Escherichia coli (E. coli) BL21 (RIPL) Codon Plus, expression plasmids were constructed under the control of T7lac promoter and isopropyl β-D thiogalactopyranoside was used as an inducer. BbGHR and BbGHBP were expressed as inclusion bodies at ~ 40% and > 30% of the total E. coli proteins, respectively. The BbGHBP was solubilized and refolded by dilution method using cysteine-cystine redox potential. The recombinant BbGHBP was purified and biological activity was checked on HeLa cell lines showing increase cell proliferation in the presence of ovine GH (oGH), hence justifying the increase in the half-life of GH in the presence of BbGHBP. For the molecular interactions of oGH-BbGHBP multiple docking programs were employed to explore the subsequent interactions which showed high binding affinity and presence of large number of hydrogen bonds. Molecular Dynamics studies performed to examine the stability of proteins and exhibited stable structures along with favorable molecular interactions. This study has described the sequence characterization of BbGHR in Nili-Ravi buffaloes and hence provided the basis for the assessment of GH-GHR binding in other Bovidae species.
... ;https://doi.org/10.1101https://doi.org/10. /2022 (targets) in BM5 (Vreven et al., 2015), we generated a total of 30,000 DMs with FTDock (Gabb et al., 1997), ZDock (Chen and Weng, 2002) and HADDOCK (de Vries et al., 2007;Dominguez et al., 2003) as detailed previously (Barradas-Bautista et al., 2022). The quality of the generated DMs was assessed following the CAPRI (Critical Assessment of PRedicted Interactions) protocol (Méndez et al., 2003), and consequently DMs were classified, in order of increasing quality, as Incorrect, Acceptable, Medium-and High-quality, as reported in Table S1. ...
Preprint
Full-text available
Protein-protein interactions drive many important biological events, such as infection, replication, and recognition. We need to access the molecular details of the interaction provided by experimental 3D structures to control or engineer such events. However, such experiments take time and are expensive; moreover, the current technology cannot keep up with the high discovery rate of new interactions. Computational modeling like protein-protein docking can help to fill this gap by generating docking poses. Protein-protein docking generally consists of two parts, sampling and scoring. The sampling is an exhaustive search of the tridimensional space. The caveat of the sampling produces a large number of incorrect poses, producing a highly unbalanced dataset. This limits the utility of the data to train machine learning classifiers. Using weak supervision, we developed a data augmentation method named hAIkal. Using hAIkal, we increased the labeled training data to train several algorithms. We trained and obtained different classifiers; the best classifier has 81% accuracy and 0.51 MCC on the test set, surpassing the state-of-the-art scoring functions.
... Because stochastic techniques rely on a specific element. In rigid protein-docking, there are the simplest six degrees of freedom, and systematic search techniques are frequently applied in applications like DOT, GRAMM [86], and ZDOCK [87]. Greater dimensional issues, consisting of stretchy ligand-protein docking, are higher suited for stochastic search approaches. ...
Chapter
Diabetes is one of the most common chronic and progressive diseases around the world and the number of people with diabetes is projected to rise in the next few years. Although many synthetic drugs are effective in controlling diabetes, their long-term usage has undesirable side effects. Hence, there is a growing scientific interest to replace synthetic drugs with natural drugs from medicinal plants. However, understanding the role of molecules and their interaction are necessary for the effective anti-diabetic drug discovery. Over the years, computational biology methods, such as molecular docking, simulations of biomolecules, computer-aided drug design (CADD), and multi-scale biological modeling have been developed in the drug discovery process. Therefore, the present chapter aimed to understand the characteristics of diabetes, the role of phytochemicals from natural sources in controlling diabetes, and the anti-diabetic mechanism of phytochemicals. The chapter also covers the computational screening techniques using phytochemicals in anti-diabetic drug discovery. Many phytochemicals from natural sources showed potential anti-diabetic properties. The use of CADD will speed up the drug discovery process, mitigate risk, and help to identify well-established targets with more innovative drugs in the future.
... The crystal structure of martentoxin was found in the Protein Data Bank (PDB ID: 1M2S) (32) and was used in protein-protein docking. A Fast Fourier Transform Correlation technique based protein-protein docking algorithm named ZDOCK was used to establish the BK channel-martentoxin complex (33). In order to get a finer conformational sampling and reach a more accurate prediction, a proper angular step size was set for the sampling of ligand orientations (6° in this study). ...
Article
Full-text available
Background: Large conductance calcium-activated potassium channel (BK channel) is gated by both voltage and calcium ions and is widely distributed in excitable and nonexcitable cells. BK channel plays an important role in epilepsy and other diseases, but BK channel subtype-specific drugs are still extremely rare. Martentoxin was previously isolated from the venom of members of Scorpionidae and shown to be composed of 37 amino acids. Research has shown that the pharmacological selectivity of martentoxin to the BK channel is higher than that to other potassium channels. Therefore, it is of great significance to study the mechanism of interaction between martentoxin and BK channels. Methods: The three-dimensional structure of BK channel pore region was constructed by homologous modeling method, and the key amino acid sites of BK channel interaction with martentoxin were analyzed by protein-protein docking, molecular dynamic simulation and virtual alanine mutation. Results: Based on homologous modeling of BK channel pore structure and protein-protein docking analysis, Phe1, Lys28 and Arg35 of martentoxin were found to be key amino acids in toxin BK channel interaction. Conclusions: This study reveals the structural basis of martentoxin interaction with BK channel. These results will contribute to the design of BK channel specific blockers based on the structure of martentoxin.
... Nonetheless, at least one work has reported two L-pentapeptides as potential 3CL pro inhibitors by screening a 70,000-peptide library (Porto, 2021), using AutoDock Vina for the docking simulations (Trott and Olson, 2010). Remarkably, AutoDock Vina outperformed other freely-available docking algorithms, such as AutoDock and ZDOCK (Chen and Weng, 2002;Morris et al., 2009), in a benchmark study that presented a pipeline for peptide SBVS (Ansar and Vetrivel, 2019). ...
Article
Full-text available
The SARS-CoV-2 main protease, also known as 3-chymotrypsin-like protease (3CLpro), is a cysteine protease responsible for the cleavage of viral polyproteins pp1a and pp1ab, at least, at eleven conserved sites, which leads to the formation of mature nonstructural proteins essential for the replication of the virus. Due to its essential role, numerous studies have been conducted so far, which have confirmed 3CLpro as an attractive drug target to combat Covid-19 and have reported a vast number of inhibitors and their co-crystal structures with this protease. Despite all the ongoing efforts, D-peptides, which possess key advantages over L-peptides as therapeutic agents, have not been explored as potential drug candidates against 3CLpro. The current work fills this gap by reporting an in silico approach for the discovery of D-peptides capable of inhibiting 3CLpro that involves virtual screening of an in-house library of D-tripeptides and D-tetrapeptides into the protease active site and subsequent rescoring steps, including Molecular Mechanics Generalized-Born Surface Area (MM-GBSA) free energy calculations and molecular dynamics (MD) simulations. In vitro enzymatic assays conducted for the four top-scoring D-tetrapeptides at 20 μM showed that all of them caused a 55 to 85% inhibition of 3CLpro activity, thus highlighting the suitability of the devised approach. Overall, our results present a promising computational strategy to identify D-peptides capable of inhibiting 3CLpro, with broader application in problems involving protein inhibition.
... An ensemble of 3D structures was generated for every sequence to account for their flexibility. Initially, 1000 s of tertiary structure models for the target sequence were created using the AbinitioRelax protocol from the Rosetta package 37 . The models were clustered and structures from five clusters with the lowest energy were selected as input structures for the replica exchange Molecular Dynamics (REMD) simulations 29 . ...
Article
Full-text available
Rapid design, screening, and characterization of biorecognition elements (BREs) is essential for the development of diagnostic tests and antiviral therapeutics needed to combat the spread of viruses such as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). To address this need, we developed a high-throughput pipeline combining in silico design of a peptide library specific for SARS-CoV-2 spike (S) protein and microarray screening to identify binding sequences. Our optimized microarray platform allowed the simultaneous screening of ~ 2.5 k peptides and rapid identification of binding sequences resulting in selection of four peptides with nanomolar affinity to the SARS-CoV-2 S protein. Finally, we demonstrated the successful integration of one of the top peptides into an electrochemical sensor with a clinically relevant limit of detection for S protein in spiked saliva. Our results demonstrate the utility of this novel pipeline for the selection of peptide BREs in response to the SARS-CoV-2 pandemic, and the broader application of such a platform in response to future viral threats.
... The crystal structures for ET B from the Protein Data Bank (PDB IDs: 5GLI and 5GLH for ligand-free hET B and ET-1-bound ET B , respectively) were used for modeling and docking analysis. The potential binding site in AG8 was limited to the extracellular region of the ET B structure, and the most stable binding site was determined using the "ZDOCK" function in Discovery Studio 2019 30 . ...
Article
Full-text available
Endothelin receptor A (ET A ), a class A G-protein-coupled receptor (GPCR), is involved in the progression and metastasis of colorectal, breast, lung, ovarian, and prostate cancer. We overexpressed and purified human endothelin receptor type A in Escherichia coli and reconstituted it with lipid and membrane scaffold proteins to prepare an ET A nanodisc as a functional antigen with a structure similar to that of native GPCR. By screening a human naive immune single-chain variable fragment phage library constructed in-house, we successfully isolated a human anti-ET A antibody (AG8) exhibiting high specificity for ET A in the β-arrestin Tango assay and effective inhibitory activity against the ET-1-induced signaling cascade via ET A using either a CHO-K1 cell line stably expressing human ET A or HT-29 colorectal cancer cells, in which AG8 exhibited IC 50 values of 56 and 51 nM, respectively. In addition, AG8 treatment repressed the transcription of inhibin βA and reduced the ET A -induced phosphorylation of protein kinase B and extracellular regulated kinase. Furthermore, tumor growth was effectively inhibited by AG8 in a colorectal cancer mouse xenograft model. The human anti-ET A antibody isolated in this study could be used as a potential therapeutic for cancers, including colorectal cancer.
... ZDOCK uses the fast Fourier transform algorithm for an efficient global docking on the 3D grid. ZDOCK also uses the combination of shape complementarity, electrostatic, and statistical potential for scoring the docked complex [55]. ...
Article
Full-text available
Peroxisome proliferator-activated receptor-gamma coactivator 1-alpha (PPARGC1A) regulates the expression of energy metabolism’s genes and mitochondrial biogenesis. The essential roles of PPARGC1A encouraged the researchers to assess the relation between metabolism-related diseases and its variants. To study Gly482Ser (+1564G/A) single-nucleotide polymorphism (SNP) after PPARGC1A modeling, we substitute Gly482 for Ser482. Stability prediction tools showed that this substitution decreases the stability of PPARGC1A or has a destabilizing effect on this protein. We then utilized molecular dynamics simulation of both the Gly482Ser variant and wild type of the PPARGC1A protein to analyze the structural changes and to reveal the conformational flexibility of the PPARGC1A protein. We observed loss flexibility in the RMSD plot of the Gly482Ser variant, which was further supported by a decrease in the SASA value in the Gly482Ser variant structure of PPARGC1A and an increase of H-bond with the increase of β-sheet and coil and decrease of turn in the DSSP plot of the Gly482Ser variant. Such alterations may significantly impact the structural conformation of the PPARGC1A protein, and it might also affect its function. It showed that the Gly482Ser variant affects the PPARGC1A structure and makes the backbone less flexible to move. In general, molecular dynamics simulation (MDS) showed more flexibility in the native PPARGC1A structure. Essential dynamics (ED) also revealed that the range of eigenvectors in the conformational space has lower extension of motion in the Gly482Ser variant compared with WT. The Gly482Ser variant also disrupts PPARGC1A interaction. Due to this single-nucleotide polymorphism in PPARGC1A, it became more rigid and might disarray the structural conformation and catalytic function of the protein and might also induce type 2 diabetes mellitus (T2DM), coronary artery disease (CAD), and nonalcoholic fatty liver disease (NAFLD). The results obtained from this study will assist wet lab research in expanding potent treatment on T2DM. 1. Introduction Peroxisome proliferator-activated receptor-G coactivator 1-alpha (PPARGC1A, PGC-1α, or PGC-1) is a transcriptional coactivator of peroxisome proliferator-activated receptor gamma (PPAR-γ), which regulates the energy metabolism’s genes and mitochondrial biogenesis [1, 2]. The nuclear receptor PPAR-γ enables PPARGC1A to interact with various transcription factors. PGC-1α also regulates the cAMP (cyclic adenosine monophosphate) response element-binding protein (CREB) and nuclear respiratory factors (NRFs). The PGC-1α protein is also associated with controlling blood pressure, cellular cholesterol homeostasis, and obesity [3, 4]. Thus, the PGC-1α encoding gene plays an essential role in cardiovascular and metabolic diseases. It also regulates the pathophysiological processes contributing to coronary artery disease (CAD) [5–7]. PGC-1α regulates the gene expression of mitochondrial fatty acid oxidation enzymes through interaction with peroxisome proliferator-activated receptor- (PPAR-) alpha in the heart, brown adipose tissue, and liver [8]. PGC-1α also increases glucose uptake in the muscles by regulating glucose transporter 4 [9]. In addition, it increases the gene expression of phosphoenolpyruvate carboxykinase and glucose-6-phosphatase, which is vital for hepatic gluconeogenesis [10]. These critical functions of PGC-1α in the regulation of adaptive cellular energy metabolism, vascular stasis, oxidative stress, and adipogenesis led to conducting a study on the relationship between PPARGC1A variation and a range of metabolism-related diseases [5]. Single-nucleotide polymorphisms (SNPs) are widely divided into two distinct clusters, synonymous (csSNPs) and nonsynonymous SNPs (nsSNPs) [11]. The nonsynonymous SNPs are further divided into missense mutations and nonsense mutations. The coding synonymous SNPs have a low effect over protein structure, while the nonsynonymous SNPs have a great impact on the protein structure and higher risk of diseases [12, 13]. Thus, they have particular importance for additional experimental assessment. In silico studies supply an efficient platform for analysis and evaluation of genetic mutations for their pathological consequence, and defining their underlying molecular mechanism [14–18]. The G to A substitution in exon 8 of the PGC-1a gene leads to the substitution of glycine with serine in codon 482 that reduces PGC-1a expression and PGC-1a protein activity [19]. In the present study, we surveyed the literature on the deleterious effect of SNP G>A Gly482Ser in the PPARGC1A protein coding region. Despite the controversial results of studies, many reports related the PPARGC1A gene’s polymorphisms to type 2 diabetes mellitus (T2DM), obesity, and hypertension [20]. Gly482Ser (+1564G/A) polymorphism is one of the most widely studied. Gly482Ser is the most critical and common PPARGC1A gene SNPs, corresponding to a missense variant in the coding sequence [6, 21]. Frequency of this variant in the gnomAD database is 0.3 with 12728 homozygous and 77425 heterozygous. From the beginning, many of the studies have reported associations of Gly482Ser (+1564G/A) variation with diabetic complications [22–25]. We also investigated the effect of Gly482Ser polymorphism on nonalcoholic fatty liver disease (NAFLD) and the risk of coronary artery disease (CAD) among patients with T2DM [26–28]. We then used computational studied to further investigate this polymorphism. After we modeled the structure of PPARGC1A protein, we substituted Gly482 with 482Ser and predicted the effect of Gly482Ser variant on the stability of PPARGC1A using prediction SNP tools. The molecular dynamics simulation (MDS) is a promising approach to examine the conformational changes in the Gly482Ser variant structure with respect to the native conformation [29–36]. Researches indicate that MDS can detect the changes in protein phenotype that significantly contribute towards confirming the damaging consequences of computationally predicted disease-associated mutations [16]. We focused on investigating that the changes in the dynamic behavior of PPARGC1A was induced by the pathogenic G>A Gly482Ser variant. Experimental studies have indicated that G>A Gly482Ser variant causes disease. We conducted MDS to reveal the conformational changes occurring in the Gly482Ser variant structure which may account for the observed molecular changes and the related pathological outcomes. The simulation also reveals the conformational flexibility of the Gly482Ser PPARGC1A variant to show how this variant affects the protein and pathogenesis of the related diseases. We also performed essential dynamics (ED) and molecular docking for the survey of this variant. In general, our results provide strong evidence of main conformational drift occurring in the Gly482Ser variant as compared to the native. 2. Materials and Methods PPARGC1A sequence data was collected from the national center for biological information (NCBI) protein sequence database. rs8192678 (+1564G>A Gly482Ser) SNP information for our computational analysis was retrieved from the dbSNP database (http://www.ncbi.nlm.nih.gov/projects/SNP/, access date: February 30, 2019) [37]. The studies were performed in the RCSB PDB (https://www.rcsb.org/) [38] and UniProt (https://www.uniprot.org/uniprot/O75369) [39] databases to find the suitable crystallographic structure of the PGC-1α protein with ID Q9UBK2. 2.1. Modeling of Protein, Modeling Evaluation, and SNP Creation After investigating RCSB PDB, the appropriate structure that includes the polymorphism site was not found so the structure of PGC-1α was modeled. As communitywide blind CASP experiments have indicated which I-TASSER server can now create structural models with accuracy similar to the best human expert-guided modeling [40] and compared with other useful online structure prediction tools, the I-TASSER is in the reliability and notable accuracy of full-length structure prediction for protein targets with various difficulty and the wide structure-based function predictions [41]. Then, we selected I-TASSER (https://zhanglab.ccmb.med.umich.edu/I-TASSER/) [42] server for modeling the human PPARGC1A protein structure (with 798 aa). The quality of the modeled PPARGC1A protein structure was evaluated independently by the VADAR version 1.8 (http://vadar.wishartlab.com/). Since our study was on the Gly482Ser polymorphism, we replaced glycine residue of the wild-type (WT) protein to serine residue in the variant using the SPDB viewer [43]. The structures were minimized with YASARA program [44] and were applied to the study. 2.2. Stability Analysis Using SNP Tools Since a missense polymorphism causes the alteration of the protein structure and function, therefore, we predicted protein stability. A number of recent studies have verified that implementing multiple bioinformatics tools and algorithms increases the accuracy of the results [45]. To evaluate the effect of the amino acid substitution at 482 position on the stability of wild-type PPARGC1A, we used the following stability predictor tools. MUpro is an assembly of programs with machine learning that computes the protein stability and changes based on sequence data, especially when the tertiary structure is not subjected. This approach dominates significant restrictions on previous approaches based on the tertiary structure [46]. The CUPSAT tool evaluates and predicts protein stability based on mutations [47]. DynaMut can perform rapid analysis of the protein stability and dynamics coming from alterations in vibrational entropy [48]. DUET also predicts the effect of point mutations on the protein stability through an embedded computational approach [49]. The mCSM calculates the consequences of missense polymorphisms on the stability of protein, protein-protein binding, and protein-DNA interaction [50]. SDM considers the amino acid substitutions of different structural conditions tolerated in the families of homologous proteins of specified 3D structures and converts them into possibility tables for amino acid substitution [51]. I-Mutant2.0 calculations are based on the protein structure or the protein sequence or are based on prediction of protein stability of missense variants [52]. PANTHER also predicts evolutionary evaluation of the coding SNPs [53]. To evaluate deleterious effect of the Gly482Ser variant on the interaction of the PPARGC1A protein, then to investigate the effect of the Gly482Ser variant on the PPARGC1A function and interaction, we performed molecular docking. 2.3. Protein-Protein Molecular Docking Protein-protein interactions have a significant role in different cellular processes and are also involved in various diseases. They are also a highly significant target for therapeutic interventions [54]. PPARGC1A is a transcriptional coactivator of peroxisome proliferator-activated receptor gamma (PPAR-γ), which regulates the energy metabolism’s genes and the mitochondrial biogenesis. The nuclear receptor PPAR-γ enables PPARGC1A to interact with various transcription factors [1, 2]. We employed ZDOCK (http://zdock.umassmed.edu/) to evaluate deleterious effect of the Gly482Ser variant on the interaction of the PPARGC1A protein with PPAR-γ. 292-403 amino acids from PPARGC1A were selected as PPAR-γ binding domain and 317, 351, 477, and 501 amino acids as interaction site of PPAR-γ. ZDOCK uses the fast Fourier transform algorithm for an efficient global docking on the 3D grid. ZDOCK also uses the combination of shape complementarity, electrostatic, and statistical potential for scoring the docked complex [55]. 2.4. Molecular Dynamics Simulation 2.4.1. MD Simulation This study was performed using the basic tool of GROMACS [56]. MDS was carried out with the parallel version of PME in the GROMACS program. Each one of structures was immersed in a dodecahedron-modeled box (, , and ) with 238.58 nm³. SPC/E water molecules were used to solvate the system. The nonbonded cut off was set at 10 Å, and every 5 steps, the nonbonded pair list was updated. LINK mode was applied to constrain all hydrogen bonds and motion equation integration [57]. MDS of PPARGC1A was started through 1000 steps of energy minimization with solvation within a dodecahedron-shaped water cage with 1 Å of the distance between protein periphery and the cage edges. System neutralization was done by adding 15 NA ions. Molecular dynamics simulation was performed at 300 k (physiological temperature), using GROMACS 4.6.5 (http://www.gromacs.org/), and the GROMOS53a6 force field. Before the MDS run, the structures were gained to a temperature of 300 K and were equilibrated during 100 ps under constant volume and temperature (NVT). Next, the system was switched to continuous pressure and temperature (NPT) and equilibrated for 100 ps. All the periodic boundary condition functions were carried out using the leap-frog algorithm with a 2 fs time step, and every 500 steps, structural snapshots were flushed [56]. 50 ns MD simulations of the Gly482Ser variant and the native of PPARGC1A in were steps individually done. The cutoff radius of protein-solvent intramolecular hydrogen bonds was 0.3 nm. 2.4.2. Analysis of Molecular Dynamics Trajectories Structural deviation analysis of the Gly482Ser variant and wild-type protein such as root-mean-square deviation (RMSD), root-mean-square fluctuation (RMSF), solvent accessible surface area, gyration radius, hydrogen bond, and the secondary structure of the protein (DSSP) was computed using g_rmsd, g_rmsf, g_sasa, g_gyrate, g_hbond, and do_dssp built-in functions of GROMACS package. GRACE software was used to plot graphs (http://plasma-gate.weizmann.ac.il/Grace/) [58]. 2.5. Essential Dynamics Essential dynamics, known as principal component analysis (PCA), can show the collective atomic motion of the wild-type and Gly482Ser variant proteins by the GROMACS tool [59]. Principal component analysis was computed using g_covar and g_anaeig built-in functions of GROMACS package. PCA is a standard protocol for the characterization of eigenvectors and the projection across the first PC1 and PC2 [60]. 3. Results 3.1. Protein Modeling, Modeling Evaluation, and Replacement Gly to Ser at 482 Position in the PPARGC1A The modeling using I-TASSER gave five models. Model 3 with the highest C-score was selected for further studies. Modeling evaluation of Model 3 by VADAR server was showed 94% of the amino acids of the modeled structure in the allowed area (Figure 1), meaning that this model is suitable for further study. Amino acid replacement was also done using SPDB viewer. In the next step, the effect of Gly482Ser polymorphism on the structure and function of PPARGC1A was exhibited by SNP tools.
... The structure modeling of six proteins (Cas1, Cas2-3, Csy1, Csy2, Csy3, and Csy4) from Z. mobilis Type I-F CRISPR/Cas complex were performed by using the I-TASSER on-line server 42 respectively. The docking analyses of type I-F CRISPR/Cas complex were performed by using the ZDOCK program in BIOVIA Discovery Studio client 2020 43 . The circular visualization map of protein-protein interactions among six proteins (Cas1, Cas2-3, Csy1, Csy2, Csy3, and Csy4) from Z. mobilis Type I-F CRISPR/Cas complex was generated by using circlize package in R 44 . ...
Article
Full-text available
Characterizing protein–protein interactions (PPIs) is an effective method to help explore protein function. Here, through integrating a newly identified split human Rhinovirus 3 C (HRV 3 C) protease, super-folder GFP (sfGFP), and ClpXP-SsrA protein degradation machinery, we developed a fluorescence-assisted single-cell methodology (split protease-E. coli ClpXP (SPEC)) to explore protein–protein interactions for both eukaryotic and prokaryotic species in E. coli cells. We firstly identified a highly efficient split HRV 3 C protease with high re-assembly ability and then incorporated it into the SPEC method. The SPEC method could convert the cellular protein-protein interaction to quantitative fluorescence signals through a split HRV 3 C protease-mediated proteolytic reaction with high efficiency and broad temperature adaptability. Using SPEC method, we explored the interactions among effectors of representative type I-E and I-F CRISPR/Cas complexes, which combining with subsequent studies of Cas3 mutations conferred further understanding of the functions and structures of CRISPR/Cas complexes.
... Protein−Protein Docking and Interface RMSD Calculations. We used ZDOCK 34,35 to perform rigid docking with exhaustive and dense sampling, resulting in 54 000 docked poses for each complex. We then identified interface residues as those with a heavy atom within 5 Å of a heavy atom of any residue in the binding partner. ...
Article
Protein-protein interactions play a key role in mediating numerous biological functions, with more than half the proteins in living organisms existing as either homo- or hetero-oligomeric assemblies. Protein subunits that form oligomers minimize the free energy of the complex, but exhaustive computational search-based docking methods have not comprehensively addressed the challenge of distinguishing a natively bound complex from non-native forms. Current protein docking approaches address this problem by sampling multiple binding modes in proteins and scoring each mode, with the lowest-energy (or highest scoring) binding mode being regarded as a near-native complex. However, high-scoring modes often match poorly with the true bound form, suggesting a need for improvement of the scoring function. In this study, we propose a scoring function, KFC-E, that accounts for both conservation and coevolution of putative binding hotspot residues at protein-protein interfaces. We tested KFC-E on four benchmark sets of unbound examples and two benchmark sets of bound examples, with the results demonstrating a clear improvement over scores that examine conservation and coevolution across the entire interface.
... ZDOCK (https://zdock.umassmed.edu/) facilitates global docking search on a 3D grid using the FFT algorithm via its user-friendly web interface combined with shape complementarity, electro statistics and statistical potential terms for scoring of the complex structures (Chen and Weng, 2002). ZDOCK version 3.0.2 ...
Article
Full-text available
Protein-protein interactions are indispensable physiological processes regulating several biological functions. Despite the availability of structural information on protein-protein complexes, deciphering their complex topology remains an outstanding challenge. Raf kinase inhibitory protein (RKIP) has gained substantial attention as a favorable molecular target for numerous pathologies including cancer and Alzheimer’s disease. RKIP interferes with the RAF/MEK/ERK signaling cascade by endogenously binding with C-Raf (Raf-1 kinase) and preventing its activation. In the current investigation, the binding of RKIP with C-Raf was explored by knowledge-based protein-protein docking web-servers including HADDOCK and ZDOCK and a consensus binding mode of C-Raf/RKIP structural complex was obtained. Molecular dynamics (MD) simulations were further performed in an explicit solvent to sample the conformations for when RKIP binds to C-Raf. Some of the conserved interface residues were mutated to alanine, phenylalanine and leucine and the impact of mutations was estimated by additional MD simulations and MM/PBSA analysis for the wild-type (WT) and constructed mutant complexes. Substantial decrease in binding free energy was observed for the mutant complexes as compared to the binding free energy of WT C-Raf/RKIP structural complex. Furthermore, a considerable increase in average backbone root mean square deviation and fluctuation was perceived for the mutant complexes. Moreover, per-residue energy contribution analysis of the equilibrated simulation trajectory by HawkDock and ANCHOR web-servers was conducted to characterize the key residues for the complex formation. One residue each from C-Raf (Arg398) and RKIP (Lys80) were identified as the druggable “hot spots” constituting the core of the binding interface and corroborated by additional long-time scale (300 ns) MD simulation of Arg398Ala mutant complex. A notable conformational change in Arg398Ala mutant occurred near the mutation site as compared to the equilibrated C-Raf/RKIP native state conformation and an essential hydrogen bonding interaction was lost. The thirteen binding sites assimilated from the overall analysis were mapped onto the complex as surface and divided into active and allosteric binding sites, depending on their location at the interface. The acquired information on the predicted 3D structural complex and the detected sites aid as promising targets in designing novel inhibitors to block the C-Raf/RKIP interaction.
... In this docking study, we used the benchmark set from Chen et al. [60]. The set comprises 12 test systems with available unbound X-ray structures of both the proteinreceptor and the protein-ligand. ...
Preprint
Full-text available
Structure prediction of protein-protein complexes is one of the most critical challenges in computational structural biology. It is often difficult to predict the complex structure, even for relatively rigid proteins. Modeling significant structural flexibility in protein docking remains an unsolved problem. This work demonstrates a protein-protein docking protocol with enhanced sampling that accounts for large-scale backbone flexibility. The docking protocol starts from unbound x-ray structures and is not using any binding site information. In docking, one protein partner undergoes multiple fold rearrangements, rotations, and translations during docking simulations, while the other protein exhibits small backbone fluctuations. Including significant backbone flexibility during the search for the binding site has been made possible using the CABS coarse-grained protein model and Replica Exchange Monte Carlo dynamics. In our simulations, we obtained acceptable quality models for the set of 12 protein-protein complexes, while for selected cases, models were close to high accuracy.
... The protein-protein docking algorithm was applied for the construction of initial structure of the HECT-C2 complex. The HECT-peptide inhibitor complexes were also generated by the flexible docking method 30,31 . By applying the AMBER16 software package 35 , the initial structures of free HECT and its complexes with selected inhibitors were then corrected by means of sequential steps procedure, starting from a geometry optimisation followed by MD simulations. ...
Article
Full-text available
The C2-WW-HECT-domain E3 ubiquitin ligase SMURF2 emerges as an important regulator of diverse cellular processes. To date, SMURF2-specific modulators were not developed. Here, we generated and investigated a set of SMURF2-targeting synthetic peptides and peptidomimetics designed to stimulate SMURF2’s autoubiquitination and turnover via a disruption of the inhibitory intramolecular interaction between its C2 and HECT domains. The results revealed the effects of these molecules both in vitro and in cellulo at the nanomolar concentration range. Moreover, the data showed that targeting of SMURF2 with either these modifiers or SMURF2-specific shRNAs could accelerate cell growth in a cell-context-dependent manner. Intriguingly, a concomitant cell treatment with a selected SMURF2-targeting compound and the DNA-damaging drug etoposide markedly increased the cytotoxicity produced by this drug in growing cells. Altogether, these findings demonstrate that SMURF2 can be druggable through its self-destructive autoubiquitination, and inactivation of SMURF2 might be used to affect cell sensitivity to certain anticancer drugs.
... Introduction 1 classical computers [7] without model simplification or cutting the number of residues 8 involved in modeling based on experimental results. While determining the most 9 optimal residues for binding models is a straightforward process if experimental data 10 exists for binding kinetics of that protein, it is difficult to complete intuitively for 11 proteins of which there is little known experimental binding behavior, as chemical 12 formation kinetics is completed in non-deterministic, polynomial (NP) time on a 13 classical device [7]. 14 To solve this problem, a multitude of bioinformaticians have turned to tools that can 15 complete molecular dynamics simulations of the proteins undergoing thermal relaxation 16 in order to more effectively reveal their best binding sites. ...
Preprint
Full-text available
Determining an optimal protein configuration for the employment of protein binding analysis as completed by Temperature based Replica Exchange Molecular Dynamics (T-REMD) is an important process in the accurate depiction of a protein's behavior in different solvent environments, especially when determining a protein's top binding sites for use in protein-ligand and protein-protein docking studies. However, the completion of this analysis, which pushes out top binding sites through configurational changes, is an polynomial-state computational problem that can take multiple hours to compute, even on the fastest supercomputers. In this study, we aim to determine if graph cutting provide approximated solutions the MaxCut problem can be used as a method to provide similar results to T-REMD in the determination of top binding sites of Surfactant Protein A (SP-A) for binding analysis. Additionally, we utilize a quantum-hybrid algorithm within Iff Technologies' Polar+ package using an actual quantum processor unit (QPU), an implementation of Polar+ using an emulated QPU, or Quantum Abstract Machine (QAM), on a large scale classical computing device, and an implementation of a classical MaxCut algorithm on a supercomputer in order to determine the types of advantages that can be gained through using a quantum computing device for this problem, or even using quantum algorithms on a classical device. This study found that Polar+ provides a dramatic speedup over a classical implementation of a MaxCut approximation algorithm or the use of GROMACS T-REMD, and produces viable results, in both its QPU and QAM implementations. However, the lack of direct configurational changes carried out onto the structure of SP-A after the use of graph cutting methods produces different final binding results than those produced by GROMACS T-REMD. Thus, further work needs to be completed into translating quantum-based probabilities into configurational changes based on a variety of noise conditions to better determine the accuracy advantage that quantum algorithms and quantum devices can provide in the near future.
... The ZDOCK (version 2.1) (http://zdock.umassmed.edu/) proteinprotein rigid body docking program based on the fast Fourier transform (FFT) correlation techniques [34,35,36] was used in this study to search globally for all possible binding configurations between the ligands (RBD of CoV-2 and CoV-1) and the receptors (CD markers). During protein-protein docking, the default parameters of ZDOCK were applied. ...
Article
Full-text available
The coronavirus disease 19 (COVID-19) is a highly contagious and rapidly spreading infection caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). In some cases, the disease can be fatal which resulted in more than one million deaths worldwide according the WHO. Currently, there is no effective vaccine or treatment for COVID-19, however many small-molecule inhibitors have shown potent antiviral activity against SARS-CoV-2 and some of them are now under clinical trials. Despite their promising activities, the development of these small molecules for the clinical use can be limited by many factors like the off-target effect, the poor stability, and the low bioavailability. The clusters of differentiation CD147, CD209, CD299 have been identified as essential entry co-receptors for SARS-CoV-2 species specificity to humans, although the underlying mechanisms are yet to be fully elucidated. In this paper, protein-protein docking was utilized for identifying the critical epitopes in CD147, CD209 and CD299 which are involved in the binding with SARS-CoV-2 Spike receptor binding domain (RBD). The results of binding free energies showed a high affinity of SARS-CoV-2 RBD to CD299 receptor which was used as a reference to derive hypothetical peptide sequences with specific binding activities to SARS-CoV-2 RBD. Molecular docking and molecular dynamics simulations of the newly designed peptides showed favorable binding features and stability with SARS-CoV-2 RBD and therefore can be further considered as potential candidates in future anti-SARS CoV-2 drug discovery studies.
... Findings of ZDOCK Server highlighted that S1-CTD exhibited strong interaction with IgG2a LA5 as depicted by its Z-dock score (2268) followed by IgA (1525), and ACE2 (1457) [37,51]. Further, docking stability of these proteins was checked in terms of ∆G (kcal mol -1 ) and Kd (M) [41,42] that followed the order as-IgG2a LA5 (∆G: -21.1, Kd: 1.0x10 -16 ) > IgA (-20.2, 1.5x 10 -15 ) > ACE2 (-18.7, ...
Article
Full-text available
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), a ruthless killer of the human population and highly transmissible, has become a big threat to public health by spreading one of the most infectious coronavirus diseases (COVID-19). Vaccine production is of paramount importance at present, albeit it is a gradual and time taking process. Since the predicament demand is immediate prevention, we hypothesized the utility of IgG2a LA5 and IgA antibodies developed inside the body after vaccination to assess its protective effects as non-specific immunity against SARS-CoV-2. Identifying the vaccine for repurposing, we considered the C-terminal domain of spike protein (S1-CTD) and envelope (E) protein for molecular interactions with aforesaid antibodies using computational and Bioinformatics tools in order to elucidate its practicality and applicability. Our in silico findings exhibited the involvement of S1-CTD and E-protein hotspot residues as key players in molecular interaction with IgG2a LA5 and IgA and exhibited better binding efficiency (higher negative ∆G and lower K d values) in comparison to their cognate host receptors (ACE2 and MPP5). Detailed hotspot residue analysis of S1-CTD and E-protein with IgG2a LA5 and IgA indicates that the existing vaccine could be used as a preventive measure against SARS-CoV-2.
... Chen and Weng developed ZDOCK as a user-friendly docking server based on the rigid-body docking algorithm and as a simple interface to predict structures of protein complexes in the year 2002 [93]. Since its development, the ZDOCK server has undergone major transformations to improve its docking efficiency, which includes the upgrades of the docking algorithm, docking accuracy, running time, user interface and integrated viewing of the structures and descriptions during the analysis of results [80]. ...
Article
Aptamers are single-stranded DNA or RNA oligonucleotides generated by SELEX that exhibit binding affinity and specificity against a wide variety of target molecules. Compared to RNA aptamers, DNA aptamers are much more stable and therefore are widely adopted in a number of applications especially in diagnostics. The tediousness and rigor associated with certain steps of the SELEX intensify the efforts to adopt in silico molecular docking approaches together with in vitro SELEX procedures in developing DNA aptamers. Inspired by these endeavors, we carry out an overview of the in silico molecular docking approaches in DNA aptamer generation, by detailing the stepwise procedures as well as shedding some light on the various softwares used. The in silico maturation strategy and the limitations of the in silico approaches are also underscored.
Article
Full-text available
Soluble methane monooxygenase (sMMO) oxidizes a wide range of carbon feedstocks (C1 to C8) directly using intracellular NADH and is a useful means in developing green routes for industrial manufacturing of chemicals. However, the high-throughput biosynthesis of active recombinant sMMO and the ensuing catalytic oxidation have so far been unsuccessful due to the structural and functional complexity of sMMO, comprised of three functionally complementary components, which remains a major challenge for its industrial applications. Here we develop a catalytically active miniature of sMMO (mini-sMMO), with a turnover frequency of 0.32 s⁻¹, through an optimal reassembly of minimal and modified components of sMMO on catalytically inert and stable apoferritin scaffold. We characterise the molecular characteristics in detail through in silico and experimental analyses and verifications. Notably, in-situ methanol production in a high-cell-density culture of mini-sMMO-expressing recombinant Escherichia coli resulted in higher yield and productivity (~ 3.0 g/L and 0.11 g/L/h, respectively) compared to traditional methanotrophic production.
Chapter
In biology, it remains challenging to predict interactions between proteins and DNA or RNA. When it comes to nucleic acids, existing methods of binding site identification or interaction prediction are inefficient, especially in minor cases, such as aptamer binding. In order to predict NA-protein interactions, we use a deep-learning framework called dMaSIF. Therefore, we modified the atom encoding module to reflect atom positions and relationships more precisely and used parallel calculation to optimize training process. The framework showed effectiveness on two tasks: identifying NA binding sites and predicting NA-protein interactions. This approach can thereby be used to find potential NA binding sites, to perform NA-protein docking and virtual screening, etc.
Preprint
Full-text available
Soluble methane monooxygenase (sMMO) oxidizes a wide range of carbon feedstocks (C1 to C8) directly using intracellular NADH and is a useful means in developing green routes for industrial manufacturing of chemicals; however, the high-throughput biosynthesis of active recombinant sMMO and the ensuing catalytic oxidation have so far been unsuccessful due to the structural and functional complexity of sMMO, comprised of functionally complementary, three enzyme components, which remains a major challenge for its industrial applications. Here we developed a catalytically active miniature of sMMO (mini-sMMO) through an optimal reassembly of minimal and modified components of sMMO on catalytically inert and stable apoferritin scaffold, with demonstrating the molecular characteristics in detail through in silico and experimental analyses and verifications. Notably, the in-situ methanol production in the high-cell-density culture of mini-sMMO-expressing recombinant Escherichia coli resulted in a remarkably higher productivity compared to the traditional methanotrophic production.
Article
Rhomboid proteases hydrolyze substrate helices within the lipid bilayer to release soluble domains from the membrane. Here, we investigate the mechanism of activity regulation for this unique but wide-spread protein family. In the model rhomboid GlpG, a lateral gate formed by transmembrane helices TM2 and TM5 was previously proposed to allow access of the hydrophobic substrate to the shielded hydrophilic active site. In our study, we modified the gate region and either immobilized the gate by introducing a maleimide-maleimide (M2M) crosslink or weakened the TM2/TM5 interaction network through mutations. We used solid-state nuclear magnetic resonance (NMR), molecular dynamics (MD) simulations, and molecular docking to investigate the resulting effects on structure and dynamics on the atomic level. We find that variants with increased dynamics at TM5 also exhibit enhanced activity, whereas introduction of a crosslink close to the active site strongly reduces activity. Our study therefore establishes a strong link between the opening dynamics of the lateral gate in rhomboid proteases and their enzymatic activity.
Article
Rhynchophorus ferrugineus is a quarantine pest that mainly damages plants in tropical regions, which are essential economic resources. Cry3Aa has been used to control coleopteran pests and is known to be toxic to R. ferrugineus. The binding of the Cry toxin to specific receptors on the target insect plays a crucial role in the toxicological mechanism of Cry toxins. However, in the case of R. ferrugineus, the nature and identity of the receptor proteins involved remain unknown. In the present study, pull-down assays and mass spectrometry were used to identify two proteins of aminopeptidase N proteins (RfAPN2a and RfAPN2b) in the larval midguts of R. ferrugineus. Cry3Aa was able to bind to RfAPN2a (Kd = 108.5 nM) and RfAPN2b (Kd = 68.2 nM), as well as midgut brush border membrane vesicles (Kd = 482.5 nM). In silico analysis of both RfAPN proteins included the signal peptide and anchored sites for glycosyl phosphatidyl inositol. In addition, RfAPN2a and RfAPN2b were expressed in the human embryonic kidney 293T cell line, and cytotoxicity assays showed that the transgenic cells were not susceptible to activated Cry3Aa. Our results show that RfAPN2a and RfAPN2b are Cry3Aa-binding proteins involved in the Cry3Aa toxicity of R. ferrugineus. This study deepens our understanding of the action mechanism of Cry3Aa in R. ferrugineus larvae.
Article
This paper aims to understand the binding strategies of a nanobody-protein pair by studying known complexes. Rigid body protein-ligand docking programs produce several complexes, called decoys, which are good candidates with high scores of shape complementarity, electrostatic interactions, desolvation, buried surface area, and Lennard-Jones potentials. However, the decoy that corresponds to the native structure is not known. We studied 36 nanobody-protein complexes from the single domain antibody database, sd-Ab DB, http://www.sdab-db.ca/. For each structure, a large number of decoys are generated using the Fast Fourier Transform algorithm of the software ZDOCK. The decoys were ranked according to their target protein-nanobody interaction energies, calculated by using the Dreiding Force Field, with rank 1 having the lowest interaction energy. Out of 36 protein data bank (PDB) structures, 25 true structures were predicted as rank 1. Eleven of the remaining structures required Ångstrom size rigid body translations of the nanobody relative to the protein to match the given PDB structure. After the translation, the Dreiding interaction (DI) energies of all complexes decreased and became rank 1. In one case, rigid body rotations as well as translations of the nanobody were required for matching the crystal structure. We used a Monte Carlo algorithm that randomly translates and rotates the nanobody of a decoy and calculates the DI energy. Results show that rigid body translations and the DI energy are sufficient for determining the correct binding location and pose of ZDOCK created decoys. A survey of the sd-Ab DB showed that each nanobody makes at least one salt bridge with its partner protein, indicating that salt bridge formation is an essential strategy in nanobody-protein recognition. Based on the analysis of the 36 crystal structures and evidence from existing literature, we propose a set of principles that could be used in the design of nanobodies.
Preprint
This paper aims to understand the binding strategies of a nanobody-protein pair by studying known complexes. Rigid body protein-ligand docking programs produce several complexes, called decoys, which are good candidates with high scores of shape complementarity, electrostatic interactions, desolvation, buried surface area, and Lennard-Jones potentials. It is not known which decoy represents the true structure. We studied thirty-seven nanobody-protein complexes from the Single Domain Antibody Database, sd-Ab DB, http://www.sdab-db.ca/. For each structure, a large number of decoys are generated using the Fast Fourier Transform algorithm of the software ZDOCK. The decoys were ranked according to their target protein-nanobody interaction energies, calculated by using the Dreiding Force Field, with rank 1 having the lowest interaction energy. Out of thirty-six PDB structures, twenty-five true structures were predicted as rank 1. Eleven of the remaining structures required Ångstrom size rigid body translations of the nanobody relative to the protein to match the given PDB structure. After the translation the Dreiding interaction (DI) energies of all complexes decreased and became rank 1. In one case, rigid body rotations as well as translations of the nanobody were required for matching the crystal structure. We used a Monte Carlo algorithm that randomly translates and rotates the nanobody of a decoy and calculates the DI energy. Results show that rigid body translations and the DI energy are sufficient for determining the correct binding location and pose of ZDOCK created decoys. A survey of the sd-Ab DB showed that each nanobody makes at least one salt bridge with its partner protein, indicating that salt bridge formation is an essential strategy in nanobody-protein recognition. Based on the analysis of the thirty-six crystal structures and evidence from existing literature, we propose a set of principles that could be used in the design of nanobodies.
Article
O-N-Acetylglucosamine transferase (OGT) can catalyze the O-GlcNAc modification of thousands of proteins. The holoenzyme formation of OGT and adaptor protein is the precondition for further recognition and glycosylation of the target protein, while the corresponding mechanism is still open. Here, static and dynamic schemes based on statistics can successfully screen the feasible identifying, approaching, and binding mechanism of OGT and its typical adaptor protein p38α. The most favorable interface, energy contribution of hotspots, and conformational changes of fragments were discovered. The hydrogen bond interactions were verified as the main driving force for the whole process. The distinct characteristic of active and inactive p38α is explored and demonstrates that the phosphorylated tyrosine and threonine will form strong ion-pair interactions with Lys714, playing a key role in the dynamic identification stage. Multiple method combinations from different points of view may be helpful for exploring other systems of the protein-protein interactions.
Chapter
In the recent years, therapeutic use of antibodies has seen a huge growth, "due to their inherent proprieties and technological advances in the methods used to study and characterize them. Effective design and engineering of antibodies for therapeutic purposes are heavily dependent on knowledge of the structural principles that regulate antibody–antigen interactions. Several experimental techniques such as X-ray crystallography, cryo-electron microscopy, NMR, or mutagenesis analysis can be applied, but these are usually expensive and time-consuming. Therefore computational approaches like molecular docking may offer a valuable alternative for the characterization of antibody–antigen complexes.Here we describe a protocol for the prediction of the 3D structure of antibody–antigen complexes using the integrative modelling platform HADDOCK. The protocol consists of (1) the identification of the antibody residues belonging to the hypervariable loops which are known to be crucial for the binding and can be used to guide the docking and (2) the detailed steps to perform docking with the HADDOCK 2.4 webserver following different strategies depending on the availability of information about epitope residues.Key wordsAntibodyHADDOCKInformation-driven docking
Chapter
Antibody and TCR modeling are becoming important as more and more sequence data becomes available to the public. One of the pressing questions now is how to use such data to understand adaptive immune responses to disease. Infectious disease is of particular interest because the antigens driving such responses are often known to some extent. Here, we describe tips for gathering data and cleaning it for use in downstream analysis. We present a method for high-throughput structural modeling of antibodies or TCRs using Repertoire Builder and its extensions. AbAdapt is an extension of Repertoire Builder for antibody–antigen docking from antibody and antigen sequences. ImmuneScape is a corresponding extension for TCR–pMHC 3D modeling. Together, these pipelines can help researchers to understand immune responses to infection from a structural point of view.Key wordsAntibodySequenceStructure modelingT-cell receptor
Article
Full-text available
The pulmonary surfactant protein A (SP-A) is a constitutively expressed immune-protective collagenous lectin (collectin) in the lung. It binds to the cell membrane of immune cells and opsonizes infectious agents such as bacteria, fungi, and viruses through glycoprotein binding. SARS-CoV-2 enters airway epithelial cells by ligating the Angiotensin Converting Enzyme 2 (ACE2) receptor on the cell surface using its Spike glycoprotein (S protein). We hypothesized that SP-A binds to the SARS-CoV-2 S protein and this binding interferes with ACE2 ligation. To study this hypothesis, we used a hybrid quantum and classical in silico modeling technique that utilized protein graph pruning. This graph pruning technique determines the best binding sites between amino acid chains by utilizing the Quantum Approximate Optimization Algorithm (QAOA)-based MaxCut (QAOA-MaxCut) program on a Near Intermediate Scale Quantum (NISQ) device. In this, the angles between every neighboring three atoms were Fourier-transformed into microwave frequencies and sent to a quantum chip that identified the chemically irrelevant atoms to eliminate based on their chemical topology. We confirmed that the remaining residues contained all the potential binding sites in the molecules by the Universal Protein Resource (UniProt) database. QAOA-MaxCut was compared with GROMACS with T-REMD using AMBER, OPLS, and CHARMM force fields to determine the differences in preparing a protein structure docking, as well as with Goemans-Williamson, the best classical algorithm for MaxCut. The relative binding affinity of potential interactions between the pruned protein chain residues of SP-A and SARS-CoV-2 S proteins was assessed by the ZDOCK program. Our data indicate that SP-A could ligate the S protein with a similar affinity to the ACE2-Spike binding. Interestingly, however, the results suggest that the most tightly-bound SP-A binding site is localized to the S2 chain, in the fusion region of the SARS-CoV-2 S protein, that is responsible for cell entry Based on these findings we speculate that SP-A may not directly compete with ACE2 for the binding site on the S protein, but interferes with viral entry to the cell by hindering necessary conformational changes or the fusion process.
Article
The study of the interaction of plant lectins with the target intestinal receptors of the weaning piglets by the in silico computer modelling method showed that the most probable targets for the lectin binding are the divalent metal transporter 1 (DMT1) and neutral amino acid transporter. Barley lectins have the greatest potential for interaction. Legume-type lectins and the OS9-like protein are the main types of reactive lectins. The addition of specific carbohydrates is the optimal method for neutralization of lectins.
Article
Small ubiquitin-related modifier (SUMO) proteins are efficiently used to target the soluble expression of various difficult-to-express proteins in E. coli. However, its utilization in large scale protein production is restricted by the higher cost of Ulp, which is required to cleave SUMO fusion tag from protein-of-interest to generate an authentic N-terminus. This study identified and characterized two novel SUMO proteases i.e., Ulp1 and Ulp2 from Schizosaccharomyces pombe. Codon-optimized gene sequences were cloned and expressed in E. coli. The sequence and structure of SpUlp1 and SpUlp2 catalytic domains were deduced using bioinformatics tools. Protein-protein interaction studies predicted the higher affinity of SpUlp1 towards SUMO compared to its counterpart from Saccharomyces cerevisiae (ScUlp1). The catalytic domain of SpUlp1 was purified using Ni-NTA chromatography with 83.33% recovery yield. Moreover, In vitro activity data further confirmed the fast-acting nature of SpUlp1 catalytic domain, where a 90% cleavage of fusion proteins was obtained within 1 h of incubation, indicating novelty and commercial relevance of S. pombe Ulp1. Biophysical characterization showed 8.8% α-helices, 36.7% β-sheets in SpUlp1SD. From thermal CD and fluorescence data, SpUlp1SD Tm was found to be 45 °C. Further, bioprocess optimization using fed-batch cultivation resulted in 3.5 g/L of SpUlp1SD production with YP/X of 77.26 mg/g DCW and volumetric productivity of 205.88 mg/L/h.
Article
The protective innate immune response of β-amyloid peptide (Aβ) has been indicated as a risk factor for Alzheimer's disease (AD) due to the rapid amyloidosis. In order to obtain molecular-level insights into the protective and pathogenic roles of Aβ, the binding modes between Aβ1-42 and the envelop glycoprotein D (gD) of Herpes simplex virus-1 (HSV-1)/Aβ1-42 were theoretically investigated by using molecular docking, molecular dynamics (MD) simulations and binding free energy decomposition methods in the present study. The Aβ1-42 stably binds to the envelop gD via intermolecular hydrogen bonds and van der Waals (vdW) interactions. The Aβ1-42 acquires its equilibrium with higher fluctuation amplitude and a better structured C-terminal in the HSV-1 gD–Aβ1-42 complex comparing to that in the Aβ1-42–Aβ1-42 complex. The amino acid residues of Aβ1-42 involved in the formation of the Aβ1-42 dimer are fully free and accessible in the HSV-1 gD–Aβ1-42 complex. It is favorable for the Aβ1-42 monomer to interact with the HSV-1 gD–Aβ1-42 complex. It may be responsible for the rapid amyloidosis which entraps the herpesvirus as well as causing AD.
Article
Visualization of the interfacial electrostatic complementarity (VIINEC) is a recently developed method for analyzing protein–protein interactions using electrostatic potential (ESP) calculated via the ab initio fragment molecular orbital method. In this Letter, the molecular interactions of the receptor-binding domain (RBD) of the SARS-CoV-2 spike protein with human angiotensin-converting enzyme 2 (ACE2) and B38 neutralizing antibody were examined as an illustrative application of VIINEC. The results of VIINEC revealed that the E484 of RBD has a role in making a local electrostatic complementary with ACE2 at the protein–protein interface, while it causes a considerable repulsive electrostatic interaction. Furthermore, the calculated ESP map at the interface of the RBD/B38 complex was significantly different from that of the RBD/ACE2 complex, which is discussed herein in association with the mechanism of the specificity of the antibody binding to the target protein.
Article
Full-text available
The complement system is designed to recognise and eliminate invading pathogens via activation of classical, alternative and lectin pathways. Human properdin stabilises the alternative pathway C3 convertase, resulting in an amplification loop that leads to the formation of C5 convertase, thereby acting as a positive regulator of the alternative pathway. It has been noted that human properdin on its own can operate as a pattern recognition receptor and exert immune functions outside its involvement in complement activation. Properdin can bind directly to microbial targets via DNA, sulfatides and glycosaminoglycans, apoptotic cells, nanoparticles, and well-known viral virulence factors. This study was aimed at investigating the complement-independent role of properdin against Influenza A virus infection. As one of the first immune cells to arrive at the site of IAV infection, we show here that IAV challenged neutrophils released properdin in a time-dependent manner. Properdin was found to directly interact with haemagglutinin, neuraminidase and matrix 1 protein Influenza A virus proteins in ELISA and western blot. Furthermore, modelling studies revealed that properdin could bind HA and NA of the H1N1 subtype with higher affinity compared to that of H3N2 due to the presence of an HA cleavage site in H1N1. In an infection assay using A549 cells, properdin suppressed viral replication in pH1N1 subtype while promoting replication of H3N2 subtype, as revealed by qPCR analysis of M1 transcripts. Properdin treatment triggered an anti-inflammatory response in H1N1-challenged A549 cells and a pro-inflammatory response in H3N2-infected cells, as evident from differential mRNA expression of TNF-α, NF-κB, IFN-α, IFN-β, IL-6, IL-12 and RANTES. Properdin treatment also reduced luciferase reporter activity in MDCK cells transduced with H1N1 pseudotyped lentiviral particles; however, it was increased in the case of pseudotyped H3N2 particles. Collectively, we conclude that infiltrating neutrophils at the site of IAV infection can release properdin, which then acts as an entry inhibitor for pandemic H1N1 subtype while suppressing viral replication and inducing an anti-inflammatory response. H3N2 subtype can escape this immune restriction due to altered haemagglutinin and neuraminindase, leading to enhanced viral entry, replication and pro-inflammatory response. Thus, depending on the subtype, properdin can either limit or aggravate IAV infection in the host.
Article
The efficacies of three short synthetic antifungal peptides were tested for their inhibitory action on pathogenic fungi, Aspergillus flavus. The sequences of the short synthetic peptides are PPD1- FRLHF, 66-10-FRLKFH, 77–3- FRLKFHF, respectively. These test peptides inhibited fungal growth and showed a membranolytic activity. The fungal biomass and ergosterol levels were significantly low in peptides treated samples. Further, the fungal cell wall component chitin was also found to be lower in peptides treated samples. Scanning electron microscopic images also showed highly wrinkled fungal mycelia. Significant membrane permeabilisation as well as potassium ion leakage was also observed in fungal samples treated with peptides. To assess the membrane damage, the uptake of Sytox green dye was employed. At tested concentration, peptides induced fungal membrane damage as evidenced by the green fluorescence. Further, these peptides induced an oxidative stress in A.flavus as evidenced by an increase in the ROS production, malondialdehyde levels, increase in the antioxidant enzymes - superoxide dismutase, catalase with concomitant decrease in the reduced glutathione content. Additionally, a growth dependent reduction in aflatoxin levels were also observed in peptides treated samples. Docking studies on the interaction of the peptides with a trans-membrane protein calcium ATPase of A. flavus showed that all the peptides were able to bind to the protein with high z rank score. The activity of the calcium ATPase was significantly decreased in peptides treated fungal samples, thereby validating the docking results. Among all the tested peptides, 77–3 peptide exhibited the maximal membrane damage property.
Article
Full-text available
We present a rapidly executable minimal binding energy model for molecular docking and use it to explore the energy landscape in the vicinity of the binding sites of four different enzyme inhibitor complexes. The structures of the complexes are calculated starting with the crystal structures of the free monomers, using DOCK 4.0 to generate a large number of potential configurations, and screening with the binding energy target function. In order to investigate possible correlations between energy and variation from the native structure, we introduce a new measure of similarity, which removes many of the difficulties associated with root mean square deviation. The analysis uncovers energy gradients, or funnels, near the binding site, with decreasing energy as the degree of similarity between the native and docked structures increases. Such energy funnels can increase the number of random collisions that may evolve into productive stable complex, and indicate that short-range interactions in the precomplexes can contribute to the association rate. The finding could provide an explanation for the relatively rapid association rates that are observed even in the absence of long-range electrostatic steering. Proteins 1999; 34:255–267. © 1999 Wiley-Liss, Inc.
Article
Full-text available
Software systems predicting automatically whether and how two proteins may interact are highly desirable, both for understanding biological processes and for the rational design of new proteins. As a part of a future complete solution to this problem, a bundle of programs is presented designed (i) to estimate initial docking positions for a given pair of docking candidates, (ii) to adjust them, and (iii) to filter them, thus preparing more detailed computations of free energies. The system is evaluated on a test set of 51 co-crystallized complexes aiming at redocking the subunits. It works completely automatically and the evaluation is performed using one single set of parameters for all complexes in the test set. The number of solutions is fixed to 50 positions with a median CPU time of 26 min. For 30 complexes, these contain a near-correct solution with root mean square deviation ( RMSD ) </=5.0 A, which is ranked first in five cases. For all complexes, the best solution is scored on rank 16 as the worst case, and has a median RMSD of 4.3 A. Alternatively to this initial estimation of docking positions, a global sampling of rotations was tested. Whereas this yields top-ranked solutions with RMSD </=3.0 A for all 51 complexes, the median CPU time increases to 11 h. This shows that this blind sampling is not feasible for most applications. The system and its components are available on request from the authors. Contact: friedric@techfak.uni-bielefeld or posch@techfak.uni-bielefeld.de
Article
Full-text available
A geometric recognition algorithm was developed to identify molecular surface complementarity. It is based on a purely geometric approach and takes advantage of techniques applied in the field of pattern recognition. The algorithm involves an automated procedure including (i) a digital representation of the molecules (derived from atomic coordinates) by three-dimensional discrete functions that distinguishes between the surface and the interior; (ii) the calculation, using Fourier transformation, of a correlation function that assesses the degree of molecular surface overlap and penetration upon relative shifts of the molecules in three dimensions; and (iii) a scan of the relative orientations of the molecules in three dimensions. The algorithm provides a list of correlation values indicating the extent of geometric match between the surfaces of the molecules; each of these values is associated with six numbers describing the relative position (translation and rotation) of the molecules. The procedure is thus equivalent to a six-dimensional search but much faster by design, and the computation time is only moderately dependent on molecular size. The procedure was tested and validated by using five known complexes for which the correct relative position of the molecules in the respective adducts was successfully predicted. The molecular pairs were deoxyhemoglobin and methemoglobin, tRNA synthetase-tyrosinyl adenylate, aspartic proteinase-peptide inhibitor, and trypsin-trypsin inhibitor. A more realistic test was performed with the last two pairs by using the structures of uncomplexed aspartic proteinase and trypsin inhibitor, respectively. The results are indicative of the extent of conformational changes in the molecules tolerated by the algorithm.
Article
Full-text available
Crystallization of the 1:1 molecular complex between the beta-lactamase TEM-1 and the beta-lactamase inhibitory protein BLIP has provided an opportunity to put a stringent test on current protein-docking algorithms. Prior to the successful determination of the structure of the complex, nine laboratory groups were given the refined atomic coordinates of each of the native molecules. Other than the fact that BLIP is an effective inhibitor of a number of beta-lactamase enzymes (KI for TEM-1 approximately 100 pM) no other biochemical or structural data were available to assist the practitioners in their molecular docking. In addition, it was not known whether the molecules underwent conformational changes upon association or whether the inhibition was competitive or non-competitive. All six of the groups that accepted the challenge correctly predicted the general mode of association of BLIP and TEM-1.
Article
Full-text available
A comprehensive nonredundant database of 475 cocrystallized protein-protein complexes was used to study low-resolution recognition, which was reported in earlier docking experiments with a small number of proteins. The docking program GRAMM was used to delete the atom-size structural details and systematically dock the resulting molecular images. The results reveal the existence of the low-resolution recognition in 52% of all complexes in the database and in 76% of the 113 complexes with an interface area >4,000 A(2). Limitations of the docking and analysis tools used in this study suggest that the actual number of complexes with the low-resolution recognition is higher. However, the results already prove the existence of the low-resolution recognition on a broad scale.
Article
Full-text available
The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
Article
Full-text available
Two large-scale yeast two-hybrid screens were undertaken to identify protein-protein interactions between full-length open reading frames predicted from the Saccharomyces cerevisiae genome sequence. In one approach, we constructed a protein array of about 6,000 yeast transformants, with each transformant expressing one of the open reading frames as a fusion to an activation domain. This array was screened by a simple and automated procedure for 192 yeast proteins, with positive responses identified by their positions in the array. In a second approach, we pooled cells expressing one of about 6,000 activation domain fusions to generate a library. We used a high-throughput screening procedure to screen nearly all of the 6,000 predicted yeast proteins, expressed as Gal4 DNA-binding domain fusion proteins, against the library, and characterized positives by sequence analysis. These approaches resulted in the detection of 957 putative interactions involving 1,004 S. cerevisiae proteins. These data reveal interactions that place functionally unclassified proteins in a biological context, interactions between proteins involved in the same biological function, and interactions that link biological functions together into larger cellular processes. The results of these screens are shown here.
Article
A new computationally efficient and automated “soft docking” algorithm is described to assist the prediction of the mode of binding between two proteins, using the three-dimensional structures of the unbound molecules. The method is implemented in a software package called BiGGER (Bimolecular Complex Generation with Global Evaluation and Ranking) and works in two sequential steps: first, the complete 6-dimensional binding spaces of both molecules is systematically searched. A population of candidate protein-protein docked geometries is thus generated and selected on the basis of the geometric complementarity and amino acid pairwise affinities between the two molecular surfaces. Most of the conformational changes observed during protein association are treated in an implicit way and test results are equally satisfactory, regardless of starting from the bound or the unbound forms of known structures of the interacting proteins. In contrast to other methods, the entire molecular surfaces are searched during the simulation, using absolutely no additional information regarding the binding sites. In a second step, an interaction scoring function is used to rank the putative docked structures. The function incorporates interaction terms that are thought to be relevant to the stabilization of protein complexes. These include: geometric complementarity of the surfaces, explicit electrostatic interactions, desolvation energy, and pairwise propensities of the amino acid side chains to contact across the molecular interface. The relative functional contribution of each of these interaction terms to the global scoring function has been empirically adjusted through a neural network optimizer using a learning set of 25 protein-protein complexes of known crystallographic structures. In 22 out of 25 protein-protein complexes tested, near-native docked geometries were found with Cα RMS deviations ≤ 4.0 Å from the experimental structures, of which 14 were found within the 20 top ranking solutions. The program works on widely available personal computers and takes 2 to 8 hours of CPU time to run any of the docking tests herein presented. Finally, the value and limitations of the method for the study of macromolecular interactions, not yet revealed by experimental techniques, are discussed. Proteins 2000;39:372–384. © 2000 Wiley-Liss, Inc.
Article
Rigid-body methods, particularly Fourier correlation techniques, are very efficient for docking bound (co-crystallized) protein conformations using measures of surface complementarity as the target function. However, when docking unbound (separately crystallized) conformations, the method generally yields hundreds of false positive structures with good scores but high root mean square deviations (RMSDs). This paper describes a two-step scoring algorithm that can discriminate near-native conformations (with less than 5 Å RMSD) from other structures. The first step includes two rigid-body filters that use the desolvation free energy and the electrostatic energy to select a manageable number of conformations for further processing, but are unable to eliminate all false positives. Complete discrimination is achieved in the second step that minimizes the molecular mechanics energy of the retained structures, and re-ranks them with a combined free-energy function which includes electrostatic, solvation, and van der Waals energy terms. After minimization, the improved fit in near-native complex conformations provides the free-energy gap required for discrimination. The algorithm has been developed and tested using docking decoys, i.e., docked conformations generated by Fourier correlation techniques. The decoy sets are available on the web for testing other discrimination procedures. Proteins 2000;40:525–537. © 2000 Wiley-Liss, Inc.
Article
Rigid-body methods, particularly Fourier correlation techniques, are very efficient for docking bound (co-crystallized) protein conformations using measures of surface complementarity as the target function. However, when docking unbound (separately crystallized) conformations, the method generally yields hundreds of false positive structures with good scores but high root mean square deviations (RMSDs). This paper describes a two-step scoring algorithm that can discriminate near-native conformations (with less than 5 Å RMSD) from other structures. The first step includes two rigid-body filters that use the desolvation free energy and the electrostatic energy to select a manageable number of conformations for further processing, but are unable to eliminate all false positives. Complete discrimination is achieved in the second step that minimizes the molecular mechanics energy of the retained structures, and re-ranks them with a combined free-energy function which includes electrostatic, solvation, and van der Waals energy terms. After minimization, the improved fit in near-native complex conformations provides the free-energy gap required for discrimination. The algorithm has been developed and tested using docking decoys, i.e., docked conformations generated by Fourier correlation techniques. The decoy sets are available on the web for testing other discrimination procedures. Proteins 2000;40:525–537. © 2000 Wiley-Liss, Inc.
Article
The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
Article
The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
Article
CHARMM (Chemistry at HARvard Macromolecular Mechanics) is a highly flexible computer program which uses empirical energy functions to model macromolecular systems. The program can read or model build structures, energy minimize them by first- or second-derivative techniques, perform a normal mode or molecular dynamics simulation, and analyze the structural, equilibrium, and dynamic properties determined in these calculations. The operations that CHARMM can perform are described, and some implementation details are given. A set of parameters for the empirical energy function and a sample run are included.
Article
Docking algorithms simulate protein-protein association in molecular assemblies such as protease-inhibitor or antigen-antibody complexes by reconstituting the complexes from their component molecules. They not only efficiently retrieve native structures but also select a number of non-native structures with structural and physicochemical features that were assumed to be unique to the native complexes. Some of these ‘false positives’ may deserve further examination in experimental studies of protein-protein recognition.
Article
Here we carry out an examination of shape complementarity as a criterion in protein--protein docking and binding. Specifically, we examine the quality of shape complementarity as a critical determinant not only in the docking of 26 protein--protein "bound", complexed cases, but in particular, of 19 "unbound" protein--protein cases, where the structures have been determined separately. In all cases, entire molecular surfaces are utilized in the docking, with no consideration of the location of the active site, or of particular residues/atoms in either the receptor or the ligand which participate in the binding. To evaluate the goodness of the strictly geometry-based shape complementarity in the docking process as compared to the main favorable and unfavorable energy components, we study systematically a potential correlation between each of these components and the RMSD of the "unbound" protein--protein cases. Specifically, we examine the non-polar buried surface area, polar b...
Chapter
In 1998, members of the Research Collaboratory for Structural Bioinformatics became the managers of the Protein Data Bank archive. This chapter details the systems used for the deposition, annotation and distribution of the data in the archive. This chapter is also available as HTML from the International Tables Online site hosted by the IUCr.
Article
A computationally tractable strategy has been developed to refine protein-protein interfaces that models the effects of side-chain conformational change, solvation and limited rigid-body movement of the subunits. The proteins are described at the atomic level by a multiple copy representation of side-chains modelled according to a rotamer library on a fixed peptide backbone. The surrounding solvent environment is described by "soft" sphere Langevin dipoles for water that interact with the protein via electrostatic, van der Waals and field-dependent hydrophobic terms. Energy refinement is based on a two-step process in which (1) a probability-based conformational matrix of the protein side-chains is refined iteratively by a mean field method. A side-chain interacts with the protein backbone and the probability-weighted average of the surrounding protein side-chains and solvent molecules. The resultant protein conformations then undergo (2) rigid-body energy minimization to relax the protein interface. Steps (1) and (2) are repeated until convergence of the interaction energy. The influence of refinement on side-chain conformation starting from unbound conformations found improvement in the RMSD of side-chains in the interface of protease-inhibitor complexes, and shows that the method leads to an improvement in interface geometry. In terms of discriminating between docked structures, the refinement was applied to two classes of protein-protein complex: five protease-protein inhibitor and four antibody-antigen complexes. A large number of putative docked complexes have already been generated for the test systems using our rigid-body docking program, FTDOCK. They include geometries that closely resemble the crystal complex, and therefore act as a test for the refinement procedure. In the protease-inhibitors, geometries that resemble the crystal complex are ranked in the top four solutions for four out of five systems when solvation is included in the energy function, against a background of between 26 and 364 complexes in the data set. The results for the antibody-antigen complexes are not as encouraging, with only two of the four systems showing discrimination. It would appear that these results reflect the somewhat different binding mechanism dominant in the two types of protein-protein complex. Binding in the protease-inhibitors appears to be "lock and key" in nature. The fixed backbone and mobile side-chain representation provide a good model for binding. Movements in the backbone geometry of antigens on binding represent an "induced-fit" and provides more of a challenge for the model. Given the limitations of the conformational sampling, the ability of the energy function to discriminate between native and non-native states is encouraging. Development of the approach to include greater conformational sampling could lead to a more general solution to the protein docking problem.
Article
Molecular recognition is achieved through the complementarity of molecular surface structures and energetics with, most commonly, associated minor conformational changes. This complementarity can take many forms: charge-charge interaction, hydrogen bonding, van der Waals' interaction, and the size and shape of surfaces. We describe a method that exploits these features to predict the sites of interactions between two cognate molecules given their three-dimensional structures. We have developed a “cube representation” of molecular surface and volume which enables us not only to design a simple algorithm for a six-dimensional search but also to allow implicitly the effects of the conformational changes caused by complex formation. The present molecular docking procedure may be divided into two stages. The first is the selection of a population of complexes by geometric “soft docking”, in which surface structures of two interacting molecules are matched with each other, allowing minor conformational changes implicitly, on the basis of complementarity in size and shape, close packing, and the absence of steric hindrance. The second is a screening process to identify a subpopulation with many favorable energetic interactions between the buried surface areas. Once the size of the subpopulation is small, one may further screen to find the correct complex based on other criteria or constraints obtained from biochemical, genetic, and theoretical studies, including visual inspection.
Article
A program is described for drawing the van der Waal's surface of a protein molecule. An extension of the program permits the accessibility of atoms, or groups of atoms, to solvent or solute molecules of specified size to be quantitatively assessed. As defined in this study, the accessibility is proportional to surface area. The accessibility of all atoms in the twenty common amino acids in model tripeptides of the type Ala-X-Ala are given for defined conformation. The accessibilities are also given for all atoms in ribonuclease-S, lysozyme and myogoblin. Internal cavities are defined and discussed. Various summaries of these data are provided. Forty to fifty per cent of the surface area of each protein is occupied by non-polar atoms. The actual numerical results are sensitive to the values chosen for the van der Waal's radii of the various groups. Since there is uncertainty over the correct values for these radii, the derived numbers should only be used as a qualitative guide at this stage.The average change in accessibility for the atoms of all three proteins in going from a hypothetical extended chain to the folded conformation of the native protein is about a factor of 3. This number applies to both polar (nitrogen and oxygen) and non-polar (carbon and sulfur) atoms considered separately. The larger non-polar amino acids tend to be more “buried” in the native form of all three proteins. However, for all classes and for residues within a given class the accessibility changes on folding tend to be highly variable.
Article
The prediction of protein-protein interactions in solution is a major goal of theoretical structural biology. Here, we implement a continuum description of the thermodynamic processes involved. The model differs considerably from previous models in its use of "molecular surface" area to describe the hydrophobic component to the free energy of conformational change in solution. We have applied this model to a data set of alternative docked conformations of protein-protein complexes which were generated independently of this work. It was found previously that commonly used energy evaluation techniques fail to distinguish between near-native and certain non-native complexes in this data set. Here, we found that an energy function that takes into account (1) total electrostatic free energy, (2) hydrophobic free energy and (3) loss in side-chain conformational energy was able to reliably discriminate between near-native and non-native configurations but only when molecular surface is used as a descriptor of the hydrophobic effect. It is shown that the molecular surface and the more conventional surface descriptor "solvent accessible surface" give very different quantitative measures of hydrophobicity. In terms of the contribution of different energy components to the free energy of complex formation it was found that loss in side-chain conformational entropy is a second order effect. Electrostatic interaction energy (which is commonly used to score docked conformations) was a poor indicator of complementarity when starting from unbound conformations. It was found that electrostatic desolvation energy and the hydrophobic contribution (based on a molecular surface area descriptor) are much less sensitive to local fluctuations in atomic structure than point-to-point interaction energies and thus may be more suited for use as a scoring function when docking unbound conformations, where atomic complementarity is much less apparent. Whilst a combined energy function was able to distinguish near-native from non-native conformations in the six systems studied here, it remains to be determined to what extent more sizeable conformational changes would influence the results.
Article
The fundamental event in biological assembly is association of two biological macromolecules. Here we present a successful, accurate ab initio prediction of the binding of uncomplexed lysozyme to the HyHel5 antibody. The prediction combines pseudo Brownian Monte Carlo minimization with a biased-probability global side-chain placement procedure. It was effected in an all-atom representation, with ECEPP/2 potentials complemented with the surface energy, side-chain entropy and electrostatic polarization free energy. The near-native solution found was surprisingly close to the crystallographic structure (root-mean-square deviation of 1.57 A for all backbone atoms of lysozyme) and had a considerably lower energy (by 20 kcal mol-1) than any other solution.
Article
Rigid-body docking of two molecules involves matching of their surfaces. A successful docking methodology considers two key issues: molecular surface representation, and matching. While approaches to the problem differ, they all employ certain surface geometric features. While surface normals are routinely created with molecular surfaces, their employment has surprisingly been almost completely overlooked. Here we show how the normals to the surface, at specific, well placed points, can play a critical role in molecular docking. If the points for which the normals are calculated represent faithfully and accurately the molecular surfaces, the normals can substantially ameliorate the efficiency of the docking in a number of ways. The normals can drastically reduce the combinatorial complexity of the receptor-ligand docking. Furthermore, they can serve as a powerful filter in screening for quality docked conformations. Below we show how deploying such a straight forward device, which is easy to calculate, large protein-protein molecules are docked with unparalleled short times and with a manageable number of potential solutions. Considering the facts that here we dock (1) two large protein molecules, including several large immunoglobulin-lysozyme complexes; (2) that we use the entire molecular surfaces, without a predefinition of the active sites, or of the epitopes, of neither the ligand nor the receptor; that (3) the docking is completely automated, without any labelling, or pre-specification, of the input structural database, and (4) with a single set of parameters, without any further tuning whatsoever, such results are highly desirable. This approach is specifically geared towards matching of the surfaces of large protein molecules and is not applicable to small molecule drugs.
Article
We have developed a geometry-based suite of processes for molecular docking. The suite consists of a molecular surface representation, a docking algorithm, and a surface inter-penetration and contact filter. The surface representation is composed of a sparse set of critical points (with their associated normals) positioned at the face centers of the molecular surface, providing a concise yet representative set. The docking algorithm is based on the Geometric Hashing technique, which indexes the critical points with their normals in a transformation invariant fashion preserving the multi-element geometric constraints. The inter-penetration and surface contact filter features a three-layer scoring system, through which docked models with high contact area and low clashes are funneled. This suite of processes enables a pipelined operation of molecular docking with high efficacy. Accurate and fast docking has been achieved with a rich collection of complexes and unbound molecules, including protein-protein and protein-small molecule associations. An energy evaluation routine assesses the intermolecular interactions of the funneled models obtained from the docking of the bound molecules by pairwise van der Waals and Coulombic potentials. Applications of this routine demonstrate the goodness of the high scoring, geometrically docked conformations of the bound crystal complexes.
Article
A new statistic Sc, which has a number of advantages over other measures of packing, is used to examine the shape complementarity of protein/protein interfaces selected from the Brookhaven Protein Data Bank. It is shown using Sc that antibody/antigen interfaces as a whole exhibit poorer shape complementarity than is observed in other systems involving protein/protein interactions. This result can be understood in terms of the fundamentally different evolutionary history of particular antibody/antigen associations compared to other systems considered, and in terms of the differing chemical natures of the interfaces.
Article
In a blind test of protein-docking algorithms, six groups used different methods to predict the structure of a protein complex. All six predicted structures were close enough to the experimental complex to be useful; nevertheless, several important details of the experimental complex were missed or only partially predicted.
Article
A long sought goal in the physical chemistry of macromolecular structure, and one directly relevant to understanding the molecular basis of biological recognition, is predicting the geometry of bimolecular complexes from the geometries of their free monomers. Even when the monomers remain relatively unchanged by complex formation, prediction has been difficult because the free energies of alternative conformations of the complex have been difficult to evaluate quickly and accurately. This has forced the use of incomplete target functions, which typically do no better than to provide tens of possible complexes with no way of choosing between them. Here we present a general framework for empirical free energy evaluation and report calculations, based on a relatively complete and easily executable free energy function, that indicate that the structures of complexes can be predicted accurately from the structures of monomers, including close sequence homologues. The calculations also suggest that the binding free energies themselves may be predicted with reasonable accuracy. The method is compared to an alternative formulation that has also been applied recently to the same data set. Both approaches promise to open new opportunities in macromolecular design and specificity modification.
Article
A geometric docking algorithm based upon correlation analysis for quantification of geometric complementarity between protein molecular surfaces in close interfacial contact has been developed by a detailed optimization of the conformational search of the algorithm. In order to reduce the entire conformation space search required by the method a physico-chemical pre-filter of conformation space has been developed based upon the a priori assumption that two or more intermolecular hydrogen bonds are intrinsic to the mechanism of binding within protein complexes. Donor sites are defined spatially and directionally by the positions of explicitly calculated donor hydrogen atoms, and the vector space within a defined range about the donor atom-hydrogen atom bond vector. Acceptor sites are represented spatially and directionally by the van der Waals molecular surface points having normal vectors within a predefined range of vector space about the acceptor atom covalent bond vector(s). Geometric conditions necessary for the simultaneous hydrogen bonding interaction between both sites of functionally congruent hydrogen bonding site pairs, located on the individual proteins, are then tested on the basis of a transformation invariant parameterization of the site pair spatial and directional properties. Sterically acceptable conformations defined by interaction of functionally, spatially, and directionally compatible site pairs are then refined to a maximum contact of complementary contact surfaces using the simplex method for the angular search and correlation techniques for the translational search. The utility of the spatial and directional properties of hydrogen bonding donor and acceptor sites for the identification of candidate docking conformations is demonstrated by the reliable preliminary reduction of conformation space, the improved geometric ranking of the minimum RMS conformations of some complexes and the overall reduction of CPU time obtained.
Article
Article
We estimated effective atomic contact energies (ACE), the desolvation free energies required to transfer atoms from water to a protein's interior, using an adaptation of a method introduced by S. Miyazawa and R. L. Jernigan. The energies were obtained for 18 different atom types, which were resolved on the basis of the way their properties cluster in the 20 common amino acids. In addition to providing information on atoms at the highest resolution compatible with the amount and quality of data currently available, the method itself has several new features, including its reference state, the random crystal structure, which removes compositional bias, and a scaling factor that makes contact energies quantitatively comparable with experimentally measured energies. The high level of resolution, the explicit accounting of the local properties of protein interiors during determination of the energies, and the very high computational efficiency with which they can be assigned during any computation, should make the results presented here widely applicable. First we used ACE to calculate the free energies of transferring side-chains from protein interior into water. A comparison of the results thus obtained with the measured free energies of transferring side-chains from n-octanol to water, indicates that the magnitude of protein to water transfer free energies for hydrophobic side-chains is larger than that of n-octanol to water transfer free energies. The difference is consistent with observations made by D. Shortle and co-workers, who measured differential free energies of protein unfolding for site-specific mutants in which Ala or Gly was substituted for various hydrophobic side-chains. A direct comparison (calculated versus observed free energy differences) with those experiments finds slopes of 1.15 and 1.13 for Gly and Ala substitutions, respectively. Finally we compared calculated and observed binding free energies of nine protease-inhibitor complexes. This requires a full free energy function, which is created by adding direct electrostatic interactions and an appropriate entropic component to the solvation free energy term. The calculated free energies are typically within 10% of the observed values. Taken collectively, these results suggest that ACE should provide a reasonably accurate and rapidly evaluatable solvation component of free energy, and should thus make accessible a range of docking, design and protein folding calculations that would otherwise be difficult to perform.
Article
We show that a rapidly executable computational procedure provides the basis for a predictive understanding of antigenic peptide side chain specificity, for binding to class I major histocompatibility complex (MHC) molecules. The procedure consists of a combined search to identify the joint conformations of peptide side chains and side chains comprising the MHC pocket, followed by conformational selection, using a target function, based on solvation energies and modified electrostatic energies. The method was applied to the B pocket region of five MHC molecules, which were chosen to encompass the full range of specificities displayed by anchors at peptide position 2. These were a medium hydrophobic residue (Leu or Met) for HLA-A*0201, a basic residue (Arg or Lys) for HLA-B*2705; a small hydrophobic residue (Val) for HLA-A*6801, an acidic residue (Glu) for HLA-B*4001 and a bulky residue (Tyr) for H-2K(d). The observed anchors are correctly predicted in each case. The agreement for HLA-B40 and H-2K(d) is especially promising, since their structures have not yet been determined experimentally. Because the experimental determination of motifs by elution is difficult and these calculations take only hours on a high speed workstation, the results open the possibility of routine determination of motifs computationally.
Article
We report a new free energy decomposition that includes structure-derived atomic contact energies for the desolvation component, and show that it applies equally well to the analysis of single-domain protein folding and to the binding of flexible peptides to proteins. Specifically, we selected the 17 single-domain proteins for which the three-dimensional structures and thermodynamic unfolding free energies are available. By calculating all terms except the backbone conformational entropy change and comparing the result to the experimentally measured free energy, we estimated that the mean entropy gain by the backbone chain upon unfolding (delta Sbb) is 5.3 cal/K per mole of residue, and that the average backbone entropy for glycine is 6.7 cal/K. Both numbers are in close agreement with recent estimates made by entirely different methods, suggesting a promising degree of consistency between data obtained from disparate sources. In addition, a quantitative analysis of the folding free energy indicates that the unfavorable backbone entropy for each of the proteins is balanced predominantly by favorable backbone interactions. Finally, because the binding of flexible peptides to receptors is physically similar to folding, the free energy function should, in principle, be equally applicable to flexible docking. By combining atomic contact energies, electrostatics, and sequence-dependent backbone entropy, we calculated a priori the free energy changes associated with the binding of four different peptides to HLA-A2, 1 MHC molecule and found agreement with experiment to within 10% without parameter adjustment.
Article
Protein-protein interaction sites in complexes of known structure are characterised using a series of parameters to evaluate what differentiates them from other sites on the protein surface. Surface patches are defined in protomers from a data set of 28 homo-dimers, 20 different hetero-complexes (segregated into large and small protomers), and antigens from six antibody-antigen complexes. Six parameters (solvation potential, residue interface propensity, hydrophobicity, planarity, protrusion and accessible surface area) are calculated for the observed interface patch and all other surface patches defined on each protein. A ranking of the observed interface, relative to all other possible patches, is calculated. With this approach it becomes possible to analyse the distribution of the rankings of all the observed patches, relative to all other surface patches, for each data set. For each type of complex, none of the parameters were definitive, but the majority showed trends for the observed interface to be distinguished from other surface patches.
Article
A protein docking study was performed for two classes of biomolecular complexes: six enzyme/inhibitor and four antibody/antigen. Biomolecular complexes for which crystal structures of both the complexed and uncomplexed proteins are available were used for eight of the ten test systems. Our docking experiments consist of a global search of translational and rotational space followed by refinement of the best predictions. Potential complexes are scored on the basis of shape complementarity and favourable electrostatic interactions using Fourier correlation theory. Since proteins undergo conformational changes upon binding, the scoring function must be sufficiently soft to dock unbound structures successfully. Some degree of surface overlap is tolerated to account for side-chain flexibility. Similarly for electrostatics, the interaction of the dispersed point charges of one protein with the Coulombic field of the other is measured rather than precise atomic interactions. We tested our docking protocol using the native rather than the complexed forms of the proteins to address the more scientifically interesting problem of predictive docking. In all but one of our test cases, correctly docked geometries (interface Calpha RMS deviation </=2 A from the experimental structure) are found during a global search of translational and rotational space in a list that was always less than 250 complexes and often less than 30. Varying degrees of biochemical information are still necessary to remove most of the incorrectly docked complexes.
Article
A single protein-protein pair, the complex of the influenza virus hemagglutinin with an antibody (Fab BH151), was suggested for prediction at the second experiment on the Critical Assessment of Techniques for Protein Structure Prediction. To predict the structure of the complex, we applied our docking program GRAMM at a decreased resolution (to accommodate the conformational inaccuracies). The lowest-energy match showed a remarkable "low-resolution" surface complementarity between the molecular structures. After receiving the experimental structure of the complex we had a chance to verify our assumptions and results. The analysis of the hemagglutinin-antibody interface revealed several significant conformational changes in the side chains, which resulted in deep interpenetrations of the hemagglutinin and the antibody structures. This confirmed our initial assumption that the structural changes will be beyond the tolerance of high-resolution rigid-body docking. The comparison of the predicted low-resolution match, submitted as the solution, and the experimentally determined complex showed significant structural discrepancies in the orientation of the antibody, due to the low-resolution character of the docking. Because of the severe structural errors, no residue-residue contacts were predicted correctly. However, a significant part of the antigenic site was determined. This illustrates the practical value of the present methodology for the initial prediction of the binding site, as well as points out the problem of transition from the low-resolution predictions of protein-protein complexes to the accurate structure.
Article
The docking section of CASP2 is reviewed. Seven small molecule ligand-protein targets and one protein-protein target were available for predictions. Many of the small molecule ligand complexes involved serine proteases. Overall results for the small molecule targets were good, with at least one prediction for each target being within 3 A root-mean-square deviation (RMSD) for nearly all targets and within 2 A RMSD for over half the targets. However, no single docking method seemed to consistently perform best. In addition, the predictions closest to the experimental results were not always those ranked the highest, pointing out that the evaluation (scoring) of potential solutions is still an area that needs improvement. The protein-protein target proved more difficult. None of the predictions did well in reproducing the geometry of the complex, although in many cases the interacting surfaces of the two proteins were predicted with reasonable accuracy. This target consisted of two large proteins and, therefore was a demanding target for docking methods.
Article
Recent developments in algorithms to predict the docking of two proteins have considered both the initial rigid-body global search and subsequent screening and refinement. The result of two blind trials of protein docking are encouraging--for complexes that are not too large and do not undergo sizeable conformational change upon association, the algorithms are now able to suggest reasonably accurate models.
Article
The non-covalent assembly of proteins that fold separately is central to many biological processes, and differs from the permanent macromolecular assembly of protein subunits in oligomeric proteins. We performed an analysis of the atomic structure of the recognition sites seen in 75 protein-protein complexes of known three-dimensional structure: 24 protease-inhibitor, 19 antibody-antigen and 32 other complexes, including nine enzyme-inhibitor and 11 that are involved in signal transduction.The size of the recognition site is related to the conformational changes that occur upon association. Of the 75 complexes, 52 have "standard-size" interfaces in which the total area buried by the components in the recognition site is 1600 (+/-400) A2. In these complexes, association involves only small changes of conformation. Twenty complexes have "large" interfaces burying 2000 to 4660 A2, and large conformational changes are seen to occur in those cases where we can compare the structure of complexed and free components. The average interface has approximately the same non-polar character as the protein surface as a whole, and carries somewhat fewer charged groups. However, some interfaces are significantly more polar and others more non-polar than the average. Of the atoms that lose accessibility upon association, half make contacts across the interface and one-third become fully inaccessible to the solvent. In the latter case, the Voronoi volume was calculated and compared with that of atoms buried inside proteins. The ratio of the two volumes was 1.01 (+/-0.03) in all but 11 complexes, which shows that atoms buried at protein-protein interfaces are close-packed like the protein interior. This conclusion could be extended to the majority of interface atoms by including solvent positions determined in high-resolution X-ray structures in the calculation of Voronoi volumes. Thus, water molecules contribute to the close-packing of atoms that insure complementarity between the two protein surfaces, as well as providing polar interactions between the two proteins.
Article
We present a rapidly executable minimal binding energy model for molecular docking and use it to explore the energy landscape in the vicinity of the binding sites of four different enzyme inhibitor complexes. The structures of the complexes are calculated starting with the crystal structures of the free monomers, using DOCK 4.0 to generate a large number of potential configurations, and screening with the binding energy target function. In order to investigate possible correlations between energy and variation from the native structure, we introduce a new measure of similarity, which removes many of the difficulties associated with root mean square deviation. The analysis uncovers energy gradients, or funnels, near the binding site, with decreasing energy as the degree of similarity between the native and docked structures increases. Such energy funnels can increase the number of random collisions that may evolve into productive stable complex, and indicate that short-range interactions in the precomplexes can contribute to the association rate. The finding could provide an explanation for the relatively rapid association rates that are observed even in the absence of long-range electrostatic steering.
Article
The protein‐protein interaction energy of 12 nonhomologous serine protease‐inhibitor and 15 antibody‐antigen complexes is calculated using a molecular mechanics formalism and dissected in terms of the main‐chain vs. side‐chain contribution, nonrotameric side‐chain contributions, and amino acid residue type involvement in the interface interaction. There are major differences in the interactions of the two types of protein‐protein complex. Protease‐inhibitor complexes interact predominantly through a main‐chain‐main‐chain mechanism while antibody‐antigen complexes interact predominantly through a side‐chain‐side‐chain or a side‐chain‐main‐chain mechanism. However, there is no simple correlation between the main‐chain‐main‐chain interaction energy and the percentage of main‐chain surface area buried on binding. The interaction energy is equally effected by the presence of nonrotameric side‐chain conformations, which constitute ∼20% of the interaction energy. The ability to reproduce the interface interaction energy of the crystal structure if original side‐chain conformations are removed from the calculation is much greater in the protease‐inhibitor complexes than the antibody‐antigen complexes. The success of a rotameric model for protein‐protein docking appears dependent on the extent of the main‐chain‐main‐chain contribution to binding. Analysis of (1) residue type and (2) residue pair interactions at the interface show that antibody‐antigen interactions are very restricted with over 70% of the antibody energy attributable to just six residue types (Tyr > Asp > Asn > Ser > Glu > Trp) in agreement with previous studies on residue propensity. However, it is found here that 50% of the antigen energy is attributable to just four residue types (Arg = Lys > Asn > Asp). On average just 12 residue pair interactions (6%) contribute over 40% of the favorable interaction energy in the antibody‐antigen complexes, with charge‐charge and charge/polar‐tyrosine interactions being prominent. In contrast protease inhibitors use a diverse set of residue types and residue pair interactions.
Article
Conformational changes on complex formation have been measured for 39 pairs of structures of complexed proteins and unbound equivalents, averaged over interface and non-interface regions and for individual residues. We evaluate their significance by comparison with the differences seen in 12 pairs of independently solved structures of identical proteins, and find that just over half have some substantial overall movement. Movements involve main chains as well as side chains, and large changes in the interface are closely involved with complex formation, while those of exposed non-interface residues are caused by flexibility and disorder. Interface movements in enzymes are similar in extent to those of inhibitors. All eight of the complexes (six enzyme–inhibitor and two antibody–antigen) that have structures of both components in an unbound form available show some significant interface movement. However, predictive docking is successful even when some of the largest changes occur. We note however that the situation may be different in systems other than the enzyme–inhibitors which dominate this study. Thus the general model is induced fit but, because there is only limited conformational change in many systems, recognition can be treated as lock and key to a first approximation.
Article
Here we carry out an examination of shape complementarity as a criterion in protein-protein docking and binding. Specifically, we examine the quality of shape complementarity as a critical determinant not only in the docking of 26 protein-protein "bound" complexed cases, but in particular, of 19 "unbound" protein-protein cases, where the structures have been determined separately. In all cases, entire molecular surfaces are utilized in the docking, with no consideration of the location of the active site, or of particular residues/atoms in either the receptor or the ligand that participate in the binding. To evaluate the goodness of the strictly geometry-based shape complementarity in the docking process as compared to the main favorable and unfavorable energy components, we study systematically a potential correlation between each of these components and the root mean square deviation (RMSD) of the "unbound" protein-protein cases. Specifically, we examine the non-polar buried surface area, polar buried surface area, buried surface area relating to groups bearing unsatisfied buried charges, and the number of hydrogen bonds in all docked protein-protein interfaces. For these cases, where the two proteins have been crystallized separately, and where entire molecular surfaces are considered without a predefinition of the binding site, no correlation is observed. None of these parameters appears to consistently improve on shape complementarity in the docking of unbound molecules. These findings argue that simplicity in the docking process, utilizing geometrical shape criteria may capture many of the essential features in protein-protein docking. In particular, they further reinforce the long held notion of the importance of molecular surface shape complementarity in the binding, and hence in docking. This is particularly interesting in light of the fact that the structures of the docked pairs have been determined separately, allowing side chains on the surface of the proteins to move relatively freely. This study has been enabled by our efficient, computer vision-based docking algorithms. The fast CPU matching times, on the order of minutes on a PC, allow such large-scale docking experiments of large molecules, which may not be feasible by other techniques. Proteins 1999;36:307-317.
Article
With access to whole genome sequences for various organisms and imminent completion of the Human Genome Project, the entire process of discovery in molecular and cellular biology is poised to change. Massively parallel measurement strategies promise to revolutionize how we study and ultimately understand the complex biochemical circuitry responsible for controlling normal development, physiologic homeostasis and disease processes. This information explosion is also providing the foundation for an important new initiative in structural biology. We are about to embark on a program of high-throughput X-ray crystallography aimed at developing a comprehensive mechanistic understanding of normal and abnormal human and microbial physiology at the molecular level. We present the rationale for creation of a structural genomics initiative, recount the efforts of ongoing structural genomics pilot studies, and detail the lofty goals, technical challenges and pitfalls facing structural biologists.
Article
Protein-protein interactions play pivotal roles in various aspects of the structural and functional organization of the cell, and their complete description is indispensable to thorough understanding of the cell. As an approach toward this goal, here we report a comprehensive system to examine two-hybrid interactions in all of the possible combinations between proteins of Saccharomyces cerevisiae. We cloned all of the yeast ORFs individually as a DNA-binding domain fusion ("bait") in a MATa strain and as an activation domain fusion ("prey") in a MATalpha strain, and subsequently divided them into pools, each containing 96 clones. These bait and prey clone pools were systematically mated with each other, and the transformants were subjected to strict selection for the activation of three reporter genes followed by sequence tagging. Our initial examination of approximately 4 x 10(6) different combinations, constituting approximately 10% of the total to be tested, has revealed 183 independent two-hybrid interactions, more than half of which are entirely novel. Notably, the obtained binary data allow us to extract more complex interaction networks, including the one that may explain a currently unsolved mechanism for the connection between distinct steps of vesicular transport. The approach described here thus will provide many leads for integration of various cellular functions and serve as a major driving force in the completion of the protein-protein interaction map.
Article
A new computationally efficient and automated "soft docking" algorithm is described to assist the prediction of the mode of binding between two proteins, using the three-dimensional structures of the unbound molecules. The method is implemented in a software package called BiGGER (Bimolecular Complex Generation with Global Evaluation and Ranking) and works in two sequential steps: first, the complete 6-dimensional binding spaces of both molecules is systematically searched. A population of candidate protein-protein docked geometries is thus generated and selected on the basis of the geometric complementarity and amino acid pairwise affinities between the two molecular surfaces. Most of the conformational changes observed during protein association are treated in an implicit way and test results are equally satisfactory, regardless of starting from the bound or the unbound forms of known structures of the interacting proteins. In contrast to other methods, the entire molecular surfaces are searched during the simulation, using absolutely no additional information regarding the binding sites. In a second step, an interaction scoring function is used to rank the putative docked structures. The function incorporates interaction terms that are thought to be relevant to the stabilization of protein complexes. These include: geometric complementarity of the surfaces, explicit electrostatic interactions, desolvation energy, and pairwise propensities of the amino acid side chains to contact across the molecular interface. The relative functional contribution of each of these interaction terms to the global scoring function has been empirically adjusted through a neural network optimizer using a learning set of 25 protein-protein complexes of known crystallographic structures. In 22 out of 25 protein-protein complexes tested, near-native docked geometries were found with C(alpha) RMS deviations < or =4.0 A from the experimental structures, of which 14 were found within the 20 top ranking solutions. The program works on widely available personal computers and takes 2 to 8 hours of CPU time to run any of the docking tests herein presented. Finally, the value and limitations of the method for the study of macromolecular interactions, not yet revealed by experimental techniques, are discussed.
Article
Rigid-body methods, particularly Fourier correlation techniques, are very efficient for docking bound (co-crystallized) protein conformations using measures of surface complementarity as the target function. However, when docking unbound (separately crystallized) conformations, the method generally yields hundreds of false positive structures with good scores but high root mean square deviations (RMSDs). This paper describes a two-step scoring algorithm that can discriminate near-native conformations (with less than 5 A RMSD) from other structures. The first step includes two rigid-body filters that use the desolvation free energy and the electrostatic energy to select a manageable number of conformations for further processing, but are unable to eliminate all false positives. Complete discrimination is achieved in the second step that minimizes the molecular mechanics energy of the retained structures, and re-ranks them with a combined free-energy function which includes electrostatic, solvation, and van der Waals energy terms. After minimization, the improved fit in near-native complex conformations provides the free-energy gap required for discrimination. The algorithm has been developed and tested using docking decoys, i.e., docked conformations generated by Fourier correlation techniques. The decoy sets are available on the web for testing other discrimination procedures. Proteins 2000;40:525-537.
Article
Proteomics, the large-scale analysis of proteins, will contribute greatly to our understanding of gene function in the post-genomic era. Proteomics can be divided into three main areas: (1) protein micro-characterization for large-scale identification of proteins and their post-translational modifications; (2) 'differential display' proteomics for comparison of protein levels with potential application in a wide range of diseases; and (3) studies of protein-protein interactions using techniques such as mass spectrometry or the yeast two-hybrid system. Because it is often difficult to predict the function of a protein based on homology to other proteins or even their three-dimensional structure, determination of components of a protein complex or of a cellular structure is central in functional analysis. This aspect of proteomic studies is perhaps the area of greatest promise. After the revolution in molecular biology exemplified by the ease of cloning by DNA methods, proteomics will add to our understanding of the biochemistry of proteins, processes and pathways for years to come.
Article
Knowledge of the three-dimensional (3D) structure of a protein-protein complex provides insights into the function of the system that can guide, for example, the systematic design of novel regulators of activity. However, at the end of 1997, there were more than 5000 protein structures in the Brookhaven databank (PDB) but less than 200 sets of coordinates for protein-protein complexes. This disparity is reminiscent of the protein-sequence/protein-structure gap and similarity motivates the development of computational methods for structure prediction. This chapter describes the strategy to start with the coordinates of the two molecules in their unbound states and then computationally model the structure of the bound complex including the conformational changes on association. For reviews of the field of protein docking see refs. 1–3.
Article
The computer program DOT quickly finds low-energy docked structures for two proteins by performing a systematic search over six degrees of freedom. A novel feature of DOT is its energy function, which is the sum of both a Poisson-Boltzmann electrostatic energy and a van der Waals energy, each represented as a grid-based correlation function. DOT evaluates the energy of interaction for many orientations of the moving molecule and maintains separate lists scored by either the electrostatic energy, the van der Waals energy or the composite sum of both. The free energy is obtained by summing the Boltzmann factor over all rotations at each grid point. Three important findings are presented. First, for a wide variety of protein-protein interactions, the composite-energy function is shown to produce larger clusters of correct answers than found by scoring with either van der Waals energy (geometric fit) or electrostatic energy alone. Second, free-energy clusters are demonstrated to be indicators of binding sites. Third, the contributions of electrostatic and attractive van der Waals energies to the total energy term appropriately reflect the nature of the various types of protein-protein interactions studied.
Article
A genetic algorithm (GA) for protein-protein docking is described, in which the proteins are represented by dot surfaces calculated using the Connolly program. The GA is used to move the surface of one protein relative to the other to locate the area of greatest surface complementarity between the two. Surface dots are deemed complementary if their normals are opposed, their Connolly shape type is complementary, and their hydrogen bonding or hydrophobic potential is fulfilled. Overlap of the protein interiors is penalized. The GA is tested on 34 large protein-protein complexes where one or both proteins has been crystallized separately. Parameters are established for which 30 of the complexes have at least one near-native solution ranked in the top 100. We have also successfully reassembled a 1,400-residue heptamer based on the top-ranking GA solution obtained when docking two bound subunits.