Article

A Protein-Protein Docking Benchmark

Authors:
  • State Key Laboratory for Agrobiotechnology, College of Biological Sciences
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We have developed a nonredundant benchmark for testing protein-protein docking algorithms. Currently it contains 59 test cases: 22 enzyme-inhibitor complexes, 19 antibody-antigen complexes, 11 other complexes, and 7 difficult test cases. Thirty-one of the test cases, for which the unbound structures of both the receptor and ligand are available, are classified as follows: 16 enzyme-inhibitor, 5 antibody-antigen, 5 others, and 5 difficult. Such a centralized resource should benefit the docking community not only as a large curated test set but also as a common ground for comparing different algorithms. The benchmark is available at (http://zlab.bu.edu/~rong/dock/benchmark.shtml).

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Two protein-protein docking datasets, Docking Benchmark 5.5 (DB5.5) [41][42][43][44][45] and Database of Interacting Protein Structures (DIPS) 46 , are used here to represent interfaces that form without the help of molecular glues. DB5.5 41-45 is a manually curated dataset containing 253 protein-protein complexes spanning eight categories based on protein functions. ...
... The PPI in the other three datasets (Group1, DB5.5, and DIPS) are largely biased towards complexes formed between folded protein partners. Intrinsically disordered proteins are not included when building DB5.5 dataset [41][42][43][44][45] . The DIPS dataset contains complexes with large PPIs such as the membrane protein complex phosphatidylcholine flippase Dnf2-Lem3 (PDB ID: 7KY8) with a PPI BSA at 4,995 Å 2 and the Myosin II complete coiled-coil domain (PDB ID: 7KOG) with an extensive PPI BSA at 25,746 Å 2 . ...
Article
Full-text available
Molecular glues are a class of small molecules that stabilize the interactions between proteins. Naturally occurring molecular glues are present in many areas of biology where they serve as central regulators of signaling pathways. Importantly, several clinical compounds act as molecular glue degraders that stabilize interactions between E3 ubiquitin ligases and target proteins, leading to their degradation. Molecular glues hold promise as a new generation of therapeutic agents, including those molecular glue degraders that can redirect the protein degradation machinery in a precise way. However, rational discovery of molecular glues is difficult in part due to the lack of understanding of the protein-protein interactions they stabilize. In this review, we summarize the structures of known molecular glue-induced ternary complexes and the interface properties. Detailed analysis shows different mechanisms of ternary structure formation. Additionally, we also review computational approaches for predicting protein-protein interfaces and highlight the promises and challenges. This information will ultimately help inform future approaches for rational molecular glue discovery.
... These can evaluate the accuracy of the predictions by assessing the similarity to the native P:P pose, which is not known a priori. 24 The Critical Assessment of PRedicted Interactions (CAPRI) challenge, developed in 2001, 25 is a valuable platform that promotes the evaluation and the progression of docking algorithms in realistic conditions. Several rounds of blind predictions have resulted in the development of a wide variety of docking algorithms, and some of them with considerable success. ...
... 32 Despite all the developments in docking algorithms, present methods still struggle to identify near-native poses. 24,33,34 In particular, the success rate for finding an acceptable or better pose in the top-ten is a maximum of 58%, or 27% at the top position (considering 115 scoring functions). 33 Others have obtained a prediction accuracy of only 38% considering the top-ten poses of 55 cases. ...
Article
The development of docking algorithms to predict near-native structures of protein:protein complexes from the structure of the isolated monomers, is of paramount importance for molecular biology and drug discovery. In this study, we assessed the capacity of the interfacial area of protein:protein complexes and of Molecular Mechanics-Poisson Boltzmann Surface Area (MM-PBSA)-derived properties, to rank docking poses. We used a set of 48 protein:protein complexes, and a total of 67 docking experiments distributed among bound:bound, bound:unbound, and unbound:unbound test cases. The MM-PBSA binding free energy of protein monomers, has been shown to be very convenient to predict high-quality structures with a high success rate. In fact, considering solely the top-ranked pose of more than 200 docking solutions of each of 39 protein:protein complexes, the success rate was of 77% in the prediction of high-quality poses, or 90% if considering high- or medium-quality poses. If considering high- or medium-quality poses in the top-one prediction, a success rate of 87% was obtained for a scoring scheme based on computational alanine scanning mutagenesis data. Such ranking accuracy highlights the ability of these properties to predict near-native poses in protein:protein docking.
... [10][11][12][13][14][15][16] Docking methodologies are evaluated by community-wide blind assessment CAPRI, 17 and by benchmarking on pre-compiled protein-protein sets. Such benchmark sets are based on bound and unbound X-ray protein structures, 16,[18][19][20] and protein models. [21][22][23] Docking has to distinguish correct matches from false-positives. ...
... Thus, an accurate prediction of protein complexes from the unbound components poses a big problem for the docking algorithms. Because of that, the datasets of unbound structures corresponding to the co-crystallized complexes 16,[18][19][20] are essential for the development and validation of docking approaches. The DOCKGROUND unbound set distinguishes itself by being an integral part of a large resource and the basis for other datasets ( Fig. 1), for example, allowing a straightforward comparison of docking performance for unbound and model structures. ...
Article
Characterization of life processes at the molecular level requires structural details of protein interactions. The number of experimentally determined structures of protein-protein complexes accounts only for a fraction of known protein interactions. This gap in structural description of the interactome has to be bridged by modeling. An essential part of the development of structural modeling/docking techniques for protein interactions is databases of protein-protein complexes. They are necessary for studying protein interfaces, providing a knowledge base for docking algorithms, developing intermolecular potentials, search procedures, and scoring functions. Development of protein-protein docking techniques requires thorough benchmarking of different parts of the docking protocols on carefully curated sets of protein-protein complexes. We present a comprehensive description of the Dockground resource (http://dockground.compbio.ku.edu) for structural modeling of protein interactions, including previously unpublished unbound docking benchmark set 4, and the X-ray docking decoy set 2. The resource offers a variety of interconnected datasets of protein-protein complexes and other data for the development and testing of different aspects of protein docking methodologies. Based on protein-protein complexes extracted from the PDB biounit files, Dockground offers sets of X-ray unbound, simulated unbound, model, and docking decoy structures. All datasets are freely available for download, as a whole or selecting specific structures, through a user-friendly interface on one integrated website. This article is protected by copyright. All rights reserved.
... The Docking Benchmark set (BM set) is a successful protein-protein scoring benchmark dataset series, with the first DBM set published in 2003 [57] being named DBM1.0. Each later BM set version was built on top of the previous version by adding new targets. ...
Preprint
Full-text available
The quality prediction of quaternary structure models of a protein complex, in the absence of its true structure, is known as the Estimation of Model Accuracy (EMA). EMA is useful for ranking predicted protein complex structures and using them appropriately in biomedical research, such as protein-protein interaction studies, protein design, and drug discovery. With the advent of more accurate protein complex (multimer) prediction tools, such as AlphaFold2-Multimer and ESMFold, the estimation of the accuracy of protein complex structures has attracted increasing attention. Many deep learning methods have been developed to tackle this problem; however, there is a noticeable absence of a comprehensive overview of these methods to facilitate future development. Addressing this gap, we present a review of deep learning EMA methods for protein complex structures developed in the past several years, analyzing their methodologies and impacts. We also provide a prospective summary of some potential new developments for further improving the accuracy of the EMA methods.
... The Docking Benchmark set (BM set) is a successful protein-protein scoring benchmark dataset series, with the first DBM set published in 2003 [58] being named DBM1.0. Each later BM set version was built on top of the previous version by adding new targets. ...
Article
Full-text available
The quality prediction of quaternary structure models of a protein complex, in the absence of its true structure, is known as the Estimation of Model Accuracy (EMA). EMA is useful for ranking predicted protein complex structures and using them appropriately in biomedical research, such as protein–protein interaction studies, protein design, and drug discovery. With the advent of more accurate protein complex (multimer) prediction tools, such as AlphaFold2-Multimer and ESMFold, the estimation of the accuracy of protein complex structures has attracted increasing attention. Many deep learning methods have been developed to tackle this problem; however, there is a noticeable absence of a comprehensive overview of these methods to facilitate future development. Addressing this gap, we present a review of deep learning EMA methods for protein complex structures developed in the past several years, analyzing their methodologies, data and feature construction. We also provide a prospective summary of some potential new developments for further improving the accuracy of the EMA methods.
... [25] An essential ingredient of the scoring functions for protein-protein docking is the shape complementarity. [26] It has been possible to dock proteins that remain rigid after complex formation and the upsurge in the elucidation of the protein structure has boosted studies on protein-protein docking. ...
Article
Candida glabrata infections being resistant to many azole antifungal agents, are difficult to treat. Various parts of Adenanthera pavonina have been used in traditional medicine. In the present study, an attempt was made to screen the bioefficacy of the identified phytoconstituents of A. pavonina on an in silico platform and identify some potential drug-like molecules that can impede important drug targets of C. glabrata using the molecular docking method. In a previous study related to the current research, the phytochemical profiling of the methanolic stem extract of A. pavonina was carried out using GC-MS to identify the phytoconstituents. The three-dimensional structure of the fungal receptors were derived by homology modeling using Modeller9v7 and the same for the ligands for which the structures were not available were drawn by ACD chemSketch. The docking of ligands and receptors were performed using PatchDock software. Druglikeliness and pharmacodynamics properties were evaluated using SWISS-ADME. GC-MS analysis of the A. pavonina extract revealed the presence of 17 phyto compounds, of which 2 heptyl 1,3dioxolane and methyl 4-o-methyl-d-arabinopyranoside best docked with the epithelial adhesion protein 6 receptor and cell wall transcription factor ACE2. Methyl 4-o-methyl-d-arabinopyranoside also best docked with the integral cell wall protein receptor. Although other compounds have shown good scores related to docking 2 heptyl 1, 3 dioxolane had an excellent binding affinity than the other ligands thus signifying its potent antifungal activity. 2 Heptyl 1, 3 Dioxolane was found to be BBB positive and 4-O-Methyl-D-arabinopyranoside is BBB negative. As the finding indicates, the two phyto compounds present in the methanolic stem extract of A. pavonina demonstrated good docking scores when docked with specific fungal cell wall receptors and thus can prove to be appropriated for the lead molecule.
... The ZDOCK/RDOCK method has been validated by multiple independent protein-protein docking studies and has been found to be a reliable method for the prediction of protein-protein interactions. The program ZDOCK 8,9 initially identifies the docking positions within the receptor for the ligand(ANG2) based on shape, stearic, and electrostatic complementarities. Once docking positions are identified, they are ranked and refined by the second program, RDOCK 10,11 , which is based on the CHARMM force field and calculates the energetic between the protein and docked peptides and ranks the docked poses. ...
Article
Full-text available
BACKGROUD & OBJECTIVE: Angiopoietins are protein growth factors which play key role in Angiogenesis.Angiogenesis is the process of forming blood vessels from pre-existing ones. Angiopoietin-1 (Ang-1) and Angiopoietin-2 (Ang-2) have been identified as ligands of the endothelial receptor tyrosine kinase Tie-2. ANG-2 is a key regulator ofangiogenesis that exerts context-dependent effects on endothelial cell (ECs). ANG-2 binds the endothelial-specificreceptor TIE2 and acts as a negative regulator of ANG-1/TIE2 signaling during angiogenesis, thereby controlling theresponsiveness of ECs to exogenous cytokines. The transmembrane tyrosine kinase TIE-2 and the receptor forangiopoietins have been shown to be involved in angiogenic processes. They are also known to play a role in tumorangiogenesis. However, the mode of interactions between ANG-2 and TIE2 receptor is not known because of theabsence of high resolution co-crystal structure. Therefore in this study attempts were made to investigate the mode andmechanism of molecular interactions between Tie2 with Ang2 using molecular modeling and molecular dynamicsstudies. METHODOLOGY: In the present study, both Tie2 (PDB Id: 2GY5) and Angiopoietins (PDB Id: 2GY7) werefirst prepared using protein preparation wizard (Schrodinger package). Protein-protein interaction between both theproteins was studied using ZDock followed by refinement using Rdock. The best docked pose was then subjected toMolecular dynamics (MD) simulations to study the precise interaction between TIE2 (Receptor) and Angiopoietin-2(Ligand) over a specific time span using AMBER 11.0. The obtained MD trajectories were further used to estimate thebinding free energy of the complex using the molecular mechanics/Poisson Boltzmann surface area (MM-PBSA)method. RESULTS: The binding energy (∆Gbinding) between both the proteins, Tie2 and Ang2 was predicted to be -28.77kcal/mol using Rdock. The other energy parameters between Tie2 and APC interactions such as electrostatic (Eelec), vander Waals (Evdw) and desolvation (Esol) energy are -44.68 kcal/mol, -99.83 kcal/mol and 6.10 kacal/mol respectively,demonstrating modest interactions between them. The interacting surface area between Tie2 and Ang2 is 842:858Å2.CONCLUSION: Results obtained from this study revealed that both Ang2 and Tie2 bind with high affinity with modestinteracting surface area. Further the results guided us in designing specific experiments for biological evaluations.
... The drawback of this approach is that it uses (a) only the bound structures, and (b) only one bound structure out of a pool of bound structures, based on subjective criteria. Developments in the protein-protein docking benchmark over the last 20 years (Chen et al., 2003;Mintseris et al., 2005;Vreven et al., 2015) have incorporated BSA calculations considering the unbound states, even for specific complexes (e.g., antibody-antigen interactions; Guest et al., 2021). However, even though there exist multiple structural data of the same protein complex derived by modern structural biology methods (Guest et al., 2021;Richardson et al., 2021), still, a single best complex is considered for subsequent analysis. ...
Article
Full-text available
Understanding protein–protein interactions (PPIs) is fundamental to infer how different molecular systems work. A major component to model molecular recognition is the buried surface area (BSA), that is, the area that becomes inaccessible to solvent upon complex formation. To date, many attempts tried to connect BSA to molecular recognition principles, and in particular, to the underlying binding affinity. However, the most popular approach to calculate BSA is to use a single (or in some cases few) bound structures, consequently neglecting a wealth of structural information of the interacting proteins derived from ensembles corresponding to their unbound and bound states. Moreover, the most popular method inherently assumes the component proteins to bind as rigid entities. To address the above shortcomings, we developed a Monte Carlo method‐based Interface Residue Assessment Algorithm (IRAA), to calculate a combined distribution of BSA for a given complex. Further, we apply our algorithm to human ACE2 and SARS‐CoV‐2 Spike protein complex, a system of prime importance. Results show a much broader distribution of BSA compared to that obtained from only the bound structure or structures and extended residue members of the interface with implications to the underlying biomolecular recognition. We derive that specific interface residues of ACE2 and of S‐protein are consistently highly flexible, whereas other residues systematically show minor conformational variations. In effect, IRAA facilitates the use of all available structural data for any biomolecular complex of interest, extracting quantitative parameters with statistical significance, thereby providing a deeper biophysical understanding of the molecular system under investigation.
... In the current investigation, it was discovered that DCIP could dock and bind with targeted proteins and enzymes. The objective of ligand-protein interaction is to anticipate a ligand preferred binding mode(s) with a protein having a known 3D structure [76]. The rcsb PDB format [77] was used to download the structures of target proteins. ...
Article
2,6 Dichloroindophenol sodium salt was an aromatic compound evaluated by DFT through experimental and computation using various solvents used in solvation analysis investigation. FT-IR studies are used to identify the various functional groups, which are then compared with simulated spectra. The estimated vibrational wavenumbers were scaled using a suitable scaling factor after the optimized geometrical parameters were determined. The plotted FT-IR, FT-Raman; and UV-Vis spectra are correlated with experimental calculations. Measured and computed spectra are found to be quite similar. NBO (Natural Bond Analysis) research explains how charges transferred occur in a molecule. NBO analysis indicates that the greatest second-order perturbation energy E(2) = 25.77 kcal/mol is associated with electron delocalization from the donor π (C13-C15) to π* (O4-C17) acceptor interaction. Local reactivity descriptor stipulates the molecule's reactive areas. The stability, hardness, and softness are studied by Highest Occupied Molecular Orbital (HOMO) and Lowest Unoccupied Molecular Orbital (LUMO). FMO (Frontier Molecular Orbital) divulge kinetic stabilization and reactivity of DCIP (2,6 Dichloroindophenol sodium salt). Reactive descriptors and molecular reactivity of DCIP were significantly changed by the solvents. The material under analysis has outstanding NLO (Non-linear Optical) properties. ELF (Electron Localization Function), LOL (Localised Orbital Locator) and RDG (Reduced Density Gradient) were performed and reported. Molecular docking is used to investigate specific biological data of DCIP. Two bacterial targets interact with a protein ligand. Whereas to ascertain the biological properties drug-likeness and ADMET were utilised. The outcome of the ADMET experiments showed that the structure under investigation possesses antibacterial properties. To test the substance's effectiveness against various bacterial strains, antibacterial tests were carried out.
... Finally, a root mean square deviation clustering is applied to the candidate solutions to help identify redundant solutions to be discarded. The main reason behind PatchDock's high efficiency is its fast--transformational search, driven by local feature matching, rather than a co--called "brute force" method of searching of six-dimensional transformation space (Chen et al., 2003;Connolly, 1983;de Lima et al., 2016;Shaker et al., 2018;Zhang et al., 1997). ...
Article
Full-text available
This combined Al12E12 (E = N, P) surface adsorption and docking study describes the new possibility of prospective potential probing(photophysical/optical) and therapy(medicinal/biochemical) with these adsorbent conjugates. DFT investigations were undertaken herein to help generate geometrical models and better understand the possible favorable adsorption energetics. We attempt to explain their adsorption behaviors and docking involving SARS-CoV-2 viruses (PDB)to assess their possible pharmaceutical potential against the pandemic virus (COVID-19). The adsorption behavior of 8-hydroxy-2-methylquinoline (MQ) and its halogenated derivatives, 5,7-diiodo-8-hydroxy-2-methylquinoline (MQI), 5,7-dichloro-8-hydroxy-2-methylquinoline (MQCl), and 5,7-dibromo-8-hydroxy-2-methylquinoline (MQBr), with aluminum-nitrogen (AlN), and aluminum-phosphorous (AlP) fullerene-like nanocages is reported. A decrease in the hardness of the nanoclusters when adsorbed with drug molecules resulted in an incrementally improved chemical softness (see e.g., Hard-Soft Acid Base theory) indicating that reactivity of the drug molecule in the resulting complex increases upon cluster chemical adsorption. The energy gap is found to be maximized for AlN-MQ and minimized for AlP-MQI; the reduced density gradient (RDG) iso-surfaces and AIM studies also corroborated this. Therefore, these two were found, respectively, to be the least and most electrically conductive of the species under study. We selected a simple medicinal building block (chelator)in addition to selecting the cluster based on previous literature reports. Important parameters such as gap energies and global indices were determined. We assessed NLO properties. The SARS-CoV-2 virus PDB docking data for 6VW1, 6VYO, 6WKQ, 7AD1, 7AOL, 7B3C, were enlisted as ligand targets for studies of docking (PatchDock Server) using the requisite PDB geometries (For the structure of 6VW1, kindly see reference, 2020; For the structure of 6VYO kindly see reference, 2020; For the structure of 6WKQ kindly see reference, 2020; For the structure of 7AD1 kindly see reference, 2021; For the structure of 7AOL kindly see reference, 2021; For the structure of 7B3C kindly see reference, 2021). Such findings indicate that the AlN-drug conjugation have inhibitory effect against these selected receptors.
... The benchmark consists of 13 easy, 22 intermediate, and 12 difficult cases for docking. 224 Unlike the protein-RNA [69][70][71][72] or protein-protein docking [225][226][227][228][229] benchmarks, the protein-DNA benchmark is not updated regularly. ...
Chapter
Full-text available
Protein–nucleic acids interactions, which involve the binding of protein with RNA or/and DNA molecules, play crucial roles in many biochemical pathways. Experimental studies of these biomolecular interactions are important to understand the mechanistic details of their functions. Different in vitro and in vivo assays, high-throughput methods, and high-resolution structure determination techniques enrich our knowledge of molecular biology, cell development, and mechanisms of various diseases. However, experimental determination of protein–nucleic acid interactions is often difficult, time-consuming, and sometimes impossible. Various structural information, including those from direct experiments as well as their interpretation, are available for protein–nucleic acid complexes in the form of databases. Different groups have used the experimental data to train and validate different prediction models. Here we present a comprehensive overview of different tools and databases encompassing different realms of protein–nucleic acids interactions.
... Protein-protein docking benchmark, DOCK-GROUND, and PPI4Dock are docking datasets that provide the opportunity for the same. The protein-protein docking benchmark, proposed by Chen et al. [162] in 2003, was the first of its kind. They took special attention to include targets of all difficulty levelsrigid-body, medium, and difficult-while avoiding redundant structures. ...
Article
Full-text available
The biological significance of proteins attracted the scientific community in exploring their characteristics. The studies shed light on the interaction patterns and functions of proteins in a living body. Due to their practical difficulties, reliable experimental techniques pave the way for introducing computational methods in the interaction prediction. Automated methods reduced the difficulties but could not yet replace experimental studies as the field is still evolving. Interaction prediction problem being critical needs highly accurate results, but none of the existing methods could offer reliable performance that can parallel with experimental results yet. This article aims to assess the existing computational docking algorithms, their challenges, and future scope. Blind docking techniques are quite helpful when no information other than the individual structures are available. As more and more complex structures are being added to different databases, information-driven approaches can be a good alternative. Artificial intelligence, ruling over the major fields, is expected to take over this domain very shortly.
... Aggregated free energy also known as interface energy is sum of several energetic factors. In this analysis the benchmark value for the fire dock energy score was set as 1.0 Chen, et al.[24] and ranking of fire dock energy scores demonstrated that among the potential candidate proteins analysed, only CDK5 with a fire dockscore of 1.81 exceeded the benchmark and was the unambiguous choice as intermediate protein. (Other disqualified proteins and the calculation of fire dock energy scores are given in the Supplementary Data Section 4 for further information.) ...
... Betts and Sternberg studied the conformational changes underlying PPIs for 39 complexes (Betts and Sternberg 1999). They concluded that although several PPIs follow the mechanism of induced fit, there are still many other protein complexes that have large accompanying structural changes and thus cannot be explained by induced fit mechanism alone (Chen et al. 2003;Goh et al. 2004). Bosshard has stated that only in the case of a very strong match between the interacting sites, the initial complex has enough longevity and strength for induced fit to take place within a reasonable period of time frame (Bosshard 2001). ...
Chapter
Many biological functions in the cell are mediated when a protein interacts with one or more proteins in a process known as protein–protein interaction (PPI). These PPIs can be classified based on six distinct criteria such as contact criteria, interaction lifetime, affinity, nature of interacting partners, identity of formed complex, and presence/absence of artifacts. This chapter delineates how PPIs are functionally significant for the sustenance of life and discusses all the parameters that are utilized to analyze a particular PPI, particularly the interface between the two interacting proteins. Further, it discusses the models that have been proposed to describe PPIs and also traces the pathways down which PPIs have evolved. The chapter concludes with interactions that are mediated either by particular subunits, domains and/or motifs.
... PatchDock a very efficient algorithm for protein-small ligand and protein-protein docking [8]. The algorithm was verified on enzyme-inhibitor and antibody-antigen complexes from benchmark 0.0 [9], where it successfully found near-native solutions for most of the cases. The algorithm was also successfully tested in the last three rounds [10][11][12] of the Critical Assessment of PRediction of Interactions (CAPRI) [13]. ...
... Group 4 has a single member, the complex of human profilin-1 with β -actin (4), and is the only case for which all bound and unbound structures were available with high sequence identity (over 94%). This case is also an entry of the protein-protein docking benchmark that we maintain [6,[42][43][44][45]. ...
Preprint
Ab initio protein-protein docking algorithms often rely on experimental data to identify the most likely complex structure. We integrated protein-protein docking with the experimental data of chemical cross-linking followed by mass spectrometry. We tested our approach using 12 cases that resulted from an exhaustive search of the Protein Data Bank for protein complexes with cross-links identified in our experiments. We implemented cross-links as constraints based on Euclidean distance or void-volume distance. For most test cases the rank of the top-scoring near-native prediction was improved by at least two fold compared with docking without the cross-link information, and the success rates for the top 5 and top 10 predictions doubled. Our results demonstrate the delicate balance between retaining correct predictions and eliminating false positives. Several test cases had multiple components with distinct interfaces, and we present an approach for assigning cross-links to the interfaces. Employing the symmetry information for these cases further improved the performance of complex structure prediction. Highlights Incorporating low-resolution cross-linking experimental data in protein-protein docking algorithms improves performance more than two fold. Integration of protein-protein docking with chemical cross-linking reveals information on the configuration of higher order complexes. Symmetry analysis of protein-protein docking results improves the predictions of multimeric complex structures
... The dataset was the total of 59 protein heterodimeric complexes in the ZLAB protein-protein docking benchmark (version 1.0) [33]. The 59 heterodimers were divided, and all-to-all (cross) docking calculations were performed on the 59 receptor proteins and 59 ligand proteins. ...
Preprint
Public cloud computing environments, such as Amazon AWS, Microsoft Azure, and the Google Cloud Platform, have achieved remarkable improvements in computational performance in recent years, and are also expected to be able to perform massively parallel computing. As the cloud enables users to use thousands of CPU cores and GPU accelerators casually, and various software types can be used very easily by cloud images, the cloud is beginning to be used in the field of bioinformatics. In this study, we ported the original protein-protein interaction prediction (protein-protein docking) software, MEGADOCK, into Microsoft Azure as an example of an HPC cloud environment. A cloud parallel computing environment with up to 1,600 CPU cores and 960 GPUs was constructed using four CPU instance types and two GPU instance types, and the parallel computing performance was evaluated. Our MEGADOCK on Azure system showed a strong scaling value of 0.93 for the CPU instance when H16 instance with 100 instances were used compared to 50, and a strong scaling value of 0.89 for the GPU instance when NC24 instance with 20 were used compared to 5. Moreover, the results of the usage fee and total computation time supported that using a GPU instance reduced the computation time of MEGADOCK and the cloud usage fee required for the computation. The developed environment deployed on the cloud is highly portable, making it suitable for applications in which an on-demand and large-scale HPC environment is desirable.
... We downloaded the protein responsible for this action from the protein data bank with id PDB ID: 3hkx [46]. The ligand, our pyrrole derivative, was docking with the above macromolecule using the Patchdock docking server [47,48,49,50]. The score of the molecular docking between pyrrole derived compound, and the protein (PDB ID: 3hkx) was 2660, full fitness energy is -1263.77 ...
Article
Full-text available
Pyrroles are an exciting class of organic compounds with immense medicinal activities. This manuscript presents the structural and quantum mechanical studies of 1-(2-aminophenyl) pyrrole using X-Ray diffraction and various spectroscopic methods like Infra-Red, Raman, Ultra-violet and Fluorescence spectroscopy and its comparison with theoretical simulations. The single-crystal X-ray diffraction values and optimized geometry parameters also were within the agreeable range. A fully relaxed potential energy scan revealed the stability of the possible conformers of this molecule. We present the density functional theory results and assignment of the vibrational modes in the infrared spectrum. The experimental and scaled simulated vibrations matched when density functional theory simulations (B3LYP functional with 6–311++G∗∗). The electronic spectrum was simulated using time-dependent density functional theory with CAM-B3LYP functional in dimethylsulphoxide solvent. The fluorescence spectrum of the compound was studied at different excitation wavelengths in the dimethylsulphoxide solvent. The stability of the molecule by intramolecular electron transfer by hyperconjugation was studied with the natural bond orbital analysis. Frontier molecular orbitals and molecular electrostatic potentials of the compound gave an idea about the reactive behaviour of the compounds. Prediction of activity spectral studies followed by docking analysis indicated that the molecule is active against arylacetonitrilase inhibitor.
... PASS (Prediction of Activity Spectra) [32] gives activities, Feruloyl esterase inhibitor, Bisphosphoglycerate phosphatase inhibitor and Prolylaminopeptidase inhibitor (activity values 0.934, 0.931 and 0.930) and the corresponding receptors are, 3WMT, 2H4Z and 2EEP are used for docking. PatchDock Server is used for docking purpose [33,34,35,36] and the algorithm of Patchdock has three major steps: molecular shape representation, surface patch matching and filtering and scoring [37,38,39]. Feruloylesterases (FAEs) (3WMT) are carboxyl esterases which enhance the hydrolysis of ester bonds between ferulic acid and polysaccharides present in the plant cell wall [40]. ...
Article
Full-text available
The structural, spectroscopic various physico-chemical and biological characteristics of the organic molecule benzil (BZL) and derivatives, 1,2-bis(4-methylphneyl)-1,2-ethanedione (DMB), 4,4'-difluorobenzil (DFB), 4,4'-dichlorobenzil (DCB) and 4,4'-dibromobenzil (DBB) have been studied by various computational methods. The experimental and scaled simulated Raman and IR spectra were compared and found close agreement. Assignments of important peaks are also presented. Detailed information pertaining to the local and global reactivity and other properties like electrophilic and nucleophilic characteristics were analysed. The hyperactive pressure was measured in terms of polarizability and corresponding biological properties were validated to identity reactive sites. Prediction of Activity Spectral Studies (PASS) predicts the biological activity of the compounds and it is found that the candidate molecules can be used as feruloyl esterase inhibitor, bisphosphoglycerate phosphatase inhibitor and Prolylaminopeptidase inhibitor. The crystals structures of those receptors are taken from the protein data bank and docking studies indicates stable complex with the receptors and candidate molecules. Light harvesting efficiency, followed by photovoltaic modelling shows that DMB is the best compound to be used in the DSSC to get the best output.
... Receptors, 3DY9, 4NZ2 and 2AYR were obtained from the protein data bank website. PatchDock Server is used for docking purpose [23,24,25,26]. ...
Article
Full-text available
The organic molecule tenoxicam and similar derivatives, piroxicam and isoxicam have been studied by quantum chemical theory (DFT), FT-Raman and FT-IR. By FMOs energies the charge transfer inside the molecules are obtained. The UV-Vis spectra of the compounds are simulated to study the electronic transition in the target molecules. By using natural bond orbital (NBO), charge delocalization analyzes arising from hyper conjugative interactions and the stability of the molecules are obtained. First order hyperpolarizability of piroxicam is higher than that of isoxicam and tenoxicam. The reactive areas are thoroughly studied by MEP. Prediction of Activity Spectra gives activities, anti-inflammatory, CYP2C9 substrate and gout treatment. Docked ligands form a stable complex with the receptors. Keywords: Organic chemistry, Theoretical chemistry, Pharmaceutical chemistry, DFT, MEP, FT-IR, FT-Raman, Molecular docking
... The protein-protein benchmark set, collected by the Weng lab [8][9][10][11], has become well established for testing docking methods. The benchmark consists of non-redundant, high-quality structures of protein-protein complexes along with the unbound structures of their components. ...
Article
A number of well-established servers perform 'free' docking of proteins of known structures. In contrast, template-based docking can start from sequences if structures are available for complexes that are homologous to the target. On the basis of the results of the CAPRI-CASP structure prediction experiments, template-based methods yield more accurate predictions if good templates can be found, but generally fail without such templates. However, free global docking, or focused docking around even poor quality template-based models, can still generate acceptable docked structures in these cases. In accordance with the analysis of a benchmark set, free docking of heterodimers yields acceptable or better predictions in the top 10 models for around 40% of structures. However, it is likely that a combination of template-based and free docking methods can perform better for targets that have template structures available. Another way of improving the reliability of predictions is adding experimental information as restraints, an option built into several docking servers.
... Therefore, collecting a protein-protein complex database is a challenging task, by reason that, comprehensive consideration including protein family, the type of protein-protein interaction or characteristics of interface is need. However, there are still some databases pick out protein-protein complexes for theoretical research, such as protein-protein complexes in PDBbind [34], 2P2I-DB [35,36], ZDOCK benchmark [37] and etc. The protein-protein complexes in ZDOCK benchmark 4.0 [38] were chose to develop the HawkRank scoring function in our study. ...
Article
Full-text available
Deciphering the structural determinants of protein–protein interactions (PPIs) is essential to gain a deep understanding of many important biological functions in the living cells. Computational approaches for the structural modeling of PPIs, such as protein–protein docking, are quite needed to complement existing experimental techniques. The reliability of a protein–protein docking method is dependent on the ability of the scoring function to accurately distinguish the near-native binding structures from a huge number of decoys. In this study, we developed HawkRank, a novel scoring function designed for the sampling stage of protein–protein docking by summing the contributions from several energy terms, including van der Waals potentials, electrostatic potentials and desolvation potentials. First, based on the solvation free energies predicted by the Generalized Born model for ~ 800 proteins, a SASA (solvent accessible surface area)-based solvation model was developed, which can give the aqueous solvation free energies for proteins by summing the contributions of 21 atom types. Then, the van der Waals potentials and electrostatic potentials based on the Amber ff14SB force field were computed. Finally, the HawkRank scoring function was derived by determining the most optimal weights for five energy terms based on the training set. Here, MSR (modified success rate), a novel protein–protein scoring quality index, was used to assess the performance of HawkRank and three other popular protein–protein scoring functions, including ZRANK, FireDock and dDFIRE. The results show that HawkRank outperformed the other three scoring functions according to the total number of hits and MSR. HawkRank is available at http://cadd.zju.edu.cn/programs/hawkrank. Electronic supplementary material The online version of this article (10.1186/s13321-017-0254-7) contains supplementary material, which is available to authorized users.
... To corroborate the performances we obtained on DF and CF, we compare the above results with the predictive power of atomic potentials. We select two tools widely used in the scientific community for this purpose: PISA (Krissinel and Henrick, 2007), combining energetic and entropic terms weighted by the number of contacts; and ZRANK (Pierce and Weng, 2007), a sum of energy terms, where weights are learnt from decoy sets built on 15 complexes of PPDB v1.0 (Chen et al., 2003). ...
Article
Full-text available
Motivation: Large-scale computational docking will be increasingly used in future years to discriminate protein-protein interactions at the residue resolution. Complete cross-docking experiments make in silico reconstruction of protein-protein interaction networks a feasible goal. They ask for efficient and accurate screening of the millions structural conformations issued by the calculations. Results: We propose CIPS (Combined Interface Propensity for decoy Scoring), a new pair potential combining interface composition with residue-residue contact preference. CIPS outperforms several other methods on screening docking solutions obtained either with all-atom or with coarse-grain rigid docking. Further testing on 28 CAPRI targets corroborates CIPS predictive power over existing methods. By combining CIPS with atomic potentials, discrimination of correct conformations in all-atom structures reaches optimal accuracy. The drastic reduction of candidate solutions produced by thousands of proteins docked against each other makes large-scale docking accessible to analysis. Availability: CIPS source code is freely available at http://www.lcqb.upmc.fr/CIPS. Contact: alessandra.carbone@lip6.fr. Supplementary information: Supplementary data are available at Bioinformatics online.
... If there is data about the complex in a protein-protein docking benchmark (e.g. [4]), user can compare her docking with the benchmark and check how successful her docking is. ...
Article
Full-text available
Proteins are large molecules that are vital for all living organisms and they are essential components of many industrial products. The process of binding a protein to another is called protein-protein docking. Many automated algorithms have been proposed to find docking configurations that might yield promising protein-protein complexes. However, these automated methods are likely to come up with false positives and have high computational costs. Consequently, Virtual Reality has been used to take advantage of user's experience on the problem; and proposed applications can be further improved. Haptic devices have been used for molecular docking problems; but they are inappropriate for protein-protein docking due to their workspace limitations. Instead of haptic rendering of forces, we provide a novel visual feedback for simulating physicochemical forces of proteins. We propose an interactive 3D application, DockPro, which enables domain experts to come up with dockings of protein-protein couples by using magnetic trackers and gloves in front of a large display.
Thesis
The human adaptive immune system has evolved to provide a sophisticated response to a vast body of pathogenic microbes and toxic substances. The primary mediators of this response are T and B lymphocytes. Antigenic peptides presented at the surface of infected cells by major histocompatibility complex (MHC) molecules are recognised by T cell receptors (TCRs) with exceptional specificity. This specificity arises from the enormous diversity in TCR sequence and structure generated through an imprecise process of somatic gene recombination that takes place during T cell development. Quantification of the TCR repertoire through the analysis of data produced by high-throughput RNA sequencing allows for a characterisation of the immune response to disease over time and between patients, and the development of methods for diagnosis and therapeutic design. The latest version of the software package Decombinator extracts and quantifies the TCR repertoire with improved accuracy and compatibility with complementary experimental protocols and external computational tools. The software has been extended for analysis of fragmented short-read data from single cells, comparing favourably with two alternative tools. The development of cell-based therapeutics and vaccines is incomplete without an understanding of molecular level interactions. The breadth of TCR diversity and cross-reactivity presents a barrier for comprehensive structural resolution of the repertoire by traditional means. Computational modelling of TCR structures and TCR-pMHC complexes provides an efficient alternative. Four generalpurpose protein-protein docking platforms were compared in their ability to accurately model TCR-pMHC complexes. Each platform was evaluated against an expanded benchmark of docking test cases and in the context of varying additional information about the binding interface. Continual innovation in structural modelling techniques sets the stage for novel automated tools for TCR design. A prototype platform has been developed, integrating structural modelling and an optimisation routine, to engineer desirable features into TCR and TCR-pMHC complex models.
Chapter
Public cloud computing environments, such as Amazon Web Services, Microsoft Azure, and the Google Cloud Platform, have achieved remarkable improvements in computational performance in recent years and are also expected to be able to perform massively parallel computing. As the cloud enables users to use thousands of CPU cores and GPU accelerators casually, and various software types can be used very easily by cloud images, the cloud is beginning to be used in the field of bioinformatics. In this study, we ported the original protein–protein interaction prediction (protein–protein docking) software, MEGADOCK, into Microsoft Azure as an example of an HPC cloud environment. A cloud parallel computing environment with up to 1600 CPU cores and 960 GPUs was constructed using four CPU instance types and two GPU instance types, and the parallel computing performance was evaluated. Our MEGADOCK on Azure system showed a strong scaling value of 0.93 for the CPU instance when H16 instance with 100 instances was used compared to 50 and a strong scaling value of 0.89 for the GPU instance when NC24 instance with 20 was used compared to 5. Moreover, the results of the usage fee and total computation time supported that using a GPU instance reduced the computation time of MEGADOCK and the cloud usage fee required for the computation. The developed environment deployed on the cloud is highly portable, making it suitable for applications in which an on-demand and large-scale HPC environment is desirable.
Chapter
Full-text available
This study offers a step-by-step practical procedure from the analysis of the current status of the spare parts inventory system to advanced service level analysis by virtue of simulation-optimization technique for a real-world case study associated with a seaport. The remarkable variety and immense diversity, on one hand, and extreme complexities not only in consumption patterns but also in the supply of spare parts in an international port with technically advanced port operator machinery, on the other hand, have convinced the managers to deal with this issue in a structural framework. The huge available data require cleaning and classification to properly process them and derive reorder point (ROP) estimation, reorder quantity (ROQ) estimation, and associated service level analysis. Finally, from 247,000 items used in 9 years long, 1416 inventory items are elected as a result of ABC analysis integrating with the analytic hierarchy process (AHP), which led to the main items that need to be kept under strict inventory control. The ROPs and the pertinent quantities are simulated by Arena software for all the main items, each of which took approximately 30 minutes run time on a personal computer to determine near-optimal estimations.
Article
Two very important compounds diethylstilbestrol (DESB) and diethylstilbestrol dimethyl ether (DESME) were studied in the present work. DFT theory with B3LYP functional was used to study the vibrational spectra in detail along with other quantum mechanical studies. The compounds were found to interact with graphene monolayer and the results show that there is enhancement in various physicochemical descriptors and enhancement of Raman modes. Enhancement of polarizability values of molecule-graphene complex systems is responsible for the enhancement in Raman intensity of different vibrational modes. The global reactivity descriptors were found out. The compounds are efficient non-linear active materials. DESB and DESME are docked with androgen receptor for prostate cancer and the amino acid interactions are reported UV spectra were simulated using TD-DFT and is used to predict the light-harvesting efficiency and their ability to act as photo sensitizers in solar cells.
Article
Protein docking protocols typically involve global docking scan, followed by re‐ranking of the scan predictions by more accurate scoring functions that are either computationally too expensive or algorithmically impossible to include in the global scan. Development and validation of scoring methodologies are often performed on scoring benchmark sets (docking decoys) which offer concise and nonredundant representation of the global docking scan output for a large and diverse set of protein‐protein complexes. Two such protein‐protein scoring benchmarks were built for the Dockground resource, which contains various datasets for the development and testing of protein docking methodologies. One set was generated based on the Dockground unbound docking benchmark 4, and the other based on protein models from the Dockground model‐model benchmark 2. The docking decoys were designed to reflect the reality of the real‐case docking applications (e.g., correct docking predictions defined as near‐native rather than native structures), and to minimize applicability of approaches not directly related to the development of scoring functions (reducing clustering of predictions in the binding funnel and disparity in structural quality of the near‐native and non‐native matches). The sets were further characterized by the source organism and the function of the protein‐protein complexes. The sets, freely available to the research community on the Dockground webpage, present a unique, user‐friendly resource for the developing and testing of protein‐protein scoring approaches. This article is protected by copyright. All rights reserved.
Article
Full-text available
(E)-4-((4-bromobenzylidene) amino)-N-(pyrimidin-2-yl) benzenesulfonamide were synthesized with condensation of 4-bromobenzaldehyde and sulfadiazine (4BRDA). This compound characterized with FTIR and Electronic spectra in experimental part. The calculated part used for DFT mode and B3LYP with 6311++G(d,p) basic set. IR designed with B3LYP/6311++G(d,p) basic set, and UV–Vis spectra computed with TD-DFT (time-dependent density functional theory) mode with same basic set level, with IEFPCM solvation model and dimethyl sulfoxide used for solvent. ADME properties considered with Swiss ADME online tools. Molecular docking calculation calculated with patch-dock online server. The Multiwfn software used for compute the ELF, RDG and LOL.
Article
Spectroscopic investigations of 1-phenyl −2,3-dimethyl-5-oxo-1,2-dihydro-1H-pyrazol-4-ammonium 2[(2-carboxyphenyl) disulfanyl]benzoate (PACB) reported experimentally and theoretically. NH-O interaction is observed and there is a very large downshift for NH-O stretching frequency. Reactive sites are identified from the chemical and electronic properties. For PACB the maximum repulsion was around H33, H55 and H57 atom. LOL shows red regions between C-C and blue around C atoms are surrounded by a delocalized electron cloud. The red ring is a hallmark of electron density depletion from the NCI plot due to electrostatic repulsion and its existences suggests that coordination sphere for PACB is minimally strained around the central ion. Atomic contact energy values and high score of the docking results obtained propose that, PACB may have inhibitory properties and have a significant function in pharmacological chemistry. Molecular dynamics simulation was performed to validate the stability of the title compound with the Bovine thrombin-activatable fibrinolysis inhibitor protein. Communicated by Ramaswamy H. Sarma
Article
This study reports the experimental and computational investigation on the binding of a common anticancer drug, gemcitabine, with the model plasma protein, bovine serum albumin (BSA). Several experimental and computational methods, such as intrinsic and synchronous fluorescence, UV-visible, and circular dichroism spectroscopies, consensus molecular docking and molecular dynamics simulation have been employed to elucidate the binding mechanism. Gemcitabine altered the UV-visible spectrum of BSA, which is a clear indication of the complex formation between them. The visual inspection of observed fluorescence quenching results at λex = 280 nm and 295 nm has shown the substantial involvement of tyrosine residue, even larger than tryptophan. However, after the correction of inner filter effect of the observed data, it became clear that tyrosine has a negligible role in quenching. A 20-fold decrease in quenching constant was found in the corrected data, as compared to the observed data at λex = 280 nm. There was a 1:1 weak binding between BSA and gemcitabine accompanied by dynamic quenching. The secondary structure of BSA remained almost intact in the presence of gemcitabine. The primary binding site of gemcitabine inside BSA was the drug binding site 2 or DS II, which is located in the subdomain 3 A. MD Simulation results suggested that gemcitabine doesn’t affect or deviate the structure of BSA upon interaction throughout 100 ns time period. The dominating intermolecular forces were hydrophobic forces and hydrogen bonding. A small change in the frontier molecular orbitals of gemcitabine was also observed after its binding with BSA. Communicated by Ramaswamy H. Sarma
Article
In this study, solvent-assisted co-grinding method is used to form the cocrystal of 8-hydroxy quinoline-5-sulfonic acid (HQS) and 5-chloro-8-hydroxyquinoline (CHQ). In order to determine spectroscopic and electronic properties, the theoretical characterization has been carried out. Charge delocalization patterns and second-order perturbation energies of the most interacting orbitals have also been computed and predicted. Geometrical parameters are in agreement with reported values. Molecular electrostatic potential plot predicts the reactive sites and electropositive potential region is around the hydrogen atom bonded through the nitrogen atoms, negative potentials on oxygen atoms and phenyl rings. Molecular docking of the study of the HQS-CHQ molecule has been performed for the receptors, 3C52, 4IIT, 3QYD and 4FGY.
Chapter
Natural products (NPs) are one of the key resources for ancient as well as modern remedies. Today, many scientists aim to discover the mechanisms of action behind the observed effects of plant-based medicines. Modern drug discovery till today profits from the advantageous properties of nature-derived compounds. However, NP-based activities are often challenging to define due to unspecific and multi-target effects. Molecular docking is often used as a computational and easily accessible method to propose a binding mode of NPs on a protein target. By revealing the interaction points between ligand and target, it enables one to determine the structural elements, responsible for a specific activity. Binding site and pharmacophore similarities are used to explain the multi-target effects of NPs. This knowledge allows us to simulate the effects of possible synthetic structural modifications to optimize desirable activities and to exclude undesired targets. Furthermore, docking can also be used to explore different targets to determine the most likely one for a ligand to bind, in virtual target fishing setups. In addition, this technique is often combined with other molecular modeling approaches, e.g., pharmacophore modeling or molecular dynamics simulations to predict activity and to explore binding mechanisms. Robust molecular docking studies, however, require a thorough analysis of the available data on the target and known binding modes. As the theoretical approach aims to calculate the direct interaction between a small molecule ligand and a protein target, it is vital that the experimental setup corresponds to this premise and does not measure a more general activity. In the current chapter, we want to present the methods and requirements for successful NP docking and to highlight state-of-the-art docking studies performed on NPs, which resulted in significant, and ideally experimentally validated, insights into their mechanism of action.
Article
Cocrystals are of immense applications in crystal engineering and pharmaceutical chemistry. Sulfathiazole is found to form cocrystals with theophylline (S1) and sulfanilamide (S2). The experimental and computed values assigned by potential energy distribution. Further natural bond orbital analysis, nonlinear optical properties, frontier molecular orbitals and molecular geometries were also calculated. Frontier orbital energies are used to predict the energy properties and model the possible charge transfer between the cocrystal constituents. The molecular electrostatic potential (MEP) surface reveals the various reactive surfaces in the cocrystal system, which is very important in deciding various biological activities. The ultraviolet-visible (UV-Vis) spectra show the possible electronic transitions of the molecules. Simulated electronic spectra using time dependent density functional theory (TDDFT) method with Coulomb-attenuating method, Becke 3-parameter, Lee-Yang-Parr (CAM-B3LYP) functional was used to investigate the suitability of the cocrystals to be used in dye-sensitized solar cell (DSSC). Moreover docking proves that S1 and S2 cocrystals act as potential inhibitors and paves the way for developing effective drugs.
Article
From phenotype to structure Much insight has come from structures of macromolecular complexes determined by methods such as crystallography or cryo–electron microscopy. However, looking at transient complexes remains challenging, as does determining structures in the context of the cellular environment. Braberg et al. used an integrative approach in which they mapped the phenotypic profiles of a comprehensive set of mutants in a protein complex in the context of gene deletions or environmental perturbations (see the Perspective by Wang). By associating the similarity between phenotypic profiles with the distance between residues, they determined structures for the yeast histone H3-H4 complex, subunits Rpb1-Rpb2 of yeast RNA polymerase II, and subunits RpoB-RpoC of bacterial RNA polymerase. Comparison with known structures shows that the accuracy is comparable to structures determined based on chemical cross-links. Science , this issue p. eaaz4910 ; see also p. 1269
Article
Benzimidazole derivatives flubendazole (FD1) and similar derivatives, mebendazole (MD2) and ricobendazole (RD3) have been studied by using various computational tools to analyze their geometry and spectral characteristics. The various reactive descriptors obtained from the FMO analysis predict the reactive nature of the compound. The various lone pair/sigma to pi conjugation was analyzed using NBO formalism, which provides valuable information about intra-molecular electron transfer which is vital in predicting the inherent stability of the molecule. Nucleophilic and electrophilic regions of the molecules are identified using MESP, which adds to the reactivity information. The compounds were found to interact with graphene monolayer results show that there is enhancement in various physicochemical descriptors and surface-enhanced Raman spectra (SERS). Prediction of Activity Spectra provide activities, glutathione peroxidase inhibitor, general pump inhibitor, membrane permeability inhibitor for FD1, glutathione peroxidase inhibitor, antihelmintic, antiparasitic for MD2 and antihelmintic, antiparasitic, catalase inhibitor for RD3. Compounds form a stable complex on docking with the receptors. Results also indicate that the ligands adsorbed over graphene also form stable complexes with the receptors as indicated by the high binding affinity energy values.
Article
Cocrystals are of immense applications in crystal engineering and pharmaceutical chemistry. Hydrochlorothiazide is found to form cocrystals with picolinamide (H1), tetramethylpyrazine (H2) and piperazine (H3). It was characterized using IR spectra, and quantum mechanical calculations for geometry and other properties. Frontier orbital energies are used to predict the energy properties and model the possible charge transfer between the constituents of the cocrystal. The frontier molecular orbital analysis indicates chemical reactivity and bioactivity of the cocrystals. The MEP surface reveals the various reactive surfaces in the cocrystal system, which is very important in deciding various biological activities. The UV-Vis spectra show the possible electronic transitions of the molecules. Simulated electronic spectra using TDDFT method with CAM-B3LYP functional were used to investigate the suitability of the cocrystals to be used in DSSC. Moreover, the molecular docking analysis proves that the cocrystals can act as potential inhibitors and paves the way for developing effective drugs.
Chapter
Databases of protein–protein complexes are essential for the development of protein modeling/docking techniques. Such databases provide a knowledge base for docking algorithms, intermolecular potentials, search procedures, scoring functions, and refinement protocols. Development of docking techniques requires systematic validation of the modeling protocols on carefully curated benchmark sets of complexes. We present a description and a guide to the Dockground resource (http://dockground.compbio.ku.edu) for structural modeling of protein interactions. The resource integrates various datasets of protein complexes and other data for the development and testing of protein docking techniques. The sets include bound complexes, experimentally determined unbound, simulated unbound, model–model complexes, and docking decoys. The datasets are available to the user community through a Web interface.
Article
Protein‐protein interactions (PPIs) are ubiquitous and functionally of great importance in biological systems. Hence, the accurate prediction of PPIs by protein‐protein docking and scoring tools is highly desirable in order to characterize their structure and biological function. Ab initio docking protocols are divided into the sampling of docking poses to produce at least one near‐native structure, then to evaluate the vast candidate structures by scoring. Concurrent development in both sampling and scoring is crucial for the deployment of protein‐protein docking software. In the present work, we apply a machine learning model on pairwise potentials to refine the task of protein quaternary structure native structure detection among decoys. A decoy set was featurized using the Knowledge and Empirical Combined Scoring Algorithm 2 (KECSA2) pairwise potential. The highly unbalanced decoy set was then balanced using a comparison concept between native and decoy structures. The resultant comparison descriptors were used to train a logistic regression (LR) classifier. The LR model yielded the optimal performance for native detection among decoys compared to conventional scoring functions, while exhibiting lesser performance for the detection of low root mean square deviation (RMSD) decoy structures. Its deployment on an independent benchmark set confirms that the scoring function performs competitively relative to other scoring functions. Scripts used are available at: https://github.com/TanemuraKiyoto/PPI‐native‐detection‐via‐LR . This article is protected by copyright. All rights reserved.
Article
Proteins in their native states can be represented as ensembles of conformers in dynamical equilibrium. Thermal fluctuations are responsible for transitions between these conformers. Normal modes analysis (NMA) using elastic network models (ENM) provides an efficient procedure to explore global dynamics of proteins commonly associated to conformational transitions. In the present work, we present an iterative approach to explore protein conformational spaces by introducing structural distortions according to their equilibrium dynamics at room temperature. The approach can be used either to perform unbiased explorations of conformational space or to explore guided pathways connecting two different conformations, e.g., apo and holo forms. In order to test its performance, four proteins with different magnitude of structural distortions upon ligand binding have been tested. In all cases, the conformational selection model has been confirmed and the conformational space between apo and holo forms has been encompassed. Different strategies have been tested that impact either on the efficiency to achieve a desired conformational change or to achieve a balanced exploration of the protein conformational multiplicity.
Article
Theoretical calculations were done using density functional theory in order to determine vibrational frequencies, infrared and Raman intensities, MEP, NLO and NBO properties of the Hydrochlorothiazide-isoniazid (HCTA-IN) and Hydrochlorothiazide-malonamide (HCTA-MA) cocrystals. Electron donor-acceptor due to charge transfer mechanism has been scrutinized by the NBO investigation. First order hyperpolarizability of HCTA-IN and HCTA-MA are 19.86 and 7.80 times that of urea. The downshift of NH2, NH and CO modes are due to strong hyper conjugative interactions are indicated in NBO analysis. TD-DFT analysis was used to generate the theoretical electronic spectra, which shows a charge transfer process between the thiazide and isoniazid in HCTA-IN complex. Light harvesting efficiency studies reveal efficient photosensitization potential and photochemical modeling proves the efficiency of the cocrystal to be used in dye sensitized solar cells. Title compounds are docked with glucocorticoid receptor (1NHZ) and peroxidase manganese-dependent I (1YYD).
Article
Multi-protein machines are responsible for most cellular tasks, and many efforts have been invested in the systematic identification and characterization of thousands of these macromolecular assemblies. However, unfortunately, the (quasi) atomic details necessary to understand their function are available only for a tiny fraction of the known complexes. The computational biology community is developing strategies to integrate structural data of different nature, from electron microscopy to X-ray crystallography, to model large molecular machines, as it has been done for individual proteins and interactions with remarkable success. However, unlike for binary interactions, there is no reliable gold-standard set of three-dimensional (3D) complexes to benchmark the performance of these methodologies and detect their limitations. Here, we present a strategy to dynamically generate non-redundant sets of 3D heteromeric complexes with three or more components. By changing the values of sequence identity and component overlap between assemblies required to define complex redundancy, we can create sets of representative complexes with known 3D structure (i.e., target complexes). Using an identity threshold of 20% and imposing a fraction of component overlap of <0.5, we identify 495 unique target complexes, which represent a real non-redundant set of heteromeric assemblies with known 3D structure. Moreover, for each target complex, we also identify a set of assemblies, of varying degrees of identity and component overlap, that can be readily used as input in a complex modeling exercise (i.e., template subcomplexes). We hope that resources like this will significantly help the development and progress assessment of novel methodologies, as docking benchmarks and blind prediction contests did. The interactive resource is accessible at https://DynBench3D.irbbarcelona.org.
Article
Full-text available
Bioinformatic tools is widely used to manage the enormous genomic and proteomic data involving DNA/protein sequences management, drug designing, homology modelling, motif/domain prediction ,docking, annotation and dynamic simulation etc. Bioinformatics offers a wide range of applications in numerous disciplines such as genomics. Proteomics, comparative genomics, nutrigenomics, microbial genome, biodefense, forensics etc. Thus it offers promising future to accelerate scientific research in biotechnology
Article
Aim: Scoring functions are important component of protein-protein docking methods. They need to be evaluated on high-quality benchmarks to reveal their strengths and weaknesses. Evaluation results obtained on such benchmarks can provide valuable guidance for developing more advanced scoring functions. Methodology & results: In our comparative assessment of scoring functions for protein-protein interactions benchmark, the performance of a scoring function was characterized by 'docking power' and 'scoring power'. A high-quality dataset of 273 protein-protein complexes was compiled and employed in both tests. Four scoring functions, including FASTCONTACT, ZRANK, dDFIRE and ATTRACT were tested as demonstration. ZRANK and ATTRACT exhibited encouraging performance in the docking power test. However, all four scoring functions failed badly in the scoring power test. Conclusion: Our comparative assessment of scoring functions for protein-protein interaction benchmark is created especially for assessing the scoring functions applicable to protein-protein interactions. It is different from other benchmarks for assessing protein-protein docking methods. Our benchmark is available to the public at www.pdbbind-cn.org/download/CASF-PPI/ .
Article
Ab initio protein-protein docking algorithms often rely on experimental data to identify the most likely complex structure. We integrated protein-protein docking with the experimental data of chemical cross-linking followed by mass spectrometry. We tested our approach using 19 cases that resulted from an exhaustive search of the Protein Data Bank for protein complexes with cross-links identified in our experiments. We implemented cross-links as constraints based on Euclidean distance or void-volume distance. For most test cases the rank of the top-scoring near-native prediction was improved by at least two fold compared with docking without the cross-link information, and the success rate for the top 5 predictions nearly tripled. Our results demonstrate the delicate balance between retaining correct predictions and eliminating false positives. Several test cases had multiple components with distinct interfaces, and we present an approach for assigning cross-links to the interfaces. Employing the symmetry information for these cases further improved the performance of complex structure prediction.
Thesis
Même si le docking protéine-protéine devient un outil incontournable pour répondre aux problématiques biologiques actuelles, il reste cependant deux difficultés inhérentes aux méthodes actuelles: 1) la majorité de ces méthodes ne considère pas les possibles déformations internes des protéines durant leur association. 2) Il n'est pas toujours simple de traduire les informations issues de la littérature ou d'expérimentations en contraintes intégrables aux programmes de docking. Nous avons donc tenté de développer une approche permettant d'améliorer les programmes de docking existants. Pour cela nous nous sommes inspirés des méthodologies mises en place sur des cas concrets traités durant cette thèse. D'abord, à travers la création du complexe ERBIN PDZ/Smad3 MH2, nous avons pu tester l'utilité de la Dynamique Moléculaire en Solvant Explicite (DMSE) pour mettre en évidence des résidus importants pour l'interaction. Puis, nous avons étendu cette recherche en utilisant divers serveurs de docking puis la DMSE pour cibler un résultat consensus. Enfin, nous avons essayé le raffinage par DMSE sur une cible du challenge CAPRI et comparé les résultats avec des simulations courtes de Monte-Carlo. La dernière partie de cette thèse portait sur le développement d'un nouvel outil de visualisation de la surface moléculaire. Ce programme, nommé MetaMol, permet de visualiser un nouveau type de surface moléculaire: la Skin Surface Moléculaire. La distribution des calculs à la fois sur le processeur de l'ordinateur (CPU) et sur ceux de la carte graphique (GPU) entraine une diminution des temps de calcul autorisant la visualisation, en temps réel, des déformations de la surface moléculaire.
Article
Protein-protein interactions are essential for biological function, but structures of protein-protein complexes are difficult to obtain experimentally. To derive the protein complex of the DNA-repair enzyme human uracil-DNA-glycosylase (hUNG) with its protein inhibitor (UGI), we combined rigid-body computational docking with hydrogen/deuterium exchange mass spectrometry (DXMS). Computational docking of the unbound protein structures provides a list of possible three-dimensional models of the complex; DXMS identifies solvent-protected protein residues. DXMS showed that unbound hUNG is compactly folded, but unbound UGI is loosely packed. Increased solvent protection of hUNG in the complex was localized to four regions on the same face. The decrease in incorporated deuterons was quantitatively interpreted as the minimum number of main-chain hUNG amides buried in the protein-protein interface. Deuteration of complexed UGI decreased throughout the protein chain, indicating both tighter packing and direct solvent protection by hUNG. Three UGI regions showing the greatest decrease were best interpreted leniently, requiring just one main-chain amide from each in the interface. Applying the DXMS constraints as filters to a list of docked complexes gave the correct complex as the largest favorable-energy cluster. Thus, identification of approximate protein interfaces was sufficient to distinguish the protein complex. Surprisingly, incorporating the DXMS data as added favorable potentials in the docking calculation was less effective at finding the correct complex. The filtering method has greater flexibility, with the capability to test each constraint and enforce simultaneous contact by multiple regions, but with the caveat that the list from the unbiased docking must include correct complexes.
Article
Motivation: With the discovery of more and more noncoding RNAs and their versatile functions, RNA-RNA interactions have received increased attention. Therefore, determination of their complex structures is valuable to understand the molecular mechanism of the interactions. Given the high cost of experimental methods, computational approaches like molecular docking have played an important role in the determination of complex structures, in which a benchmark is critical for the development of docking algorithms. Results: Meeting the need, we have developed the first comprehensive and nonredundant RNA-RNA docking benchmark (RRDB). The diverse dataset of 123 targets consists of 78 unbound-unbound and 45 bound-unbound (or unbound-bound) test cases. The dataset was classified into three groups according to the interface conformational changes between bound and unbound structures: 47 'easy', 38 'medium', and 38 'difficult' targets. A docking test with the benchmark using ZDOCK 2.1 demonstrated the challenging nature of the RNA-RNA docking problem and the important value of the present benchmark. The bound and unbound cases of the benchmark will be beneficial for the development and optimization of docking and scoring algorithms for RNA-RNA interactions. Availability: The benchmark is available at http://huanglab.phys.hust.edu.cn/RRDbenchmark/. Contact: huangsy@hust.edu.cn. Supplementary information: Supplementary data are available at Bioinformatics online.
Article
Full-text available
The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic, and statistical refinements permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is described for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position Specific Iterated BLAST (PSLBLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities.
Article
Full-text available
Computerized molecular model building has been used to deduce the arrangement of sickle cell hemoglobin molecules (Hb-S) in the tubular fibers which form within sickling cells and in concentrated cell-free solutions of deoxygenated Hb-S. A "best" solution has been found which satisfies all of the reported properties of these fibers. In the proposed arrangement the contact between adjacent Hb-S molecules in the direction parallel to the fiber axis is primarily hydrophobic and in addition contains two salt bridges between the molecules. This contact would be disrupted with the Glu of Hb-A at the beta6 position instead of the Val of Hb-S, and it would not make a long fiber with oxygenated Hb-S. Residues in the A helix and the GH corner of the beta2 chain of one molecule are in contact with residues of the A, B, and E helices and the GH corner of the alpha1 chain of its neighbor. The intermolecular contact in the direction perpendicular to the fiber axis is mainly between the end of the E helix and the EF corner of the beta1 chain on the first molecule and the F helix and FG corner of the alpha2 chain of its neighbor. Some of the implications of these contacts are reported here, and others will be presented in subsequent papers.
Article
Full-text available
The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSIBLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.
Article
Full-text available
The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
Article
We have developed a novel, fully automatic method for aligning the three-dimensional structures of two proteins. The basic approach is to first align the proteins' secondary structure elements and then extend the alignment to include any equivalent residues found in loops or turns. The initial secondary structure element alignment is determined by a genetic algorithm. After refinement of the secondary structure element alignment, the protein backbones are superposed and a search is performed to identify any additional equivalent residues in a convergent process. Alignments are evaluated using intramolecular distance matrices. Alignments can be performed with or without sequential connectivity constraints. We have applied the method to proteins from several well-studied families: globins, immunoglobulins, serine proteases, dihydrofolate reductases, and DNA methyltransferases. Agreement with manually curated alignments is excellent. A web-based server and additional supporting information are available at http://engpub1.bu.edu/∼josephs. Proteins 2000;38:428–440. © 2000 Wiley-Liss, Inc.
Article
The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
Article
The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
Chapter
In 1998, members of the Research Collaboratory for Structural Bioinformatics became the managers of the Protein Data Bank archive. This chapter details the systems used for the deposition, annotation and distribution of the data in the archive. This chapter is also available as HTML from the International Tables Online site hosted by the IUCr.
Article
An automatic procedure which generates possible modes of protein-protein association is developed and applied to the bovine pancreatic trypsin inhibitor-trypsin complex as a test case. Using a simplified model in which each residue is replaced by one interaction center, all possible modes of interaction between the inhibitor and the active center of the enzyme are generated systematically. The non-bonded interactions between the molecules and the protein surface area buried in the generated interfaces are evaluated and used as criteria for selecting stable complexes. We show that satisfactory estimates of accessible and buried surface areas can be made using the simplified model.The procedure leads to about nine structures having non-bonded interactions and buried surface areas similar to those of the native complex. This suggests that the major contributions to the free energy of dissociation are taken into account by our selection procedure, though complementarity and specificity are not properly represented in the simplified model. However, it makes it possible to scan a much larger number of configurations than would otherwise be feasible, chiefly through elimination of side-chain detail.
Article
The non-covalent assembly of proteins that fold separately is central to many biological processes, and differs from the permanent macromolecular assembly of protein subunits in oligomeric proteins. We performed an analysis of the atomic structure of the recognition sites seen in 75 protein-protein complexes of known three-dimensional structure: 24 protease-inhibitor, 19 antibody-antigen and 32 other complexes, including nine enzyme-inhibitor and 11 that are involved in signal transduction.The size of the recognition site is related to the conformational changes that occur upon association. Of the 75 complexes, 52 have "standard-size" interfaces in which the total area buried by the components in the recognition site is 1600 (+/-400) A2. In these complexes, association involves only small changes of conformation. Twenty complexes have "large" interfaces burying 2000 to 4660 A2, and large conformational changes are seen to occur in those cases where we can compare the structure of complexed and free components. The average interface has approximately the same non-polar character as the protein surface as a whole, and carries somewhat fewer charged groups. However, some interfaces are significantly more polar and others more non-polar than the average. Of the atoms that lose accessibility upon association, half make contacts across the interface and one-third become fully inaccessible to the solvent. In the latter case, the Voronoi volume was calculated and compared with that of atoms buried inside proteins. The ratio of the two volumes was 1.01 (+/-0.03) in all but 11 complexes, which shows that atoms buried at protein-protein interfaces are close-packed like the protein interior. This conclusion could be extended to the majority of interface atoms by including solvent positions determined in high-resolution X-ray structures in the calculation of Voronoi volumes. Thus, water molecules contribute to the close-packing of atoms that insure complementarity between the two protein surfaces, as well as providing polar interactions between the two proteins.
Article
We have developed a novel, fully automatic method for aligning the three-dimensional structures of two proteins. The basic approach is to first align the proteins' secondary structure elements and then extend the alignment to include any equivalent residues found in loops or turns. The initial secondary structure element alignment is determined by a genetic algorithm. After refinement of the secondary structure element alignment, the protein backbones are superposed and a search is performed to identify any additional equivalent residues in a convergent process. Alignments are evaluated using intramolecular distance matrices. Alignments can be performed with or without sequential connectivity constraints. We have applied the method to proteins from several well-studied families: globins, immunoglobulins, serine proteases, dihydrofolate reductases, and DNA methyltransferases. Agreement with manually curated alignments is excellent. A web-based server and additional supporting information are available at http://engpub1.bu.edu/-josephs.
Article
A comprehensive docking study was performed on 27 distinct protein-protein complexes. For 13 test systems, docking was performed with the unbound X-ray structures of both the receptor and the ligand. For the remaining systems, the unbound X-ray structure of only molecule was available; therefore the bound structure for the other molecule was used. Our method optimizes desolvation, shape complementarity, and electrostatics using a Fast Fourier Transform algorithm. A global search in the rotational and translational space without any knowledge of the binding sites was performed for all proteins except nine antibodies recognizing antigens. For these antibodies, we docked their well-characterized binding site-the complementarity-determining region defined without information of the antigen-to the entire surface of the antigen. For 24 systems, we were able to find near-native ligand orientations (interface C(alpha) root mean square deviation less than 2.5 A from the crystal complex) among the top 2,000 choices. For three systems, our algorithm could identify the correct complex structure unambiguously. For 13 other complexes, we either ranked a near-native structure in the top 20 or obtained 20 or more near-native structures in the top 2,000 or both. The key feature of our algorithm is the use of target functions that are highly tolerant to conformational changes upon binding. If combined with a post-processing method, our algorithm may provide a general solution to the unbound docking problem. Our program, called ZDOCK, is freely available to academic users (http://zlab.bu.edu/~rong/dock/).
University College London: Department of
  • Sj Hubbard
  • Jm Thornton
Hubbard SJ, Thornton JM. NACCESS. University College London: Department of Biochemistry and Molecular Biology; 1993. PROTEIN–PROTEIN DOCKING BENCHMARK 91
K2: protein structure comparisons and their statistical significance Evolu-tionary computation in bioinformatics
  • Szustakowski Jd
  • Weng
Szustakowski JD, Weng Z. K2: protein structure comparisons and their statistical significance. In: Fogel G, Corne D, editors. Evolu-tionary computation in bioinformatics. San Francisco, CA: Morgan Kaufmann; 2002.
K2: protein structure comparisons and their statistical significance
  • J D Szustakowski
  • Z Weng
Szustakowski JD, Weng Z. K2: protein structure comparisons and their statistical significance. In: Fogel G, Corne D, editors. Evolutionary computation in bioinformatics. San Francisco, CA: Morgan Kaufmann; 2002.
  • Szustakowski