Article

Integrating Cross-Linking Experiments with Ab Initio Protein-Protein Docking

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Ab initio protein-protein docking algorithms often rely on experimental data to identify the most likely complex structure. We integrated protein-protein docking with the experimental data of chemical cross-linking followed by mass spectrometry. We tested our approach using 19 cases that resulted from an exhaustive search of the Protein Data Bank for protein complexes with cross-links identified in our experiments. We implemented cross-links as constraints based on Euclidean distance or void-volume distance. For most test cases the rank of the top-scoring near-native prediction was improved by at least two fold compared with docking without the cross-link information, and the success rate for the top 5 predictions nearly tripled. Our results demonstrate the delicate balance between retaining correct predictions and eliminating false positives. Several test cases had multiple components with distinct interfaces, and we present an approach for assigning cross-links to the interfaces. Employing the symmetry information for these cases further improved the performance of complex structure prediction.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Among the most frequently used spatial restraints are those acquired from chemical cross-linking, coupled with mass spectrometry (XL-MS) [6][7][8]. Cross-linker, a bi-reactive chemical component, connects specific amino acid residues located at an appropriate distance in the 3D structure of the protein complex. The length of the cross-linker defines the maximal distance between the reactive ends of the cross-linked residues. ...
... Imposing symmetry improves modeling performance It has been already reported that cross-linking data can be used to predict if a protein complex is symmetric or not, based on the identified interaction interfaces [7]. We investigated the effect of imposing symmetry to spatial restraints in the scoring function on modeling performance. ...
... Similarly to the findings of others [7,28], we showed that symmetry could be exploited to improve modeling results (Fig. 5). Both of our symmetry imposing scoring functions Symmetry-matched and Symmetry-difference score inter-subunit alternative only when both inter-subunit alternatives (A-B and B-A) are sufficiently similar. ...
Article
Full-text available
Background: The function of oligomeric proteins is inherently linked to their quaternary structure. In the absence of high-resolution data, low-resolution information in the form of spatial restraints can significantly contribute to the precision and accuracy of structural models obtained using computational approaches. To obtain such restraints, chemical cross-linking coupled with mass spectrometry (XL-MS) is commonly used. However, the use of XL-MS in the modeling of protein complexes comprised of identical subunits (homo-oligomers) is often hindered by the inherent ambiguity of intra- and inter-subunit connection assignment. Results: We present a comprehensive evaluation of (1) different methods for inter-residue distance calculations, and (2) different approaches for the scoring of spatial restraints. Our results show that using Solvent Accessible Surface distances (SASDs) instead of Euclidean distances (EUCs) greatly reduces the assignation ambiguity and delivers better modeling precision. Furthermore, ambiguous connections should be considered as inter-subunit only when the intra-subunit alternative exceeds the distance threshold. Modeling performance can also be improved if symmetry, characteristic for most homo-oligomers, is explicitly defined in the scoring function. Conclusions: Our findings provide guidelines for proper evaluation of chemical cross-linking-based spatial restraints in modeling homo-oligomeric protein complexes, which could facilitate structural characterization of this important group of proteins.
... Specialized tools have been developed to model antibody structures, owing to the unique structural properties of antibody complementarity-determining regions (CDRs). 9, 10 We hypothesize that prediction-based approaches, when combined with low-resolution structural information such as that obtained by XLMS, 11 can provide useful structural models to inform on the molecular and structural basis of antibody-HLA interactions. ...
... 18 XLMS data have also been used to guide molecular docking for modeling of protein-protein interactions. 7,11 More recently, approaches combining XLMS and predictive structural modeling using Alphafold 8 have been applied to SARS-CoV-2 19 and in proteome-wide studies. 20 The hybrid structural modeling approach reported here uses a similar principle to incorporate XLMS data with predicted structures, with additional inputs of predicted interacting surfaces for molecular docking. ...
Article
Full-text available
Alloantibody recognition of donor human leukocyte antigen (HLA) is associated with poor clinical transplantation outcomes. However, the molecular and structural basis for the alloantibody-HLA interaction is not well understood. Here, we used a hybrid structural modeling approach on a previously studied alloantibody-HLA interacting pair with inputs from ab initio, in silico, and in vitro data. Highly reproducible cross-linking mass spectrometry data were obtained with both discovery- and targeted mass spectrometry-based approaches approaches. The cross-link information was then used together with predicted antibody Fv structure, predicted antibody paratope, and in silico-predicted interacting surface to model the antibody-HLA interaction. This hybrid structural modeling approach closely recapitulates the key interacting residues from a previously solved crystal structure of an alloantibody-HLA-A∗11:01 pair. These results suggest that a predictive-based hybrid structural modeling approach supplemented with cross-linking mass spectrometry data can provide functionally relevant structural models to understand the structural basis of antibody-HLA mismatch in transplantation.
... This is because the underlying distributions of residue depths and Cα-Cα distances that were used to compute monolinking and crosslinking probabilities (Fig. 2) came from empirical XL-MS data from both protein monomers and complexes. Crosslinks have been widely used to screen large sets of proteinprotein docking models (45,50,51), but in the case of C3, AF-MM correctly predicted the binding interface for all models, only differing in the conformation of the flexible chain. As of current writing, the problem of protein complex prediction is not as well solved as the structure prediction of single protein chains, but we might extrapolate that the former problem will soon be solved with additional algorithmic refinements and/or protein complex structures in the PDB. ...
Article
Full-text available
We propose a pipeline that combines AlphaFold2 (AF2) and crosslinking mass spectrometry (XL-MS) to model the structure of proteins with multiple conformations. The pipeline consists of two main steps: ensemble generation using AF2 and conformer selection using XL-MS data. For conformer selection, we developed two scores—the monolink probability score (MP) and the crosslink probability score (XLP)—both of which are based on residue depth from the protein surface. We benchmarked MP and XLP on a large dataset of decoy protein structures and showed that our scores outperform previously developed scores. We then tested our methodology on three proteins having an open and closed conformation in the Protein Data Bank: Complement component 3 (C3), luciferase, and glutamine-binding periplasmic protein, first generating ensembles using AF2, which were then screened for the open and closed conformations using experimental XL-MS data. In five out of six cases, the most accurate model within the AF2 ensembles—or a conformation within 1 Å of this model—was identified using crosslinks, as assessed through the XLP score. In the remaining case, only the monolinks (assessed through the MP score) successfully identified the open conformation of glutamine-binding periplasmic protein, and these results were further improved by including the “occupancy” of the monolinks. This serves as a compelling proof-of-concept for the effectiveness of monolinks. In contrast, the AF2 assessment score was only able to identify the most accurate conformation in two out of six cases. Our results highlight the complementarity of AF2 with experimental methods like XL-MS, with the MP and XLP scores providing reliable metrics to assess the quality of the predicted models. The MP and XLP scoring functions mentioned above are available at https://gitlab.com/topf-lab/xlms-tools.
... As shown in Figs. 3c and 4, six residues in ME53 109−138 aa can be seen docking with seven residues in GP64 [34]. ...
Article
Full-text available
me53, a highly conserved immediate early gene in all Lepidoptera baculoviruses, has been of great interest in recent years. Autographa californica multiple nucleopolyhedrovirus (AcMNPV) is in the family Baculoviridae, genus Alphabaculovirus. The me53 gene of AcMNPV has been sequenced, and it was transcribed late after infection. The structure of ME53 protein and its roles in the infection of host cells were summarized and discussed, including that (1) the production of Budding Virus (BV); (2) nucleocapsid formation in the host nuclei; (3) ME53 forms a lesion on the cell membrane of AcMNPV-infected cells and co-locates with GP64 and the primary capsid protein VP39; (4) the nuclear translocation signal sequence of ME53 is essential for optimal baculovirus production. In this review, we focus on the emerging roles of ME53 by discussing novel mechanisms identified to mediate or interact by ME53, which provides an important reference for the effective transformation, utilization and improvement of the anti-insect activity of AcMNPV.
... The XLMS methodology can derive residue-level resolution on where the crosslinks occur, what proteins are crosslinked, and, with the spacer arm of the crosslinker, the distances of those crosslinks. The data that is captured from these analyses can help build interactome maps of protein networks, delineate binding sites (Vreven et al., 2018), and help with model building (Brodie et al., 2017). With enough crosslinks, XLMS data can even report on conformation dynamics (Minteris and Gygi, 2020). ...
Thesis
Full-text available
Bacterial microcompartments (BMCs) are polyhedral, protein-based organelles present in a wide range of bacteria. This mode of compartmentalization is highly modular and can accommodate a wide range of chemistries within them, including carbon fixation. These aspects make them a promising target to serve as bioplatforms for commodity chemical synthesis and enhanced carbon fixation. However, it is challenging to investigate the structure and function of BMCs using classical methods. As such, the native structure of BMCs remains largely enigmatic, hampering their synthetic adaption. This dissertation addresses these concerns by describing the assembly state of the model 1,2-propanediol (Pdu) BMC using a variety of approaches. Chemical probing reveals the Pdu BMC is surprisingly permeable to and permissive of derivatization. This insight enabled application of crosslinking mass spectrometry to describe its protein interactome. The interactome map reveals that small domains called encapsulation peptides dominate interior interactions while reporting on the organization of the outer protein shell. Laser scanning confocal approaches were developed to study the solution behavior of BMCs. These experiments heavily suggest that the Pdu BMC is a dynamic entity that exchanges protein elements; a result we primarily attribute to the protein shell. These confocal microscopy approaches were further used to study the superstructures formed by individual shell proteins and to describe their interactions with one another. Together, the results from this project give important insight on the assembly state of the model Pdu BMC including its biogenesis, organization, and behavior. These data answer some of the open questions concerning the assembly of BMC structures, which will help innovate the next generation of BMC-based biotechnology tools.
... Imposing symmetry at the initial sampling instead of filtering the results at the end also leads to improvements in both the accuracy and computational time [86]. Although M-ZDOCK uses the ZDOCK scoring function [121], which does not provide the user with the ability to include experimentally determined restraints, integration of cross-linking mass spectrometry (XL-MS) data with Z-DOCK was recently reported to improve docking results and even provide insight into the symmetry of the analyzed protein complex [122]. ...
Article
Full-text available
Protein homo-oligomerization is a very common phenomenon, and approximately half of proteins form homo-oligomeric assemblies composed of identical subunits. The vast majority of such assemblies possess internal symmetry which can be either exploited to help or poses challenges during structure determination. Moreover, aspects of symmetry are critical in the modeling of protein homo-oligomers either by docking or by homology-based approaches. Here, we first provide a brief overview of the nature of protein homo-oligomerization. Next, we describe how the symmetry of homo-oligomers is addressed by crystallographic and non-crystallographic symmetry operations, and how biologically relevant intermolecular interactions can be deciphered from the ordered array of molecules within protein crystals. Additionally, we describe the most important aspects of protein homo-oligomerization in structure determination by NMR. Finally, we give an overview of approaches aimed at modeling homo-oligomers using computational methods that specifically address their internal symmetry and allow the incorporation of other experimental data as spatial restraints to achieve higher model reliability.
... Local 3D Zernike descriptor-based docking (LZerD), one of the top methods in CAPRI, projects 3D surfaces onto spheres to efficiently capture complementarity of protein surfaces [16]. Some rigidbody approaches exploit data from chemical cross-linking experiments [17] or small-angle X-ray scattering (SAXS) [18] to further improve discrimination of generated structures. These approaches provide fast, global exploration of the energy landscape, and in recent CAPRI rounds [3,2], many predictors incorporated these approaches as the first step to identify putative binding patches, and they supplement with other refinement tools to capture backbone flexibility. ...
Article
Computational docking methods can provide structural models of protein–protein complexes, but protein backbone flexibility upon association often thwarts accurate predictions. In recent blind challenges, medium or high accuracy models were submitted in less than 20% of the ‘difficult’ targets (with significant backbone change or uncertainty). Here, we describe recent developments in protein–protein docking and highlight advances that tackle backbone flexibility. In molecular dynamics and Monte Carlo approaches, enhanced sampling techniques have reduced time-scale limitations. Internal coordinate formulations can now capture realistic motions of monomers and complexes using harmonic dynamics. And machine learning approaches adaptively guide docking trajectories or generate novel binding site predictions from deep neural networks trained on protein interfaces. These tools poise the field to break through the longstanding challenge of correctly predicting complex structures with significant conformational change.
... Local 3D Zernike descriptor-based docking (LZerD), one of the top methods in CAPRI, projects 3D surfaces onto spheres to efficiently capture complementarity of protein surfaces [23]. Some rigid-body approaches exploit data from chemical cross-linking experiments [24] or small-angle X-ray scattering (SAXS) [25] to further improve discrimination of generated structures. These approaches provide fast, global exploration of the energy landscape, and in recent CAPRI rounds [4,5], many predictors incorporated these approaches as the first step to identify putative binding patches, and they supplement with other refinement tools to capture backbone flexibility. ...
Preprint
Full-text available
Computational docking methods can provide structural models of protein-protein complexes, but protein backbone flexibility upon association often thwarts accurate predictions. In recent blind challenges, medium or high accuracy models were submitted in less than 20% of the "difficult" targets (with significant backbone change or uncertainty). Here, we describe recent developments in protein-protein docking and highlight advances that tackle backbone flexibility. In molecular dynamics and Monte Carlo approaches, enhanced sampling techniques have reduced time-scale limitations. Internal coordinate formulations can now capture realistic motions of monomers and complexes using harmonic dynamics. And machine learning approaches adaptively guide docking trajectories or generate novel binding site predictions from deep neural networks trained on protein interfaces. These tools poise the field to break through the longstanding challenge of correctly predicting complex structures with significant conformational change.
... One example is the integration of Small-Angle X-ray Scattering (SAXS) experimental data in ab initio docking methods such as pyDock [46][47][48], HADDOCK [49], PatchDock [50,51], ATTRACT [52] or ClusPro [53]. And chemical cross-linking data has also been integrated in protein docking methods such as ZDOCK [54]. In the 7th CAPRI experiment, the use of integrative modeling approaches was blindly evaluated. ...
Article
Computational docking approaches aim to overcome the limited availability of experimental structural data on protein–protein interactions, which are key in biology. The field is rapidly moving from the traditional docking methodologies for modeling of binary complexes to more integrative approaches using template-based, data-driven modeling of multi-molecular assemblies. We will review here the predictive capabilities of current docking methods in blind conditions, based on the results from the most recent community-wide blind experiments. Integration of template-based and ab initio docking approaches is emerging as the optimal strategy for modeling protein complexes and multimolecular assemblies. We will also review the new methodological advances on ab initio docking and integrative modeling.
... The HADDOCK program and server explicitly employs such information based on biochemical/biophysical interaction data to drive the docking process [16]. Other docking servers, including ClusPro [36], ZDOCK [37], and pyDock [38] were more recently enhanced to take advantage of such restraints. Another source of information, increasingly used in docking, is small angle X-ray scattering (SAXS). ...
Article
A number of well-established servers perform 'free' docking of proteins of known structures. In contrast, template-based docking can start from sequences if structures are available for complexes that are homologous to the target. On the basis of the results of the CAPRI-CASP structure prediction experiments, template-based methods yield more accurate predictions if good templates can be found, but generally fail without such templates. However, free global docking, or focused docking around even poor quality template-based models, can still generate acceptable docked structures in these cases. In accordance with the analysis of a benchmark set, free docking of heterodimers yields acceptable or better predictions in the top 10 models for around 40% of structures. However, it is likely that a combination of template-based and free docking methods can perform better for targets that have template structures available. Another way of improving the reliability of predictions is adding experimental information as restraints, an option built into several docking servers.
... for the explicit treatment of XL-MS-derived distance restraints in docking simulations and other hybrid methods of structural biology 10,11,35,39,46,47 . They allow to transition from an indiscriminate single distance restraint 28 to a set of protein/domain-specific ones, and support the notion that the use of CA-CA restraints reduces the uncertainty arising due to molecular thermal motion, as compared to the more intuitive NZ-NZ restraints. ...
Preprint
Covalent cross-link mapping by mass spectrometry (XL-MS) is rapidly becoming the most widely used method of hybrid structural biology. We investigated the impact of incremental variations of cross-linker length have on the depth of XL-MS interrogation of protein-protein complexes, and assessed the role molecular motions in solution play in generation of cross-link-derived distance restraints. Supplementation of a popular NHS-ester cross-linker, DSS, with 2 reagents shorter or longer by CH2-CH2, increased the number of non-reductant crosslinks by ~50%. Molecular dynamics simulations of these cross-linkers revealed 3 individual, partially overlapping ranges of motion, consistent with partially overlapping sets of cross-links formed by each reagent. Similar simulations elucidated protein fold-specific ranges of motions for the reactive and backbone atoms from rigid and flexible target domains. Together these findings create a quantitative framework for generation of cross-linker- and protein fold-specific distance restraints for XL-MS-guided protein-protein docking.
Article
The influence of distance restraints from chemical cross-link mass spectroscopy (XL-MS) on the quality of protein structures modeled with the coarse-grained UNRES force field was assessed by using a protocol based on multiplexed replica exchange molecular dynamics, in which both simulated and experimental cross-link restraints were employed, for 23 small proteins. Six cross-links with upper distance boundaries from 4 Å to 12 Å (azido benzoic acid succinimide (ABAS), triazidotriazine (TATA), succinimidyldiazirine (SDA), disuccinimidyl adipate (DSA), disuccinimidyl glutarate (DSG), and disuccinimidyl suberate (BS³)) and two types of restraining potentials ((i) simple flat-bottom Lorentz-like potentials dependent on side chain distance (all cross-links) and (ii) distance- and orientation-dependent potentials determined based on molecular dynamics simulations of model systems (DSA, DSG, BS³, and SDA)) were considered. The Lorentz-like potentials with properly set parameters were found to produce a greater number of higher-quality models compared to unrestrained simulations than the MD-based potentials, because the latter can force too long distances between side chains. Therefore, the flat-bottom Lorentz-like potentials are recommended to represent cross-link restraints. It was also found that significant improvement of model quality upon the introduction of cross-link restraints is obtained when the sum of differences of indices of cross-linked residues exceeds 150.
Chapter
Protein–protein complexes are involved in most cellular processes and form building blocks for larger biological assemblies. Understanding its function requires knowledge of the structure of stable but also transient protein–protein complexes. Experimental structure determination of protein–protein complexes remains a challenging task, especially in case of transient complexes. Computational protein–protein docking methods can complement experimental structure determination by providing structural models for protein–protein interactions. Often protein–protein association is strongly coupled to conformational changes in the binding partners. It can range from local side‐chain changes to more global domain motions associated with the complex formation process. Preferably, docking approaches should account for conformational flexibility during docking searches. Several stand‐alone as well as web‐server applications of a variety of docking methodologies are available. The chapter will give an overview of available approaches and the underlying algorithms for predicting protein–protein complex structures. It includes efficient methods for rigid partner docking but also methods that include possible conformational changes. In addition, methods for refinement of docked complexes and realistic scoring will also be reported and discussed.
Article
Cross-linking and mass spectrometry (XL-MS) workflows represent an increasingly popular technique for low-resolution structural studies of macromolecular complexes. Cross-linking reactions take place in the solution state, capturing contact sites between components of a complex that represent the native, functionally relevant structure. Protein-protein XL-MS protocols are widely adopted, providing precise localization of cross-linking sites to single amino acid positions within a pair of cross-linked peptides. In contrast, protein-RNA XL-MS workflows are evolving rapidly and differ in their ability to localize interaction regions within the RNA sequence. Here, we review protein-protein and protein-RNA XL-MS workflows, and discuss their applications in studies of protein-RNA complexes. The examples highlight the complementary value of XL-MS in structural studies of protein-RNA complexes, where more established high-resolution techniques might be unable to produce conclusive data.
Article
Pseudopotentials for the chemical cross-links comprising the glutamic- and aspartic-acid side chains bridged with adipic- (ADH) or pimelic-acid hydrazide (PDH), and the lysine side chains bridged with glutaric (BS²G) or suberic acid (BS³) for coarse-grained cross-link-assisted simulations were determined by canonical molecular dynamics with the Amber14sb force field. The potentials depend on the distance between side-chain ends and on side-chain orientation, this preventing from making cross-link contacts across the globule in simulations. The potentials were implemented in the UNRES coarse-grained force field and their effect on the quality of models was assessed with 11 monomeric and 1 dimeric proteins, using synthetic or experimental cross-link data. Simulations with the new potentials resulted in improvement of the generated models compared to unrestrained simulations in more instances compared to those with the statistical potentials.
Article
Human Leukocyte Antigen (HLA) complexes are critical cell-surface protein assemblies that facilitate T-cell surveillance of almost all cell types in the body. While T-cell receptor binding to HLA class I and class II complexes is well-described with detailed structural information, the nature of cis HLA interactions within the plasma membrane of the surveyed cells remains to be better characterized, as protein-protein interactions in the membrane environment are technically challenging to profile. Here we performed extracellular chemical crosslinking on intact antigen presenting cells to specifically elucidate protein-protein interactions present in the external plasma membrane. We found that the crosslink dataset was dominated by inter- and intra-protein crosslinks involving HLA molecules, which enabled not only the construction of an HLA-centric plasma membrane protein interaction map, but also revealed multiple modes of HLA class I – HLA class II interactions with further structural modeling based on crosslinker distance restraints. Collectively, our data demonstrate that HLA molecules colocalize and can be densely packed on the plasma membrane.
Article
Detailed mechanistic understanding of protein complex function is greatly enhanced by insights from its 3-dimensional structure. Traditional methods of protein structure elucidation remain expensive and labor-intensive and require highly purified starting material. Chemical cross-linking coupled with mass spectrometry offers an alternative that has seen increased use, especially in combination with other experimental approaches like cryo-electron microscopy. Here we report advances in method development, combining several orthogonal cross-linking chemistries as well as improvements in search algorithms, statistical analysis, and computational cost to achieve coverage of 1 unique cross-linked position pair for every 7 amino acids at a 1% false discovery rate. This is accomplished without any peptide-level fractionation or enrichment. We apply our methods to model the complex between a carbonic anhydrase (CA) and its protein inhibitor, showing that the cross-links are self-consistent and define the interaction interface at high resolution. The resulting model suggests a scaffold for development of a class of protein-based inhibitors of the CA family of enzymes. We next cross-link the yeast proteasome, identifying 3,893 unique cross-linked peptides in 3 mass spectrometry runs. The dataset includes 1,704 unique cross-linked position pairs for the proteasome subunits, more than half of them intersubunit. Using multiple recently solved cryo-EM structures, we show that observed cross-links reflect the conformational dynamics and disorder of some proteasome subunits. We further demonstrate that this level of cross-linking density is sufficient to model the architecture of the 19-subunit regulatory particle de novo.
Article
We report the performance of the protein docking prediction pipeline of our group and the results for CAPRI rounds 38‐46. The pipeline integrates programs developed in our group as well as other existing scoring functions. The core of the pipeline is the LZerD protein‐protein docking algorithm. If templates of the target complex are not found in PDB, the first step of our docking prediction pipeline is to run LZerD for a query protein pair. Meanwhile, in the case of human group prediction, we survey the literature to find information that can guide the modeling, such as protein‐protein interface information. In addition to any literature information and binding residue prediction, generated docking decoys were selected by a rank aggregation of statistical scoring functions. The top ten decoys were relaxed by a short molecular dynamics simulation before submission to remove atom clashes and improve side‐chain conformations. In these CAPRI rounds, our group, particularly the LZerD server, showed robust performance. On the other hand, there are failed cases where some other groups were successful. To understand weaknesses of our pipeline, we analyzed sources of errors for failed targets. Since we noted that structure refinement is a step that needs improvement, we newly performed a comparative study of several refinement approaches. Finally, we show several examples that illustrate successful and unsuccessful cases by our group. This article is protected by copyright. All rights reserved.
Article
We present a cross-linking/mass spectrometry (XLMS) workflow for performing proteome-wide cross-linking analyses within one week. The workflow is based on the commercially available MS-cleavable cross-linker disuccinimidyl dibutyric urea (DSBU) and can be employed by every lab having access to a mass spectrometer with tandem MS capabilities. We provide an updated version 2.0 of the freeware software tool MeroX, available at www.StavroX.com, that allows conducting fully automated and reliable studies delivering insights into protein-protein interaction networks and protein conformations at the proteome level. We exemplify our optimized workflow for mapping protein-protein interaction networks in Drosophila melanogaster embryos on a system-wide level. From cross-linked Drosophila embryo extracts, we detected 29,931 cross-link spectrum matches corresponding to 7,436 unique cross-linked residues in biological triplicate experiments at 1% FDR. Among these, 1,611 inter-protein cross-linking sites were identified that yield valuable information on protein-protein interactions. The remaining 5,825 intra-protein cross-links yield information on conformational landscape of proteins in their cellular environment.
Article
IgA nephropathy (IgAN) is the most prevalent cause of primary glomerular disease worldwide, and the cytokine A PRoliferation‐Inducing Ligand (APRIL) is emerging as a key player in IgAN pathogenesis and disease progression. For a panel of anti‐human APRIL antibodies with known antagonistic activity, we sought to define their structural mode of engagement to understand molecular mechanisms of action and aid rational antibody engineering. Reliable computational prediction of antibody‐antigen complexes remains challenging, and experimental methods such as X‐ray co‐crystallography and cryoEM have considerable technical, resource, and throughput barriers. To overcome these limitations, we implemented an integrated and accessible experimental‐computational workflow to more accurately predict structures of antibody‐APRIL complexes. Specifically, a yeast surface display library encoding site‐saturation mutagenized surface positions of APRIL was screened against a panel of anti‐APRIL antibodies to rapidly obtain a comprehensive biochemical profile of mutational impact on binding for each antibody. The experimentally derived mutational profile data were used as quantitative constraints in a computational docking workflow optimized for antibodies, resulting in robust structural models of antibody‐antigen complexes. The model results were confirmed by solving the cocrystal structure of one antibody‐APRIL complex, which revealed strong agreement with our model. The models were used to rationally select and engineer one antibody for cross‐species APRIL binding, which quite often aids further testing in relevant animal models. Collectively, we demonstrate a rapid, integrated computational‐experimental approach to robustly predict antibody‐antigen structures information, which can aid rational antibody engineering and provide insights into molecular mechanisms of action.
Article
Biological processes supporting life are orchestrated by a highly dynamic array of protein structures and interactions comprising the interactome. Defining the interactome, visualizing how structures and interactions change and function to support life is essential to improved understanding of fundamental molecular processes, but represents a challenge unmet by any single analytical technique. Chemical cross-linking with mass spectrometry provides identification of proximal amino acid residues within proteins and protein complexes, yielding low resolution structural information. This approach has predominantly been employed to provide structural insight on isolated protein complexes, and has been particularly useful for molecules that are recalcitrant to conventional structural biology studies. Here we discuss recent developments in cross-linking and mass spectrometry technologies that are providing large-scale or systems-level interactome data with successful applications to isolated organelles, cell lysates, virus particles, intact bacterial and mammalian cultured cells and tissue samples.
Article
Glyoxalase II (GlxII) is an antioxidant glutathione-dependent enzyme, which catalyzes the hydrolysis of S-D-lactoylglutathione to form D-lactic acid and glutathione (GSH). The last product is the most important thiol reducing agent present in all eukaryotic cells that have mitochondria and chloroplasts. It is generally known that GSH plays a crucial role on the cellular redox state but also on various cellular processes. One of them is protein S-glutathionylation, a process that can occur through an oxidation reaction of proteins thiol groups by GSH. Changes in protein S-glutathionylation have been associated with a range of human diseases such as diabetes, cardiovascular and pulmonary diseases, neurodegenerative diseases and cancer. Within a major project aimed to elucidate the role of GlxII in the mechanism of S-glutathionylation, a reliable computational protocol consisting in a protein-protein docking approach followed by atomistic Molecular Dynamics (MD) simulations was settled out and it was applied to the prediction of molecular associations between human GlxII (in presence and in absence of GSH), with some proteins that are known to be S-glutathionylated in vitro, as actin, malate dehydrogenase (MDH) and glyceraldehyde-3-phosphate dehydrogenase (GAPDH). The computational results show a high propensity of GlxII to interact with actin and MDH through its active site and a high stability of the GlxII-protein systems when GSH is present. Moreover, close proximities of GSH with actin and MDH cysteine residues have been found, suggesting that GlxII could be able to perform protein S-glutathionylation by using the GSH molecule present in its catalytic site.
Article
Full-text available
The x-ray crystal structure of succinyl-CoA synthetase (SCS) from Escherichia coli has been determined by the method of multiple isomorphous replacement to a resolution of 2.5 A. Crystals of SCS are tetragonal with a space group of P4(3)22 and unit cell dimensions of a = b = 98.47 A and c = 400.6 A. One molecule of SCS (142 kDa) is contained in the asymmetric unit. The current model has been refined to a conventional R factor of 21.6% with root mean square deviations from ideal stereochemistry of 0.022 A for bond lengths and 3.25 degrees for bond angles. The quaternary organization of the E. coli enzyme is an alpha 2 beta 2 heterotetramer. In this tetramer, the alpha-subunits interact only with the beta-subunits, whereas the beta-subunits interact to form the dimer of alpha beta-dimers. The two active site pockets are located at regions of contact between alpha- and beta-subunits. One molecule of coenzyme A is bound to each alpha-subunit at a typical nucleotide-binding motif, and His-246 of each alpha-subunit is phosphorylated. This phosphohistidine, a catalytic intermediate, is stabilized by two helix dipoles (the “power” helices), one from each of the two subunit types. A short segment of the beta-subunit from one alpha beta-dimer is in close proximity to the CoA-binding site of the other alpha beta-dimer, providing a possible rationale for the overall tetrameric structure.
Article
Full-text available
Significance Mitochondria meet the majority of living cells’ demand for ATP and, as important regulators of redox homeostasis, metabolite levels, and calcium buffering, are a critical link between cell energetics and signaling. Disruption of these processes can induce adaptive or pathological signaling responses to stress and under severe stress promote cell death. Mitochondria have a complex proteome with conformations and interactions that are not well understood. Mitochondrial dysfunction is a direct cause of rare inherited diseases and is implicated in common metabolic diseases and age-related pathology. This study illuminates protein interactions and conformational features of nearly one-third of the mitochondrial proteome. Network information on this scale will enable groundbreaking insights into mitochondrial function, dysfunction, and potential therapeutic targets for mitochondrial-based pathology.
Article
Full-text available
ClusPro is a heavily used protein-protein docking server based on the fast Fourier transform (FFT) correlation approach. While FFT enables global docking, accounting for pairwise distance restraints using penalty terms in the scoring function is computationally expensive. We use a different approach and directly select low energy solutions that also satisfy the given restraints. As expected, accounting for restraints generally improves the rank of near native predictions, while retaining or even improving the numerical efficiency of FFT based docking. Availability: The software is freely available as part of the ClusPro web-based server at http://cluspro.org/nousername.php Supplementary Information: Supplementary data are available at Bioinformatics online.
Article
Full-text available
Hsp90 belongs to a family of some of the most highly expressed heat shock proteins that function as molecular chaperones to protect the proteome not only from the heat shock but also from other misfolding events. As many client proteins of Hsp90 are involved in oncogenesis, this chaperone has been the focus of intense research efforts. Yet, we lack structural information for how Hsp90 interacts with co-chaperones and client proteins. Here, we developed a mass-spectrometry-based approach that allowed quantitative measurements of in vitro and in vivo effects of small-molecule inhibitors on Hsp90 conformation, and interaction with co-chaperones and client proteins. From this analysis, we were able to derive structural models for how Hsp90 engages its interaction partners in vivo, and how different drugs affect these structures. In addition, the methodology described here offers a new approach to probe the effects of virtually any inhibitor treatment on the proteome level.
Article
Full-text available
Chemoresistance is a common mode of therapy failure for many cancers. Tumours develop resistance to chemotherapeutics through a variety of mechanisms, with proteins serving pivotal roles. Changes in protein conformations and interactions affect the cellular response to environmental conditions contributing to the development of new phenotypes. The ability to understand how protein interaction networks adapt to yield new function or alter phenotype is limited by the inability to determine structural and protein interaction changes on a proteomic scale. Here, chemical crosslinking and mass spectrometry were employed to quantify changes in protein structures and interactions in multidrug-resistant human carcinoma cells. Quantitative analysis of the largest crosslinking-derived, protein interaction network comprising 1,391 crosslinked peptides allows for 'edgotype' analysis in a cell model of chemoresistance. We detect consistent changes to protein interactions and structures, including those involving cytokeratins, topoisomerase-2-alpha, and post-translationally modified histones, which correlate with a chemoresistant phenotype.
Article
Full-text available
Recent studies on the respiratory chain of Ascaris suum showed that the mitochondrial NADH-fumarate reductase system composed of complex I, rhodoquinone and complex II plays an important role in the anaerobic energy metabolism of adult A. suum. The system is the major pathway of energy metabolism for adaptation to a hypoxic environment not only in parasitic organisms, but also in some types of human cancer cells. Thus, enzymes of the pathway are potential targets for chemotherapy. We found that flutolanil is an excellent inhibitor for A. suum complex II (IC50 = 0.058 μM) but less effectively inhibits homologous porcine complex II (IC50 = 45.9 μM). In order to account for the specificity of flutolanil to A. suum complex II from the standpoint of structural biology, we determined the crystal structures of A. suum and porcine complex IIs binding flutolanil and its derivative compounds. The structures clearly demonstrated key interactions responsible for its high specificity to A. suum complex II and enabled us to find analogue compounds, which surpass flutolanil in both potency and specificity to A. suum complex II. Structures of complex IIs binding these compounds will be helpful to accelerate structure-based drug design targeted for complex IIs.
Article
Full-text available
We present DisVis, a Python package and command line tool to calculate the reduced accessible interaction space of distance-restrained binary protein complexes, allowing for direct visualization and quantification of the information content of the distance restraints. The approach is general and can also be used as a knowledge-based distance energy term in FFT-based docking directly during the sampling stage. The source code with documentation is freely available from https://github.com/haddocking/disvis. a.m.j.j.bonvin@uu.nl SUPPLEMENTARY INFORMATION: Available at Bioinformatics online. © The Author(s) 2015. Published by Oxford University Press.
Article
Full-text available
F1 -ATPase (F1 ) is the catalytic sector in Fo F1 -ATP synthase that is responsible for ATP production in living cells. In catalysis, its three catalytic β-subunits undergo the nucleotide-occupancy dependent and concerted open-close conformational changes that are accompanied by rotation of the γ-subunit. Bacterial and chloroplast F1 are inhibited by its own ε-subunit. In the ε-inhibited E. coli F1 structure, the ε-subunit stabilizes an over-all conformation (half closed, closed, open) of the β-subunits by inserting its C-terminal helix into the α3β3 cavity. The ε-inhibited thermophilic F1 structure is similar to the E. coli structure, showing a similar conformation of the ε-subunit, but it is another unique over-all conformation (open, closed, open) of the β-subunits that the thermophilic ε-subunit stabilizes. The ε- C-terminal helix 2 and hook are conserved between the two structures in the interactions with target residues and in their positions. Rest of the ε-C-terminal domains are in quite different conformations and positions, and have different modes of interaction with targets. This region is thought to serve the ε-inhibition differently. For inhibition, the ε-subunit contacts the second catches of some of the β- and α-subunits, the N- and C- terminal helices and some of the Rossmann fold segments. Those contacts, as a whole, lead to a) positioning of those β- and α- second catches in the ε-inhibition-specific positions and b) preventing rotation of the γ-subunit. Some of those structure features are observed even in the IF1 inhibition in mitochondrial F1 .. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Article
Full-text available
In pathogenic Gram-negative bacteria, interactions among membrane proteins are key mediators of host cell attachment, invasion, pathogenesis, and antibiotic resistance. Membrane protein interactions are highly dependent upon local properties and environment, warranting direct measurements on native protein complex structures as they exist in cells. Here we apply in vivo chemical cross-linking mass spectrometry, to reveal the first large-scale protein interaction network in Pseudomonas aeruginosa, an opportunistic human pathogen, by covalently linking interacting protein partners, thereby fixing protein complexes in vivo. A total of 626 cross-linked peptide pairs, including previously unknown interactions of many membrane proteins, are reported. These pairs not only define the existence of these interactions in cells but also provide linkage constraints for complex structure predictions. Structures of three membrane proteins, namely, SecD-SecF, OprF, and OprI are predicted using in vivo cross-linked sites. These findings improve understanding of membrane protein interactions and structures in cells. Copyright © 2015 Elsevier Ltd. All rights reserved.
Article
Full-text available
Availability of high-resolution atomic structures is one of the prerequisites for a mechanistic understanding of biomolecular function. This atomic information can, however, be difficult to acquire for interesting systems such as high molecular weight and multi-subunit complexes. For these, low-resolution and/or sparse data from a variety of sources including NMR are often available to define the interaction between the subunits. To make best use of all the available information and shed light on these challenging systems, integrative computational tools are required that can judiciously combine and accurately translate the sparse experimental data into structural information. In this Perspective we discuss NMR techniques and data sources available for the modeling of large and multi-subunit complexes. Recent developments are illustrated by particularly challenging application examples taken from the literature. Within this context, we also position our data-driven docking approach, HADDOCK, which can integrate a variety of information sources to drive the modeling of biomolecular complexes. It is the synergy between experimentation and computational modeling that will provides us with detailed views on the machinery of life and lead to a mechanistic understanding of biomolecular function.
Article
Full-text available
In many protein-protein docking algorithms, binding site information is used to help predicting the protein complex structures. Using correct and accurate binding site information can increase protein-protein docking success rate significantly. On the other hand, using wrong binding sites information should lead to a failed prediction, or, at least decrease the success rate. Recently, various successful theoretical methods have been proposed to predict the binding sites of proteins. However, the predicted binding site information is not always reliable, sometimes wrong binding site information could be given. Hence there is a high risk to use the predicted binding site information in current docking algorithms. In this paper, a softly restricting method (SRM) is developed to solve this problem. By utilizing predicted binding site information in a proper way, the SRM algorithm is sensitive to the correct binding site information but insensitive to wrong information, which decreases the risk of using predicted binding site information. This SRM is tested on benchmark 3.0 using purely predicted binding site information. The result shows that when the predicted information is correct, SRM increases the success rate significantly; however, even if the predicted information is completely wrong, SRM only decreases success rate slightly, which indicates that the SRM is suitable for utilizing predicted binding site information.
Article
Full-text available
Chemical cross-links identified by mass spectrometry generate distance restraints that reveal low-resolution structural information on proteins and protein complexes. The technology to reliably generate such data has become mature and robust enough to shift the focus to the question of how these distance restraints can be best integrated into molecular modeling calculations. Here, we introduce three workflows for incorporating distance restraints generated by chemical cross-linking and mass spectrometry into ROSETTA protocols for comparative and de novo modeling and protein-protein docking. We demonstrate that the cross-link validation and visualization software Xwalk facilitates successful cross-link data integration. Besides the protocols we introduce XLdb, a database of chemical cross-links from 14 different publications with 506 intra-protein and 62 inter-protein cross-links, where each cross-link can be mapped on an experimental structure from the Protein Data Bank. Finally, we demonstrate on a protein-protein docking reference data set the impact of virtual cross-links on protein docking calculations and show that an inter-protein cross-link can reduce on average the RMSD of a docking prediction by 5.0 Å. The methods and results presented here provide guidelines for the effective integration of chemical cross-link data in molecular modeling calculations and should advance the structural analysis of particularly large and transient protein complexes via hybrid structural biology methods.
Article
Full-text available
Scoring, the process of selecting the biologically relevant solution from a pool of generated conformations, is one of the major challenges in the field of biomolecular docking. A prominent way to cope with this challenge is to incorporate information-based terms into the scoring function. Within this context, low-resolution shape data obtained from either ion-mobility mass spectrometry (IM-MS) or SAXS experiments have been integrated into the conventional scoring function of the information-driven docking program HADDOCK. Here, the strengths and weaknesses of IM-MS-based and SAXS-based scoring, either in isolation or in combination with the HADDOCK score, are systematically assessed. The results of an analysis of a large docking decoy set composed of dimers generated by running HADDOCK in ab initio mode reveal that the content of the IM-MS data is of too low resolution for selecting correct models, while scoring with SAXS data leads to a significant improvement in performance. However, the effectiveness of SAXS scoring depends on the shape and the arrangement of the complex, with prolate and oblate systems showing the best performance. It is observed that the highest accuracy is achieved when SAXS scoring is combined with the energy-based HADDOCK score.
Article
Full-text available
We present a two-stage hybrid-resolution approach for rigid-body protein-protein docking. The first stage is carried out at low-resolution (15°) angular sampling. In the second stage, we sample promising regions from the first stage at a higher resolution of 6°. The hybrid-resolution approach produces the same results as a 6° uniform sampling docking run, but uses only 17% of the computational time. We also show that the angular distance can be used successfully in clustering and pruning algorithms, as well as the characterization of energy funnels. Traditionally the root-mean-square-distance is used in these algorithms, but the evaluation is computationally expensive as it depends on both the rotational and translational parameters of the docking solutions. In contrast, the angular distances only depend on the rotational parameters, which are generally fixed for all docking runs. Hence the angular distances can be pre-computed, and do not add computational time to the post-processing of rigid-body docking results.
Article
Full-text available
Protein interaction topologies are critical determinants of biological function. Large-scale or proteome-wide measurements of protein interaction topologies in cells currently pose an unmet challenge that could dramatically improve understanding of complex biological systems. A primary impediment includes direct protein topology and interaction measurements from living systems since interactions that lack biological significance may be introduced during cell lysis. Furthermore, many biologically relevant protein interactions will likely not survive the lysis/sample preparation and may only be measured with in vivo methods. As a step toward meeting this challenge, a new mass spectrometry method called Real-time Analysis for Cross-linked peptide Technology (ReACT) has been developed that enables assignment of cross-linked peptides "on-the-fly". Using ReACT, 708 unique cross-linked (<5% FDR) peptide pairs were identified from cross-linked E. coli cells. These data allow assembly of the first protein interaction network that also contains topological features of every interaction, as it existed in cells during cross-linker application. Of the identified interprotein cross-linked peptide pairs, 40% are derived from known interactions and provide new topological data that can help visualize how these interactions exist in cells. Other identified cross-linked peptide pairs are from proteins known to be involved within the same complex, but yield newly discovered direct physical interactors. ReACT enables the first view of these interactions inside cells, and the results acquired with this method suggest cross-linking can play a major role in future efforts to map the interactome in cells.
Article
Full-text available
The identification of proximate amino acids by chemical cross-linking and mass spectrometry (XL-MS) facilitates the structural analysis of homogeneous protein complexes. We gained distance restraints on a modular interaction network of protein complexes affinity-purified from human cells by applying an adapted XL-MS protocol. Systematic analysis of human protein phosphatase 2A (PP2A) complexes identified 176 interprotein and 570 intraprotein cross-links that link specific trimeric PP2A complexes to a multitude of adaptor proteins that control their cellular functions. Spatial restraints guided molecular modeling of the binding interface between immunoglobulin binding protein 1 (IGBP1) and PP2A and revealed the topology of TCP1 ring complex (TRiC) chaperonin interacting with the PP2A regulatory subunit 2ABG. This study establishes XL-MS as an integral part of hybrid structural biology approaches for the analysis of endogenous protein complexes.
Article
Full-text available
Computational prediction of the 3D structures of molecular interactions is a challenging area, often requiring significant computational resources to produce structural predictions with atomic-level accuracy. This can be particularly burdensome when modeling large sets of interactions, macromolecular assemblies, or interactions between flexible proteins. We previously developed a protein docking program, ZDOCK, which uses a fast Fourier transform to perform a 3D search of the spatial degrees of freedom between two molecules. By utilizing a pairwise statistical potential in the ZDOCK scoring function, there were notable gains in docking accuracy over previous versions, but this improvement in accuracy came at a substantial computational cost. In this study, we incorporated a recently developed 3D convolution library into ZDOCK, and additionally modified ZDOCK to dynamically orient the input proteins for more efficient convolution. These modifications resulted in an average of over 8.5-fold improvement in running time when tested on 176 cases in a newly released protein docking benchmark, as well as substantially less memory usage, with no loss in docking accuracy. We also applied these improvements to a previous version of ZDOCK that uses a simpler non-pairwise atomic potential, yielding an average speed improvement of over 5-fold on the docking benchmark, while maintaining predictive success. This permits the utilization of ZDOCK for more intensive tasks such as docking flexible molecules and modeling of interactomes, and can be run more readily by those with limited computational resources.
Article
Full-text available
The β-barrel assembly machinery (BAM) complex of Escherichia coli is a multiprotein machine that catalyzes the essential process of assembling outer membrane proteins. The BAM complex consists of five proteins: one membrane protein, BamA, and four lipoproteins, BamB, BamC, BamD, and BamE. Here, we report the first crystal structure of a Bam lipoprotein complex: the essential lipoprotein BamD in complex with the N-terminal half of BamC (BamC(UN) (Asp(28)-Ala(217)), a 73-residue-long unstructured region followed by the N-terminal domain). The BamCD complex is stabilized predominantly by various hydrogen bonds and salt bridges formed between BamD and the N-terminal unstructured region of BamC. Sequence and molecular surface analyses revealed that many of the conserved residues in both proteins are found at the BamC-BamD interface. A series of truncation mutagenesis and analytical gel filtration chromatography experiments confirmed that the unstructured region of BamC is essential for stabilizing the BamCD complex structure. The unstructured N terminus of BamC interacts with the proposed substrate-binding pocket of BamD, suggesting that this region of BamC may play a regulatory role in outer membrane protein biogenesis.
Article
Full-text available
Chemical cross-linking of proteins or protein complexes and the mass spectrometry-based localization of the cross-linked amino acids in peptide sequences is a powerful method for generating distance restraints on the substrate's topology. Here, we introduce the algorithm Xwalk for predicting and validating these cross-links on existing protein structures. Xwalk calculates and displays non-linear distances between chemically cross-linked amino acids on protein surfaces, while mimicking the flexibility and non-linearity of cross-linker molecules. It returns a 'solvent accessible surface distance', which corresponds to the length of the shortest path between two amino acids, where the path leads through solvent occupied space without penetrating the protein surface. Xwalk is freely available as a web server or stand-alone JAVA application at http://www.xwalk.org.
Article
Full-text available
In order to enhance the structure determination process of macromolecular assemblies by NMR, we have implemented long-range pseudocontact shift (PCS) restraints into the data-driven protein docking package HADDOCK. We demonstrate the efficiency of the method on a synthetic, yet realistic case based on the lanthanide-labeled N-terminal ε domain of the E. coli DNA polymerase III (ε186) in complex with the HOT domain. Docking from the bound form of the two partners is swiftly executed (interface RMSDs < 1 Å) even with addition of very large amount of noise, while the conformational changes of the free form still present some challenges (interface RMSDs in a 3.1–3.9 Å range for the ten lowest energy complexes). Finally, using exclusively PCS as experimental information, we determine the structure of ε186 in complex with the HOT-homologue θ subunit of the E. coli DNA polymerase III. Electronic supplementary material The online version of this article (doi:10.1007/s10858-011-9514-4) contains supplementary material, which is available to authorized users.
Article
Full-text available
The iterative threading assembly refinement (I-TASSER) server is an integrated platform for automated protein structure and function prediction based on the sequence-to-structure-to-function paradigm. Starting from an amino acid sequence, I-TASSER first generates three-dimensional (3D) atomic models from multiple threading alignments and iterative structural assembly simulations. The function of the protein is then inferred by structurally matching the 3D models with other known proteins. The output from a typical server run contains full-length secondary and tertiary structure predictions, and functional annotations on ligand-binding sites, Enzyme Commission numbers and Gene Ontology terms. An estimate of accuracy of the predictions is provided based on the confidence score of the modeling. This protocol provides new insights and guidelines for designing of online server systems for the state-of-the-art protein structure and function predictions. The server is available at http://zhanglab.ccmb.med.umich.edu/I-TASSER.
Article
Full-text available
The x-ray crystal structure of succinyl-CoA synthetase (SCS) from Escherichia coli has been determined by the method of multiple isomorphous replacement to a resolution of 2.5 A. Crystals of SCS are tetragonal with a space group of P4(3)22 and unit cell dimensions of a = b = 98.47 A and c = 400.6 A. One molecule of SCS (142 kDa) is contained in the asymmetric unit. The current model has been refined to a conventional R factor of 21.6% with root mean square deviations from ideal stereochemistry of 0.022 A for bond lengths and 3.25 degrees for bond angles. The quaternary organization of the E. coli enzyme is an alpha 2 beta 2 heterotetramer. In this tetramer, the alpha-subunits interact only with the beta-subunits, whereas the beta-subunits interact to form the dimer of alpha beta-dimers. The two active site pockets are located at regions of contact between alpha- and beta-subunits. One molecule of coenzyme A is bound to each alpha-subunit at a typical nucleotide-binding motif, and His-246 of each alpha-subunit is phosphorylated. This phosphohistidine, a catalytic intermediate, is stabilized by two helix dipoles (the "power" helices), one from each of the two subunit types. A short segment of the beta-subunit from one alpha beta-dimer is in close proximity to the CoA-binding site of the other alpha beta-dimer, providing a possible rationale for the overall tetrameric structure.
Article
Full-text available
Mutation of the VHL tumor suppressor is associated with the inherited von Hippel–Lindau (VHL) cancer syndrome and the majority of kidney cancers. VHL binds the ElonginC-ElonginB complex and regulates levels of hypoxia-inducible proteins. The structure of the ternary complex at 2.7 angstrom resolution shows two interfaces, one between VHL and ElonginC and another between ElonginC and ElonginB. Tumorigenic mutations frequently occur in a 35-residue domain of VHL responsible for ElonginC binding. A mutational patch on a separate domain of VHL indicates a second macromolecular binding site. The structure extends the similarities to the SCF (Skp1-Cul1–F-box protein) complex that targets proteins for degradation, supporting the hypothesis that VHL may function in an analogous pathway.
Article
While modern structural biology technologies have greatly expanded the size and type of protein complexes that can now be studied, the ability to derive large-scale structural information on proteins and complexes as they exist within tissues is practically nonexistent. Here, we demonstrate the application of crosslinking mass spectrometry to identify protein structural features and interactions in tissue samples, providing systems structural biology insight into protein complexes as they exist in the mouse heart. This includes insights into multiple conformational states of sarcomere proteins, as well as interactions among OXPHOS complexes indicative of supercomplex assembly. The extension of crosslinking mass spectrometry analysis into the realm of tissues opens the door to increasing our understanding of protein structures and interactions within the context of the greater biological system.
Article
Interspecies protein-protein interactions are essential mediators of infection. While bacterial proteins required for host cell invasion and infection can be identified through bacterial mutant library screens, information about host target proteins and interspecies complex structures has been more difficult to acquire. Using an unbiased chemical crosslinking/mass spectrometry approach, we identified interspecies protein-protein interactions in human lung epithelial cells infected with Acinetobacter baumannii. These efforts resulted in identification of 3,076 crosslinked peptide pairs and 46 interspecies protein-protein interactions. Most notably, the key A. baumannii virulence factor, OmpA, was identified as crosslinked to host proteins involved in desmosomes, specialized structures that mediate host cell-to-cell adhesion. Co-immunoprecipitation and transposon mutant experiments were used to verify these interactions and demonstrate relevance for host cell invasion and acute murine lung infection. These results shed new light on A. baumannii-host protein interactions and their structural features, and the presented approach is generally applicable to other systems.
Article
The HOP2/MND1 heterodimer is essential for meiotic homologous recombination in plants and other eukaryotes, and promotes the repair of DNA double strand breaks. We investigated the conformational flexibility of HOP2/MND1, important for understanding mechanistic details of the heterodimer, by chemical crosslinking in combination with mass spectrometry (XL-MS). The final XL-MS workflow encompassed the use of complementary crosslinkers, quenching, digestion, size exclusion enrichment and HCD based LC-MS/MS detection prior to data evaluation. We applied two different homobifunctional amine-reactive crosslinkers (DSS, BS(2)G) and one zero-length heterobifunctional crosslinker (EDC). Crosslinked peptides of four biological replicates were analyzed prior to 3D structure prediction by protein threading and protein-protein docking for crosslink guided molecular modeling. Miniaturization of the size exclusion enrichment step reduced the required starting material, led to a high amount of crosslinked peptides and allowed the analysis of replicates. The major interaction site of HOP2/MND1 was identified in the central coiled coil domains and an open co-linear parallel arrangement of HOP2 and MND1 within the complex was predicted. Moreover, flexibility of the C-terminal capping helices of both complex partners was observed suggesting the coexistence of a closed complex conformation in solution.
Article
We present an updated and integrated version of our widely used protein-protein docking and binding affinity benchmarks. The benchmarks consist of non-redundant, high quality structures of protein-protein complexes along with the unbound structures of their components. Fifty-five new complexes were added to the docking benchmark, 35 of which have experimentally-measured binding affinities. These updated docking and affinity benchmarks now contain 230 and 179 entries, respectively. In particular, the number of antibody-antigen complexes has increased significantly, by 67% and 74% in the docking and affinity benchmarks, respectively. We tested previously developed docking and affinity prediction algorithms on the new cases. Considering only the top ten docking predictions per benchmark case, a prediction accuracy of 38% is achieved on all 55 cases, and up to 50% for the 32 rigid-body cases only. Predicted affinity scores are found to correlate with experimental binding energies up to r=0.52 overall, and r=0.72 for the rigid complexes. Copyright © 2015. Published by Elsevier Ltd.
Article
The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
Article
Auristatins, synthetic analogs of the antineoplastic natural product Dolastatin 10, are ultra-potent cytotoxic microtubule inhibitors that are clinically used as payloads in antibody-drug conjugates (ADCs). The design and synthesis of several new auristatin analogs with N-terminal modifications that include amino acids with α,α-disubstituted carbon atoms are described, including the discovery of our lead auristatin, PF-06380101. This modification of the peptide structure is unprecedented and led to analogs with excellent potencies in tumor cell proliferation assays and differential ADME properties when compared to other synthetic auristatin analogs that are used in the preparation of ADCs. In addition, auristatin co-crystal structures with tubulin are being presented that allow for the detailed examination of their binding modes. A surprising finding is that all analyzed analogs have a cis-configuration at the Val-Dil amide bond in their functionally relevant tubulin bound state, whereas in solution this bond is exclusively in the trans-configuration. This remarkable observation shines light onto the preferred binding mode of auristatins and serves as a valuable tool for structure based drug design.
Article
The formate-nitrite transporters (FNT) form a superfamily of pentameric membrane channels that translocate monovalent anions across biological membranes. FocA translocates formate bidirectionally but the mechanism underlying how translocation of formate is controlled and what governs substrate specificity remain unclear. Here we demonstrate that the normally soluble dimeric enzyme pyruvate formate-lyase (PflB), which is responsible for intracellular formate generation in enterobacteria and other microbes, interacts specifically with FocA. Association of PflB with the cytoplasmic membrane was shown to be FocA-dependent and purified, Strep-tagged FocA specifically retrieved PflB from Eschericha coli crude extracts. Using a bacterial two-hybrid system it could be shown that the N-terminus of FocA and the central domain of PflB were involved in the interaction. This finding was confirmed by chemical cross-linking experiments. Using constraints imposed by the amino acid residues identified in the cross-linking study we provide for the first time a model for the FocA-PflB complex. The model suggests that the N-terminus of FocA is important for interaction with PflB. An in vivo assay developed to monitor changes in formate levels in the cytoplasm revealed the importance of the interaction with PflB for optimal tranlocation of formate by FocA. This system represents a paradigm for the control of activity of FNT channel proteins.
Article
Protein-protein interactions are essential to cellular and immune function, and in many cases, due to absence of an experimentally determined structure of the complex, these interactions must be modeled to obtain an understanding of their molecular basis. We present a user-friendly protein docking server, based on the rigid-body docking programs ZDOCK and M-ZDOCK, to predict structures of protein-protein complexes and symmetric multimers. With a goal of providing an accessible and intuitive interface, we provide options for users to guide the scoring and selection of output models, in addition to dynamic visualization of input structures and output docking models. This server enables the research community to easily and quickly produce structural models of protein-protein complexes and symmetric multimers for their own analysis. The ZDOCK server is freely available to all academic and non-profit users at: http://zdock.umassmed.edu. No registration is required. zhiping.weng@umassmed.edu or brian.pierce@umassmed.edu.
Article
We report the performance of our approaches for protein-protein docking and interface analysis in CAPRI rounds 20-26. At the core of our pipeline was the ZDOCK program for rigid-body protein-protein docking. We then reranked the ZDOCK predictions using the ZRANK or IRAD scoring functions, pruned and analyzed energy landscapes using clustering, and analyzed the docking results using our interface prediction approach RCF. When possible, we used biological information from the literature to apply constraints to the search space during or after the ZDOCK runs. For approximately half of the standard docking challenges we made at least one prediction that was acceptable or better. For the scoring challenges we made acceptable or better predictions for all but one target. This indicates that our scoring functions are generally able to select the correct binding mode. © Proteins 2013;. © 2013 Wiley Periodicals, Inc.
Article
We present the 5th evaluation of docking and related scoring methods used in the community-wide experiment on the Critical Assessment of Predicted Interactions (CAPRI). The evaluation examined predictions submitted for a total of 15 targets in eight CAPRI rounds held during the years 2010-2012. The targets represented one the most diverse set tackled by the CAPRI community so far. They included only 10 'classical' docking and scoring problems. In one of the classical targets the new challenge was to predict the position of water molecules in the protein-protein interface. The remaining 5 targets represented other new challenges that involved estimating the relative binding affinity and the effect of point mutations on the stability of designed and natural protein-protein complexes. Although the 10 'classical' CAPRI targets included two difficult multi-component systems, and a protein-oligosaccharide complex with which CAPRI participants had little experience, this evaluation indicates that the performance of docking and scoring methods has remained quite robust. More remarkably, we find that automatic docking servers exhibit a significantly improved performance, with some servers now performing on par with predictions done by humans. The performance of CAPRI participants in the new challenges, briefly reviewed here, was mediocre overall, but some groups did relatively well and their approaches suggested ways of improving methods for designing binders and for estimating the free energies of protein assemblies, which should impact the field of protein modeling and design as a whole. © Proteins 2013;. © 2013 Wiley Periodicals, Inc.
Article
While major progress has been achieved in the experimental techniques used for the detection of protein interactions and in the processing and analysis of the vast amount of data that they generate, we still do not understand why the set of identified interactions remains so highly dependent on the particular detection method. Here we present an overview of the major high-throughput experimental methods used to detect interactions and the datasets produced using these methods over the last 10 years. We discuss the challenges of assessing the quality of these datasets, and examine key factors that likely underlie the persistent poor overlap between the interactions detected by different methods. Lastly, we present a brief overview of the literature-curated protein interaction data stored in public databases, which are often relied upon for independent validation of newly derived interaction networks.
Article
Information-driven docking is currently one of the most successful approaches to obtain structural models of protein interactions as demonstrated in the latest round of CAPRI. While various experimental and computational techniques can be used to retrieve information about the binding mode, the availability of three-dimensional structures of the interacting partners remains a limiting factor. Fortunately, the wealth of structural information gathered by large-scale initiatives allows for homology-based modelling of a significant fraction of the protein universe. Defining the limits of information-driven docking based on such homology models is therefore highly relevant. Here we show using previous CAPRI targets, that out of a variety of measures, the global sequence identity between template and target is a simple but reliable predictor of the achievable quality of the docking models. This indicates that a well-defined overall fold is critical for the interaction. Furthermore, the quality of the data at our disposal to characterize the interaction plays a determinant role in the success of the docking. Given reliable interface information we can obtain acceptable predictions even at low global sequence identity. These results, which define the boundaries between trustworthy and unreliable predictions, should guide both experts and non-experts in defining the limits of what is achievable by docking. This is highly relevant considering that the fraction of the interactome amenable for docking is only bound to grow as the number of experimentally solved structures increases. © Proteins 2013;. © 2013 Wiley Periodicals, Inc.
Article
As large-scale cross-linking data becomes available, new software tools for data processing and visualization are required to replace manual data analysis. XLink-DB serves as a data storage site and visualization tool for cross-linking results. XLink-DB accepts data generated with any cross-linker and stores them in a relational database. Cross-linked sites are automatically mapped onto PDB structures if available, and results are compared to existing protein interaction databases. A protein interaction network is also automatically generated for the entire data set. The XLink-DB server, including examples, and a help page are available for noncommercial use at http://brucelab.gs.washington.edu/crosslinkdbv1/ . The source code can be viewed and downloaded at https://sourceforge.net/projects/crosslinkdb/?source=directory .
Article
Motivation: Structural characterization of protein interactions is necessary for understanding and modulating biological processes. On one hand, X-ray crystallography or NMR spectroscopy provide atomic resolution structures but the data collection process is typically long and the success rate is low. On the other hand, computational methods for modeling assembly structures from individual components frequently suffer from high false-positive rate, rarely resulting in a unique solution. Results: Here, we present a combined approach that computationally integrates data from a variety of fast and accessible experimental techniques for rapid and accurate structure determination of protein-protein complexes. The integrative method uses atomistic models of two interacting proteins and one or more datasets from five accessible experimental techniques: a small-angle X-ray scattering (SAXS) profile, 2D class average images from negative-stain electron microscopy micrographs (EM), a 3D density map from single-particle negative-stain EM, residue type content of the protein-protein interface from NMR spectroscopy and chemical cross-linking detected by mass spectrometry. The method is tested on a docking benchmark consisting of 176 known complex structures and simulated experimental data. The near-native model is the top scoring one for up to 61% of benchmark cases depending on the included experimental datasets; in comparison to 10% for standard computational docking. We also collected SAXS, 2D class average images and 3D density map from negative-stain EM to model the PCSK9 antigen-J16 Fab antibody complex, followed by validation of the model by a subsequently available X-ray crystallographic structure.
Article
A novel computational method for fitting high-resolution structures of multiple proteins into a cryoelectron microscopy map is presented. The method named EMLZerD generates a pool of candidate multiple protein docking conformations of component proteins, which are later compared with a provided electron microscopy (EM) density map to select the ones that fit well into the EM map. The comparison of docking conformations and the EM map is performed using the 3D Zernike descriptor (3DZD), a mathematical series expansion of three-dimensional functions. The 3DZD provides a unified representation of the surface shape of multimeric protein complex models and EM maps, which allows a convenient, fast quantitative comparison of the three-dimensional structural data. Out of 19 multimeric complexes tested, near native complex structures with a root-mean-square deviation of less than 2.5 Å were obtained for 14 cases while medium range resolution structures with correct topology were computed for the additional 5 cases.
Article
Most scoring functions for protein-protein docking algorithms are either atom-based or residue-based, with the former being able to produce higher quality structures and latter more tolerant to conformational changes upon binding. Earlier, we developed the ZRANK algorithm for reranking docking predictions, with a scoring function that contained only atom-based terms. Here we combine ZRANK's atom-based potentials with five residue-based potentials published by other labs, as well as an atom-based potential IFACE that we published after ZRANK. We simultaneously optimized the weights for selected combinations of terms in the scoring function, using decoys generated with the protein-protein docking algorithm ZDOCK. We performed rigorous cross validation of the combinations using 96 test cases from a docking benchmark. Judged by the integrative success rate of making 1000 predictions per complex, addition of IFACE and the best residue-based pair potential reduced the number of cases without a correct prediction by 38 and 27% relative to ZDOCK and ZRANK, respectively. Thus combination of residue-based and atom-based potentials into a scoring function can improve performance for protein-protein docking. The resulting scoring function is called IRAD (integration of residue- and atom-based potentials for docking) and is available at http://zlab.umassmed.edu.
Article
We report the performance of the ZDOCK and ZRANK algorithms in CAPRI rounds 13-19 and introduce a novel measure atom contact frequency (ACF). To compute ACF, we identify the residues that most often make contact with the binding partner in the complete set of ZDOCK predictions for each target. We used ACF to predict the interface of the proteins, which, in combination with the biological data available in the literature, is a valuable addition to our docking pipeline. Furthermore, we incorporated a straightforward and efficient clustering algorithm with two purposes: (1) to determine clusters of similar docking poses (corresponding to energy funnels) and (2) to remove redundancies from the final set of predictions. With these new developments, we achieved at least one acceptable prediction for targets 29 and 36, at least one medium-quality prediction for targets 41 and 42, and at least one high-quality prediction for targets 37 and 40; thus, we succeeded for six out of a total of 12 targets.
Article
Structural models of macromolecular assemblies are instrumental for gaining a mechanistic understanding of cellular processes. Determining these structures is a major challenge for experimental techniques, such as X-ray crystallography, NMR spectroscopy and electron microscopy (EM). Thus, computational modeling techniques, including molecular docking, are required. The development of most molecular docking methods has so far been focused on modeling of binary complexes. We have recently introduced the MultiFit method for modeling the structure of a multisubunit complex by simultaneously optimizing the fit of the model into an EM density map of the entire complex and the shape complementarity between interacting subunits. Here, we report algorithmic advances of the MultiFit method that result in an efficient and accurate assembly of the input subunits into their density map. The successful predictions and the increasing number of complexes being characterized by EM suggests that the CAPRI challenge could be extended to include docking-based modeling of macromolecular assemblies guided by EM.
Article
We updated our protein-protein docking benchmark to include complexes that became available since our previous release. As before, we only considered high-resolution complex structures that are nonredundant at the family-family pair level, for which the X-ray or NMR unbound structures of the constituent proteins are also available. Benchmark 4.0 adds 52 new complexes to the 124 cases of Benchmark 3.0, representing an increase of 42%. Thus, benchmark 4.0 provides 176 unbound-unbound cases that can be used for protein-protein docking method development and assessment. Seventeen of the newly added cases are enzyme-inhibitor complexes, and we found no new antigen-antibody complexes. Classifying the new cases according to expected difficulty for protein-protein docking algorithms gives 33 rigid body cases, 11 cases of medium difficulty, and 8 cases that are difficult. Benchmark 4.0 listings and processed structure files are publicly accessible at http://zlab.umassmed.edu/benchmark/.
Article
X-ray crystallography and NMR can provide detailed structural information of protein-protein complexes, but technical problems make their application challenging in the high-throughput regime. Other methods such as small-angle X-ray scattering (SAXS) are more promising for large-scale application, but at the cost of lower resolution, which is a problem that can be solved by complementing SAXS data with theoretical simulations. Here, we propose a novel strategy that combines SAXS data and accurate protein-protein docking simulations. The approach has been benchmarked on a large pool of known structures with synthetic SAXS data, and on three experimental examples. The combined approach (pyDockSAXS) provided a significantly better success rate (43% for the top 10 predictions) than either of the two methods alone. Further analysis of the influence of different docking parameters made it possible to increase the success rates for specific cases, and to define guidelines for improving the data-driven protein-protein docking protocols.
Article
The structure determination of symmetric dimers by NMR is impeded by the ambiguity of inter- and intramonomer NOE crosspeaks. In this paper, a calculation strategy is presented that allows the calculation of dimer structures without resolving the ambiguity by additional experiments (like asymmetric labeling). The strategy employs a molecular dynamics-based simulated annealing approach to minimize a target function. The experimental part of the target function contains distance restraints that correctly describe the ambiguity of the NOE peaks, and a novel term that restrains the symmetry of the dimer without requiring the knowledge of the symmetry axis. The use of the method is illustrated by three examples, using experimentally obtained data and model data derived from a known structure. For the purpose of testing the method, it is assumed that every NOE crosspeak is ambiguous in all three cases. It is shown that the method is useful both in situations where the structure of a homologous protein is known and in ab initio structure determination. The method can be extended to higher order symmetric multimers.
Article
F1-ATPase, an oligomeric assembly with subunit stoichiometry alpha 3 beta 3 gamma delta epsilon, is the catalytic component of the ATP synthase complex, which plays a central role in energy transduction in bacteria, chloroplasts and mitochondria. The crystal structure of bovine mitochondrial F1-ATPase displays a marked asymmetry in the conformation and nucleotide content of the catalytic beta subunits. The alpha 3 beta 3 subcomplex of F1-ATPase has been assembled from subunits of the moderately thermophilic Bacillus PS3 made in Escherichia coli, and the subcomplex is active but does not show the catalytic cooperativity of intact F1-ATPase. The structure of this subcomplex should provide new information on the conformational variability of F1-ATPase and may provide insights into the unusual catalytic mechanism employed by this enzyme. The crystal structure of the nucleotide-free bacterial alpha 3 beta 3 subcomplex of F1-ATPase, determined at 3.2 A resolution, shows that the oligomer has exact threefold symmetry. The bacterial beta subunits adopt a conformation essentially identical to that of the nucleotide-free beta subunit in mitochondrial F1-ATPase; the alpha subunits have similar conformations in both structures. The structures of the bacterial F1-ATPase alpha and beta subunits are very similar to their counterparts in the mitochondrial enzyme, suggesting a common catalytic mechanism. The study presented here allows an analysis of the different conformations adopted by the alpha and beta subunits and may ultimately further our understanding of this mechanism.
Article
Succinyl-CoA synthetase (SCS) catalyzes the reversible phosphorylation/dephosphorylation reaction:¿¿¿rm succinyl ¿hbox ¿-¿CoA+NDP+P_i¿leftrightarrow succinate+CoA+NTP¿¿where N denotes adenosine or guanosine. In the course of the reaction, an essential histidine residue is transiently phosphorylated. We have crystallized and solved the structure of the GTP-specific isoform of SCS from pig heart (EC 6.2.1.4) in both the dephosphorylated and phosphorylated forms. The structures were refined to 2.1 A resolution. In the dephosphorylated structure, the enzyme is stabilized via coordination of a phosphate ion by the active-site histidine residue and the two "power" helices, one contributed by each subunit of the alphabeta-dimer. Small changes in the conformations of residues at the amino terminus of the power helix contributed by the alpha-subunit allow the enzyme to accommodate either the covalently bound phosphoryl group or the free phosphate ion. Structural comparisons are made between the active sites in these two forms of the enzyme, both of which can occur along the catalytic path. Comparisons are also made with the structure of Escherichia coli SCS. The domain that has been shown to bind ADP in E. coli SCS is more open in the pig heart, GTP-specific SCS structure.
Article
The structure determination of protein-protein complexes is a rather tedious and lengthy process, by both NMR and X-ray crystallography. Several methods based on docking to study protein complexes have also been well developed over the past few years. Most of these approaches are not driven by experimental data but are based on a combination of energetics and shape complementarity. Here, we present an approach called HADDOCK (High Ambiguity Driven protein-protein Docking) that makes use of biochemical and/or biophysical interaction data such as chemical shift perturbation data resulting from NMR titration experiments or mutagenesis data. This information is introduced as Ambiguous Interaction Restraints (AIRs) to drive the docking process. An AIR is defined as an ambiguous distance between all residues shown to be involved in the interaction. The accuracy of our approach is demonstrated with three molecular complexes. For two of these complexes, for which both the complex and the free protein structures have been solved, NMR titration data were available. Mutagenesis data were used in the last example. In all cases, the best structures generated by HADDOCK, that is, the structures with the lowest intermolecular energies, were the closest to the published structure of the respective complexes (within 2.0 A backbone RMSD).
Article
A simple and reliable method for docking protein-protein complexes using (1)H(N)/(15)N chemical shift mapping and backbone (15)N-(1)H residual dipolar couplings is presented and illustrated with three complexes (EIN-HPr, IIA(Glc)-HPr, and IIA(Mtl)-HPr) of known structure. The (1)H(N)/(15)N chemical shift mapping data are transformed into a set of highly ambiguous, intermolecular distance restraints (comprising between 400 and 3000 individual distances) with translational and some degree of orientational information content, while the dipolar couplings provide information on relative protein-protein orientation. The optimization protocol employs conjoined rigid body/torsion angle dynamics in simulated annealing calculations. The target function also comprises three nonbonded interactions terms: a van der Waals repulsion term to prevent atomic overlap, a radius of gyration term (E(rgyr)) to avoid expansion at the protein-protein interface, and a torsion angle database potential of mean force to bias interfacial side chain conformations toward physically allowed rotamers. For the EIN-HPr and IIA(Glc)-HPr complexes, all structures satisfying the experimental restraints (i.e., both the ambiguous intermolecular distance restraints and the dipolar couplings) converge to a single cluster with mean backbone coordinate accuracies of 0.7-1.5 A. For the IIA(Mtl)-HPr complex, twofold degeneracy remains, and the structures cluster into two distinct solutions differing by a 180 degrees rotation about the z axis of the alignment tensor. The correct and incorrect solutions which have mean backbone coordinate accuracies of approximately 0.5 and approximately 10.5 A, respectively, can readily be distinguished using a variety of criteria: (a) examination of the overall (1)H(N)/(15)N chemical shift perturbation map (because the incorrect cluster predicts the presence of residues at the interface that experience only minimal chemical shift perturbations; this information is readily incorporated into the calculations in the form of ambiguous intermolecular repulsion restraints); (b) back-calculation of dipolar couplings on the basis of molecular shape; or (c) the E(rgyr) distribution which, because of its global nature, directly reflects the interfacial packing quality. This methodology should be particularly useful for high throughput, NMR-based, structural proteomics.
Article
We have developed a nonredundant benchmark for testing protein-protein docking algorithms. Currently it contains 59 test cases: 22 enzyme-inhibitor complexes, 19 antibody-antigen complexes, 11 other complexes, and 7 difficult test cases. Thirty-one of the test cases, for which the unbound structures of both the receptor and ligand are available, are classified as follows: 16 enzyme-inhibitor, 5 antibody-antigen, 5 others, and 5 difficult. Such a centralized resource should benefit the docking community not only as a large curated test set but also as a common ground for comparing different algorithms. The benchmark is available at (http://zlab.bu.edu/~rong/dock/benchmark.shtml).