Article

Rapid Refinement of Protein Interfaces Incorporating Solvation: Application to the Docking Problem

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

otease-protein inhibitor and four antibody-antigen complexes. A large number of putative docked complexes have already been generated for the test systems using our rigid-body docking program, FTDOCK. They include geometries that closely resemble the crystal complex, and therefore act as a test for the renement procedure. In the protease-inhibitors, geometries that resemble the crystal complex are ranked in the top four solutions for four out of ve systems when solvation is included in the energy function, against a background of between 26 and 364 complexes in the data set. The results for the antibody-antigen complexes are not as encouraging, with only two of the four systems showing discrimination. It would appear that these results reect the somewhat different binding mechanism dominant in the two types of protein-protein complex. Binding in the protease-inhibitors appears to be "lock and key" in nature. The xed backbone and mobile side-chain representation provide a good model for

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... where FFT(X) represents applying a forward discrete Fourier transform to X, FFT À1 (X) represents an inverse Fourier transform for X, and ¯ X is the conjugate of the complex number X.Local shape feature matching Distance geometry algorithm DOCK[52]Geometric hashing PatchDock, SymmDock, LZerD[53][54][55][56]Genetic algorithm GAPDOCK[57]Randomized search Monte Carlo search RosettaDock, ICM-DISCO, ATTRACT, HADDOCK[61][62][63][64][65][66][67][68][69][70][71]Particle swarm optimization SwarmDock[72]Genetic algorithm AutoDock[73]Post-docking approach Using advanced scoring functions RPScore, ZRANK, PyDock, EMPIRE, DARS, DECK, SIPPER, PIE, MDockPP, etc.[81][82][83][84][85][86][87][88][89][90][91][92][93][94]Considering protein flexibility MultiDock, SmoothDock, RDOCK, FireDock, FiberDock, EigenHex, etc.[95][96][97][98][99][100][101][102][103][104]Other ranking protocols SDU, CyClus, CONSRANK, etc.[105][106][107][108][109][110][111]The above FFT-based search process in 3D translational space is repeated for each of the rotations for the ligand protein until the complete 3D rotational space is sampled. It should be noted that, here, a simple shape complementarity scoring method is only used for illustration purposes. ...
... Given the reasonable success of current protein–protein docking programs in generating hits in a certain number of top orientations and/or conformations, postdocking algorithms have made significant progress and achieved promising success in the community-wide Critical Assessment of PRedicted Interaction (CAPRI) experiments[76][77][78][79][80]. Examples of post-docking algorithms include using more sophisticated scoring functions, such as RPScore[81], ZRANK[82], PyDock[83], EMPIRE[84], DARS[85], DECK[86], SIPPER[87], PIE[88], MDockPP[89,90]and so on[91][92][93][94]; explicitly considering protein flexibility, such as MultiDock[95], SmoothDock[96], RDOCK[97], FireDock[98], FiberDock[99], EigenHex[100]and so on[101][102][103][104], and other refinement and/or ranking protocols, such as SDU[105], CyClus[106], and so on[107][108][109][110][111]. ...
... Similar to FFT-based algorithms, direct search and local shape feature-matching algorithms also suffer from large conformational changes in proteins because of their rigid-body docking nature and, therefore, have similar challenges and/or limitations. Possible solutions include ensemble docking, where several different conformations of proteins are used for multiple docking calculations[63,124,125], or post-docking approaches, where protein flexibility is explicitly considered during the optimization and/or refinement stage[95][96][97][98][99][100][101][102][103][104]. Theoretically, randomized search algorithms can handle any degree of protein flexibility from small side-chain fluctuations to large domain movements. ...
... Docking is framed as a rigid alignment problem of two rigid objects with complementary shapes. Flexible docking algorithms solve the general protein docking problem termed unbound or predictive docking by prediction of binding of two proteins in their free or unbound states [7, 9, 12, 16, 18, 20, 22, 26]. This problem regards one or both proteins as flexible objects to account for significant conformational shape changes which occur during protein interactions. ...
... Flexible docking algorithms can be classified into three categories. Rigid docking with refinement methods perform rigid docking of the proteins followed by refinement of their side chains [7, 9, 12, 16, 26]. By applying side chain refinement, side chain flexibility can be accounted for to improve docking results. ...
... The methods of [7, 26] apply biased probability Monte Carlo minimization of the ligand-interacting side chains while [16] uses energy minimization. The algorithm in [12] uses side chain rotamers and rigid body minimization to relax the interfaces of docking results. These methods handle side chain flexibility but not backbone conformational changes. ...
Conference Paper
Studies of interactions between protein domains and ligands are important in many aspects such as cellular signaling. We present a knowledge-guided approach for docking protein domains and exible ligands. The approach is applied to the WW domain, a small protein module mediating signaling complexes which have been implicated in diseases such as muscular dystrophy and Liddle's syndrome. The rst stage of the approach employs a substring search for two binding grooves of WW domains and possible binding motifs of peptide ligands based on known features. The second stage aligns the ligand's peptide backbone to the two binding grooves using a quasi-Newton constrained optimization algorithm. The backbone-aligned ligands produced serve as good start- ing points to the third stage which uses any exible docking algorithm to perform the docking. The experimental results demonstrate that the backbone alignment method in the second stage performs better than conventional rigid superposition given two binding constraints. It is also shown that using the backbone-aligned ligands as initial congurations improves the exible docking in the third stage. The presented approach can also be applied to other protein domains that involve binding of exible ligand to two or more binding sites.
... In this study, the focus is on the first of these challenges, for which a number of approaches have been used. The simplest method of docking two structures is to treat them as rigid bodies, usually using the Fast Fourier Transform (FFT) technique12345678 which may include a refinement stage [9], although other rigid-body methods have been used [10,11]. For many proteins, however, there are flexible deformations upon binding which can alter their geometric and electrostatic properties. ...
... For many proteins, however, there are flexible deformations upon binding which can alter their geometric and electrostatic properties. Attempts to cater for flexibility include rigid-body cross-docking of an ensemble of structures generated with molecule dynamics (MD)121314 and flexible refinement of rigidly docked poses [6,1516171819202122. Soft potentials have also been used to allow minor clashes [10,17,22], whilst others use a mean-field technique to select a conformations from a multiple copy representation [6,23] or assemble independently docked subunits connected by hinge regions [24]. A common flexible docking approach consists of an energy minimisation protocol (such as Monte Carlo, simulated annealing, simplex, MD, adopted basis Newton-Raphson or steepest decent) and a set of parameters to minimise (such as multiple copy weightings, atomic cartesian coordinates, internal coordinates or rotamers). ...
... Attempts to cater for flexibility include rigid-body cross-docking of an ensemble of structures generated with molecule dynamics (MD)121314 and flexible refinement of rigidly docked poses [6,1516171819202122. Soft potentials have also been used to allow minor clashes [10,17,22], whilst others use a mean-field technique to select a conformations from a multiple copy representation [6,23] or assemble independently docked subunits connected by hinge regions [24]. A common flexible docking approach consists of an energy minimisation protocol (such as Monte Carlo, simulated annealing, simplex, MD, adopted basis Newton-Raphson or steepest decent) and a set of parameters to minimise (such as multiple copy weightings, atomic cartesian coordinates, internal coordinates or rotamers). ...
Article
Full-text available
Here is presented an investigation of the use of normal modes in protein-protein docking, both in theory and in practice. Upper limits of the ability of normal modes to capture the unbound to bound conformational change are calculated on a large test set, with particular focus on the binding interface, the subset of residues from which the binding energy is calculated. Further, the SwarmDock algorithm is presented, to demonstrate that the modelling of conformational change as a linear combination of normal modes is an effective method of modelling flexibility in protein-protein docking.
... Regarding the realistic problem of docking the free molecules to predict a complex structure, we performed a twostep procedure (rigid-body docking and refinement of ligand side-chains of resulting conformations) that found a near-native solution as the lowest energy conformation in seven out of 24 complexes. The results are compared with other published docking methods inTable 2. The FTDOCK rigid-body docking program followed by a refinement step (Jackson et al. 1998) found only one complex (out of five) in which the near-native solution ranked first. This program uses distance restraints to filter solutions, and presents much worse results when no experimental information is included in the procedure (in the best case, the near-native solution is ranked the 87th). ...
... If more ligand atoms out of the interface are included in the calculation of the RMSD, as in Norel et al. (1999) and BiGGER (Palma et al. 2000), the possibility exists that the overall ligand position can be close to the real complex structure, whereas the interacting residues have incorrect contacts. More inaccuracy (in evaluating the good contacts) can be introduced when both receptor and ligand atoms are included in the calculation of RMSD and both molecules are used in the superimposition onto the complex structure, as in FTDOCK reported values (Gabb et al. 1997; Jackson et al. 1998). The performance of our method, in terms of computational time, is comparable to the other published docking procedures. ...
... For most of the complexes, the ICM rigid-body docking step took from 2 to 7 h, and the final side-chain refinement ∼7 to 20 min per structure on a 667-MHz Alpha processor (4 to 10 h for the rigid-body docking step and 10 to 30 min per structure for the refinement on a 700-MHz Pentium III workstation running Linux). The FTDOCK rigid-body step typically took ∼6 h using eight SGI R10000 processors simultaneously (48 h/CPU; Gabb et al. 1997) and the refinement step took 10 to 40 min per structure in a SGI R10000 (Jackson et al. 1998). Norel et al. (1999) reported CPU times of 2 to 6 h on a 133-MHz personal computer . ...
Article
The association of two biological macromolecules is a fundamental biological phenomenon and an unsolved theoretical problem. Docking methods for ab initio prediction of association of two independently determined protein structures usually fail when they are applied to a large set of complexes, mostly because of inaccuracies in the scoring function and/or difficulties on simulating the rearrangement of the interface residues on binding. In this work we present an efficient pseudo-Brownian rigid-body docking procedure followed by Biased Probability Monte Carlo Minimization of the ligand interacting side-chains. The use of a soft interaction energy function precalculated on a grid, instead of the explicit energy, drastically increased the speed of the procedure. The method was tested on a benchmark of 24 protein-protein complexes in which the three-dimensional structures of their subunits (bound and free) were available. The rank of the near-native conformation in a list of candidate docking solutions was <20 in 85% of complexes with no major backbone motion on binding. Among them, as many as 7 out of 11 (64%) protease-inhibitor complexes can be successfully predicted as the highest rank conformations. The presented method can be further refined to include the binding site predictions and applied to the structures generated by the structural proteomics projects. All scripts are available on the Web.
... In these cases, the use of several side-chain copies present during docking and selection of the best-fitting copy during EM significantly enhances the performance of the approach also for docking of unbound structures. Discrete side-chain copies at atomic resolution have been used in order to predict side-chain placements in protein homology modeling (see Tuffery et al. 1991) and also in protein docking simulations for preselected docked complexes (Jackson et al. 1998; Lorber and Shoichet 1998; Lorber et al. 2002). Most related to the present approach is the method by Jackson et al. (1998), which introduces a mean field calculated from possible rotameric states of sidechains located at the protein–protein interface. ...
... Discrete side-chain copies at atomic resolution have been used in order to predict side-chain placements in protein homology modeling (see Tuffery et al. 1991) and also in protein docking simulations for preselected docked complexes (Jackson et al. 1998; Lorber and Shoichet 1998; Lorber et al. 2002). Most related to the present approach is the method by Jackson et al. (1998), which introduces a mean field calculated from possible rotameric states of sidechains located at the protein–protein interface. This mean field is used to further refine the complex geometry in translational and rotational coordinates. ...
... This mean field is used to further refine the complex geometry in translational and rotational coordinates. However, the method is limited to a small number of putative complex starting geometries because each refinement at atomic resolution requires several minutes of workstation time (Jackson et al. 1998 ). As a consequence, a small number of putative complexes reasonably close to the experimental geometry must either be known prior to application of the search for optimal side-chain rotamers or the initial docking result (with incorrect side-chain geometry) must be close to the correct (experimental) geometry. ...
Article
A protein-protein docking approach has been developed based on a reduced protein representation with up to three pseudo atoms per amino acid residue. Docking is performed by energy minimization in rotational and translational degrees of freedom. The reduced protein representation allows an efficient search for docking minima on the protein surfaces within. During docking, an effective energy function between pseudo atoms has been used based on amino acid size and physico-chemical character. Energy minimization of protein test complexes in the reduced representation results in geometries close to experiment with backbone root mean square deviations (RMSDs) of approximately 1 to 3 A for the mobile protein partner from the experimental geometry. For most test cases, the energy-minimized experimental structure scores among the top five energy minima in systematic docking studies when using both partners in their bound conformations. To account for side-chain conformational changes in case of using unbound protein conformations, a multicopy approach has been used to select the most favorable side-chain conformation during the docking process. The multicopy approach significantly improves the docking performance, using unbound (apo) binding partners without a significant increase in computer time. For most docking test systems using unbound partners, and without accounting for any information about the known binding geometry, a solution within approximately 2 to 3.5 A RMSD of the full mobile partner from the experimental geometry was found among the 40 top-scoring complexes. The approach could be extended to include protein loop flexibility, and might also be useful for docking of modeled protein structures.
... This energy gap, and therefore the discrimination of real from false positives, was improved by the more detailed refinement, in parallel with an improvement of the lysozyme backbone RMSD to 1.6A. Jackson et al., 1998, developed another method for filtering putative complex structures, in which rigid-body movements were refined along with side-chain torsion angles. The method used a microscopic treatment of thermodynamics, rather than the continuum description developed previously (Jackson and Sternberg, 1995). ...
... It would be useful to know if side-chain movements are more substantial than those of main-chains, as this would provide additional justification for the approach of docking procedures that simulate flexibility only in the side-chains of interface residues (for example Weng et al., 1996, andJackson et al., 1998). Figure 5-5 shows a comparison of the side-chain RMSD's against the C a RM SD's of the exposed regions of the control systems (plotted as crosses). ...
Thesis
The aims of the work presented in this thesis were two-fold. Firstly, an existing protein-protein docking algorithm was re-implemented on a type of computer more available than that used originally, and its behaviour was analysed in detail. This analysis led to changes in the scoring function, a treatment of electrostatic complementarity, and side-chain truncation. The algorithm had problems with its representation of surface, but more generally it pointed to difficulties in dealing with conformational change on association. Thus such changes were the second problem studied. They were measured in thirty-nine pairs of structures of complexed and unbound proteins, averaged over interface and non-interface regions and for individual residues. The significance of the changes was evaluated by comparison with the differences seen in twelve pairs of independently solved structures of identical proteins. Just over half had some substantial overall movement. Movements involved main-chains as well as side-chains, and large changes in the interface were closely involved with complex formation, while those of exposed non-interface residues were caused by flexibility and disorder. Interface movements in enzymes were similar in extent to those of inhibitors. All eight of the complexes that had structures of both components in an unbound form available showed some significant interface movement. An algorithm that was tested on five of these complexes was seen to be successful even when some of the largest changes occurred. The situation may be different in systems other than the enzyme-inhibitors which dominate this study. Thus the general model of protein-protein recognition was found to be induced fit. However, because there is only limited conformational change in many systems, recognition can be treated as lock and key to a first approximation.
... There are many different methods that have been used in the post-docking refinement stage. This includes a biased probability side-chain optimization approach as implemented in the ICM program (Abagyan, Totrov, & Kuznetsov, 1994) or side-chain minimization as employed by Multidock (Multiple copy side-chain refinement Dock) (Jackson, Gabb, & Sternberg, 1998) algorithm. The RosettaDock program (Gray et al., 2003) uses another effective method, which involves the correction of main-chain displacements. ...
... In addition, some post-docking algorithms are specialized in refining the results to a higher extent. These include ZRANK (Pierce & Weng, 2007), EMPIRE (Liang, Liu, Zhang, & Zhou, 2007), DARS (Chuang, Kozakov, Brenke, Comeau, & Vajda, 2008), DECK (Liu & Vakser, 2011), RDOCK , pyDock (Cheng et al., 2007), Eigen-Hex (Venkatraman & Ritchie, 2012), RPScore (Moont, Gabb, & Sternberg, 1999), Multidock (Jackson et al., 1998) among others. DOCKSCORE (Huang et al., 2015) and FiltRest3D (Gajda, Tuszynska, Kaczor, Bakulina, & Bujnicki, 2010) are two of the online webservers which allow ranking of poses from protein-protein docking. ...
... Since the renement step is consequently only applied to a very limited number of docking solutions, more time consuming computational methods can be used. Among these methods are short position restrained molecular dynamics simulations (Gillilan and Lilien, 2004;Grünberg et al., 2004;Smith et al., 2005a,b), energy minimisation procedures (Jackson et al., 1998;Li et al., 2003b Carter et al., 2005) and the use of rotamer libraries (Jackson et al., 1998;Koch et al., 2002;Althaus et al., 2002;Carter et al., 2005) that explicitly account for possible conformational changes of side chains. Afterward, highly specic scoring functions are applied to evaluate the interaction energy. ...
... Since the renement step is consequently only applied to a very limited number of docking solutions, more time consuming computational methods can be used. Among these methods are short position restrained molecular dynamics simulations (Gillilan and Lilien, 2004;Grünberg et al., 2004;Smith et al., 2005a,b), energy minimisation procedures (Jackson et al., 1998;Li et al., 2003b Carter et al., 2005) and the use of rotamer libraries (Jackson et al., 1998;Koch et al., 2002;Althaus et al., 2002;Carter et al., 2005) that explicitly account for possible conformational changes of side chains. Afterward, highly specic scoring functions are applied to evaluate the interaction energy. ...
... The tightly packed interface and its thermodynamic and kinetic consequence are all lost when a complex is constructed from the separately determined structures of two proteins. Indeed, more often than not one finds some interfacial structural motifs in " wrong " positions (Koshland, 1958Koshland, , 1963Koshland, , 1994 Jorgensen, 1991), resulting in steric clashes and unfavorable electrostatic interactions, even for high resolution x-ray structures and for proteins whose backbone remains practically invariant in the process of binding (Jackson et al., 1998; Vakser et al., 1999; Camacho et al., 2000a). These findings emphasize that upon binding, the protein interactions should lead to some degree of induced fit (Koshland, 1994; Jorgensen, 1991), resulting in the tightly packed interface. ...
... These are the barnase-barstar complex (PDB code 1brs) from the RNase-inhibitor family, the trypsin-BPTI (2ptc) and the trypsin-kallikrein (2kai) complexes from the protease-inhibitor family, and hen egg-white lysozyme bound to two different antibodies, Fab D44.1 (IgG1, ) (1mlc) and HyHEL5 (2hfl) from the antigen-antibody family (Fig. 1). The structures of these receptors and ligands have also been solved independently and, hence, have been analyzed in numerous docking studies (Jackson et al., 1998; Camacho et al., 2000a; Vakser and Aflalo, 1994; Vakser et al., 1999). MD simulations were carried out on the unbound protein ligands alone. ...
Article
When a complex is constructed from the separately determined rigid structures of a receptor and its ligand, some key side chains are usually in wrong positions. These distortions of the interface yield an apparent loss in affinity and would unfavorably affect the kinetics of association. It is generally assumed that the interacting proteins should drive the appropriate conformational changes, leading to their complementarity, but this hypothesis does not explain their fast association rates. However, nanosecond explicit solvent molecular dynamics simulations of misfolded surface side chains from the independently solved structures of barstar, bovine pancreatic trypsin inhibitor, and lysozyme show that even before any receptor-ligand interaction, key side chains frequently visit the rotamer conformations seen in the complex. We show that these simple structural motifs can reconcile most of the binding affinity required for a rapid and highly specific association process. Side chains amenable to induced fit are also identified. These results corroborate that solvent-side chain interactions play a critical role in the recognition process. Our findings are also supported by crystallographic data.
... It was found that adding the electrostatic component improved the ability of the program to ®nd solutions similar to the known structure. This group extended their method to include protein sidechain ¯exibility and the effect of solvation (Jackson et al., 1998) and they have also reported tests of a pair potential function in screening docked complexes (Moont et al., 1999). The docking method of Vakser (1995 Vakser ( , 1996) does not utilise a model with full atomic detail; instead a low resolution approach is used with simpli®ed atom±atom and residue±residue potential terms. ...
... Flexible docking procedures are designed to accommodate such rearrangements by allowing for protein side-chain and/or backbone movement during the computation. Development of such methods have been reported by a number of groups (Jackson et al., 1998; Sandak et al., 1998); in some cases these methods have identi®ed protein±antibody complexes with a backbone RMSD of only 1.6 A Ê from the crystal structure (Totrov and Abagyan, 1994). Several research groups have made use of the available crystal structure data to examine the nature of the residues and interactions involved at protein±protein interfaces. ...
Article
Molecular modelling is a powerful methodology for analysing the three dimensional structure of biological macromolecules. There are many ways in which molecular modelling methods have been used to address problems in structural biology. It is not widely appreciated that modelling methods are often an integral component of structure determination by NMR spectroscopy and X-ray crystallography. In this review we consider some of the numerous ways in which modelling can be used to interpret and rationalise experimental data and in constructing hypotheses that can be tested by experiment. Genome sequencing projects are producing a vast wealth of data describing the protein coding regions of the genome under study. However, only a minority of the protein sequences thus identified will have a clear sequence homology to a known protein. In such cases valuable three-dimensional models of the protein coding sequence can be constructed by homology modelling methods. Threading methods, which used specialised schemes to relate protein sequences to a library of known structures, have been shown to be able to identify the likely protein fold even in cases where there is no clear sequence homology. The number of protein sequences that cannot be assigned to a structural class by homology or threading methods, simply because they belong to a previously unidentified protein folding class, will decrease in the future as collaborative efforts in systematic structure determination begin to develop. For this reason, modelling methods are likely to become increasingly useful in the near future. The role of the blind prediction contests, such as the Critical Assessment of techniques for protein Structure Prediction (CASP), will be briefly discussed. Methods for modelling protein-ligand and protein-protein complexes are also described and examples of their applications given.
... They also found that intersection of solutions from geometric, geometric–electrostatic, and geometric–hydrophobic docking searches considerably improved the ranking of the nearly correct solu- tions. Desolvation energy in a post-scan filter was considered by Jackson et al. [29] and by Palma et al. [24]. Neither group provided information about the effect of desolvation alone. ...
... Most of the groups that participated in the first docking challenge used binding-site information as a post-scan filter that eliminated false-positive solutions [20,30]. Several groups [13,22,29,31,32] made such a filter an integral part of their prediction procedure. In contrast, Ben-Zeev and Eisenstein [33] formulated an algorithm to incorporate external information from biological, biochemical, and bioinformatics studies in the scan, generating a different set of solutions, which was biased toward solutions in which several specified residues participated in binding (or did not participate , if this was the preferred option). ...
Article
The activity of a living cell can be portrayed as a network of interactions involving proteins and nucleic acids that transfer biological information. Intervention in cellular processes requires thorough understanding of the interactions between the molecules, which can be provided by docking techniques. Docking methods attempt to predict the structures of complexes given the structures of the component molecules. We focus hereby on protein-protein docking procedures that employ grid representations of the molecules, and use correlation for searching the solution space and evaluating putative complexes. Geometric surface complementarity is the dominant descriptor in docking. Inclusion of electrostatics often improves the results of geometric docking for soluble proteins, whereas hydrophobic complementarity is more important in construction of oligomers. Using binding-site information in the scan or as a filter helps to identify and up-rank nearly correct solutions.
... Three dimensional X ray crystallized structure of Von Hippel-Lindau protein (VBP1, PDB: 4AJY) were downloaded from the protein data bank [21]. The protein was taken as receptor and ligand protein and the most suitable site was predicted using q site finder [22] ligand binding site prediction (http://www.modelling.leeds.ac.uk). The crystal structure of nuclear factor-kappa B ligand NFkB1 [23], NFkB2 [24], NFkB3 [25] , NFkB4 [16] were downloaded from the protein data bank. ...
Article
Molecular docking is an efficient way to study protein-protein and protein-ligand interactions in virtual mode, this provides structural annotations of molecular interactions, required in the drug discovery process. The Cartesian FFT approach in ‘Hex’ spherical polar Fourier (SPF) uses rotational correlations, this method is used here to study protein-protein interactions. Hepatitis B virus (HBV) X protein (HBx) is essential for virus infection and has been used in the development of therapeutics for liver cancer. It can interact with many cellular proteins. It interferes with cell viability and stimulates HBV replication. The von Hippel-Lindau binding protein 1(VBP1) has an important role in HBx-mediated nuclear factor kappa B (NFkB) stimulation. VBP1 and HBx function as coactivators in the activation of NFκB binding. Docking results revealed that HBx and NFkB bind with VBP1 at the common site on amino acids positions Arg 161, Glu 92, and Arg 82, which may have a role in HBx-mediated NFκB activation. Lowest energy complex VBP1- NFkB1 was obtained at -883.70 Kcal/mol. The amino acids involved in interaction among HBx, VBP1, and NFκB proteins, may be involved in transcriptional regulation and has significance in normal and abnormal regulation. These amino acid interactions may be associated with the manifestation of Liver cancer.
... Many solutions generated from a pair of static molecular structures with scoring function comprise the specific position of each atom, giving rise to the simulation of modeling that is seriously sensitive to the specific packing of atoms at the interface [31][32][33]. For modeling the protein, dynamics and correct protein arrangement are required, considering scoring functions related to the feature of docking poses using techniques such as molecular dynamics (MD) [34,35]. ...
Chapter
Full-text available
SARS-CoV-2 pandemic issue threatening world health and economy became a major problem with its destructive impact. The researchers have seen that conventional methods related to medicine and immunological background do not resolve this disease by gained knowledge of viruses previously studied. Advances in computational biology comprising bioinformatics, simulation, and yielded databases have accelerated and strengthened our facilities to predict some cases related to the biological complex by comparison with the use of artificial intelligence. Various novel drugs by using in silico resources and in vivo imaging techniques associated with high-resolution technologies can cause the confidential development of methods for the detection of antiviral drugs and the production of diagnosis kits. In the future, we will start seeing these novel techniques' positive reflection and their advantages in cost/time effective profits. This chapter highlights these approaches and addresses updated knowledge currently used for research and development.
... In order to cater to protein dynamics, alterations to how models are built or how the scoring function assesses a protein arrangement are required. Possible strategies include rigid-body docking of ensembles of structures, 5−7 additional refinement stages that take place after rigid-body docking, 8,9 scoring functions that feature soft potentials to allow minor molecular overlaps, 10,11 pseudo-coarse-grained protein representations, 12 docking subunits connected by potentials, 13 matching protein surfaces represented as a collection of patches, 14 using normal modes to account for flexible conformational switches, 15 and relaxing the interface of docking poses using techniques such as molecular dynamics (MD), Monte Carlo (MC), or simulated annealing. 10,12,16 Some methods, such as HADDOCK 17 and IMP, 18 feature scoring functions that utilize a combination of terms that describe physical interactions and penalize models that do not recapitulate the available experimental data. ...
Article
Full-text available
Predicting the assembly of multiple proteins into specific complexes is critical to the understanding of their biological function in an organism, and thus the design of drugs to address their malfunction. Proteins are flexible molecules, and this inherently poses a problem to any protein docking computational method, where even a simple rearrangement of the side chain and backbone atoms at the interface of binding partners complicates the successful determination of the correct docked pose. Herein, we present a means of representing protein surface, electrostatics and local dynamics within a single volumetric descriptor. We show that our representations can be physically related to the surface accessible solvent area and mass of the protein. We then demonstrate that the application of this representation into a protein-protein docking scenario bypasses the need to compensate for, and predict, specific side chain packing at the interface of binding partners. This representation is leveraged in our de novo protein docking software, JabberDock, which we show can accurately and robustly predict difficult target complexes with an average success rate of >54%, which is comparable to or greater than currently available methods.
... First, the sequence and numbering of the PDB structures were extracted and aligned with the corresponding canonical sequence fetched from UniProt database, to ensure a correct residue numbering. Then, docking simulations were run with the Fast Fourier Transform (FFT)-based program FTDock 2.0 [51], and the resulting 10,000 rigid-body orientations were rescored by pyDock scoring function, which includes electrostatics, desolvation energy, and a limited van der Waals contribution [52]. ...
Article
Full-text available
One of the known potential effects of disease-causing amino acid substitutions in proteins is to modulate protein-protein interactions (PPIs). To interpret such variants at the molecular level and to obtain useful information for prediction purposes, it is important to determine whether they are located at protein-protein interfaces, which are composed of two main regions, core and rim, with different evolutionary conservation and physicochemical properties. Here we have performed a structural, energetics and computational analysis of interactions between proteins hosting mutations related to diseases detected in newborn screening. Interface residues were classified as core or rim, showing that the core residues contribute the most to the binding free energy of the PPI. Disease-causing variants are more likely to occur at the interface core region rather than at the interface rim (p < 0.0001). In contrast, neutral variants are more often found at the interface rim or at the non-interacting surface rather than at the interface core region. We also found that arginine, tryptophan, and tyrosine are over-represented among mutated residues leading to disease. These results can enhance our understanding of disease at molecular level and thus contribute towards personalized medicine by helping clinicians to provide adequate diagnosis and treatments.
... There are many different methods that have been used in the post-docking refinement stage. This includes a biased probability side-chain optimization approach as implemented in the ICM program (Abagyan, Totrov, & Kuznetsov, 1994) or side-chain minimization as employed by Multidock (Multiple copy side-chain refinement Dock) (Jackson, Gabb, & Sternberg, 1998) algorithm. The RosettaDock program ( Gray et al., 2003) uses another effective method, which involves the correction of main-chain displacements. ...
Article
Full-text available
Resolving the three dimensional structure of a protein is a critical step in modern drug discovery today. Homology modeling is a powerful tool that can efficiently predict protein structures from their amino acid sequence. Although it might sound simple enough, homology modeling, in fact, has to pass through several sophisticated steps before it can predict an accurate structure of a protein. These steps include template identification, alignment with the template, model construction and many post-modeling processes. Here, we describe in details these different steps, discuss the strengths and limitations of the methods and list a number of successful homology modelling applications in the literature. The objective of this review is to shed light on this extremely useful tool and highlight many case studies in this area of active research.
... Q-SiteFinder uses several separate procedures to perform ligand binding site prediction described in detail previously (Laurie and Jackson, 2005). Briefly, these processes include the addition of hydrogen atoms to the protein as described by Jackson et al. (1998) . It then calculates the non-bonded interaction energy of a methyl probe (-CH ) with the protein at each position on a 3D grid of resolution 0.9 ˚ A, using the GRID force field parameters as previously described (Jackson, 2002). ...
... reach the global minima provided they are able to converge [34] [35] [54] [56] [83] [88] [89] [102] [106] [122] [153], while some algorithms based on other techniques such as Monte Carlo simulation [70], cyclical search [42] [160], spatial restraint satisfaction [132], and approximation algorithms [29] [85] [162] run reasonably fast without any guarantee of global optima. Among other methods A* search [93], simulated annealing [94], mean-field optimization [73] [90], maximum edge-weighted clique [9], and integer linear programming [5] [47] are also used. All-atom representations are used in [1] [26] [27] [37] [48] [49] [100] [140] [150], all of which allow at least side chain flexibility. ...
Article
Full-text available
Protein interactions, key to many biological processes, involves induced fit between flexible proteins which typically undergo conformational changes. Modeling this flexible protein-protein docking is an important step in drug discovery, structure determination and understanding structure-function relation-ships. In this paper, we present F 3 Dock, a Fast Flexible and Fourier based docking algorithm which utilizes adaptive sampling of orientation and conformational spaces, and a hierarchical molecular flexi-bility and structure representation. Different conformations are adaptively sampled and docked using a Non-equispaced Fast Fourier based algorithm.
... Searching the side chain conformations that optimize the interface is computationally expensive for protein-protein complexes, where the interfaces of both partners need to be simultaneously optimized in combination with small amplitude rigid body adjustments. Side chain rotamer libraries, gathering the conformations most frequently found in known protein structures, can be used with benefit in the process [29]. While the conformation of particular side chains in the bound form does not necessarily coincide with a rotamer, the use of discrete side chain representations can be justified in preliminary search phases. ...
Article
Full-text available
Rapid progress of theoretical methods and computer calculation resources has turned in silico methods into a conceivable tool to predict the 3D structure of macromolecular assemblages, starting from the structure of their separate elements. Still, some classes of complexes represent a real challenge for macromolecular docking methods. In these complexes, protein parts like loops or domains undergo large amplitude deformations upon association, thus remodeling the surface accessible to the partner protein or DNA. We discuss the problems linked with managing such rearrangements in docking methods and we review strategies that are presently being explored, as well as their limitations and success.
... The obtained result of the protein immobilization on the inorganic solid surface points to the importance of electrostatic and steric complementarity. This effect is well-known in protein-protein interactions in the formation of known protein complexes 44,[70][71][72][73] . This complementarity corresponds to electrostatic and steric fit between interacting proteins. ...
Article
Full-text available
Motivated by experimentally-observed biocompatibility enhancement of nanoengineered cubic zirconia (ZrO(2)) coatings to mesenchymal stromal cells, we have carried out computational analysis of the initial immobilization of one known structural fragment of the adhesive protein (fibronectin) on the corresponding surface. We constructed an atomistic model of the ZrO(2) nano-hillock of 3-fold symmetry based on Atom Force Microscopy and Transmission Electron Microscopy images. First principle quantum mechanical calculations show a substantial variation of electrostatic potential at the hillock due to the presence of surface features such as edges and vertexes. Using an implemented Monte Carlo simulated annealing method, we found the orientation of the immobilized protein on the ZrO(2) surface and the contribution of the amino acid residues from the protein sequence to the adsorption energy. Accounting for the variation of the dielectric permittivity at the protein-implant interface, we used a model distance-dependent dielectric function to describe the inter-atom electrostatic interactions in the adsorption potential. We found that the initial immobilization of the rigid protein fragment on the nanostructured pyramidal ZrO(2) surface is achieved with a magnitude of adsorption energy larger than that of the protein on the smooth (atomically flat) surface. The strong attractive electrostatic interactions are a major contributing factor in the enhanced adsorption at the nanostructured surface. In the case of adsorption on the flat, uncharged surface this factor is negligible. We show that the best electrostatic and steric fit of the protein to the inorganic surface corresponds to a minimum of the adsorption energy determined by the non-covalent interactions.
... Protein interactions or protein-protein docking involves induced complementary fit between flexible protein interfaces and additionally the interface conformational changes are often critical during the lock and key matching [43]. The flexible docking solution space consisting of all relative positions, orientations and conformations of the proteins, is searched, and the putative dockings are evaluated using combinations of interface complementarity scoring, and atomic pair-wise charged Coulombic interactions [27]. Since proteins function in their predominantly watery (solvent) environment, the computation of protein solvation energy (or known as protein -solvent interaction energy) also plays an important role in determining inter-molecular binding affinities " in-vivo " for drug screening, as well as in molecular dynamics simulations [52] , and in the study of hydrophobicity and protein folding. ...
Article
Full-text available
We present the 'Dynamic Packing Grid' (DPG), a neighborhood data structure for maintaining and manipulating flexible molecules and assemblies, for efficient computation of binding affinities in drug design or in molecular dynamics calculations. DPG can efficiently maintain the molecular surface using only linear space and supports quasi-constant time insertion, deletion and movement (i.e. updates) of atoms or groups of atoms. DPG also supports constant time neighborhood queries from arbitrary points. Our results for maintenance of molecular surface and polarization energy computations using DPG exhibit marked improvement in time and space requirements. http://www.cs.utexas.edu/~bajaj/cvc/software/DPG.shtml.
... Constraints were applied in the FTDock program to filter out all results in which Gln17 (in E5) was more than 5 Å from Thr513 (in PDGFR) or in which Asp33 (in E5) was more than 5 Å from Lys499 (in PDGFβR), since these interactions have been shown to stabilize the interaction in vivo [1]. Side-chain refinement of the remaining models was carried out using Multidock [41] with a dielectric constant of 2.0, and the results were analyzed using PyMOL version 0.99. ...
Article
The homodimeric E5 protein from bovine papillomavirus activates the platelet-derived growth factor β receptor through transmembrane (TM) helix-helix interactions leading to uncontrolled cell growth. Detailed structural information for the E5 dimer is essential if we are to uncover its unique mechanism of action. In vivo mutagenesis has been used to identify residues in the TM domain critical for dimerization, and we previously reported that a truncated synthetic E5 peptide forms dimers via TM domain interactions. Here we extend this work with the first application of high-resolution solution-state NMR to the study of the E5 TM domain in SDS micelles. Using selectively 15N-labelled peptides, we first probe sample homogeneity revealing two predominate species, which we interpret to be monomer and dimer. The equilibrium between the two states is shown to be dependent on detergent concentration, revealed by intensity shifts between two sets of peaks in 15N-(1)H HSQC experiments, highlighting the importance of sample preparation when working with these types of proteins. This information is used to estimate a free energy of association (ΔGx°=-3.05 kcal mol(-1)) for the dimerization of E5 in SDS micelles. In addition, chemical shift changes have been observed that indicate a more pronounced change in chemical environment for those residues expected to be at the dimer interface in vivo versus those that are not. Thus we are able to demonstrate our in vitro dimer is comparable to that defined in vivo, validating the biological significance of our synthetic peptide and providing a solid foundation upon which to base further structural studies. Using detergent concentration to modulate oligomeric state and map interfacial residues by NMR could prove useful in the study of other homo-oligomeric transmembrane proteins.
... Protein–protein interfaces were calculated from protein binary complexes by considering the region of the solvent accessible surface of one partner that lies within a given Euclidean distance cut-off (i.e. is buried) of the other binding partner. A distance cutoff of 4.5 Å was used as the default as this value is commonly used as the default in such calculations (Jackson et al., 1998;Lu et al., 2003;Ofran and Rost, 2003;Pulim et al., 2008;Vakser, 2005, 2006). We also considered interface residue delimitations at greater distance cut-offs such as 6, 6.5 and 7 Å. ...
Article
In this work, a model for the interaction between CYP2B4 and the FMN domain of rat P450-oxidoreductase is built using as template the structure of a bacterial redox complex. Amino acid residues identified in the literature as cytochrome P450 (CYP)-redox partner interfacial residues map to the interface in our model. Our model supports the view that the bacterial template represents a specific electron transfer complex and moreover provides a structural framework for explaining previous experimental data. We have used our model in an exhaustive search for complementary pairs of mammalian CYP and P450-oxidoreductase (POR) charge clusters. We quantitatively show that among the previously defined basic clusters, the 433K-434R cluster is the most dominant (32.3% of interactions) and among the acidic clusters, the 207D-208D-209D cluster is the most dominant (29%). Our analysis also reveals the previously not described basic cluster 343R-345K (16.1% of interactions) and 373K (3.2%) and the acidic clusters 113D-115E-116E (25.8%), 92E-93E (12.9%), 101D (3.2%) and 179E (3.2%). Cluster pairings among the previously defined charge clusters include the pairing of cluster 421K-422R to cluster 207D-208D-209D. Moreover, 433K-434R and 207D-208D-209D, respectively the dominant positively and negatively charged clusters, are uncorrelated. Instead our analysis suggests that the newly identified cluster 113D-115E-116E is the main partner of the 433K-434R cluster while the newly described cluster 343R-345K is correlated to the cluster 207D-208D-209D.
... Mit dem Programm FTDOCK (Jackson et al., 1998) und den biochemischen Daten von Dumin und Mitarbeitern konnte ein ternärer Komplex zwischen ProMMP-1, Kollagen und Integrin gemodelt werden (Dumin et al., 2001). Die Bindung von MMPs an die Zelloberfläche wurde schon früher für MMP-9 über CD44 (Yu & Stamenkovic, 1999), α2(IV)-Kette von Kollagen (Olson et al., 1998) sowie α V β 3 -Integrin für MMP-2 berichtet (Brooks et al., 1996 ). ...
Article
At the moment, three human collagenases are well documented, named collagenase-1, 2 and 3, which are collectively able to cleave fibrillar collagen (collagen types I-III) albeit with different efficiency. All collagenases cleave type I collagen at a specific Gly-Leu/Ile bond to generate the characteristic ¾ - ¼ degradation products, which can be further degraded by gelatinases and serine proteinases. We have crystallized recombinant human proMMP-1 and have determined its structure to 2.2 Å resolution. The catalytic and hemopexin domains show the classical MMP-fold but also new features in surface loops. The prodomain is formed by a three-helix bundle like as observed for other MMPs, but interacts with the hemopexin domain. This interaction is supposed to stabilize the domain arrangement. Furthermore, a hemopexin domain dependent dimer was observed which is expected to contribute to the biological functions and catalytic efficiency of collagenase-1 in the extracellular matrix. L-aspartyl and L-asparaginyl residues in proteins spontaneously undergo intra-residue rearrangements forming isoaspartyl/beta-aspartyl residues linked through their side-chain beta-carboxyl group with the following amino acid. In order to avoid accumulation of isoaspartyl dipeptides left over from protein degradation, some bacteria have developed specialized isoaspartyl/beta-aspartyl zinc dipeptidases sequentially unrelated to other peptidases, which also poorly degrade alpha-aspartyl dipeptides. We have expressed and crystallized the 390 amino acid residue isoaspartyl dipeptidase (IadA) from E.coli, and have determined its crystal structure in the absence and presence of the phosphinic inhibitor Asp-Psi[PO(2)CH(2)]-LeuOH. This structure reveals an octameric particle of 422 symmetry, with each polypeptide chain organized in a (alphabeta)(8) TIM-like barrel catalytic domain attached to a U-shaped beta-sandwich domain. At the C termini of the beta-strands of the beta-barrel, the two catalytic zinc ions are surrounded by four His, a bridging carbamylated Lys and an Asp residue, which seems to act as a proton shuttle. A large beta-hairpin loop protruding from the (alphabeta)(8) barrel is disordered in the free peptidase, but forms a flap that stoppers the barrel entrance to the active center upon binding of the dipeptide mimic. This isoaspartyl dipeptidase shows strong topological homology with the alpha-subunit of the binickel-containing ureases, the dinuclear zinc dihydroorotases, hydantoinases and phosphotriesterases. PepV from Lactobacillus delbrueckii, a dinuclear zinc peptidase, has been characterized as an unspecific amino dipeptidase. The crystal structure of PepV in complex with the phosphinic inhibitor AspPsi[PO(2)CH(2)]AlaOH, a dipeptide substrate mimetic, reveals a "catalytic domain" and a "lid domain," which together form an internal active site cavity that traps the inhibitor. The catalytic domain is topologically similar to catalytic domains from amino- and carboxypeptidases. However, the lid domain is unique among the related enzymes. In contrast to the other related exopeptidases, PepV recognizes and fixes the dipeptide backbone, while the side chains are not specifically probed and can vary, rendering it a nonspecific dipeptidase. The cocrystallized inhibitor illustrates the two roles of the two catalytic zinc ions, namely stabilization of the tetrahedral intermediate and activation of the catalytic water molecule.
... Weng et al. [128] optimized the side chains in the binding site of three protease-inhibitor complexes using exhaustive search. Jackson et al. [53] employed a selfconsistent mean field approach and a Langevin dipole model for the side chain placement in several protein complexes. However, no algorithm has been yet presented that employs an optimal side chain placement for the interface refinement in protein-protein docking except for the exhaustive search of Weng et al. [128], which is severely limited in the number of side chains and thus not tractable for larger protein interfaces. ...
Article
Full-text available
In the first part of this work, we propose new methods for protein docking. First, we present two approaches to protein docking with flexible side chains. The first approach is a fast greedy heuristic, while the second is a branch -&-cut algorithm that yields optimal solutions. For a test set of protease-inhibitor complexes, both approaches correctly predict the true complex structure. Another problem in protein docking is the prediction of the binding free energy, which is the the final step of many protein docking algorithms. Therefore, we propose a new approach that avoids the expensive and difficult calculation of the binding free energy and, instead, employs a scoring function that is based on the similarity of the proton nuclear magnetic resonance spectra of the tentative complexes with the experimental spectrum. Using this method, we could even predict the structure of a very difficult protein-peptide complex that could not be solved using any energy-based scoring functions. The second part of this work presents BALL (Biochemical ALgorithms Library), a framework for Rapid Application Development in the field of Molecular Modeling. BALL provides an extensive set of data structures as well as classes for Molecular Mechanics, advanced solvation methods, comparison and analysis of protein structures, file import/export, NMR shift prediction, and visualization. BALL has been carefully designed to be robust, easy to use, and open to extensions. Especially its extensibility, which results from an object-oriented and generic programming approach, distinguishes it from other software packages.
... Many methods have been developed to address this problem, and, although substantial progress has been made, the problem remains essentially unsolved (Sternberg et al., 1998). There are several reasons why a general solution to this problem has not yet been found, but the dominant factor is the extreme computational difficulty associated with including the necessary conformational flexibility in the protein partners (Betts and Sternberg, 1999; Jackson et al., 1998). For this reason it might be thought that computational methods would be of little use in treating weak interactions, but this may not be the case. ...
Article
Interactions between proteins are often sufficiently weak that their study through the use of conventional structural techniques becomes problematic. Of the few techniques capable of providing experimental measures of weak protein-protein interactions, perhaps the most useful is the second virial coefficient, B(22), which quantifies a protein solution's deviations from ideal behavior. It has long been known that B(22) can in principle be computed, but only very recently has it been demonstrated that such calculations can be performed using protein models of true atomic detail (Biophys. J. 1998, 75:2469-2477). The work reported here extends these previous efforts in an attempt to develop a transferable energetic model capable of reproducing the experimental trends obtained for two different proteins over a range of pH and ionic strengths. We describe protein-protein interaction energies by a combination of three separate terms: (i) an electrostatic interaction term based on the use of effective charges, (ii) a term describing the electrostatic desolvation that occurs when charged groups are buried by an approaching protein partner, and (iii) a solvent-accessible surface area term that is used to describe contributions from van der Waals and hydrophobic interactions. The magnitude of the third term is governed by an adjustable, empirical parameter, gamma, that is altered to optimize agreement between calculated and experimental values of B(22). The model is applied separately to the proteins lysozyme and chymotrypsinogen, yielding optimal values of gamma that are almost identical. There are, however, clear difficulties in reproducing B(22) values at the extremes of pH. Explicit calculation of the protonation states of ionizable amino acids in the 200 most energetically favorable protein-protein structures suggest that these difficulties are due to a neglect of the protonation state changes that can accompany complexation. Proper reproduction of the pH dependence of B(22) will, therefore, almost certainly require that account be taken of these protonation state changes. Despite this problem, the fact that almost identical gamma values are obtained from two different proteins suggests that the basic energetic formulation used here, which can be evaluated very rapidly, might find use in dynamical simulations of weak protein-protein interactions at intermediate pH values.
... The main reason for this is the intrinsic uncertainty of the protein structures to be docked, e.g. the positions of solvent-exposed side chains (Kimura et al., 2001). During the last couple of years, substantial progress has been made in developing methods that re-rank the docked conformations and attempt to select the ones close to the native, usually using a potential that accounts for the chemical affinity between the molecules, and possibly refining the interacting surfaces (Weng et al., 1996; Gabb et al., 1997; Jackson et al., 1998; Moont et al., 1999; Camacho et al., 2000a; Norel et al., 2001). These procedures improve the discrimination, and 'hits', [i.e. ...
... A large number of docking programs have been developed over the past several years to study protein complexes. HADDOCK, FTDOCK and BIGGER are some of the programs currently in use (Jackson et al., 1998; Palma et al., 2000; Dominguez et al., 2003 ). One of the advantages of HADDOCK (High Ambiguity Driven Protein–Protein Docking) over other programs is the ability to include biochemical data, such as titration shifts, NOEs and mutagenesis results, along with energetics and shape complementarity to drive the docking process right at the outset. ...
Article
The transcriptional activator, MarA, interacts with RNA polymerase (RNAP) to activate promoters of the mar regulon. Here, we identify the interacting surfaces of MarA and of the carboxy-terminal domain of the alpha subunit of RNAP (alpha-CTD) by NMR-based chemical shift mapping. Spectral changes were monitored for a MarA-DNA complex upon titration with alpha-CTD, and for alpha-CTD upon titration with MarA-DNA. The mapping results were confirmed by mutational studies and retention chromatography. A model of the ternary complex shows that alpha-CTD uses a '265-like determinant' to contact MarA at a surface distant from the DNA. This is unlike the interaction of alpha-CTD with the CRP or Fis activators where the '265 determinant' contacts DNA while another surface of the same alpha-CTD molecule contacts the activator. These results reveal a new versatility for alpha-CTD in transcriptional activation.
... Individual mutations were created by using MutantDock, an online computational mutagenesis server (S. J. Campbell & R. M. Jackson, unpublished data), which uses SCWRL to predict the conformation of mutant-residue side-chain positions (Bower et al., 1997) before refining the protein–peptide interaction with MultiDock (Jackson et al., 1998). The mutations required to convert the PP2.2 peptide were done sequentially in all possible orders; this did not, however, significantly change either the final predicted conformation or interaction energies. ...
Article
Full-text available
The NS5A protein of hepatitis C virus has been shown to interact with a subset of Src homology 3 (SH3) domain-containing proteins. The molecular mechanisms underlying these observations have not been fully characterized, therefore a previous analysis of NS5A-SH3 domain interactions was extended. By using a semi-quantitative ELISA assay, a hierarchy of binding between various SH3 domains for NS5A was demonstrated. Molecular modelling of a polyproline motif within NS5A (termed PP2.2) bound to the FynSH3 domain predicted that the specificity-determining RT-loop region within the SH3 domain did not interact directly with the PP2.2 motif. However, it was demonstrated that the RT loop did contribute to the specificity of binding, implicating the involvement of other intermolecular contacts between NS5A and SH3 domains. The modelling analysis also predicted a critical role for a conserved arginine located at the C terminus of the PP2.2 motif; this was confirmed experimentally. Finally, it was demonstrated that, in comparison with wild-type replicon cells, inhibition of the transcription factor AP-1, a function previously assigned to NS5A, was not observed in cells harbouring a subgenomic replicon containing a mutation within the PP2.2 motif. However, the ability of the mutated replicon to establish itself within Huh-7 cells was unaffected. The highly conserved nature of the PP2.2 motif within NS5A suggests that functions involving this motif are of importance, but are unlikely to play a role in replication of the viral RNA genome. It is more likely that they play a role in altering the cellular environment to favour viral persistence.
... Eleven leptin/CRH2 complex models were built using one dimer of the G-CSF/G-CSF receptor complex (1cd9) as template: the model for mouse leptin was superposed on G-CSF as described previously (Peelman et al., 2004b), and the 11 LR CRH2 models were superposed on the G-CSF receptor CRH. The complex interface was energy minimized using MULTIDOCK (Jackson et al., 1998). The interface properties of our models were analysed using the Protein- Protein interaction server (V1.5) (http://www.biochem.ucl.ac.uk/bsm/ ...
Article
Full-text available
Despite the impact of the leptin system on body weight and other physiologic processes, little is known about the binding of leptin to its receptor. The extracellular domain of the leptin receptor consists of two cytokine receptor homology (CRH) domains separated by an immunoglobulin-like domain, and followed by two juxtamembrane fibronectin type III modules. The CRH2 domain functions as a high-affinity binding site for leptin, and we previously demonstrated interaction with helices A and C of leptin. In this work, we constructed a homology model for the leptin/CRH2 complex and performed a detailed mutation analysis of the CRH2/leptin interface. Using both cell-based and in vitro binding assays using the isolated CRH2 domain, we show the critical role of hydrophobic interactions between Leu 13 and Leu 86 of leptin and Leu 504 in CRH2 in leptin binding and signalling. This binding pattern closely resembles the interaction of other four-helix bundle long chain cytokines with the CRH domain of their cognate receptors.
Thesis
The sequence of the high affinity integral membrane receptor for immunoglobulin E, designated as FcԑRI, led to the proposal that it was an αβγ2 with seven transmembrane helices. The latter were connected by three loop peptides. FcԑRI had five cytoplasmic and one extracellular domain peptides. Except for the seven transmembrane helices most of these peptides were synthesized and their solution structure determined, whole or in part, by a combination of circular dichroism, Fourier transform infrared and multidimensional nuclear magnetic resonance spectroscopy. This experimentally-derived structural database then served as a basis for calculating the structures of subunits, especially the β-subunit of FcԑRI. To improve these determinations of the 3D structure, the arrangement of the transmembrane helices of the FcԑRI was studied using molecular modelling. Specifically, protein docking methods were used to study the interaction between all transmembrane helices of the FcԑRI and to establish the favourable helix surfaces for helix - helix contacts. This objective procedure led to the proposal that the transmembrane domain of the β-subunit consisted of a four helix bundle. Further information was needed to determine the 3D structure of the receptor and its subunits, specifically which helix surfaces favoured hydrophobic interactions with membrane lipids. Molecular mechanics was used to predict the relative lipid - transmembrane helix interacting surfaces. The interaction of dodecane and palmitic fatty acid with transmembrane helices led to mapping of the relative hydrophobic surfaces on these helices. This type of experiment successfully predicted the lipid-helix interaction surfaces which were in agreement with those found in the crystal structure of the Bacteriorhodopsin. It fully supported mapping of the hydrophobic surfaces and fatty acid interaction sites of all six different transmembrane helices of the IgE receptor. By experimentally elucidating the conformational components of domain peptides within the FcԑRI, it has therefore been possible to have an improved understanding of the structural interactions between receptor peptides, and between receptor peptides and lipids, and to model the conformation of the receptor subunits. The final 3D structures of the β-subunit were calculated by molecular dynamics using: a) NMR-based loop peptide structure; b) calculated helix-helix interactions; and c) mapping of lipid-helix interactions. The proposed structure of the β-subunit had the repeated conformational motif (transmembrane helix - turn - loop helix - bend - transmembrane helix).
Thesis
Whilst the incorporation of water occurs in many biomolecular interfaces, the role they play is poorly understood with little attention paid to their contribution in dictating the specificity of an interaction. To investigate this the Src Homology 2 (SH2) domain of the viral Src protein kinase and its interactions with various phosphotyrosyl peptides and peptidomimetic ligands was studied. X-ray analysis shows that a feature of SH2 domains is the involvement of water molecules in the peptide binding-site. SH2 domains play a fundamental role in signal transduction and are therapeutic drug targets. Changes in water molecule content (incorporation, removal) and/or effects on binding affinity of the biomolecular complexes are examined using a combination of the thermodynamic isothermal titration calorimetry (ITC) and nanoflow electrospray ionisation mass spectrometry (ESI-MS) techniques and correlated to known structural information. The results from this study are used to predict the nature of possible water-mediated binding in the SH2 domain of Fyn using similar ligands. The role of water in binding interactions is further investigated by applying an empirical relationship based on the correlation of solvent-accessible surface area burial and changes in heat capacity (ΔCp). The experimental ΔCp is determined using ITC for the SH2/ligand interactions. The effects of proton linkage on binding are considered and five different surface area-based models are tested (relating to the treatment of conformational flexibility in the peptide ligand and the inclusion of proximal ordered solvent molecules in the surface area calculations). This allows the calculation of a range of thermodynamic state functions (ΔCp, ΔS, ΔH and ΔG) directly from structure. Comparison with the experimentally derived data shows little agreement for the observed trends in the interactions of selected phosphotyrosyl peptides and SrcSH2. Furthermore, the different models have a dramatic effect on the calculated thermodynamic functions, thus binding energies predicted from these types of correlations are highly model dependent.
Article
Full-text available
We dissect the protein-protein interfaces into water preservation (WP), water hydration (WH) and water dehydration (WD) sites by comparing the water mediated hydrogen bonds (H-bond) in the bound and unbound states of the interacting subunits. Upon subunit complexation, if a H–bond between an interface water and a protein polar group is retained, we assign it as WP site; if it is lost, we assign it as WD site and if a new H–bond is created, we assign it as WH site. We find that the density of WD sites is highest followed by WH and WP sites except in antigen and (or) antibody complexes, where the density of WH sites is highest followed by WD and WP sites. Furthermore, we find that WP sites are the most conserved followed by WD and WH sites in all class of complexes except in antigen and (or) antibody complexes, where WD sites are the most conserved followed by WH and WP sites. A significant number of WP and WH sites are involved in water bridges that stabilize the subunit interactions. At WH sites, the residues involved in water bridges are significantly better conserved than the other residues. However, no such difference is observed at WP sites. Interestingly, WD sites are generally replaced with direct H-bonds upon subunit complexation. Significantly, we observe many water mediated H-bonds remain preserved in spite of large conformational changes upon subunit complexation. These findings have implications in predicting and engineering water binding sites at protein-protein interfaces.
Book
"Molecular Materials with Specific Interactions: Modeling and Design" has a very interdisciplinary character and is intended to provide basic information as well as the details of theory and examples of its application to experimentalists and theoreticians interested in modeling molecular properties and putting into practice rational design of new materials. One of the first requirements to initiate the molecular modeling of molecular materials is an accurate and realistic description of the electronic structure, intermolecular interactions and chemical reactions at microscopic and macroscopic scale. Therefore the first four chapters contain an extensive introduction into the latest theories of intermolecular interactions, functional density techniques, microscopic and mezoscopic modeling techniques as well as first-principle molecular dynamics. In the following chapters, techniques bridging microscopic and mezoscopic modeling scales are presented. The authors then illustrate various successful applications of molecular design of new materials, drugs, biocatalysts, etc. before presenting challenging topics in molecular materials design. This book is an excellent source of information for professionals involved in research in computational chemistry and physics, material science, nanotechnology, rational drug design and molecular biology. It will benefit graduates, as well as undergraduate students exposed to the above research areas.
Article
A high titer of antibody to HBsAg (Hepatitis B virus surface antigen) (anti-HBs) is a requisite for the prevention of HB (Hepatitis B), and adjuvants generally play a great role in eliciting special anti-HBs to HB vaccine. However, adjuvants still need to be improved because of their shortages such as unremarkable efficacy, undesirable side effect or poor security. In this study, we used HBsAg separated from HB patient sera to screen a human liver cDNA expression library, and found a novel HBsAg-binding protein (SBP), which is located at the human chromosome 14q32.33 and is similar to human IgG heavy chain in structure. Western blot demonstrated that SBP existed in both healthy human sera and HB patient sera. Furthermore, SBP could bind to HBsAg by its N-terminal domain. Notably, we confirmed that SBP could promote dendritic cells (DC) to phagocytize HBsAg more effectively and enhance the immunogenicity of HB vaccine, when SBP was mixed proportionally with HBsAg and the resulting mixture was infused into mice. These results suggest that SBP could be developed into a safe and promising adjuvant of HB vaccine.
Article
The methods of continuum electrostatics are used to calculate the binding free energies of a set of protein–protein complexes including experimentally determined structures as well as other orientations generated by a fast docking algorithm. In the native structures, charged groups that are deeply buried were often found to favor complex formation (relative to isosteric nonpolar groups), whereas in nonnative complexes generated by a geometric docking algorithm, they were equally likely to be stabilizing as destabilizing. These observations were used to design a new filter for screening docked conformations that was applied, in conjunction with a number of geometric filters that assess shape complementarity, to 15 antibody–antigen complexes and 14 enzyme-inhibitor complexes. For the bound docking problem, which is the major focus of this paper, native and near-native solutions were ranked first or second in all but two enzyme-inhibitor complexes. Less success was encountered for antibody–antigen complexes, but in all cases studied, the more complete free energy evaluation was able to idey native and near-native structures. A filter based on the enrichment of tyrosines and tryptophans in antibody binding sites was applied to the antibody–antigen complexes and resulted in a native and near-native solution being ranked first and second in all cases. A clear improvement over previously reported results was obtained for the unbound antibody–antigen examples as well. The algorithm and various filters used in this work are quite efficient and are able to reduce the number of plausible docking orientations to a size small enough so that a final more complete free energy evaluation on the reduced set becomes computationally feasible.
Article
A series of novel bis-indole compounds, 1,omega-bis(((3-acetamino-5-methoxy-2-methylindole)-2-methylene)phenoxy)alkane, have been designed and synthesized on the basis of the enzyme structure of human nonpancreatic secretory phospholipase A2 (hnps PLA2). Their inhibition activities against hnps PLA2 were improved compared to that of the monofunctional protocompound. These bivalent ligands not only inhibited hnps PLA2 but also drove the dimerization of hnps PLA2. Their dimerization ability correlated with the linker length and position. Further study on the potent compound 5 (1,5-bis(((3-acetamino-5-methoxy-2-methylindole)-2-methylene)phenoxy)pentane, IC50 = 24 nM) revealed that cooperative binding interactions between the two enzyme molecules also contributed to the stability of the ternary complex. The combination of bivalent ligands and hnps PLA2 can be used as a novel chemically induced dimerization (CID) system for designing regulatory inhibitors.
Article
Full-text available
Atomic Solvation Parameters (ASP) model has been proven to be a very successful method of calculating the binding free energy of protein complexes. This suggests that incorporating it into docking algorithms should improve the accuracy of prediction. In this paper we propose an FFT-based algorithm to calculate ASP scores of protein complexes and develop an ASP-based protein-protein docking method (ASPDock). The ASPDock is first tested on the 21 complexes whose binding free energies have been determined experimentally. The results show that the calculated ASP scores have stronger correlation (r ≈ 0.69) with the binding free energies than the pure shape complementarity scores (r ≈ 0.48). The ASPDock is further tested on a large dataset, the benchmark 3.0, which contain 124 complexes and also shows better performance than pure shape complementarity method in docking prediction. Comparisons with other state-of-the-art docking algorithms showed that ASP score indeed gives higher success rate than the pure shape complementarity score of FTDock but lower success rate than Zdock3.0. We also developed a softly restricting method to add the information of predicted binding sites into our docking algorithm. The ASP-based docking method performed well in CAPRI rounds 18 and 19. ASP may be more accurate and physical than the pure shape complementarity in describing the feature of protein docking.
Article
Full-text available
The problem of determining the physical conformation of a protein dimer, given the structures of the two interacting proteins in their unbound state, is a difficult one. The location of the docking interface is determined largely by geometric complementarity, but finding complementary geometry is complicated by the flexibility of the backbone and side-chains of both proteins. We seek to generate candidates for docking that approximate the bound state well, even in cases where there is backbone and/or side-chain difference from unbound to bound states. We divide the surfaces of each protein into local patches and describe the effect of side-chain flexibility on each patch by sampling the space of conformations of its side-chains. Likely positions of individual side-chains are given by a rotamer library; this library is used to derive a sample of possible mutual conformations within the patch. We enforce broad coverage of torsion space. We control the size of the sample by using energy criteria to eliminate unlikely configurations, and by clustering similar configurations, resulting in 50 candidates for a patch, a manageable number for docking. Using a database of protein dimers for which the bound and unbound structures of the monomers are known, we show that from the unbound patch we are able to generate candidates for docking that approximate the bound structure. In patches where backbone change is small (within 1 Å RMSD of bound), we are able to account for flexibility and generate candidates that are good approximations of the bound state (82% are within 1 Å and 98% are within 1.4 Å RMSD of the bound conformation). We also find that even in cases of moderate backbone flexibility our candidates are able to capture some of the overall shape change. Overall, in 650 of 700 test patches we produce a candidate that is either within 1 Å RMSD of the bound conformation or is closer to the bound state than the unbound is.
Article
Full-text available
X-linked dyskeratosis congenita (DC) is a rare bone marrow failure syndrome caused by mostly missense mutations in the pseudouridine synthase NAP57 (dyskerin/Cbf5). As part of H/ACA ribonucleoproteins (RNPs), NAP57 is important for the biogenesis of ribosomes, spliceosomal small nuclear RNPs, microRNAs and the telomerase RNP. DC mutations concentrate in the N- and C-termini of NAP57 but not in its central catalytic domain raising questions as to their impact. We demonstrate that the N- and C-termini together form the binding surface for the H/ACA RNP assembly factor SHQ1 and that DC mutations modulate the interaction between the two proteins. Pinpointing impaired interaction between NAP57 and SHQ1 as a potential molecular basis for X-linked DC has implications for therapeutic approaches, e.g. by targeting the NAP57–SHQ1 interface with small molecules.
Article
Antizyme (Az) is a highly conserved key regulatory protein bearing a major role in regulating polyamine levels in the cell. It has the ability to bind and inhibit ornithine decarboxylase (ODC), targeting it for degradation. Az inhibitor (AzI) impairs the activity of Az. In this study, we mapped the binding sites of ODC and AzI on Az using Ala scan mutagenesis and generated models of the two complexes by constrained computational docking. In order to scan a large number of mutants in a short time, we developed a workflow combining high-throughput mutagenesis, small-scale parallel partial purification of His-tagged proteins and their immobilization on a tris-nitrilotriacetic-acid-coated surface plasmon resonance chip. This combination of techniques resulted in a significant reduction in time for production and measurement of large numbers of mutant proteins. The data-driven docking results suggest that both proteins occupy the same binding site on Az, with Az binding within a large groove in AzI and ODC. However, single-mutant data provide information concerning the location of the binding sites only, not on their relative orientations. Therefore, we generated a large number of double-mutant cycles between residues on Az and ODC and used the resulting interaction energies to restrict docking. The model of the complex is well defined and accounts for the mutant data generated here, and previously determined biochemical data for this system. Insights on the structure and function of the complexes, as well as general aspects of the method, are discussed.
Article
How to refine a near-native structure to make it closer to its native conformation is an unsolved problem in protein-structure and protein-protein complex-structure prediction. In this article, we first test several scoring functions for selecting locally resampled near-native protein-protein docking conformations and then propose a computationally efficient protocol for structure refinement via local resampling and energy minimization. The proposed method employs a statistical energy function based on a Distance-scaled Ideal-gas REference state (DFIRE) as an initial filter and an empirical energy function EMPIRE (EMpirical Protein-InteRaction Energy) for optimization and re-ranking. Significant improvement of final top-1 ranked structures over initial near-native structures is observed in the ZDOCK 2.3 decoy set for Benchmark 1.0 (74% whose global rmsd reduced by 0.5 A or more and only 7% increased by 0.5 A or more). Less significant improvement is observed for Benchmark 2.0 (38% versus 33%). Possible reasons are discussed.
Article
Full-text available
Bacteriophage T4 deoxycytidylate hydroxymethylase (EC 2.1.2.8), a homodimer of 246-residue subunits, catalyzes hydroxymethylation of the cytosine base in deoxycytidylate (dCMP) to produce 5-hydroxymethyl-dCMP. It forms part of a phage DNA protection system and appears to function in vivo as a component of a multienzyme complex called deoxyribonucleoside triphosphate (dNTP) synthetase. We have determined its crystal structure in the presence of the substrate dCMP at 1.6 A resolution. The structure reveals a subunit fold and a dimerization pattern in common with thymidylate synthases, despite low (approximately 20%) sequence identity. Among the residues that form the dCMP binding site, those interacting with the sugar and phosphate are arranged in a configuration similar to the deoxyuridylate binding site of thymidylate synthases. However, the residues interacting directly or indirectly with the cytosine base show a more divergent structure and the presumed folate cofactor binding site is more open. Our structure reveals a water molecule properly positioned near C-6 of cytosine to add to the C-7 methylene intermediate during the last step of hydroxymethylation. On the basis of sequence comparison and crystal packing analysis, a hypothetical model for the interaction between T4 deoxycytidylate hydroxymethylase and T4 thymidylate synthase in the dNTP-synthesizing complex has been built.
Article
A major problem in genome annotation is whether it is valid to transfer the function from a characterised protein to a homologue of unknown activity. Here, we show that one can employ a strategy that uses a structure-based prediction of protein functional sites to assess the reliability of functional inheritance. We have automated and benchmarked a method based on the evolutionary trace approach. Using a multiple sequence alignment, we identified invariant polar residues, which were then mapped onto the protein structure. Spatial clusters of these invariant residues formed the predicted functional site. For 68 of 86 proteins examined, the method yielded information about the observed functional site. This algorithm for functional site prediction was then used to assess the validity of transferring the function between homologues. This procedure was tested on 18 pairs of homologous proteins with unrelated function and 70 pairs of proteins with related function, and was shown to be 94 % accurate. This automated method could be linked to schemes for genome annotation. Finally, we examined the use of functional site prediction in protein-protein and protein-DNA docking. The use of predicted functional sites was shown to filter putative docked complexes with a discrimination similar to that obtained by manually including biological information about active sites or DNA-binding residues.
Article
The protein docking problem has two major aspects: sampling conformations and orientations, and scoring them for fit. To investigate the extent to which the protein docking problem may be attributed to the sampling of ligand side-chain conformations, multiple conformations of multiple residues were calculated for the uncomplexed (unbound) structures of protein ligands. These ligand conformations were docked into both the complexed (bound) and unbound conformations of the cognate receptors, and their energies were evaluated using an atomistic potential function. The following questions were considered: (1) does the ensemble of precalculated ligand conformations contain a structure similar to the bound form of the ligand? (2) Can the large number of conformations that are calculated be efficiently docked into the receptors? (3) Can near-native complexes be distinguished from non-native complexes? Results from seven test systems suggest that the precalculated ensembles do include side-chain conformations similar to those adopted in the experimental complexes. By assuming additivity among the side chains, the ensemble can be docked in less than 12 h on a desktop computer. These multiconformer dockings produce near-native complexes and also non-native complexes. When docked against the bound conformations of the receptors, the near-native complexes of the unbound ligand were always distinguishable from the non-native complexes. When docked against the unbound conformations of the receptors, the near-native dockings could usually, but not always, be distinguished from the non-native complexes. In every case, docking the unbound ligands with flexible side chains led to better energies and a better distinction between near-native and non-native fits. An extension of this algorithm allowed for docking multiple residue substitutions (mutants) in addition to multiple conformations. The rankings of the docked mutant proteins correlated with experimental binding affinities. These results suggest that sampling multiple residue conformations and residue substitutions of the unbound ligand contributes to, but does not fully provide, a solution to the protein docking problem. Conformational sampling allows a classical atomistic scoring function to be used; such a function may contribute to better selectivity between near-native and non-native complexes. Allowing for receptor flexibility may further extend these results.
Article
Rigid-body docking approaches are not sufficient to predict the structure of a protein complex from the unbound (native) structures of the two proteins. Accounting for side chain flexibility is an important step towards fully flexible protein docking. This work describes an approach that allows conformational flexibility for the side chains while keeping the protein backbone rigid. Starting from candidates created by a rigid-docking algorithm, we demangle the side chains of the docking site, thus creating reasonable approximations of the true complex structure. These structures are ranked with respect to the binding free energy. We present two new techniques for side chain demangling. Both approaches are based on a discrete representation of the side chain conformational space by the use of a rotamer library. This leads to a combinatorial optimization problem. For the solution of this problem, we propose a fast heuristic approach and an exact, albeit slower, method that uses branch-and-cut techniques. As a test set, we use the unbound structures of three proteases and the corresponding protein inhibitors. For each of the examples, the highest-ranking conformation produced was a good approximation of the true complex structure.
Article
Full-text available
ClusPro (http://nrc.bu.edu/cluster) represents the first fully automated, web-based program for the computational docking of protein structures. Users may upload the coordinate files of two protein structures through ClusPro's web interface, or enter the PDB codes of the respective structures, which ClusPro will then download from the PDB server (http://www.rcsb.org/pdb/). The docking algorithms evaluate billions of putative complexes, retaining a preset number with favorable surface complementarities. A filtering method is then applied to this set of structures, selecting those with good electrostatic and desolvation free energies for further clustering. The program output is a short list of putative complexes ranked according to their clustering properties, which is automatically sent back to the user via email.
Article
Full-text available
We investigate the extent to which the conformational fluctuations of proteins in solution reflect the conformational changes that they undergo when they form binary protein-protein complexes. To do this, we study a set of 41 proteins that form such complexes and whose three-dimensional structures are known, both bound in the complex and unbound. We carry out molecular dynamics simulations of each protein, starting from the unbound structure, and analyze the resulting conformational fluctuations in trajectories of 5 ns in length, comparing with the structure in the complex. It is found that fluctuations take some parts of the molecules into regions of conformational space close to the bound state (or give information about it), but at no point in the simulation does each protein as whole sample the complete bound state. Subsequent use of conformations from a clustered MD ensemble in rigid-body docking is nevertheless partially successful when compared to docking the unbound conformations, as long as the unbound conformations are themselves included with the MD conformations and the whole globally rescored. For one key example where sub-domain motion is present, a ribonuclease inhibitor, principal components analysis of the MD was applied and was also able to produce conformations for docking that gave enhanced results compared to the unbound. The most significant finding is that core interface residues show a tendency to be less mobile (by size of fluctuation or entropy) than the rest of the surface even when the other binding partner is absent, and conversely the peripheral interface residues are more mobile. This surprising result, consistent across up to 40 of the 41 proteins, suggests different roles for these regions in protein recognition and binding, and suggests ways that docking algorithms could be improved by treating these regions differently in the docking process.
Article
Success in high-resolution protein-protein docking requires accurate modeling of side-chain conformations at the interface. Most current methods either leave side chains fixed in the conformations observed in the unbound protein structures or allow the side chains to sample a set of discrete rotamer conformations. Here we describe a rapid and efficient method for sampling off-rotamer side-chain conformations by torsion space minimization during protein-protein docking starting from discrete rotamer libraries supplemented with side-chain conformations taken from the unbound structures, and show that the new method improves side-chain modeling and increases the energetic discrimination between good and bad models. Analysis of the distribution of side-chain interaction energies within and between the two protein partners shows that the new method leads to more native-like distributions of interaction energies and that the neglect of side-chain entropy produces a small but measurable increase in the number of residues whose interaction energy cannot compensate for the entropic cost of side-chain freezing at the interface. The power of the method is highlighted by a number of predictions of unprecedented accuracy in the recent CAPRI (Critical Assessment of PRedicted Interactions) blind test of protein-protein docking methods.
Article
Full-text available
The three-dimensional crystal structure of the complex between the Fab from the monoclonal anti-lysozyme antibody D1.3 and the antigen, hen egg white lysozyme, has been refined by crystallographic techniques using x-ray intensity data to 2.5-A resolution. The antibody contacts the antigen with residues from all its complementarity determining regions. Antigen residues 18-27 and 117-125 form a discontinuous antigenic determinant making hydrogen bonds and van der Waals interactions with the antibody. Water molecules at or near the antigen-antibody interface mediate some contacts between antigen and antibody. The fine specificity of antibody D1.3, which does not bind (K alpha less than 10(5) M-1) avian lysozymes where Gln121 in the amino acid sequence is occupied by His, can be explained on the basis of the refined model.
Article
Full-text available
The crystal structure of the complex of the anti-lysozyme HyHEL-10 Fab and hen egg white lysozyme has been determined to a nominal resolution of 3.0 A. The antigenic determinant (epitope) on the lysozyme is discontinuous, consisting of residues from four different regions of the linear sequence. It consists of the exposed residues of an alpha-helix together with surrounding amino acids. The epitope crosses the active-site cleft and includes a tryptophan located within this cleft. The combining site of the antibody is mostly flat with a protuberance made up of two tyrosines that penetrate the cleft. All six complementarity-determining regions of the Fab contribute at least one residue to the binding; one residue from the framework is also in contact with the lysozyme. The contacting residues on the antibody contain a disproportionate number of aromatic side chains. The antibody-antigen contact mainly involves hydrogen bonds and van der Waals interactions; there is one ion-pair interaction but it is weak.
Article
Full-text available
Docking algorithms play an important role in the process of rational drug design and in understanding the mechanism of molecular recognition. An important determinant for successful docking is the extent to which the configurational space (including conformational changes) of the ligand/receptor system is searched. Here we describe a new, combinatorial method for flexible docking of peptides to proteins that allows full rotation around all single bonds of the peptide ligand and around those of a large set of receptor side chains. We have simulated the binding of several viral peptides to murine major histocompatibility complex class I H-2Kb. In addition, we have explored the limits of our method by simulating a complex between calmodulin and an 18-residue long helical peptide from calmodulin-dependent protein kinase IIalpha. The calculated peptide conformations generally matched well with the X-ray structures. Essential information about local flexibility and about residues that are responsible for strong binding was obtained. We have frequently observed considerable side-chain flexibility during the simulations, showing the need for a flexible treatment of the receptor. Our method may also be useful whenever the receptor side-chain conformation is not available or uncertain, as illustrated by the docking of an H-2Kb binding nonapeptide to the receptor structure taken from an octapeptide/H-2Kb complex.
Article
Full-text available
The structure of bovine pancreatic trypsin inhibitor has been refined to a resolution of 1.1 A against data collected at 125 K. The space group of the form II crystal is P2(1)2(1)2(1) with a = 75.39(3), b = 22.581(7), c = 28.606 (9) A (cf. a = 74.1, b = 23.4, c = 28.9 A at room temperature). The structure was refined by restrained least-squares minimization of summation operator w(F (o)(2)- F (c)(2))(2) with the SHELXL93 program. As the model improved, water molecules were included and exceptionally clear electron density was found for two residues, Gly57 and Ala58, that had been largely obscured at room temperature. The side chains of residues Glu7 and Arg53 were modelled over two positions with refined occupancy factors. The final model contains 145.6 water molecules distributed over 167 sites, and a single phosphate group disordered over two sites. The root-mean-square discrepancy between Calpha atoms in residues Arg1-Gly56 at room and low temperatures is 0.4 A. A comparison of models refined with anisotropic and isotropic thermal parameters revealed that there were no significant differences in atomic positions. The final weighted R-factor on F(2) (wR(2)) for data in the range 10-1.1 A was 35.9% for the anisotropic model and 40.9% for the isotropic model. Conventional R-factors based on F for F > 4sigma(F) were 12.2 and 14.6%, respectively, corresponding to 16.1 and 18.7% on all data. These large R-factor differences were not reflected in values of R(free), which were not significantly different at 21.5(5) and 21.8(4)%, respectively. These results, along with the relatively straightforward nature of the refinement, clearly highlight the benefits of low-temperature data collection.
Thesis
An analysis of disulphide bridges in protein structure and its application to the prediction of structural features are presented in this thesis. In Chapter 1, amongst other aspects of protein structure and folding, the use of disulphide bridges in the investigation of protein folding and in experimental work that aims to increase the stability of a protein is reviewed. In Chapter 2, the theory of cross-linkage of proteins by disulphides is presented in detail. Chapter 3 describes a procedure to make non-redundant data sets of protein chains that is designed to give as large a data set as possible. Disulphide-bridge connectivity patterns have been analysed and classified (Chapter 4). A classification that takes account of loop lengths, based on the theory of cross-linkage, has been derived. This suggests that the hypothesis that the non-local entropic effect of disulphides is important for small, disulphide-rich proteins cannot be rejected. In Chapter 5, structural principles for disulphide-bridged ß-sheet and for clustered disulphides are described. A motif, comprising two clustered disulphides packed against a ß-hairpin, is characterised and is termed the disulphide cross. It is abundant in small, disulphide-rich proteins and subsumes an array of partial similarities between such protein folds previously identified by several other workers. Aspects of the analysis in Chapter 5 have been applied to the comparative modelling of a growth factor sequentially similar to the epidermal growth factor (EGF), which is a small disulphide-rich protein (Chapter 6). The implications of the modelled structure are discussed in relation to the structures of other EGF-like growth factors and mutagenesis data thereon. Chapter 7 provides general conclusions on the work and suggestions for further developments.
Article
A method and parametrization scheme which allow fast and accurate calculations of hydration free energies are described. The solute is treated as a polarizable cavity of a shape defined by the molecular surface, containing point charges at the location of atomic nuclei. Electrostatic contributions to solvation are derived from:finite difference solutions of the Poisson equation (FDPB method). Nonpolar (cavity/van der Waals) energies are added as a surface area dependent term, with a single surface tension coefficient (gamma) derived from hydrocarbon solubility in water. Atomic charges and radii are obtained by modifying existing force-field or quantum-mechanically-derived values, by fitting to experimental solvation energies of small organic molecules. A new, simple parameter set (parameters for solvation energy, PARSE) is developed specifically for the FDPB/gamma method, by choosing atomic charges and radii which reproduce the estimated contributions to solvation of simple functional groups. The PARSE parameters reproduce hydration free energies for a test set of 67 molecules with an average error of 0.4 kcal/mol. For amino acid side chain and peptide backbone analogs the average error is only 0.1 kcal/mol.
Article
Empirical interatomic potentials are developed for calculating the energetically most favored conformations of polypeptides and proteins. Geometric parameters, partial atomic charges, nonbonded interaction energies, hydrogen bond energies, and intrinsic torsional potentials are determined for each of the naturally occurring amino acids. The geometric parameters were obtained from a survey of the recent structural literature; the bond lengths arid bond angles of a given type of residue appear to be similar from structure to structure. The partial atomic charges (overlap normalized) for every atom of each amino acid were obtained by the CNDO/2 method. Parameters of the nonbonded and hydrogen-bonding potentials were taken from previous calculations on crystals of small molecules; however, the coefficient of the nonbonded repulsive term for two atoms separated by three bonds, the central one being the bond about which rotation can take place (1-4 interactions), was reduced by a factor of 2 to make the repulsive force constants compatible with those computed from Hartree-Fock and Thomas-Fermi-Dirac repulsive potentials. Intrinsic torsional potentials were introduced, where necessary, to reproduce experimental internal rotation barriers. Interaction arrays for interatomic pairwise interactions were defined in order to distinguish hydrogen bonding, and 1-4, from 1-5, 1-6, etc., interactions. Procedures are described for carrying out conformational energy calculations on polypeptides and proteins, using these empirical potentials. Computer programs for performing these calculations are available.
Article
Different microscopic and semimicroscopic approaches for calculations of electrostatic energies in macromolecules are examined. This includes the Protein Dipoles Langevin Dipoles (PDLD) method, the semimicroscopic PDLD (PDLD/S) method, and a free energy perturbation (FEP) method. The incorporation of these approaches in the POLARIS and ENZYMIX modules of the MOLARIS package is described in detail. The PDLD electrostatic calculations are augmented by estimates of the relevant hydrophobic and steric contributions, as well as the effects of the ionic strength and external pH. Determination of the hydrophobic energy involves an approach that considers the modification of the effective surface area of the solute by local field effects. The steric contributions are analyzed in terms of the corresponding reorganization energies. Ionic strength effects are studied by modeling the ionic environment around the given system using a grid of residual charges and evaluating the relevant interaction using Coulomb's law with the dielectric constant of water. The performance of the FEP calculations is significantly enhanced by using special boundary conditions and evaluating the long-range electrostatic contributions using the Local Reaction Field (LRF) model. A diverse set of electrostatic effects are examined, including the solvation energies of charges in proteins and solutions, energetics of ion pairs in proteins and solutions, interaction between surface charges in proteins, and effect of ionic strength on such interactions, as well as electrostatic contributions to binding and catalysis in solvated proteins. Encouraging results are obtained by the microscopic and semimicroscopic approaches and the problems associated with some macroscopic models are illustrated. The PDLD and PDLD/S methods appear to be much faster than the FEP approach and still give reasonable results. In particular, the speed and simplicity of the PDLD/S method make it an effective strategy for calculations of electrostatic free energies in interactive docking studies. Nevertheless, comparing the results of the three approaches can provide a useful estimate of the accuracy of the calculated energies. © 1993 John Wiley & Sons, Inc.
Article
Our previously developed approaches for integrating quantum mechanical molecular orbital methods with microscopic solvent models are refined and examined. These approaches consider the nonlinear solute–solvent coupling in a self-consistent way by incorporating the potential from the solvent dipoles in the solute Hamiltonian, while considering the polarization of the solvent by the potential from the solute charges. The solvent models used include the simplified Langevin Dipoles (LD) model and the much more expensive surface constrained All Atom Solvent (SCAAS) model, which is combined with a free energy pertubation (FEP) approach. Both methods are effectively integrated with the quantum mechanical AMPAC package and can be easily combined with other quantum mechanical programs. The advantages of the present approaches and their earlier versions over macroscopic reaction field models and supermolecular approaches are considered. A LD/MNDO study of solvated organic ions demonstrates that this model can yield reliable solvation energies, provided the quantum mechanical charges are scaled to have similar magnitudes to those obtained by high level ab initio methods. The incorporation of a field-dependent hydrophobic term in the LD free energy makes the present approach capable of evaluating the free energy of transfer of polar molecules from non polar solvents to aqueous solutions. The reliability of the LD approach is examined not only by evaluating a rather standard set of solvation energies of organic ions and polar molecules, but also by considering the stringent test case of sterically hindered hydrophobic ions. In this case, we compare the LD/MNDO solvation energies to the more rigorous FEP/SCAAS/MNDO solvation energies. Both methods are found to give similar results even in this challenging test case. The FEP/SCAAS/AMPAC method is incorporated into the current version of the program ENZYMIX. This option allows one to study chemical reactions in enzymes and in solutions using the MNDO and AM1 approximations. A special procedure that uses the EVB method as a reference potential for SCF MO calculations should help in improving the reliability of such studies.
Article
The importance of including different energy contributions in calculations of electrostatic energies in proteins is examined by calculating the intrinsic pKa values of the acidic groups of bovine pancreatic trypsin inhibitor. It appears that such calculations provide a powerful and revealing test; the relevant solvation energies of the ionized acids are of the order of -70 kcal/mol (1 cal = 4.184 J), and microscopic calculations that do not attempt to simulate the complete protein dielectric effect (including the surrounding solvent) can underestimate the solvation energy by as much as 50 kcal/mol. Reproducing correctly, by the same set of parameters, the solvation energies of ionized acids in different sites of a protein cannot be accomplished by including only part of the key energy contributions. The problems associated with macroscopic calculations are also considered and illustrated by the specific case of bovine pancreatic trypsin inhibitor. A promising approach is shown to be provided by a refinement of the previously developed Protein Dipoles Langevin Dipoles model. This model seems to represent consistently the microscopic dielectric of the protein and the surrounding water molecules. The model overcomes the problems associated with the macroscopic models (by treating explicitly the solvent molecules) and avoids the convergence problems associated with all-atom solvent models (by treating the average solvent polarization rather than averaging the actual polarization energy). This paper describes in detail the actual implementation of the model and examines its performance in evaluating intrinsic pKa values. Preliminary microscopic considerations of charge-charge interactions are presented.
Article
We present the development of a force field for simulation of nucleic acids and proteins. Our approach began by obtaining equilibrium bond lengths and angles from microwave, neutron diffraction, and prior molecular mechanical calculations, torsional constants from microwave, NMR, and molecular mechanical studies, nonbonded parameters from crystal packing calculations, and atomic charges from the fit of a partial charge model to electrostatic potentials calculated by ab initio quantum mechanical theory. The parameters were then refined with molecular mechanical studies on the structures and energies of model compounds. For nucleic acids, we focused on methyl ethyl ether, tetrahydrofuran, deoxyadenosine, dimethyl phosphate, 9-methylguanine-1-methylcytosine hydrogen-bonded complex, 9-methyladenine-1-methylthymine hydrogen-bonded complex, and 1,3-dimethyluracil base-stacked dimer. Bond, angle, torsional, nonbonded, and hydrogen-bond parameters were varied to optimize the agreement between calculated and experimental values for sugar pucker energies and structures, vibrational frequencies of dimethyl phosphate and tetrahydrofuran, and energies for base pairing and base stacking. For proteins, we focused on Φ,Ψ maps of glycyl and alanyl dipeptides, hydrogen-bonding interactions involving the various protein polar groups, and energy refinement calculations on insulin. Unlike the models for hydrogen bonding involving nitrogen and oxygen electron donors, an adequate description of sulfur hydrogen bonding required explicit inclusion of lone pairs.
Article
The complex formed by porcine pancreatic kallikrein A with the bovine pancreatic trypsin inhibitor (PTI) has been crystallized at pH 4 in tetragonal crystals of space group P41212 with one molecule per asymmetric unit. Its crystal structure has been solved applying Patterson search methods and using a model derived from the bovine trypsin-PTI complex (Huber et al., 1974) and the structure of porcine pancreatic kallikrein A (Bode et al., 1983). The kallikrein-PTI model has been crystallographically refined to an R-value of 0·23 including X-ray data to 2·5 Å.
Article
Molecular surfaces are fitted to each other by a new solution to the problem of docking a ligand into the active site of a protein molecule. The procedure constructs patterns of points on the surfaces and superimposes them upon each other using a least-squares best-fit algorithm. This brings the surfaces into contact and provides a direct measure of their local complementarity. The search over the ligand surface produces a large number of dockings, of which a small fraction having the best complementarity and the least steric hindrance are evaluated for electrostatic interaction energy. When applied to molecules taken from crystallographically observed complexes, this procedure consistently assigns the lowest electrostatic energies to correct dockings. On independently determined structures, the ability of the method to discern correct dockings depends on how much conformational difference there is between the free and complexed forms of the molecules. The procedure is found to be fast enough on contemporary workstation computers to permit many conformations to be considered, and tolerant enough to make rather coarse bond dihedral sampling a practicable way to overcome the problem of structural flexibility.
Article
A modified version of the human pancreatic trypsin inhibitor (PSTI), generated in a protein-design project, has been crystallized in spacegroup P4(3) with lattice constants a = 40.15 A, c = 33.91 A. The structure has been solved by molecular replacement. Refinement of the structure by simulated annealing and conventional restrained least-squares yielded for 8.0 to 2.3 A data a final R-value of 19.1%. Differences to the known structures of porcine PSTI complexed with trypsinogen and modified human PSTI complexed with chymotrypsinogen occur at the flexible N-terminal part of the molecule. These differences are influenced by crystal packing, as are low temperature factors for the binding loop. The geometry of the binding loop is similar to the complexed structures.
Article
A combination of enzyme kinetics and X-ray crystallographic analysis of site-specific mutants has been used to probe the determinants of substrate specificity for the enzyme alpha-lytic protease. We now present a generalized model for understanding the effects of mutagenesis on enzyme substrate specificity. This algorithm uses a library of side-chain rotamers to sample conformation space within the binding site for the enzyme-substrate complex. The free energy of each conformation is evaluated with a standard molecular mechanics force field, modified to include a solvation energy term. This rapid energy calculation based on coarse conformation sampling quite accurately predicts the relative catalytic efficiency of over 40 different alpha-lytic protease-substrate combinations. Unlike other computational approaches, with this method it is feasible to evaluate all possible mutations within the binding site. Using this algorithm, we have successfully designed a protease that is both highly active and selective for a non-natural substrate. These encouraging results indicate that it is possible to design altered enzymes solely on the basis of empirical energy calculations.
Article
Variants of the human pancreatic secretory trypsin inhibitor (PSTI) have been created during a protein design project to generate a high-affinity inhibitor with respect to some serine proteases other than trypsin. Two modified versions of human PSTI with high affinity for chymotrypsin were crystallized as a complex with chymotrypsinogen. Both crystallize isomorphously in space group P4(1)2(1)2 with lattice constants a = 84.4 A, c = 86.7 A and diffract to 2.3 A resolution. The structure was solved by molecular replacement. The final R-value after refinement with 8.0 to 2.3 A resolution data was 19.5% for both complexes after inclusion of about 50 bound water molecules. The overall three-dimensional structure of PSTI is similar to the structure of porcine PSTI in the trypsinogen complex (1TGS). Small differences in the relative orientation of the binding loop and the core of the inhibitors indicate flexible adaptation to the proteases. The chymotrypsinogen part of the complex is similar to chymotrypsin. After refolding induced by binding of the inhibitor the root-mean-square difference of the active site residues A186 to A195 and A217 to A222 compared to chymotrypsin was 0.26 A.
Article
Two efficient algorithms have been developed which allow amino acid side chain conformations to be optimized rapidly for a given peptide backbone conformation. Both these approaches are based on the assumption that each side chain can be represented by a small number of rotameric states. These states have been obtained by a dynamic cluster analysis of a large data base of known crystallographic structures. Successful applications of these algorithms to the prediction of known protein conformations are presented.
Article
Predicting the structures of protein-protein complexes is a difficult problem owing to the topographical and thermodynamic complexity of these structures. Past efforts in this area have focussed on fitting the interacting proteins together using rigid body searches, usually with the conformations of the proteins as they occur in crystal structure complexes. Here we present work which uses a rigid body docking method to generate the structures of three known protein complexes, using both the bound and unbound conformations of the interacting molecules. In all cases we can regenerate the geometry of the crystal complexes to high accuracy. We also are able to find geometries that do not resemble the crystal structure but nevertheless are surprisingly reasonable both mechanistically and by some simple physical criteria. In contrast to previous work in this area, we find that simple methods for evaluating the complementarity at the protein-protein interface cannot distinguish between the configurations that resemble the crystal structure complex and those that do not. Methods that could not distinguish between such similar and dissimilar configurations include surface area burial, solvation free energy, packing and mechanism-based filtering. Evaluations of the total interaction energy and the electrostatic interaction energy of the complexes were somewhat better. Of the techniques that we tried, energy minimization distinguished most clearly between the "true" and "false" positives, though even here the energy differences were surprisingly small. We found the lowest total interaction energy from amongst all of the putative complexes generated by docking was always within 5 A root-mean-square of the crystallographic structure. There were, however, several putative complexes that were very dissimilar to the crystallographic structure but had energies that were close to that of the low energy structure. The magnitude of the error in energy calculations has not been established in macromolecular systems, and thus the reliability of the small differences in energy remains to be determined. The ability of this docking method to regenerate the crystallographic configurations of the interacting proteins using their unbound conformations suggests that it will be a useful tool in predicting the structures of unsolved complexes.
Article
Molecular recognition is achieved through the complementarity of molecular surface structures and energetics with, most commonly, associated minor conformational changes. This complementarity can take many forms: charge-charge interaction, hydrogen bonding, van der Waals' interaction, and the size and shape of surfaces. We describe a method that exploits these features to predict the sites of interactions between two cognate molecules given their three-dimensional structures. We have developed a “cube representation” of molecular surface and volume which enables us not only to design a simple algorithm for a six-dimensional search but also to allow implicitly the effects of the conformational changes caused by complex formation. The present molecular docking procedure may be divided into two stages. The first is the selection of a population of complexes by geometric “soft docking”, in which surface structures of two interacting molecules are matched with each other, allowing minor conformational changes implicitly, on the basis of complementarity in size and shape, close packing, and the absence of steric hindrance. The second is a screening process to identify a subpopulation with many favorable energetic interactions between the buried surface areas. Once the size of the subpopulation is small, one may further screen to find the correct complex based on other criteria or constraints obtained from biochemical, genetic, and theoretical studies, including visual inspection.
Article
The potential use of monoclonal antibodies in immunological, chemical and clinical applications has stimulated the protein engineering and expression of Fv fragments, which are heterodimers consisting of the light and heavy chain variable domains (VL and VH) of antibodies. Although Fv fragments exhibit antigen binding specificity and association constants similar to their parent antibodies or Fab moieties, similarity in their interactions with antigen at the level of three-dimensional structure has not been investigated. We have determined the high-resolution crystal structure of the genetically engineered FvD1.3 fragment of the anti-hen egg-white lysozyme (HEL) monoclonal antibody D1.3, and of its complex with HEL. On comparison with the crystallographically refined FabD1.3-HEL complex, we find that FvD1.3 and FabD1.3 make, with minor exceptions, very similar contacts with the antigen. Furthermore, a small but systematic rearrangement of the domains of FvD1.3 occurs on binding HEL, bringing the contacting residues closer to the antigen by a mean value of about 0.7 A for VH (aligning on VL) or of 0.5 A for VL (aligning on VH). This is indicative of an induced fit rather than a 'lock and key' fit to the antigen.
Article
The crystal structures of the molecular complexes between two serine proteinases and two of their protein inhibitors have been determined: subtilisin Carlsberg with the recombinant form of eglin-c from the leech Hirudo medicinalis and subtilisin Novo with chymotrypsin inhibitor 2 from barley seeds. The structures have been fully refined by restrained-parameter least-squares methods to crystallographic R factors (sigma[[Fo[ - [Fc[[/sigma[Fo[) of 0.136 at 1.8-A resolution and 0.154 at 2.1-A resolution, respectively. The 274 equivalent alpha-carbon atoms of the enzymes superpose with an rms deviation of 0.53 A. Sequence changes between the enzymes result in localized structural adjustments. Functional groups in the active sites superpose with an rms deviation of 0.19 A for 161 equivalent atoms; this close similarity in the conformation of active-site residues provides no obvious reason for known differences in catalytic activity between Carlsberg and Novo. Conformational changes in the active-site region indicate a small induced fit of enzyme and inhibitor. Some conformational differences are observed between equivalent active-site residues of subtilisin Carlsberg and alpha-chymotrypsin. Despite differences in tertiary architecture, most enzyme-substrate (inhibitor) interactions are maintained. Subtilisin Carlsberg has a rare cis-peptide bond preceding Thr211 (Gly211 in Novo). Both enzymes contain tightly bound Ca2+ ions. Site 1 is heptacoordinate with the oxygen atoms at the vertices of a pentagonal bipyramid. Site 2 in Carlsberg is probably occupied by a K+ ion in Novo. Conserved water molecules appear to play important structural roles in the enzyme interior, in the inhibitor beta-sheet, and at the enzyme-inhibitor interface. The 62 equivalent alpha-carbon atoms of the inhibitors superpose with an rms deviation of 1.68 A. Sequence changes result in somewhat different packing of the alpha-helix, beta-sheet, and reactive-site loop relative to each other. Hydrogen bonds and electrostatic interactions supporting the conformation of the reactive-site loop are conserved. The 24 main-chain plus C beta atoms of P4 to P1' overlap with an rms deviation of 0.19 A. Features contributing to the inhibitory nature of eglin-c and CI-2 are discussed.
Article
We report here the X-ray crystal structure of native subtilisin Carlsberg, solved at 2.5 Å resolution by molecular replacement and refined by restrained least squares to a crystal-lographic residual of 0.206. we compare this structure to the crystal structure of subtilisin BPN'. We find that, despite 82 amino acid substitutions and one deletion in subtilisin Carlsberg relative to subtilisin BPN', the structures of these enzymes are remarkably similar. We calculate an r.m.s. difference between equivalent a-carbon positions in subtilisin Carlsberg and subtilisin BPN' of only 0.55 Å. This confirms previous reports of extensive structural bomology between these two subtilisins based on X-ray crystal structures of the complex of eglin-c with subtilisin Carlsberg [McPhalen,C.A., Schnebli.H.P. and James,M.N.G. (1985) FEBS Lett., 188, 55; Bode,W., Papamokos,E. and Musil,D. (1987) Eur. J. Biochem., 166, 673-692]. In addition, we find that the native active sites of subtilisins Carlsberg and BPN' are virtually identical. While conservative substitutions at residues 217 and 156 may have subtle effects on the environments of substrate-binding sites SI' and SI respectively, we find no obvious structural correlate for reports that subtilisins Carlsberg and BPN' differ in their recognition of model substrates. In particular, we find no evidence that the hydrophobic binding pocket SI in subtilisin Carlsberg is ‘deeper’, ‘narrower’ or 'less polar' than the corresponding binding site hi subtilisin BPN' [Karasaki and Ohno (1978) J. Biochem., Tokyo, 84, 531–538].
Article
In this report we describe an accurate numerical method for calculating the total electrostatic energy of molecules of arbitrary shape and charge distribution, accounting for both Coulombic and solvent polarization terms. In addition to the solvation energies of individual molecules, the method can be used to calculate the electrostatic energy associated with conformational changes in proteins as well as changes in solvation energy that accompany the binding of charged substrates. The validity of the method is examined by calculating the hydration energies of acetate, methyl ammonium, ammonium, and methanol. The method is then used to study the relationship between the depth of a charge within a protein and its interaction with the solvent. Calculations of the relative electrostatic energies of crystal and misfolded conformations of Themiste dyscritum hemerythrin and the VL domain of an antibody are also presented. The results indicate that electrostatic charge-solvent interactions strongly favor the crystal structures. More generally, it is found that charge-solvent interactions, which are frequently neglected in protein structure analysis, can make large contributions to the total energy of a macromolecular system.
Article
A procedure, CONGEN, for uniformly sampling the conformational spaceof short polypeptide segments in proteins has been implemented. Because thetime required for this sampling grows exponentially with the number of residues, parameters are introduced to limit the conformational space that has to be explored. This is done by the use of the empirical energy function ofCHARMM [B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan and M. Karplus (1983) J. Comput. Chem.4, 187-217] and truncating the search when conformations of grossly unfavorable energy are sampled. Tests are made to determine control parameters that optimize the search without excluding important configurations. When applied to known protein structures, the resulting procedure is generally capable of generating conformations where the lowest energy conformation matches the known structure within a rms deviation of 1 Å.
Article
Chymotrypsin inhibitor 2 (CI-2), a serine proteinase inhibitor from barley seeds, has been crystallized and its three-dimensional structure determined at 2.0-A resolution by the molecular replacement method. The structure has been refined by restrained-parameter least-squares methods to a crystallographic R factor (= sigma parallel Fo magnitude of-Fo parallel/sigma magnitude of Fo) o of 0.198. CI-2 is a member of the potato inhibitor 1 family. It lacks the characteristic stabilizing disulfide bonds of most other members of serine proteinase inhibitor families. The body of CI-2 shows few conformational changes between the free inhibitor and the previously reported structure of CI-2 in complex with subtilisin Novo [McPhalen, C.A., Svendsen, I., Jonassen, I., & James, M.N.G. (1985) Proc. Natl. Acad. Sci. U.S.A. 82, 7242-7246]. However, the reactive site loop has some significant conformational differences between the free inhibitor and its complexed form. The residues in this segment of polypeptide exhibit relatively large thermal motion parameters and some disorder in the uncomplexed form of the inhibitor. The reactive site bond is between Met-59I and Glu-60I in the consecutive sequential numbering of CI-2 (Met-60-Glu-61 according to the alignment of Svendsen et al. [Svendsen, I., Hejgaard, J., & Chavan, J.K. (1984) Carlsberg Res. Commun. 49, 493-502]). The network of hydrogen bonds and electrostatic interactions stabilizing the conformation of the reactive site loop is much less extensive in the free than in the complexed inhibitor.
Article
OMSVP3 and OMTKY3 (third domains of silver pheasant and turkey ovomucoid inhibitor) are Kazal-type serine proteinase inhibitors. They have been isomorphously crystallized in the monoclinic space group C2 with cell dimensions of a = 4.429 nm, b = 2.115 nm, c = 4.405 nm, beta = 107 degrees. The asymmetric unit contains one molecule corresponding to an extremely low volume per unit molecular mass of 0.0017 nm3/Da. Data collection was only possible for the OMSVP3 crystals. Orientation and position of the OMSVP3 molecules in the monoclinic unit cells were determined using Patterson search methods and the known structure of the third domain of Japanese quail ovomucoid (OMJPQ3) [Papamokos, E., Weber, E., Bode, W., Huber, R., Empie, M. W., Kato, I. and Laskowski, M., Jr (1982) J. Mol. Biol. 158, 515-537]. The OMSVP3 structure has been refined by restrained crystallographic refinement yielding a final R value of 0.199 for data to 0.15 nm resolution. Conformation and hydrogen-bonding pattern of OMSVP3 and OMJPQ3 are very similar. Large deviations occur at the NH2 terminus owing to different crystal packing, and at the C terminus of the central helix, representing an intrinsic property and resulting from amino acid substitutions far away from this site. The deviation of OMSVP3 from OMTKY3 complexed with the Streptomyces griseus protease B is very small [Fujinaga, M., Read, R. J., Sielecki, A., Ardelt, W., Laskowski, M., Jr and James, M. N. G. (1982) Proc. Natl Acad. Sci. USA, 79, 4868-4872].
Article
Correlating the structure and action of biological molecules requires knowledge of the corresponding relation between structure and energy. Probably the most important factors in such a structure– energy correlation are associated with electrostatic interactions. Thus the key requirement for quantative understanding of the action of biological molecules is the ability to correlate electrostatic interactions with structural information. To appreciate this point it is useful to compare the electrostatic energy of a charged amino acid in a polar solvent to the corresponding van der Waals energy. The electrostatic free energy, Δ G el , can be approximated (as will be shown in Section II) by the Born formula (Δ G el = –(166Q ² /ā) (I – I/ E )). Where Δ G el is given in kcal/mol, Q is the charge of the given group, in units of electron charge, ā is the effective radius of the group, and E is the dielectric constant of the solvent. With an effective radius of charged amino acids of approximately 2 Å, Born's formula gives about – 80 kcal/mol for their energy in polar solvents where E is larger than 10. This energy is two orders of magnitude larger than the van der Waals interaction of such groups and their surroundings.
Article
An analytical formula has been derived for the calculation of the solvent accessible surface area of a protein molecule or equivalently the surface area exterior to an arbitrary number of overlapping spheres. The directional derivative of this function with respect to atomic co-ordinates is provided to facilitate minimization procedures used with molecular docking algorithms and energy calculations. An analytical formula for the calculation of the volume enclosed within the accessible surface, the excluded volume, is also derived. Although the area function is not specific to the structures of proteins, the derivation was motivated by the need for a computationally feasible simulation of the hydrophobic effect in proteins. A computer program using the equations for area has been tested and has had limited application to the docking of protein alpha-helices. Possible relationships of the solvent excluded volume to hydrophobic interaction free energy and transfer free energy of solute molecules are derived from the statistical mechanics of solution.
Article
Porcine pancreas kallikrein A has been crystallized in the presence of the small inhibitor benzamidine, yielding tetragonal crystals of space group P41212 containing two molecules per asymmetric unit. X-ray data up to 2·05 Å resolution have been collected using normal rotation anode as well as synchrotron radiation. The crystal structure of benzamidine-kallikrein has been determined using multiple isomorphous replacement techniques, and has subsequently been refined to a crystallographic R-value of 0·220 by applying a diagonal matrix least-squares energy constraint refinement procedure.
Article
The prediction of protein-protein interactions in solution is a major goal of theoretical structural biology. Here, we implement a continuum description of the thermodynamic processes involved. The model differs considerably from previous models in its use of "molecular surface" area to describe the hydrophobic component to the free energy of conformational change in solution. We have applied this model to a data set of alternative docked conformations of protein-protein complexes which were generated independently of this work. It was found previously that commonly used energy evaluation techniques fail to distinguish between near-native and certain non-native complexes in this data set. Here, we found that an energy function that takes into account (1) total electrostatic free energy, (2) hydrophobic free energy and (3) loss in side-chain conformational energy was able to reliably discriminate between near-native and non-native configurations but only when molecular surface is used as a descriptor of the hydrophobic effect. It is shown that the molecular surface and the more conventional surface descriptor "solvent accessible surface" give very different quantitative measures of hydrophobicity. In terms of the contribution of different energy components to the free energy of complex formation it was found that loss in side-chain conformational entropy is a second order effect. Electrostatic interaction energy (which is commonly used to score docked conformations) was a poor indicator of complementarity when starting from unbound conformations. It was found that electrostatic desolvation energy and the hydrophobic contribution (based on a molecular surface area descriptor) are much less sensitive to local fluctuations in atomic structure than point-to-point interaction energies and thus may be more suited for use as a scoring function when docking unbound conformations, where atomic complementarity is much less apparent. Whilst a combined energy function was able to distinguish near-native from non-native conformations in the six systems studied here, it remains to be determined to what extent more sizeable conformational changes would influence the results.
Article
The fundamental event in biological assembly is association of two biological macromolecules. Here we present a successful, accurate ab initio prediction of the binding of uncomplexed lysozyme to the HyHel5 antibody. The prediction combines pseudo Brownian Monte Carlo minimization with a biased-probability global side-chain placement procedure. It was effected in an all-atom representation, with ECEPP/2 potentials complemented with the surface energy, side-chain entropy and electrostatic polarization free energy. The near-native solution found was surprisingly close to the crystallographic structure (root-mean-square deviation of 1.57 A for all backbone atoms of lysozyme) and had a considerably lower energy (by 20 kcal mol-1) than any other solution.
Article
Trp62 in the binding subsite B of hen egg-white lysozyme shows general features often observed in protein-carbohydrate interactions including a stacking interaction and a hydrogen bonding network with water molecules. A previous report by our group showed that the perturbation of these interactions by substitution of Trp62 with tyrosine or phenylalanine affects the substrate binding modes and also enhances the hydrolytic activity. In order to elucidate the relationship between structural and functional changes of these protein-carbohydrate interactions, the Trp62Tyr and Trp62Phe mutants complexed with the substrate analogue, (GlcNAc)3, were analyzed at 1.8 A resolution by X-ray crystallography. The overall structures of the mutant enzymes are indistinguishable from that of the wild type enzyme. Although the wild-type enzyme binds (GlcNAc)3 in only one binding mode (A-B-C), the Trp62Tyr mutant binds (GlcNAc)3 in two binding modes (A-B-C, B-C-D) and the Trp62Phe mutant has an even weaker binding mode. The aromatic rings of Tyr62 and Phe62 maintain their interactions with the carbohydrate molecules, but make fewer stacking interactions with the GlcNAc in the B site than the wild-type enzyme does. The hydroxyl group of Tyr62 interacts weakly with a water molecule which mediates hydrogen bonding in the GlcNAc residues in the B and C sites. The C-6 hydroxyl group of the GlcNAc residue in the C site rotates around the C-5-C-6 bond to complete the hydrogen bond network in the Trp62Tyr mutant-(GlcNAc)3 complex. On the other hand, this hydrogen bonding network does not form in the Trp62Phe mutant-(GlcNAc)3. In addition to these structural studies, the kinetic parameters of the hydrolysis of 4-methylumbelliferyl N-acetyl-chitotriose, ((GlcNAc)3-MeU), have been determined in order to further characterize the enzymatic properties of these mutant lysozymes. This demonstrates that the modulation of the hydrogen bonding network, including the flexible part of the carbohydrate and water molecules and/or the slight reduction of stacking interaction in the B site, alters the binding mode toward the carbohydrate and induces an enhancement of the hydrolytic activity.
Article
We have developed a geometry-based suite of processes for molecular docking. The suite consists of a molecular surface representation, a docking algorithm, and a surface inter-penetration and contact filter. The surface representation is composed of a sparse set of critical points (with their associated normals) positioned at the face centers of the molecular surface, providing a concise yet representative set. The docking algorithm is based on the Geometric Hashing technique, which indexes the critical points with their normals in a transformation invariant fashion preserving the multi-element geometric constraints. The inter-penetration and surface contact filter features a three-layer scoring system, through which docked models with high contact area and low clashes are funneled. This suite of processes enables a pipelined operation of molecular docking with high efficacy. Accurate and fast docking has been achieved with a rich collection of complexes and unbound molecules, including protein-protein and protein-small molecule associations. An energy evaluation routine assesses the intermolecular interactions of the funneled models obtained from the docking of the bound molecules by pairwise van der Waals and Coulombic potentials. Applications of this routine demonstrate the goodness of the high scoring, geometrically docked conformations of the bound crystal complexes.
Article
The three-dimensional structures of the free and antigen-complexed Fabs from the mouse monoclonal anti-hen egg white lysozyme antibody D44.1 have been solved and refined by X-ray crystallographic techniques. The crystals of the free and lysozyme-bound Fabs were grown under identical conditions and their X-ray diffraction data were collected to 2.1 and 2.5 A, respectively. Two molecules of the Fab-lysozyme complex in the asymmetric unit of the crystals show nearly identical conformations and thus confirm the essential structural features of the antigen-antibody interface. Three buried water molecules enhance the surface complementarity at the interface and provide hydrogen bonds to stabilize the complex. Two hydrophobic buried holes are present at the interface which, although large enough to accommodate solvent molecules, are void. The combining site residues of the complexed FabD44.1 exhibit reduced temperature factors compared with those of the free Fab. Furthermore, small perturbations in atomic positions and rearrangements of side-chains at the combining site, and a relative rearrangement of the variable domains of the light (VL) and the heavy (VH) chains, detail a Fab accommodation of the bound lysozyme. The amino acid sequence of the VH domain, as well as the epitope of lysozyme recognized by D44.1 are very close to those previously reported for the monoclonal antibody HyHEL-5. A feature central to the FabD44.1 and FabHyHEL-5 complexes with lysozyme are three salt bridges between VH glutamate residues 35 and 50 and lysozyme arginine residues 45 and 68. The presence of the three salt bridges in the D44.1-lysozyme interface indicates that these bonds are not responsible for the 1000-fold increase in affinity for lysozyme that HyHEL-5 exhibits relative to D44.1.
Article
In this paper we present a self-consistent ensemble optimization (SCEO) theory for efficient conformational search, which we have applied to predicting the effects of mutations on protein thermostability. This approach takes advantage of a statistical mechanical self-consistency condition to home in iteratively on the global minimum structure. We employ a fast potential of mean-force approximation to cut computation time to a few minutes for a typical protein mutation, with only linear time-dependence on the size of the prediction problem. Rather than seeking a single, static structure of minimum energy, the new method optimizes an ensemble of many conformations, seeking to predict the most likely ensemble for the native state at a desired temperature. Testing this approach with a simple physical model focusing entirely on steric interactions and side-chain rearrangement, we obtain robustly convergent prediction of core side-chain conformation, and of hydrophobic core mutations' effects on protein stability. Self-consistent ensemble optimization is superior to simulated annealing in its speed and convergence to the global minimum, and insensitive to starting conformation. In calculations on lambda repressor protein, structural predictions for an eight-residue molten-zone had side-chain r.m.s. error of 0.49 A for the wild-type protein. Evaluation of the method's mutant structure predictions should become possible, as structures of these mutant repressors are solved. Predicted energies for a series of nine hydrophobic core mutants correlated with measured free energies of unfolding with a coefficient of 0.82.
Article
The energetics of alkane dissolution and partition between water and organic solvent are described in terms of the energy of cavity formation and solute-solvent interaction using scaled particle theory. Thermodynamic arguments are proposed that allow comparison of experimental measurements of the surface area with values calculated from an all-atom representation of the solute. While the surface tension relating to the accessible surface is shape dependent, it is found that for the molecular surface it is not. This model rationalizes the change in surface tension between the microscopic (20-30 cal/mol/A2) and macroscopic (70-75 cal/mol/A2) regimes without the need to invoke Flory-Huggins theory or to apply other corrections. The difference in the values arises (i) to a small extent as a result of the curvature dependence of surface tension and (ii) to a large extent due to the difference in the molecular surface derived from the experiment and that calculated from an extended all-atom model. The model suggests that the primary driving force for alkane association in water is due to the tendency of water to reduce the solute cavity surface. It is argued that to model the energetics of alkane association, the surface tension should be related to the molecular surface (rather than the accessible surface) with a surface tension near the macroscopic limit for water. This model is compared with results from theoretical simulations of the hydrophobic effect for two well-studied systems. The implications for antibody-antigen interactions and the effect of hydrophobic amino acid deletion on protein stability are discussed. The approach can be used to model the solute cavity formation energy in solution as a first step in the continuum modelling of biomolecular interactions.
Article
Understanding the relations between the conformation of the side-chains and the backbone geometry is crucial for structure prediction as well as for homology modelling. To attempt to unravel these rules, we have developed a method which allows us to predict the position of the side-chains from the co-ordinates of the main-chain atoms. This method is based on a rotamer library and refines iteratively a conformational matrix of the side-chains of a protein, CM, such that its current element at each cycle CM (ij) gives the probability that side-chain i of the protein adopts the conformation of its possible rotamer j. Each residue feels the average of all possible environments, weighted by their respective probabilities. The method converges in only a few cycles, thereby deserving the name of self consistent mean field method. Using the rotamer with the highest probability in the optimized conformational matrix to define the conformation of the side-chain leads to the result that on average 72% of chi 1, 75% of chi 2 and 62% of chi 1 + 2 are correctly predicted for a set of 30 proteins. Tests with six pairs of homologous proteins have shown that the method is quite successful even when the protein backbone deviates from the correct conformation. The second application of the optimized conformational matrix was to provide estimates of the conformational entropy of the side-chains in the folded state of the protein. The relevance of this entropy is discussed.
Article
Several sets of amino acid surface areas and transfer free energies were used to derive a total of nine sets of atomic solvation parameters (ASPs). We tested the accuracy of each of these sets of parameters in predicting the experimentally determined transfer free energies of the amino acid derivatives from which the parameters were derived. In all cases, the calculated and experimental values correlated well. We then chose three parameter sets and examined the effect of adding an energetic correction for desolvation based on these three parameter sets to the simple potential function used in our multiple start Monte Carlo docking method. A variety of protein-protein interactions and docking results were examined. In the docking simulations studied, the desolvation correction was only applied during the final energy calculation of each simulation. For most of the docking results we analyzed, the use of an octanol-water-based ASP set marginally improved the energetic ranking of the low-energy dockings, whereas the other ASP sets we tested disturbed the ranking of the low-energy dockings in many of the same systems. We also examined the correlation between the experimental free energies of association and our calculated interaction energies for a series of proteinase-inhibitor complexes. Again, the octanol-water-based ASP set was compatible with our standard potential function, whereas ASP sets derived from other solvent systems were not.
Article
A protein docking study was performed for two classes of biomolecular complexes: six enzyme/inhibitor and four antibody/antigen. Biomolecular complexes for which crystal structures of both the complexed and uncomplexed proteins are available were used for eight of the ten test systems. Our docking experiments consist of a global search of translational and rotational space followed by refinement of the best predictions. Potential complexes are scored on the basis of shape complementarity and favourable electrostatic interactions using Fourier correlation theory. Since proteins undergo conformational changes upon binding, the scoring function must be sufficiently soft to dock unbound structures successfully. Some degree of surface overlap is tolerated to account for side-chain flexibility. Similarly for electrostatics, the interaction of the dispersed point charges of one protein with the Coulombic field of the other is measured rather than precise atomic interactions. We tested our docking protocol using the native rather than the complexed forms of the proteins to address the more scientifically interesting problem of predictive docking. In all but one of our test cases, correctly docked geometries (interface Calpha RMS deviation </=2 A from the experimental structure) are found during a global search of translational and rotational space in a list that was always less than 250 complexes and often less than 30. Varying degrees of biochemical information are still necessary to remove most of the incorrectly docked complexes.
The re®nement and the structure of the dimer of a-chymotrypsin at 1
  • R A Blevins
  • A Tulinsky
Blevins, R. A. & Tulinsky, A. (1985). The re®nement and the structure of the dimer of a-chymotrypsin at 1.
Re®ned structure of a-lytic protease at 1. 7 A Ê resolution
  • M Fujinaga
  • A F Sielecki
  • R J Read
  • W Ardelt
  • M J Laskowski
  • M N G James
Fujinaga, M., Sielecki, A. F., Read, R. J., Ardelt, W., Laskowski, M. J. & James, M. N. G. (1987). Re®ned structure of a-lytic protease at 1. 7 A Ê resolution. J. Mol. Biol. 195, 397 ±418.
Computer aided molecular modelling of DNA-drug interactions
  • S A Islam
Islam, S. A. (1986). Computer aided molecular modelling of DNA-drug interactions. PhD dissertation, University of London.
Theoretical studies of enzymic reactions: dielectric, electrostatic and steric stabilization of the carbonium ion in the reaction of lysozyme
  • A Warshel
  • M Levitt
Warshel, A. & Levitt, M. (1976). Theoretical studies of enzymic reactions: dielectric, electrostatic and steric stabilization of the carbonium ion in the reaction of lysozyme. J. Mol. Biol. 103, 227± 249.