Article

New Methodology for Computer-Aided Modelling of Biomolecular Structure and Dynamics 2. Local Deformations and Cycles

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

A new methodology for the conformational modelling of biomolecular systems (1) is extended to local deformations of chain molecules and to flexible molecular rings. It is shown that these two cases may be reduced to considering an equivalent molecular model with a regular tree-like topology. A simple procedure is developed to analyze any flexible rings (the five- and six-membered sugar rings of carbohydrates and nucleic acids, in particular) and local deformation regions by energy minimization. Dynamic equations are also derived for such molecular systems. As a result, a unified approach is proposed for the efficient energy minimization and simulation of dynamic behavior of multimolecular systems having any set of variable internal coordinates, local deformation regions and cycles. Advantages and domains of applicability of the approach are discussed.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Their idea has been further developed into MD formulations in torsion angle coordinates [22]. Mazur and Abagyan [23,24] extended this method to include bond stretches and angle bend coordinates, while Kneller and Hinsen [29] incorporated quaternion parameters and angular velocities for rotations of linked rigid body subunits. However, all these formulations require ®nding the inverse of a massmatrix at each time step (or at least solving a system of linear equations), which is the costliest part in the whole process. ...
... The nonzero A-matrix elements for internal coordinates can be computed from the direct derivatives of Eq. (27) w.r.t. the Z-coordinates, or from simple relations related to vector rotations [22,23], which will be further investigated in this section. It should be noted that Eq. (28) is well-de®ned also for the external rotations and translations, with related A-matrix elements being computed from the direct derivatives of Eq. (24). When S c represent not only the internal coordinates but also the molecule's external rotations and translations, we will designate the S c as generalized coordinates. ...
... The method of computing A-matrix elements in the previous ICMD formalisms is based on the observation that (for a speci®c illustration see Fig. 1) the changes in x 6 due to in®nitesimal changes in h 25 and s 15 can be viewed as in®nitesimal rotations of x 36 about u 25 and e 23 , respectively [22,23]: [24]. (Note that in this formulation even the atomic ordering given in Fig. 1 cannot be used but should be modi®ed by setting the base to one of the end atoms, e.g., atom 6.) ...
Article
Internal coordinate molecular dynamics (ICMD) has been used in the past in simulations for large molecules as an alternative way of increasing step size with a reduced operational dimension that is not achievable by MD in Cartesian coordinates. A new ICMD formalism for flexible molecular systems is presented, which is based on the spectroscopic B-matrix rather than the A-matrix of previous methods. The proposed formalism does not require an inversion of a large matrix as in the recursive formulations based on robot dynamics, and takes advantage of the sparsity of the B-matrix, ensuring computational efficiency for flexible molecules. Each molecule’s external rotations about an arbitrary atom center, which may differ from its center of mass, are parameterized by the SU(2) Euler representation, giving singularity free parameterization. Although the formalism is based on the use of nonredundant generalized (internal and external) coordinates, an MD simulation in linearly dependent coordinates can be done by finding a transformation to a new set of independent coordinates. Based on the clear separability in the generalized coordinates between fast varying degrees of freedom and slowly varying ones, a multiple time step algorithm is introduced that avoids the previous nontrivial interaction distance classification. Also presented is a recursive method for computing nonzero A-matrix elements that is much easier to apply to a general molecular structure than the previous method.
... The first application of torsion dynamics was limited to linear chains [16] and was based on Wittenburg's formalism for connected rigid bodies-which is related, in turn, to equations derived for n-body space satellites [17,18]. The general equations for internal coordinate molecular dynamics of arbitrary fixed branched biomolecules were first introduced and tested on biomolecules in 1989-1991 [19][20][21]. Subsequently, two other implementations of torsion dynamics were proposed and applied to x-ray refinement and NMR-structure determination [22,23] and peptide simulations [24]. This method allows us to easily distinguish between "hard" degrees of freedom, such as bond lengths and bond angles, and "soft" degrees of freedom such as torsion angles. ...
... The first six internal coordinates determine the rotation and translation of the whole molecule. Analytical derivatives of a pair-wise energy function with respect to the four types of variables for such an object are given in [5,19,20] and the equations of motion are given in [19][20][21]25]. ...
... The first six internal coordinates determine the rotation and translation of the whole molecule. Analytical derivatives of a pair-wise energy function with respect to the four types of variables for such an object are given in [5,19,20] and the equations of motion are given in [19][20][21]25]. ...
Article
Full-text available
Prediction of three-dimensional structures of proteins and peptides by global optimization of the free energy estimate has been attempted without much success for over thirty years. The key problems were the insufficient accuracy of the free energy estimate and the giant size of the conformational space. Global optimization of the free energy function of a peptide in internal coordinate space is a powerful method of structure prediction that outperforms both Molecular Dynamics, bound by the continuity requirement, and Monte Carlo, bound by the Boltzmann ensemble generation requirement. We demonstrate that stochastic global optimization algorithms of the first order, i.e., with local minimization after each iteration (e.g., Monte Carlo-Minimization), have a greater chance of finding the global minimum after a fixed number of function evaluations. Recently, the principle of optimal bias was mathematically justified and the Optimal-Bias Monte Carlo-Minimization algorithm (a.k.a. Biased Probability Monte Carlo-minimization) was successfully applied to theoreticalab initiofolding of several peptides, resulting in more than a 10-fold increase in efficiency compared to the Monte Carlo-Minimization method. The square-root bias is shown to be comparable in performance with the previously derived linear bias. A 23-residue peptide of beta-beta-alpha structure can be predicted from any random starting conformation.
... The complementary advantages and disadvantages of these two representations have been discussed extensively (for early reviews, see 10,11 ). Abagyan and Mazur have conducted the most extensive development of internal coordinates [12][13][14][15][16][17][18] , beginning with a flexible and general approach to IC dynamics 12,13 and continuing through recent work on a molecular force field optimized for use in internal coordinates 18 . Recently, Chys et al. have advanced the use of spinors and geometric algebra as a formalism for converting between Cartesian and internal coordinates [19][20][21] (an analysis of different approaches may be found here 22 Instead, our work aims primarily to advance capabilities for applications where internal coordinates have significant advantages, but where complex software infrastructure is currently Cartesian-centric and expected to remain so. ...
... The complementary advantages and disadvantages of these two representations have been discussed extensively (for early reviews, see 10,11 ). Abagyan and Mazur have conducted the most extensive development of internal coordinates [12][13][14][15][16][17][18] , beginning with a flexible and general approach to IC dynamics 12,13 and continuing through recent work on a molecular force field optimized for use in internal coordinates 18 . Recently, Chys et al. have advanced the use of spinors and geometric algebra as a formalism for converting between Cartesian and internal coordinates [19][20][21] (an analysis of different approaches may be found here 22 Instead, our work aims primarily to advance capabilities for applications where internal coordinates have significant advantages, but where complex software infrastructure is currently Cartesian-centric and expected to remain so. ...
Preprint
We present a highly parallel algorithm to convert internal coordinates of a poly-meric molecule into Cartesian coordinates. Traditionally, converting the structures of polymers (e.g. proteins) from internal to Cartesian coordinates has been performed serially, due to an inherent linear dependency along the polymer chain. We show this dependency can be removed using a tree-based concatenation of coordinate transforms between segments, and then parallelized efficiently on Graphics Processing Units (GPUs). The conversion algorithm is applicable to protein engineering and fitting protein structures to experimental data, and we observe an order of magnitude speedup using parallel processing on a GPU compared to serial execution on a CPU.
... The complementary advantages and disadvantages of these two representations have been discussed extensively (for early reviews, see 10,11 ). Abagyan and Mazur have conducted the most extensive development of internal coordinates [12][13][14][15][16][17][18] , beginning with a flexible and general approach to IC dynamics 12,13 and continuing through recent work on a molecular force field optimized for use in internal coordinates 18 . Recently, Chys et al. have advanced the use of spinors and geometric algebra as a formalism for converting between Cartesian and internal coordinates [19][20][21] (an analysis of different approaches may be found here 22 Instead, our work aims primarily to advance capabilities for applications where internal coordinates have significant advantages, but where complex software infrastructure is currently Cartesian-centric and expected to remain so. ...
... The complementary advantages and disadvantages of these two representations have been discussed extensively (for early reviews, see 10,11 ). Abagyan and Mazur have conducted the most extensive development of internal coordinates [12][13][14][15][16][17][18] , beginning with a flexible and general approach to IC dynamics 12,13 and continuing through recent work on a molecular force field optimized for use in internal coordinates 18 . Recently, Chys et al. have advanced the use of spinors and geometric algebra as a formalism for converting between Cartesian and internal coordinates [19][20][21] (an analysis of different approaches may be found here 22 Instead, our work aims primarily to advance capabilities for applications where internal coordinates have significant advantages, but where complex software infrastructure is currently Cartesian-centric and expected to remain so. ...
Article
Full-text available
We present a highly parallel algorithm to convert internal coordinates of a polymeric molecule into Cartesian coordinates. Traditionally, converting the structures of polymers (e.g., proteins) from internal to Cartesian coordinates has been performed serially, due to an inherent linear dependency along the polymer chain. We show this dependency can be removed using a tree‐based concatenation of coordinate transforms between segments, and then parallelized efficiently on graphics processing units (GPUs). The conversion algorithm is applicable to protein engineering and fitting protein structures to experimental data, and we observe an order of magnitude speedup using parallel processing on a GPU compared to serial execution on a CPU. We present a highly parallel algorithm to convert internal coordinates of a polymeric molecule into Cartesian coordinates. Traditional approaches offer minimal parallelism, because straightforward determination of an atom's Cartesian coordinates requires having the (global) Cartesian coordinates of preceding atoms. We remove this dependency by instead defining coordinate transforms between segments, which allows tree‐based concatenation. We observe an order of magnitude speedup using parallel processing on a graphics processing units compared to serial execution on a CPU.
... If |N | = N then there should be 3N − 6 independent internal coordinates assigned to each non-collinear shape. Internal coordinate systems have been studied by chemists and molecular physicists for a long time because of their importance for simplifying the study of molecular vibrations, scattering processes, and geometry optimization [6], [7], [16], [20], [28], [31], [32], [67], [70], [76], [77], [90]. It is possible to devise internal coordinate systems which use only distances as coordinates; this approach is related to distance geometry [18] and rigidity theory [36], [85]. ...
... In fact, the recent appearance of [31] illustrates the fact that the term " valence coordinate system " has never been rigorously defined. This problem is becoming more urgent now that some molecular dynamics simulations are being done in internal coordinates [67], [41], [56]. Mazur and Abagyan have proposed a mathematical structure which they call a " BKS tree " , originally proposed by Eyring [28], which reflects the inherent structure of biomolecules more closely than a pdb file does. ...
Article
A general theory of molecular internal coordinates of valence type is presented based on the concept of a Z-system. The Z-system can be considered as a discrete mathematical generalization of the Z-matrix (a molecular geometry file format familiar to chemists) which avoids the principal disadvantage of Z-matrices. Z-matrices are usually only employed for small molecules because there is no easy way to glue two Z-matrices together to get the Z-matrix of a larger molecule. It is shown that Z-matrices are simply Z-systems together with additional extraneous structures and that the Z-systems for any two molecules can be naturally glued together to obtain a Z-system for the combined molecule. A general mathematical framework suitable for the detailed study of molecular geometry is introduced and applied to five and six-membered molecular rings. A classification of shapes of hexagons with opposite sides and angles congruent is given with explicit parameterizations of the flexible and rigid solutions. The entire mathematical formalism generalizes to a theory of polyspherical coordinate systems on orbit spaces of the group of nn-dimensional rigid motions acting on finite collections of points in nn-dimensional Euclidean space. The nn-dimensional Z-system is a new discrete structure related to abstract simplicial complexes, graded posets, and iterated line graphs. Complete proofs of all the nn-dimensional results are given, and connections to other areas of mathematics are noted.
... Mazur et al. 87,88 used the internal coordinates representation to investigate the conformations and dynamics of bio-macromolecules. However solving the EOM with this method scaled exponentially with size and relied on a costly expression of the interatomic potentials in internal coordinates. ...
... As in Monte Carlo sampling, Cartesian coordinates may be substituted by generalized variables. Practical examples include all-atom [1,97] and CG simulations [93]. This approach indeed allows for a significant increase of the integration time step. ...
... Consequently, the torsion bonds around the cycle are not independent (Fig 3A). The incompatibility between such cycles and the tree-based topologies required for efficient processing of macromolecular structure is well-known [36]. One way to handle flexible rings is through the introduction of virtual atoms. ...
Article
Full-text available
The development of models of macromolecular electrostatics capable of delivering improved fidelity to quantum mechanical calculations is an active field of research in computational chemistry. Most molecular force field development takes place in the context of models with full Cartesian coordinate degrees of freedom. Nevertheless, a number of macromolecular modeling programs use a reduced set of conformational variables limited to rotatable bonds. Efficient algorithms for minimizing the energies of macromolecular systems with torsional degrees of freedom have been developed with the assumption that all atom-atom interaction potentials are isotropic. We describe novel modifications to address the anisotropy of higher order multipole terms while retaining the efficiency of these approaches. In addition, we present a treatment for obtaining derivatives of atom-centered tensors with respect to torsional degrees of freedom. We apply these results to enable minimization of the Amoeba multipole electrostatics potential in a system with torsional degrees of freedom, and validate the correctness of the gradients by comparison to finite difference approximations. In the interest of enabling a complete model of electrostatics with implicit treatment of solvent-mediated effects, we also derive expressions for the derivative of solvent accessible surface area with respect to torsional degrees of freedom.
... This includes internal coordinate representations that treat clusters of atoms as rigid bodies. Mazur et al. [97,98] demonstrated such a scheme in the conformational dynamics of biomacromolecules; however, their method scaled exponentially with size and relied on an expensive expression for the interatomic potential internal coordinates. ...
Chapter
This chapter outlines our progress toward developing a first-principles-based hierarchical multiscale, multiparadigm modeling and simulation framework for the characterization and optimization of electronic and chemical properties of nanoscale materials and devices. In our approach, we build from the bottom-up by solving the quantum-mechanical (QM) Schrödinger equation for small systems. The results of these calculations lead to physical parameters that feed into methods capable of spanning longer length and time scale with minimum loss of accuracy. This is achieved by having higher-scale quantities self-consistently derived and optimized from the results at finer scales. In contrast to other methods, we are strictly first-principles-based, and all of our parameters at all scales relate to physically measurable or QM-computable observables. Our approach that is applicable to the forward (materials phenomenology) and inverse (“materials by design”) problems. The inverse problem involves top-down predictions of structures and compositions at a lower scale from desired properties at a higher scale. The advantages of our strategy over experimental- and phenomenological-based modeling and simulation approaches include the following: (1) providing access to details that are difficult or impossible to measure (e.g., excited electronic states in materials undergoing extreme conditions of pressure, temperature, etc.); (2) the ability to make useful predictions outside the range of experiments (i.e., since all calculations are ultimately related to first principles); and (3) providing sound, first-principles-based, steering for experiments.
... The two diastereoisomers of (-)-lobeline ( Fig. 1a- The docking of ligands to the proteins was performed using the script dock6grid-Lig of the software ICM [57]. The docking procedure was based on a Monte Carlo algorithm, allowing the exploration of the ligand conformations in the torsion angle space [58,59]. The protein electrostatic potential was a distance-dependent potential with a grid size of 0.5 Å, and the van der Waals potential used with an interaction cutoff of 4.0 Å. 30 poses were calculated for each ligand docking. ...
Article
Full-text available
Docking of lobeline, a partial agonist of nicotinic acetylcholine receptors (nAChRs), was investigated at once into crystallographic structures of acetylcholine binding proteins (AChBP) and into α7 and α4β2 nAChRs homology models, and compared to behavior of full agonists, nicotine and epibatidine. The homology models were built using as templates the different pocket geometries established in crystallographic AChBP structures. Systematic cross-docking of each ligand into binding pockets of the two other ligands as well as its self-docking into its own pocket were performed in order to better understand the structural features determining the binding of these three ligands chosen for their molecular diversity. In AChBPs, epibatidine and nicotine display similar docking scores in their own pocket and in other ligands pockets: in particular, they also dock favorably into the lobeline pocket. In opposite, lobeline displays different features: it only binds favorably to its own pocket in AChBPs. Furthermore, the docking poses observed starting from lobeline stereoisomers support the importance of the intramolecular hydrogen bond between the alcohol function of the β-phenyl- β-hydroxyethyl arm and the piperidinium proton for the lobeline binding to AChBP. For homology models, cross-dockings are still discriminating and the specificity of lobeline for its binding pocket is conserved.
... As in Monte Carlo sampling, Cartesian coordinates may be substituted by generalized variables. Practical examples include all-atom (Mazur et al. 1991;Abagyan and Mazur 1989) and CG simulations (Liwo et al. 2005). Although this approach facilitates a significant increase in the integration time step, a matrix inversion is required at every time step, limiting applications to large biomolecules. ...
Chapter
Full-text available
The knowledge of the three-dimensional structure of proteins is crucial for understanding many important biological processes. Most biologically important proteins are too large to handle for the classical simulation tools. In such cases, coarse-grained (CG) models nowadays offer various opportunities for efficient conformational sampling and thus prediction of the three-dimensional structure. A variety of CG models have been proposed, each based on a similar framework consisting of a set of conceptual components such as protein representation, force field, sampling, etc. In this chapter we discuss these components, highlighting ideas which have proven to be the most successful. As CG methods are usually part of multistage procedures, we also describe approaches used for the incorporation of homology data and all-atom reconstruction methods.
... Our focus will be on the description and study of molecular geometries. We will not deal with the dynamics of molecules as described in internal coordinates (but see [33]); nor will we consider geometry optimization. We are interested in molecular flexibility, but from a geometric rather than an energetic perspective. ...
Article
Graph theory has long been applied to molecular structure in re-gard to the covalent bonds between atoms. Here we extend the graph G whose vertices are atoms and whose edges are covalent bonds to allow a description of the conformation (or shape) of the molecule in three dimensional space. We define GZ-trees to be a certain class of tree subgraphs Γ of a graph AL 2 (G), which we call the amalgamated twice iterated line graph of G, and show that each such rooted GZ-tree (Γ, r) defines a well-behaved system of molecular internal coordinates, generalizing those known to chemists as Z-matrices. We prove that these coordinates are the most general type which give a diffeo-morphism of an explicitly determined and very large open subset of molecular configuration space onto the Cartesian product of the overall position and orientation manifold and the internal coordinate space. We give examples of labelled rooted GZ-trees, describing three dimensional (3D) molecular struc-tures, for three types of molecules important in biochemistry: amino acids, nucleotides, and glucose. Finally, some graph theoretical problems natural from the standpoint of molecular conformation are discussed.
... Mazur et al. [103,104] demonstrated the conformational dynamics of biomacromolecules. However, their method scaled exponentially with size and relied on an expensive expression for the inter-atomic potentials in internal coordinates. ...
Article
Full-text available
We expect that systematic and seamless computational upscaling and downscaling for modeling, predicting, or optimizing material and system properties and behavior with atomistic resolution will eventually be sufficiently accurate and practical that it will transform the mode of development in the materials, chemical, catalysis, and Pharma industries. However, despite truly dramatic progress in methods, software, and hardware, this goal remains elusive, particularly for systems that exhibit inherently complex chemistry under normal or extreme conditions of temperature, pressure, radiation, and others. We describe here some of the significant progress towards solving these problems via a general multiscale, multiparadigm strategy based on first-principles quantum mechanics (QM), and the development of breakthrough methods for treating reaction processes, excited electronic states, and weak bonding effects on the conformational dynamics of large-scale molecular systems. These methods have resulted directly from filling in the physical and chemical gaps in existing theoretical and computational models, within the multiscale, multiparadigm strategy. To illustrate the procedure we demonstrate the application and transferability of such methods on an ample set of challenging problems that span multiple fields, system length- and timescales, and that lay beyond the realm of existing computational or, in some case, experimental approaches, including understanding the solvation effects on the reactivity of organic and organometallic structures, predicting transmembrane protein structures, understanding carbon nanotube nucleation and growth, understanding the effects of electronic excitations in materials subjected to extreme conditions of temperature and pressure, following the dynamics and energetics of long-term conformational evolution of DNA macromolecules, and predicting the long-term mechanisms involved in enhancing the mechanical response of polymer-based hydrogels.
... [102][103][104][105][106][107][108] The first docking method that considered continuous flexibility of interface side-chains during the global minimization process was based on internal coordinate mechanics (ICM). [109][110][111][112] The ICM flexible docking procedure, successfully applied to the prediction of an antibody-lysozyme complex, 113 was tested in a blind prediction contest. 114 Although the ICM pseudo-Brownian method 115 with subsequent global optimization 116 of the interface side chain rotations lead to promising results, it was computationally too expensive to be tested on large databases of complexes. ...
Article
Full-text available
Solène Grosdidier1, Max Totrov2, Juan Fernández-Recio11Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Spain; 2Molsoft LLC, La Jolla, CA, USAAbstract: In recent years, protein–protein interactions are becoming the object of increasing attention in many different fields, such as structural biology, molecular biology, systems biology, and drug discovery. From a structural biology perspective, it would be desirable to integrate current efforts into the structural proteomics programs. Given that experimental determination of many protein–protein complex structures is highly challenging, and in the context of current high-performance computational capabilities, different computer tools are being developed to help in this task. Among them, computational docking aims to predict the structure of a protein–protein complex starting from the atomic coordinates of its individual components, and in recent years, a growing number of docking approaches are being reported with increased predictive capabilities. The improvement of speed and accuracy of these docking methods, together with the modeling of the interaction networks that regulate the most critical processes in a living organism, will be essential for computational proteomics. The ultimate goal is the rational design of drugs capable of specifically inhibiting or modifying protein–protein interactions of therapeutic significance. While rational design of protein–protein interaction inhibitors is at its very early stage, the first results are promising.Keywords: protein–protein interactions, drug design, protein docking, structural prediction, virtual ligand screening, hot-spots
Chapter
Conformational search or global energy optimization procedures constitute the core for many applications of molecular modeling such as protein design and engineering, modeling by homology, structure determination from NMR data, and structure analysis and refinement. Factors complicating global optimization include the high dimensionality of conformational space, accompanied by strong anisotropy of the energy hypersurface and a large number of local minima as well as inaccuracies in calculating energy. Several global search protocols, based on Monte Carlo procedures, are compared and optimized. Performance criteria are reformulated and corresponding techniques for accumulating and clustering the optimal conformations found during a search are developed and applied to Met-enkephalin.
Article
The theoretical prediction of biomolecular structure from first principles and without the crutches of experimental restraints remains a dream. Most theoreticians agree that the answer is the global minimum of the free energy function [1,2], but disagree about strategies to find the minimum. Several schools of thought have formed over the years: dynamicists [3–11], minimalists [12–32], and synthesists [2,33–44]. Dynamicists believe that:sufficiently long simulations of a quasi-continuous trajectory of molecular dynamics of atomic models in vacuo or in water will solve the problem using new generations of computers, code parallelization [45,46], and optimized simulation techniques. Minimalists, unwilling to play power games and too impatient to wait until new generations of processors cover the next mile of a hundred-mile road, simplify the system by using a reduced atomic representation or a lattice, inventing a potential and then enjoying the luxury of always finding the global minimum of their energies as well as most of the other possible states for a chain of up to a hundred simplified residues [27,47]. The third school shares the impatience of minimalists, yet resists the temptation to use simple models since it appears that accuracy is a pivotal issue. Synthesists focus on the development of algorithms to replace molecular dynamics as a generator of conformational changes [42,43,48,35] and the design of methods of energy calculation which combine accuracy and speed.
Article
Full-text available
The virulence of the Gram+ bacterium Bacillus anthracis, responsible for anthrax disease, is caused by a capsule and two toxins. Each toxin is formed from the assembly of the protective antigen (PA) and either the lethal factor (LF) or the edema factor (EF) in the cytoplasm of host cells. EF is an adenylate cyclase, that produces cAMP from ATP in an uncontrolled fashion, provoking severe cellular dysfunction. EF is activated by calmodulin (CaM), involved in many calcium signaling pathways. Crystallographic structures and an NMR study showed that the level of calcium bound to CaM inuences the stability of the EF-CaM complex. Molecular dynamics simulations of the complex, with 0, 2 or 4 calcium ions, enabled to characterize the effect of calcium on the conformational plasticity of each partner and to propose a model for the EF-CaM interaction. The joint analysis of dynamical correlations and energetic influences raised the concept of residue network connectedness as a stability criterion. The large conformational transition undergone by EF upon CaM binding was described through the determination of a plausible reaction path. The obtained intermediate conformations were further used to guide the rational search for inhibitors of the EF toxin, in an approach combining computational and experimental methods. An innovative strategy involving the virtual screening of an allosteric pocket instead of the catalytic site of the enzyme, identified six active compounds able to fully inhibit EF activity at concentrations of 10-100 μM.
Article
The numerical simulation of highly complex biomolecular systems such as DNAs, RNAs, and proteins become intractable as the size and fidelity of these systems increase. Herein, efficient techniques to accelerate multibody-based coarse-grained simulations of such systems are presented. First, an adaptive coarse-graining framework is explained which is capable of determining when and where the system model needs to change to achieve an optimal combination of speed and accuracy. The metrics to guide these on-the-fly instantaneous model adjustments and the issues associated with post-transition system’s states are addressed in this book chapter. Due to its highly modular and parallel nature, the Generalized Divide-and-Conquer Algorithm (GDCA) forms the bases for a suite of dynamics simulation tools used in this work. For completeness, the fundamental aspects of the GDCA are presented herein. Finally, a novel method for the efficient and accurate approximation of far-field force and moment terms are developed. This aspect is key to the success of any large molecular simulation since more than 90 % of the computational load in such simulations is associated with pairwise force calculations. The presented approximations are efficient, accurate, and highly compatible with multibody-based coarse-grained models.
Conference Paper
Due to the challenges involved with modeling complex molecular systems, it is essential that computationally intelligent schemes be produced that put the computational effort where and when it is needed to capture important phenomena, and maintain needed accuracy at minimum costs. In this work, we investigate and propose some key issues for the adaptive modeling and simulation of the dynamic behavior of highly complex multiscale processes. This is accomplished through the appropriate use of an adaptive hybridization of existing, newly developed, and proposed advanced multibody dynamics algorithms and modeling strategies for forward dynamic simulation. The adaptive multiscale simulation technique discussed here benefits from the highly parallelizable structure of the divide and conquer (DCA) framework for modeling multibody systems. These algorithms include Flexible Divide and Conquer Algorithm (FDCA), Orthogonal Complement Divide-and-Conquer Algorithm (ODCA) and generalized momentum approaches for modeling discontinuous changes in the system. These algorithms permits a large complex molecule (or systems of molecules) to be seamlessly treated using a hierarchy of reduced order models ranging from atomistic to the continuum scale.
Chapter
Full-text available
Computer simulations of molecular systems provide invaluable insights for understanding the structure-function relationships pertinent to the discovery o new molecules with desired properties, such as new pharmaceutical drugs. The use of molecular simulations and the increasing performance of modern computers makes it now possible to study the precise physicochemical nature of protein-ligand interactions, protein engineering, solvation phenomena, and to characterize the thermodynamical properties of complex systems with many thousands of atoms. Even with the availability of high-performance computers, many problems of practical interest are so computationally challenging that new solution methods are required for formulating and solving the resulting mathematical models. This paper explores the capabilities of recursive dynamics methods for reducing the computational effort required for study complex molecular systems. A numerical example is presented that demonstrates the application of the basic recursive algorithms.
Article
A longstanding challenge in using computational methods for protein structure prediction is the refinement of low-resolution structural models derived from comparative modeling methods into highly accurate atomistic models useful for detailed structural studies. Previously, we have developed and demonstrated the utility of the internal coordinate molecular dynamics (MD) technique, Generalized Newton-Euler Inverse Mass Operator (GNEIMO), for refinement of small proteins. Using GNEIMO, the high-frequency degrees of freedom are frozen and the protein is modeled as a collection of rigid clusters connected by torsional hinges. This physical model allows larger integration time steps and focuses the conformational search in the low frequency torsional degrees of freedom. Here, we have applied GNEIMO with temperature Replica Exchange to refine low-resolution protein models of 30 proteins taken from the Continuous Assessment of Structure Prediction (CASP) competition. We have shown that GNEIMO torsional MD method leads to refinement of up to 1.3 Å in the root mean square deviation in coordinates for 30 CASP target proteins without using any experimental data as restraints in performing the GNEIMO simulations. This is in contrast with unconstrained all-atom Cartesian MD method where refinement requires the use of restraints during the simulations. We have also demonstrated refinement using a user-defined clustering scheme in GNEIMO as a viable approach for enhancing localized conformational search. Finally, we provide a protocol based on the GNEIMO replica exchange method for protein structure refinement that can be readily extended to other proteins and possibly applied for high throughput protein structure refinement.
Article
Multiresolution simulations of molecular systems such as DNAs, RNAs, and proteins are implemented using models with different resolutions ranging from a fully atomistic model to coarse-grained molecules, or even to continuum level system descriptions. For such simulations, pairwise force calculation is a serious bottleneck which can impose a prohibitive amount of computational load on the simulation if not performed wisely. Herein, we approximate the resultant force due to long-range particle-body and body-body interactions applicable to multiresolution simulations. Since the resultant force does not necessarily act through the center of mass of the body, it creates a moment about the mass center. Although this potentially important torque is neglected in many coarse-grained models which only use particle dynamics to formulate the dynamics of the system, it should be calculated and used when coarse-grained simulations are performed in a multibody scheme. Herein, the approximation for this moment due to far-field particle-body and body-body interactions is also provided.
Article
In this paper, a scheme for the canonical ensemble simulation of the coarse-grained articulated polymers is discussed. In this coarse-graining strategy, different subdomains of the system are considered as rigid and/or flexible bodies connected to each other via kinematic joints instead of stiff, but elastic bonds. Herein, the temperature of the simulation is controlled by a Nosé–Hoover thermostat. The dynamics of this feedback control system in the context of multibody dynamics may be represented and solved using traditional methods with computational complexity of O(n3)O(n3) where nn denotes the number of degrees of freedom of the system. In this paper, we extend the divide-and-conquer algorithm (DCA), and apply it to constant temperature molecular simulations. The DCA in its original form uses spatial forces to formulate the equations of motion. The Generalized-DCA applied here properly accommodates the thermostat generalized forces (from the thermostat), which control the temperature of the simulation, in the equations of motion. This algorithm can be implemented in serial and parallel with computational complexity of O(n)O(n) and O(logn)O(logn), respectively.
Article
All-atom molecular dynamics simulations are widely used to study the flexibility of protein conformations. However, enhanced sampling techniques are required for simulating protein dynamics that occur on the millisecond timescale. In this work, we show that torsional molecular dynamics simulations enhance protein conformational sampling by performing conformational search in the low-frequency torsional degrees of freedom. In this article, we use our recently developed torsional-dynamics method called Generalized Newton-Euler Inverse Mass Operator (GNEIMO) to study the conformational dynamics of four proteins. We investigate the use of the GNEIMO method in simulations of the conformationally flexible proteins fasciculin and calmodulin, as well as the less flexible crambin and bovine pancreatic trypsin inhibitor. For the latter two proteins, the GNEIMO simulations with an implicit-solvent model reproduced the average protein structural fluctuations and sample conformations similar to those from Cartesian simulations with explicit solvent. The application of GNEIMO with replica exchange to the study of fasciculin conformational dynamics produced sampling of two of this protein’s experimentally established conformational substates. Conformational transition of calmodulin from the Ca2+-bound to the Ca2+-free conformation occurred readily with GNEIMO simulations. Moreover, the GNEIMO method generated an ensemble of conformations that satisfy about half of both short- and long-range interresidue distances obtained from NMR structures of holo to apo transitions in calmodulin. Although unconstrained all-atom Cartesian simulations have failed to sample transitions between the substates of fasciculin and calmodulin, GNEIMO simulations show the transitions in both systems. The relatively short simulation times required to capture these long-timescale conformational dynamics indicate that GNEIMO is a promising molecular-dynamics technique for studying domain motion in proteins.
Article
Full-text available
The original grant proposal comprised three parts, two of which were continuations of previous successful projects. The first project involves more efficient optimization techniques for very large molecules (containing several thousand atoms). The second is the development of algorithms for molecular dynamics in internal coordinates. The third project involves the efficient calculation of correlation energies for large (a few hundred atoms) molecules. This report summarizes our work in all three areas. Progress has been excellent throughout.
Article
Internal coordinate molecular dynamics (ICMD) is a recent efficient method for modeling polymer molecules which treats them as chains of rigid bodies rather than ensembles of point particles as in Cartesian MD. Unfortunately, it is readily applicable only to linear or tree topologies without closed flexible loops. Important examples violating this condition are sugar rings of nucleic acids, proline residues in proteins, and also disulfide bridges. This paper presents the first complete numerical solution of the chain closure problem within the context of ICMD. The method combines natural implicit fixation of bond lengths and bond angles by the choice of internal coordinates with explicit constraints similar to Cartesian dynamics used to maintain the chain closure. It is affordable for large molecules and makes possible 3–5 times faster dynamics simulations of molecular systems with flexible rings, including important biological objects like nucleic acids and disulfide-bonded proteins. © 1999 American Institute of Physics.
Article
A Brownian dynamics treatment in torsional angle space is presented for the simulation of conformational dynamics of macromolecules with fixed bond lengths and bond angles and with an arbitrary intramolecular potential energy function. The advantages of the torsional angle space treatment over similar treatments (Brownian dynamics or molecular dynamics) in atomic coordinate space are that, first, the number of variables is reduced by roughly a factor of 10 and, second, the integration time step size is increased by 3 to 4 orders of magnitude (because, by confining the treatment to the torsional angle space, the time step size is not limited by the fast oscillation modes of covalent bonds but rather by the slow motion of macromolecular segments whose time scale is roughly 3 to 4 orders of magnitude larger than that of bond oscillations). Consequently, the exploration of global conformational relaxation processes becomes computationally possible. The treatment is tested by studying the folding kinetics of off-lattice chains with fixed bond lengths and bond angles and with prescribed sequences. The present treatment is a general purpose one applicable to all macromolecular conformational relaxation processes (e.g., protein folding kinetics, drug/ligand docking on to target proteins, conformational multiple-minima problems, etc.). It serves as a complement to the molecular dynamics or Brownian dynamics treatments in atomic coordinate space. © 1998 American Institute of Physics.
Article
Conventional simulation techniques to model the dynamics of proteins in atomic detail are restricted to short time scales. A simplified molecular description, in which high frequency motions with small amplitudes are ignored, can overcome this problem. In this protein model only the backbone dihedrals ϕ and ψ and the χi of the side chains serve as degrees of freedom. Bond angles and lengths are fixed at ideal geometry values provided by the standard molecular dynamics (MD) energy function CHARMM. In this work a Monte Carlo (MC) algorithm is used, whose elementary moves employ cooperative rotations in a small window of consecutive amide planes, leaving the polypeptide conformation outside of this window invariant. A single of these window MC moves generates local conformational changes only. But, the application of many such moves at different parts of the polypeptide backbone leads to global conformational changes. To account for the lack of flexibility in the protein model employed, the energy function used to evaluate conformational energies is split into sequentially neighbored and sequentially distant contributions. The sequentially neighbored part is represented by an effective (ϕ,ψ)-torsion potential. It is derived from MD simulations of a flexible model dipeptide using a conventional MD energy function. To avoid exaggeration of hydrogen bonding strengths, the electrostatic interactions involving hydrogen atoms are scaled down at short distances. With these adjustments of the energy function, the rigid polypeptide model exhibits the same equilibrium distributions as obtained by conventional MD simulation with a fully flexible molecular model. Also, the same temperature dependence of the stability and build-up of α helices of 18-alanine as found in MD simulations is observed using the adapted energy function for MC simulations. Analyses of transition frequencies demonstrate that also dynamical aspects of MD trajectories are faithfully reproduced. Finally, it is demonstrated that even for high temperature unfolded polypeptides the MC simulation is more efficient by a factor of 10 than conventional MD simulations. © 1998 American Institute of Physics.
Article
Full-text available
▪ Abstract Innovative algorithms have been developed during the past decade for simulating Newtonian physics for macromolecules. A major goal is alleviation of the severe requirement that the integration timestep be small enough to resolve the fastest components of the motion and thus guarantee numerical stability. This timestep problem is challenging if strictly faster methods with the same all-atom resolution at small timesteps are sought. Mathematical techniques that have worked well in other multiple-timescale contexts—where the fast motions are rapidly decaying or largely decoupled from others—have not been as successful for biomolecules, where vibrational coupling is strong. This review examines general issues that limit the timestep and describes available methods (constrained, reduced-variable, implicit, symplecttic, multiple-timestep, and normal-mode-based schemes). A section compares results of selected integrators for a model dipeptide, assessing physical and numerical performance. Included is ...
Article
A new algorithm is presented for performing molecular dynamics simulations of peptides with fixed geometry, with the aim of simulating conformational changes and of exploring conformational space. The principle of the method is to expand the potential energy as a Taylor's series in the coordinates around the current point, retaining the force and its first two derivatives, and obtain a series solution of the resulting differential equations using a method due to Lyapunov. By choosing the time step so that the second term in the series is small compared to the first, the true solution can in principle be approximated to any desired degree of accuracy. The algorithm has been used to solve numerically Lagrange's equations of motion for N-acetyl alanine amide and N-acetyl methionide amide, regarded as fixed at their C-termini, under the influence of the ECEPP/2 potential energy function, and time steps of 15–30 fsec have been achieved with little variation in the total energy. Possible directions for future development are discussed.
Article
The Gō–Scheraga algorithm to produce rigid-geometry chain closures for polypeptide chains (N. Gō and H.A. Scheraga, Macromolecules, 3, 178, 1970) has been updated to allow each residue in the chain to adopt different bond lengths or bond angles. A treatment of five-residue local chain deformations is presented in detail. For chain sections shorter than five residues in length, it is shown that satisfactory closures may be obtained by direct fitting, indicating that the rigid-geometry approximation is adequate to model even short sections of chains having perturbed local geometry. The new implementation of the algorithm has been applied to several problems in protein structure determination and molecular modeling. The first of these is the problem of finding standard-geometry closures for short regions of chains having irregular geometry. It is shown that standard-geometry closures which superimpose well upon the coordinates of the irregular structures may be obtained routinely for chain sections that are five amino acid residues or more in length. Another application of the algorithm is to generate a large number of closures for a short segment of a protein chain, as a method to search the conformational space of this segment. The latter application should prove useful in studies in which the conformation of some region of a given protein has not been determined experimentally. Such applications include the modeling of proteins which have a sequence homology to a crystallized protein, and modeling regions of crystallized proteins which are not well-defined in electron density maps.
Article
Computer methods for analytic surface calculations of molecular systems suffer from numerical instabilities and are CPU time consuming. In this article, we present proposals toward the solution of both problems. Singularities arise when nearly collinear triples of neighboring atoms or multiple vertices are encountered during the calculation. Topological decisions in analytic surface calculation algorithms (accessibility of vertices and arcs) are based upon the comparison of distances or angles. If two such numbers are nearly equal, then currently used computer programs may not resolve this ambiguity correctly and can subsequently fail. In this article, modifications in the analytic surface calculation algorithm are described that recognize singularities automatically and treat them appropriately without restarting parts of the computation. The computing time required to execute these alterations is minimal. The basic modification consists in defining an accuracy limit within which two values may be assumed as equal. The search algorithm has been reformulated to reduce the computational effort. A new set of formulas makes it possible to avoid mostly the extraction of square roots. Tests for small-and medium-sized intersection circles and for pairs of vertices with small vertex height help recognize fully buried circles and vertex pairs at an early stage. The new program can compute the complete topology of the surface and accessible surface area of the protein crambin in 1.50–4.29 s (on a single R3000 processor of an SGI 4D/480) depending on the compactness of the conformation where the limits correspond to the fully extended or fully folded chain, respectively. The algorithm, implemented in a computer program, will be made available on request. © John Wiley & Sons, Inc.
Article
Geometric algebra is an elegant and practical merger of classical vector algebra with Hamilton's quaternions. In this paper we show how it can be used to solve the local deformation problem introduced by Gō and Scheraga in 1970. As a special case, this enables us to characterize the torsion angles in cyclic molecules that are consistent with the ring closure constraint.
Article
Recent studies have pointed out the important role of local water structures in protein conformational stability. Here, we present an accurate and computationally effective way to estimate the free energy contribution of the simplest water structure motif–the water bridge. Based on the combination of empirical parameters for accessible protein surface area and the explicit consideration of all possible water bridges with the protein, we introduce an improved protein solvation model. We find that accounting for water bridge formation in our model is essential to understand the conformational behavior of polypeptides in water. The model formulation, in fact, does not depend on the polypeptide nature of the solute and is therefore applicable to other flexible biomolecules (i.e., DNAs, RNAs, polysaccharides, etc.).
Article
General Lagrange's equations of motion for a system of polymeric molecules are obtained in an explicit form. They can be used for simulating molecular dynamics of large molecules. The molecular conformations are described by internal coordinates, i.e., bond lengths, valence angles, and torsion angles. The equations derived permit any internal degrees of freedom to be frozen. The method is applied to an oligopeptide in an α-helical conformation. Three models of the molecule with different degrees of fixation are compared. It is shown that the method permits one to increase significantly the time step in molecular dynamics calculations.
Article
This paper reports a new research effort aimed at using efficient multibody dynamics methods to simulate coarse-grained molecular systems. Various molecular systems are studied and the results of nanosecond-long simulations are analyzed to validate the method. The systems studied include bulk water, alkane chains, alanine dipeptide and carboxyl terminal fragments of calmodulin, ribosomal L7/L12 and rhodopsin proteins. The stability and validity of the simulations are studied through conservation of energy, thermodynamics properties and conformational analysis. In these simulations, a speed up of an order of magnitude is realized for conservative error bounds with a fixed timestep integration scheme. A discussion is presented on the open-source software developed to facilitate future research using multibody dynamics with molecular dynamics.
Article
Until now and based on the success of the helix/coil transition theory it has been assumed that the α-helical propensities of the amino acids are position independent. This has been critical to derive the set of theoretical parameters for the 20 natural amino acids. Here, we have analyzed the behavior of several non-polar residues, Val, Ile, Leu, Met and Gly at the N-cap, at each position of the first helical turn and at a central helical position of a 16-residue peptide model system that starts with eight consecutive alanine residues. We have interpreted the results from these experiments with the model of the helix/coil transition (AGADIR), that indicates that the intrinsic helical propensity is position dependent. Gly, Val and Ile are more favorable at the first turn than in the middle of the α-helix, while for Leu and Met we observe the opposite behavior. The differences between the observed helical propensities are as large as 1.0 kcal/mol in some cases. Molecular modeling calculations using the ECEPP/2 force-field equipped with a hydration potential show that this effect can be explained by the combination of three factors: (a) the side-chains in the first helix turn are more solvent-exposed; (b) they have fewer intramolecular van der Waals’ contacts; and (c) they posses higher configurational entropy than that in the central position of an α-helix. The position-dependent results of the calculations are in reasonable agreement with the experimental estimates and with the intrinsic propensities of the amino acids derived from the statistical analysis of the protein structure database.
Article
Guanylate kinases (GMPK) from Mycobacterium tuberculosis, Mus musculus and Saccharomyces cereviae were submitted to virtual screening in order to determine protein-ligand interactions specific to M. tuberculosis. The opening of the cleft between CORE, LID and GMP domains was found to have a large influence on the established interactions and on the determination of ligands binding specifically to the M. tuberculosis GMPK. An extended definition of the active site pocket, allowing to be more discriminant between Mycobacterium tuberculosis and M. musculus, is given. A virtual screening run with the extended pocket definition, was used to select compounds having high docking scores on M. tuberculosis GMPK and low ones on M. musculus GMPK. The protein residues involved in hydrogen bonds with ligands were the same than in the GMPK-GMP complex, but the chemical functions of the ligand involved in these hydrogen bonds are often different. On the other hand, the hydrophobic interactions are different from the ones observed in the GMPK-GMP structure, and may be a way to increase the specificity between the M. tuberculosis and M. musculus GMPKs.
Article
Loop closure in proteins has been studied actively for over 25 years. Using spherical geometry and polynomial equations, several loop-closure problems in proteins are solved exactly by reducing them to the determination of the real roots of a polynomial. Loops of seven, eight, and nine atoms are treated explicitly, including the tripeptide and disulfide-bonded loop-closure problems. The number of valid loop closures can be evaluated by the method of Sturm chains, which counts the number of real roots of a polynomial. Longer loops can be treated by three methods: by sampling enough dihedral angles to reduce the problem to a soluble loop-closure problem; by applying the loop-closure algorithm hierarchically; or by decimating the chain into independently moving rigid elements that can be reconnected using loop-closure algorithms. Applications of the methods to docking, homology modeling and NMR problems are discussed. ©1999 John Wiley & Sons, Inc. J Comput Chem 20: 819–844, 1999
Article
An efficient methodology, further referred to as ICM, for versatile modeling operations and global energy optimization on arbitrarily fixed multimolecular systems is described. It is aimed at protein structure prediction, homology modeling, molecular docking, nuclear magnetic resonance (NMR) structure determination, and protein design. The method uses and further develops a previously introduced approach to model biomolecular structures in which bond lengths, bond angles, and torsion angles are considered as independent variables, any subset of them being fixed. Here we simplify and generalize the basic description of the system, introduce the variable dihedral phase angle, and allow arbitrary connections of the molecules and conventional definition of the torsion angles. Algorithms for calculation of energy derivatives with respect to internal variables in the topological tree of the system and for rapid evaluation of accessible surface are presented. Multidimensional variable restraints are proposed to represent the statistical information about the torsion angle distributions in proteins. To incorporate complex energy terms as solvation energy and electrostatics into a structure prediction procedure, a “double-energy” Monte Carlo minimization procedure in which these terms are omitted during the minimization stage of the random step and included for the comparison with the previous conformation in a Markov chain is proposed and justified. The ICM method is applied successfully to a molecular docking problem. The procedure finds the correct parallel arrangement of two rigid helixes from a leucine zipper domain as the lowest-energy conformation (0.5 Å root mean square, rms, deviation from the native structure) starting from completely random configuration. Structures with antiparallel helixes or helixes staggered by one helix turn had energies higher by about 7 or 9 kcal/mol, respectively. Soft docking was also attempted. A docking procedure allowing side-chain flexibility also converged to the parallel configuration starting from the helixes optimized individually. To justdy an internal coordinate approach to the structure prediction as opposed to a Cartesian one, energy hypersurfaces around the native structure of the squash seeds trypsin inhibitor were studied. Torsion angle minimization from the optimal conformation randomly distorted up to the rms deviation of 2.2 Å or angular rms deviation of l0° restored the native conformation in most cases. In contrast, Cartesian coordinate minimization did not reach the minimum from deviations as small as 0.3 Å or 2°. We conclude that the most promising detailed approach to the protein-folding problem would consist of some coarse global sampling strategy combined with the local energy minimization in the torsion coordinate space. © 1994 by John Wiley & Sons, Inc.
Article
This article describes an extension to previously developed constraint techniques. These enhanced constraint methods will enable the study of large computational chemistry problems that cannot be easily handled with current constrained molecular dynamics (MD) methods. These methods are based on an O(N) solution to the constrained equations of motion. The benefits of this approach are that (1) the system constraints are solved exactly at each time step, (2) the solution algorithm is noniterative, (3) the algorithm is recursive and scales as O(N), (4) the algorithm is numerically stable, (5) the algorithm is highly amenable to parallel processing, and (6) potentially greater integration step sizes are possible. It is anticipated that application of this methodology will provide a 10- to 100-improvement in the speed of a large molecular trajectory as compared with the time required to run a conventional atomistic unconstrained simulation. It is, therefore, anticipated that this methodology will provide an enabling capacity for pursuing the drug discovery process for large molecular systems. © 1995 John Wiley & Sons, Inc.
Article
Full-text available
Semiflexible models are often used to study macromolecules containing stable structural elements. Based on rigid body dynamics, we developed a rigid fragment constraint dynamics algorithm for the simulation of semiflexible macromolecules. Stable structural elements are treated as rigid fragments. Rigid fragment constraints, defined as combinations of distance constraints and position constraints, are introduced to limit internal molecular motion to the required mode. The constraint forces are solved separately for each rigid fragment constraint and iteratively until all constraint conditions are satisfied within a given tolerance at each time step, as is done for the bond length constraint in the SHAKE algorithm. The orientation of a rigid fragment is represented by the quaternion parameters, and both translation and rotation are solved by the leap-frog formulation. We tested the algorithm with molecular dynamics simulations of a series of peptides and a small protein. The computation cost for the constraints is roughly proportional to the size of the molecule. In the microcanonical ensemble simulation of polyvalines, the total energy was conserved satisfactorily with time steps as large as 20 fs. A helix folding simulation of a synthetic peptide was carried out to show the efficiency of the algorithm in a conformational search. © 1998 John Wiley & Sons, Inc. J Comput Chem 19: 1555–1566, 1998
Article
Macromolecular structure determination by X-ray crystallography and solution NMR spectroscopy has experienced unprecedented growth during the past decade.
Article
Since determining the crystallographic structure of all peptide-MHC complexes is infeasible, an accurate prediction of the conformation is a critical computational problem. These models can be useful for determining binding energetics, predicting the structures of specific ternary complexes with T-cell receptors, and designing new molecules interacting with these complexes. The main difficulties are (1) adequate sampling of the large number of conformational degrees of freedom for the flexible peptide, (2) predicting subtle changes in the MHC interface geometry upon binding, and (3) building models for numerous MHC allotypes without known structures. Whereas previous studies have approached the sampling problem by dividing the conformational variables into different sets and predicting them separately, we have refined the Biased-Probability Monte Carlo docking protocol in internal coordinates to optimize a physical energy function for all peptide variables simultaneously. We also imitated the induced fit by docking into a more permissive smooth grid representation of the MHC followed by refinement and reranking using an all-atom MHC model. Our method was tested by a comparison of the results of cross-docking 14 peptides into HLA-A*0201 and 9 peptides into H-2K(b) as well as docking peptides into homology models for five different HLA allotypes with a comprehensive set of experimental structures. The surprisingly accurate prediction (0.75 A backbone RMSD) for cross-docking of a highly flexible decapeptide, dissimilar to the original bound peptide, as well as docking predictions using homology models for two allotypes with low average backbone RMSDs of less than 1.0 A illustrate the method's effectiveness. Finally, energy terms calculated using the predicted structures were combined with supervised learning on a large data set to classify peptides as either HLA-A*0201 binders or nonbinders. In contrast with sequence-based prediction methods, this model was also able to predict the binding affinity for peptides to a different MHC allotype (H-2K(b)), not used for training, with comparable prediction accuracy.
Article
Efficient conformational search or sampling approaches play an integral role in molecular modeling, leading to a strong demand for even faster and more reliable conformer search algorithms. This article compares the efficiency of a molecular dynamics method, a simulated annealing method, and the basin hopping (BH) approach (which are widely used in this field) with a previously suggested tabu-search-based approach called gradient only tabu search (GOTS). The study emphasizes the success of the GOTS procedure and, more importantly, shows that an approach which combines BH and GOTS outperforms the single methods in efficiency and speed. We also show that ring structures built by a hydrogen bond are useful as starting points for conformational search investigations of peptides and organic ligands with biological activities, especially in structures that contain multiple rings. © 2011 Wiley Periodicals, Inc. J Comput Chem, 2011.
Article
The unfolded protein response (UPR) is a coordinated program that promotes cell survival under conditions of endoplasmic reticulum stress and is required in tumor progression as well. To date, no specific small molecule inhibitor targeting this pathway has been identified. Pancreatic endoplasmic reticulum kinase (PERK), one of the UPR transducers, is an eIF2α kinase. Compromising PERK function inhibits tumor growth in mice, suggesting that PERK may be a cancer drug target, but identifying a specific inhibitor of any kinase is challenging. The goal of this study was to identify some pair-wise receptor-ligand atomic contacts that confer selective PERK inhibition. Compounds selectively inhibiting PERK-mediated phosphorylation in vitro were identified using an initial virtual library screen, followed by structure-activity hypothesis testing. The most potent PERK selective inhibitors utilize three specific kinase active site contacts that, when absent from chemically similar compounds, abrogates the inhibition: (i) a strong van der Waals contact with PERK residue Met7, (ii) interactions with the N-terminal portion of the activation loop, and (iii) groups providing electrostatic complementarity to Asp144. Interestingly, the activation loop contact is required for PERK selectivity to emerge. Understanding these structure-activity relationships may accelerate rational PERK inhibitor design.
Article
La virulence de la bactérie Gram+ Bacillus anthracis, responsable de la maladie du charbon, est due à la présence d'une capsule et deux toxines. Chaque toxine résulte de l'assemblage de l'antigène protecteur (PA) avec l'un des deux facteurs, létal (LF) ou oedémateux (EF), dans le cytoplasme de la cellule hôte. EF est une adénylyl cyclase, qui transforme l'ATP en AMPc de manière incontrôlée, provoquant des dérèglements cellulaires. Elle est activée par la calmoduline (CaM), impliquée dans de nombreuses voies de signalisation du calcium. Des structures cristallographiques et une étude par RMN ont montré que la stabilité du complexe EF-CaM dépend du niveau de calcium fixé à CAM. Des simulations de dynamique moléculaire du complexe, avec 0, 2 ou 4 ions calcium, ont permis de caractériser l'effet du calcium sur la plasticité conformationnelle des deux partenaires et de proposer un modèle de l'interaction EF-CaM. L'analyse conjointe des corrélations dynamiques et des influences énergétiques a fait émerger le concept de connexité du réseau de résidus comme critère de stabilité. La large transition conformationnelle induite chez EF par la fixation de CaM a été décrite, grâce à la détermination d'un chemin de réaction plausible, par modélisation moléculaire. Les conformations intermédiaires obtenues ont servi à guider la recherche rationnelle d'inhibiteurs de la toxine EF, dans le cadre d'une approche combinant méthodes computationnelles et expérimentales. Une stratégie innovante, impliquant le criblage virtuel d'une poche allostérique plutôt que du site catalytique de l'enzyme, a identifié six molécules actives, inhibant totalement l'activité de EF à des concentrations de 10-100 microM.
Article
Structure-based drug design (SBDD) has emerged as an important tool in drug discovery research. Traditionally, SBDD is based on a static crystal structure of the target protein. However, a protein in solution exists as an ensemble of energetically accessible conformations and is best described when all states are represented. Upon ligand binding, further conformational changes in the receptor can be induced. While ligand flexibility can be accurately reproduced, replicating the innumerable degrees of freedom of the protein is impractical due to limitations in computational power. Previously, Carlson et al. developed a robust method to generate receptor-based pharmacophore models based on an ensemble of protein conformations. The use of multiple protein structures (MPS) allows a range of conformational space that can be assumed by the protein to be sampled and hence, simulates the inherent flexibility of a binding site in a computationally feasible manner. Small molecule probes are used to map energetically favorable regions of each protein active site, and the MPS are then overlaid to identify the most important, chemically relevant features conserved across the conformations. Here, we have refined the MPS method by developing techniques to optimize different steps in the procedure. First, we outline tools to properly overlay flexible proteins based on the rigid regions of the structure by incorporating a Gaussian weight into a standard RMSD alignment. Atoms that barely move between the two conformations will have a greater weighting than those that have a large displacement. Using HIV-1 protease (HIV-1p) as a test case, we next examine the use of various sources of MPS: snapshots of an apo structure across a molecular dynamics simulation, a bound NMR ensemble, and a collection of bound crystal structures. Finally, we implement a simple ranking metric into the MPS method to quantify ligand overlap with a contour-based representation of the pharmacophore model. Overlapping in a region of the active site dense with pharmacophore spheres results in a higher ranking of a ligand pose. The refined MPS method and other computational techniques are then applied to study HIV-1p and investigate a novel inhibition mechanism by modulating its conformational behavior.
Article
This paper presents an application of the new nonlinear global optimization routine gradient only tabu search (GOTS) to conformational search problems. It is based on the tabu search strategy which tries to determine the global minimum of a function by the steepest descent-modest ascent strategy. The refinement of ranking procedure of the original GOTS method and the exploitation of simulated annealing elements are described, and the modifications of the GOTS algorithm necessary to adopt it to conformation searches are explained. The utility of the GOTS for conformational search problems is tested using various examples.
Article
Full-text available
The statistical weights of equilibrium conformations of macromolecules contain contributions from internal vibrations. An analysis of such vibrations, in the absence and presence of solvent, is presented from a quantum‐statistical‐mechanical point of view. Several classical approximations to the quantum‐mechanically correct expression, with different degrees of accuracy, are derived. In all of these approximations, all internal degrees of freedom of the polymer are divided into two classes: hard (bond stretching and bond angle bending) and soft (torsional rotations around single bonds, i.e., variation of dihedral angles). Since the hard variables oscillate many times before the soft variables change in value by an appreciable amount, the hard variables can be treated effectively as parameters (i.e., not as independent variables), which are (i) functions of the instantaneous values of the soft variables, or (ii) simply constants, the latter treatment being less accurate. In treatment (i) the molecule is regarded as flexible, whereas in treatment (ii) bond lengths and bond angles are assumed to be rigidly fixed, while the dihedral angles can change. In both treatments, the sum of the zero‐point energies corresponding to the hard degrees of freedom must be added to the soft‐mode part of the energy unless the errors in the calculation of the latter are ≥0.1 kcal/unit, which is the usual magnitude for the change in zero‐point energies for various conformations. The soft degrees of freedom are treated classically, i.e., the statistical weight is given by integration of the Boltzmann factor over phase space. The integration over the momentum space of the soft variables yields a conformation‐dependent term, proportional to ln detG, which may be called the kinetic entropy (G being a coefficient matrix for the kinetic energy of a polymer in the canonical expression for the Hamiltonian). A practical method is given for the calculation of detG, and it is applied to a simple example. The result shows that the dependence of ln detG on the coordinates is usually not negligible. For states with small conformational fluctuations (helical polymer structures and globular proteins), the result of treatment (ii) can be used as a perturbational step to proceed to treatment (i). Further approximations, necessary for treating random‐coil states (ones with large conformational fluctuations), are discussed. The effect of solvent on the statistical weights is also discussed.
Article
Full-text available
A numerical algorithm integrating the 3N Cartesian equations of motion of a system of N points subject to holonomic constraints is formulated. The relations of constraint remain perfectly fulfilled at each step of the trajectory despite the approximate character of numerical integration. The method is applied to a molecular dynamics simulation of a liquid of 64 n-butane molecules and compared to a simulation using generalized coordinates. The method should be useful for molecular dynamics calculations on large molecules with internal degrees of freedom.
Article
DOI:https://doi.org/10.1103/PhysRev.39.746
Article
The chain-closure algorithm of Gō and Scheraga (Gō, N.; Scheraga, H. A. Macromolecules 1970, 3, 178) has been modified to allow bond angle variations. This modification greatly increases the domain of applicability of the algorithm. Of particular interest is its use for bridging deletions or introducing additions in the homology modeling of proteins. Examples from an α-helix, a β-sheet, a cyclic polypeptide, and several proteins are presented.
Article
A survey of over 50 crystal structures indicates that both imino acid and peptide derivatives of proline populate ring conformers consistent with the torsional potentials about single bonds. In both cases, lower barriers for rotation about CN bonds relative to those about CC bonds favor smaller values for dihedral angles about the former bonds. In peptides a minimum in the torsional potential about CN bonds occurs at zero dihedral angle, further favoring small angles. The pyrrolidine‐ring dihedral angles of the proline compounds in the solid state obey a cyclopentane‐type pseudorotation function. Thus the puckering of the five‐membered ring can be quantitatively described by two parameters. Consistent with small dihedral angles about CN bonds, C β and/or C γ are puckered out of the mean plane of the ring in nearly all of the nonstrained compounds. Utilizing the consistent force‐field method of Lifson and coworkers [see A. Warshel, M. Levitt, and S. Lifson (1970) J. Mol. Spectrosc. 33 , 84] the intramolecular energy of five proline peptides was minimized with respect to all internal coordinates. In addition, the energy surface near minima was explored by constraining a particular dihedral angle and reminimizing the energy with respect to all remaining variables. In linear peptides two types of pyrrolidine‐ring conformers have identical predicted energies. In the cyclic dipeptide cyclo (Pro‐Gly) one of the ring conformers is favored by about 3 kcal/mol, while the cyclic tripeptide cyclo (Pro‐Gly‐Gly) favors the other conformer by a comparable margin. In agreement with observations in the solid state and in solution, C β and/or C γ are puckered in the predicted conformers. A correlation between proline Φ and the details of the puckered conformation was predicted and found to match precisely conformers observed in crystals. For the diamides N ‐acetyl‐ L ‐proline‐ N ′‐methyl‐amide and N ‐acetyl‐ L ‐proline‐ N ′, N ′‐dimethylamide (AcProMe 2 A) 30% and 60% cis acetyl peptide bonds were predicted in good agreement with observations in nonpolar solvents for the respective compounds. The conformational distributions with respect to proline Ψ are also in accord with experimental observations. For AcProMe 2 A, a model for a ‐Pro‐Pro‐sequence in a peptide chain, this study is the first to predict stable conformers for proline Ψ either ca. −50° or 140° for both cis and trans peptides.
Article
Conformational analysis of triple helics of a type of collagen was performed with typical collagen tripeptide sequences based on Gly-Pro-Ala, Gly-Ala-Hyp, and Gly-Ala-Ala. During energy minimization, the possibility of continual deformation of the pyrrolidine cycle was taken into account in order to achieve better accuracy in the resulting structure. The (Gly-Pro-Ala)n structure is almost isomorphic to the (Gly-Pro-Hyp)n structure obtained in the previous work [Tumanyan, V. G. & Esipova, N.G. (1982) Biopolymers21, 475–497]. For a collagen-type structure, the optimal conformation of (Gly-Ala-Hyp)n tends to have a decreased unit twist (t = 15°), although the energy advantage with respect to the conformation with t = 45° is not so significant. A similar situation is observed for (Gly-Ala-Ala)n. In this case, the energy decrease during unwinding to t = 15° from t = 45° is quite small. The conformations of (Gly-Ala-Hyp)n and (Gly-Ala-Ala)n with t = 15° exhibit a similarity with a triple complex of polyproline II helices—a noncoiled coil such as (Gly-Pro-Hyp)n and (Gly-Pro-Ala)n. A similar structure may be postulated for subcomponent cq1 of the first component of a human complement containing substantial Gly-X-Pro and Gly-X-Y tripeptide derivatives in the primary structure (X, Y = any amino acid). The results suggest that the observed helical symmetry of collagen (t = 36°) is a consequence of superposition of diffraction patterns (for sufficiently long segments) from various helices (t varies from ∼15° for Gly-X-Hyp and Gly-X-Y to ∼56° for Gly-Pro-Ala). For short alternating segments, some unification of different helical structures is possible.
Article
The CORELS (COnstrained-REstrained Least-Squares) program is developed for proteins and nucleic acids to take advantage of the intrinsic rigid groups found in the molecules and to overcome the relatively low resolution of the X-ray data from their crystals. CORELS combines Scheringer's rigid groups constraints, extended to allow for variable torsion angles, with distance restraints to maintain stereochemistry between groups within a specified error limit. Even though allowing variable internal dihedral angles introduces torsional degrees of freedom within the otherwise constrained group, it can reduce the total number of structural parameters in the structure by decreasing the number of groups. The advantages of this approach include a large increase in the data-to-parameter ratio over the restrained refinement methods; automatic maintenance of group stereochemistry: within the group, all bond lengths and bond angles are constrained. The use of a constrained-restrained least-squares procedure has proven to be extremely useful in refining macromolecular structures, especially when the initial model has severe errors. This method inherently has many fewer degrees of freedom than restrained refinement procedures and therefore is applicable at extremely low resolution with a very large radius of convergence.
Article
A general methodology is proposed for the conformational modelling of biomolecular systems. The approach allows one: (i) to describe the system under investigation by an arbitrary set of internal variables, i.e., torsion angles, bond angles, and bond lengths; it offers a possibility to pass from the free structure to a completely fixed one with the number of variables from 3N to zero, respectively, where N is the number of atoms; (ii) to consider both, a single molecule and a complex of many molecules, (e.g., proteins, water, ligands, etc.) in terms of one universal model; (iii) to study the dynamics of the system using explicit analytical Lagrangian equations of motion, thus opening up possibilities for investigations of slow concerted motions such as domain oscillations in proteins etc.; (iv) to calculate the partial derivatives of various functions of conformation, e.g., the conformational energy or external constraints imposed, using a standard efficient procedure regardless of the variables and the structure of the system. The approach is meant to be used in various investigations concerning the conformations and dynamics of biomacromolecules.
Article
A general and efficient methodology is presented which allows molecules containing one or many rings of any size to be manipulated within energy minimization procedures. Variables describing the conformation of the molecules concerned are limited to dihedral and ring valence angles and the ring closure conditions are treated as equality constraints. An application is made to the ion transporter valinomycin and its complexes with K+ and Na+ which illustrates the possibilities of the approach and leads to results which allow a better understanding of the conformational mechanics of this important ionophore.
Article
One approach to finding the conformation of minimum energy for a complicated molecule is to perform energy minimization, perhaps coupled to more exhaustive search procedures such as dynamics or Monte Carlo sampling, from many starting conformation. Where there are geometric constraints on the conformations, as in a ring molecule, or a variable loop starting and ending in known constant regions of one of a series of homologous proteins, rapidly generating many such starting conformations, all satisfying the constraints, has been a problem in the past. We have devised an algorithm, which we call random tweak, which performs this task in the context of a torsional description of a molecule, and have used it to model the backbones of the six CDRs (complementarity determining regions) of the immunoglobulin MCPC603. These range in size from 5 to 19 residues, and have from 8 to 36 variable dihedral angles. Ensembles of 100 properly closed backbone structures for each CDR were generated under several conditions of van der Waals screening internally and against the rest of the molecule, and ensembles of 1000 were generated for selected CDRs. These structure “libraries” reveal how the geometry at the base of a CDR and the topography of the surrounding protein surface restrict the region of space that a given CDR can occupy. In accord with simple notions of chain molecule statistics, the more highly extended a CDR at its base, the more similar the possible structures and the fewer that are necessary to span the conformational space. Energy minimization and molecular dynamics studies (reported elsewhere) using these libraries to furnish starting conformations show that, as the number of residues in a CDR goes from five to nine, the number of randomly generated structures necessary to ensure that low-lying energetic minima, such as the native conformation, will be found several times goes from a few tens to a few hundred. Some of the spatial features of an ensemble of random conformations are implicit in the histogram of the rms atomic displacements calculated for all the pairs in the ensemble. The random tweak method is carried out by setting each dihederal angle on the main chain of the variable fragment to a random value, then using an iterated linearized Lagrange multiplier technique to enforce the geometric constraints with the minimal conformational perturbation. The time required for the algorithm is linear in fragment length, and the resulting ability of the method to handle large loops makes it especially applicable to the modeling of homologous proteins. In most cases, hundreds of acceptable structures could be generated in a few hours on a VAX 11/780. Where van der Waals screening against fixed atoms need not be performed, as for isolated ring molecules, generation times go down by an order of magnitude or more.
Article
New first and second-order differential equations for changes of dihedral angles characterizing local deformations of chain molecules with fixed bond lengths and bond angles are derived. Two methods for integrating the differential relations are given. The proposed method is used to generate a path of locally deformed conformations around a β-turn region of a small protein, bovine pancreatic trypsin inhibitor. The variable regions change their conformations by more than 3 Å root-mean-square distance value whereas the fixed regions stay within 0.02 Å. Possible applications of this method are in the field of computer graphics, Monte Carlo simulations, and energy minimization calculations of chain molecules.
Article
Native-like folded conformations of bovine pancreatic trypsin inhibitor protein are calculated by searching for conformations with the lowest possible potential energy. Twenty-five random starting structures are subjected to soft-atom restrained energy minimization with respect to both the torsion angles and the atomic Cartesian co-ordinates. The restraints used to limit the search include the three disulphide bridges and the 16 main-chain hydrogen bonds that define the native secondary structure. The potential energy functions used are detailed and include terms that allow bond stretching, bond angle bending, bond twisting, van der Waals' forces and hydrogen bonds. Novel features of the methods used include soft-atoms to make restrained energy minimization work, writhing numbers to classify chain threadings, and molecular dynamics followed by energy minimization to anneal the conformations and reduce their energies further. Conformations are analysed using writhing numbers, torsion angle distributions, hydrogen bonds and accessible surface areas. The resulting conformations are very diverse in their chain threadings, energies and root-mean-square deviations from the X-ray structure. There is a relationship between the root-mean-square deviation and the energy, in that the lowest energy conformations are also closest to the X-ray structure. The best conformation calculated here has a root-mean-square deviation of only 3 A and shows the same special threading found in the X-ray structure. The methods introduced here have wide ranging applications; they can be used to build models of protein conformations that have low energy values and obey a wide variety of restraints.