The sustained performance of the Protein Explorer with the direct summation algorithm. The calculation time per step (left axis) and the efficiency are plotted. The solid line and dashed one indicate those of the petaflops system and the 2-Tflops system, re- spectively.

Source publication

Protein Explorer: A Petaflops Special-Purpose Computer System for Molecular Dynamics Simulations

Conference Paper

Full-text available

Jan 2003

We are developing the 'Protein Explorer' system, a petaflops special-purpose computer system for molecular dynamics simulations. The Protein Explorer is a PC cluster equipped with special-purpose engines that calculate nonbonded interactions between atoms, which is the most time-consuming part of the simulations. A dedicated LSI 'MDGRAPE-3 chip' pe...

Context 1

... model is based on the direct summation algorithm, which will show the best sustained performance. Figure 7 shows the sustained performance of the Protein Explorer. The total time T = T PE + T host + T comm + T MPI and the efficiency T PE /T are plotted. ...

View in full-text

Accelerators for Classical Molecular Dynamics Simulations of Biomolecules

Article

Full-text available

Jul 2022
J CHEM THEORY COMPUT

Atomistic Molecular Dynamics (MD) simulations provide researchers the ability to model biomolecular structures such as proteins and their interactions with drug-like small molecules with greater spatiotemporal resolution than is otherwise possible using experimental methods. MD simulations are notoriously expensive computational endeavors that have traditionally required massive investment in specialized hardware to access biologically relevant spatiotemporal scales. Our goal is to summarize the fundamental algorithms that are employed in the literature to then highlight the challenges that have affected accelerator implementations in practice. We consider three broad categories of accelerators: Graphics Processing Units (GPUs), Field-Programmable Gate Arrays (FPGAs), and Application Specific Integrated Circuits (ASICs). These categories are comparatively studied to facilitate discussion of their relative trade-offs and to gain context for the current state of the art. We conclude by providing insights into the potential of emerging hardware platforms and algorithms for MD.

Molecular modeling in drug discovery

Article

Full-text available

Feb 2022

With the financial requirements and high time associated with bringing a commercial drug to the market, the application of computer-aided drug design has been recognized as a powerful technology in the drug discovery pipeline. In accelerating drug discovery, molecular modeling techniques have experienced considerable growth in computational capabilities over the last decade. Pharmaceutical companies and academic research organizations are currently using various computational modeling techniques to lower the cost and time required for the discovery of an effective drug. In this article, we focus on reviewing three key components of molecular modeling (Molecular Docking, Molecular Dynamics, and ADMET modeling), their applications, and limitations in small-molecule drug discovery. We discussed the technicalities encircling molecular dynamics and docking, the algorithms used to develop the docking softwares, and the models explored by these algorithms coupled with their scoring functions. We also reviewed the Journal Pre-proof influence of molecular dynamics simulations (all atoms and coarse-grained molecular dynamics simulations) in drug discovery and also elucidated how the ensembles generated from MD simulations could pave the way for novel drug discovery. Furthermore, we briefly explain the role played by pharmacokinetics and pharmacodynamics profiling in discovering new leads for therapeutic efficacy. Besides the computational success of molecular modeling in drug discovery, we highlighted the experimental corroboration of in silico discovered drug candidates. However, as there is hardly a drug in the market discovered primarily with the use of computational modeling, we concluded the review by proposing possible solutions that could foster the advancement and clinical success of drugs.

86 PFLOPS Deep Potential Molecular Dynamics simulation of 100 million atoms with ab initio accuracy

Preprint

Full-text available

Apr 2020

We present the GPU version of DeePMD-kit, which, upon training a deep neural network model using ab initio data, can drive extremely large-scale molecular dynamics (MD) simulation with ab initio accuracy. Our tests show that the GPU version is 7 times faster than the CPU version with the same power consumption. The code can scale up to the entire Summit supercomputer. For a copper system of 113, 246, 208 atoms, the code can perform one nanosecond MD simulation per day, reaching a peak performance of 86 PFLOPS (43% of the peak). Such unprecedented ability to perform MD simulation with ab initio accuracy opens up the possibility of studying many important issues in materials and molecules, such as heterogeneous catalysis, electrochemical cells, irradiation damage, crack propagation, and biochemical reactions.

On the Feasibility of FPGA Acceleration of Molecular Dynamics Simulations

Preprint

Full-text available

Aug 2018

Classical molecular dynamics (MD) simulations are important tools in life and material sciences since they allow studying chemical and biological processes in detail. However, the inherent scalability problem of particle-particle interactions and the sequential dependency of subsequent time steps render MD computationally intensive and difficult to scale. To this end, specialized FPGA-based accelerators have been repeatedly proposed to ameliorate this problem. However, to date none of the leading MD simulation packages fully support FPGA acceleration and a direct comparison of GPU versus FPGA accelerated codes has remained elusive so far. With this report, we aim at clarifying this issue by comparing measured application performance on GPU-dense compute nodes with performance and cost estimates of a FPGA-based single- node system. Our results show that an FPGA-based system can indeed outperform a similarly configured GPU-based system, but the overall application-level speedup remains in the order of 2x due to software overheads on the host. Considering the price for GPU and FPGA solutions, we observe that GPU-based solutions provide the better cost/performance tradeoff, and hence pure FPGA-based solutions are likely not going to be commercially viable. However, we also note that scaled multi-node systems could potentially benefit from a hybrid composition, where GPUs are used for compute intensive parts and FPGAs for latency and communication sensitive tasks.

Revisiting FPGA Acceleration of Molecular Dynamics Simulation with Dynamic Data Flow Behavior in High-Level Synthesis

Article

Full-text available

Nov 2016

Molecular dynamics (MD) simulation is one of the past decade's most important tools for enabling biology scientists and researchers to explore human health and diseases. However, due to the computation complexity of the MD algorithm, it takes weeks or even months to simulate a comparatively simple biology entity on conventional multicore processors. The critical path in molecular dynamics simulations is the force calculation between particles inside the simulated environment, which has abundant parallelism. Among various acceleration platforms, FPGA is an attractive alternative because of its low power and high energy efficiency. However, due to its high programming cost using RTL, none of the mainstream MD software packages has yet adopted FPGA for acceleration. In this paper we revisit the FPGA acceleration of MD in high-level synthesis (HLS) so as to provide affordable programming cost. Our experience with the MD acceleration demonstrates that HLS optimizations such as loop pipelining, module duplication and memory partitioning are essential to improve the performance, achieving a speedup of 9.5X compared to a 12-core CPU. More importantly, we observe that even the fully optimized HLS design can still be 2X slower than the reference RTL architecture due to the common dynamic (conditional) data flow behavior that is not yet supported by current HLS tools. To support such behavior, we further customize an array of processing elements together with a data-driven streaming network through a common RTL template, and fully automate the design flow. Our final experimental results demonstrate a 19.4X performance speedup and 39X energy efficiency for the widely used ApoA1 MD benchmark on the Convey HC1ex FPGA compared to a 12-core Intel Xeon server.

A piecewise lookup table for calculating nonbonded pairwise atomic interactions

Article

Full-text available

Nov 2015
J MOL MODEL

A critical challenge for molecular dynamics simulations of chemical or biological systems is to improve the calculation efficiency while retaining sufficient accuracy. The main bottleneck in improving the efficiency is the evaluation of nonbonded pairwise interactions. We propose a new piecewise lookup table method for rapid and accurate calculation of interatomic nonbonded pairwise interactions. The piecewise lookup table allows nonuniform assignment of table nodes according to the slope of the potential function and the pair interaction distribution. The proposed method assigns the nodes more reasonably than in general lookup tables, and thus improves the accuracy while requiring fewer nodes. To obtain the same level of accuracy, our piecewise lookup table accelerates the calculation via the efficient usage of cache memory. This new method is straightforward to implement and should be broadly applicable. Graphical Abstract Illustration of piecewise lookup table method

MDGRAPE-4: A special-purpose computer system formolecular dynamics simulations

Article

Full-text available

Aug 2014
PHILOS T R SOC A

We are developing the MDGRAPE-4, a special-purpose computer system for molecular dynamics (MD) simulations. MDGRAPE-4 is designed to achieve strong scalability for protein MD simulations through the integration of general-purpose cores, dedicated pipelines, memory banks and network interfaces (NIFs) to create a system on chip (SoC). Each SoC has 64 dedicated pipelines that are used for non-bonded force calculations and run at 0.8 GHz. Additionally, it has 65 Tensilica Xtensa LX cores with single-precision floating-point units that are used for other calculations and run at 0.6 GHz. At peak performance levels, each SoC can evaluate 51.2 G interactions per second. It also has 1.8 MB of embedded shared memory banks and six network units with a peak bandwidth of 7.2 GB s(-1) for the three-dimensional torus network. The system consists of 512 (8×8×8) SoCs in total, which are mounted on 64 node modules with eight SoCs. The optical transmitters/receivers are used for internode communication. The expected maximum power consumption is 50 kW. While MDGRAPE-4 software has still been improved, we plan to run MD simulations on MDGRAPE-4 in 2014. The MDGRAPE-4 system will enable long-time molecular dynamics simulations of small systems. It is also useful for multiscale molecular simulations where the particle simulation parts often become bottlenecks.

Molecular Dynamics Study of the Effect of Induced Mutations on the Protein Structures Associated with Diseases of A Radiobiological Nature

Article

Full-text available

Jan 2013

Kholmirzo T. Kholmurodov

The induced mutations in biological molecules, such as DNA and proteins, have quite a different nature (environmental factors, viruses, ionizing radiation, mutagenic chemicals, inherited genetic alterations, etc.). Induced mutations can destroy the existing chemical (hydrogen) bonds in the native molecular structures or, on the contrary, create new chemical (hydrogen) bonds that do not normally exist there. In protein structures, the cause of such changes might be the substitution of one or several specific amino acid residues (point mutations). At the atomic level, the replacement of one amino acid residue by another causes essential modifications of the molecular force fields of the environment, which can break important hydrogen bonds underlying the structural stability of biological molecules. In this work, based on molecular dynamics (MD) method, we demonstrate the effect of mutational structure changes on several biological protein models (the p53 oncoprotein, visual pigment rhodopsin, cyclin-dependent kinase, and recA protein). Molecular dynamics simulation is a powerful tool in investigating the structure properties of biological molecules on the atomic and molecular levels, and it has been widely used to study the structural conformational behavior of proteins. We also discuss the scenario of the mutation effects associated with different kinds of diseases that could develop and take place in physiological conditions.

THE EVOGRID: An Approach to Computational Origins of Life Endeavours

Thesis

Full-text available

Jul 2011

Bruce Damer

The quest to understand the mechanisms of the origin of life on Earth could be enhanced by computer simulations of plausible stages in the emergence of life from non-life at the molecular level. This class of simulation could then support testing and validation through parallel laboratory chemical experiments. This combination of a computational, or “cyber” component and a parallel effort investigation in chemical abiogenesis could be termed a cyberbiogenesis approach. The central technological challenge to cyberbiogenesis endeavours is to design computer simulation models permitting de novo emergence of prebiotic and biological virtual molecular structures and processes through multiple thresholds of complexity. This thesis takes on the challenge of designing, implementing and analyzing one such simulation model. This model can be described concisely as: distributed processing and global optimization through the method of search coupled with stochastic hill climbing supporting emergent phenomena within small volume, short time frame molecular dynamics simulations. The original contributions to knowledge made by this work are to frame computational origins of life endeavours historically; postulate and describe one concrete design to test a hypothesis surrounding this class of computation; present results from a prototype system, the EvoGrid, built to execute a range of experiments which test the hypothesis; and propose a road map and societal considerations for future computational origins of life endeavours.

Synthesis and infeasibility analysis for stochastic models of biochemical systems using statistical model checking and abstraction refinement

Article

May 2011
THEOR COMPUT SCI

The stochastic dynamics of biochemical reaction networks can be modeled using a number of succinct formalisms all of whose semantics are expressed as Continuous Time Markov Chains (CTMC). While some kinetic parameters for such models can be measured experimentally, most are estimated by either fitting to experimental data or by performing ad hoc, and often manual search procedures. We consider an alternative strategy to the problem, and introduce algorithms for automatically synthesizing the set of all kinetic parameters such that the model satisfies a given high-level behavioral specification. Our algorithms, which integrate statistical model checking and abstraction refinement, can also report the infeasibility of the model if no such combination of parameters exists. Behavioral specifications can be given in any finitely monitorable logic for stochastic systems, including the probabilistic and bounded fragments of linear and metric temporal logics. The correctness of our algorithms is established using a novel combination of arguments based on survey sampling and uniform continuity. We prove that the probability of a measurable set of paths is uniformly and jointly continuous with respect to the kinetic parameters. Under a suitable technical condition, we also show that the unbiased statistical estimator for the probability of a measurable set of paths is monotonic in the parameter space. We apply our algorithms to two benchmark models of biochemical signaling, and demonstrate that they can efficiently find parameter regimes satisfying a given high-level behavioral specification. In particular, we show that our algorithms can synthesize up to 6 parameters, simultaneously, which is more than that reported by any other synthesis algorithm for stochastic systems. Moreover, when parameter estimation is desired, as opposed to synthesis, we show that our approach can scale to even higher dimensional spaces, by identifying the single parameter combination that maximizes the probability of the behavior being true in an 11-dimensional system.

The sustained performance of the Protein Explorer with the direct summation algorithm. The calculation time per step (left axis) and the efficiency are plotted. The solid line and dashed one indicate those of the petaflops system and the 2-Tflops system, re- spectively.

Context in source publication

Citations