ChapterPDF Available

Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule Docking Approaches

July 2022

July 2022

DOI:10.5772/intechopen.105815

License
CC BY 3.0

In book: Molecular Docking - Recent Advances [Working Title]

Authors:

Sefika Feyza Maden

Istanbul Medeniyet Universitesi

Saliha Ece Acuner

Istanbul Medeniyet University

Proteins (e.g., enzymes, receptors, hormones, antibodies, transporter proteins, etc.) seldom act alone in the cell, and their functions rely on their interactions with various partners such as small molecules, other proteins, and/or nucleic acids. Molecular docking is a computational method developed to model these interactions at the molecular level by predicting the 3D structures of complexes. Predicting the binding site and pose of a protein with its partner through docking can help us to unveil protein structure-function relationship and aid drug design in numerous ways. In this chapter, we focus on the fundamentals of protein docking by describing docking methods including search algorithm, scoring, and assessment steps as well as illustrating recent successful applications in drug discovery. We especially address protein–small-molecule (drug) docking by comparatively analyzing available tools implementing different approaches such as ab initio, structure-based, ligand-based (pharmacophore-/shape-based), information-driven, and machine learning approaches.

Grid box usage in docking: (A) blind docking with a grid box of size: ×× 44 72 68 and center coordinates: 10.711, 0.0, 3.782, (B) specific docking with a grid box of size: ×× 14 14 16 and center coordinates: 10.735, −2.409, 21.173.

…

Crystal SARS-CoV-2 main protease structure (white, PDB ID: 7KX5_chain (A) in complex with the blind docking (blue), specific docking (green) poses predicted with AutoDock Vina and the reference ligand Jun8-76-3A inhibitor (red, PDB ID: 7KX5_chain B). This figure was drawn with PyMol 2.5.2.

…

Figures - available via license: Creative Commons Attribution 3.0 Unported

Content may be subject to copyright.

Available via license: CC BY 3.0

Content may be subject to copyright.

Selection of our books indexed in the Book Citation Index

in Web of Science™ Core Collection (BKCI)

Interested in publishing with us?

Contact book.department@intechopen.com

Numbers displayed above are based on latest data collected.

For more information visit www.intechopen.com

Open access books available

Countries delivered to Contributors from top 500 universities

International authors and editor s

Our authors are among the

most cited scientists

Downloads

We are IntechOpen,the world’s leading publisher ofOpen Access booksBuilt by scientists, for scientists

12.2%

169,000

185M

TOP 1%

154

6,200



Chapter

Fundamentals of Molecular

Docking and Comparative Analysis

of Protein–Small-Molecule Docking

Approaches

SefikaFeyza Maden, SelinSezer and Saliha EceAcuner

Abstract

Proteins (e.g., enzymes, receptors, hormones, antibodies, transporter proteins,

etc.) seldom act alone in the cell, and their functions rely on their interactions

with various partners such as small molecules, other proteins, and/or nucleic acids.

Molecular docking is a computational method developed to model these interac-

tions at the molecular level by predicting the 3D structures of complexes. Predicting

the binding site and pose of a protein with its partner through docking can help us

to unveil protein structure-function relationship and aid drug design in numerous

ways. In this chapter, we focus on the fundamentals of protein docking by describ-

ing docking methods including search algorithm, scoring, and assessment steps as

well as illustrating recent successful applications in drug discovery. We especially

address protein–small-molecule (drug) docking by comparatively analyzing available

tools implementing different approaches such as ab initio, structure-based, ligand-

based (pharmacophore-/shape-based), information-driven, and machine learning

approaches.

Keywords: molecular docking, drug design, drug discovery, protein interactions,

machine learning

. Introduction

The molecular machines of the cell, i.e., proteins, are essential to many cellular

processes such as signal transduction and cell regulation. Proteins seldom act alone in

the cell, but they function through interacting with other small or macromolecules.

Therefore, understanding protein interactions at the atomic level is critical to under-

standing biological processes [1]. Primary structure, i.e., amino acid sequence, of the

interacting proteins is a necessary but insufficient source of information at the atomic

level. After being synthesized, proteins fold and acquire a stable native structure,

i.e., tertiary structure that can be defined in a three-dimensional (3D) plane in order

to be functional. It is known that proteins with different sequence information can

have similar functional structures, that is, different amino acid sequences can show

Molecular Docking - Recent Advances



similar folding trends in 3D space and structure is more conserved than sequence [2].

Therefore, it is crucial to understand the interaction details at the structural level.

Proteins physically interact with their partners via non-covalent associations, namely

H-bond, hydrophobic, and electrostatic interactions, with the exception of covalent

disulfide bridges. These intermolecular physical forces also dominate the protein

folding process.

The 3D structure of the macromolecules can be determined using the experimen-

tal methods such as X-ray crystallography, nuclear magnetic resonance (NMR), and

cryo-EM and then deposited in the Protein Data Bank (PDB) (https://www.rcsb.

org). However, there is a huge gap between the number of known protein sequences

and structures [3, 4]. Computational modeling approaches that can predict 3D

structures of macromolecules can help to bridge this gap. A recent machine learning

algorithm developed by DeepMind, called AlphaFold [5], can predict 3D structures

of proteins using the sequence information with high accuracy and has been accepted

as a breakthrough in the structural biology field. In 1 year, approximately 1 million

new structures have been predicted and deposited at AlphaFold Protein Structure

Database (https://alphafold.ebi.ac.uk/). In order to have a complete understanding

of the proteome, computational techniques are not only needed for modeling single

protein structures, but also the interactions between them.

Molecular docking is a method used to predict the structures of proteins in complex

with other proteins, nucleic acids, or small molecules. It can be defined as predicting

the appropriate low-energy binding pose of the ligand in complex with the target

structure, by randomly colliding proteins and their potential partners in space, first

creating a rigid complex structure model, and then focusing on the binding sites of

that model with flexible interface refinement [6]. Energy minimization of randomly

docked conformations in space requires a multidimensional calculation. Initially devel-

oped molecular docking method was treating ligands and receptors as rigid bodies

without considering any conformational changes [7]. However, interactions between

proteins can become quite complex even with small changes in the conformation of the

structures [7], and docking algorithms may not physically solve this complex prob-

lem correctly [8]. The main factor that creates computational difficulties in docking

algorithms is when the protein backbone changes its conformation significantly upon

binding [9, 10]. To address this problem, different techniques that consider backbone

flexibility have been successfully implemented in docking algorithms [10].

Many diseases today, such as cancer, are likely to be linked to problems in protein-

protein interactions and targeting them can therefore enable the development of

next-generation therapeutic methods [11]. Modeling the complex structures formed

by proteins with other proteins or small molecules holds the key to understand

many biological processes such that modeling enzyme-substrate or protein-drug

interactions can reveal insights into binding sites/interface regions, function, and

mechanism of action. The main protein–small-molecule docking applications in drug

discovery include drug repositioning, structure- and ligand-based (pharmacophore−/

shape-based) drug design approaches using virtual and reverse screening [11–14].

Today, with the continuously developing technology; targeted drug design, drug

target search, evaluation of the side effects of existing drugs, or finding new targets

for these drugs can be achieved with the help of molecular modeling and machine

learning methods [12]. Deep learning neural network models have strong computa-

tional ability on big data and attract attention in structural biology field [15]. There

are antibiotic discovery studies using deep neural networks [16] and deep learning

studies adapted to drug design [17].



Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule…

DOI: http://dx.doi.org/10.5772/intechopen.105815

In this chapter, we focus on the protein–small-molecule docking fundamentals

and the steps of the docking algorithm and procedure in detail. We then give recent

successful applications in drug design and discovery that use different docking

approaches, namely virtual screening, reverse screening, and machine learning.

Lastly, we comparatively analyze some of the available protein–small-molecule

docking tools using the structure of SARS-CoV-2 main protease in complex with a

non-covalent inhibitor Jun8-76-3A as a case study.

. Fundamentals of protein–small-molecule docking

Protein–small-molecule interactions are essential for the sustainability of

biological processes such as enzymatic catalysis and overall homeostasis in the body [18].

The engineering of protein–small-molecule interactions is one of the computational

approaches used to solve critical problems in biology [18]. Protein–small-molecule

docking, i.e., modeling the interaction between chemical compounds and their

target protein receptors at the atomic level, is an effective tool in drug design. In the

structure-based design of small-molecule drugs, a good estimation of the binding

pose is required to clearly demonstrate important interactions and design drugs with

increased selectivity and efficacy [19]. The procedures that can be followed and the

tools that can be used before, during, and after molecular docking are explained in the

following subsections and summarized in Figure .

. Before docking: molecule preparation

Before starting the docking studies, first of all, the most suitable protein

and ligand structures should be selected [20]. There are databases to access the

experimentally determined structures of target proteins such as PDB, Uniprot, and

Therapeutic Target Database (TTD). If the experimental structure is not available,

modeled structures can be obtained from AlphaFold Database or can be modeled

Figure 1.

The procedures that can be followed and the tools that can be used before, during, and after protein-ligand

molecular docking in drug design.

Molecular Docking - Recent Advances



using relevant structure modeling software. The most frequently used databases for

getting the small-molecule ligand/chemical structures are: DrugBank [21], PubChem

[22], ZINC [23], ChEMBL [24], and Chemspider [25] (Figure ). DrugBank,

Chemspider, and ZINC databases include more than 500,000, 100 million, and 230

million compounds/drug molecules, respectively.

The molecular docking algorithms may require preliminary preparation of the

structures that are obtained in PDB format (lacking H atoms). There are tools avail-

able for such preliminary preparations such as Open Babel [26] and AutoDockTools

(Figure ) [27].

It is also of crucial importance to guide docking with preliminary information

on the binding site. Otherwise, there are no binding site constraints, blind docking

takes place, and it is more difficult to detect the correct binding poses when the ligand

search space is large. There are various guiding algorithms for active site prediction

that can be used when binding sites are not known. Some of them can be listed as:

GRID [28], SurfNet [29], COACH [30], SCFbio [31], CASTp [32], DeepSite [33], and

PUResNet (Figure ) [34].

The capabilities of docking algorithms can differ from each other, and in this

respect, it is important to carefully choose the algorithm to use in accordance with the

purpose of the study before starting the docking.

. Docking algorithm steps

There are many approaches and algorithms for molecular docking, based on

different parameters, and they aim to perform the protein-ligand docking with the

best performance [12]. The steps of molecular docking algorithms can be summarized

as follows: molecule flexibility, conformational search algorithms (ligand sampling),

and scoring functions (Figure ) [12, 35].

.. Molecule flexibility

During molecular docking, structures can be considered rigid or flexible. Rigid

docking takes into account only the translation and rotation degrees of freedom.

Providing flexibility means also considering the rotation about single bonds so that

they have the same bond lengths and angles but different torsion angles. Although

Figure 2.

Methods for protein-ligand molecular docking.



Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule…

DOI: http://dx.doi.org/10.5772/intechopen.105815

flexible docking approach is more realistic than rigid docking, whenthere are

many rotatable bonds, the ligand conformational search space becomes so

large that it is difficult to find the correct binding pose with the lowest binding

free energy (global minimum solution). Some algorithms, such as HADDOCK

[36], first treat the structures as rigid to increase time efficiency and then per-

form flexibility improvements on the poses of molecules with the best energy

scores.Molecular docking software can be grouped according to the flexibility

treatments of molecules such as Rigid Docking, Semi-Flexible Docking, and Soft

Docking [35, 37].

In rigid docking, protein and ligand molecules are treated as rigid entities [37, 38].

During docking, the positions of the molecules change without losing their shape

[37], i.e., only translation and rotation but no conformational degrees of freedom are

considered.

Semi-flexible docking is based on the principle of keeping the protein structure

rigid and letting the ligand structure be flexible by allowing rotatable bonds. Thus,

various conformational poses of the ligand on the protein are sampled [35, 37, 38]. It

gives more accurate results than rigid docking [37].

In soft docking, van der Waals interactions between atoms are softened, making

the structures of both receptor and ligand molecules implicitly flexible as overlap is

allowed to a small extent [39, 40]. Soft docking process is carried out realistically by

ensuring that both the protein and the ligand are rotatable as in their natural states

[37, 38]. It is an advantageous method due to its computational efficiency and ease of

application [35, 37].

.. Conformational search algorithms

Conformational search algorithms can identify different conformational orien-

tations (poses) of the ligand sampled around the experimentally determined active

site or other binding sites on the protein [35, 41, 42]. These algorithms are gener-

ally classified as: shape matching, systematic, stochastic, and simulation methods

[35, 38, 43].

Shape matching algorithms have the advantage of speed over other algorithms

[35, 44] and adopt a sampling principle in which the conformation of the ligand

should be structurally complementary to the protein binding site [38]. It ensures that

the ligand is positioned in such a way that best complements the molecular surface

of the binding site on the protein [35]. Some example software using shape matching

are: DOCK [45], FLOG [46], EUDOC [47], Surflex [48], LibDOCK [49], SANDOCK

[50], and MDock [51].

Using systematic search algorithms, a large number of possible binding poses

can be obtained by gradually changing the degrees of freedom of the ligands [35, 52]

toward the direction of minimum energy. Systematic search algorithms can be

divided into two as exhaustive search and fragmentation (incremental structure)

[35, 41, 53]. Exhaustive search algorithm is based on systematically generating flex-

ible ligand conformations by rotating the rotatable bonds in the ligand [35]. If the

number of rotatable bonds is large, there is a combinatorial explosion in the number

of poses, i.e., the search space, so that some filtering and optimization procedures

are applied for practical purposes [35]. Glide [54] and FRED [55] are example

docking software using exhaustive conformational search algorithms. In the frag-

mentation method, the ligand is divided into smaller fragments, each fragment is

placed and augmented at the binding site gradually through covalent bonding to the

Molecular Docking - Recent Advances



previous one [35]. DOCK [56], LUDI [57], FlexX [58], and eHiTs [59] are example

software using fragmentation.

The algorithms used in stochastic search methods are more efficient but do not

guarantee an accurate result as they are based on generating random ligand confor-

mations, and therefore, the docking process is iterative in these algorithms [41, 44].

Monte Carlo, swarm optimization, evolutionary algorithms, and Tabu search meth-

ods are among the most used stochastic algorithms [35, 38, 52]. Example software

using stochastic conformational search method include AutoDock [60], GOLD [61],

DockThor [62], and MolDock [63].

Simulations of the obtained ligand poses (simulation methods) represent protein

and ligand flexibility better than the other algorithms but have a slow flow and can

make insufficient sampling [38, 44]. For this reason, they are used as a complement to

other conformational search methods [38].

.. Scoring functions

In the previously described conformational search step, many structures are cre-

ated and most of them should be eliminated by selecting the biologically appropriate

structures. Therefore, the possible poses created by conformational search algorithms

are evaluated and ranked by using a scoring function [35]. The scoring function is a

measure to evaluate the docking poses obtained [35, 38, 52] in terms of their binding

free energies [11, 44, 64].

With the scoring functions that estimate the binding energies of the created

complex structures, various physicochemical properties should be evaluated in order

to distinguish good results from the bad ones. These physicochemical properties can

be intermolecular interactions, desolvation from solvent, electrostatic and entropic

effects, etc. [65]. As the number of evaluated parameters increases, the accuracy

of the scoring function will increase; but the computational load will also increase.

Therefore, scoring functions with ideal efficiency, especially when working with

large ligand sets, are those that are balanced in terms of accuracy and speed [11]. The

scoring functions can be classified as: force-field-based, empirical, knowledge-based,

and consensus scoring.

The Force Field Scoring Function (FFSF) is designed to work with multiple force

fields such as AMBER [66], CHARMM [67], GROMOS [68], and OPLS [69] individu-

ally or in combination. The designed FFSFs estimate the free energy of ligand binding

by considering van der Waals energy terms such as electrostatic interactions and

hydrogen bonds [35, 38].

Empirical scoring functions use simpler energy terms to estimate the free energy

of ligand binding such as hydrogen bonds and ionic interaction, and they can be

calculated more easily and faster than FFSFs [35, 38, 52]. Some examples of empirical

scoring functions are GlideScore [54], PLP [70], LigScore [71], LUDI [72], SCORE

[73], and X-Score [74].

Knowledge-based scoring functions use statistical analysis of protein-ligand com-

plex structures to derive protein-ligand distance [44]. These functions can show high

performance in a short time [52]. They can also model some uncommon interactions,

such as sulfur-aromatic, that other functions do not address [44].

Consensus scoring function, not a specific scoring system, aims at an effective

scoring with a combination of multiple scoring functions with the idea of minimizing

the possible error margins of existing scoring systems [35, 38, 44].



Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule…

DOI: http://dx.doi.org/10.5772/intechopen.105815

. After docking: evaluation of the results

After performing protein-ligand docking studies, the accuracy of pose estima-

tions needs to be evaluated [41, 52]. The best way to evaluate the docking algorithm

is to compare the predicted binding pose of the ligand with position of the reference

ligand in the experimentally determined structure, if possible. The structural com-

parison is quantified by using root mean squared deviation (RMSD) (Eq. 1), with the

unit of Å [41, 75]. It is preferred that this value is between 2 and 4Å or less for a good

docking. RMSD calculations are simple, but this metric is not normalized to number

of atoms and therefore should not be considered as an absolute measure [76]. As a

more systematic approach, in order to ensure the consistency of the docking algo-

rithm used, it should be checked whether the same poses are obtained by repeating

the docking process [52] at least 50 times and clustering the poses of the side chains

and references according to a certain threshold value [77]. With this method, whether

the docking algorithm correctly and consistently creates a pose in the right position

can be determined [41, 44, 78].

( )

= + − +− +−

∑2

ai bi ai bi ai bi

RMSD x x y y z z

N (1)

Eq. (1) Root mean squared deviation for the coordinates of two molecules, a and b,

with N atoms.

Modeling successes and capabilities of docking algorithms are being evaluated in

a competition called CAPRI (Critical Assessment of Protein Interactions) (https://

www.capri-docking.org/) since 2001 [79, 80]. Experimentally determined complex

structures that have not yet been published in PDB are submitted to CAPRI and

without knowing the experimental structure of the complex, the participants try to

predict the most similar structure to the experimentally determined complex struc-

ture through docking algorithms [79]. A solution set of 10 models is presented to the

CAPRI committee for evaluation based on the geometry similarity and biological

relevance of the predicted complex structures. The results of CAPRI show very good

predictions for easy targets with simple conformational changes, but rather worse

ones for difficult targets with conformational changes upon binding [9].

. Molecular docking approaches and applications in drug design

Computational methods have become an important part of the drug discovery

process with increasing accuracy of algorithms. Various docking methods based

on different algorithms are constantly being developed to determine the structural

relationships of potential drug molecules and their targets [44]. In addition, studies in

this area shed light on the candidate drugs in terms of the pharmacodynamic proper-

ties, affinity, and selectivity [11]. The main molecular docking applications in drug

discovery include drug repositioning (repurposing), structure- and ligand-based

drug design approaches using virtual and reverse screening [11–14].

Drug repositioning seeks out new targets for natural compounds, drugs currently

in use, or candidate ligands to reveal their unknown therapeutic potentials [81]. Many

successful repositioning studies are available in the literature [81–83]. Virtual screen-

ing (VS) and reverse screening (RS) techniques are frequently used in drug discovery

Molecular Docking - Recent Advances



and repositioning. VS offers a more effective and rational approach compared with

traditional methods [36]. The atomic-level analyzable results presented to us by

virtual screening studies guide us in understanding the function of the target and

in new drug discoveries [5, 36, 55]. In the RS approach, interest is on a single ligand

molecule, and there is a search for a biological target for this molecule [12]. Unlike

virtual screening (VS), the search library consists of potential target receptors. RS

approach has the potential to lead studies such as testing toxicity or side effects of the

existing drugs [38]. The potential side effects of a drug need to be evaluated in the

drug discovery process. Molecular docking studies can offer an important perspective

in this regard, and there are inverse (reverse) docking studies that provide bioactiv-

ity data by detecting off-target bindings [25]. Lastly, the subclasses of Artificial

Intelligence (AI): Machine Learning (ML) and Deep Learning (DL) methods have

significant contributions in pharmaceutical industry [84]. AI can be applied to dif-

ferent steps such as drug design with VS, de novo generation of drug molecules, and

computational planning of drug synthesis [85]. Recent developments are promising

that molecular docking methods may benefit from the machine learning methods

more in the future [84].

. Virtual screening

Virtual screening (VS) approach uses a target receptor and a library of small

molecules. Libraries can be created manually, or already existing libraries can be used.

The library consists of a large number of chemically diverse bioactive small molecules

with a high probability of binding to the receptor. This virtual computing technique

is considered as the in silico equivalent of in vitro methods such as high-throughput

screening (HTS) [11]. VS is preferred as a guide in scientific studies because its

success rate is 400 times higher [86], less costly, faster, and requires less labor com-

pared with high-throughput screening methods [87]. VS studies aim to reduce a large

number of potential drug candidates to manageable numbers applying various filters.

The biggest challenge in VS is the detection of false negatives [19].

Ligand-based VS methods conduct research by identifying common properties

of compound sequences, such as molecular volume and protonation state [11]. In

addition to chemical similarity [88] and rule-based [89] software included in filtra-

tion strategies, there are also various software such as freely add-on pharmacophore

and quantitative structure-activity relationship (QSAR) models [87, 90]. The most

commonly used ligand-based virtual screening method is the QSAR method. Ligand-

based VS does not contain structural information about the receptor, it only scans

using receptor sites known to be active and tries to detect active ligand molecules [85].

Structure-based VS methods are often used when the receptor has different con-

formations. The aim is to predict receptor binding affinity by processing structural

information using a variety of techniques, such as binding site similarity and phar-

macophore mapping. By estimating the different binding modes, the molecules are

sorted for evaluation [11]. Analysis of the predicted poses can be done manually using

visualization programs. It has been reported that nAPOLI, a web server developed in

recent years, analyzes results automatically [91].

Structure-based pharmacophore generation is one of the most frequently used

methods for small molecules in the virtual screening method. Here, 3D pharmaco-

phore model interfaces of the scaffolds of the ligands are created, and ligands that will

adapt to the binding site and provide the desired bioactivity are selected. Some of the

programs that use pharmacophore modeling are HipHop [92], PHASE [93], MOE,



Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule…

DOI: http://dx.doi.org/10.5772/intechopen.105815

which are commercial, SCAMPI [94], PharmaGist [95], ALADDIN [96], which are

suitable for academic use.

A recent example of VS application on the non-structural protein of SARS-CoV-2,

nsp1, one of the virulence factors causing viral infection, is by G. O. Timo et al. [74].

They estimated the exact pattern of nsp1 interaction through molecular simulation

studies and analyzed 8694 potential inhibitors from the DrugBank database using

the virtual screening method and proposed 16 inhibitor molecules with the best

binding energy scores [74]. There is another recent study on the transcription factor

BRF2, which is among the therapeutic targets as its upregulation is observed in the

formation of various types of cancer, but there is no available specific drug targeting

BRF2. By performing drug repositioning through virtual screening of drug mol-

ecules that are potential candidates for BRF2 inhibition, Rashidieh et al. found that

the bexarotene molecule led to a serious decrease in the proliferation of this type of

cancer cells [97].

. Reverse screening

Reverse screening (RS) is also called inverse docking, reverse docking, inverse

virtual screening, or target screening. Libraries are more limited for target hunting

and profiling [12] and can be created manually using the most common acces-

sible databases such as PDB [98] and TTD [12, 99]. But this process requires a long

preparation time and effort. There are various algorithms used to detect interactions

by reverse screening. Some web platforms (INVDOCK [100], idTarget [101], ACTP

[102], etc.) have been developed for reverse docking, which use libraries prepared

for specific diseases and docked using programs such as standard AutoDock and

AutoDock Vina [12].

A recently developed Consensus Reverse Docking System (CRDS) detects

potential binding sites by screening approximately 5200 candidate proteins for the

ligand molecule using three different scoring methods [103]. In another example,

Stepanova et al. tested the antimicrobial activity against Mycobacterium tuberculosis

strain by reverse screening for chemicals that had been successful in experimental

studies and determined the most appropriate target as aspartate 1-decarboxylase by

performing docking studies using 35 different target protein structures [104]. Reverse

screening was also used for Bazedoxifene, an FDA-approved drug for the prevention

of postmenopausal osteoporosis, and Xiao et al. defined the inhibitory power of

Bazedoxifene on IL-6/GP130 signaling pathway (critical for cancer survival) by using

computational techniques and confirmed the result with in vivo studies [83].

. Machine-learning-based approaches

Machine learning techniques take information from biological data and make

predictions about them, thus contributing to building a structural model [9]. Once a

model is built, it must be improved so that the state with the lowest potential energy

(global minimum) can be reached. Global minimum means a stable and sterically

acceptable structure, and reaching it without being stuck at the local minima is

very important in the field of bioinformatics and computational structural biology.

A recent machine learning algorithm developed by DeepMind, called AlphaFold

[5], implements deep learning and can predict 3D structures of proteins using the

sequence information with high accuracy and has been accepted as a breakthrough in

the structural biology field.

Molecular Docking - Recent Advances



Machine learning makes classifications by learning on datasets and needs human

intervention to evaluate possible outcomes. Deep learning is a more advanced model

having the neural network with ability to decide the right result without human

intervention (Figure ). Machine learning can use supervised or unsupervised learn-

ing. Supervised learning performs machine learning on datasets that we know about,

whereas unsupervised learning detects and labels similarities and orientations in a

created cluster [38, 90].

The training set used in machine learning constitutes the performance of the algo-

rithm. Machine learning studies in the field of virtual screening are generally focused

on improving the performance of the scoring function [85]. Studies have shown that

working with small subsets of the same family, which consists of similar structures,

gives better scoring results rather than working with large data from different com-

plexes [105]. Working with subsets of interest is also a better approach in terms of

computational requirements [38].

Machine learning and deep learning can describe more diverse data than other

computational systems and can be representative of structural biology. Nonparametric

machine learning has great potential to be the next step in computer-based program-

ming to improve the accuracy of molecular docking studies [41]. Machine learning can

be used to refine predetermined function data as well as provide high-quality data to

complement pharmaceutical discovery research and development.

. Case study: comparison of docking tools

As a case study for comparing different protein-ligand docking tools, the crystal

structure of the SARS-CoV-2 (COVID-19) main protease in complex with its non-

covalent inhibitor Jun8-76-3A (PDB ID: 7KX5) is used as the experimental refer-

ence structure to evaluate the accuracies of the complex structures predicted using

Figure 3.

Schematic illustration of artificial intelligence subfields: Machine learning and deep learning.



Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule…

DOI: http://dx.doi.org/10.5772/intechopen.105815

AutoDock Vina, HADDOCK, and SwissDock programs and changing some of the

parameters to test their effects on prediction capabilities. The inhibitor in the experi-

mental protein structure is removed and then molecular docking is performed using

the initial coordinates of the main protease structure of SARS-CoV-2 and its inhibitor

Jun8-76-3A, separately.

. Docking with AutoDock Vina

AutoDock is a free software that predicts the binding compatibility of small

ligands to macromolecule targets with a flexible-rigid (semi-flexible) docking

approach [27]. It uses a grid-based method to place the ligand in the active region

determined on the macromolecule [106]. AutoDockTools (http://mgltools.scripps.

edu/downloads) is the user interface to produce and examine grid information

required for the preparation of the protein and ligand structures in the relevant

format and the configuration file [27].

As a docking input in AutoDock Vina, a configuration file, which contains the

coordinate information of the protein and ligand structures and the ligand-binding

region on the receptor, is required. For docking the case study ligand to the receptor

using AutoDock Vina, the structure file was downloaded from RCSB PDB database

(https://www.rcsb.org) in .pdb format (PDB ID:7KX5). AutoDockTools (v1.5.6) inter-

face was used to prepare input files, such that, water molecules in the relevant protein

structure were deleted, polar H bonds were added to the structure and both the

receptor and ligand structures were saved in .pdbqt file format. After preparing the

ligand and protein structures, the most important input information for AutoDock is

the docking parameter. The docking parameter involves determining the coordinates

of the ligand-binding region on the target protein. While determining the docking

parameter, if the binding region on the protein is not known, blind docking can be

performed by putting the whole protein in the grid box (Figure A), or a small grid

box can be placed in the specific known/predicted ligand-binding region on the pro-

tein (Figure B). Lastly, after determining the region on the protein where the ligand

is to be bound by using the “grid box” in AutoDockTools, the protein coordinates were

Figure 4.

Grid box usage in docking: (A) blind docking with a grid box of size:

××44 72 68

and center coordinates: 10.711,

0.0, 3.782, (B) specific docking with a grid box of size:

××14 14 16

and center coordinates: 10.735, −2.409, 21.173.

Molecular Docking - Recent Advances



specified in the input configuration file. Preparing all the required inputs, docking

was performed using AutoDock Vina by repeating each docking process three times in

order to observe the consistency of the algorithm (Table ).

In order to examine the accuracy of the docking results, the poses obtained from

AutoDock Vina were aligned with the original PDB structure by using the PyMol

program [107]. When the energies of the poses predicted with specific docking

(i.e., using specific grid on the binding site) and blind docking are compared,

although the energy scores of the blind docking results are better, the comparison

of the poses with the reference ligand shows that the most accurate binding is

achieved with specific docking (Figure ). Alignment of the first poses (with the

lowest energy score) predicted with specific docking (green) and blind docking

studies (blue) with the reference ligand (red) shows that the specific docking pose

was in a more similar position with the reference ligand (green vs. red), than the

blind docking pose (blue vs. red).

. Docking with HADDOCK

An integrative platform called High Ambiguity-Driven biomolecular DOCKing

(HADDOCK) is used for molecular docking of two or more molecules [108] and is

a popular algorithm [36]. Although it is mainly suitable for protein-protein interac-

tions, it can also be applied to model the protein–small-molecule complexes [109].

HADDOCK automatically decides the most suitable configuration of the ligand

according to the given restrictions [108]. Protein-protein docking is more complex

than protein–small-molecule docking, as the proteins are flexible and the conforma-

tional space is larger [110].

HADDOCK does not require CPU and allows the user to see all the docking steps

from start to finish. It should be noted that the success of HADDOCK studies is

directly related with the amount of data entered into the system [36]. HADDOCK

allows processing different types of molecules with the help of different platforms

such as WHATIF, ProDRG, PDB. There is no need to create different conformer

Mode Specific docking Blind docking

Affinity (kcal/mol) Affinity (kcal/mol)

Rep Rep  Rep  AVG Rep Rep Rep AVG

1−8.9 −8.8 −8.9 −8.9 −8.9 −8.9 −9.0 −8.9

2−7.3 −8.7 −7.3 −7.8 −8.2 −8.2 −8.3 −8.2

3−7.2 −7.2 −7.3 −7.2 −8.1 −8.0 −8.1 −8.1

4−6.8 −7.0 −7.0 −6.9 −7.9 −7.8 −8.0 −7.9

5−6.8 −6.9 −7.0 −6.9 −7.9 −7.5 −8.0 −7.8

6−6.8 −6.8 −6.9 −6.8 −7.7 −7.4 −7.8 −7.6

7−6.7 −6.5 −6.8 −6.7 −7.7 −7.4 −7.8 −7.6

8−6.4 −6.4 −6.8 −6.5 −7.6 −7.2 −7.6 −7.5

9−6.3 −6.4 −6.7 −6.4 −7.5 −6.9 −7.4 −7.3

Table 1.

Specific and blind docking studies with AutoDock were repeated three times.



Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule…

DOI: http://dx.doi.org/10.5772/intechopen.105815

sequences as the system selects the most compatible conformers based on the shape

constraints. With restriction files, we can set clear target sites, binding distances,

or select active or passive residues (areas that are likely to interact). Defining

semi-flexible regions is also allowed.

HADDOCK algorithm consists of three stages: rigid-body minimization and

randomization of orientations (it0), semi-flexible simulated annealing in torsion

angle space (it1), and refinement in 3D space with explicit solvent (water) (https://

www.bonvinlab.org/education/HADDOCK-protein-protein-basic/). it0 stage treats

structures as rigid solids and 1000 poses with the best score are selected. it1 optimizes

orientations by allowing different docking poses from it0 to have different flexible

regions defined. Two-hundred models with the best energy pass to the final stage. In

the final step, a complex solvent medium (DMSO or water) is considered to improve

the interaction energy and the final models are automatically aggregated.

To dock the case study inhibitor-protein complex (PDB ID:7XK5), the guideline

tutorial (HADDOCK small-molecule binding site screening protocol) [111] was fol-

lowed and two different approaches were tested: (i) using an unambiguous (distance)

restraint file, indicating the target that should bind the ligand, (ii) by defining the

active and passive residues. This case study consists of a pre-docking for the detection

of the binding region and a second docking for the detection of binding pose.

First, we tested HADDOCK’s accuracy of binding site detection. Two different

binding sites were detected in the top 10 clusters with the best energy scores and 70%

(7 out of 10) of the clusters were in the correct binding site (Figure A). Secondly, an

ambiguous and unambiguous restraint file was created by identifying the region with

the highest number of interactions between the ligand and the receptor. The restraint

files can be created manually or using the link in the protocol. However, it may be

necessary to make corrections in the distance restraints. The structure with the best

energy is visualized in Figure B. Secondly, active and passive residues were defined

on the system, and the pose with the best energy result is visualized in Figure C.

HADDOCK results are summarized in Table .

Comparison of the results shows that HADDOCK is successful in detecting the

binding site. However, according to the results obtained in the second stage, the

Figure 5.

Crystal SARS-CoV-2 main protease structure (white, PDB ID: 7KX5_chain (A) in complex with the blind

docking (blue), specific docking (green) poses predicted with AutoDock Vina and the reference ligand Jun8-76-3A

inhibitor (red, PDB ID: 7KX5_chain B). This figure was drawn with PyMol 2.5.2.

Molecular Docking - Recent Advances



algorithm was not successful enough to find the correct conformation of the ligand

in binding site. Defining ambiguous/unambiguous restraint files or selecting active

and passive residues did not make a significant contribution in detecting the correct

binding pose (Figure B and C). Docking with both approaches was repeated several

times and no significant similarity was detected.

. Docking with SwissDock

SwissDock is a database to improve protein–small-molecule docking using amino

acid sequence information from genome projects. Moreover, it is a web browser and

programmatic interface that enables creating three-dimensional protein models

from protein amino acid sequences [112]. It also has user interfaces such as Swiss-

Pdb Viewer (DeepView) to simultaneously analyze several proteins [113]. Using

the SwissDock web server, the starting crystal structures of the target proteins can

Figure 6.

Crystal SARS-CoV-2 main protease structure (gray, PDB ID: 7KX5_chain (A) in complex with the docking poses

(blue) predicted with HADDOCK and reference ligand Jun8-76-3A inhibitor (red, PDB ID: 7KX5_chain B).

A. Top 10 clusters for binding site determination. B. Pose with the best energy using ambiguous/unambiguous

restraints. C. Pose With the best energy using active/passive restraints. This figure was drawn with PyMol 2.5.2.



Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule…

DOI: http://dx.doi.org/10.5772/intechopen.105815

be searched and fetched from protein and ligand structure databases. If there is no

crystal structure available to compare, it provides homology modeling of the studied

protein. During the docking process, the user does not have to do any calculations

because all calculations are handled by the server side [112]. As a docking constraint,

the ligand binding region can be defined or blind docking can be applied with no

information.

Using the case study, both specific and blind dockings were performed on the

SwissDock server, and the results were compared. The server presented 256 poses.

The best scores obtained by specific docking (green) blind docking (blue) were−9.88

and−9.35kcal/mol, respectively (Figure ). Although both of the predicted poses

did not show the same conformation with the reference ligand, it was observed that

the pose obtained from the specific docking (green) was more similar to the reference

ligand (red) (Figure ).

Binding site

detection

Ambiguous/

Unambiguous restraints

Active/passive

restraints

HADDOCK score −53.4 ± 1.5 −52.1 ± 0.5 −21.9 ± 2.7

Cluster size 69 513

RMSD from the overall lowest-

energy structure

0.3 ± 0.2 0.1 ± 0.1 0.2 ± 0.0

Van der Waals energy −40.3 ± 1.2 −41.6 ± 0.2 −32.4 ± 4.5

Electrostatic energy −22.1 ± 1.9 −15.2 ± 6.0 −25.8 ± 7.3

Desolvation energy −10.9 ± 2.5 −9.0 ± 0.2 −6.7 ± 0.3

Restraints violation energy 0.0 ± 0.00 0.7 ± 0.2 198.5 ± 78.0

Buried Surface Area 795.4 ± 21.9 781.6 ± 5.2 783.0 ± 9.4

Z-Score −1.7 −2.4 −1.3

Table 2.

HADDOCK results.

Figure 7.

Crystal SARS-CoV-2 main protease structure (white, PDB ID: 7KX5_chain (A) in complex with the blind

docking (blue), specific docking (green) poses predicted by SwissDock and the reference ligand Jun8-76-3A

inhibitor (red, PDB ID: 7KX5_chain B). This figure was drawn with PyMol 2.5.2.

Molecular Docking - Recent Advances



. Conclusions

Molecular docking is a computational method that predicts the 3D structures of

receptor-ligand complexes. Modeling the atomic details of the ligand pose with the

receptor protein by molecular docking can assist in understanding protein structure-

function relationship and in drug design studies in several ways. Computational

modeling approaches complement and/or lead experiments by eliminating irrelevant

drug candidates and selecting the ones with the best binding properties. With the

continuously developing technology, there are many different approaches and algo-

rithms for molecular docking studies, and they are successfully used in therapeutic

applications such as targeted drug design, drug target search, evaluation of the side

effects of existing drugs, or finding new targets for these drugs.

The crystal structure of the SARS-CoV-2 (COVID-19) main protease in complex

with its non-covalent inhibitor Jun8-76-3A (PDB ID: 7KX5) was used as an experi-

mental reference case study to compare and evaluate the prediction accuracies of

AutoDock Vina, HADDOCK, and SwissDock programs as well as to test the effects of

some parameters on their prediction capabilities. One of the main observations is that

the ligand poses with the lowest binding energy scores are not necessarily the best

solution. Therefore, docking results should always be evaluated in terms of biologi-

cal relevance. Moreover, when a priori information about the ligand-binding site

is included as grid box placement and size in AutoDock Vina and as ligand binding

residues in SwissDock, the binding accuracy is improved significantly.

In summary, before starting the molecular docking, it is of crucial importance to

obtain detailed information on the target protein and ligand from various sources and

servers and to decide which docking algorithm to use. Moreover, the top predicted

poses with the best scores should not be unquestioningly accepted as the best solu-

tions but further structural analyses and evaluations should be incorporated in the

decision process.

Acknowledgements

We would like to thank dear Merve DEMİR AKYÜZ and Merve YÜCETÜRK

(a.k.a Merves), who are fourth-year undergraduate students at Molecular Biology and

Genetics Department of Istanbul Medeniyet University, for their contribution to the

writing of the introduction section.

Fundamentals of Molecular Docking and Comparative Analysis of Protein: Small Molecule…

DOI: http://dx.doi.org/10.5772/intechopen.105815



Author details

SefikaFeyza Maden, SelinSezer and Saliha EceAcuner*

Department of Bioengineering and Science and Advanced Technologies Research

Center (BILTAM), Istanbul Medeniyet University, Istanbul, Turkey

*Address all correspondence to: ece.ozbabacan@medeniyet.edu.tr

the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0),

which permits unrestricted use, distribution, and reproduction in any medium, provided

the original work is properly cited.

Molecular Docking - Recent Advances



References

[1] Russell RB, Alber F, Aloy P, Davis FP,

Korkin D, Pichaud M, et al. A structural

perspective on protein–protein

interactions. Current Opinion in

Structural Biology. 2004;(3):313-324

[2] Sadowski MI, Jones DT. The

sequence–structure relationship

and protein function prediction.

Current Opinion in Structural Biology.

2009;(3):357-362

[3] Petrey D, Honig B. Structural

bioinformatics of the interactome.

Annual Review in Biophysics.

2014;(1):193-210

[4] Stein A, Mosca R, Aloy P. Three-

dimensional modeling of protein

interactions and complexes is going

‘omics. Current Opinion in Structural

Biology. 2011;(2):200-208

[5] Senior AW, Evans R, Jumper J,

Kirkpatrick J, Sifre L, Green T, et al.

Improved protein structure prediction

using potentials from deep learning.

Nature. 2020;(7792):706-710

[6] Andrusier N, Mashiach E,

Nussinov R, Wolfson HJ. Principles

of flexible protein-protein docking.

Proteins. 2008;(2):271-289

[7] Bonvin AM. Flexible protein–protein

docking. Current Opinion in Structural

Biology. 2006;(2):194-200

[8] Vakser IA. Protein-protein

docking: From Interaction to

Interactome. Biophysical Journal.

2014;(8):1785-1793

[9] Harmalkar A, Gray JJ. Advances to

tackle backbone flexibility in protein

docking. Current Opinion in Structural

Biology. 2021;:178-186

[10] Wang C, Bradley P, Baker D.

Protein–protein docking with backbone

flexibility. Journal of Molecular Biology.

2007;(2):503-519

[11] Ferreira L, dos Santos R, Oliva G,

Andricopulo A. Molecular docking and

structure-based drug design strategies.

Molecules. 2015;(7):13384-13421

[12] Pinzi L, Rastelli G. Molecular

docking: Shifting paradigms in drug

discovery. IJMS. 2019;(18):4331

[13] March-Vila E, Pinzi L, Sturm N,

Tinivella A, Engkvist O, Chen H, et al.

On the integration of in silico drug design

methods for drug repurposing. Frontiers

in Pharmacology. 2017;(8):298

[14] Wilson GL, Lill MA. Integrating

structure-based and ligand-based

approaches for computational drug

design. Future Medicinal Chemistry.

2011;(6):735-750

[15] Anighoro A. Deep learning in

structure-based drug design. Methods in

Molecular Biology. 2022;:261-271

[16] Stokes JM, Yang K, Swanson K,

Jin W, Cubillos-Ruiz A, Donghia NM,

et al. A deep learning approach

to antibiotic discovery. Cell.

2020;(4):688-702.e13

[17] Elton DC, Boukouvalas Z, Fuge MD,

Chung PW. Deep learning for molecular

design—a review of the state of the

art. Molecular System and Design

Engineering. 2019;(4):828-849

[18] Allison B, Combs S,

DeLuca S, Lemmon G, Mizoue L, Meiler J.

Computational design of protein-small

molecule interfaces. Journal of Structural

Biology. 2014;(2):193-202

Fundamentals of Molecular Docking and Comparative Analysis of Protein: Small Molecule…

DOI: http://dx.doi.org/10.5772/intechopen.105815



[19] Śledź P, Caflisch A. Protein

structure-based drug design: From

docking to molecular dynamics.

Current Opinion in Structural Biology.

2018;:93-102

[20] Guterres H, Im W. Improving

protein-ligand docking results with

high-throughput molecular dynamics

simulations. Journal of Chemical Model.

2020;(4):2189-2198

[21] Wishart DS. DrugBank: A

comprehensive resource for in silico drug

discovery and exploration. Nucleic Acids

Research. 2006;(90001):D668-D672

[22] Li Q , Cheng T, Wang Y, Bryant SH.

PubChem as a public resource for drug

discovery. Drug Discovery Today.

2010;(23-24):1052-1057

[23] Irwin JJ, Sterling T, Mysinger MM,

Bolstad ES, Coleman RG. ZINC: A Free

Tool to Discover Chemistry for Biology.

Journal of Chemical Information and

Modeling. 2012;(7):1757-1768

[24] Gaulton A, Bellis LJ, Bento AP,

Chambers J, Davies M, Hersey A, etal.

ChEMBL: A large-scale bioactivity

database for drug discovery.

Nucleic Acids Research.

2012;(D1):D1100-D1107

[25] Pence HE, Williams A. ChemSpider:

An online chemical information

resource. Journal of Chemical Education.

2010;(11):1123-1124

[26] O’Boyle NM, Banck M,

James CA, Morley C, Vandermeersch T,

Hutchison GR. Open Babel: An

open chemical toolbox. Journal of

Cheminformatics. 2011;(1):33

[27] Morris GM, Huey R, Lindstrom W,

Sanner MF, Belew RK, Goodsell DS,

etal. AutoDock4 and AutoDockTools4:

Automated docking with selective

receptor flexibility. Journal

of Computational Chemistry.

2009;(16):2785-2791

[28] Goodford PJ. A computational

procedure for determining energetically

favorable binding sites on biologically

important macromolecules. Journal of

Medicinal Chemistry. 1985;(7):849-857

[29] Laskowski RA. SURFNET: A

program for visualizing molecular

surfaces, cavities, and intermolecular

interactions. Journal of Molecular

Graphics. 1995;(5):323-330

[30] Yang J, Roy A, Zhang Y. Protein-

ligand binding site recognition using

complementary binding-specific

substructure comparison and sequence

profile alignment. Bioinformatics.

2013;(20):2588-2595

[31] Narang P, Bhushan K, Bose S,

Jayaram B. Protein structure evaluation

using an all-atom energy based

empirical scoring function. Journal of

Biomolecular Structure & Dynamics.

2006;(4):385-406

[32] Binkowski TA, Naghibzadeh S,

Liang J. CASTp: Computed Atlas of

Surface Topography of proteins. Nucleic

Acids Research. 2003;(13):3352-3355

[33] Jiménez J, Doerr S,

Martínez-Rosell G, Rose AS, De

Fabritiis G. DeepSite: Protein-binding

site predictor using 3D-convolutional

neural networks. Bioinformatics.

2017;(19):3036-3042

[34] Kandel J, Tayara H, Chong KT.

PUResNet: Prediction of protein-ligand

binding sites using deep residual neural

network. Journal of Cheminformatics.

2021;(1):65

[35] Huang SY, Zou X. Advances and

Challenges in Protein-Ligand Docking.

IJMS. 2010;(8):3016-3034

Molecular Docking - Recent Advances



[36] de Vries SJ, van Dijk M, Bonvin AMJJ.

The HADDOCK web server for data-

driven biomolecular docking. Nature

Protocols. 2010;(5):883-897

[37] Fan J, Fu A, Zhang L. Progress in

molecular docking. Quantitative Biology.

2019;(2):83-89

[38] Crampon K, Giorkallos A,

Deldossi M, Baud S, Steffenel LA.

Machine-learning methods for ligand–

protein molecular docking. Drug

Discovery Today. 2022;(1):151-164

[39] Jiang F, Kim SH. “Soft docking”:

Matching of molecular surface

cubes. Journal of Molecular Biology.

1991;(1):79-102

[40] Ferrari AM, Wei BQ , Costantino L,

Shoichet BK. Soft docking and multiple

receptor conformations in virtual

screening. Journal of Medicinal

Chemistry. 2004;(21):5076-5084

[41] Torres PHM, Sodero ACR,

Jofily P, Silva-Jr FP. Key topics in

molecular docking for drug design. IJMS.

2019;(18):4574

[42] Gioia D, Bertazzo M, Recanatini M,

Masetti M, Cavalli A. Dynamic docking:

A paradigm shift in computational drug

discovery. Molecules. 2017;(11):2029

[43] Sousa SF, Fernandes PA, Ramos MJ.

Protein-ligand docking: Current status

and future challenges. Proteins.

2006;(1):15-26

[44] Meng XY, Zhang HX, Mezei M,

Cui M. Molecular docking: A powerful

approach for structure-based drug

discovery. Caduceus. 2011;(2):146-157

[45] Kuntz ID, Blaney JM, Oatley SJ,

Langridge R, Ferrin TE. A geometric

approach to macromolecule-ligand

interactions. Journal of Molecular

Biology. 1982;(2):269-288

[46] Miller MD, Kearsley SK,

Underwood DJ, Sheridan RP. FLOG: A

system to select?quasi-flexible? ligands

complementary to a receptor of known

three-dimensional structure. Journal

of Computer-Aided Molecular Design.

1994;(2):153-174

[47] Pang YP, Perola E, Xu K,

Prendergast FG. EUDOC: A computer

program for identification of drug

interaction sites in macromolecules and

drug leads from chemical databases.

Journal of Computational Chemistry.

2001;(15):1750-1771

[48] Jain AN. Surflex: Fully automatic

flexible molecular docking using a

molecular similarity-based search

engine. Journal of Medicinal Chemistry.

2003;(4):499-511

[49] Diller DJ, Merz KM. High

throughput docking for library design

and library prioritization. Proteins.

2001;(2):113-124

[50] Burkhard P, Taylor P,

Walkinshaw MD. An example of a

protein ligand found by database

mining: Description of the docking

method and its verification by a 2.3 Å

X-ray structure of a Thrombin-Ligand

complex. Journal of Molecular Biology.

1998;(2):449-466

[51] Huang SY, Zou X. Ensemble

docking of multiple protein structures:

Considering protein structural variations

in molecular docking. Proteins.

2006;(2):399-421

[52] Prieto-Martínez FD, Arciniega M,

Medina-Franco JL. Acoplamiento

Molecular: Avances Recientes y Retos.

TIP RECQB. 2018. [cited 2022 May

15];21. Available from: http://tip.

Fundamentals of Molecular Docking and Comparative Analysis of Protein: Small Molecule…

DOI: http://dx.doi.org/10.5772/intechopen.105815



zaragoza.unam.mx/index.php/tip/

article/view/143

[53] Guedes IA, de Magalhães CS,

Dardenne LE. Receptor–ligand molecular

docking. Biophysical Reviews.

2014;(1):75-87

[54] Friesner RA, Banks JL, Murphy RB,

Halgren TA, Klicic JJ, Mainz DT, etal.

Glide: A new approach for rapid,

accurate docking and scoring. 1. Method

and assessment of docking accuracy.

Journal of Medicinal Chemistry.

2004;(7):1739-1749

[55] McGann MR, Almond HR,

Nicholls A, Grant JA, Brown FK.

Gaussian docking functions.

Biopolymers. 2003;(1):76-90

[56] Ewing TJA, Kuntz ID. Critical

evaluation of search algorithms

for automated molecular docking

and database screening. Journal

of Computational Chemistry.

1997;(9):1175-1189

[57] Böhm HJ. The computer program

LUDI: A new method for the de novo

design of enzyme inhibitors. Journal

of Computer-Aided Molecular Design.

1992;(1):61-78

[58] Rarey M, Kramer B, Lengauer T,

Klebe G. A fast flexible docking method

using an incremental construction

algorithm. Journal of Molecular Biology.

1996;(3):470-489

[59] Bentham Science Publisher BSP.

eHiTS: An innovative approach to the

docking and scoring function problems.

CPPS. 2006;(5):421-435

[60] Trott O, Olson AJ. AutoDock Vina:

Improving the speed and accuracy

of docking with a new scoring

function, efficient optimization,

and multithreading. Journal of

Computational Chemistry. 2009

[61] Verdonk ML, Cole JC, Hartshorn MJ,

Murray CW, Taylor RD. Improved

protein-ligand docking using GOLD.

Proteins. 2003;(4):609-623

[62] de Magalhães CS, Almeida DM,

Barbosa HJC, Dardenne LE. A dynamic

niching genetic algorithm strategy

for docking highly flexible ligands.

Information Sciences. 2014;:206-224

[63] Thomsen R, Christensen MH.

MolDock: A new technique for

high-accuracy molecular docking.

Journal of Medicinal Chemistry.

2006;(11):3315-3321

[64] Forli S, Huey R, Pique ME,

Sanner MF, Goodsell DS, Olson AJ.

Computational protein–ligand docking

and virtual drug screening with the

AutoDock suite. Nature Protocols.

2016;(5):905-919

[65] Bentham Science Publisher BSP.

Scoring functions for protein-ligand

docking. CPPS. 2006;(5):407-420

[66] Weiner PK, Kollman PA. AMBER:

Assisted model building with energy

refinement. A general program

for modeling molecules and their

interactions. Journal of Computational

Chemistry. 1981;(3):287-303

[67] Brooks BR, Bruccoleri RE,

Olafson BD, States DJ, Swaminathan S,

Karplus M. CHARMM: A program for

macromolecular energy, minimization,

and dynamics calculations. Journal

of Computational Chemistry.

1983;(2):187-217

[68] van Gunsteren WF, Berendsen HJC.

Computer simulation of molecular

dynamics: Methodology, applications,

and perspectives in chemistry.

Angewandte Chemie (International Ed.

in English). 1990;(9):992-1023

Molecular Docking - Recent Advances



[69] Jorgensen WL, Tirado-Rives J. The

OPLS Potential Functions for Proteins.

Energy Minimizations for Crystals of

Cyclic Peptides and Crambin. p. 10

[70] Parrill AL, Reddy MR. Rational

Drug Design: Novel Methodology

and Practical Applications. American

Chemical Society; 1999 [cited 2022 May

23]. (ACS Symposium Series; vol. 719).

Available from: https://pubs.acs.org/doi/

book/10.1021/bk-1999-0719

[71] Krammer A, Kirchhoff PD,

Jiang X, Venkatachalam CM, Waldman M.

LigScore: A novel scoring function for

predicting binding affinities. Journal

of Molecular Graphics & Modelling.

2005;(5):395-407

[72] Böhm HJ. The development of

a simple empirical scoring function

to estimate the binding constant for

a protein-ligand complex of known

three-dimensional structure. Journal

of Computer-Aided Molecular Design.

1994;(3):243-256

[73] Wang R, Liu L, Lai L, Tang Y.

SCORE: A new empirical method for

estimating the binding affinity

of a protein-ligand complex.

Journal of Molecular Modeling.

1998;(12):379-394

[74] Wang R, Lai L, Wang S. Further

development and validation of empirical

scoring functions for structure-based

binding affinity prediction. Journal of

Computer-Aided Molecular Design.

2002;(1):11-26

[75] Dias R, de Azevedo W.

Molecular docking algorithms. CDT.

2008;(12):1040-1047

[76] Waszkowycz B, Clark DE, Gancia E.

Outstanding challenges in protein–

ligand docking and structure-based

virtual screening. WIREs

Computational Molecular Science.

2011;(2):229-259

[77] Morris GM, Lim-Wilby M. Molecular

docking. In: Kukol A, editor. Molecular

Modeling of Proteins. Totowa, NJ:

Humana Press; 2008. pp. 365-382

[78] Verdonk ML, Taylor RD, Chessari G,

Murray CW. Illustration of current

challenges in molecular docking. In:

Structure-Based Drug Discovery.

Dordrecht: Springer Netherlands; 2007.

pp. 201-221

[79] Janin J, Henrick K, Moult J, Eyck LT,

Sternberg MJE, Vajda S, et al. CAPRI:

A critical assessment of PRedicted

interactions. Proteins. 2003;(1):2-9

[80] Janin J. Protein–protein docking

tested in blind predictions: The CAPRI

experiment. Molecular BioSystems.

2010;(12):2351

[81] Hurle MR, Yang L, Xie Q ,

Rajpal DK, Sanseau P, Agarwal P.

Computational drug repositioning:

From data to therapeutics. Clinical

Pharmacology and Therapeutics.

2013;(4):335-341

[82] Scherman D, Fetro C. Drug

repositioning for rare diseases:

Knowledge-based success stories.

Thérapie. 2020;(2):161-167

[83] Xiao H, Bid HK, Chen X,

Wu X, Wei J, Bian Y, et al. Repositioning

Bazedoxifene as a novel IL-6/GP130

signaling antagonist for human

rhabdomyosarcoma therapy. PLoS ONE.

2017;(7):e0180297

[84] Gupta RR. Application of artificial

intelligence and machine learning in

drug discovery. Methods in Molecular

Biology. 2022;:113-124

Fundamentals of Molecular Docking and Comparative Analysis of Protein: Small Molecule…

DOI: http://dx.doi.org/10.5772/intechopen.105815



[85] Thomas M, Boardman A,

Garcia-Ortegon M, Yang H, de Graaf C,

Bender A. Applications of artificial

intelligence in drug design:

Opportunities and challenges. Methods

in Molecular Biology. 2022;:1-59

[86] Zhu T, Cao S, Su PC, Patel R, Shah D,

Chokshi HB, et al. Hit identification and

optimization in virtualscreening: Practical

recommendations based on a critical

literature analysis: Miniperspective.

Journal of Medicinal Chemistry.

2013;(17):6560-6572

[87] Neves BJ, Mottin M, Moreira-

Filho JT, Sousa BK de P, Mendonca SS,

Andrade CH. Best practices for docking-

based virtual screening. In: Molecular

Docking for Computer-Aided Drug

Design. Academic Press (Elsevier); 2021.

pp. 75-98

[88] Lipinski CA, Lombardo F,

Dominy BW, Feeney PJ. Experimental

and computational approaches to

estimate solubility and permeability

in drug discovery and development

settings. Advanced Drug Delivery

Reviews. 2001;(1-3):3-26

[89] Veber DF, Johnson SR, Cheng HY,

Smith BR, Ward KW, Kopple KD.

Molecular properties that influence the

oral bioavailability of drug candidates.

Journal of Medicinal Chemistry.

2002;(12):2615-2623

[90] Neves BJ, Braga RC,

Melo-Filho CC, Moreira-Filho JT,

Muratov EN, Andrade CH. QSAR-

based virtual screening: Advances and

applications in drug discovery. Frontiers

in Pharmacology. 2018;:1275

[91] Fassio AV, Santos LH, Silveira SA,

Ferreira RS, de Melo-Minardi RC.

nAPOLI: A graph-based strategy to

detect and visualize conserved protein-

ligand interactions in large-scale.

IEEE/ACM Transactions on Computa-

tional Biology and Bioinformatics.

2019:1-1

[92] Kurogi Y, Güner OF. Pharmacophore

modeling and three-dimensional

database searching for drug design using

catalyst. Current Medicinal Chemistry.

2001;(9):1035-1055

[93] Dixon SL, Smondyrev AM,

Knoll EH, Rao SN, Shaw DE,

Friesner RA. PHASE: A new engine

for pharmacophore perception, 3D

QSAR model development, and 3D

database screening: 1. Methodology

and preliminary results. Journal of

Computer-Aided Molecular Design.

2006;(10-11):647-671

[94] Chen X, Rusinko A, Tropsha A,

Young SS. Automated pharmacophore

identification for large chemical

data sets. Journal of Chemical

Information and Computer Sciences.

1999;(5):887-896

[95] Schneidman-Duhovny D,

Dror O, Inbar Y, Nussinov R, Wolfson HJ.

PharmaGist: A webserver for ligand-

based pharmacophore detection. Nucleic

Acids Research. 2008;:W223-W228

[96] Fan N, Bauer CA, Stork C, de

Bruyn KC, Kirchmair J. ALADDIN:

Docking approach augmented by

machine learning for protein structure

selection yields superior virtual

screening performance. Molecular

Informatics. 2020;(4):e1900103

[97] Rashidieh B, Molakarimi M,

Mohseni A, Tria SM, Truong H,

Srihari S, et al. Targeting BRF2 in cancer

using repurposed drugs. Cancers.

2021;(15):3778

[98] Berman HM. The protein data

bank. Nucleic Acids Research.

2000;(1):235-242

Molecular Docking - Recent Advances



[99] Chen X. TTD: Therapeutic target

database. Nucleic Acids Research.

2002;(1):412-415

[100] Chen YZ, Zhi DG. Ligand-protein

inverse docking and its potential use

in the computer search of protein

targets of a small molecule. Proteins.

2001;(2):217-226

[101] Wang JC, Chu PY, Chen CM, Lin JH.

idTarget: A web server for identifying

protein targets of small chemical

molecules with robust scoring functions

and a divide-and-conquer docking

approach. Nucleic Acids Research.

2012;:W393-W399

[102] Xie T, Zhang L, Zhang S, Ouyang L,

Cai H, Liu B. ACTP: A webserver for

predicting potential targets and

relevant pathways of autophagy-

modulating compounds. Oncotarget.

2016;(9):10015-10022

[103] Lee A, Kim D. CRDS: Consensus

reverse docking system for target fishing.

Bioinformatics. 2019

[104] Stepanova EE, Balandina SY,

Drobkova VA, Dmitriev MV,

Mashevskaya IV, Maslivets AN. Synthesis,

in vitro antibacterial activity against

Mycobacterium tuberculosis, and

reverse docking-based target fishing of

1,4-benzoxazin-2-one derivatives. Archiv

der Pharmazie. 2021;(2):2000199

[105] Imrie F, Bradley AR, van der

Schaar M, Deane CM. Protein family-

specific models using deep neural

networks and transfer learning

improve virtual screening and highlight

the need for more data. Journal of

Chemical Information and Modeling.

2018;(11):2319-2330

[106] Kitchen DB, Decornez H, Furr JR,

Bajorath J. Docking and scoring in virtual

screening for drug discovery: Methods

and applications. Nature Reviews. Drug

Discovery. 2004;(11):935-949

[107] Yuan S, Chan HCS, Hu Z. Using

PyMOL as a platform for computational

drug design. WIREs Computers

Molecular Science. 2017;(2):70

[108] Koukos PI, Réau M, Bonvin AMJJ.

Shape-restrained modeling of protein–

small-molecule complexes with high

ambiguity driven DOCKing. Journal of

Chemical Information and Modeling.

2021;(9):4807-4818

[109] Koukos PI, Xue LC, Bonvin AMJJ.

Protein–ligand pose and affinity

prediction: Lessons from D3R Grand

Challenge 3. Journal of Computer-Aided

Molecular Design. 2019;(1):83-91

[110] Stanzione F, Giangreco I, Cole JC.

Use of molecular docking computational

tools in drug discovery. In: Progress in

Medicinal Chemistry. Elsevier; 2021.

pp.273-343

[111] Sennhauser G, Amstutz P, Briand C,

Storchenegger O, Grütter MG. Drug

export pathway of multidrug exporter

AcrB revealed by DARPin inhibitors.

PLoS Biology. 2007;(1):e7

[112] Grosdidier A, Zoete V, Michielin O.

SwissDock, a protein-small molecule

docking web service based on

EADock DSS. Nucleic Acids Research.

2011;(suppl):W270-W277

[113] Guex N, Peitsch MC. SWISS-

MODEL and the Swiss-Pdb Viewer:

An environment for comparative

protein modeling. Electrophoresis.

1997;(15):2714-2723

Computational Tools for Structural Analysis of Proteins

Chapter

Jan 2024

Experimental approaches for identifying protein structures and conducting their analyses face systemic limitations and challenges. Thus, computational approaches have emerged as invaluable tools over the past few decades, offering complementary insights into the protein structure, function and analysis. This chapter provides a focused overview of computational tools being developed for predicting and analyzing protein structures. Acquisition and validation of protein structures is initially discussed, focusing on prominent resources such as the Protein Data Bank (PDB) and the AlphaFold Protein Structure Database. This section is followed by an overview of tools to study protein interactions, characterization of functional sites and visualization of protein static/dynamic state. Finally, we conclude with a meticulous discussion of the unaddressed challenges and future directions.

Chemical profiling of essential oils: Investigations into modulating milk production in dairy cows using in silico methods

Article

Full-text available

Apr 2024

This study aimed to assess the biological and biotherapeutic activities of essential oils derived from the medicinal plants Tanacetum vulgare L., Myrtus communis L. subsp. communis L., and Pimpinella flabellifolia (Boiss.) Benth. Et Hook. ex Drude. Plant samples were systematically collected from the Sivas region of Türkiye. Subsequently, essential oils were extracted using a Clevenger-type apparatus, and their compositions were assessed by gas chromatography–mass spectrometry (GC-MS) analysis. Then, antioxidant activities of the essential oil samples were investigated using β-carotene-linoleic acid and 2,2-diphenyl-1-picrylhydrazyl (DPPH) assays. Furthermore, the antimicrobial activity of these species was assessed via the disc diffusion assay. Finally, the potential effects of the essential oil compositions from these plants on milk production in dairy cows were analyzed through in-silico methods.

Chemical profiling of essential oils: Investigations into modulating milk production in dairy cows using in silico methods

Article

Full-text available

Apr 2024

This study aimed to assess the biological and biotherapeutic activities of essential oils derived from the medicinal plants Tanacetum vulgare L., Myrtus communis L. subsp. communis L., and Pimpinella flabellifolia (Boiss.) Benth. Et Hook. ex Drude. Plant samples were systematically collected from the Sivas region of Türkiye. Subsequently, essential oils were extracted using a Clevenger-type apparatus, and their compositions were assessed by gas chromatography-mass spectrometry (GC-MS) analysis. Then, antioxidant activities of the essential oil samples were investigated using β-carotene-linoleic acid and 2,2-diphenyl-1-picrylhydrazyl (DPPH) assays. Furthermore, the antimicrobial activity of these species was assessed via the disc diffusion assay. Finally, the potential effects of the essential oil compositions from these plants on milk production in dairy cows were analyzed through in-silico methods.

Computer Aided Drug Designing Approach for Prospective Human Metastatic Cancer

Article

Jul 2023

It is well known that finding new drugs is a difficult, expensive, time-consuming, and expensive project. According to a study, the typical time and cost for developing a new medicine through the conventional drug development pipeline is 12 years and 2.7 billion dollars. The pharmaceutical industry is grappling with the difficult and pressing challenge of how to find new drugs faster and at lower research costs.Insilico,The field of computer-aided drug discovery (CADD) has shown significant promise as an advanced technology for secure, cost-effective, and efficient drug design. In recent times, there has been remarkable progress in computational tools for drug discovery, particularly in the development of anticancer therapies. This progress has had a significant impact on the design of anticancer drugs and has provided valuable insights into the field of cancer treatment. To carry out molecular docking, we utilized AutoDock software and prepared the target protein by loading and converting its PDB file format into a macromolecule. Additionally, the ligand structures underwent energy minimization (EM) and were selected alongside the target proteins in AutoDock. To ensure coverage of the binding site residues, a suitable grid box with appropriate dimensions was chosen

Evaluation of Biomedical Applications for Linseed Extract: Antimicrobial, Antioxidant, Anti-Diabetic, and Anti-Inflammatory Activities In Vitro

Article

Full-text available

May 2023

Background: In the last few decades, the development of multidrug-resistant (MDR) microbes has accelerated alarmingly and resulted in significant health issues. Morbidity and mortality have increased along with the prevalence of infections caused by MDR bacteria, making the need to solve these problems an urgent and unmet challenge. Therefore, the current investigation aimed to evaluate the activity of linseed extract against Methicillin-resistant Staphylococcus aureus (MRSA) as an isolate from diabetic foot infection. In addition, antioxidant and anti-inflammatory biological activities of linseed extract were evaluated. Result: HPLC analysis indicated the presence of 1932.20 µg/mL, 284.31 µg/mL, 155.10 µg/mL, and 120.86 µg/mL of chlorogenic acid, methyl gallate, gallic acid, and ellagic acid, respectively, in the linseed extract. Rutin, caffeic acid, coumaric acid, and vanillin were also detected in the extract of linseed. Linseed extract inhibited MRSA (35.67 mm inhibition zone) compared to the inhibition zone (29.33 mm) caused by ciprofloxacin. Standards of chlorogenic acid, ellagic acid, methyl gallate, rutin, gallic acid, caffeic acid, catechin, and coumaric acid compounds reflected different inhibition zones against MRSA when tested individually, but less than the inhibitory action of crude extract. A lower MIC value, of 15.41 µg/mL, was observed using linseed extract than the MIC 31.17 µg/mL of the ciprofloxacin. The MBC/MIC index indicated the bactericidal properties of linseed extract. The inhibition % of MRSA biofilm was 83.98, 90.80, and 95.58%, using 25%, 50%, and 75%, respectively, of the MBC of linseed extract. A promising antioxidant activity of linseed extract was recorded, with an IC50 value of 20.8 µg/mL. Anti-diabetic activity of linseed extract, expressed by glucosidase inhibition, showed an IC50 of 177.75 µg/mL. Anti-hemolysis activity of linseed extract was documented at 90.1, 91.5, and 93.7% at 600, 800, and 1000 µg/mL, respectively. Anti-hemolysis activity of the chemical drug indomethacin, on the other hand, was measured at 94.6, 96.2, and 98.6% at 600, 800, and 1000 µg/mL, respectively. The interaction of the main detected compound in linseed extract (chlorogenic acid) with the crystal structure of the 4G6D protein of S. aureus was investigated via the molecular docking (MD) mode to determine the greatest binding approach that interacted most energetically with the binding locations. MD showed that chlorogenic acid was an appropriate inhibitor for S. aureus via inhibition of its 4HI0 protein. The MD interaction resulted in a low energy score (-6.26841 Kcal/mol) with specified residues (PRO 38, LEU 3, LYS 195, and LYS 2), indicating its essential role in the repression of S. aureus growth. Conclusion: Altogether, these findings clearly revealed the great potential of the in vitro biological activity of linseed extract as a safe source for combatting multidrug-resistant S. aureus. In addition, linseed extract provides health-promoting antioxidant, anti-diabetic, and anti-inflammatory phytoconstituents. Clinical reports are required to authenticate the role of linseed extract in the treatment of a variety of ailments and prevent the development of complications associated with diabetes mellitus, particularly type 2.

Integrating pan-genome and reverse vaccinology to design multi-epitope vaccine against Herpes simplex virus type-1

Article

Jun 2024

Herpes simplex virus type-1 (HSV-1), the etiological agent of sporadic encephalitis and recurring oral (sometimes genital) infections in humans, affects millions each year. The evolving viral genome reduces susceptibility to existing antivirals and, thus, necessitates new therapeutic strategies. Immunoinformatics strategies have shown promise in designing novel vaccine candidates in the absence of a clinically licensed vaccine to prevent HSV-1. However, to encourage clinical translation, the HSV-1 pan-genome was integrated with the reverse-vaccinology pipeline for rigorous screening of universal vaccine candidates. Viral targets were screened from 104 available complete genomes. Among 364 proteins, envelope glycoprotein D being an outer membrane protein with a high antigenicity score (> 0.4) and solubility (> 0.6) was selected for epitope screening. A total of 17 T-cell and 4 B-cell epitopes with highly antigenic, immunogenic, non-toxic properties and high global population coverage were identified. Furthermore, 8 vaccine constructs were designed using different combinations of epitopes and suitable linkers. VC-8 was identified as the most potential vaccine candidate regarding chemical and structural stability. Molecular docking revealed high interactive affinity (low binding energy: − 56.25 kcal/mol) of VC-8 with the target elicited by firm intermolecular H-bonds, salt-bridges, and hydrophobic interactions, which was validated with simulations. Compatibility of the vaccine candidate to be expressed in pET-29(a) + plasmid was established by in silico cloning studies. Immune simulations confirmed the potential of VC-8 to trigger robust B-cell, T-cell, cytokine, and antibody-mediated responses, thereby suggesting a promising candidate for the future of HSV-1 prevention.

Computational Tools in Drug-Lead Identification and Development

Chapter

Mar 2024

Drug discovery is a multidisciplinary process, which encompasses scientific areas like chemistry, biology, pharmacology, and computer sciences. In the past decades, drug discovery was very laborious, expensive, and time consuming process. Massive efforts were needed to harness computational capabilities to encompass both chemical and biological domains, aiming to streamline the processes of drug discovery, design, development, and optimization. The introduction of super computers and accurate algorithms revolutionized different methods in drug discovery such as hit identification, hit-to-lead selection, lead optimization, pharmacokinetic analysis, and toxicity assessment. Computer-aided drug discovery (CADD) is a general term that covers various in silico tools and methods associated with drug discovery. The area is still advancing with the application of artificial intelligence in CADD tools and software. This chapter is devoted to expound various tools and methods frequently used in CADD including structure modeling, pharmacokinetics and toxicity prediction, pharmacophore modeling, molecular docking, and molecular dynamics.

Machine-learning methods for ligand–protein molecular docking

Article

Full-text available

Sep 2021
DRUG DISCOV TODAY

Artificial intelligence (AI) is often presented as a new Industrial Revolution. Many domains use AI, including molecular simulation for drug discovery. In this review, we provide an overview of ligand-protein molecular docking and how machine learning (ML), especially deep learning (DL), a subset of ML, is transforming the field by tackling the associated challenges.

PUResNet: prediction of protein-ligand binding sites using deep residual neural network

Article

Full-text available

Sep 2021

Background Predicting protein-ligand binding sites is a fundamental step in understanding the functional characteristics of proteins, which plays a vital role in elucidating different biological functions and is a crucial step in drug discovery. A protein exhibits its true nature after binding to its interacting molecule known as a ligand that binds only in the favorable binding site of the protein structure. Different computational methods exploiting the features of proteins have been developed to identify the binding sites in the protein structure, but none seems to provide promising results, and therefore, further investigation is required. Results In this study, we present a deep learning model PUResNet and a novel data cleaning process based on structural similarity for predicting protein-ligand binding sites. From the whole scPDB (an annotated database of druggable binding sites extracted from the Protein DataBank) database, 5020 protein structures were selected to address this problem, which were used to train PUResNet. With this, we achieved better and justifiable performance than the existing methods while evaluating two independent sets using distance, volume and proportion metrics.

Shape-Restrained Modeling of Protein-Small-Molecule Complexes with High Ambiguity Driven DOCKing

Article

Full-text available

Aug 2021
J CHEM INF MODEL

Small-molecule docking remains one of the most valuable computational techniques for the structure prediction of protein-small-molecule complexes. It allows us to study the interactions between compounds and the protein receptors they target at atomic detail in a timely and efficient manner. Here, we present a new protocol in HADDOCK (High Ambiguity Driven DOCKing), our integrative modeling platform, which incorporates homology information for both receptor and compounds. It makes use of HADDOCK's unique ability to integrate information in the simulation to drive it toward conformations, which agree with the provided data. The focal point is the use of shape restraints derived from homologous compounds bound to the target receptors. We have developed two protocols: in the first, the shape is composed of dummy atom beads based on the position of the heavy atoms of the homologous template compound, whereas in the second, the shape is additionally annotated with pharmacophore data for some or all beads. For both protocols, ambiguous distance restraints are subsequently defined between those beads and the heavy atoms of the ligand to be docked. We have benchmarked the performance of these protocols with a fully unbound version of the widely used DUD-E (Database of Useful Decoys-Enhanced) dataset. In this unbound docking scenario, our template/shape-based docking protocol reaches an overall success rate of 81% when a reliable template can be identified (which was the case for 99 out of 102 complexes in the DUD-E dataset), which is close to the best results reported for bound docking on the DUD-E dataset.

Targeting BRF2 in Cancer Using Repurposed Drugs

Article

Full-text available

Jul 2021

The overexpression of BRF2, a selective subunit of RNA polymerase III, has been shown to be crucial in the development of several types of cancers, including breast cancer and lung squamous cell carcinoma. Predominantly, BRF2 acts as a central redox-sensing transcription factor (TF) and is involved in rescuing oxidative stress (OS)-induced apoptosis. Here, we showed a novel link between BRF2 and the DNA damage response. Due to the lack of BRF2-specific inhibitors, through virtual screening and molecular dynamics simulation, we identified potential drug candidates that interfere with BRF2-TATA-binding Protein (TBP)-DNA complex interactions based on binding energy, intermolecular, and torsional energy parameters. We experimentally tested bexarotene as a potential BRF2 inhibitor. We found that bexarotene (Bex) treatment resulted in a dramatic decline in oxidative stress and Tert-butylhydroquinone (tBHQ)-induced levels of BRF2 and consequently led to a decrease in the cellular proliferation of cancer cells which may in part be due to the drug pretreatment-induced reduction of ROS generated by the oxidizing agent. Our data thus provide the first experimental evidence that BRF2 is a novel player in the DNA damage response pathway and that bexarotene can be used as a potential inhibitor to treat cancers with the specific elevation of oxidative stress.

Applications of Artificial Intelligence in Drug Design: Opportunities and Challenges

Chapter

Jan 2022

Artificial intelligence (AI) has undergone rapid development in recent years and has been successfully applied to real-world problems such as drug design. In this chapter, we review recent applications of AI to problems in drug design including virtual screening, computer-aided synthesis planning, and de novo molecule generation, with a focus on the limitations of the application of AI therein and opportunities for improvement. Furthermore, we discuss the broader challenges imposed by AI in translating theoretical practice to real-world drug design; including quantifying prediction uncertainty and explaining model behavior.

Application of Artificial Intelligence and Machine Learning in Drug Discovery

Chapter

Jan 2022

Rishi R. Gupta

Machine Learning (ML) and Deep Learning (DL) are two subclasses of Artificial Intelligence (AI), that, in this day and age of big data provides significant opportunities to pharmaceutical discovery research and development by translating data to information and ultimately to knowledge. Machine Learning or AI is not really new but over last few years, application of better methods have emerged and they have been successfully applied for drug discovery and development. This chapter would provide an overview of these methods and how they have been applied across various work streams, e.g., generative chemistry, ADMET prediction, retrosynthetic analysis, etc. within drug discovery process. This chapter would also attempt to provide caution and pit falls in utilizing these methods blindly while summarizing challenges and limitations.

Deep Learning in

Chapter

Jan 2022

Andrew Anighoro

Computational methods play an increasingly important role in drug discovery. Structure-based drug design (SBDD), in particular, includes techniques that take into account the structure of the macromolecular target to predict compounds that are likely to establish optimal interactions with the binding site. The current interest in machine learning algorithms based on deep neural networks encouraged the application of deep learning to SBDD related problems. This chapter covers selected works in this active area of research.

Use of molecular docking computational tools in drug discovery

Chapter

Jan 2021
Progr Med Chem

Molecular docking has become an important component of the drug discovery process. Since first being developed in the 1980s, advancements in the power of computer hardware and the increasing number of and ease of access to small molecule and protein structures have contributed to the development of improved methods, making docking more popular in both industrial and academic settings. Over the years, the modalities by which docking is used to assist the different tasks of drug discovery have changed. Although initially developed and used as a standalone method, docking is now mostly employed in combination with other computational approaches within integrated workflows. Despite its invaluable contribution to the drug discovery process, molecular docking is still far from perfect. In this chapter we will provide an introduction to molecular docking and to the different docking procedures with a focus on several considerations and protocols, including protonation states, active site waters and consensus, that can greatly improve the docking results.

Best Practices for Docking-Based Virtual Screening

Chapter

Jan 2021

Docking-based virtual screening (DBVS) is well placed in modern drug discovery and is widely applied with many success cases by both pharmaceutical companies and academic groups. The recent advances in scoring functions, search algorithms, consensus scoring, protein flexibility and enrichment represent a new era of docking approaches. Given the popularity of docking techniques, here we emphasize the importance of assessing the performance of docking protocols to discriminate between active and inactives, using a variety of metrics from classic enrichment descriptors to advanced ones, as well as to compare if some methods and scoring functions perform better than others and in what situations what metrics are more appropriate than others. Moreover, we highlighted the pitfalls and strengths of main steps of DBVS and suggest possible roadmaps, methods, and strategies, which may contribute for optimizing drug discovery projects using computational approaches.

Advances to tackle backbone flexibility in protein docking

Article

Apr 2021
CURR OPIN STRUC BIOL

Computational docking methods can provide structural models of protein–protein complexes, but protein backbone flexibility upon association often thwarts accurate predictions. In recent blind challenges, medium or high accuracy models were submitted in less than 20% of the ‘difficult’ targets (with significant backbone change or uncertainty). Here, we describe recent developments in protein–protein docking and highlight advances that tackle backbone flexibility. In molecular dynamics and Monte Carlo approaches, enhanced sampling techniques have reduced time-scale limitations. Internal coordinate formulations can now capture realistic motions of monomers and complexes using harmonic dynamics. And machine learning approaches adaptively guide docking trajectories or generate novel binding site predictions from deep neural networks trained on protein interfaces. These tools poise the field to break through the longstanding challenge of correctly predicting complex structures with significant conformational change.

Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule Docking Approaches

Abstract and Figures

Recommended publications

Deep learning in modelling the protein–ligand interaction: new pathways in drug development

Principles and aspects of molecular docking: A bird’s eye view

A Deep Learning Bioinformatics Approach to Modeling Protein-Ligand Interaction with cryo-EM Data in...

Molecular interaction analysis of Sulawesi propolis compounds with SARS-CoV-2 main protease as preli...