ChapterPDF Available

Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule Docking Approaches

Authors:
  • Istanbul Medeniyet University

Abstract and Figures

Proteins (e.g., enzymes, receptors, hormones, antibodies, transporter proteins, etc.) seldom act alone in the cell, and their functions rely on their interactions with various partners such as small molecules, other proteins, and/or nucleic acids. Molecular docking is a computational method developed to model these interactions at the molecular level by predicting the 3D structures of complexes. Predicting the binding site and pose of a protein with its partner through docking can help us to unveil protein structure-function relationship and aid drug design in numerous ways. In this chapter, we focus on the fundamentals of protein docking by describing docking methods including search algorithm, scoring, and assessment steps as well as illustrating recent successful applications in drug discovery. We especially address protein–small-molecule (drug) docking by comparatively analyzing available tools implementing different approaches such as ab initio, structure-based, ligand-based (pharmacophore-/shape-based), information-driven, and machine learning approaches.
Content may be subject to copyright.
Selection of our books indexed in the Book Citation Index
in Web of Science™ Core Collection (BKCI)
Interested in publishing with us?
Contact book.department@intechopen.com
Numbers displayed above are based on latest data collected.
For more information visit www.intechopen.com
Open access books available
Countries delivered to Contributors from top 500 universities
International authors and editor s
Our authors are among the
most cited scientists
Downloads
We are IntechOpen,the world’s leading publisher ofOpen Access booksBuilt by scientists, for scientists
12.2%
169,000
185M
TOP 1%
154
6,200
Chapter
Fundamentals of Molecular
Docking and Comparative Analysis
of Protein–Small-Molecule Docking
Approaches
SefikaFeyza Maden, SelinSezer and Saliha EceAcuner
Abstract
Proteins (e.g., enzymes, receptors, hormones, antibodies, transporter proteins,
etc.) seldom act alone in the cell, and their functions rely on their interactions
with various partners such as small molecules, other proteins, and/or nucleic acids.
Molecular docking is a computational method developed to model these interac-
tions at the molecular level by predicting the 3D structures of complexes. Predicting
the binding site and pose of a protein with its partner through docking can help us
to unveil protein structure-function relationship and aid drug design in numerous
ways. In this chapter, we focus on the fundamentals of protein docking by describ-
ing docking methods including search algorithm, scoring, and assessment steps as
well as illustrating recent successful applications in drug discovery. We especially
address protein–small-molecule (drug) docking by comparatively analyzing available
tools implementing different approaches such as ab initio, structure-based, ligand-
based (pharmacophore-/shape-based), information-driven, and machine learning
approaches.
Keywords: molecular docking, drug design, drug discovery, protein interactions,
machine learning
. Introduction
The molecular machines of the cell, i.e., proteins, are essential to many cellular
processes such as signal transduction and cell regulation. Proteins seldom act alone in
the cell, but they function through interacting with other small or macromolecules.
Therefore, understanding protein interactions at the atomic level is critical to under-
standing biological processes [1]. Primary structure, i.e., amino acid sequence, of the
interacting proteins is a necessary but insufficient source of information at the atomic
level. After being synthesized, proteins fold and acquire a stable native structure,
i.e., tertiary structure that can be defined in a three-dimensional (3D) plane in order
to be functional. It is known that proteins with different sequence information can
have similar functional structures, that is, different amino acid sequences can show
Molecular Docking - Recent Advances
similar folding trends in 3D space and structure is more conserved than sequence [2].
Therefore, it is crucial to understand the interaction details at the structural level.
Proteins physically interact with their partners via non-covalent associations, namely
H-bond, hydrophobic, and electrostatic interactions, with the exception of covalent
disulfide bridges. These intermolecular physical forces also dominate the protein
folding process.
The 3D structure of the macromolecules can be determined using the experimen-
tal methods such as X-ray crystallography, nuclear magnetic resonance (NMR), and
cryo-EM and then deposited in the Protein Data Bank (PDB) (https://www.rcsb.
org). However, there is a huge gap between the number of known protein sequences
and structures [3, 4]. Computational modeling approaches that can predict 3D
structures of macromolecules can help to bridge this gap. A recent machine learning
algorithm developed by DeepMind, called AlphaFold [5], can predict 3D structures
of proteins using the sequence information with high accuracy and has been accepted
as a breakthrough in the structural biology field. In 1 year, approximately 1 million
new structures have been predicted and deposited at AlphaFold Protein Structure
Database (https://alphafold.ebi.ac.uk/). In order to have a complete understanding
of the proteome, computational techniques are not only needed for modeling single
protein structures, but also the interactions between them.
Molecular docking is a method used to predict the structures of proteins in complex
with other proteins, nucleic acids, or small molecules. It can be defined as predicting
the appropriate low-energy binding pose of the ligand in complex with the target
structure, by randomly colliding proteins and their potential partners in space, first
creating a rigid complex structure model, and then focusing on the binding sites of
that model with flexible interface refinement [6]. Energy minimization of randomly
docked conformations in space requires a multidimensional calculation. Initially devel-
oped molecular docking method was treating ligands and receptors as rigid bodies
without considering any conformational changes [7]. However, interactions between
proteins can become quite complex even with small changes in the conformation of the
structures [7], and docking algorithms may not physically solve this complex prob-
lem correctly [8]. The main factor that creates computational difficulties in docking
algorithms is when the protein backbone changes its conformation significantly upon
binding [9, 10]. To address this problem, different techniques that consider backbone
flexibility have been successfully implemented in docking algorithms [10].
Many diseases today, such as cancer, are likely to be linked to problems in protein-
protein interactions and targeting them can therefore enable the development of
next-generation therapeutic methods [11]. Modeling the complex structures formed
by proteins with other proteins or small molecules holds the key to understand
many biological processes such that modeling enzyme-substrate or protein-drug
interactions can reveal insights into binding sites/interface regions, function, and
mechanism of action. The main protein–small-molecule docking applications in drug
discovery include drug repositioning, structure- and ligand-based (pharmacophore/
shape-based) drug design approaches using virtual and reverse screening [11–14].
Today, with the continuously developing technology; targeted drug design, drug
target search, evaluation of the side effects of existing drugs, or finding new targets
for these drugs can be achieved with the help of molecular modeling and machine
learning methods [12]. Deep learning neural network models have strong computa-
tional ability on big data and attract attention in structural biology field [15]. There
are antibiotic discovery studies using deep neural networks [16] and deep learning
studies adapted to drug design [17].
Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule…
DOI: http://dx.doi.org/10.5772/intechopen.105815
In this chapter, we focus on the protein–small-molecule docking fundamentals
and the steps of the docking algorithm and procedure in detail. We then give recent
successful applications in drug design and discovery that use different docking
approaches, namely virtual screening, reverse screening, and machine learning.
Lastly, we comparatively analyze some of the available protein–small-molecule
docking tools using the structure of SARS-CoV-2 main protease in complex with a
non-covalent inhibitor Jun8-76-3A as a case study.
. Fundamentals of protein–small-molecule docking
Protein–small-molecule interactions are essential for the sustainability of
biological processes such as enzymatic catalysis and overall homeostasis in the body [18].
The engineering of protein–small-molecule interactions is one of the computational
approaches used to solve critical problems in biology [18]. Protein–small-molecule
docking, i.e., modeling the interaction between chemical compounds and their
target protein receptors at the atomic level, is an effective tool in drug design. In the
structure-based design of small-molecule drugs, a good estimation of the binding
pose is required to clearly demonstrate important interactions and design drugs with
increased selectivity and efficacy [19]. The procedures that can be followed and the
tools that can be used before, during, and after molecular docking are explained in the
following subsections and summarized in Figure .
. Before docking: molecule preparation
Before starting the docking studies, first of all, the most suitable protein
and ligand structures should be selected [20]. There are databases to access the
experimentally determined structures of target proteins such as PDB, Uniprot, and
Therapeutic Target Database (TTD). If the experimental structure is not available,
modeled structures can be obtained from AlphaFold Database or can be modeled
Figure 1.
The procedures that can be followed and the tools that can be used before, during, and after protein-ligand
molecular docking in drug design.
Molecular Docking - Recent Advances
using relevant structure modeling software. The most frequently used databases for
getting the small-molecule ligand/chemical structures are: DrugBank [21], PubChem
[22], ZINC [23], ChEMBL [24], and Chemspider [25] (Figure ). DrugBank,
Chemspider, and ZINC databases include more than 500,000, 100 million, and 230
million compounds/drug molecules, respectively.
The molecular docking algorithms may require preliminary preparation of the
structures that are obtained in PDB format (lacking H atoms). There are tools avail-
able for such preliminary preparations such as Open Babel [26] and AutoDockTools
(Figure ) [27].
It is also of crucial importance to guide docking with preliminary information
on the binding site. Otherwise, there are no binding site constraints, blind docking
takes place, and it is more difficult to detect the correct binding poses when the ligand
search space is large. There are various guiding algorithms for active site prediction
that can be used when binding sites are not known. Some of them can be listed as:
GRID [28], SurfNet [29], COACH [30], SCFbio [31], CASTp [32], DeepSite [33], and
PUResNet (Figure ) [34].
The capabilities of docking algorithms can differ from each other, and in this
respect, it is important to carefully choose the algorithm to use in accordance with the
purpose of the study before starting the docking.
. Docking algorithm steps
There are many approaches and algorithms for molecular docking, based on
different parameters, and they aim to perform the protein-ligand docking with the
best performance [12]. The steps of molecular docking algorithms can be summarized
as follows: molecule flexibility, conformational search algorithms (ligand sampling),
and scoring functions (Figure ) [12, 35].
.. Molecule flexibility
During molecular docking, structures can be considered rigid or flexible. Rigid
docking takes into account only the translation and rotation degrees of freedom.
Providing flexibility means also considering the rotation about single bonds so that
they have the same bond lengths and angles but different torsion angles. Although
Figure 2.
Methods for protein-ligand molecular docking.
Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule…
DOI: http://dx.doi.org/10.5772/intechopen.105815
flexible docking approach is more realistic than rigid docking, whenthere are
many rotatable bonds, the ligand conformational search space becomes so
large that it is difficult to find the correct binding pose with the lowest binding
free energy (global minimum solution). Some algorithms, such as HADDOCK
[36], first treat the structures as rigid to increase time efficiency and then per-
form flexibility improvements on the poses of molecules with the best energy
scores.Molecular docking software can be grouped according to the flexibility
treatments of molecules such as Rigid Docking, Semi-Flexible Docking, and Soft
Docking [35, 37].
In rigid docking, protein and ligand molecules are treated as rigid entities [37, 38].
During docking, the positions of the molecules change without losing their shape
[37], i.e., only translation and rotation but no conformational degrees of freedom are
considered.
Semi-flexible docking is based on the principle of keeping the protein structure
rigid and letting the ligand structure be flexible by allowing rotatable bonds. Thus,
various conformational poses of the ligand on the protein are sampled [35, 37, 38]. It
gives more accurate results than rigid docking [37].
In soft docking, van der Waals interactions between atoms are softened, making
the structures of both receptor and ligand molecules implicitly flexible as overlap is
allowed to a small extent [39, 40]. Soft docking process is carried out realistically by
ensuring that both the protein and the ligand are rotatable as in their natural states
[37, 38]. It is an advantageous method due to its computational efficiency and ease of
application [35, 37].
.. Conformational search algorithms
Conformational search algorithms can identify different conformational orien-
tations (poses) of the ligand sampled around the experimentally determined active
site or other binding sites on the protein [35, 41, 42]. These algorithms are gener-
ally classified as: shape matching, systematic, stochastic, and simulation methods
[35, 38, 43].
Shape matching algorithms have the advantage of speed over other algorithms
[35, 44] and adopt a sampling principle in which the conformation of the ligand
should be structurally complementary to the protein binding site [38]. It ensures that
the ligand is positioned in such a way that best complements the molecular surface
of the binding site on the protein [35]. Some example software using shape matching
are: DOCK [45], FLOG [46], EUDOC [47], Surflex [48], LibDOCK [49], SANDOCK
[50], and MDock [51].
Using systematic search algorithms, a large number of possible binding poses
can be obtained by gradually changing the degrees of freedom of the ligands [35, 52]
toward the direction of minimum energy. Systematic search algorithms can be
divided into two as exhaustive search and fragmentation (incremental structure)
[35, 41, 53]. Exhaustive search algorithm is based on systematically generating flex-
ible ligand conformations by rotating the rotatable bonds in the ligand [35]. If the
number of rotatable bonds is large, there is a combinatorial explosion in the number
of poses, i.e., the search space, so that some filtering and optimization procedures
are applied for practical purposes [35]. Glide [54] and FRED [55] are example
docking software using exhaustive conformational search algorithms. In the frag-
mentation method, the ligand is divided into smaller fragments, each fragment is
placed and augmented at the binding site gradually through covalent bonding to the
Molecular Docking - Recent Advances
previous one [35]. DOCK [56], LUDI [57], FlexX [58], and eHiTs [59] are example
software using fragmentation.
The algorithms used in stochastic search methods are more efficient but do not
guarantee an accurate result as they are based on generating random ligand confor-
mations, and therefore, the docking process is iterative in these algorithms [41, 44].
Monte Carlo, swarm optimization, evolutionary algorithms, and Tabu search meth-
ods are among the most used stochastic algorithms [35, 38, 52]. Example software
using stochastic conformational search method include AutoDock [60], GOLD [61],
DockThor [62], and MolDock [63].
Simulations of the obtained ligand poses (simulation methods) represent protein
and ligand flexibility better than the other algorithms but have a slow flow and can
make insufficient sampling [38, 44]. For this reason, they are used as a complement to
other conformational search methods [38].
.. Scoring functions
In the previously described conformational search step, many structures are cre-
ated and most of them should be eliminated by selecting the biologically appropriate
structures. Therefore, the possible poses created by conformational search algorithms
are evaluated and ranked by using a scoring function [35]. The scoring function is a
measure to evaluate the docking poses obtained [35, 38, 52] in terms of their binding
free energies [11, 44, 64].
With the scoring functions that estimate the binding energies of the created
complex structures, various physicochemical properties should be evaluated in order
to distinguish good results from the bad ones. These physicochemical properties can
be intermolecular interactions, desolvation from solvent, electrostatic and entropic
effects, etc. [65]. As the number of evaluated parameters increases, the accuracy
of the scoring function will increase; but the computational load will also increase.
Therefore, scoring functions with ideal efficiency, especially when working with
large ligand sets, are those that are balanced in terms of accuracy and speed [11]. The
scoring functions can be classified as: force-field-based, empirical, knowledge-based,
and consensus scoring.
The Force Field Scoring Function (FFSF) is designed to work with multiple force
fields such as AMBER [66], CHARMM [67], GROMOS [68], and OPLS [69] individu-
ally or in combination. The designed FFSFs estimate the free energy of ligand binding
by considering van der Waals energy terms such as electrostatic interactions and
hydrogen bonds [35, 38].
Empirical scoring functions use simpler energy terms to estimate the free energy
of ligand binding such as hydrogen bonds and ionic interaction, and they can be
calculated more easily and faster than FFSFs [35, 38, 52]. Some examples of empirical
scoring functions are GlideScore [54], PLP [70], LigScore [71], LUDI [72], SCORE
[73], and X-Score [74].
Knowledge-based scoring functions use statistical analysis of protein-ligand com-
plex structures to derive protein-ligand distance [44]. These functions can show high
performance in a short time [52]. They can also model some uncommon interactions,
such as sulfur-aromatic, that other functions do not address [44].
Consensus scoring function, not a specific scoring system, aims at an effective
scoring with a combination of multiple scoring functions with the idea of minimizing
the possible error margins of existing scoring systems [35, 38, 44].
Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule…
DOI: http://dx.doi.org/10.5772/intechopen.105815
. After docking: evaluation of the results
After performing protein-ligand docking studies, the accuracy of pose estima-
tions needs to be evaluated [41, 52]. The best way to evaluate the docking algorithm
is to compare the predicted binding pose of the ligand with position of the reference
ligand in the experimentally determined structure, if possible. The structural com-
parison is quantified by using root mean squared deviation (RMSD) (Eq. 1), with the
unit of Å [41, 75]. It is preferred that this value is between 2 and 4Å or less for a good
docking. RMSD calculations are simple, but this metric is not normalized to number
of atoms and therefore should not be considered as an absolute measure [76]. As a
more systematic approach, in order to ensure the consistency of the docking algo-
rithm used, it should be checked whether the same poses are obtained by repeating
the docking process [52] at least 50 times and clustering the poses of the side chains
and references according to a certain threshold value [77]. With this method, whether
the docking algorithm correctly and consistently creates a pose in the right position
can be determined [41, 44, 78].
( )
( )
( )
=
= + +− +−
2
22
1
1N
ai bi ai bi ai bi
i
RMSD x x y y z z
N (1)
Eq. (1) Root mean squared deviation for the coordinates of two molecules, a and b,
with N atoms.
Modeling successes and capabilities of docking algorithms are being evaluated in
a competition called CAPRI (Critical Assessment of Protein Interactions) (https://
www.capri-docking.org/) since 2001 [79, 80]. Experimentally determined complex
structures that have not yet been published in PDB are submitted to CAPRI and
without knowing the experimental structure of the complex, the participants try to
predict the most similar structure to the experimentally determined complex struc-
ture through docking algorithms [79]. A solution set of 10 models is presented to the
CAPRI committee for evaluation based on the geometry similarity and biological
relevance of the predicted complex structures. The results of CAPRI show very good
predictions for easy targets with simple conformational changes, but rather worse
ones for difficult targets with conformational changes upon binding [9].
. Molecular docking approaches and applications in drug design
Computational methods have become an important part of the drug discovery
process with increasing accuracy of algorithms. Various docking methods based
on different algorithms are constantly being developed to determine the structural
relationships of potential drug molecules and their targets [44]. In addition, studies in
this area shed light on the candidate drugs in terms of the pharmacodynamic proper-
ties, affinity, and selectivity [11]. The main molecular docking applications in drug
discovery include drug repositioning (repurposing), structure- and ligand-based
drug design approaches using virtual and reverse screening [11–14].
Drug repositioning seeks out new targets for natural compounds, drugs currently
in use, or candidate ligands to reveal their unknown therapeutic potentials [81]. Many
successful repositioning studies are available in the literature [81–83]. Virtual screen-
ing (VS) and reverse screening (RS) techniques are frequently used in drug discovery
Molecular Docking - Recent Advances
and repositioning. VS offers a more effective and rational approach compared with
traditional methods [36]. The atomic-level analyzable results presented to us by
virtual screening studies guide us in understanding the function of the target and
in new drug discoveries [5, 36, 55]. In the RS approach, interest is on a single ligand
molecule, and there is a search for a biological target for this molecule [12]. Unlike
virtual screening (VS), the search library consists of potential target receptors. RS
approach has the potential to lead studies such as testing toxicity or side effects of the
existing drugs [38]. The potential side effects of a drug need to be evaluated in the
drug discovery process. Molecular docking studies can offer an important perspective
in this regard, and there are inverse (reverse) docking studies that provide bioactiv-
ity data by detecting off-target bindings [25]. Lastly, the subclasses of Artificial
Intelligence (AI): Machine Learning (ML) and Deep Learning (DL) methods have
significant contributions in pharmaceutical industry [84]. AI can be applied to dif-
ferent steps such as drug design with VS, de novo generation of drug molecules, and
computational planning of drug synthesis [85]. Recent developments are promising
that molecular docking methods may benefit from the machine learning methods
more in the future [84].
. Virtual screening
Virtual screening (VS) approach uses a target receptor and a library of small
molecules. Libraries can be created manually, or already existing libraries can be used.
The library consists of a large number of chemically diverse bioactive small molecules
with a high probability of binding to the receptor. This virtual computing technique
is considered as the in silico equivalent of in vitro methods such as high-throughput
screening (HTS) [11]. VS is preferred as a guide in scientific studies because its
success rate is 400 times higher [86], less costly, faster, and requires less labor com-
pared with high-throughput screening methods [87]. VS studies aim to reduce a large
number of potential drug candidates to manageable numbers applying various filters.
The biggest challenge in VS is the detection of false negatives [19].
Ligand-based VS methods conduct research by identifying common properties
of compound sequences, such as molecular volume and protonation state [11]. In
addition to chemical similarity [88] and rule-based [89] software included in filtra-
tion strategies, there are also various software such as freely add-on pharmacophore
and quantitative structure-activity relationship (QSAR) models [87, 90]. The most
commonly used ligand-based virtual screening method is the QSAR method. Ligand-
based VS does not contain structural information about the receptor, it only scans
using receptor sites known to be active and tries to detect active ligand molecules [85].
Structure-based VS methods are often used when the receptor has different con-
formations. The aim is to predict receptor binding affinity by processing structural
information using a variety of techniques, such as binding site similarity and phar-
macophore mapping. By estimating the different binding modes, the molecules are
sorted for evaluation [11]. Analysis of the predicted poses can be done manually using
visualization programs. It has been reported that nAPOLI, a web server developed in
recent years, analyzes results automatically [91].
Structure-based pharmacophore generation is one of the most frequently used
methods for small molecules in the virtual screening method. Here, 3D pharmaco-
phore model interfaces of the scaffolds of the ligands are created, and ligands that will
adapt to the binding site and provide the desired bioactivity are selected. Some of the
programs that use pharmacophore modeling are HipHop [92], PHASE [93], MOE,
Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule…
DOI: http://dx.doi.org/10.5772/intechopen.105815
which are commercial, SCAMPI [94], PharmaGist [95], ALADDIN [96], which are
suitable for academic use.
A recent example of VS application on the non-structural protein of SARS-CoV-2,
nsp1, one of the virulence factors causing viral infection, is by G. O. Timo et al. [74].
They estimated the exact pattern of nsp1 interaction through molecular simulation
studies and analyzed 8694 potential inhibitors from the DrugBank database using
the virtual screening method and proposed 16 inhibitor molecules with the best
binding energy scores [74]. There is another recent study on the transcription factor
BRF2, which is among the therapeutic targets as its upregulation is observed in the
formation of various types of cancer, but there is no available specific drug targeting
BRF2. By performing drug repositioning through virtual screening of drug mol-
ecules that are potential candidates for BRF2 inhibition, Rashidieh et al. found that
the bexarotene molecule led to a serious decrease in the proliferation of this type of
cancer cells [97].
. Reverse screening
Reverse screening (RS) is also called inverse docking, reverse docking, inverse
virtual screening, or target screening. Libraries are more limited for target hunting
and profiling [12] and can be created manually using the most common acces-
sible databases such as PDB [98] and TTD [12, 99]. But this process requires a long
preparation time and effort. There are various algorithms used to detect interactions
by reverse screening. Some web platforms (INVDOCK [100], idTarget [101], ACTP
[102], etc.) have been developed for reverse docking, which use libraries prepared
for specific diseases and docked using programs such as standard AutoDock and
AutoDock Vina [12].
A recently developed Consensus Reverse Docking System (CRDS) detects
potential binding sites by screening approximately 5200 candidate proteins for the
ligand molecule using three different scoring methods [103]. In another example,
Stepanova et al. tested the antimicrobial activity against Mycobacterium tuberculosis
strain by reverse screening for chemicals that had been successful in experimental
studies and determined the most appropriate target as aspartate 1-decarboxylase by
performing docking studies using 35 different target protein structures [104]. Reverse
screening was also used for Bazedoxifene, an FDA-approved drug for the prevention
of postmenopausal osteoporosis, and Xiao et al. defined the inhibitory power of
Bazedoxifene on IL-6/GP130 signaling pathway (critical for cancer survival) by using
computational techniques and confirmed the result with in vivo studies [83].
. Machine-learning-based approaches
Machine learning techniques take information from biological data and make
predictions about them, thus contributing to building a structural model [9]. Once a
model is built, it must be improved so that the state with the lowest potential energy
(global minimum) can be reached. Global minimum means a stable and sterically
acceptable structure, and reaching it without being stuck at the local minima is
very important in the field of bioinformatics and computational structural biology.
A recent machine learning algorithm developed by DeepMind, called AlphaFold
[5], implements deep learning and can predict 3D structures of proteins using the
sequence information with high accuracy and has been accepted as a breakthrough in
the structural biology field.
Molecular Docking - Recent Advances

Machine learning makes classifications by learning on datasets and needs human
intervention to evaluate possible outcomes. Deep learning is a more advanced model
having the neural network with ability to decide the right result without human
intervention (Figure ). Machine learning can use supervised or unsupervised learn-
ing. Supervised learning performs machine learning on datasets that we know about,
whereas unsupervised learning detects and labels similarities and orientations in a
created cluster [38, 90].
The training set used in machine learning constitutes the performance of the algo-
rithm. Machine learning studies in the field of virtual screening are generally focused
on improving the performance of the scoring function [85]. Studies have shown that
working with small subsets of the same family, which consists of similar structures,
gives better scoring results rather than working with large data from different com-
plexes [105]. Working with subsets of interest is also a better approach in terms of
computational requirements [38].
Machine learning and deep learning can describe more diverse data than other
computational systems and can be representative of structural biology. Nonparametric
machine learning has great potential to be the next step in computer-based program-
ming to improve the accuracy of molecular docking studies [41]. Machine learning can
be used to refine predetermined function data as well as provide high-quality data to
complement pharmaceutical discovery research and development.
. Case study: comparison of docking tools
As a case study for comparing different protein-ligand docking tools, the crystal
structure of the SARS-CoV-2 (COVID-19) main protease in complex with its non-
covalent inhibitor Jun8-76-3A (PDB ID: 7KX5) is used as the experimental refer-
ence structure to evaluate the accuracies of the complex structures predicted using
Figure 3.
Schematic illustration of artificial intelligence subfields: Machine learning and deep learning.

Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule…
DOI: http://dx.doi.org/10.5772/intechopen.105815
AutoDock Vina, HADDOCK, and SwissDock programs and changing some of the
parameters to test their effects on prediction capabilities. The inhibitor in the experi-
mental protein structure is removed and then molecular docking is performed using
the initial coordinates of the main protease structure of SARS-CoV-2 and its inhibitor
Jun8-76-3A, separately.
. Docking with AutoDock Vina
AutoDock is a free software that predicts the binding compatibility of small
ligands to macromolecule targets with a flexible-rigid (semi-flexible) docking
approach [27]. It uses a grid-based method to place the ligand in the active region
determined on the macromolecule [106]. AutoDockTools (http://mgltools.scripps.
edu/downloads) is the user interface to produce and examine grid information
required for the preparation of the protein and ligand structures in the relevant
format and the configuration file [27].
As a docking input in AutoDock Vina, a configuration file, which contains the
coordinate information of the protein and ligand structures and the ligand-binding
region on the receptor, is required. For docking the case study ligand to the receptor
using AutoDock Vina, the structure file was downloaded from RCSB PDB database
(https://www.rcsb.org) in .pdb format (PDB ID:7KX5). AutoDockTools (v1.5.6) inter-
face was used to prepare input files, such that, water molecules in the relevant protein
structure were deleted, polar H bonds were added to the structure and both the
receptor and ligand structures were saved in .pdbqt file format. After preparing the
ligand and protein structures, the most important input information for AutoDock is
the docking parameter. The docking parameter involves determining the coordinates
of the ligand-binding region on the target protein. While determining the docking
parameter, if the binding region on the protein is not known, blind docking can be
performed by putting the whole protein in the grid box (Figure A), or a small grid
box can be placed in the specific known/predicted ligand-binding region on the pro-
tein (Figure B). Lastly, after determining the region on the protein where the ligand
is to be bound by using thegrid box” in AutoDockTools, the protein coordinates were
Figure 4.
Grid box usage in docking: (A) blind docking with a grid box of size:
××44 72 68
and center coordinates: 10.711,
0.0, 3.782, (B) specific docking with a grid box of size:
and center coordinates: 10.735, 2.409, 21.173.
Molecular Docking - Recent Advances

specified in the input configuration file. Preparing all the required inputs, docking
was performed using AutoDock Vina by repeating each docking process three times in
order to observe the consistency of the algorithm (Table).
In order to examine the accuracy of the docking results, the poses obtained from
AutoDock Vina were aligned with the original PDB structure by using the PyMol
program [107]. When the energies of the poses predicted with specific docking
(i.e., using specific grid on the binding site) and blind docking are compared,
although the energy scores of the blind docking results are better, the comparison
of the poses with the reference ligand shows that the most accurate binding is
achieved with specific docking (Figure ). Alignment of the first poses (with the
lowest energy score) predicted with specific docking (green) and blind docking
studies (blue) with the reference ligand (red) shows that the specific docking pose
was in a more similar position with the reference ligand (green vs. red), than the
blind docking pose (blue vs. red).
. Docking with HADDOCK
An integrative platform called High Ambiguity-Driven biomolecular DOCKing
(HADDOCK) is used for molecular docking of two or more molecules [108] and is
a popular algorithm [36]. Although it is mainly suitable for protein-protein interac-
tions, it can also be applied to model the protein–small-molecule complexes [109].
HADDOCK automatically decides the most suitable configuration of the ligand
according to the given restrictions [108]. Protein-protein docking is more complex
than protein–small-molecule docking, as the proteins are flexible and the conforma-
tional space is larger [110].
HADDOCK does not require CPU and allows the user to see all the docking steps
from start to finish. It should be noted that the success of HADDOCK studies is
directly related with the amount of data entered into the system [36]. HADDOCK
allows processing different types of molecules with the help of different platforms
such as WHATIF, ProDRG, PDB. There is no need to create different conformer
Mode Specific docking Blind docking
Affinity (kcal/mol) Affinity (kcal/mol)
Rep Rep  Rep  AVG Rep Rep Rep AVG
18.9 8.8 8.9 8.9 8.9 8.9 9.0 8.9
27.3 8.7 7.3 7.8 8.2 8.2 8.3 8.2
37.2 7.2 7.3 7.2 8.1 8.0 8.1 8.1
46.8 7.0 7.0 6.9 7.9 7.8 8.0 7.9
56.8 6.9 7.0 6.9 7.9 7.5 8.0 7.8
66.8 6.8 6.9 6.8 7.7 7.4 7.8 7.6
76.7 6.5 6.8 6.7 7.7 7.4 7.8 7.6
86.4 6.4 6.8 6.5 7.6 7.2 7.6 7.5
96.3 6.4 6.7 6.4 7.5 6.9 7.4 7.3
Table 1.
Specific and blind docking studies with AutoDock were repeated three times.

Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule…
DOI: http://dx.doi.org/10.5772/intechopen.105815
sequences as the system selects the most compatible conformers based on the shape
constraints. With restriction files, we can set clear target sites, binding distances,
or select active or passive residues (areas that are likely to interact). Defining
semi-flexible regions is also allowed.
HADDOCK algorithm consists of three stages: rigid-body minimization and
randomization of orientations (it0), semi-flexible simulated annealing in torsion
angle space (it1), and refinement in 3D space with explicit solvent (water) (https://
www.bonvinlab.org/education/HADDOCK-protein-protein-basic/). it0 stage treats
structures as rigid solids and 1000 poses with the best score are selected. it1 optimizes
orientations by allowing different docking poses from it0 to have different flexible
regions defined. Two-hundred models with the best energy pass to the final stage. In
the final step, a complex solvent medium (DMSO or water) is considered to improve
the interaction energy and the final models are automatically aggregated.
To dock the case study inhibitor-protein complex (PDB ID:7XK5), the guideline
tutorial (HADDOCK small-molecule binding site screening protocol) [111] was fol-
lowed and two different approaches were tested: (i) using an unambiguous (distance)
restraint file, indicating the target that should bind the ligand, (ii) by defining the
active and passive residues. This case study consists of a pre-docking for the detection
of the binding region and a second docking for the detection of binding pose.
First, we tested HADDOCK’s accuracy of binding site detection. Two different
binding sites were detected in the top 10 clusters with the best energy scores and 70%
(7 out of 10) of the clusters were in the correct binding site (Figure A). Secondly, an
ambiguous and unambiguous restraint file was created by identifying the region with
the highest number of interactions between the ligand and the receptor. The restraint
files can be created manually or using the link in the protocol. However, it may be
necessary to make corrections in the distance restraints. The structure with the best
energy is visualized in Figure B. Secondly, active and passive residues were defined
on the system, and the pose with the best energy result is visualized in Figure C.
HADDOCK results are summarized in Table .
Comparison of the results shows that HADDOCK is successful in detecting the
binding site. However, according to the results obtained in the second stage, the
Figure 5.
Crystal SARS-CoV-2 main protease structure (white, PDB ID: 7KX5_chain (A) in complex with the blind
docking (blue), specific docking (green) poses predicted with AutoDock Vina and the reference ligand Jun8-76-3A
inhibitor (red, PDB ID: 7KX5_chain B). This figure was drawn with PyMol 2.5.2.
Molecular Docking - Recent Advances

algorithm was not successful enough to find the correct conformation of the ligand
in binding site. Defining ambiguous/unambiguous restraint files or selecting active
and passive residues did not make a significant contribution in detecting the correct
binding pose (Figure B and C). Docking with both approaches was repeated several
times and no significant similarity was detected.
. Docking with SwissDock
SwissDock is a database to improve protein–small-molecule docking using amino
acid sequence information from genome projects. Moreover, it is a web browser and
programmatic interface that enables creating three-dimensional protein models
from protein amino acid sequences [112]. It also has user interfaces such as Swiss-
Pdb Viewer (DeepView) to simultaneously analyze several proteins [113]. Using
the SwissDock web server, the starting crystal structures of the target proteins can
Figure 6.
Crystal SARS-CoV-2 main protease structure (gray, PDB ID: 7KX5_chain (A) in complex with the docking poses
(blue) predicted with HADDOCK and reference ligand Jun8-76-3A inhibitor (red, PDB ID: 7KX5_chain B).
A. Top 10 clusters for binding site determination. B. Pose with the best energy using ambiguous/unambiguous
restraints. C. Pose With the best energy using active/passive restraints. This figure was drawn with PyMol 2.5.2.

Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule…
DOI: http://dx.doi.org/10.5772/intechopen.105815
be searched and fetched from protein and ligand structure databases. If there is no
crystal structure available to compare, it provides homology modeling of the studied
protein. During the docking process, the user does not have to do any calculations
because all calculations are handled by the server side [112]. As a docking constraint,
the ligand binding region can be defined or blind docking can be applied with no
information.
Using the case study, both specific and blind dockings were performed on the
SwissDock server, and the results were compared. The server presented 256 poses.
The best scores obtained by specific docking (green) blind docking (blue) were9.88
and9.35kcal/mol, respectively (Figure ). Although both of the predicted poses
did not show the same conformation with the reference ligand, it was observed that
the pose obtained from the specific docking (green) was more similar to the reference
ligand (red) (Figure ).
Binding site
detection
Ambiguous/
Unambiguous restraints
Active/passive
restraints
HADDOCK score 53.4 ± 1.5 52.1 ± 0.5 21.9 ± 2.7
Cluster size 69 513
RMSD from the overall lowest-
energy structure
0.3 ± 0.2 0.1 ± 0.1 0.2 ± 0.0
Van der Waals energy 40.3 ± 1.2 41.6 ± 0.2 32.4 ± 4.5
Electrostatic energy 22.1 ± 1.9 15.2 ± 6.0 25.8 ± 7.3
Desolvation energy 10.9 ± 2.5 9.0 ± 0.2 6.7 ± 0.3
Restraints violation energy 0.0 ± 0.00 0.7 ± 0.2 198.5 ± 78.0
Buried Surface Area 795.4 ± 21.9 781.6 ± 5.2 783.0 ± 9.4
Z-Score 1.7 2.4 1.3
Table 2.
HADDOCK results.
Figure 7.
Crystal SARS-CoV-2 main protease structure (white, PDB ID: 7KX5_chain (A) in complex with the blind
docking (blue), specific docking (green) poses predicted by SwissDock and the reference ligand Jun8-76-3A
inhibitor (red, PDB ID: 7KX5_chain B). This figure was drawn with PyMol 2.5.2.
Molecular Docking - Recent Advances

. Conclusions
Molecular docking is a computational method that predicts the 3D structures of
receptor-ligand complexes. Modeling the atomic details of the ligand pose with the
receptor protein by molecular docking can assist in understanding protein structure-
function relationship and in drug design studies in several ways. Computational
modeling approaches complement and/or lead experiments by eliminating irrelevant
drug candidates and selecting the ones with the best binding properties. With the
continuously developing technology, there are many different approaches and algo-
rithms for molecular docking studies, and they are successfully used in therapeutic
applications such as targeted drug design, drug target search, evaluation of the side
effects of existing drugs, or finding new targets for these drugs.
The crystal structure of the SARS-CoV-2 (COVID-19) main protease in complex
with its non-covalent inhibitor Jun8-76-3A (PDB ID: 7KX5) was used as an experi-
mental reference case study to compare and evaluate the prediction accuracies of
AutoDock Vina, HADDOCK, and SwissDock programs as well as to test the effects of
some parameters on their prediction capabilities. One of the main observations is that
the ligand poses with the lowest binding energy scores are not necessarily the best
solution. Therefore, docking results should always be evaluated in terms of biologi-
cal relevance. Moreover, when a priori information about the ligand-binding site
is included as grid box placement and size in AutoDock Vina and as ligand binding
residues in SwissDock, the binding accuracy is improved significantly.
In summary, before starting the molecular docking, it is of crucial importance to
obtain detailed information on the target protein and ligand from various sources and
servers and to decide which docking algorithm to use. Moreover, the top predicted
poses with the best scores should not be unquestioningly accepted as the best solu-
tions but further structural analyses and evaluations should be incorporated in the
decision process.
Acknowledgements
We would like to thank dear Merve DEMİR AKYÜZ and Merve YÜCETÜRK
(a.k.a Merves), who are fourth-year undergraduate students at Molecular Biology and
Genetics Department of Istanbul Medeniyet University, for their contribution to the
writing of the introduction section.
Fundamentals of Molecular Docking and Comparative Analysis of Protein: Small Molecule
DOI: http://dx.doi.org/10.5772/intechopen.105815

Author details
SefikaFeyza Maden, SelinSezer and Saliha EceAcuner*
Department of Bioengineering and Science and Advanced Technologies Research
Center (BILTAM), Istanbul Medeniyet University, Istanbul, Turkey
*Address all correspondence to: ece.ozbabacan@medeniyet.edu.tr
© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of
the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0),
which permits unrestricted use, distribution, and reproduction in any medium, provided
the original work is properly cited.
Molecular Docking - Recent Advances

References
[1] Russell RB, Alber F, Aloy P, Davis FP,
Korkin D, Pichaud M, et al. A structural
perspective on protein–protein
interactions. Current Opinion in
Structural Biology. 2004;(3):313-324
[2] Sadowski MI, Jones DT. The
sequence–structure relationship
and protein function prediction.
Current Opinion in Structural Biology.
2009;(3):357-362
[3] Petrey D, Honig B. Structural
bioinformatics of the interactome.
Annual Review in Biophysics.
2014;(1):193-210
[4] Stein A, Mosca R, Aloy P. Three-
dimensional modeling of protein
interactions and complexes is going
omics. Current Opinion in Structural
Biology. 2011;(2):200-208
[5] Senior AW, Evans R, Jumper J,
Kirkpatrick J, Sifre L, Green T, et al.
Improved protein structure prediction
using potentials from deep learning.
Nature. 2020;(7792):706-710
[6] Andrusier N, Mashiach E,
Nussinov R, Wolfson HJ. Principles
of flexible protein-protein docking.
Proteins. 2008;(2):271-289
[7] Bonvin AM. Flexible protein–protein
docking. Current Opinion in Structural
Biology. 2006;(2):194-200
[8] Vakser IA. Protein-protein
docking: From Interaction to
Interactome. Biophysical Journal.
2014;(8):1785-1793
[9] Harmalkar A, Gray JJ. Advances to
tackle backbone flexibility in protein
docking. Current Opinion in Structural
Biology. 2021;:178-186
[10] Wang C, Bradley P, Baker D.
Protein–protein docking with backbone
flexibility. Journal of Molecular Biology.
2007;(2):503-519
[11] Ferreira L, dos Santos R, Oliva G,
Andricopulo A. Molecular docking and
structure-based drug design strategies.
Molecules. 2015;(7):13384-13421
[12] Pinzi L, Rastelli G. Molecular
docking: Shifting paradigms in drug
discovery. IJMS. 2019;(18):4331
[13] March-Vila E, Pinzi L, Sturm N,
Tinivella A, Engkvist O, Chen H, et al.
On the integration of in silico drug design
methods for drug repurposing. Frontiers
in Pharmacology. 2017;(8):298
[14] Wilson GL, Lill MA. Integrating
structure-based and ligand-based
approaches for computational drug
design. Future Medicinal Chemistry.
2011;(6):735-750
[15] Anighoro A. Deep learning in
structure-based drug design. Methods in
Molecular Biology. 2022;:261-271
[16] Stokes JM, Yang K, Swanson K,
Jin W, Cubillos-Ruiz A, Donghia NM,
et al. A deep learning approach
to antibiotic discovery. Cell.
2020;(4):688-702.e13
[17] Elton DC, Boukouvalas Z, Fuge MD,
Chung PW. Deep learning for molecular
design—a review of the state of the
art. Molecular System and Design
Engineering. 2019;(4):828-849
[18] Allison B, Combs S,
DeLuca S, Lemmon G, Mizoue L, Meiler J.
Computational design of protein-small
molecule interfaces. Journal of Structural
Biology. 2014;(2):193-202
Fundamentals of Molecular Docking and Comparative Analysis of Protein: Small Molecule
DOI: http://dx.doi.org/10.5772/intechopen.105815

[19] Śledź P, Caflisch A. Protein
structure-based drug design: From
docking to molecular dynamics.
Current Opinion in Structural Biology.
2018;:93-102
[20] Guterres H, Im W. Improving
protein-ligand docking results with
high-throughput molecular dynamics
simulations. Journal of Chemical Model.
2020;(4):2189-2198
[21] Wishart DS. DrugBank: A
comprehensive resource for in silico drug
discovery and exploration. Nucleic Acids
Research. 2006;(90001):D668-D672
[22] Li Q , Cheng T, Wang Y, Bryant SH.
PubChem as a public resource for drug
discovery. Drug Discovery Today.
2010;(23-24):1052-1057
[23] Irwin JJ, Sterling T, Mysinger MM,
Bolstad ES, Coleman RG. ZINC: A Free
Tool to Discover Chemistry for Biology.
Journal of Chemical Information and
Modeling. 2012;(7):1757-1768
[24] Gaulton A, Bellis LJ, Bento AP,
Chambers J, Davies M, Hersey A, etal.
ChEMBL: A large-scale bioactivity
database for drug discovery.
Nucleic Acids Research.
2012;(D1):D1100-D1107
[25] Pence HE, Williams A. ChemSpider:
An online chemical information
resource. Journal of Chemical Education.
2010;(11):1123-1124
[26] O’Boyle NM, Banck M,
James CA, Morley C, Vandermeersch T,
Hutchison GR. Open Babel: An
open chemical toolbox. Journal of
Cheminformatics. 2011;(1):33
[27] Morris GM, Huey R, Lindstrom W,
Sanner MF, Belew RK, Goodsell DS,
etal. AutoDock4 and AutoDockTools4:
Automated docking with selective
receptor flexibility. Journal
of Computational Chemistry.
2009;(16):2785-2791
[28] Goodford PJ. A computational
procedure for determining energetically
favorable binding sites on biologically
important macromolecules. Journal of
Medicinal Chemistry. 1985;(7):849-857
[29] Laskowski RA. SURFNET: A
program for visualizing molecular
surfaces, cavities, and intermolecular
interactions. Journal of Molecular
Graphics. 1995;(5):323-330
[30] Yang J, Roy A, Zhang Y. Protein-
ligand binding site recognition using
complementary binding-specific
substructure comparison and sequence
profile alignment. Bioinformatics.
2013;(20):2588-2595
[31] Narang P, Bhushan K, Bose S,
Jayaram B. Protein structure evaluation
using an all-atom energy based
empirical scoring function. Journal of
Biomolecular Structure & Dynamics.
2006;(4):385-406
[32] Binkowski TA, Naghibzadeh S,
Liang J. CASTp: Computed Atlas of
Surface Topography of proteins. Nucleic
Acids Research. 2003;(13):3352-3355
[33] Jiménez J, Doerr S,
Martínez-Rosell G, Rose AS, De
Fabritiis G. DeepSite: Protein-binding
site predictor using 3D-convolutional
neural networks. Bioinformatics.
2017;(19):3036-3042
[34] Kandel J, Tayara H, Chong KT.
PUResNet: Prediction of protein-ligand
binding sites using deep residual neural
network. Journal of Cheminformatics.
2021;(1):65
[35] Huang SY, Zou X. Advances and
Challenges in Protein-Ligand Docking.
IJMS. 2010;(8):3016-3034
Molecular Docking - Recent Advances

[36] de Vries SJ, van Dijk M, Bonvin AMJJ.
The HADDOCK web server for data-
driven biomolecular docking. Nature
Protocols. 2010;(5):883-897
[37] Fan J, Fu A, Zhang L. Progress in
molecular docking. Quantitative Biology.
2019;(2):83-89
[38] Crampon K, Giorkallos A,
Deldossi M, Baud S, Steffenel LA.
Machine-learning methods for ligand–
protein molecular docking. Drug
Discovery Today. 2022;(1):151-164
[39] Jiang F, Kim SH. “Soft docking”:
Matching of molecular surface
cubes. Journal of Molecular Biology.
1991;(1):79-102
[40] Ferrari AM, Wei BQ , Costantino L,
Shoichet BK. Soft docking and multiple
receptor conformations in virtual
screening. Journal of Medicinal
Chemistry. 2004;(21):5076-5084
[41] Torres PHM, Sodero ACR,
Jofily P, Silva-Jr FP. Key topics in
molecular docking for drug design. IJMS.
2019;(18):4574
[42] Gioia D, Bertazzo M, Recanatini M,
Masetti M, Cavalli A. Dynamic docking:
A paradigm shift in computational drug
discovery. Molecules. 2017;(11):2029
[43] Sousa SF, Fernandes PA, Ramos MJ.
Protein-ligand docking: Current status
and future challenges. Proteins.
2006;(1):15-26
[44] Meng XY, Zhang HX, Mezei M,
Cui M. Molecular docking: A powerful
approach for structure-based drug
discovery. Caduceus. 2011;(2):146-157
[45] Kuntz ID, Blaney JM, Oatley SJ,
Langridge R, Ferrin TE. A geometric
approach to macromolecule-ligand
interactions. Journal of Molecular
Biology. 1982;(2):269-288
[46] Miller MD, Kearsley SK,
Underwood DJ, Sheridan RP. FLOG: A
system to select?quasi-flexible? ligands
complementary to a receptor of known
three-dimensional structure. Journal
of Computer-Aided Molecular Design.
1994;(2):153-174
[47] Pang YP, Perola E, Xu K,
Prendergast FG. EUDOC: A computer
program for identification of drug
interaction sites in macromolecules and
drug leads from chemical databases.
Journal of Computational Chemistry.
2001;(15):1750-1771
[48] Jain AN. Surflex: Fully automatic
flexible molecular docking using a
molecular similarity-based search
engine. Journal of Medicinal Chemistry.
2003;(4):499-511
[49] Diller DJ, Merz KM. High
throughput docking for library design
and library prioritization. Proteins.
2001;(2):113-124
[50] Burkhard P, Taylor P,
Walkinshaw MD. An example of a
protein ligand found by database
mining: Description of the docking
method and its verification by a 2.3 Å
X-ray structure of a Thrombin-Ligand
complex. Journal of Molecular Biology.
1998;(2):449-466
[51] Huang SY, Zou X. Ensemble
docking of multiple protein structures:
Considering protein structural variations
in molecular docking. Proteins.
2006;(2):399-421
[52] Prieto-Martínez FD, Arciniega M,
Medina-Franco JL. Acoplamiento
Molecular: Avances Recientes y Retos.
TIP RECQB. 2018. [cited 2022 May
15];21. Available from: http://tip.
Fundamentals of Molecular Docking and Comparative Analysis of Protein: Small Molecule
DOI: http://dx.doi.org/10.5772/intechopen.105815

zaragoza.unam.mx/index.php/tip/
article/view/143
[53] Guedes IA, de Magalhães CS,
Dardenne LE. Receptor–ligand molecular
docking. Biophysical Reviews.
2014;(1):75-87
[54] Friesner RA, Banks JL, Murphy RB,
Halgren TA, Klicic JJ, Mainz DT, etal.
Glide: A new approach for rapid,
accurate docking and scoring. 1. Method
and assessment of docking accuracy.
Journal of Medicinal Chemistry.
2004;(7):1739-1749
[55] McGann MR, Almond HR,
Nicholls A, Grant JA, Brown FK.
Gaussian docking functions.
Biopolymers. 2003;(1):76-90
[56] Ewing TJA, Kuntz ID. Critical
evaluation of search algorithms
for automated molecular docking
and database screening. Journal
of Computational Chemistry.
1997;(9):1175-1189
[57] Böhm HJ. The computer program
LUDI: A new method for the de novo
design of enzyme inhibitors. Journal
of Computer-Aided Molecular Design.
1992;(1):61-78
[58] Rarey M, Kramer B, Lengauer T,
Klebe G. A fast flexible docking method
using an incremental construction
algorithm. Journal of Molecular Biology.
1996;(3):470-489
[59] Bentham Science Publisher BSP.
eHiTS: An innovative approach to the
docking and scoring function problems.
CPPS. 2006;(5):421-435
[60] Trott O, Olson AJ. AutoDock Vina:
Improving the speed and accuracy
of docking with a new scoring
function, efficient optimization,
and multithreading. Journal of
Computational Chemistry. 2009
[61] Verdonk ML, Cole JC, Hartshorn MJ,
Murray CW, Taylor RD. Improved
protein-ligand docking using GOLD.
Proteins. 2003;(4):609-623
[62] de Magalhães CS, Almeida DM,
Barbosa HJC, Dardenne LE. A dynamic
niching genetic algorithm strategy
for docking highly flexible ligands.
Information Sciences. 2014;:206-224
[63] Thomsen R, Christensen MH.
MolDock: A new technique for
high-accuracy molecular docking.
Journal of Medicinal Chemistry.
2006;(11):3315-3321
[64] Forli S, Huey R, Pique ME,
Sanner MF, Goodsell DS, Olson AJ.
Computational protein–ligand docking
and virtual drug screening with the
AutoDock suite. Nature Protocols.
2016;(5):905-919
[65] Bentham Science Publisher BSP.
Scoring functions for protein-ligand
docking. CPPS. 2006;(5):407-420
[66] Weiner PK, Kollman PA. AMBER:
Assisted model building with energy
refinement. A general program
for modeling molecules and their
interactions. Journal of Computational
Chemistry. 1981;(3):287-303
[67] Brooks BR, Bruccoleri RE,
Olafson BD, States DJ, Swaminathan S,
Karplus M. CHARMM: A program for
macromolecular energy, minimization,
and dynamics calculations. Journal
of Computational Chemistry.
1983;(2):187-217
[68] van Gunsteren WF, Berendsen HJC.
Computer simulation of molecular
dynamics: Methodology, applications,
and perspectives in chemistry.
Angewandte Chemie (International Ed.
in English). 1990;(9):992-1023
Molecular Docking - Recent Advances

[69] Jorgensen WL, Tirado-Rives J. The
OPLS Potential Functions for Proteins.
Energy Minimizations for Crystals of
Cyclic Peptides and Crambin. p. 10
[70] Parrill AL, Reddy MR. Rational
Drug Design: Novel Methodology
and Practical Applications. American
Chemical Society; 1999 [cited 2022 May
23]. (ACS Symposium Series; vol. 719).
Available from: https://pubs.acs.org/doi/
book/10.1021/bk-1999-0719
[71] Krammer A, Kirchhoff PD,
Jiang X, Venkatachalam CM, Waldman M.
LigScore: A novel scoring function for
predicting binding affinities. Journal
of Molecular Graphics & Modelling.
2005;(5):395-407
[72] Böhm HJ. The development of
a simple empirical scoring function
to estimate the binding constant for
a protein-ligand complex of known
three-dimensional structure. Journal
of Computer-Aided Molecular Design.
1994;(3):243-256
[73] Wang R, Liu L, Lai L, Tang Y.
SCORE: A new empirical method for
estimating the binding affinity
of a protein-ligand complex.
Journal of Molecular Modeling.
1998;(12):379-394
[74] Wang R, Lai L, Wang S. Further
development and validation of empirical
scoring functions for structure-based
binding affinity prediction. Journal of
Computer-Aided Molecular Design.
2002;(1):11-26
[75] Dias R, de Azevedo W.
Molecular docking algorithms. CDT.
2008;(12):1040-1047
[76] Waszkowycz B, Clark DE, Gancia E.
Outstanding challenges in protein–
ligand docking and structure-based
virtual screening. WIREs
Computational Molecular Science.
2011;(2):229-259
[77] Morris GM, Lim-Wilby M. Molecular
docking. In: Kukol A, editor. Molecular
Modeling of Proteins. Totowa, NJ:
Humana Press; 2008. pp. 365-382
[78] Verdonk ML, Taylor RD, Chessari G,
Murray CW. Illustration of current
challenges in molecular docking. In:
Structure-Based Drug Discovery.
Dordrecht: Springer Netherlands; 2007.
pp. 201-221
[79] Janin J, Henrick K, Moult J, Eyck LT,
Sternberg MJE, Vajda S, et al. CAPRI:
A critical assessment of PRedicted
interactions. Proteins. 2003;(1):2-9
[80] Janin J. Protein–protein docking
tested in blind predictions: The CAPRI
experiment. Molecular BioSystems.
2010;(12):2351
[81] Hurle MR, Yang L, Xie Q ,
Rajpal DK, Sanseau P, Agarwal P.
Computational drug repositioning:
From data to therapeutics. Clinical
Pharmacology and Therapeutics.
2013;(4):335-341
[82] Scherman D, Fetro C. Drug
repositioning for rare diseases:
Knowledge-based success stories.
Thérapie. 2020;(2):161-167
[83] Xiao H, Bid HK, Chen X,
Wu X, Wei J, Bian Y, et al. Repositioning
Bazedoxifene as a novel IL-6/GP130
signaling antagonist for human
rhabdomyosarcoma therapy. PLoS ONE.
2017;(7):e0180297
[84] Gupta RR. Application of artificial
intelligence and machine learning in
drug discovery. Methods in Molecular
Biology. 2022;:113-124
Fundamentals of Molecular Docking and Comparative Analysis of Protein: Small Molecule
DOI: http://dx.doi.org/10.5772/intechopen.105815

[85] Thomas M, Boardman A,
Garcia-Ortegon M, Yang H, de Graaf C,
Bender A. Applications of artificial
intelligence in drug design:
Opportunities and challenges. Methods
in Molecular Biology. 2022;:1-59
[86] Zhu T, Cao S, Su PC, Patel R, Shah D,
Chokshi HB, et al. Hit identification and
optimization in virtualscreening: Practical
recommendations based on a critical
literature analysis: Miniperspective.
Journal of Medicinal Chemistry.
2013;(17):6560-6572
[87] Neves BJ, Mottin M, Moreira-
Filho JT, Sousa BK de P, Mendonca SS,
Andrade CH. Best practices for docking-
based virtual screening. In: Molecular
Docking for Computer-Aided Drug
Design. Academic Press (Elsevier); 2021.
pp. 75-98
[88] Lipinski CA, Lombardo F,
Dominy BW, Feeney PJ. Experimental
and computational approaches to
estimate solubility and permeability
in drug discovery and development
settings. Advanced Drug Delivery
Reviews. 2001;(1-3):3-26
[89] Veber DF, Johnson SR, Cheng HY,
Smith BR, Ward KW, Kopple KD.
Molecular properties that influence the
oral bioavailability of drug candidates.
Journal of Medicinal Chemistry.
2002;(12):2615-2623
[90] Neves BJ, Braga RC,
Melo-Filho CC, Moreira-Filho JT,
Muratov EN, Andrade CH. QSAR-
based virtual screening: Advances and
applications in drug discovery. Frontiers
in Pharmacology. 2018;:1275
[91] Fassio AV, Santos LH, Silveira SA,
Ferreira RS, de Melo-Minardi RC.
nAPOLI: A graph-based strategy to
detect and visualize conserved protein-
ligand interactions in large-scale.
IEEE/ACM Transactions on Computa-
tional Biology and Bioinformatics.
2019:1-1
[92] Kurogi Y, Güner OF. Pharmacophore
modeling and three-dimensional
database searching for drug design using
catalyst. Current Medicinal Chemistry.
2001;(9):1035-1055
[93] Dixon SL, Smondyrev AM,
Knoll EH, Rao SN, Shaw DE,
Friesner RA. PHASE: A new engine
for pharmacophore perception, 3D
QSAR model development, and 3D
database screening: 1. Methodology
and preliminary results. Journal of
Computer-Aided Molecular Design.
2006;(10-11):647-671
[94] Chen X, Rusinko A, Tropsha A,
Young SS. Automated pharmacophore
identification for large chemical
data sets. Journal of Chemical
Information and Computer Sciences.
1999;(5):887-896
[95] Schneidman-Duhovny D,
Dror O, Inbar Y, Nussinov R, Wolfson HJ.
PharmaGist: A webserver for ligand-
based pharmacophore detection. Nucleic
Acids Research. 2008;:W223-W228
[96] Fan N, Bauer CA, Stork C, de
Bruyn KC, Kirchmair J. ALADDIN:
Docking approach augmented by
machine learning for protein structure
selection yields superior virtual
screening performance. Molecular
Informatics. 2020;(4):e1900103
[97] Rashidieh B, Molakarimi M,
Mohseni A, Tria SM, Truong H,
Srihari S, et al. Targeting BRF2 in cancer
using repurposed drugs. Cancers.
2021;(15):3778
[98] Berman HM. The protein data
bank. Nucleic Acids Research.
2000;(1):235-242
Molecular Docking - Recent Advances

[99] Chen X. TTD: Therapeutic target
database. Nucleic Acids Research.
2002;(1):412-415
[100] Chen YZ, Zhi DG. Ligand-protein
inverse docking and its potential use
in the computer search of protein
targets of a small molecule. Proteins.
2001;(2):217-226
[101] Wang JC, Chu PY, Chen CM, Lin JH.
idTarget: A web server for identifying
protein targets of small chemical
molecules with robust scoring functions
and a divide-and-conquer docking
approach. Nucleic Acids Research.
2012;:W393-W399
[102] Xie T, Zhang L, Zhang S, Ouyang L,
Cai H, Liu B. ACTP: A webserver for
predicting potential targets and
relevant pathways of autophagy-
modulating compounds. Oncotarget.
2016;(9):10015-10022
[103] Lee A, Kim D. CRDS: Consensus
reverse docking system for target fishing.
Bioinformatics. 2019
[104] Stepanova EE, Balandina SY,
Drobkova VA, Dmitriev MV,
Mashevskaya IV, Maslivets AN. Synthesis,
in vitro antibacterial activity against
Mycobacterium tuberculosis, and
reverse docking-based target fishing of
1,4-benzoxazin-2-one derivatives. Archiv
der Pharmazie. 2021;(2):2000199
[105] Imrie F, Bradley AR, van der
Schaar M, Deane CM. Protein family-
specific models using deep neural
networks and transfer learning
improve virtual screening and highlight
the need for more data. Journal of
Chemical Information and Modeling.
2018;(11):2319-2330
[106] Kitchen DB, Decornez H, Furr JR,
Bajorath J. Docking and scoring in virtual
screening for drug discovery: Methods
and applications. Nature Reviews. Drug
Discovery. 2004;(11):935-949
[107] Yuan S, Chan HCS, Hu Z. Using
PyMOL as a platform for computational
drug design. WIREs Computers
Molecular Science. 2017;(2):70
[108] Koukos PI, Réau M, Bonvin AMJJ.
Shape-restrained modeling of protein–
small-molecule complexes with high
ambiguity driven DOCKing. Journal of
Chemical Information and Modeling.
2021;(9):4807-4818
[109] Koukos PI, Xue LC, Bonvin AMJJ.
Protein–ligand pose and affinity
prediction: Lessons from D3R Grand
Challenge 3. Journal of Computer-Aided
Molecular Design. 2019;(1):83-91
[110] Stanzione F, Giangreco I, Cole JC.
Use of molecular docking computational
tools in drug discovery. In: Progress in
Medicinal Chemistry. Elsevier; 2021.
pp.273-343
[111] Sennhauser G, Amstutz P, Briand C,
Storchenegger O, Grütter MG. Drug
export pathway of multidrug exporter
AcrB revealed by DARPin inhibitors.
PLoS Biology. 2007;(1):e7
[112] Grosdidier A, Zoete V, Michielin O.
SwissDock, a protein-small molecule
docking web service based on
EADock DSS. Nucleic Acids Research.
2011;(suppl):W270-W277
[113] Guex N, Peitsch MC. SWISS-
MODEL and the Swiss-Pdb Viewer:
An environment for comparative
protein modeling. Electrophoresis.
1997;(15):2714-2723
... com/ Predicting protein complexes AutoDock Vina (Eberhardt et al., 2021) https://vina.scripps. edu/ (Dhakal et al., 2022;Maden et al., 2022;Pagadala et al., 2017;Wu et al., 2024;Zhang et al., 2024) HADDOCK (van Zundert et al., 2016 https://wenmr.science. uu.nl/haddock2.4/ ...
... There are currently tens of molecular docking programs either commercial or free, which were reviewed in Pagadala et al. (2017). Furthermore, there is a useful description of docking fundamentals together with a case study using SARS-CoV-2 main protease, comparing the functionalities of three different programs (Maden et al., 2022). For this chapter, we point out just the best performing open source tool (Wang et al., 2016), namely AutoDock Vina (Eberhardt et al., 2021), that is a wide-spread software in the community ( Table 1). ...
Chapter
Experimental approaches for identifying protein structures and conducting their analyses face systemic limitations and challenges. Thus, computational approaches have emerged as invaluable tools over the past few decades, offering complementary insights into the protein structure, function and analysis. This chapter provides a focused overview of computational tools being developed for predicting and analyzing protein structures. Acquisition and validation of protein structures is initially discussed, focusing on prominent resources such as the Protein Data Bank (PDB) and the AlphaFold Protein Structure Database. This section is followed by an overview of tools to study protein interactions, characterization of functional sites and visualization of protein static/dynamic state. Finally, we conclude with a meticulous discussion of the unaddressed challenges and future directions.
... Molecular docking studies were conducted for ligands common in EO compositions of plants with the highest quantities. The in silico investigation employed Autodock Vina (23) and Discovery Studio Visualizer (24) programs for comprehensive assessment. ...
Article
Full-text available
This study aimed to assess the biological and biotherapeutic activities of essential oils derived from the medicinal plants Tanacetum vulgare L., Myrtus communis L. subsp. communis L., and Pimpinella flabellifolia (Boiss.) Benth. Et Hook. ex Drude. Plant samples were systematically collected from the Sivas region of Türkiye. Subsequently, essential oils were extracted using a Clevenger-type apparatus, and their compositions were assessed by gas chromatography–mass spectrometry (GC-MS) analysis. Then, antioxidant activities of the essential oil samples were investigated using β-carotene-linoleic acid and 2,2-diphenyl-1-picrylhydrazyl (DPPH) assays. Furthermore, the antimicrobial activity of these species was assessed via the disc diffusion assay. Finally, the potential effects of the essential oil compositions from these plants on milk production in dairy cows were analyzed through in-silico methods.
... Molecular docking studies were conducted for ligands common in EO compositions of plants with the highest quantities. The in silico investigation employed Autodock Vina (23) and Discovery Studio Visualizer (24) programs for comprehensive assessment. ...
Article
Full-text available
This study aimed to assess the biological and biotherapeutic activities of essential oils derived from the medicinal plants Tanacetum vulgare L., Myrtus communis L. subsp. communis L., and Pimpinella flabellifolia (Boiss.) Benth. Et Hook. ex Drude. Plant samples were systematically collected from the Sivas region of Türkiye. Subsequently, essential oils were extracted using a Clevenger-type apparatus, and their compositions were assessed by gas chromatography-mass spectrometry (GC-MS) analysis. Then, antioxidant activities of the essential oil samples were investigated using β-carotene-linoleic acid and 2,2-diphenyl-1-picrylhydrazyl (DPPH) assays. Furthermore, the antimicrobial activity of these species was assessed via the disc diffusion assay. Finally, the potential effects of the essential oil compositions from these plants on milk production in dairy cows were analyzed through in-silico methods.
... It provides an interactive and user-friendly environment for the visualization and analysis of three-dimensional biomolecular structures. PyMOL allows users to render high-quality images and animations, explore protein structures, and analyze various molecular properties.PyMOL was used to visualise the neuroglobin 1OJ6and prepare partialligands while also analyzing the binding of ligands to the target [19,20]. RasMol is a pioneering molecular visualization program that has played a significant role in the field of structural biology. ...
Article
It is well known that finding new drugs is a difficult, expensive, time-consuming, and expensive project. According to a study, the typical time and cost for developing a new medicine through the conventional drug development pipeline is 12 years and 2.7 billion dollars. The pharmaceutical industry is grappling with the difficult and pressing challenge of how to find new drugs faster and at lower research costs.Insilico,The field of computer-aided drug discovery (CADD) has shown significant promise as an advanced technology for secure, cost-effective, and efficient drug design. In recent times, there has been remarkable progress in computational tools for drug discovery, particularly in the development of anticancer therapies. This progress has had a significant impact on the design of anticancer drugs and has provided valuable insights into the field of cancer treatment. To carry out molecular docking, we utilized AutoDock software and prepared the target protein by loading and converting its PDB file format into a macromolecule. Additionally, the ligand structures underwent energy minimization (EM) and were selected alongside the target proteins in AutoDock. To ensure coverage of the binding site residues, a suitable grid box with appropriate dimensions was chosen
... MD was designed to simulate these interactions at the molecular level by forecasting the 3D structures of complexes. Through MD, investigators can predict the binding site and pose of a protein with its partner, revealing the relationship between protein structure and function and assisting drug design in a variety of ways [26]. ...
Article
Full-text available
Background: In the last few decades, the development of multidrug-resistant (MDR) microbes has accelerated alarmingly and resulted in significant health issues. Morbidity and mortality have increased along with the prevalence of infections caused by MDR bacteria, making the need to solve these problems an urgent and unmet challenge. Therefore, the current investigation aimed to evaluate the activity of linseed extract against Methicillin-resistant Staphylococcus aureus (MRSA) as an isolate from diabetic foot infection. In addition, antioxidant and anti-inflammatory biological activities of linseed extract were evaluated. Result: HPLC analysis indicated the presence of 1932.20 µg/mL, 284.31 µg/mL, 155.10 µg/mL, and 120.86 µg/mL of chlorogenic acid, methyl gallate, gallic acid, and ellagic acid, respectively, in the linseed extract. Rutin, caffeic acid, coumaric acid, and vanillin were also detected in the extract of linseed. Linseed extract inhibited MRSA (35.67 mm inhibition zone) compared to the inhibition zone (29.33 mm) caused by ciprofloxacin. Standards of chlorogenic acid, ellagic acid, methyl gallate, rutin, gallic acid, caffeic acid, catechin, and coumaric acid compounds reflected different inhibition zones against MRSA when tested individually, but less than the inhibitory action of crude extract. A lower MIC value, of 15.41 µg/mL, was observed using linseed extract than the MIC 31.17 µg/mL of the ciprofloxacin. The MBC/MIC index indicated the bactericidal properties of linseed extract. The inhibition % of MRSA biofilm was 83.98, 90.80, and 95.58%, using 25%, 50%, and 75%, respectively, of the MBC of linseed extract. A promising antioxidant activity of linseed extract was recorded, with an IC50 value of 20.8 µg/mL. Anti-diabetic activity of linseed extract, expressed by glucosidase inhibition, showed an IC50 of 177.75 µg/mL. Anti-hemolysis activity of linseed extract was documented at 90.1, 91.5, and 93.7% at 600, 800, and 1000 µg/mL, respectively. Anti-hemolysis activity of the chemical drug indomethacin, on the other hand, was measured at 94.6, 96.2, and 98.6% at 600, 800, and 1000 µg/mL, respectively. The interaction of the main detected compound in linseed extract (chlorogenic acid) with the crystal structure of the 4G6D protein of S. aureus was investigated via the molecular docking (MD) mode to determine the greatest binding approach that interacted most energetically with the binding locations. MD showed that chlorogenic acid was an appropriate inhibitor for S. aureus via inhibition of its 4HI0 protein. The MD interaction resulted in a low energy score (-6.26841 Kcal/mol) with specified residues (PRO 38, LEU 3, LYS 195, and LYS 2), indicating its essential role in the repression of S. aureus growth. Conclusion: Altogether, these findings clearly revealed the great potential of the in vitro biological activity of linseed extract as a safe source for combatting multidrug-resistant S. aureus. In addition, linseed extract provides health-promoting antioxidant, anti-diabetic, and anti-inflammatory phytoconstituents. Clinical reports are required to authenticate the role of linseed extract in the treatment of a variety of ailments and prevent the development of complications associated with diabetes mellitus, particularly type 2.
Article
Herpes simplex virus type-1 (HSV-1), the etiological agent of sporadic encephalitis and recurring oral (sometimes genital) infections in humans, affects millions each year. The evolving viral genome reduces susceptibility to existing antivirals and, thus, necessitates new therapeutic strategies. Immunoinformatics strategies have shown promise in designing novel vaccine candidates in the absence of a clinically licensed vaccine to prevent HSV-1. However, to encourage clinical translation, the HSV-1 pan-genome was integrated with the reverse-vaccinology pipeline for rigorous screening of universal vaccine candidates. Viral targets were screened from 104 available complete genomes. Among 364 proteins, envelope glycoprotein D being an outer membrane protein with a high antigenicity score (> 0.4) and solubility (> 0.6) was selected for epitope screening. A total of 17 T-cell and 4 B-cell epitopes with highly antigenic, immunogenic, non-toxic properties and high global population coverage were identified. Furthermore, 8 vaccine constructs were designed using different combinations of epitopes and suitable linkers. VC-8 was identified as the most potential vaccine candidate regarding chemical and structural stability. Molecular docking revealed high interactive affinity (low binding energy: − 56.25 kcal/mol) of VC-8 with the target elicited by firm intermolecular H-bonds, salt-bridges, and hydrophobic interactions, which was validated with simulations. Compatibility of the vaccine candidate to be expressed in pET-29(a) + plasmid was established by in silico cloning studies. Immune simulations confirmed the potential of VC-8 to trigger robust B-cell, T-cell, cytokine, and antibody-mediated responses, thereby suggesting a promising candidate for the future of HSV-1 prevention.
Chapter
Drug discovery is a multidisciplinary process, which encompasses scientific areas like chemistry, biology, pharmacology, and computer sciences. In the past decades, drug discovery was very laborious, expensive, and time consuming process. Massive efforts were needed to harness computational capabilities to encompass both chemical and biological domains, aiming to streamline the processes of drug discovery, design, development, and optimization. The introduction of super computers and accurate algorithms revolutionized different methods in drug discovery such as hit identification, hit-to-lead selection, lead optimization, pharmacokinetic analysis, and toxicity assessment. Computer-aided drug discovery (CADD) is a general term that covers various in silico tools and methods associated with drug discovery. The area is still advancing with the application of artificial intelligence in CADD tools and software. This chapter is devoted to expound various tools and methods frequently used in CADD including structure modeling, pharmacokinetics and toxicity prediction, pharmacophore modeling, molecular docking, and molecular dynamics.
Article
Full-text available
Artificial intelligence (AI) is often presented as a new Industrial Revolution. Many domains use AI, including molecular simulation for drug discovery. In this review, we provide an overview of ligand-protein molecular docking and how machine learning (ML), especially deep learning (DL), a subset of ML, is transforming the field by tackling the associated challenges.
Article
Full-text available
Background Predicting protein-ligand binding sites is a fundamental step in understanding the functional characteristics of proteins, which plays a vital role in elucidating different biological functions and is a crucial step in drug discovery. A protein exhibits its true nature after binding to its interacting molecule known as a ligand that binds only in the favorable binding site of the protein structure. Different computational methods exploiting the features of proteins have been developed to identify the binding sites in the protein structure, but none seems to provide promising results, and therefore, further investigation is required. Results In this study, we present a deep learning model PUResNet and a novel data cleaning process based on structural similarity for predicting protein-ligand binding sites. From the whole scPDB (an annotated database of druggable binding sites extracted from the Protein DataBank) database, 5020 protein structures were selected to address this problem, which were used to train PUResNet. With this, we achieved better and justifiable performance than the existing methods while evaluating two independent sets using distance, volume and proportion metrics.
Article
Full-text available
Small-molecule docking remains one of the most valuable computational techniques for the structure prediction of protein-small-molecule complexes. It allows us to study the interactions between compounds and the protein receptors they target at atomic detail in a timely and efficient manner. Here, we present a new protocol in HADDOCK (High Ambiguity Driven DOCKing), our integrative modeling platform, which incorporates homology information for both receptor and compounds. It makes use of HADDOCK's unique ability to integrate information in the simulation to drive it toward conformations, which agree with the provided data. The focal point is the use of shape restraints derived from homologous compounds bound to the target receptors. We have developed two protocols: in the first, the shape is composed of dummy atom beads based on the position of the heavy atoms of the homologous template compound, whereas in the second, the shape is additionally annotated with pharmacophore data for some or all beads. For both protocols, ambiguous distance restraints are subsequently defined between those beads and the heavy atoms of the ligand to be docked. We have benchmarked the performance of these protocols with a fully unbound version of the widely used DUD-E (Database of Useful Decoys-Enhanced) dataset. In this unbound docking scenario, our template/shape-based docking protocol reaches an overall success rate of 81% when a reliable template can be identified (which was the case for 99 out of 102 complexes in the DUD-E dataset), which is close to the best results reported for bound docking on the DUD-E dataset.
Article
Full-text available
The overexpression of BRF2, a selective subunit of RNA polymerase III, has been shown to be crucial in the development of several types of cancers, including breast cancer and lung squamous cell carcinoma. Predominantly, BRF2 acts as a central redox-sensing transcription factor (TF) and is involved in rescuing oxidative stress (OS)-induced apoptosis. Here, we showed a novel link between BRF2 and the DNA damage response. Due to the lack of BRF2-specific inhibitors, through virtual screening and molecular dynamics simulation, we identified potential drug candidates that interfere with BRF2-TATA-binding Protein (TBP)-DNA complex interactions based on binding energy, intermolecular, and torsional energy parameters. We experimentally tested bexarotene as a potential BRF2 inhibitor. We found that bexarotene (Bex) treatment resulted in a dramatic decline in oxidative stress and Tert-butylhydroquinone (tBHQ)-induced levels of BRF2 and consequently led to a decrease in the cellular proliferation of cancer cells which may in part be due to the drug pretreatment-induced reduction of ROS generated by the oxidizing agent. Our data thus provide the first experimental evidence that BRF2 is a novel player in the DNA damage response pathway and that bexarotene can be used as a potential inhibitor to treat cancers with the specific elevation of oxidative stress.
Chapter
Artificial intelligence (AI) has undergone rapid development in recent years and has been successfully applied to real-world problems such as drug design. In this chapter, we review recent applications of AI to problems in drug design including virtual screening, computer-aided synthesis planning, and de novo molecule generation, with a focus on the limitations of the application of AI therein and opportunities for improvement. Furthermore, we discuss the broader challenges imposed by AI in translating theoretical practice to real-world drug design; including quantifying prediction uncertainty and explaining model behavior.
Chapter
Machine Learning (ML) and Deep Learning (DL) are two subclasses of Artificial Intelligence (AI), that, in this day and age of big data provides significant opportunities to pharmaceutical discovery research and development by translating data to information and ultimately to knowledge. Machine Learning or AI is not really new but over last few years, application of better methods have emerged and they have been successfully applied for drug discovery and development. This chapter would provide an overview of these methods and how they have been applied across various work streams, e.g., generative chemistry, ADMET prediction, retrosynthetic analysis, etc. within drug discovery process. This chapter would also attempt to provide caution and pit falls in utilizing these methods blindly while summarizing challenges and limitations.
Chapter
Computational methods play an increasingly important role in drug discovery. Structure-based drug design (SBDD), in particular, includes techniques that take into account the structure of the macromolecular target to predict compounds that are likely to establish optimal interactions with the binding site. The current interest in machine learning algorithms based on deep neural networks encouraged the application of deep learning to SBDD related problems. This chapter covers selected works in this active area of research.
Chapter
Molecular docking has become an important component of the drug discovery process. Since first being developed in the 1980s, advancements in the power of computer hardware and the increasing number of and ease of access to small molecule and protein structures have contributed to the development of improved methods, making docking more popular in both industrial and academic settings. Over the years, the modalities by which docking is used to assist the different tasks of drug discovery have changed. Although initially developed and used as a standalone method, docking is now mostly employed in combination with other computational approaches within integrated workflows. Despite its invaluable contribution to the drug discovery process, molecular docking is still far from perfect. In this chapter we will provide an introduction to molecular docking and to the different docking procedures with a focus on several considerations and protocols, including protonation states, active site waters and consensus, that can greatly improve the docking results.
Chapter
Docking-based virtual screening (DBVS) is well placed in modern drug discovery and is widely applied with many success cases by both pharmaceutical companies and academic groups. The recent advances in scoring functions, search algorithms, consensus scoring, protein flexibility and enrichment represent a new era of docking approaches. Given the popularity of docking techniques, here we emphasize the importance of assessing the performance of docking protocols to discriminate between active and inactives, using a variety of metrics from classic enrichment descriptors to advanced ones, as well as to compare if some methods and scoring functions perform better than others and in what situations what metrics are more appropriate than others. Moreover, we highlighted the pitfalls and strengths of main steps of DBVS and suggest possible roadmaps, methods, and strategies, which may contribute for optimizing drug discovery projects using computational approaches.
Article
Computational docking methods can provide structural models of protein–protein complexes, but protein backbone flexibility upon association often thwarts accurate predictions. In recent blind challenges, medium or high accuracy models were submitted in less than 20% of the ‘difficult’ targets (with significant backbone change or uncertainty). Here, we describe recent developments in protein–protein docking and highlight advances that tackle backbone flexibility. In molecular dynamics and Monte Carlo approaches, enhanced sampling techniques have reduced time-scale limitations. Internal coordinate formulations can now capture realistic motions of monomers and complexes using harmonic dynamics. And machine learning approaches adaptively guide docking trajectories or generate novel binding site predictions from deep neural networks trained on protein interfaces. These tools poise the field to break through the longstanding challenge of correctly predicting complex structures with significant conformational change.