Content uploaded by Kevser Kübra Kırboğa
Author content
All content in this area was uploaded by Kevser Kübra Kırboğa on Jun 08, 2023
Content may be subject to copyright.
Chem Biol Drug Des. 2023;00:1–17. wileyonlinelibrary.com/journal/cbdd
|
1
© 2023 John Wiley & Sons Ltd.
1
|
INTRODUCTION
The drug discovery process is very complex and chal-
lenging, with a low success rate. Therefore, an interdis-
ciplinary effort is required for effective commercial drug
design and development. This process includes identify-
ing a therapeutic drug molecule that is therapeutically
effective and useful in treating and maintaining the dis-
ease state. The drug discovery and development process
provides target molecule identification, synthesis, charac-
terization, screening, and therapeutic measurements. One
molecule is selected from every 2.3 million compounds
that target the research project in this process. Preclinical,
clinical, and post- clinical drug discovery and development
researches require high budgets and advanced technolo-
gies. The average cost for effective drug research and de-
velopment ranges from $900 million to $2 billion (Zeng
et al.,2022). The time from the discovery of a drug pro-
duced for the treatment of a disease to its release on the
market takes an average of 12– 15 years (Deore et al.,2019).
Also, the success rate of launching a drug from a Phase I
clinical trial is daunting, less than 10% (Deng et al.,2022).
In the last decade, drug discovery has been undergoing
radical transformations driven by rapid development
in artificial intelligence (AI) (Chen et al.,2018; Mater &
Coote, 2019; Schneider,2018; Vamathevan et al.,2019).
Received: 12 November 2022
|
Revised: 24 March 2023
|
Accepted: 12 April 2023
DOI: 10.1111/cbdd.14262
REVIEW
Explainability and white box in drug discovery
Kevser KübraKırboğa1,2
|
SumraAbbasi3
|
Ecir UğurKüçüksille4
1Bioengineering Department, Bilecik
Seyh Edebali University, Bilecik, Turkey
2Informatics Institute, Istanbul
Technical University, Maslak, Turkey
3Department of Biological Sciences,
National of Medical Sciences,
Rawalpindi, Pakistan
4Department of Computer Engineering,
Süleyman Demirel University, Isparta,
Turkey
Correspondence
Kevser Kübra Kırboğa, Bioengineering
Department, Bilecik Seyh Edebali
University, Bilecik, Turkey.
Email: kubra.kirboga@yahoo.com
Abstract
Recently, artificial intelligence (AI) techniques have been increasingly used to
overcome the challenges in drug discovery. Although traditional AI techniques
generally have high accuracy rates, there may be difficulties in explaining the
decision process and patterns. This can create difficulties in understanding and
making sense of the outputs of algorithms used in drug discovery. Therefore,
using explainable AI (XAI) techniques, the causes and consequences of the deci-
sion process are better understood. This can help further improve the drug dis-
covery process and make the right decisions. To address this issue, Explainable
Artificial Intelligence (XAI) emerged as a process and method that securely cap-
tures the results and outputs of machine learning (ML) and deep learning (DL)
algorithms. Using techniques such as SHAP (SHApley Additive ExPlanations)
and LIME (Locally Interpretable Model- Independent Explanations) has made
the drug targeting phase clearer and more understandable. XAI methods are ex-
pected to reduce time and cost in future computational drug discovery studies.
This review provides a comprehensive overview of XAI- based drug discovery and
development prediction. XAI mechanisms to increase confidence in AI and mod-
eling methods. The limitations and future directions of XAI in drug discovery are
also discussed.
KEYWORDS
artificial intelligence, computational drug discovery, drug development, explainable artificial
intelligence
2
|
KIRBOĞA et al.
Popular applications of AI in drug discovery include vir-
tual screening (Stumpfe & Bajorath,2020), reaction pre-
diction (Boström et al.,2018), and de novo drug design
(Schneider et al.,2020) retrosynthesis (Deng et al.,2022;
Table1). These applications are powered by various AI
techniques, with model designs spanning from common
ML models to deep neural networks (DNN), including
convolutional neural networks, recurrent neural net-
works, graph neural networks, and transformers, as well
as other types of networks. In this review, we aim to pro-
vide a comprehensive overview of recent XAI (eXplain-
able Artificial Intelligence) research, highlighting its
benefits, limitations, and future opportunities for drug
discovery. We first give an overview of critical applications
in drug discovery and highlight a collection of previously
published perspectives, reviews, and surveys. Relevant
techniques, including model architectures and learning
paradigms, will be detailed with information on data and
representations. Finally, we discuss current challenges
and highlight some future aspects.
2
|
AI AND XAI IN DRUG
DISCOVERY
AI significantly impacts the creation of small- molecule
medications thanks to access to new biology, improved or
original chemistry, increased success rates, and speedier
and less expensive discovery processes (Mak et al.,2022).
AI- native drug discovery businesses that offer software or
other services to pharmaceutical corporations have been
primarily responsible for historical advancement. At vari-
ous points throughout the value chain, these businesses
employ data and analytics to enhance one or more spe-
cific use cases. Examples include small molecule design
using generative neural networks and target finding and
validation using knowledge graphics. For instance, large
pharmaceutical firms might use partnerships or soft-
ware licensing agreements to gain access to these capa-
bilities and integrate them into their pipelines. Use of AI
in drug discovery Since the early 2000s, machine learn-
ing models such as random forest (RF) has been used for
virtual screening (VS) and Quantitative structure– activity
relationship (QSAR; Lavecchia, 2015; Ma et al., 2015).
Wójcikowski et al. discussed using the R and Python
programming languages, features, and regression mod-
els and creating an RF Score based on the RF technique
(Wójcikowski et al.,2019).
Potent inhibitors of fast discoid domain receptor
1 (DDR1) were discovered in a short time by Insilico
Medicine researchers (Zhavoronkov et al.,2019). At the
same time, AI is used in various applications at different
stages of drug discovery, from target identification and
validation to drug response determination. For instance,
MIT scientists discovered a novel drug candidate against
antibiotic- resistant bacteria in 2020 (Stokes et al.,2020).
Many AI- native drug development businesses have
expanded their end- to- end capabilities in the past several
years. For instance, Atomwise and Schrödinger estab-
lished a joint company with a shared portfolio to unite
the many platform technologies, while Roivant Sciences
bought Silicon Therapeutics (Savage, 2021). IT behe-
moths' internal resources and investments, which are also
actively increasing their AI efforts in biology and pharma-
ceutical research, are significant. For instance, Alphabet
has established Isomorphic Labs, building on AI innova-
tions in the DeepMind AI organization. Another essential
work supporting DeepMind was to produce structure pre-
dictions with accuracies approaching those of DeepMind
in the 14th Critical Assessment of Structure Prediction
(CASP14), using the Three- way network, enabling rapid
resolution of challenging structure modeling problems,
and identifying proteins with current unknown struc-
tures (Baek et al.,2021). With Baidu's AI drug discovery
division, Sanofi, Nvidia Clara has significantly invested in
several AI technologies and applications (Savage,2021).
AI's impact on traditional drug discovery is in its early
stages. Still, we have already seen that when layered into
a conventional process, AI- powered capabilities can dra-
matically speed up or improve individual steps and reduce
the costs of running expensive experiments. AI algorithms
can potentially transform most exploratory tasks (such as
molecule design and testing) so that physical experiments
only need to be done when necessary to validate results.
Companies controlling the entire AI- powered discovery
process emphasize that the IP supports their assets. They
benefit from a network of partners, including CROs and
contract development and manufacturing companies,
but retain ownership of the molecule. Their investments
have potentially significant commercial value through de-
licensing, joint ventures (typically after clinical proof of
concept), and therapeutic marketing. In addition, the ma-
turing AI- first model has accelerated the transition among
AI- native players from software or service providers to
asset- owning biotechnology.
In addition to its relationship with the private sector,
many studies continue to be conducted on the use of AI
in drug discovery. Therefore, when computational biol-
ogists focus on the identification and discovery of new
drugs, network- based biology analysis algorithms can
be therapeutic for cancer from molecular networks such
as protein– protein interaction networks (Li et al.,2017),
gene regulatory networks (Karlebach & Shamir, 2008),
metabolic networks (Stelling et al.,2002), and drug– drug
interaction networks (Hu & Hayton,2011) and drug– drug
interaction networks (Table1). Targets can be identified
17470285, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cbdd.14262 by Bilecik Seyh Edebali, Wiley Online Library on [07/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
|
3
KIRBOĞA et al.
TABLE Use of explainable artificial intelligence in categorized drug studies.
Application Algoritms Technique References
Adverse drug reactions CART decision trees and JRip Boosting based feature selection Bresso et al.(2021)
Adverse drug reactions Random forests (RF), extra
randomTrees (ET), and
eXtreme Gradient Boosting
machines (XGB)
LIME and shapley values Ward et al.(2021)
Adverse drug reactions Support vector machine (SVM) Shapley values Joshi et al.(2021)
Adverse drug reactions Reccurrent neural network (RNN) Shapley values Rebane et al.,2021)
Adverse drug reactions XGB Shapley values Zhu et al.(2022)
Adverse drug reactions XGB, CatBoost, AdaBoost,
LightGBM, RF, gradient
boosting decision tree (GBDT),
TPOT
Shapley values Yu et al.(2021)
Adverse drug reactions GBDT Shapley values Imran et al.(2022)
Drug repurposing Knowledge graph Drancé(2022)
Drug repurposing Knowledge graph Knowledge- based embeddings He et al.(2022)
Drug repurposing Graphical neural network (GNN) Drug explorer (meta matrix) Wang et al.(2022)
Drug repurposing Knowledge graph Integrated gradients (IG) Atsuko Takagi et al.(2022)
Drug- disease and drug-
target interaction
Knowledge graph Wang et al.(2021)
Drug- disease and drug-
target interaction
GNN GNN explainer Pfeifer et al.(2022)
Drug- disease and drug-
target interaction
Knowledge graphs Knowledge graph Embedding model Zeng et al.(2020)
Drug– drug interactions Knowledge graph neural network
(KGNN)
Lin et al.(2020)
Drug– drug interactions Naive Bayes (NB), decision tree
(DT), RF, logistic regression
(LR), and XGB
Shapley values Dang et al.(2021)
Drug– drug interactions RF and XGBoost Shapley values Hung et al.(2022)
Drug design DT, RF, SVM, XGB, KNN, ANN,
RIPPER, RLF
Permutation importance, LIME,
Shapley values, Integrated
gradients, Diverse Counterfactual
Explanations (DiCE), Partial
Dependence Plot (PDP) + Individual
Conditional Expectation (ICE),
Accumulated Local Effects (ALE)
Banegas- Luna and
Pérez- Sánchez(2022)
Drug design Extreme gradient boosting (GB) Shapley values Vangala et al.(2022)
Drug design Naïve Bayes, SVM, tree Shapley values Wojtuch et al.(2021)
Drug design XGBoost (XGB), k- nearest
neighbor (KNN), extra trees
classifier (ETC), support vector
machine (SVM), and Adaboost
(ADA)
Shapley values Akbar et al.(2022)
Drug design Distributed random forest (DRF),
extremely randomized trees
(XRT), generalizedlinear model
(GLM), XGB, gradient boosting
machine (GBM), multilayer
artificial neural network, and
stacked ensemble models
Shapley values Czub et al.(2021)
(Continues)
17470285, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cbdd.14262 by Bilecik Seyh Edebali, Wiley Online Library on [07/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
4
|
KIRBOĞA et al.
XAI development frequently aims to make AI intelligi-
ble to humans. All technological ways of understanding,
such as direct interpretability, the generation of an expla-
nation or justification, and the provision of transparent
information, fall within the expansive definition of XAI
(Páez, 2019). Scientists occasionally get impasses when
defining explicability and related concepts like transpar-
ency, interpretability, and intelligibility. We feel that a
diverse group of experts, including biologists, computer
scientists, doctors, and nurses, should be included as we
focus on XAI applications in the drug development pro-
cess since users frequently want a comprehensive knowl-
edge of the system and its behavior. The phrase “XAI” was
first used about expert systems 50 years ago. The adoption
of ML technology has propelled us into its second phase.
High- level XAI approaches may be divided into two cate-
gories (Guidotti et al.,2019; Lipton,2016).
1. Select simpler and easier- to- understand models, such
as a decision tree, a rule- based model, or a linear
regression.
2. Deciding on a complicated, opaque model (sometimes
known as a “black box model”), such as DNN and siz-
able ensembles of trees, and then employing a post- hoc
approach to produce explanations.
The term “performance- interpretability tradeoff” is oc-
casionally used to refer to the decision between the two
since opaque models frequently outperform transparent
ones at various tasks. However, this decision is not always
the best one, though. Research has demonstrated that
explicitly interpretable models may perform on par with
opaque models, especially when given well- structured
datasets and significant characteristics (Rudin,2019).
Additionally, a current field of XAI research is on
creating new algorithms with interpretable features and
beneficial to performance. In these cases, post- hoc XAI
techniques must be used to make the models explainable.
For the disclosure, (Guidotti et al.,2019) categorize post-
hoc XAI techniques as global model explanations on the
general logic of the model, result explanations that focus
on explaining a particular model output, and counterfac-
tual analysis that supports understanding how the model
will behave with an alternative input. Within these cat-
egories, XAI techniques usually produce feature- based
explanations to elucidate the inside of the model or case-
based explanations to support case- based reasoning.
Each category pertains to models that may be directly
interpreted. It should be highlighted. A shallow decision
tree may be stated simply as a generic statement, highlight-
ing a specific explanation for a forecast outcome or several
approaches to doing counterfactual analysis. As we shall
see in the examples below, it is far less straightforward in
opaque box models, which call for other post- hoc meth-
ods. A global model description gives a rough picture of
how the model functions because it is hard to comprehend
the intricate underlying structure of an opaque model.
This is often accomplished by using the same training data
to train an immediately interpretable basic model, such as
a decision tree, rule set, or regression, and then optimiz-
ing it to behave more like the original model— examples
of result descriptions. To explain the result of a predic-
tion made on a sample, a set of algorithms can be used
to estimate the significance of each sample feature that
contributes to the prediction. For example, LIME (Locally
Interpretable Model- independent Explanations; Ribeiro
et al.,2016a) starts by adding a small amount of noise to
the sample to create a set of neighboring samples; It fits a
simple linear model on these neighbors that mimics the
behavior of the original model in the local area. The linear
model weights can then be used as feature importance to
explain the estimation. Another popular algorithm, SHAP
Application Algoritms Technique References
Drug design Deep neural networks (DNN) Shapley values Fan et al.(2022)
Drug design SVM, XGBoost, RF, DNN, GCN,
GAT, MPNN
Shapley values Jiang et al.(2021)
Drug design Convolutional neural network
(CNN)
Shapley values Hosen et al.(2022)
Drug monitoring Support vector regression (SVR),
GBRT, RF
Shapley values Ma et al.(2022)
Drug monitoring RF Shapley values Bittremieux et al.(2022)
Drug monitoring LR, least absolute shrinkage and
selection operation regression,
classification and regression
trees, RF, and gradient boost
modeling (GBM)
Shapley values Lin et al.(2022)
TABLE (Continued)
17470285, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cbdd.14262 by Bilecik Seyh Edebali, Wiley Online Library on [07/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
|
5
KIRBOĞA et al.
(SHapley Additive exPlanations; Lundberg & Lee,2017b),
identifies feature importance based on Shapley values,
inspired by collaborative game theory to assign credits
to each feature. Feature- importance descriptions can be
displayed to users by visualizing importance or simply
describing the essential features for forecasting. In other
words, rather than asking a detailed “why,” people are
more interested in questions like “why not a different es-
timate” or “how to tweak it to achieve a new estimate.”
Such justifications are especially desired when seeking
a cure or recommendation for an already occurring, fre-
quently undesired event, such as strategies to lessen a
patient's anticipated high illness risk. Finally, it is possi-
ble to employ a variety of algorithms to generate tangi-
ble explanations. Ultimately, these algorithms frequently
identify the modifications needed for a sample to obtain a
different estimate, often using the idea of minimal change
(Dhurandhar et al.,2018; Lundberg & Lee,2017b).
The data for instance- based approaches frequently
include examples. The possible risk of adopting approxi-
mate post- hoc procedures for explanations rather than an
interpretable model has long been discussed. Approaches
invariably exclude certain edge situations or even fail to
calculate results exactly as the original model intended
(Dhurandhar et al.,2018). Furthermore, there is a prag-
matic debate concerning the many communication tools
individuals employ to gain “sufficient comprehension”
to achieve a given objective, in addition to the previously
cited practical reasons for using opaque box models.
Explores outlining a causal chain may be required to de-
velop a firm diagnosis of a condition. For instance, using
rough principles or case- based reasoning may be adequate
and less mentally taxing if one wants to make predictions.
It may also be claimed that methods are a required sort
of translation to connect model and person when they
have differing epistemic access. A new area of XAI study
focuses on creating human- consumable descriptions by
managing human descriptions (Codella et al.,2018; Ehsan
et al.,2019; Kim et al.,2018). This translates model rea-
soning into meaningful human explanations that apply
to the exact prediction. This type of explanation is a com-
plete guess. Still, it may be helpful to ordinary people who
have trouble understanding how machine learning (ML)
models work but want to get an idea of the validity of their
predictions. However, AI developers are responsible for
understanding, mitigating, and transparently communi-
cating the limitations of approximate disclosures to stake-
holders. For example, an explainability metric known as
fidelity can detect erroneous post- hoc disclosures (Alvarez-
Melis & Jaakkola,2018). However, we must acknowledge
that this is an actively researched topic and that there is
still a lack of principled approaches to identifying and
communicating the limitations of post- hoc explanations.
Drug discovery is a field that includes medicinal chem-
istry. In medicinal chemistry, XAI provides increased reli-
ability and interpretation of drug effects. Thanks to deep
learning (DL) models, the reliability of the determined
models has increased. It is possible to associate biological
effects with physicochemical effects and to derive accurate
and appropriate models according to this relationship.
Ultimately, XAI aims to reveal what is done, how it is done,
and related information in drug discovery (Holzinger
et al.,2022; Polzer et al.,2022). Given the importance of
explainability, XAI is emerging, a collection of AI methods
focused on generating outputs and recommendations that
human experts can understand and interpret. Currently,
the AI community focuses on developing XAI methods
that balance transparency, explainability, power, perfor-
mance, and accuracy (Gunning,2017).
2.1
|
SHAP
The SHAP method finds and prioritizes the characteris-
tics that affect how any ML model classifies compounds
and estimates their activities. By looking for a version
for the precise computation of Shapley values for deci-
sion tree techniques and rigorously contrasting this
variant with the model- free SHAP method in estima-
tions of compound activity and potential value, we now
advance our analysis of the SHAP approach to drug de-
velopment. Some studies present a theoretically novel
agnostic interpretation technique for ML models of any
complexity utilized for activity prediction. The SHAP ap-
proach (Lee et al.,2009) is an extension of LIME (Ribeiro
et al.,2016b), and accordingly, feature weights are rep-
resented as Shapley values from game theory (Kuhn &
Tucker, 1953). SHAP can interpret activity estimates
from complex ML models. Features that increase or de-
crease the probability of predictive activity are mapped
on molecular graphs to identify and visualize the struc-
tural patterns that determine the predictions (Feldmann
et al.,2021). In theory, it has been shown that the pay-
ment concepts correspond to the model estimate, and
these values represent the mean of all associated contri-
butions (Joseph,2019; Lundberg & Lee,2017a). Shapley
values, which have an essential role in improving the
explainability of ML models, enable the model to be
evaluated without specifying the functional forms of the
models. The functions and property variables in an ML
model are as follows:
where k denotes a single property variable, K denotes the
total number of explanatory variables available, n is the total
(2.1)
𝜙
(f(Xi)) ≡𝜙0+
∑
Kk=1𝜙k(Xi),∀i=1, …,n
,
17470285, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cbdd.14262 by Bilecik Seyh Edebali, Wiley Online Library on [07/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
6
|
KIRBOĞA et al.
number of units that should be. ϕ ∈ RK; ϕk ∈ R. Φk (Xi) is
the Shapley values of local functions.
As a general definition, Shapley values (SV) respond
by adding credibility to the model's complex decision-
making process. This way, it reveals more clearly how the
model uses its features.
In feature analysis and ML model interpretation, SVs
quantify the contributions of specific traits of a given rep-
resentation to a foreseeable result. For large feature sets,
the explicit calculation of SVs for all possible feature pair-
ings also becomes computationally expensive. This lim-
itation may be overcome using the SHAP model, which
generates a local interpretation model for each prediction
closely resembling the original ML model in the relevant
feature space regions (Lundberg & Lee,2017c).
Regardless of the complexity of an ML model, SHAP
calculations make it feasible to quantify the contributions
of certain chemical features to a successful or unsuccess-
ful prediction for a variety of compound activity prediction
tasks (Rodríguez- Pérez & Bajorath,2019, 2020b). Therefore,
SHAP can be applied to any ML algorithm, including DL
methods. Importantly, for decision tree methods, an algo-
rithm for the precise calculation of local SVs has recently
been presented (Lundberg et al.,2020). They showed that
SHAP and well- defined local SVs were strongly associ-
ated with predicting composite activity for both tree- based
classification and regression models (>80%; Feldmann
et al.,2021; Rodríguez- Pérez & Bajorath,2020b).
2.2
|
LIME
LIME is a technique that applies a local, interpretable
model to each prediction in any black box machine learn-
ing model. The models must be understandable by con-
sumers for them to trust AI systems. AI interpretability
sheds light on these systems' operations and aids in de-
tecting possible problems, including information leakage,
model bias, robustness, and causality. LIME offers a ge-
neric framework for deciphering black boxes and explains
the “why” behind predictions or suggestions made by AI.
LIME attempts to fit a straightforward model to a single
observation that will replicate the behavior of the global
model in that area. The predictions of the more sophisti-
cated model may then be locally explained using the basic
model (Ribeiro et al.,2016a).
2.3
|
Deficiency of AI and why do we
need XAI?
AI addresses various issues by supplying input data
samples and the predicted output from neural networks
(Gunning, 2017). Definite mathematical rules are used
systematically to alter network weights— think of them
as numerical knobs adjusted to achieve the desired out-
come. These values that translate an input query into an
output response are what the network truly learns. Image
categorization is a common application for neural net-
works (West,2018). For many datasets, the performance
of neural networks in this job is well documented. These
collections contain photos ranging from numbers to flora
and wildlife, people, vehicles, motorcycles, aeroplanes,
and other items (Goodman & Flaxman,2017). The image
classification process involves taking an input picture and
passing it through a trained network, which undergoes a
series of modifications to produce a single output class.
Natural language is another practical use of neural net-
works. Translation, language modeling, text categoriza-
tion, question answering, named entity identification, and
dependency parsing are all areas where neural networks
thrive. Google replaced traditional NLP approaches with
an LSTM- based algorithm for its translation service in late
2016. This free service is offered worldwide and is utilized
daily by over 200 million individuals (Castelvecchi,2016).
Although neural networks have been employed to tackle
various issues, it is unclear what the network truly learns
(Lipton, 2018). Attempts have been made to depict the
taught weights to give us an understanding of the obtained
information. However, we have yet to interpret the signifi-
cance of these adjusted knobs, classified as open research.
Employing these networks in real- world issues is danger-
ous without comprehending what information is genu-
inely obtained (Arrieta et al., 2020; Preece et al., 2018).
This issue is conceptually related to the XAI approach,
which is often considered necessary for applying AI mod-
els. For users to adequately comprehend, believe in, and
control powerful AI technologies, XAI is crucial (Gunning
et al.,2019).
2.3.1
|
Scale, growth, diversity
Since it has been studied for decades, XAI has had an
upsurge in popularity comparable to AI. The ability of
XAI to connect applications of AI and the people who
create or use them has also garnered much attention in
recent years. Several XAI support strategies have been
put forth, and it has been emphasized how vital XAI is
in human- machine contexts (Adadi & Berrada, 2018;
Guidotti et al., 2018; Rosenfeld et al., 2019). XAI has
been a recurring topic in other AI settings, such as expert
systems (Swartout et al., 1991), response set program-
ming (Fandinno et al.,2019), and planning (Chakraborti
et al.,2020). Data diversity, which relates to an algorith-
mic model's capacity to ensure that all types of objects
17470285, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cbdd.14262 by Bilecik Seyh Edebali, Wiley Online Library on [07/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
|
7
KIRBOĞA et al.
are represented in its output, has lately received much
attention (Drosou et al.,2017). Therefore, diversity may
be viewed as a metric of the caliber of a set of items that,
when they manifest as a model's output, can character-
ize the model's tendency to produce a spectrum of out-
puts rather than precise predictions. Diversity is crucial
in applications focusing on people, but ethical limitations
apply to AI modeling (Lerman,2013).
Similarly, several AI problems seek to generate differ-
ent recommendations rather than high- scoring but sim-
ilar outcomes (Agrawal et al.,2009). In such instances,
XAI techniques may be useful in identifying the model's
capabilities without compromising the diversity of input
data at its output. Learning techniques to give a model
diversity- keeping skills might be complemented by XAI
approaches to shed light on the model internals and assess
the effectiveness of such tactics regarding the diversity of
the data used to train the model. On the other hand, XAI
could make it simpler to spot the model components af-
fecting its ability to preserve variety.
2.3.2
|
Transparency
Black box to white box, inherent explanations, under-
standability, and comprehensibility are other phrases
that refer to transparency. These phrases allude to a de-
tailed explanation of the AI model's internal operations'
mathematical, statistical, and computational elements.
Mathematicians, computer scientists, or statisticians
might be able to comprehend this explanation. However,
it frequently serves no purpose for the doctor or the pa-
tient. The “transparency” of an AI in banking, for instance,
determining if a consumer qualifies for a loan, is based
on the equation ln(p(x)1p(x)) = 0 + 1 × 1. AI is predicated
on exchanging the model's interpretability for its perfor-
mance in terms of the precision of the AI's predictions
or homework assignments (Došilović et al.,2018). These
models, for instance, artificial neural networks (ANN) or
RFs, employ a significant number of nonlinear functions
as neurons or decision trees coupled by thousands of cou-
pling factors, such as synapses and weights, respectively,
which in DL can be organized in many layers. The many
thousands or millions of internal parameters for these
models can be optimized thanks to the power of modern
computers, up to and including current PCs, to approxi-
mate the functions that map the high- dimensional multi-
variate input data to multivalued outputs such as various
diagnostic classes. In this approach, explaining the indi-
vidual components (neurons, trees) precisely is possible,
but it is impossible to comprehend the system's aggregate
behavior. This is comparable to how it is impossible to
describe concepts or ideas through brain activity. This is
regarded as one of the essential characteristics of emer-
gent systems in systems theory (Zhang et al.,2018).
2.3.3
|
Justification
XAI approach aims to satisfy particular demands, ob-
jectives, expectations, and interests related to artificial
systems (Tjoa & Guan,2020). The concept of XAI was ini-
tiated in the computer sciences XAI was initially started
by the computer science community, who were building
the XAI approach to provide a technical solution to the
current issues (Arrieta et al.,2020). The rising adaptabil-
ity of AI provides one more complex layer of human and
computer interaction (HCI). At the same time, XAI plays a
significant role in the cognitive and behavioral dimensions
of AI- associated decision- making. XAI is a system of novel
inquiry (Tjoa & Guan,2020; Zhu et al., 2018). The cur-
rent era needs XAI despite several neural networks being
used to fix many issues. However, the core understanding
of the network is still contradictory. AI has several biases
in the dataset provided. Thus neural networking requires
transparency and adds justification to the generated pre-
dictions (Zhu et al.,2018). The broad spectrum applica-
tion of Explainable AI can be viewed through various
lenses. The approach depends upon humans' convincing,
as researchers feel more confident if multiple models sup-
port the same prediction. Moreover, the network should
be capable of generating reasoning to provide support to
the predicted features.
2.3.4
|
Informativeness
Whether the pharmacological activity can be derived
from molecular structure and whether components of
such structure are relevant characterizes the process of
developing novel medications. However, the added diffi-
culties and occasionally incorrectly stated issues by multi-
objective design lead to molecular architectures that
frequently serve as compromises. The practical method
reduces the syntheses and assays required to discover and
optimize novel hit and potent leads, mainly when con-
ducting laborious and costly experiments.
By enabling decision- making while concurrently con-
sidering medicinal chemistry expertise, model logic, and
awareness of the system's limits, XAI- assisted drug de-
sign is intended to help address some of these problems
(Liao et al.,2020). For instance, clinical decision support
systems that help doctors with diagnostic or therapeutic
duties require the informativeness of AI. These AI- based
systems are employed more often in clinical settings, with
a current emphasis on medical imaging. For example,
17470285, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cbdd.14262 by Bilecik Seyh Edebali, Wiley Online Library on [07/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
8
|
KIRBOĞA et al.
magnetic resonance imaging may train DNNs to recog-
nize aberrant brain areas to detect Alzheimer's disease.
To make decisions utilizing these data more accessible,
DNNs were also implemented in clinical imaging (Zhang
et al.,2018). However, the precise processes by which a
diagnosis is determined for a specific patient are still un-
known, even though the findings of these analyses seem
plausible to a medical professional since they are congru-
ent with medical knowledge and, as such, might be com-
municated to the patient. DNNs and RFs are subsymbolic
classifiers in the sense described above; thus, a doctor can-
not fully understand and effectively explain to a patient
the hundreds or thousands of decision trees that make
up these systems. Informativeness aims to provide more
straightforward representations of what an AI performs
on the inside so that the user may learn more from this
abstraction (Carpenter & Huang,2018; Rudin,2019).
2.3.5
|
Uncertainty estimation
Another method of interpreting models that quantify
the epistemic error in a prediction is uncertainty esti-
mation. DNNs are ineffective at estimating uncertainty
compared to other ML techniques, such as Gaussian pro-
cesses (Nguyen et al.,2015; Rasmussen,2003). For this
reason, numerous initiatives have been made to quantify
uncertainty in predictions made using neural networks.
Predictions of opaque black- box systems are commonly
used in high- stakes applications, including banking,
healthcare, and criminal justice (Adadi & Berrada,2018).
Posthoc counterfactual explanations can give users valu-
able and actionable information about the inner work-
ings of black- box models (Byrne, 2019). While the XAI
community has offered numerous strategies for generat-
ing counterfactual explanations, much less attention has
been dedicated to investigating the uncertainty of these
explanations (Karimi et al.,2021; Keane et al.,2021). By
giving users uncertainty estimates on counterfactual ex-
planations, we may avoid giving them overconfident and
perhaps dangerous options, which can help them make
better decisions and increase their trust in intelligent sys-
tems (Bhatt et al.,2021; Jesson et al., 2020). Moreover,
recent user tests have shown that consumers are more
likely to concur with a model's forecast when given the
appropriate predictive uncertainty (McGrath et al.,2020),
further driving the need for solutions. These instances
have made it abundantly evident that understanding the
uncertainty of proposed explanations is a crucial first step
in developing a valuable and reliable resource, especially
in high- stakes real- world prediction tasks (Upadhyay
et al.,2021).
Finding novel medications that can cure or prevent
a specific condition is the aim of drug discovery. Drugs
come in various forms, but most are tiny compounds that
can attach precisely to a target molecule— typically a pro-
tein implicated in a disease. In the past, scientists have
combed through vast libraries of compounds to find can-
didates that may one day become drugs. Although rational
structure- based drug design has gained popularity over
time, it still necessitates several strategies, synthesis, and
testing phases. However, since it is sometimes challenging
to foresee which chemical construct will have the desired
biological effects and the qualities required to be a suc-
cessful medicine, the drug development process continues
to be costly and time- consuming. Even if a novel medicine
candidate performs well in tests, it could not succeed in
human trials. For example, less than 10% of medication
candidates undergo Phase I studies before release. Given
this, it is understandable why researchers are looking to
artificial intelligence's superior data processing capabili-
ties to expedite and lower the cost of drug discovery. In
addition, AI technologies can speed up medication devel-
opment, stimulate innovation, improve the effectiveness
of clinical trials, and regulate drug dosage.
AI may be able to study and adjust chemical character-
istics in de novo molecular design more fully and swiftly
than teams of scientists using conventional techniques.
One of the difficulties in AI- driven de novo drug creation,
in addition to the invention of new chemical compounds,
is synthetic feasibility, or the capacity to synthesize the
substance. New XAI models, called neuro- symbolic mod-
els, have been developed. These latest AI models are inher-
ently transparent, meaning users can find and understand
the mechanisms that lead to a prediction without having
to modify, simulate, or process any information about the
model's functioning. Neuro- symbolic models combine
symbolic and statistical learning. (Das et al.,2017). This
combination enables a neural network to make reliable
predictions, reinforced by the transparency offered by log-
ical principles understandable to humans. The potential
for interaction between the model and users throughout
the learning process is one of the numerous benefits of
employing these neuro- symbolic models. If some bias is
found, models that explain their function may be changed
more quickly. To estimate the likelihood of prostate can-
cer (PCa) and clinically significant PCa (csPCa) in 2020,
Suh et al. built and validated XGBoost- based XAI models.
They also included these models in a web- based structure,
giving doctors simple access to decision guidance before
a prostate biopsy (Suh et al.,2020). In 2022, Kirboga et al.
discussed their impacts on potential medications that may
be created for (Friedreich Ataxia) (FA/FRDA) using
XAI, which can be described by SHAP (Shapley Additive
17470285, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cbdd.14262 by Bilecik Seyh Edebali, Wiley Online Library on [07/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
|
9
KIRBOĞA et al.
Explanations) values. (Kırboğa et al., 2022). Their un-
derstanding of intricate biological processes is substan-
tially increased, and long- term intervention studies that
might disclose gene- editing methods to aid drug discov-
ery are fascinating. Therefore, using gene expression data
(GED) in which XAI is included, information extraction
and functional validation investigations are accessible to
discover physiologically meaningful sequential patterns
(Anguita- Ruiz et al.,2020). Because they make it easier to
understand which components of the inputs utilized by
the underlying supervised learning approach are essential
to a given prediction, feature association methods are pop-
ular options in the explainable AI toolkit. These strategies
often include coloring molecular graphs in the context
of molecular design, and when presented to medicinal
chemists, they can help them choose which compounds
to synthesize or prioritize. Understanding ML models in
drug design is predicted to be aided by the consistency of
highlighted portions and previous specialist knowledge.
However, infrastructure identification tasks have been the
exclusive focus of the quantitative analysis of such color-
ing techniques thus far (Jiménez- Luna et al.,2022). Using
the suggested benchmark, they discovered that molecular
coloring techniques associated with traditional ML mod-
els frequently outperformed recent alternatives utilizing
graph neural networks. Moreover, they anticipate that the
benchmark data, which is open source, will make it easier
to evaluate recently created molecular feature association
tools (Jiménez- Luna et al.,2022). Luna et al. emphasized
the mathematical methods of these estimations in their
report, while (Vo et al.,2022) prepared a review on drug–
drug interactions of XAI. In this study, while explaining
the mathematical background of algorithms, its use in
various fields, such as drug– drug and drug- target inter-
actions, is focused on. The primary objective of machine
learning (ML) in medicinal chemistry is the prediction of
compound characteristics from chemical structures. In
applications like chemical scanning, virtual library enu-
meration, or generative chemistry, ML is frequently used
for enormous datasets. While desired, a thorough compre-
hension of ML model judgments is typically not required
in these circumstances. Comparatively, efforts at compos-
ite optimization rely on tiny datasets to spot structural
adjustments that result in desirable feature profiles. For
example, if ML is used in this case, the person is frequently
hesitant to make choices based on predictions that can-
not be explained. Only a select number of ML techniques
can be understood. However, illustrative methods may be
used to learn more about sophisticated ML model choices
(Figure1; Rodríguez- Pérez & Bajorath,2021).
Graph neural networks can complete specialized drug
discovery tasks, including predicting chemical properties
and creating brand- new molecules. These models, how-
ever, are regarded as “black boxes” and “hard to debug.”
This study used an integrated gradient XAI technique for
graphical neural network models to increase modeling
transparency for rational molecular design. Models were
trained to forecast cytochrome P450 inhibition, passive
permeability, human ether- a- go- go related gene (hERG)
channel inhibition, and plasma protein binding. The sug-
gested technique focused on structural and molecular
characteristics aligning with well- known pharmacoph-
ore patterns, revealing information on specified property
gaps and general ligand- target interactions. To train new
models at different clinically pertinent endpoints, practi-
tioners may use the created XAI technique, which is en-
tirely open- source (Figure1; Jiménez- Luna et al.,2021).
ML models that discriminate between active and inac-
tive substances are trained to detect structural patterns in
qualitative or quantitative structure– activity correlations
(SARs) investigations. Model choices might be challeng-
ing to grasp but essential for guiding composite design.
Interpreting machine learning outcomes provides extra
model validation based on expert knowledge. Many so-
phisticated ML methods, especially DL architectures,
have a recognizable “black box” quality. SHAP, a locally
comprehensible descriptive technique, is presented in the
study to justify activity estimates of any ML algorithm
independent of complexity. To comprehend the models
produced by DNNs, nonlinear support vector machines
(SVM), and RF learning, structural patterns that are used
to predict the likelihood of activity are found and mapped
on test substances. The findings demonstrate that SHAP
can significantly justify the predictions of sophisticated
ML models (Figure1; Rodríguez- Pérez & Bajorath,2020a).
Finding substances with advantageous pharmacolog-
ical, toxicological, and pharmacokinetic characteristics
continues to be difficult for drug development. DL pro-
vides robust tools to create prediction models appropriate
for expanding data sets. Still, there is a widening divide be-
tween what these neural networks learn and what people
can understand. Additionally, this gap may lead to vulner-
ability and limit the practical implementation of DL appli-
cations. Finally, the article introduces attentive fingerprint
for molecular representation (FP), a new graphical neu-
ral network architecture that leverages a visual attention
mechanism to learn from relevant drug discovery datasets.
They demonstrate that attentive FP produces state- of- the-
art prediction results on various datasets and that the in-
formation it picks up can be understood. By automatically
learning non- local intramolecular interactions from cer-
tain activities, Attentive FP's feature visualization shows
that they can gain direct chemical insights from data out-
side human awareness's scope (Xiong et al.,2020). One of
17470285, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cbdd.14262 by Bilecik Seyh Edebali, Wiley Online Library on [07/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
10
|
KIRBOĞA et al.
the interesting recent studies is Gimeno et al. It is multidi-
mensional module optimization (MOM). They applied the
MOM to an acute myeloid leukaemia (AML) cohort of 122
screened drugs and 319 ex- vivo tumor samples with WES.
They found that they successfully validated their results in
three large- scale screening experiments. In this way, they
have proven that XAI will help healthcare providers and
drug regulators better understand AI medical decisions
(Gimeno et al., 2022). It has been shown that machine
learning models and scoring functions that simplify the
screened Coulomb and Lennard- Jones interactions be-
tween ligands and residues of the target receptor can sig-
nificantly improve the classification ability to improve the
mentioned virtual screening and identify active ligands
(Shimazaki & Tachikawa,2022). With this method, it has
become easier to identify active ligands with the simpli-
fied scoring method.
Explainable models have been proposed to obtain
more transparent and understandable predictions of drug
studies. In Table1, we reviewed a literature review, ad-
verse drug reactions, drug reuse purpose, drug- disease
interaction, drug design, drug– drug interactions, and
more.
3
|
OPPORTUNITY AND
CHALLENGES OF XAI
XAI can educate the general population about how
AI functions. Although there has been much study on
AI in the public sector, XAI has received less atten-
tion. The concept behind XAI is that humans would be
more likely to accept expert system recommendations
if they could understand them (Swartout et al., 1991;
Swartout & Moore,1993). XAI frequently contrasts with
opaque, black- box techniques that leave unclear how or
why a decision was made. The accuracy of AI models
will increase as they become more sophisticated, but
this may compromise the work's capacity to be under-
stood (Xu et al.,2019). Explainability is an intuitively
appealing concept but difficult to realize. Belle and
Papantonis (2021) offer four suggestions for creating
FIGURE Studies on the use of XAI in drug discovery. The methodology proposed in (a) highlighted molecular features and
structural elements compatible with known pharmacophore motifs, providing information on accurately defined property gaps and non-
specific ligand- target interactions. Reprinted (adapted) with permission from (Jiménez- Luna et al.,2021). (b) A study demonstrating the
high potential of SHAP to rationalize predictions of complex ML models. Reprinted (adapted) with permission from (Rodríguez- Pérez &
Bajorath,2020a). (c) The study that descriptive approaches can be applied to gain insights into complex ML model decisions. Permission
from (Rodríguez- Pérez & Bajorath,2021). (d) They used and compared multiple XAI methods to projects of well- established SARs, existing
X- ray crystal structures, and lead optimization datasets. Reprinted (adapted) with permission from (Harren et al.,2022).
17470285, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cbdd.14262 by Bilecik Seyh Edebali, Wiley Online Library on [07/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
|
11
KIRBOĞA et al.
explainability, such as explanation by simplifying, de-
scribing the contribution of each feature to decisions,
explaining one example rather than a general one, and
using graphical visualization methods for explanations.
They also discuss the complexity of implementing such
proposals. Simplifications may not be accurate, features
may be related, local explanations may fail to provide
the whole picture, and graphical visualization requires
assumptions about the data that may not necessarily be
true. Explainability is assumed to create transparency
and trust in AI. Although trust can be affected differ-
ently than expected, situational factors also affect trust
(Bannister & Connolly, 2011b). Transparency can in-
crease and decrease trust (Bannister & Connolly,2011a).
Similarly, XAI can increase or decrease confidence.
Therefore, explainability must be better understood,
and strategies are required to build confidence in XAI.
For the domain of drug discovery affected by false
predictions, monitoring results reduces the impact of
false results, and identifying the root cause improves the
underlying model. An explainable system can reduce
the effects of such biased estimates by explaining the
decision- making criteria. AI models always have some
degree of error in their predictions, allowing someone
who is and can be responsible for those errors to make
the entire system more efficient. For example, apply-
ing XAI to detect molecular fingerprints in a drug can
increase the effectiveness of predictions. Most molec-
ular fingerprints are designed, validated, and used in
the context of small molecule drugs within the classi-
cal Lipinski boundaries (Lipinski et al., 2001) and are
not well suited for identifying larger molecules. For
example, the most popular molecular fingerprint is the
Morgan fingerprint (Morgan,1965), also known as the
extended link fingerprint ECFP4 (Rogers & Hahn,2010).
ECFP4, along with the corresponding MinHashed fin-
gerprint MHFP6 (Probst & Reymond, 2018), belongs
to the best- performing fingerprints in small molecule
virtual screening (Riniker & Landrum,2013)and target
prediction benchmarks (Awale & Reymond,2018, 2019).
Both fingerprints detect the presence of specific circular
substructures around each atom in a molecule that pre-
dict the biological activities of small organic molecules.
However, both poorly perceive the spherical properties
of molecules, such as size and shape. Which compounds
have a more significant impact on the drug's efficacy can
be identified. Finally, the drug's chemical implications,
explanation, and justification boost the system's trust.
For more practical usage, several user- critical systems,
such as medical diagnostics, and drug events (Chemical
absorption, distribution, metabolism, excretion, and tox-
icity [ADMET]), require high code trust from the user
(Kırboğa et al.,2022).
XAI is an intelligent, influential, and attractive aspect
of AI, and XAI is a powerful descriptive tool and provides
deeper insights compared to traditional linear models.
However, XAI has its unique challenges, regardless of its
benefits. AI algorithms can identify complex relationships
in large datasets used in drug discovery (Nazar et al.,2021;
Samek et al.,2019; Thomas Altmann et al.,2020). However,
it may not be fully understood what the decisions of
these algorithms are based on, so researchers may have a
hard time understanding why and how a drug is recom-
mended. In addition, learning about biases in the train-
ing data can produce false results (de Bruijn et al.,2022;
Ghassemi et al.,2021). This can be particularly problem-
atic in treating diseases that show genetic variations be-
tween sexes, races, and geographic regions. Large datasets
used for drug discovery contain patients' private health in-
formation. These data need to be protected against privacy
and security risks. However, AI algorithms processing
these data may raise concerns about data privacy (Islam
et al.,2022). By explaining the decisions in the drug dis-
covery process, XAI can help researchers understand why
a drug is recommended. However, the responsibility for
decisions such as the approval and use of drugs is still in
the hands of the people (Holzinger et al.,2019). Therefore,
there may be uncertainties about who is responsible for
the decisions of AI algorithms. Another challenge is in the
drug discovery process, researchers need to understand
how the drugs suggested by AI algorithms work and to
which diseases they can be applied. However, understand-
ing how the decisions of AI algorithms are made and why
these drugs are recommended is essential for researchers
to make the right decisions (Saeed & Omlin,2023).
4
|
CONCLUSION AND FUTURE
OUTLOOK
XAI, which has an essential place for shorter and less
costly preliminary trials of drug discovery time, has been
more deeply involved in drug discovery in recent years.
However, current XAI faces technical challenges and a
multitude of possible explanations and methods applicable
to a given task (Lipton,2017). Targeting small molecules in
drug discovery, accurate prediction of disease- based mech-
anisms, and the capacity of molecules to become drugs
are essential for many biological information needs. XAI
is a technique used to increase the explainability of deci-
sions in critical fields such as medicine and drug discovery.
Methods used in drug discovery include correlation analy-
sis, artificial neural networks (ANN) interpretation, t- SNE
(t- Distribution Stochastic Neighborhood Embedding) and
PCA (Principal Component Analysis). Correlation analy-
sis examines the relationships between molecules used
17470285, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cbdd.14262 by Bilecik Seyh Edebali, Wiley Online Library on [07/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
12
|
KIRBOĞA et al.
for drug discovery, specifically to identify dependencies
between their properties in the dataset. Interpretation of
ANN is a method to understand how ANN's decisions are
produced. This method measures the contribution of each
input that affects the outputs of the ANN.
Methods such as t- SNE and PCA compress complex
structures in data sets, making them more understand-
able. On the other hand, XAI techniques are used to make
the decisions of machine learning models understand-
able, making the decision- making processes more trans-
parent. These techniques include LIME, SHAP pattern
tracking and consistency analysis. LIME is used to inter-
pret a model's predictions for a given sample. SHAP helps
to understand how models work by measuring the effect
of a feature or variable on model output. Model tracking
allows for tracing the characteristics and decisions of a
model. As a result, methods for drug discovery are often
used to identify relationships between data, make com-
plex structures more understandable, and interpret the
decisions of machine learning models. In contrast, XAI
techniques are used to make machine learning models'
decisions intelligible and increase the reliability of the
models. In addition, the tools, programs, and codes we
will use should also be suitable for the work we aim for.
In current studies, most approaches do not come as ready-
to- use solutions but must be tailored to each application.
Therefore, spending much time on each of them is neces-
sary. Thus, to switch to in vivo and in vitro conditions, the
accuracy of computational studies should be ensured as
much as possible. Since the models used in computational
drug discovery differ in need of explanation, which mod-
els require more explanation or have intrinsic explainabil-
ity should be known. Therefore, the user must understand
what types of responses are needed, meaningful or mean-
ingless (Goodman & Flaxman, 2017). In the pre- drug
discovery phase, we emphasize the importance of find-
ing solutions to existing problems and working interdis-
ciplinary while overcoming these problems. However, a
concerted effort may shine a new light on drug discovery.
For example, data scientists and chemists must work
together when working computationally on a SMILES
array. This cooperation is also necessary for detecting
molecules at a level that can be a drug. Looking at the
recent studies described above, structural features and
molecular descriptors that can be easily interpreted in
chemistry are lax. Finally, they reflect a multifaceted light
on the biochemical process (Awale & Reymond, 2014;
Jiménez- Luna et al., 2020; Katritzky & Gordeeva, 1993;
Rogers & Hahn, 2010; Sheridan, 2019; Todeschini &
Consonni, 2010). Given XAI's existing potential and
constraints in drug discovery, it is fair to expect that
the continuous development of more understandable
and computationally economical hybrid techniques and
alternative models will not lose their value. Although
there are several software data codes for XAI in drug
development, there are currently no open community
platforms for tackling difficulties brought on by the
uniqueness of drug research. Therefore, only synergistic
investigations can produce this circumstance.
CONFLICT OF INTEREST STATEMENT
The authors declare that they have no known competing
financial interests or personal relationships that could
have appeared to influence the work reported in this
paper.
ACKNOWLEDGMENTS
This study was supported by the Read&Publish agree-
ment between TÜBİTAK ULAKBİM and Wiley, Republic
of Turkey (2023).
DATA AVAILABILITY STATEMENT
Data sharing not applicable to this article as no datasets
were generated or analysed during the current study.
ORCID
Kevser Kübra Kırboğa https://orcid.
org/0000-0002-2917-8860
REFERENCES
Adadi, A., & Berrada, M. J. I. A. (2018). Peeking inside the black-
box: A survey on explainable artificial intelligence (XAI). IEEE
Access, 6, 52138– 52160.
Agrawal, R., Gollapudi, S., Halverson, A., & Ieong, S. (2009).
Diversifying search results. Paper presented at the proceedings
of the second ACM international conference on web search and
data mining.
Akbar, S., Ali, F., Hayat, M., Ahmad, A., Khan, S., & Gul, S. (2022).
Prediction of antiviral peptides using transform evolution-
ary & SHAP analysis based descriptors by incorporation with
ensemble learning strategy. Chemometrics and Intelligent
Laboratory Systems, 230, 104682. https://doi.org/10.1016/j.
chemo lab.2022.104682
Alvarez- Melis, D., & Jaakkola, T. (2018). Towards robust inter-
pretability with self- explaining neural networks. ArXiv, 1,
7786– 7795.
Anguita- Ruiz, A., Segura- Delgado, A., Alcalá, R., Aguilera, C. M., &
Alcalá- Fdez, J. (2020). eXplainable artificial intelligence (XAI)
for the identification of biologically relevant gene expression
patterns in longitudinal human studies, insights from obesity
research. PLoS Computational Biology, 16(4), e1007792. https://
doi.org/10.1371/journ al.pcbi.1007792
Arrieta, A. B., Díaz- Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S.,
Barbado, A., … Herrera, F. (2020). Explainable artificial intel-
ligence (XAI): Concepts, taxonomies, opportunities and chal-
lenges toward responsible AI. Information Fusion, 58, 82– 115.
https://doi.org/10.1016/j.inffus.2019.12.012
Atsuko Takagi, M. K., Hamatani, E., Kojima, R., & Okuno, Y. (2022).
GraphIX: Graph- based In silico XAI (explainable artificial
17470285, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cbdd.14262 by Bilecik Seyh Edebali, Wiley Online Library on [07/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
|
13
KIRBOĞA et al.
intelligence) for drug repositioning from biopharmaceutical
network. ArXiv. https://doi.org/10.48550/ arxiv.2212.10788
Awale, M., & Reymond, J.- L. (2014). Atom pair 2D- fingerprints per-
ceive 3D- molecular shape and pharmacophores for very fast
virtual screening of ZINC and GDB- 17. Journal of Chemical
Information and Modeling, 54(7), 1892– 1907.
Awale, M., & Reymond, J.- L. (2018). Polypharmacology browser
PPB2: Target prediction combining nearest neighbors with ma-
chine learning. Journal of Chemical Information and Modeling,
59(1), 10– 17.
Awale, M., & Reymond, J.- L. (2019). Web- based tools for polyphar-
macology prediction. In Systems chemical biology (pp. 255– 272).
Springer.
Baek, M., DiMaio, F., Anishchenko, I., Dauparas, J., Ovchinnikov,
S., Lee, G. R., … Baker, D. (2021). Accurate prediction of pro-
tein structures and interactions using a three- track neural net-
work. Science, 373(6557), 871– 876. https://doi.org/10.1126/scien
ce.abj8754
Banegas- Luna, A. J., & Pérez- Sánchez, H. (2022). SIBILA: High-
performance computing and interpretable machine learning join
efforts toward personalised medicine in a novel decision- making
tool.
Bannister, F., & Connolly, R. (2011a). The trouble with transpar-
ency: A critical review of openness in e- government. Policy &
Internet, 3(1), 1– 30.
Bannister, F., & Connolly, R. (2011b). Trust and transformational
government: A proposed framework for research. Government
Information Quarterly, 28(2), 137– 147.
Belle, V., & Papantonis, I. (2021). Principles and Practice of
Explainable Machine Learning. Frontiers in Big Data, 4. https://
doi.org/10.3389/fdata.2021.688969
Bhatt, U., Zhang, Y., Antorán, J., Liao, Q. V., Sattigeri, P., Fogliato,
R., Melançon, G., Krishnan, R., Stanley, J., Tickoo, O.,
Nachman, L., Chunara, R., Srikumar, M., Weller, A., & Xiang,
A. (2021). Uncertainty as a form of transparency: Measuring,
communicating, and using uncertainty. arXiv preprint arXiv:
2011.07586.
Bittremieux, W., Advani, R. S., Jarmusch, A. K., Aguirre, S., Lu, A.,
Dorrestein, P. C., & Tsunoda, S. M. (2022). Physicochemical
properties determining drug detection in skin. Clinical and
Translational Science, 15(3), 761– 770. https://doi.org/10.1111/
cts.13198
Boström, J., Brown, D. G., Young, R. J., & Keserü, G. M. (2018).
Expanding the medicinal chemistry synthetic toolbox.
Nature Reviews. Drug Discovery, 17(10), 709– 727. https://doi.
org/10.1038/nrd.2018.116
Bresso, E., Monnin, P., Bousquet, C., Calvier, F.- E., Ndiaye, N.- C.,
Petitpain, N., … Coulet, A. (2021). Investigating ADR mech-
anisms with explainable AI: A feasibility study with knowl-
edge graph mining. BMC Medical Informatics and Decision
Making, 21(1), 171. https://doi.org/10.1186/s1291 1- 021-
01518 - 6
Byrne, R. M. (2019). Counterfactuals in explainable artificial intelli-
gence (XAI): Evidence from human reasoning. Paper presented
at the IJCAI.
Carpenter, K. A., & Huang, X. (2018). Machine learning- based virtual
screening and its applications to Alzheimer's drug discovery: A
review. Current Pharmaceutical Design, 24(28), 3347– 3358.
Castelvecchi, D. J. N. N. (2016). Can we open the black box of AI?
Nature, 538(7623), 20.
Chakraborti, T., Sreedharan, S., & Kambhampati, S. J. (2020). The
emerging landscape of explainable ai planning and decision
making. arXiv Preprint arXiv: 2002.
Chen, H., Engkvist, O., Wang, Y., Olivecrona, M., & Blaschke, T.
(2018). The rise of deep learning in drug discovery. Drug
Discovery Today, 23(6), 1241– 1250. https://doi.org/10.1016/j.
drudis.2018.01.039
Codella, N., Hind, M., Natesan Ramamurthy, K., Campbell, M.,
Dhurandhar, A., Kush, R., … Mojsilovic, A. (2018). TED:
Teaching AI to explain its decisions.
Czub, N., Pacławski, A., Szlęk, J., & Mendyk, A. (2021). Curated
database and preliminary AutoML QSAR model for 5- HT1A
receptor. Pharmaceutics, 13(10), 1711. https://doi.org/10.3390/
pharm aceut ics13 101711
Dang, L. H., Dung, N. T., Quang, L. X., Hung, L. Q., Le, N. H., Le, N.
T. N., … Le, N. Q. K. (2021). Machine learning- based prediction
of drug- drug interactions for histamine antagonist using hybrid
chemical features. Cell, 10(11), 3092. https://doi.org/10.3390/
cells 10113092
Das, R., Dhuliawala, S., Zaheer, M., Vilnis, L., Durugkar, I.,
Krishnamurthy, A., … McCallum, A. (2017). Go for a walk and
arrive at the answer: Reasoning over paths in knowledge bases
using reinforcement learning.
de Bruijn, H., Warnier, M., & Janssen, M. (2022). The perils and pit-
falls of explainable AI: Strategies for explaining algorithmic
decision- making. Government Information Quarterly, 39(2),
101666. https://doi.org/10.1016/j.giq.2021.101666
Deng, J., Yang, Z., Ojima, I., Samaras, D., & Wang, F. (2022). Artificial
intelligence in drug discovery: Applications and techniques.
Briefings in Bioinformatics, 23(1). https://doi.org/10.1093/bib/
bbab430
Deore, A., Dhumane, J., Wagh, R., & Sonawane, R. (2019). The stages
of drug discovery and development process. Asian Journal of
Pharmaceutical Research and Development, 7, 62– 67. https://
doi.org/10.22270/ ajprd.v7i6.616
Dhurandhar, A., Chen, P.- Y., Luss, R., Tu, C.- C., Ting, P., Shanmugam,
K., & Das, P. (2018). Explanations based on the missing:
Towards contrastive explanations with pertinent negatives.
Došilović, F. K., Brčić, M., & Hlupić, N. (2018). Explainable artificial
intelligence: A survey. Paper presented at the 2018 41st interna-
tional convention on information and communication technol-
ogy, electronics and microelectronics (MIPRO).
Drancé, M. (2022). Neuro- symbolic XAI: Application to drug repur-
posing for rare diseases. Paper presented at the database Systems
for Advanced Applications: 27th International Conference,
DASFAA 2022, Virtual Event, April 11– 14, 2022, Proceedings,
Part III https://doi.org/10.1007/978- 3- 031- 00129 - 1_51
Drosou, M., Jagadish, H., Pitoura, E., & Stoyanovich, J. J. B. D. (2017).
Diversity in big data: A review. Big Data, 5(2), 73– 84.
Ehsan, U., Tambwekar, P., Chan, L., Harrison, B., & Riedl, M. O.
(2019). Automated rationale generation: A technique for explain-
able AI and its effects on human perceptions. Paper presented
at the Proceedings of the 24th International Conference on
Intelligent User interfaces, Marina del Ray, California https://
doi.org/10.1145/33012 75.3302316
Fan, Y.- W., Liu, W.- H., Chen, Y.- T., Hsu, Y.- C., Pathak, N., Huang, Y.-
W., & Yang, J.- M. (2022). Exploring kinase family inhibitors and
their moiety preferences using deep SHapley additive exPlana-
tions. BMC Bioinformatics, 23(S4), 242. https://doi.org/10.1186/
s1285 9- 022- 04760 - 5
17470285, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cbdd.14262 by Bilecik Seyh Edebali, Wiley Online Library on [07/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
14
|
KIRBOĞA et al.
Fandinno, J., Schulz, C. J. T., & Programming, P. O. L. (2019).
Answering the “why” in answer set programming— A sur-
vey of explanation approaches. Theory and Practice of Logic
Programming, 19(2), 114– 203.
Feldmann, C., Philipps, M., & Bajorath, J. (2021). Explainable ma-
chine learning predictions of dual- target compounds reveal
characteristic structural features. Scientific Reports, 11(1),
21594. https://doi.org/10.1038/s4159 8- 021- 01099 - 4
Ghassemi, M., Oakden- Rayner, L., & Beam, A. L. (2021). The false
hope of current approaches to explainable artificial intelligence
in health care. The Lancet Digital Health, 3(11), e745– e750.
https://doi.org/10.1016/S2589 - 7500(21)00208 - 9
Gimeno, M., San Jose- Eneriz, E., Villar, S., Agirre, X., Prosper, F.,
Rubio, A., & Carazo, F. (2022). Explainable artificial intel-
ligence for precision medicine in acute myeloid leukemia.
Frontiers in Immunology, 13, 977358. https://doi.org/10.3389/
fimmu.2022.977358
Goodman, B., & Flaxman, S. (2017). European Union regulations on
algorithmic decision- making and a “right to explanation”. AI
Magazine, 38(3), 50– 57.
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F.,
& Pedreschi, D. (2018). A survey of methods for explaining
black box models. ACM Computing Surveys (CSUR), 51(5),
1– 42.
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., &
Pedreschi, D. (2019). A survey of methods for explaining black
box models. ACM Computing Surveys, 51(5), 1– 42. https://doi.
org/10.1145/3236009
Gunning, D. (2017). Explainable artificial intelligence (xai). Defense
Advanced Research Projects Agency (DARPA), 2(2), 1.
Gunning, D., Stefik, M., Choi, J., Miller, T., Stumpf, S., & Yang, G.- Z.
J. S. R. (2019). XAI— Explainable artificial intelligence. Science
Robotics, 4(37), eaay7120.
Harren, T., Matter, H., Hessler, G., Rarey, M., & Grebner, C. (2022).
Interpretation of structure– activity relationships in real- world
drug design data sets using explainable artificial intelligence.
Journal of Chemical Information and Modeling, 62(3), 447– 462.
https://doi.org/10.1021/acs.jcim.1c01263
He, C., Duan, L., Zheng, H., Song, L., & Huang, M. (2022). An ex-
plainable framework for drug repositioning from disease infor-
mation network. Neurocomputing, 511, 247– 258. https://doi.
org/10.1016/j.neucom.2022.09.063
Holzinger, A., Langs, G., Denk, H., Zatloukal, K., & Müller, H. (2019).
Causability and explainability of artificial intelligence in medi-
cine. WIREs Data Mining and Knowledge Discovery, 9(4), e1312.
https://doi.org/10.1002/widm.1312
Holzinger, A., Saranti, A., Molnar, C., Biecek, P., & Samek, W. (2022).
Explainable AI methods— A brief overview (pp. 13– 38). Springer
International Publishing.
Hosen, M. F., Mahmud, S. M. H., Ahmed, K., Chen, W. Y., Moni,
M. A., Deng, H. W., … Hasan, M. M. (2022). DeepDNAbP: A
deep learning- based hybrid approach to improve the identifi-
cation of deoxyribonucleic acid- binding proteins. Computers
in Biology and Medicine, 145, 105433. https://doi.org/10.1016/j.
compb iomed.2022.105433
Hu, T. M., & Hayton, W. (2011). Architecture of the drug– drug inter-
action network. Journal of Clinical Pharmacy and Therapeutics,
36(2), 135– 143.
Hung, T. N. K., Le, N. Q. K., Le, N. H., Van Tuan, L., Nguyen, T. P.,
Thi, C., & Kang, J. H. (2022). An AI- based prediction model
for drug- drug interactions in osteoporosis and Paget's diseases
from SMILES. Molecular Informatics, 41(6), e2100264. https://
doi.org/10.1002/minf.20210 0264
Imran, M., Bhatti, A., King, D. M., Lerch, M., Dietrich, J., Doron,
G., & Manlik, K. (2022). Supervised machine learning- based de-
cision support for signal validation classification. Drug Safety,
45(5), 583– 596. https://doi.org/10.1007/s4026 4- 022- 01159 - 2
Islam, M. R., Ahmed, M. U., Barua, S., & Begum, S. (2022). A
systematic review of explainable artificial intelligence in
terms of different application domains and tasks. Applied
Sciences, 12(3), 1353. Retrieved from https://www.mdpi.
com/2076- 3417/12/3/1353
Jesson, A., Mindermann, S., Shalit, U., & Gal, Y. (2020). Identifying
causal- effect inference failure with uncertainty- aware models,
33, 11637– 11649.
Jiang, D. J., Wu, Z. X., Hsieh, C. Y., Chen, G. Y., Liao, B., Wang, Z.,
… Hou, T. J. (2021). Could graph neural networks learn better
molecular representation for drug discovery? A comparison
study of descriptor- based and graph- based models. Journal of
Cheminformatics, 13(1), 12. https://doi.org/10.1186/s1332 1-
020- 00479 - 8
Jiménez- Luna, J., Grisoni, F., & Schneider, G. (2020). Drug discov-
ery with explainable artificial intelligence. Nature Machine
Intelligence, 2(10), 573– 584. https://doi.org/10.1038/s4225 6-
020- 00236 - 4
Jiménez- Luna, J., Skalic, M., & Weskamp, N. (2022). Benchmarking
molecular feature attribution methods with activity cliffs.
Journal of Chemical Information and Modeling, 62(2), 274– 283.
https://doi.org/10.1021/acs.jcim.1c01163
Jiménez- Luna, J., Skalic, M., Weskamp, N., & Schneider, G. (2021).
Coloring molecules with explainable artificial intelligence
for preclinical relevance assessment. Journal of Chemical
Information and Modeling, 61(3), 1083– 1094. https://doi.
org/10.1021/acs.jcim.0c01344
Joseph, A. (2019). Shapley regressions: A framework for statistical in-
ference on machine learning models.
Joshi, P., Masilamani, V., & Ramesh, R. (2021). An ensembled SVM
based approach for predicting adverse drug reactions. Current
Bioinformatics, 16(3), 422– 432. https://doi.org/10.2174/15748
93615 99920 07071 41420
Karimi, A.- H., Schölkopf, B., & Valera, I. (2021). Algorithmic re-
course: From counterfactual explanations to interventions. Paper
presented at the Proceedings of the 2021 ACM Conference on
Fairness, Accountability, and Transparency.
Karlebach, G., & Shamir, R. (2008). Modelling and analysis of gene
regulatory networks. Nature Reviews Molecular Cell Biology,
9(10), 770– 780.
Katritzky, A. R., & Gordeeva, E. V. (1993). Traditional topological
indexes vs electronic, geometrical, and combined molecu-
lar descriptors in QSAR/QSPR research. Journal of Chemical
Information and Computer Sciences, 33(6), 835– 857.
Keane, M. T., Kenny, E. M., Delaney, E., & Smyth, B. (2021). If only
we had better counterfactual explanations: Five key deficits to
rectify in the evaluation of counterfactual xai techniques. arXiv
preprint arXiv:2103.01035.
Kim, B., Wattenberg, M., Gilmer, J., Cai, C., Wexler, J., Viegas, F.,
& Sayres, R. (2018). Interpretability beyond feature attribution:
Quantitative testing with concept activation vectors (TCAV).
Paper presented at the Proceedings of the 35th International
Conference on Machine Learning, Proceedings of Machine
17470285, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cbdd.14262 by Bilecik Seyh Edebali, Wiley Online Library on [07/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
|
15
KIRBOĞA et al.
Learning Research. https://proce edings.mlr.press/ v80/kim18d.
html
Kırboğa, K., Kucuksille, E. U., & Köse, U. (2022). Ignition of small
molecule inhibitors in Friedreich's ataxia with explainable arti-
ficial intelligence.
Kuhn, H. W., & Tucker, A. W. (1953). Contributions to the theory of
games. Princeton University Press.
Lavecchia, A. (2015). Machine- learning approaches in drug discov-
ery: Methods and applications. Drug Discovery Today, 20(3),
318– 331. https://doi.org/10.1016/j.drudis.2014.10.012
Lee, D. D., Pham, P., Largman, Y., & Ng, A. (2009). Advances in neu-
ral information processing systems 22.
Lerman, J. J. S. L. R. O. (2013). Big data and its exclusions. Stan. L.
Rev. Online, 66, 55.
Li, Z., Ivanov, A. A., Su, R., Gonzalez- Pecchi, V., Qi, Q., Liu, S., …
Pham, C. (2017). The OncoPPi network of cancer- focused
protein– protein interactions to inform biological insights and
therapeutic strategies. Nature Communications, 8(1), 1– 14.
Liao, Q. V., Gruen, D., & Miller, S. (2020). Questioning the AI:
Informing design practices for explainable AI user experiences.
Paper presented at the Proceedings of the 2020 CHI Conference
on Human Factors in Computing Systems.
Lin, H. C., Wang, Z., Hu, Y. H., Simon, K., & Buu, A. (2022).
Characteristics of statewide prescription drug monitoring
programs and potentially inappropriate opioid prescribing
to patients with non- cancer chronic pain: A machine learn-
ing application. Preventive Medicine, 161, 107116. https://doi.
org/10.1016/j.ypmed.2022.107116
Lin, X., Quan, Z., Wang, Z.- J., Ma, T., & Zeng, X. (2020). KGNN:
Knowledge graph neural network for drug- drug interaction
prediction.
Lipinski, C. A., Lombardo, F., Dominy, B. W., & Feeney, P. J. (2001).
Experimental and computational approaches to estimate solu-
bility and permeability in drug discovery and development set-
tings. Advanced Drug Delivery Reviews, 46(3), 3– 26.
Lipton, Z. (2016). The mythos of model interpretability. Communications
of the ACM, 61, 36– 43. https://doi.org/10.1145/3233231
Lipton, Z. (2017). The doctor just won't accept that!
Lipton, Z. C. J. Q. (2018). The mythos of model interpretability: In
machine learning, the concept of interpretability is both im-
portant and slippery. Queue, 16(3), 31– 57.
Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M.,
Nair, B., … Lee, S.- I. (2020). From local explanations to global
understanding with explainable AI for trees. Nature Machine
Intelligence, 2(1), 56– 67.
Lundberg, S. M., & Lee, S.- I. (2017a). A unified approach to interpret-
ing model predictions.
Lundberg, S. M., & Lee, S.- I. (2017b). A unified approach to inter-
preting model predictions. Paper presented at the Proceedings
of the 31st International Conference on Neural Information
Processing Systems, Long Beach, California, USA.
Lundberg, S. M., & Lee, S.- I. (2017c). A unified approach to inter-
preting model predictions. ArXiv, abs/1705.07874 https://doi.
org/10.48550/ arXiv.1705.07874
Ma, J., Sheridan, R. P., Liaw, A., Dahl, G. E., & Svetnik, V. (2015).
Deep neural nets as a method for quantitative structure– activity
relationships. Journal of Chemical Information and Modeling,
55(2), 263– 274. https://doi.org/10.1021/ci500 747n
Ma, P., Liu, R. X., Gu, W. R., Dai, Q., Gan, Y., Cen, J., … Chen, Y.
C. (2022). Construction and interpretation of prediction model
of Teicoplanin trough concentration via machine learning.
Frontiers in Medicine, 9, 808969. https://doi.org/10.3389/
fmed.2022.808969
Mak, K.- K., Balijepalli, M. K., & Pichika, M. R. (2022). Success stories
of AI in drug discovery— Where do things stand? Expert Opinion
on Drug Discovery, 17(1), 79– 92. https://doi.org/10.1080/17460
441.2022.1985108
Mater, A. C., & Coote, M. L. (2019). Deep learning in chemistry.
Journal of Chemical Information and Modeling, 59(6), 2545–
2559. https://doi.org/10.1021/acs.jcim.9b00266
McGrath, S., Mehta, P., Zytek, A., Lage, I., & Lakkaraju, H. J.
(2020). When does uncertainty matter?: Understanding the
impact of predictive uncertainty in ML assisted decision
making.
Morgan, H. L. (1965). The generation of a unique machine descrip-
tion for chemical structures- a technique developed at chemi-
cal abstracts service. Journal of Chemical Documentation, 5(2),
107– 113.
Nazar, M., Alam, M. M., Yafi, E., & Su'ud, M. M. (2021). A sys-
tematic review of human– computer interaction and explain-
able artificial intelligence in healthcare with artificial intelli-
gence techniques. IEEE Access, 9, 153316– 153348. https://doi.
org/10.1109/ACCESS.2021.3127881
Nguyen, A., Yosinski, J., & Clune, J. (2015). Deep neural networks
are easily fooled: High confidence predictions for unrecognizable
images. Paper presented at the Proceedings of the IEEE confer-
ence on computer vision and pattern recognition.
Páez, A. (2019). The pragmatic turn in explainable artificial intelli-
gence (XAI). Minds and Machines, 29(3), 441– 459. https://doi.
org/10.1007/s1102 3- 019- 09502 - w
Pfeifer, B., Saranti, A., & Holzinger, A. (2022). GNN- SubNet: Disease
subnetwork detection with explainable graph neural networks.
Bioinformatics, 38(Supplement_2), ii120– ii126. https://doi.
org/10.1093/bioin forma tics/btac478
Polzer, A., Fleiß, J., Ebner, T., Kainz, P., Koeth, C., & Thalmann,
S. (2022). Validation of AI- based information systems for sen-
sitive use cases: Using an XAI approach in pharmaceutical
engineering.
Preece, A., Harborne, D., Braines, D., Tomsett, R., & Chakraborty, S.
J. (2018). Stakeholders in Explainable AI.
Probst, D., & Reymond, J.- L. (2018). A probabilistic molecular finger-
print for big data settings. Journal of Cheminformatics, 10(1), 1– 12.
Rasmussen, C. E. (2003). Gaussian processes in machine learning.
Paper presented at the Summer school on machine learning.
Rebane, J., Samsten, I., Pantelidis, P., & Papapetrou, P. (2021, 7- 9
June 2021). Assessing the clinical validity of attention- based
and SHAP temporal explanations for adverse drug event pre-
dictions. Paper presented at the 2021 IEEE 34th International
Symposium on Computer- Based Medical Systems (CBMS).
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016a). “Why should I trust
you?”: Explaining the predictions of any classifier.
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016b). "Why should i trust
you?" Explaining the predictions of any classifier. Paper presented
at the Proceedings of the 22nd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining.
Riniker, S., & Landrum, G. A. (2013). Open- source platform to
benchmark fingerprints for ligand- based virtual screening.
Journal of Cheminformatics, 5(1), 1– 17.
Rodríguez- Pérez, R., & Bajorath, J. (2019). Interpretation of com-
pound activity predictions from complex machine learning
17470285, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cbdd.14262 by Bilecik Seyh Edebali, Wiley Online Library on [07/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
16
|
KIRBOĞA et al.
models using local approximations and shapley values. Journal
of Medicinal Chemistry, 63(16), 8761– 8777.
Rodríguez- Pérez, R., & Bajorath, J. (2020a). Interpretation of com-
pound activity predictions from complex machine learn-
ing models using local approximations and Shapley values.
Journal of Medicinal Chemistry, 63(16), 8761– 8777. https://doi.
org/10.1021/acs.jmedc hem.9b01101
Rodríguez- Pérez, R., & Bajorath, J. (2020b). Interpretation of
machine learning models using shapley values: Application
to compound potency and multi- target activity predic-
tions. Journal of Computer- Aided Molecular Design, 34(10),
1013– 1026.
Rodríguez- Pérez, R., & Bajorath, J. (2021). Explainable machine
learning for property predictions in compound optimization.
Journal of Medicinal Chemistry, 64(24), 17744– 17752. https://
doi.org/10.1021/acs.jmedc hem.1c01789
Rogers, D., & Hahn, M. (2010). Extended- connectivity fingerprints.
Journal of Chemical Information and Modeling, 50(5), 742– 754.
Rosenfeld, A., Richardson, A. J. A. A., & Systems, M.- A. (2019).
Explainability in human– agent systems. Autonomous Agents
and Multi- Agent Systems., 33(6), 673– 705.
Rudin, C. (2019). Stop explaining black box machine learning mod-
els for high stakes decisions and use interpretable models in-
stead. Nature Machine Intelligence, 1(5), 206– 215. https://doi.
org/10.1038/s4225 6- 019- 0048- x
Saeed, W., & Omlin, C. (2023). Explainable AI (XAI): A system-
atic meta- survey of current challenges and future opportu-
nities. Knowledge- Based Systems, 263, 110273. https://doi.
org/10.1016/j.knosys.2023.110273
Samek, W., Montavon, G., Vedaldi, A., Hansen, L. K., & Müller, K.- R.
(2019). Explainable AI: Interpreting, explaining and visualizing
deep learning. https://doi.org/10.1007/978- 3- 030- 28954 - 6_1
Savage, N. (2021). Tapping into the drug discovery potential of AI.
Retrieved from https://www.nature.com/artic les/d4374 7- 021-
00045 - 7
Schneider, G. (2018). Automating drug discovery. Nature Reviews.
Drug Discovery, 17(2), 97– 113. https://doi.org/10.1038/
nrd.2017.232
Schneider, P., Walters, W. P., Plowright, A. T., Sieroka, N., Listgarten,
J., Goodnow, R. A., Jr., … Schneider, G. (2020). Rethinking
drug design in the artificial intelligence era. Nature Reviews.
Drug Discovery, 19(5), 353– 364. https://doi.org/10.1038/s4157
3- 019- 0050- 3
Sheridan, R. P. (2019). Interpretation of QSAR models by coloring
atoms according to changes in predicted activity: How robust
is it? Journal of Chemical Information and Modeling, 59(4),
1324– 1337.
Shimazaki, T., & Tachikawa, M. (2022). Collaborative approach be-
tween explainable artificial intelligence and simplified chemi-
cal interactions to explore active ligands for cyclin- dependent
kinase 2. ACS Omega, 7(12), 10372– 10381. https://doi.
org/10.1021/acsom ega.1c06976
Stelling, J., Klamt, S., Bettenbrock, K., Schuster, S., & Gilles, E. D.
(2002). Metabolic network structure determines key aspects of
functionality and regulation. Nature, 420(6912), 190– 193.
Stokes, J. M., Yang, K., Swanson, K., Jin, W., Cubillos- Ruiz, A.,
Donghia, N. M., … Collins, J. J. (2020). A deep learning approach
to antibiotic discovery. Cell, 180(4), 688– 702.e613. https://doi.
org/10.1016/j.cell.2020.01.021
Stumpfe, D., & Bajorath, J. (2020). Current trends, overlooked is-
sues, and unmet challenges in virtual screening. Journal of
Chemical Information and Modeling, 60(9), 4112– 4115. https://
doi.org/10.1021/acs.jcim.9b01101
Suh, J., Yoo, S., Park, J., Cho, S. Y., Cho, M. C., Son, H., & Jeong, H.
(2020). Development and validation of an explainable artificial
intelligence- based decision- supporting tool for prostate biopsy.
BJU International, 126(6), 694– 703. https://doi.org/10.1111/
bju.15122
Swartout, W., Paris, C., & Moore, J. (1991). Explanations in knowl-
edge systems: Design for explainable expert systems. IEEE
Expert, 6(3), 58– 64.
Swartout, W. R., & Moore, J. D. (1993). Explanation in second gen-
eration expert systems. In Second generation expert systems (pp.
543– 585). Springer.
Thomas Altmann, J. B., Dankers, C., Dassen, T., Fritz, N., Gruber,
S., Kopper, P., Kronseder, V., Wagner, M., & Renkl, E. (2020).
Limitations of interpretable machine learning methods.
Tjoa, E., & Guan, C. (2020). A survey on explainable artificial intelli-
gence (XAI): Toward Medical Xai. IEEE Transactions on Neural
Networks and Learning Systems, 32(11), 4793– 4813.
Todeschini, R., & Consonni, V. (2010). New local vertex invariants
and molecular descriptors based on functions of the vertex
degrees. MATCH Communications in Mathematical and in
Computer Chemistry, 64(2), 359– 372.
Upadhyay, S., Joshi, S., & Lakkaraju, H. (2021). Towards robust and
reliable algorithmic recourse. 34, 16926– 16937.
Vamathevan, J., Clark, D., Czodrowski, P., Dunham, I., Ferran, E.,
Lee, G., … Zhao, S. (2019). Applications of machine learning
in drug discovery and development. Nature Reviews. Drug
Discovery, 18(6), 463– 477. https://doi.org/10.1038/s4157
3- 019- 0024- 5
Vangala, S. R., Bung, N., Krishnan, S. R., & Roy, A. (2022). An interpre-
table machine learning model for selectivity of small molecules
against homologous protein family. Future Medicinal Chemistry,
14(20), 1441– 1453. https://doi.org/10.4155/fmc- 2022- 0075
Vo, T. H., Nguyen, N. T. K., Kha, Q. H., & Le, N. Q. K. (2022). On
the road to explainable AI in drug- drug interactions pre-
diction: A systematic review. Computational and Structural
Biotechnology Journal, 20, 2112– 2123. https://doi.org/10.1016/j.
csbj.2022.04.021
Wang, Q., Huang, K., Chandak, P., Zitnik, M., & Gehlenborg, N.
(2022). Extending the nested model for user- centric XAI: A de-
sign study on GNN- based drug repurposing. IEEE Transactions
on Visualization and Computer Graphics, 29, 1266– 1276.
https://doi.org/10.1109/tvcg.2022.3209435
Wang, Q., Li, M., Wang, X., Parulian, N., Han, G., Ma, J., …
Onyshkevych, B. (2021). COVID- 19 literature knowledge graph
construction and drug repurposing report generation.
Ward, I. R., Wang, L., Lu, J., Bennamoun, M., Dwivedi, G., &
Sanfilippo, F. M. (2021). Explainable artificial intelligence for
pharmacovigilance: What features are important when pre-
dicting adverse outcomes? Computer Methods and Programs
in Biomedicine, 212, 106415. https://doi.org/10.1016/j.
cmpb.2021.106415
West, D. M. (2018). The future of work: Robots, AI, and automation.
Brookings Institution Press.
Wójcikowski, M., Siedlecki, P., & Ballester, P. J. (2019). Building
machine- learning scoring functions for structure- based
17470285, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cbdd.14262 by Bilecik Seyh Edebali, Wiley Online Library on [07/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
|
17
KIRBOĞA et al.
prediction of intermolecular binding affinity. In W. F. de
Azevedo Jr (Ed.), Docking screens for drug discovery (pp. 1– 12).
Springer New York.
Wojtuch, A., Jankowski, R., & Podlewska, S. (2021). How can SHAP
values help to shape metabolic stability of chemical com-
pounds? Journal of Cheminformatics, 13(1), 74. https://doi.
org/10.1186/s1332 1- 021- 00542 - y
Xiong, Z., Wang, D., Liu, X., Zhong, F., Wan, X., Li, X., … Zheng,
M. (2020). Pushing the boundaries of molecular representa-
tion for drug discovery with the graph attention mechanism.
Journal of Medicinal Chemistry, 63(16), 8749– 8760. https://doi.
org/10.1021/acs.jmedc hem.9b00959
Xu, F., Uszkoreit, H., Du, Y., Fan, W., Zhao, D., & Zhu, J. (2019).
Explainable AI: A brief survey on history, research areas, ap-
proaches and challenges. Paper presented at the CCF interna-
tional conference on natural language processing and Chinese
computing.
Yu, Z., Ji, H. H., Xiao, J. W., Wei, P., Song, L., Tang, T. T., … Jia, Y.
T. (2021). Predicting adverse drug events in Chinese pediatric
inpatients with the associated risk factors: A machine learn-
ing study. Frontiers in Pharmacology, 12, 659099. https://doi.
org/10.3389/fphar.2021.659099
Zeng, X., Song, X., Ma, T., Pan, X., Zhou, Y., Hou, Y., … Cheng, F. (2020).
Repurpose open data to discover therapeutics for COVID- 19
using deep learning. Journal of Proteome Research, 19(11), 4624–
4636. https://doi.org/10.1021/acs.jprot eome.0c00316
Zeng, X., Tu, X., Liu, Y., Fu, X., & Su, Y. (2022). Toward better drug
discovery with knowledge graph. Current Opinion in Structural
Biology, 72, 114– 126. https://doi.org/10.1016/j.sbi.2021.09.003
Zhang, Q., Yang, Y., Liu, Y., Wu, Y. N., & Zhu, S.- C. J. (2018).
Unsupervised learning of neural networks to explain neural
networks.
Zhavoronkov, A., Ivanenkov, Y. A., Aliper, A., Veselov, M. S.,
Aladinskiy, V. A., Aladinskaya, A. V., … Aspuru- Guzik, A.
(2019). Deep learning enables rapid identification of potent
DDR1 kinase inhibitors. Nature Biotechnology, 37(9), 1038–
1040. https://doi.org/10.1038/s4158 7- 019- 0224- x
Zhu, J., Liapis, A., Risi, S., Bidarra, R., & Youngblood, G. M.
(2018). Explainable AI for designers: A human- centered per-
spective on mixed- initiative co- creation. Paper presented at
the 2018 IEEE Conference on Computational Intelligence
and Games (CIG).
Zhu, X. Q., Hu, J. Q., Xiao, T., Huang, S. Q., Shang, D. W., & Wen, Y.
G. (2022). Integrating machine learning with electronic health
record data to facilitate detection of prolactin level and phar-
macovigilance signals in olanzapine- treated patients. Frontiers
in Endocrinology, 13, 1011492. https://doi.org/10.3389/
fendo.2022.1011492
How to cite this article: Kırboğa, K. K., Abbasi,
S., & Küçüksille, E. U. (2023). Explainability and
white box in drug discovery. Chemical Biology &
Drug Design, 00, 1–17. https://doi.org/10.1111/
cbdd.14262
17470285, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/cbdd.14262 by Bilecik Seyh Edebali, Wiley Online Library on [07/06/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License