Conference PaperPDF Available

Explainable Artificial Intelligence: A Survey

Authors:

Abstract

In the last decade, with availability of large datasets and more computing power, machine learning systems have achieved (super)human performance in a wide variety of tasks. Examples of this rapid development can be seen in image recognition, speech analysis, strategic game planning and many more. The problem with many state-of-the-art models is a lack of transparency and interpretability. The lack of thereof is a major drawback in many applications, e.g. healthcare and finance, where rationale for model's decision is a requirement for trust. In the light of these issues, explainable artificial intelligence (XAI) has become an area of interest in research community. This paper summarizes recent developments in XAI in supervised learning, starts a discussion on its connection with artificial general intelligence, and gives proposals for further research directions.
A preview of the PDF is not available
... We must also acknowledge the potential limitations of XAIs. Reliance on these tools may perpetuate biases inherent in the data they are trained on, leading to biased insights [69]. Healthcare professionals might overly rely on model outputs, potentially diminishing their critical thinking and clinical judgment. ...
Conference Paper
Full-text available
As an increasing number of states adopt more permissive cannabis regulations, the necessity of gaining a comprehensive understanding of cannabis's effects on young adults has grown exponentially, driven by its escalating prevalence of use. By leveraging popular eXplainable Artificial Intelligence (XAI) techniques such as SHAP (SHapley Additive exPlanations), rule-based explanations, intrinsically interpretable models, and counterfactual explanations, we undertake an exploratory but in-depth examination of the impact of cannabis use on individual behavioral patterns and physiological states. This study explores the possibility of facilitating algorithmic decision-making by combining interpretable artificial intelligence (XAI) techniques with sensor data, with the aim of providing researchers and clinicians with personalized analyses of cannabis intoxication behavior. SHAP analyzes the importance and quantifies the impact of specific factors such as environmental noise or heart rate, enabling clinicians to pinpoint influential behaviors and environmental conditions. SkopeRules simplify the understanding of cannabis use for a specific activity or environmental use. Decision trees provide a clear visualization of how factors interact to influence cannabis consumption. Counterfactual models help identify key changes in behaviors or conditions that may alter cannabis use outcomes, to guide effective individualized intervention strategies. This multidimensional analytical approach not only unveils changes in behavioral and physiological states after cannabis use, such as frequent fluctuations in activity states, nontraditional sleep patterns, and specific use habits at different times and places, but also highlights the significance of individual differences in responses to cannabis use. These insights carry profound implications for clinicians seeking to gain a deeper understanding of the diverse needs of their patients and for tailoring precisely targeted intervention strategies. Furthermore, our findings highlight the pivotal role that XAI technologies could play in enhancing the transparency and interpretability of Clinical Decision Support Systems (CDSS), with a particular focus on substance misuse treatment. This research significantly contributes to ongoing initiatives aimed at advancing clinical practices that aim to prevent and reduce cannabis-related harms to health, positioning XAI as a supportive tool for clinicians and researchers alike.
... Ensuring consistent and explainable outputs from intelligent agents is crucial, because humans are fundamentally limited in understanding AI (artificial intelligence) behavior. Explainability is an essential aspect of AI safety, which we define as the ability of an AI system to stay within the boundaries of desired states, i.e., worst-case guarantees [54]. ...
Article
Full-text available
Integrating large language model (LLM) agents within game theory demonstrates their ability to replicate human-like behaviors through strategic decision making. In this paper, we introduce an augmented LLM agent, called the private agent, which engages in private deliberation and employs deception in repeated games. Utilizing the partially observable stochastic game (POSG) framework and incorporating in-context learning (ICL) and chain-of-thought (CoT) prompting, we investigated the private agent's proficiency in both competitive and cooperative scenarios. Our empirical analysis demonstrated that the private agent consistently achieved higher long-term payoffs than its baseline counterpart and performed similarly or better in various game settings. However, we also found inherent deficiencies of LLMs in certain algorithmic capabilities crucial for high-quality decision making in games. These findings highlight the potential for enhancing LLM agents' performance in multi-player games using information-theoretic approaches of deception and communication with complex environments.
... Thus, ANNs can pose a significant challenge in fields where interpretability and transparency are often critical considerations [8], [9]. This needs for explainability has caused This research was supported by funding from CDTI, with Grant Number MIG-20221006 associated with the ATMOSPHERE Project and grant PID2022-142024NB-I00 funded by MCIN/AEI/10.13039/501100011033. the field of explainable artificial intelligence (xAI) to grow substantially in recent years [10]. Consequently, there have been many approaches trying to explain how neural networks are computing their prediction [11], [12], [13]. ...
Preprint
Artificial Neural Networks (ANNs) have become a powerful tool for modeling complex relationships in large-scale datasets. However, their black-box nature poses ethical challenges. In certain situations, ensuring ethical predictions might require following specific partial monotonic constraints. However, certifying if an already-trained ANN is partially monotonic is challenging. Therefore, ANNs are often disregarded in some critical applications, such as credit scoring, where partial monotonicity is required. To address this challenge, this paper presents a novel algorithm (LipVor) that certifies if a black-box model, such as an ANN, is positive based on a finite number of evaluations. Therefore, as partial monotonicity can be stated as a positivity condition of the partial derivatives, the LipVor Algorithm can certify whether an already trained ANN is partially monotonic. To do so, for every positively evaluated point, the Lipschitzianity of the black-box model is used to construct a specific neighborhood where the function remains positive. Next, based on the Voronoi diagram of the evaluated points, a sufficient condition is stated to certify if the function is positive in the domain. Compared to prior methods, our approach is able to mathematically certify if an ANN is partially monotonic without needing constrained ANN's architectures or piece-wise linear activation functions. Therefore, LipVor could open up the possibility of using unconstrained ANN in some critical fields. Moreover, some other properties of an ANN, such as convexity, can be posed as positivity conditions, and therefore, LipVor could also be applied.
... According to Kurzweil, artificial intelligence (AI) is the "art of building machines that perform tasks that require intelligence when performed by people" from 1990. Winston put up the alternative definition of AI in 1992, describing it as "the Study of the Compilations that Make It Possible to Perceive Reason and Act." ''The branch of computer science that is concerned with the automation of intelligent behavior,'' according to Luger's definition of AI in 1993 (Dosilovic, Brcic & Hlupic, 2018;Nazar, 2021). ...
Conference Paper
Full-text available
One of the cutting-edge technologies is artificial intelligence (AI). Artificial intelligence (AI) has become widely accepted in a number of fields recently in the life of modern day human being, such as, knowledge acquisition, learning, healthcare, and security. In order to develop an interactive intelligent system for user interaction, the field of human-computer interaction (HCI) has been fusing AI and human-computer interaction during the past few years. By using different algorithms and utilizing HCI to give the user transparency and establish their trust in the computer, AI is being employed in a range of fields. Global scientists are interested in artificial intelligence, a significant technology that mimics human cognitive capacities. An artificial neural network, which can mimic human mind, is the fundamental building block of artificial intelligence technology. This form of brain architecture is composed of densely linked neurons and serves largely as a mechanism for processing input to deal with a specific problem. Due to a rapidly developing technology, robots may now perform tasks that were formerly reserved for humans alone. Along the perspective of knowledge development and interactive learning this paper tries to shade some light on recent researches and activities on AI influencing learning through a vertical literature review.
Article
Full-text available
The paper addresses some fundamental and hotly debated issues for high-stakes event predictions underpinning the computational approach to social sciences, especially in criminology and criminal justice. We question several prevalent views against machine learning and outline a new paradigm that highlights the promises and promotes the infusion of computational methods and conventional social science approaches.
Article
Full-text available
In recent years, the design and production of information systems have seen significant growth. However, these information artefacts often exhibit characteristics that compromise their reliability. This issue appears to stem from the neglect or underestimation of certain crucial aspects in the application of Information Systems Design (ISD). For example, it is frequently difficult to prove when one of these products does not work properly or works incorrectly (falsifiability), their usage is often left to subjective experience and somewhat arbitrary choices (anecdotes), and their functions are often obscure for users as well as designers (explainability). In this paper, we propose an approach that can be used to support the analysis and re-(design) of information systems grounded on a well-known theory of information, namely, teleosemantics. This approach emphasizes the importance of grounding the design and validation process on dependencies between four core components: the producer (or designer), the produced (or used) information system, the consumer (or user), and the design (or use) purpose. We analyze the ambiguities and problems of considering these components separately. We then present some possible ways in which they can be combined through the teleological approach. Also, we debate guidelines to prevent ISD from failing to address critical issues. Finally, we discuss perspectives on applications over real existing information technologies and some implications for explainable AI and ISD.
Article
Full-text available
The performance of Artificial Intelligence (AI) systems reaches or even exceeds that of humans in an increasing number of complicated tasks. Highly effective non-linear AI models are generally employed in a black-box form nested in their complex structures, which means that no information as to what precisely helps them reach appropriate predictions is provided. The lack of transparency and interpretability in existing Artificial Intelligence techniques would reduce human users’ trust in the models used for cyber defence, especially in current scenarios where cyber resilience is becoming increasingly diverse and challenging. Explainable AI (XAI) should be incorporated into developing cybersecurity models to deliver explainable models with high accuracy that human users can understand, trust, and manage. This paper explores the following concepts related to XAI. A summary of current literature on XAI is discussed. Recent taxonomies that help explain different machine learning algorithms are discussed. These include deep learning techniques developed and studied extensively in other IoT taxonomies. The outputs of AI models are crucial for cybersecurity, as experts require more than simple binary outputs for examination to enable the cyber resilience of IoT systems. Examining the available XAI applications and safety-related threat models to explain resilience towards IoT systems also summarises the difficulties and gaps in XAI concerning cybersecurity. Finally, various technical issues and trends are explained, and future studies on technology, applications, security, and privacy are presented, emphasizing the ideas of explainable AI models.
Article
Full-text available
Intelligent applications supported by Machine Learning have achieved remarkable performance rates for a wide range of tasks in many domains. However, understanding why a trained algorithm makes a particular decision remains problematic. Given the growing interest in the application of learning-based models, some concerns arise in the dealing with sensible environments, which may impact users’ lives. The complex nature of those models’ decision mechanisms makes them the so-called “black boxes,” in which the understanding of the logic behind automated decision-making processes by humans is not trivial. Furthermore, the reasoning that leads a model to provide a specific prediction can be more important than performance metrics, which introduces a trade-off between interpretability and model accuracy. Explaining intelligent computer decisions can be regarded as a way to justify their reliability and establish trust. In this sense, explanations are critical tools that verify predictions to discover errors and biases previously hidden within the models’ complex structures, opening up vast possibilities for more responsible applications. In this review, we provide theoretical foundations of Explainable Artificial Intelligence (XAI), clarifying diffuse definitions and identifying research objectives, challenges, and future research lines related to turning opaque machine learning outputs into more transparent decisions. We also present a careful overview of the state-of-the-art explainability approaches, with a particular analysis of methods based on feature importance, such as the well-known LIME and SHAP. As a result, we highlight practical applications of the successful use of XAI.
Article
Full-text available
Data-trained predictive models see widespread use, but for the most part they are used as black boxes which output a prediction or score. It is therefore hard to acquire a deeper understanding of model behavior and in particular how different features influence the model prediction. This is important when interpreting the behavior of complex models or asserting that certain problematic attributes (such as race or gender) are not unduly influencing decisions. In this paper, we present a technique for auditing black-box models, which lets us study the extent to which existing models take advantage of particular features in the data set, without knowing how the models work. Our work focuses on the problem of indirect influence: how some features might indirectly influence outcomes via other, related features. As a result, we can find attribute influences even in cases where, upon further direct examination of the model, the attribute is not referred to by the model at all. Our approach does not require the black-box model to be retrained. This is important if, for example, the model is only accessible via an API, and contrasts our work with other methods that investigate feature influence such as feature selection. We present experimental evidence for the effectiveness of our procedure using a variety of publicly available data sets and models. We also validate our procedure using techniques from interpretable learning and feature selection, as well as against other black-box auditing procedures. To further demonstrate the effectiveness of this technique, we use it to audit a black-box recidivism prediction algorithm.
Conference Paper
Full-text available
Understanding why a model makes a certain prediction can be as crucial as the prediction's accuracy in many applications. However, the highest accuracy for large modern datasets is often achieved by complex models that even experts struggle to interpret, such as ensemble or deep learning models, creating a tension between accuracy and interpretability. In response, various methods have recently been proposed to help users interpret the predictions of complex models, but it is often unclear how these methods are related and when one method is preferable over another. To address this problem, we present a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations). SHAP assigns each feature an importance value for a particular prediction. Its novel components include: (1) the identification of a new class of additive feature importance measures, and (2) theoretical results showing there is a unique solution in this class with a set of desirable properties. The new class unifies six existing methods, notable because several recent methods in the class lack the proposed desirable properties. Based on insights from this unification, we present new methods that show improved computational performance and/or better consistency with human intuition than previous approaches.
Conference Paper
Full-text available
To realize the full potential of machine learning in diverse real-world domains, it is necessary for model predictions to be readily interpretable and actionable for the human in the loop. Analysts, who are the users but not the developers of machine learning models, often do not trust a model because of the lack of transparency in associating predictions with the underlying data space. To address this problem, we propose Rivelo, a visual analytics interface that enables analysts to understand the causes behind predictions of binary classifiers by interactively exploring a set of instance-level explanations. These explanations are model-agnostic, treating a model as a black box, and they help analysts in interactively probing the high-dimensional binary data space for detecting features relevant to predictions. We demonstrate the utility of the interface with a case study analyzing a random forest model on the sentiment of Yelp reviews about doctors.
Article
We present ObamaNet, the first architecture that generates both audio and synchronized photo-realistic lip-sync videos from any new text. Contrary to other published lip-sync approaches, ours is only composed of fully trainable neural modules and does not rely on any traditional computer graphics methods. More precisely, we use three main modules: a text-to-speech network based on Char2Wav, a time-delayed LSTM to generate mouth-keypoints synced to the audio, and a network based on Pix2Pix to generate the video frames conditioned on the keypoints.
Article
Although deep learning has historical roots going back decades, neither the term "deep learning" nor the approach was popular just over five years ago, when the field was reignited by papers such as Krizhevsky, Sutskever and Hinton's now classic (2012) deep network model of Imagenet. What has the field discovered in the five subsequent years? Against a background of considerable progress in areas such as speech recognition, image recognition, and game playing, and considerable enthusiasm in the popular press, I present ten concerns for deep learning, and suggest that deep learning must be supplemented by other techniques if we are to reach artificial general intelligence.
Article
In this paper we examine the case of Tay, the Microsoft AI chatbot that was launched in March, 2016. After less than 24 hours, Microsoft shut down the experiment because the chatbot was generating tweets that were judged to be inappropriate since they included racist, sexist, and anti-Semitic language. We contend that the case of Tay illustrates a problem with the very nature of learning software (LS is a term that describes any software that changes its program in response to its interactions) that interacts directly with the public, and the developer's role and responsibility associated with it. We make the case that when LS interacts directly with people or indirectly via social media, the developer has additional ethical responsibilities beyond those of standard software. There is an additional burden of care.
Article
Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust in a model. Trust is fundamental if one plans to take action based on a prediction, or when choosing whether or not to deploy a new model. Such understanding further provides insights into the model, which can be used to turn an untrustworthy model or prediction into a trustworthy one. In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally around the prediction. We further propose a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem. We demonstrate the flexibility of these methods by explaining different models for text (e.g. random forests) and image classification (e.g. neural networks). The usefulness of explanations is shown via novel experiments, both simulated and with human subjects. Our explanations empower users in various scenarios that require trust: deciding if one should trust a prediction, choosing between models, improving an untrustworthy classifier, and detecting why a classifier should not be trusted.
Article
With the availability of large databases and recent improvements in deep learning methodology, the performance of AI systems is reaching or even exceeding the human level on an increasing number of complex tasks. Impressive examples of this development can be found in domains such as image classification, sentiment analysis, speech understanding or strategic game playing. However, because of their nested non-linear structure, these highly successful machine learning and artificial intelligence models are usually applied in a black box manner, i.e., no information is provided about what exactly makes them arrive at their predictions. Since this lack of transparency can be a major drawback, e.g., in medical applications, the development of methods for visualizing, explaining and interpreting deep learning models has recently attracted increasing attention. This paper summarizes recent developments in this field and makes a plea for more interpretability in artificial intelligence. Furthermore, it presents two approaches to explaining predictions of deep learning models, one method which computes the sensitivity of the prediction with respect to changes in the input and one approach which meaningfully decomposes the decision in terms of the input variables. These methods are evaluated on three classification tasks.
Article
Algorithms, particularly machine learning (ML) algorithms, are increasingly important to individuals’ lives, but have caused a range of concerns revolving mainly around unfairness, discrimination and opacity. Transparency in the form of a “right to an explanation” has emerged as a compellingly attractive remedy since it intuitively promises to open the algorithmic “black box” to promote challenge, redress, and hopefully heightened accountability. Amidst the general furore over algorithmic bias we describe, any remedy in a storm has looked attractive. However, we argue that a right to an explanation in the EU General Data Protection Regulation (GDPR) is unlikely to present a complete remedy to algorithmic harms, particularly in some of the core “algorithmic war stories” that have shaped recent attitudes in this domain. Firstly, the law is restrictive, unclear, or even paradoxical concerning when any explanation-related right can be triggered. Secondly, even navigating this, the legal conception of explanations as “meaningful information about the logic of processing” may not be provided by the kind of ML “explanations” computer scientists have developed, partially in response. ML explanations are restricted both by the type of explanation sought, the dimensionality of the domain and the type of user seeking an explanation. However, “subject-centric" explanations (SCEs) focussing on particular regions of a model around a query show promise for interactive exploration, as do explanation systems based on learning a model from outside rather than taking it apart (pedagogical versus decompositional explanations) in dodging developers' worries of intellectual property or trade secrets disclosure. Based on our analysis, we fear that the search for a “right to an explanation” in the GDPR may be at best distracting, and at worst nurture a new kind of “transparency fallacy.” But all is not lost. We argue that other parts of the GDPR related (i) to the right to erasure ("right to be forgotten") and the right to data portability; and (ii) to privacy by design, Data Protection Impact Assessments and certification and privacy seals, may have the seeds we can use to make algorithms more responsible, explicable, and human-centered.
Article
This paper provides an entry point to the problem of interpreting a deep neural network model and explaining its predictions. It is based on a tutorial given at ICASSP 2017. As a tutorial paper, the set of methods covered here is not exhaustive, but sufficiently representative to discuss a number of questions in interpretability, technical challenges, and possible applications. The second part of the tutorial focuses on the recently proposed layer-wise relevance propagation (LRP) technique, for which we provide theory, recommendations, and tricks, to make most efficient use of it on real data.