Figure 1 - uploaded by Randall L. Mayes
Content may be subject to copyright.
3 disc model with filled rubber as the middle disc (in yellow) 

3 disc model with filled rubber as the middle disc (in yellow) 

Source publication
Article
Full-text available
This paper presents a basic tutorial on epistemic uncertainty quantification methods. Epistemic uncertainty, characterizing lack-of-knowledge, is often prevalent in engineering applications. However, the methods we have for analyzing and propagating epistemic uncertainty are not as nearly widely used or well-understood as methods to propagate aleat...

Context in source publication

Context 1
... treatment of uncertainty in the analysis of computer models is essential for understanding possible ranges of outputs or scenario implications. Most computer models for engineering applications are developed to help assess a design or regulatory requirement. The capability to quantify the impact of uncertainty in the decision context is critical. This paper will focus on situations with epistemic uncertainty, which represents a lack of knowledge about the appropriate value to use for a quantity. Epistemic uncertainty is sometimes referred to as state of knowledge uncertainty, subjective uncertainty, Type B, or reducible uncertainty, meaning that the uncertainty can be reduced through increased understanding (research), or increased and more relevant data. [ 7, 8] Epistemic quantities are sometimes referred to as quantities which have a fixed value in an analysis, but we do not know that fixed value. For example, the elastic modulus for the material in a specific component is presumably fixed but unknown or poorly known. In contrast, uncertainty characterized by inherent randomness which cannot be reduced by further data is called aleatory uncertainty. Some examples of aleatory uncertainty are weather or the height of individuals in a population: these cannot be reduced by gathering further information. Aleatory uncertainty is also called stochastic, variability, irreducible and type A uncertainty. Aleatory uncertainties are usually modeled with probability distributions, but epistemic uncertainty may or may not be modeled probabilistically. Regulatory agencies, design teams, and weapon certification assessments are increasingly being asked to specifically characterize and quantify epistemic uncertainty and separate its effect from that of aleatory uncertainty [1]. There are many ways of representing epistemic uncertainty, including probability theory, fuzzy sets, possibility theory, and imprecise probability. The problem of selecting an appropriate mathematical structure to represent epistemic uncertainties can be challenging. At Sandia we have chosen to focus on three approaches: interval analysis, Dempster-Shafer evidence theory, and (for mixed aleatory/epistemic uncertainties) second-order probability. Section 2 presents a structural dynamics example that will be used to demonstrate the various methods. Section 3 discusses interval analysis and shows results, Section 4 discusses evidence theory and shows results, and Section 5 discusses second-order probability and associated results. Section 6 summarizes the paper. We present an example from structural dynamics, where the application of interest is the performance of the bonding material in an aeroshell. In the example we present, the application has been simplified. We have a fairly coarse, 3-D model of 3 discs. The outer 2 discs represent rigid masses (in this case, they are steel) and the inner disc represents a layer of a filled rubber. Figure 1 depicts the geometry of the configuration used in this example. We are interested understanding frequencies of the axial and shear modes for this experimental configuration, shown in Figure 2. There is significant epistemic uncertainty in this example associated with the material properties of the filled rubber. Specifically, we have a wide variety of tests and expert opinion on potential values for the modulus of elasticity in tension and compression, E, and Poisson’s ratio, ν . The filled rubber is a rubber material with particles in it. In this case the particles are glass balloons, which are used to get the density of the material down. A filled rubber softens with increased strain (on other rubbers, we have seen as much as an order of magnitude difference in the modulus, depending on the strain level). In vibration, the strain levels are usually very low, e.g. on the order of 0.1% strain or less. The simulation code used is Salinas [11, 12], which is a finite-element analysis code for modal, vibration, static and shock analysis developed at Sandia National Laboratories for massively parallel implementations (for more information, see: ). This simulation takes approximately 2 hours to run on a Linux workstation with two Dual-Core Intel® Xeon® 5000 series 64-bit processors and 2Gigabytes of RAM. We have a variety of test data: some dynamic tests, some static, and one ultrasonic. Some of the tests are on the discs and some on the system-level aeroshells. The test data has been taken by several organizations under different conditions and is not very consistent. One of the static tests was taken at strain levels much higher than the small strain of the rubber in vibration, thus invalidating the data for our needs. We don’t have much confidence in the ultrasonic test because the filled rubber layer was too thin in comparison to the other layers they had to send the ultrasonic signal through. Also, some of the test data reported to us involves people using test results and calibrating their models to infer values of E and/or ν . For the purposes of this paper, we are not trying to calibrate our finite-element model; we are simply trying to use it to properly propagate epistemic uncertainty. Finally, there is some correlation between E and ν . To start, based on our assessment of the test data available, we will assume that the value of E falls within the interval of [2000, 25000] psi and the value for ν falls within the interval of [0.45, 0.495]. We used DAKOTA [2, 3], a software framework that allows one to perform uncertainty quantification, optimization, and parameter studies (see: ) to perform the computational runs of the Salinas model presented in subsequent sections. DAKOTA was configured to drive the analysis for 3 case studies: pure interval analysis, Dempster-Shafer evidence theory, and second-order probability analysis. The simplest way to propagate epistemic uncertainty is by interval analysis. In interval analysis, it is assumed that nothing is known about the uncertain input variables except that they lie within certain intervals [6, 8]. That is, there is no particular structure on the possible values for the epistemic uncertain variables except that they lie within bounds. The problem of uncertainty propagation then becomes an interval analysis problem: given inputs that are defined within intervals, what is the corresponding interval on the outputs? Although interval analysis is conceptually simple, in practice it can be difficult to determine the optimal solution approach. A direct approach is to use optimization to find the maximum and minimum values of the output measure of interest, which correspond to the upper and lower interval bounds on the output, respectively. There are a number of optimization algorithms which solve bound constrained problems, such as bound-constrained Newton methods. In practice, it may require a prohibitively large number of function evaluations to determine these optima, especially if the simulation is very nonlinear with respect to the inputs, has a high number of inputs with interaction effects, exhibits discontinuities, etc. Local optimization solvers will not guarantee finding global optima, and thus to solve this problem properly, one may have to resort to multi-start implementations of local optimization methods or global methods such as genetic algorithms, DIRECT, etc. These approaches can be very expensive. Another approach to interval analysis is to sample from the uncertain interval inputs, and then take the maximum and minimum output values based on the sampling process as the estimate for the upper and lower output bounds. Usually a uniform distribution is assumed over the input intervals, although this is not necessary. Although uniform distributions may be used to create samples, one cannot assign a probabilistic distribution to them or make a corresponding probabilistic interpretation of the output. That is, one cannot make a CDF of the output: all one can assume is that sample input values were generated, corresponding sample output values were created, and the minimum and maximum of the output are the estimated output interval bounds. This sampling approach is easy to implement, but its accuracy is highly dependent on the number of samples. Often, sampling will generate output bounds which underestimate the true output interval. In this paper, a single input variable is represented as x i , x represents the vector of m uncertain variables, and the output y is a function of x : y = F( x ). Figure 3 shows a Monte Carlo sampling approach that is often used to propagate aleatory uncertainty. In this figure, there are 3 input parameter distributions ( m = 3) represented on the left side. Five samples are taken from each (N=5), and the simulation model is run five times with these sets of input, resulting in 5 realizations of the output y shown on the right. In the case of aleatory uncertainty propagation, one can interpret the resulting output samples probabilistically and fit an appropriate ...

Similar publications

Preprint
Full-text available
Deep neural networks (DNNs) are becoming more prevalent in important safety-critical applications, where reliability in the prediction is paramount. Despite their exceptional prediction capabilities, current DNNs do not have an implicit mechanism to quantify and propagate significant input data uncertainty -- which is common in safety-critical appl...

Citations

... Previous studies have clarified the relationship between the perception of flood risk and the decision to purchase flood insurance (Botzen and van den Bergh 2012;Gallagher 2013;Royal and Walls 2019), as well as the relationship between the subjective probability of disaster and evacuation behavior during heavy rainfall disasters (Okumura et al. 2001;Kakimoto et al. 2016). When asking for subjective probability using an explicit numerical scale (ranging from 0 to 100%), the "50%" response is typically used as a substitute for an intended response of "fifty-fifty" (Fischhoff and Bruine De Bruin 1999) and can be considered as an expression reflecting epistemic uncertainty (e.g., Swiler et al. 2009;Merz and Thieken 2005). Epistemic uncertainty represents a lack of knowledge about the appropriate value to be used for a quantity. ...
Article
Full-text available
Lack of sufficient knowledge about flood risk can lead to “don’t know” responses or non-response in risk perception surveys, and incorrect treatment of these responses can lead to bias in the results. In this study, we focus on the possibility that the “50%” response actually means “fifty-fifty” and thus reflects epistemic uncertainty in the subjective probability of river flooding. We conduct an analysis that introduces a concomitant-variable latent class model as a method to adjust for this epistemic uncertainty. The results of the analysis accounting for epistemic uncertainty suggest that risk communication, such as simulated evacuation experiences and flood-related information distribution, increases subjective probability. Moreover, the proportion of latent classes with epistemic uncertainty decreased with each successive wave of the survey, thereby suggesting that knowledge acquisition and learning through the demonstration experiment led to a reduction in epistemic uncertainty. The analysis using the evacuation decision-making model also suggested that the introduction of subjective probability contributed to improving the likelihood of the model. These results suggest that knowledge acquisition through risk communication and short-term panel survey can lead to correct risk perception estimations and influence evacuation decisions.
... Epistemic uncertainty, which is typically brought on by a lack of training data, is standard in ML models [95]. The under-representation of minority groups, such as the tribal people, in face recognition algorithms is an example of this problem. ...
Article
Full-text available
ML applications proliferate across various sectors. Large internet firms employ ML to train intelligent models using vast datasets, including sensitive user information. However, new regulations like GDPR require data removal by businesses. Deleting data from ML models is more complex than databases. Machine Un-learning (MUL), an emerging field, garners academic interest for selectively erasing learned data from ML models. MUL benefits multiple disciplines, enhancing privacy, security, usability, and accuracy. This article reviews MUL’s significance, providing a taxonomy and summarizing key MUL algorithms. We categorize modern MUL models by criteria, including model independence, data driven, and implementation considerations. We explore MUL applications in smart devices and recommendation systems. We also identify open questions and future research areas. This work advances methods for implementing regulations like GDPR and safeguarding user privacy.
... Uncertainty can be divided into aleatoric (inexplicable due to randomness) and epistemic (reducible with more/better data or better understanding of the knowledge) uncertainty [37,38]. Point estimation for parameters in DL models is one source of reducible epistemic uncertainty; a non-Bayesian DL model is trained to find a single set of optimal parameter values given observations of and , but Bayesian methods account for multiple possible solutions [39][40][41]. ...
... Uncertainty can be divided into aleatoric (inexplicable due to randomness) and epistemic (reducible with more/better data or better understanding of the knowledge) uncertainty [37,38]. Point estimation for parameters in DL models is one source of reducible epistemic uncertainty; a non-Bayesian DL model is trained to find a single set of optimal parameter values given observations of x and y, but Bayesian methods account for multiple possible solutions [39][40][41]. ...
Article
Full-text available
Diagnosis of adamantinomatous craniopharyngioma (ACP) is predominantly determined through invasive pathological examination of a neurosurgical biopsy specimen. Clinical experts can distinguish ACP from Magnetic Resonance Imaging (MRI) with an accuracy of 86%, and 9% of ACP cases are diagnosed this way. Classification using deep learning (DL) provides a solution to support a non-invasive diagnosis of ACP through neuroimaging, but it is still limited in implementation, a major reason being the lack of predictive uncertainty representation. We trained and tested a DL classifier on preoperative MRI from 86 suprasellar tumor patients across multiple institutions. We then applied a Bayesian DL approach to calibrate our previously published ACP classifier, extending beyond point-estimate predictions to predictive distributions. Our original classifier outperforms random forest and XGBoost models in classifying ACP. The calibrated classifier underperformed our previously published results, indicating that the original model was overfit. Mean values of the predictive distributions were not informative regarding model uncertainty. However, the variance of predictive distributions was indicative of predictive uncertainty. We developed an algorithm to incorporate predicted values and the associated uncertainty to create a classification abstention mechanism. Our model accuracy improved from 80.8% to 95.5%, with a 34.2% abstention rate. We demonstrated that calibration of DL models can be used to estimate predictive uncertainty, which may enable clinical translation of artificial intelligence to support non-invasive diagnosis of brain tumors in the future.
... Typical epistemic uncertainties are the load of a vehicle, aircraft or robot. Interval arithmetic or Monte Carlo samples might be used for analysis, see [27]. In the latter case, approximated intervals are computed for interested output variables, but no stochastic distributions are determined as for aleatory uncertainties. ...
Article
Full-text available
Modeling and simulation is increasingly used in the design process for a wide span of applications. Rising demands and the complexity of modern products also increase the need for models and tools capable to cover areas such as virtual testing, design-space exploration or digital twins, and to provide measures of the quality of the models and the achieved results. The latter is also called credible simulation process. In an article at the International Modelica Conference 2021, we summarized the state of the art and best practice from the viewpoint of a Modelica language user, based on the experience gained in projects in which Modelica models were utilized in the design process. Furthermore, missing features and gaps in the used processes were identified. In this article, new proposals are presented to improve the quality of Modelica models, in particular by adding traceability, uncertainty, and calibration information of the parameters in a standardized way to Modelica models. Furthermore, the new open-source Modelica library Credibility is discussed together with examples to support the implementation of credible Modelica models.
... It is also known as variable, irreducible, stochastic, or type A uncertainty. • Epistemic uncertainty stems from a lack of knowledge, and may be specified in a probabilistic or non-probabilistic way, including second-order probability, interval, evidence theory, and fuzzy sets (Swiler et al., 2009;and Tian et al., 2018). It is also known as state of knowledge, reducible, subjective, or type B uncertainty. ...
Chapter
DESCRIPTION Industrial activity concerned with the profitability and safety of investments can be supported and promoted by research through the creation of new mathematical modeling approaches, and the quantification and mitigation of uncertainties. In recent years there has been increasing interest in the adoption of probabilistic approaches to assess sources of uncertainty in solar energy systems to estimate their feasibility, considering yield estimates, investments, operation and maintenance costs, and solar resource. In this context, the synthetic solar irradiance data set approach emerges as a promising tool to emulate the variability inherent to the solar resource in confident designs and feasibility analyses of these systems. Chapter 5 deals with the requirements of the industry with respect to synthetic solar data, and how such requirements are currently addressed during the main stages of development of solar projects. We recap methods for benchmarking the success of generated synthetic irradiance, reviewing statistical indicators for that purpose. We discuss and compare the use of single annual and multiple synthetic annual data sets of solar irradiance in the first stages of solar projects, and present their uses in a case study application in a Concentrating Solar Power (CSP) plant with a similar configuration to a well-known operational Parabolic Trough (PT) plant located in Spain.
... Due to their ease and applicability when only a small amount of data is available, interval-based approaches have garnered attention in many studies (Pradlwarter and Schuëller 2008;Voyles and Roy 2015;Zhu et al. 2016). The evidence theory (Bae et al. 2004;Swiler et al. 2009;Eldred et al. 2011;Salehghaffari and Rais-Rohani 2013;Shah et al. 2015), also referred to as the Dempster-Shafer theory, can be used for interval analysis by aggregating information (e.g., interval data) obtained from different sources and arriving at a degree of belief (i.e., confidence in an interval (Pan et al. 2016)). Similarly, the fuzzy set theory (Moore and Lodwick 2003;Hanss and Turrin 2010;Lima Azevedo et al. 2015) can be used to estimate the interval by combining evidence of different credibility. ...
Article
Full-text available
Computer-aided engineering (CAE) is now an essential instrument that aids in engineering decision-making. Statistical model calibration and validation has recently drawn great attention in the engineering community for its applications in practical CAE models. The objective of this paper is to review the state-of-the-art and trends in statistical model calibration and validation, based on the available extensive literature, from the perspective of uncertainty structures. After a brief discussion about uncertainties, this paper examines three problem categories—the forward problem, the inverse problem, and the validation problem—in the context of techniques and applications for statistical model calibration and validation.
... A distinction is often made between two types of uncertainty: aleatory uncertainty and epistemic uncertainty [27,28]. Aleatory uncertainty (also called variability, stochastic, irreducible, and type A uncertainty) is due to inherent or natural variation of the system under investigation. ...
... A number of non-probabilistic uncertainty analysis methods are emerging to quantify uncertainty given limited information, especially for epistemic uncertainty. Non-probabilistic methods include interval analysis, fuzzy theory, Dempster-Shafer evidence theory, and the affine arithmetic model [27,102,155]. ...
Article
Full-text available
Uncertainty analysis in building energy assessment has become an active research field because a number of factors influencing energy use in buildings are inherently uncertain. This paper provides a systematic review on the latest research progress of uncertainty analysis in building energy assessment from four perspectives: uncertainty data sources, forward and inverse methods, application of uncertainty analysis, and available software. First, this paper describes the data sources of uncertainty in building performance analysis to provide a firm foundation for specifying variations of uncertainty factors affecting building energy. The next two sections focus on the forward and inverse methods. Forward uncertainty analysis propagates input uncertainty through building energy models to obtain variations of energy use, whereas inverse uncertainty analysis infers unknown input factors through building energy models based on energy data and prior information. For forward analysis, three types of approaches (Monte Carlo, non-sampling, and non-probabilistic) are discussed to provide sufficient choices of uncertainty methods depending on the purpose and specific application of a building project. For inverse analysis, recent research has concentrated more on Bayesian computation because Bayesian inverse methods can make full use of prior information on unknown variables. Fourth, several applications of uncertainty analysis in building energy assessment are discussed, including building stock analysis, HVAC system sizing, variations of sensitivity indicators, and optimization under uncertainty. Moreover, the software for uncertainty analysis is described to provide flexible computational environments for implementing uncertainty methods described in this review. This paper concludes with the trends and recommendations for further research to provide more convenient and robust uncertainty analysis of building energy. Uncertainty analysis has been ready to become the mainstream approach in building energy assessment although a number of issues still need to be addressed.
... Two types of uncertainties exist in groundwater flow and transport modelling: aleatory and epistemic uncertainty (Helton et al. 2008;Ross et al. 2009;Swiler et al. 2009). Epistemic uncertainty represents a lack of knowledge about the appropriate value to use for a quantity; this uncertainty can be reduced through increased understanding (research) or collecting more relevant data. ...
... Note, however, that one must be careful not to interpret the result with any type of structure other than an interval on the output. Furthermore, while sampling is easy to implement, it may underestimate the true output interval (Swiler et al. 2009). Figure 6 Monte Carlo sampling used for epistemic interval propagation (Swiler et al. 2009) A second way to propagate epistemic uncertainty is by using Dempster-Shafer or evidence theory (Helton et al. 2004;Helton et al. 2008;Swiler et al. 2009). ...
... Furthermore, while sampling is easy to implement, it may underestimate the true output interval (Swiler et al. 2009). Figure 6 Monte Carlo sampling used for epistemic interval propagation (Swiler et al. 2009) A second way to propagate epistemic uncertainty is by using Dempster-Shafer or evidence theory (Helton et al. 2004;Helton et al. 2008;Swiler et al. 2009). Evidence theory involves two specifications of likelihood, a belief and a plausibility. ...
Technical Report
Full-text available
This report summarises approaches to the simulation of the impacts of CSG extraction in regional scale groundwater flow models. It summarises the literature relating to regional scale groundwater modelling approaches and discusses two aspects of the modelling process in detail, specifically: strategies for parameter upscaling and spatially interpolating, and, numerical representations of fault and fracture systems in groundwater flow models.
... These definitions are adopted from the papers by Oberkampf et al. [9][10][11]. Epistemic uncertainty regarding a variable can be of two types: a poorly known stochastic quantity [12] or a poorly known deterministic quantity [13]. ...
Article
Full-text available
This paper proposes evidence theory based methods to both quantify the epistemic uncertainty and validate computational model. Three types of epistemic uncertainty concerning input model data, that is, sparse points, intervals, and probability distributions with uncertain parameters, are considered. Through the proposed methods, the given data will be described as corresponding probability distributions for uncertainty propagation in the computational model, thus, for the model validation. The proposed evidential model validation method is inspired by the idea of Bayesian hypothesis testing and Bayes factor, which compares the model predictions with the observed experimental data so as to assess the predictive capability of the model and help the decision making of model acceptance. Developed by the idea of Bayes factor, the frame of discernment of Dempster-Shafer evidence theory is constituted and the basic probability assignment (BPA) is determined. Because the proposed validation method is evidence based, the robustness of the result can be guaranteed, and the most evidence-supported hypothesis about the model testing will be favored by the BPA. The validity of proposed methods is illustrated through a numerical example.
... That is, they identify the ALE of an attack and the RM level of a specific countermeasure as the most challenging, thus requiring a considerable effort. A great improvement toward this goal is made in [149], where the authors propose statistical methodologies to estimate the above parameters using epistemic uncertainty [150]. The second limitation is only partially addressed in [145], where the authors study the results of their model applying a combination of two or more countermeasures. ...
Article
It is without doubt that today the volume and sophistication of cyber attacks keeps consistently growing, militating an endless arm race between attackers and defenders. In this context, full-fledged frameworks, methodologies, or strategies that are able to offer optimal or near-optimal reaction in terms of countermeasure selection, preferably in a fully or semi-automated way, are of high demand. This is reflected in the literature, which encompasses a significant number of major works on this topic spanning over a time period of 5 years, that is, from 2012 to 2016. The survey at hand has a dual aim, namely: first, to critically analyze all the pertinent works in this field, and second to offer an in-depth discussion and side-by-side comparison among them based on 7 common criteria. Also, a quite extensive discussion is offered to highlight on the shortcomings and future research challenges and directions in this timely area.