Figure 6 - uploaded by Leliane Nunes de Barros
Content may be subject to copyright.
The Björnstorp estate. The gray areas represent the forest stands and the white areas represent the non-forest areas.

The Björnstorp estate. The gray areas represent the forest stands and the white areas represent the non-forest areas.

Source publication
Conference Paper
Full-text available
Two noteworthy models of planning in AI are prob- abilistic planning (based on MDPs and its gener- alizations) and nondeterministic planning (mainly based on model checking). In this paper we: (1) show that probabilistic and nondeterministic plan- ning are extremes of a rich continuum of prob- lems that deal simultaneously with risk and (Knigh- tia...

Context in source publication

Context 1
... forest land is divided into 623 stands that are mainly dominated by Norway spruce. See Figure 6 for an overview of the Björnstorp estate. The estate was modeled with twenty-year-long time periods, and for each stand two management activities could be selected: to clear-cut the stand or not to clear-cut the stand. ...

Similar publications

Conference Paper
Full-text available
In this paper an overview is given of the principles of probabilistic budgeting and time planning. Uncertainties related to normal-and special events are described. Analytical expressions are presented. To deal with correlations between special events, an alternative for the classical product moment correlation coefficient is proposed.
Article
Full-text available
The aim of this work is to discuss model uncertainties, in the case of the biome-chanics of phonation. A number of mechanical models of voice production have been proposed in past years; but, in general, they have a deterministic nature. In previous works, data uncer-tainties were incorporated to a two-mass model of the vocal folds to perform a pro...

Citations

... As future work, we aim to expand the application of CG-iLAO * to more complex models that can benefit from our iterative method of generating applicable actions. Models with imprecise parameters, such as MDPIPs and MDP-STs (White III and Eldeib 1994;Trevizan, Cozman, and Barros 2007), are suitable candidates for our approach since they have a minimax semantics for the Bellman equations. In this minimax semantics, the value function minimises the expected cost-to-go assuming that an adversary aims to maximise the cost-to-go by selecting the values of the imprecise parameters. ...
Article
Current methods for solving Stochastic Shortest Path Problems (SSPs) find states’ costs-to-go by applying Bellman backups, where state-of-the-art methods employ heuristics to select states to back up and prune. A fundamental limitation of these algorithms is their need to compute the cost-to-go for every applicable action during each state backup, leading to unnecessary computation for actions identified as sub-optimal. We present new connections between planning and operations research and, using this framework, we address this issue of unnecessary computation by introducing an efficient version of constraint generation for SSPs. This technique allows algorithms to ignore sub-optimal actions and avoid computing their costs-to-go. We also apply our novel technique to iLAO* resulting in a new algorithm, CG-iLAO*. Our experiments show that CG-iLAO* ignores up to 57% of iLAO*’s actions and it solves problems up to 8x and 3x faster than LRTDP and iLAO*.
... White and Eldeib (1986) and White and Eldeib (1994) also produced early uncertain MDP models, with similar work by Harmanec (2002) in the planning literature. From the general class of MDPs with imprecise parameters (Delgado et al. 2011a(Delgado et al. , 2016, bounded parameter MDPs (Givan et al. 2000) and MDPs with set-valued transitions (Trevizan et al. 2007) were later developed as special cases in the artificial intelligence community. In the operations research literature, robust MDPs allow the parameters for the transition matrix to vary within uncertainty sets and use a worst-case solution approach (Wiesemann et al. 2013;Nilim and El Ghaoui 2005;Iyengar 2005). ...
Article
Full-text available
Recent advances in decision making have incorporated both risk and ambiguity in decision theory and optimization methods. These methods implement a variety of uncertainty representations from probabilistic and non-probabilistic foundations, including traditional probability theory, sets of probability measures, uncertainty sets, ambiguity sets, possibility theory, evidence theory, fuzzy measures, and imprecise probability. The choice of uncertainty representation impacts the expressiveness and tractability of the decision models. We survey recent approaches for representing uncertainty in both decision making and optimization to clarify the trade-offs among the alternative representations. Robust and distributionally robust optimization are surveyed, with particular attention to standard form ambiguity sets. Applications of uncertainty and decision models are also reviewed, with a focus on recent optimization applications. These applications highlight common practices and potential research gaps. The intersection of behavioral decision making and robust optimization is a promising area for future research and there is also opportunity for further advances in distributionally robust optimization in sequential and multi-agent settings.
... Risk management is the process through which project managers identify, consider, and attempt to mitigate potential risks to projects. There are a number of terminological issues discussed in literature relating to risk, such as the difference between risks, hazards, and uncertainties (Trevizan et al., 2007). The word " risk " in this paper is used loosely, in line with common usage of the word, which is defined by the Macquarie dictionary as " the state of being open to the chance of injury or loss " . ...
Article
This paper has considered risk management, financial evaluation and funding in seven Australian wastewater and stormwater reuse projects. From the investigated case studies it can be seen that responsible parties have generally been well equipped to identify potential risks. In relation to financial evaluation methods some serious discrepancies, such as time periods for analysis, and how stormwater benefits are valued, have been identified. Most of the projects have required external, often National Government, funding to proceed. As National funding is likely to become less common in the future, future reuse projects may need to be funded internally by the water industry. In order to enable this the authors propose that the industry requires (1) a standard project evaluation process, and (2) an infrastructure funders' forum (or committee) with representation from both utilities and regulators, in order to compare and prioritise future reuse projects against each other.
... For a multitude of reasons, ranging from dynamic environments to conflicting elicitations from experts, from insufficient data to aggregation of states in exponentially large problems, researchers have previously highlighted the difficulty in exactly specifying reward and transition models in Markov Decision Problems. Motivated by this difficulty, there have been a wide variety of models, objectives and algorithms presented in the literature: namely Markov Decision Processes (MDPs) with Imprecise Transition Probabilities (White and Eldeib 1994), Bounded parameter MDPs (Givan, Leach, and Dean 2000), robust-MDPs (Nilim and Ghaoui 2005;Iyengar 2004), reward uncertain MDPs (Regan and Boutilier 2009;Xu and Mannor 2009), uncertain MDPs (Bagnell, Ng, and Schneider 2001;Trevizan, Cozman, and de Barros 2007;Ahmed et al. 2013) etc. We broadly refer to all the above models as uncertain MDPs in the introduction. ...
... Such an objective yields conservative policies (Delage and Mannor 2010), as it is primarily assumes that worst case can be terminal. The second thread (Trevizan, Cozman, and de Barros 2007;Regan and Boutilier 2009;Xu and Mannor 2009;Ahmed et al. 2013) focusses on the minimax objective, i.e., compute the policy that minimizes the maximum regret (difference from optimal for that instantiation) over all instantiations of uncertainty. Regret objective addresses the issue of conservative policies and can be considered as an alternate definition of robustness. ...
Article
Markov Decision Problems, MDPs offer an effective mechanism for planning under uncertainty. However, due to unavoidable uncertainty over models, it is difficult to obtain an exact specification of an MDP. We are interested in solving MDPs, where transition and reward functions are not exactly specified. Existing research has primarily focussed on computing infinite horizon stationary policies when optimizing robustness, regret and percentile based objectives. We focus specifically on finite horizon problems with a special emphasis on objectives that are separable over individual instantiations of model uncertainty (i.e., objectives that can be expressed as a sum over instantiations of model uncertainty): (a) First, we identify two separable objectives for uncertain MDPs: Average Value Maximization (AVM) and Confidence Probability Maximisation (CPM). (b) Second, we provide optimization based solutions to compute policies for uncertain MDPs with such objectives. In particular, we exploit the separability of AVM and CPM objectives by employing Lagrangian dual decomposition(LDD). (c) Finally, we demonstrate the utility of the LDD approach on a benchmark problem from the literature.
... The case of an unknown or partially known reward function has been considered in several studies in the context of Markov decision processes [6][7][8][9] as well as in reinforcement learning [10][11][12]. Besides, our work is related to preferencebased reinforcement learning [13,14], learning from demonstration [15] and apprenticeship learning [16]. ...
... In [1,15,5], the case where only probabilities are not accurately known is considered and solving methods are proposed for searching for robust solutions, i.e. optimizing the worst case. More generally, a unifying extension of MDPs allowing for different kinds of uncertainty have been proposed by [21]. Recently, the dual case where only rewards are partially known has been studied. ...
Article
Full-text available
Setting the values of rewards in Markov decision processes (MDP) may be a difficult task. In this paper, we consider two ordinal decision models for MDPs where only an order is known over rewards. The first one, which has been proposed recently in MDPs [23], defines preferences with respect to a reference point. The second model, which can been viewed as the dual approach of the first one, is based on quantiles. Based on the first decision model, we give a new interpretation of rewards in standard MDPs, which sheds some interesting light on the preference system used in standard MDPs. The second model based on quantile optimization is a new approach in MDPs with ordinal rewards. Although quantile-based optimality is state-dependent, we prove that an optimal stationary deterministic policy exists for a given initial state. Finally, we propose solution methods based on linear programming for optimizing quantiles.
... Recent solutions to BMDPs include extensions of real-time dynamic programming (RTDP) [47] and LAO* [48, 49] that search for the best policy under the worst model. The Markov Decision Process with Set-valued Transitions (MDPSTs) [50] is another subclass of MDPIPs where probability distributions are given over finite sets of states. Since BMDP and MDPST are special cases of MDPIPs, we can represent both by " flat " MD- PIPs. ...
Article
This paper investigates Factored Markov Decision Processes with Imprecise Probabilities (MDPIPs); that is, Factored Markov Decision Processes (MDPs) where transition probabilities are imprecisely specified. We derive efficient approximate solutions for Factored MDPIPs based on mathematical programming. To do this, we extend previous linear programming approaches for linear approximations in Factored MDPs, resulting in a multilinear formulation for robust “maximin” linear approximations in Factored MDPIPs. By exploiting the factored structure in MDPIPs we are able to demonstrate orders of magnitude reduction in solution time over standard exact non-factored approaches, in exchange for relatively low approximation errors, on a difficult class of benchmark problems with millions of states.
... But some works already exist for more general theories like imprecise probabilities and capacity the- ory [6, 24] . In summary, these are the MDPIP (MDP with Imprecise transition Prob- abilities) [14] and the BMDP (Bounded-parameter MDP) [5], in which the transition probabilities and the rewards are replaced by intervals; the AMDP (Algebraic MDP) [8], which is concerned with extensions of probabilities that can be written in an algebraic form (this is not the case of the belief functions); a possibilistic model has been proposed for qualitative MDPs [13], in which the observations and the preferences of the decision-maker are both modeled with the possibility theory; but the only generalization for belief functions is the MDPST (MDP with Set-valued Transitions) [23], which manipulates random sets (thus belief functions): ...
Conference Paper
Full-text available
This paper proposes a new model, the EMDP (Evidential Markov Decision Process). It is a MDP (Markov Decision Process) for belief functions in which rewards are defined for each state transition, like in a classical MDP, whereas the transitions are modeled as in an EMC (Evidential Markov Chain), i.e. they are sets transitions instead of states transitions. The EMDP can fit to more applications than a MDPST (MDP with Set-valued Transitions). Generalizing to belief functions allows us to cope with applications with high uncertainty (imprecise or lacking data) where probabilistic approaches fail. Implementation results are shown on a search-and-rescue unmanned rotorcraft benchmark.
... Another, more substantial, contribution of the paper is the development of algorithms for consequentialist sequential decision making expressed through decision trees [56] and influence diagrams [34]. Algorithms for decision making under Γ -Maximin and similar criteria have appeared in many settings [58,69,74], while algorithms for decision making under Maximality and E-admissibility have been suggested by Kyburg and Pittarelli [43] and proposed more recently by Kikuti et al. [42] and Utkin and Augustin [72]. 1 Section 3 presents algorithms and computational analysis for several criteria of choice. The most valuable contribution of Section 3 is the algorithm for E-admissibility. ...
... Sometimes the very language in which preferences are expressed allows for partial specification; this is particularly relevant in artificial intelligence applications. For instance, the semantics of " nondeterministic " actions in planning [6] is that these actions have effects whose probabilities are unknown, and consequently it is not possible to completely order them with respect to expected utility [69]. Another suggestive example is the theory of CP-nets [7], in which a graph-theoretical language organizes preferences about features of outcomes rather than outcomes themselves. ...
Article
This paper presents new insights and novel algorithms for strategy selection in sequential decision making with partially ordered preferences; that is, where some strategies may be incomparable with respect to expected utility. We assume that incomparability amongst strategies is caused by indeterminacy/imprecision in probability values. We investigate six criteria for consequentialist strategy selection: Γ-Maximin, Γ-Maximax, Γ-Maximix, Interval Dominance, Maximality and E-admissibility. We focus on the popular decision tree and influence diagram representations. Algorithms resort to linear/multilinear programming; we describe implementation and experiments.
... • MDP with Set-valued Transitions (MDPSTs) [143,144], ...
... They appear as random sets. [143], is a generalization of the MDP model. It is modeled by a tuple (Ω, A, F, R, P 0 ) where the parameters are defined as for a MDP except that the transition function F is set-valued, as follows: Trevizan et al. [143,144] proposed this MDPST model with a single initial state, i.e. the initial probability vector P 0 is equal to 1 for one single coordinate and 0 elsewhere. ...
... [143], is a generalization of the MDP model. It is modeled by a tuple (Ω, A, F, R, P 0 ) where the parameters are defined as for a MDP except that the transition function F is set-valued, as follows: Trevizan et al. [143,144] proposed this MDPST model with a single initial state, i.e. the initial probability vector P 0 is equal to 1 for one single coordinate and 0 elsewhere. ...
Thesis
Full-text available
On applications such as crisis management, the decision-maker's role is crucial. Indeed, if he performs a good and early choice of what actions to do, taking into account his limited available resources, he can avoid an important part of human or financial losses. We focus on a Markov modeling of a system, which is suitable in many situations (propagating phenomena...). We aim at defining risk measures as decision criteria in such systems, that will lead to define a robust Markov Decision Process (MDP) (Puterman 1994). The motivation of this thesis is first examining probabilistic risk measures, and we consider Markov chain modeling of systems for this. But the main difficulty in applications such as early crisis management is that the data are missing for determining a probabilistic model, due to the uncertainty about what may happen, the lack of observations at the very beginning of a crisis, and their imprecision (e.g. textual information). This is why we shall consider a generalization of the Markov chain called Evidential Markov Chain (EMC) (Lanchantin 2005), to the DS's Theory of Evidence (Shafer 1976). We shall use this generalization in order to propose a new measure of risk and involve it in a generalized MDP (Evidential MDP). The main results of this thesis are: * an algorithm for forecasting risk in Markov chains; * an algorithm for forecasting risk in Markov space-time model; * an algorithm for forecasting risk in Evidential Markov Chains; * the simulation of a crisis with an Evidential Markov Chain; * evidential measures of uncertainty applicable to risk (generalized channel capacity); * an algorithm for solving the Evidential MDP (EMDP); * application of the EMDP on a search-and-rescue robot. These contributions will be illustrated on the following case studies: * a geopolitical crisis; * a benchmark of the IPC (International Planning Competition) in robotics for search-and-rescue; * small examples of crisis management.