PreprintPDF Available

Non-deterministic Model Validation Methodology for Simulation-based Safety Assessment of Automated Vehicles

November 2020

November 2020

Authors:

Stefan Riedmaier

Technische Universität München

Jakob Schneider

Technische Universität München

Benedikt Danquah

Technische Universität München

Bernhard Schick

Hochschule für angewandte Wissenschaften Kempten

Show all 5 authorsHide

Safeguarding and type approval of automated vehicles is a key enabler for their market launch in our complex traffic environment. Scenario-based testing by means of computer simulation is becoming increasingly important to cope with the enormous complexity and effort. However, there is a huge gap when assessing the safety of the virtual vehicle while the real vehicle will drive on the road. Simulation must be accompanied by model validation to ensure its credibility since errors and uncertainties are inherent in every model. Unfortunately, this is rarely addressed in the current literature. In this paper, a modular process is presented covering both model validation and safeguarding. It is characterized by the fact that it quantifies a large number of errors and uncertainties, represents them in the form of an error model and ultimately integrates them into the safeguarding results. It is applied to a type approval regulation for the lane keeping behavior of a vehicle under various scenario conditions. The paper contains a thorough validation of the methodology itself by comparing its results with actual ground truth values. For this comparison, a binary classifier and confusion matrices are used that relate the binary type approval decisions. The classifier demonstrates that the methodology of this paper identifies a systematic error of the simulation model across several safeguarding scenarios. Finally, the paper provides recommendations for alternative configurations of the modular methodology depending on different requirements.

Content uploaded by Stefan Riedmaier

Content may be subject to copyright.

Non-deterministic Model Validation Methodology for Simulation-based Safety

Assessment of Automated VehiclesI

Stefan Riedmaiera,1,∗, Jakob Schneidera, Benedikt Danquaha, Bernhard Schickb, Frank Diermeyera

aTechnical University of Munich, Institute of Automotive Technology, Boltzmannstr. 15, 85748 Garching b. München, Germany

bKempten University of Applied Sciences, Bahnhofstr. 61, 87435 Kempten, Germany

Abstract

Safeguarding and type approval of automated vehicles is a key enabler for their market launch in our complex traﬃc

environment. Scenario-based testing by means of computer simulation is becoming increasingly important to cope

with the enormous complexity and eﬀort. However, there is a huge gap when assessing the safety of the virtual vehicle

while the real vehicle will drive on the road. Simulation must be accompanied by model validation to ensure its

credibility since errors and uncertainties are inherent in every model. Unfortunately, this is rarely addressed in the

current literature. In this paper, a modular process is presented covering both model validation and safeguarding. It is

characterized by the fact that it quantiﬁes a large number of errors and uncertainties, represents them in the form of an

error model and ultimately integrates them into the safeguarding results. It is applied to a type approval regulation for

the lane keeping behavior of a vehicle under various scenario conditions. The paper contains a thorough validation of

the methodology itself by comparing its results with actual ground truth values. For this comparison, a binary classiﬁer

and confusion matrices are used that relate the binary type approval decisions. The classiﬁer demonstrates that the

methodology of this paper identiﬁes a systematic error of the simulation model across several safeguarding scenarios.

Finally, the paper provides recommendations for alternative conﬁgurations of the modular methodology depending on

diﬀerent requirements.

Keywords: Automated vehicles, Model validation, Predictive models, Safety assessment, Systems simulation,

Uncertainty aggregation

1. Introduction

In recent years, automated driving has raised great hopes of making future mobility more environmentally friendly,

comfortable and safe. Today, vehicles with assistance systems and partial automation are available on the market, but

automated vehicles (AVs) from Level 3 onward according to SAE [

] pose great challenges for industry and academia.

It is not only challenging to develop AVs, but especially to prove their safety in our complex traﬃc environment with

an inﬁnite amount of diﬀerent situations [

]. The United Nations Economic Commission for Europe (UNECE) has

developed a proposal for the type approval of Automated Lane Keeping Systems (ALKS), which should pave the way

for the market release of Level 3 vehicles in the next years [3].

Various safety assessment approaches have been developed by the entire AV research community, since classical

real-world testing approaches reach their limits for AVs due to enormously high mileage [

]. Above all, the scenario-

based approach is to be mentioned here, as it is frequently used in the literature and as it was implemented in the two

large research projects PEGASUS [

] and ENABLE-S3 [

]. It is a promising method that tests the AV in individual

traﬃc situations, mainly by means of systems simulation [

] and by discarding most of the driving time without any

actions and events. The scenario-based approach can be accompanied by further safety assessment approaches. System

IThis document is the results of the research project funded by TÜV SÜD Auto Service GmbH.

∗Corresponding author

Email address: riedmaier@ftm.mw.tum.de (Stefan Riedmaier)

1S. Riedmaier is the ﬁrst author.

Preprint submitted to Simulation Modelling Practice and Theory November 6, 2020

functions of the AV can be compared with requirement speciﬁcations and standards [

], models of the controller and

trajectory planner can be veriﬁed by formal methods [

], traﬃc simulations can analyze the overall impact of AVs on

traﬃc [

], etc. A comprehensive overview of safety assessment approaches can be found in our previous paper [

We focus in the following on the scenario-based approach, since it is currently the most widely used approach.

The majority of the safety assessment literature uses mathematical or computational models for their proof of

concept, as the real testing eﬀort is hardly feasible. However, the model-based techniques are very rarely accompanied

by model validation activities, especially in quantitative terms [

]. This represents a very risky gap. Without validation

of the simulation models, there can be no trustworthiness of the models and ultimately no credible decision making for

such safety-critical systems as AVs.

In our previous paper [

], we gave a survey about publications across several engineering ﬁelds, which present

Veriﬁcation, Validation and Uncertainty Quantiﬁcation (VV&UQ) approaches to assess the credibility of simulation

models. The few papers from the automotive safety assessment ﬁeld were facing great challenges. They often focus

only on individual components such as vehicle dynamics or environmental sensors. They use simple qualitative or

deterministic validation approaches without considering diﬀerent types of errors and uncertainties inherent in any

simulation model. They do not analyze the eﬀect of those errors and uncertainties on the ﬁnal safety assessment

decisions. The papers from numerical engineering ﬁelds presented enhanced VV&UQ approaches. However, those

could, to the best of our knowledge, never be applied to complex practical problems [

] such as cyber-physical

systems [

]. Therefore, we introduced a novel, modular and uniﬁed VV&UQ framework in [

] and integrated several

approaches so that the diﬀerent engineering ﬁelds can beneﬁt from each other.

After presenting the general framework, this paper focuses on its application to the speciﬁc use case of safeguarding

AVs, on its validation and on the comparison of diﬀerent framework manifestations. In this paper we are less concerned

with the quality of a single simulation model compared to reality and more with the quality of our VV&UQ methodology

itself. We therefore apply the Method of Manufacture Universes (MMU) [

], which compares a simulation model

with a manufacturing universe to analyze important characteristics. The application of the VV&UQ methodology to

the comparison between simulation and reality follows in a subsequent paper.

The main contributions are:

1. a comprehensive overview about the validation of AV models with derivation of requirements,

2. our modular VV&UQ framework in speciﬁc conﬁguration for model-based safety assessment of AVs,

3. the ﬁrst application of a VV&UQ approach with aggregation of errors and uncertainties to AVs,

a detailed analysis and fair comparison of a deterministic and a non-deterministic manifestation of the framework

based on MMU.

Section 2 summarizes the state of the art in validation of AV models and in VV&UQ in general. Section 3 concludes

the state of the art with an extensive analysis and derivation of requirements for our methodology. Section 4 illustrates

the use case of the type approval of an automated vehicle in this paper and the conﬁguration of the manufactured

universe. Section 5 describes our model-based methodology based on the use case. Section 6 follows a detailed

evaluation of the results in order to validate and compare diﬀerent manifestations of the methodology. Finally, the

conclusion in Section 7 summarizes the most important research ﬁndings.

2. Related Work

This section begins by outlining fundamental sources of modeling errors and uncertainties to motivate why VV&UQ

activities are so important. Then, it gives a comprehensive overview about validation of AV models. Since this ﬁeld of

application is strongly characterized by deterministic systems simulations (point predictions) and hardly any distinction

is made between diﬀerent sources of errors and uncertainties, a non-deterministic approach is presented in the last part.

Further information can be found in our survey paper [

]. Here, we highlight individual aspects that are integral to

this paper’s central theme and understanding2.

Section 2.1 is a compact summary of [

, Sec. 2.2.2, 2.2.4, 2.2.5] and Section 2.3 of [

, Sec. 5.2.3, 6.2.2, 6.4.5, 7.1.1]. Section 2.2 is a

restructured and revised version of [13, Sec. 7.2].

Table 1: Classiﬁcation of references addressing AV model validation. The columns contain the aﬃliation of the reference to a component according

to Sec. 2.2.1-2.2.4. The rows distinguish between deterministic and non-deterministic simulation. References of the same research group are

combined in the same bracket.

Sensor Vehicle Closed-loop Traﬃc

Deterministic [22], [23, 24], [25], [26], [27] [28, 29], [30] [31, 32, 33], [34], [35] [36], [37], [38], [39]

Non-deterministic - [40, 41], [42] [43, 10], [44] [45]

2.1. Sources of Errors and Uncertainties

Errors are inherent in every simulation model, since a model is by deﬁnition a simpliﬁed abstraction of reality. The

error, denoted

, indicates the deviation compared to the true value of nature

ytrue

. However, it is often quite diﬃcult

to precisely quantify the errors of a simulation model. As soon as it is not possible, associated uncertainties arise.

Two basic types of uncertainties can be distinguished. Aleatory uncertainties describe stochastic eﬀects and natural

variability and can be quantiﬁed by probability distributions. Epistemic uncertainties exist when knowledge is poor and

can therefore either be reduced or at least be quantiﬁed, e. g. by intervals treating all values equally. [17, Sec. 1.2].

One source of errors and uncertainties results from the calculation of computational models with computers of

ﬁnite precision compared to the exact solution of the mathematical model. These numerical errors can vary greatly

depending on the speciﬁc application. They include both basic code errors and errors in solving the equations, such as

rounding errors or the discretization error

. During model veriﬁcation, the ﬁrst category is treated by code veriﬁcation

and the second category by solution veriﬁcation [18, Sec. 2].

Another source of errors and uncertainties arises from model inputs (

) and parameters (

eθ

). Whereas they are

usually assumed fully characterized in deterministic simulations, this is not the case for partially and uncharacterized

experiments as well as unknown model prediction conditions. Input uncertainty quantiﬁcation intends to rigorously

quantify those uncertainties. In non-deterministic simulations, the input uncertainties are propagated through the

simulation model and aggregated on the output side [19].

Furthermore, the model-form itself contains errors

and uncertainties due to assumptions or the selection of

an inadequate equation. They can be quantiﬁed based on physical data from model validation experiments. The

experiments are re-simulated and the results of the physical system

and the simulation model

are compared using

validation metrics [20].

The involved errors can be summarized as follows [18, eq. (1-5-6)]:

ytrue =gs(x)−ey,obs (1)

=gm(x, θ, h)−(em+ex+eθ+eh).(2)

The various sources of errors and uncertainties are deeply interwoven and interfere with each other. Depending on

the constellation, they can magnify or compensate each other, which results in a misleading trustworthiness of the

simulation model [

, Sec. 1]. This eﬀect is sometimes seen during model calibration, when an inadequate model-form

is compensated by adjusting the parameters far beyond their physical signiﬁcance until the results coincide under

certain calibration conditions [

]. It is therefore helpful to quantify the diﬀerent uncertainties separately as far as

possible. Certainly, additional sources such as measurement errors

ey,obs

or extrapolation errors caused by model

predictions outside the scope of validity make this even more diﬃcult.

2.2. Validation of Automated Vehicle Models

We sort this section on the validation of AV models based on diﬀerent components that form the entire AV, since

there exists some literature on component-level, but just a few references on system-level. The central modules of an

AV include environmental sensors, several software units for environment perception, motion planning and control, as

well as actuators and vehicle dynamics. An overview about most of the references is given in Table 1.

2.2.1. Sensor Model Validation

The development of models for camera, lidar and radar sensors is currently a key enabler for simulation-based

safety assessment of AVs. They are essential for realistic simulations and were recently used to identify challenging test

scenarios for AVs [

]. Holder et al.

[47]

, Rosenberger et al.

[48]

systematically derive requirements for sensor models

to determine required ﬁdelity levels. Schaermann et al.

[24]

distinguish parametric, non-parametric and ray-tracing

approaches for sensor modeling, either on the level of raw sensor data or of processed object lists.

Regarding sensor model validation, it is also possible to distinguish between validation on raw data or object list

level. Hanke et al.

[23]

, Schaermann et al.

[24]

compare a lidar sensor and its corresponding model with regards to raw

point clouds and occupancy grids of traﬃc objects. For the comparison, they select three validation metrics that can

handle the complex nature of the data. They are referred to as overall error, Barons and Pearson correlation coeﬃcients.

Abbas et al.

[22]

develop a metric to compare the visual complexity of real and synthetic images as camera raw data

based on color and spatial information.

Nentwig et al.

[26]

apply a classiﬁer to camera images to perform comparisons on the level of object lists. They

compare both the object hypothesis and the bounding boxes from the classiﬁer, the latter based on the box sizes and the

former based on a confusion matrix. Gaidon et al.

[25]

also apply classiﬁers and present eight metrics to measure the

real-to-virtual performance gap. Zec et al.

[27]

validate the sensor fusion of camera and radar models. On the one hand,

they compare time signals of processed traﬃc objects with a log-likelihood and a Root Mean Square Error (RMSE).

On the other hand, they compare statistical histogram data with a Jensen-Shannon divergence.

2.2.2. Vehicle Model Validation

For validation of vehicle dynamics models, standardized test maneuvers such as steady-state cornering [

] or

sine with dwell are used [

]. They provoke a characteristic behavior of the vehicle, which can be evaluated on the

basis of time signals and Key Performance Indicators (KPIs). Both standards were developed with regard to the

simulation-based type approval of electronic stability control systems. The standard [

] assumes one deterministic

simulation of the steady-state cornering maneuver as a baseline, which results in one time signal of the steering wheel

angle, the sideslip angle and the roll angle, plotted against the lateral acceleration, respectively. It adds a pre-deﬁned

tolerance band around each baseline signal and checks whether multiple repetitions of physical experiments fall within

these bands. Afterwards, the standard [

] performs a similar comparison based on tolerances for speciﬁc KPIs in the

sine with dwell maneuver. If all tolerances are met, the generic validity of the vehicle dynamics model is concluded.

Then, the actual type approval of diﬀerent vehicle variants can take place in the simulation [

]. Kutluay and Winner

[30] give a comprehensive overview of the literature in validation of vehicle dynamics models.

Kutluay

[40]

inverts the tolerance approach from those standards in his PhD thesis. Instead of adding the tolerances

to the deterministic simulation, he calculates conﬁdence intervals from the experimental repetitions based on Student’s

t-distribution and adds the tolerances to the conﬁdence intervals around the experimental mean. He oﬀers either

absolute magnitudes or relative percentages for the tolerance values. Then, he can check, whether the simulation falls

within the area around the experimental mean. He distinguishes an averaged input case, where the experimental input

data is averaged to perform one deterministic re-simulation, and an averaged output case, where each experiment is

re-simulated and the average is taken on the result stage. According to Oberkampf and Roy

[50

, p. 492

]

, the former

will lead to signiﬁcant deviations for strongly non-linear systems.

Viehof

[41]

extends the validation methodology from Kutluay

[40]

in his PhD thesis. He also re-simulates each

experimental repetition, but then compares Probability Density Functions (PDFs), without averaging the outputs. He

accepts the simulation model based on a statistical t-test, if the PDF from the model lies within the one from the

experiment. In case of lower requirements, he switches back to the conventional tolerance values and checks whether

the simulation PDF lies within the tolerance around the experimental mean. Similarly, Rhode

[42]

use non-deterministic

Monte Carlo simulations and conﬁdence bands for vehicle dynamics model validation.

2.2.3. Closed-loop Model Validation

In our previous paper [

], we used an adaptive cruise control function to control a test vehicle and its corresponding

vehicle model. We compared a purely virtual Model-in-the-Loop (MiL) simulation and a hybrid Vehicle-in-the-Loop

(ViL) simulation on a chassis dynamometer with physical proving ground tests. As test scenarios, we selected vehicle

following, emergency braking and cutting-in. We used qualitative graphical comparisons and quantitative validation

metrics such as the correlation coeﬃcient to validate the MiL and ViL simulation along the control pipeline.

Groh et al.

[31]

, Notz et al.

[32]

, Wagner et al.

[33]

focus on the inﬂuence of the sensor data on the validation of the

closed-loop AV, compared to independent ground truth measurements of the environment. They compare signals such

as the velocity, trajectories or an overall risk measure along the control pipeline and evaluate the validation errors via

box-plots. Aramrattana et al. [51] analyze the inﬂuence of modeling errors and uncertainties on control performance.

Schürmann et al.

[10]

use formal methods for safety assessment of AVs. They create non-deterministic models

based on set theory and apply reachability analysis to determine which states the AV can reach from given initial states

and possible inputs and parameters. If the reachable set of the AV does not intersect with predicted ones of other traﬃc

participants, the AV is safe. In contrast to sampling techniques such as Monte Carlo, the reachability analysis can

formally guarantee safety. They require non-deterministic models for the vehicle dynamics with controller and for the

other traﬃc participants.

Hartung et al.

[43]

create those models by using a conformance testing technique. They deﬁne a formal notion

of conformance, similar to validation metrics, and test whether the behavior of the non-deterministic model encloses

the one from the physical system. Strictly speaking, the conformance testing is formulated as an inverse optimization

problem, which optimizes the parameter sets so that the bound is as tight as possible. This can be seen as a model

calibration approach rather than a model validation approach. Böde et al.

[44]

propose three notions of model validity

and ﬁnd an optimal split between calibration and validation data. Johnson et al.

[34]

apply formal methods based

on correct-by-construction controller design and validate the closed-loop model by comparing the percentage of

collision-free runs in a parking lot scenario.

2.2.4. Traﬃc Model Validation

In contrast to microscopic systems simulations, traﬃc simulations with multiple agents do not assess individual

AVs, but determine the overall impact of AVs on traﬃc. Detering et al.

[45]

measure parametric uncertainties for Monte

Carlo simulations. Zheng et al.

[52]

use extreme value theory to validate statistical traﬃc models. Rao and Owen

[38]

apply autoregressive integrated moving average models to analyze the errors of traﬃc models. An overview about the

calibration of traﬃc models including metrics for comparison is given in [36, 37, 39].

2.3. Probability Bound Analysis

Oberkampf and Roy

[50]

use Probability Bound Analysis (PBA) as VV&UQ framework to separately quantify input,

numerical and model-form uncertainties. They describe aleatory uncertainties in the form of probability distributions,

epistemic uncertainties as intervals and mixed uncertainties as probability boxes (p-boxes). The latter are imprecise

probabilities that add an interval width to a Cumulative Distribution Function (CDF). By combining all sources of

uncertainties they obtain a ﬁnal p-box, which bounds the true value with a high probability.

They use the Richardson extrapolation technique to quantify the discretization error by comparing results of

diﬀerent step sizes as a replacement for the exact mathematical model. They quantify input uncertainties and apply

nested Monte Carlo sampling to propagate them through the simulation model. In the outer loop they take samples

from the epistemic parameters and in the inner loop from the aleatory parameters for each of the epistemic ones. While

all aleatory samples form a single CDF – more precisely a stepwise Empirical CDF (ECDF) – the epistemic samples

provide the width of the CDF and ultimately result in a p-box. During model validation, they quantify the area between

the distribution from simulation and experiment to quantify the model-form uncertainty. They extrapolate the latter

from the experimental validation conditions to the ones used for model prediction with a regression model. Finally, they

add the epistemic intervals from the model-form and the numerical uncertainty to both sides of the input uncertainty

distribution to obtain the ﬁnal p-box.

3. Analysis of the Related Work

In this section, we analyze the related work presented so far to identify research gaps and directions for the

remaining paper. Again, we address both the validation of AV models and the PBA approach.

3.1. Validation of Automated Vehicle Models

The related work on the validation of AV models contains references addressing speciﬁc component models as well

as references addressing the entire closed-loop model:

The sensor model references focus mainly on the complex nature of the multi-dimensional sensor data and on

speciﬁc sensor eﬀects. Comparisons on the level of object lists contain the impact of the sensor data on the

detection algorithms and allow the application of typical time series metrics.

The vehicle dynamics references are based on standardized maneuvers and tolerance values for the permissible

deviations between simulation and experiment. These tolerances represent subjective expert estimates with

regard to a classic vehicle dynamics evaluation and describe model validity as a simple binary result. Regarding

AV safety, these tolerances are not applicable, since model validation by deﬁnition always refers to a certain use

case.

The current closed-loop references are either embedded in a formal framework running online in the AV or

analyze how modeling errors ﬂow through the AV pipeline.

The traﬃc model references are interesting for safety assessment approaches that use traﬃc simulations to

analyze the impact of AVs. Since we are focusing on the scenario-based approach, we require a re-simulation of

traﬃc trajectories instead of traﬃc models.

All approaches lack an aggregation of errors and uncertainties for the safety assessment of the entire AV. It is crucial

to analyze the impact of these errors on safety, because in the end we are interested in making a decision on the safety

of the entire AV based on the simulation models.

The validation of component models is a powerful tool for the developer of the respective component. Furthermore,

it also increases the trustworthiness of the entire model on a qualitative level. On a quantitative level, however, it is

very diﬃcult to relate the errors in component modeling to the ﬁnal decisions about the system. Suppose a robust vs.

a sensitive lane keeping function controls the same vehicle. The robust controller might compensate a poor vehicle

dynamics model violating the permissible tolerances. The sensor, controller and vehicle dynamics interact with each

other and should not only be validated independently.

This type of extrapolation in the system hierarchy from component to system level is generally one of the main

current challenges in VV&UQ [

, Sec. 6.6] with only a few recent approaches such as [

]. There are engineering

ﬁelds such as spacecrafts [

] or nuclear reactor safety [

] that make system level tests impossible for cost or safety

reasons and require such VV&UQ approaches. However, this is not the case with AVs. Therefore, we concentrate

in this paper exclusively on the entire AV. Nevertheless, we make sure that the methodology is also applicable to

component models.

3.2. Probability Bound Analysis

In our survey paper [

], we have also presented a variety of references that focus on advanced non-deterministic

approaches. However, there were only a few general approaches which cover the entire model-based process and which

proved diﬃcult to apply to complex systems. For example, Eek et al.

[15]

found that the number of components and

parameters of an aircraft simulation model is too complex to be handled by PBA with reasonable eﬀort. Nevertheless,

we presented a modular framework in [

] and are ﬁrmly convinced that the advanced approaches can be used in

selected blocks of the framework even for complex systems such as AVs.

In this paper, we have chosen to use PBA as part of our framework for safety assessment of AVs. PBA is the main

approach of Frequentist VV&UQ. It objectively estimates relative frequencies of model inputs and parameters based

on repeated measurements. Unlike Bayesian approaches, it does not incorporate personal beliefs in the form of prior

probabilities, nor does it modify the original model with new data based on Bayes’ theorem. Therefore, it meets the

requirements of an independent type approval of an automated vehicle, which we are particularly interested in.

4. Use Case

In the ﬁrst part of this section, we present the Lane Keeping Functional Test (LKFT) of UN regulation 79 for the

type approval of lateral driving functions. In the second part, we describe the selection of our simulation models based

on the Method of Manufactured Universes.

𝑅

𝑣

𝑎, 𝑗

𝑦

Figure 1: Lane Keeping Functional Test (LKFT)

4.1. Lane Keeping Functional Test

We choose UNECE regulation 79 in Revision 4 [

] as the use case of this paper and target the Automatically

Commanded Steering Function (ACSF) of category B1. The latter "means a function which assists the driver in keeping

the vehicle within the chosen lane, by inﬂuencing the lateral movement of the vehicle." In production vehicles, it is

either a standalone assistance system (SAE Level 1) or combined with adaptive cruise control for Partial Automation

(SAE Level 2). The R-79 lends itself as a use case, because a regulation generally has a public and independent

character with clearly deﬁned test cases and pass/fail criteria, and the R-79 is also binding for current production

vehicles of Level 1 and 2.

The R-79 describes the Lane Keeping Functional Test (LKFT) [

, Sec. 3.2.1] to check whether the ACSF system

can detect and keep its own lane without running over the lane markings. The principle is visualized in Fig. 1. In detail,

the test scenario is speciﬁed as follows:

•"The vehicle shall be driven [...] with a constant speed on a curved track with lane markings at each side."

•

"The necessary lateral acceleration to follow the curve shall be between 80 and 90 per cent of the maximum

lateral acceleration speciﬁed by the vehicle manufacturer ay,smax."

•"The vehicle speed shall remain in the range from vsmin up to vsmax."

•

"The vehicle manufacturer shall demonstrate [...] that the requirements for the whole lateral acceleration and

speed range are fulﬁlled."

The pass criteria are fulﬁlled if:

•"The vehicle does not cross any lane marking;"

•"The moving average over half a second of the lateral jerk does not exceed 5 m/s3."

Thus, the LKFT focuses on the stationary conditions of a roughly constant lateral acceleration

and velocity

. The

ranges depend on the manufacturer and are assumed here as typical values

vsmin

80 km/h

vsmax

180 km/h

and

ay,smax

2.5 m/s2

. The curve radius

v2/a

is a third dependent variable. We extend the LKFT in this paper by the

scenario parameters wind speed

, road gradient

and tank load

. We thus bring the example closer to reality and

consider parameters from diﬀerent categories without losing focus on

and

. The categories include driving states of

the vehicle as well as the road and the environment layer of the 5-layer environmental model presented in [

]. Our

LKFT example can be summarized in vector notation as

x=h1vavwsrltiT∈RNx+1(3)

AV Velocity (km/h)

100

120

140

160

180

Normalized Lat. Accel. (-)

0.4

0.5

0.6

0.7

0.8

Min. Dist. to Line (m)

0.0

0.1

0.2

0.3

0.4

(a) Simulation model

AV Velocity (km/h)

100

120

140

160

180

Normalized Lat. Accel. (-)

0.4

0.5

0.6

0.7

0.8

Min. Dist. to Line (m)

0.00

0.05

0.10

0.15

0.20

0.25

(b) Manufactured Universe

Figure 2: Minimum distance to line across the scenario space

with

=5 inputs. We concentrate on the

=1 output of the distance to line

y∈R

, as it is much more descriptive

than the jerk. Finally, we are aware that the regulation with two parameters is currently pursued with physical testing.

Nevertheless, the extended use case serves as a blueprint for a model-based safeguarding process, as it is ﬁrmly planned

for higher automation levels [3, 6]. Therefore, we will use the term AV as a placeholder for all automation levels.

4.2. Method of Manufactured Universes

Stripling et al.

[16]

introduce the Method of Manufactured Universes (MMU) as an approach to validate uncertainty

quantiﬁcation methods. Thus, it is not a method to validate the models, but to validate the validation methodology itself.

The intention of MMU is to create a manufactured universe where the true values of nature are known. In contrast

to reality, where the true values can be estimated, if at all, by extensive measurements, the manufactured universe

impresses by any number of simulations in a known environment. For sure, the user must manufacture the universe in

a way that it is close to reality. Otherwise, the transferability of the ﬁndings is not guaranteed. In MMU, the actual

simulation model is compared with the manufactured one by means of the desired validation methodology. Stripling

et al.

[16]

analyze UQ methods in a particle transport universe by comparing the actual low-ﬁdelity model with a

high-ﬁdelity reference. Whiting et al.

[57]

create a CFD universe to compare four validation methodologies including

PBA under validation and prediction conditions.

In accordance with the current state of science, we also create a universe for the validation of our methodology. We

adapt the MMU approach by injecting modeling errors into the manufactured universe so that we can verify that a

validation methodology is able to identify these errors. We select a pre-conﬁgured vehicle dynamics model of a sports

car, a simple PI controller for lane keeping and an ideal sensor model from a vehicle simulation tool. We use a vehicle

mass of

1377 kg

for the simulation model and a mass of

1577 kg

for the manufactured universe (MU) model. This

systematic error of

200 kg

can be compared with the weight of two to three passengers. The mass has an inﬂuence

on the overall driving behavior of the vehicle. The resulting minimum distance to line across the scenario space is

already anticipated in the two surface plots in Figure 2. Higher lateral accelerations and higher velocities for constant

radii lead to a smaller distance to line. This corresponds to the driving physics during cornering in which the vehicle is

pushed outwards by the centripetal force. The heavier vehicle has smaller distances and for high lateral accelerations it

has even a plateau of zero distance to line representing a line crossing. We deliberately generated this behavior in our

universe because it is especially safety-critical if the simulation model passes all tests and gives the developer a positive

feedback although the real system would fail several times. It is now the challenge for the validation methodology to

identify those fails.

Validation Domain (v)

Application Domain (a)

Scenarios Model & System Assessment Error / Uncertainty

Pipeline Decision Making

Single scenario

subsequently

All scenarios

at once

Unavailable

in reality

LKFT

Approval

Scenarios

Linear Regr.

Fit

Linear Regr.

Inference

LKFT

Validation

Scenarios

L1/2 AV

Model

L1/2 AV

MU-Model

Validation

Metric

L1/2 AV

Model

Type

Approval

LKFT

Assessment

LKFT

Assessment

Uncertainty

Expansion

Tolerance

Approach

L1/2 AV

MU-Model

LKFT

Assessment

LKFT

Assessment

Figure 3: Model-based process for AV homologation, based on previous work in [13, Fig. 1]

5. Methodology and Application Results

In this section we implement our generic framework from [

, Fig. 1] for the use case of safeguarding AVs. The

speciﬁc conﬁguration of the framework can be seen in Fig. 3. It describes a model-based process of model validation

and model prediction in the application domain with several individual steps. We describe the methodology step-by-step

from left to right and from top to bottom, using exemplary results of the LKFT from R-79. The mathematical symbols

that describe the interface of each framework block are summarized in Fig. 3. We deal with both a deterministic and a

non-deterministic manifestation of the framework during the description of the blocks and will compare them in the

next section. The two terms refer to the type of simulation. In contrast to state of the art deterministic validation, our

deterministic manifestation also takes interpolation uncertainties into account, but no input uncertainties compared

to the non-deterministic simulation. The interested reader is referred to [

] regarding general information and more

detailed theory.

5.1. Model Veriﬁcation

Before proceeding with model validation and prediction, the numerical eﬀects during model veriﬁcation should

be analyzed and the model parameters estimated. We skip the step of model calibration, since we assume already

parameterized models for an independent type approval. Therefore, we dedicate this subsection to a numerical

pre-analysis.

Since we do not have the exact mathematical model, we use the Richardson Extrapolation technique to estimate

the numerical discretization error. We perform three simulations with a ﬁne step size

2.5×10−4s

, a medium

step size

5×10−4s

and the actual coarse step size

1×10−3s

. This corresponds to a small reﬁnement factor

Table 2: Parameter characteristics. There are several scenarios in the global validation and application space and again several random samples in the

local space around each scenario. A normal distribution

(

µ, σ2

) is speciﬁed by its mean

and variance

σ2

. The repetitive samples refer to the

physical validation experiments, the nested samples to the remaining non-deterministic simulations.

Validation Application Uncertainty (around each scenario)

space scen-

arios

space scen-

arios

space samples

(nested)

samples

(rep.)

Parameter Unit min max min max type size

Velocity km/h 90 170 3 80 180 6 Aleatory N(0,0.5)

P10 P10

Lat. Accel. - 0.4 0.8 3 0.35 0.85 5 Aleatory N(0,0.01)

Wind Speed km/h -5 5 2 -5 5 2 Aleatory N(0,2)

Tank Load kg -20 20 2 -20 20 2 Aleatory N(0,0.5)

Road Slope ◦-1 1 2 -1 1 2 Epistemic [−0.1,0.1] 3

QNv=72 Na=240 30 Nv

r=10

h3/h2

h2/h1

=2. It gives identical results of the distance to line in the relevant decimal places for all three step

sizes. This shows high convergence and very small numerical errors en≈0.

The negligible numerical eﬀects coincide with the ﬁndings of Viehof and Winner

[58]

from vehicle dynamics

simulation. However, we would like to point out that complex co-simulations with diﬀerent solvers and step sizes are

sometimes used to assess the safety of AVs, which no longer necessarily lead to similar results. Therefore, this should

be investigated at least once per tool chain.

5.2. Scenario Design

Since the LKFT requires an assessment over "the whole lateral acceleration and speed range", we concentrate directly

on a good coverage of the scenario space, without restricting us only to the critical band from 80 to

90 %

ay,smax

The scenario space is speciﬁed in Table 2 by the min-max ranges of the ﬁve parameters from

(3)

. The application space

is slightly larger than the validation space so that it not only requires interpolation but also extrapolation capabilities.

The values are chosen so that the manufactured universe reﬂects the real world properly.

We use a simple full factorial Design of Experiments (DoE) to generate the concrete scenarios. According to

Table 2, 72 validation scenarios and 240 application scenarios are selected to legitimize the model-based process. In the

future, it is possible to increase the amount of application scenarios and to select more sophisticated DoE techniques

to gain eﬃciency. The crosses in Figure 4 show a two-dimensional cross-section along the velocity and the lateral

acceleration dimension with a grid of nine validation and 30 application scenarios. We use this type of visualization to

reﬂect the focus of the R-79 on the two parameters. All NLKFT scenarios will be aggregated in a data matrix

X=





1v1a1vw,1sl,1lt,1

1v2a2vw,2sl,2lt,2

1vNaNvw,Nsl,Nlt,N







∈RN×(Nx+1) (4)

for each domain: Xv∈RNx×Nvfor validation and Xa∈RNx×Nafor the application.

5.3. Uncertainty Quantiﬁcation and Propagation

A deterministic simulation performs a point prediction, since it assumes that all parameters are precisely known.

A non-deterministic simulation, however, considers uncertain parameters, quantiﬁes them, propagates them through

the simulation model and aggregates the results. The uncertainties are summarized in Table 2 for the ﬁve scenario

parameters. The values were systematically derived depending on how accurately it is possible to measure them during

real validation experiments. We assume tight normal distributions for the velocity and lateral acceleration, since both

can be measured accurately with Inertial Measurement Units (IMUs), and for the tank load, since the vehicle can be

weighed in advance and the fuel level is available. We assume a coarser normal distribution for the wind – based on

80 100 120 140 160 180

AV Velocity (km/h)

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Normalized Lateral Acceleration (-)

Nominal Veriﬁcation Scenario

Nominal Validation Scenario

Nominal Application Scenario

Validation Uncertainty Samples, Model

Validation Uncertainty Samples, System

Application Uncertainty Samples

Figure 4: Veriﬁcation, validation and application scenarios

online measuring stations – and an epistemic uncertainty for the road gradient, since it has to be extracted from map

data of highways. This lack of knowledge from publicly accessible maps could only be compensated by extensive

measurements. We assume further scenario parameters and the internal vehicle parameters as deterministic for this

simpliﬁed proof of concept, since they are often precisely known and constant. This assumption is reﬂected in the

model-form uncertainty and can be substantiated by a sensitivity analysis in the future. In addition, we assume the

uncertainties from Table 2 to be equal around each validation and application scenario.

We use

=10 repetitions of each validation scenario for the manufactured universe to imitate physical experiments

and to capture their natural variability. The value is based on the ﬁndings of Viehof

[41

, p. 74

]

, who recommends

10 to 15 repetitions based on the t-distribution for a detailed analysis and later at least three repetitions for practical

experiments. These tests are the baseline for both the deterministic and the non-deterministic re-simulation to enable a

fair comparison later. In the deterministic case, we re-simulate the mean value of the scenario repetitions (see averaged

input case in Section 2.2.2). In the non-deterministic case, we apply the nested two-loop sampling approach from

Section 2.3 to propagate the uncertainties. We use a full factorial design with three steps of the epistemic parameter in

the outer loop and Monte Carlo sampling with ten random samples for all aleatory parameters in the inner loop. This is

equivalent to a resolution in

10 %

steps and is suﬃcient for the proof of concept in this paper. In total, we perform 2160

runs with the simulation model in the validation domain and 7200 runs in the application domain. The uncertainties

are also illustrated in Figure 4 by projecting all samples of all parameters onto the velocity and lateral acceleration

dimension. They form local point clouds within the large scenario space.

5.4. LKFT Assessment

Each simulation run results in one distance to line signal. It shows a typical behavior as shown in Figure 5, where

the vehicle is pushed hardest to the edge at the entrance of the curve (cross at minimum distance to line), and then

compensates a bit – unless it crosses the line at high lateral accelerations. We extract the minimum distance to the line

0510 15 20 25 30 35 40 45

Time (s)

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Distance to Left Line (m)

Distance Time Signal

Minimum Distance

Figure 5: Characteristic lane keeping behavior by means of the distance to line time signal and its minimum value as KPI at the exemplary application

scenario v=100 km/h and a=0.35 ·ay,smax =0.875 m/s2.

as a KPI because it is the most safety critical. If not even the minimum distance falls below zero (crosses the line), the

whole time signal will not fall below it. The behavior over the whole scenario space was already shown in Figure 2.

In the deterministic case, we average the repetition results so that we can use a scalar validation metric. In the

non-deterministic case, we aggregate the repetitions to an ECDF and the nested uncertainty propagation to a p-box for

the use of a non-deterministic metric.

5.5. Validation Metric

In the deterministic case, we use the absolute deviation between the minimum distance to line from the input-

averaged re-simulation and the output-averaged minimum distance to line from the manufactured universe

ev=gm(hxvi, θm)−gs(xv, θs)(5)

as validation metric with

as mean operator. In the non-deterministic case, we quantify the diﬀerence between the

ECDF of the manufactured universe

(

) and the p-box from the simulation model

(

). There are diﬀerent metrics

that give diﬀerent weight to diﬀerent characteristics of a distribution. We use an Area Validation Metric (AVM) based

on Voyles and Roy

[59]

because it includes the entire shape of the distributions in the calculation. The principle is

illustrated in Figure 6a for one validation scenario. Since the manufactured universe shows worse performance, it only

exists a left-hand area

l=ZB(yv

m)≥F(yv

|B(yv

m)−F(yv

s)|dy (6)

where the p-box is larger than the ECDF. The area can be determined mathematically by integration of the step functions.

Since the right-hand area is zero, we get a one-sided adaption of the area validation metric (without safety factor from

[59]) in form of the interval of ev(no function)

I(ev)B{ev|ev≤ev≤ev}B[ev,ev]=[ev

l,0] .(7)

5.6. Error Learning and Inference

It is essential to take into account the modeling errors and uncertainties for the ﬁnal decision making in the applica-

tion domain. The validation metric quantities the model-form uncertainty for each validation scenario. Ultimately,

0.2 0.3 0.40.5

Distance to Left Line (m)

0.0

0.2

0.4

0.6

0.8

1.0

Cumulative Probability

Area Metric = 0.28

Model Input Uncertainty

System

Model-form Uncertainty

(a) Area Validation Metric

AV Velocity (km/h)

80 100 120 140 160 180

Normalized Lateral Accel. (-)

0.4

0.5

0.6

0.7

0.8

Metric (m)

0.0

0.1

0.2

0.3

0.4

0.5

Validation Metric

Lower Bound Inferred Error

Upper Bound

(b) Error Inference

Figure 6: Validation errors. (a) Non-deterministic area metric for the exemplary validation scenario

90 km/h

and

·ay,smax

1 m/s2

(b) Linear regression inference of the deterministic validation error ˆeva with prediction intervals across the entire application scenario space.

however, we need to predict it for each application scenario. Therefore, we learn an error model with the metric results

from the validation domain, so that we can use it afterwards to infer the errors in the application domain.

For this ﬁrst proof of concept, we apply a multiple linear regression technique based on [

, p. 657] to model the

validation error

ˆev=w·x,w=hw0· · · wNxi∈RNx+1(8)

of the distance to line with the scenario parameters

as regressors. The regression weights

are determined so that the

diﬀerence between the estimated and the actual validation errors is minimized

arg min

i=1

(ˆevv

i−evv

i)2(9)

across the whole validation data set from Table 2 with

distinct scenarios. We emphasize estimates of the validation

error in the validation domain with

ˆevv

and estimates of the validation error in the application domain with

ˆeva

. To

get to the latter, we ﬁt one linear model to infer the deterministic validation error

ˆeva

, whereas we ﬁt two linear

models to infer both the left and right (zero here) interval boundaries of the non-deterministic model-form uncertainty

I(ˆeva )=[eva,eva].

In addition, we calculate prediction intervals (PIs) based on [

, p. 657] to incorporate the uncertainty of the ﬁtted

model and the one associated with future observations. Since the prediction variable represents a random variable and

we need an interval estimate for its mean value, we apply a non-simultaneous Bonferroni-type prediction interval via

the function [60, p. 115]

gp(xa)=tα/2

Nv−(Nx+1) ·s·q1+xaT(XvTXv)−1xa(10)

with a conﬁdence of α=95 % for the t-distribution and with the sample variance

s2=1

Nv−(Nx+1)

i=1evv

i−ˆevv

i2.(11)

In the deterministic case, the prediction interval ±gp(xa) shifts the signed estimate of the error in both directions:

I(ˆeva )=[eva,eva]=[ˆeva −gp(xa),ˆeva +gp(xa)] .(12)

0.0 0.1 0.2 0.3 0.40.5 0.6

Distance to Left Line (m)

0.0

0.2

0.4

0.6

0.8

1.0

Cumulative Probability

Model Input Uncertainty

Model-form Uncertainty

Regulation

System

(a) Non-deterministic

0.0 0.1 0.2 0.3 0.40.5 0.6

Distance to Left Line (m)

0.0

0.2

0.4

0.6

0.8

1.0

Cumulative Probability

Model

Model Uncertainty

Regulation

System

(b) Deterministic

Figure 7: Uncertainty aggregation and decision making at the exemplary application scenario

100 km/h

and

·ay,smax

0.875 m/s2

based on the (a) deterministic point prediction and (b) non-deterministic p-box prediction of the nominal simulation model.

In the non-deterministic case, we add the respective function value to the left and right (positive) areas

I(ˆeva )=[eva,eva]=[ˆeva

l+gp,l(xa),ˆeva

r+gp,r(xa)] ,(13)

since the resulting outer bounds enclose the regression estimates and the inner bounds. Figure 6b illustrates the

regression surface of the deterministic metric including prediction intervals. The light blue plane of the linear model

follows the trend of the orange metric results used for training.

5.7. Uncertainty Expansion

After quantifying the individual errors and uncertainties, they must be aggregated in the application domain.

In the non-deterministic case, we get for each application scenario one input uncertainty in the form of a p-box

(

)=[

(

)

(

)], one inferred model-form uncertainty in the form of an interval

(

ˆeva

)=[

eva,eva

] and one

numerical uncertainty (zero here) in the form of an interval

(

ˆena

)=[

ena,ena

]. The combination of the uncertainties can

be seen in Figure 7a for one application scenario. The two lower interval boundaries shift the input uncertainty p-box

to the left, the two upper interval boundaries to the right. The resulting p-box reﬂects an estimate of the actual system

response

B(ˆya

s)={F(ˆya

s)|F(ya

m+(eva +ena))

≤F(ˆya

s)≤F(ya

m−(eva +ena))}.(14)

In the example, the model-form uncertainty has the largest inﬂuence on the total uncertainty, since we injected a

systematic error into the manufactured universe.

In the deterministic case, we use the inferred error bounds

(

ˆeva

)=[

eva,eva

] to adapt the actual model prediction

to a non-deterministic interval estimate of the system response

I(ˆya

s)=









[ya

m−eva,ya

m]eva,eva ≥0

[ya

m−eva,ya

m−eva]eva ≤0,eva ≥0

[ya

m,ya

m−eva]eva ,eva ≤0

(15)

depending on diﬀerent cases. If both error bounds were either greater or less than zero, the middle case would lead

to a bias correction of the nominal model in one direction. This would be equivalent to putting more trust into the

data-driven error model than into the physical simulation model itself and would again be associated with uncertainties.

To avoid this, we add two more cases, which can be interpreted as one-sided uncertainty expansion, starting from the

simulation model as a stable baseline. The principle is also visualized in Fig. 7b. This mixed approach neither neglects

the errors as usual, nor does it perform a risky bias correction, but it adds tight bounds for a fair comparison.

5.8. Decision Making

Fig 7a and Fig. 7b also include the regulation threshold as a vertical line, which states that the distance to line must

be greater than zero (the vehicle edge does not cross the line). It transforms the real values into boolean decisions,

because, in the end, the type approval requirements can only be passed or failed. Both in the deterministic and in the

non-deterministic case, the entire green area exceeds the threshold and passes the requirements. This even corresponds

to the most conservative non-deterministic estimate. A reduction of the conﬁdence level of the CDF in the future would

lower the requirements so that not all steps have to pass. Currently, checking all steps refers to a conﬁdence of

90 %

due to the 10 aleatory samples from Table 2. In total, the AV passes 123 and fails 117 of the 240 application scenarios

in the deterministic case, whereas it passes 97 and fails 143 in the non-deterministic case. The values can be taken

later as part of Table 3. The latter ones are, as expected, somewhat more conservative due to the consideration of

the input uncertainties. Ultimately, the amount of failed decisions would lead in summary to a failure of the entire

homologation.

6. Validation of the Application Results

The last section presented the VV&UQ methodology using an example where the type approval of a lane keeping

assistant was carried out based on simulation. It concluded that the AV fails the type approval. However, what does this

actually mean regarding the quality of the methodology itself? To answer this question, the Method of Manufactured

Universes comes into play, where the actual ground truth values are available in all domains. The ﬁrst subsection

introduces the principle, how those values can be used to analyze the meaningfulness of the VV&UQ results. The

second sub-section evaluates the tolerance approach as a baseline from the current state of the art. The two following

subsections are dedicated to the validation and comparison of the deterministic and the non-deterministic manifestation

of the framework from the last section.

6.1. Evaluation Methodology

The analyses in this section build up on the Method of Manufactured Universes. Its big advantage is that the actual

Ground Truth (GT) values are available for all validation scenarios and in particular for all application scenarios. This

makes it possible to compare the VV&UQ results with the true values and thus to evaluate the selected VV&UQ

methodology itself. We simulate the manufactured universe in the application domain under the same conditions

as the nominal simulation model. Even if additional real experiments were conducted to validate the methodology,

this would never be possible. There will always be measurement errors or poor input uncertainties that falsify the

ground truth values. These are factors that can be considered in the future to analyze the robustness of the approach

against unexpected uncertainties. In the application domain, however, we require two p-boxes for the model and the

manufactured universe under the same desired application conditions for a fair comparison.

The comparison can take place at multiple stages along the processing pipeline of the framework:

1. Error inference stage

2. Assessment stage

3. Error integration stage

4. Type approval stage.

We focus on the ﬁnal two stages because they depend on the previous ones and are ultimately decisive. Starting

backwards, the ﬁnal binary decisions from the type approval can be compared. For this purpose a binary classiﬁer is

used in pattern recognition and machine learning. On the one hand, it distinguishes positive from negative decisions

based on the absolute simulation results. On the other hand, it distinguishes true from false decisions based on the

relation of simulation and experimental results. In our example, a True Positive (TP) stands for a correctly failed type

approval, whereas a True Negative (TN) stands for a correctly passed type approval. On the contrary, a False Positive

(FP) stands for an incorrectly failed type approval and a False Negative (FN) for an incorrectly passed type approval.

FPs are Type I errors that convict the innocent AV. FNs are Type II errors that acquit an unsafe AV. In addition, the

precision Pis based on the former and the recall Ron the latter:

P=PT P

PT P +PF P (16)

R=PT P

PT P +PF N .(17)

Recall is a very important measure from the safety perspective to quantify how many of the actual approval fails are

detected by the simulation model. Precision is also important to identify the approval fails with as few Type I errors as

possible.

In the error integration stage, it is additionally possible to check whether the actual GT values fall within the bounds

from the VV&UQ method. This looks as follows in the deterministic and the non-deterministic case:

s∈I(ˆya

s) (18)

F(ya

s)∈F(ˆya

s) (19)

It gives additional insights and conﬁdence in the error model, since it is possible to correctly classify the binary

decisions depending on the distance to the threshold, despite the GT not falling within the bounds.

6.2. Tolerance Approach

It is important to emphasize that the pure tolerance approach from Section 2.2.2, as it is currently used in the

automotive sector, is not directly transferable to controlled AVs. First of all, there are absolute and relative tolerances.

The absolute ones struggle in the sense that they do not relate the validation errors to the scenario results. An exemplary

assumed model error of

30 cm

has much stronger eﬀects when the virtual vehicle drives

15 cm

beside the line than

when it drives in the middle of the road. The ﬁnal binary results for the pure tolerance approach (without error pipeline)

for the absolute tolerance of

30 cm

can be seen in Figure 8. It proves cases where the model is assumed to be valid

according to the tolerances in the validation domain, but false classiﬁcations (FP and FN) between the nominal model

and the GT occur at neighboring locations in the application domain. They are concerning from a safety perspective.

There is also an inverse case with an invalid model and true classiﬁcations in the neighborhood that leaves potential.

Classical relative tolerances suﬀer from similar problems. Suppose one allows a

10 %

deviation of the jerk. Then larger

jerk values beneﬁt, although these are safety critical.

It would be possible to extend the tolerances to the safety perspective by dividing the validation errors by the

distance from the threshold. Regardless of this, we recommend using the tolerances in the validation domain as a tool

for the model developer, but not standalone. It should be combined with the vertical error pipeline in the framework in

Figure 3 to consider model-form and extrapolation uncertainties. The use of tolerances alone separates the validation

and application domain, neglects uncertainties and is risky.

6.3. Deterministic and Non-Deterministic Manifestation

This subsection continues with the analysis of the overall framework including the error and uncertainty pipeline

as presented in Section 5. Table 3 summarizes the results from the binary classiﬁer. The ﬁrst column refers to the

deterministic manifestation of the framework and the second column to the non-deterministic manifestation. In the ﬁrst

row, the classiﬁer is used to compare the nominal simulation model with the GT from the manufactured universe, and

in the second row, the results of the VV&UQ methodology are compared with the GT. The injected systematic error

results in many safety critical FNs and a low recall rate with a perfect precision of 100% in the Table 3a and 3b.

Checking whether the GT falls within the bounds of the VV&UQ methodology according to Equation

(18)

and

(19)

yields 238 bounded cases for the deterministic manifestation and even 240 of 240 bounded cases for the non-

deterministic manifestation. This shows a working conservative error modeling using linear regression and prediction

intervals. In both Table 3c and 3d, the VV&UQ methodology successfully shifts all the FNs into the TP cell. This

heavily increases the recall from almost 0 to

100 %

. From a safety perspective this is optimal since no unsafe AV is

acquitted anymore. In return, a few TNs are moved into the FP cell, which convicts the innocent AV. Over the entire

80 100 120 140 160 180

AV Velocity (km/h)

0.4

0.5

0.6

0.7

0.8

Normalized Lateral Acceleration (-)

Approval Correct

Approval Passed

Model Valid

Approval Wrong

Approval Failed

Model Invalid

Figure 8: Decisions across the scenario space. The rectangles refer to general model validity based on the tolerance approach, the circles and crosses

to the validity of the application decisions using the binary classiﬁer. The decision of the nominal simulation model is coded in the color channel, the

comparison with the ground truth values by the two symbols cross and circle. Thus, an orange circle symbolizes a TP, an orange cross an FP, a green

cross an FN and a green circle a TN.

scenario space, this results in a precision of

77 %

resp.

86 %

. However, since the vehicle will not pass the homologation

anyway due to the many TPs, the additional FPs do not matter much in this case. Generally, it is impossible to guarantee

model validity with only a restricted amount of validation experiments [

]. Nevertheless, further cases and alternative

metamodeling techniques should be considered in the future to address the trade-oﬀbetween Type I and II errors.

6.4. Comparison and Discussion

Comparing the deterministic and the non-deterministic manifestation of the framework regarding the binary

classiﬁcations shows similar results. The non-deterministic approach has a slight advantage due to a higher precision at

the same recall rate. In principle, the classiﬁer results allow both approaches and do not show a clear selection. Thus,

further factors may be included in the process to select the conﬁguration of the framework for the speciﬁc use case of

this paper. The non-deterministic approach requires a higher eﬀort to quantify the uncertainties and to perform the

simulations. In return, it considers diﬀerent sources of uncertainty including the scenario parameters. This results in

a higher conﬁdence in the statement of each application scenario and a larger coverage of the scenario space due to

the scatter around each nominal scenario point. The accuracy of the statement can even be adjusted by the number of

aleatory samples. In addition, the illustrative example with the comparatively large systematic error makes the strengths

of the non-deterministic approach in the quantiﬁcation of input uncertainties less obvious. Therefore, if the resources

are available for the non-deterministic approach, its selection is recommended. Otherwise, the deterministic approach

represents a good compromise between eﬀort and risk.

Generally, we recommend to use all (5 here) uncertain parameters also as scenario parameters and to include

them into the scenario design. It ﬁts the nature of model validation to quantify the input uncertainties as accurately

as possible around a nominal scenario condition. If an uncertain parameter is not included in the scenario design, its

global uncertainty must be taken into account (space columns instead of the uncertainty column in Table 2). This leads

Table 3: Binary classiﬁer

(a) Deterministic results – Nominal model

Universe

Fails Passes

Model

Fails 2 TPs 0 FPs P=100 %

Passes 88 FNs 150 TNs

R≈2 %

(b) Non-deterministic results – Nominal model

Universe

Fails Passes

Model

Fails 5 TPs 0 FPs P=100 %

Passes 92 FNs 143 TNs

R≈5 %

Universe

Fails Passes

VVUQ

Fails 90 TPs 27 FPs P≈77 %

Passes 0 FNs 123 TNs

R=100 %

(d) Non-deterministic results – VV&UQ method

Universe

Fails Passes

VVUQ

Fails 123 TPs 20 FPs P≈86 %

Passes 0 FNs 97 TNs

R=100 %

to wider p-boxes in case of larger input uncertainties and thus to an under-approximation of the model error. This in

turn is too little to perform a suﬃcient shift of the model responses during the uncertainty expansion and would ﬁnally

lead to some false classiﬁcations.

7. Conclusion

For credible safeguarding of automated vehicles based on computer simulation, model validation activities are

essential to assess the errors and uncertainties inherent in every model. Therefore, we presented a modular framework

based on the type approval of a lane keeping function. Due to the modular design, the framework can be used in diﬀerent

manifestations depending on the requirements. On the one hand, we presented a non-deterministic manifestation

considering several types of uncertainties. On the other hand, we presented a simpliﬁed deterministic manifestation

that still considers an overall model error. We analyzed both manifestations in regard to ground truth values using a

binary classiﬁer.

Both approaches show excellent results in the identiﬁcation of systematic errors. They corrected all safety-critical

cases where the nominal simulation model suggests a suitable lane keeping behavior although the actual ground truth

illegally crosses the line. The non-deterministic approach shows slightly better results, but is also more complex. In the

future, we plan to analyze diﬀerent scenario designs, validation metrics and error learning techniques to further improve

the results. Nevertheless, it is important to mention that there is no absolute model validity since it is impossible to

guarantee validity across the entire scenario space with only a restricted amount of validation experiments. In addition,

we can apply the framework in the next step to a real vehicle, after we have validated the methodology in this paper

based on the method of manufactured universes.

Declaration of competing interest

The authors declare that they have no known competing ﬁnancial interests or personal relationships that could have

appeared to inﬂuence the work reported in this paper.

Acknowledgement

The authors want to thank TÜV SÜD Auto Service GmbH for the support and funding of this work. Additionally,

the authors want to thank Daniel Schneider for proofreading the article and for enhancing the content due to his critical

remarks.

Contribution

Stefan Riedmaier initiated and wrote this paper. He was involved in all stages of development and primarily

developed the concept and content of this work. Jakob Schneider wrote his master thesis about uncertainties in

safeguarding an ACC. Afterwards, he continued his research by supporting Stefan Riedmaier in implementing the

framework, in applying it to the LKAS type approval and in enhancing the content in frequent workshops. Benedikt

Danquah contributed to the structure of the paper and improved the content thanks to a close cooperation and many

valuable discussions on VV&UQ methods. Bernhard Schick and Frank Diermeyer contributed to the conception of the

research project and revised the paper critically for important intellectual content. Frank Diermeyer gave ﬁnal approval

of the version to be published and agrees to all aspects of the work. As a guarantor, he accepts responsibility for the

overall integrity of the paper.

References

[1] SAE J3016, Taxonomy and Deﬁnitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles, 2018.

[2]

H. Beglerovic, M. Stolz, M. Horn, Testing of autonomous vehicles using surrogate models and stochastic optimization, in: 2017 IEEE 20th

International Conference on Intelligent Transportation Systems (ITSC), IEEE, 2017, pp. 1–6.

[3]

United Nations Economic Commission for Europe (UNECE), Proposal for a new un regulation on uniform provisions concerning the approval

of vehicles with regards to automated lane keeping system (ece/trans/wp.29/2020/81), 2020.

[4]

N. Kalra, S. M. Paddock, Driving to safety: How many miles of driving would it take to demonstrate autonomous vehicle reliability?,

Transportation Research Part A: Policy and Practice 94 (2016) 182–193. URL:

http://www.sciencedirect.com/science/article/

pii/S0965856416302129.

[5]

W. Wachenfeld, H. Winner, The Release of Autonomous Vehicles, in: M. Maurer, J. C. Gerdes, B. Lenz, H. Winner (Eds.), Autonomous

Driving: Technical, Legal and Social Aspects, Springer Berlin Heidelberg, Berlin, Heidelberg, 2016, pp. 425–449. URL:

https://doi.org/

10.1007/978-3- 662-48847-8_21.

[6] German Aerospace Center, PEGASUS-project, 2019. URL: https://www.pegasusprojekt.de/en/home.

[7]

A. Leitner, A. Akkermann, B. Å. Hjøllo, B. Wirtz, D. Nickovic, E. Möhlmann, H. Holzer, J. van der Voet, J. Niehaus, M. Sarrazin, M. Zofka,

M. Rooker, M. Kubisch, M. Paulweber, M. Siegel, M. Rautila, N. Marko, P. Tummeltshammer, P. Rosenberger, R. Rott, S. Muckenhuber,

S. Kalisvaart, T. d. Graaﬀ, T. D’Hondt, T. Fleck, Z. Slavik, Enable-s3: testing & validation of highly automated systems: Summary of results,

2019.

[8]

E. F. Z. Santana, G. Covas, F. Duarte, P. Santi, C. Ratti, F. Kon, Transitioning to a driverless city: Evaluating a hybrid system for autonomous

and non-autonomous vehicles, Simulation Modelling Practice and Theory 107 (2021) 102210.

[9] UL, UL 4600: Standard for Safety for the Evaluation of Autonomous Products, 2019.

[10]

B. Schürmann, D. Heß, J. Eilbrecht, O. Stursberg, F. Koster, M. Althoﬀ, Ensuring drivability of planned motions using formal methods, in:

2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), IEEE, 2017, pp. 1–8.

[11]

S. Kitajima, K. Shimono, J. Tajima, J. Antona-Makoshi, N. Uchida, Multi-agent traﬃc simulations to estimate the impact of automated

technologies on safety, Traﬃc injury prevention 20 (2019) 58–64.

[12]

S. Riedmaier, T. Ponn, D. Ludwig, B. Schick, F. Diermeyer, Survey on scenario-based safety assessment of automated vehicles, IEEE Access

8 (2020) 87456–87477.

[13]

S. Riedmaier, B. Danquah, B. Schick, F. Diermeyer, Uniﬁed framework and survey for model veriﬁcation, validation and uncertainty

quantiﬁcation, Archives of Computational Methods in Engineering (2020).

[14]

M. G. Faes, M. A. Valdebenito, Fully decoupled reliability-based design optimization of structural systems subject to uncertain loads, Computer

Methods in Applied Mechanics and Engineering 371 (2020) 1–17.

[15]

M. Eek, H. Gavel, J. Ölvander, Deﬁnition and implementation of a method for uncertainty aggregation in component-based system simulation

models, Journal of Veriﬁcation, Validation and Uncertainty Quantiﬁcation 2 (2017) pp. 011001–1–011001–12.

[16]

H. F. Stripling, M. L. Adams, R. G. McClarren, B. K. Mallick, The method of manufactured universes for validating uncertainty quantiﬁcation

methods, Reliability Engineering & System Safety 96 (2011) 1242–1256.

[17] R. G. Hills, Roll-up of validation results to a target application, 2013.

[18]

American Society of Mechanical Engineers, Standard for veriﬁcation and validation in computational ﬂuid dynamics and heat transfer: An

American national standard, volume 20-2009 of ASME V&V, reaﬃrmed 2016 ed., The American Society of Mechanical Engineers, New York,

NY, 2009.

[19]

J. Mullins, Y. Ling, S. Mahadevan, L. Sun, A. Strachan, Separation of aleatory and epistemic uncertainty in probabilistic model validation,

Reliability Engineering & System Safety 147 (2016) 49–59.

[20]

S. Sankararaman, S. Mahadevan, Integration of model veriﬁcation, validation, and calibration for uncertainty quantiﬁcation in engineering

systems, Reliability Engineering & System Safety 138 (2015) 194–209.

[21]

S. Atamturktur, F. M. Hemez, J. A. Laman, Uncertainty quantiﬁcation in model veriﬁcation and validation as applied to large scale historic

masonry monuments, Engineering Structures 43 (2012) 221–234.

[22]

H. Abbas, M. O’Kelly, A. Rodionova, R. Mangharam, Safe at any speed: A simulation-based test harness for autonomous vehicles, in: Seventh

Workshop on Design, Modeling and Evaluation of Cyber Physical Systems (CyPhy’17), 2017, pp. 94–106.

[23]

T. Hanke, A. Schaermann, M. Geiger, K. Weiler, N. Hirsenkorn, A. Rauch, S.-A. Schneider, E. Biebl, Generation and validation of virtual

point cloud data for automated driving systems, in: 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC),

IEEE, 2017, pp. 1–6.

[24]

A. Schaermann, A. Rauch, N. Hirsenkorn, T. Hanke, R. Rasshofer, E. Biebl, Validation of vehicle environment sensor models, in: 2017 IEEE

Intelligent Vehicles Symposium (IV), IEEE, 2017, pp. 405–411.

[25]

A. Gaidon, Q. Wang, Y. Cabon, E. Vig, Virtualworlds as proxy for multi-object tracking analysis, in: 2016 IEEE Conference on Computer

Vision and Pattern Recognition (CVPR), IEEE, 2016, pp. 4340–4349.

[26]

M. Nentwig, M. Miegler, M. Stamminger, Concerning the applicability of computer graphics for the evaluation of image processing algorithms,

in: 2012 IEEE International Conference on Vehicular Electronics and Safety (ICVES 2012), IEEE, 2012, pp. 205–210.

[27]

E. L. Zec, N. Mohammadiha, A. Schliep, Statistical sensor modelling for autonomous driving using autoregressive input-output hmms, in:

2018 IEEE 21th International Conference on Intelligent Transportation Systems (ITSC), IEEE, 2018, pp. 1331–1336.

[28]

International Organization for Standardization, Passenger cars — validation of vehicle dynamic simulation — sine with dwell stability control

testing, 2016.

[29]

International Organization for Standardization, Passenger cars — vehicle dynamic simulation and validation — steady-state circular driving

behaviour, 2016.

[30] E. Kutluay, H. Winner, Validation of vehicle dynamics simulation models – a review, Vehicle System Dynamics 52 (2014) 186–200.

[31]

K. Groh, S. Wagner, T. Kuehbeck, A. Knoll, Simulation and its contribution to evaluate highly automated driving functions, in: WCX SAE

World Congress Experience, SAE Technical Paper Series, SAE International400 Commonwealth Drive, Warrendale, PA, United States, 2019,

pp. 1–11.

[32]

D. Notz, M. Sigl, T. Kühbeck, S. Wagner, K. Groh, C. Schütz, D. Watzenig, Methods for improving the accuracy of the virtual assessment of

autonomous driving, in: 2019 IEEE International Conference on Connected Vehicles and Expo (ICCVE) Proceedings, 2019, pp. 1–6.

[33]

S. Wagner, K. Groh, T. Kuhbeck, A. Knoll, Towards cross-veriﬁcation and use of simulation in the assessment of automated driving, in: 2019

IEEE Intelligent Vehicles Symposium (IV), IEEE, 2019, pp. 1589–1596.

[34]

B. Johnson, F. Havlak, H. Kress-Gazit, M. Campbell, Experimental evaluation and formal analysis of high-level tasks with dynamic obstacle

anticipation on a full-sized autonomous vehicle, Journal of Field Robotics 34 (2017) 897–911.

[35]

S. Riedmaier, J. Nesensohn, C. Gutenkunst, T. Düser, B. Schick, H. Abdellatif, Validation of x-in-the-loop approaches for virtual homologation

of automated driving functions, in: 11th Graz Symposium Virtual Vehicle (GSVF), 2018, pp. 1—-12.

[36]

W. Daamen (Ed.), Traﬃc simulation and data: Validation methods and applications, [elektronische ressource] ed., Taylor and Francis and CRC

Press, Hoboken and Boca Raton, Fla., 2015.

[37] Y. Hollander, R. Liu, The principles of calibrating traﬃc microsimulation models, Transportation 35 (2008) 347–362.

[38]

L. Rao, L. Owen, Validation of high-ﬁdelity traﬃc simulation models, Transportation Research Record: Journal of the Transportation Research

Board 1710 (2000) 69–78.

[39]

T. Toledo, H. N. Koutsopoulos, Statistical validation of traﬃc simulation models, Transportation Research Record: Journal of the Transportation

Research Board 1876 (2004) 142–150.

[40]

E. Kutluay, Development and Demonstration of a Validation Methodology for Vehicle Lateral Dynamics Simulation Models, Ph.D. thesis,

Technische Universität Darmstadt, Darmstadt, 2012.

[41]

M. Viehof, Objektive Qualitätsbewertung von Fahrdynamiksimulationen durch statistische Validierung, Ph.D. thesis, Technische Universität

Darmstadt, Darmstadt, 2018.

[42]

S. Rhode, Non-stationary gaussian process regression applied in validation of vehicle dynamics models, Engineering Applications of Artiﬁcial

Intelligence 93 (2020) 103716.

[43] M. Hartung, D. Hess, R. Lattarulo, J. Oehlerking, J. Perez, A. Rausch, Report on conformance testing of application models, 2017.

[44]

E. Böde, M. Büker, Ulrich Eberle, M. Fränzle, S. Gerwinn, B. Kramer, Eﬃcient splitting of test and simulation cases for the veriﬁcation

of highly automated driving functions, in: B. Gallina, A. Skavhaug, F. Bitsch (Eds.), Computer Safety, Reliability, and Security, Springer

International Publishing, Cham, 2018, pp. 139–153.

[45]

S. Detering, L. Schnieder, E. Schnieder, Two-level validation and data acquisition for microscopic traﬃc simulation models, International

Journal on Advances in Systems and Measurements 3 (2010).

[46]

T. Ponn, T. Kröger, F. Diermeyer, Performance analysis of camera-based object detection for automated vehicles, Sensors (Basel, Switzerland)

20 (2020).

[47]

M. Holder, P. Rosenberger, H. Winner, V. P. Makkapati, M. Maier, H. Schreiber, Z. Magosi, T. D’hondt, Z. Slavik, O. Bringmann, W. Rosenstiel,

Measurements revealing challenges in radar sensor modeling for virtual validation of autonomous driving, in: 2018 IEEE 21th International

Conference on Intelligent Transportation Systems (ITSC), IEEE, 2018, pp. 2616–2622.

[48]

P. Rosenberger, M. Holder, M. Zirulnik, H. Winner, Analysis of real world sensor behavior for rising ﬁdelity of physically based lidar sensor

models, in: 2018 IEEE Intelligent Vehicles Symposium (IV), IEEE, 2018, pp. 611–616.

[49]

United Nations Economic Commission for Europe (UNECE), Addendum 139 - regulation no. 140 — uniform provisions concerning the

approval of passenger cars with regard to electronic stability control (esc) systems, 2017.

[50] W. L. Oberkampf, C. J. Roy, Veriﬁcation and Validation in Scientiﬁc Computing, Cambridge University Press, Cambridge, 2010.

[51]

M. Aramrattana, R. H. Patel, C. Englund, J. Härri, J. Jansson, C. Bonnet, Evaluating model mismatch impacting cacc controllers in mixed

traﬃc using a driving simulator, in: 2018 IEEE Intelligent Vehicles Symposium (IV), IEEE, 2018, pp. 1867–1872.

[52]

L. Zheng, T. Sayed, M. Essa, Y. Guo, Do simulated traﬃc conﬂicts predict crashes? an investigation using the extreme value approach, in:

2019 IEEE 22th International Conference on Intelligent Transportation Systems (ITSC), IEEE, 2019, p. 631–636.

[53]

D. C. Kammer, P. A. Blelloch, J. Sills, Test-based uncertainty quantiﬁcation and propagation using hurty/craig-bampton substructure

representations, in: Proceedings of the IMAC-XXXVII, 2019, pp. 107–129.

[54]

M. N. Avramova, K. N. Ivanov, Veriﬁcation, validation and uncertainty quantiﬁcation in multi-physics modeling for nuclear reactor design and

safety analysis, Progress in Nuclear Energy 52 (2010) 601–614.

[55]

United Nations Economic Commission for Europe (UNECE), Addendum 78: Un regulation no. 79 — uniform provisions concerning the

approval of vehicles with regard to steering equipment, 2018.

[56]

G. Bagschik, T. Menzel, M. Maurer, Ontology based scene creation for the development of automated vehicles, in: 2018 IEEE Intelligent

Vehicles Symposium (IV), 2018, pp. 1813–1820. doi:10.1109/IVS.2018.8500632.

[57]

N. W. Whiting, C. J. Roy, E. P. Duque, S. Lawrence, Assessment of model validation and calibration approaches in the presence of uncertainty,

in: AIAA Scitech 2019 Forum, American Institute of Aeronautics and Astronautics, 2019, pp. 1–16.

[58]

M. Viehof, H. Winner, Research methodology for a new validation concept in vehicle dynamics, Automotive and Engine Technology (2018).

[59]

I. T. Voyles, C. J. Roy, Evaluation of model validation techniques in the presence of aleatory and epistemic input uncertainties, in: 17th AIAA

Non-Deterministic Approaches Conference, American Institute of Aeronautics and Astronautics, 2015, pp. 1–16.

[60] R. G. Miller, Simultaneous Statistical Inference, Springer New York, New York, NY, 1981.

ResearchGate has not been able to resolve any citations for this publication.

Unified Framework and Survey for Model Verification, Validation and Uncertainty Quantification

Article

Full-text available

Aug 2020

Simulation is becoming increasingly important in the development, testing and approval process in many areas of engineering, ranging from finite element models to highly complex cyber-physical systems such as autonomous cars. Simulation must be accompanied by model verification, validation and uncertainty quantification (VV&UQ) activities to assess the inherent errors and uncertainties of each simulation model. However, the VV&UQ methods differ greatly between the application areas. In general, a major challenge is the aggregation of uncertainties from calibration and validation experiments to the actual model predictions under new, untested conditions. This is especially relevant due to high extrapolation uncertainties, if the experimental conditions differ strongly from the prediction conditions, or if the output quantities required for prediction cannot be measured during the experiments. In this paper, both the heterogeneous VV&UQ landscape and the challenge of aggregation will be addressed with a novel modular and unified framework to enable credible decision making based on simulation models. This paper contains a comprehensive survey of over 200 literature sources from many application areas and embeds them into the unified framework. In addition, this paper analyzes and compares the VV&UQ methods and the application areas in order to identify strengths and weaknesses and to derive further research directions. The framework thus combines a variety of VV&UQ methods, so that different engineering areas can benefit from new methods and combinations. Finally, this paper presents a procedure to select a suitable method from the framework for the desired application.

Fully Decoupled Reliability-Based Design Optimization of Structural Systems Subject to Uncertain Loads

Article

Full-text available

Aug 2020

Reliability-based optimization (RBO) offers the possibility of finding the best design for a system according to a prescribed criterion while explicitly taking into account the effects of uncertainty. Although the importance and usefulness of RBO is undisputed, it is rarely applied to practical problems, as the associated numerical efforts are usually extremely large due to the necessity of solving simultaneously a reliability problem nested in an optimization procedure. To alleviate this issue, this contribution proposes an approach for solving a particular class of problems in RBO: the minimization of the failure probability of a linear system subjected to an uncertain load. The main characteristic of this approach is that it is fully decoupled. That is, the solution of the RBO problem is reduced to the solution of a single deterministic optimization problem followed by a single reliability analysis. Such an approach implies a complete change of paradigm with respect to the more classical double-loop and sequential methods for RBO. The theoretical basis for the proposed approach lies in the application of the operator norm theorem. The application and capabilities of the proposed approach* are illustrated by means of three examples.

Performance Analysis of Camera-based Object Detection for Automated Vehicles

Article

Full-text available

Jul 2020
SENSORS-BASEL

For a safe market launch of automated vehicles, the risks of the overall system as well as the sub-components must be efficiently identified and evaluated. This also includes camera-based object detection using artificial intelligence algorithms. It is trivial and explainable that due to the principle of the camera, performance depends highly on the environmental conditions and can be poor, for example in heavy fog. However, there are other factors influencing the performance of camera-based object detection, which will be comprehensively investigated for the first time in this paper. Furthermore, a precise modeling of the detection performance and the explanation of individual detection results is not possible due to the artificial intelligence based algorithms used. Therefore, a modeling approach based on the investigated influence factors is proposed and the newly developed SHapley Additive exPlanations (SHAP) approach is adopted to analyze and explain the detection performance of different object detection algorithms. The results show that many influence factors such as the relative rotation of an object towards the camera or the position of an object on the image have basically the same influence on the detection performance regardless of the detection algorithm used. In particular, the revealed weaknesses of the tested object detectors can be used to derive challenging and critical scenarios for the testing and type approval of automated vehicles.

Non-stationary Gaussian process regression applied in validation of vehicle dynamics models

Article

Full-text available

May 2020
ENG APPL ARTIF INTEL

Stephan Rhode

This work compares methods to compute confidence bands in a validation task of a vehicle single-track model. The confidence bands are computed from time series by naïve method, Gaussian process regression and heteroscedastic and non-stationary Gaussian process regression. The simulation model considers the epistemic uncertainty of the vehicle mass parameter by Latin hypercube sampling. The validation procedure compares all stochastically simulated time series of the vehicle yaw rate with the confidence band of the reference data. The model is marked as valid if the yaw rate for each time step is within the confidence band of the reference data. The data was challenging due to noise and time-varying variance and smoothness. Due to required data pre-processing and the high sensitivity to noise in the reference data, the naïve method has generated unusable confidence bands and cannot be recommended for similar validation tasks. Gaussian process regression solved the problem of noise sensitivity, but was not able to model the time-varying length scale of the reference data. Therefore, heteroscedastic and non-stationary Gaussian process regression is proposed to calculate accurate confidence bands of time-varying and noisy reference data for the validation of dynamic models by a confidence band approach.

Survey on Scenario-Based Safety Assessment of Automated Vehicles

Article

Full-text available

May 2020

When will automated vehicles come onto the market? This question has puzzled the automotive industry and society for years. The technology and its implementation have made rapid progress over the last decade, but the challenge of how to prove the safety of these systems has not yet been solved. Since a market launch without proof of safety would neither be accepted by society nor by legislators, much time and many resources have been invested into safety assessment in recent years in order to develop new approaches for an efficient assessment. This paper therefore provides an overview of various approaches, and gives a comprehensive survey of the so-called scenario-based approach. The scenario-based approach is a promising method, in which individual traffic situations are typically tested by means of virtual simulation. Since an infinite number of different scenarios can theoretically occur in real-world traffic, even the scenario-based approach leaves the question unanswered as to how to break these down into a finite set of scenarios, and find those which are representative in order to render testing more manageable. This paper provides a comprehensive literature review of related safety-assessment publications that deal precisely with this question. Therefore, this paper develops a novel taxonomy for the scenario-based approach, and classifies all literature sources. Based on this, the existing methods will be compared with each other and, as one conclusion, the alternative concept of formal verification will be combined with the scenario-based approach. Finally, future research priorities are derived.

Methods for Improving the Accuracy of the Virtual Assessment of Autonomous Driving

Conference Paper

Full-text available

Nov 2019

Assessing the safety of an autonomous vehicle is an open problem within the research domain for autonomous vehicles. Next to real-world driving tests, simulation and re- processing of recordings play a crucial role in validating the correct and safe behavior. Current state-of-the-art methods for function reprocessing suffer from several sources of error and hence, might lead to incorrect results. In this work, an overview of the most recent reprocessing methods is given and their shortcomings are described. We suggest the derivation of explicit sensor models and the learning of behavior models for traffic objects. An overview of different levels of sensor and different kinds of agent models is given along with a discussion for the need for statistical and machine learning based models. Furthermore, a novel method, based on infrastructure sensors, to collect the data needed for the derivation of the models is presented.

Transitioning to a driverless city: Evaluating a hybrid system for autonomous and non-autonomous vehicles

Article

Feb 2021
SIMUL MODEL PRACT TH

Autonomous vehicles will transform urban mobility. However, before being fully implemented, autonomous vehicles will navigate cities in mixed-traffic roads, negotiating traffic with human-driven vehicles. In this work, we simulate a system of autonomous vehicles co-existing with human-driven vehicles, analyzing the consequences of system design choices. The system consists of a network of arterial roads with exclusive lanes for autonomous vehicles where they can travel in platoons. This paper presents the evaluation of this system in realistic scenarios evaluating the impacts of the system on travel time using mesoscopic traffic simulation. We used real data from the metropolis of São Paulo to create the simulation scenarios. The results show that the proposed system would bring reductions to the average travel time of the city commuters and other benefits such as the reduction of the space required to handle all the traffic.

Validation of X-in-the-Loop Approaches for Virtual Homologation of Automated Driving Functions

Conference Paper

May 2018

Securing and homologating automated driving functions presents a huge challenge for their market introduction due to an enormous number of scenarios and environment parameter combinations. Confronting conventional real world tests with the new challenges of automated driving is not feasible anymore and yields to a virtualization of the testing methods by means of X-in-the-Loop approaches. Since their validity is a key enabler for virtual homologation, this paper focuses on the validation of X-in-the-Loop approaches. A generic validation methodology is introduced and demonstrated for the specific use case of an automated longitudinal driving function. As a proof of concept equal scenarios are performed in real driving tests as reference and in two X-in-the-Loop approaches based on a test bed resp. a purely virtual co-simulation environment. The paper describes how a consistent implementation can be ensured to evaluate the collected data. First results show a promising correlation regarding multiple repetitions on the test bed and regarding the validation of both X-in-the-Loop approaches for a future virtual homologation of automated driving functions.

Do Simulated Traffic Conflicts Predict Crashes? An Investigation Using the Extreme Value Approach

Conference Paper

Oct 2019

There has been significant recent interest in using microscopic traffic simulation in road safety analysis. However, whether the simulated traffic conflicts can predict crashes reliably is still an open question. This study aims at investigating the prediction validity of simulated traffic conflicts using the extreme value approach. The traffic simulation models were developed based on real-world data from four approaches of a signalized intersection and calibrated using a two-step calibration procedure that aimed at enhancing the correlation between simulated conflicts and actual field-measured conflicts. Extreme value models were developed for field conflicts and simulated conflicts of three scenarios (i.e., default, first-step calibration and second-step calibration) separately, and then the model estimated crashes were compared to observed crashes for the purpose of performance evaluation. The results show that after the second-step calibration, a high correlation between simulated conflicts and field conflicts is achieved, but the estimated crashes from these simulated conflicts are not as good as field conflicts and are with systematic underestimation. Nevertheless, each effort of simulation model calibration has shown progressive improvement in terms of crash estimation accuracy and similarity of extreme value distributions between simulated conflicts and field conflicts. This finding implies the possibility of predicting crashes reliably from simulated conflicts with proper calibration.

Towards Cross-Verification and Use of Simulation in the Assessment of Automated Driving

Conference Paper

Jun 2019

Non-deterministic Model Validation Methodology for Simulation-based Safety Assessment of Automated Vehicles

Abstract

Recommended publications

Non-deterministic model validation methodology for simulation-based safety assessment of automated v...

Unified Framework and Survey for Model Verification, Validation and Uncertainty Quantification

Model Validation and Scenario Selection for Virtual-Based Homologation of Automated Vehicles

Survey on Scenario-Based Safety Assessment of Automated Vehicles