Content uploaded by Jeffrey Wishart
Author content
All content in this area was uploaded by Jeffrey Wishart on Apr 05, 2021
Content may be subject to copyright.
2021-01-0868 Published 06 Apr 2021
Evaluation of Operational Safety Assessment
(OSA) Metrics for Automated Vehicles in
Simulation
Maria Soledad Elli Intel Corp.
Jerey Wishart Exponent Inc.
Steven Como and Siddhaarthan Dhakshinamoorthy Arizona State University
Jack Weast Intel Corp.
Citation: Elli, M.S., Wishart, J., Como, S., Dhakshinamoorthy, S. et al., “Evaluation of Operational Safety Assessment (OSA) Metrics for
Automated Vehicles in Simulation,” SAE Technical Paper 2021-01-0868, 2021, doi:10.4271/2021-01-0868.
Abstract
The operational safety of automated driving system
(ADS)-equipped vehicles (AVs) must bequantied
using well-dened metrics in order to gain an unam-
biguous understanding of the level of risk associated with AV
deployment on public roads. In this research, efforts to
evaluate the operational safety assessment (OSA) metrics
introduced in prior work by the Institute of Automated
Mobility (IAM) are described. An initial validation of the
proposed set of OSA metrics involved using the open-source
simulation software Car Learning to Act (CARLA) and
Scenario Runner, which are used to place a subject vehicle in
selected scenarios and obtain measurements for the various
relevant OSA metrics. Car following scenarios were selected
from the list of 37 pre-crash scenarios identified by the
National Highway Trac Safety Administration (NHTSA) as
the most common driving situations that lead to crash events
involving two light vehicles. e resulting data were used to
evaluate dierent parameters and thresholds of the metrics
developed in the prior IAM work. e simulation and analysis
results were used to evaluate the relevant metrics in the
context of a proposed criteria as measurable and applicable
to the operational safety of AVs and human-driven vehicles
alike in a data-driven approach.
Introduction
As the development of automated driving system
(ADS)-equipped vehicles (AVs) continues, the need
of a process to evaluate the operational safety perfor-
mance of the technology has become ever more apparent. e
process must provide a consistent, unbiased and technology-
neutral evaluation that will provide public condence as AVs
are deployed.
A possible process for this operational safety performance
evaluation is to use the concept of a formalized Safety Case
Framework (SCF). A safety case is “a structured argument,
supported by a body of evidence, that provides a compelling,
comprehensible, and valid case that a product is safe for a given
application in a given environment.” [1] An SCF, an example
of which is the UL 4600 standard [2], will contain a variety of
possible verication and validation (V&V) methods that are
used in developing the required evidence to support the AV
safety case. A subset of V&V methods is testing methods,
which include conducting simulation testing, closed course
testing, public road testing, or some combination of the three
types that involves placing the AV under test in a set of trac
scenarios and evaluating the operational safety performance.
A comprehensive evaluation methodology, with validated
metrics, must bedeveloped to derive safety case evidence from
the test conduct of a given scenario; to the authors’ k nowledge,
such a methodology does not exist in the literature.
e Institute of Automated Mobility (IAM) was formed
by the Governor’s Executive Order in 2018 to help provide
guidance on AVs in the state of Arizona. e IAM has been
conducting research to develop an operational safety assess-
ment (OSA) methodology, along with OSA metrics, to beused
in an SCF as the evaluation methodology for trac scenario
testing. e intent of the OSA methodology is to evaluate the
navigation of a given trac scenario by a vehicle (primarily
AVs but can also beused for human-driven vehicles) using
the OSA metrics measurements and assigning a score for said
test. e aggregate score over the set of trac scenarios can
then beused to build the safety case for the AV, which in turn
can beused by AV developers and authorities having jurisdic-
tion (AHJs) to evaluate the status of an AV throughout its
development and allow for a determination of readiness for
various stages of deployment. e safety case provides assur-
ance of a level of safety achieved by an AV, which is imperative
for gaining public trust.
Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021
EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 2
In 2020, IAM researchers proposed a set of OSA metrics
in Wishart etal.[3] that were compiled and adapted from a
comprehensive literature review. e OSA metrics set is, to
the authors knowledge, the only comprehensive set that can
beused to measure operational safety of AVs and human-
driven vehicles in the literature.
In the current work, a subset of the OSA metrics proposed
in [1] has been measured and evaluated for a selection of trac
scenarios to rene where appropriate. is evaluation has been
conducted using simulation soware, including Car Learning
to Act (CARLA) [4] and ScenarioRunner [5], to determine the
OSA metric measurements for a subset of scenarios chosen
from NHTSA’s list of 37 pre-crash scenarios in [6].
e proposed set of OSA metrics included sets of param-
eters and thresholds which were adapted from existing litera-
ture and other research studies. Many of these parameters
and thresholds are considered subjective and require further
research for renement. e analysis conducted in this work
includes an attempt to determine values for parameters and
thresholds for some of the metrics that were le as future work
in [3]. e simulation results presented in this work provide
further insight into the consequences of diering thresholds
and parameters for a variety of car-following scenarios, as
well as the the performance of the proposed OSA metrics in
the context of assessing the operational safety of AVs.
e outline of the paper is as follows. First, the OSA
metrics are described and summarized. e parameters and
thresholds used in the experiments are then listed. e trac
scenarios, including the selection process, are then described.
The simulation methodology is discussed, including the
CARLA soware and test vehicle model. Next, the simulation
results are presented and discussed. Finally, overall conclu-
sions and future work are described.
OSA Metrics
e proposed set of OSA metrics was introduced in Wishart
etal. [3] e objective was to develop a comprehensive set that
would allow for an assessment of the operational safety of a
vehicle (human-driven or automated driving) in a variety of
scenarios as part of an SCF. A novel taxonomy is proposed
here to organize the OSA metrics into three categories: (1)
Black Box metrics, (2) Grey Box metrics, and (3) White
Box metrics:
•A Black Box metric allows for measurement of data that
can beobtained without requiring any access to ADS
data. is could befrom an on-board or o-board source
(e.g., public road infrastructure, or CAN bus data).
However, using ADS data may enhance the accuracy and
precision of the measurement(s).
•A Grey Box metric allows for measurement of data that
can only beobtained with limited access to ADS data.
•A White Box metric allows for measurement of data that
can only beobtained with signicant access to ADS data.
ere are trade-os for each metric type that makes them
advantageous (or disadvantageous) in particular use cases.
For example, the Black Box metrics may bepreferable where
access to proprietary ADS data is not desired. Conversely,
White Box metrics allow for specic sub-systems in the ADS
to beassessed rather than just the AV system as a whole. e
Grey Box metrics represent a balance between sensitivity to
proprietary data and assessment granularity. It should
benoted that while Black Box metrics are useful for both
human-driven vehicles and AVs, Grey Box and White Box
metrics are only applicable to AVs.
e proposed OSA metrics are shown in Table 1, along
with the proposed taxonomy. is proposed set is comprised
only of Black Box and Grey Box metrics since White Box
metrics rely on shared data from AV developers and may
beunavailable for evaluation purposes. e Black Box metrics
can be further categorized as Minimum Safe Distance-
Related, Universal, and Trac Engineering-Related.
e Minimum Safe Distance-Related metrics are based
on the Responsibility-Sensitive Safety (RSS) model [8].
Universal metrics are dened as those which apply to both
human-driven vehicles and AVs including events such as
Collision Incidents and Trac Law Violations. It should
benoted that the latter metric in [3] was originally “Rules-
of-the-Road Violation” but has since been changed to “Trac
Law Violation” since the denition for Rules of the Road is
ambiguous and can include customary practices for a partic-
ular region. Trac engineering metrics consist of tradition-
ally used surrogate safety metrics that have been heavily
researched in the past [9]. Lastly, the Grey Box metrics
include two metrics (ADS Active (ADSA) and Achieved
Behavioral Competency (ABC)) that indicate whether the
ADS is completing the dynamic driving task (DDT) and if
the AV accomplishes the trajectory as planned, respectively.
e Human Trac Control Detection Error Rate (HTCDER)
TAB L E 1 Taxonomy for OSA metrics
Black Box Metrics
Grey Box MetricsMinimum Safe Distance-Related Universal Trac Engineering-Related
Minimum Safe Distance Violation
(MSDV)
Collision Incident (CI) Time-to-Collision Violation
(TTCV)
Human Trac Control Detection
Error Rate (HTCDER)
Proper Response Action (PRA) Trac Law Violation (TLV) Modified Time-to-Collision
Violation (MTTCV)
ADS Active (ADSA)
Minimum Safe Distance Factor
(MSDF)
Human Trac Control Violation
Rate (HTCVR)
Post-Encroachment Time
Violation (PETV)
Achieved Behavioral Competency
(ABC)
Aggressive Driving (AD) Minimum Safe Distance Calculation
Error (MSDCE)
© SAE International.
Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021
EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 3
and Minimum Safe Distance Calculation Error (MSDCE)
provide insight into the perception system performance
without requiring raw sensor data (which is particularly
sensitive, proprietary data) but rather much more limited
ADS data. While the MSDCE metric is also based on the
RSS model, it is classied as a “Grey Box” metric due to the
necessity of some ADS data required to calculate the metric
value.
It should benoted that the authors continue to monitor
the literature for additional metrics to beconsidered for the
OSA metrics, such as Time Headway (THW) from [10],
Collision Avoidance Capability (CAC) from [11], and Model
Predictive Instantenous Safety Metric (MPrISM) from [12].
Future work will consider these metrics to beincluded in the
OSA metrics set. e OSA metrics from Table 1 are also being
considered for the SAE J3237 Information Report currently
being developed by the V&V Task Force of the SAE On-Road
Automated Driving (ORAD) Committee.
Selection and Formulation
of OSA Metrics
For this work, not all of the proposed OSA metrics are measur-
able when using the employed simulation. Within the CARLA
simulator, ground truth data of the vehicles’ positions, speeds,
and accelerations were used to calculate the aforementioned
metrics; therefore, metrics related to quantication of localiza-
tion and tracking errors, such as MSDCE and HTCDER, are
not possible to obtain. Additionally, metrics related to the
behavior of the subject vehicle under test (i.e., the AV), such
as ABC, HTCVR, ADSA, and PRA, are inapplicable to this
work as the algorithm controlling the vehicles’ behavior does
not reect a real AV driving policy.
erefore, the OSA metrics evaluated in the presented
work are:
•MSDV
•TTCV
•MTTCV
•PETV
In addition to the previously discussed metrics, the THW
metric discussed in [10] and [13] was also included in the
analysis. THW is a time-based metric similar to that of TTC;
however, the THW is based on only the distance between the
following and lead vehicle in relation to the speed of the
following vehicle rather than the dierence in speed. Similar
to other time-based metrics, the lower the THW value, the
higher the risk of a collision; therefore, THW is oen used in
the literature with a pre-determined threshold. is implies
that when the threshold is met, the situation has become
unsafe and a proper response action is required. An example
of the usage of this metric is the latest United Nations
Economic Forum for Europe (UNECE) regulation on
Automated Lane-Keeping Systems (ALKS) [10]. e THW
formulation was modied to align with the other metrics in
the form of a violation if the threshold is exceeded, such that
the Time Headway Violation (THWV) is introduced.
e details of the selected metrics are shown in Tab le 2
(note that the Distance to Stop Violation (DSV) is discussed
below). e selected metrics are relevant to the analysis of
operational safety and represent the rst step in the evaluation
of the proposed OSA metrics.
In order to evaluate the performance of the chosen OSA
metrics, three evaluation criteria were developed to analyze
the ecacy of a given metric in assessing the safety of a
driving situation:
•Robustness to changing scenario congurations
•Relevance
•Comparison to a ground truth metric
Robustness to changing scenario congurations refers to
the metrics providing a timely warning that changes appro-
priately with variations for dierent scenario conditions such
as vehicle initial position and speed, relative headway of the
vehicles, changes in the environment, etc. For example, if the
scenario conguration changes but the metrics violation
timing does not, then the metric robustness is lower.
Relevance refers to the metric providing safety informa-
tion throughout the scenario for all scenario permutations.
For example, if there is one (or more) instance(s) of the metric
exhibiting a nonsensical value (e.g., a denominator being 0),
the metric relevance is lower.
For the purpose of this work, a ground truth metric,
Distance to Stop Violation (DSV) has been established to
enable comparisons of the eectiveness of metrics in identi-
fying a potentially unsafe situation preemptively. is crite-
rion thus evaluates the timeliness of the metric violation
temporal occurrence. e ground truth DSV metric is based
on the distance that it would take the subject vehicle to come
to a full stop at a determined deceleration (DSTOP in
Equation 6in Table 2). When the follower vehicle is driving
at a distance that is less or equal than the distance required
for this deceleration to occur, a DSV has occurred. is
metric indicates a potentially unsafe situation under ideal
conditions, meaning that this metric does not consider the
velocity or deceleration capabilities of the lead vehicle (or
rather it assumes a stopped lead vehicle) nor the road condi-
tions. It also assumes that the subject vehicle can reach the
dened abrakeinstantly. Because of this, two dierent values
for abrake were selected and evaluated with simulation in order
to provide dierent “thresholds” for comparison with the
ground truth. ese values were taken from [10] and reect
the minimum deceleration needed to perform an emergency
maneuver (abrake=5 m/s2) and a reasonable (maximum) decel-
eration applied by automatic emergency braking (AEB)
systems (abrake=8.3 m/s2). e implementation of this evalu-
ation criterion is described in the Metrics Observations and
Discussion section.
ese three criteria will beused to evaluate the metrics
based on the experimental results in the following sections.
Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021
EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 4
OSA Metrics Parameterization
e OSA metrics evaluated in this work included subjective
assumptions for thresholds determining when a violation of
a metric occurred. erefore, this work focused on evaluating
the impact of dierent threshold and parameter values used
to dene the OSA metrics.
Time-Based Metrics e implementation of time-based
metrics such as TTCV, MTTCV, PETV, and THWV is highly
dependent on the threshold values assigned to them and there-
fore it is important to evaluate the metrics’ performance as a
function of diering thresholds. e set of values chosen for
TTCV, MTTCV, and THWV thresholds are based on values
suggested by previous literature reviewed in [3]. In the case
of PETV, the threshold values were chosen according to the
TAB L E 2 OSA metrics formulation
Minimum Safe Distance Violation ′
<∧ <
=
min min
1
0
lat lat long long
if d d d d
MSDV else
= ∧
=
1 '1
0
if MSDV Originated by AV
MSDV
else
(1)
Time to Collision Violation
LF
FL
XX
TTC vv
−
=
−
≤
=
1
0
if TTC threshold
TTCV
else
(2)
Modified Time to Collision Violation −∆ ± ∆ + ∆
=
∆
22V V AD
MTTC A
≤
=
1
0
if MTTC threshold
MTTCV else
(3)
Post Encroachment Time Violation PET=t2−t1
≤
=
1
0
if PET threshold
PETV else (4)
Time Headway Violation
LF
F
XX
THW v
−
=
,
≤
=
1
0
if THW threshold
THWV else
(5)
Distance to Stop Violation
2
2
F
brake
v
DSTOP a
=
≤
=
1
0
long
if d DSTOP
DSV else
(6)
Where:
dlong: longitudinal distance between two vehicles
dlat: lateral distance between two vehicles
min :
long
d
minimum longitudinal distance between two vehicles ([8])*
min
lat
d
: minimum lateral distance between two vehicles ( [8])*
XL: Leading vehicle position
XF: Following vehicle position
vL: Leading vehicle speed
vF: Following vehicle speed
∆
V
: Relative velocity
∆
A
: Relative acceleration
D
: Relative space gap (equivalent to
dlong in car following situations)
t2: Arrival time of (any part of) Vehicle 2 at Conflict Point
t1: Arrival time of (any part of) Vehicle 1 at Conflict Point
abrake: Following vehicle deceleration
*Note: The formulae for
min
long
d
and
min
lat
d
are in Appendix A (equations 7 and 8).
© SAE International.
Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021
EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 5
distribution of PET values found in the simulated data, as the
spread of PET values was smaller than that of the rest of the
time-based metrics (see Figure 7). e threshold values for
the time-based metrics evaluated in this work are shown in
Tab le 3.
Minimum Safe Distance-Related Metric For the
Minimum Safe Distance-Related metric, t here are four param-
eters that were examined according to the denitions in [8] :
1. Reaction time of the ADS of the subject vehicle, ρ
2. Maximum acceleration of the subject vehicle during
the response duration,
aaccel
long
max,
3. Minimum deceleration of the subject vehicle aer the
response duration in order to avoid a collision,
adecel
long
min,
4. Maximum assumed deceleration capability of the
other vehicle,
adecel
long
max,
.
e values chosen for the above-mentioned parameters
were extracted from previous research that determined such
values based on naturalistic driving data and further simula-
tion experiments. For the purpose of this work, the set of
values were separated into three dierent categories (shown
in Table 4): Aggressive, Conservative, and NDS (with NDS
being Naturalistic Driving Study). The values for the
Aggressive and Conservative categories were adopted from
[14] in which a Falsication Search Engine using simulation
was used to nd RSS parameter values with an associated
robustness value, with robustness being a measure of how
close the vehicles under test were during the simulated
scenarios. From the search, clusters of parameter sets were
divided into Aggressive and Conservative categories as they
resulted in more aggressive (i.e., shorter following distances)
and more conservative (i.e., larger following distances)
behaviors, respectively. e values under the NDS category
were adopted from the China Intelligent Transportation
Systems (C-ITS) Alliance standard #0116-2019 [15]. e
values in this standard were dened aer analyzing 3 years
of naturalistic driving data collected from Shanghai highways
in China.
Experimental Design
e experimental design in this work involved developing
various scenarios and then implementing said scenarios in
simulation, as described in the following sections.
Scenario Selection
One of the difficult challenges currently facing the AV
industry is the selection of scenarios to beevaluated for the
assessment of vehicle safety. Unique scenario generation is
out of the scope of this project and the authors relied upon
the 37 pre-crash scenarios documented by the National
Highway Trac Safety Administration (NHTSA) [6]. From
the list of pre-crash scenarios, a subset of car-following situ-
ations was selected for evaluation. e 37 pre-crash scenarios
were filtered and car-following scenarios that involved
“Two-Vehicle”, “Light-Vehicle Crashes” were then selected
based on frequency of occurrence. Scenarios involving signal-
ized and unsignalized junctions are out of the scope of this
work. e reason for using car-following scenarios in the
subject work was for simplication within the context of the
simulation setup. Future work will expand the discussed
methodology to consider more complex scenarios such as
intersection-related environments. Details of the chosen
scenarios extracted from [6] are summarized in Table 7 in
Appendix B and include:
1. Lead vehicle stopped (LVS)
2. Lead vehicle decelerating (LVD)
3. Lead vehicle moving at lower constant
speed (LVMLCS)
4. Lead vehicle accelerating (LVA)
Together, the four scenarios accounted for some 18.4% of
all light-duty vehicle crashes from 2004-2008, according to
Tab le 7 in [6] .
Scenarios Realization
In this work, each selected scenario from NHTSA’s pre-crash
topology is considered a scenario category in which all
scenarios dened within the same category share the same
behavior, but variations on the vehicles’ speeds and/or initial
position were dened to evaluate the sensitivity to changes in
such conditions. Speed variations and positions for the
vehicles were dened with the goal to create relevant situations
that may aect the resulting OSA metrics calculation. is
allows for the inference of data trends when calculating the
associated OSA metrics. In total, 13 dierent scenarios were
dened, shown in Table 5 that were simulated and then
analyzed. e scenarios dened in this study are not meant
to beexhaustive of dierent driving situations, but rather an
initial selection for studying the impact of dierent driving
conditions of vehicles (e.g., vL > vF, vL = vF, etc.) in the
OSA metrics.
In the simulation experiments, the selected scenarios
involve only two vehicles, the subject vehicle and the lead
TAB L E 3 Evaluated thresholds for time-based metrics
Metric Thresholds [s]
TTCV {1, 2, 3, 4, 5}
MTTCV {1, 2, 3, 4, 5}
PETV {0.5, 1, 1.2, 1.5, 2}
THWV {1, 2, 3, 4, 5}
© SAE International.
TAB L E 4 Categories of RSS parameters for MSDV metric
Category ρ [s]
long
max,accel
a
[m/s2]
long
min,decel
a
[m/s2]
long
max,decel
a
[m/s2]
Aggressive 0.5 4.1 4.6 8
Conservative 1.9 5.9 4.1 9.5
NDS 0.2 1.8 3.6 6.1
© SAE International.
Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021
EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 6
vehicle. Both vehicles start from a rest position and accelerate
to their target speeds unless explicitly stated otherwise. e
two vehicles are positioned with the subject vehicle behind
the lead vehicle in the same lane at a specied distance, in a
straight road with no breaks (e.g., junctions, signals, turns,
stop signs, etc.) which is long enough for both vehicles to
achieve their designated target speeds. Each scenario
progresses such that the subject vehicle approaches the lead
vehicle and a collision occurs. e scenario is terminated at
the collision time.
e behavior of the vehicles in a scenario is described
as follows:
Lead Vehicle Stopped (LVS) e lead vehicle is posi-
tioned at an initial distance ahead of the subject vehicle to
provide enough distance for the subject vehicle to reach its
designated target speed. e subject vehicle maintains this
speed until it eventually collides with the lead vehicle.
roughout the entire scenario, the lead vehicle stays at rest.
Example vehicle dynamics for the LVS scenarios are depicted
in Figure 1.
Lead Vehicle Decelerating (LVD) e lead vehicle is
positioned at an initial distance ahead of the subject vehicle
and both vehicles start moving from rest position. Both
vehicles maintain a constant acceleration aer reaching the
target speed. Aer vehicles achieve their target speed, the lead
vehicle starts decelerating until reaching a full stop. e
subject vehicle eventually collides with the decelerating lead
vehicle. Example vehicle dynamics for the LVD scenarios are
depicted in Figure 2.
Lead Vehicle Moving at Lower Constant Speed
(LV M LC S ) The lead vehicle is positioned at an initial
distance ahead of the subject vehicle and both vehicles start
moving from rest position. e target speed of the lead vehicle
is lower than that of the subject vehicle. e subject vehicle
eventually collides with the slower-moving lead vehicle.
Example vehicle dynamics for the LVMLCS scenarios are
depicted in Figure 3.
Lead Vehicle Accelerating (LVA) e lead vehicle is
positioned at an initial distance ahead of the subject vehicle
and both vehicles start moving from rest position. e lead
vehicle has an initial speed that is lower than t hat of the subject
vehicle. As the subject vehicle approaches the lead vehicle, the
lead vehicle starts accelerating to its nal target speed. e
subject vehicle eventually collides into the accelerating lead
vehicle. Example vehicle dynamics for the LVA scenarios are
depicted in Figure 4.
TAB L E 5 Scenarios categories and details
Scenario Category Scenario ID
Initial Headway
Distance [m]
Subject Vehicle
Target Speed [m/s]
Lead Vehicle Target
Speed [m/s]
Lead Vehicle Stopped LVS_10 200 10 0
LVS_15 200 15 0
LVS_18 200 18 0
Lead Vehicle Decelerating LVD_14 5* 14.1 14
LVD_15 5* 15 15.1
LVD_16 30 16 18
LVD_18 30 18 18
LVD_20 30 20 18
Lead Vehicle Moving at Lower Constant Speed LVMLCS_12 30 12 10
LVMLCS_15 30 15 10
LVMLCS_20 30 20 15
Lead Vehicle Accelerating LVA_15 30 15 20
LVA_20 30 20 25
*Note: Scenarios LVD_14 and LVD_15 are special situations in which the subject vehicle is initially 5m away from leading vehicle, both driving at
high speeds, purposefully creating unsafe situations for the sake of metrics evaluation.
© SAE International.
FIGURE 1 Example: LVS_18 scenario dynamics
© SAE International.
FIGURE 2 LVD_16 scenario dynamics
© SAE International.
Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021
EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 7
Simulation Setup
CARLA [4] is an open-source AV simulation soware which
can beused to simulate an AV and diverse sensor suites for
use in dierent weather conditions and road topologies.
ScenarioRunner for CARLA [5] is an open source trac
scenario denition and execution engine that can beused for
dening the interaction of an AV with other road users in
dierent driving situations. Within ScenarioRunner, it is
possible to dene scenarios using the standard OpenScenario
format [16] which allows for controlling involved trac partic-
ipants’ initial conditions, driving inputs, and environmental
conditions for evaluation purposes.
For the purpose of this work, a Tesla Model 3 vehicle from
the CARLA vehicle library was selected as the test vehicle.
Both the subject a nd leading vehicle within each scenario were
defined using the same vehicle model to avoid possible
confounding results associated with nuances of dierent
vehicle models. e particular vehicle used for the simulation
is not important so long as the vehicle characteristics are
representative of realistic conditions as the metrics are not
sensitive to a specic vehicle type.
The subject vehicle used in the scenarios behaved
according to a basic behavior dened by the “Roaming Agent”
in CARLA. is agent controls the longitudinal and lateral
motion of the vehicle using a PID controller (with parameters
for the longitudinal control as: Kp = 0.1, Kd = 0.1 and Ki = 1.0;
the parameters for lateral control are not relevant to this
work). Only longitudinal control is relevant in the car-
following scenarios evaluated in this work. Additionally, the
agent responds to trac lights and other vehicles within a
certain proximity threshold. In our experiments, this prox-
imity threshold was set to 0 meters in order to prevent inter-
ference with the subject vehicle’s behavior creating, in this
way, purposefully unsafe situations and collisions for the sake
of the metrics evaluation. As a result, the subject vehicle did
not apply any response when approaching the leading vehicle.
Within each scenario run, the data needed to compute
the metrics were collected and a post-processing Python
pipeline was generated to calculate the instantaneous measure
of the OSA metrics throughout each scenario execution.
In order to calculate the Minimum Safe Distance-Related
metrics from [3], the subject vehicle used the open source
implementation of the RSS model from [8] and [17] that is
integrated within CARLA [18]. An “RSS Sensor” was attached
to the subject vehicle which analyzed the situation at each
time step in order to calculate the longitudinal and lateral
minimum safe distances with respect to the other road users
during the simulation. Within the simulation setup, the RSS
Sensor was only used to detect dangerous situations (according
to RSS denitions), but the subject vehicle did not actuate
based on this information (i.e., did not apply a Proper Response
as dened by RSS), thus, resulting in a collision for each
scenario. By allowing collision events to occur, the capability
of the OSA metrics to pre-emptively determine potential colli-
sions could beassessed.
A quantitative analysis as well as a visual depiction of the
metrics was generated to better understand the relationship
between dierent metrics values and the impact that thresh-
olds or parameter values have on the context of assessing the
safety of the situation.
Experimental Results
e evaluation of the metrics’ relationship, redundancies,
areas of inapplicability, and any other potential observations
associated with the proposed metric thresholds and param-
eters is explained in the following sections.
Metric values as well as metric violation duration distri-
butions across all scenarios were evaluated in order to under-
stand the relevance of each metric for the assessment of the
operational safety performance of AVs. Particularly, metric
violation durations are important for assessing violation
temporal occurrence when identif ying unsafe situations. Short
metric violation durations could be indicative of failure to
identify t he safety of the situation. In ot her words, short metric
violation durations could indicate that the metric identied
an unsafe situation too late, or that it is not giving a continuous
safety assessment. On the other hand, large metric violation
durations could beindicative of increased conservativeness.
Additionally, metric violations for a given threshold/param-
eter set were also analyzed across the scenarios.
Metric Parametrization Based
on Duration Distributions
For each of the evaluated metrics, specic thresholds and, in
some cases, assumed parameters must beassigned in order
to establish a violation occurrence. In previous work, Wishart
et al. [3] compiled thresholds and assumed parameters
collected from previous research as proposed values for each
metric. is analysis considered a range of values for each
FIGURE 3 LVMCLS_15 scenario dynamics
© SAE International.
FIGURE 4 LVA_20 scenario dynamics
© SAE International.
Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021
EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 8
scenario to further assess the impact of these values on gener-
ating meaningful results. Comparing the results for each
threshold value demonstrated that for high thresholds values
on the time-based metrics, violations were generated well
before the vehicle would need to react to avoid an unsafe situ-
ation whereas for lower thresholds, an unsafe situation was
not detected until the vehicle would not have enough time to
safely avoid the collision. is evaluation provided additional
context to justify threshold values within the metric
violation evaluations.
Minimum Safe Distance
Violation (MSDV)
e Minimum Safe Distance Violation metric depends on the
current longitudinal and lateral distances between the subject
vehicle and the lead vehicle as well as the calculated
RSS-dened longitudinal and lateral minimum safe distances
using the dened parameters. In the time steps during which
the calculated minimum safe distances were less than the
instantaneous longitudinal and lateral distances, a violation
of the Minimum Safe Distance occurs (eq. (1)). In order to
better understand the impact of dierent parameter sets into
the MSDV metric, the distributions across the selected
scenarios of the resulting calculation of the Minimum Safe
Distance using different parameters sets are depicted in
Figure 5.
e results indicated that the Minimum Safe Distances
calculated with Ag gressive and NDS parameter sets lead to
similar distributions, with median longitudinal distances of
23.0m and 16.2m respectively. In contrast, the Conservative
category had a median longitudina l distance of 103.8 m, more
than four times the median of the Aggressive category.
When analyzing the distribution of the duration of a MSD
Violation with the dierent parameter sets, a more conserva-
tive set of parameters yielded longer violation durations, with
a median violation duration of 16.6 s (see Figure 6). In the case
of Aggressive and NDS parameters, the median durations of
an MSD Violation were 8.2 s and 5.4 s, respectively. e
minimum violation durations were 2.0 s, 1.6 s, and 7.4 s for
Aggressive, NDS and Conservative categories, respectively.
While Aggressive and NDS parameter sets generated
similar distributions of MSD values and violation durations,
it is worth noting the eect of choosing one set over the other.
In the case of the Aggressive category, the subject vehicle
assumes that it can accelerate up to amax, accel = 4.1 m/s2 during
the response time ρ = 0.5 s and then it is expected to brake
with at least amin, brake = 4.6 m/s2. Additionally, the subject
vehicle assumes that the front vehicle can brake with up to
amax, brake= 8.0 m/s2. ese values certainly reect a more
aggressive and jerky behavior than the values from the NDS
category. e primary reason for this is because values on the
NDS category were extracted from human drivers’ behavior,
while the ones on the Aggressive category from simulation
experiments. Moreover, the values in the NDS category are in
line with previous research work done by NHTSA around the
average maximum deceleration values achieved by humans
[19]. In the latter work by NHTSA, it was found that the mean
maximum deceleration applied by humans on dry and wet
surfaces was approximately 0.67 g (SD 0.25, max 1.15, min
0.00 4).
Time-Based Metrics In this section, the time-based
metrics TTCV, MTTCV, PETV and THWV were analyzed,
in addition to the impact of diering thresholds for the initia-
tion of a metric violation and the duration of the violations.
Figure 7 depicts the distribution of the time-based metric
values across all scenarios. For TTC, MTTC, and THW calcu-
lations, the values were limited to a maximum of 10 s when
the TTC, MTTC and THW calculations resulted in larger
values, skewing the distributions of TTC and MTTC values
towards 10 s, with median values of 10 s in both cases.
Conversely, THW had a median value of 1.9 s. is speaks
towards the lack of relevance of the information provided by
TTCV and MTTCV metrics overall. In the case of PET, a
narrower distribution of values was observed with a median
of 1.5 s and a maximum of 3.7 s.
Time-To-Collision Violation (TTCV) e TTC metric is
arguably the most popular surrogate safety metric found in
the literature. erefore, it is important to understand the
meaning of a certain threshold value for TTC in relation with
the operational safety of vehicles. e TTCV threshold was
varied from 1 s to 5 s and the resulting TTC Violation duration
is shown in Figure 8. For a threshold value of 1 s, the TTCV
FIGURE 5 Minimum Safe Distance calculations boxplot with
varying parameter sets
© SAE International.
FIGURE 6 MSDV duration boxplot with varying
parameter sets
© SAE International.
FIGURE 7 Time-based metrics calculation boxplot with
varying threshold values
© SAE International.
Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021
EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 9
duration has a maximum value of 1.3 s and a median of 1 s.
is implies that the metric likely fails to identify unsafe situ-
ations on time (i.e., 1 s before collision). For threshold values
of 2 s and 3 s, the duration of a TTCV yields a median of 1.9
s and 2.9 s, respectively. e minimum violation duration for
the 2 s threshold is 0.8 s whereas for a threshold of 3 s, the
minimum violation duration is 0.85 s. is implies that a
threshold of 3 s for TTCV might betoo conservative for real-
world driving for the scenarios studied in this work, since the
lower duration violations that happened with a threshold of
2 s were also associated with a threshold of 3 s. Beyond a
threshold of 4 s, the increasing trend of mean duration of
TTCV attens out. e maximum maintains its trend due to
the fact that a higher threshold means the activation of TTC
is much earlier during a scenario. Aer the 4 s threshold, the
metric becomes too conservative to pick up on the smaller
duration of violations.
Modified Time-To-Collision Violation (MTTCV) e
MTTC metric was introduced as a more sophisticated TTC,
considering relative distances, speeds and accelerations of the
vehicles. However, unless linked to a threshold, MTTC by
itself cannot give a sense on the severity of a situation. is is
mainly due to the fact that two vehicles under test might have
the same MTTC value for dierent combinations of relative
speeds and distances. In this work, MTTC Violations were
analyzed with thresholds varying from 1 s to 5 s and the
resulting MTTCV durations at dierent thresholds are shown
in Figure 9.
From Figure 9 weobserve that the minimum MTTCV
duration is always close to 0 (0.05 s in all cases, as this is the
time cycle of the simulation), even in cases of higher threshold
values. is highlights the sensitivity of the MTTC ca lculation
to changes in the relative speed, acceleration and distances
between the vehicles. In all cases, very short violations were
found, suggesting that MTTCV could result in multiple
conservative violations (e.g., unwarranted warnings of an
unsafe situation) regardless of the threshold value.
Furthermore, the median value of a duration for MTTCV for
a thresholds of 3, 4 and 5 seconds is biased towards the lower
values, implying shorter violation durations even for the
highest threshold. The variability demonstrated by the
MTTCV metric across dierent threshold values illustrated
some of the drawbacks of this measurement in the context of
reliably evaluating safety of a vehicle consistently
across scenarios.
Post-Encroachment Time Violation (PETV) PET is a
traditional surrogate safety measure used frequently in trac
engineering studies. By including PET in the proposed set of
metrics, the intent was to provide comparable measures to
existing datasets in addition to a metric that is relatable to
both human driven and automated vehicles. In the context of
the tested scenarios, PET was measured as the dierence in
time between a conict point for the subject and lead vehicles.
In the car following scenarios analyzed, several conict points
were dened throughout the scenarios, resulting in a PET
curve over time, rather than a single PET value for each
scenario. e latter case was the origina l intent of the PET [9].
In the cases where the subject vehicle’s path does not coincide
with the path of the lead vehicle, i.e., the rear bumper of the
lead vehicle and the front bumper of the subject vehicle passing
through the conict point, the PET value was non-existent.
e threshold for a PET violation was varied from 0.5 s to 2 s.
As depicted in Figure 10 the PET violation duration was
minimal for all t hresholds due to scenarios in the LVS category
where there is no post encroachment. In all cases, the
maximum violation duration was larger than 15 s due to
scenarios LVD_14 and LVD_15 where vehicles were 5m away
from each other initially. For thresholds 0.5 s, 1 s, 1.2 s, and
1.5 s, there is an increasing trend for the median values of the
violation duration, with values of 1 s, 2 s, 2.4 s, and 3.6 s
respectively. e variability demonstrated by the PET viola-
tion metric illustrated some of the pitfalls associated with this
measurement in the context of reliably evaluating safety of a
vehicle consistently across scenarios.
Time Headway Violation (THWV) e THW metric is
similar to the TTC metric; however, it is dened by the relative
distance between the subject vehicle and lead vehicle and only
the velocity of the subject vehicle. THW violations were evalu-
ated for varying thresholds from 1 s to 5 s across all scenarios
and the THWV duration distributions are depicted in
Figure 11.
FIGURE 8 TTCV duration boxplot with varying
threshold values
© SAE International.
FIGURE 9 MTTCV duration boxplot with varying
threshold values
© SAE International.
FIGURE 10 PETV metric boxplot with varying
threshold values
© SAE International.
Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021
EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 10
e duration of the THWV for a threshold value of 1 s
has a median of 3 s and a maximum of 18.3 s, due to the condi-
tions on scenarios LVD_14 and LVD_15 (same as with PETV).
When the threshold is increased, the violation duration distri-
butions are skewed towards the maximum value due to the
fact that the THW calculation only considers the relative
distance between both vehicles regardless of the speed of the
lead vehicle. erefore, a THWV will happen when the ratio
between the delta in distance between vehicles and the subject
vehicle’s speed is equal to the threshold value (Δd/
speed=threshold). Since most scenarios in this work start
with a Δd = 30m and vehicles star t accelerating until reaching
their target speed (oen greater than 15m/s), a THWV will
betriggered before the subject vehicle reaches its target speed,
independent of the behavior of the lead vehicle. In these situ-
ations, a THW V triggers when the subject vehicle is travelling
at 15 m/s, 10 m/s, 7.5 m/s and 6 m/s for thresholds of 2 s, 3 s,
4 s, and 5 s respectively. is results in larger violation dura-
tions, as seen in Figu re 11.
Metrics Parameterization Based on Temporal
Occurrence As mentioned in previous sections, one of the
criteria for evaluating the metrics is to compare the results
against the DSV, considered to bethe ground truth metric.
is was done by examining the temporal occurrence of each
metric for each scenario with respect to the dened ground
truth. From this, three temporal regions were identied:
1. Metric violation prior to DSV with abrake=5 m/s2-
is indicates that the subject vehicle may not require
immediate action since a forced braking event would
not require excessive deceleration. However, if a
metric violation occurred too early compared to DSV
at 5 m/s2, this may bean overly conservative violation.
2. Metric violation between DSV with abrake=5 m /s2
and DSV with ab rake=8.3 m/s2- is indicates the
subject vehicle is in a situation in which an avoidance
maneuver may need to betaken, since a forced
braking event might require a deceleration greater
than 5 m/s2, considered to bean emergency maneuver
[10] that may not bea comfortable deceleration rate
for passengers [19]. Metric violations happening in
this temporal region may not indicate failure to
identify the safety of the situation as the DSV does
not consider the reaction time of the subject vehicle
but also does not consider the speed of the
lead vehicle.
3. Metric violation aer DSV with abrak e=8.3 m/s2-
is indicates that the subject vehicle is in a situation
in which an avoidance maneuver is recommended,
since a forced braking event might require a
deceleration rate greater than 8.3 m/s2, which was
established in [10] as an appropriate expected value
for AEB systems; thus, the deceleration rate would
exceed (or nearly exceed) the braking capability of the
AV, and a collision may not beavoided. Metric
violations happening in this temporal region may
indicate failure to identify the safety of the situation
at an appropriate time.
e results of metric violation temporal occurrences
across all scenarios at varying thresholds/parameters based
on the three regions are listed in Ta ble 8. Additionally, the
average time dierence between a metric violation and the
DSV at 5 m/s2, DSV at 8.3 m/s2, or a collision was calculated
for each threshold, depending on whether the metric violation
occurred in a temporal region of 1, 2, or 3, respectively. is
allows for analyzing the trade-os between dierent thresh-
olds and parameters sets. e optimal threshold/parameter
set should besuch that it tries to maximize violation occur-
rences in region 1 at the minimum average time prior DSV at
5 m/s2 as well as maximize the time dierence between viola-
tion occurrence and DSV at 8.3 m/s2 in region 2 and minimize
violation occurrences in region 3. Otherwise, the metric at
the given threshold could become overly conservative or not
relevant enough.
e TTCV metric with a threshold of 2 s has as many
violations as TTCV with thresholds of 3 s and 4 s in region 1
but with the lowest average time prior to DSV at 5 m/s2. e
MTTCV metric with a threshold of 2 s, while it achieved a
lower number of violations in region 1 than did higher
threshold values, has the lowest average time prior to DSV at
5 m/s2. e PETV metric with a threshold of 1.5 s achieved a
slightly higher average time prior to DSV at 5 m/s2 (only 0.13
s higher than a threshold of 1 s) but with a lower number of
metric violations within region 3. e THWV w ith a threshold
of 2 s achieves all occurrences in region 1, but the average time
of a THWV prior to DSV at 5 m/s2 of 2.4 s indicates that the
metric with the given threshold is possibly overly conservative.
Finally, the MSDV metric occurs most frequently in the rst
region with varying parameters, but the NDS parameter
category achieves the lowest average metric violation prior to
DSV at 5m/s2.
Metrics Parametrization
Selection
Based on the metrics parametrization analysis from the
previous section, the results shown in Tab le 8 in Appendix C,
and the threshold/parameter set selection process discussed
above, a threshold value or a parameter set was chosen for
each metric and the OSA metrics in the scenarios dened in
this study were evaluated. A value of 2 s was chosen for the
TTCV, MTTCV, and THWV metrics and 1.5 s for PETV, as
these values demonstrated a balance between conservativeness
and usefulness when compared to other threshold values for
each given metric. For the MSDV metric, the parameter set
from the NDS category was chosen as it demonstrated
FIGURE 11 THW metric boxplot with varying
threshold values
© SAE International.
Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021
EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 11
reasonable following distances (Figure 5) at the minimum
violation duration (median of 5.1 s) with a reasonable compro-
mise between conservativeness versus usefulness compared
to the other parameter categories. It is worth noting that
parameter/threshold values may dier when evaluating the
safety performance of human-driven vehicles, compared to
that of AVs, as human drivers could have longer reaction times
than the ones proposed in this work.
Scenario Results
e time at which metric violations occurred throughout the
execution of the scenarios with regards to the collision point
are shown in Figure 12 through Figure 15. e violation
symbols on the gures depict the start of a metric violation
and the trace aer it demarcates the violation duration. us,
in scenarios where more than one violation occurred for a
given metric1, the cause was either due to the change in the
dynamics of the scenario, or an initial, overly conservative
violation. Additionally, the temporal occurrence of the ground
truth DSV is shown in each gure for both the minimum
deceleration needed to perform an emergency maneuver
(abrake=5m/s2, in orange ) and a reasonable (maximum) decel-
eration applied by automatic emergency brak ing (AEB) systems
(abrake=8. 3m/s2, in red ) for a visual comparison of the metrics’
performance in preemptive warning versus emergency
warning time needed to respond to a conict. For example, in
the LVS_10 scenario of Figure 13, the DSV at 5 m/s2 occurred
at 23.0 s and the DSV at 8.3 m/s2 occurred at 23.4 s.
As depicted in each of the gures, a collision resulted
from each scenario and the timing was dictated by scenario
dynamics, including the speeds and initial separation of the
vehicles. For example, in LVS_10 shown in Figure 13, a colli-
sion happened aer 24.0 s of scenario execution (demonstrated
by the Headway Distance curve reaching 0m at the secondary
y-axis at right) due to the lower speed of the subject vehicle
compared to scenarios LVS_15 and LVS_18, in which vehicles
collided earlier, at 18.1 s and 16.8 s, respectively. In the case
of the metrics violations, in all scenarios within the LVS
category (Figure 13), the THWV, TTCV and MTTCV
occurred at the same time (2 s before the collision), and lasted
until the collision occurred. PETV occurred at the time of
collision, which is obviously not useful as a warning. e only
metric that had a variation on its results with the dierent
scenario congurations was the MSDV metric.
For TTCV, it is worth noting that for scenarios in which
the lead vehicle is driving faster than the subject vehicle (e.g.,
LVD_16), the TTCV metric fails to provide any information
with respect to the safety of the situation during the condition
vl>vf since it results in a negative denominator for the TTC
calculation. A similar behavior happens in cases in which
vehicles are driving at the same speed, (e.g., LVD_18), a TTCV
will not occur unless one of the vehicles changes speed, and
even in that case, the TTCV might occur too late. Moreover,
when vehicles have slightly dierent speeds, like in LDV_14
and LVD_15, the vehicles drive 5m away from each other
without triggering a TTCV until the lead vehicle starts
1 is occurred for MTTCV in scenarios LVD_14 and LVD_15 and for
THWV in scenario LVD_16.
decelerating. e results on TTCV show that this metric is
not always useful, leading to numerous delayed violations in
some cases. Additionally, TTCV is not robust to variations in
the relative vehicle speeds.
MTTCV showed similar behavior to that of TTCV, oen
triggering a violation less than 1 s before TTCV. In some cases,
due to variations in the velocity and accelerations of both
vehicles throughout the scenarios, the metric experienced
changing status of violations with short durations, indicating
that MTTCV may not bea consistently reliable metric for
dierent scenarios (e.g., LVD_14). is observation would
suggest that in scenarios where vehicles may experience
minimal changes in velocity and accelerations (i.e., typical
highway car-following scenarios) other metrics may
bemore suitable.
e PETV metric did not provide releva nt safety informa-
tion in cases where the was no post encroachment (e.g.,
LVS_10, LVS_15, and LVS_18), meaning that the subject
FIGURE 12 Lead Vehicle Decelerating metrics violations
© SAE International.
Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021
EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 12
vehicle did not travel through the conict point. ese can
beconsidered as analogous to “false negatives” in perception
systems. is is a huge pitfall for the robustness and usefulness
of the PETV metric in the car-following scenarios evaluated
in this work.
THWV is a metric that was not considered in the initial
proposed set of safety metrics; however, it has been evaluated
here due to its relevance in a recently published standard for
ALKS features [10]. is metric is dependent on the distance
between the subject and lead vehicles and the velocity of the
subject vehicle which may beconfounding depending on the
particular driving situation. Consider, for example, scenarios
LVD_16, LVD_18, and LVD_20. A THWV violation happens
always at the same time even when the lead vehicle is behaving
dierently (higher, same, and lower target speed than subject
vehicle, respectively). is metric produces a violation once
the subject vehicle is moving at a speed of 15 m/s or more.
Specically in the case of LVD_16, the rst THWV disappears
once the dista nce between vehicles increases as the lead vehicle
moves faster, and a second THWV appears approximately 2
s prior to collision once the lead vehicle starts decelerating.
is indicates a lack of robustness to changing scenario
congurations. One benet of the THWV metric over other
metrics, such as TTCV and PETV, is the relevance and avoid-
ance of discontinuities and negative values in the data.
e MSDV metric seemed robust to changes in scenario
dynamics providing a continuous safety assessment even
when the lead vehicle was moving faster than the subject
vehicle (e.g., LVD_16). Variations in its subjective parameters
yielded diering results, but overall, it is a continuous metric
that considers many relevant aspects for driving such as
reaction time and braking capability of the vehicles. As a
result, overly conservative violations were not seen in the
scenarios for the parameters values chosen.
e criteria introduced earlier dening the ecacy of a
metric according to robustness, relevance, and comparison
to the ground truth DSV metric was evaluated for each metric
based on the experimental results. Table 6 summarizes the
ecacy of each metric in the evaluation criteria categories
with “-” for low ecacy, “+” for medium ecacy, and “++” for
hi gh ecac y.
FIGURE 14 Lead Vehicle Moving at Lower Constant Speed
metrics violations
© SAE International.
FIGURE 15 Lead Vehicle Accelerating metrics violations
© SAE International.
FIGURE 13 Lead Vehicle Stopped scenarios and
metrics violations
© SAE International.
Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021
EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 13
Conclusions and Future
Work
is work demonstrates t he capability of calculating the previ-
ously proposed OSA metrics through the use of simulation
and evaluates the sensitivity and interplay of these metrics for
variations in initial conditions and dierent scenarios. e
metric calculations were directly computed through the
CARLA outputs to produce graphical data that could
beanalyzed for further evaluation.
e following highlights the conclusions drawn from
evaluation of the scenarios presented in the context of this
paper and the OSA metrics renement proposed by this work:
•e TTCV metric is unreliable in cases where there is no
relative velocity dierence between the lead and
following vehicles. If the lead and subject vehicles are in
close proximity with the same velocity, the TTCV does
not provide relevant safety information. e TTC metric
has been utilized in many human driver studies
providing a common reference metric, but the
experiments in this work demonstrated several pitfalls of
the metric. As such, the TTCV metric may only
berelevant within the context of an OSA methodology
for comparison purposes to previous trac engineering
research and naturalistic studies.
•e MTTCV metric is sensitive to changes in the
accelerations of the vehicles involved in the simulated
scenarios. MTTC violations occurred frequently with
short durations at high threshold values, indicating high
sensitivity to any variation in the vehicle position, velocity,
and accelerations. As such, the MTTCV metric does not
provide reliable and robust notication of a collision and
may not bean adequate metric for the scenarios evaluated
in this paper. As a result, the MTTCV metric may not
berelevant for a nalized OSA metrics set.
•e PETV metric may bemore applicable to intersection
scenarios; yet, overall is an ex post facto metric that in
many cases will not provide a continuous, real-time
measurement of a given situation and is more useful in
reactive assessments. erefore, PETV may not bean
adequate metric for the scenarios evaluated in this paper
and may not berelevant in the nalized OSA metrics set.
•e THWV metric utilizes the relative distance of the
involved vehicles and the speed of the following vehicle;
however, it does not consider any information regarding
the dynamics of the lead vehicle. As a result, the THWV
metric can lead to confounding results depending on the
situation where the parameters of the lead vehicle can
play a major role in the context of the scenario (i.e., lead
vehicle stopped). erefore, THWV may not bean
adequate metric for the scenarios evaluated in this paper
and may not berelevant in the nalized OSA metrics set.
•e MSDV metric is subject to the parameters used in its
formulation; however, research has been conducted in
order to propose sets of parameters that follow
naturalistic driving behavior from humans [15]. e use
of naturalistic studies and generalized vehicle dynamic
capabilities provides a more comprehensive metric
which incorporates the complexity of the physical
attributes of the vehicles. e MSDV formulation
accounts for “what if ” worst-case situations, making it
more robust to changes in the dynamics of the situation
without being an overly conservative measure; while the
time-based metrics are based solely on relative
measurements of vehicle motion including relative
position, velocity, and acceleration.
With respect to future work, there is a need for optimiza-
tion of the selection process of the thresholds/parameter sets
of the OSA metrics. A deeper understanding of violation
duration and temporal occurrence will allow for the selection
process to determine the optimal threshold or parameter set.
is is important for comparing the results for optimized
metrics, which will lead to renement and nalization of the
OSA metrics set.
Additional scenarios will also beevaluated in future
work. is paper focused on car-following scenarios; but the
use of the script and database to calculate the metrics for any
given scenario will a llow for others to beconsidered. Although
the current work examines scenarios composing approxi-
mately 18.4% of all light-duty vehicle collisions in the U.S.,
further simulation work should beconducted to consider
additional scenarios, such as intersections or lane changing
scenarios. e analysis of the additional scenarios will provide
further insight into the relative merits of the OSA metrics,
with the intent to further validate the set and remove any
metrics that are not necessary to the OSA methodology
under development.
To facilitate the OSA metrics set nalization, there is a
need to automate the process of calculating all applicable
metrics proposed within [3]with inputs of the required param-
eters. Although the CARLA script was modied to calculate
and output the safety metric results independently, a script
and database combination is planned to establish a repository
and accompanying methodology capable of calculating the
metrics from any simulation software that is capable of
outputting the necessary variables rather than limiting the
scope specically to CARLA.
One such soware that is planned to beused for future
work is Human, Vehicle, Environment (HVE) which is a
physics-based accident reconstruction soware traditionally
used in the evaluation of vehicle dynamics in collision
scenarios. Although CARLA provided a useful platform to
evaluate the discussed metrics, collision incidents were not
evaluated. CARLA is capable of providing the timing for colli-
sion incidents within the scenarios; however, one limitation
of CARLA is the inability of the soware to handle the colli-
sion and post-collision vehicle dynamics associated with an
TAB L E 6 Evaluation of OSA metrics based on
presented criteria
Criteria TTCV MTTCV PETV THWV MSDV
Robustness + + + + ++
Relevance - + - ++ ++
DSV Comparison - - - ++ +
© SAE International.
Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021
EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 14
impact. e simulation events were terminated at the initia-
tion of a collision event since the vehicles are not modeled to
a level of delity that is capable of accurately calculating the
physics of the vehicles including rotation, crush, and exit
velocities associated with the momentum and energy charac-
teristics of the collision. e proposed Collision Incident (CI)
metric in [3] utilizes the KABCO index to determine the
severity of the collision. e use of HVE could enhance the
severity quantication for any given scenario by providing
information such as delta-v, dissipated energy, and principle
direction of force (PDOF). is renement would provide
additional granularity to not only consider whether a collision
occurs but also understand the severity and dynamics of a
collision if one does occur.
When the OSA metrics have been nalized and validated,
the OSA methodology will bedeveloped as part of an overall
SCF. is will allow for a score to beassigned to the navigation
of any given scenario by an AV (or human-driven vehicle) that
will bea part of the AV safety case.
References
1. Safetyengineering.wordpress.com, “e Safety Engineering
Resource,” April 18, 2008, accessed Jan. 8, 2021.
2. Underwriters Laboratories (UL), ANSI/UL 4600- Standard
for Safety For the Evaluation of Autonomous Products, 2020.
3. Wishart, J., Como, S., Elli, M., Russo, B. et al., “Driving
Safety Performance Assessment Metrics for ADS-Equipped
Vehicles,” SAE Technical Paper 2020-01-1206, 2020. https://
doi.org/10.4271/2020-01-1206.
4. Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., and Koltun,
V., “CARLA: An Open Urban Driving Simulator,” in 1st
Annual Conference on Robot Learning, 2 017.
5. “ScenarioRunner,” https://github.com/carla-simulator/
scenario_runner, accessed Nov. 1, 2020.
6. Wassim, N., Ranganathan, R., Srinivasan, G., Smith, J.,
Toma, S., Swanson, E., and Burgett, A., “Description of
Light-Vehicle to-Vehicle Communications for Safety
Applications Based on Vehicle-to-Vehicle Communications,”
National Highway Trac Safety Administration, Report No.
DOT HS 811 731, 2013.
7. Najm, W.G., Smith, J.D., and Yanagisawa, M., “Pre-Crash
Scenario Typology for Crash Avoidance Research,” National
Highway Trac Safety Administration, Report No. DOT-
VNTSC-NHTSA-06-02, 2007.
8. Shalev-Shwartz, S., Shammah, S., and Shashua, A., “On a
Formal Model of Safe and Scalable Self-Driving Cars,”
arXiv:1708 .0 6374, 2017.
9. Gettman, D. and Head, L., “Surrogate Safety Measures from
Trac Simulation Models,” 1840(1):104-115, 2003.
10. United Nations Economic Commission for Europe
(UNECE), “Proposal for a New UN Regulation on Uniform
Provisions Concerning the Approval of Vehicles with
Regards to Automated Lane Keeping System,” 2020, https://
undocs.org/ECE/TRANS/WP.29/2020/81.
11. Silberling, J., Wells, P., Acharya, A., Kelly, J., and Lenkeit, J.,
“Development and Application of a Collision Avoidance
Capability Metric,” SAE Technical Paper 2020-01-1207, 2020.
https://doi.org/10.4271/2020-01-1207.
12. Weng, B., Rao, S., Deosthale, E., Schnelle, S., and Barickman,
F., “Model Predictive Instantaneous Safety Metric for
Evaluation of Automated Driving Systems,” in IEEE
Intelligent Vehicles Symposium (IV), 2020.
13. Javed, M.A. and Khan, J.Y., “Performance Analysis of a Time
Headway Based Rate Control Algorithm for VANET Safety
Applications,” in in 7th International Conference on Signal
Processing and Communication Systems (ICSPCS), Carrara,
VIC , 2013.
14. Rodionova, A., Alvarez, I., Elli, M.S., Oboril, F., Quast, J.,
and Mangharam, R., “How Safe Is Safe Enough? Automatic
Safety Constraints Boundary Estimation for Decision-
Making in Automated Vehicles,” in in IEEE Intelligent
Vehicles Symposium (IV), 2020.
15. China Intelligent Transportation Systems (ITS) Alliance,
“Safety Assurance Technical Requirements for Decision-
Making on Autonomous Vehicles,” C-ITS Alliance
Report, 2020.
16. ASAM , “OpenScenario,” https://ww w.asam.net/standards/
detail/openscenario/, accessed Jan. 11, 2020.
17. Gassmann, B., Oboril, F., Buerkle, C., Liu, S. et al., “Towards
Standardization of AV Safety: C++ Library for Responsibility
Sensitive Safety,” in IEEE Intelligent Vehicles Symposium
(IV), 2019.
18. Gassmann, B., Pasch, F., Oboril, F., and Scholl, K.-U.,
“Integration of Formal Safety Models on System Level Using
the Example of Responsibility Sensitive Safety and CARLA
Driving Simulator,” in International Conference on Computer
Safety, Reliability, and Security, 2020.
19. Mazzae, E.N., Barickman, F.S., Forkenbrock, G., and
Baldwin, G.H., “NHTSA Light Vehicle Antilock Brake
System Research Program Task 5.2/5.3: Test Track
Examination of Drivers’ Collision Avoidance Behavior Using
Conventional and Antilock Brakes,” DOT HS809, 2003.
20. Balas, V.E. and Balas, M.M., Driver Assisting by Inverse Time
to Collision (World Automation Congress (WAC):
Budapest, 2006).
Acknowledgement
is work was made possible by the generous contributions
and funding provided by the Institute for Automated Mobility
(IAM). e authors would like to thank the IAM for its
continued support in advancing research surrounding
vehicle automation.
Definitions/Abbreviations
ABC - Achieved Behavioral Competency
AD - Aggressive Driving
ADS - Automated Driving System
ADSA - ADS Active
Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021
EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 15
AEB - Automatic Emergency Braking
AHJ - Authority Having Jurisdiction
ALKS - Automated Lane Keeping System
AV - ADS-Equipped Vehicle
CAC - Collision Avoidance Capability
CARLA - Car Learning to Act
CI - Collision Incident
HTCDER - Human Trac Control Detection Error Rate
HTCVR - Human Trac Control Violation Rate
IAM - Institute of Automated Mobility
MPrISM - Model Predictive Instantenous Safety Metric
MSD - Minimum Safe Distance
MSDCE - Minimum Safe Distance Calculation Error
MSDF - Minimum Safe Distance Factor
MSDV - Minimum Safe Distance Violation
MTTC - Modied Time-to-Collision
MTTCV - Modied Time-to-Collision Violation
NHTSA - National Highway Trac Safety Administration
ORAD - On-Road Automated Driving
OSA - Operational Safety Assessment
PET - Post Encroachment Time
PETV - Post Encroachment Time Violation
PRA - Proper Response Action
RSS - Responsibility-Sensitive Safety
SCF - Safety Case Framework
THW - Time Headway
THWV - Time Headway Violation
TLW - Trac Law Violation
TTC - Time-to-Collision
TTCV - Time-to-Collision Violation
V&V - Verication and Validation
Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021
EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION 16
Appendix A Minimum Longitudinal and Lateral
Distances of MSDV (from [])
d
va
va
long
long
accel
long
long
ac
min
,max,
,max,
=
+
++
1111
2
111
1
2
ρρ
ρ
ccel
long
decel
long
long
decel
long
a
v
a
()
−
()
2
1
2
2
2
2
2
,min,
,max,
+
(7)
Where the subject vehicle (subscript 1) is following behind another entity (subscript 2) and both are moving in the same
direction, and [x]+≔max{x, 0}.
d
va
va
lt
lataccel
lat
lataccel
min
,max,
,max,
a=+
+
++
µ
2
2
111
1
111
ρρ
ρ
ll at
decel
lat
lataccel
lat
lat
a
va
v
()
−
−
−−
2
1
222
2
2
2
2
2
,min,
,max,
ρρ
ρρ
22
2
2
2
a
a
accel
lat
decel
lat
,max,
,min,
()
+
(8)
Where the subject vehicle (subscript 1) is to the le of the other entity (subscript 2),
dlat
min
is the distance between the right
side of the ego vehicle and the le side of the other entity, and μ is a lateral uctuation margin [m] , and [x]+≔ma x{x, 0}.Appendix
B: Pre-Crash Scenario Category Summary
Appendix C Temporal Occurrence Results
TAB L E 7 Pre-crash scenario categories and descriptions for two-vehicle crashes from [6]
Scenario Category Description
Scenario
Category ID
Proportion of
Collisions1
Lead Vehicle Stopped Subject vehicle is traveling straight in an urban area, in daylight, under
clear weather conditions, at an intersection-related location with a
posted speed limit of 35 mph and approaches a stopped lead vehicle.
LVS 10.2%
Lead Vehicle Decelerating Subject vehicle is traveling straight and following a lead vehicle in a rural
area, in daylight, under clear weather conditions, at a non-junction with a
posted speed limit of 55 mph or more, and the lead vehicle suddenly
decelerates.
LVD 4.2%
Lead Vehicle Moving at Lower
Constant Speed
Subject vehicle is traveling straight in an urban area, in daylight, under
clear weather conditions, at a non-junction with a posted speed limit of
55 mph or more; and approaches a lead vehicle moving at lower
constant speed.
LVMLCS 3.7%
Lead Vehicle Accelerating Subject vehicle is traveling straight in an urban area, in daylight, under
clear weather conditions, at an intersection-related location with a
posted speed limit of 45 mph and approaches an accelerating lead
vehicle.
LVA 0.3%
Total: 18.4%
* The relative frequency is based on the collision statistics from Table 7 in [6].
© SAE International.
Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021
© 2021 SAE I nternational. Al l rights reserve d. No part of this p ublication may be re produced, store d in a retrieval system , or transmitted , in any form or by any mean s,
electronic, me chanic al, photo copying, recording, or other wise, without the prior writ ten permission of SAE International.
Positions and opinions adva nced in this work are those of the author(s) and not necessarily those of SAE International. Responsibility for the content of the work lies
solely with the author(s).
ISSN 0148-7191
17
EVALUATION OF OPERATIONAL SAFETY ASSESSMENT (OSA) METRICS FOR AUTOMATED VEHICLES IN SIMULATION
TAB L E 8 Temporal occurrence results of metrics violations at varying thresholds across all scenarios*
Metric Threshold
# Violations
Prior to DSV
at 5 m/s^2
Avg. Time
Dierence of
Violation and DSV
at 5 m/s2 [s]
# Violations
Between DSV at
5 and 8.3 m/s2
Avg. Time Dierence
of Violation and DSV
at 8.3 m/s2 [s]
# Violations
After DSV at
8.3 m/s2
Avg. Time
Dierence of
Violation and
Collision [s]
MDSV Aggressive 13 1.27 0 - 0 -
NDS 9 0.58 4 2.39 0 -
Conservative 13 7.6 5 0 - 0 -
TTCV 1 s 0 - 3 0.20 10 0.91
2 s 4 0.48 1 0.20 8 1.73
3 s 4 1.36 3 0.43 6 2.20
4 s 4 2.21 3 1.28 6 2.67
5 s 8 2.03 20.93 52.76
MTTCV 1 s 2 1.75 40.26 10 0.94
2 s 6 0.97 1 0.85 8 1.84
3 s 14 6.16 7 0.58 7 2.54
4 s 23 4.72 31.17 74.61
5 s 22 5.57 9 1.02 5 6.14
THWV 1 s 2 0.75 60.51 5 2.68
2 s 14 2.43 0 - 0 -
3 s 13 4.63 0 - 0 -
4 s 13 5.51 0 - 0 -
5 s 13 6.09 0 - 0 -
PETV 0.5 s 0 - 0 - 13 3.25
1 s 2 0.45 10.95 10 1.33
1.2 s 2 0.55 11.95 10 1.70
1.5 s 3 0.58 5 0 .74 50.13
2 s 8 2.66 2 1.35 40.03
*Note that the threshold/parameter sets selected for the scenario analysis are in bold
© SAE International.
Downloaded from SAE International by Jeffrey Wishart, Monday, April 05, 2021